Kazuo Yamamoto fand-ee
-
Toshiba Corporation
- Tokyo, Japan
-
18:35
(UTC +08:00)
Stars
🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 2h!
[EMNLP2025] "LightRAG: Simple and Fast Retrieval-Augmented Generation"
MiniCPM-o 4.5: A Gemini 2.5 Flash Level MLLM for Vision, Speech, and Full-Duplex Mulitmodal Live Streaming on Your Phone
Llama中文社区,实时汇总最新Llama学习资料,构建最好的中文Llama大模型开源生态,完全开源可商用
[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer
Hierarchical Reasoning Model Official Release
Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation
An Autonomous LLM Agent for Complex Task Solving
Mobile-Agent: The Powerful GUI Agent Family
A lightweight LMM-based Document Parsing Model
Klavis AI (YC X25): MCP integration platforms that let AI agents use tools reliably at any scale
ACI.dev is the open source tool-calling platform that hooks up 600+ tools into any agentic IDE or custom AI agent through direct function calling or a unified MCP server. The birthplace of VibeOps.
Align Anything: Training All-modality Model with Feedback
HunyuanVideo-1.5: A leading lightweight video generation model
This Inventory management system is the currently Ford Asia Pacific after-sales logistics warehousing supply chain process . After I leave Ford , I start this project . You can share your vacant wa…
[ECCV 2024] Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance
Nexent is a zero-code platform for auto-generating agents — no orchestration, no complex drag-and-drop required. Nexent also offers powerful capabilities for agent running control, data processing …
Recommendation Algorithm大规模推荐算法库,包含推荐系统经典及最新算法LR、Wide&Deep、DSSM、TDM、MIND、Word2Vec、Bert4Rec、DeepWalk、SSR、AITM,DSIN,SIGN,IPREC、GRU4Rec、Youtube_dnn、NCF、GNN、FM、FFM、DeepFM、DCN、DIN、DIEN、DLRM、MMOE、PLE、ESM…
FEDML - The unified and scalable ML library for large-scale distributed training, model serving, and federated learning. FEDML Launch, a cross-cloud scheduler, further enables running any AI jobs o…
Easiest and laziest way for building multi-agent LLMs applications.
[ICLR 2025] Hallo2: Long-Duration and High-Resolution Audio-driven Portrait Image Animation
One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks
open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
Mirix is a multi-agent personal assistant designed to track on-screen activities and answer user questions intelligently. By capturing real-time visual data and consolidating it into structured mem…
The next generation deep reinforcement learning tookit
🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.
PromptEnhancer is a prompt-rewriting tool, refining prompts into clearer, structured versions for better image generation.
Skywork-R1V is an advanced multimodal AI model series developed by Skywork AI, specializing in vision-language reasoning.