-
16:40
(UTC +08:00)
Stars
PyTorch code and models for VJEPA2 self-supervised learning from video.
GameVerse: Can Vision-Language Models Learn from Video-based Reflection?
Official repo of "Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens"
[CVPR 2026] Thinking in 360°: Humanoid Visual Search in the Wild
A paper list for spatial reasoning
PlayStation 4 emulator for Windows, Linux, macOS and FreeBSD written in C++
Instant voice cloning by MIT and MyShell. Audio foundation model.
Qlib is an AI-oriented Quant investment platform that aims to use AI tech to empower Quant Research, from exploring ideas to implementing productions. Qlib supports diverse ML modeling paradigms, i…
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
清华大学计算机系课程攻略 Guidance for courses in Department of Computer Science and Technology, Tsinghua University
[ICLR'24 spotlight] An open platform for training, serving, and evaluating large language model for tool learning.
🤖 AgentVerse 🪐 is designed to facilitate the deployment of multiple LLM-based agents in various applications, which primarily provides two frameworks: task-solving and simulation