Stars
Compute 2D human pose and angles from a video or a webcam.
Fast Hadamard transform in CUDA, with a PyTorch interface
An inequality benchmark for theorem proving
Code for paper: "LLM-based Automated Theorem Proving Hinges on Scalable Synthetic Data Generation"
[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer
MAGI-1: Autoregressive Video Generation at Scale
A Distributed Attention Towards Linear Scalability for Ultra-Long Context, Heterogeneous Data Training
This repository contains the implementation of the paper "MeteoRA: Multiple-tasks Embedded LoRA for Large Language Models".
The official implementation of "Self-play LLM Theorem Provers with Iterative Conjecturing and Proving"
A collection of assignments on large language models, serving for both beginners to get started and pros to practice advanced tech.
Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation
Fully open reproduction of DeepSeek-R1
Minimal reproduction of DeepSeek R1-Zero
SONAR, a new multilingual and multimodal fixed-size sentence embedding space, with a full suite of speech and text encoders and decoders.
Large Concept Models: Language modeling in a sentence representation space
OCR, layout analysis, reading order, table recognition in 90+ languages
Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.
Code for paper: "Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines"
[TKDE'25] The official GitHub page for the survey paper "A Survey on Mixture of Experts in Large Language Models".
A Collection of Foundation Driving Models by OpenDriveLab
🌞 CareGPT (关怀GPT)是一个医疗大语言模型,同时它集合了数十个公开可用的医疗微调数据集和开放可用的医疗大语言模型,包含LLM的训练、测评、部署等以促进医疗LLM快速发展。Medical LLM, Open Source Driven for a Healthy Future.
Codes and Data for Scaling Relationship on Learning Mathematical Reasoning with Large Language Models