Stars
Lightweight coding agent that runs in your terminal
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
HunyuanVideo: A Systematic Framework For Large Video Generation Model
A trainable PyTorch reproduction of AlphaFold 3.
Adaptive Length Image Tokenization via Recurrent Allocation | How many tokens is an image worth ?
Tencent Hunyuan3D-1.0: A Unified Framework for Text-to-3D and Image-to-3D Generation
The best OSS video generation models, created by Genmo
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.
MLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineering
An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & TIS & vLLM & Ray & Dynamic Sampling & Async Agentic RL)
A Comprehensive Toolkit for High-Quality PDF Content Extraction
Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
Text-to-Music Generation with Rectified Flow Transformers
[NeurIPS 2024 Best Paper Award][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". A…
A collection of projects designed to help developers quickly get started with building deployable applications using the Claude API
Efficient Triton Kernels for LLM Training
High-resolution models for human tasks.
SGLang is a fast serving framework for large language models and vision language models.
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑🔬
Benchmarking Legal Knowledge of Large Language Models
real time face swap and one-click video deepfake with only a single image
Official PyTorch implementation of "Authentic Hand Avatar from a Phone Scan via Universal Hand Model", CVPR 2024.
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
Run OpenAI's CLIP and Apple's MobileCLIP model on iOS to search photos.
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation