-
Facebook AI Research (FAIR)
- Menlo Park
- rongjiehuang.github.io
Stars
A brief and partial summary of RLHF algorithms.
[ICLR 2026 Oral] ScaleCUA is the open-sourced computer use agents that can operate on cross-platform environments (Windows, macOS, Ubuntu, Android).
OpenClaw-RL: Train any agent simply by talking
MAI-UI: Real-World Centric Foundation GUI Agents ranging from 2B to 235B
verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework
An agent-managed museum exhibit, built in Rust with Gajae-Code / LazyCodex — developed and maintained with no human intervention.
This repository contains code and metadata of How2 dataset
The repository provides code for running inference with the Meta Segment Anything Audio Model (SAM-Audio), links for downloading the trained model checkpoints, and example notebooks that show how t…
MAGI-1: Autoregressive Video Generation at Scale
Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages
A high-throughput and memory-efficient inference and serving engine for LLMs
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
This is the official repo for the paper "LongCat-Flash-Omni Technical Report"
Large Concept Models: Language modeling in a sentence representation space
Build local voice agents with open-source models
Official implementation of "HumanAesExpert: Advancing a Multi-Modality Foundation Model for Human Image Aesthetic Assessment"
An official implementation of DanceGRPO: Unleashing GRPO on Visual Generation
Research code artifacts for Code World Model (CWM) including inference tools, reproducibility, and documentation.
State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.
An open-source implementaion for fine-tuning Qwen-VL series by Alibaba Cloud.
[ACM MM 2025] FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis
Scalable and memory-optimized training of diffusion models