Stars
Machine Learning Engineering Open Book
Run, manage, and scale AI workloads on any AI infrastructure. Use one system to access & manage all AI compute (Kubernetes, Slurm, 20+ clouds, on-prem).
[MathCoder, MathCoder-VL] Family of LLMs/LMMs for mathematical reasoning.
🦉 OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation
Implementing DeepSeek R1's GRPO algorithm from scratch
A WebGL2 fluid simulation tool implementing the lattice Boltzmann method (LBM) for advection-diffusion problems.
Skip Context Tree Switching - Reference Implementation
Optical character recognition for Japanese text, with the main focus being Japanese manga
Codebase for Berkeley Humanoid Lite
Code release for "LLMs can see and hear without any training"
[ICLR 2025] Official Implementation of M3: 3D-Spatial Multimodal Memory
[NeurIPS'24] This repository is the implementation of "SpatialRGPT: Grounded Spatial Reasoning in Vision Language Models"
VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.
A project to improve skills of large language models
A pattern for an always on AI Assistant powered by Deepseek-V3, RealtimeSTT, and Typer for engineering
NVIDIA Cosmos is an open platform of world models, datasets, and tools that enables developers to build Physical AI for robots, autonomous vehicles, smart infrastructure, and more.
🤗 smolagents: a barebones library for agents that think in code.
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
A Pythonic framework to simplify AI service building
A latent text-to-image diffusion model
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
A unified 3D Transformer Pipeline for visual synthesis