Real-time behaviour synthesis with MuJoCo, using Predictive Control
Environment generation code for the paper "Emergent Tool Use"
NVIDIA Isaac GR00T N1.5 is the world's first open foundation model
Open, non-commercial SDXL model for quality image generation
Reasoning-powered OCR VLM for converting complex documents to Markdown
Production-tested AI infrastructure tools
Python example app from the OpenAI API quickstart tutorial
React app for inspecting, building and debugging with the Realtime API
Instructions on how to use the Realtime API on Microcontrollers
Vision-language-action model for robot control via images and text
800,000 step-level correctness labels on LLM solutions to MATH problem
A Production-ready Reinforcement Learning AI Agent Library
State-of-the-art Image & Video CLIP, Multimodal Large Language Models
Phi-3.5 for Mac: Locally-run Vision and Language Models
Analyze computation-communication overlap in V3/R1
Latent Diffusion and Stable Diffusion Implementation
A mix of GAN implementations including progressive growing
Generate embeddings from large-scale graph-structured data
QwQ-32B is a reasoning-focused language model for complex tasks
Qwen2.5-VL is the multimodal large language model series
Chat & pretrained large audio language model proposed by Alibaba Cloud
An advanced bilingual image editing with semantic control
Chat & pretrained large vision language model
Instruction-tuned 7B language model for chat and complex tasks
Powerful 14B LLM with strong instruction and long-text handling