Stars
Extend OpenRLHF to support LMM RL training for reproduction of DeepSeek-R1 on multimodal tasks.
NVIDIA TensorRT-RTX is an SDK for high-performance AI inference on NVIDIA RTX GPUs. This repository contains Open-Source Software components of TensorRT-RTX.
LightMem: Lightweight and Efficient Memory-Augmented Generation
User-friendly AI Interface (Supports Ollama, OpenAI API, ...)
Tools for merging pretrained large language models.
Unofficial implementation of Titans, SOTA memory for transformers, in Pytorch
An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models
MimicTalk: Mimicking a personalized and expressive 3D talking face in minutes; NeurIPS 2024; Official code
[CVPR 2025] EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation
all of the workflows of n8n i could find (also from the site itself)
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
A Model Context Protocol (MCP) server that provides image generation capabilities using Bytedance's SeedDream 4.0 model via the FAL AI platform.
VibeVoice: Expressive, longform conversational speech synthesis. (Community fork)
Voice Activity Detector (VAD) : low-latency, high-performance and lightweight
PyTorch code and models for the DINOv2 self-supervised learning method.
[CVPR'24 Best Student Paper] Mip-Splatting: Alias-free 3D Gaussian Splatting
Official PyTorch implementation of BigVGAN (ICLR 2023)
Single Image to 3D using Cross-Domain Diffusion for 3D Generation
An open-source impl. of Large Reconstruction Models
Official repo for paper "Structured 3D Latents for Scalable and Versatile 3D Generation" (CVPR'25 Spotlight).
Added vLLM support to IndexTTS for faster inference.
An interactive NVIDIA-GPU process viewer and beyond, the one-stop solution for GPU process management.
QLoRA: Efficient Finetuning of Quantized LLMs