Stars
"DeepCode: Open Agentic Coding (Paper2Code & Text2Web & Text2Backend)"
Codec for paper: LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis
AAAI 2025: Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model
TalkVid: A Large-Scale Diversified Dataset for Audio-Driven Talking Head Synthesis
[ICML'23] StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis
[NeurIPS'21] Projected GANs Converge Faster
DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation
A bridge to use Langchain output as an OpenAI-compatible API
PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838
Efficient vision foundation models for high-resolution generation and perception.
This repo contains the code for 1D tokenizer and generator
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
A Flexible Framework for Experiencing Heterogeneous LLM Inference/Fine-tune Optimizations
Writing kubernetes controllers can be simple
End-to-end realtime stack for connecting humans and AI
[CVPR 2025] The First Investigation of CoT Reasoning (RL, TTS, Reflection) in Image Generation
An Optimized Speech-to-Text Pipeline for the Whisper Model Supporting Multiple Inference Engine
Official inference framework for 1-bit LLMs
High performance self-hosted photo and video management solution.
Vim-fork focused on extensibility and usability
Clean, modern, Python 3.6+ code generator & library for Protobuf 3 and async gRPC
Docker files and images to run Ceph in containers
Garnet is a remote cache-store from Microsoft Research that offers strong performance (throughput and latency), scalability, storage, recovery, cluster sharding, key migration, and replication feat…
[ECCV 2024] Official PyTorch implementation code for realizing the technical part of Mixture of All Intelligence (MoAI) to improve performance of numerous zero-shot vision language tasks.
The fastest knowledge base for growing teams. Beautiful, realtime collaborative, feature packed, and markdown compatible.