Starred repositories
Paper reading and discussion notes, covering AI frameworks, distributed systems, cluster management, etc.
Boosting GPU utilization for LLM serving via dynamic spatial-temporal prefill & decode orchestration
Dynamic Memory Management for Serving LLMs without PagedAttention
A low-latency & high-throughput serving engine for LLMs
A multi-voice TTS system trained with an emphasis on quality
An official implementation for "X-CLIP: End-to-End Multi-grained Contrastive Learning for Video-Text Retrieval"
Zero-Shot Detection via Vision and Language Knowledge Distillation
[ICLR 2025] VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation
[arXiv: 2502.05178] QLIP: Text-Aligned Visual Tokenization Unifies Auto-Regressive Multimodal Understanding and Generation
Official inference code and LongText-Bench benchmark for our paper X-Omni (https://arxiv.org/pdf/2507.22058).
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
X-LLM: Bootstrapping Advanced Large Language Models by Treating Multi-Modalities as Foreign Languages
[WACV 2023] Audio-Visual Efficient Conformer (AVEC) for Robust Speech Recognition
An up-to-date list of works on Multi-Task Learning
awesome-autonomous-driving
Official implementation of CrossViT. https://arxiv.org/abs/2103.14899
Artifact from "Hardware Compute Partitioning on NVIDIA GPUs". THIS IS A FORK OF BAKITAS REPO. I AM NOT ONE OF THE AUTHORS OF THE PAPER.
Tacker: Tensor-CUDA Core Kernel Fusion for Improving the GPU Utilization while Ensuring QoS
Official Pytorch implementations for "SegNeXt: Rethinking Convolutional Attention Design for Semantic Segmentation" (NeurIPS 2022)
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
[ICML 2022] ShiftAddNAS: Hardware-Inspired Search for More Accurate and Efficient Neural Networks
[ICLR 2020] Lite Transformer with Long-Short Range Attention
Code for "Adaptive Frequency Enhancement Network for Remote Sensing Image Semantic Segmentation"