Skip to content
View ChunyuanLI's full-sized avatar

Block or report ChunyuanLI

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 17,972 1,996 Updated Dec 17, 2025

[NeurIPS 2023 D&B] VidChapters-7M: Video Chapters at Scale

Jupyter Notebook 200 23 Updated Nov 13, 2023

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Python 19,214 2,044 Updated Oct 21, 2025

The official Python library for the OpenAI API

Python 29,505 4,470 Updated Dec 17, 2025

LLM101n: Let's build a Storyteller

35,885 1,961 Updated Aug 1, 2024
Python 4,460 432 Updated Sep 14, 2025

Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.

70,511 8,066 Updated Jun 4, 2025

[arXiv 2023] Set-of-Mark Prompting for GPT-4V and LMMs

Python 1,497 111 Updated Aug 19, 2024

Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama mode…

Jupyter Notebook 18,086 2,672 Updated Nov 3, 2025

🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.

Python 3,283 209 Updated Mar 5, 2024

Code for Enhancing Detail Preservation for Customized Text-to-Image Generation: A Regularization-Free Approach

Jupyter Notebook 469 29 Updated Dec 29, 2023

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 24,171 2,680 Updated Aug 12, 2024

[NeurIPS 2023] Official implementation of the paper "Segment Everything Everywhere All at Once"

Python 4,755 454 Updated Aug 19, 2024

Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything

Jupyter Notebook 17,244 1,562 Updated Sep 5, 2024

Instruction Tuning with GPT-4

HTML 4,338 306 Updated Jun 11, 2023

The ChatGPT Retrieval Plugin lets you easily find personal or work documents by asking questions in natural language.

Python 21,235 3,673 Updated Jul 4, 2024

Open-Set Grounded Text-to-Image Generation

Python 2,184 164 Updated Mar 6, 2024

A playbook for systematically maximizing the performance of deep learning models.

29,524 2,408 Updated Jun 18, 2024

[CVPR 2023] Official Implementation of X-Decoder for generalized decoding for pixel, image and language

Python 1,337 159 Updated Oct 5, 2023
Jupyter Notebook 3,046 287 Updated Feb 27, 2023

Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) with Stable Diffusion

Jupyter Notebook 7,766 808 Updated Dec 8, 2022

A compilation of network architectures for vision and others without usage of self-attention mechanism

81 7 Updated Jan 18, 2023

Official implementation of AAAI 2023 paper "Parameter-efficient Model Adaptation for Vision Transformers"

Python 106 8 Updated Aug 7, 2023

[NeurIPS 2022] code for "K-LITE: Learning Transferable Visual Models with External Knowledge" https://arxiv.org/abs/2204.09222

Python 52 2 Updated Jun 12, 2023

Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.

Jupyter Notebook 5,706 539 Updated Aug 29, 2025

A collection of papers on the topic of ``Computer Vision in the Wild (CVinW)''

1,350 59 Updated Mar 14, 2024

Toolkit for Elevater Benchmark

Python 76 18 Updated Oct 17, 2023

This is a offical PyTorch/GPU implementation of SupMAE.

Jupyter Notebook 79 4 Updated Aug 30, 2022

[CVPR 2022] Official code for "RegionCLIP: Region-based Language-Image Pretraining"

Python 798 57 Updated Mar 20, 2024
Next