Skip to content
View yeziyang1992's full-sized avatar

Block or report yeziyang1992

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Free ChatGPT&DeepSeek API Key,免费ChatGPT&DeepSeek API。免费接入DeepSeek API和GPT4 API,支持 gpt | deepseek | claude | gemini | grok 等排名靠前的常用大模型。

Python 35,030 2,492 Updated Dec 15, 2025

A next.js web application that integrates AI capabilities with draw.io diagrams. This app allows you to create, modify, and enhance diagrams through natural language commands and AI-assisted visual…

TypeScript 14,048 1,423 Updated Dec 21, 2025

LangGraph 1.0 Tutorial

Jupyter Notebook 250 26 Updated Dec 18, 2025

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 64,282 7,790 Updated Dec 21, 2025

快速提取音视频内容,整理成一份结构化的markdown笔记

Python 1,960 286 Updated Jul 26, 2024

Ola: Pushing the Frontiers of Omni-Modal Language Model

Python 381 16 Updated Jun 13, 2025

Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2

Jupyter Notebook 3,127 361 Updated Nov 11, 2025

Fully open reproduction of DeepSeek-R1

Python 25,745 2,405 Updated Nov 24, 2025

Tarsier -- a family of large-scale video-language models, which is designed to generate high-quality video descriptions , together with good capability of general video understanding.

Python 507 28 Updated Aug 14, 2025

One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks

Python 3,401 461 Updated Dec 18, 2025

PyTorch code and models for the DINOv2 self-supervised learning method.

Jupyter Notebook 12,097 1,145 Updated Dec 17, 2025

A minimal codebase for finetuning large multimodal models, supporting llava-1.5/1.6, llava-interleave, llava-next-video, llava-onevision, llama-3.2-vision, qwen-vl, qwen2-vl, phi3-v etc.

Python 359 41 Updated Dec 11, 2025
Jupyter Notebook 9 4 Updated Nov 17, 2024

TransNet V2: Shot Boundary Detection Neural Network

Python 809 129 Updated Dec 4, 2023

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 24,197 2,685 Updated Aug 12, 2024

Official inference repo for FLUX.1 models

Python 24,936 1,829 Updated Jul 31, 2025

An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"

Python 1,014 138 Updated Apr 12, 2024

Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 17,302 1,445 Updated Nov 28, 2025

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3, Qwen3-MoE, DeepSeek-R1, GLM4.5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, …

Python 11,760 1,071 Updated Dec 21, 2025

[ICCV 2025] LLaVA-CoT, a visual language model capable of spontaneous, systematic reasoning

Python 2,108 82 Updated Dec 12, 2025

Inpaint anything using Segment Anything and inpainting models.

Jupyter Notebook 7,545 651 Updated Feb 29, 2024

🔥🔥🔥 [IEEE TCSVT] Latest Papers, Codes and Datasets on Vid-LLMs.

2,978 134 Updated Dec 20, 2025

Collection of AWESOME vision-language models for vision tasks

3,039 229 Updated Oct 14, 2025

a state-of-the-art-level open visual language model | 多模态预训练模型

Python 6,710 449 Updated May 29, 2024

将微信读书划线同步到Notion

Python 2,839 6,628 Updated May 23, 2025

The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!

Python 8,316 896 Updated Dec 18, 2025
Python 45 12 Updated Apr 15, 2023

PyTorch implementation of MAE https//arxiv.org/abs/2111.06377

Python 8,153 1,339 Updated Jul 23, 2024

We use MixedWM38, the mixed-type wafer defect pattern dataset for wafer defect pattern regcognition with visual transformers.

Jupyter Notebook 41 13 Updated Oct 1, 2023

Bag of Visual Feature with Hamming Enbedding, Reranking

Python 55 17 Updated Jun 20, 2018
Next