-
Zhejiang University
- Beijing
Starred repositories
⚡A CLI tool for code structural search, lint and rewriting. Written in Rust
Drop in a screenshot and convert it to clean code (HTML/Tailwind/React/Vue)
Emote Portrait Alive: Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions
[SCIS] MULTI-Benchmark: Multimodal Understanding Leaderboard with Text and Images
[AAAI 2025] Official implementation of "OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on"
Industry leading face manipulation platform
Large World Model -- Modeling Text and Video with Millions Context
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
Official implementation of AnimateDiff.
InstantID: Zero-shot Identity-Preserving Generation in Seconds 🔥
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
The first "code-first" agent framework for seamlessly planning and executing data analytics tasks.
Question and Answer based on Anything.
Code and dataset for photorealistic Codec Avatars driven from audio
Outfit Anyone: Ultra-high quality virtual try-on for Any Clothing and Any Person
🤯 LobeHub - an open-source, modern design AI Agent Workspace. Supports multiple AI providers, Knowledge Base (file upload / RAG ), one click install MCP Marketplace and Artifacts / Thinking. One-cl…
Free ChatGPT&DeepSeek API Key,免费ChatGPT&DeepSeek API。免费接入DeepSeek API和GPT4 API,支持 gpt | deepseek | claude | gemini | grok 等排名靠前的常用大模型。
ai副业赚钱大集合,教你如何利用ai做一些副业项目,赚取更多额外收益。The Ultimate Guide to Making Money with AI Side Hustles: Learn how to leverage AI for some cool side gigs and rake in some extra cash. Check out the English versi…
AI agent using GPT-4V(ision) capable of using a mouse/keyboard to interact with web UI
Can large language models provide useful feedback on research papers? A large-scale empirical analysis.
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
LAVIS - A One-stop Library for Language-Vision Intelligence
Code and models for ICML 2024 paper, NExT-GPT: Any-to-Any Multimodal Large Language Model
An Open-source Framework for Data-centric, Self-evolving Autonomous Language Agents