- Greater Bay Area
- https://shesung.github.io
Stars
An open-source AI agent that brings the power of Gemini directly into your terminal.
A Datacenter Scale Distributed Inference Serving Framework
2025年音视频开发最新总结,提供全面的音视频开发学习资源,涵盖从基础知识到实战项目的资料、论文、书籍、项目和示例,帮助你快速热门并逐步进阶,持续更新维护中!
Python binding to Poppler-cpp pdf library
A code executor for Dify that is compatible with the official sandbox API calls and dependency installation.
使用IndexTTS模型在ComfyUI中实现高质量文本到语音转换的自定义节点。支持中文和英文文本,可以基于参考音频复刻声音特征。
Reference PyTorch implementation and models for DINOv3
Wan: Open and Advanced Large-Scale Video Generative Models
🚀 即梦3.0逆向API【特长:图像生成顶流】,零配置部署,多路token支持,仅供测试,如需商用请前往官方开放平台。
A browser automation framework and ecosystem.
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
Watermark you artworks to stay away from unauthorized diffusion style mimicry!
Low rank adaptation for Vision Transformer
[CVPR 2024] Real-Time Open-Vocabulary Object Detection
[AAAI2025] Revisiting Tampered Scene Text Detection in the Era of Generative AI
[ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
Wan: Open and Advanced Large-Scale Video Generative Models
This repository contains integer operators on GPUs for PyTorch.
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…
Bypass MDM Setup for MacOS, up to MacOS Tahoe 26.1
Official repository of In-Context LoRA for Diffusion Transformers
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
A streamlined and customizable framework for efficient large model (LLM, VLM, AIGC) evaluation and performance benchmarking.