Skip to content
View fengjiasun's full-sized avatar

Block or report fengjiasun

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Public repository for Agent Skills

Python 62,299 6,109 Updated Feb 4, 2026

Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞

TypeScript 121 11 Updated Feb 3, 2026

Advancing Open-source World Models

Python 2,365 178 Updated Feb 2, 2026

A Pragmatic VLA Foundation Model

Python 647 50 Updated Jan 30, 2026

[ICLR 2026] TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching

Jupyter Notebook 830 74 Updated Jan 28, 2026

Comprehensive open-source library of AI research and engineering skills for any AI model. Package the skills and your claude code/codex/gemini agent will be an AI research agent with full horsepowe…

TeX 2,273 189 Updated Feb 3, 2026

LLM驱动的 A/H/美股智能分析器,多数据源行情 + 实时新闻 + Gemini 决策仪表盘 + 多渠道推送,零成本,纯白嫖,定时运行

Python 9,258 9,683 Updated Feb 3, 2026

Open-Source Frontier Voice AI

Python 22,873 2,497 Updated Feb 3, 2026

Code for "Diffusion Model Alignment Using Direct Preference Optimization"

Python 658 46 Updated Nov 10, 2025

An instruct text-to-speech solution based on LLaSA and CosyVoice2 developed by the ASLP lab and collaborators.

Python 207 11 Updated Jan 20, 2026

text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)

Python 12,395 1,251 Updated Nov 4, 2025

SoulX-FlashTalk is the first 14B model to achieve sub-second start-up latency (0.87s) while maintaining a real-time throughput of 32 FPS on an 8xH800 node.

Python 458 40 Updated Jan 30, 2026

Official Python inference and LoRA trainer package for the LTX-2 audio–video generative model.

Python 3,481 456 Updated Jan 29, 2026

The repository provides code for running inference with the Meta Segment Anything Audio Model (SAM-Audio), links for downloading the trained model checkpoints, and example notebooks that show how t…

Python 3,266 275 Updated Jan 5, 2026

State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!

Jupyter Notebook 2,135 141 Updated Jan 22, 2026

We introduce temporal working memory (TWM), which aims to enhance the temporal modeling capabilities of Multimodal foundation models (MFMs). This plug-and-play module can be easily integrated into …

Python 311 30 Updated Nov 26, 2025

EVA Series: Visual Representation Fantasies from BAAI

Python 2,642 189 Updated Aug 1, 2024

A 5-way embedding model for text, audio, image, video, and 3D point clouds.

Python 12 3 Updated Nov 13, 2025

A dataset of 100M connections between 5 different modalities.

58 5 Updated Nov 14, 2025

Uses machine learning to denoise audio containing speech

Python 49 3 Updated Jun 22, 2024

Code implementation for the paper "Large-scale Pre-training for Grounded Video Caption Generation" (ICCV 2025)

Python 28 1 Updated Jan 18, 2026

AnyTalker: Scaling Multi-person Talking Video Generation with Interactivity Refinement

Python 276 40 Updated Dec 5, 2025

[ICCV 2025] Implementation for Describe Anything: Detailed Localized Image and Video Captioning

Python 1,447 87 Updated Jun 26, 2025

Video Grounding and Captioning

Python 332 73 Updated Oct 12, 2021

[ISMIR 2025] A curated list of vision-to-music generation: methods, datasets, evaluation and challenges.

118 3 Updated Aug 9, 2025

A curated list of Vision (video/image) to Audio Generation

96 4 Updated Nov 22, 2025

SoulX-Podcast is an inference codebase by the Soul AI team for generating high-fidelity podcasts from text.

Python 3,128 403 Updated Dec 11, 2025

Open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, achieving a new SOTA on public Mandarin ASR benchmarks, while also offering outstanding singing lyrics rec…

Python 1,750 159 Updated Jan 29, 2026
Python 68 4 Updated Dec 30, 2025
Python 37 Updated Jul 4, 2024
Next