Skip to content
View kongzhecn's full-sized avatar

Block or report kongzhecn

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

AI PPT赛道终结者,史上最最最强 PPT Skill!!! 使用GPT生成豪华的图片格式PPT,然后转换为完全可编辑的PPTX文件。

Python 1,128 105 Updated Jun 7, 2026

Wan: Open and Advanced Large-Scale Video Generative Models

Python 16,332 2,030 Updated Mar 17, 2026
Python 11,597 790 Updated Feb 9, 2026

A toolkit for speaker diarization.

Jupyter Notebook 481 56 Updated May 29, 2026

Official Implementation of LongLive-RAG: A general retrieval-augmented framework for long video generation.

Python 72 Updated Jun 4, 2026

JoyAI-Echo: Pushing the Frontier of Long Audio-Visual Generation

Python 1,644 149 Updated Jun 16, 2026

Official page of ImmerIris: A Large-Scale Dataset and Benchmark for Off-Axis and Unconstrained Iris Recognition in Immersive Applications.

HTML 30 1 Updated Jun 9, 2026

"CLI-Anything: Making ALL Software Agent-Native" -- CLI-Hub: https://clianything.cc/

Python 43,631 4,083 Updated Jun 14, 2026

Implementation of Gamma-World: Generative Multi-Agent World Modeling Beyond Two Players

Python 635 9 Updated Jun 17, 2026

A Minimal and Elegant Framework & Tutorial for Real-Time Interactive World Models

Python 625 12 Updated Jun 15, 2026

Codex skill for converting slide images, PDFs, and image-based PPTX files into editable PowerPoint decks.

Python 774 39 Updated Jun 22, 2026

Multimodal RL training framework for diffusion & omni models

Python 402 58 Updated Jun 22, 2026

Official Repo of "D-OPSD: On-Policy Self-Distillation for Continuously Tuning Step-Distilled Diffusion Models"

Python 256 7 Updated May 22, 2026

A unified framework for easy reinforcement learning in Flow-Matching models

Python 584 47 Updated Jun 18, 2026

AI generates a real, editable PowerPoint from any document — native shapes & animations, speaker notes voiced as audio narration, and the option to follow your own .pptx template, not slide images …

Python 30,351 2,652 Updated Jun 22, 2026

Interactive World Model papers organized by core research challenges.

Python 241 8 Updated Jun 19, 2026

[ICML 2026] World-R1: Reinforcing 3D Constraints for Text-to-Video Generation

Python 392 15 Updated Jun 3, 2026

A Curated List of Awesome Video World Models with AR Diffusion: Covering Algorithms, Applications, and Infrastructure, Aimed at Serving as a Comprehensive Resource for Researchers, Practitioners, a…

TeX 608 17 Updated Jun 4, 2026

GPT-Image-2 API and Prompts

Python 16,907 1,714 Updated Jun 22, 2026

Official Implementation of MultiWorld: Scalable Multi-Agent Multi-View Video World Models

Python 231 12 Updated May 12, 2026

[ICLR 26 Oral] Stable Video Infinity: Infinite-Length Video Generation with Error Recycling

Python 2,501 219 Updated Jun 3, 2026

Helios: Real Real-Time Long Video Generation Model

Python 1,923 149 Updated Jun 10, 2026

ARIS ⚔️ (Auto-Research-In-Sleep) — Lightweight Markdown-only skills for autonomous ML research: cross-model review loops, idea discovery, and experiment automation. No framework, no lock-in — works…

Python 12,486 1,133 Updated Jun 21, 2026

Open-Source Frontier Voice AI

Python 49,545 5,530 Updated May 6, 2026
Python 15 Updated Mar 30, 2026

[ICML 2026] DreamID-Omni: Unified Framework for Controllable Human-Centric Audio-Video Generation

Python 268 17 Updated May 22, 2026

ReactMotion: Generating Reactive Listener Motions from Speaker Utterance

Python 105 2 Updated Mar 30, 2026

SOTA Open Source TTS

Python 30,899 2,636 Updated Jun 9, 2026

The official code of "Beyond VLM-Based Rewards: Diffusion-Native Latent Reward Modeling"

Python 60 3 Updated Feb 26, 2026
Next