-
北京交通大学
- BeiJing
- https://orcid.org/0000-0003-4635-7032
Lists (2)
Sort Name ascending (A-Z)
Stars
Industrial-grade speech recognition toolkit: 170x realtime, 50+ languages, speaker diarization, emotion detection, streaming, and OpenAI-compatible API.
3D world built by Claude Fable 5 to test its capabilities using three.js
Sleek, mobile-friendly web UI for NVIDIA LocateAnything-3B — open-vocabulary object detection & grounding on your own GPU, via one docker compose up.
GPT-image-2 and seedance2 workflows and prompt templates to produce high-quality AI videos.
中国专利.skill,从项目文档到可交付的技术交底书,专利点挖掘、联网国知局查新、脱敏成文与自检闭环。
AI generates a real, editable PowerPoint from any document — native shapes & animations, speaker notes voiced as audio narration, and the option to follow your own .pptx template, not slide images …
Official baseline implementation for the UAVM @ ACM MM 2026 Workshop.
An open-source implementaion for fine-tuning Qwen-VL series by Alibaba Cloud.
An AI SKILL that provide design intelligence for building professional UI/UX multiple platforms
A free, open source, privacy-first voice input app for macOS.
The agent harness performance optimization system. Skills, instincts, memory, security, and research-first development for Claude Code, Codex, Opencode, Cursor and beyond.
Academic Research Skills for Claude Code: research → write → review → revise → finalize
Over 450 terminal color schemes/themes for iTerm/iTerm2. Includes ports to Terminal, Konsole, PuTTY, Xresources, XRDB, Remmina, Termite, XFCE, Tilda, FreeBSD VT, Terminator, Kitty, MobaXterm, LXTer…
Seoul World Model: Grounding World Simulation Models in a Real-World Metropolis
Train your AI self, amplify you, bridge the world
Official Implementation of VGG-Flow (NeurIPS 2025; https://arxiv.org/abs/2512.05116)
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
WebRTC/RTSP/RTMP/HTTP/HLS/HTTP-FLV/WebSocket-FLV/HTTP-TS/HTTP-fMP4/WebSocket-TS/WebSocket-fMP4/GB28181/SRT/STUN/TURN server and client framework based on C++11
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
③[ICML2024] [IQA, IAA, VQA] All-in-one Foundation Model for visual scoring. Can efficiently fine-tune to downstream datasets.
A highly extensible Markdown editor. Version control, AI Copilot, mind map, documents encryption, code snippet running, integrated terminal, chart embedding, HTML applets, Reveal.js, plug-in, and m…
The official implementation of InfiniteVGGT
Reference PyTorch implementation and models for DINOv3