Stars
Write HTML. Render video. Built for agents.
[Awesome-Spatial-VLMs] This repository is the official, community-maintained resource for the survey paper: Spatial Intelligence in Vision-Language Models: A Comprehensive Survey;
AI generates a real, editable PowerPoint from any document — native shapes & animations, speaker notes voiced as audio narration, and the option to follow your own .pptx template, not slide images …
AI-agent Skill for generating polished HTML slide decks: editorial magazine and Swiss layouts, image prompts, social covers, and a WebGL/low-power presentation runtime.
Huashu Design · HTML-native design skill for Claude Code · Claude Code 里 HTML 原生的设计 skill · 高保真原型 / 幻灯片 / 动画 + 20 设计哲学 + 5 维评审 + MP4 导出 · Agent-agnostic
AI Agent 驱动的开源视频生成工作台 — 小说→角色/场景/道具设计→剧本→分镜图→视频,跨镜头角色与场景一致 | Open-source AI video workspace powered by AI Agents, Nano Banana 2 & Veo 3.1 / Grok / Seedance / OpenAI
Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.
A SOTA Industrial-Grade All-in-One ASR system with ASR, VAD, LID, and Punc modules. FireRedASR2 supports Chinese (Mandarin, 20+ dialects/accents), English, code-switching, and both speech and singi…
Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML
ViCaS: A Dataset for Combining Holistic and Pixel-level Video Understanding using Captions with Grounded Segmentation (CVPR'25)
Benchmarking Knowledge Transfer in Lifelong Robot Learning
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
A Simple and Universal Swarm Intelligence Engine, Predicting Anything. 简洁通用的群体智能引擎,预测万物
一个专注于中国基金市场的基金分析工具,能够提供实时估值、定投回测、基金筛选、基金对比等功能,并能结合大语言模型技术对市场行情进行分析。
🎬 火宝短剧 - 基于AI的一站式短剧生成平台 《一句话生成完整短剧,从剧本到成片全自动化》 Huobao Drama - An AI-Powered End-to-End Short Drama Generator "One Sentence to Complete Drama: Fully Automated from Script to Final Video"
AetherViz Master - 互动教育可视化建筑师,将任意教学主题转化为沉浸式3D交互网页
The API to search, scrape, and interact with the web at scale. 🔥
An Agentic Framework for Reflective PowerPoint Generation
A visual no-code/code-free web crawler/spider易采集:一个可视化浏览器自动化测试/数据采集/网页爬虫软件,可以无代码图形化的设计和执行爬虫任务。别名:ServiceWrapper面向Web应用的智能化服务封装系统。
微舆:人人可用的多Agent舆情分析助手,打破信息茧房,还原舆情原貌,预测未来走向,辅助决策!从0实现,不依赖任何框架。
Automate browser based workflows with AI
🌐 Make websites accessible for AI agents. Automate tasks online with ease.
一个基于nano banana pro🍌的原生AI PPT生成应用,迈向"Vibe PPT"; 支持上传任意模板图片,上传任意素材&智能解析,一句话/大纲/页面描述自动生成PPT,口头修改指定区域、一键导出可编辑ppt - An AI-native slides generator based on nano banana pro🍌
An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System
TikTok 发布/喜欢/合辑/直播/视频/图集/音乐;抖音发布/喜欢/收藏/收藏夹/视频/图集/实况/直播/音乐/合集/评论/账号/搜索/热榜数据采集工具/下载工具
TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…