Lists (8)
Sort Name ascending (A-Z)
Stars
On the Hidden Mystery of OCR in Large Multimodal Models (OCRBench)
The awesome collection of OpenClaw Skills. Formerly known as Moltbot, originally Clawdbot.
Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.
Mobile and Web client for Codex and Claude Code, with realtime voice, encryption and fully featured
Transform your favorite cities into beautiful, minimalist designs. MapToPoster lets you create and export visually striking map posters with code.
Chrome DevTools for coding agents
Open Source Visualized Route Tracing Tool for macOS, Windows, and Linux. 跨平台可视化路由追踪工具。
Gemini Nano Banana / Pro watermark maintenance tool
Official code for the paper "Examining Post-Training Quantization for Mixture-of-Experts: A Benchmark"
Axel tries to accelerate HTTP/FTP downloading process by using multiple connections for one file. It can use multiple mirrors for a download. Wilmer van der Gaast is the upstream author of Axel. Y …
Toolkit of BDD100K Dataset for Heterogeneous Multitask Learning - CVPR 2020 Oral Paper
A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.
🏗 Build container images for your Java applications.
An elegant and deeply customizable lyrics visualizer & versatile music player, built with WinUI3/Win2D | 一款优雅且高度自定义的歌词可视化与全能音乐播放应用,基于 WinUI3/Win2D 构建
Personal CRM. Remember everything about your friends, family and business relationships.
CV/resume generator for academics and engineers, YAML to PDF
一款轻量级、高度可定制的 Windows桌面和任务栏硬件性能监控工具,支持监测 CPU、GPU、内存、磁盘、网速、FPS 计数、插件扩展及内存清理。A lightweight, customizable hardware monitor for the Windows desktop & taskbar. Features CPU/GPU/RAM/Network monitoring, FP…
一个基于nano banana pro🍌的原生AI PPT生成应用,迈向真正的"Vibe PPT"; 支持上传任意模板图片;上传任意素材&智能解析;一句话/大纲/页面描述自动生成PPT;口头修改指定区域、一键导出可编辑ppt - An AI-native PPT generator based on nano banana pro🍌
[2025] Efficient Vision Language Models: A Survey
vscode 注释翻译插件, 不干扰正常代码,方便快速阅读源码。
Visualizer for neural network, deep learning and machine learning models
Run Stable Diffusion on Android Devices with Snapdragon NPU acceleration. Also supports CPU/GPU inference.
Automate your mobile devices with natural language commands - an LLM agnostic mobile Agent 🤖
Source code for the paper "Empowering LLM to use Smartphone for Intelligent Task Automation"
Open-AutoGLM混合方案 - 在手机上运行AI自动化,无需电脑
An Open Phone Agent Model & Framework. Unlocking the AI Phone for Everyone
Context-aware AI assistant for your desktop. Ready to respond intelligently, seamlessly integrating multiple LLMs and MCP tools.