Stars
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
Open-source and strong foundation image recognition models.
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
DeepFashion2 Dataset https://arxiv.org/pdf/1901.07973.pdf
State-of-the-art CLIP/SigLIP embedding models finetuned for the fashion domain. +57% increase in evaluation metrics vs FashionCLIP 2.0.
LAVIS - A One-stop Library for Language-Vision Intelligence
🥥 Coco AI Server - Search, Connect, Collaborate, AI-powered Enterprise Search, all in one space.
🥥 Coco AI App - Search, Connect, Collaborate, Personal AI Search and Assistant, all in one space.
The 500 AI Agents Projects is a curated collection of AI agent use cases across various industries. It showcases practical applications and provides links to open-source projects for implementation…
PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/MCP/Docker/Zotero
Expose your FastAPI endpoints as Model Context Protocol (MCP) tools, with Auth!
A private messenger for Windows, macOS, and Linux.
截屏 离线OCR 搜索翻译 以图搜图 贴图 录屏 万向滚动截屏 屏幕翻译 Screenshot Offline OCR Search Translate Search for picture Paste the picture on the screen Screen recorder Omnidirectional scrolling screenshot Screen translato…
本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)
The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data 🔥
RooCodeInc / Roo-Code
Forked from cline/clineRoo Code gives you a whole dev team of AI agents in your code editor.
High performance Node.js image processing, the fastest module to resize JPEG, PNG, WebP, AVIF and TIFF images. Uses the libvips library.
Beautiful and responsive UI components and templates for React and Vue (soon) with Tailwind CSS.
Flutter widgets and themes implementing the current macOS design language.
An enterprise-class package of Flutter components for mobile applications. ( Bruno 是基于一整套设计体系的 Flutter 组件库。)
A generative speech model for daily dialogue.
User-friendly AI Interface (Supports Ollama, OpenAI API, ...)
YOLOv10: Real-Time End-to-End Object Detection [NeurIPS 2024]