Stars
Python tool for converting files and office documents to Markdown.
Models and examples built with TensorFlow
Collection of awesome LLM apps with AI Agents and RAG using OpenAI, Anthropic, Gemini and opensource models.
real time face swap and one-click video deepfake with only a single image
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
Fine-tuning & Reinforcement Learning for LLMs. π¦₯ Train OpenAI gpt-oss, DeepSeek-R1, Qwen3, Gemma 3, TTS 2x faster with 70% less VRAM.
Universal memory layer for AI Agents; Announcing OpenMemory MCP - local and secure memory management.
Federated query engine for AI - The only MCP Server you'll ever need
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (Vβ¦
Multi-agent framework, runtime and control plane. Built for speed, privacy, and scale.
We have made you a wrapper you can't refuse
Industry leading face manipulation platform
Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow
Fully Local Manus AI. No APIs, No $200 monthly bills. Enjoy an autonomous agent that thinks, browses the web, and code for the sole cost of electricity. π Official updates only via twitter @Martin9β¦
SearXNG is a free internet metasearch engine which aggregates results from various search services and databases. Users are neither tracked nor profiled.
MiniCPM-V 4.5: A GPT-4o Level MLLM for Single Image, Multi Image and High-FPS Video Understanding on Your Phone
[NeurIPS 2022] Towards Robust Blind Face Restoration with Codebook Lookup Transformer
A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.
LLM agents built for control. Designed for real-world use. Deployed in minutes.
Image Polygonal Annotation with Python (polygon, rectangle, circle, line, point and image-level flag annotation).
Translate the video from one language to another and add dubbing.
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
Semantic segmentation models with 500+ pretrained convolutional and transformer-based backbones.
"RAG-Anything: All-in-One RAG Framework"
EmotiVoice π: a Multi-Voice and Prompt-Controlled TTS Engine
Semantic cache for LLMs. Fully integrated with LangChain and llama_index.
"AutoAgent: Fully-Automated and Zero-Code LLM Agent Framework"