Stars
A feature-rich command-line audio/video downloader
A natural language interface for computers
🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyN
The lean application framework for Python. Build sophisticated user interfaces with a simple Python API. Run your apps in the terminal and a web browser.
An open-source RAG-based tool for chatting with your documents.
Build Real-Time Knowledge Graphs for AI Agents
Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
OCR, layout analysis, reading order, table recognition in 90+ languages
LLM agents built for control. Designed for real-world use. Deployed in minutes.
Simple, unified interface to multiple Generative AI providers
Video-based AI memory library. Store millions of text chunks in MP4 files with lightning-fast semantic search. No database needed.
Code for the manim-generated scenes used in 3blue1brown videos
🕷️ An undetectable, powerful, flexible, high-performance Python library to make Web Scraping Easy and Effortless as it should be!
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
Download, model, analyze, and visualize street networks and other geospatial features from OpenStreetMap.
Represent, send, store and search multimodal data
Fast and accurate automatic speech recognition (ASR) for edge devices
Official Claude Code compound engineering plugin
Enhance Tesseract OCR output for scanned PDFs by applying Large Language Model (LLM) corrections.
Concatenate a directory full of files into a single prompt for use with LLMs
OCRFlux is a lightweight yet powerful multimodal toolkit that significantly advances PDF-to-Markdown conversion, excelling in complex layout handling, complicated table parsing and cross-page conte…
Multiversal tree writing interface for human-AI collaboration
What did I do on February 14th 2007? Visualize your (digital) life in Org-mode
A novel created autonomously by a team of 10 AI agents
utilities for decoding deep representations (like sentence embeddings) back to text