- Argentina
- http://www.twitter.com/andangc
Stars
WAHA - WhatsApp HTTP API (REST API) that you can configure in a click! 3 engines: WEBJS (browser based), NOWEB (websocket nodejs), GOWS (websocket go)
A Comprehensive Toolkit for High-Quality PDF Content Extraction
A toolkit for SQLite databases, with a focus on application development
Playwright is a framework for Web Testing and Automation. It allows testing Chromium, Firefox and WebKit with a single API.
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, an…
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.
Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.
MiniCPM-V 4.5: A GPT-4o Level MLLM for Single Image, Multi Image and High-FPS Video Understanding on Your Phone
The open-source CapCut alternative
AnyCrawl 🚀: A Node.js/TypeScript crawler that turns websites into LLM-ready data and extracts structured SERP results from Google/Bing/Baidu/etc. Native multi-threading for bulk processing.
MCP Toolbox for Databases is an open source MCP server for databases.
Harmonize is a modern linter for Swift that allows you to write architectural lint rules as unit tests, helping your team to keep your codebase clean, maintainable, and consistent as it grows, with…
12 Weeks, 24 Lessons, AI for All!
AirPrint Bridge: Enable AirPrint for Unsupported Printers on macOS
Pyramids of map tiles in a single file on static storage
Bare metal to production ready in mins; your own fly server on your VPS.
An open version of ChatGPT you can host anywhere or run locally.
User-friendly AI Interface (Supports Ollama, OpenAI API, ...)
CORD: A Consolidated Receipt Dataset for Post-OCR Parsing
A docker-powered PaaS that helps you build and manage the lifecycle of applications
Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website …
Structured data extraction and instruction calling with ML, LLM and Vision LLM
Deploy your agentic worfklows to production