Stars
Unlock your displays on your Mac! Flexible HiDPI scaling, XDR/HDR extra brightness, virtual screens, DDC control, extra dimming, PIP/streaming, EDID override and lots more!
🐶 Kubernetes CLI To Manage Your Clusters In Style!
Faster way to switch between clusters and namespaces in kubectl
Chris Titus Tech's Windows Utility - Install Programs, Tweaks, Fixes, and Updates
A quick test to get deepseek ocr to run on a Mac with either images or pdfs
Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.
PDF Parser for AI-ready data. Automate PDF accessibility. Open-source.
Structured data extraction and instruction calling with ML, LLM and Vision LLM
[CVPR 2025] A Comprehensive Benchmark for Document Parsing and Evaluation
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
LLM-Driven Extraction of Unstructured Data — Built for API Deployments & ETL Pipeline Workflows
📚 Process PDFs, Word documents and more with spaCy
A collection of code snippets from the publication To Data & Beyond on Substack: https://youssefh.substack.com/
Convert PDF to markdown + JSON quickly with high accuracy
Toolkit for linearizing PDFs for LLM datasets/training
Jsonnet library for generating Grafana dashboards.
Extract and convert data from any document, images, pdfs, word doc, ppt or URL into multiple formats (Markdown, JSON, CSV, HTML) with intelligent structured data extraction and advanced OCR.
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
MCP servers for the Atlassian products (Bitbucket, Confluence, JIRA) of the Data Center version
Transforms complex documents like PDFs and Office docs into LLM-ready markdown/JSON for your Agentic workflows.
A Comprehensive Toolkit for High-Quality PDF Content Extraction
Transforms PDF, Documents and Images into Enriched Structured Data
An on-premises, OCR-free unstructured data extraction, markdown conversion and benchmarking toolkit. (https://idp-leaderboard.org/)
Tesseract Open Source OCR Engine (main repository)
AI Powered Knowledge Graph Generator