Skip to content
View dwillis's full-sized avatar

Highlights

  • Pro

Organizations

@unitedstates

Block or report dwillis

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

316 results for source starred repositories written in Python
Clear filter

Robust Speech Recognition via Large-Scale Weak Supervision

Python 90,454 11,327 Updated Sep 8, 2025

Interact with your documents using the power of GPT, 100% privately, no data leaks

Python 56,772 7,598 Updated Nov 13, 2024

The world's simplest facial recognition api for Python and the command line

Python 55,700 13,700 Updated Aug 21, 2024

An open-source RAG-based tool for chatting with your documents.

Python 24,600 2,026 Updated Jul 4, 2025

Build Real-Time Knowledge Graphs for AI Agents

Python 19,879 1,871 Updated Nov 6, 2025

OCR, layout analysis, reading order, table recognition in 90+ languages

Python 18,839 1,286 Updated Oct 21, 2025

A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.

Python 16,808 1,181 Updated Nov 2, 2025

Open Source AI Platform - AI Chat with advanced features that works with every LLM

Python 15,678 2,120 Updated Nov 6, 2025

Buzz transcribes and translates audio offline on your personal computer. Powered by OpenAI's Whisper.

Python 15,582 1,161 Updated Nov 6, 2025

Always know what to expect from your data.

Python 10,901 1,643 Updated Nov 6, 2025

An open source multi-tool for exploring and publishing data

Python 10,498 784 Updated Nov 5, 2025

Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.

Python 9,461 1,177 Updated Nov 4, 2025

Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.

Python 9,057 821 Updated Jul 20, 2025

Anomaly detection related books, papers, videos, and toolboxes

Python 9,007 1,797 Updated Apr 24, 2025

🏹 Better dates & times for Python

Python 8,954 700 Updated Oct 25, 2025

SQL for Humans™

Python 7,221 574 Updated Jul 9, 2024

File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.

Python 7,215 398 Updated Feb 21, 2025

Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Wo…

Python 7,121 511 Updated Nov 6, 2025

A suite of utilities for converting to and working with CSV, the king of tabular file formats.

Python 6,271 674 Updated Nov 6, 2025

[NeurIPS 2023] Tree of Thoughts: Deliberate Problem Solving with Large Language Models

Python 5,651 574 Updated Jan 16, 2025

High-resolution models for human tasks.

Python 5,198 305 Updated Nov 18, 2024

Structured data extraction and instruction calling with ML, LLM and Vision LLM

Python 5,032 504 Updated Nov 6, 2025

Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML

Python 4,868 325 Updated Sep 12, 2025

VADER Sentiment Analysis. VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social …

Python 4,861 1,051 Updated Mar 16, 2024

Easy-to-use data handling for SQL data stores with support for implicit table creation, bulk loading, and transactions.

Python 4,827 295 Updated Feb 5, 2025

A Python library for automating interaction with websites.

Python 4,806 387 Updated Oct 8, 2025

Requests + Gevent = <3

Python 4,580 330 Updated Aug 8, 2024

🆔 A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.

Python 4,394 568 Updated Jul 29, 2025

A generic JSON document store with sharing and synchronisation capabilities.

Python 4,387 422 Updated Nov 5, 2025

A utility for mocking out the Python Requests library.

Python 4,301 362 Updated Aug 8, 2025
Next