Highlights
- Pro
Starred repositories
An open source multi-tool for exploring and publishing data
Anomaly detection related books, papers, videos, and toolboxes
Structured data extraction and instruction calling with ML, LLM and Vision LLM
Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML
Easy-to-use data handling for SQL data stores with support for implicit table creation, bulk loading, and transactions.
Django application and library for importing and exporting data with admin integration.
A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.
Concatenate a directory full of files into a single prompt for use with LLMs
news-please - an integrated web crawler and information extractor for news that just works
MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.
Using OpenAI's Whisper to automatically generate YouTube subtitles
An API to retrieve and read NFL Game Center JSON data. It can work with real-time data, which can be used for fantasy football.
Doing dirty (but extremely useful) things with equals.
Retrieval Augmented Generation based on LanceDB
Have you ever wanted multiple views to match to the same URL? Now you can.
A modern Python library for writing maintainable web scrapers.
A Python version (almost a port) of ProPublica's TableFu
A tool for creating credentials for accessing S3 buckets
Ultra simple API for geocoding a single string against various web services.
A tiny library for Python text normalisation. Useful for ad-hoc text processing.
An unsurprising Django API framework
A simple Python wrapper for Google’s geocoder API
Script for saving a JSON archive of your tweets.
A Django app to download, extract and load campaign finance and lobbying activity data from the California Secretary of State's CAL-ACCESS database
CLI tool for converting DBF files (dBase, FoxPro etc) to SQLite
A Python client for the ProPublica Congress API
a web based tool to monitor how your website content is used in wikipedia