Skip to content
View camscottie's full-sized avatar

Block or report camscottie

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
70 stars written in Python
Clear filter

Python tool for converting files and office documents to Markdown.

Python 86,579 5,020 Updated Jan 8, 2026

Get your documents ready for gen AI

Python 52,423 3,581 Updated Feb 6, 2026

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

Python 44,178 16,443 Updated Feb 8, 2026

A modular graph-based Retrieval-Augmented Generation (RAG) system

Python 30,805 3,251 Updated Feb 6, 2026

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

Python 20,656 5,048 Updated Feb 7, 2026

A reactive notebook for Python — run reproducible experiments, query with SQL, execute as a script, deploy as an app, and version with git. Stored as pure Python. All in a modern, AI-native editor.

Python 18,951 907 Updated Feb 7, 2026

Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.

Python 18,657 2,451 Updated Feb 8, 2026

The interactive graphing library for Python ✨

Python 18,240 2,776 Updated Jan 26, 2026

🦉 Data Versioning and ML Experiments

Python 15,345 1,276 Updated Feb 1, 2026

Pyodide is a Python distribution for the browser and Node.js based on WebAssembly

Python 14,194 993 Updated Feb 6, 2026

A PyTorch-based Speech Toolkit

Python 11,189 1,648 Updated Feb 7, 2026

Always know what to expect from your data.

Python 11,130 1,677 Updated Feb 7, 2026

NumPy & SciPy for GPU

Python 10,773 987 Updated Feb 3, 2026

Refine high-quality datasets and visual AI models

Python 10,351 714 Updated Feb 7, 2026

Declarative visualization library for Python

Python 10,246 835 Updated Feb 6, 2026

Open Source framework for voice and multimodal conversational AI

Python 10,229 1,704 Updated Feb 8, 2026

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…

Python 9,685 792 Updated May 27, 2025

Python SQL Parser and Transpiler

Python 8,891 1,056 Updated Feb 6, 2026

A system-level, binary package and environment manager running on all major operating systems and platforms.

Python 7,295 2,096 Updated Feb 6, 2026

Python Socket.IO server and client

Python 4,321 628 Updated Feb 6, 2026

Missing data visualization module for Python.

Python 4,186 530 Updated May 14, 2024

A minimal Python framework for building custom AI inference servers with full control over logic, batching, and scaling.

Python 3,801 269 Updated Feb 3, 2026

WebApps in pure Python. No JavaScript, HTML and CSS needed

Python 3,349 130 Updated Feb 6, 2026

Improved file parsing for LLM’s

Python 3,151 140 Updated Nov 13, 2024

Data manipulation and transformation for audio signal processing, powered by PyTorch

Python 2,822 760 Updated Feb 8, 2026

Jobs scraper library for LinkedIn, Indeed, Glassdoor, Google, ZipRecruiter & more

Python 2,724 575 Updated Jan 10, 2026

Analytics, Versioning and ETL for multimodal data: video, audio, PDFs, images

Python 2,721 134 Updated Feb 8, 2026

A Python library for audio data augmentation. Useful for making audio ML models work well in the real world, not just in the lab.

Python 2,230 210 Updated Dec 27, 2025

Open source platform for AI Engineering: OpenTelemetry-native LLM Observability, GPU Monitoring, Guardrails, Evaluations, Prompt Management, Vault, Playground. 🚀💻 Integrates with 50+ LLM Providers,…

Python 2,186 236 Updated Feb 6, 2026

Enforce the output format (JSON Schema, Regex etc) of a language model

Python 1,989 81 Updated Aug 24, 2025
Next