Skip to content
View jimdowling's full-sized avatar

Block or report jimdowling

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
121 stars written in Python
Clear filter

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 69,598 13,222 Updated Feb 6, 2026

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 41,550 4,706 Updated Feb 5, 2026

DSPy: The framework for programming—not prompting—language models

Python 32,022 2,606 Updated Feb 5, 2026

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

Python 20,642 5,043 Updated Feb 6, 2026

tiktoken is a fast BPE tokeniser for use with OpenAI's models.

Python 17,196 1,373 Updated Oct 6, 2025

Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.

Python 16,977 3,718 Updated Jun 2, 2023

Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.

Python 15,256 3,564 Updated Feb 5, 2026

An orchestration platform for the development, production, and observation of data assets.

Python 14,902 1,970 Updated Feb 6, 2026

Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.

Python 14,665 2,256 Updated Dec 1, 2025

A framework for few-shot evaluation of language models.

Python 11,367 3,018 Updated Feb 5, 2026

Cleanlab's open-source library is the standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.

Python 11,303 882 Updated Jan 13, 2026

Modin: Scale your Pandas workflows by changing a single line of code

Python 10,357 673 Updated Oct 2, 2025

A Lightweight Recommendation System

Python 9,261 716 Updated Oct 13, 2025

🧙 Build, run, and manage data pipelines for integrating and transforming data.

Python 8,641 908 Updated Jan 28, 2026

AI Toolkit for Healthcare Imaging

Python 7,827 1,417 Updated Jan 30, 2026

PyTorch code for Vision Transformers training with the Self-Supervised learning method DINO

Python 7,440 1,027 Updated Jul 3, 2024

The Open Source Feature Store for AI/ML

Python 6,683 1,213 Updated Feb 6, 2026

📚 Parameterize, execute, and analyze notebooks

Python 6,368 445 Updated Jan 5, 2026

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Python 6,175 1,162 Updated May 28, 2023

Voilà turns Jupyter notebooks into standalone web applications

Python 5,892 528 Updated Feb 2, 2026

⚡ TabPFN: Foundation Model for Tabular Data ⚡

Python 5,652 558 Updated Feb 5, 2026

ZenML 🙏: One AI Platform from Pipelines to Agents. https://zenml.io.

Python 5,191 573 Updated Feb 5, 2026

😎 A curated list of awesome MLOps tools

Python 5,010 672 Updated Dec 10, 2025

A collection of scripts to flash Tuya IoT devices to alternative firmwares

Python 5,005 520 Updated Sep 6, 2024

data load tool (dlt) is an open source Python library that makes data loading easy 🛠️

Python 4,882 447 Updated Feb 5, 2026

High-level library to help with training and evaluating neural networks in PyTorch flexibly and transparently.

Python 4,729 667 Updated Feb 5, 2026

Sequence modeling benchmarks and temporal convolutional networks

Python 4,473 904 Updated Mar 28, 2022

A fast inference library for running LLMs locally on modern consumer-class GPUs

Python 4,436 330 Updated Dec 9, 2025

TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters.

Python 3,859 941 Updated Jul 10, 2023

ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22, ACL'23, EMNLP'23)

Python 3,769 465 Updated Oct 14, 2025
Next