Highlights
- Pro
Stars
Run, manage, and scale AI workloads on any AI infrastructure. Use one system to access & manage all AI compute (Kubernetes, 17+ clouds, or on-prem).
Tensors and Dynamic neural networks in Python with strong GPU acceleration
SGLang is a fast serving framework for large language models and vision language models.
An Open Source Machine Learning Framework for Everyone
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
Protocol Buffers - Google's data interchange format
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
A high-throughput and memory-efficient inference and serving engine for LLMs
Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Apache Spark - A unified analytics engine for large-scale data processing
Pytorch domain library for recommendation systems
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
Open-source search and retrieval database for AI applications.
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Olive: Simplify ML Model Finetuning, Conversion, Quantization, and Optimization for CPUs, GPUs and NPUs.
The official home of the Presto distributed SQL query engine for big data
Apache Doris is an easy-to-use, high performance and unified analytics database.
Apache Superset is a Data Visualization and Data Exploration Platform
Google Cloud Client Library for Python
Streaming data platform. Real-time stream processing, low-latency serving, and Iceberg table management.
Apache Beam is a unified programming model for Batch and Streaming data processing.