Skip to content
View MartinForReal's full-sized avatar

Organizations

@kubeflow

Block or report MartinForReal

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

ml/data

50 repositories

A curated list of references for MLOps

13,483 1,996 Updated Nov 21, 2024

ClearML - Auto-Magical CI/CD to streamline your AI workload. Experiment Management, Data Management, Pipeline, Orchestration, Scheduling & Serving in one MLOps/LLMOps solution

Python 6,421 724 Updated Dec 23, 2025

lakeFS - Data version control for your data lake | Git for data

Go 5,052 417 Updated Dec 23, 2025

An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs

Scala 8,480 1,975 Updated Dec 20, 2025

Data-Centric Pipelines and Data Versioning

Go 6,276 570 Updated Feb 3, 2025

Examples of using Neptune to keep track of your experiments (maintenance only).

Jupyter Notebook 26 13 Updated Mar 30, 2022

A low-latency prediction-serving system

C++ 1,421 279 Updated Apr 26, 2021

Lingvo

Python 2,856 453 Updated Dec 5, 2025

🦉 Data Versioning and ML Experiments

Python 15,219 1,261 Updated Dec 23, 2025

A kubernetes based framework for hassle free handling of datasets

Go 532 75 Updated Dec 16, 2025

SeaweedFS is a fast distributed storage system for blobs, objects, files, and data lake, for billions of files! Blob store has O(1) disk seek, cloud tiering. Filer supports Cloud Drive, xDC replica…

Go 29,039 2,626 Updated Dec 23, 2025

JuiceFS is a distributed POSIX file system built on top of Redis and S3.

Go 12,571 1,129 Updated Dec 23, 2025
C 26 14 Updated May 19, 2021

For recording and retrieving metadata associated with ML developer and data scientist workflows.

C++ 667 171 Updated Apr 3, 2025

PyTorch extensions for high performance and large scale training.

Python 3,391 294 Updated Apr 26, 2025

An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.

Python 14,318 1,843 Updated Jul 3, 2024

Alluxio, data orchestration for analytics and machine learning in the cloud

Java 7,132 2,956 Updated Apr 29, 2025

Spark RAPIDS plugin - accelerate Apache Spark with GPUs

Scala 953 267 Updated Dec 24, 2025

The open source developer platform to build AI agents and models with confidence. Enhance your AI applications with end-to-end tracking, observability, and evaluations, all in one integrated platform.

Python 23,424 5,093 Updated Dec 24, 2025

Distributed AI Model Training and Fine-Tuning on Kubernetes

Go 1,990 854 Updated Dec 23, 2025

Resource scheduling and cluster management for AI

JavaScript 2,684 550 Updated Jun 6, 2024

NVIDIA's launch, startup, and logging scripts used by our MLPerf Training and HPC submissions

Python 35 18 Updated Sep 12, 2025

A latent text-to-image diffusion model

Jupyter Notebook 72,050 10,549 Updated Jun 18, 2024

The Python Risk Identification Tool for generative AI (PyRIT) is an open source framework built to empower security professionals and engineers to proactively identify risks in generative AI systems.

Python 3,231 632 Updated Dec 23, 2025

Efficient vision foundation models for high-resolution generation and perception.

Python 3,184 229 Updated Sep 5, 2025

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Python 3,397 285 Updated Jul 17, 2025

VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.

Python 3,702 310 Updated Nov 28, 2025

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 24,208 2,688 Updated Aug 12, 2024
Python 2,090 317 Updated Apr 19, 2024

Together Mixture-Of-Agents (MoA) – 65.1% on AlpacaEval with OSS models

Python 2,843 378 Updated Jan 7, 2025