Stars
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Harbor is a framework for running agent evaluations and creating and using RL environments.
The agent benchmark that scores the full stack — harness, config, and model — not just the LLM. Trace-based scoring, reliability metrics, configuration diagnostics.
Run agents like Hermes and OpenClaw more securely inside NVIDIA OpenShell with managed inference
Evolve your language agent with Agentic Context Engineering (ACE)
Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.
Kubernetes controller to validate AI models
Collection of demos for building Llama Stack based apps on OpenShift
Efficient and Scalable Estimation of Tool Representations in Vector Space
Get your documents ready for gen AI
Open source project for data preparation for GenAI applications
Run PyTorch LLMs locally on servers, desktop and mobile
🚀 Collection of tuning recipes with HuggingFace SFTTrainer and PyTorch FSDP.
Improve ROSA customer experience (and customer retention) by leveraging foundation models to do “gpt-chat” style search of Red Hat customer documentation assets.
📋 A list of open LLMs available for commercial use.
The first open Federated Learning framework implemented in C++ and Python.
Train transformer language models with reinforcement learning.
Label Studio is a multi-type data labeling and annotation tool with standardized output format
A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
Example models using DeepSpeed
Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets
Open source platform for the privacy-preserving machine learning lifecycle