Stars
Pokee Deep Research Model Open Source Repo
A generative world for general-purpose robotics & embodied AI learning.
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Implementation of Recurrent Memory Transformer, Neurips 2022 paper, in Pytorch
An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & Ray & Dynamic Sampling & Async Agentic RL)
A PyTorch native platform for training generative AI models
Recipes to train reward model for RLHF.
🚀 Efficient implementations of state-of-the-art linear attention models
Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".
Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
Scalable toolkit for efficient model alignment
NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.
JVector: the most advanced embedded vector search engine
Write scalable load tests in plain Python 🚗💨
A Kubernetes web UI that is fully-featured, user-friendly and extensible
LDB: A Large Language Model Debugger via Verifying Runtime Execution Step by Step (ACL'24)
🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with OpenTelemetry, Langchain, OpenAI SDK, LiteLLM, and more. 🍊YC W23
This repo includes ChatGPT prompt curation to use ChatGPT and other LLM tools better.
[EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.
💥💻💥 A data-parallel functional programming language
An LLM agent that conducts deep research (local and web) on any given topic and generates a long report with citations.
The Learning Interpretability Tool: Interactively analyze ML models to understand their behavior in an extensible and framework agnostic interface.
TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…
Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.