Skip to content
Change the repository type filter

All

    Repositories list

    • vllm-rbln

      Public
      vLLM plugin for RBLN NPU
      Python
      Apache License 2.0
      11000Updated May 20, 2026May 20, 2026
    • The agent that grows with you
      Python
      MIT License
      26k001Updated May 18, 2026May 18, 2026
    • modular

      Public
      The Modular Platform (includes MAX & Mojo)
      Mojo
      Other
      2.8k0018Updated May 7, 2026May 7, 2026
    • Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
      Python
      Apache License 2.0
      520000Updated May 3, 2026May 3, 2026
    • One framework to evaluate any VLA model on any robot simulation benchmark.
      Python
      Apache License 2.0
      27001Updated Apr 20, 2026Apr 20, 2026
    • Jacobi Forcing: Fast and Accurate Diffusion-style Decoding
      Python
      Apache License 2.0
      11000Updated Apr 15, 2026Apr 15, 2026
    • ⚡ A seamless integration of HuggingFace Transformers & Diffusers with RBLN SDK for efficient inference on RBLN NPUs.
      Python
      Apache License 2.0
      5000Updated Apr 13, 2026Apr 13, 2026
    • Python
      0000Updated Apr 7, 2026Apr 7, 2026
    • Yetter Python Client
      Python
      0000Updated Mar 27, 2026Mar 27, 2026
    • vllm

      Public
      A high-throughput and memory-efficient inference and serving engine for LLMs
      Python
      Apache License 2.0
      17k000Updated Feb 10, 2026Feb 10, 2026
    • Python
      0000Updated Jan 22, 2026Jan 22, 2026
    • Jupyter Notebook
      1600Updated Jan 18, 2026Jan 18, 2026
    • GraLoRA

      Public
      Jupyter Notebook
      23400Updated Nov 18, 2025Nov 18, 2025
    • owlite

      Public
      OwLite is a low-code AI model compression toolkit for AI models.
      Python
      GNU Affero General Public License v3.0
      45400Updated Nov 14, 2025Nov 14, 2025
    • OwLite Examples repository offers illustrative example codes to help users seamlessly compress PyTorch deep learning models and transform them into TensorRT eng…
      Python
      1911Updated Nov 14, 2025Nov 14, 2025
    • Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
      Python
      Apache License 2.0
      520000Updated Nov 12, 2025Nov 12, 2025
    • 🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference an…
      Python
      Apache License 2.0
      33k000Updated Nov 4, 2025Nov 4, 2025
    • vllm-fork

      Public
      A high-throughput and memory-efficient inference and serving engine for LLMs
      Python
      Apache License 2.0
      17k000Updated Nov 4, 2025Nov 4, 2025
    • Machine Learning Engineering Open Book
      Python
      Creative Commons Attribution Share Alike 4.0 International
      1.1k100Updated Sep 1, 2025Sep 1, 2025
    • SGLang is a fast serving framework for large language models and vision language models.
      Python
      Apache License 2.0
      6k000Updated Aug 28, 2025Aug 28, 2025
    • Ditto is an open-source framework that enables direct conversion of HuggingFace PreTrainedModels into TensorRT-LLM engines.
      Python
      Apache License 2.0
      35700Updated Jul 16, 2025Jul 16, 2025
    • A high-throughput and memory-efficient inference and serving engine for LLMs
      Python
      Apache License 2.0
      17k000Updated Jul 14, 2025Jul 14, 2025
    • LMCache

      Public
      Redis for LLMs
      Python
      Apache License 2.0
      1.2k000Updated Jul 10, 2025Jul 10, 2025
    • 0000Updated Jul 9, 2025Jul 9, 2025
    • TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optim…
      C++
      Apache License 2.0
      2.4k001Updated Jun 26, 2025Jun 26, 2025
    • fal-js

      Public
      The JavaScript client and utilities to fal-serverless with built-in TypeScript definitions
      TypeScript
      MIT License
      43000Updated May 30, 2025May 30, 2025
    • gradio

      Public
      Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
      Python
      Apache License 2.0
      3.5k000Updated Jan 13, 2025Jan 13, 2025
    • Python
      Apache License 2.0
      48000Updated Nov 22, 2024Nov 22, 2024
    • Intel Neural Compressor
      Python
      Apache License 2.0
      0000Updated Oct 22, 2024Oct 22, 2024
    • Isolated DinD (Docker in Docker) container for developing and deploying Docker containers using NVIDIA GPUs and the NVIDIA container toolkit. Useful for deployi…
      Dockerfile
      Mozilla Public License 2.0
      19000Updated Aug 27, 2024Aug 27, 2024
    ProTip! When viewing an organization's repositories, you can use the props. filter to filter by custom property.