Skip to content
Change the repository type filter

All

    Repositories list

    • Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
      Python
      7.8k200Updated Dec 19, 2025Dec 19, 2025
    • Kotlin
      11013Updated Dec 18, 2025Dec 18, 2025
    • SkyRL

      Public
      SkyRL: A Modular Full-stack RL Library for LLMs
      Python
      203101Updated Dec 8, 2025Dec 8, 2025
    • vllm

      Public
      A high-throughput and memory-efficient inference and serving engine for LLMs
      Python
      12k000Updated Dec 1, 2025Dec 1, 2025
    • Gelato

      Public
      🍨 Gelato — From Data Curation to Reinforcement Learning: Building a Strong Grounding Model for Computer-Use Agents
      Python
      03200Updated Nov 12, 2025Nov 12, 2025
    • open_clip

      Public
      An open source implementation of CLIP.
      Python
      1.2k13k2730Updated Nov 4, 2025Nov 4, 2025
    • dclm

      Public
      DataComp for Language Models
      HTML
      1291.4k173Updated Sep 9, 2025Sep 9, 2025
    • evalchemy

      Public
      Automatic evals for LLMs
      HTML
      725692112Updated Jun 27, 2025Jun 27, 2025
    • HTML
      2600Updated Jun 15, 2025Jun 15, 2025
    • open_lm

      Public
      A repository for research on medium sized language models.
      Python
      755223535Updated Jun 6, 2025Jun 6, 2025
    • 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
      Python
      31k000Updated May 18, 2025May 18, 2025
    • datacomp

      Public
      DataComp: In search of the next generation of multimodal datasets
      Python
      63759273Updated Apr 28, 2025Apr 28, 2025
    • rtfm

      Public
      Research on Tabular Foundation Models
      Python
      1467120Updated Dec 13, 2024Dec 13, 2024
    • MixEval

      Public
      The official evaluation suite and dynamic data release for MixEval.
      Python
      41000Updated Sep 20, 2024Sep 20, 2024
    • An open-source framework for training large multimodal models.
      Python
      3204.1k456Updated Aug 31, 2024Aug 31, 2024
    • tabliblib

      Public
      A Python library for processing and filtering TabLib
      Python
      31300Updated Aug 24, 2024Aug 24, 2024
    • MINT-1T

      Public
      🍃 MINT-1T: A one trillion token multimodal interleaved dataset.
      1882710Updated Jul 31, 2024Jul 31, 2024
    • Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time
      Python
      4650180Updated Jul 15, 2024Jul 15, 2024
    • A benchmark for distribution shift in tabular data
      Python
      1657121Updated Jun 6, 2024Jun 6, 2024
    • scaling

      Public
      Language models scale reliably with over-training and on downstream tasks
      Jupyter Notebook
      610020Updated Apr 2, 2024Apr 2, 2024
    • Python
      72700Updated Mar 21, 2024Mar 21, 2024
    • Editing Models with Task Arithmetic
      Python
      4852290Updated Jan 11, 2024Jan 11, 2024
    • Python
      25000Updated Oct 29, 2023Oct 29, 2023
    • patching

      Public
      Patching open-vocabulary models by interpolating weights
      Python
      89110Updated Sep 28, 2023Sep 28, 2023
    • Python
      2200Updated Aug 22, 2023Aug 22, 2023
    • LLM training code for MosaicML foundation models
      Python
      578100Updated Aug 10, 2023Aug 10, 2023
    • CSS
      0300Updated Jun 2, 2023Jun 2, 2023
    • Simple large-scale training of stable diffusion with multi-node support.
      Python
      913320Updated May 8, 2023May 8, 2023
    • Efficiently process webdatasets
      Python
      0410Updated Apr 5, 2023Apr 5, 2023
    • Release of ImageNet-Captions
      55100Updated Jan 20, 2023Jan 20, 2023