Skip to content
View lhoestq's full-sized avatar
🤗
🤗

Organizations

@huggingface

Block or report lhoestq

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 156,156 31,973 Updated Feb 6, 2026

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 66,963 8,140 Updated Feb 4, 2026

Jan is an open source alternative to ChatGPT that runs 100% offline on your computer.

TypeScript 40,331 2,510 Updated Feb 6, 2026

A library for efficient similarity search and clustering of dense vectors.

C++ 39,003 4,218 Updated Feb 5, 2026

DuckDB is an analytical in-process SQL database management system

C++ 35,912 2,908 Updated Feb 6, 2026

🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.

Python 32,703 6,741 Updated Feb 6, 2026

Fast, secure, efficient backup program

Go 32,095 1,708 Updated Feb 1, 2026

🤗 smolagents: a barebones library for agents that think in code.

Python 25,295 2,279 Updated Jan 23, 2026

🤗 The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools

Python 21,174 3,090 Updated Feb 4, 2026

Train transformer language models with reinforcement learning.

Python 17,295 2,472 Updated Feb 6, 2026

Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics

C++ 16,483 4,004 Updated Feb 6, 2026

State-of-the-art Machine Learning for the web. Run 🤗 Transformers directly in your browser, with no need for a server!

JavaScript 15,319 1,082 Updated Feb 5, 2026

Parallel computing with task scheduling

Python 13,730 1,843 Updated Feb 5, 2026

A framework for few-shot evaluation of language models.

Python 11,369 3,018 Updated Feb 5, 2026

The official repository of Mozilla's Firefox web browser.

JavaScript 11,182 848 Updated Feb 6, 2026

Large Language Model Text Generation Inference

Python 10,751 1,253 Updated Jan 8, 2026

The open source codebase powering HuggingChat

TypeScript 10,490 1,595 Updated Feb 4, 2026

💥 Fast State-of-the-Art Tokenizers optimized for Research and Production

Rust 10,454 1,027 Updated Feb 5, 2026

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support

Python 9,485 1,280 Updated Feb 4, 2026

Apache DataFusion SQL Query Engine

Rust 8,361 1,935 Updated Feb 6, 2026

A course on aligning smol models.

Jupyter Notebook 6,579 2,302 Updated Nov 10, 2025

A scikit-learn compatible neural network library that wraps PyTorch

Jupyter Notebook 6,149 406 Updated Dec 22, 2025

Open Lakehouse Format for Multimodal AI. Convert from Parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, a…

Rust 6,017 543 Updated Feb 6, 2026

Probabilistic time series modeling in Python

Python 5,131 805 Updated Aug 14, 2025

Source code for pbrt, the renderer described in the third edition of "Physically Based Rendering: From Theory To Implementation", by Matt Pharr, Wenzel Jakob, and Greg Humphreys.

C++ 5,052 1,211 Updated Sep 3, 2023

A lightweight data processing framework built on DuckDB and 3FS.

Python 4,920 443 Updated Mar 5, 2025

Apache OpenDAL: One Layer, All Storage.

Rust 4,879 710 Updated Feb 5, 2026

Jupyter notebooks for the Natural Language Processing with Transformers book

Jupyter Notebook 4,706 1,467 Updated Aug 21, 2024

The simplest, fastest repository for training/finetuning small-sized VLMs.

Python 4,626 461 Updated Oct 27, 2025

Embedding Atlas is a tool that provides interactive visualizations for large embeddings. It allows you to visualize, cross-filter, and search embeddings and metadata.

TypeScript 4,580 259 Updated Feb 3, 2026
Next