Stars
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
An Open Source Machine Learning Framework for Everyone
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
搜索、推荐、广告、用增等工业界实践文章收集(来源:知乎、Datafuntalk、技术公众号)
Distributed transactional key-value database, originally created to complement TiDB
A flexible, high-performance serving system for machine learning models
Making large AI models cheaper, faster and more accessible
A library for efficient similarity search and clustering of dense vectors.
Zotero is a free, easy-to-use tool to help you collect, organize, annotate, cite, and share your research sources.
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
High performance server-side application framework
A common bricks library for building scalable and portable distributed machine learning.
🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
💫 Industrial-strength Natural Language Processing (NLP) in Python
An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & Ray & Dynamic Sampling & Async Agentic RL)
Models and examples built with TensorFlow
Unsupervised text tokenizer for Neural Network-based text generation.
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
HierarchicalKV is a part of NVIDIA Merlin and provides hierarchical key-value storage to meet RecSys requirements. The key capability of HierarchicalKV is to store key-value feature-embeddings on h…
Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk
TensorFlow's Visualization Toolkit