Highlights
- Pro
Stars
Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.
A library for efficient similarity search and clustering of dense vectors.
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Open Source real-time strategy game engine for early Westwood games such as Command & Conquer: Red Alert written in C# using SDL and OpenGL. Runs on Windows, Linux, *BSD and Mac OS X.
A C++ bare metal environment for Raspberry Pi with USB (32 and 64 bit)
🚀 Efficient implementations of state-of-the-art linear attention models
The official repository for ERNIE 4.5 and ERNIEKit – its industrial-grade development toolkit based on PaddlePaddle.
Mirage Persistent Kernel: Compiling LLMs into a MegaKernel
Accessible large language models via k-bit quantization for PyTorch.
🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
A Zotero plugin for syncing items and notes into Notion
提供同花顺客户端/miniqmt/雪球的股票量化交易,支持跟踪 joinquant /ricequant 模拟交易 和 实盘雪球组合
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
Shared repository for open-sourced projects from the Google AI Language team.
📄 Awesome CV is LaTeX template for your outstanding job application
A Data Streaming Library for Efficient Neural Network Training
Efficient Training (including pre-training and fine-tuning) for Big Models
A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
Example models using DeepSpeed
MiniCPM4 & MiniCPM4.1: Ultra-Efficient LLMs on End Devices, achieving 3+ generation speedup on reasoning tasks
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
"Bootstrapping Relationship Extractors with Distributional Semantics" (Batista et al., 2015) in EMNLP'15 - Python implementation
Software in C and data files for the popular GloVe model for distributed word representations, a.k.a. word vectors or embeddings
Scalable training for dense retrieval models.
arXiv LaTeX Cleaner: Easily clean the LaTeX code of your paper to submit to arXiv
A sample integration of AWS services with SLURM
Running large language models on a single GPU for throughput-oriented scenarios.