Lists (3)
Sort Name ascending (A-Z)
Stars
Self-hosted AI assistant with tool use, multi-agent orchestration, coding copilot and a lightweight Flask + vanilla JS stack.
An introduction to ODEs and their applications in vision and language
The code for "Is ChatGPT Fair for Recommendation? Evaluating Fairness in Large Language Model Recommendation"
Minimalistic large language model 3D-parallelism training
A repository for pretraining a discrete diffusion model (llada), with all components built on the Hugging Face ecosystem.
🔥 今日热榜 API,一个聚合热门数据的 API 接口,支持 RSS 模式 及 Vercel 部署 | 前端页面:https://github.com/imsyy/DailyHot
GUI for LLaDA Diffusion LLM with Quantization for low end GPU and CPU options.
Kimi K2 is the large language model series developed by Moonshot AI team
Fast and memory-efficient exact attention
A high-throughput and memory-efficient inference and serving engine for LLMs
A streamlined and customizable framework for efficient large model (LLM, VLM, AIGC) evaluation and performance benchmarking.
ParamMute: Suppressing Knowledge-Critical FFNs for Faithful Retrieval-Augmented Generation
Reverse Engineering the Abstraction and Reasoning Corpus
Infinity is a high-throughput, low-latency serving engine for text-embeddings, reranking models, clip, clap and colpali
SGLang is a high-performance serving framework for large language models and multimodal models.
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
Tools for merging pretrained large language models.
Static memory-efficient Trie-like structures for Python based on marisa-trie C++ library.
The libsais library provides fast linear-time construction of suffix array (SA), generalized suffix array (GSA), longest common prefix (LCP) array, permuted LCP (PLCP) array, Burrows-Wheeler transf…
NumPy and SciPy on Multi-Node Multi-GPU systems
An unofficial implementation of the Infini-gram model proposed by Liu et al. (2024)