Stars
Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models
Post-training with Tinker
A tiny yet powerful LLM inference system tailored for researching purpose. vLLM-equivalent performance with only 2k lines of code (2% of vLLM).
Code for reproducing our paper "Are Sparse Autoencoders Useful? A Case Study in Sparse Probing"
This repository contains the replication of the iGSM dataset generation process from the Physics of LLM paper by Zeyuan Zhu.
TurtleBench: Evaluating Top Language Models via Real-World Yes/No Puzzles.
lightweight, standalone C++ inference engine for Google's Gemma models.
DeepSeek LLM: Let there be answers
Pond: CXL-Based Memory Pooling Systems for Cloud Platforms (ASPLOS'23)
DeepSeek Coder: Let the Code Write Itself
Rigourous evaluation of LLM-synthesized code - NeurIPS 2023 & COLM 2024
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga…
Extracting spatial and temporal world models from LLMs
Your API ⇒ Paid MCP. Instantly.
A library that provides an embeddable, persistent key-value store for fast storage.
Implementation of Nougat Neural Optical Understanding for Academic Documents
Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML
Unofficial Pytorch implementation of Dom-LM paper.
Project 2 (Building Large Language Models) for Stanford CS324: Understanding and Developing Large Language Models (Winter 2022)
Hackable and optimized Transformers building blocks, supporting a composable construction.
AISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术
AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving (OSDI 23)