-
UCAS
- Beijing, China
-
18:30
(UTC +08:00)
Highlights
- Pro
Stars
Implementation of the paper Fast Inference from Transformers via Speculative Decoding, Leviathan et al. 2023.
Open source code for ICLR 2026 Paper: Evaluating Memory in LLM Agents via Incremental Multi-Turn Interactions
Letta is the platform for building stateful agents: AI with advanced memory that can learn and self-improve over time.
The paper list of "Memory in the Age of AI Agents: A Survey"
🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
This repo summarizes papers for efficient PPML across protocol, model, and system levels.
This is the source code for HufuNet. Our paper is accepted by the IEEE TDSC.
Official Code of ACL2025 "PrivacyRestore: Privacy-Preserving Inference in Large Language Models via Privacy Removal and Restoration"