Skip to content
View boluoyu's full-sized avatar

Block or report boluoyu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

This repository is for the "LLM-Aligned Geographic Item Tokenization for Local-Life Recommendation".

Python 17 2 Updated Nov 18, 2025

Algorithm powering the For You feed on X

Rust 16,236 2,810 Updated Jan 20, 2026

Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models

Python 4,220 308 Updated Jan 14, 2026

[Pytorch] The repo contains the code for "FORGE: Forming Semantic Identifiers for Generative Retrieval in Industrial Datasets"

Python 199 20 Updated Feb 9, 2026

Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, sparsely activated memory layers complement compute-heavy dense f…

Python 375 32 Updated Dec 12, 2024

Awesome AI Memory | LLM Memory | A curated knowledge base on AI memory for LLMs and agents, covering long-term memory, reasoning, retrieval, and memory-native system design. Awesome-AI-Memory 是一个 集…

Python 659 51 Updated Mar 31, 2026

[ICLR 2026] LightMem: Lightweight and Efficient Memory-Augmented Generation

Python 734 63 Updated Apr 3, 2026

Unofficial implementation of Titans, SOTA memory for transformers, in Pytorch

Python 1,939 202 Updated Feb 9, 2026

DataComp for Language Models

HTML 1,430 131 Updated Sep 9, 2025

😊 TPTT: Transforming Pretrained Transformers into Titans

Python 62 Updated Nov 24, 2025

Minimal reproduction of OneRec

Python 1,382 190 Updated Mar 31, 2026

Reproduced the DFT method without using Verl. https://arxiv.org/abs/2508.05629

Python 22 2 Updated Oct 14, 2025

[ICLR 2026] The official implementation of the paper “Anchored Supervised Fine-Tuning”

Jupyter Notebook 35 2 Updated Feb 12, 2026

A Reproduction of GDM's Nested Learning Paper

Python 666 96 Updated Feb 25, 2026

[ICLR 2025] MiniPLM: Knowledge Distillation for Pre-Training Language Models

Python 75 9 Updated Nov 23, 2024

[NeurIPS 2025 Spotlight] TPA: Tensor ProducT ATTenTion Transformer (T6) (https://arxiv.org/abs/2501.06425)

Python 448 37 Updated Jan 26, 2026

计划的核心——大语言模型

Python 7 3 Updated Feb 17, 2026

🚀🚀 「大模型」2小时完全从0训练64M的小参数GPT!🌏 Train a 64M-parameter GPT from scratch in just 2h!

Python 45,610 5,569 Updated Apr 4, 2026

MiniRBT (中文小型预训练模型系列)

Python 301 19 Updated Jul 15, 2025

Phi2-Chinese-0.2B 从0开始训练自己的Phi2中文小模型,支持接入langchain加载本地知识库做检索增强生成RAG。Training your own Phi2 small chat model from scratch.

Jupyter Notebook 589 67 Updated Jul 11, 2024

Merge of LLMs and TRMs

TeX 1 Updated Feb 4, 2026
Jupyter Notebook 1 Updated Nov 2, 2025

Text-to-Video generation model using a Hierarchical Reasoning Model (HRM) optimized for T4 GPUs.

Python 5 1 Updated Jan 13, 2026

Hierarchical Reasoning Model Official Release

Python 12,373 1,806 Updated Mar 31, 2026

Python Implementation of MUVERA (Multi-Vector Retrieval via Fixed Dimensional Encodings)

Python 407 27 Updated Dec 10, 2025

BERT-based intent and slots detector for chatbots.

Python 239 32 Updated Feb 21, 2025

A fast llama2 decoder in pure Rust.

Rust 1,063 57 Updated Nov 30, 2023
Next