Bingrui-Li

Follow

Bingrui Li Bingrui-Li

Follow

PhD student in Computer Science at TSAIL Group, Tsinghua University, @thu-ml. Interested in pretraining, optimization, theory for LLMs.

61 followers · 107 following

@thu-ml, Tsinghua University
Beijing, China
14:01 (UTC +09:00)
https://bingrui-li.github.io/
@bingruili_
@bingruil.bsky.social

Achievements

Achievements

Stars

29 stars written in Jupyter Notebook

NVIDIA / DeepLearningExamples

State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.

Jupyter Notebook 14,546 3,375 Updated Aug 12, 2024

google-research / vision_transformer

Jupyter Notebook 11,977 1,426 Updated Mar 6, 2025

srush / GPU-Puzzles

Solve puzzles. Learn CUDA.

Jupyter Notebook 11,614 890 Updated Sep 1, 2024

artidoro / qlora

QLoRA: Efficient Finetuning of Quantized LLMs

Jupyter Notebook 10,735 865 Updated Jun 10, 2024

google-research / simclr

SimCLRv2 - Big Self-Supervised Models are Strong Semi-Supervised Learners

Jupyter Notebook 4,395 651 Updated May 22, 2023

QwenLM / Qwen3-Omni

Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.

Jupyter Notebook 2,822 159 Updated Oct 9, 2025

state-spaces / s4

Structured state space sequence models

Jupyter Notebook 2,765 345 Updated Jul 17, 2024

FasterDecoding / Medusa

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

Jupyter Notebook 2,658 187 Updated Jun 25, 2024

google / neural-tangents

Fast and Easy Infinite Neural Networks in Python

Jupyter Notebook 2,360 237 Updated Mar 1, 2024

microsoft / mup

maximal update parametrization (µP)

Jupyter Notebook 1,614 104 Updated Jul 17, 2024

ByteDance-Seed / Seed1.5-VL

Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving state-of-the-art performance on 38 out of 60 public benchmarks.

Jupyter Notebook 1,486 58 Updated Jun 14, 2025

allenai / OLMoE

OLMoE: Open Mixture-of-Experts Language Models

Jupyter Notebook 899 83 Updated Sep 23, 2025

McGill-NLP / nano-aha-moment

Single File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"

Jupyter Notebook 552 51 Updated Oct 7, 2025

facebookresearch / luckmatters

Understanding Training Dynamics of Deep ReLU Networks

Jupyter Notebook 302 32 Updated Oct 19, 2025

berkeleydeeprlcourse / homework_fall2020

Assignments for Berkeley CS 285: Deep Reinforcement Learning (Fall 2020)

Jupyter Notebook 249 244 Updated May 18, 2021

dtsip / in-context-learning

Jupyter Notebook 241 71 Updated May 10, 2024

namkoong-lab / dro

A package of distributionally robust optimization (DRO) methods. Implemented via cvxpy and PyTorch

Jupyter Notebook 144 8 Updated May 31, 2025

graphcore-research / unit-scaling

A library for unit scaling in PyTorch

Jupyter Notebook 132 12 Updated Jul 11, 2025

LucasPrietoAl / grokking-at-the-edge-of-numerical-stability

Jupyter Notebook 103 14 Updated Jul 23, 2025

mdy666 / Qwen-Native-Sparse-Attention

qwen-nsa

Jupyter Notebook 83 6 Updated Oct 14, 2025

zyushun / hessian-spectrum

Code for the paper: Why Transformers Need Adam: A Hessian Perspective

Jupyter Notebook 64 8 Updated Mar 11, 2025

namkoong-lab / whyshift

A python package providing a benchmark with various specified distribution shift patterns.

Jupyter Notebook 58 4 Updated Nov 27, 2023

berlino / seq_icl

Jupyter Notebook 53 4 Updated May 20, 2024

Farseer-Scaling-Law / Farseer

Jupyter Notebook 18 Updated Jun 12, 2025

nikhilvyas / SOAP_MUON

Combining SOAP and MUON

Jupyter Notebook 16 Updated Feb 11, 2025

hlzhang109 / critical-batch-size

[ICLR 2025] How Does Critical Batch Size Scale in Pre-training?

Jupyter Notebook 10 1 Updated Feb 20, 2025

miskcoo / ARNCG

Code for "A Regularized Newton Method for Nonconvex Optimization with Global and Local Complexity Guarantees"

Jupyter Notebook 8 Updated Oct 31, 2025

shikaiqiu / supercollapse

Jupyter Notebook 6 Updated Jul 11, 2025

harmonbhasin / curriculum_learning_icl

paper accepted into naacl

Jupyter Notebook 2 Updated Apr 4, 2024