Skip to content
View Bingrui-Li's full-sized avatar

Block or report Bingrui-Li

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
29 stars written in Jupyter Notebook
Clear filter

State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.

Jupyter Notebook 14,546 3,375 Updated Aug 12, 2024

Solve puzzles. Learn CUDA.

Jupyter Notebook 11,614 890 Updated Sep 1, 2024

QLoRA: Efficient Finetuning of Quantized LLMs

Jupyter Notebook 10,735 865 Updated Jun 10, 2024

SimCLRv2 - Big Self-Supervised Models are Strong Semi-Supervised Learners

Jupyter Notebook 4,395 651 Updated May 22, 2023

Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.

Jupyter Notebook 2,822 159 Updated Oct 9, 2025

Structured state space sequence models

Jupyter Notebook 2,765 345 Updated Jul 17, 2024

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

Jupyter Notebook 2,658 187 Updated Jun 25, 2024

Fast and Easy Infinite Neural Networks in Python

Jupyter Notebook 2,360 237 Updated Mar 1, 2024

maximal update parametrization (µP)

Jupyter Notebook 1,614 104 Updated Jul 17, 2024

Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving state-of-the-art performance on 38 out of 60 public benchmarks.

Jupyter Notebook 1,486 58 Updated Jun 14, 2025

OLMoE: Open Mixture-of-Experts Language Models

Jupyter Notebook 899 83 Updated Sep 23, 2025

Single File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"

Jupyter Notebook 552 51 Updated Oct 7, 2025

Understanding Training Dynamics of Deep ReLU Networks

Jupyter Notebook 302 32 Updated Oct 19, 2025

Assignments for Berkeley CS 285: Deep Reinforcement Learning (Fall 2020)

Jupyter Notebook 249 244 Updated May 18, 2021
Jupyter Notebook 241 71 Updated May 10, 2024

A package of distributionally robust optimization (DRO) methods. Implemented via cvxpy and PyTorch

Jupyter Notebook 144 8 Updated May 31, 2025

A library for unit scaling in PyTorch

Jupyter Notebook 132 12 Updated Jul 11, 2025

qwen-nsa

Jupyter Notebook 83 6 Updated Oct 14, 2025

Code for the paper: Why Transformers Need Adam: A Hessian Perspective

Jupyter Notebook 64 8 Updated Mar 11, 2025

A python package providing a benchmark with various specified distribution shift patterns.

Jupyter Notebook 58 4 Updated Nov 27, 2023
Jupyter Notebook 53 4 Updated May 20, 2024
Jupyter Notebook 18 Updated Jun 12, 2025

Combining SOAP and MUON

Jupyter Notebook 16 Updated Feb 11, 2025

[ICLR 2025] How Does Critical Batch Size Scale in Pre-training?

Jupyter Notebook 10 1 Updated Feb 20, 2025

Code for "A Regularized Newton Method for Nonconvex Optimization with Global and Local Complexity Guarantees"

Jupyter Notebook 8 Updated Oct 31, 2025
Jupyter Notebook 6 Updated Jul 11, 2025

paper accepted into naacl

Jupyter Notebook 2 Updated Apr 4, 2024