wangbluo

Wang Binluo wangbluo

Previous worked in Colossalai, ByteDance, Temu. Master of NUS, bachelor of SYSU. Focus on AI Infra.

28 followers · 22 following

colossalai
Singapore

Achievements

x2 x2

Achievements

x2 x2

Stars

MiroMindAI / MiroThinker

MiroThinker is a deep research agent optimized for complex research and prediction tasks. Our latest models, MiroThinker-1.7, achieves 74.0 and 75.3 on the BrowseComp and BrowseComp Zh, respectively.

Python 8,149 629 Updated Apr 25, 2026

fanshiqing / grouped_gemm

Forked from tgale96/grouped_gemm

PyTorch bindings for CUTLASS grouped GEMM.

Cuda 187 50 Updated Apr 8, 2026

Dao-AILab / flash-attention

Fast and memory-efficient exact attention

Python 23,802 2,725 Updated May 15, 2026

EvolvingLMMs-Lab / LongVT

[CVPR 2026] LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling

Python 231 13 Updated Apr 10, 2026

benfred / py-spy

Sampling profiler for Python programs

Rust 15,191 518 Updated May 12, 2026

verl-project / verl

verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework

Python 21,332 3,873 Updated May 16, 2026

MiroMindAI / MiroTrain

MiroTrain is an efficient and algorithm-first framework research agent.

Python 142 17 Updated Aug 27, 2025

JYT86 / Condition_Generation_on_EC

Python 1 Updated Sep 30, 2025

JYT86 / Question_Answer

Python 1 Updated Sep 17, 2025

sgl-project / SpecForge

Train speculative decoding models effortlessly and port them smoothly to SGLang serving.

Python 832 228 Updated Apr 2, 2026

HandH1998 / QQQ

QQQ is an innovative and hardware-optimized W4A8 quantization solution for LLMs.

Python 155 24 Updated Aug 21, 2025

MooreThreads / SimuMax

a static analytical model for LLM distributed training

Python 132 19 Updated May 11, 2026

nvidia-cosmos / cosmos-rl

Cosmos-RL is a flexible and scalable Reinforcement Learning framework specialized for Physical AI applications.

Python 422 61 Updated May 14, 2026

JYT86 / Polymer

Jupyter Notebook 1 1 Updated Jul 7, 2025

linkedin / Liger-Kernel

Efficient Triton Kernels for LLM Training

Python 6,354 528 Updated May 16, 2026

pytorch / torchtitan

A PyTorch native platform for training generative AI models

Python 5,348 820 Updated May 16, 2026

triton-lang / triton

Development repository for the Triton language and compiler

MLIR 19,190 2,852 Updated May 16, 2026

laekov / fastmoe

A fast MoE impl for PyTorch

Python 1,850 206 Updated Feb 10, 2025

wangbluo / Finetune_llama2_Megatron

Using megatron style to do TP training.

Python 2 Updated Oct 14, 2024

hpcaitech / Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Python 28,975 2,952 Updated Apr 9, 2026

wangbluo / BandWidth_Test

Test the GPU bandwidth of collectives operators like all-reduce, all-gather, broadcast and all-to-all primitives on single-node multi-GPU (2, 4, 8 cards) and multi-node multi-GPU (16 cards) setups,…

Python 2 Updated Oct 21, 2024

wangbluo / ColossalAI

Forked from hpcaitech/ColossalAI

Making large AI models cheaper, faster and more accessible

Python 1 Updated Jun 3, 2025

wangbluo / Finetune_llama2

Build a llama fine-tuning script from scratch using PyTorch and transformers API. It needs to support 4 optional features: gradient checkpointing, mixed precision, data parallelism, tensor parallel…

Python 3 Updated Sep 20, 2024

hpcaitech / ColossalAI

Making large AI models cheaper, faster and more accessible

Python 41,382 4,511 Updated May 11, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wang Binluo wangbluo

Achievements

Achievements

Block or report wangbluo

Stars

MiroMindAI / MiroThinker

fanshiqing / grouped_gemm

Dao-AILab / flash-attention

EvolvingLMMs-Lab / LongVT

benfred / py-spy

verl-project / verl

MiroMindAI / MiroTrain

JYT86 / Condition_Generation_on_EC

JYT86 / Question_Answer

sgl-project / SpecForge

HandH1998 / QQQ

MooreThreads / SimuMax

nvidia-cosmos / cosmos-rl

JYT86 / Polymer

linkedin / Liger-Kernel

pytorch / torchtitan

triton-lang / triton

laekov / fastmoe

wangbluo / Finetune_llama2_Megatron

hpcaitech / Open-Sora

wangbluo / BandWidth_Test

wangbluo / ColossalAI

wangbluo / Finetune_llama2

hpcaitech / ColossalAI