Skip to content
View weijietong's full-sized avatar

Organizations

@RoaringBitmap

Block or report weijietong

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.

Python 31,055 3,705 Updated Apr 13, 2026

Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more

Python 35,407 3,529 Updated Apr 17, 2026

A machine learning compiler for GPUs, CPUs, and ML accelerators

C++ 4,178 783 Updated Apr 17, 2026

Implement a reasoning LLM in PyTorch from scratch, step by step

Jupyter Notebook 4,131 586 Updated Apr 17, 2026

Build smaller, faster, and more secure desktop and mobile applications with a web frontend.

Rust 105,571 3,535 Updated Apr 16, 2026

AnyBlox runtime and tooling

C 37 1 Updated Sep 4, 2025

An extensible, state-of-the-art framework for columnar compression, and the fastest FOSS columnar file format. Formerly at @spiraldb, now an Incubation Stage project at LFAI&Data, part of the Linux…

Rust 2,876 145 Updated Apr 17, 2026

An easy-to-use, header-only C++ wrapper for Linux' perf event API

C++ 140 22 Updated Jan 7, 2026
C++ 23 4 Updated Nov 7, 2025
C++ 924 90 Updated Apr 17, 2026

Goal: Enable awesome tooling for Bazel users of the C language family.

Python 899 187 Updated Aug 11, 2025

The universal proxy platform

Go 32,765 3,854 Updated Apr 17, 2026

[TMLR 2025] Efficient Reasoning Models: A Survey

Python 309 22 Updated Mar 9, 2026

Vector (and Scalar) Quantization, in Pytorch

Python 3,900 325 Updated Apr 17, 2026

Official repository of the xLSTM.

Python 2,150 179 Updated Nov 4, 2025

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit floating point (FP8 and FP4) precision on Hopper, Ada and Blackwell GPUs, to provide better performance…

Python 3,277 697 Updated Apr 17, 2026

The hub for EleutherAI's work on interpretability and learning dynamics

Jupyter Notebook 2,776 212 Updated Nov 15, 2025

Modeling, training, eval, and inference code for OLMo

Python 6,478 749 Updated Nov 24, 2025

InkFuse - An Experimental Database Runtime Unifying Vectorized and Compiled Query Execution.

C++ 55 3 Updated May 13, 2024

CUDA Templates and Python DSLs for High-Performance Linear Algebra

C++ 9,595 1,796 Updated Apr 17, 2026

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

Python 13,398 2,297 Updated Apr 17, 2026

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 70,231 8,598 Updated Apr 12, 2026

Unsloth Studio is a web UI for training and running open models like Gemma 4, Qwen3.5, DeepSeek, gpt-oss locally.

Python 61,986 5,378 Updated Apr 17, 2026

The AI developer platform. Use Weights & Biases to train and fine-tune models, and manage models from experimentation to production.

Python 11,009 859 Updated Apr 17, 2026

A Flexible Framework for Experiencing Heterogeneous LLM Inference/Fine-tune Optimizations

Python 17,001 1,265 Updated Apr 14, 2026

FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

Python 1,057 86 Updated Sep 4, 2024

A concise but complete full-attention transformer with a set of promising experimental features from various papers

Python 5,826 508 Updated Apr 16, 2026

Get up and running with Kimi-K2.5, GLM-5, MiniMax, DeepSeek, gpt-oss, Qwen, Gemma and other models.

Go 169,269 15,649 Updated Apr 17, 2026
41 22 Updated Apr 3, 2022

LLM training in simple, raw C/CUDA

Cuda 29,610 3,530 Updated Jun 26, 2025
Next