dengzheng-cloud

Zheng.Deng dengzheng-cloud

8 followers · 14 following

shanghai

Achievements

Stars

sgl-project / SpecForge

Train speculative decoding models effortlessly and port them smoothly to SGLang serving.

Python 566 124 Updated Dec 23, 2025

LMCache / LMCache

Supercharge Your LLM with the Fastest KV Cache Layer

Python 6,418 812 Updated Dec 23, 2025

GeeeekExplorer / nano-vllm

Nano vLLM

Python 10,043 1,256 Updated Nov 3, 2025

harleyszhang / lite_llama

A light llama-like llm inference framework based on the triton kernel.

Python 167 25 Updated Sep 20, 2025

zinccat / Awesome-Triton-Kernels

Collection of kernels written in Triton language

173 9 Updated Apr 5, 2025

FareedKhan-dev / train-deepseek-r1

Building DeepSeek R1 from Scratch

Jupyter Notebook 730 118 Updated Mar 21, 2025

OpenBMB / UltraEval

[ACL 2024 Demo] Official GitHub repo for UltraEval: An open source framework for evaluating foundation models.

Python 253 22 Updated Oct 30, 2024

microsoft / LMOps

General technology for enabling AI capabilities w/ LLMs and MLLMs

Python 4,230 357 Updated Dec 22, 2025

h3r2tic / dolly

Composable camera rigs

Rust 474 39 Updated Jul 22, 2024

run-ai / llmperf

C++ 60 7 Updated Sep 17, 2024

horseee / Awesome-Efficient-LLM

A curated list for Efficient Large Language Models

Python 1,918 147 Updated Jun 17, 2025

archibate / co_http

小彭老师特意从零开始研发的一款教学用，基于 C++17 回调函数的异步 HTTP 服务器

C++ 174 22 Updated Jul 24, 2024

PaddleJitLab / CUDATutorial

A self-learning tutorail for CUDA High Performance Programing.

JavaScript 791 75 Updated Jun 30, 2025

Kedreamix / ChatTTS

Forked from 2noise/ChatTTS

TTS

Jupyter Notebook 49 7 Updated Jun 4, 2024

2noise / ChatTTS

A generative speech model for daily dialogue.

Python 38,384 4,167 Updated Dec 3, 2025

HazyResearch / ThunderKittens

Tile primitives for speedy kernels

Cuda 3,012 219 Updated Dec 9, 2025

xlite-dev / LeetCUDA

📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉

Cuda 9,027 887 Updated Dec 4, 2025

adam-maj / tiny-gpu

A minimal GPU design in Verilog to learn how GPUs work from the ground up

SystemVerilog 8,997 704 Updated Aug 18, 2024

gpu-mode / lectures

Material for gpu-mode lectures

Jupyter Notebook 5,446 552 Updated Dec 8, 2025

karpathy / llm.c

LLM training in simple, raw C/CUDA

Cuda 28,456 3,337 Updated Jun 26, 2025

mozilla-ai / llamafile

Distribute and run LLMs with a single file.

C 23,547 1,253 Updated Dec 19, 2025

zhllxt / asio2

Header only c++ network library, based on asio,support tcp,udp,http,websocket,rpc,ssl,icmp,serial_port,socks5.

C++ 904 196 Updated Oct 28, 2025

UbiquitousLearning / mllm

Fast Multimodal LLM on Mobile Devices

C++ 1,292 156 Updated Dec 23, 2025

hpcaitech / Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Python 28,145 2,818 Updated Apr 30, 2025

baidu-research / baidu-allreduce

Cuda 600 112 Updated Apr 6, 2018

shreyansh26 / FlashAttention-PyTorch

Implementation of FlashAttention in PyTorch

Python 178 20 Updated Jan 12, 2025

ronancpl / HeavenMS

An improved server based on MapleSolaxia (v83 MapleStory private server)

Java 1,149 864 Updated Dec 28, 2019

jundaf2 / INT8-Flash-Attention-FMHA-Quantization

Cuda 159 16 Updated Sep 15, 2023

kohya-ss / sd-scripts

Python 6,807 1,151 Updated Dec 21, 2025

Akegarasu / sd-webui-model-converter

model convert extension for stable-diffusion-webui. supports convert fp16/bf16 no-ema/ema-only safetensors

Python 339 41 Updated Dec 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Zheng.Deng dengzheng-cloud

Achievements

Achievements

Block or report dengzheng-cloud

Stars

sgl-project / SpecForge

LMCache / LMCache

GeeeekExplorer / nano-vllm

harleyszhang / lite_llama

zinccat / Awesome-Triton-Kernels

FareedKhan-dev / train-deepseek-r1

OpenBMB / UltraEval

microsoft / LMOps

h3r2tic / dolly

run-ai / llmperf

horseee / Awesome-Efficient-LLM

archibate / co_http

PaddleJitLab / CUDATutorial

Kedreamix / ChatTTS

2noise / ChatTTS

HazyResearch / ThunderKittens

xlite-dev / LeetCUDA

adam-maj / tiny-gpu

gpu-mode / lectures

karpathy / llm.c

mozilla-ai / llamafile

zhllxt / asio2

UbiquitousLearning / mllm

hpcaitech / Open-Sora

baidu-research / baidu-allreduce

shreyansh26 / FlashAttention-PyTorch

ronancpl / HeavenMS

jundaf2 / INT8-Flash-Attention-FMHA-Quantization

kohya-ss / sd-scripts

Akegarasu / sd-webui-model-converter