Skip to content
View tonyluj's full-sized avatar
  • Alibaba Cloud
  • Hangzhou China
  • 04:21 (UTC +08:00)
  • X @tonyluj

Block or report tonyluj

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

A next.js web application that integrates AI capabilities with draw.io diagrams. This app allows you to create, modify, and enhance diagrams through natural language commands and AI-assisted visual…

TypeScript 13,646 1,374 Updated Dec 19, 2025

React app for inspecting, building and debugging with the Realtime API

JavaScript 3,516 1,384 Updated Aug 28, 2025

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Python 7,676 698 Updated Dec 10, 2025

A framework for efficient model inference with omni-modality models

Python 1,006 136 Updated Dec 19, 2025

A Lightweight LLM Inference Performance Simulator

Python 47 9 Updated Nov 26, 2025

Contexts Optical Compression

Python 21,498 1,922 Updated Oct 25, 2025

SGLang is a fast serving framework for large language models and vision language models.

Python 21,760 3,810 Updated Dec 19, 2025

Shared data types for building collaborative software

JavaScript 20,840 731 Updated Dec 19, 2025

WebAssembly Micro Runtime (WAMR)

C 5,686 740 Updated Dec 18, 2025

A highly customable, adaptable, runtime agnostic and WASM/WASI friendly Gossip protocol (SWIM) which helps manage cluster membership and member failure detection.

Rust 128 8 Updated Dec 15, 2025

a collection of well-tested, serializable CRDTs for Rust

Rust 1,496 62 Updated Jun 16, 2024

Offline optimization of your disaggregated Dynamo graph

Python 134 38 Updated Dec 19, 2025

An transformer based LLM. Written completely in Rust

Rust 2,999 254 Updated Oct 10, 2025

A native implementation of ØMQ in Rust

Rust 1,330 122 Updated Dec 19, 2025

Python tool for converting files and office documents to Markdown.

Python 84,405 4,856 Updated Dec 1, 2025

Raft distributed consensus algorithm implemented in Rust.

Rust 3,261 438 Updated Oct 29, 2025

Train speculative decoding models effortlessly and port them smoothly to SGLang serving.

Python 558 119 Updated Dec 18, 2025

📰 Must-read papers and blogs on Speculative Decoding ⚡️

1,060 55 Updated Dec 11, 2025

This is the release repository for Fan Control, a highly customizable fan controlling software for Windows.

18,259 541 Updated Dec 11, 2025

The official Python library for the OpenAI API

Python 29,520 4,471 Updated Dec 19, 2025

Tilus is a tile-level kernel programming language with explicit control over shared memory and registers.

Python 432 14 Updated Dec 16, 2025

A workload for deploying LLM inference services on Kubernetes

Go 140 36 Updated Dec 12, 2025

Fast inference from large lauguage models via speculative decoding

Python 868 93 Updated Aug 22, 2024

Codes for our paper "Speculative Decoding: Exploiting Speculative Execution for Accelerating Seq2seq Generation" (EMNLP 2023 Findings)

Python 46 1 Updated Dec 9, 2023

A curated list for Efficient Large Language Models

Python 1,916 146 Updated Jun 17, 2025

vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization

Python 2,043 342 Updated Dec 19, 2025

Rust bindings for the C++ api of PyTorch.

Rust 5,193 409 Updated Nov 4, 2025

Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.

Python 16,826 3,700 Updated Jun 2, 2023

Ergonomic and modular web framework built with Tokio, Tower, and Hyper

Rust 24,192 1,291 Updated Dec 19, 2025
Next