Qubitium

🙌

....

Qubitium-ModelCloud Qubitium

🙌

....

Golang, Python, Kotlin. GPTQModel maintainer and OSS contributor to SGLang, vLLM, and others. @ModelCloudAi founder

106 followers · 118 following

ModelCloud.ai
Earth/Epoch 2.0
https://modelcloud.ai
@qubitium

Achievements

x4 x3 x3

Achievements

x4 x3 x3

Stars

google / eng-practices

Google's Engineering Practices documentation

23,243 2,268 Updated Sep 19, 2024

Venkat2811 / myelon

Ultra-low-latency, high-throughput multiprocess transport over SHM and mmap. LMAX-Disruptor-style cross-process ring substrate.

Rust 11 1 Updated Jun 11, 2026

xlite-dev / svdquant-kernels

Forked from ultism/svdquant-kernels

Cross-architecture CUDA kernels for SVDQuant (W4A4 with low-rank correction)

Python 2 1 Updated May 28, 2026

inclusionAI / humming

Python 135 18 Updated Jun 10, 2026

ModelCloud / GPTQModel

LLM model quantization (compression) toolkit with HW acceleration support for Nvidia, AMD, Intel GPU and Intel/AMD/Apple CPU via HF, vLLM, and SGLang.

Python 1,177 187 Updated Jun 12, 2026

muellerzr / h100-performance-tests

Jupyter Notebook 6 4 Updated Feb 2, 2024

dphnAI / vessel

Compile docker images into a single self-contained binary

Rust 19 Updated Apr 30, 2026

Sky1-Linux / acpi-to-dts-tools

Tools for converting ACPI DSDT to Device Tree Source for CIX Sky1 boards

Shell 4 2 Updated Feb 1, 2026

Tencent / AngelSlim

Model compression toolkit engineered for enhanced usability, comprehensiveness, and efficiency.

Python 1,304 150 Updated Jun 12, 2026

patrick-toulme / pyptx

A Python DSL to write Nvidia PTX for Hopper and Blackwell in JAX and PyTorch

Python 311 26 Updated May 8, 2026

h4ckf0r0day / obscura

The headless browser for AI agents and web scraping

Rust 15,617 1,058 Updated Jun 13, 2026

CelestoAI / SmolVM

Open-source AI sandbox infrastructure with unified API for VMMs -- Firecracker, QEMU and libkrun.

Python 594 46 Updated Jun 13, 2026

firecracker-microvm / firecracker

Secure and fast microVMs for serverless computing.

Rust 34,910 2,441 Updated Jun 12, 2026

RightNow-AI / TIDE

Dynamic per-token early exit for LLM inference. Skip layers tokens don't need

Python 31 5 Updated Mar 18, 2026

m0at / rvllm

rvLLM: High-performance LLM inference in Rust. Drop-in vLLM replacement.

Rust 739 70 Updated Jun 12, 2026

xlite-dev / ffpa-attn

🤖FFPA: Extends FlashAttention-2 via Split-D for large headdims, 1.5x~3×↑🎉 vs SDPA, up to 430T🎉 on H200.

Python 310 20 Updated Jun 12, 2026

groxaxo / Qwen3-TTS-Openai-Fastapi

Forked from QwenLM/Qwen3-TTS

Qwen3-TTS is an open-source series of TTS models developed by the Qwen team at Alibaba Cloud, supporting stable, expressive, and streaming speech generation, free-form voice design, and vivid voice…

Python 201 62 Updated Jun 13, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qubitium-ModelCloud Qubitium

Achievements

Achievements

Block or report Qubitium

Stars

google / eng-practices

Venkat2811 / myelon

xlite-dev / svdquant-kernels

inclusionAI / humming

ModelCloud / GPTQModel

muellerzr / h100-performance-tests

dphnAI / vessel

Sky1-Linux / acpi-to-dts-tools

Tencent / AngelSlim

patrick-toulme / pyptx

h4ckf0r0day / obscura

CelestoAI / SmolVM

firecracker-microvm / firecracker

RightNow-AI / TIDE

m0at / rvllm

xlite-dev / ffpa-attn

groxaxo / Qwen3-TTS-Openai-Fastapi

apple / ml-ssd

deepseek-ai / DeepGEMM

komikndr / raylight

changjonathanc / FlashInfer-Bench_attempt_027

elder-plinius / OBLITERATUS

hao-ai-lab / JacobiForcing

lithos-ai / motus

sionic-ai / b300-ConnectX-8-netplan

google / magika

Barath19 / Boxer3D

NVIDIA-NeMo / DataDesigner

yuntian-group / neural-os

NVIDIA / OpenShell