Skip to content
View Qubitium's full-sized avatar
🙌
....
🙌
....

Block or report Qubitium

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Google's Engineering Practices documentation

23,243 2,268 Updated Sep 19, 2024

Ultra-low-latency, high-throughput multiprocess transport over SHM and mmap. LMAX-Disruptor-style cross-process ring substrate.

Rust 11 1 Updated Jun 11, 2026

Cross-architecture CUDA kernels for SVDQuant (W4A4 with low-rank correction)

Python 2 1 Updated May 28, 2026
Python 135 18 Updated Jun 10, 2026

LLM model quantization (compression) toolkit with HW acceleration support for Nvidia, AMD, Intel GPU and Intel/AMD/Apple CPU via HF, vLLM, and SGLang.

Python 1,177 187 Updated Jun 12, 2026
Jupyter Notebook 6 4 Updated Feb 2, 2024

Compile docker images into a single self-contained binary

Rust 19 Updated Apr 30, 2026

Tools for converting ACPI DSDT to Device Tree Source for CIX Sky1 boards

Shell 4 2 Updated Feb 1, 2026

Model compression toolkit engineered for enhanced usability, comprehensiveness, and efficiency.

Python 1,304 150 Updated Jun 12, 2026

A Python DSL to write Nvidia PTX for Hopper and Blackwell in JAX and PyTorch

Python 311 26 Updated May 8, 2026

The headless browser for AI agents and web scraping

Rust 15,617 1,058 Updated Jun 13, 2026

Open-source AI sandbox infrastructure with unified API for VMMs -- Firecracker, QEMU and libkrun.

Python 594 46 Updated Jun 13, 2026

Secure and fast microVMs for serverless computing.

Rust 34,910 2,441 Updated Jun 12, 2026

Dynamic per-token early exit for LLM inference. Skip layers tokens don't need

Python 31 5 Updated Mar 18, 2026

rvLLM: High-performance LLM inference in Rust. Drop-in vLLM replacement.

Rust 739 70 Updated Jun 12, 2026

🤖FFPA: Extends FlashAttention-2 via Split-D for large headdims, 1.5x~3×↑🎉 vs SDPA, up to 430T🎉 on H200.

Python 310 20 Updated Jun 12, 2026

Qwen3-TTS is an open-source series of TTS models developed by the Qwen team at Alibaba Cloud, supporting stable, expressive, and streaming speech generation, free-form voice design, and vivid voice…

Python 201 62 Updated Jun 13, 2026
Python 778 62 Updated Apr 16, 2026

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 7,373 1,044 Updated Jun 4, 2026

Enable true multi gpu capability in Comfy UI using XDiT XFuser and FSDP managed by Ray

Python 350 37 Updated Jun 12, 2026

OBLITERATE THE CHAINS THAT BIND YOU

Python 6,474 1,227 Updated Apr 1, 2026

[ICML 2026] Jacobi Forcing: Fast and Accurate Diffusion-style Decoding

Python 117 10 Updated Feb 20, 2026

The open-source agent-serving project

Python 477 32 Updated Jun 8, 2026
Python 4 Updated Apr 25, 2026

Fast and accurate AI powered file content types detection

Python 17,135 1,051 Updated Jun 11, 2026

AR 3D object detection for iPhone with LiDAR — YOLO 2D + BoxerNet 3D lifting

Swift 402 46 Updated Apr 30, 2026

🎨 NeMo Data Designer: Generate high-quality synthetic data from scratch or from seed data.

Python 1,993 181 Updated Jun 12, 2026
Python 192 6 Updated Jan 26, 2026

OpenShell is the safe, private runtime for autonomous AI agents.

Rust 7,074 849 Updated Jun 13, 2026
Next