Skip to content
View masahi's full-sized avatar

Organizations

@apache @dmlc @octoml

Block or report masahi

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

Python 5,193 452 Updated Feb 15, 2026

CUDA/Metal accelerated language model inference

C 625 31 Updated May 29, 2025

RPyC (Remote Python Call) - A transparent and symmetric RPC library for python

Python 1,692 250 Updated Aug 14, 2025

📚 Jupyter notebook tutorials for OpenVINO™

Jupyter Notebook 3,040 979 Updated Feb 15, 2026

Embree ray tracing kernels repository.

C++ 2,644 420 Updated Feb 13, 2026

Universal LLM Deployment Engine with ML Compilation

Python 22,042 1,936 Updated Feb 13, 2026

Build system, successor to Buck

Rust 4,255 328 Updated Feb 15, 2026

MoonRay is DreamWorks’ open-source, award-winning, state-of-the-art production MCRT renderer.

CMake 4,586 289 Updated Feb 4, 2026

optimized BERT transformer inference on NVIDIA GPU. https://arxiv.org/abs/2210.03052

C++ 478 37 Updated Mar 15, 2024

Language Modeling with the H3 State Space Model

Assembly 522 51 Updated Sep 29, 2023

An open-source efficient deep learning framework/compiler, written in python.

Python 738 68 Updated Sep 4, 2025

An efficient vector-graphics renderer

Rust 2,648 56 Updated May 16, 2023

A GPU compute-centric 2D renderer.

Rust 3,761 218 Updated Feb 15, 2026

A modern cross-platform low-level graphics library and rendering framework

Batchfile 4,181 367 Updated Feb 16, 2026

AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.

Python 4,703 382 Updated Jan 12, 2026

Real-time GPU path tracing with an OpenUSD Hydra render delegate

C++ 599 48 Updated Aug 8, 2025

This is the development repository for the OpenFHE library. The current version is 1.4.2 (released on October 20, 2025).

C++ 1,073 273 Updated Feb 14, 2026

3D fluid simulation experiments in Rust, using WebGPU-rs (WIP)

Rust 472 17 Updated Dec 17, 2022
HLSL 475 72 Updated Jan 13, 2026
Python 51 8 Updated Mar 29, 2023

A STARK prover and verifier for arbitrary computations

Rust 884 222 Updated Jul 19, 2025

The Flutter engine

C++ 7,587 5,991 Updated Feb 25, 2025

A General-purpose Task-parallel Programming System using Modern C++

C++ 11,724 1,366 Updated Feb 16, 2026

SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on PyTorch, TensorFlow, and ONNX Runtime

Python 2,583 295 Updated Feb 14, 2026

Single C file, Realtime CPU/GPU Profiler with Remote Web Viewer

C 3,286 283 Updated Aug 28, 2024

Vulkan and rust experiments, including a spectral path tracer using Vulkan ray tracing extensions

Rust 131 5 Updated Sep 13, 2025

Instant neural graphics primitives: lightning fast NeRF and more

Cuda 17,275 2,049 Updated Feb 2, 2026

magic-trace collects and displays high-resolution traces of what a process is doing

OCaml 5,232 124 Updated Jan 14, 2026

3D engine with modern graphics

C 6,876 723 Updated Feb 15, 2026

Open Machine Learning Compiler Framework

Python 13,119 3,788 Updated Feb 16, 2026
Next