Skip to content
View JOE1994's full-sized avatar

Organizations

@freebsdkorea @sslab-gatech @llvm

Block or report JOE1994

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

RenderDoc is a stand-alone graphics debugging tool.

C++ 10,598 1,617 Updated Apr 13, 2026

WinPixEventRuntime + decoder + tests

C++ 100 13 Updated Mar 27, 2024

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 159,354 32,868 Updated Apr 14, 2026

A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")

C++ 386 80 Updated Apr 13, 2026

Powering AWS purpose-built machine learning chips. Blazing fast and cost effective, natively integrated into PyTorch and TensorFlow and integrated with your favorite AWS services

Python 591 184 Updated Apr 14, 2026

LLVM Code Generation, published by Packt

C++ 242 60 Updated Jan 23, 2026

This is the second repo for the book "LLVM Code Generation". This will be linked to the main repo for this title.

LLVM 39 25 Updated Apr 7, 2026

Backward compatible ML compute opset inspired by HLO/MHLO

MLIR 640 191 Updated Apr 13, 2026

Material for gpu-mode lectures

Jupyter Notebook 5,950 599 Updated Feb 1, 2026

P4_16 reference compiler

C++ 817 510 Updated Apr 14, 2026

Super-fast Structured Outputs

Rust 736 62 Updated Apr 14, 2026

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Python 42,118 7,437 Updated Apr 14, 2026

Large Language Model Text Generation Inference

Python 10,832 1,261 Updated Mar 21, 2026

Implementation of Speculative Sampling as described in "Accelerating Large Language Model Decoding with Speculative Sampling" by Deepmind

Python 110 16 Updated Feb 29, 2024

Fast and memory-efficient exact attention

Python 23,348 2,614 Updated Apr 14, 2026

SRIOV network device plugin for Kubernetes

Go 511 204 Updated Apr 6, 2026

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 76,540 15,566 Updated Apr 14, 2026

Tiny, fast, non-dependent and fully loaded printf implementation for embedded systems. Extensive test suite passing.

C 2,976 553 Updated Apr 3, 2023

LaTeX Examples Document Source

TeX 253 67 Updated Nov 14, 2025

Training materials associated with NVIDIA's CUDA Training Series (www.olcf.ornl.gov/cuda-training-series/)

Cuda 964 352 Updated Aug 19, 2024

QUDA is a library for performing calculations in lattice QCD on GPUs.

C++ 347 113 Updated Apr 8, 2026

LaTeX Examples Document Source

Jupyter Notebook 11 2 Updated Apr 9, 2024

CPU profiling trace viewer

C# 266 21 Updated Apr 10, 2026

A tool and a library for bi-directional translation between SPIR-V and LLVM IR

LLVM 606 266 Updated Apr 14, 2026

PyTorch native quantization and sparsity for training and inference

Python 2,772 481 Updated Apr 14, 2026

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Python 99,115 27,482 Updated Apr 14, 2026

Demonstration and Template Projects

GDScript 8,589 2,080 Updated Apr 10, 2026

A course on aligning smol models.

Jupyter Notebook 6,629 2,294 Updated Apr 7, 2026
Next