Skip to content
View JOE1994's full-sized avatar

Organizations

@freebsdkorea @sslab-gatech @llvm

Block or report JOE1994

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

RenderDoc is a stand-alone graphics debugging tool.

C++ 10,135 1,473 Updated Nov 6, 2025

WinPixEventRuntime + decoder + tests

C++ 94 10 Updated Mar 27, 2024

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 152,211 31,076 Updated Nov 7, 2025

A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")

C++ 360 69 Updated Nov 7, 2025

Powering AWS purpose-built machine learning chips. Blazing fast and cost effective, natively integrated into PyTorch and TensorFlow and integrated with your favorite AWS services

Python 549 175 Updated Nov 6, 2025

LLVM Code Generation, published by Packt

C++ 202 43 Updated Oct 7, 2025

This is the second repo for the book "LLVM Code Generation". This will be linked to the main repo for this title.

LLVM 34 24 Updated Sep 29, 2025

Backward compatible ML compute opset inspired by HLO/MHLO

MLIR 563 161 Updated Nov 6, 2025

Material for gpu-mode lectures

Jupyter Notebook 5,262 526 Updated Sep 23, 2025

P4_16 reference compiler

C++ 790 472 Updated Nov 6, 2025

Super-fast Structured Outputs

Rust 594 39 Updated Oct 20, 2025

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Python 39,707 6,871 Updated Nov 7, 2025

Large Language Model Text Generation Inference

Python 10,628 1,234 Updated Nov 6, 2025

Implementation of Speculative Sampling as described in "Accelerating Large Language Model Decoding with Speculative Sampling" by Deepmind

Python 105 14 Updated Feb 29, 2024

Fast and memory-efficient exact attention

Python 20,393 2,121 Updated Nov 5, 2025

SRIOV network device plugin for Kubernetes

Go 479 196 Updated Nov 5, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 62,436 11,108 Updated Nov 7, 2025

Tiny, fast, non-dependent and fully loaded printf implementation for embedded systems. Extensive test suite passing.

C 2,887 540 Updated Apr 3, 2023

LaTeX Examples Document Source

TeX 249 65 Updated Jan 5, 2025

Training materials associated with NVIDIA's CUDA Training Series (www.olcf.ornl.gov/cuda-training-series/)

Cuda 890 323 Updated Aug 19, 2024

QUDA is a library for performing calculations in lattice QCD on GPUs.

C++ 330 109 Updated Nov 6, 2025

LaTeX Examples Document Source

Jupyter Notebook 11 2 Updated Apr 9, 2024

CPU profiling trace viewer

C# 238 19 Updated Oct 27, 2025

A tool and a library for bi-directional translation between SPIR-V and LLVM IR

LLVM 584 246 Updated Nov 7, 2025

PyTorch native quantization and sparsity for training and inference

Python 2,492 363 Updated Nov 7, 2025

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Python 94,788 25,822 Updated Nov 7, 2025

Demonstration and Template Projects

GDScript 7,687 1,970 Updated Nov 6, 2025

A course on aligning smol models.

Jupyter Notebook 6,489 2,300 Updated Nov 4, 2025
Next