Skip to content
View leejaymin's full-sized avatar

Highlights

  • Pro

Organizations

@PowerLab

Block or report leejaymin

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Integer-only FlashAttention kernel in Triton.

Python 1 1 Updated Jun 12, 2026

PyTorchSim is a Comprehensive, Fast, and Accurate NPU Simulation Framework

Python 128 22 Updated Jun 19, 2026

mcp for handling hwp

Python 257 62 Updated Jan 29, 2026

PyTorch CoreSIG

59 2 Updated Dec 30, 2024
Python 23 5 Updated May 7, 2026

Generate a comprehensive review from an arXiv paper, then turn it into a blog post. This project powers the website below for the HuggingFace's Daily Papers (https://huggingface.co/papers).

Python 840 92 Updated Feb 20, 2025
Python 34 3 Updated Feb 10, 2026

[ECCV 2024] CLAMP-ViT: Contrastive Data-Free Learning for Adaptive Post-Training Quantization of ViTs

Python 19 1 Updated Jul 2, 2024

Efficient GPU kernels for mixed-precision Vision Transformers in Triton

Python 17 1 Updated Sep 18, 2025

List of papers related to Vision Transformers quantization and hardware acceleration in recent AI conferences and journals.

106 6 Updated Jun 2, 2024

Code Repository of Evaluating Quantized Large Language Models

Python 134 10 Updated Sep 8, 2024

Hiera: A fast, powerful, and simple hierarchical vision transformer.

Python 6 1 Updated Mar 24, 2024

Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline model in a user-friendly interface.

Python 650 90 Updated Sep 11, 2024

GPU programming related news and material links

2,182 132 Updated Jun 15, 2026

Shared Middle-Layer for Triton Compilation

MLIR 337 103 Updated Dec 5, 2025

A beautiful, simple, clean, and responsive Jekyll theme for academics

HTML 15,745 13,060 Updated Jun 17, 2026

Command-line program to download videos from YouTube.com and other video sites

Python 140,526 10,679 Updated Feb 19, 2026

Extract your SlidesLive presentation.

Jupyter Notebook 15 4 Updated Apr 19, 2024

A tool to modify ONNX models in a visualization fashion, based on Netron and Flask.

JavaScript 1,629 199 Updated Nov 19, 2025
Python 36 4 Updated Mar 29, 2023

Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".

Python 2,320 201 Updated Mar 27, 2024

[CVPR 2023] PD-Quant: Post-Training Quantization Based on Prediction Difference Metric

Python 61 7 Updated Mar 23, 2023

LLM inference in C/C++

C++ 117,220 19,713 Updated Jun 19, 2026

This project aims to split onnx by reading yaml config.

Python 1 1 Updated Aug 2, 2023

Template repository to build PyTorch projects from source on any version of PyTorch/CUDA/cuDNN.

Dockerfile 720 43 Updated Jan 11, 2025

Code and models for mobile-former

Python 132 19 Updated Jul 18, 2022
Next