Skip to content
View akothen's full-sized avatar

Highlights

  • Pro

Block or report akothen

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Fast and memory-efficient exact attention

Python 21,289 2,248 Updated Dec 25, 2025

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 16,357 3,243 Updated Dec 24, 2025

Official QEMU mirror. Please see https://www.qemu.org/contribute/ for how to submit changes to QEMU. Pull Requests are ignored. Please only use release tarballs from the QEMU website.

C 12,445 6,411 Updated Dec 23, 2025

Lists of company wise questions available on leetcode premium. Every csv file in the companies directory corresponds to a list of questions on leetcode for a specific company based on the leetcode …

10,734 2,224 Updated Jun 20, 2025

Official inference library for Mistral models

Jupyter Notebook 10,606 1,002 Updated Nov 21, 2025

Optimizer and compiler/toolchain library for WebAssembly

WebAssembly 8,251 831 Updated Dec 23, 2025

Efficient Triton Kernels for LLM Training

Python 5,978 455 Updated Dec 25, 2025

Efficient vision foundation models for high-resolution generation and perception.

Python 3,185 229 Updated Sep 5, 2025

A list of awesome compiler projects and papers for tensor computation and deep learning.

2,702 323 Updated Oct 19, 2024

💥💻💥 A data-parallel functional programming language

Haskell 2,640 191 Updated Dec 25, 2025

This is originally a collection of papers on neural network accelerators. Now it's more like my selection of research on deep learning and computer architecture.

2,045 388 Updated Nov 8, 2025

GPGPU-Sim provides a detailed simulation model of contemporary NVIDIA GPUs running CUDA and/or OpenCL workloads. It includes support for features such as TensorCores and CUDA Dynamic Parallelism as…

C++ 1,532 607 Updated Feb 15, 2025

Implementation of Peter Shirley's Ray Tracing In One Weekend book using Vulkan and NVIDIA's RTX extension.

C++ 1,453 129 Updated Jun 26, 2025

Working draft of the proposed RISC-V V vector extension

Assembly 1,059 280 Updated Mar 17, 2024

A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.

C++ 1,004 165 Updated Sep 19, 2024

Byted PyTorch Distributed for Hyperscale Training of LLMs and RLs

Python 910 53 Updated Nov 27, 2025

[ICML2025] SpargeAttention: A training-free sparse attention that accelerates any model inference.

Cuda 862 74 Updated Dec 17, 2025

AMD Ryzen™ AI Software includes the tools and runtime libraries for optimizing and deploying AI inference on AMD Ryzen™ AI powered PCs.

Python 719 110 Updated Dec 16, 2025

A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.

Python 695 89 Updated Dec 25, 2025

A listing of compiler, language and runtime teams for people looking for jobs in this area

HTML 690 72 Updated Dec 9, 2025

A compiler for homomorphic encryption

C++ 631 109 Updated Dec 25, 2025

Perplexity GPU Kernels

C++ 544 74 Updated Nov 7, 2025

A list of tutorials, paper, talks, and open-source projects for emerging compiler and architecture

511 41 Updated Jan 15, 2025

Tilus is a tile-level kernel programming language with explicit control over shared memory and registers.

Python 433 15 Updated Dec 16, 2025

A sparse attention kernel supporting mix sparse patterns

C++ 413 39 Updated Dec 16, 2025

Repository to host and maintain SCALE-Sim code

Python 395 138 Updated Dec 17, 2025

VisionLLaMA: A Unified LLaMA Backbone for Vision Tasks

Python 390 12 Updated Jul 9, 2024

(Cir)cuit (C)ompiler. Compiling high-level languages to circuits for SMT, zero-knowledge proofs, and more.

Rust 312 48 Updated Jun 3, 2025

[ICML 2025] XAttention: Block Sparse Attention with Antidiagonal Scoring

Python 261 19 Updated Jul 6, 2025

Sys: A Static/Symbolic Tool for Finding Good Bugs in Good (Browser) Code

LLVM 234 41 Updated Mar 14, 2022
Next