akothen

Follow

Akash K. akothen

Follow

17 followers · 4 following

Achievements

Achievements

Highlights

Pro

Stars

Dao-AILab / flash-attention

Fast and memory-efficient exact attention

Python 21,289 2,248 Updated Dec 25, 2025

NVIDIA-NeMo / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 16,357 3,243 Updated Dec 24, 2025

qemu / qemu

Official QEMU mirror. Please see https://www.qemu.org/contribute/ for how to submit changes to QEMU. Pull Requests are ignored. Please only use release tarballs from the QEMU website.

C 12,445 6,411 Updated Dec 23, 2025

liquidslr / leetcode-company-wise-problems

Lists of company wise questions available on leetcode premium. Every csv file in the companies directory corresponds to a list of questions on leetcode for a specific company based on the leetcode …

10,734 2,224 Updated Jun 20, 2025

mistralai / mistral-inference

Official inference library for Mistral models

Jupyter Notebook 10,606 1,002 Updated Nov 21, 2025

WebAssembly / binaryen

Optimizer and compiler/toolchain library for WebAssembly

WebAssembly 8,251 831 Updated Dec 23, 2025

linkedin / Liger-Kernel

Efficient Triton Kernels for LLM Training

Python 5,978 455 Updated Dec 25, 2025

mit-han-lab / efficientvit

Efficient vision foundation models for high-resolution generation and perception.

Python 3,185 229 Updated Sep 5, 2025

merrymercy / awesome-tensor-compilers

A list of awesome compiler projects and papers for tensor computation and deep learning.

2,702 323 Updated Oct 19, 2024

diku-dk / futhark

💥💻💥 A data-parallel functional programming language

Haskell 2,640 191 Updated Dec 25, 2025

fengbintu / Neural-Networks-on-Silicon

This is originally a collection of papers on neural network accelerators. Now it's more like my selection of research on deep learning and computer architecture.

2,045 388 Updated Nov 8, 2025

gpgpu-sim / gpgpu-sim_distribution

GPGPU-Sim provides a detailed simulation model of contemporary NVIDIA GPUs running CUDA and/or OpenCL workloads. It includes support for features such as TensorCores and CUDA Dynamic Parallelism as…

C++ 1,532 607 Updated Feb 15, 2025

GPSnoopy / RayTracingInVulkan

Implementation of Peter Shirley's Ray Tracing In One Weekend book using Vulkan and NVIDIA's RTX extension.

C++ 1,453 129 Updated Jun 26, 2025

riscvarchive / riscv-v-spec

Working draft of the proposed RISC-V V vector extension

Assembly 1,059 280 Updated Mar 17, 2024

microsoft / nnfusion

A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.

C++ 1,004 165 Updated Sep 19, 2024

volcengine / veScale

Byted PyTorch Distributed for Hyperscale Training of LLMs and RLs

Python 910 53 Updated Nov 27, 2025

thu-ml / SpargeAttn

[ICML2025] SpargeAttention: A training-free sparse attention that accelerates any model inference.

Cuda 862 74 Updated Dec 17, 2025

amd / RyzenAI-SW

AMD Ryzen™ AI Software includes the tools and runtime libraries for optimizing and deploying AI inference on AMD Ryzen™ AI powered PCs.

Python 719 110 Updated Dec 16, 2025

pytorch / helion

A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.

Python 695 89 Updated Dec 25, 2025

mgaudet / CompilerJobs

A listing of compiler, language and runtime teams for people looking for jobs in this area

HTML 690 72 Updated Dec 9, 2025

google / heir

A compiler for homomorphic encryption

C++ 631 109 Updated Dec 25, 2025

perplexityai / pplx-kernels

Perplexity GPU Kernels

C++ 544 74 Updated Nov 7, 2025

KnowingNothing / compiler-and-arch

A list of tutorials, paper, talks, and open-source projects for emerging compiler and architecture

511 41 Updated Jan 15, 2025

NVIDIA / tilus

Tilus is a tile-level kernel programming language with explicit control over shared memory and registers.

Python 433 15 Updated Dec 16, 2025

mit-han-lab / Block-Sparse-Attention

A sparse attention kernel supporting mix sparse patterns

C++ 413 39 Updated Dec 16, 2025

scalesim-project / SCALE-Sim

Repository to host and maintain SCALE-Sim code

Python 395 138 Updated Dec 17, 2025

Meituan-AutoML / VisionLLaMA

VisionLLaMA: A Unified LLaMA Backbone for Vision Tasks

Python 390 12 Updated Jul 9, 2024

circify / circ

(Cir)cuit (C)ompiler. Compiling high-level languages to circuits for SMT, zero-knowledge proofs, and more.

Rust 312 48 Updated Jun 3, 2025

mit-han-lab / x-attention

[ICML 2025] XAttention: Block Sparse Attention with Antidiagonal Scoring

Python 261 19 Updated Jul 6, 2025

PLSysSec / sys

Sys: A Static/Symbolic Tool for Finding Good Bugs in Good (Browser) Code

LLVM 234 41 Updated Mar 14, 2022