akothen

Follow

Akash K. akothen

Follow

18 followers · 4 following

Achievements

Achievements

Highlights

Pro

Stars

zhang677 / AccelOpt

AccelOpt: Self-improving Agents for AI Accelerator Kernel Optimization

Python 30 3 Updated Feb 18, 2026

alibaba / redfuser

Python 17 Updated Mar 17, 2026

lichye / smartVerilog

SystemVerilog 7 Updated Feb 20, 2026

jonathanvdc / foresight

A Scala equality saturation library

Scala 8 Updated Dec 16, 2025

amd / Triton-XDNA

Python 23 4 Updated Mar 21, 2026

maeri-project / FEATHER

A Reconfigurable Accelerator with Data Reordering Support for Low-Cost On-Chip Dataflow Switching

C++ 79 12 Updated Mar 20, 2026

symengine / symengine

SymEngine is a fast symbolic manipulation library, written in C++

C++ 1,345 311 Updated Feb 13, 2026

Optima-CityU / Evolution_of_Kernels

Reference Code Implementation of paper "Evolution of Kernels: Automated RISC-V Kernel Optimization with Large Language Models"

Python 5 3 Updated Dec 2, 2025

IBM / 3D-CiM-LLM-Inference-Simulator

Simulator for LLM inference on an abstract 3D AIMC-based accelerator

Python 27 6 Updated Sep 18, 2025

mit-han-lab / spatten

[HPCA'21] SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning

Scala 128 11 Updated Aug 27, 2024

coreylammie / MemTorch

A Simulation Framework for Memristive Deep Learning Systems

Python 183 61 Updated May 13, 2024

ARCANA-Research / MASTODON

Memory Array Simulation Testbed for Organization, Data, Operations, and Networks

C++ 3 Updated Sep 12, 2024

mcj-group / fased-verilog

Verilog used to evaluate the FASED dot product hardware unit [IEEE CAL 2026]

SystemVerilog 8 2 Updated Jan 9, 2026

tenstorrent / tt-metal

🤘 TT-NN operator library, and TT-Metalium low level kernel programming model.

C++ 1,385 387 Updated Mar 23, 2026

NVIDIA / tilus

Tilus is a tile-level kernel programming language with explicit control over shared memory and registers.

Python 456 18 Updated Mar 23, 2026

Lychee-ysy / llmemu

Python 2 Updated Jun 22, 2025

yc2367 / P3-LLM

Python 12 2 Updated Dec 9, 2025

microsoft / nnfusion

A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.

C++ 1,004 165 Updated Sep 19, 2024

pku-liang / AMOS

Automatic Mapping Generation, Verification, and Exploration for ISA-based Spatial Accelerators

Python 122 13 Updated Oct 26, 2022

thu-ml / SpargeAttn

[ICML2025] SpargeAttention: A training-free sparse attention that accelerates any model inference.

Cuda 966 88 Updated Feb 25, 2026

google-research / swift-tfp

Find shape errors before you run your code!

Swift 152 9 Updated Dec 27, 2020

cornell-zhang / apu-micro25-artifact

Artifact of MICRO'25 paper Characterizing and Optimizing Realistic Workloads on a Commercial Compute-in-SRAM Device

C 6 1 Updated Aug 3, 2025

ChijinZ / PolyJuice-Fuzzer

A DL compiler fuzzer

Python 14 Updated Nov 1, 2024

FusedMindLab / TransFusion

An end-to-end Transformer fusion integrating DAG-based pipeline scheduling and whole encoder and decoder fusion.

Python 5 1 Updated Jul 17, 2025

NVIDIA-NeMo / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 16,969 3,384 Updated Mar 23, 2026

hannabdul / etf4asr

Official repo for the paper "An Effective Training Framework for Light-Weight Automatic Speech Recognition Models" accepted at InterSpeech 2025.

Lex 8 Updated Aug 15, 2025

mit-han-lab / efficientvit

Efficient vision foundation models for high-resolution generation and perception.

Python 3,270 235 Updated Sep 5, 2025

AIS-SNU / GraNNDis_Artifact

[PACT'24] GraNNDis. A fast and unified distributed graph neural network (GNN) training framework for both full-batch (full-graph) and mini-batch training. Provides unification of full-/mini-batch t…

Python 10 2 Updated Aug 13, 2024

MerHS / pfeife

Pfeife: Automatic Pipeline Parallelism for PyTorch

Python 5 Updated Oct 27, 2025

memoryleak47 / slotted-egraphs

Rust 40 7 Updated Jan 22, 2026