bkataru

🌩️

boomin at da speed of c ⚡🔧

Baalateja Kataru bkataru

🌩️

boomin at da speed of c ⚡🔧

⟨e|acc⟩ resident technical wizard | ai/ml/sys eng | hacking at it till grug says "software go zoom" | prev. computational physics in neutrino phenomenology

104 followers · 354 following

Achievements

Highlights

Developer Program Member

Organizations

Lists (3)

Sort

Starred repositories

8 stars written in Cuda

Clear filter

karpathy / llm.c

LLM training in simple, raw C/CUDA

Cuda 29,756 3,562 Updated Jun 26, 2025

HigherOrderCO / HVM2

A massively parallel, optimal functional runtime in Rust

Cuda 11,236 437 Updated Nov 21, 2024

deepseek-ai / DeepEP

DeepEP: an efficient expert-parallel communication library

Cuda 9,586 1,211 Updated Apr 29, 2026

deepseek-ai / DeepGEMM

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 7,135 950 Updated Apr 24, 2026

alibaba / rtp-llm

RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.

Cuda 1,106 178 Updated Apr 29, 2026

openai / blocksparse

Efficient GPU kernels for block-sparse matrix multiplication and convolution

Cuda 1,065 198 Updated Jun 8, 2023

gigit0000 / qwen3.cu

Single-file, pure CUDA C implementation for running inference on Qwen3 0.6B GGUF. No Dependencies.

Cuda 24 Updated Nov 26, 2025

wjcunningham7 / causets

Causal set quantum cosmology

Cuda 2 Updated Apr 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Baalateja Kataru bkataru

Achievements

Achievements

Highlights

Organizations

Block or report bkataru

Lists (3)

🤖 ai

🤔 maybe contribute

My Repo Watchlist

Starred repositories

karpathy / llm.c

HigherOrderCO / HVM2

deepseek-ai / DeepEP

deepseek-ai / DeepGEMM

alibaba / rtp-llm

openai / blocksparse

gigit0000 / qwen3.cu

wjcunningham7 / causets

Starred topics

deepseek-api

observability

prometheus

Monitoring

Database

Rust

SQL

claude

superpowers

skills