eqy

💭

damn that's crazy

eqy

💭

damn that's crazy

`nvprof` is dead, long live `nsys nvprof`

138 followers · 128 following

NVIDIA

Achievements

x3 x3

Achievements

x3 x3

Stars

mirage-project / mirage

Mirage Persistent Kernel: Compiling LLMs into a MegaKernel

Cuda 2,335 224 Updated Jun 24, 2026

xai-org / grok-prompts

Prompts for our Grok chat assistant and the `@grok` bot on X.

Jinja 4,171 461 Updated Nov 17, 2025

drisspg / transformer_nuggets

A place to store reusable transformer components of my own creation or found on the interwebs

Python 80 12 Updated May 30, 2026

KeeyanGhoreshi / PokemonFireredSingleSequence

A list of inputs that will beat the vast majority of Pokemon Firered games

Lua 200 9 Updated Mar 10, 2023

berkeley-cs164-sp25 / hw8-benchmarks

Common Lisp 1 2 Updated Apr 22, 2025

janeyx99 / torch-release-notes

Staging ground for release notes for PyTorch

2 6 Updated Jul 9, 2025

altanh / altanh.github.io

Shell 1 Updated Jan 5, 2026

deepseek-ai / DeepSeek-R1

91,984 11,719 Updated Jun 27, 2025

mcarilli / FlameGraph

Forked from brendangregg/FlameGraph

Stack trace visualizer

Perl 4 1 Updated Sep 2, 2020

NVIDIA / nvidia-resiliency-ext

NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the effective training time by minimizing the downtime due to fa…

Python 301 54 Updated Jun 25, 2026

ridgerchu / matmulfreellm

Implementation for MatMul-free LM.

Python 3,072 202 Updated Dec 2, 2025

eggert / tz

Time zone database and code

C 1,816 258 Updated Jun 25, 2026

MarisaKirisame / megatron

OCaml 6 1 Updated Jun 18, 2025

loopj / short-stack

World's Smallest Nintendo Wii, using a trimmed motherboard and custom stacked PCBs

HTML 791 11 Updated Oct 23, 2025

tinygrad / 7900xtx

Python 456 31 Updated Apr 6, 2025

lightvector / KataGo

GTP engine and self-play learning in Go

C++ 4,717 719 Updated Jun 22, 2026

altanh / gen

graph generation and analysis stuff

1 Updated Feb 26, 2024

sail-sg / zero-bubble-pipeline-parallelism

Forked from NVIDIA/Megatron-LM

Zero Bubble Pipeline Parallelism

Python 459 33 Updated May 7, 2025

crcrpar / pin

A Pin

2 Updated Aug 1, 2023

Lightning-AI / litgpt

20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.

Python 13,439 1,464 Updated Jun 25, 2026

gussmith23 / dissertation

TeX 1 Updated Aug 27, 2024

tinygrad / tinygrad

You like pytorch? You like micrograd? You love tinygrad! ❤️

Python 33,164 4,194 Updated Jun 25, 2026

Liuhong99 / Sophia

The official implementation of “Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training”

Python 1,002 58 Updated Jan 30, 2024

joshpoll / reactive-bluefish-experiments

TypeScript 2 Updated May 17, 2023

ksivaman / TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper GPUs, to provide better performance with lower memory utilization in bot…

Cuda 2 Updated Sep 29, 2022

google / paxml

Pax is a Jax-based machine learning framework for training large scale models. Pax allows for advanced and fully configurable experimentation and parallelization, and has demonstrated industry lead…

Python 556 72 Updated Jun 4, 2026

Luo-Liang / OSDP-public

Composable + Tunable = Optimal

Python 2 Updated Apr 14, 2023

Syllo / nvtop

GPU & Accelerator process monitoring for AMD, Apple, Huawei, Intel, NVIDIA and Qualcomm

C 10,774 397 Updated May 6, 2026

NVIDIA / Fuser

A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")

C++ 392 81 Updated May 31, 2026

FMInference / FlexLLMGen

Running large language models on a single GPU for throughput-oriented scenarios.

Python 9,363 591 Updated Oct 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

eqy

Achievements

Achievements

Block or report eqy

Stars

mirage-project / mirage

xai-org / grok-prompts

drisspg / transformer_nuggets

KeeyanGhoreshi / PokemonFireredSingleSequence

berkeley-cs164-sp25 / hw8-benchmarks

janeyx99 / torch-release-notes

altanh / altanh.github.io

deepseek-ai / DeepSeek-R1

mcarilli / FlameGraph

NVIDIA / nvidia-resiliency-ext

ridgerchu / matmulfreellm

eggert / tz

MarisaKirisame / megatron

loopj / short-stack

tinygrad / 7900xtx

lightvector / KataGo

altanh / gen

sail-sg / zero-bubble-pipeline-parallelism

crcrpar / pin

Lightning-AI / litgpt

gussmith23 / dissertation

tinygrad / tinygrad

Liuhong99 / Sophia

joshpoll / reactive-bluefish-experiments

ksivaman / TransformerEngine

google / paxml

Luo-Liang / OSDP-public

Syllo / nvtop

NVIDIA / Fuser

FMInference / FlexLLMGen