Skip to content
View pritam5756's full-sized avatar
🎯
Focusing
🎯
Focusing
  • Ranchi

Highlights

  • Pro

Block or report pritam5756

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 3 Updated Apr 4, 2026

Train the smallest LM you can that fits in 16MB. Best model wins!

Python 4,760 3,142 Updated Apr 9, 2026

A lightweight inference engine supporting speculative speculative decoding (SSD).

Python 871 65 Updated Mar 22, 2026

KV Cache & LoRA for minGPT

Python 62 8 Updated Mar 4, 2026

Course on Flash-attention in Triton

Jupyter Notebook 98 9 Updated Feb 9, 2026

From Minimal GEMM to Everything

Cuda 195 10 Updated Feb 10, 2026

Lists of company wise questions. Every csv file in the companies directory corresponds to a list of questions on leetcode for a specific company based on the leetcode company tags. Updated as of 20…

22,773 4,514 Updated Jun 20, 2025

Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflo…

Shell 112,927 18,889 Updated Apr 10, 2026

Student version of Assignment 1 for Stanford CS336 - Language Modeling From Scratch

Python 1,458 1,902 Updated Apr 7, 2026

Official inference repo for FLUX.2 models

Python 2,125 138 Updated Mar 12, 2026

Flux 2 image generation model pure C inference

C 1,923 132 Updated Feb 13, 2026

Claude Code skill that removes signs of AI-generated writing from text

13,475 1,164 Updated Apr 1, 2026

The missing tiktoken training code

Rust 428 46 Updated Jan 3, 2026

A curriculum for learning about gpu performance engineering, from scratch to what the frontier AI labs do

562 66 Updated Mar 2, 2026

Shortest solutions for CS231n 2021-2026

Jupyter Notebook 468 78 Updated Sep 26, 2025

~950 line, minimal, extensible LLM inference engine built from scratch.

Python 469 38 Updated Jan 9, 2026

Simple OSINT script to find Instagram profiles by name and e-mail/phone

Python 2,757 289 Updated Aug 17, 2024

Implementing scalable LLMs in pure JAX (no third-party libraries)

Python 48 6 Updated Apr 12, 2026

List of AI Residency Programs

3,284 271 Updated Apr 4, 2025

Based on Nano-vLLM, a simple replication of vLLM with self-contained paged attention and flash attention implementation

Python 650 87 Updated Mar 16, 2026

Train your own speech AI model from scratch

Python 150 15 Updated Feb 17, 2026

Learn CUDA with PyTorch

Cuda 270 37 Updated Apr 9, 2026

Official code of Why Knowledge Distillation Works in Generative Models: A Minimal Working Explanation (NeurIPS 2025)

Jupyter Notebook 18 1 Updated Dec 23, 2025

Big & Small LLMs working together

Python 1,285 146 Updated Mar 12, 2026

Agentic AI Infrastructure for magnifying HUMAN capabilities.

TypeScript 11,306 1,576 Updated Apr 12, 2026

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,971 286 Updated May 15, 2025

Examples demonstrating available options to program multiple GPUs in a single node or a cluster

Cuda 883 149 Updated Sep 26, 2025

Patterns and behaviors for GPU computing

C++ 1,769 284 Updated Jan 17, 2026
Next