Skip to content
View gohar94's full-sized avatar

Highlights

  • Pro

Block or report gohar94

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

This is a tool for managing GPU partitions for NVIDIA Fabric Manager’s Shared NVSwitch.

C++ 13 9 Updated Apr 29, 2025

You like pytorch? You like micrograd? You love tinygrad! ❤️

Python 32,390 4,053 Updated Apr 13, 2026

NVIDIA Linux open GPU kernel module source

C 16,891 1,658 Updated Apr 3, 2026

Official JAX implementation of End-to-End Test-Time Training for Long Context

Python 583 41 Updated Feb 15, 2026
321 28 Updated Apr 6, 2026

A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technology

C++ 1,368 183 Updated Mar 12, 2026

GCAPS: GPU Context-Aware Preemptive Scheduling Approach

C 15 1 Updated Mar 22, 2026
Jupyter Notebook 23 3 Updated May 18, 2025

Official PyTorch implementation of Learning to (Learn at Test Time): RNNs with Expressive Hidden States

Python 1,354 83 Updated Jul 14, 2024

A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.

Python 3,975 570 Updated Mar 13, 2026

Nano vLLM

Python 12,861 1,919 Updated Apr 13, 2026

Unsloth Studio is a web UI for training and running open models like Gemma 4, Qwen3.5, DeepSeek, gpt-oss locally.

Python 61,397 5,309 Updated Apr 13, 2026

Resource Multiplexing in Tuning and Serving Large Language Models (USENIX ATC 2025)

Python 8 5 Updated Apr 13, 2026

Naive attempt at implementing TTT paper by letting autograd do the heavy lifting

Python 8 Updated Feb 20, 2026

Unofficial implementation of Titans, SOTA memory for transformers, in Pytorch

Python 1,948 204 Updated Feb 9, 2026

dspy-cli is a tool for creating, developing, testing, and deploying DSPy programs as HTTP APIs.

Python 122 10 Updated Mar 3, 2026

Kernel Tuner

Python 387 64 Updated Apr 13, 2026

NVIDIA Linux open GPU with P2P support

C 12 1 Updated Jan 6, 2026

Artifact from "Hardware Compute Partitioning on NVIDIA GPUs". THIS IS A FORK OF BAKITAS REPO. I AM NOT ONE OF THE AUTHORS OF THE PAPER.

C 59 5 Updated Nov 24, 2025

Dynamic Memory Management for Serving LLMs without PagedAttention

C 477 41 Updated May 30, 2025

The official implementation of the ICML 2024 paper "MemoryLLM: Towards Self-Updatable Large Language Models" and "M+: Extending MemoryLLM with Scalable Long-Term Memory"

Python 309 32 Updated Jul 28, 2025

The open-source RAG platform: built-in citations, deep research, 22+ file formats, partitions, MCP server, and more.

TypeScript 1,947 172 Updated Mar 21, 2026

The best ChatGPT that $100 can buy.

Python 51,767 6,874 Updated Mar 27, 2026

Contexts Optical Compression

Python 22,818 2,099 Updated Jan 27, 2026

Hackable and optimized Transformers building blocks, supporting a composable construction.

Python 10,417 773 Updated Mar 30, 2026

A Datacenter Scale Distributed Inference Serving Framework

Rust 6,546 1,019 Updated Apr 13, 2026

Train transformer language models with reinforcement learning.

Python 18,029 2,640 Updated Apr 13, 2026

A framework for optimizing DSPy programs with RL

Python 331 30 Updated Jan 12, 2026
Next