Skip to content
View jepeake's full-sized avatar

Highlights

  • Pro

Block or report jepeake

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
83 stars written in Python
Clear filter

The AI developer platform. Use Weights & Biases to train and fine-tune models, and manage models from experimentation to production.

Python 10,537 793 Updated Nov 13, 2025

Run, manage, and scale AI workloads on any AI infrastructure. Use one system to access & manage all AI compute (Kubernetes, 17+ clouds, or on-prem).

Python 8,948 838 Updated Nov 13, 2025

Accessible large language models via k-bit quantization for PyTorch.

Python 7,742 793 Updated Nov 13, 2025

Utilities intended for use with Llama models.

Python 7,336 1,266 Updated Oct 10, 2025

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks

Python 7,121 392 Updated Jul 11, 2024

An interactive NVIDIA-GPU process viewer and beyond, the one-stop solution for GPU process management.

Python 6,284 193 Updated Nov 12, 2025

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Python 6,146 565 Updated Aug 22, 2025

Efficient Triton Kernels for LLM Training

Python 5,829 429 Updated Nov 11, 2025

Depth Pro: Sharp Monocular Metric Depth in Less Than a Second.

Python 4,985 376 Updated Apr 21, 2025

A PyTorch native platform for training generative AI models

Python 4,699 601 Updated Nov 13, 2025

Perf monitoring CLI tool for Apple Silicon

Python 4,351 183 Updated Apr 18, 2024

🚀 Efficient implementations of state-of-the-art linear attention models

Python 3,829 299 Updated Nov 12, 2025

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Python 3,340 280 Updated Jul 17, 2025

Open source process design kit for usage with SkyWater Technology Foundry's 130nm node.

Python 3,314 431 Updated Oct 28, 2024

Sparsity-aware deep learning inference runtime for CPUs

Python 3,162 191 Updated Jun 2, 2025

Implementation for MatMul-free LM.

Python 3,037 196 Updated Jul 21, 2025

Run LLMs with MLX

Python 2,837 301 Updated Nov 13, 2025

SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

Python 2,520 281 Updated Nov 12, 2025

PyTorch native quantization and sparsity for training and inference

Python 2,502 367 Updated Nov 13, 2025

Minimalistic large language model 3D-parallelism training

Python 2,318 258 Updated Sep 3, 2025

Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".

Python 2,218 184 Updated Mar 27, 2024

Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models

Python 2,146 157 Updated Jun 2, 2025

dstack is an open-source control plane for running development, training, and inference jobs on GPUs—across hyperscalers, neoclouds, or on-prem.

Python 1,953 202 Updated Nov 13, 2025

Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch

Python 1,891 168 Updated Oct 27, 2025

MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.

Python 1,834 204 Updated Nov 11, 2025

Machine learning on FPGAs using HLS

Python 1,687 488 Updated Nov 12, 2025

Implementing DeepSeek R1's GRPO algorithm from scratch

Python 1,663 76 Updated Apr 18, 2025

[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models

Python 1,550 188 Updated Jul 12, 2024

Modular hardware build system

Python 1,104 113 Updated Nov 13, 2025