Skip to content
View jalola's full-sized avatar

Block or report jalola

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Agentic RL on Any Harness at Scale

Python 577 61 Updated Jun 17, 2026

A safetensors extension to efficiently store sparse quantized tensors on disk

Python 292 95 Updated Jun 18, 2026

DFlash: Block Diffusion for Flash Speculative Decoding

Python 5,183 374 Updated May 10, 2026

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.6, DeepSeek-V4, GLM-5.1, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Gemma4, Llava, …

Python 14,565 1,487 Updated Jun 18, 2026

🚀 Pytorch Distributed native training library for LLMs/VLMs with OOTB Hugging Face support

Python 610 185 Updated Jun 21, 2026

Ongoing research training transformer models at scale

Python 16,773 4,100 Updated Jun 21, 2026

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit floating point (FP8 and FP4) precision on Hopper, Ada and Blackwell GPUs, to provide better performance…

Python 3,400 752 Updated Jun 17, 2026

Official Implementation of EAGLE-1 (ICML'24), EAGLE-2 (EMNLP'24), and EAGLE-3 (NeurIPS'25).

Python 2,407 286 Updated Feb 20, 2026

3x Faster Inference; Unofficial implementation of EAGLE Speculative Decoding

Python 84 14 Updated Jul 3, 2025

Reproduction of DeepSeek-R1

Python 238 23 Updated Apr 14, 2025

Contrib repository for the OpenTelemetry Collector

Go 4,745 3,666 Updated Jun 20, 2026

An open-source framework for detecting, redacting, masking, and anonymizing sensitive data (PII) across text, images, and structured data. Supports NLP, pattern matching, and customizable pipelines.

Python 9,385 1,148 Updated Jun 21, 2026

Context OpenTelemetry Collector processor

Go 11 7 Updated Jun 20, 2026

The Triton backend for the ONNX Runtime.

C++ 179 83 Updated Jun 15, 2026

Splits Keras with Tensorflow backends into two or more submodels.

Jupyter Notebook 18 4 Updated Feb 20, 2023

DSPy: The framework for programming—not prompting—language models

Python 35,243 2,993 Updated Jun 18, 2026

LLM training in simple, raw C/CUDA

Cuda 30,282 3,657 Updated Jun 26, 2025

ONNXMLTools enables conversion of models to ONNX

Python 1,159 218 Updated Jun 10, 2026

onnxruntime-extensions: A specialized pre- and post- processing library for ONNX Runtime

C++ 469 134 Updated Jun 18, 2026

Triton Model Analyzer is a CLI tool to help with better understanding of the compute and memory requirements of the Triton Inference Server models.

Python 516 86 Updated Jun 16, 2026

🕹️ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.

Shell 141 5 Updated Jul 25, 2024

Set of tools to assess and improve LLM security.

Python 4,233 740 Updated Jun 19, 2026

Master programming by recreating your favorite technologies from scratch.

Markdown 517,890 48,994 Updated Feb 21, 2026

Microsoft Automatic Mixed Precision Library

Python 637 49 Updated Dec 1, 2025

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

C++ 20,877 4,003 Updated Jun 21, 2026

The Triton TensorRT-LLM Backend

935 137 Updated Jun 10, 2026

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

Python 13,928 2,484 Updated Jun 21, 2026

🐍 A Python lib for (de)serializing Python objects to/from JSON

Python 291 39 Updated Dec 29, 2023

Large Language Model Text Generation Inference

Python 10,862 1,271 Updated Mar 21, 2026

Simplify your onnx model

C++ 4,353 430 Updated Jun 16, 2026
Next