Skip to content
View ProExpertProg's full-sized avatar

Organizations

@cryptovoting

Block or report ProExpertProg

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Production inference for encoder models - ColBERT, GLiNER, ColPali, embeddings etc. - as vLLM plugins for online and in-process deployment

Python 53 5 Updated Apr 27, 2026

A simple GPU reservation tool for single host shared development systems

Go 26 7 Updated Apr 27, 2026

wentao.site / Hugo Template / A template repository for Hugo based blog

55 3 Updated Mar 21, 2026

Nano vLLM

Python 13,182 2,017 Updated Apr 26, 2026

Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs

Python 1,080 145 Updated Apr 29, 2026

Achieve state of the art inference performance with modern accelerators on Kubernetes

Shell 3,100 442 Updated Apr 29, 2026

Tile primitives for speedy kernels

Cuda 3,330 276 Updated Apr 29, 2026

AI Tensor Engine for ROCm

Python 420 294 Updated Apr 29, 2026

An efficient, composable design pattern for range processing

C++ 138 6 Updated Mar 6, 2025

A High-Performance JIT-Based C++ Expression/Script Execution Engine with SIMD Vectorization Support

C++ 99 7 Updated Oct 17, 2025

A curated list of awesome SIMD frameworks, libraries and software

239 18 Updated Sep 15, 2024

Performance-portable, length-agnostic SIMD with runtime dispatch

C++ 5,482 431 Updated Apr 29, 2026

A modern C++ wrapper around the FFTW library

C++ 1 Updated Apr 7, 2025

Simple Useful Libraries: C++17/20 header-only dynamic bitset

C++ 175 16 Updated Jan 19, 2026

Top-level directory for documentation and general content

MDX 120 7 Updated Jun 2, 2025

Notebooks using the Neural Magic libraries 📓

Jupyter Notebook 39 6 Updated Jul 24, 2024

Neural network model repository for highly sparse and sparse-quantized models with matching sparsification recipes

Python 388 28 Updated Jun 2, 2025

Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models

Python 2,143 156 Updated Jun 2, 2025

Sparsity-aware deep learning inference runtime for CPUs

Python 3,162 191 Updated Jun 2, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 266 10 Updated Dec 4, 2025

ClearML - Auto-Magical CI/CD to streamline your AI workload. Experiment Management, Data Management, Pipeline, Orchestration, Scheduling & Serving in one MLOps/LLMOps solution

Python 6,653 772 Updated Apr 27, 2026

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 78,608 16,254 Updated Apr 29, 2026

A tagged-pointer type for C++.

C++ 37 1 Updated Aug 3, 2023

A utility for creating amalgamated single-header C++ libraries

C++ 64 6 Updated Apr 3, 2022

A machine learning compiler for GPUs, CPUs, and ML accelerators

C++ 4,218 792 Updated Apr 29, 2026

CLI utility for managing your project, a modern touch for C/C++

C++ 120 1 Updated Mar 12, 2024

"See why!" Explains and suggests fixes for compile-time errors for C, C++, C#, Go, Java, LaTeX, PHP, Python, Ruby, Rust, and TypeScript

Python 303 9 Updated Nov 10, 2025

Athena++ radiation GRMHD code and adaptive mesh refinement (AMR) framework

C++ 332 181 Updated Apr 27, 2026
Next