Skip to content
View ProExpertProg's full-sized avatar

Organizations

@cryptovoting

Block or report ProExpertProg

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Production inference for encoder models - ColBERT, GLiNER, ColPali, embeddings etc. - as vLLM plugins for online and in-process deployment

Python 40 3 Updated Apr 15, 2026

A simple GPU reservation tool for single host shared development systems

Go 25 5 Updated Apr 6, 2026

wentao.site / Hugo Template / A template repository for Hugo based blog

55 3 Updated Mar 21, 2026

Nano vLLM

Python 12,920 1,934 Updated Apr 13, 2026

Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs

Python 1,010 145 Updated Apr 15, 2026

Achieve state of the art inference performance with modern accelerators on Kubernetes

Shell 2,990 414 Updated Apr 15, 2026

Tile primitives for speedy kernels

Cuda 3,313 277 Updated Apr 8, 2026

AI Tensor Engine for ROCm

Python 406 281 Updated Apr 16, 2026

An efficient, composable design pattern for range processing

C++ 136 6 Updated Mar 6, 2025

A High-Performance JIT-Based C++ Expression/Script Execution Engine with SIMD Vectorization Support

C++ 99 7 Updated Oct 17, 2025

A curated list of awesome SIMD frameworks, libraries and software

239 18 Updated Sep 15, 2024

Performance-portable, length-agnostic SIMD with runtime dispatch

C++ 5,435 431 Updated Apr 15, 2026

A modern C++ wrapper around the FFTW library

C++ 1 Updated Apr 7, 2025

Simple Useful Libraries: C++17/20 header-only dynamic bitset

C++ 175 16 Updated Jan 19, 2026

Top-level directory for documentation and general content

MDX 120 7 Updated Jun 2, 2025

Notebooks using the Neural Magic libraries 📓

Jupyter Notebook 39 6 Updated Jul 24, 2024

Neural network model repository for highly sparse and sparse-quantized models with matching sparsification recipes

Python 388 28 Updated Jun 2, 2025

Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models

Python 2,142 156 Updated Jun 2, 2025

Sparsity-aware deep learning inference runtime for CPUs

Python 3,163 191 Updated Jun 2, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 266 10 Updated Dec 4, 2025

ClearML - Auto-Magical CI/CD to streamline your AI workload. Experiment Management, Data Management, Pipeline, Orchestration, Scheduling & Serving in one MLOps/LLMOps solution

Python 6,631 769 Updated Apr 10, 2026

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 76,760 15,648 Updated Apr 15, 2026

A tagged-pointer type for C++.

C++ 37 1 Updated Aug 3, 2023

A utility for creating amalgamated single-header C++ libraries

C++ 64 6 Updated Apr 3, 2022

A machine learning compiler for GPUs, CPUs, and ML accelerators

C++ 4,171 781 Updated Apr 16, 2026

CLI utility for managing your project, a modern touch for C/C++

C++ 120 1 Updated Mar 12, 2024

"See why!" Explains and suggests fixes for compile-time errors for C, C++, C#, Go, Java, LaTeX, PHP, Python, Ruby, Rust, and TypeScript

Python 303 9 Updated Nov 10, 2025

Athena++ radiation GRMHD code and adaptive mesh refinement (AMR) framework

C++ 328 180 Updated Apr 4, 2026
Next