Skip to content
View qimcis's full-sized avatar
🌱
🌱

Block or report qimcis

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Tenstorrent MLIR compiler

C++ 187 59 Updated Oct 9, 2025

🤘 TT-NN operator library, and TT-Metalium low level kernel programming model.

C++ 1,225 277 Updated Oct 9, 2025

Universal LLM Deployment Engine with ML Compilation

Python 21,451 1,835 Updated Oct 6, 2025

Open deep learning compiler stack for cpu, gpu and specialized accelerators

Python 12,706 3,668 Updated Oct 8, 2025

Efficient Triton Kernels for LLM Training

Python 5,725 413 Updated Oct 8, 2025

Blazingly fast LLM inference.

Rust 6,136 459 Updated Oct 6, 2025

Open-source search and retrieval database for AI applications.

Rust 23,778 1,865 Updated Oct 9, 2025

TensorZero is an open-source stack for industrial-grade LLM applications. It unifies an LLM gateway, observability, optimization, evaluation, and experimentation.

Rust 10,399 711 Updated Oct 9, 2025

Efficent platform for inference and serving local LLMs including an OpenAI compatible API server.

Rust 482 56 Updated Sep 26, 2025

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 7,141 610 Updated Oct 9, 2025

A minimal tensor processing unit (TPU), inspired by Google's TPU V2 and V1

SystemVerilog 956 75 Updated Aug 21, 2025

A Datacenter Scale Distributed Inference Serving Framework

Rust 5,263 632 Updated Oct 9, 2025

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 40,347 4,575 Updated Oct 9, 2025

The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!

Python 8,117 877 Updated Oct 8, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 59,709 10,583 Updated Oct 9, 2025

Use your Neovim like using Cursor AI IDE!

Lua 16,083 733 Updated Oct 8, 2025