Skip to content
View TS-toolchain's full-sized avatar

Block or report TS-toolchain

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 19,425 1,795 Updated Jan 30, 2026

一款简单易用和高性能的AI部署框架 | An Easy-to-Use and High-Performance AI Deployment Framework

C++ 1,830 222 Updated Apr 25, 2026

MNN: A blazing-fast, lightweight inference engine battle-tested by Alibaba, powering high-performance on-device LLMs and Edge AI.

C++ 15,523 2,369 Updated Jun 18, 2026

On-device AI across mobile, embedded and edge for PyTorch

Python 4,745 1,037 Updated Jun 21, 2026

AIInfra(AI 基础设施)指AI系统从底层芯片等硬件,到上层软件栈支持AI大模型训练和推理。

Jupyter Notebook 7,382 957 Updated Dec 22, 2025

YOLO-World-ONNX is a Python package for running inference on YOLO-WORLD Open-vocabulary-object detection model using ONNX models. It provides an easy-to-use interface for performing inference on im…

Python 17 2 Updated Feb 6, 2026

[CVPR 2024] Real-Time Open-Vocabulary Object Detection

Python 6,423 608 Updated Feb 26, 2025

这个是一个在SSD的基础上用于生成绘制mAP代码所用的txt的例子。(目的是生成txt)

Python 128 40 Updated Jan 31, 2021

Utilities intended for use with Llama models.

Python 7,638 1,387 Updated Feb 11, 2026

AISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术

Jupyter Notebook 16,989 2,403 Updated Sep 3, 2025

SGLang is a high-performance serving framework for large language models and multimodal models.

Python 29,464 6,630 Updated Jun 21, 2026

Large Language Model Text Generation Inference

Python 10,863 1,271 Updated Mar 21, 2026

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 83,441 18,275 Updated Jun 21, 2026

Inference code for Llama models

Python 59,465 9,791 Updated Jan 26, 2025

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 7,909 700 Updated Jun 18, 2026

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Python 7,108 791 Updated Jun 17, 2026

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks

Python 7,232 398 Updated Jul 11, 2024

paraformer(chinense asr) online onnx runtime for python

Python 54 5 Updated Mar 27, 2024

Industrial-grade speech recognition toolkit: 170x realtime, 50+ languages, speaker diarization, emotion detection, streaming, and OpenAI-compatible API.

Python 18,387 1,870 Updated Jun 21, 2026

Triton CLI is an open source command line interface that enables users to create, deploy, and profile models served by the Triton Inference Server.

Python 74 5 Updated Jun 10, 2026

Doing simple retrieval from LLM models at various context lengths to measure accuracy

Jupyter Notebook 2,317 247 Updated Jun 8, 2026

Coral issue tracker (and legacy Edge TPU API source)

C++ 477 130 Updated Oct 27, 2021

LLMem: GPU Memory Estimation for Fine-Tuning Pre-Trained LLMs

Python 30 3 Updated May 31, 2025

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"

Python 13,607 909 Updated Dec 17, 2024

A Next-Generation Training Engine Built for Ultra-Large MoE Models

Python 5,150 425 Updated Jun 18, 2026

Go ahead and axolotl questions

Python 12,070 1,373 Updated Jun 21, 2026

Instruction Tuning with GPT-4

HTML 4,334 309 Updated Jun 11, 2023

QLoRA: Efficient Finetuning of Quantized LLMs

Jupyter Notebook 10,934 874 Updated Jun 10, 2024
Next