YOLO-World-ONNX is a Python package for running inference on YOLO-WORLD Open-vocabulary-object detection model using ONNX models. It provides an easy-to-use interface for performing inference on im…

Python 17 2 Updated Feb 6, 2026

AILab-CVC / YOLO-World

[CVPR 2024] Real-Time Open-Vocabulary Object Detection

Python 6,423 608 Updated Feb 26, 2025

bubbliiiing / count-mAP-txt

这个是一个在SSD的基础上用于生成绘制mAP代码所用的txt的例子。（目的是生成txt）

Python 128 40 Updated Jan 31, 2021

meta-llama / llama-models

Utilities intended for use with Llama models.

Python 7,638 1,387 Updated Feb 11, 2026

deepseek-ai / DeepSeek-V3

Python 103,787 16,738 Updated Aug 28, 2025

Infrasys-AI / AISystem

AISystem 主要是指AI系统，包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术

Jupyter Notebook 16,989 2,403 Updated Sep 3, 2025

sgl-project / sglang

SGLang is a high-performance serving framework for large language models and multimodal models.

Python 29,464 6,630 Updated Jun 21, 2026

huggingface / text-generation-inference

Large Language Model Text Generation Inference

Python 10,863 1,271 Updated Mar 21, 2026

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 83,441 18,275 Updated Jun 21, 2026

meta-llama / llama

Inference code for Llama models

Python 59,465 9,791 Updated Jan 26, 2025

InternLM / lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 7,909 700 Updated Jun 18, 2026

open-compass / opencompass

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Python 7,108 791 Updated Jun 17, 2026

mit-han-lab / streaming-llm

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks

Python 7,232 398 Updated Jul 11, 2024

lovemefan / paraformer-python

paraformer(chinense asr) online onnx runtime for python

Python 54 5 Updated Mar 27, 2024

modelscope / FunASR

Industrial-grade speech recognition toolkit: 170x realtime, 50+ languages, speaker diarization, emotion detection, streaming, and OpenAI-compatible API.

Python 18,387 1,870 Updated Jun 21, 2026

triton-inference-server / triton_cli

Triton CLI is an open source command line interface that enables users to create, deploy, and profile models served by the Triton Inference Server.

Python 74 5 Updated Jun 10, 2026

triton-inference-server / perf_analyzer

Python 146 44 Updated Jun 9, 2026

gkamradt / needle-in-a-haystack

Doing simple retrieval from LLM models at various context lengths to measure accuracy

Jupyter Notebook 2,317 247 Updated Jun 8, 2026

google-coral / edgetpu

Coral issue tracker (and legacy Edge TPU API source)

C++ 477 130 Updated Oct 27, 2021

taehokim20 / LLMem

LLMem: GPU Memory Estimation for Fine-Tuning Pre-Trained LLMs

Python 30 3 Updated May 31, 2025

microsoft / LoRA

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"

Python 13,607 909 Updated Dec 17, 2024

InternLM / xtuner

A Next-Generation Training Engine Built for Ultra-Large MoE Models

Python 5,150 425 Updated Jun 18, 2026

axolotl-ai-cloud / axolotl

Go ahead and axolotl questions

Python 12,070 1,373 Updated Jun 21, 2026

Instruction-Tuning-with-GPT-4 / GPT-4-LLM

Instruction Tuning with GPT-4

HTML 4,334 309 Updated Jun 11, 2023

artidoro / qlora

QLoRA: Efficient Finetuning of Quantized LLMs

Jupyter Notebook 10,934 874 Updated Jun 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fancy TS-toolchain

Block or report TS-toolchain

Stars

QwenLM / Qwen3-VL

nndeploy / nndeploy

alibaba / MNN

pytorch / executorch

Infrasys-AI / AIInfra

Ziad-Algrafi / yolo-world-onnx