Skip to content
View xubo245's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report xubo245

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
6 stars written in Python
Clear filter

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 74,699 14,954 Updated Mar 30, 2026

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Python 41,872 7,396 Updated Mar 30, 2026

Making large AI models cheaper, faster and more accessible

Python 41,373 4,521 Updated Mar 16, 2026

Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.

Python 12,244 806 Updated Mar 23, 2026

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, DeepSeek, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discr…

Python 8,737 1,409 Updated Jan 28, 2026

TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters.

Python 3,859 941 Updated Jul 10, 2023