Skip to content
View JF-D's full-sized avatar
🎯
Focusing
🎯
Focusing

Highlights

  • Pro

Block or report JF-D

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
242 results for source starred repositories
Clear filter

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Python 3,717 283 Updated Nov 12, 2025

LLM inference in C/C++

C++ 89,624 13,652 Updated Nov 12, 2025

Ongoing research training transformer models at scale

Python 14,170 3,268 Updated Nov 12, 2025

An Open Source Machine Learning Framework for Everyone

C++ 192,407 74,983 Updated Nov 12, 2025

SGLang is a fast serving framework for large language models and vision language models.

Python 20,061 3,338 Updated Nov 12, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 15,363 2,485 Updated Nov 12, 2025

Universal memory layer for AI Agents

Python 42,982 4,646 Updated Nov 12, 2025

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Python 94,981 25,867 Updated Nov 12, 2025

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Python 39,785 6,894 Updated Nov 12, 2025

A collection of full time roles in SWE, Quant, and PM for new grads.

15,592 1,229 Updated Nov 12, 2025

Making large AI models cheaper, faster and more accessible

Python 41,235 4,538 Updated Nov 12, 2025

MSCCL++: A GPU-driven communication stack for scalable AI applications

C++ 433 72 Updated Nov 12, 2025

🚀 Efficient implementations of state-of-the-art linear attention models

Python 3,817 299 Updated Nov 12, 2025

Open-source high-performance RISC-V processor

Scala 6,726 833 Updated Nov 12, 2025

Checkpoint-engine is a simple middleware to update model weights in LLM inference engines

Python 820 65 Updated Nov 12, 2025

FlagGems is an operator library for large language models implemented in the Triton Language.

Python 753 150 Updated Nov 12, 2025

FlashInfer: Kernel Library for LLM Serving

Cuda 4,052 564 Updated Nov 12, 2025

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

Python 10,014 1,670 Updated Nov 12, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 62,827 11,215 Updated Nov 12, 2025

An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models

Python 2,278 146 Updated Nov 12, 2025

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

C++ 12,106 1,859 Updated Nov 12, 2025

Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!

Python 40,489 3,127 Updated Nov 12, 2025

Seamless operability between C++11 and Python

C++ 17,437 2,234 Updated Nov 12, 2025

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

C++ 3,900 309 Updated Nov 12, 2025

xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism

Python 2,390 279 Updated Nov 12, 2025

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 152,421 31,115 Updated Nov 11, 2025

Memory engine and app that is extremely fast, scalable. The Memory API for the AI era.

TypeScript 13,489 1,401 Updated Nov 11, 2025

Mirage Persistent Kernel: Compiling LLMs into a MegaKernel

C++ 1,942 149 Updated Nov 11, 2025

Hackable and optimized Transformers building blocks, supporting a composable construction.

Python 10,091 733 Updated Nov 11, 2025

Tile primitives for speedy kernels

Cuda 2,881 194 Updated Nov 11, 2025
Next