JF-D

🎯

Focusing

JFDuan JF-D

🎯

Focusing

Interested in AI for system, efficient LLM training and serving!

98 followers · 182 following

Ph.D. Candidate@CUHK-MMLab, B.E.@ UCAS
HongKong
https://jf-d.github.io/

Achievements

Highlights

Lists (1)

Sort

🔮 Future ideas

Stars

242 results for source starred repositories

Clear filter

ModelTC / LightLLM

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Python 3,717 283 Updated Nov 12, 2025

ggml-org / llama.cpp

LLM inference in C/C++

C++ 89,624 13,652 Updated Nov 12, 2025

NVIDIA / Megatron-LM

Ongoing research training transformer models at scale

Python 14,170 3,268 Updated Nov 12, 2025

tensorflow / tensorflow

An Open Source Machine Learning Framework for Everyone

C++ 192,407 74,983 Updated Nov 12, 2025

sgl-project / sglang

SGLang is a fast serving framework for large language models and vision language models.

Python 20,061 3,338 Updated Nov 12, 2025

volcengine / verl

verl: Volcano Engine Reinforcement Learning for LLMs

Python 15,363 2,485 Updated Nov 12, 2025

mem0ai / mem0

Universal memory layer for AI Agents

Python 42,982 4,646 Updated Nov 12, 2025

pytorch / pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Python 94,981 25,867 Updated Nov 12, 2025

ray-project / ray

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Python 39,785 6,894 Updated Nov 12, 2025

SimplifyJobs / New-Grad-Positions

A collection of full time roles in SWE, Quant, and PM for new grads.

15,592 1,229 Updated Nov 12, 2025

hpcaitech / ColossalAI

Making large AI models cheaper, faster and more accessible

Python 41,235 4,538 Updated Nov 12, 2025

microsoft / mscclpp

MSCCL++: A GPU-driven communication stack for scalable AI applications

C++ 433 72 Updated Nov 12, 2025

fla-org / flash-linear-attention

🚀 Efficient implementations of state-of-the-art linear attention models

Python 3,817 299 Updated Nov 12, 2025

OpenXiangShan / XiangShan

Open-source high-performance RISC-V processor

Scala 6,726 833 Updated Nov 12, 2025

MoonshotAI / checkpoint-engine

Checkpoint-engine is a simple middleware to update model weights in LLM inference engines

Python 820 65 Updated Nov 12, 2025

FlagOpen / FlagGems

FlagGems is an operator library for large language models implemented in the Triton Language.

Python 753 150 Updated Nov 12, 2025

flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving

Cuda 4,052 564 Updated Nov 12, 2025

triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

Python 10,014 1,670 Updated Nov 12, 2025

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 62,827 11,215 Updated Nov 12, 2025

alibaba / ROLL

An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models

Python 2,278 146 Updated Nov 12, 2025

NVIDIA / TensorRT-LLM

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

C++ 12,106 1,859 Updated Nov 12, 2025

gradio-app / gradio

Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!

Python 40,489 3,127 Updated Nov 12, 2025

pybind / pybind11

Seamless operability between C++11 and Python

C++ 17,437 2,234 Updated Nov 12, 2025

tile-ai / tilelang

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

C++ 3,900 309 Updated Nov 12, 2025

xdit-project / xDiT

xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism

Python 2,390 279 Updated Nov 12, 2025

huggingface / transformers

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 152,421 31,115 Updated Nov 11, 2025

supermemoryai / supermemory

Memory engine and app that is extremely fast, scalable. The Memory API for the AI era.

TypeScript 13,489 1,401 Updated Nov 11, 2025

mirage-project / mirage

Mirage Persistent Kernel: Compiling LLMs into a MegaKernel

C++ 1,942 149 Updated Nov 11, 2025

facebookresearch / xformers

Hackable and optimized Transformers building blocks, supporting a composable construction.

Python 10,091 733 Updated Nov 11, 2025

HazyResearch / ThunderKittens

Tile primitives for speedy kernels

Cuda 2,881 194 Updated Nov 11, 2025