Skip to content
View The-Lyc's full-sized avatar

Highlights

  • Pro

Block or report The-Lyc

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Awesome Agent Skills collection list, papers, tools, projects, and resources

44 Updated Feb 16, 2026

A collection of Docker files for the RTEMS RTOS tools and BSP builds

Dockerfile 13 7 Updated Dec 14, 2021

High-performance, light-weight C++ LLM and VLM Inference Software for Physical AI

C++ 339 52 Updated Mar 19, 2026

📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉

Python 5,143 360 Updated Apr 9, 2026

Lumina Robotics Talent Call | Lumina社区具身智能招贤榜 | A list for Embodied AI / Robotics Jobs (PhD, RA, intern, etc

1,383 26 Updated Feb 25, 2026

Tensor library for machine learning

C++ 14,441 1,551 Updated Apr 14, 2026

LLM inference in C/C++

C++ 103,723 16,859 Updated Apr 15, 2026

Large Language Model (LLM) Systems Paper List

1,923 98 Updated Mar 24, 2026

Open-source Windows and Office activator featuring HWID, Ohook, TSforge, and Online KMS activation methods, along with advanced troubleshooting.

Batchfile 172,026 16,537 Updated Mar 9, 2026

Virtualized Elastic KV Cache for Dynamic GPU Sharing and Beyond

Python 854 99 Updated Apr 7, 2026

A tiny yet powerful LLM inference system tailored for researching purpose. vLLM-equivalent performance with only 2k lines of code (2% of vLLM).

Python 323 37 Updated Jun 10, 2025

[ArXiv 2025] A curated list of papers on on-device large language models, focusing on model compression and system optimization techniques from the survey "On-Device Large Language Models: A Survey…

30 3 Updated Jan 27, 2026

Tile-Based Runtime for Ultra-Low-Latency LLM Inference

Python 705 42 Updated Mar 8, 2026

A Flexible Framework for Experiencing Heterogeneous LLM Inference/Fine-tune Optimizations

Python 16,975 1,263 Updated Apr 14, 2026
Python 3 1 Updated Mar 15, 2026

compiler learning resources collect.

Python 2,711 370 Updated Mar 19, 2025

how to optimize some algorithm in cuda.

Cuda 2,922 268 Updated Apr 9, 2026

NVIDIA Linux open GPU kernel module source

C 16,893 1,658 Updated Apr 3, 2026

Nano vLLM

Python 12,899 1,928 Updated Apr 13, 2026

A Datacenter Scale Distributed Inference Serving Framework

Rust 6,553 1,027 Updated Apr 15, 2026

This is a Chinese translation of the CUDA programming guide

1,933 283 Updated Nov 13, 2024

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Python 42,125 7,441 Updated Apr 15, 2026

Materials for learning SGLang

799 61 Updated Jan 5, 2026

Artifact from "Hardware Compute Partitioning on NVIDIA GPUs". THIS IS A FORK OF BAKITAS REPO. I AM NOT ONE OF THE AUTHORS OF THE PAPER.

C 59 5 Updated Nov 24, 2025

Jetson embedded platform-target deep learning inference acceleration framework with TensorRT

C++ 30 6 Updated Oct 10, 2025

A tool for examining GPU scheduling behavior.

Cuda 96 22 Updated Aug 17, 2024

[CVPR 2023 Best Paper Award] Planning-oriented Autonomous Driving

Python 4,571 533 Updated Oct 29, 2025

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

C++ 12,896 2,344 Updated Apr 13, 2026
Next