Skip to content
View Jerick26's full-sized avatar

Block or report Jerick26

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Tile-Based Runtime for Ultra-Low-Latency LLM Inference

Python 1,349 82 Updated Jun 8, 2026

📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉

Cuda 11,245 1,149 Updated May 29, 2026

A framework for building native applications using React

C++ 125,992 25,174 Updated Jun 12, 2026

Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞

TypeScript 378,339 79,124 Updated Jun 12, 2026

Flash Attention in ~100 lines of CUDA (forward pass only)

Cuda 1,158 113 Updated Dec 30, 2024

Official Implementation of EAGLE-1 (ICML'24), EAGLE-2 (EMNLP'24), and EAGLE-3 (NeurIPS'25).

Python 2,394 285 Updated Feb 20, 2026

机场推荐/SSR V2ray节点订阅机场/镜像直连/工具推荐

12,210 1,184 Updated Apr 10, 2026

机场推荐与机场评测

15,154 361 Updated Jun 12, 2026

《中国食物成分表标准版(第6版)》中“能量和食物一般营养成分”部分的表格截图,以及转换为特定格式的json文件。

Python 257 80 Updated Dec 6, 2025

Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

Python 3,391 544 Updated Jun 12, 2026

🐳 A curated list of Docker resources and projects

36,200 3,325 Updated Jun 5, 2026

MathJax source code for version 3 and beyond

TypeScript 2,370 240 Updated Jun 11, 2026

The official Python library for the OpenAI API

Python 30,976 4,831 Updated Jun 11, 2026

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 5,564 846 Updated Jun 12, 2026

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 7,368 1,042 Updated Jun 4, 2026

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

8,003 287 Updated May 15, 2025

A Datacenter Scale Distributed Inference Serving Framework

Rust 7,246 1,238 Updated Jun 12, 2026

FlashMLA: Efficient Multi-head Latent Attention Kernels

C++ 12,701 1,059 Updated Apr 30, 2026
Python 4,522 492 Updated Apr 22, 2026

Optimized primitives for collective multi-GPU communication

C++ 4,803 1,295 Updated Jun 12, 2026

CUDA Templates and Python DSLs for High-Performance Linear Algebra

C++ 9,885 1,906 Updated Jun 11, 2026

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Python 100,682 27,990 Updated Jun 12, 2026

RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.

Cuda 1,218 212 Updated Jun 12, 2026

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 58,606 6,411 Updated Apr 30, 2026

</> htmx - high power tools for HTML

JavaScript 48,190 1,602 Updated Jun 12, 2026

Ascend PyTorch adapter (torch_npu). Mirror of https://gitcode.com/Ascend/pytorch

Python 535 72 Updated Jun 12, 2026

《开源大模型食用指南》针对中国宝宝量身打造的基于Linux环境快速微调(全参数/Lora)、部署国内外开源大模型(LLM)/多模态大模型(MLLM)教程

Jupyter Notebook 30,868 3,019 Updated Jun 3, 2026

📂 Web File Browser

Go 35,062 3,877 Updated Jun 11, 2026

Structured Outputs

Python 13,952 708 Updated May 18, 2026
Next