Skip to content
View Yiming992's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report Yiming992

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. DeepEP DeepEP Public

    Forked from deepseek-ai/DeepEP

    DeepEP: an efficient expert-parallel communication library

    Cuda

  2. DeepGEMM DeepGEMM Public

    Forked from deepseek-ai/DeepGEMM

    DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

    Cuda

  3. sglang sglang Public

    Forked from sgl-project/sglang

    SGLang is a structured generation language designed for large language models (LLMs). It makes your interaction with models faster and more controllable.

    Python

  4. cutlass cutlass Public

    Forked from NVIDIA/cutlass

    CUDA Templates for Linear Algebra Subroutines

    C++

  5. nanochat nanochat Public

    Forked from karpathy/nanochat

    The best ChatGPT that $100 can buy.

    Python

  6. SageAttention SageAttention Public

    Forked from thu-ml/SageAttention

    Quantized Attention that achieves speedups of 2.1-3.1x and 2.7-5.1x compared to FlashAttention2 and xformers, respectively, without lossing end-to-end metrics across various models.

    Cuda