lovelydett

🚗

Auto-driving

Yuting Xie lovelydett

🚗

Auto-driving

[ OK ] Done building Yuting Xie. Enjoy!

25 followers · 23 following

Huawei Canada
Markham, Canada
@lovelydett

Achievements

Highlights

Lists (2)

Sort

🔮 Future ideas

Recent

recent

4 repositories

Stars

10 stars written in Cuda

Clear filter

deepseek-ai / DeepEP

DeepEP: an efficient expert-parallel communication library

Cuda 8,830 1,037 Updated Dec 24, 2025

siboehm / SGEMM_CUDA

Fast CUDA matrix multiplication from scratch

Cuda 987 149 Updated Sep 2, 2025

Tongkaio / CUDA_Kernel_Samples

CUDA 算子手撕与面试指南

Cuda 743 82 Updated Aug 23, 2025

rapidsai / cuvs

cuVS - a library for vector search and clustering on the GPU

Cuda 599 150 Updated Dec 24, 2025

Cjkkkk / CUDA_gemm

A simple high performance CUDA GEMM implementation.

Cuda 421 42 Updated Jan 4, 2024

yzhaiustc / Optimizing-SGEMM-on-NVIDIA-Turing-GPUs

Optimizing SGEMM kernel functions on NVIDIA GPUs to a close-to-cuBLAS performance.

Cuda 397 52 Updated Jan 2, 2025

salykova / sgemm.cu

High-Performance SGEMM on CUDA devices

Cuda 114 5 Updated Jan 21, 2025

SJTU-IPADS / reef

REEF is a GPU-accelerated DNN inference serving system that enables instant kernel preemption and biased concurrent execution in GPU scheduling.

Cuda 103 11 Updated Dec 24, 2022

yalue / cuda_scheduling_examiner_mirror

A tool for examining GPU scheduling behavior.

Cuda 89 20 Updated Aug 17, 2024

RC4ML / Hyperion

Cost-efficient Out-of-core GNN Training System on TB-scale Graph [ICDE 25]

Cuda 22 4 Updated Jan 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Yuting Xie lovelydett

Achievements

Achievements

Highlights

Block or report lovelydett

Lists (2)

🔮 Future ideas

Recent

Stars

deepseek-ai / DeepEP

siboehm / SGEMM_CUDA

Tongkaio / CUDA_Kernel_Samples

rapidsai / cuvs

Cjkkkk / CUDA_gemm

yzhaiustc / Optimizing-SGEMM-on-NVIDIA-Turing-GPUs

salykova / sgemm.cu

SJTU-IPADS / reef

yalue / cuda_scheduling_examiner_mirror

RC4ML / Hyperion