Skip to content
View charleschetty's full-sized avatar
  • Math.SDU
  • JiNan ShanDong china

Block or report charleschetty

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
124 results for source starred repositories
Clear filter

Low overhead tracing library and trace visualizer for pipelined CUDA kernels

C 65 3 Updated Nov 9, 2025

The official implementation of OSDI'25 paper BlitzScale

Rust 36 2 Updated Sep 20, 2025

UCCL is an efficient communication library for GPUs, covering collectives, P2P (e.g., KV cache transfer, RL weight transfer), and EP (e.g., GPU-driven)

C++ 946 82 Updated Nov 10, 2025

A tiny deep learning training framework implemented from scratch in C++ that follows PyTorch's API.

C++ 122 23 Updated Nov 1, 2025

Tiny C++ LLM inference implementation from scratch

C++ 91 13 Updated Sep 9, 2025

MessagePack is an extremely efficient object serialization library. It's like JSON, but very fast and small.

7,344 522 Updated Aug 10, 2024

Optimized string search routines for Rust.

Rust 1,259 124 Updated Sep 25, 2025

torchcomms: a modern PyTorch communications API

C++ 250 29 Updated Nov 10, 2025

收集户晨风的所有内容

954 121 Updated Oct 12, 2025

🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 2h!

Python 32,991 3,834 Updated Nov 10, 2025

We are committed to the open-sourcing quantitative knowledge, aiming to bridge the information gap between the domestic and international quantitative finance industries. 我们致力于量化知识的开源与汉化,打破国内外量化金融行…

2,530 197 Updated Oct 18, 2025

基于多智能体LLM的中文金融交易框架 - TradingAgents中文增强版

Python 12,562 2,694 Updated Nov 10, 2025

A very fast linker for Linux

Rust 2,972 77 Updated Nov 10, 2025

教科书《计算机体系结构基础》(胡伟武等,第三版)的开源版本

TeX 3,291 311 Updated Dec 9, 2024

muvm - run programs from your system in a microVM

Rust 708 43 Updated Oct 31, 2025

Writing an OS in 1,000 lines.

C 3,076 241 Updated Nov 3, 2025

A scalable file analysis and data generation platform that allows users to easily orchestrate arbitrary docker/vm/shell tools at scale.

Rust 959 114 Updated Nov 7, 2025

Safe Rust bindings to POSIX-ish APIs

Rust 1,807 224 Updated Nov 10, 2025

Legacy-Mess Detector – assess the “legacy-mess level” of your code and output a beautiful report | 屎山代码检测器,评估代码的“屎山等级”并输出美观的报告

Go 6,055 288 Updated Nov 9, 2025

This repository contains a 90-day cybersecurity study plan, along with resources and materials for learning various cybersecurity concepts and technologies. The plan is organized into daily tasks, …

12,079 1,334 Updated Sep 2, 2025

c++20 coroutine with epoll and queue

C++ 110 19 Updated Dec 25, 2023

杨景媛(武汉大学)事件记录。更新:杨景媛论文下载突破31万次。

389 23 Updated Sep 5, 2025

一个用来记录武汉大学杨景媛论文问题的仓库

HTML 3,702 233 Updated Aug 13, 2025

Patterns and resources of low latency programming.

737 29 Updated Jul 30, 2025

Library for specialized dense and sparse matrix operations, and deep learning primitives.

C 917 197 Updated Oct 10, 2025

Stepwise optimizations of DGEMM on CPU, reaching performance faster than Intel MKL eventually, even under multithreading.

C 154 29 Updated Feb 3, 2022

[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.

Cuda 2,648 259 Updated Nov 6, 2025

🌈 Solutions of LeetGPU

Cuda 46 3 Updated Oct 31, 2025

Nano vLLM

Python 8,622 1,046 Updated Nov 3, 2025

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 4,247 422 Updated Nov 10, 2025
Next