TKONIY

🌋

Working on Data x AI

Yangshen Deng TKONIY

🌋

Working on Data x AI

🚀PhD student in University of Edinburgh

126 followers · 163 following

University of Edinburgh
Shenzhen, China
18:11 (UTC +08:00)
https://dengyangshen.netlify.app

Achievements

x3 x2

Achievements

x3 x2

Highlights

Developer Program Member
Pro

Organizations

Lists (5)

Sort

Starred repositories

anthropics / jacobian-lens

Companion code for the global workspace interpretability paper

Python 1,560 221 Updated Jul 17, 2026

deepseek-ai / FlashMLA

FlashMLA: Efficient Multi-head Latent Attention Kernels

C++ 12,767 1,103 Updated Apr 30, 2026

lightseekorg / tokenspeed

TokenSpeed is a speed-of-light LLM inference engine.

Python 1,656 197 Updated Jul 24, 2026

HazyResearch / Megakernels

Kernels, of the mega variety :)

Python 786 64 Updated May 26, 2026

mstar-project / mstar

A high-performance, universal serving framework for any-to-any models.

Python 56 10 Updated Jul 24, 2026

CedricVerlinden / cursor-dark

A copy of the Cursor AI editor theme from cursor.sh, repackaged for use in Visual Studio Code without needing the Cursor editor.

37 4 Updated Sep 17, 2025

batchgen-project / batchgen

High-Throughput Batch Inference

C++ 13 1 Updated Jul 14, 2026

princeton-nlp / ProLong

Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"

Python 261 15 Updated Sep 12, 2025

hpdps-group / ElasticMM

ElasticMM: Elastic and Efficient MLLM Serving System

Python 44 2 Updated May 10, 2026

togethercomputer / ParallelKernelBench

Python 44 Updated Jul 1, 2026

uccl-project / CommBench

Can LLMs Write Correct and Efficient GPU Communication Code?

Python 47 2 Updated Jul 7, 2026

deepseek-ai / DeepSpec

DeepSpec: a full-stack codebase for training and evaluating speculative decoding algorithms

Python 6,761 626 Updated Jul 9, 2026

jianuo-huang / Domino

Official implementation of “Domino: Decoupling Causal Modeling from Autoregressive Drafting in Speculative Decoding”.

Python 122 7 Updated Jul 17, 2026

shengshu-ai / TurboServe

TurboServe: Serving Streaming Video Generation Efficiently and Economically

Python 37 2 Updated Jul 12, 2026

vklimkov-nvidia / Speech

Forked from erastorgueva-nv/NeMo

NeMo: a toolkit for conversational AI

Python 1 Updated Jul 21, 2026

NVIDIA / TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

C++ 13,182 2,389 Updated Jul 7, 2026

Ji-shuo / MRAgent

Python 222 22 Updated Jun 8, 2026

NVlabs / SparDA

Sparse Decoupled Attention for Efficient Long-Context LLM Inference

Python 51 2 Updated Jun 4, 2026

cloudcores / CuAssembler

An unofficial cuda assembler, for all generations of SASS, hopefully ：）

Python 609 110 Updated Apr 20, 2023

openai / mle-bench

MLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineering

Python 1,655 257 Updated Apr 24, 2026

uccl-project / uccl

UCCL is an efficient communication library for GPUs, covering collectives, P2P (e.g., KV cache transfer, RL weight transfer), and EP (e.g., GPU-driven)

C++ 1,470 164 Updated Jul 24, 2026

NVIDIA / flashdreams

high-performance inference and serving library for interactive autoregressive video and world models

Python 416 40 Updated Jul 22, 2026

maplibre / maplibre-gl-js

MapLibre GL JS - Interactive vector tile maps in the browser

TypeScript 11,133 1,148 Updated Jul 24, 2026

NVlabs / cutile-rs

cuTile Rust provides a safe, tile-based kernel programming DSL for the Rust programming language. It features a safe host-side API for passing tensors to asynchronously executed kernel functions.

Rust 713 50 Updated Jul 23, 2026