Skip to content
View TKONIY's full-sized avatar
🌋
Working on Data x AI
🌋
Working on Data x AI

Organizations

@DBGroup-SUSTech

Block or report TKONIY

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 250 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Memory-efficient multi layer perceptron implementation in OpenAI Triton.

Python 12 Updated Jan 24, 2025

GPU programming related news and material links

1,728 98 Updated Sep 17, 2025

Material for gpu-mode lectures

Jupyter Notebook 5,144 513 Updated Sep 23, 2025

Skywork-R1V is an advanced multimodal AI model series developed by Skywork AI (Kunlun Inc.), specializing in vision-language reasoning.

Python 2,941 270 Updated Aug 2, 2025

Video-R1: Reinforcing Video Reasoning in MLLMs [🔥the first paper to explore R1 for video]

Python 706 38 Updated Sep 19, 2025

SkyReels-V2: Infinite-length Film Generative model

Python 4,672 624 Updated Aug 11, 2025
Python 246 18 Updated Sep 21, 2025

Official implementation of "Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding"

Python 536 43 Updated Oct 8, 2025

Official PyTorch implementation for "Large Language Diffusion Models"

Python 3,012 201 Updated Sep 30, 2025

Ring attention implementation with flash attention

Python 888 84 Updated Sep 10, 2025

ring-attention experiments

Python 152 13 Updated Oct 17, 2024

Pie: Programmable LLM Serving

Rust 25 7 Updated Oct 9, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 14,114 2,515 Updated Oct 9, 2025

PyTorch library for cost-effective, fast and easy serving of MoE models.

Python 244 18 Updated Jul 7, 2025

Fast CUDA matrix multiplication from scratch

Cuda 887 130 Updated Sep 2, 2025

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

C++ 3,345 243 Updated Oct 9, 2025

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 3,719 281 Updated Oct 6, 2025

A curated list of Multi-Modal Reinforcement Learning resources (continually updated)

529 20 Updated Sep 12, 2025

Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.

Jupyter Notebook 3,695 290 Updated Jun 12, 2025

[TMLR 2025] Latte: Latent Diffusion Transformer for Video Generation.

Python 1,876 190 Updated Apr 8, 2025

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Python 7,881 703 Updated May 31, 2024

[ICLR 2025] FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality

Python 250 10 Updated Dec 27, 2024

Wan: Open and Advanced Large-Scale Video Generative Models

Python 14,347 2,024 Updated Jul 17, 2025

A collection of awesome video generation studies.

TeX 640 28 Updated Sep 23, 2025

serverless agents

TypeScript 200 53 Updated Apr 13, 2025
C++ 683 118 Updated Sep 25, 2025

Brain-to-text with test time training

Python 21 Updated Oct 7, 2025

An analytical performance modeling tool for deep neural networks.

Python 91 40 Updated Sep 24, 2020

😼 优雅地使用基于 clash/mihomo 的代理环境

Shell 4,763 628 Updated Sep 30, 2025

A Quirky Assortment of CuTe Kernels

Python 612 48 Updated Oct 9, 2025
Next