Skip to content
View neur1n's full-sized avatar
🤔
O'RLY
🤔
O'RLY

Highlights

  • Pro

Block or report neur1n

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

CUDA Tile IR is an MLIR-based intermediate representation and compiler infrastructure for CUDA kernel optimization, focusing on tile-based computation patterns and optimizations targeting NVIDIA te…

MLIR 833 61 Updated Feb 13, 2026

cuTile is a programming model for writing parallel kernels for NVIDIA GPUs

Python 1,926 115 Updated Feb 17, 2026

Helpful kernel tutorials and examples for tile-based GPU programming

Python 645 46 Updated Feb 17, 2026

how to optimize some algorithm in cuda.

Cuda 2,821 256 Updated Feb 15, 2026

⭐AI-driven public opinion & trend monitor with multi-platform aggregation, RSS, and smart alerts.🎯 告别信息过载,你的 AI 舆情监控助手与热点筛选工具!聚合多平台热点 + RSS 订阅,支持关键词精准筛选。AI 翻译 + AI 分析简报直推手机,也支持接入 MCP 架构,赋能 AI 自然语言对…

Python 46,505 22,204 Updated Feb 9, 2026

微舆:人人可用的多Agent舆情分析助手,打破信息茧房,还原舆情原貌,预测未来走向,辅助决策!从0实现,不依赖任何框架。

Python 35,602 6,813 Updated Feb 11, 2026

A beta Dota2 Bot Script aims to provide better bot game experience

Lua 209 36 Updated Feb 11, 2026

给cn刀塔的一封情书

Python 33 2 Updated Nov 12, 2025

Game Preservation Project

6,899 755 Updated Nov 1, 2024

Yet another im-select implementation for Windows

C++ 10 Updated Oct 24, 2025

CUDA Matrix Multiplication Optimization

Cuda 258 24 Updated Jul 19, 2024

This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, s…

Cuda 1,239 179 Updated Jul 29, 2023

Flexible GPGPU instrumentation

C++ 89 31 Updated Oct 10, 2019

Unofficial description of the CUDA assembly (SASS) instruction sets.

Python 201 19 Updated Jul 18, 2025

An unofficial cuda assembler, for all generations of SASS, hopefully :)

Python 570 99 Updated Apr 20, 2023

Tensor Core Multiplication at the Speed of CuBLAS in Three Simple Steps

Cuda 3 1 Updated Mar 17, 2024

📚 A curated list of awesome matrix-matrix multiplication (A * B = C) frameworks, libraries and software

61 7 Updated Feb 23, 2025

Awesome curated RSS feed links related to Machine Learning, Artificial Intelligence, Reinforcement Learning

258 23 Updated Dec 12, 2021

Awesome RSS feeds - A curated list of RSS feeds (and OPML files) used in Recommended Feeds and local news sections of Plenary - an RSS reader, article downloader and a podcast player app for android

1,953 146 Updated Jul 13, 2024

Some special ebooks,一些个人喜欢同时也比较特别的电子书

1,458 339 Updated Feb 13, 2026

AI Crash Course to help busy builders catch up to the public frontier of AI research in 2 weeks

5,721 817 Updated Jan 7, 2026

A curated list for Efficient Large Language Models

Python 1,953 152 Updated Jun 17, 2025

Dynamic Memory Management for Serving LLMs without PagedAttention

C 462 38 Updated May 30, 2025

Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline model in a user-friendly interface.

Python 621 81 Updated Sep 11, 2024

A curated list of foundation models for vision and language tasks

1,141 58 Updated Jun 23, 2025

A generic cross-platform C library that includes many commonly used components and frameworks, and a new scripting language interpreter. It currently supports C99 and Aspect-Oriented Programming (…

C 1,444 208 Updated Jan 31, 2025

Optimizing SGEMM kernel functions on NVIDIA GPUs to a close-to-cuBLAS performance.

Cuda 407 52 Updated Jan 2, 2025

Assembler for NVIDIA Maxwell architecture

Sass 1,060 171 Updated Jan 3, 2023
Next