Skip to content
View sunshinemyson's full-sized avatar
  • verisilicon.com
  • CD

Block or report sunshinemyson

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs

Python 168 14 Updated Sep 23, 2025

Topics in Machine Learning Accelerator Design

90 22 Updated Feb 16, 2023

Shared Middle-Layer for Triton Compilation

MLIR 1 Updated Jan 7, 2025

FlagGems is an operator library for large language models implemented in the Triton Language.

Python 750 149 Updated Nov 10, 2025

A PyTorch native platform for training generative AI models

Python 4,667 599 Updated Nov 10, 2025
Jupyter Notebook 127 39 Updated Nov 9, 2025

PrIM (Processing-In-Memory benchmarks) is the first benchmark suite for a real-world processing-in-memory (PIM) architecture. PrIM is developed to evaluate, analyze, and characterize the first publ…

C 162 58 Updated Apr 29, 2024

Ramulator 2.0 is a modern, modular, extensible, and fast cycle-accurate DRAM simulator. It provides support for agile implementation and evaluation of new memory system designs (e.g., new DRAM stan…

C++ 430 112 Updated Oct 20, 2025

Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline model in a user-friendly interface.

Python 574 70 Updated Sep 11, 2024

LLM training in simple, raw C/CUDA

Cuda 28,113 3,282 Updated Jun 26, 2025

📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉

Python 4,677 319 Updated Aug 19, 2025

The pjrt-plugin implementation for VeriSIlicon NPU IP for Tensorflow/PyTorch/Other ecosystem.

C++ 7 1 Updated Apr 28, 2024

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 62,628 11,144 Updated Nov 10, 2025

Empower VeriSilicon's NPU on Android Platform by NNAPI

C++ 5 5 Updated Jan 21, 2025

我的自学笔记,终身更新

Python 3,920 483 Updated Nov 8, 2025

Verisilicon Tensor Interface Module

C 5 Updated Oct 13, 2023

Shared Middle-Layer for Triton Compilation

MLIR 304 79 Updated Oct 27, 2025
Shell 77 8 Updated Sep 9, 2022

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 152,330 31,100 Updated Nov 9, 2025

Code release for ConvNeXt model

Python 6,182 726 Updated Jan 8, 2023

GitHub Action for clang-format checking

C 126 40 Updated Nov 8, 2025

VeriSilicon Tensor Interface Module

C 239 87 Updated Oct 13, 2025

The Torch-MLIR project aims to provide first class support from the PyTorch ecosystem to the MLIR ecosystem.

C++ 1,675 611 Updated Nov 8, 2025

Acuitylite is an end-to-end neural network deployment tool

Roff 18 5 Updated Nov 4, 2025

CUDA Templates and Python DSLs for High-Performance Linear Algebra

C++ 8,751 1,521 Updated Nov 7, 2025
Python 242 60 Updated Mar 31, 2023

Visually explore, understand, and present your data.

TypeScript 7,024 565 Updated Oct 17, 2025

Lean Algorithmic Trading Engine by QuantConnect (Python, C#)

C# 12,722 3,800 Updated Nov 5, 2025

A GUI client for Windows, Linux and macOS, support Xray and sing-box and others

C# 89,753 13,606 Updated Nov 9, 2025
Next