Skip to content
View sunshinemyson's full-sized avatar
  • verisilicon.com
  • CD

Block or report sunshinemyson

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs

Python 168 14 Updated Sep 23, 2025

Topics in Machine Learning Accelerator Design

90 22 Updated Feb 16, 2023

Shared Middle-Layer for Triton Compilation

MLIR 1 Updated Jan 7, 2025

FlagGems is an operator library for large language models implemented in the Triton Language.

Python 747 148 Updated Nov 6, 2025

A PyTorch native platform for training generative AI models

Python 4,654 595 Updated Nov 6, 2025
Jupyter Notebook 126 39 Updated Oct 31, 2025

PrIM (Processing-In-Memory benchmarks) is the first benchmark suite for a real-world processing-in-memory (PIM) architecture. PrIM is developed to evaluate, analyze, and characterize the first publ…

C 162 58 Updated Apr 29, 2024

Ramulator 2.0 is a modern, modular, extensible, and fast cycle-accurate DRAM simulator. It provides support for agile implementation and evaluation of new memory system designs (e.g., new DRAM stan…

C++ 423 111 Updated Oct 20, 2025

Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline model in a user-friendly interface.

Python 571 70 Updated Sep 11, 2024

LLM training in simple, raw C/CUDA

Cuda 28,083 3,265 Updated Jun 26, 2025

📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉

Python 4,662 319 Updated Aug 19, 2025

The pjrt-plugin implementation for VeriSIlicon NPU IP for Tensorflow/PyTorch/Other ecosystem.

C++ 7 1 Updated Apr 28, 2024

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 62,259 11,065 Updated Nov 6, 2025

Empower VeriSilicon's NPU on Android Platform by NNAPI

C++ 5 5 Updated Jan 21, 2025

我的自学笔记,终身更新

Python 3,919 483 Updated Nov 5, 2025

Verisilicon Tensor Interface Module

C 5 Updated Oct 13, 2023

Shared Middle-Layer for Triton Compilation

MLIR 304 79 Updated Oct 27, 2025
Shell 77 8 Updated Sep 9, 2022

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 152,159 31,060 Updated Nov 6, 2025

Code release for ConvNeXt model

Python 6,178 725 Updated Jan 8, 2023

GitHub Action for clang-format checking

C 125 40 Updated Nov 1, 2025

VeriSilicon Tensor Interface Module

C 239 88 Updated Oct 13, 2025

The Torch-MLIR project aims to provide first class support from the PyTorch ecosystem to the MLIR ecosystem.

C++ 1,670 610 Updated Nov 4, 2025

Acuitylite is an end-to-end neural network deployment tool

Roff 18 5 Updated Nov 4, 2025

CUDA Templates and Python DSLs for High-Performance Linear Algebra

C++ 8,733 1,518 Updated Nov 5, 2025
Python 241 60 Updated Mar 31, 2023

Visually explore, understand, and present your data.

TypeScript 6,942 563 Updated Oct 17, 2025

Lean Algorithmic Trading Engine by QuantConnect (Python, C#)

C# 12,692 3,790 Updated Nov 5, 2025

A GUI client for Windows, Linux and macOS, support Xray and sing-box and others

C# 89,493 13,580 Updated Nov 5, 2025
Next