Stars
[Embodied-AI-Survey-2024] Paper list and projects for Embodied AI
Repository for the QUIK project, enabling the use of 4bit kernels for generative inference
Code Repository of Evaluating Quantized Large Language Models
Awesome LLM compression research papers and tools.
[HPCA 2023] ViTCoD: Vision Transformer Acceleration via Dedicated Algorithm and Accelerator Co-Design
This is originally a collection of papers on neural network accelerators. Now it's more like my selection of research on deep learning and computer architecture.
A TensorFlow+Keras implementation of "Sample-level CNN Architectures for Music Auto-tagging Using Raw Waveforms"
Code for paper "FuSeConv Fully Separable Convolutions for Fast Inference on Systolic Arrays" published at DATE 2021
This is a collection of our zero-cost NAS and efficient vision applications.
Ibex is a small 32 bit RISC-V CPU core, previously known as zero-riscy.
DO NOT CHECK OUT THESE FILES FROM GITHUB UNLESS YOU KNOW WHAT YOU ARE DOING. (See below.)
sliding DFT for FPGA, targetting Lattice ICE40 1k
Verilog module for calculation of FFT.
Noise reduction in python using spectral gating (speech, bioacoustics, audio, time-domain signals)
Must-have verilog systemverilog modules
Must-have verilog systemverilog modules
Everything we actually know about the Apple Neural Engine (ANE)
在FPGA上面实现一个NPU计算单元。能够执行矩阵运算(ADD/ADDi/ADDs/MULT/MULTi/DOT等)、图像处理运算(CONV/POOL等)、非线性映射(RELU/TANH/SIGM等)。
Deep Learning Accelerator (Convolution Neural Networks)
FPGA implementation of Cellular Neural Network (CNN)
This is a fully parameterized verilog implementation of computation kernels for accleration of the Inference of Convolutional Neural Networks on FPGAs
Datasets, Transforms and Models specific to Computer Vision