Skip to content

NeuroFan/matmult

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

matmult

A floating-point matrix multiplication implemented in hardware.

This repo describes the implementation of a floating-point matrix multiplication on a PYNQ-Z1 development board.

The hardware module implements the matrix product C = AB, where A, B, and C are 128 x 128 matrices.

This hardware accelerator provides a 2.8x speedup compared to NumPy. It should be noted that NumPy uses both vectorization and, presumably, a more efficient algorithm than the naive one implemented in this example.

A 3.5x speedup can be achieved by using the 64-bit AXI-Stream interface. This approach requires additional logic to pack and unpack the matrices.

Repo Organization

  • [hls] contains the accelerator c++ source code for high level synthesis.
  • [boards/Pynq-Z1/matmult] contains the Vivado project.
  • [notebooks] contains the Jupyter Notebook to evaluate the design. This notebook uses the Xilinx/PYNQ Python library.
  • [overlay] contains the generated hardware files. These files were generated using vivado and vivado_hls version 2019.2.

Installation

Build

Requires Xilinx vivado and vivado_hls version 2019.2. If necessary, a different version can be configured in the tcl scripts: script_solution1.tcl and matmult.tcl.

  • Build the matmult module:
    cd hls
    make clean && make solution1
  • Build the Vivado project:
    cd boards/Pynq-Z1/matmult
    make clean  && make all

Credits

About

A floating-point matrix multiplication implemented in hardware

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Tcl 72.0%
  • C++ 14.5%
  • Jupyter Notebook 11.5%
  • Makefile 2.0%