mojoBLAS

A high-performance BLAS (Basic Linear Algebra Subprograms) implementation written in Mojo.

Overview

mojoBLAS is a pure-Mojo BLAS implementation focused on performance. It currently includes:

Level 1 BLAS: vector-vector operations such as dot, axpy, nrm2, scal, and more.
Level 2 BLAS: matrix-vector operations such as gemv, ger, triangular and packed matrix-vector routines.
Level 3 BLAS: matrix-matrix operations such as gemm, syrk, syr2k, symm, trmm, and trsm.
Benchmarking suite: comparison against reference/system BLAS implementations.

The codebase is currently optimized for real scalar data types through Mojo DType support.

Installation

Prerequisites

Pixi
Mojo >=1.0.0b1,<2

Modular community

mojoBLAS is available in the modular-community https://repo.prefix.dev/modular-community package repository. Add the following to your channels list in your pixi.toml file:

channels = ["https://conda.modular.com/max", "https://repo.prefix.dev/modular-community", "conda-forge"]

Then, you can install mojoBLAS using any of these methods:

From the pixi CLI, run the command pixi add mojoblas.
In the pixi.toml file of your project, add the following dependency:
```
mojoblas = "==0.1.0"
```

Then run pixi install to download and install the package.

Use as a dependency

Add the repository to your pixi.toml:

[workspace]
preview = ["pixi-build"]

[dependencies]
mojo = ">=1.0.0b1,<2"
mojoblas = { git = "https://github.com/shivasankarka/mojoBLAS.git", branch = "main" }

Then run:

pixi install

Clone locally

git clone https://github.com/shivasankarka/mojoBLAS.git
cd mojoBLAS
pixi install

Usage

Basic example

from mojoblas.src.level1 import dot, axpy, nrm2

fn main():
    var x = alloc[Float32](3)
    var y = alloc[Float32](3)

    x[0] = 1.0
    x[1] = 2.0
    x[2] = 3.0
    y[0] = 4.0
    y[1] = 5.0
    y[2] = 6.0

    print(dot(3, x, 1, y, 1))
    axpy(3, 2.0, x, 1, y, 1)
    print(y[0], y[1], y[2])
    print(nrm2(3, x, 1))

    x.free()
    y.free()

Available routines

Level 1: asum, axpy, copy, dot, iamax, nrm2, rot, rotg, rotm, rotmg, scal, swap
Level 2: gbmv, gemv, ger, sbmv, spmv, spr, spr2, symv, syr, syr2, tbmv, tbsv, tpmv, tpsv, trmv, trsv
Level 3: gemm, symm, syrk, syr2k, trmm, trsm

Testing

Run the test suites with Pixi:

pixi run test_level1
pixi run test_level2
pixi run test_level3

Benchmarking

The repository includes benchmark scripts. This benchmark compares mojoblas against general openblas and Accelerate (on Apple M chips) routines. To run the full benchmarks and generate plots, run the following command

pixi run -e bench bench_all

Outputs

benchmarks/bench_plot_level1.png
benchmarks/bench_plot_level2.png
benchmarks/bench_plot_level3.png

Project structure

src/ - Mojo source for BLAS implementations
tests/ - Mojo tests and reference data
benchmarks/ - benchmark scripts and plots
docs/ - Reference documentation.

Roadmap

Completed

Level 1 BLAS
Level 2 BLAS
Level 3 BLAS
Benchmarking suite

Future goals

Optimize current algorithms (Goal: openblas, accelerate performance and more :))
Complex number support
GPU acceleration

Contributing

Contributions are welcome. If you find a bug or performance issue, please open an issue or submit a pull request.

License

This project is licensed under the MIT License. See LICENSE for details.

Acknowledgments

This project is inspired by the Netlib BLAS reference implementation:

http://www.netlib.org/blas/

Special thanks to the Mojo and BLAS communities for the tools and ideas that made this project possible.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mojoBLAS

Overview

Installation

Prerequisites

Modular community

Use as a dependency

Clone locally

Usage

Basic example

Available routines

Testing

Benchmarking

Outputs

Project structure

Roadmap

Completed

Future goals

Contributing

License

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 125 Commits
assets		assets
benchmarks		benchmarks
docs		docs
src		src
tests		tests
LICENSE		LICENSE
README.md		README.md
pixi.lock		pixi.lock
pixi.toml		pixi.toml

Folders and files

Latest commit

History

Repository files navigation

mojoBLAS

Overview

Installation

Prerequisites

Modular community

Use as a dependency

Clone locally

Usage

Basic example

Available routines

Testing

Benchmarking

Outputs

Project structure

Roadmap

Completed

Future goals

Contributing

License

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages