Latency performance measurement framework for ExecuTorch models with SME2 acceleration. Enables operator-level performance analysis, bottleneck identification, and automated reporting.

Python 5 1 Updated Jun 11, 2026

rehohoho / onnx2versal

Generate versal system design from ONNX model. AI engine kernels. Sub-microsecond speeds for autoencoders.

C++ 18 1 Updated Dec 29, 2024

microsoft / brainsmith

Open-source AI acceleration on FPGA: from ONNX to RTL

Python 54 7 Updated Jun 4, 2026

Xilinx / XRT

Run Time for AIE and FPGA based platforms

C++ 667 536 Updated Jun 10, 2026

Xilinx / device-tree-xlnx

Linux device tree generator for the Xilinx SDK (Vivado > 2014.1)

Tcl 237 204 Updated May 14, 2026

PacktPublishing / Mastering-Embedded-Linux-Development

Mastering Embedded Linux Development Fourth Edition, published by Packt

C 101 35 Updated Apr 22, 2026

arc-research-lab / CHARM

CHARM: Composing Heterogeneous Accelerators on Heterogeneous SoC Architecture

C++ 173 24 Updated Mar 12, 2026

enyac-group / MaxEVA

MaxEVA: Maximizing the Efficiency of Matrix Multiplication on Versal AI Engine (accepted as full paper at FPT'23)

C++ 22 2 Updated Apr 17, 2024

siemens / jailhouse

Linux-based partitioning hypervisor

C 1,935 358 Updated May 18, 2024

google-coral / coralnpu

A machine learning accelerator core designed for energy-efficient AI at the edge.

Emacs Lisp 2,369 294 Updated Jun 11, 2026

mit-han-lab / tinyengine

[NeurIPS 2020] MCUNet: Tiny Deep Learning on IoT Devices; [NeurIPS 2021] MCUNetV2: Memory-Efficient Patch-based Inference for Tiny Deep Learning; [NeurIPS 2022] MCUNetV3: On-Device Training Under 2…

C 949 161 Updated Nov 27, 2024

andem25 / Sleep-Detector-for-Kria-KV260-Vision-AI-Starter-Kit

Sleep detector using kria KV260 AI vision and Bluecoin

Python 3 Updated Jun 11, 2025

pulp-platform / dory

A tool to deploy Deep Neural Networks on PULP-based SoC's

Python 94 24 Updated Aug 4, 2025

niklasnolte / MonotonicNetworks

Python 57 13 Updated May 29, 2024

ntuaislab / BRONet

[ICML 2025 Spotlight] Enhancing Certified Robustness via Block Reflector Orthogonal Layers and Logit Annealing Loss

Python 7 1 Updated May 30, 2025

Xilinx / finn

Dataflow compiler for QNN inference on FPGAs

Python 1,007 300 Updated Jun 11, 2026

cs-jsi / chisel4ml

Scala 18 1 Updated Jan 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tommaso Baldi balditommaso

Achievements

Achievements

Block or report balditommaso

Stars

jafermarq / WinogradAwareNets

uxlfoundation / oneDNN

Tencent / ncnn

andravin / wincnn

dimdano / aie4ml

KULeuven-MICAS / snax_cluster

legion1581 / go2_firmware_tools

quark0 / darts

microsoft / LoRA

davda54 / sam

DependableSystemsLab / TensorFI-BinaryFI

tancheng / CGRA-Flow

fzi-peccia / tvm

ArmDeveloperEcosystem / sme-executorch-profiling