fp32

Here are 6 public repositories matching this topic...

tmarhguy / mac

High-Efficiency 16-bit BFloat16 Multiply-Accumulate (MAC) Unit for ML Acceleration. Verified for SkyWater 130nm (TinyTapeout 07). Includes FP32 accumulation and streaming I/O.

machine-learning asic rtl verilog systemverilog hardware-acceleration cocotb bfloat16 openlane sky130 fp32 tinytapeout librelane

Updated Feb 16, 2026
Python

Dartayous / FP16-vs-FP32-A-GPU-Lab-in-Frames

Star

A reproducible GPU benchmarking lab that compares FP16 vs FP32 training on MNIST using PyTorch, CuPy, and Nsight profiling tools. This project blends performance engineering with cinematic storytelling—featuring NVTX-tagged training loops, fused CuPy kernels, and a profiler-driven README that narrates the GPU’s inner workings frame by frame.

performance-engineering deep-learning reproducible-research cuda pytorch fp16 cupy mixed-precision nsight gpu-benchmark nvtx fp32 tensor-core

Updated Apr 25, 2026
Python

yasser1-0 / FP16-vs-FP32-A-GPU-Lab-in-Frames

Star

🎬 Explore GPU training efficiency with FP32 vs FP16 in this modular lab, utilizing Tensor Core acceleration for deep learning insights.

performance-engineering deep-learning reproducible-research cuda pytorch fp16 cupy mixed-precision nsight gpu-benchmark nvtx fp32 tensor-core

Updated Feb 20, 2026
Python

CodeGeekR / benchmark-ai-tops

Star

AI Performance Benchmark Tool. Unifies CPU, GPU (Metal), and NPU (Neural Engine) tests in GFLOPS and TOPS.

benchmarking cpu ai tensorflow gpu numpy pytorch nvidia tops npu bechmark apple-silicon fp32

Updated Jan 23, 2026
Python

Adxell / CNN-downcasting-fashion-model

Star

The goal of this reposotory is create a downcasting the Fashion MNIST model from FP32 to FP16 using pytorch

cnn-model fp16 fp32

Updated Sep 22, 2025
Python

calebzf / benchmark-ai-tops

Star

🔍 Benchmark AI performance on Apple Silicon with a unified tool for CPU, GPU, and NPU testing, leveraging advanced strategies for accurate results.

benchmarking cpu ai tensorflow gpu numpy pytorch nvidia tops npu bechmark apple-silicon fp32

Updated May 21, 2026
Python

Improve this page

Add a description, image, and links to the fp32 topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the fp32 topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fp32

Here are 6 public repositories matching this topic...

tmarhguy / mac

Dartayous / FP16-vs-FP32-A-GPU-Lab-in-Frames

yasser1-0 / FP16-vs-FP32-A-GPU-Lab-in-Frames

CodeGeekR / benchmark-ai-tops

Adxell / CNN-downcasting-fashion-model

calebzf / benchmark-ai-tops

Improve this page

Add this topic to your repo