Pieman is a simple neural network with a configurable number of hidden layers, optimized using AVX vectorization for performance.
- Multi-layer Neural Network: Configurable number of hidden layers.
- AVX Vectorization: Uses AVX instructions to accelerate forward and backward propagation.
- Aligned Memory Allocation: Ensures optimal data alignment for SIMD operations.
- Model Persistence: Supports saving and loading model weights and biases.
- Training Monitoring: Reports loss at set intervals.
- Compiler: GCC with AVX support (
-mavx -mavx2) - Libraries: Standard C libraries (
stdio.h,stdlib.h,math.h,immintrin.h) - OS: Linux/macOS/Windows (with proper AVX support)
For the optional beta.py demo, install the Python dependencies:
pip install -r requirements.txtUse the provided Makefile to build the project:
makeRun the executable:
./nnetThe training will start and display epoch-wise loss values. The trained model is saved as model.bin.
You can override default parameters at runtime:
| Option | Description | Default |
|---|---|---|
-l |
Number of hidden layers | 28 |
-s |
Neurons per hidden layer | 28 |
-r |
Learning rate | 1e-2 |
-e |
Maximum epochs | 1e9 |
- Input Layer: 784 neurons (default)
- Hidden Layers: 28 layers, each with 28 neurons
- Output Layer: 10 neurons
- Activation Function: Sigmoid
- Loss Function: Mean Squared Error (MSE)
- Optimizer: Gradient Descent
All layers allocate memory using 32‑byte alignment so that 256‑bit AVX
instructions can operate on four double values at a time. The code pads each
layer's size to a multiple of four and performs matrix multiplications using
_mm256_load_pd and _mm256_fmadd_pd for efficient fused multiply‑add
operations. A helper horizontal_sum function reduces the AVX registers to a
scalar result, enabling the forward and backward passes to remain vectorized.
👨🎓 784 params
┏━━━━━━━━━━━━━━━━━━━━━━┓
┃ Epoch Loss ┃
┣━━━━━━━━━━━━━━━━━━━━━━┫
┃ 1 0.306073244 ┃
┃ 10000 0.000067958 ┃
┃ 20000 0.000032532 ┃
┃ 30000 0.000021231 ┃
┃ 40000 0.000015706 ┃
┃ 50000 0.000012440 ┃
┃ 60000 0.000010286 ┃
┃ 61650 0.000010000 ┃
┗━━━━━━━━━━━━━━━━━━━━━━┛
🏋️ Training Time: 1.16 seconds
💾 Model saved: model.bin
The snippet above was captured after building the project with make and
running ./nnet. Training stops once the loss drops below the threshold defined
by MAX_ACCEPTABLE_LOSS, which with the default settings happens at roughly
61k epochs.
The model is saved automatically after training. Check the return value to ensure the operation succeeded:
int rc = save_model("model.bin");
if (rc < 0) {
fprintf(stderr, "Failed to save model (error %d)\n", rc);
}
// On success you'll see:
// 💾 model.bin
// On failure an error message is printed and rc will be negative.To load a saved model and verify it loads correctly:
rc = load_model("model.bin");
if (rc < 0) {
fprintf(stderr, "Failed to load model (error %d)\n", rc);
}
// If loading succeeds no message is printed by the function.Call this after initialize_network() to restore the weights and biases before
training or inference.
Modify these macros in main.c to adjust the default model:
#define INPUT_SIZE_ORIGINAL 784
#define DEFAULT_HIDDEN_LAYERS 28
#define DEFAULT_HIDDEN_SIZE_ORIGINAL 28
#define OUTPUT_SIZE_ORIGINAL 10
#define DEFAULT_LEARNING_RATE 1e-2
#define DEFAULT_MAX_EPOCHS 1e9
#define MAX_ACCEPTABLE_LOSS 1e-5
#define REPORT_FREQUENCY 10000The repository also includes beta.py, a small PyTorch script that
demonstrates a modern framework for neural networks. To try it out, install the
Python dependencies and run the script:
pip install numpy torch matplotlib scipy
python3 beta.pyThis step is completely optional and may require a machine with sufficient memory and AVX2 support.
Remove compiled binaries:
make cleanRun the minimal test suite with:
make testThis project is licensed under the MIT license.
beta.py is a small PyTorch example that learns the parameters of a Beta
distribution from randomly generated inputs. It demonstrates how to build a
simple neural network using numpy, torch, scipy, and matplotlib.