GoML - Machine Learning Framework in Go

A somewhat comprehensive, modular machine learning framework built in Go, designed for both educational and production use. GoML provides neural network implementations, data processing utilities, and GPU acceleration support in a clean, well-structured codebase.

Features

Neural Networks: Fully connected layers with backpropagation
Multiple Activations: ReLU, Sigmoid, Tanh, Softmax, and more
Optimization Algorithms: SGD, Adam, with momentum and regularization
GPU Acceleration: CUDA, Metal, and OpenCL support via Gorgonia
Data Processing: CSV, image data loading with normalization
Visualization: Training metrics, confusion matrices, ROC curves
Configuration Management: YAML/JSON configuration files
Model Persistence: Save and load trained models
Extensible Architecture: Easy to add new layers and models

Installation

Prerequisites

Go 1.21 or later
(Optional) CUDA toolkit for GPU acceleration

Installation Steps

# Clone the repository
git clone https://github.com/AestheticVoyager/goml
cd goml

# Install dependencies
make deps

# Setup project structure
make setup

Quick Start

Basic Usage

package main

import (
    "fmt"
    "log"

    "github.com/AestheticVoyager/goml/internal/datasets"
    "github.com/AestheticVoyager/goml/internal/models"
    "github.com/AestheticVoyager/goml/internal/training"
)

func main() {
    // Load dataset
    dataset := datasets.NewIrisDataset()
    if err := dataset.Load(); err != nil {
        log.Fatal(err)
    }

    XTrain, YTrain, XTest, YTest := dataset.GetData()
    info := dataset.GetInfo()

    // Create model
    config := models.ModelConfig{
        InputSize:   info.InputSize,
        OutputSize:  info.OutputSize,
        LearningRate: 0.01,
        BatchSize:    16,
        Epochs:       500,
        Layers: []models.LayerConfig{
            {Neurons: 16, Activation: models.ReLU},
            {Neurons: 8, Activation: models.ReLU},
            {Neurons: info.OutputSize, Activation: models.Softmax},
        },
    }

    model := models.NewNeuralNetwork(config)

    // Train model
    trainer := training.NewTrainer(model, training.TrainingConfig{
        Epochs:    config.Epochs,
        BatchSize: config.BatchSize,
        LearningRate: config.LearningRate,
        Verbose:   true,
    })

    if err := trainer.Train(XTrain, YTrain, XTest, YTest); err != nil {
        log.Fatal(err)
    }

    // Evaluate
    accuracy := trainer.Metrics.ValAcc[len(trainer.Metrics.ValAcc)-1]
    fmt.Printf("Final accuracy: %.2f%%\n", accuracy*100)
}

Using Configuration Files

# config.yaml
project:
  name: "my-classification-project"
  output_dir: "./output"

model:
  input_shape: [4]
  output_shape: [3]
  layers:
    - type: "dense"
      units: 16
      activation: "relu"
    - type: "dense"
      units: 8  
      activation: "relu"
    - type: "dense"
      units: 3
      activation: "softmax"

training:
  epochs: 500
  batch_size: 16
  learning_rate: 0.01

// Load configuration
configManager := config.NewConfigManager()
if err := configManager.LoadConfig("config.yaml"); err != nil {
    log.Fatal(err)
}
cfg := configManager.GetConfig()

Project Structure

goml/
├── cmd/                 # Command-line applications
│   ├── iris/           # Iris classification CLI
│   └── mnist/          # MNIST classification CLI
├── examples/           # Example implementations
│   ├── iris/           # Iris classification example
│   └── mnist/          # MNIST digit recognition example
├── internal/           # Core framework components
│   ├── datasets/       # Data loading and preprocessing
│   │   ├── dataset.go  # Base dataset interface
│   │   ├── iris.go     # Iris dataset implementation
│   │   └── mnist.go    # MNIST dataset implementation
│   ├── models/         # Neural network models
│   │   ├── neuralnet.go # Main neural network implementation
│   │   ├── layer.go     # Layer definitions
│   │   └── activations.go # Activation functions
│   ├── training/       # Training algorithms
│   │   ├── trainer.go   # Training orchestration
│   │   ├── optimizers.go # Optimization algorithms
│   │   └── metrics.go   # Metrics and evaluation
│   └── utils/          # Utility functions
│       ├── preprocessing.go # Data preprocessing
│       └── visualization.go # Plotting and visualization
├── pkg/                # Reusable packages
│   ├── gpu/            # GPU acceleration
│   │   └── accelerator.go # GPU computation management
│   └── config/         # Configuration management
│       └── config.go   # Configuration parsing and validation
├── go.mod              # Go module definition
├── Makefile            # Build automation
└── README.md           # This file

Examples

Iris Classification

The Iris example demonstrates a simple classification task using the classic Iris flower dataset.

# Run the Iris example
make run-iris

Expected Output:

GoML - Iris Classification Example
==================================
Loading dataset...
Dataset: Iris
Training samples: 120, Test samples: 30
Creating neural network...
Training model...
Epoch 0: Loss=1.0986, Acc=0.3333, ValLoss=1.0980, ValAcc=0.3333
Epoch 100: Loss=0.2345, Acc=0.9167, ValLoss=0.2456, ValAcc=0.9000
...
Final Accuracy: 96.67%
Model saved to: examples/iris/output/iris_model.json

MNIST Digit Recognition

The MNIST example shows image classification using the handwritten digits dataset.

make run-mnist

Key Features:

3 hidden layers with dropout regularization
Batch normalization support
GPU acceleration capability
Comprehensive metrics and visualization

Configuration

Model Configuration

Model architecture is defined through configuration:

model:
  architecture: "sequential"
  input_shape: [784]    # 28x28 flattened images
  output_shape: [10]    # Digits 0-9
  layers:
    - type: "dense"
      units: 512
      activation: "relu"
      dropout: 0.2
      batch_norm: true
    - type: "dense"
      units: 256
      activation: "relu"
      dropout: 0.2
    - type: "dense"
      units: 128
      activation: "relu"
      dropout: 0.2
    - type: "dense"
      units: 10
      activation: "softmax"
  initializer:
    weights: "xavier"
    bias: "zeros"
  regularizer:
    l1: 0.0
    l2: 0.0001

Training Configuration

Training parameters and optimization settings:

training:
  epochs: 100
  batch_size: 128
  optimizer:
    type: "adam"
    params:
      learning_rate: 0.001
      beta1: 0.9
      beta2: 0.999
      epsilon: 1e-8
  loss: "categorical_crossentropy"
  metrics: ["accuracy", "loss"]
  learning_rate:
    initial: 0.001
    schedule: "exponential"
    decay: 0.95
    step_size: 10
  callbacks:
    - type: "early_stopping"
      params:
        patience: 10
        min_delta: 0.001

GPU Configuration

Enable and configure GPU acceleration:

gpu:
  enabled: true
  backend: "cuda"  # or "metal", "opencl", "cpu"
  device_id: 0
  memory:
    limit: 4096    # MB
    growth: true
  precision: "fp32" # or "fp16", "mixed"

API Reference

Core Types

NeuralNetwork

The main model class for creating and training neural networks.

type NeuralNetwork struct {
    Config    ModelConfig
    Layers    []*Layer
    IsTrained bool
}

func NewNeuralNetwork(config ModelConfig) *NeuralNetwork
func (nn *NeuralNetwork) Train(X, y *mat.Dense) error
func (nn *NeuralNetwork) Predict(X *mat.Dense) (*mat.Dense, error)
func (nn *NeuralNetwork) Save(path string) error
func LoadNeuralNetwork(path string) (*NeuralNetwork, error)

Dataset Interface

Standard interface for data loading:

type Dataset interface {
    Load() error
    GetData() (*mat.Dense, *mat.Dense, *mat.Dense, *mat.Dense) // X_train, y_train, X_test, y_test
    GetInfo() DatasetInfo
}

Available Datasets

IrisDataset

dataset := datasets.NewIrisDataset()
dataset.SetDataPaths("data/train.csv", "data/test.csv")
err := dataset.Load()

MNISTDataset

dataset := datasets.NewMNISTDataset()
dataset.SetDataPaths(
    "data/train-images.idx3-ubyte",
    "data/train-labels.idx1-ubyte", 
    "data/t10k-images.idx3-ubyte",
    "data/t10k-labels.idx1-ubyte",
)
err := dataset.Load()

Training Components

Trainer

trainer := training.NewTrainer(model, training.TrainingConfig{
    Epochs:       100,
    BatchSize:    32,
    LearningRate: 0.001,
    ValidateEach: 10,
    Verbose:      true,
})

err := trainer.Train(XTrain, YTrain, XTest, YTest)

Optimizers

// SGD with momentum
optimizer := training.NewSGDOptimizer(0.01, 0.9, 0.001)

// Adam optimizer  
optimizer := training.NewAdamOptimizer(0.001, 0.9, 0.999, 1e-8, 0.001)

Visualization

Create plots and visualizations:

viz := utils.NewVisualization("./output")

// Training history
err := viz.TrainingHistoryPlot(loss, accuracy, valLoss, valAcc, "Training Progress")

// Confusion matrix
err := viz.ConfusionMatrixPlot(matrix, classNames, "Confusion Matrix")

// ROC curve
err := viz.ROCCurvePlot(predictions, targets, "ROC Curve")

Advanced Usage

Custom Layers

Implement custom layer types by extending the Layer interface:

type CustomLayer struct {
    config LayerConfig
    // Layer implementation
}

func (l *CustomLayer) Forward(input *mat.Dense, training bool) (*mat.Dense, *BatchStats)
func (l *CustomLayer) Backward(dOutput *mat.Dense, input *mat.Dense, batchStats *BatchStats, learningRate float64) (*mat.Dense, *mat.Dense, *mat.Dense)

Custom Loss Functions

func CustomLoss(predictions, targets *mat.Dense) (*mat.Dense, error) {
    // Implement custom loss calculation
    return loss, nil
}

func CustomLossDerivative(predictions, targets *mat.Dense) (*mat.Dense, error) {
    // Implement loss derivative
    return derivative, nil
}

GPU Acceleration

Enable GPU computation for large models:

gpuConfig := gpu.AcceleratorConfig{
    Type:        gpu.AcceleratorCUDA,
    DeviceID:    0,
    MemoryLimit: 8 * 1024 * 1024 * 1024, // 8GB
    Precision:   gpu.PrecisionFP32,
}

accelerator, err := gpu.NewAccelerator(gpuConfig)
if err != nil {
    log.Printf("GPU not available, using CPU: %v", err)
}

Performance Tips

Use GPU Acceleration: For large datasets and models, enable GPU support
Batch Size Optimization: Larger batches generally train faster but require more memory
Data Preprocessing: Normalize input data for faster convergence
Early Stopping: Prevent overfitting and save training time
Learning Rate Scheduling: Adaptive learning rates can improve convergence

Troubleshooting

Common Issues

GPU Memory Errors:

gpu:
  memory:
    limit: 4096  # Reduce if out of memory
    growth: true

Slow Training:

Increase batch size
Enable GPU acceleration
Simplify model architecture

Poor Accuracy:

Check data preprocessing
Adjust learning rate
Add regularization
Increase model capacity

Debug Mode

Enable verbose logging for debugging:

logging:
  level: "debug"
  file: "debug.log"

Contributing

We welcome contributions! Please see our contributing guidelines for details.

Development Setup

# Fork and clone the repository
git clone https://github.com/AestheticVoyager/goml
cd goml

# Create a feature branch
git checkout -b feature/amazing-feature

# Make changes and test
make test

# Commit and push
git commit -m "Add amazing feature"
git push origin feature/amazing-feature

Testing

# Run all tests
make test

# Run specific package tests
go test ./internal/models/...

# Run with coverage
go test ./... -cover

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Birth of GoML
Gonum library for scientific computing
Gorgonia for tensor operations and GPU support
Anyone crazy enough to contribute to GoML

Support

For support and questions:

Open an issue on GitHub
Read the Friendly Documentation
Review Existing Examples

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
cmd		cmd
examples		examples
internal		internal
pkg		pkg
Makefile		Makefile
README.md		README.md
go.mod		go.mod

Folders and files

Latest commit

History

Repository files navigation

GoML - Machine Learning Framework in Go

Table of Contents

Features

Installation

Prerequisites

Installation Steps

Quick Start

Basic Usage

Using Configuration Files

Project Structure

Examples

Iris Classification

MNIST Digit Recognition

Configuration

Model Configuration

Training Configuration

GPU Configuration

API Reference

Core Types

NeuralNetwork

Dataset Interface

Available Datasets

IrisDataset

MNISTDataset

Training Components

Trainer

Optimizers

Visualization

Advanced Usage

Custom Layers

Custom Loss Functions

GPU Acceleration

Performance Tips

Troubleshooting

Common Issues

Debug Mode

Contributing

Development Setup

Testing

License

Acknowledgments

Support

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages