Skip to content

AestheticVoyager/GoML

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GoML - Machine Learning Framework in Go

A somewhat comprehensive, modular machine learning framework built in Go, designed for both educational and production use. GoML provides neural network implementations, data processing utilities, and GPU acceleration support in a clean, well-structured codebase.

Table of Contents

Features

  • Neural Networks: Fully connected layers with backpropagation
  • Multiple Activations: ReLU, Sigmoid, Tanh, Softmax, and more
  • Optimization Algorithms: SGD, Adam, with momentum and regularization
  • GPU Acceleration: CUDA, Metal, and OpenCL support via Gorgonia
  • Data Processing: CSV, image data loading with normalization
  • Visualization: Training metrics, confusion matrices, ROC curves
  • Configuration Management: YAML/JSON configuration files
  • Model Persistence: Save and load trained models
  • Extensible Architecture: Easy to add new layers and models

Installation

Prerequisites

  • Go 1.21 or later
  • (Optional) CUDA toolkit for GPU acceleration

Installation Steps

# Clone the repository
git clone https://github.com/AestheticVoyager/goml
cd goml

# Install dependencies
make deps

# Setup project structure
make setup

Quick Start

Basic Usage

package main

import (
    "fmt"
    "log"

    "github.com/AestheticVoyager/goml/internal/datasets"
    "github.com/AestheticVoyager/goml/internal/models"
    "github.com/AestheticVoyager/goml/internal/training"
)

func main() {
    // Load dataset
    dataset := datasets.NewIrisDataset()
    if err := dataset.Load(); err != nil {
        log.Fatal(err)
    }

    XTrain, YTrain, XTest, YTest := dataset.GetData()
    info := dataset.GetInfo()

    // Create model
    config := models.ModelConfig{
        InputSize:   info.InputSize,
        OutputSize:  info.OutputSize,
        LearningRate: 0.01,
        BatchSize:    16,
        Epochs:       500,
        Layers: []models.LayerConfig{
            {Neurons: 16, Activation: models.ReLU},
            {Neurons: 8, Activation: models.ReLU},
            {Neurons: info.OutputSize, Activation: models.Softmax},
        },
    }

    model := models.NewNeuralNetwork(config)

    // Train model
    trainer := training.NewTrainer(model, training.TrainingConfig{
        Epochs:    config.Epochs,
        BatchSize: config.BatchSize,
        LearningRate: config.LearningRate,
        Verbose:   true,
    })

    if err := trainer.Train(XTrain, YTrain, XTest, YTest); err != nil {
        log.Fatal(err)
    }

    // Evaluate
    accuracy := trainer.Metrics.ValAcc[len(trainer.Metrics.ValAcc)-1]
    fmt.Printf("Final accuracy: %.2f%%\n", accuracy*100)
}

Using Configuration Files

# config.yaml
project:
  name: "my-classification-project"
  output_dir: "./output"

model:
  input_shape: [4]
  output_shape: [3]
  layers:
    - type: "dense"
      units: 16
      activation: "relu"
    - type: "dense"
      units: 8  
      activation: "relu"
    - type: "dense"
      units: 3
      activation: "softmax"

training:
  epochs: 500
  batch_size: 16
  learning_rate: 0.01
// Load configuration
configManager := config.NewConfigManager()
if err := configManager.LoadConfig("config.yaml"); err != nil {
    log.Fatal(err)
}
cfg := configManager.GetConfig()

Project Structure

goml/
├── cmd/                 # Command-line applications
│   ├── iris/           # Iris classification CLI
│   └── mnist/          # MNIST classification CLI
├── examples/           # Example implementations
│   ├── iris/           # Iris classification example
│   └── mnist/          # MNIST digit recognition example
├── internal/           # Core framework components
│   ├── datasets/       # Data loading and preprocessing
│   │   ├── dataset.go  # Base dataset interface
│   │   ├── iris.go     # Iris dataset implementation
│   │   └── mnist.go    # MNIST dataset implementation
│   ├── models/         # Neural network models
│   │   ├── neuralnet.go # Main neural network implementation
│   │   ├── layer.go     # Layer definitions
│   │   └── activations.go # Activation functions
│   ├── training/       # Training algorithms
│   │   ├── trainer.go   # Training orchestration
│   │   ├── optimizers.go # Optimization algorithms
│   │   └── metrics.go   # Metrics and evaluation
│   └── utils/          # Utility functions
│       ├── preprocessing.go # Data preprocessing
│       └── visualization.go # Plotting and visualization
├── pkg/                # Reusable packages
│   ├── gpu/            # GPU acceleration
│   │   └── accelerator.go # GPU computation management
│   └── config/         # Configuration management
│       └── config.go   # Configuration parsing and validation
├── go.mod              # Go module definition
├── Makefile            # Build automation
└── README.md           # This file

Examples

Iris Classification

The Iris example demonstrates a simple classification task using the classic Iris flower dataset.

# Run the Iris example
make run-iris

Expected Output:

GoML - Iris Classification Example
==================================
Loading dataset...
Dataset: Iris
Training samples: 120, Test samples: 30
Creating neural network...
Training model...
Epoch 0: Loss=1.0986, Acc=0.3333, ValLoss=1.0980, ValAcc=0.3333
Epoch 100: Loss=0.2345, Acc=0.9167, ValLoss=0.2456, ValAcc=0.9000
...
Final Accuracy: 96.67%
Model saved to: examples/iris/output/iris_model.json

MNIST Digit Recognition

The MNIST example shows image classification using the handwritten digits dataset.

make run-mnist

Key Features:

  • 3 hidden layers with dropout regularization
  • Batch normalization support
  • GPU acceleration capability
  • Comprehensive metrics and visualization

Configuration

Model Configuration

Model architecture is defined through configuration:

model:
  architecture: "sequential"
  input_shape: [784]    # 28x28 flattened images
  output_shape: [10]    # Digits 0-9
  layers:
    - type: "dense"
      units: 512
      activation: "relu"
      dropout: 0.2
      batch_norm: true
    - type: "dense"
      units: 256
      activation: "relu"
      dropout: 0.2
    - type: "dense"
      units: 128
      activation: "relu"
      dropout: 0.2
    - type: "dense"
      units: 10
      activation: "softmax"
  initializer:
    weights: "xavier"
    bias: "zeros"
  regularizer:
    l1: 0.0
    l2: 0.0001

Training Configuration

Training parameters and optimization settings:

training:
  epochs: 100
  batch_size: 128
  optimizer:
    type: "adam"
    params:
      learning_rate: 0.001
      beta1: 0.9
      beta2: 0.999
      epsilon: 1e-8
  loss: "categorical_crossentropy"
  metrics: ["accuracy", "loss"]
  learning_rate:
    initial: 0.001
    schedule: "exponential"
    decay: 0.95
    step_size: 10
  callbacks:
    - type: "early_stopping"
      params:
        patience: 10
        min_delta: 0.001

GPU Configuration

Enable and configure GPU acceleration:

gpu:
  enabled: true
  backend: "cuda"  # or "metal", "opencl", "cpu"
  device_id: 0
  memory:
    limit: 4096    # MB
    growth: true
  precision: "fp32" # or "fp16", "mixed"

API Reference

Core Types

NeuralNetwork

The main model class for creating and training neural networks.

type NeuralNetwork struct {
    Config    ModelConfig
    Layers    []*Layer
    IsTrained bool
}

func NewNeuralNetwork(config ModelConfig) *NeuralNetwork
func (nn *NeuralNetwork) Train(X, y *mat.Dense) error
func (nn *NeuralNetwork) Predict(X *mat.Dense) (*mat.Dense, error)
func (nn *NeuralNetwork) Save(path string) error
func LoadNeuralNetwork(path string) (*NeuralNetwork, error)

Dataset Interface

Standard interface for data loading:

type Dataset interface {
    Load() error
    GetData() (*mat.Dense, *mat.Dense, *mat.Dense, *mat.Dense) // X_train, y_train, X_test, y_test
    GetInfo() DatasetInfo
}

Available Datasets

IrisDataset

dataset := datasets.NewIrisDataset()
dataset.SetDataPaths("data/train.csv", "data/test.csv")
err := dataset.Load()

MNISTDataset

dataset := datasets.NewMNISTDataset()
dataset.SetDataPaths(
    "data/train-images.idx3-ubyte",
    "data/train-labels.idx1-ubyte", 
    "data/t10k-images.idx3-ubyte",
    "data/t10k-labels.idx1-ubyte",
)
err := dataset.Load()

Training Components

Trainer

trainer := training.NewTrainer(model, training.TrainingConfig{
    Epochs:       100,
    BatchSize:    32,
    LearningRate: 0.001,
    ValidateEach: 10,
    Verbose:      true,
})

err := trainer.Train(XTrain, YTrain, XTest, YTest)

Optimizers

// SGD with momentum
optimizer := training.NewSGDOptimizer(0.01, 0.9, 0.001)

// Adam optimizer  
optimizer := training.NewAdamOptimizer(0.001, 0.9, 0.999, 1e-8, 0.001)

Visualization

Create plots and visualizations:

viz := utils.NewVisualization("./output")

// Training history
err := viz.TrainingHistoryPlot(loss, accuracy, valLoss, valAcc, "Training Progress")

// Confusion matrix
err := viz.ConfusionMatrixPlot(matrix, classNames, "Confusion Matrix")

// ROC curve
err := viz.ROCCurvePlot(predictions, targets, "ROC Curve")

Advanced Usage

Custom Layers

Implement custom layer types by extending the Layer interface:

type CustomLayer struct {
    config LayerConfig
    // Layer implementation
}

func (l *CustomLayer) Forward(input *mat.Dense, training bool) (*mat.Dense, *BatchStats)
func (l *CustomLayer) Backward(dOutput *mat.Dense, input *mat.Dense, batchStats *BatchStats, learningRate float64) (*mat.Dense, *mat.Dense, *mat.Dense)

Custom Loss Functions

func CustomLoss(predictions, targets *mat.Dense) (*mat.Dense, error) {
    // Implement custom loss calculation
    return loss, nil
}

func CustomLossDerivative(predictions, targets *mat.Dense) (*mat.Dense, error) {
    // Implement loss derivative
    return derivative, nil
}

GPU Acceleration

Enable GPU computation for large models:

gpuConfig := gpu.AcceleratorConfig{
    Type:        gpu.AcceleratorCUDA,
    DeviceID:    0,
    MemoryLimit: 8 * 1024 * 1024 * 1024, // 8GB
    Precision:   gpu.PrecisionFP32,
}

accelerator, err := gpu.NewAccelerator(gpuConfig)
if err != nil {
    log.Printf("GPU not available, using CPU: %v", err)
}

Performance Tips

  1. Use GPU Acceleration: For large datasets and models, enable GPU support
  2. Batch Size Optimization: Larger batches generally train faster but require more memory
  3. Data Preprocessing: Normalize input data for faster convergence
  4. Early Stopping: Prevent overfitting and save training time
  5. Learning Rate Scheduling: Adaptive learning rates can improve convergence

Troubleshooting

Common Issues

GPU Memory Errors:

gpu:
  memory:
    limit: 4096  # Reduce if out of memory
    growth: true

Slow Training:

  • Increase batch size
  • Enable GPU acceleration
  • Simplify model architecture

Poor Accuracy:

  • Check data preprocessing
  • Adjust learning rate
  • Add regularization
  • Increase model capacity

Debug Mode

Enable verbose logging for debugging:

logging:
  level: "debug"
  file: "debug.log"

Contributing

We welcome contributions! Please see our contributing guidelines for details.

Development Setup

# Fork and clone the repository
git clone https://github.com/AestheticVoyager/goml
cd goml

# Create a feature branch
git checkout -b feature/amazing-feature

# Make changes and test
make test

# Commit and push
git commit -m "Add amazing feature"
git push origin feature/amazing-feature

Testing

# Run all tests
make test

# Run specific package tests
go test ./internal/models/...

# Run with coverage
go test ./... -cover

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

  • Birth of GoML
  • Gonum library for scientific computing
  • Gorgonia for tensor operations and GPU support
  • Anyone crazy enough to contribute to GoML

Support

For support and questions:

  • Open an issue on GitHub
  • Read the Friendly Documentation
  • Review Existing Examples

About

A somewhat comprehensive, modular machine learning framework built in Go, designed for both educational and production use. GoML provides neural network implementations, data processing utilities, and GPU acceleration support in a clean, well-structured codebase.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors