Skip to content

fuziontech/GoGPT

Repository files navigation

🚀 GoGPT - Transformer LLM in Pure Go

A complete Large Language Model implementation in pure Go with no external ML frameworks. Built from the ground up using only basic math operations.

🎯 Features

  • Pure Go Implementation - No CGO, no external dependencies for ML
  • Transformer Architecture - Multi-head attention, feed-forward networks, layer normalization
  • Training Pipeline - Pre-training and instruction tuning phases
  • Adam Optimizer - With gradient clipping for stable training
  • Interactive Chat - Test the model after training
  • Modular Design - Clean separation of components

🏗️ Architecture

Input → Embeddings → Transformer Blocks (×3) → Output Projection → Predictions

Each Transformer block contains:

  • Multi-Head Self-Attention
  • Layer Normalization
  • Feed-Forward Network
  • Residual Connections

📁 Project Structure

GoGPT/
├── go.mod              # Module definition
├── cmd/
│   └── main.go        # Training pipeline and interactive mode
├── constants.go       # Model hyperparameters
├── tensor.go         # Tensor operations and math utilities
├── ops.go            # Matrix operations
├── layer.go          # Layer interface
├── vocab.go          # Tokenization and vocabulary
├── embeddings.go     # Token embedding layer
├── self_attention.go # Multi-head attention mechanism
├── feed_forward.go   # Position-wise feed-forward network
├── layer_norm.go     # Layer normalization
├── transformer.go    # Transformer block
├── output_projection.go # Final projection to vocabulary
├── adam.go           # Adam optimizer
├── llm.go            # Main LLM implementation
└── tensor_test.go    # Unit tests

🚀 Quick Start

# Clone the repository
cd GoGPT

# Run the training and interactive mode
go run cmd/main.go

# Run tests
go test

🧮 Model Configuration

  • Vocabulary: Dynamic (built from training data)
  • Embedding Dimension: 32
  • Hidden Dimension: 32
  • Max Sequence Length: 40 tokens
  • Number of Heads: 4
  • Number of Layers: 3
  • Gradient Clipping: 5.0

🎓 Training Process

  1. Vocabulary Building - Creates token mappings from training data
  2. Pre-training (100 epochs) - Learns factual knowledge
  3. Instruction Tuning (100 epochs) - Learns conversational patterns
  4. Interactive Mode - Chat with the trained model

💬 Example Interaction

You: What is the sun?
GoGPT: The sun is a star at the center of our solar system

You: How do plants grow?
GoGPT: Plants grow by converting sunlight into energy through photosynthesis

🔧 Development

# Build the project
go build ./cmd/main.go

# Run with custom parameters (modify constants.go)
go run cmd/main.go

# Run benchmarks
go test -bench=.

🤝 Differences from RustGPT

While inspired by RustGPT, GoGPT has some Go-specific design choices:

  • Uses Go interfaces for the Layer abstraction
  • Leverages Go's garbage collection (no manual memory management)
  • Simplified tensor operations using slices
  • Goroutine-ready architecture (though not parallelized yet)

📊 Performance

The implementation prioritizes clarity over performance. Potential optimizations:

  • SIMD operations for matrix multiplication
  • Parallel batch processing with goroutines
  • GPU support via CUDA bindings
  • Memory pooling for tensor allocations

🚧 Future Improvements

  • Model persistence (save/load)
  • Batch training support
  • More sophisticated sampling (beam search, top-k/top-p)
  • Positional encodings
  • Attention visualization
  • Distributed training support
  • ONNX export

📝 License

MIT License - See LICENSE file for details

🙏 Acknowledgments

Inspired by the RustGPT project - demonstrating that modern LLMs can be built from scratch for educational purposes.

About

Transformer LLM in Pure Go

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages