A complete Large Language Model implementation in pure Go with no external ML frameworks. Built from the ground up using only basic math operations.
- Pure Go Implementation - No CGO, no external dependencies for ML
- Transformer Architecture - Multi-head attention, feed-forward networks, layer normalization
- Training Pipeline - Pre-training and instruction tuning phases
- Adam Optimizer - With gradient clipping for stable training
- Interactive Chat - Test the model after training
- Modular Design - Clean separation of components
Input → Embeddings → Transformer Blocks (×3) → Output Projection → Predictions
Each Transformer block contains:
- Multi-Head Self-Attention
- Layer Normalization
- Feed-Forward Network
- Residual Connections
GoGPT/
├── go.mod # Module definition
├── cmd/
│ └── main.go # Training pipeline and interactive mode
├── constants.go # Model hyperparameters
├── tensor.go # Tensor operations and math utilities
├── ops.go # Matrix operations
├── layer.go # Layer interface
├── vocab.go # Tokenization and vocabulary
├── embeddings.go # Token embedding layer
├── self_attention.go # Multi-head attention mechanism
├── feed_forward.go # Position-wise feed-forward network
├── layer_norm.go # Layer normalization
├── transformer.go # Transformer block
├── output_projection.go # Final projection to vocabulary
├── adam.go # Adam optimizer
├── llm.go # Main LLM implementation
└── tensor_test.go # Unit tests
# Clone the repository
cd GoGPT
# Run the training and interactive mode
go run cmd/main.go
# Run tests
go test- Vocabulary: Dynamic (built from training data)
- Embedding Dimension: 32
- Hidden Dimension: 32
- Max Sequence Length: 40 tokens
- Number of Heads: 4
- Number of Layers: 3
- Gradient Clipping: 5.0
- Vocabulary Building - Creates token mappings from training data
- Pre-training (100 epochs) - Learns factual knowledge
- Instruction Tuning (100 epochs) - Learns conversational patterns
- Interactive Mode - Chat with the trained model
You: What is the sun?
GoGPT: The sun is a star at the center of our solar system
You: How do plants grow?
GoGPT: Plants grow by converting sunlight into energy through photosynthesis
# Build the project
go build ./cmd/main.go
# Run with custom parameters (modify constants.go)
go run cmd/main.go
# Run benchmarks
go test -bench=.While inspired by RustGPT, GoGPT has some Go-specific design choices:
- Uses Go interfaces for the Layer abstraction
- Leverages Go's garbage collection (no manual memory management)
- Simplified tensor operations using slices
- Goroutine-ready architecture (though not parallelized yet)
The implementation prioritizes clarity over performance. Potential optimizations:
- SIMD operations for matrix multiplication
- Parallel batch processing with goroutines
- GPU support via CUDA bindings
- Memory pooling for tensor allocations
- Model persistence (save/load)
- Batch training support
- More sophisticated sampling (beam search, top-k/top-p)
- Positional encodings
- Attention visualization
- Distributed training support
- ONNX export
MIT License - See LICENSE file for details
Inspired by the RustGPT project - demonstrating that modern LLMs can be built from scratch for educational purposes.