This module provisions a GPU-accelerated EC2 instance running a custom-built AMI openSUSE Leap 15.6 and prepares it for the installation of GPU operator by installing NVIDIA drivers and RKE2.
- By default provisions G4dn instance types with NVIDIA GPUs.
- Uses openSUSE Leap 15.x AMI.
- Automated driver installation via
startupscript.tftpl. - Configures Security Groups for SSH and Kubernetes API (6443).
# Install AWS CLI and configure credentials
aws configure
# Verify identity
aws sts get-caller-identitygit clone https://github.com/devenkulkarni/migdemo.git
cd migdemomigdemo
├── demo
│ ├── gpu-workload.yaml
│ ├── mig-mixed-strategy-deploy-pod.sh
│ ├── mig-mixed-strategy-log-verify.sh
│ ├── mig-single-strategy-deploy-pod.sh
│ └── mig-single-strategy-log-verify.sh
├── gpu-operator
│ └── README.md
├── infra
│ ├── data.tf
│ ├── docs.md
│ ├── main.tf
│ ├── outputs.tf
│ ├── provider.tf
│ ├── scripts
│ │ ├── rke2-localpath-install.sh
│ │ └── startupscript.tftpl
│ ├── terraform.tfvars
│ ├── terraform.tfvars.example
│ └── variables.tf
└── README.md-
Copy ./terraform.tfvars.example to ./terraform.tfvars
-
Edit ./terraform.tfvars
-
prefixto give the resources an identifiable name (e.g., your initials or first name) -
regionto specify the AWS region where resources will be created (e.g., us-west-2) -
zoneto specify the AWS zone where resources will be created (e.g., us-west-2c) -
instance_typeto specify the instance type (e.g, g4dn.xlarge) -
rke2_versionto specify the version of RKE2 cluster to be installed.
# Navigate to the AWS infra implementation directory
cd infra
# Initialize the working directory (downloads providers and modules)
terraform init -upgrade
# Preview the changes (highly recommended)
terraform plan
# Apply the configuration
terraform apply --auto-approveTo tear down the infrastructure and avoid costs:
terraform destroy --auto-approve