PyTorch Dataloader and Model
Training
Aykhan Mahmudov (aykhan.mahmudov.std@bhos.edu.az)
Group IT 21
03.21.2025
1 Overview
This report documents the implementation and analysis of a PyTorch dataloader and model for a card
image classification task. We worked with a dataset containing 53 different classes of playing cards,
implementing a custom dataset class, modifying a pre-trained ResNet18 model, and training the model
with different learning rates. We analyzed the impact of different learning rates on model convergence and
performance, finding that a moderate learning rate of 0.05 provided the best balance between training
stability and convergence speed.
2 Dataset Analysis
After examining the dataset downloaded from Kaggle, we observed the following characteristics:
1. Number of Training Images: The dataset contains a large collection of playing card images
organized by class.
2. Number of Classes: The dataset includes 53 distinct classes, representing all standard playing
cards plus joker.
3. Data Structure: Images are organized in a hierarchical structure with class folders containing
individual card images.
3 Dataset Class Implementation
We implemented the getitem method for the custom dataset class to properly load and process
images:
def __getitem__(self, index):
# Get the image path and label
image_path = self.image_list[index]
label = self.label_list[index]
# Load and process the image
image = Image.open(image_path).convert(’RGB’)
# Apply transformations if available
if self.transforms:
image = self.transforms(image)
return image, torch.tensor(label)
This implementation ensures proper loading of images from disk, conversion to RGB format, applica-
tion of transformations, and conversion of labels to PyTorch tensors.
1
4 Model Class Implementation
We modified the forward method of the ExModel class to implement the forward pass through the
network:
def forward(self, image):
# Pass the image through the ResNet18 model
features = self.resnet18(image)
# Reshape the output
features = features.view(features.size(0), -1)
# Pass the features through the classifier
out = self.classifier(features)
return out
4.1 The Meaning of Forward in Model Class
The forward method in a PyTorch model class defines the computation performed at every call to the
model. It specifies how data flows through the network layers during the forward pass, transforming
input data (in this case, images) into output predictions (card classes). While PyTorch automatically
handles the backward pass for gradient computation, the forward pass must be explicitly defined.
4.2 Transfer Learning Significance
Transfer learning leverages knowledge gained from solving one problem and applies it to a different but
related problem. In our implementation, we used a pre-trained ResNet18 model that had already learned
feature representations from millions of images in the ImageNet dataset. The benefits of this approach
include:
1. Efficiency: Reduces training time significantly compared to training from scratch.
2. Performance: Often leads to better model performance, especially when training data is limited.
3. Feature Reuse: Lower layers of pre-trained networks have learned generic features like edges,
textures, and shapes that are transferable across image domains.
4. Optimization: Provides a better initialization point, helping to avoid poor local minima during
training.
5 Model Training and Learning Rate Analysis
We trained the model using three different learning rates to analyze their impact on model performance:
1. High learning rate: 1.5
2. Medium learning rate: 0.05
3. Very low learning rate: 0.0000005
5.1 Training Results
Table 1: Learning Rate Comparison
Learning Rate Training Loss Validation Loss F1 Score
1.5 Highly unstable Highly unstable Poor
0.05 Steadily decreasing Good convergence Best performance
0.0000005 Minimal decrease Minimal improvement Minimal improvement
2
5.2 Learning Rate Impact Analysis
1. High Learning Rate (1.5): The high learning rate caused unstable training with significant
fluctuations in loss values. The model struggled to converge, and the optimization process often
overshot optimal parameters, leading to poor performance metrics.
2. Medium Learning Rate (0.05): This learning rate provided the best results, showing steady
decrease in training loss, good convergence on the validation set, and consistent improvement in F1
score. The optimization process was stable while still making meaningful progress with each epoch.
3. Very Low Learning Rate (0.0000005): With such a low learning rate, the model showed
minimal improvement over training epochs. Parameter updates were too small to make significant
progress within the given number of epochs, resulting in a model that was effectively under-trained.
6 Observations and Challenges
Throughout the implementation and training process, we encountered several noteworthy challenges and
learning points:
1. Dataset Loading Efficiency: Ensuring proper loading and pre-processing of images required
careful implementation to avoid memory issues.
2. Transfer Learning Adaptation: Modifying the pre-trained ResNet18 model required under-
standing of how to properly remove the final classification layer and add a new one tailored to our
specific classification task.
3. Learning Rate Sensitivity: The experiments clearly demonstrated that model performance is
highly sensitive to learning rate selection, with orders of magnitude differences in learning rates
resulting in dramatically different training outcomes.
4. Balancing Batch Size: Finding the appropriate batch size to balance computational efficiency
with training stability was important, especially given the memory constraints of the training
environment.
5. Tensor Dimensionality: Ensuring correct tensor shapes throughout the data pipeline and model
forward pass required careful debugging and understanding of PyTorch’s tensor operations.
7 Conclusion
This project successfully implemented a complete PyTorch pipeline for image classification using transfer
learning. We completed the dataset class by implementing the getitem method, modified the model
architecture to implement the forward pass, and trained the model with different learning rates.
The analysis of different learning rates clearly demonstrated the critical importance of this hyperpa-
rameter for model training success. The medium learning rate (0.05) provided the best balance between
progress speed and training stability, achieving superior performance compared to both higher and lower
learning rates.
The use of transfer learning through the pre-trained ResNet18 model proved to be an effective ap-
proach, allowing the model to leverage previously learned features while adapting to the specific charac-
teristics of the card classification task.
This implementation and analysis provide valuable insights into the practical aspects of deep learning
model development and the impact of hyperparameter choices on training outcomes.