Skip to content

Code and resources for LMFM-12 microalgae image classification, comparing seven CNN architectures and evaluating transfer learning performance

Notifications You must be signed in to change notification settings

aimialina/LMFM-12

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LMFM-12

1

A Morphologically Diverse Freshwater Microalgae Dataset for Deep Learning-Based Classification with Transfer Learning Analysis

Aimi Alina Binti Hussin, Mohd Ibrahim Shapiai, Shaza Eva Mohamad, Koji Iwamoto, Mohd Farizal Kamaroddin, Kazuhiro Takemoto

We introduce the Light Microscopy Freshwater Microalgae (LMFM-12) dataset, comprising 7,555 curated images from 12 species under multiple magnifications, the largest publicly available freshwater microalgae light microscopy dataset to date. Comprehensive evaluation of seven CNN architectures reveals that randomly initialized models achieve accuracies exceeding 98%, approaching the performance of fully fine-tuned ImageNet-pretrained networks. Through the first application of Singular Vector Canonical Correlation Analysis (SVCCA) to microalgae classification, we suggest that random initialization develops different representational strategies that may be more suited to microscopic morphology, contrasting sharply with ImageNet-adapted features. Despite achieving comparable accuracy, these divergent approaches suggest that effective microalgae classification emerges from learning specialized microscopic features rather than adapting generic visual patterns. Cross-domain evaluation reveals that while ImageNet pretraining achieves superior generalization performance, Grad-CAM++ analysis shows distinct attention patterns between ImageNet and LMFM-12 initialization strategies. This positions LMFM-12 as a useful resource for advancing automated microalgae classification research.

Keywords: microalgae dataset, transfer learning, datasets comparison, SVCCA, image classification


Sections in this paper:

  1. Comparative analysis of model performance across initialization strategies (RD, FT and FB)
  2. Analysis of SVCCA hidden representational analysis
  3. Effect of transfer learning to other publicly available phytoplankton datasets

Models used are from 2 model libraries:

Timm:

  • MobileNet V2 (mobilenetv2_100)
  • DenseNet 121 (densenet121)
  • ResNext 50 (resnext50_32x4d)
  • ConvNext Base (convnext_base)
  • VGG 19 (vgg19_bn)

Torchvision:

  • ShuffleNet V2 (shufflenet_v2_x1_0)
  • EfficientNet V2 (efficientnet_v2_s)

If you find our code/dataset/evaluation useful in your research, please cite as follows:

Hussin, A. A., Eva Mohamad, S., Iwamoto, K., & Takemoto, K. (2026). LMFM-12 [Data set]. Zenodo. https://doi.org/10.5281/zenodo.17669912

About

Code and resources for LMFM-12 microalgae image classification, comparing seven CNN architectures and evaluating transfer learning performance

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Languages