0% found this document useful (0 votes)
25 views8 pages

Paper 67

The document presents a mobile application developed for classifying pine seeds into three varieties (Tecunumanii, Oocarpa, Pseudostrobus) using Convolutional Neural Networks (CNN) and Python data analysis. The application was built in three phases: data selection, model training, and implementation, achieving a classification accuracy of 90.32%. It aims to improve seed management in forest nurseries in Peru, addressing challenges in the seed production process.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views8 pages

Paper 67

The document presents a mobile application developed for classifying pine seeds into three varieties (Tecunumanii, Oocarpa, Pseudostrobus) using Convolutional Neural Networks (CNN) and Python data analysis. The application was built in three phases: data selection, model training, and implementation, achieving a classification accuracy of 90.32%. It aims to improve seed management in forest nurseries in Peru, addressing challenges in the seed production process.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Mobile application for pine variety classification in

an Oxapampa nursery using CNN


Rodrigo Cesar Calle Castillo Ana Maria Gissella Lescano Asto Jymmy Stuwart Dextre Alarcon
Universidad Peruana de Ciencias Universidad Peruana de Ciencias Universidad Peruana de Ciencias
Aplicadas Aplicadas Aplicadas
Lima, Perú Lima, Perú Lima, Perú
U201911952@upc.edu.pe [0000-0002- U201521499@upc.edu.pe [0000-0002- PCISJDEX@upc.edu.pe [0000-0002-
5979-9270] 7677-8025] 1686-0510]

Abstract—Peru has large expanses of reforested hectares achieving up to 99% accuracy in validation. Another study
along its coast, with pine being one of the two most commonly [7] compared various classification methods using CNN,
used species for this practice. These plants originate from achieving a maximum accuracy of 99.9% and
forest nurseries, which are centers for the reproduction and recommending the development of mobile applications for
care of plants in their early stages of life before being this purpose. However, Peru lacks solutions to classify pine
transplanted to the field. A crucial stage in pine production is seeds by the varieties produced and planted in the country.
the classification of seeds by variety, which is fundamental for Therefore, our proposal is to develop a mobile application
plant development. This influences the planting destination, for seed classification to determine three species of pine
subsequent care, and the transplant environment. In this
seeds (Tecunumanii, Oocarpa, Pseudostrobus) using CNNs
article, we present a mobile application (app) for classifying
pine seeds by variety. The app uses a model based on
with artificial intelligence libraries and Python data analysis
Convolutional Neural Networks (CNN) designed to for the classification model and React Native for mobile
distinguish and predict among the Tecunumanii, Oocarpa, application development.
and Pseudostrobus pine varieties. The mobile application was The development will occur in three phases: (1) data
built in three phases: Data Selection, Model Training, and selection, (2) model training, and (3) development and
Development and Implementation. The model achieved an implementation of the app.
accuracy of 90.32% in classifying pine seeds, demonstrating
effectiveness and potential as a useful tool in managing the This work is organized into sections as follows: Section
seed classification process in forest nurseries. 2 presents an analysis of related articles, Section 3 shows
the proposed model, Section 4 addresses the validation,
Keywords—Seeds classification, Pine nuts classification, Section 5 presents the results, and finally, Section 6 outlines
Artificial intelligence, Convolutional Neural Network, Machine the conclusions and future work.
Learning
II. RELATED WORKS
I. INTRODUCTION
This section presents the literature review on seed
The reforestation of pines in Peru is an important classification using CNNs in mobile applications. It consists
practice in the forestry sector, but the seed production of three phases: planning, development, and findings.
process lacks adequate management. During the planning phase, the following will be carried out:
The pine seed production process in Peru faces (1) formulation of research questions, (2) selection of
challenges that jeopardize the quality of forest plantations scientific databases, (3) definition of keywords, and (4)
and the sustainability of the resource. It was found that 83% definition of inclusion and exclusion criteria. The defined
of seeds are collected informally and without control from research questions were: Q1: What techniques are currently
natural forests, without knowing the genetic value or the used by researchers to classify seeds using CNNs? Q2:
actual species, directly affecting the quality and variety of What are the phases of the proposed techniques? Q3: What
seeds distributed in the country [1]. Additionally, poor seed algorithms do the found approaches use to build a seed
classification by variety can lead to seed mixing during classification solution? Q4: What technologies were found
production, negatively impacting seed purity and, for the development of mobile applications for seed
consequently, market stability and agricultural yields [2]. classification?
Pine seeds have high morphological similarity, making The defined keywords were: 'seed classification,'
variety classification difficult [3]. Moreover, due to the lack 'convolutional neural networks,' 'mobile application,' 'CNN,'
of traceability information in distribution flows, the origins 'image classification,' and 'species recognition.' The selected
are uncertain, and the accuracy of variety is null [4]. These databases for the article search were Scopus, MDPI, and
difficulties generate uncertainty about the quality and Web of Science.
legitimacy of seeds, negatively impacting forest
conservation [5] and affecting economic benefits in the
forestry sector economy [6] .
There are many studies on using artificial intelligence,
ma-chine learning, and deep learning techniques for seed
classification. Among the findings, [4] presents a study in
Turkey where a mobile application based on CNN
architectures was developed for seed variety selection,

979-8-3503-7834-4/24/$31.00 ©2024 IEEE


TABLE I. RESEARCH QUESTIONS ARTICLES ResNet50 96.9785 0.1754242479801178

Category References Quantity VGG16 92.1875 0.6280940771102905

[2] [8] [9] [10] [11]


Technique (Q1) [12] [13] [3] [14] [15] 12 The system was implemented in 4 phases: collection of pine
[16] [17] seed images, development and training of the classification
Phase (Q2) [9] [11] [14] 3 model, design, development, and implementation of the
[18] [11] [9] [14] [10]
mobile application, and validation of results. The mobile
Algorithms (Q3)
[3] [2]
8 application is developed under a REST architecture
consisting of a Frontend client to display data to the user,
Mobile Applications (Q4) [4] [14] [14] [9] 3 backend logic responsible for processing images, predicting
the variety of seeds, providing that information to the
Frontend, and storing it in the database as required. In
Regarding techniques, we found 3: Computer Vision [2] addition to seed classification functionality, the application
[8], CNN(Convolutional Neural Networks) [9] [10] [11] [2] provides a supplier management system, worker
[13] [14] [16] [3] and NIR(near-infrared) [3] [17] CNNs are management, and records of classifications performed.
considered the most important for our study, as they involve
the use of non-invasive techniques that can be adapted to
mobile developments due to the wide variety of
architectures available. This will help define the technique
to be used in the classification of pine seeds in the
application to be developed.
Regarding the phases, we found 2 sequences. In the first
sequence, the phases converge into three stages for seed
classification: (1) dataset formation, (2) training, and (3)
testing [9] [14]. While in the second sequence, these three
phases are supplemented by an additional phase called Fig 1 Proposed Mobile Application for Pine Seed Classification
"Image Feature Extraction" [11]
A. Phases of Proposed Model
Regarding the algorithms, we found 5: kNN (k-Nearest
Neighbors) [18], DT(Decision Tree) [18] [3], Naïve Bayes 1) Phase 1. Data Selection: This phase aims to collect
[18], Bayes Net [11], CNN [9] [14] [10] [3] [2]. The most a diverse set of images of pine seeds from varieties
prominent and by far the most studied with better results for Pseudostrobus, Oocarpa, and Tecunumanii for use in
seed classification through images is CNN. This provides us training, testing, and validating the CNN model. It is
with a clear path for selecting the algorithm to implement in
divided into two sub-phases:
the selection of pine seeds by variety.
Regarding mobile applications, we found 2: the first one a) Image Collection: Seeds were collected through
in Python [4] and the second in Java [14], Both articles photographs of three varieties of pine seeds with high
implement the Machine Learning library called TensorFlow definition using a 100x magnification lens. The collected
as part of their implementation. images belong to the varieties Oocarpa, Tecunumanii, and
Pseudostrobus. It is worth mentioning that all images were
Our analysis of related works concludes that the use of captured against a white background, using natural light
Convolutional Neural Networks (CNN) is closely related to without any artificial lighting assistance.
seed classification in general. Architectural variations of
this model, like the base model, achieve a high accuracy b) Dataset Construction: Here, the collected images
percentage of over 90% in seed classification. On the other were distributed into three-folders: 80% of the images for
hand, a modern approach will be presented for the training, 10% for testing, and 10% for validation. Each
development of the application; scientific articles related to folder has three subfolders named after the seeds to be
the development of mobile applications for seed classified (Oocarpa, Tecunumanii, and Pseudostrobus),
classification do not provide definitive results on which which contain the corresponding images of the seed
technology is best suited for this case [14] [9] [4]. variety.

III. PROPOSED MODEL TABLE III. DATASET DISTRIBUTION


Our proposal is to develop a mobile application based Percentage of
Tags Varieties
on a Convolutional Neural Network (CNN) model of our images by tags
own design. The model used by our application was selected 80% Training Tecunu- Oocarpa Pseudos-
based on the following validation of comparison results of manii trobus
models found in related works tested under the same 10% Evidence Tecunu- Oocarpa Pseudos-
scenario and data. manii trobus
10% Validation Tecunu- Oocarpa Pseudos-
TABLE II. CNN MODEL ACCURACY PERFORMANCE COMPARISON
manii trobus
Model Test Accuracy % Test Loss
Own
Custom 97.3125 0.14804404973983765 2) Phase 2. Model Training: This phase aims to
CNN develop a pine seed classification model based on the base
architecture of CNN using the collected images. It is
divided into three sub-phases.
a) Training: This stage marked the beginning of the
development of the Machine Learning model, specifically
Convolutional Neu-ral Networks (CNN). To carry out this
process, the Python programming language was used along
with specialized li-braries in data science and artificial
intelligence: Matplotlib, Keras, and TensorFlow.

Fig 2. Definition of Image Parameters for CNN Model Training

The constants shown at the top determine the image


size, the number of channels (RGB), batch size, and
number of epochs (the number of times the model will see
the complete dataset).

Fig 3. Image resizing function

These layers resize the images to IMAGE_SIZE x


IMAGE_SIZE and scale their values between 0 and 1. Data
augmentation techniques are also added to improve the
model's generalization.
Fig 5. Training Model Flow

b) Metrics. Two metrics were used for the developed


neural networks: "accuracy" and "loss," which refer to the
accuracy value in seed classification and the loss value
respectively.

The "accuracy" is calculated by comparing the model's


predictions with the true labels in the training set.

𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑜𝑟𝑟𝑒𝑐𝑡 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛𝑠


𝑎𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =
𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑒𝑥𝑎𝑚𝑝𝑙𝑒𝑠
Fig 4. CNN Model
While the 'loss' is calculated by applying the loss function
to the model's predictions on the training set. It is defined
This process defines and constructs a convolutional as:
neural network model for image classification, which
includes multiple convolutional and max-pooling layers for 𝐶
feature extraction, followed by dense layers for final 𝐿(𝑦, 𝑦̂) = − ∑ 𝑦𝑖 log( 𝑦̂)
𝑖 (1)
classification. Below is the entire flow of the model 𝑖=1
training.
Where in (1):
• 𝑦 is the vector of true labels
• (𝑦̂)𝑖 ̂ is the vector of predicted probabilities by the
model
• 𝐶 is the number of classes
file (.h5) is commonly used to store trained models in
Python, facilitating their subsequent loading and use in
different applications or environments. This .h5 file
constitutes a tangible document representing the final
product of the development and training phase of our model,
ready to be implemented in the next phase.

3) Phase 3. Development and Implementation of The


Application: This phase aims to design, develop, and
implement an intuitive mobile application that allows users
to capture and classify im-ages of pine seeds in real time
using the camera of a mobile device. It integrates the
classification model efficiently to pro-vide a smooth and
accurate user experience. It consists of three sub-phases
Fig 6. Metric Values During Training
a) Mobile Application Design: The design was created
c) Results: As a result of this training phase, generous using the interface design application Figma and consists of
statistics above 90% were obtained with the collected 6 interfaces dedicated to user authentication, seed
images. classification management, and seed supplier management.

Fig 9. Prototypes of Proposed Mobile Application

b) Mobile Application Development: The


development of the application was carried out using
functional programming and a component-based
Fig 7. Accuracy Result architecture supported by the TypeScript programming
language and the React Native cross-platform mobile
development library with Expo. The main business logic
Result of the confusion matrix that allows to visualize implemented in the mobile application pertains to seed
the classification errors and determine which seed varieties classification. When creating a pine seed classification
are being confused most frequently. session, users can take photographs that the application
processes through the CNN model, which is consumed and
processed in the application logic developed in Fast API
using Python. The application then provides a result that is
stored in the database. The process of seed classification
through photographs re-peats until the user closes the
classification session.

Fig 10. Seed Classification API in Fast API

Fig 8. Confusion Matrix

Similarly, a file with the extension .h5 was generated.


This file encapsulates the CNN model, including the Fig 11. Service Consuming the Classification Model in Fast API
parameters and weights learned during training. This type of
Fig 12. Consuming the Pine Seed Classification API from React Native

IV. VALIDATION
To validate the pine seed classification application by Fig 14. Capturing an image through the proposed application
variety, three experiments were conducted. Experiment 1
aimed to determine if the application correctly classifies 2) Hardware Setup: To capture the morphological
Tecunumanii pine seeds. Experiment 2 aimed to ensure details of the pine seeds, a x100 magnification lens, model
correct classification of Oocarpa pine seeds, while 100x Microscope by APEXEL, was used.
Experiment 3 focused on verifying the correct classification
of Pseudostrobus pine seeds.

For these experiments, the following setups were used:


50 grams of Tecunumanii pine seeds (Container 1), 50
grams of Oocarpa pine seeds (Container 2), and 50 grams
of Pseudostrobus pine seeds (Container 3).

Fig 15. X100 Magnification Lens

Fig 13. Pine Seed of the Tecunumanii, Oocarpa and Pseudostrobus


Varieties
For all three experiments, the following tools were used:
An An-droid smartphone, a x100 magnification lens, and a
white card-board canvas.
A. Experiments: All experiments will be explained in the
following paragraph, as they share the same
experimental phases with the difference that they aim
to achieve different objectives (determining differ-ent
varieties of pine seeds).

1) Software Setup: For the validation of the proposed Fig 16. Smartphone with x100 Magnification Lens
application, the developed application was installed on a
Moto g(30) smartphone. 3) Scenario Setup: In a naturally lit space, a white
canvas was placed. This served as a background to rest the
pine seeds and per-form the classification by capturing
photographs with the smartphone camera using the
application.
result, as scenario 1 contained only Tecunumanii seeds. The
same logic was applied to the other experiments.
B. Validation of Experiments: The values obtained from
the experiments were considered to determine the
effectiveness of the application as a tool for classifying
pine seeds by variety. The following formula was used
to determine the effectiveness.

𝑎
𝑟= ∗ 100% (2)
𝑝

Where in (2):
Fig 17. Validation Scenario 𝑟: result
𝑝: number of seeds evaluated
𝑎: number of successful seeds
4) Classification: Upon opening the application, the
classification login functionality was used. This allows for
the classification of seeds by variety through photographs V. RESULTS
taken via the application, storing the information of the
results. The 50 grams of seeds determined in each scenario This section presents the results obtained from the test
were used for each experiment. con-ducted under the scenario described at the end of the
previous section.
In Experiment 1 (classification of 50 grams of
Tecunumanii seeds), tests were conducted with a total of
3,227 seed units, and the application yielded a confidence
level of 86.98%, correctly identifying the variety in a total
of 2,807 units.

TABLE IV. RESULTS OF THE FIRST EXPERIMENT WITH


TECUNUMANII SEEDS

Seed Variety Number of Number of Accuracy


Seeds Seeds Percentage
Evaluated Correctly
Fig 18. Capturing an image through the proposed application Classified
Tecunumanii 3,227 2,807 86.98%

In Experiment 2 (classification of 50 grams of Oocarpa


seeds), tests were conducted with a total of 3,494 seed units,
and the application yielded a confidence level of 95.99%,
correctly identifying the variety in a total of 3,354 units.

TABLE V. RESULTS OF THE SECOND EXPERIMENT WITH OOCARPA


SEEDS

Seed Variety Number of Number of Accuracy


Seeds Seeds Percentage
Evaluated Correctly
Classified
Fig 19. Seed Classification through the application
Oocarpa 3,494 3,354 95.99%
The classification information was stored in the
Firestore NoSQL database, and a summary of these results In Experiment 3 (classification of 50 grams of
could be viewed through the "Classification Details" view Pseudostrobus seeds), tests were conducted with a total of
within the application. 3,659 seed units. The application achieved a confidence
Only when the result of the corresponded to the variety level of 87.97%, correctly identifying the variety in a total
being analyzed was it considered that the seed variety was of 3,219 units.
correctly classified by the application. For example, in
Experiment 1, if processing the photograph of scenario 1
resulted in the application indicating that the seed was of the
Oocarpa variety, this was considered a failed result.
However, if the result indicated that the seed was of the
Tecunumanii variety, it was determined to be a successful
TABLE VI. RESULTS OF THE THIRD EXPERIMENT WITH a more robust system that can be used in different situations
PSEUDOSTROBUS SEEDS
not included in the study.
Seed Variety Number of Number of Accuracy
ACKNOWLEDGMENTS
Seeds Seeds Percentage
Evaluated Correctly
Classified We thank the Universidad Peruana de Ciencias
Pseudostrobus 3,659 3,219 87.97%
Aplicadas, the Faculty of Engineering, and the research
department for their support. We also extend our gratitude
to our advisors for their guidance throughout this research.
The accuracy percentage corresponds to the To our families, for their un-wavering support, and to the
effectiveness of the model's use through the application company that provided the necessary resources for the
concerning the seed variety. We can see that the accuracy validation stage. This work would not have been possible
percentage is generally high but significantly different for without the support of all those mentioned.
the Tecunumanii variety compared to the Oocarpa variety.
However, it is still much higher than the manual selection REFERENCES
result provided by the forest nursery, which has an average
[1] INIA, «Las Semillas Forestales en el Perú: Desafíos y
accuracy percentage of 75.4%. Oportunidades,» Instituto Nacional de Innovación Agraria, La
Molina, 2017.
TABLE VII.
[2] A. N. F. F. O. N. N. Amin Taheri-Garavand, «Automated in situ
Number of Seeds Number of Seeds Accuracy seed variety identification via deep learning: a case study in
chickpea, » Plants, nº 7, 2021.
Evaluated Correctly Classified Percentage
[3] J. L. J. J. J. L. D.,. J. M. Y. Z. &. Y. Z. Biaosheng Huang,
10,380 9,380 90.32% «Applications of machine learning in pine nuts classification, »
Scientific Reports, nº 8799, 2022.
[4] S. T. Yusuf Basol, «A Deep Learning-Based Seed Classification
VI. CONCLUSIONS AND FUTURE WORK With Mobile Application, » Turkish Journal of Mathematics and
The research allowed us to achieve positive results in Computer Science, vol. 9, nº 1, pp. 192-203, 2021.
classification, achieving accuracy rates of 86.98% for the [5] S. B. H. D. Servet Caliskan, «Effects of geoclimatic factors on the,
» Balekoglu et al. Ecological Processes, vol. 9, nº 55, 2020.
first scenario, 95.99% for the second scenario, and 87.97%
for the third and final scenario. This resulted in an average [6] 2. H. Z. H. I. P. B. D. S. K. Mujib Rahman Ahmadzai 1, «The
Societal and Economic Impact of Reforestation Strategies and
classification accuracy of 90.32% through the application. Policies in Southeast Asia, » Forest, vol. 14, nº 1, 2022.
This means that the application has a high level of
[7] Y. H. A. B. S. A. A. A. L. J. Yonis Gulzar, «Lipid Profile
confidence and can be used to support the seed selection Quantification and Species Discrimination of Pine Seeds through
process in a pine seedling production environment of the NIR Spectroscopy: A Feasibility Study, » Foods, vol. 11, nº 23,
Tecunumanii, Pseudostrobus and Oocarpa varieties, 2020.
especially in forest nurseries. [8] S.-H. M. A. F. J. V. A. M. Shima Javanmardi, «Computer-vision
classification of corn seed varieties using deep convolutional
Various studies were found in the literature for seed neural network, » Journal of Stored Products Research, vol. 92,
classification in general and pine seeds specifically. 2021.
However, none of them presented a classification tool for [9] Y. H. A. B. S. A. A. A. a. L. J. Yonis Gulzar, «A Convolution
Tecunumanii, Pseudostrobus, and Oocarpa pine seed Neural Network-Based Seed Classification System, » Symmetry,
varieties. vol. 20, pp. 2-18, 2020.
[10] E. Dönmez, «Enhancing classification capacity of CNN models
This study proposed a mobile application for pine seed with deep feature selection and fusion: A case study on maize seed
classification for the three most used varieties in Peru. The classification, » Data & Knowledge Engineering, vol. 141, 2022.
application was developed in three phases: (1) data [11] S. Q. W. K. M. S. B. B. S. N. S. R. F. J. C. C. a. S. A. Aqib Alia,
selection, (2) model training, and (3) application «Machine learning approach for the classification of corn seed
development and implementation. using hybrid features, » INTERNATIONAL JOURNAL OF
FOOD PROPERTIES, vol. 23, nº 1, pp. 1110-1124, 2020.
Three experiments were conducted to validate the [12] S. T. Yusuf Basol, «Application, A Deep Learning-Based Seed
application at a forest nursery in the province of Oxapampa, ClassificationWith Mobile, » Turkish Journal of Mathematics and
Pasco department, Peru. Experiment 1 validated the Computer Science, vol. 13, nº 1, pp. 192-203, 2021.
classification of 50g of Tecunumanii pine seeds using the [13] L. P. Y. G. J. W. J. M. F. H. &. L. Y. Jianing Ma, «Application of
application. Experiment 2 vali-dated the classification of Hyperspectral Imaging to Identify Pine Seed Varieties, » Journal
50g of Oocarpa pine seeds, and Experiment 3 did the same of Applied Spectroscopy volume, nº 8, pp. 916-923, 2023.
for Pseudostrobus pine seeds. [14] Y. Hamid, S. Wani, A. B. Soomro, A. A. Alwan y Y. Gulzar,
«Smart Seed Classification System based on MobileNetV2
As future work, we encourage strengthening the Architecture, » IEEE, pp. 217-222, 2022.
classification model by training it with images of pine seeds [15] M. Ahmed, J. Yasmin, E. Park, G. Kim, M. Kim, C. Wakholi, C.
in different scenarios, including various types of light, Mo y Cho, «Classification of Watermelon Seeds Using
including artificial ones, different backgrounds on which the Morphological Patterns of X-ray Imaging: A Comparison of
seed is placed, and various varieties of pine seeds not used Conventional Machine Learning and Deep Learning, » Sensors,
in the research, such as Pinus Patula, Caribaea, Radiata, vol. 20, nº 23, 2020.
among others. We also want to expand the test scenario to [16] T. S. X. J. Y. Z. Hao Wu, «Sunflower seeds classification based
on sparse convolutional neural networks in multi-objective scene,
the regions of Cusco and Huancavelica, places where there » Scientific Reports, vol. 12, 2022.
are pine production centers. This will make the application
[17] L. D. F. C. Jun Zhang, «Classification of Frozen Corn Seeds Using
Hyperspectral VIS/NIR Reflectance Imaging, » Molecules, vol.
24, nº 1, 2019.
[18] C. D. R. Andrea Loddo, «On the Efficacy of Handcrafted and Deep
Features for Seed Image Classification, » Journal of Imaging, nº 9,
p. 171, 2021.
[19] P. Xu, Q. Tan, Y. Zhang, X. Zha, S. Yang y Yang, «Research on
Maize Seed Classification and Recognition Based on Machine
Vision and Deep Learning, » Agriculture, vol. 12, 2022.
[20] B. Trugrul, «Classification of Five Different Rice Seeds Grown
Inturkeywith Deep Learning Methods, » Comunication Series, vol.
64, nº 1, pp. 40-50, 2022.

You might also like