AI Image Recognition Advances
AI Image Recognition Advances
E-ISSN: 2708-4507
P-ISSN: 2708-4493
IJEM 2022; 2(1): 51-56                       Image recognition with machine learning
© 2022 IJEM
www.microcircuitsjournal.com
Received: 13-01-2022           Parth Kumar Thakur
Accepted: 16-02-2022
Keywords: Facial recognition, machine learning, edge detection, pattern recognition, deep fake
                               1. Introduction
                               Machine learning has progressed significantly over the past few decades, overcoming many
                               challenges and becoming capable enough to recreate complicated 3D structures like the
                               human face accurately. The technology has only improved in recent years extending from
                               simplistic detection models working on objects to more complicated methods of analysing
                               visual input to recreate, enhance, modify and process the information being presented to the
                               systems. The capabilities of these algorithms and neural networks have increased even
                               further when they’ve been provided the datasets formed from previous neural networks to
                               work with and refine their algorithms for better results. A subset of Deepfake images, videos
                               and other multimedia methods have arisen with the rise of this technology. Simultaneously, a
                               rise in Artificial Intelligence geared towards differentiating and detecting these deep fakes
                               have also seen a rise in the past few years. One big application of this technology can be seen
                               in obtaining underwater images and using neural networks to remove the water, allowing for
                               a clearer view [1]. Another application can be seen with the usage of multiple images to
                               create a 3D render and model of the given images, allowing for processing and data to be
                               collected with ease [3]. With the increasing demand for self-driving cars and the need for
                               safety, better resource management, traffic and decreased pollution, we need better AI
                               models capable of learning faster than ever [4].
                               Definition of Machine Learning as a term varies depending on the context but the basis of
                               the idea all include using images and processing them to obtain data. Machine learning with
                               security cameras, have allowed investigations to more easily detect and tell apart criminals,
                               with many security approaches using machine learning and facial recognition for
                               investigation and criminal identification [7]. Biometrics analysis and forensic rports have also
                               begun using the help of image recognition for selecting out clues and information that may
                               have otherwise been lost, or to process infrared imaging and ultrasonic mapping to obtain a
                               better and more thorough investigation of the crime scene. Last, perhaps the largest use of
                               machine learning has been in product inspection for objects such as microprocessors,
Correspondence                 automobiles and agricultural and food produce [8]. The use of machine learning has
Parth Kumar Thakur
School of Computer Science     automated the inspection of such processes, especially with regards to small and easy to miss
and Engineering, Vellore       errors present in microprocessors etc.
Institute of Technology,       A large part of image processing relates to machine learning and the development of
Chennai, Tamil Nadu, India     Artificial intelligence (AI).
                                                                ~ 51 ~
International Journal of Electronics and Microcircuits                                                      www.microcircuitsjournal.com
Machine vision has become a more and more important               set in comparison to the limitations present from real images
assets and component in the development of neural                 as shown in the following figure [6].
networks as it has turned out to be one of the easiest modes
of obtaining information that can be taught to a neural                   Table 1: Real vs Synthetic data set for facial recognition
network. The improved capability and application in                                             algorithms
robotics alongside the various algorithms that can recreate                         Common NME Challenging NME Private FR 10%
faces, image, geographical locations, terrains, ecological            CVPR' 17                                       3.67
and global mapping, art, and help enhance visual images,              CVPR' 18         2.98          5.19            0.83
and video editing in towards modern world has made                     ICCV' 19        2.72          4.52            0.33
Machine Vision synonymous with Artificial intelligence.               CVPR' 19         3.56          6.67              -
Neural networks are becoming increasingly more capable at              ICCV' 19        3.19          6.87              -
processing and analysing images, with real time application            CVPR'20         3.36          5.74            0.17
and analysis providing a new level of reach to the field.                (real)        3.37          5.77            1.17
Many machines, robots, systems are already making use of              (synthetic)      3.09          4.86            0.50
this new technology inducing rapid growth. From the largest
to the smallest sector, machine vision and artificial             The data shows the results of synthetic training for facial
intelligence are helping enhance and improve development          recognition as a faster and more efficient way to train new
and lifestyle all over the globe. The increasing popularity of    algorithms. It allows for improved facial landmark
self-driving cars also serve to make traffic accidents rarer      processing and removes any hindrances such as lighting and
and help reduce carbon emissions from vehicles as electric        angle concerns due to the synthetic nature of the database.
cars become more prominent alongside them.                        Additionally, this approach also provides an increased
With vast applications and many uses that only keep               control over the fidelity of the data and allows for a much
growing, there is in an increasing need for research and          more variable dataset that can be exponentially generated,
understanding to create better neural networks and machine        tweaked, modified and reiterated.
learning for image processing, and machine vision. To             A similar approach can be seen in image recognition
better develop, understand and improve upon this                  performed by virtual simulations for self-driving cars [4].
technology.                                                       The same can be used as a training method for humans
                                                                  which provides more safety for learning and beginner
2. Neural Network Training Models for Image                       drivers in handling the new and as of yet unfamiliar vehicle.
Processing                                                        Random generation of terrain and image recognition allows
Computer vision systems uses technology based on image            for the creation of realistic models of real-world terrain to
recognition, primarily based upon cameras and other visual        provide for these simulations, combined with physical
devices to obtain data. The machine learning algorithm then       simulations.
processes the data with a feedback loop checking for the
accuracy of the data and then assigning feedback on the           3. Applications of Machine Vision and Image Processing
basis of the accuracy of the detection. The feedback is taken     There are an increasing number of applications for computer
in by the algorithm and applied on a larger data set linked       vision ranging from collecting data, processing information,
together through and processed through multiple algorithms        facial recognition, landmark recognition, real time self-
and run through many iterations to enhance the accuracy of        driving cars, modifying image processing and much more
                                                                  [25-42]
the algorithm. There is an inevitable human part in the usage             . In this section, some of the major applications of this
of this method where the human operators actively train the       technology will be highlighted.
machine and provide it with feedback on the accuracy of its
recognition. The most popular means of providing the              Facial recognition
dataset so far has been the modern Completely Automated           Neural networks and AI play a massive role in Facial
Public Turing test to tell Computers and Humans Apart             recognition, which forms one of the primary uses of image
(CAPTCHA) training methods [5]. Originally intended as a          recognition. It’s a technology that has been rapidly
means of telling humans and bots apart, the image                 expanding in its usage, from biometrics and identification to
recognition method has become a way to provide an                 information and data analysis model and graphics rendering.
accurate data set and method to train modern Artificial           Deep fake and complicated video editing has allowed for
Intelligences (AI) by providing human feedback. The               extremely realistic renders and videos to be developed by
efficacy of this method will soon turn the current                training an AI and feeding it enough source images about a
CAPTCHA methods of image recognition obsolete as                  person. Alongside the many practical usage this has also
machines would be just as capable as humans in telling            developed concerned for abuse and misuse and has raised
apart objects, if not better.                                     concerns on an individual’s privacy and security on the
Another method arising for training Neural networks to            internet even more so than ever before. Facial recognition is
recognise and tell apart images, including better edge            also, used in government and military Airports [9] as a means
recognition, segmentation and pixel counting has been the         of keeping track of passengers and personnel that have
usage of other machine learning algorithms to train newer         arrived and are departing. It is used for face id unlocks in
ones. Training machine learning for processing images             modern smart phones [10], a widely used and popular method
based on synthetic dataset has opened a much vaster and           of security but with concerns over how the same can
much more flexible dataset to train neural networks and           sometimes be tricked using an image of a person which has
databases from. An application of such can be seen with           led to be the development of infrared and 3D face mapping
machine learning algorithms used to recognises faces, and         which develops a 3D face database of a person and cannot
reconstruct them performing better with the synthetic data        be tricked using a flat image. The same technology can be
                                                             ~ 52 ~
International Journal of Electronics and Microcircuits                                                www.microcircuitsjournal.com
seen certain college classrooms to take attendance. It is used     a wide application of image processing software combined
on social media platforms like Facebook [11]. Cameras and          with VGR’s in examples such as the da Vinci surgical
other devices come with facial recognition software to take        systems [17]. The development of machine vision in these
better images and for improved lens focus [6]. Facial              sectors has made increasingly difficult procedures and
recognition databases also play a major role in crime              terminal injuries less of a risk. Cancer treatment, and
investigation and modern-day law enforcement [7]. Facial           delicate surgical procedures for neurological surgeries and
recognition is used in automobiles to prevent car theft and        organ transplants have both greatly benefitted from the
increase security. There is increasing usage of Facial             development of this technology as image processing and
recognition in marketing for better targeting ads and more         machine vision allows the use of far more complex robotic
effective usage of resources [12].                                 surgical tools that can provide a better and more controlled
                                                                   environment for the surgeons to work in.
Bar code reading
A major usage of image recognition is barcode reading and          Satellite imaging and navigation
QR code reading where the images are used to encode a              Imagine processing is used in forming maps across the
certain piece of data and can then be decoded by the               globe using satellites. GPS and navigation systems can also
application reading the image. It is often used to store data      rely on image processing to detect and locate various
on physical goods and products in grocery stores and other         terrains. Real world views and 3D maps formed by google
physical appliances [13]. They form a way to store data            showing each location on their maps are mapped using 360
physically on devices in a compact manner allowing people          cameras set on vehicles which are then processed via image
to track specific devices, their dates of manufacture, stature,    processing software to provide. Locating rescue spots,
location of origin and product descriptions with the usage of      research locations, monitoring changes to the polar caps and
a barcode reader through any local system. This allows for a       many other uses tie into this segment of image recognition.
far easier distribution and tracking method than its               Things such as wildfires, flood and other natural disasters
predecessor and has wildly been spread out as a means of           can be detected before they occur with the help of satellite
product management and tracking. The further evolution of          imaging and image recognition. In [21], Weather prediction
the technology has come in the form of QR codes which              models, detection and studies of climate change, increasing
uses a different format of image coding to store information.      carbon emissions, sea levels rising also make use of image
                                                                   recognition to predict these weather patterns. In [22],
Vision guided robots                                               Hurricane and cyclone detections make use of Image
Vision guided robots (VGR) play an important role in               recognition as well to improve their detection
multiple sectors with applications ranging from medical and        methodologies and get predictions for harvesting rainfalls to
surgical robots that improve procedure and safety of the           help farmers and the agriculture sector prepare accordingly
operation to drones and environmental scanning robots for          for planting and harvesting their crops. This plays a vital
rescue operations. There is also a big role of visual guided       role in the development and tracking of agricultural cycles
robots in military applications and guided missiles alongside      and the economy of nations that depend heavily on the
anti-aircraft and missile security systems meant to take out       weather for their agriculture. The same can be used to track
any unidentified aircrafts navigating the borders or entering      the changing weather conditions across any given region for
a country’s aerospace without authorisation. They are also         further climate change and weather studies.
used for inspection in the manufacturing and processing
industry [14]. This field also pertains to self-driving car,       4. Technologies of Machine Vision
otherwise known as automatic vehicles (AV) which has               There are various technologies pertaining to machine vision
seen a massive increase in its popularity with large tech          and image recognition [30-40]. With various methodologies
companies like Tesla spreading the technology and making           and technologies meant for different subsets of image
it more and more feasible than ever before [4]. The same           processing and recognition. The following list contains
technology is used in security and traffic cameras to detect       some of the prominent methods and technologies used in
collision and inform medical or police services if required        machine vision.
[18]
     . Speeding tickets can also be charged and automatically
added based on the license plate of the speeding vehicle [19].     Image Stitching and Registering
Alongside the practical application, robotics and                  Image stitching is one of the most basic forms of machine
animatronics have developed massively with autonomous              vision where multiple images are stitched together and
robots capable of mimicking conversations using vision             overlapped to form a singular image. It can be used for
guidance to study their environments, the people                   higher resolution, for software implementations of focus and
participating in the conversation and appropriate responses        zoom in cameras and for a wider-angle view using multiple
to any given situation. Other robots can help speed up             images. The technology uses algorithm to determine image
manufacturing processes and reduce faulty products from            alignment by relating coordinates of pixel from one to
being produced by relying on machine vision and guidance           another to form a rough guideline of positioning and angle.
to analyse and maintain quality for the parts being produced.      Image Registering then can use a set of key point and
                                                                   matching features to minimize the errors and differences
Medical Usage                                                      that may crop up from the combination of multiple images.
Image processing can be used for X-rays, ultrasounds,              The most common method used for this purpose is Random
chemotherapy, cancer detection and guided application and          Sample Consensus (RANSAC) which is an iterative method
cancer cell extermination [15, 16]. It can be used to treat        that uses mathematical models based on a given dataset of
cancer, tumour and help with a far more targeted approach          observed points. This technology then further has models
towards treatment of many illnesses and conditions. There is       for calibration whose purpose is to minimize the differences
                                                              ~ 53 ~
International Journal of Electronics and Microcircuits                                                www.microcircuitsjournal.com
between a camera lens image and the image formed using             it down into smaller and smaller components interconnected
the image stitching and registering method used in this            and tied together [24]. Some of the primary methods of blob
machine vision technology. The algorithm would observe             detection are Laplacian of Gaussian (LoG), Difference of
distortion and exposure differences before correcting and          Gaussian (DoG), Determinant of Hessian (DoH) and
optimising the image to be uniform with the overall                Connected Components Labelling.
composition. Another method for image stitching is image
blending, which takes image comparison information from            5. Conclusion and Future Remarks
the calibration stage and uses the given information to            In this paper we analysed the usage of Artificial
combine and adjust the image for an output projection.             Intelligence, Neural Networks and Machine Learning for the
                                                                   development of machine vision, and examined the various
Edge Detection                                                     methodologies, models, approaches used in the technologies
Edge detection is a technology that combines mathematical          that make use of this field of Image processing. We have
methods to detect edges, corves and lines. The technique           shown the widely spread-out usage of image processing
simplifies the process of image detection by reducing the          across multiple fields such as education, medicinal
amount of information that needs to be processed by                development, surgical processes, quality maintenance,
converting a 3D image into its subsequent 2D representation        security, national defence, military operations, robotics,
while still containing the needed structural composition to        manufacturing and processing, agricultural development,
maintain the image’s integrity and information [23]. There         geo satellite mapping, navigation, graphic rendering, 3D
are various different methods applied with edge detection          modelling and many other fields, and we’ve shown the
with two primary categories, zero-crossing based and               application of Machine learning, Artificial intelligence and
search-based. Search based applied a first order derivative to     Neural networks in the development of this technology.
detect edges by a measure of edge strength. The zero-              While widely prevalent, the technologies for machine vision
crossing method on the other hand applies a second order           and neural networks still remains in its infancy. With better
derivative to find the edges in the image. Some of the masks       technological advances being made each day and
for edge detection are Prewitt Operator, Sobel Operator,           development being made in the field alongside increasing
Robinson Operator, Laplacian operator. The first of them is        research, and capabilities of hardware devices to run and
used to detect horizontal and vertical edges. Sobel operator       refine these algorithms, we are on the cusp of a world with
works quite similarly to Prewitt operator, detection both          enhanced and quick image processing capabilities, and
horizontal and vertical edges. Robinson operator makes use         robotics capable of navigating and understanding the world
of a mask, rotating it all over in 8 separate directions to get    around them in a way that had not been possible up until
direction edges and vectors for all of them. Kirsch compass        now. The development of this technology, which has
mask is another mask operator that can find edges in all           become synonymous with artificial intelligence has come up
directions. Finally, the Laplacian operator is a second order      with concerns and pushback from concerned individuals
derivative mask, which has two sections, positive Laplacian        about the abuse of deep fakes and the ease with which these
and negative Laplacian. One big usage of edge detection is         can be produced, but various articles and research papers
in sharpening images, where the more edges are found, the          have highlighted the benefits far outweighing the
sharper the image becomes after the edges have been                disadvantages. Image processing and Artificial Intelligence
applied. It can be used in adding many outlines and effects        are technologies that carry significant impact on the
in photo editing that are based around the shapes and edges        development of the future of humanity and mark a vital step
present in an image.                                               in the growth of the field of artificial intelligence and
                                                                   computer science. The developments made in just the last
Blob Detection                                                     decade have been astronomical and with more importance
Blob detection, otherwise known as Connected-component             being given to the field, the research and growth has only
labelling (CCL), connected-component analysis (CCA),               increased with time.
blob discovery or region extraction is an algorithm that           In this paper, the models described above, proposed by
applies graph theory to analyse and process images. In it,         various authors and research papers for the development of
subsets of uniquely identified groups of pixels and sections       image processing and the approach with which to tackle
are labelled, giving it the name connected-component               machine learning in regards to image processing has
labelling. It is used to scan, analyse and detect connected        highlighted multiple directions for the development of this
regions in an image. The process works with binary images          technology. The paper discusses the various approaches in
but colour images can also be analysed using this method.          which machine learning algorithms can be trained and
Since images contain various forms of information, pre-            refined using the varying nature of data sets and feedback
processing to filter the irrelevant information out before         loops dependant on both human input and machine input.
using blob detection is required often times. Blob detection       The paper has also highlighted the differences between the
locates and finds any large clusters of pixels, bright pixels,     two approaches alongside their advantages and
set against a dark background. That is, a set of pixels that       disadvantages. Various application and usage of the
can form a distinctive collection and can distinguish itself       technology have been highlighted in the paper as well,
from the background of the image. Blob Detection is often          alongside the processes and technologies that form the basis
used in pair with other image recognition systems,                 of image processing and the methods in which these
mechanisms and component and can work based on a                   algorithms analyse and process any given image. Taking in
variety of information and data points depending upon the          account the various methods of development, the
nature of the image processing and detection system. There         advantages and disadvantages of a synthetic dataset, the
are multiple usages of real time blob detection which can          methods used for developing and training machine vision,
help in processing and understanding an image by breaking          it’s varied and diverse usage methods alongside various
                                                              ~ 54 ~
International Journal of Electronics and Microcircuits                                                  www.microcircuitsjournal.com
                                                               ~ 55 ~
International Journal of Electronics and Microcircuits                                         www.microcircuitsjournal.com
    image processing, 2010.                                    34. Gillala Rekha V, Krishna Reddy, Amit Kumar Tyagi.
23. Canny John. A Computational Approach To Edge                   CIRUS - Critical Instances removal based Under-
    Detection. Pattern Analysis and Machine Intelligence,          Sampling - A solution for Class Imbalance, IJHIS.
    IEEE Transactions on. PAMI-8, 1986, 679-698.                   2020;16(2):55-66.
    10.1109/TPAMI.1986.4767851.                                35. Kumari S, Vani V, Malik S, Tyagi AK, Reddy S.
24. Petrović Vladimir, Popović-Bozovic Jelena. Towards             Analysis of Text Mining Tools in Disease Prediction.
    Real-Time Blob Detection in Large Images with                  In: Abraham A., Hanne T., Castillo O., Gandhi N.,
    Reduced Memory Cost, 2016.                                     Nogueira Rios T., Hong TP. (eds) Hybrid Intelligent
25. Sekar K, Tyagi AK. Study of Data Behaviour and                 Systems. HIS 2020. Advances in Intelligent Systems
    Methods for Data Prediction and Analysis, 6th                  and Computing, vol 1375. Springer, Cham, 2021.
    International Conference on Intelligent Computing and          https://doi.org/10.1007/978-3-030-73050-5_55
    Control       Systems      (ICICCS),     2022,     1-6.    36. Amit Kumar Tyagi, Poonam Chahal. Artificial
    DOI:10.1109/ICICCS53718.2022.9788360.                          Intelligence and Machine Learning Algorithms, Book:
26. Malik S, Tyagi AK, Mahajan S. Architecture,                    Challenges and Applications for Implementing
    Generative Model, and Deep Reinforcement Learning              Machine Learning in Computer Vision, IGI Global,
    for IoT Applications: Deep Learning Perspective. In:           2020. DOI: 10.4018/978-1-7998-0182-5.ch008
    Pal S., De D., Buyya R. (eds) Artificial Intelligence-     37. Amit Kumar Tyagi, Rekha G. Challenges of Applying
    based Internet of Things Systems. Internet of Things           Deep Learning in Real-World Applications”, Book:
    (Technology, Communications and Computing).                    Challenges and Applications for Implementing
    Springer, Cham, 2022. https://doi.org/10.1007/978-3-           Machine Learning in Computer Vision, IGI Global,
    030-87059-1_9                                                  2020, 92-118. DOI: 10.4018/978-1-7998-0182-5.ch004
27. Varsha R, Nair SM, Tyagi AK, Aswathy SU, Radha             38. Gillala Rekha, Krishna Reddy V, Amit Kumar Tyagi.
    Krishnan R. The Future with Advanced Analytics: A              An Earth mover's distance-based undersampling
    Sequential Analysis of the Disruptive Technology’s             approach for handling class-imbalanced data,
    Scope. In: Abraham A., Hanne T., Castillo O., Gandhi           International Journal of Intelligent Information and
    N., Nogueira Rios T., Hong TP. (eds) Hybrid Intelligent        Database Systems, 2020 13(2/3/4).
    Systems. HIS 2020. Advances in Intelligent Systems         39. Akshara Pramod, Harsh Sankar Naicker, Amit Kumar
    and Computing, 2021, 1375. Springer, Cham.                     Tyagi. Machine Learning and Deep Learning: Open
    https://doi.org/10.1007/978-3-030-73050-5_56                   Issues and Future Research Directions for Next Ten
28. Goyal D, Goyal R, Rekha G, Malik S, Tyagi AK.                  Years,      Book:    Computational    Analysis   and
    Emerging Trends and Challenges in Data Science and             Understanding of Deep Learning for Medical Care:
    Big Data Analytics, 2020 International Conference on           Principles, Methods, and Applications, 2020, Wiley
    Emerging Trends in Information Technology and                  Scrivener, 2020.
    Engineering (ic-ETITE), 2020, 1-8. DOI: 10.1109/ic-        40. Tyagi, Amit Kumar, Rekha G. Machine Learning with
    ETITE47903.2020.316.                                           Big Data (March 20, 2019). Proceedings of
29. Malik S, Mire A, Tyagi AK, Arora V. A Novel Feature            International Conference on Sustainable Computing in
    Extractor Based on the Modified Approach of                    Science, Technology and Management (SUSCOM),
    Histogram of Oriented Gradient. In: Gervasi O. et al.          Amity University Rajasthan, Jaipur - India, 2019 Feb
    (eds) Computational Science and Its Applications –             26-28.
    ICCSA 2020. ICCSA 2020. Lecture Notes in Computer          41. Gillala Rekha, Amit Kumar Tyagi, Krishna Reddy V. A
    Science,      Springer,      Cham,     2020,    12254.         Wide Scale Classification of Class Imbalance Problem
    https://doi.org/10.1007/978-3-030-58817-5_54                   and its Solutions: A Systematic Literature Review,
30. Gillala Rekha V. Krishna Reddy, and Amit Kumar                 Journal of Computer Science. 2019;15(7):886-929.
    Tyagi, “KDOS - Kernel Density based Over Sampling -            ISSN Print: 1549-3636.
    A Solution to Skewed Class Distribution, Journal of        42. Deekshetha HR, Shreyas Madhav AV, Tyagi AK.
    Information Assurance and Security (JIAS).                     Traffic Prediction Using Machine Learning. In: Suma,
    2020;15(2):44-52. 9p.                                          V, Fernando X, Du KL, Wang H. (eds) Evolutionary
31. Gillala Rekha, Amit Kumar Tyagi, Krishna Reddy V.              Computing and Mobile Sustainable Networks. Lecture
    Solving Class Imbalance Problem Using Bagging,                 Notes on Data Engineering and Communications
    Boosting Techniques, with and without Noise Filter             Technologies, Springer, Singapore, 2022, 116.
    Method, International Journal of Hybrid Intelligent            https://doi.org/10.1007/978-981-16-9605-3_68
    Systems. 2019;15(2):67-76.
32. Gillala Rekha V, Krishna Reddy, Amit Kumar Tyagi. A
    Novel Approach for Solving Skewed Classification
    Problem using Cluster Based Ensemble Approach,
    Mathematical Foundations of Computing, 2020
    Feb;3(1):1-9.
33. Mishra S, Tyagi AK. The Role of Machine Learning
    Techniques in Internet of Things-Based Cloud
    Applications. In: Pal S., De D., Buyya R. (eds)
    Artificial Intelligence-based Internet of Things
    Systems.      Internet    of    Things    (Technology,
    Communications and Computing). Springer, Cham,
    2022. https://doi.org/10.1007/978-3-030-87059-1_4
~ 56 ~