0% found this document useful (0 votes)
28 views48 pages

0 100029991

Uploaded by

Sanjay Reddy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views48 pages

0 100029991

Uploaded by

Sanjay Reddy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 48

3

DECLARATION

I am the student under the guidance of Ms. Sriya Basu Mallik (Manager), IT
& ERP department, Visakhapatnam, steel plant, hereby declare that the
project entitled “LEAD TIME PREDICTION USING MACHINE
LEARNING” is an original work done at Rashtriya Ispat Nigam Limited
(RINL) Visakhapatnam Steel Plant, submitted in partial fulfilment of the
requirements for the award of Industrial Training project work. I assure that
this project is not submitted in any university or college.

Place:

VISHAKAPATNAM
4

ACKNOWLEDGEMENT

We would like to express our deep gratitude for the valuable guidance of our guide
MS. SRIYA BASU MALIK, MANAGER, IT and ERP department ,Visakhapatnam ,
Steel plant for all her guidance , help and ongoing support throughout the course of this
work by explaining basic concepts of yard management system and their functioning
along with industry requirements. We are immensely grateful to you madam. Without
whose inspiration and valuable support our training would never have taken off.

We sincerely thank the Training and Development centre (T&DC) for their guidance
during safety training and encouragement in successful completion of our training.

Gundubogula Neeraja
Indukuri Sri Surya
Varma Bogineni
Vyshnavi
Devarakonda Sai Sreeja
5

ABSTRACT

Accurate prediction of end-to-end parts shipment delivery lead time (LT)


is crucial for optimizing manufacturing efficiency, planning, and job scheduling in the
steel plant industry. Transparency in lead times enables companies to reduce operating
expenses, improve capital utilization, increase revenues, and enhance competitive
advantage. Clients can better allocate resources and mitigate risks with confidence in
their product and service delivery schedules. This paper explores the application of
machine learning algorithms to predict order lead times, presenting a comprehensive
analysis of different models and their prediction accuracies.

By leveraging historical data from organizational logistics applications


and various supply chain-specific variables, we demonstrate how machine learning can
significantly enhance job scheduling, service levels, operational efficiency, cost
reduction, and customer satisfaction. We detail the methodology, including the use of
Random Forest and other algorithms, and provide sensitivity analysis results to illustrate
the robustness of our approach.

Keywords: Machine Learning; Lead time; Random Forest; Logistic; Production; Supply
Chain.
6

CONTENTS
A Brief Overview of Steel Plant......................................07-08

LEAD TIME PREDICTION USING ML


Introduction.......................................................................09-21
Data Cleaning....................................................................21-25
Deploying the Model........................................................26-29
YOLO
YOLO Introduction..........................................................30-35
YOLOV8...........................................................................35-37
Training............................................................................38-42
Prediction..........................................................................43-44
Output................................................................................45

Conclusion.........................................................................46-47
References.........................................................................48
7

Brief Overview of Steel Plant:

Visakhapatnam Steel Plant (VSP) is the integrated steel plant of Rashtriya Ispat
Nigam Limited in Visakhapatnam, founded in 1971. VSP strikes everyone with a
tremendous sense of awe, wonder, and amazement as it presents a wide array of
excellence in all its facets in scenic beauty, technology, human resources,
management, and product quality. On the coast of the Bay of Bengal and by the
side of scenic Gangavaram beach has tall and massive structures of technological
architecture, the Visakhapatnam Steel Plant. Nevertheless, the vistas of excellence
do not rest with the inherent beauty of location over the sophistication of
technology-they march ahead, parading one aspect after another.

The decision of the Government of India to set up an integrated steel plant at


Visakhapatnam was announced by then Prime Minister Smt. Indira Gandhi in
Parliament on 17 January 1971. VSP is the first coastal- based integrated steel
plant in India, 16 km west of the city of destiny, Visakhapatnam, bestowed with
modern technologies; VSP has an installed capacity of 3 million tons per annum of
liquid steel and 2.656 million tons of saleable steel. The saleable steel here is in
the form of wire rod coils, Structural, Special Steel, Rebar, Forged Rounds, etc. At
VSP, there lies emphasis on total automation, seamless integration, and efficient
up-gradation. This results in a wide range of long and structural products to meet
8

stringent demands of discerning customers in India & abroad; SP product meets


exalting international Quality Standards such as JIS, DIN, BIS, BS, etc.

VSP has become the first integrated steel plant in the country to be certified to all
the three international standards for quality (ISO -9001), Environment
Management (ISO-14001) & Occupational Health & Safety (OHSAS—18001).
The certificate covers quality systems of all operational, maintenance, and service
units besides purchase systems, Training, and Marketing functions spreading over
4 Regional Marketing offices, 20 branch offices & 22 stockyards located all over
the country. VSP, by successfully installing & operating efficiently Rs.460 crores
worth of pollution and Environment Control Equipment and converting the barren
landscape more than 3 million plants have made the steel plant, steel Township,
and VP export Quality Pig Iron & Steel products Sri Lanka, Myanmar, Nepal,
Middle East, USA. & South East Asia (Pig iron). RINL—VSP was awarded
―Star Trading HOUSE‖ status during 1997-2000. Having established a fairly
dependable export market, VP Plans to make a continuous presence in the export
market.

Different sections at the RINL VSP:

● Coke oven and coal chemicals plant

● Sinter plant

● Blast Furnace

● Steel Melt Shop

● Continuous casting machine

● Light and medium machine mills

● Calcining and refractive materials plant

● Rolling mills

● Thermal power plant

● Chemical power plant


9

Introduction:
Predictive data analytics contains a variety of statistical methods from data
mining, predictive modeling, and ML that analyze historical facts to make
predictions about prospects or unidentified occasions.

―CRISP-DM breaks the process of data mining into six major phases:1) Business
Understanding; 2) Data Understanding; 3) Data Preparation; 4) Modeling; 5)
Evaluation; 6) Deployment‖. The fourth stage (modeling) is where machine
learning (ML) algorithms are engaged to build predictions for Lead time. In
particular, ML is defined as an automated computational method that ―learns‖
and extracts information and patterns directly from (historical) data. There are four
approaches, 1) Supervised, 2) Unsupervised, 3) Semi-Supervised, and 4)
Reinforcement Machine Learning.

● Supervised ML – Is a learning task with a full set of labeled data while


training an algorithm. In other words, it accepts that training instances are
classified or labeled (learning affiliation between a set of descriptive features
and a target feature). In Supervised ML, the model needs to be trained on a
classified dataset that means we have both raw input data as well as its
outcomes. We divided our data into a training dataset and test dataset, where
the training dataset is used to train our system, whereas the test dataset turns
into new data for predicting results or to see the accuracy of our model.
Supervised ML is relatively a simpler, highly accurate, and trustworthy
method. There are three key areas where supervised learning is beneficial:
Classification Techniques, Regression, and Forecasting. Classification
techniques use the algorithm to predict a discrete value, focused on predicting
a qualitative reaction by analyzing data and identifying patterns. On the other
hand, regression techniques used continuous data. The technique classically
used in predicting, forecasting, and finding relationships amongst quantitative
data. Forecasting is the process of making predictions about the prospect based
on the past and present data. It is most commonly used to analyze trends.

● Unsupervised ML- Concerns the analysis of unclassified examples. In


unsupervised learning, a deep learning model is offered a dataset without
categorical commands on what to do with it. The training dataset in this
approach is a collection of instances without a specific desired outcome or an
10

accurate answer. Unsupervised learning is computationally compound, less


accurate, and reliable method.

● Semi-Supervised ML (SSL) - It is expected that there is also unlabeled data


available at the time of training in addition to the labeled data. The objective of
SSL methods is to excerpt information from the unlabeled data that could ease
learning a discriminative model with greater performance.

● Reinforcement Machine Learning - Is a category of machine learning


techniques that enables an agent (Artificial Intelligent agent) to learn in a
collaborating environment by trial and error using response from its own
activities and experiences.

―Regression technique in supervised machine learning predicts a single and


continuous target output value using training data‖. In our scenario, modeling the
association between the continuous variable (e.g. lead-time) and one or more
predictors (For example, supplier origin, Buyer destination, Order type, etc.) using
a linear function is the most suitable approach. One of the major strengths of using
a regression algorithm is that the Outputs always have a probabilistic analysis and
can be normalized to avoid overfitting. However, the Weaknesses of Logistic
regression may underperform when there are numerous or nonlinear decision
limitations. This method is not malleable, so it does not capture more complex.

Background and concept:


A lead time is the latency between the initiation and completion of a process. For
example, the lead time between the placement of an order and delivery of new cars
by a given manufacturer might be between 2 weeks and 6 months, depending on
various particularities. One business dictionary defines "manufacturing lead time"
as the total time required to manufacture an item, including order preparation time,
queue time, setup time, run time, move time, inspection time, and put-away time.
For make-to-order products, it is the time between release of an order and the
production and shipment that fulfill that order. For make-to-stock products, it is
the time taken from the release of an order to production and receipt into finished
goods inventory. This research focuses on make-to-stock products. Lead time in
inventory management is the lapse in time between when an order is placed to
11

replenish inventory and when the order is received. Lead time affects the amount
of stock a company needs to hold at any point in time. When considering the total
amount of time for a purchase order to be delivered from a supplier, factor in the
time taken for the supplier to accept and process the order. Lead time directly
affects your total inventory levels. The longer your lead time the more stock you
will need to hold in your inventory. Longer lead times make deliveries more
unpredictable and force a company to rely heavily on demand forecasts to make
orders. Once you have calculated your lead time, the next step is to employ
corrective measures to reduce it. While there have been improvements to shipping
and freight services in recent years, there are other factors that affect lead times.
Orders may take time to process and be approved within your business, your
suppliers will then need to place orders of their own for the materials to create
your products, and there might be delays due to checks done at the ports by
customs. There are different trade control compliance and regulations for each
country that needs to be factored in an international organization. Traditionally,
lead time calculation could be done using all those factors into consideration. For
all of these factors add to your lead time. Hence, there are several factors required
to be considered to calculate lead time in a specific industry. However, with the
advancement of the technology, predictive analysis techniques can be used to
predict the lead time based on the products supplier.

Methodology:

Figure 1: Data Analysis and Processing Steps


12

From the CRISP-DM model, the first three phases related to data processing as
described in Figure 1.The unit MLalgorithm is the modeling phase, and the
evaluation is done in the Results section, where pertinent landscapes have been
picked and the accuracy of the prediction model has been calculated.

Model Flow Diagram:

“Each item will have its own lead time profile, and one model cannot be
fitted to all the items"
13

1. Lead Time: Lead time is the amount of time it takes for an item to be delivered
or produced after an order has been placed. In different industries and contexts,
lead time can vary significantly depending on factors such as the complexity of the
item, the production process, supply chain efficiency, and other related variables.

2. Lead Time Profile: A lead time profile represents the historical data or pattern
of lead times for a specific item. It includes information about lead time
variability, trends, seasonality, and any other factors that may influence the lead
time for that particular item.

3. Machine Learning Models: Machine learning models are algorithms that


learn patterns and relationships from historical data and use that knowledge to
make predictions or decisions on new, unseen data. In the context of lead time
prediction, machine learning models can be trained on historical lead time data for
various items to learn how different factors impact lead times.

4. Challenges of One-Size-Fits-All Model: While it might be tempting to create


a single machine learning model to predict lead times for all items in a company or
supply chain, there are several challenges with this approach.

"For each item, various models are applied, and the model with the best
accuracy is considered for that item"

Here's a brief explanation of the process:

1. Dataset with Multiple Items: Assume we have a dataset that contains


information about multiple items, each having its own specific features, attributes,
or historical data.

2. Model Selection: Instead of using a single generic model for all items, various
machine learning models are considered. These models could belong to different
families (e.g., linear regression, decision trees, support vector machines, neural
networks) or may have different configurations.

3. Model Training and Evaluation: Each of the selected models is trained on the
historical data of a specific item. The data for each item is used to create a separate
training set for that particular item.
14

4. Cross-Validation: To ensure robustness and avoid overfitting, cross-validation


techniques are often used. The historical data for each item is split into training
and validation sets, and the model's accuracy is evaluated on the validation set.

5. Selecting the Best Model: The model's performance is assessed based on a


chosen evaluation metric (e.g., accuracy, mean squared error, F1 score) for each
item. The model that achieves the highest accuracy or the best performance metric
on the validation set is considered the best model for that specific item.

6. Application Phase: Once the best model for each item is identified during the
training and validation phase, it is used for predicting outcomes or estimating
characteristics of new, unseen data related to that item.

Benefits of this approach:

Item-Specific Models: By using different models for each item, the approach can
capture the unique patterns and characteristics of individual items, leading to more
accurate predictions and better performance.

Adaptability: Different items may have varying levels of complexity and different
relationships between features and outcomes. Using various models allows for
greater adaptability to the diverse nature of the dataset.

Optimization: This approach aims to maximize the predictive performance for


each item, which can be critical for tasks such as demand forecasting, inventory
management, and resource allocation.

Insights and Interpretability: Exploring the results from different models can
also provide valuable insights into the behavior and relationships of each item,
aiding decision-making and process improvement.

However, it's important to note that implementing this approach requires sufficient
data for each item, and it can be computationally intensive when dealing with a
large number of items and complex models. Careful consideration should be given
to the trade-offs between model complexity and interpretability, as well as
practical constraints related to data availability and computational resources.
15

a. Diverse Item Characteristics: Items in a company or supply chain can have


vastly different characteristics, ranging from raw materials to finished products,
each with its own unique production processes, lead time variability, and supply
chain complexities.

b. Non-Linear Relationships: The relationship between lead time and


influencing factors may not be linear across all items. Some items might have lead
times that are highly dependent on certain variables, while others might have more
complex and non-linear dependencies.

c. Overfitting and Underfitting: Using a single model for all items may lead to
either overfitting or underfitting. Overfitting occurs when the model is too specific
to the training data and cannot generalize well to new data. Underfitting happens
when the model is too simplistic and fails to capture the complexities of different
items.

d. Loss of Information: A one-size-fits-all model might fail to capture item-


specific patterns and could result in inaccurate predictions, leading to
inefficiencies in inventory management, production planning, and overall supply
chain operations.

e. Item-Specific Models: To address the challenges mentioned above, one could


develop item-specific machine learning models for lead time prediction. Each
model would be trained on historical data for a particular item, taking into account
its unique characteristics and lead time profile. These item-specific models can
capture more accurate relationships between variables and lead times, leading to
improved predictions and better decision-making.

f. Data Availability: Building item-specific models requires having sufficient


historical data for each item. If certain items have limited historical data, it might
be challenging to create reliable individual models. In such cases, a hybrid
approach could be considered, where items with similar characteristics are
grouped together, and models are developed for each group.

In conclusion, while using a single machine learning model for predicting lead
times might be simpler, it may not provide optimal results due to the diversity and
complexity of items in a supply chain. Developing item-specific models allows for
16

a more accurate representation of lead time behavior and can lead to better
operational efficiency and planning.

Items are considered if they have a minimum of 10 transactions and


this can be configured in the system..

Here's a brief explanation of how this works in the context of machine learning:

1. Dataset and Items: Assume we have a dataset that contains information about
various items, and each item is associated with specific features or attributes, such
as product details, historical records, or relevant characteristics.

2. Minimum Transaction Threshold: The system is configured with a minimum


transaction threshold, which in this case is set to 10 transactions. This means that
an item must have at least 10 data points or transactions available in the dataset to
be considered for model training.

3. Filtering Items: Before the machine learning process begins, the system filters
the items based on the minimum transaction threshold. Items that do not meet this
criterion are excluded from further analysis and modeling.

4. Item-Specific Modeling: For each item that meets the minimum transaction
requirement, a separate machine learning model is trained using its relevant
historical data. The model could be chosen based on its performance on the
training data and validation set.

5. Advantages of Minimum Transaction Threshold: Setting a minimum


transaction threshold offers several advantages in machine learning:

a. Data Sufficiency: By requiring a minimum number of transactions, the models


are built on a more robust dataset, which can lead to more accurate and reliable
predictions.

b. Avoiding Overfitting: With a sufficient number of data points, the risk of


overfitting (when a model memorizes the training data without generalizing well
to new data) is reduced.
17

c. Resource Management: Training machine learning models can be


computationally intensive. By excluding items with insufficient data,
computational resources can be used more efficiently.

d. Generalizability: Models trained on items with an adequate amount of data are


more likely to generalize well to unseen data and perform better in real-world
applications.

6. Configurability: The system's flexibility allows users to adjust the minimum


transaction threshold according to specific needs and domain knowledge.
Depending on the data characteristics and modeling requirements, the threshold
value can be customized.

In conclusion, the minimum transaction threshold in machine learning ensures that


item-specific models are built only for items with enough relevant data, promoting
accurate and reliable predictions. It offers adaptability by allowing users to tailor
the threshold value based on their data availability and modeling objectives.

Outliers are detected and removed for each item before applying
models for better prediction accuracy.

Here's a brief explanation of the process:

1. Outliers: Outliers are data points that are unusually distant from other data
points in the dataset. They can be caused by measurement errors, data
corruption, rare events, or other anomalies in the data. Outliers can distort the
relationships learned by machine learning models and lead to inaccurate
predictions.

2. Item-Specific Analysis: Since we are dealing with item-specific models, the


dataset is divided or grouped by items. Each item's data is then analyzed
separately to identify potential outliers.

3. Outlier Detection: There are various statistical and machine learning


techniques available to detect outliers, such as z-score analysis, box plots,
isolation forests, and k-nearest neighbors (KNN) methods. These techniques
help in flagging data points that are likely to be outliers for each item.
18

4. Outlier Removal: Once outliers are detected, they are removed or treated in a
way that aligns with the specific needs of the modeling task. Depending on the
nature of the outliers and the dataset, different approaches can be employed,
including removal, imputation, or transformation.

5. Model Training: After the outlier removal process, machine learning models
are trained using the cleaned and preprocessed data for each item. The models
can be selected based on their performance during training and validation.

6. Benefits of Outlier Removal:

a. Improved Model Performance: Removing outliers can lead to better model


performance as the models focus on the underlying patterns in the majority of the
data rather than being influenced by extreme values.

b. Robustness: By eliminating the impact of outliers, the models become more


robust and are better equipped to generalize to new, unseen data.

c. Avoiding Overfitting: Outliers can lead to overfitting, where the model


memorizes noise in the data rather than learning meaningful patterns. Removing
outliers can help prevent this issue.

d. Accurate Predictions: Removing outliers reduces the potential for inaccurate


predictions caused by data points that do not align with the normal behavior of the
item.

7. Considerations: It's essential to apply outlier removal techniques with care. In


some cases, outliers may represent genuine and critical events or insights. The
decision to remove outliers should be based on domain knowledge and an
understanding of the data generation process.

In conclusion, removing outliers from the data of each individual item before
applying machine learning models can lead to improved prediction accuracy,
robustness, and generalization. By focusing on the meaningful patterns in the
majority of the data, the models become better equipped to make accurate
predictions for real-world applications.
19

1) AVERAGE MODEL IN MACHINE LEARNING

The term "average model" in machine learning typically refers to an ensemble


learning technique known as model averaging or ensemble averaging. Ensemble
methods are powerful techniques that combine the predictions of multiple
individual models to achieve better overall performance and robustness. The idea
behind model averaging is to leverage the strengths of different models while
mitigating their weaknesses, ultimately leading to more accurate and
reliable predictions.

2) Linear model in machine learning

A linear model is a fundamental and widely used concept in machine learning that
represents a relationship between input variables (features) and an output variable
(target) using a linear equation. The primary idea is to model the data as a linear
combination of features, allowing us to make predictions, estimate relationships,
and interpret the impact of different input variables on the target.

3) Polynomial model in machine learning

A polynomial model is a type of regression model used in machine learning to


capture nonlinear relationships between input features and the target variable.
Unlike linear models, which assume a straight-line relationship between the
variables, polynomial models can represent more complex curves, making them
suitable for fitting data that exhibits nonlinear behavior.

4) SUPPORT VECTOR MACHINE IN MACHINE LEARNING

Support Vector Machine (SVM) is a popular and powerful supervised machine


learning algorithm used for classification and regression tasks. SVM is known for
its effectiveness in handling complex datasets and high-dimensional feature
spaces. It works by finding an optimal hyperplane that best separates different
classes or predicts continuous target values. SVM has been widely used in various
real-world applications due to its ability to handle both linearly and nonlinearly
separable data.
20

5) DECISION TREES IN MACHINE LEARNING

A Decision Tree is a versatile and widely used machine learning algorithm that
can handle both classification and regression tasks. It is a non-parametric
supervised learning method that learns a hierarchical structure of decisions based
on the input features to make predictions. Decision Trees are particularly valuable
because they offer interpretable and intuitive models, making them useful in
various real-world applications.

6) RANDOM FOREST MODEL IN MACHINE LEARNING

Random Forest is a popular and powerful ensemble learning technique used in


machine learning for both classification and regression tasks. It is an extension of
decision trees that leverages the concept of aggregating multiple decision trees to
improve predictive accuracy, reduce overfitting, and handle complex datasets.
Random Forest has proven to be highly effective and robust, making it one of the
most widely used algorithms in various real-world applications.
21

Data Cleaning:

Importing necessary packages and importing excel sheet:

Here we are importing a few libraries that are necessary to clean the data and we
are importing excel sheet that contain the data provided by the RINL.
22

Removing Null Values and Duplicates

Sort the data by CATALOG_NO.


23

Converting the Datatypes

Group the data using Catalogue Number


24

Model:
25

Saving the model into pickle file:


26

Deploying the Model:

Predict.py
27

Templates:
28

Output.html:
29

OUTPUT:
30

YOLO
Introduction to YOLO

YOLO (You Only Look Once) is a groundbreaking object detection system that
simplifies the traditional approach to object detection. Object detection involves
identifying objects within an image and drawing bounding boxes around them. Unlike
previous methods that required multiple stages and were computationally expensive,
YOLO transforms object detection into a single, end-to-end regression problem.

Core Concept
The core idea of YOLO is to apply a single convolutional neural network (CNN) to the
full image. This CNN divides the image into an SxS grid and, for each grid cell, predicts
bounding boxes, confidence scores for those boxes, and class probabilities for each box.
The confidence score reflects how confident the model is that the box contains an object
and the accuracy of the box's predicted location. This single-stage detection process
significantly reduces computation time, making YOLO extremely fast compared to its
predecessors.

History and Evolution of YOLO

YOLOv1 (2015)

YOLOv1 was introduced by Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali
Farhadi in 2015. Their paper, "You Only Look Once: Unified, Real-Time Object
Detection," introduced a new paradigm in object detection.

Motivation and Design: Traditional object detection systems like R-CNN, Fast R-CNN,
and Faster R-CNN involve multiple stages, including region proposal and classification,
which are computationally intensive. YOLOv1 proposed a single-stage detection system
that predicts bounding boxes and class probabilities directly from full images in one
evaluation.
31

Architecture: YOLOv1’s architecture consists of 24 convolutional layers followed by 2


fully connected layers. It uses a relatively simple and lightweight network, making it
capable of real-time detection.
Performance: YOLOv1 was revolutionary for its speed, capable of processing 45 frames
per second on a standard GPU. However, it struggled with small objects and precise
localization due to its coarse grid structure.

YOLOv2 (2016) - "YOLO9000"

YOLOv2, also known as YOLO9000, was introduced in 2016 with several


enhancements.

Improvements: YOLOv2 introduced batch normalization on all convolutional layers,


a high-resolution classifier, and multi-scale training, improving the model's accuracy
and robustness.
Architecture Changes: YOLOv2 used a new backbone network called Darknet-19,
which includes 19 convolutional layers and 5 max-pooling layers.
YOLO9000: This version introduced YOLO9000, which can detect over 9000 object
categories by combining the COCO dataset and ImageNet dataset, allowing the model to
perform detection and classification simultaneously.

YOLOv3 (2018)
YOLOv3 brought significant enhancements to the YOLO architecture.

Further Enhancements: YOLOv3 introduced Darknet-53, a more complex and deeper


backbone network with 53 convolutional layers, improving feature extraction.
Multi-Scale Predictions: YOLOv3 predicts bounding boxes at three different scales,
which helps in detecting objects of various sizes more accurately.
Class Prediction: Instead of using softmax for classification, YOLOv3 uses independent
logistic classifiers for each label, allowing it to handle overlapping labels better.

YOLOv4 (2020)
YOLOv4, released in 2020, incorporated cutting-edge techniques to further improve
performance.
32

Advanced Techniques: YOLOv4 uses CSPDarknet53 as the backbone, PANet for


path aggregation, and SPP (Spatial Pyramid Pooling) for better receptive fields. It also
includes various bag-of-freebies and bag-of-specials techniques to improve training and
inference.
Optimization: YOLOv4 balances speed and accuracy, making it suitable for a wide
range of real-time applications.

YOLOv5 (2020)
Developed by Ultralytics, YOLOv5 introduced several new features and improvements.

Usability and Performance: YOLOv5 emphasizes ease of use, modularity, and


deployment capabilities. It is implemented in PyTorch, making it more accessible to the
PyTorch community.
Variants: YOLOv5 is available in different sizes (YOLOv5s, YOLOv5m, YOLOv5l,
YOLOv5x), catering to different performance and speed requirements.
YOLOv6, YOLOv7, and Beyond
Continuous Evolution: Later versions of YOLO have continued to build on the
foundations laid by earlier versions, integrating newer techniques from deep learning
research to enhance performance, accuracy, and efficiency.

YOLOv6, YOLOv7, and Beyond

Continuous Evolution: Later versions of YOLO have continued to build on the


foundations laid by earlier versions, integrating newer techniques from deep learning
research to enhance performance, accuracy, and efficiency.

Key Features and Innovations

Single-Stage Detection
YOLO’s key innovation is framing object detection as a single regression problem.
Traditional object detection methods involve multiple stages, such as region proposal,
region refinement, and classification, which are computationally expensive and slow.
YOLO, by contrast, applies a single neural network to the full image, which predicts
bounding boxes and class probabilities simultaneously, drastically simplifying and
speeding up the detection process.
33

Grid-Based Approach
YOLO divides the input image into an SxS grid. Each grid cell is responsible for
predicting a certain number of bounding boxes and their associated confidence scores,
along with class probabilities. This grid-based approach allows YOLO to detect multiple
objects within an image efficiently. The confidence score indicates the likelihood that
the bounding box contains an object and the accuracy of the bounding box's coordinates.

Real-Time Performance
One of YOLO’s most notable advantages is its real-time performance. Thanks to its
single-stage detection process, YOLO can process images at high frame rates, making
it suitable for applications that require quick response times, such as autonomous
driving, video surveillance, and robotics. For instance, YOLOv1 could process images
at 45
frames per second on a standard GPU, with later versions achieving even faster speeds.

Unified Architecture
YOLO’s architecture is unified and streamlined, using a single neural network for
detection. This unification leads to faster and more efficient computations compared to
traditional methods, which often require separate models for region proposal and
classification.

Multi-Scale Detection
Starting from YOLOv3, the model includes multi-scale detection. This means that the
model predicts bounding boxes at three different scales, helping it detect objects of
varying sizes more accurately. This is particularly useful for detecting smaller objects
that might be missed by single-scale detectors.

Community and Ecosystem


YOLO has a strong and active community, with extensive resources available for
researchers and developers. Numerous pre-trained models, open-source implementations,
and tutorials are available, making it easier for new users to get started and for
experienced users to optimize and adapt the models to their specific needs.

Applications of YOLO

Autonomous Vehicles
34

In the realm of autonomous driving, YOLO’s real-time detection capabilities are critical.
Self-driving cars need to recognize and react to various objects on the road, such as other
vehicles, pedestrians, traffic signs, and obstacles. YOLO’s speed and accuracy enable it
to process visual data in real-time, making it an ideal choice for this application.

Surveillance and Security


YOLO is widely used in surveillance systems to detect and track intruders, monitor
crowds, and ensure security in sensitive areas. Its ability to process video streams in real-
time allows for immediate response to potential threats.

Medical Imaging
In healthcare, YOLO is employed to detect abnormalities in medical images, such as
tumors, lesions, or fractures. This assists doctors in diagnosis and treatment planning.
YOLO’s accuracy and speed make it a valuable tool for analyzing medical imagery
quickly and effectively.

Retail and Inventory Management


YOLO helps in identifying and tracking products in retail environments, improving
inventory management and customer experience. For example, it can be used to monitor
stock levels on shelves, detect misplaced items, and enhance automated checkout
systems.

Robotics
Robots equipped with YOLO-based vision systems can navigate environments, avoid
obstacles, and interact with objects. This is particularly useful in industrial automation,
where robots need to perform tasks such as sorting, picking, and assembling with high
precision and efficiency.

Challenges and Future Directions


Small Object Detection
Despite its many strengths, YOLO has historically struggled with detecting small objects,
particularly when they are located close to larger objects. Research is ongoing to improve
YOLO’s capability to detect smaller objects more accurately, which would enhance its
utility in applications where small object detection is critical.

Real-Time Performance
35

Balancing speed and accuracy continues to be a significant challenge. While YOLO is


already fast, achieving higher accuracy without sacrificing speed is an ongoing area
of research. Advances in hardware, such as more powerful GPUs and specialized AI
accelerators, as well as optimization techniques like model pruning and quantization, are
crucial for further improvements.

Generalization
Ensuring that YOLO models generalize well across diverse datasets and real-world
scenarios is another important area of research. Models trained on specific datasets might
not perform well on different types of images or in different environments. Techniques
such as data augmentation, transfer learning, and domain adaptation are being explored to
improve generalization.

Integration with Other Technologies


Combining YOLO with emerging technologies like edge computing, 5G, and AI
accelerators will open up new possibilities for real-time applications. For example, edge
computing can bring computation closer to the data source, reducing latency and
improving real-time performance for applications like autonomous driving and
surveillance.

Explainability and Trustworthiness


As YOLO is used in more safety-critical applications, enhancing the interpretability and
trustworthiness of its models becomes increasingly important. Researchers are exploring
methods to make YOLO’s decisions more transparent and to ensure that the models
behave reliably under various conditions. This includes developing techniques for model
explainability, robustness to adversarial attacks, and ensuring fairness in object detection.

YOLOv8: Next-Generation Object Detection

Introduction
YOLOv8 represents the latest iteration in the You Only Look Once (YOLO) series,
building on the strengths and lessons learned from its predecessors. As with previous
versions, YOLOv8 aims to deliver state-of-the-art object detection performance,
balancing speed, accuracy, and ease of use. YOLOv8 incorporates cutting-edge
techniques and optimizations to push the boundaries of what is possible in real-time
object detection.
36

Key Features of YOLOv8


Architecture
YOLOv8 continues to evolve the neural network architecture to improve performance.
Here are the main architectural features and enhancements:

Backbone Network: YOLOv8 uses a new and improved backbone network designed
for better feature extraction. This backbone is deeper and more complex than those in
earlier versions, incorporating advanced convolutional layers, normalization techniques,
and
activation functions.

Neck and Head: The neck of YOLOv8 is designed to enhance the flow of information
between the backbone and the detection head. It uses advanced feature pyramid networks
(FPN) and path aggregation networks (PAN) to improve multi-scale feature fusion. The
detection head, responsible for predicting bounding boxes and class probabilities, has
also been optimized for better accuracy and efficiency.

Anchors and Anchor-Free Detection: YOLOv8 introduces improvements in anchor


generation and also explores anchor-free detection mechanisms. These changes aim to
simplify the model and improve its ability to detect objects of various shapes and sizes.

Training Techniques
YOLOv8 employs several advanced training techniques to improve model performance:

Data Augmentation: Extensive data augmentation strategies are used to improve the
model's ability to generalize. Techniques such as mosaic augmentation, mixup, and
CutMix are employed to create diverse training samples.

Label Smoothing: This technique helps to regularize the model by smoothing the
labels during training, which can reduce overfitting and improve generalization.

Advanced Loss Functions: YOLOv8 uses sophisticated loss functions, such as CIoU
(Complete Intersection over Union) loss for bounding box regression and focal loss for
classification. These loss functions are designed to provide better gradients and improve
the convergence of the model.

Self-Adversarial Training: This technique involves training the model to be robust


against adversarial examples by incorporating adversarial perturbations during training.
37

Performance Optimization

YOLOv8 is designed for high performance, both in terms of speed and accuracy. Key
optimizations include:

Quantization and Pruning: These techniques are used to reduce the size of the model
and improve inference speed without significantly sacrificing accuracy.

Model Scaling: YOLOv8 comes in various sizes, such as YOLOv8s (small), YOLOv8m
(medium), YOLOv8l (large), and YOLOv8x (extra-large). This allows users to choose a
model that best fits their resource constraints and performance requirements.

Efficient Computation: YOLOv8 incorporates efficient layer designs and


tensor operations to maximize the utilization of modern hardware, such as GPUs
and AI accelerators.

YOLOv8 Model Details


Backbone
The backbone network in YOLOv8 is responsible for extracting rich features from the
input image. It typically includes:

Convolutional Layers: These layers apply filters to the input image to extract features
such as edges, textures, and shapes.
Normalization Layers: Techniques like batch normalization or group normalization are
used to stabilize and accelerate training.
Activation Functions: Non-linear activation functions like Leaky ReLU or Mish are
used to introduce non-linearity into the model.
Neck
The neck of YOLOv8 connects the backbone to the head and enhances feature
representation:

FPN and PAN: Feature Pyramid Networks (FPN) and Path Aggregation Networks
(PAN) are used to combine features from different layers of the backbone. This multi-
scale
feature representation helps in detecting objects of varying sizes.
Head
The head of YOLOv8 is responsible for making the final predictions:
38

Bounding Box Prediction: The head predicts the coordinates of bounding boxes. It uses
advanced loss functions like CIoU loss to ensure accurate localization.
Class Prediction: The head predicts the class probabilities for each bounding box. It may
use focal loss to handle class imbalance and improve the detection of hard-to-classify
objects.

Training Process

Training YOLOv8 involves the following steps:

Data Preparation: Preparing a diverse and representative dataset is crucial. Data


augmentation techniques are applied to increase the variety of training samples.

Model Initialization: The model is initialized with pre-trained weights or trained from
scratch, depending on the availability of a suitable pre-trained model.
Optimization: The model is trained using optimization algorithms like Adam or SGD.
Learning rate scheduling, weight decay, and other regularization techniques are
employed to ensure stable and efficient training.
39

Validation and Testing: The model is periodically evaluated on a validation set to


monitor its performance. Hyperparameters are tuned based on the validation results to
achieve the best performance.

CVAT (Computer Vision Annotation Tool)


CVAT (Computer Vision Annotation Tool) is an open-source, web-based tool developed
by Intel for annotating images and videos. It provides a user-friendly interface and robust
functionality for creating annotations that are essential for training machine learning
models, especially in computer vision tasks such as object detection, image segmentation,
and image classification.
40

TRAINING

The provided code snippet is for training a YOLO (You Only Look Once) model,
specifically using the YOLOv8 architecture, on a custom dataset with three classes:
helmet, head, and person. Here’s a detailed breakdown of the code:

Importing Libraries:

YOLO from ultralytics: This imports the YOLO class from the ultralytics package,
which is used for creating and training YOLO models.

pickle and warnings: These libraries are imported but not used in the snippet. pickle
could be for future use in saving/loading models or data, and warnings is used to filter out
warning messages.

Model Initialization:

model = YOLO("yolov8n.yaml"): This line initializes a YOLO model from a


configuration file (yolov8n.yaml). This file typically contains the model architecture and
hyperparameters.
model = YOLO("yolov8n.pt"): This line loads a pre-trained YOLOv8n model. Using a
pre-trained model is beneficial because it leverages transfer learning, where the model
starts with weights learned from a large dataset (like COCO) and fine-tunes them on
your custom dataset.
41

Training the Model:

model.train(data="C:/Users/HP/Desktop/YOLO/dataset/config.yaml", epochs=15):
This line starts the training process. It specifies the path to the config.yaml file, which
contains the dataset configuration, and sets the number of epochs for training (15 in this
case).

Explanation of config.yaml

The config.yaml file is crucial as it defines the paths to the training, validation, and
testing datasets, as well as the class labels. Here’s an example of what the config.yaml
file looks like and an explanation of each part:

path: C:/Users/HP/Desktop/YOLO/dataset: This is the base path to the dataset directory.


It provides a root path that the subsequent paths can use as a reference.

train: ../dataset/train: This specifies the relative path to the directory containing the
training images and annotations. It is relative to the base path defined above.

test: ../dataset/test: This specifies the relative path to the directory containing the testing
images and annotations. It is used for evaluating the model's performance after training.

val: ../dataset/valid: This specifies the relative path to the directory containing the
validation images and annotations. Validation data is used to tune model hyperparameters
and prevent overfitting during training.
42

names: This section maps class indices to their respective class names. Each class used
in the dataset is given a unique index and a corresponding name.
0: head
1: helmet
2: person

Importance of config.yaml

The config.yaml file plays a vital role in the training process:

Dataset Organization:

It provides a structured way to organize and reference the dataset. By defining paths to
training, validation, and test sets, it ensures that the model knows where to find the
necessary data.

Class Mapping:

The names section maps numerical labels to human-readable class names. This is crucial
for interpreting the model's outputs and for the training process, as the model needs to
know what each label represents.

Flexibility:

Using a configuration file allows for easy modification and experimentation. You
can change dataset paths, add new classes, or adjust other parameters without
altering the training script.

Consistency:

It ensures consistency across different training runs. By keeping the dataset paths and
class labels in a centralized configuration file, you reduce the risk of errors that
might occur if these parameters were hardcoded in multiple places.
43

Prediction

YOLO Model Initialization:


The code initializes a YOLO model using the YOLO class from the ultralytics library.
The path to the pre-trained model weights (best.pt) is provided as an argument.

Video Processing:
The code opens a video file using OpenCV's VideoCapture class. It retrieves properties of
the video such as frame width, height, and frames per second (fps).

Video Writing:
A VideoWriter object is created to write processed frames with bounding boxes and
labels to an output video file. The codec used is mp4v.

Object Detection and Drawing Bounding Boxes:


The video frames are processed one by one in a loop. The YOLO model is used to detect
objects in each frame, and bounding boxes are drawn around the detected objects.
The color of the bounding box depends on the class of the detected object (head, helmet,
or person).

Text Overlay (Labels):


Class labels (head, helmet, or person) are added as text above the bounding boxes using
OpenCV's putText function.

Video Writing (Continued):


The processed frames with bounding boxes and labels are written to the output video file
using the write method of the VideoWriter object.

Release Resources:
Once all frames have been processed, the video capture (cap) and video writer (out)
objects are released to free up system resources.

Printing Confirmation:
Finally, a message is printed to indicate that the video has been saved successfully with
bounding boxes and labels.
44
45

Output :
46

Conclusion:

Lead Time Prediction :

In conclusion, our project aimed to predict lead time in a steel plant environment using
machine learning techniques. We meticulously curated features such as order details, raw
material availability, production process parameters, equipment status, workforce
availability, quality control metrics, and environmental factors. These features were
crucial in training our models to accurately estimate lead times.

We experimented with various regression algorithms including Random Forest


Regression, Gradient Boosting Regression, and Support Vector Regression. Our models
demonstrated promising performance, as evidenced by evaluation metrics such as mean
absolute error, R-squared, and root mean squared error. However, there are areas for
improvement, particularly in outlier detection, handling class imbalance, and
integrating real-time data for dynamic predictions.

For future work, we recommend exploring outlier detection techniques, addressing class
imbalance through resampling methods, and implementing online learning algorithms for
adaptive predictions. Collaboration with domain experts to refine feature selection and
enhance model interpretability could further improve predictive accuracy and applicability
in real-world scenarios.

Overall, this project lays a solid foundation for more efficient lead time prediction in steel
plant operations. By leveraging machine learning, we can optimize production planning,
resource allocation, and ultimately enhance customer satisfaction. Continued research and
development in this area have the potential to drive significant advancements in the
manufacturing sector.
47

YOLO
The presented project harnesses the capabilities of the YOLO (You Only Look Once)
object detection model to annotate and track objects within a video stream. By employing
a pre-trained YOLO model, the code efficiently identifies and labels objects of interest,
including heads, helmets, and persons, in real-time video footage. This project serves as a
testament to the practical application of cutting-edge computer vision techniques in
addressing complex challenges across various domains.

At its core, the YOLO model offers a streamlined approach to object detection by
dividing the image into a grid and predicting bounding boxes and class probabilities for
each grid cell. This enables rapid and accurate detection of multiple objects within a
single pass through the neural network. By integrating YOLO with video processing
techniques, the project showcases how advanced machine learning algorithms can be
seamlessly integrated into real-world applications.

The significance of this project lies in its ability to automate and streamline tasks that
would otherwise require manual intervention. In scenarios such as video surveillance,
safety monitoring, and crowd analysis, the automated detection and tracking of objects
can greatly enhance efficiency and accuracy. For instance, in surveillance applications,
the ability to quickly identify and track individuals, including those wearing safety
helmets, can improve security measures and response times.

Moreover, the project underscores the broader implications of computer vision


technology in driving innovation and progress across various industries. From retail
analytics to transportation safety systems, the ability to automatically analyze and
interpret visual data opens up a myriad of opportunities for optimization and
improvement. By leveraging deep learning-based object detection models like YOLO,
organizations can gain valuable insights, make informed decisions, and enhance overall
operational efficiency.

In conclusion, the project exemplifies the transformative potential of YOLO-based


object detection in tackling real-world challenges. Through its seamless integration of
advanced machine learning techniques with video processing capabilities, it
demonstrates the power of computer vision in enhancing automation, efficiency, and
decision-making across diverse applications and industries.
48

References:

Lead Time Prediction


https://www.javatpoint.com/data-preprocessing-machine-learning
https://www.geeksforgeeks.org/introduction-machine-learning/

YOLO
GitHub - ultralytics/ultralytics: NEW - YOLOv8 ◻ in PyTorch > ONNX > OpenVINO >
CoreML > TFLite

You might also like