SARANATHAN COLLEGE OF ENGINEERING
PANJAPPUR, TIRUCHIRAPALLI - 620 012
                             (AFFILIATED TO THE ANNA UNIVERSITY)
                Name
                Reg. No..
                Sem./Branch..
                Sub. Code/Subject...
                             Bonalids record of the work         done
                            in   the   Laboratory dwring   the   period,
                                                to
                                                                           Lecturer - in - charge
Head of the Department
Date
   Submitted for the Practical Examination held at SARANATHAN
   cOLLEGE OF ENGINEERING, TIRUCHIRAPALLI on.
        2
          EXAMINERS
            Saranathan College of Engineering
Department of Electronics and Communication Engineering
            Completed the Project named as
           SUPPLY CHAIN OPTIMIZATION
                  Submitted by
                Ramya -813822106080
                Surya Dharshini -813822106104
                Sakthi -813822106084
                Sivasankari -813822106095
                PROJECT TITLE: SUPPLY CHAIN OPTIMIZATION
 Introduction:
               Supply chain optimization involves refining the supply chain to enhance
efficiency, reduce costs, and improve service levels. This project aims to declare about how
to industries can achieve a more efficient, responsive, and optimized supply chain aligned
by using implementation of lean principles.
 Objectives:
              - Reduce cost: Lower production, transportation, and inventory holding costs.
            - increase efficiency: Streamline operations, boost productivity, and optimize
resource use.
              - Improve Customer Service: Enhance delivery speed, order accuracy, and
overall satisfaction.
              - Promote Sustainability: Reduce environmental impact and ensure regulatory
compliance.
            - Foster Collaboration: Strengthen supplier relationships and integrate supply
chain systems.
System Requirements:
 Data:
     - Scalability: Handles large data volumes and adapts to network changes.
     - Data Management: Ensures accurate, reliable data through cleansing and storage.
     - Analytics: Provides advanced tools for historical analysis and optimization.
     - Optimization: Incorporates algorithms for inventory, production, transportation.
     - Visualization: Offers intuitive tools for clear insights and decision-making.
     - Monitoring: Supports real-time tracking of activities and KPIs.
     - Security: Adheres to data security and compliance standards.
Hardware:
       - High-performance servers, networking equipment, and storage systems.
       - Backup and disaster recovery solutions.
       - Security infrastructure including firewalls and encryption mechanisms.
 Software:
       - Server and client operating systems.
       - Database management systems (RDBMS and NoSQL).
       - Supply chain optimization software suites.
       - Analytics, visualization, collaboration, and security software.
       - Integration middleware for connecting with other enterprise systems.
Methodology:
  Data preprocessing:
Data Cleaning:
   •   Address missing data, outliers, and duplicates in supply chain records to ensure
       accuracy and integrity.
Data Transformation:
   •   Scale numerical features for fair comparison and analysis.
   •   Encode categorical variables into numerical representations.
   •   Engineer new features to provide additional insights for optimization.
Data Reduction:
   •   Reduce dimensionality to focus on significant optimization factors.
   •   Sample data to maintain representativeness and accuracy, especially for large-scale
       projects.
Data Integration:
   •   Combine data from various sources into a unified dataset for comprehensive analysis.
   •   Ensure consistency and resolve discrepancies in formats or values.
Data Normalization:
   •   Normalize data to a common scale for appropriate contribution to optimization
       algorithms.
Data Formatting:
   •   Format data appropriately for optimization techniques, ensuring consistency and
       usability.
These preprocessing steps enable accurate, relevant, and well-suited data for advanced
analysis and optimization, leading to more efficient and cost-effective supply chain
operations.
Existing work:
   1. Traditional Approaches:
          •   Inventory Optimization: EOQ and JIT minimize costs while ensuring stock levels.
          •   Transportation Optimization: Routing algorithms improve efficiency.
          •   Production Planning: Linear programming optimizes production schedules.
   2. Advanced Techniques:
          •   Demand Forecasting: Machine learning enhances accuracy.
          •   Supply Chain Network Design: Optimization models consider facility locations.
          •   Risk Management: Stochastic optimization mitigates risks.
   3. Integration of Technology:
          •   Big Data Analytics: Gain insights from large datasets.
          •   IoT: Sensors provide real-time data for proactive decisions.
          •   Blockchain: Ensures transparency and security.
   4. Industry-Specific Solutions:
          •   Retail: Algorithms optimize shelf space and replenishment.
          •   Manufacturing: Scheduling algorithms optimize production.
          •   Logistics: Route optimization software improves last-mile delivery.
Proposed work:
   •   Identify Optimization Opportunities:
          •   Analyze data and use modeling techniques to pinpoint areas for improvement
              in the supply chain.
   •   Develop Tailored Strategies:
          •   Create optimization strategies focusing on cost, service level, agility, and
              sustainability.
   •   Implement and Evaluate:
          •   Put strategies into action and assess their impact on key metrics.
   •   Expected Contributions:
          •   Highlight contributions to theory and practice in supply chain management.
Model Evaluation:
              •     Assess accuracy, precision, recall, and F1-score.
              •     Utilize cross-validation and confusion matrix for validation.
              •     Plot validation curves and ROC curves for analysis
Flowchart:
Implementation:
  Programs:
# Aggregate data
summary_df = df.groupby('product_category').agg({'quantity': 'sum', 'unit_price':
'mean'}).reset_index()
print(summary_df)
import pandas as pd
# Sample data for products
product_data = {
    'ProductID': [1, 2, 3, 4],
    'Product_Name': ['Product A', 'Product B', 'Product C', 'Product D'],
    'Price': [10, 20, 15, 25]
# Sample data for sales
sales_data = {
    'ProductID': [1, 2, 3, 4],
    'Units_Sold': [100, 200, 150, 300]
# Create dataframes from the sample data
product_df = pd.DataFrame(product_data)
sales_df = pd.DataFrame(sales_data)
# Merge the two dataframes based on the 'ProductID' column
merged_df = pd.merge(product_df, sales_df, on='ProductID')
print("Merged Data:")
print(merged_df)
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
# Sample supply chain data
data = {
    'product_id': [101, 102, 103, 104, 105],
    'demand': [100, 150, 200, 180, 220],
    'unit_cost': [10, 12, 15, 18, 20],
    'lead_time_days': [5, 7, 6, 3, 4]
}
# Create DataFrame
df = pd.DataFrame(data)
# Univariate analysis
# Histogram of demand
plt.figure(figsize=(10, 6))
plt.subplot(2, 2, 1)
plt.hist(df['demand'], bins=10, color='skyblue', edgecolor='black')
plt.title('Histogram of Demand')
plt.xlabel('Demand')
plt.ylabel('Frequency')
# Box plot of unit cost
plt.subplot(2, 2, 2)
plt.boxplot(df['unit_cost'])
plt.title('Boxplot of Unit Cost')
plt.ylabel('Unit Cost')
# Bivariate analysis
# Scatter plot of demand vs. lead time
plt.subplot(2, 2, 3)
plt.scatter(df['demand'], df['lead_time_days'], color='green')
plt.title('Scatter Plot of Demand vs. Lead Time')
plt.xlabel('Demand')
plt.ylabel('Lead Time (Days)')
# Multivariate analysis
# Pairplot for all variables
plt.subplot(2, 2, 4)
sns.pairplot(df)
plt.title('Pairplot of Supply Chain Variables')
plt.tight_layout()
plt.show()
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
# Sample data for order lead times
order_lead_times = np.random.normal(loc=7, scale=2, size=1000) # Mean = 7, Standard
Deviation = 2
# Univariate visualization using histogram
plt.figure(figsize=(10, 6))
sns.histplot(order_lead_times, kde=True, color='skyblue', bins=30)
plt.title('Distribution of Order Lead Times')
plt.xlabel('Lead Time (days)')
plt.ylabel('Frequency')
plt.grid(True)
plt.show()
import numpy as np
import matplotlib.pyplot as plt
# Sample data for order lead times and order quantities
order_lead_times = np.random.normal(loc=7, scale=2, size=1000) # Mean = 7, Standard
Deviation = 2
order_quantities = np.random.normal(loc=100, scale=20, size=1000) # Mean = 100,
Standard Deviation = 20
# Create a scatter plot
plt.figure(figsize=(10, 6))
plt.scatter(order_lead_times, order_quantities, color='skyblue', alpha=0.6)
plt.title('Order Lead Times vs. Order Quantities')
plt.xlabel('Lead Time (days)')
plt.ylabel('Order Quantity')
plt.grid(True)
plt.show()
import numpy as np
import plotly.graph_objs as go
# Generate sample data
np.random.seed(0)
x = np.random.randn(100)
y = np.random.randn(100)
# Create a scatter plot
scatter_plot = go.Scatter(
  x=x,
  y=y,
  mode='markers',
  marker=dict(
       size=10,
       color='rgba(152, 0, 0, .8)',
       line=dict(
           width=2,
           color='rgb(0, 0, 0)'
       )
  ),
    name='Random Data'
)
# Create layout for the plot
layout = go.Layout(
    title='Interactive Scatter Plot',
    xaxis=dict(title='X-axis'),
    yaxis=dict(title='Y-axis'),
    hovermode='closest'
)
# Combine data and layout into a figure
fig = go.Figure(data=[scatter_plot], layout=layout)
# Display the interactive plot
fig.show()
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, precision_score, average_precision_score
# Sample dataset creation
data = {
    'past_deliveries': [5, 2, 9, 4, 3, 8, 6, 7, 1, 5],
    'quality_score': [90, 85, 78, 92, 88, 75, 80, 82, 91, 87],
    'communication_rating': [4, 3, 5, 5, 4, 2, 3, 4, 5, 4],
    'on_time': [1, 1, 0, 1, 1, 0, 0, 1, 1, 1]
}
df = pd.DataFrame(data)
# Define features and target variable
X = df[['past_deliveries', 'quality_score', 'communication_rating']]
y = df['on_time']
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
# Train the Random Forest Classifier
classifier = RandomForestClassifier(random_state=42)
classifier.fit(X_train, y_train)
# Make predictions and evaluate the model
y_pred = classifier.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy: {accuracy:.2f}')
# Perform ranking analysis (Precision at K and Mean Average Precision)
k = 3 # Set the value of K
y_pred_proba = classifier.predict_proba(X_test)[:, 1] # Predict probabilities for positive class
sorted_indices = y_pred_proba.argsort()[::-1] # Sort predictions by probability in
descending order
y_pred_topk = y_test.iloc[sorted_indices][:k] # Get top K predictions
precision_at_k = precision_score(y_test.iloc[sorted_indices][:k], y_pred_topk) # Calculate
precision at K
print(f'Precision at {k}: {precision_at_k:.2f}')
# Calculate Mean Average Precision (MAP)
y_test_binary = (y_test == 1).astype(int) # Convert y_test to binary
average_precision = average_precision_score(y_test_binary, y_pred_proba)
print(f'Mean Average Precision: {average_precision:.2f}')
import numpy as np
import matplotlib.pyplot as plt
# Sample data for order lead times and order quantities
order_lead_times = np.random.normal(loc=7, scale=2, size=1000) # Mean = 7, Standard
Deviation = 2
order_quantities = np.random.normal(loc=100, scale=20, size=1000) # Mean = 100,
Standard Deviation = 20
# Create a box plot for order lead times grouped by quartiles of order quantities
plt.figure(figsize=(10, 6))
plt.boxplot([order_lead_times[order_quantities < np.percentile(order_quantities, 25)],
        order_lead_times[(order_quantities >= np.percentile(order_quantities, 25)) &
                  (order_quantities < np.percentile(order_quantities, 50))],
        order_lead_times[(order_quantities >= np.percentile(order_quantities, 50)) &
                  (order_quantities < np.percentile(order_quantities, 75))],
        order_lead_times[order_quantities >= np.percentile(order_quantities, 75)]],
       labels=['Q1', 'Q2', 'Q3', 'Q4'])
plt.title('Order Lead Times by Order Quantity Quartiles')
plt.xlabel('Order Quantity Quartiles')
plt.ylabel('Lead Time (days)')
plt.grid(True)
plt
import numpy as np
import matplotlib.pyplot as plt
# Sample data for order lead times
order_lead_times = np.random.normal(loc=7, scale=2, size=1000) # Mean = 7, Standard
Deviation = 2
# Define bins for lead time intervals
bins = np.arange(0, 15, 1) # Define bins from 0 to 15 days in steps of 1 day
# Compute frequency of lead times falling into each bin
hist, bins = np.histogram(order_lead_times, bins=bins)
# Plot the bar chart
plt.figure(figsize=(10, 6))
plt.bar(bins[:-1], hist, width=1, color='skyblue', edgecolor='black')
plt.title('Distribution of Order Lead Times')
plt.xlabel('Lead Time (days)')
plt.ylabel('Frequency')
plt.xticks(np.arange(0, 16, 1)) # Set x-axis ticks from 0 to 15 days
plt.grid(axis='y', linestyle='--', alpha=0.7)
plt.show()
import pandas as pd
# Sample supply chain data
data = {
    'product_id': [1001, 1002, 1003, 1004, 1005],
    'quantity': [10, 15, 0, 20, 1000], # Intentional outlier
    'unit_price': [5.00, 8.00, 2.50, 10.00, 2.00], # Intentional inconsistent entry
  'total_price': [50.00, 120.00, 1.00, 200.00, 2000.00], # Inconsistent with quantity *
unit_price
  'product_category': ['Category A', 'Category B', 'Category A', 'Category C', 'Category D'] #
Intentional inconsistency
}
# Create DataFrame
df = pd.DataFrame(data)
# 1. Check for Outliers
outliers = df[df['quantity'] > 100] # Assume threshold for outliers is 100
print("Outliers:")
print(outliers)
print()
# 2. Cross-Field Validation
df['total_price_calc'] = df['quantity'] * df['unit_price']
inconsistent_entries = df[abs(df['total_price'] - df['total_price_calc']) > 0.01]
print("Inconsistent Entries:")
print(inconsistent_entries)
print()
# 3. Check for Duplicate Entries
duplicate_entries = df[df.duplicated(subset='product_id', keep=False)]
print("Duplicate Entries:")
print(duplicate_entries)
print()
# 4. Consistency Checks
inconsistent_categories = df[~df['product_category'].isin(['Category A', 'Category B',
'Category C'])]
print("Inconsistent Categories:")
print(inconsistent_categories)
Outputs Sreenshot:
Future Enhancements:
         •   Integrate advanced machine learning techniques.
         •   Implement real-time monitoring systems.
         •   Develop models to handle uncertainty and disruptions.
         •   Explore blockchain technology for transparency.
         •   Utilize predictive analytics for risk anticipation.
         •   Foster collaborative optimization.
         •   Develop dynamic optimization models for real-time adaptation.
 Conclusion:
This project has successfully developed a machine learning-based system for supply chain
optimization. By leveraging data preprocessing techniques, feature engineering, and an
initial selection of machine learning algorithms, by using this system the industry and
companies can get an efficiency in supply chain optimization. In conclusion, supply chain
optimization is the cornerstone of modern business success. It also leveraging the emerging
technologies, and a commitment to continuous improvement, organizations can streamline
operations, enhance customer satisfaction, and stay ahead of the competition in today's
dynamic marketplace. In essence, supply chain optimization isn't just a strategy; it's a
mindset a relentless pursuit of excellence in every aspect of the supply chain.
SUBMITTED BY,
  RAMYA. V
[NM id: aut252080]
Register no: 813822106080