Skip to content

1234-ad/Iris-Flower-Classification

Repository files navigation

🌸 Iris Flower Classification Project

Objective

Classify iris flowers into three species (Setosa, Versicolor, Virginica) based on measurements of their petals and sepals.

📊 Dataset

The classic Iris dataset from the UCI Repository, loaded via scikit-learn:

  • 150 samples (50 per species)
  • 4 features:
    • Sepal length (cm)
    • Sepal width (cm)
    • Petal length (cm)
    • Petal width (cm)
  • 3 classes: Setosa, Versicolor, Virginica

🛠️ Technologies Used

  • Python 3.x
  • Libraries:
    • pandas - Data manipulation
    • numpy - Numerical operations
    • matplotlib - Visualization
    • seaborn - Advanced visualization
    • scikit-learn - Machine learning models and metrics

📋 Project Steps

1. Load the Dataset

  • Load the Iris dataset from scikit-learn
  • Create a pandas DataFrame for easier manipulation
  • Display basic information and statistics

2. Exploratory Data Analysis (EDA)

  • Pairplot: Visualize relationships between all feature pairs
  • Histograms: Show distribution of each feature by species
  • Box Plots: Display feature distributions and outliers
  • Correlation Heatmap: Show correlation between features

3. Data Preprocessing

  • Check for missing values (none found)
  • Split data into training (80%) and test (20%) sets
  • Apply feature scaling using StandardScaler
  • Use stratified sampling to maintain class balance

4. Model Training

Train and compare three classifiers:

  • Logistic Regression
  • K-Nearest Neighbors (KNN)
  • Decision Tree Classifier

5. Model Evaluation

Evaluate models using multiple metrics:

  • Accuracy: Overall correctness
  • Precision: Positive prediction accuracy
  • Recall: True positive detection rate
  • F1-Score: Harmonic mean of precision and recall
  • Confusion Matrix: Detailed prediction breakdown

6. Feature Importance

Analyze which features are most important for classification using the Decision Tree model.

🚀 How to Run

Prerequisites

Install required libraries:

pip install pandas numpy matplotlib seaborn scikit-learn

Execution

Run the main script:

python iris_classification.py

📈 Expected Results

All three models typically achieve high accuracy (95%+) on this dataset:

  • Logistic Regression: ~97-100%
  • K-Nearest Neighbors: ~97-100%
  • Decision Tree: ~97-100%

📁 Output Files

The script generates the following visualization files:

  1. iris_pairplot.png - Scatter plots of all feature combinations
  2. iris_distributions.png - Histograms showing feature distributions
  3. iris_boxplots.png - Box plots for each feature by species
  4. iris_correlation_heatmap.png - Feature correlation matrix
  5. iris_confusion_matrices.png - Confusion matrices for all models
  6. iris_model_comparison.png - Performance metrics comparison
  7. iris_feature_importance.png - Feature importance ranking

🎯 Skills Gained

  • ✅ Loading and exploring datasets
  • ✅ Data visualization techniques (scatter plots, histograms, heatmaps)
  • ✅ Data preprocessing and scaling
  • ✅ Train-test split methodology
  • ✅ Classification modeling with multiple algorithms
  • ✅ Model evaluation using various metrics
  • ✅ Confusion matrix interpretation
  • ✅ Feature importance analysis
  • ✅ Model comparison and selection

🔍 Key Insights

  1. Petal measurements (length and width) are typically more discriminative than sepal measurements
  2. Setosa is linearly separable from the other two species
  3. Versicolor and Virginica have some overlap, making them slightly harder to distinguish
  4. All three simple classifiers perform excellently on this dataset
  5. The dataset is well-balanced with no missing values

📚 Additional Resources

👨‍💻 Author

Created as a beginner-friendly machine learning project to demonstrate classification techniques.

📝 License

This project is open source and available for educational purposes.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages