A comprehensive collection of fundamental machine learning algorithms implemented from scratch in Python. This repository is designed for educational purposes, helping developers understand the core concepts behind popular ML algorithms without relying on library abstractions.
- Overview
- Project Structure
- Implemented Algorithms
- Getting Started
- Installation
- Usage Examples
- Algorithm Details
- Contributing
- License
This repository demonstrates hands-on implementations of key machine learning algorithms, complete with:
- Clean, well-documented code
- Example usage scripts for each algorithm
- Detailed README files for each implementation
- Educational focus on understanding fundamentals
Perfect for students, educators, and developers looking to deepen their understanding of machine learning concepts.
ML-Algorithm/
├── KNN/
│ ├── knn.py # K-Nearest Neighbors implementation
│ ├── main.py # Example usage of KNN
│ └── README.md # Detailed KNN documentation
├── Linear-Regression/
│ ├── linear.py # Linear Regression implementation
│ ├── main.py # Example usage of Linear Regression
│ └── README.md # Detailed Linear Regression documentation
├── Logistic-Regression/
│ ├── logistic.py # Logistic Regression implementation
│ ├── main.py # Example usage of Logistic Regression
│ └── README.md # Detailed Logistic Regression documentation
├── SVM/
│ ├── svm.py # Support Vector Machine implementation
│ ├── main.py # Example usage of SVM
│ └── README.md # Detailed SVM documentation
├── requirements.txt # Project dependencies
├── setup.py # Package setup and installation
├── LICENSE # MIT License file
└── README.md # This file
Description: A non-parametric, instance-based classification algorithm that predicts the class of a data point based on the majority class among its k nearest neighbors in the feature space.
Key Characteristics:
- Non-parametric approach
- Instance-based learning
- Suitable for both classification and regression
- Sensitive to feature scaling
Files: KNN/knn.py, KNN/main.py, KNN/README.md
Description: A supervised learning algorithm that models the linear relationship between a dependent variable and one or more independent variables using the least squares method.
Key Characteristics:
- Parametric approach
- Assumes linear relationship
- Minimizes mean squared error (MSE)
- Works well with linearly separable data
Files: Linear-Regression/linear.py, Linear-Regression/main.py, Linear-Regression/README.md
Description: A supervised learning algorithm for binary classification that applies the logistic (sigmoid) function to model the probability of class membership.
Key Characteristics:
- Binary classification
- Uses sigmoid activation function
- Gradient descent optimization
- Produces probability estimates
Files: Logistic-Regression/logistic.py, Logistic-Regression/main.py, Logistic-Regression/README.md
Description: A supervised learning algorithm that finds the optimal hyperplane to separate classes by maximizing the margin between them, using hinge loss and gradient descent.
Key Characteristics:
- Effective for binary classification
- Finds optimal decision boundary
- Robust to outliers
- Supports both linear and non-linear classification
Files: SVM/svm.py, SVM/main.py, SVM/README.md
- Python 3.6 or higher
- pip (Python package manager)
-
Clone or download this repository:
git clone <repository-url> cd ML-Algorithm
-
Install dependencies:
pip install -r requirements.txt
Or install manually:
pip install numpy scikit-learn
Each algorithm directory contains a main.py file with example usage. To run any algorithm:
# KNN Example
cd KNN
python main.py
# Linear Regression Example
cd ../Linear-Regression
python main.py
# Logistic Regression Example
cd ../Logistic-Regression
python main.py
# SVM Example
cd ../SVM
python main.pyfrom KNN.knn import KNN
# Create and train a KNN classifier
knn = KNN(k=3)
knn.fit(X_train, y_train)
# Make predictions
predictions = knn.predict(X_test)For in-depth information about each algorithm, please refer to the individual README files:
- KNN Documentation
- Linear Regression Documentation
- Logistic Regression Documentation
- SVM Documentation
Each README includes:
- Mathematical foundations
- Algorithm complexity analysis
- Usage examples
- Performance metrics
- Advantages and disadvantages
| Package | Version | Purpose |
|---|---|---|
| NumPy | >= 1.19.0 | Numerical computations |
| scikit-learn | >= 0.24.0 | Data generation and evaluation metrics |
Contributions are welcome! Here's how you can help:
- Fork the repository
- Create a feature branch (
git checkout -b feature/improvement) - Commit your changes (
git commit -am 'Add new feature') - Push to the branch (
git push origin feature/improvement) - Submit a Pull Request
Please ensure your code follows the existing style and includes appropriate documentation.
This project is licensed under the MIT License. See the LICENSE file for more details.
These implementations are designed for educational purposes and serve as learning resources for understanding fundamental machine learning concepts.
- All implementations are written from scratch without relying on high-level ML libraries
- Code is optimized for clarity and understanding rather than performance
- Each algorithm includes detailed comments explaining key concepts
- Example datasets use scikit-learn for convenience, but the core algorithms are custom-built