Skip to content

Erwdev/tb-cxr-detection

Repository files navigation

TB-CXR Detection

Deteksi Tuberkulosis dari Citra Chest X-Ray Menggunakan Computer Vision & Machine Learning

Streamlit App


πŸš€ Live Demo

πŸ‘‰ Try the App Here!

Upload X-Ray image β†’ Get instant TB detection results with visualization!


πŸ‘₯ Tim Pengembang

Nama NPM Email Tugas Utama
Azhar Maulana 24/533487/PA/22582 azharmaulana533487@mail.ugm.ac.id Preprocessing
Revy Satya Gunawan 24/538296/PA/22835 revysatyagunawan538296@mail.ugm.ac.id Segmentation
Raditya Nathaniel Nugroho 24/543188/PA/23069 radityanathanielnugroho2005@mail.ugm.ac.id Morphological Processing
Benedictus Erwin Widianto 23/520176/PA/22350 benedictuserwinwidianto@mail.ugm.ac.id Feature Extraction + Project Lead

🎯 Features

  • βœ… Automated Lung Segmentation - K-Means clustering untuk isolasi region paru-paru
  • βœ… Advanced Preprocessing - CLAHE + Gaussian Blur untuk enhancement
  • βœ… Multi-Region Detection - Deteksi lung, nodule, dan cavity
  • βœ… Feature Extraction - LBP, GLCM, Edge, dan Hough Line features
  • βœ… ML Classification - SLDT-MSA (Stacking + Moth Search Algorithm)
  • βœ… Interactive Visualization - Real-time visualization dengan Streamlit
  • βœ… Morphological Analysis - Complete morphology operations analysis

πŸ”¬ Technical Implementation (Pipeline)

Tahap Teknik Input Output
1. Preprocessing Grayscale β†’ Gaussian Blur β†’ CLAHE image_path: str preprocessed: np.ndarray (HΓ—W)
2. Segmentation K-Means (3 clusters) + Adaptive Threshold preprocessed masks: dict (lung, nodule, cavity)
3. Morphology Otsu + Erosion/Dilation/Opening/Closing mask morphology_results: dict
4. Feature Extraction Edge (Canny) + Lines (Hough) + GLCM + LBP img + lung_mask features: dict (14 features)
5. Classification SLDT-MSA (Stacked Decision Tree + Moth Search) feature_vector prediction: "Normal"/"Tuberculosis"

Feature Set (14 Features)

  • Shape Features (3): Edge Sum, Number of Lines, Corner Count
  • Texture Features (2): GLCM Contrast, GLCM Homogeneity
  • LBP Features (9): Local Binary Pattern Histogram (9 bins)

πŸ“Š Dataset

  • Sumber: Kaggle TB Chest X-ray Database
  • Total: 4.200 citra (3.500 Normal, 700 TB)
  • Split: 80% Training, 20% Testing (stratified)
  • Format: PNG/JPG grayscale images
  • Struktur:
    data/raw/TB_Chest_Radiography_Database/
      β”œβ”€β”€ Normal/          β†’ 3.500 citra
      └── Tuberculosis/    β†’ 700 citra
    

πŸ“ Struktur Proyek

tb-cxr-detection/
β”œβ”€β”€ README.md
β”œβ”€β”€ requirements.txt         # Dependencies untuk production
β”œβ”€β”€ requirements-local.txt   # Dependencies untuk development
β”œβ”€β”€ packages.txt            # System dependencies (deployment)
β”œβ”€β”€ config.yaml             # Configuration file
β”œβ”€β”€ .streamlit/
β”‚   └── config.toml         # Streamlit app configuration
β”œβ”€β”€ app/
β”‚   β”œβ”€β”€ main.py            # 🎨 Streamlit UI
β”‚   β”œβ”€β”€ pipeline.py        # Backend analysis pipeline
β”‚   └── utils/
β”‚       └── visualizer.py  # Visualization functions
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ tes.py             # Complete pipeline (preprocessing β†’ features)
β”‚   β”œβ”€β”€ preprocessing/
β”‚   β”œβ”€β”€ segmentation/
β”‚   β”œβ”€β”€ morphology/
β”‚   └── classification/
β”‚       β”œβ”€β”€ train.py       # Model training (SLDT-MSA)
β”‚       └── test_a.py      # Model evaluation
β”œβ”€β”€ models/
β”‚   └── tb_model_raw.pkl   # Trained classifier (Git LFS)
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ raw/               # Original dataset (gitignored)
β”‚   β”œβ”€β”€ processed/
β”‚   β”‚   β”œβ”€β”€ features/
β”‚   β”‚   β”‚   └── dataset.csv  # Extracted features
β”‚   β”‚   β”œβ”€β”€ images/
β”‚   β”‚   └── masks/
β”‚   └── mock/              # Sample images for testing
β”œβ”€β”€ notebooks/             # Jupyter notebooks (development)
└── tests/                 # Unit tests

πŸ› οΈ Setup & Installation

Prerequisites

  • Python 3.10 or 3.11
  • Git
  • (Optional) Git LFS for model file

Installation Steps

# 1. Clone repository
git clone https://github.com/benedictuserwinwidianto/tb-cxr-detection.git
cd tb-cxr-detection

# 2. Create virtual environment
python -m venv .venv

# Windows
.venv\Scripts\activate

# Linux/Mac
source .venv/bin/activate

# 3. Install dependencies
pip install -r requirements.txt

# 4. Verify installation
python -c "import streamlit; import cv2; import sklearn; print('βœ“ All dependencies installed!')"

πŸš€ Running the Application

Local Development

# From project root directory
streamlit run app/main.py

# App will open at http://localhost:8501

πŸ“Š Model Performance

Classifier: SLDT-MSA (Stacking Loopy Decision Tree + Moth Search Algorithm)

Model Architecture

  • Feature Selection: Moth Search Algorithm (MSA)
  • Base Learners: Decision Tree + Random Forest
  • Meta Learner: Decision Tree
  • Optimization: Grid Search (class weight + max depth)

πŸ’» Usage Examples

Using Streamlit App

  1. Upload chest X-Ray image (PNG/JPG)
  2. Click "πŸ”¬ Analyze Image"
  3. View results:
    • Prediction (Normal/TB) with confidence
    • Segmentation masks
    • Morphological operations
    • Extracted features

Training Custom Model

# Generate features from dataset
python src/tes.py

# Train classifier
python src/classification/train.py

# Test model
python src/classification/test_a.py

πŸ”§ Development Workflow

Branch Strategy

# Create feature branch
git checkout -b feature/module-name-yourname

# Example
git checkout -b feature/segmentation-revy

# Work β†’ Commit β†’ Push
git add .
git commit -m "Add segmentation module"
git push origin feature/segmentation-revy

Pull Request Process

  1. Create PR dari feature branch ke main
  2. Tag: @benedictuserwinwidianto + 1 teammate untuk review
  3. Merge setelah mendapat 1 approval
  4. Delete feature branch setelah merge

πŸ“š Documentation

Pipeline Modules

1. Preprocessing (src/tes.py)

preprocess_image(image_path: str) -> np.ndarray
  • Gaussian Blur (3Γ—3)
  • CLAHE (clipLimit=2.0, tileGridSize=8Γ—8)

2. Segmentation (src/tes.py)

segment_lungs(img: np.ndarray) -> dict
  • K-Means clustering (k=3)
  • Adaptive threshold untuk nodule/cavity

3. Morphology (src/tes.py)

apply_morphology(mask: np.ndarray, kernel_size: int) -> dict
  • Otsu thresholding
  • Erosion, Dilation, Opening, Closing

4. Feature Extraction (src/tes.py)

extract_lbp_features(img: np.ndarray, mask: np.ndarray) -> dict
  • Edge detection (Canny)
  • Line detection (Hough)
  • GLCM features
  • LBP histogram

πŸ™ Acknowledgments


πŸ“§ Contact

Project Lead: Benedictus Erwin Widianto
πŸ“§ benedictuserwinwidianto@mail.ugm.ac.id
πŸ”— GitHub Issues

About

Final Project DIgital Image Processing ~ Segmentation on Lung Cancer

Topics

Resources

Stars

Watchers

Forks

Contributors