0% found this document useful (0 votes)
12 views10 pages

Methodology

Uploaded by

rani160das
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views10 pages

Methodology

Uploaded by

rani160das
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 10

Methodology :

The methodology adopted in this project follows a structured and systematic framework to
achieve accurate mineral resource detection. The workflow combines geospatial data acquisition,
preprocessing, dimensionality reduction through principal component analysis, supervised
machine learning classification, ensemble modeling, and validation. Each stage of the process is
designed to ensure efficiency, scalability, and reliability in identifying mineral-rich zones. The
detailed steps are outlined below:

1. Data Collection
For this study, geospatial image data in JPG format were used as the primary input for analysis.
These images were sourced from publicly available satellite repositories and geological
references, providing the necessary spectral and spatial information for mineral detection. The
datasets included:
 Satellite imagery (ASTER, Sentinel-2, Landsat-8 in JPG format), which served as the
core source of spectral reflectance data.
 Topographic representations captured through image layers to reflect terrain variations.
 Soil and vegetation-related indices represented in image form, which helped in
minimizing misclassification caused by non-geological features.
 Reference geological maps and historical mining records (in image format), which
were utilized during the validation stage.
The use of these diverse but image-based inputs ensured that the system remained both
comprehensive and scalable while maintaining a consistent data format for processing.

2. Data Preprocessing
To ensure that the geospatial images were suitable for further analysis, preprocessing was carried
out on the input JPG datasets. The steps included:
 Noise removal and normalization, which enhanced image clarity and maintained
uniformity across all inputs.
 Spatial alignment, ensuring that all images shared a consistent reference frame for
accurate comparison and integration.
 Spectral subsetting, where selected bands were emphasized to highlight diagnostic
features relevant for mineral identification.
These preprocessing measures reduced redundancy and improved the overall quality of the
image data, making them ready for dimensionality reduction and classification.
3. Dimensionality Reduction using PCA
To reduce redundancy and simplify the complexity of the spectral image data, Principal
Component Analysis (PCA) was applied directly to the original JPG geospatial images. The
process began with the original input data (Image 1), which contained multiple correlated
spectral bands. PCA transformed this dataset into new component images that concentrated the
most significant spectral variations while discarding repetitive information.
From this transformation, the first two principal components were extracted as representative
outputs:
 PC1 (Image 2): Highlighted dominant geological patterns and large-scale spectral
variations.
 PC2 (Image 3): Emphasized localized anomalies and subtle spectral differences that are
often linked to mineralization zones.
These PCA outputs not only minimized data size but also enhanced computational efficiency
while preserving key mineralogical information required for further classification and mapping.

Image 1: Original geospatial input image (JPG format)


Image 2: Principal Component 1 (PC1) highlighting dominant geological patterns
Image 3: Principal Component 2 (PC2) emphasizing localized anomalies
4. Machine Learning Model Training
After preprocessing and dimensionality reduction, three supervised machine learning models
were trained using the transformed datasets. These models were selected for their ability to
handle geospatial image classification effectively:
 Random Forest (RF)
 Support Vector Machine (SVM)
 Decision Tree (DT)
Each model was applied to classify the study area into high, medium, and low mineral potential
zones. Hyperparameters were adjusted to maintain a balance between accuracy and
computational efficiency.
Algorithm 1: Random Forest (RF)
Mathematical Formulation:
A Random Forest classifier constructs an ensemble of decision trees T1,T2,…,Tk.
Each tree is trained on a bootstrap sample of the dataset, with random feature selection at each
split.
The final prediction is obtained by majority voting:

where ŷ is the predicted class for input x.


Algorithm 2: Support Vector Machine (SVM)
Mathematical Formulation:
SVM identifies the hyperplane that maximizes the margin between classes.
The optimization problem is:

subject to

where www is the weight vector, b is the bias, and yi ∈{−1,+1} are the class labels.
The decision rule for a new input x is:

ŷ =sign(w⋅x+b)
Algorithm 3: Decision Tree (DT)
Mathematical Formulation:
At each node, the best feature is selected using an impurity measure such as Entropy or Gini
Index.
 Entropy:

 Information Gain:

where S is the dataset, A is an attribute, and Sv is the subset of samples for value v.

5. Ensemble Model Development


To enhance the reliability of classification, a weighted ensemble model was constructed by
integrating the outputs of Random Forest, Support Vector Machine, and Decision Tree
classifiers. Each model generated probability scores for the identified mineral classes, which
were then combined using a weighted sum approach. Greater weight was assigned to Random
Forest because of its superior individual accuracy, followed by Support Vector Machine and
Decision Tree. The final classification was determined by selecting the class with the maximum
weighted score. This ensemble strategy effectively leveraged the strengths of all three algorithms
while reducing the impact of their individual limitations, leading to a more robust and accurate
prediction.
6. Mineral Mapping and Visualization
Mineral classification maps were generated from both rock sample images and satellite data to
support the interpretation of mineral distribution. These maps were compared with the
corresponding RGB images to highlight key features that are not easily visible in standard
photography. The classified maps revealed the spatial distribution of different mineral phases, the
presence of fractures and alteration zones, and fine-scale variations across the samples. This
stage demonstrated the robustness of the methodology by confirming that the approach was
effective in capturing mineralogical details in both field-based samples and remote sensing
datasets.
7. Validation and Verification
The final stage of the workflow focused on validating the outputs of the classification models to
ensure that the predictions were both accurate and reliable. For this purpose, multiple strategies
were employed. Historical mining data from the region were used to verify whether the predicted
high-potential mineral zones aligned with previously documented mineral deposits. Geological
survey records served as additional reference maps, allowing a spatial comparison between the
classified outputs and established geological information.
To quantitatively assess model performance, confusion matrix analysis was performed for each
classifier, providing accuracy, precision, recall, and F1-score values. This analysis highlighted
the strengths and weaknesses of each model in distinguishing between high, medium, and low
mineral potential zones. Among the tested approaches, the ensemble classifier consistently
demonstrated superior performance, showing the highest overlap with historical mineral
occurrence sites and reducing misclassifications when compared to individual models. This
confirmed that integrating multiple models through a weighted ensemble strategy offered a more
robust and dependable solution for mineral resource detection.

8. Workflow Representation
Results:
Dataset Information
The dataset used in this project consisted of geospatial satellite images in JPG format,
obtained from publicly available sources such as ASTER, Sentinel-2, and Landsat-8. These
images contained spectral and spatial details relevant for mineral mapping. Additional reference
images, including geological maps and historical mining records, were used for validation. Only
image-based datasets were employed; no external tabular or numerical datasets were included.
Individual Model Results
After preprocessing and dimensionality reduction using PCA, three supervised machine learning
models were applied to classify the study area into high, medium, and low mineral potential
zones.
 Random Forest (RF): Provided the best generalization performance due to its ensemble
of multiple decision trees.
 Support Vector Machine (SVM): Demonstrated strong separation between classes in
high-dimensional feature space.
 Decision Tree (DT): Offered interpretability but showed lower accuracy compared to RF
and SVM.
The quantitative performance of these models is summarized below.

Model Accuracy Precision Recall F1-Score

Random Forest 92% 90% 91% 90.5%

SVM 88% 86% 85% 85.5%

Decision Tree 84% 82% 80% 81%

Table 1: Performance metrics of individual machine learning models


Ensemble Model Development
To improve overall classification reliability, an ensemble classifier was developed that integrates
the outputs of RF, SVM, and DT. The ensemble used a weighted sum rule, where weights were
assigned based on individual model performance:

where:
 PRF(x),PSVM(x),PDT(x) are the probability outputs of each model,
 WRF,WSVM,WDTw are their respective weights, with WRF>WSVM>WDTw The class with the
maximum weighted probability was selected as the final prediction.

Ensemble Model Results


The ensemble approach outperformed the individual models by leveraging their strengths and
minimizing weaknesses.

Model Accuracy Precision Recall F1-Score

Random Forest 92% 90% 91% 90.5%

SVM 88% 86% 85% 85.5%

Decision Tree 84% 82% 80% 81%

Ensemble 94% 92% 93% 92.5%

Table 2: Comparison between individual models and the ensemble classifier


The ensemble model produced a final mineral potential classification map by integrating the
predictions of Random Forest, Support Vector Machine, and Decision Tree through a weighted
sum approach. The output highlights high, medium, and low mineral potential zones with
distinct color coding, making it easier to interpret the distribution of mineral-rich areas.
 High mineral potential zones are marked in red, indicating regions with the strongest
likelihood of mineral occurrence.
 Medium mineral potential zones are shown in yellow, representing areas with moderate
probability.
 Low mineral potential zones are displayed in green, reflecting minimal mineralization
likelihood.
This classification output (Image 4) demonstrates how the ensemble model effectively
consolidates the strengths of the individual classifiers to generate a more reliable and accurate
mineral prediction.
Image 4: Final ensemble model output showing high (Pink), medium (yellow), and low (Orange)
mineral potential zones.

Spectral Analysis for Mineral Identification


To validate the spectral separability of minerals before applying dimensionality reduction and
classification, reflectance spectra of key minerals were analyzed. Figure 5 shows the reflectance
curves of Hematite (red), Calcite (green), and Kaolinite (blue) across the visible to shortwave
infrared (SWIR) region (0.4–2.5 µm).
The following observations were made:
 Hematite (red curve): Exhibits strong absorption in the visible–near infrared region due
to iron oxide content, with distinct reflectance features useful for iron-bearing mineral
detection.
 Calcite (green curve): Shows relatively high and stable reflectance, with absorption
features in the infrared region linked to carbonate groups.
 Kaolinite (blue curve): Demonstrates pronounced absorption features near 2.2 µm,
characteristic of hydroxyl (OH) groups in clay minerals.
The red box in Figure 5 highlights the SWIR range (1.0–2.5 µm), which is especially
diagnostic for mineral mapping. This region was used during PCA to emphasize spectral
variations relevant to mineral discrimination.
Figure 5 : Reflectance spectra of Hematite (red), Calcite (green), and Kaolinite (blue), with the
SWIR region highlighted in a red box.

You might also like