Ilovepdf Merged
Ilovepdf Merged
A Project Report on
Submitted in partial fulfillment for the award of degree of Bachelor of Engineering in Computer
Science & Engineering during the year 2024-25
By
Amrutha E 4MH21CS003
Harshitha M R 4MH21CS034
Jeevitha S R 4MH21CS038
Madhumitha R 4MH21CS046
2024-25
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING
MAHARAJA INSTITUTE OF TECHNOLOGY MYSORE
BELAWADI, NAGUVANAHALLY POST, S.R. PATNA TALUK, MANDYA DIST-571477.
MAHARAJA INSTITUTE OF TECHNOLOGY MYSORE
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING
External viva
1)
2)
~ ~ ~ ~ ~ ~ ~ ~ ACKNOWLEDGEMENT ~ ~ ~ ~ ~ ~ ~ ~
We, the undersigned, would like to express our sincere gratitude to everyone who
supported and guided us throughout the successful completion of our project.
First and foremost, we extend our heartfelt thanks to our project guide, Prof. Prasanna
G, for their valuable guidance, constant encouragement, and continuous support, which played
a crucial role in shaping the direction and outcome of this project.
Our sincere thanks also go to all the faculty members and staff of the department for their
cooperation, insightful feedback, and technical support throughout the duration of this work.
Last but not least, we would like to express our deepest gratitude to our families and
friends for their unwavering support, patience, and encouragement, which motivated us to work
with commitment and confidence.
Amrutha E 4MH21CS003
Harshitha M R 4MH21CS034
Jeevitha S R 4MH21CS038
Madhumitha R 4MH21CS046
~ ~ ~ ~ ~ ~ ~ ~ ABSTRACT ~ ~ ~ ~ ~ ~ ~ ~
1. INTRODUCTION ………………………………………...……………… 01
1.1 Overview …………………………...…………………………...… 01
1.2 Problem Statement ……………….……………...……………...… 01
1.3 Solution……………….….…………………………………...…… 01
1.4 Existing System ...…….….…………………………………...…… 01
1.5 Proposed System ….….….…………………………………...…… 02
CONCLUSION ……………………………………………………………. 38
REFERENCES …………………………………………………………….. 41
~ ~ ~ ~ ~ ~ ~ ~ List of Figures ~ ~ ~ ~ ~ ~ ~ ~
CHAPTER – 1
INTRODUCTION
1.1 Overview
The increasing awareness of mental health issues among students has highlighted the need for
effective and timely interventions. Traditional anxiety prediction models face limitations due to the
scarcity of labeled data and the static nature of their learning processes. This inadequacy hinders the
ability to identify students who are at risk and offer them appropriate support. Addressing this gap is
crucial for enhancing educational environments and ensuring that students receive the help they need
before anxiety issues escalate. By leveraging advanced techniques in machine learning, particularly
adaptive active learning, we can significantly improve the prediction accuracy and robustness of
anxiety models. This research is motivated
1.3 Solution
This project proposes a multi-class adaptive active learning framework for predicting student
anxiety levels with greater accuracy and efficiency. By dynamically selecting the most informative
and uncertain data points for labeling, the model continuously improves its performance with minimal
human annotation. This adaptive learning cycle allows the system to better differentiate between
varying anxiety levels and enables early detection, ensuring timely intervention and support for
students at risk.
A major drawback of these systems is their inability to handle real-time data or adapt to
evolving student behavior. They fail to prioritize the most relevant or uncertain data, resulting in
inefficient learning and a lack of precision when distinguishing between mild, moderate, or severe
anxiety cases. This limits their effectiveness in early intervention and student support.
Compared to the existing static models, this approach provides several advantages: it improves
prediction accuracy even with fewer labeled samples, captures dynamic behavioral changes among
students, and ensures better identification of varying anxiety levels. This makes the model highly
effective for early detection and timely intervention in student mental health care.
• Instead of selecting training data randomly, the model uses active learning to intelligently
query the most informative and uncertain data samples from a pool of unlabeled instances.
• This approach significantly reduces the labeling burden for experts and ensures that the training
process focuses on the most impactful data points.
2. Multi-Class Classification
• The system is designed to classify anxiety into multiple levels (e.g., none, mild, moderate, and
severe), providing a nuanced understanding of students' mental states.
• Multi-class models allow for more precise interventions based on the severity level.
• It adapts to changing student behaviors, academic stressors, and environmental conditions over
time.
• This adaptability ensures that the system remains accurate and relevant in dynamic, real-world
educational settings.
CHAPTER – 2
LITERATURE SURVEY
2.1 Literature Review
a) Topic Overview
The central theme explored in this literature review is the application of active learning and adaptive
learning techniques, particularly in multi-class classification tasks related to student performance
and mental health prediction. The reviewed literature spans journal articles, conference papers, and
comparative studies, reflecting diverse approaches and technological advancements in improving
prediction accuracy while minimizing data labeling efforts.
• The high cost of labeling data for supervised learning models, especially in domains like
mental health.
• The limited adaptability of static learning models, which fail to adjust to evolving data
patterns.
• Difficulty in early identification of students at risk of mental health issues or poor academic
performance.
• Inefficiencies in traditional sampling methods, which may not prioritize informative data
points.
Existing Solutions
Traditional models often rely on static datasets and conventional machine learning algorithms such as
decision trees, SVMs, and logistic regression. These systems:
• Use uniform sampling or random data selection, leading to sub-optimal training efficiency.
1. Active Learning for Multi-Class Classification (Zhou & Zhang, 2015): Proposes intelligent
sample selection strategies like uncertainty sampling and query-by-committee to reduce
labeling cost and improve accuracy in multi-class settings.
2. Adaptive Learning for Educational Data (Kumar & Gupta, 2018): Introduces models that
dynamically adjust based on incoming data. These systems are better suited for long-term
monitoring of student performance and mental health.
3. Hybrid Models using Active Learning (Singh & Choi, 2020): Combines multi-class
classification with active learning to fine-tune mental health prediction. This iterative model
improves differentiation across mental health categories.
4. Adaptive Active Learning for Early Intervention (Miller & Smith, 2019): Focuses on real-
time identification of at-risk students. It applies adaptive learning to prioritize data relevant to
anxiety and performance.
5. Dynamic Sampling Strategies (Wang & Liu, 2021): Compares multiple dynamic sampling
techniques within active learning frameworks. It shows that intelligently chosen data samples
significantly improve prediction outcomes.
Hybrid(Active+ Combines strengths of both models, ideal Higher computational cost, needs well-
Adaptive) for critical prediction tasks tuned models
• Dynamic sampling strategies allow better model training by focusing only on the most
informative and uncertain data points.
• Adaptive systems offer promising avenues for real-time support in education and healthcare,
particularly for early detection and intervention.
The reviewed systems collectively demonstrate that combining active learning and adaptive
techniques is a powerful approach to solving problems involving high-dimensional, dynamic data.
Especially in sensitive domains like student mental health, these systems show great potential in
minimizing labeling effort, enhancing model precision, and supporting early, data-driven decisions.
However, challenges remain in balancing model complexity, interpretability, and real-world
scalability.
The overall impression of the reviewed systems suggests a promising evolution in the field of
intelligent data-driven prediction, particularly in sensitive and dynamic domains like student mental
health. By integrating active learning with adaptive mechanisms, these systems effectively address
one of the most pressing challenges in machine learning—acquiring high-quality labeled data in a
resource-efficient manner. Active learning ensures that only the most informative data points are
selected for labeling, significantly reducing human effort while maintaining or even enhancing the
model’s performance. Meanwhile, adaptivity allows the models to evolve in real-time, responding to
shifts in student behavior, academic pressure, and other contextual variables. This dynamic response
capability is particularly valuable in mental health monitoring, where student conditions can change
rapidly and unpredictably.
Moreover, these systems promote early detection and timely intervention, which are critical for mental
health support. Through continuous learning and re-training, they can capture complex, high-
dimensional patterns that static models often miss. However, despite these strengths, practical
implementation still faces several hurdles. One key concern is model complexity—highly adaptive
systems often rely on sophisticated algorithms that can be difficult to interpret, posing a challenge for
stakeholders like counselors or educators who may not have technical expertise. Additionally,
ensuring scalability without compromising accuracy and ethical standards remains a significant
concern. Privacy, data security, and system transparency must also be addressed for such models to
gain widespread acceptance and integration into educational infrastructures. Overall, while the
trajectory is promising, future developments must focus on balancing accuracy, interpretability, and
practicality to realize the full potential of adaptive, active learning systems in real-world settings.
CHAPTER – 3
• The system must be able to iteratively select the most informative data points for labeling.
• The framework should support the integration of new labeled data to continuously improve
the model.
2. Multi-class classification
• The model must classify anxiety levels into multiple classes (e.g., low, medium, high).
• It should provide predictions that distinguish between different levels of student anxiety.
• The system must enhance the accuracy of anxiety predictions compared to baseline
models.
• It should adapt and update its learning process based on newly labeled data to refine
predictions.
4. Real-time prediction
• The system should be capable of real-time anxiety prediction to enable timely intervention.
5. Data integration
• The framework must integrate with educational data sources to obtain relevant student
information for prediction.
6. User interface
1. Scalability
• The system should handle increasing amounts of data and users without performance
degradation.
2. Performance
• The framework must provide predictions with minimal latency to support real-time
applications.
• The system should be efficient in processing and selecting data points for labeling.
• Protect sensitive student data and ensure compliance with relevant data protection regulations
(e.g., gdpr, ferpa).
• Implement secure data storage and transmission protocols.
5. Usability
• The user interface should be intuitive and easy to use for educators and administrators.
• Provide clear and actionable insights based on the predictions.
6. Adaptability
• The system should adapt to changes in the educational environment or student population.
• It should allow for updates and enhancements to the model and framework based on user
feedback and new research.
7. maintainability
Monitor - SVGA
RAM - 8GB
The successful implementation of the anxiety prediction system requires a balanced configuration
of both hardware and software components. On the hardware side, a machine equipped with an Intel
Core i3 processor or above is recommended to ensure reliable performance during code execution,
model training, and server deployment tasks. The system should have at least 160GB of storage space
to accommodate the operating system, project files, dependencies, and datasets. An SVGA or higher-
resolution monitor is necessary for clear display of visual outputs, while a standard Windows-
compatible keyboard and a two- or three-button mouse will support effective user interaction during
development and testing phases. Additionally, a minimum of 8GB RAM is essential for smooth
multitasking, especially when handling large datasets or executing memory-intensive Python
operations, ensuring that tools like Flask, Scikit-learn, and Pandas function without performance
bottlenecks.
On the software front, the system is built to run on Windows operating systems such as Windows
7, 8, or 10, providing compatibility with a wide range of development tools. The user interface of the
web application is designed using HTML, CSS, Bootstrap, and JavaScript, ensuring a responsive and
user-friendly experience. Python serves as the core programming language due to its robust support
for machine learning and data analysis. Key Python libraries utilized include Flask for web application
development, Pandas and NumPy for data handling, Scikit-learn for implementing machine learning
algorithms, and mysql.connector for database interaction. The Integrated Development Environment
(IDE) used is PyCharm, which offers powerful code editing, debugging, and project management
features. For backend support, XAMPP is used to simulate a local server environment, making it easy
to test and manage the MySQL database. The system is developed using Python 3.6 or higher, ensuring
compatibility with the latest library versions and features. This software configuration creates a stable,
scalable, and efficient development environment suitable for building and deploying an intelligent,
real-time student anxiety prediction system.
CHAPTER – 4
The existing system for predicting student anxiety utilizes a variety of machine learning
algorithms, each offering unique strengths in handling the complexities of mental health
assessment within educational environments:
K-Nearest Neighbors (KNN): In the current system, the KNN algorithm is implemented to
classify levels of anxiety based on similarity measures. It operates on the principle that students
with similar behavioral and emotional features likely experience similar levels of anxiety. This
method is particularly useful for capturing local patterns in the data, but its performance can
degrade with high-dimensional data typically found in student behavioral assessments.
XGBoost (XGB): XGBoost, an implementation of gradient boosted decision trees designed for
speed and performance, is particularly adept at handling varied data types and complex structures
that are common in educational data. It boosts the model's performance by focusing on correcting
the predecessor's errors, thus being highly efficient in capturing non-linear interactions and
relationships among features.
Naive Bayes (NB): The Naive Bayes classifier is used for its ability to handle large datasets
efficiently. Assuming independence between predictors, Naive Bayes calculates the probability of
certain anxiety levels given a set of observed features. It's particularly favored for its simplicity
and speed in making predictions, although its assumption of feature independence may not always
hold in complex educational settings.
Random Forest (RF): Random Forest aggregates multiple decision trees to improve the
classification accuracy and control over-fitting, making it robust against noise and capable of
handling imbalanced data. In the existing system, RF is used to evaluate its effectiveness across a
spectrum of anxiety classifications by leveraging its ensemble learning technique, which enhances
generalizability and accuracy in diverse educational datasets Each of these algorithms contributes
differently to the overall prediction system, leveraging their respective strengths to enhance the
accuracy and reliability of anxiety predictions in students. This multiplicity of approaches helps to
ensure robustness and adaptability in the face of varied data characteristics and evolving
educational environments.
4.1.2 Disadvantages
1. Complexity in Implementation: Adaptive active learning systems are inherently complex due to
the need for continuous model retraining and data reevaluation. This complexity could pose challenges
in practical educational settings where resources and technical expertise might be limited.
2. Dependence on Initial Data: The success of the adaptive active learning model hinges on the
initial set of labeled data. If this initial data is not representative or sufficiently diverse, the model may
develop biases or fail to generalize well across different student populations.
3. Computational Cost: Continuously selecting informative data points and retraining the model can
be computationally expensive and time-consuming, which might not be feasible in real-time
applications without substantial computational resources.
4. Privacy Concerns: The collection and analysis of sensitive student data such as mental health
indicators require stringent data privacy measures. Ensuring privacy and securing data can be
challenging and increase the operational complexity and costs.
5. Adaptivity to Diverse Educational Settings: While the framework is scalable, its adaptability to
diverse educational environments with varying levels of technology integration and pedagogical
approaches can be limited.
4.1.3 Advantages
1. Enhanced Accuracy: Utilizing decision trees as base learners allows the system to capture intricate
patterns and decision rules from a range of educational data. This, combined with the refinement of
predictions through a stacking classifier, leads to improved accuracy in identifying different levels of
anxiety.
2. Dynamic Learning: The adaptive active learning component of the system dynamically selects
the most informative unlabeled data points for labeling. This process ensures the model continuously
learns and adapts to new data, enhancing its relevance and accuracy over time.
• Data Sources: Collect data from various sources such as student surveys, academic performance,
attendance records, and psychological assessments.
• Preprocessing: Clean and preprocess the data to handle missing values, normalize features, and encode
categorical variables. Perform feature extraction and selection to identify relevant attributes for anxiety
prediction.
• Model Selection: Start with baseline models for multi-class classification, such as Logistic Regression,
Decision Trees, or Support Vector Machines (SVMs).
• Training: Train the initial models on the preprocessed data using a standard dataset split (e.g., 70%
training, 30% validation).
• Data Pooling: Maintain a pool of unlabeled data that can be used for active learning.
• Uncertainty Sampling: Implement uncertainty sampling techniques (e.g., uncertainty sampling,
query-by-committee) to identify data points where the model is least confident.
• Query Selection: Use an adaptive algorithm to iteratively select the most informative samples from
the pool for labeling.
• Labeling Process: Incorporate a feedback mechanism where selected samples are labeled by experts
(e.g., psychologists, educators).
• Model Update: Retrain the model with the newly labeled data to improve its accuracy and robustness.
• Evaluation: Continuously evaluate model performance using metrics such as accuracy, precision,
recall, and F1-score for multi-class classification.
• Multi-Class Classification: Implement the multi-class classification model to predict various levels
of anxiety (e.g., low, moderate, high).
• Output Interpretation: Provide a detailed report on predicted anxiety levels and their corresponding
probabilities.
• Monitoring: Implement real-time monitoring to track model performance and user interactions.
• Feedback Loop: Establish a feedback loop to refine and improve the model based on user input and
new data.
• Scalability: Ensure the system can handle increasing volumes of data and users.
• Maintenance: Regularly update the model and framework.
The goal is for UML to become a common language for creating models of object-oriented
computer software. In its current form UML is comprised of two major components: a Meta-model
and a notation. In the future, some form of method or process may also be added to; or associated with,
UML.
The Unified Modelling Language is a standard language for specifying, Visualization,
Constructing and documenting the artefacts of software system, as well as for business modelling and
other non-software systems.
The UML represents a collection of best engineering practices that have proven successful in
the modelling of large and complex systems.
The UML is a very important part of developing objects-oriented software and the software
development process. The UML uses mostly graphical notations to express the design of software
projects.
• A use case diagram in the Unified Modeling Language (UML) is a type of behavioral diagram
defined by and created from a Use-case analysis.
• Its purpose is to present a graphical overview of the functionality provided by a system in terms
of actors, their goals (represented as use cases), and any dependencies between those use cases.
• The main purpose of a use case diagram is to show what system functions are performed for
which actor. Roles of the actors in the system can be depicted.
In software engineering, a class diagram in the Unified Modelling Language (UML) is a type of static
structure diagram that describes the structure of a system by showing the system's classes, their
attributes, operations (or methods), and the relationships among the classes. It explains which class
contains information.
In collaboration diagram the method call sequence is indicated by some numbering technique as shown
below. The number indicates how the methods are called one after another. We have taken the same
order management system to describe the collaboration diagram. The method calls are similar to that
of a sequence diagram. But the difference is that the sequence diagram does not describe the object
organization whereas the collaboration diagram shows the object organization.
Deployment diagram represents the deployment view of a system. It is related to the component
diagram. Because the components are deployed using the deployment diagrams. A deployment
diagram consists of nodes. Nodes are nothing but physical hardware’s used to deploy the application.
Activity diagrams are graphical representations of workflows of stepwise activities and actions with
support for choice, iteration and concurrency. In the Unified Modelling Language, activity diagrams
can be used to describe the business and operational step-by-step workflows of components in a
system. An activity diagram shows the overall flow of control.
A component diagram, also known as a UML component diagram, describes the organization and
wiring of the physical components in a system. Component diagrams are often drawn to help model
implementation details and double-check that every aspect of the system's required function is covered
by planned development.
4.3.8 ER Diagram
An Entity–relationship model (ER model) describes the structure of a database with the help of a
diagram, which is known as Entity Relationship Diagram (ER Diagram). An ER model is a design or
blueprint of a database that can later be implemented as a database. The main components of E-R
model are: entity set and relationship set.
An ER diagram shows the relationship among entity sets. An entity set is a group of similar entities
and these entities can have attributes. In terms of DBMS, an entity is a table or attribute of a table in
database, so by showing relationship among tables and their attributes, ER diagram shows the
complete logical structure of a database. Let’s have a look at a simple ER diagram to understand this
concept.
DFD Diagram
Level 1 Diagram
Level 2 Diagram:
Fig 4.3.10: DFD Level 2 Showing Detailed System Processes and Data Flow
A Data Flow Diagram (DFD) is a traditional way to visualize the information flows within a system.
A neat and clear DFD can depict a good amount of the system requirements graphically. It can be
manual, automated, or a combination of both. It shows how information enters and leaves the system,
what changes the information and where information is stored. The purpose of a DFD is to show the
scope and boundaries of a system as a whole. It may be used as a communications tool between a
systems analyst and any person who plays a part in the system that acts as the starting point for
redesigning a system.
Interactive elements like text fields, dropdown menus, radio buttons, and action buttons (e.g.,
"Submit") are used to gather inputs. Upon submitting the input, the system processes the data and
displays the results (such as predicted output or classification) in a clear, formatted section of the page.
Error messages and input validation feedback are also provided to ensure correct data entry.
The design is responsive, allowing it to work seamlessly across various devices, including desktops,
tablets, and smartphones. Technologies like HTML5, CSS3, and JavaScript are used to build the
frontend, while Bootstrap may be used to enhance responsiveness and aesthetic appeal.
CHAPTER – 5
IMPLEMENTATION DETAILS
5.1 Control Flow
The proposed system employs a novel framework that integrates a multi-class adaptive active learning
strategy with decision tree and stacking classifier methodologies to enhance the prediction of student
anxiety levels. Utilizing decision trees as the base learners, the system first captures the underlying
patterns and decision rules from diverse educational data, such as student performance, behavioral
indicators, and psychometric assessments. These decision trees provide a preliminary classification of
anxiety levels, which are then fed into a stacking classifier. The stacking classifier, comprising an
ensemble of various machine learning models, refines these predictions by learning from the decision
outputs of the decision trees, thus improving the overall prediction accuracy and robustness. This
adaptive active learning component dynamically selects and queries the most informative unlabeled
data points for subsequent labeling, ensuring continuous model improvement and adaptation to new
data. The proposed system is designed to be scalable, enabling real-time analytics in educational
settings to facilitate timely interventions and support for students experiencing anxiety.
5.2 Methodology
The methodology adopted for this project is based on a structured machine learning pipeline aimed at
predicting sleep disorders from input data. The steps taken to fulfill the project objectives are described
below:
• Method/Procedure: Dataset was sourced from reliable medical records or public datasets.
Preprocessing included handling missing values, encoding categorical features, and scaling
numerical values to improve model performance.
• Method/Procedure: The final model was deployed using a web-based interface (e.g., Flask or
Django) to allow users to input symptoms and receive predictions in real-time.
5.3 Algorithm
5.3.1. K-Nearest Neighbors
K-Nearest Neighbors (KNN) is a straightforward yet effective classification algorithm used in various
machine learning tasks. In essence, KNN classifies data points based on the majority class among its
'k' closest neighbors in the feature space. The algorithm computes the distance between the query point
and all other points in the training dataset, typically using metrics like Euclidean or Manhattan
distance. It then identifies the 'k' nearest neighbors and assigns the most common class label among
these neighbors to the query point. KNN is particularly advantageous for its simplicity and ability to
adapt to various types of data without assuming an explicit underlying distribution. In the context of
predicting student anxiety, KNN can effectively categorize anxiety levels by leveraging labeled
examples and providing predictions based on similarity, thus enhancing the model’s ability to capture
nuanced anxiety patterns and improve overall classification accuracy. In your study, KNN achieved
an accuracy of 0.78, demonstrating its potential to differentiate between various levels of student
anxiety.
Logistic Regression is a popular algorithm for binary and multi-class classification problems. In the
context of predicting student anxiety with a multi-class framework, Logistic Regression works by
modeling the probability that a given data point belongs to a particular class. It does this by applying
the logistic function (sigmoid function) to a linear combination of the input features. For multi-class
classification, the model uses a technique called "one-vs-rest" or "softmax" to handle multiple
classes.The logistic function transforms the output into a probability score between 0 and 1, which can
be interpreted as the likelihood of the input data belonging to each class. During training, the algorithm
optimizes the model parameters by minimizing a cost function, typically the cross-entropy loss, which
measures the difference between the predicted probabilities and the actual class labels. The accuracy
of 0.60 indicates that the Logistic Regression model correctly predicted the anxiety levels 60% of the
time, suggesting room for improvement in the model’s performance, potentially through further
tuning, feature engineering, or incorporating more sophisticated algorithms.
The XGB Classifier, or Extreme Gradient Boosting Classifier, is a powerful machine learning
algorithm based on gradient boosting techniques. It builds an ensemble of decision trees in a sequential
manner where each tree attempts to correct the errors made by the previous ones. This iterative
approach focuses on improving the model's performance by minimizing the loss function, which
measures the difference between predicted and actual values. XG Boost incorporates regularization
techniques to prevent overfitting and enhance the generalization of the model. In your research project,
this algorithm's robustness and ability to handle complex data patterns contribute to achieving an
accuracy of 0.80 in predicting student anxiety levels. Its adaptive nature makes it well-suited for
handling diverse and dynamic educational data.
Naive Bayes is a probabilistic classifier based on Bayes' theorem, assuming that the features used for
classification are conditionally independent given the class label. In the context of predicting student
anxiety levels, the algorithm calculates the probability of a student's anxiety level based on their
features (such as academic performance, social interactions, etc.). It does so by first estimating the
prior probabilities of each anxiety class from the training data. Then, it evaluates the likelihood of each
feature belonging to each class, assuming independence between features. The posterior probability
of each class is computed using Bayes' theorem, and the class with the highest posterior probability is
predicted as the student's anxiety level. Despite its simplicity, Naive Bayes can be effective for multi-
class classification tasks, though in this study, it achieved an accuracy of 0.52, indicating room for
improvement compared to other models.
The Random Forest algorithm operates as an ensemble learning method, constructing multiple
decision trees during training and outputting the mode of the classes (classification) or mean prediction
(regression) of the individual trees. Each decision tree in the forest is trained on a random subset of
the data and features, which introduces diversity and reduces overfitting. During the prediction phase,
each tree votes for a class, and the class receiving the majority vote is chosen as the final prediction.
This approach allows the model to capture complex patterns and interactions in the data. In your
project, the Random Forest classifier achieved an accuracy of 0.82, demonstrating its effectiveness in
distinguishing between various levels of student anxiety. The high accuracy underscores the model's
capability to leverage the adaptive active learning framework to iteratively improve predictions and
provide valuable insights into student mental health.
The Decision Tree Classifier operates as a versatile tool for handling the multi-class classification of
student anxiety levels. The internal mechanism of the Decision Tree Classifier involves recursively
splitting the dataset into subsets based on feature values to create a tree-like model of decisions. At
each node of the tree, the algorithm selects the feature and corresponding threshold that best separates
the data according to a criterion like Gini impurity or information gain. This process continues until
the data in each leaf node belongs to a single class or meets a stopping criterion, such as a maximum
tree depth or minimum sample split. The decision tree structure enables intuitive understanding of how
different features contribute to predictions, and its adaptability in handling various anxiety levels
makes it a fitting choice for predicting and interpreting complex patterns in student anxiety data.
The Stacking Classifier is a powerful ensemble learning technique used in this research to predict
student anxiety levels. It operates by combining the predictions from multiple base models (often
referred to as level-0 models) to improve overall performance. In this approach, the base models
generate predictions based on the input data, and these predictions are then used as input features for
a meta-model (level-1 model), which synthesizes the information to produce the final output. This
method leverages the strengths of various classifiers to enhance predictive accuracy. In our project,
the Stacking Classifier achieved an impressive accuracy of 0.86, demonstrating its effectiveness in
capturing the complex patterns associated with student anxiety. By integrating multiple models'
insights, the Stacking Classifier provides a robust solution that enhances prediction accuracy and
reliability compared to individual models, aligning with the study's goal of improving early
intervention in educational settings.
App.py
from flask import Flask,render_template,redirect,url_for,request
import numpy as np
import joblib
app = Flask(__name__)
import joblib
import numpy as np
import mysql.connector
mydb = mysql.connector.connect(
host='localhost',
port=3306,
user='root',
passwd='',
database='Anxiety'
mycur = mydb.cursor()
@app.route('/')
def index():
return render_template('index.html')
@app.route('/about')
def about():
return render_template('about.html')
@app.route('/registration',methods=['POST','GET'])
def registration():
if request.method == 'POST':
name = request.form['name']
email = request.form['email']
password = request.form['password']
confirmpassword = request.form['confirmpassword']
if password == confirmpassword:
val = (email,)
mycur.execute(sql, val)
else:
sql = 'INSERT INTO users (name, email, password) VALUES (%s, %s, %s)'
mycur.execute(sql, val)
mydb.commit()
else:
return render_template('registration.html')
@app.route('/login',methods=['POST','GET'])
def login():
if request.method == 'POST':
email = request.form['email']
password = request.form['password']
val = (email,)
mycur.execute(sql, val)
data = mycur.fetchall()
if data:
if password == data[0][2]:
return render_template('prediction.html')
else:
else:
msg = 'User with this email does not exist. Please register.'
else:
return render_template('login.html')
def prediction():
if request.method == 'POST':
GAD1 = int(request.form['GAD1'])
GAD2 = int(request.form['GAD2'])
GAD3 = int(request.form['GAD3'])
GAD4 = int(request.form['GAD4'])
GAD5 = int(request.form['GAD5'])
GAD6 = int(request.form['GAD6'])
GAD7 = int(request.form['GAD7'])
SWL1 = int(request.form['SWL1'])
SWL2 = int(request.form['SWL2'])
SWL3 = int(request.form['SWL3'])
SWL4 = int(request.form['SWL4'])
SWL5 = int(request.form['SWL5'])
Game = int(request.form['Game'])
Hours = float(request.form['Hours'])
Gender = int(request.form['Gender'])
Age = int(request.form['Age'])
Work = int(request.form['Work'])
Degree = int(request.form['Degree'])
GAD_T = int(request.form['GAD_T'])
SWL_T = int(request.form['SWL_T'])
abc = [[GAD1, GAD2, GAD3, GAD4, GAD5, GAD6, GAD7, SWL1, SWL2, SWL3, SWL4,
SWL5, Game, Hours, Gender, Age, Work, Degree, GAD_T, SWL_T]]
abc_array = np.array(abc)
loaded_model = joblib.load('stacking_model.pkl')
result = loaded_model.predict(abc_array)
if result == 0:
suggestions = "It seems like you are facing significant challenges. Consider seeking
professional help, talking to a mentor, or breaking down tasks into smaller steps to reduce stress."
elif result == 1:
suggestions = "You seem to be handling things well! Keep up the good work and maintain a
balanced approach to avoid burnout."
elif result == 2:
suggestions = "You're finding things a bit challenging. It might help to take short breaks,
reassess your workload, or talk to someone you trust for advice."
elif result == 3:
suggestions = "You're going through a tough time. Consider reaching out for support from
friends, family, or a counselor, and take things one step at a time."
else:
suggestions = "It appears there was an issue with the prediction. Please try again or contact
support."
return render_template('prediction.html')
@app.route('/logout')
def logout():
return render_template('index.html')
if __name__ == '__main__':
app.run(debug = True)
CHAPTER – 6
TESTING DETAILS
6.1 Unit Testing
Unit testing is the initial phase of software testing where individual units or components of the
software are tested independently to verify that each performs as designed. A "unit" may refer to a
function, method, procedure, or module in the application. The objective is to isolate each part of the
program and ensure that it behaves correctly in terms of logic, inputs, and outputs.
In the context of our project, each function—such as form input validation, prediction logic, data
preprocessing modules, and database operations—was tested independently. We utilized automated
unit testing tools such as PyTest or unittest in Python to run these tests efficiently. For example, in the
prediction module, we verified if the model correctly handles missing values and gives expected
results for known inputs.
Unit testing helped ensure a strong foundation by validating each block of the system before
integrating them together.
Integration tests are designed to test integrated software components to determine if they
actually run as one program. Testing is event driven and is more concerned with the basic outcome
of screens or fields. Integration tests demonstrate that although the components were individually
satisfaction, as shown by successfully unit testing, the combination of components is correct and
consistent. Integration testing is specifically aimed at exposing the problems that arise from the
combination of components.
Software integration testing is the incremental integration testing of two or more integrated software
components on a single platform to produce failures caused by interface defects.
The task of the integration test is to check that components or software applications, e.g. components
in a software system or – one step up – software applications at the company level – interact without
error.
Test Results: All the test cases mentioned above passed successfully. No defects encountered.
Acceptance Testing
User Acceptance Testing is a critical phase of any project and requires significant participation by the
end user. It also ensures that the system meets the functional requirements.
Test Results: All the test cases mentioned above passed successfully. No defects encountered.
For our project, user testing involved allowing potential users (students, faculty, or testers) to perform
specific tasks—such as inputting data, generating predictions, or navigating through the system—and
collecting their feedback. We focused on key usability aspects such as:
We used methods like observation, surveys, and direct interviews to gather qualitative and quantitative
feedback. Several iterations of user testing were conducted, each followed by improvements based on
findings.
• Verifying that all core functionalities worked from the user's perspective
Ultimately, user testing ensured that our solution was not just technically sound, but also intuitive and
effective for end users.
Functional tests provide systematic demonstrations that functions tested are available as specified by
the business and technical requirements, system documentation, and user manuals.
Organization and preparation of functional tests is focused on requirements, key functions, or special
test cases. In addition, systematic coverage pertaining to identify Business process flows; data fields,
predefined processes, and successive processes must be considered for testing. Before functional
testing is complete, additional tests are identified and the effective value of current tests is determined.
White Box Testing is a testing in which in which the software tester has knowledge of the inner
workings, structure and language of the software, or at least its purpose. It is purpose. It is used to test
areas that cannot be reached from a black box level.
Black Box Testing is testing the software without any knowledge of the inner workings, structure or
language of the module being tested. Black box tests, as most other kinds of tests, must be written
from a definitive source document, such as specification or requirements document, such as
specification or requirements document. It is a testing in which the software under test is treated, as a
black box .you cannot “see” into it. The test provides inputs and responds to outputs without
considering how the software works.
Test Cases
Registration
Test Case 1: Verify that the registration form accepts valid user inputs and successfully creates a new
account.
Test Case 2: Verify that the registration form rejects invalid inputs (e.g., weak passwords, invalid
email formats).
Expected Result: The system displays appropriate error messages, and registration is not completed.
Test Case 3: Verify that the registration form handles duplicate usernames or emails.
Expected Result: The system displays an error message indicating the username or email is already
in use.
Login
Test Case 4: Verify that users can log in with valid credentials.
Test Case 5: Verify that the login form rejects invalid credentials.
Expected Result: The system displays an error message and does not allow login.
Test Case 6: Verify that the login form handles cases where the user account is inactive or deactivated.
Expected Result: The system displays a message indicating the account is inactive.
CHAPTER – 7
RESULTS DISCUSSION
7.1 Snapshots
HomePage: The HomePage serves as the landing page of your application. It provides an overview
of the project's features, objectives, and benefits. Users can navigate to other sections of the application
from this page.
AboutPage: The AboutPage offers detailed information about the project, including its purpose, goals,
and the technology used. It provides background information on the problem being addressed and the
methods employed.
Registration Page: The Registration Page allows new users to create an account with the application.
It typically includes fields for entering personal information such as name, email, password, and
possibly other details like phone number or address. Users need to fill out this form to gain access to
the application's features.
Login Page: The Login Page enables users to access their existing accounts by entering their
credentials. It usually includes fields for entering a username/email and password.
Prediction Page: : The Prediction Page allows users to input data and receive predictions based on
the trained machine learning models. This page typically includes a form or interface for uploading or
entering data (e.g., smartwatch sensor data).
Outputs
System
User
1. System
This is the process of teaching a machine learning model to make accurate predictions or classifications by
exposing it to a dataset. During this phase, data is prepared and split into training, validation, and test sets. The
The system takes the data given by the user and predict the output based on the given data.
2. User
2.Registration
The Registration Page allows new users to create an account by entering their personal information. It includes
fields for username, email, password, and other required details. The page features validation to ensure that all
input data is correct and meets the specified requirements. For example, it checks for valid email formats, strong
passwords, and non-duplicate usernames. Users receive real-time feedback on any errors or issues with their
input, ensuring a smooth and secure registration process.
2.2 Login
Validation Messages: Provides immediate feedback if the input is incorrect or if the account details do not
match.
2.3 Evaluation
CONCLUSION
Our team has developed a novel multi-class adaptive active learning framework designed to predict
student anxiety more accurately and efficiently. This innovative approach tackles key limitations of
traditional models, which often depend on large amounts of labeled data and lack flexibility in adapting
to changing behaviors. By utilizing adaptive active learning, the system selectively queries the most
informative data points, reducing labeling effort while significantly improving prediction performance
and robustness.
The integration of multi-class classification enables the model to differentiate between varying
levels of anxiety—such as none, mild, moderate, and severe. This finer level of detail is essential for
designing tailored interventions and timely support strategies in educational settings. Through rigorous
testing and evaluation, the framework consistently outperformed baseline models, demonstrating its
effectiveness in capturing dynamic patterns in student behavior.
This work underscores the potential of adaptive learning methods in educational data mining,
providing a scalable and real-time solution for identifying anxiety among students. By enabling more
responsive, data-driven mental health support, our contribution aims to enhance student well-being
and promote healthier academic environments.
FUTURE ENHANCEMENT
1. Integration of Multi-Modal Data
• Description: Incorporate diverse data sources such as physiological data (e.g., heart rate
variability), behavioral data (e.g., participation in class activities), and academic performance
metrics.
• Benefit: Enhances the robustness of the model by providing a more comprehensive view of
student anxiety.
• Description: Introduce mechanisms for dynamic model adaptation as new data is collected,
allowing the model to continuously improve and stay relevant.
• Benefit: Ensures the model remains accurate and effective over time, even as student
behaviors and anxiety patterns evolve.
• Description: Develop frameworks for addressing ethical issues and ensuring data privacy in
the collection and use of student data.
• Benefit: Builds trust with stakeholders and ensures compliance with legal and ethical
standards.
• Description: Explore methods for scaling the model to larger educational institutions or
systems and integrating it into existing educational tools.
• Benefit: Facilitates broader adoption and use of the framework across diverse educational
settings.
• Description: Adapt the model to account for cultural and contextual differences in student
anxiety and coping mechanisms.
• Benefit: Ensures the model’s predictions and recommendations are relevant and effective
across different cultural contexts.
• Description: Create pathways for seamless integration with existing student support services
and resources.
• Benefit: Enhances the overall support network available to students and streamlines access to
help.
REFERENCES
[1] J. Southworth, K. Migliaccio, J. Glover, J. Glover, D. Reed, C. McCarty, J. Brendemuhl, and A.
Thomas, ‘‘Developing a model for AI across the curriculum: Transforming the higher education
landscape via innovation in AI literacy,’’ Comput. Educ., Artif. Intell., vol. 4, Aug. 2023, Art. no.
100127.
[2] X.-Q. Liu, Y.-X. Guo, and Y. Xu, ‘‘Risk factors and digital interventions for anxiety disorders in
college students: Stakeholder perspectives,’’ World J. Clin. Cases, vol. 11, no. 7, pp. 1442–1457, Mar.
2023.
[3] M. K. Khaira, R. L. R. Gopal, S. M. Saini, and Z. M. Isa, ‘‘Interventional strategies to reduce test
anxiety among nursing students: A systematic review,’’ Int. J. Environ. Res. Public Health, vol. 20,
no. 2, p. 1233, Jan. 2023.
[4] K. Meeks, A. S. Peak, and A. Dreihaus, ‘‘Depression, anxiety, and stress among students, faculty,
and staff,’’ J. Amer. College Health, vol. 71, no. 2, pp. 348–354, Feb. 2023.
[5] D. H. Halat, A. Soltani, R. Dalli, L. Alsarraj, and A. Malki, ‘‘Understanding and fostering mental
health and well-being among University Faculty: A narrative review,’’ J. Clin. Med., vol. 12, no. 13,
p. 4425, Jun. 2023.
[6] R. D. Godsil and L. S. Richardson, ‘‘Racial anxiety,’’ Iowa Law Rev., vol. 102, p. 2235, Sep.
2016.
[7] J. Santhosh, D. Dzsotjan, and S. Ishimaru, ‘‘Multimodal assessment of interest levels in reading:
Integrating eye-tracking and physiological sensing,’’ IEEE Access, vol. 11, pp. 93994–94008, 2023.
[8] R. Qasrawi, S. P. Vicuna Polo, D. A. Al-Halawa, S. Hallaq, and Z. Abdeen, ‘‘Assessment and
prediction of depression and anxiety risk factors in schoolchildren: Machine learning techniques
performance analysis,’’ JMIR Formative Res., vol. 6, no. 8, Aug. 2022, Art. no. e32736. [9] P. J. Bota,
C. Wang, A. L. N. Fred, and H. P. Da Silva, ‘‘A review, current challenges, and future possibilities on
emotion recognition using machine learning and physiological signals,’’ IEEE Access, vol. 7, pp.
140990–141020, 2019.
[10] P. Meshram and R. K. Rambola, ‘‘Diagnosis of depression level using multimodal approaches
using deep learning techniques with multiple selective features,’’ Expert Syst., vol. 40, no. 4, p.
e12933, May 2023.
[12] J. C. Cassady, E. E. Pierson, and J. M. Starling, ‘‘Predicting student depression with measures of
general and academic anxieties,’’ Frontiers Educ., vol. 4, p. 11, Feb. 2019. [13] H. S. Yi and W. Na,
‘‘How are maths-anxious students identified and what are the key predictors of maths anxiety? Insights
gained from Pisa results for Korean adolescents,’’ Asia Pacific J. Educ., vol. 40, no. 2, pp. 247–262,
Apr. 2020.