0% found this document useful (0 votes)
24 views49 pages

Harika Book

The document presents a project report for a Personalized Music Therapy System developed by students at Nadimpalli Satyanarayana Raju Institute of Technology, aimed at enhancing mental well-being through music tailored to users' emotional states. The system utilizes advanced technologies like emotion recognition and machine learning to curate music that aligns with or improves users' moods in real-time. This innovative approach addresses the limitations of existing music recommendation systems by providing a more interactive and supportive listening experience for emotional regulation and mental health management.

Uploaded by

saikrishnab436
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views49 pages

Harika Book

The document presents a project report for a Personalized Music Therapy System developed by students at Nadimpalli Satyanarayana Raju Institute of Technology, aimed at enhancing mental well-being through music tailored to users' emotional states. The system utilizes advanced technologies like emotion recognition and machine learning to curate music that aligns with or improves users' moods in real-time. This innovative approach addresses the limitations of existing music recommendation systems by providing a more interactive and supportive listening experience for emotional regulation and mental health management.

Uploaded by

saikrishnab436
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 49

PERSONALISED MUSIC THERAPY SYSTEM

A Project Report Submitted to “JNTU-GV, Vizianagaram” For Fulfilment


of Requirements for Award of Degree of

Bachelor of Technology
in
Computer Science & Engineering (Artificial Intelligence & Machine learning)
by
MARPINA HARIKA 21NU1A4217
GANAGALLA PRANAY 21NU1A4211
GURRAM PRANICK DAS 21NU1A4213
NIMMAKAYALA JOY KOUSHIK 21NU1A4219

Under the Guidance


of
MRS. J. SANTOSHI KUMARI
SR Assistant Professor

Department of Computer Science and Engineering (Artificial Intelligence & Machine learning)

Nadimpalli Satyanarayana Raju Institute of Technology


(AUTONOMOUS)
(Permanently to JNTU-GV, Vizianagaram, Approved by AICTE, New Delhi)

Sontyam, Visakhapatnam-531173
2021 – 2025
PERSONALISED MUSIC THERAPY SYSTEM

A Project Report Submitted to “JNTU-GV, Vizianagaram” For Fulfilment


of Requirements for Award of Degree of

Bachelor of Technology
in
Computer Science & Engineering (Artificial Intelligence & Machine learning)
by
MARPINA HARIKA 21NU1A4217
GANAGALLA PRANAY 21NU1A4211
GURRAM PRANICK DAS 21NU1A4213
NIMMAKAYALA JOY KOUSHIK 21NU1A4219

Under the Guidance


of
MRS. J. SANTOSHI KUMARI
SR Assistant Professor

Department of Computer Science and Engineering (Artificial Intelligence & Machine learning)

Nadimpalli Satyanarayana Raju Institute of Technology


(AUTONOMOUS)
(Permanently to JNTU-GV, Vizianagaram, Approved by AICTE, New Delhi)

Sontyam, Visakhapatnam-531173
2021 – 2025
DECLARATION
We certify that the work contained in this report is original and has been done by me under the guidance of my
supervisor MRS. J. SANTOSHI KUMARI, Sr. Assistant Professor. The work has not been submitted to any
other Institute for any degree or diploma. We have followed the guidelines provided by the Institute in preparing
the report. We have conformed to the norms and guidelines given in the Ethical Code of Conduct of the
Institute. Whenever we have used materials (data, theoretical analysis, figures, and text) from other sources,
we have given due credit to them by citing them in the text of the report and giving their details in the
references. Further, we have taken permission from the copyright owners of the sources, whenever necessary.

Place: Visakhapatnam
Date:

MARPINA HARIKA 21NU1A4217


GANAGALLA PRANAY 21NU1A4211
GURRAM PRANICK DAS 21NU1A4213
NIMMAKAYALA JOY KOUSHIK 21NU1A4219

CERTIFICATE

This is to certify that the project report entitled “PERSONALIZED MUSIC THERAPHY SYSTEM” submitted
by MARPINA HARIKA (21NU1A4217), GANAGALLA PRANAY (21NU1A4211), GURRAM PRANICK DAS

iii
(21NU1A4213), NIMMAKAYALA JOY KOUSHIK (21NU1A4219) to the Nadimpalli Satyanarayana Raju
Institute of Technology, Sontyam, Visakhapatnam in partial fulfilment of the requirements for the award of the
Degree Bachelor of Technology in Computer Science and Engineering (Artificial Intelligence & Machine
learning) is a Bonafide record of work carried out by him/her under my/our guidance and supervision. The
contents of this report, in full or in parts, have not been submitted to any other Institute for the award of any
Degree.

Internal Guide Head of the Department


MRS. J. SANTOSHI KUMARI DR. Ramkumar Addagarila
Sr Assistant Professor Professor
Department of CSE(AI&ML) Department of CSE(AI&ML)

Date

ACKNOWLEDGEMENT

We would like to take this opportunity to express my deepest gratitude to my project supervisor,
MRS. J. SANTOSHI KUMARI, Sr Assistant Professor Computer Science & Engineering, N S Raju Institute of
Technology (A), Visakhapatnam, who has persistently and determinedly guided me during the whole course of
this project. It would have been very difficult to complete this project without her enthusiastic support, insight
iv
and advice. We are extremely thankful to DR. Ramkumar Addagarila, Professor & Head of Computer Science
& Engineering (AI&ML) Department for providing excellent lab facilities which were helpful in successful
completion of my project. Our utmost thanks to our project coordinator Mr. N. Viswanath Reddy, Sr Assistant
Professor, Computer Science & Engineering (Artificial Intelligence & Machine learning) for his support
throughout our project work.

We take immense pleasure in thanking Dr. S. Sambhu Prasad, Principal, N S Raju Institute of Technology (A),
Sontyam, and Visakhapatnam, for having permitted us to finish the project work. We thank the Management of
N S Raju Institute of Technology (A), Sontyam, Visakhapatnam, for providing the various resources to complete
the project successfully. We are thankful to one and all who contributed to my work directly or indirectly.

List of Program Outcomes


As per the Program of Study

PO1: Apply the knowledge of basic sciences and fundamental engineering concepts in solving engineering
problems (Engineering Knowledge)

PO2: Identify, formulate, review research literature, and analyze complex engineering problems reaching
substantiated conclusions using first principles of mathematics, natural sciences, and engineering
sciences. (Problem Analysis)

v
PO3: Design solutions for complex engineering problems and design system components or processes that
meet the specified needs with appropriate consideration for the public health and safety, and the
cultural, societal, and environmental considerations. (Design/Development of Solutions)

PO4: Perform investigations, design and conduct experiments, analyse and interpret the results to provide
valid conclusions (Investigation of Complex Problems)

PO5: Select/develop and apply appropriate techniques and IT tools for the design & analysis of the systems
(Modern Tool Usage)

PO6: Give reasoning and assess societal, health, legal and cultural issues with competency in professional
engineering practices (The Engineer and Society)

PO7: Demonstrate professional skills and contextual reasoning to assess environmental/societal issues for
sustainable development (The Environment and Sustainability)

PO8: Demonstrate Knowledge of professional and ethical practices (Ethics)

PO9: Function effectively as an individual, and as a member or leader in diverse teams, and in multi-
disciplinary situations (Individual and Team Work)

PO10: Communicate effectively among engineering community, being able to comprehend and write
effectively reports, presentation and give / receive clears instructions (Communication)

PO11: Demonstrate and apply engineering & management principles in their own / team projects in
multidisciplinary environment (Project Finance and Management)

PO12: Recognize the need for, and have the ability to engage in independent and lifelong learning (Life Long
Learning)

LIST OF PROGRAM SPECIFIC OUTCOMES

PSO1: Apply the conceptual knowledge of computer science, machine learning and deep learning to solve
real world problems
PSO2: Develop skills to design and develop systems/applications to provide AI based solutions

vi
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING
(ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING)

THE VISION:
To become the centre of excellence for technically competent and innovative computer engineers.

THE MISSION:
 To provide quality education and spread professional and technical knowledge, leading to a career as
computer professionals in different domains of industry, governance, and academia.
 To provide a state-of-the-art environment for learning and practices.
 To impact hands-on training in latest methodologies and technologies.

vii
ABSTRACT

This project aims to create a dynamic and intelligent music recommendation system that responds to the
emotional needs of users in real-time. While current music recommendation systems primarily focus on user
preferences, playlists, or popular trends, they often overlook the emotional context in which a user is listening
to music. Music has long been recognized as a powerful tool for influencing emotions and enhancing well-
being, with the ability to uplift spirits or provide comfort during tough times. However, existing systems fail to
adapt to the user's mood and emotional state during their listening experience.
The innovative approach of this project bridges this gap by leveraging advanced technologies like emotion
recognition and machine learning. The system detects a user’s emotional state—such as anger, sadness, or
happiness—through various inputs, such as facial expressions, voice tone, or even physiological signals. Once
the user's emotional state is determined, the system curates and plays music that aligns with or helps improve
their mood. For instance, soothing and calming melodies might be recommended to a user who is experiencing
anger, while uplifting, energetic tracks may be suggested to counteract feelings of sadness.
This system personalizes the listening experience by acting as an emotional companion, offering users music
that resonates with their current emotional needs. The project offers a more effective and individualized
solution to users’ mental well-being, which is particularly relevant in today's fast-paced and often stressful
viii
world. In doing so, it not only enhances the overall music experience but also redefines the role of music as a
therapeutic tool, providing an emotionally intelligent approach to mental health management.
By combining music with emotion-based technology, the system creates a unique and enriching experience
that goes beyond entertainment—empowering users to navigate their moods more effectively. The project
seeks to transform music into a form of emotional support, providing comfort, energy, and positivity through
personalized, mood-enhancing music recommendations.

CONTENTS

ix
TITLE I
DECLARATION III
CERTIFICATE IV
ACKNOWLEDGEMENT V
List of Program Outcomes VI
List of Program Specific Outcomes VII
Department Vision and Vision VIII
ABSTRACT IX
CONTENTS XI
LIST OF FIGURES XI
CHAPTER 1: INTRODUCTION 1
1.1 Problem Objective 1
1.2 Background 2
1.3 Existing System 4
1.4 Proposed Solution 4
1.5 Problem Statement 4
1.6 Feasibility of Study 6
CHAPTER 2: LITERATURE REVIEW 8
CHAPTER 3: SOFTWARE REQUIREMENT SPECIFICATIONS 11
3.1 System Requirement 13
3.1.1 Hardware Requirements 13
CHAPTER 4: SYSTEM DESIGN 14
4.1DesignGoals 14
4.2 System Architecture 14
4.3 Data Flow Diagram 15
4.4 UML Diagram 17
4.4.1 Use Case Diagram 18
4.4.2 Class Diagram 19
4.4.3 Activity Diagram 20
CHAPTER 5: IMPLEMENTATION 21
5.1 Software Environment 21
LIST OF
5.1.1 What is python 21
x
5.1.2 modules 22
5.2 Methodology 23
FIGURES

Figure Name Page No.


1.4 Proposed System 11
4.2 System Architecture 20
4.3 Data Flow Diagram 22
4.4.1 Use case diagram 24
4.4.2 Class diagram 25
4.4.3 Activity diagram 25

xi
INTRODUCTION

1.1 Project Objective

The objective of this project is to develop a Personalized Music Therapy System that dynamically adapts to
an individual's emotional state and mental well-being. Traditional music recommendation systems primarily
focus on user preferences, playlists, or popular trends, overlooking the real-time emotional needs of users. This
project aims to bridge this gap by integrating emotion recognition and artificial intelligence (AI) to curate music
that enhances mood, reduces stress, and provides therapeutic benefits. By leveraging real-time emotional
inputs, the system offers a personalized and intelligent music experience tailored to the user’s psychological
and emotional state.
The system utilizes advanced technologies such as facial expression analysis, voice tone detection, and
physiological signal monitoring to assess a user’s emotional state. Machine learning algorithms process this
data to determine whether the user is experiencing emotions such as happiness, sadness, anxiety, or stress.
Based on the detected mood, the system recommends and plays music that either aligns with or helps regulate
the emotional state. For example, calming and meditative music may be suggested for stress relief, while
energetic tracks might be played to uplift a low mood.
This AI-driven music therapy system is designed to enhance mental well-being by making music an
interactive and supportive tool. It can be particularly beneficial for individuals facing emotional distress, mental
health challenges, or high levels of stress in their daily lives. The system learns from user feedback and
continuously improves its recommendations, ensuring that each session becomes more personalized and
effective. Over time, it adapts to individual emotional patterns, offering deeper insights into how music
influences a person’s well-being.
Beyond personal use, this system has potential applications in clinical therapy, stress management
programs, wellness centers, and digital health platforms. It can support mental health professionals by
integrating music therapy into treatment plans or provide relief to individuals dealing with anxiety, depression,
or emotional fatigue. By combining AI and music therapy, this project creates an innovative, intelligent, and
emotionally aware system that transforms the way users interact with music, making it a powerful tool for
emotional support and psychological healing.

1.2 BACKGROUND

1
Music has long been recognized as a powerful tool for influencing emotions, reducing stress, and enhancing
overall well-being. Various studies have shown that music can regulate mood, improve cognitive function, and
even aid in mental health treatments. From ancient times, different cultures have used music as a form of
therapy to heal emotional distress and promote relaxation. However, despite its therapeutic potential, most
modern music applications focus primarily on entertainment, relying on predefined playlists or user preferences
without considering the listener’s real-time emotional state.
Traditional music recommendation systems use algorithms based on historical listening patterns, social trends,
or genre preferences. While these methods offer personalized playlists, they fail to adapt dynamically to a
user’s current mood or psychological needs. Emotional states are constantly changing, influenced by factors
such as stress, fatigue, and external circumstances. A music system that can recognize and respond to these
emotional variations can provide a more meaningful and supportive listening experience, making music not just
a source of enjoyment but also a tool for emotional regulation and mental well-being.
Advancements in artificial intelligence (AI), machine learning, and emotion recognition technologies
have opened new possibilities for integrating music therapy with digital systems. AI-driven solutions can now
analyze facial expressions, voice tone, and physiological signals like heart rate to detect emotions with high
accuracy. These technologies allow for the development of intelligent systems that can curate music in real
time based on a user’s emotional state, thereby offering a personalized therapeutic experience.
Given the increasing levels of stress, anxiety, and mental health challenges in today’s fast-paced world, the
need for emotion-aware and adaptive music therapy solutions has never been greater. A personalized
music therapy system that responds to emotional cues can serve as a valuable tool in daily life, offering
comfort, relaxation, and motivation when needed. This project aims to bridge the gap between music and
mental health by creating an AI-powered system that transforms music into a proactive and intelligent
emotional support tool.

1.3 EXISTING SYSTEM

Current music recommendation systems, such as those used by streaming platforms like Spotify, Apple Music,
and YouTube Music, primarily focus on user preferences, past listening history, and popular trends. These
systems utilize collaborative filtering, content-based filtering, and deep learning models to suggest songs that
align with a user’s taste. While effective for personalized playlists, these platforms lack real-time emotional
adaptability. They do not analyze the listener’s current mood or psychological state, making their
recommendations static rather than responsive to emotional needs.
Some advancements have been made in emotion-based music selection, such as mood-based playlists and
AI-generated recommendations using sentiment analysis. However, these implementations rely on user inputs,
2
such as selecting a playlist labeled "chill", "happy", or "sad", rather than dynamically detecting emotions. More
sophisticated systems incorporating emotion recognition through facial expressions, voice analysis, or
physiological signals are still in research or limited to specialized applications. Thus, there is a gap in the
market for an intelligent, real-time personalized music therapy system that can seamlessly integrate AI-
driven emotion detection with adaptive music curation for mental well-being.

1.4 PROPOSED SYSTEM

The Personalized Music Therapy System aims to revolutionize music recommendation by integrating real-
time emotion recognition with AI-driven music selection. Unlike existing systems that rely on past listening
history or user-selected playlists, this system dynamically detects a user’s emotional state using advanced
technologies such as facial expression analysis, voice tone detection, and physiological signals (e.g.,
heart rate variability). By analyzing these inputs, the system accurately determines whether the user is feeling
happy, sad, anxious, or stressed and curates music accordingly to enhance or regulate their mood.
This AI-powered system utilizes machine learning algorithms and a music classification model to
categorize songs based on their emotional impact. Once the user’s emotion is identified, the system selects
music that aligns with therapeutic principles—for instance, calming music for stress relief, uplifting beats
for sadness, or soft instrumental melodies for relaxation. Over time, the system learns from user
interactions, refining its recommendations to provide a more personalized and effective therapeutic
experience.
The proposed system is designed for integration with smartphones, music streaming platforms, and
wearable devices, making it accessible to users in various settings, such as during work, relaxation, or therapy
sessions. It can also be used in clinical environments, assisting therapists in music-based treatments for
anxiety, depression, or emotional distress. By combining AI, emotion recognition, and music therapy, this
system transforms music into an intelligent emotional support tool, offering users a unique and proactive
approach to mental well-being.

3
Fig :1.4 proposed system

1.5 PROBLEM STATEMENT

In today’s fast-paced world, individuals frequently experience emotional fluctuations, including stress, anxiety,
sadness, and frustration, which can significantly impact their mental well-being. While music has long been
recognized for its therapeutic effects in regulating emotions, existing music recommendation systems primarily
focus on generic preferences, playlists, or trending songs. These systems lack the capability to respond
dynamically to a user’s real-time emotional state, limiting their effectiveness in providing personalized
emotional support through music.
The absence of an intelligent, adaptive system that understands and responds to a listener’s emotions in real
time creates a gap in current music experiences. Users often have to manually search for mood-appropriate
music, which may not always align with their psychological needs. Furthermore, existing platforms do not
incorporate advanced technologies such as emotion recognition through facial expressions, voice tone
analysis, or physiological signals, missing the opportunity to provide a real-time, mood-aware music
therapy solution.
To address this limitation, there is a need for a Personalized Music Therapy System that integrates AI-
driven emotion recognition with automated music curation. By leveraging machine learning and
emotion-based algorithms, the system can analyze emotional cues and recommend music that aligns with or
improves the user’s mood. This innovation will transform music from mere entertainment into an intelligent,
therapeutic tool that enhances emotional resilience, mental well-being, and overall user experience.

1.6 FEASABILITY

4
The technical feasibility of this project is strong due to advancements in artificial intelligence (AI), machine
learning, and emotion recognition technologies. Modern deep learning models can accurately analyze
facial expressions, voice tone, and physiological signals to detect emotions in real time. Emotion
classification techniques, such as convolutional neural networks (CNNs) for image analysis and recurrent
neural networks (RNNs) for speech processing, enable precise mood detection. Additionally, music
recommendation systems already leverage AI, making it feasible to integrate emotion-aware algorithms with
existing frameworks. The availability of cloud computing and edge AI further ensures that real-time processing
can be achieved on devices such as smartphones and wearable gadgets.Three key considerations involved in
the feasibility analysis are
 ECONOMICAL FEASIBILITY
 TECHNICAL FEASIBILITY
 SOCIAL FEASIBILITY

1.6.1 Economical Feasibility

Economically, this project has strong potential due to the rapid expansion of AI-driven wellness
technology and digital health markets. The integration of emotion-based music recommendations into
existing streaming platforms (such as Spotify, Apple Music, or YouTube Music) could create new revenue
streams through premium subscriptions, in-app purchases, or licensing AI-powered recommendation
engines. Additionally, partnerships with healthcare providers, meditation apps, and wearable device
manufacturers could drive adoption. The scalability of AI-powered solutions also reduces operational costs
over time, making it financially viable for large-scale deployment.

1.6.2 Technical Feasibility

Overall, the project is technically feasible due to AI advancements, socially viable due to growing mental
health awareness, and economically promising due to the expanding digital wellness industry. With the
right technological infrastructure, market positioning, and strategic partnerships, this Personalized
Music Therapy System can become a revolutionary tool in both the music and mental wellness industries., as
only minimal or null changes are required for implementing this system.
1.6.3 Social Feasibility

From a social perspective, this system has high acceptability due to the growing awareness of mental health
and emotional well-being. Many individuals experience stress, anxiety, or emotional distress in their daily
lives, and personalized music therapy offers a non-intrusive, accessible, and engaging solution. Unlike
5
traditional therapy, which may require professional intervention, this system empowers users to regulate their
emotions independently. The increasing adoption of mental wellness apps, AI-driven healthcare solutions,
and music streaming services further supports the societal demand for such an innovation.

REVIEW OF LITERATURE

[1] The study "Mood-Based Music Recommendation System" explores an AI-driven approach to emotion-
aware music recommendations. It employs real-time facial emotion detection using MobileNet (CNN model)
trained on FER 2013 and MMA datasets, classifying emotions into seven categories with 75% accuracy.
The system captures facial expressions via a live camera feed and allows emoji-based manual selection
for better accuracy. Based on detected moods, it recommends Firebase-stored playlists categorized by

6
mood and language and is optimized for Android devices using TensorFlow Lite.
While effective, the system relies solely on facial expressions, which may not fully reflect emotional states.
The authors suggest expanding dataset diversity and incorporating movies and TV shows for a broader
entertainment experience.
This research serves as a foundation for emotion-based music recommendation. However, our proposed
system aims to enhance it by integrating voice analysis, physiological signals, and improved machine
learning models, making AI-driven music therapy more accurate, scalable, and effective for mental health
applications.

[2] The paper "Emotion-Based Music Recommendation System" by Mikhail Rumiantcev and Oleksiy
Khriyenko, published in the Proceedings of the 26th Conference of Open Innovations Association
FRUCT (2020), introduces an AI-powered music recommendation system that adapts to users' emotions,
feelings, and contextual activities.
The system combines generalized music therapy principles with artificial intelligence to enhance mental
and physical well-being through personalized music recommendations. It integrates psychological feedback,
sensor-based inputs, and music metadata to assess a user’s emotional state and select appropriate tracks.
The system's architecture utilizes Long Short-Term Memory (LSTM) models for music classification and
Reinforcement Learning (RL) for adaptive recommendations, ensuring continuous personalization and
emotion-based transitions.
A working prototype was developed using MuPsych tools and Spotify integration, analyzing energy,
valence, tempo, and loudness to classify and recommend tracks dynamically. By continuously collecting data
and refining its models, the system aims to optimize emotional state transitions and improve user
experience over time.

[3] The study "Artificial Intelligence Induced Music Genre Prediction using KNN Architecture" by Dr. G.
Srivatsun, Mr. S. Thivaharan, Mr. R. Vishnu Vardhan, and Mr. R. Kumaresan, published in IJERT, Vol. 11,
Issue 06, June 2022, focuses on automating music genre classification using AI and machine learning
techniques.
The system utilizes the GTZAN dataset (1000 music samples across 10 genres) and extracts Mel-Frequency
Cepstral Coefficients (MFCC) features using the LibROSA library. It employs classification models like K-
Nearest Neighbors (KNN), Support Vector Machines (SVM), and Long Short-Term Memory (LSTM)
networks. A web-based application (built with HTML, CSS, JavaScript, and Django) allows users to upload
audio files for genre prediction.
Experimental results show that SVM and LSTM outperform KNN, with the LSTM model achieving up to
7
83% accuracy. This approach improves music classification efficiency and contributes to personalized
music recommendations and automated labeling in music databases.

[4] The study "Facial Emotion-Based Song Recommendation" by Armaan Khan, Ankit Kumar, Abhishek
Jagtap, and Dr. Mohd. Shafi Pathan, published in IJERT, Vol. 11, Issue 06, June 2022, presents a music
recommendation system based on facial emotion recognition.
The system utilizes a webcam to capture real-time images of the user, processes them using a
Convolutional Neural Network (CNN) model, and recommends songs that match the detected emotion. The
CNN model classifies emotions into seven categories: happy, angry, sad, neutral, fear, disgust, and
surprise. A website interface facilitates capturing expressions, analyzing mood, and playing songs
accordingly.
The CNN model includes six convolutional layers and four dense layers, optimized with the Adam
algorithm, achieving a test accuracy of 62.22%. Future enhancements focus on improving accuracy through
larger datasets and advanced algorithms, with potential applications in face detection and mobile
platforms.

[5] The study "Emotion-Tuned Music Playback System" by H. Immanuel James, J. James Anto Arnold, J.
Maria Masilla Ruban, M. Tamilarasan, and R. Saranya, published in IRJET, Volume 06, Issue 03, March
2019, presents a music recommendation system based on facial emotion detection.
The system processes video input from a webcam to analyze facial expressions using Histogram of
Oriented Gradients (HOG) and Principal Component Analysis (PCA). A Support Vector Machine (SVM)
classifier then categorizes emotions into happy, sad, angry, and surprise before recommending music
accordingly.
The system consists of three modules:
1. Face Detection – Identifies the user's face, reduces noise, and extracts features using HOG and
image pyramids.
2. Emotion Classification – Predicts the user’s emotion using extracted facial landmarks and an SVM
classifier.

8
3. Music Recommendation – Maps the detected emotion to a pre-assigned playlist and plays mood-
appropriate songs.
This system provides real-time emotion detection and music recommendations, reducing manual playlist
selection efforts. Future improvements include expanding detectable emotions (e.g., disgust and fear)
and integrating additional sensors for better accuracy.

SOFTWARE REQUIREMENT SPECIFICATIONS

Explanation of Components in the System


Programming Language:
 Python 3.6+ – A versatile and widely used programming language, chosen for its strong support for
machine learning, web frameworks, and data processing.
Frameworks and Libraries:
1. Flask – A lightweight web framework used to build and deploy the web application for the music
recommendation system.
2. TensorFlow/Keras – Libraries for machine learning and deep learning, used for emotion detection
through a trained CNN (Convolutional Neural Network) model.
3. OpenCV – A powerful computer vision library used to process facial images, detect faces, and
extract emotion-related features.

9
4. Spotipy – A Python library that interacts with the Spotify API, enabling song retrieval and playback
based on the detected emotion.
5. NumPy – A fundamental numerical computing library used for handling arrays and matrix
operations, essential in image and model processing.
6. Pandas (optional) – A data analysis library used for managing and structuring playlists efficiently.
7. Pillow (optional) – A Python imaging library useful for handling and modifying images when
processing facial expressions.
Developer Tools:
1. Spotify Developer Account – Required to obtain API credentials for Spotipy, allowing access to
Spotify’s music catalog and streaming services.
2. Code Editor (e.g., Visual Studio Code, PyCharm) – A development environment used for writing,
debugging, and testing the application’s code

3.1 SYSTEM REQUIREMENTS

3.1.1 HARDWARE REQUIREMENTS:

 Processor: A decent processor like Intel i5 or equivalent and above to handle facial recognition and
machine learning computations efficiently.
 RAM: Minimum 8GB is required, but 16GB is recommended for smoother performance, especially
for training deep learning models.
 GPU (Optional): A CUDA-enabled NVIDIA GPU is recommended for TensorFlow acceleration,
improving model training and inference speed.
Webcam:
 A built-in or external webcam is needed for real-time video capture, enabling facial emotion
recognition.
Internet Connection:
 A stable internet connection is essential for:
• Fetching music playlists from online services.
• Interacting with the Spotipy API to play songs based on detected emotions.
10
SYSTEM DESIGN

4.1 Design Goals

The characteristics that the system should prioritize are called design goals. Numerous design objectives can
be deduced from the application domain or nonfunctional requirements. Accuracy: To put it simply, a set of
data points from multiple measurements of the same quantity is considered accurate if its average is close to
the true value of the quantity being measured, and precise if the values are close to each other. Speed: The
system operates at a precise and effective pace. Maintaining consistency means that any errors brought about
by the system are fixed and kept that way. Completeness: The system is reliable, error-free, and dependable.

4.2 System Architecture

11
Fig:4.2 System Architecture

Initially, the webcam captures a real-time image of the user's face. Once the image is captured, a feature
extraction process is performed using a Convolutional Neural Network (CNN). In the context of emotion
recognition, this process involves converting facial expressions into numerical representations known as
feature vectors. These vectors capture distinctive facial features that help in classifying emotions. After
extracting features, the CNN model analyzes the expression and categorizes it into one of seven predefined
emotions (Angry, Disgusted, Fearful, Happy, Neutral, Sad, and Surprised). Based on the detected emotion, the
system retrieves a corresponding Spotify playlist and plays music that matches the user's mood. This ensures
a personalized and dynamic music experience.

4.3 DATA FLOW DIAGRAM

A bubble chart is an alternative term for the DFD. A system can be represented using this straightforward
graphical formalism in terms of the input data it receives, the different operations it performs on that data, and
the output data it generates. The data flow diagram, or DFD, is a crucial modeling instrument. The components
of the system are modeled using it. These elements consist of the system's procedure, the data it uses, an
outside party that communicates with it, and the information flows within it. The information is shown in the
figure as it flows through the system and undergoes various transformations. It is a visual method for
representing the flow of information and the changes made to data as it goes from input to output. A system at
any level of abstraction can be represented using a DFD. Levels of DFD can be used to indicate increasing
functional detail and information flow. It will first detect the faces in input images or video streams using Dlib's
frontal face detector, then it computes 128D face descriptors for recognized faces using Dlib ResNet model.
After that, it compares computed face descriptors with stored face data to recognize individuals and handles

12
database operations like storing, updating, and retrieving attendance records. At last, it manages the Flask-
based web interface for user interaction and stores the attendance records containing details such as names,
timestamps, and dates. This DFD showcases the flow of data and processes within the attendance
management system, emphasizing the interactions between components like face detection, recognition,
database management, and user interaction.

Fig 4.3 DATA FLOW DIAGRAM

4.4 UML DIAGRAMS

UML stands for Unified Modeling Language. UML is a standardized general-purpose modeling language in the
field of object-oriented software engineering. The standard is managed, and was created by, the Object
Management Group. The goal is for UML to become a common language for creating models of object-oriented
computer software. In its current form UML comprises two major components: a Meta-model and a notation. In
the future, some form of method or process may also be added to; or associated with, UML.
The Unified Modeling Language is a standard language for specifying, Visualization, Constructing and
documenting the artifacts of software systems, as well as for business modeling and other non-software
systems. The UML represents a collection of best engineering practices that have proven successful in the
modeling of large and complex systems.
The UML is a very important part of developing objects-oriented software and the software development
process. The UML uses mostly graphical notations to express the design of software projects.

13
GOALS:
The Primary goals in the design of the UML are as follows:
1. Provide users a ready-to-use, expressive visual modeling Language so that they can develop and
exchange meaningful models.
2. Provide extendibility and specialization mechanisms to extend the core concepts.
3. Be independent of particular programming languages and development processes.
4. Provide a formal basis for understanding the modeling language.
5. Encourage the growth of the OO tools market.
6. Support higher level development concepts such as collaborations, frameworks, patterns and
components.
7. Integrate best practices.

4.4.1 USE CASE DIAGRAM:

A use case diagram in the Unified Modeling Language (UML) is a type of behavioral diagram defined by and
created from a Use-case analysis. Its purpose is to present a graphical overview of the functionality provided by
a system in terms of actors, their goals (represented as use cases), and any dependencies between those use
cases. The main purpose of a use case diagram is to show what system functions are performed for which
actor. Roles of the actors in the system can be depicted.

Fig:4.4.1 Use case diagram

4.4.2 CLASS DIAGRAM:


In software engineering, a class diagram in the Unified Modeling Language (UML) is a type of static structure
14
diagram that describes the structure of a system by showing the system's classes, their attributes, operations
(or methods), and the relationships among the classes. It explains which class contains information.

Fig:4.4.2 Class diagram

4.4.3 SEQUENCE DIAGRAM:

A Sequence Diagram is a type of UML diagram that illustrates how objects interact in a system over time. It
depicts the sequence of messages exchanged between entities, such as users, databases, and applications, to
complete a specific process. In this project, the sequence diagram will represent the flow from image capture to
emotion detection and music recommendation using CNN and Spotipy API. It helps in visualizing system
interactions, ensuring efficient communication and process flow.

• Activity User starts the application

• System captures the image

• Emotion is predicted using CNN

• System fetches the corresponding playlist from Spotify

• Songs are displayed on the UI

15
Fig:4.4.3 Sequential diagram

4.4.4 DEPLOYMENT DIAGRAM:

A Deployment Diagram is a UML diagram that represents the physical architecture of a system, showing how
software components are deployed on hardware nodes. It highlights the system's structure, including servers,
databases, and network configurations, ensuring efficient resource allocation. Deployment diagrams help in
understanding how different modules interact in a real-world setup, improving system performance and
scalability. They are crucial for designing distributed applications and ensuring smooth deployment in cloud or
on-premises environments.

In this project, the deployment diagram illustrates how the webcam captures images, processes them through a
CNN model, and integrates with the Spotipy API to recommend music based on detected emotions. The model
runs on Anaconda, using TensorFlow and Flask for processing and hosting the web interface. The deployment
setup ensures seamless real-time emotion detection and playlist selection, enhancing user interaction.

16
Fig:4.4.4 Deployment diagram

IMPLEMENTATION

The desktop application was implemented using Anaconda, with Jupyter Notebook utilized for data
preprocessing and model training. Python served as the primary programming language, leveraging deep
learning techniques for emotion recognition. Instead of traditional face recognition methods, the system
employs a Convolutional Neural Network (CNN) built with TensorFlow and Keras to detect facial expressions.
The code utilizes a Convolutional Neural Network (CNN) built with TensorFlow and Keras for facial emotion
recognition. Instead of face recognition, the model processes facial expressions by extracting key features from
the image. It applies multiple Conv2D layers to learn spatial patterns and MaxPooling layers to reduce
dimensionality while retaining essential features. The final fully connected layers classify the detected face into
one of seven emotions (Angry, Disgusted, Fearful, Happy, Neutral, Sad, and Surprised) using a softmax
activation function. The model is trained using categorical cross-entropy loss and optimized with the Adam
optimizer to enhance accuracy. Once an emotion is detected, the system integrates with the Spotify API to

17
recommend and play a suitable music playlist based on the user's mood.

5.1 SOFTWARE ENVIRONMENT:

5.1.1 What is Python:


Python is currently the most widely used multi-purpose, high-level programming language. Python allows
programming in Object-Oriented and Procedural paradigms. Python programs generally are smaller than
other programming languages like Java. Programmers have to type relatively less and the indentation
requirement of the language makes them readable all the time. Python language is being used by almost all
tech-giant companies like – Google, Amazon, Facebook, Instagram, Dropbox, Uber… etc. The biggest
strength of Python is huge collection of standard libraries which can be used for the following –

● Machine Learning
● GUI Applications (like Kivy, Tkinter, PyQtetc)
● Web frameworks like Django (used by YouTube, Instagram, Dropbox)
● Image processing (like Opencv, Pillow)
● Test frameworks

5.1.2 Modules and Framework:


TensorFlow and Keras
TensorFlow is an open-source deep learning framework developed by Google, and Keras is a high-level API
built on top of TensorFlow. These libraries are used in this project to design and train a Convolutional Neural
Network (CNN) for facial emotion recognition. TensorFlow efficiently handles GPU-accelerated
computations, making it suitable for large-scale machine learning applications. The key features include:

 Built-in functions for designing and training deep learning models.


 Support for GPU acceleration for high-speed computations.
 Pre-trained models and layers for quick deployment.
 Automatic differentiation and optimization techniques for better performance.

NumPy
A versatile package for handling arrays is called Numpy. Along with tools for managing these arrays, it offers
a high-performance multidimensional array object. For Python scientific computing, it is the essential
package. It contains various features like:

18
Sophisticated (broadcasting) functions. Tools for integrating C/C++ and Fortran code. Useful linear algebra,
Fourier transform, and random number capabilities. Numpy has many applications in science, but it's also a
useful tool for storing generic data in multi-dimensional containers. Because Numpy can define arbitrary
data-types, it can quickly and easily integrate with a large range of databases. NumPy is a fundamental
Python library used for handling large arrays and matrices. In this project, it is primarily used for:
 Processing image pixel data in numerical formats
 Performing mathematical operations required for CNN computations.

Pandas

Using its robust data structures, Pandas is an open-source Python library that offers high-performance data
manipulation and analysis capabilities. Python was primarily used for preparation and data munging. It
didn't really contribute anything to the analysis of data. Pandas figured out the solution to this. Regardless
of the source of the data, we can use Pandas to complete five common steps in data processing and
analysis: load, prepare, manipulate, model, and analyze. Numerous academic and professional domains,
including finance, economics, statistics, analytics, and other areas, use Python with Pandas.

Pandas is used for handling and analyzing structured data, particularly in cases where:

 Emotion labels and their corresponding music playlists are mapped.


 CSV files or datasets need to be loaded and processed.
 Dataframes are used for organizing and manipulating data efficiently.

OpenCV-Python

OpenCV (Open-Source Computer Vision Library) is an open-source computer vision and machine learning
software library designed to provide a common infrastructure for computer vision applications. It offers a
wide range of tools and functionalities to perform real-time computer vision tasks and image processing.
OpenCV provides a plethora of functions for basic and advanced image processing tasks such as resizing,
filtering, morphological operations, thresholding, and contour detection. It offers various algorithms and
modules for object detection, feature extraction, segmentation, and recognition. For instance, OpenCV
provides Haar cascades and deep learning-based models for face detection and recognition. It offers
integration with machine learning libraries like TensorFlow and PyTorch, allowing users to build and deploy
machine learning models for computer vision tasks.

Spotipy API

Spotipy is a lightweight Python library that provides an easy-to-use wrapper for the *Spotify Web API,
19
enabling seamless interaction with Spotify's vast music database. It allows developers to authenticate users,
search for songs, fetch playlists, retrieve track details, and control music playback programmatically. In this
project, Spotipy is used to access curated “emotion-based playlists”, ensuring that the system plays music
that aligns with the detected facial emotions. By integrating Spotipy, the application enhances the user
experience by automatically selecting mood-appropriate songs based on real-time emotion recognition.

Flask

Flask is a lightweight and versatile web framework for Python that facilitates the creation of web applications
and APIs. Flask is known for its simplicity and minimalism, offering a straightforward yet powerful way to
build web applications without imposing strict rules or dependencies. It provides a simple mechanism for
defining routes (URLs) and associating them with functions (views) that handle requests. This makes it easy
to create different endpoints that respond to various HTTP methods (GET, POST, etc.). Although Flask
comes with a built-in development server, it's also deployable on various production servers like Gunicorn,
uWSGI, or integrating with services like Heroku or AWS for hosting web applications.

5.1.3 ALGORITHMS:
Convolutional Neural Networks (CNN): Convolutional Neural Networks (CNNs) are widely used for
emotion classification from images or videos. CNNs extract spatial features from facial expressions and learn
hierarchical patterns for better classification. The model processes input images through multiple
convolutional layers, capturing essential facial features.

Softmax Activation Function: The Softmax activation function is used in the final output layer of the CNN
for multi-class emotion classification, ensuring that the predicted probabilities sum up to 1 across all emotion
categories.

Adam Optimizer: The Adam optimizer (Adaptive Moment Estimation) is utilized for training the model. It
combines the advantages of momentum and RMSprop optimizers, ensuring adaptive learning rates for
different parameters. This helps in faster convergence and improved model performance while reducing
training instability.

5.2 METHODOLOGY
The complete methodology of the project is based on,

1. Requirement Gathering: Identified the need for an emotion-based music recommendation system using
facial expression recognition. Defined functional requirements such as emotion detection, feature extraction,
and playlist recommendation. Non-functional requirements like accuracy, real-time processing, and seamless
API integration were also determined.
20
2. Research and Planning: Conducted research on CNN-based emotion recognition models and available
libraries such as TensorFlow, Keras, and OpenCV. Planned the system architecture, outlining key components:
webcam input, deep learning model, Spotify API integration, and user interface. Selected Anaconda as the
development environment and finalized the technology stack.

3. Designing the System: Designed data flow diagrams and sequence diagrams to define the system's
workflow. Specified modules such as image acquisition, emotion classification, and playlist retrieval. Designed
the user interface to display detected emotions and music recommendations.

4. Implementation:
Developed Python scripts to implement:

 OpenCV for real-time webcam image capture.


 TensorFlow and Keras CNN model for emotion detection.
 Flask to create a lightweight web interface for user interaction.
 Spotipy API to fetch and recommend playlists based on detected emotions.

5. Testing: Conducted unit tests for each module to ensure correct functioning. Performed integration testing to
verify seamless interaction between the CNN model, webcam, and Spotify API. Tested the system under
various conditions such as different lighting, facial angles, and expressions to ensure accuracy and robustness.

6. Deployment: Deployed the application on Anaconda for development and testing. Configured the system to
work on local devices with real-time processing. Ensured smooth user experience by optimizing performance
and providing necessary documentation for usage.

5.3 Process
The Emotion-Based Music Recommendation System consists of multiple interconnected Python scripts that
work together to achieve real-time facial expression recognition and dynamic music recommendations. Below
is a detailed breakdown of the process:

Real-time Face and Emotion Detection:


The script "capture_emotions.py" is responsible for capturing real-time facial expressions using OpenCV from a
live webcam feed.It detects human faces within the frame and extracts relevant facial features necessary for
21
classification.The extracted facial data is passed to a deep learning-based Convolutional Neural Network
(CNN) model, built using TensorFlow and Keras, to classify emotions.
The model categorizes facial expressions into predefined emotions such as:
 Happy (suggests upbeat songs)
 Sad (suggests soothing or slow songs)
 Neutral (suggests balanced playlist)
 Angry (suggests calm and relaxing music)
 Surprised (suggests exciting or energetic music)
The system continuously processes frames from the webcam, updating emotion predictions in real time.

Feature Extraction and Emotion Classification:


The "emotion_classifier.py" script further processes the extracted facial data to refine the emotion classification.
The CNN model generates a confidence score for each detected emotion, ensuring the accuracy of
classification. It employs pre-trained models such as MobileNetV2 or ResNet for efficient and accurate feature
extraction. The model applies data augmentation techniques to improve generalization across different lighting
conditions, facial angles, and occlusions (e.g., glasses or masks).

Music Recommendation using Spotipy API:


The "music_recommendation.py" script utilizes Spotify's Spotipy API to fetch and suggest appropriate music
playlists based on detected emotions. The system maps each emotion to a predefined genre or playlist,
ensuring an engaging and emotion-responsive music selection:

 Happy → Pop, Dance, Party Hits


 Sad → Soft Rock, Acoustic, Piano Instrumentals
 Neutral → Lo-Fi, Chill Beats, Relaxing Music
 Angry → Calm Jazz, Classical, Meditation Sounds
 Surprised → EDM, Hip-Hop, Upbeat Songs

The Spotify API fetches real-time playlist updates, ensuring users get fresh and dynamic song
recommendations. Users can either accept the recommendations or request a different playlist based on mood
adjustments.

Flask Web Interface for User Interaction:


The "app.py" script implements a Flask-based web application, allowing users to interact with the system. The
web interface displays real-time detected emotions along with corresponding music suggestions. The system
provides interactive controls for users to browse and play recommended songs directly from the web
22
application. The Flask backend manages HTTP requests, interacts with the Spotify API, and dynamically
updates the displayed content based on emotion detection results.

System Deployment and Performance Optimization:


The complete project is executed in an Anaconda environment, ensuring efficient package and dependency
management.The Flask web application, TensorFlow-based deep learning model, and Spotify API are
seamlessly integrated to create a smooth user experience.

Testing and Optimization:


 The model undergoes extensive testing to evaluate its emotion detection accuracy and real-time
performance.
 Various scenarios, including different lighting conditions, facial orientations, and expressions, are
tested to enhance robustness.
 Performance metrics such as latency, accuracy, and computational efficiency are analyzed to ensure
smooth operation.

The system is deployed on a local machine or cloud platform, allowing access from multiple devices.

23
SYSTEM TESTING

The purpose of testing is to discover errors. Testing is the process of trying to discover every conceivable fault
or weakness in a work product. It provides a way to check the functionality of components, sub-assemblies,
assemblies and/or a finished product. It is the process of exercising software with the intent of ensuring that the
Software system meets its requirements and user expectations and does not fail in an unacceptable manner.
There are various types of tests. Each test type addresses a specific testing requirement.

6.1 TYPES OF TESTS


6.1.2 Unit Testing
Unit testing involves the design of test cases that validate that the internal program logic is functioning properly,
and that program inputs produce valid outputs. All decision branches and internal code flow should be
validated. It is the testing of individual software units of the application .It is done after the completion of an
individual unit before integration. This is a structural testing that relies on knowledge of its construction and is
invasive. Unit tests perform basic tests at component level and test a specific business process, application,
and/or system configuration. Unit tests ensure that each unique path of a business process performs
accurately to the documented specifications and contains clearly defined inputs and expected results.

6.1.3 Integration Testing


Integration tests are designed to test integrated software components to determine if they actually run as one
program. Testing is event driven and is more concerned with the basic outcome of screens or fields.
Integration tests demonstrate that although the components were individually satisfactory, as shown by
successfully unit testing, the combination of components is correct and consistent. Integration testing is
specifically aimed at exposing the problems that arise from the combination of components.

6.1.4 Functional Testing


Functional tests provide systematic demonstrations that functions tested are available as specified by the
business and technical requirements, system documentation, and user manuals.
Functional testing is centered on the following items:
24
Valid Input : identified classes of valid input must be accepted.
Invalid Input : identified classes of invalid input must be rejected.
Functions : identified functions must be exercised.
Output : identified classes of application outputs must be exercised.
Systems/Procedures : interfacing systems or procedures must be invoked.

Organization and preparation of functional tests is focused on requirements, key functions, or special test
cases. In addition, systematic coverage pertaining to identifying Business process flows; data fields, predefined
processes, and successive processes must be considered for testing. Before functional testing is complete,
additional tests are identified and the effective value of current tests is determined.

6.1.5 System Testing


System testing ensures that the entire integrated software system meets requirements. It tests a configuration
to ensure known and predictable results. An example of system testing is the configuration-oriented system
integration test. System testing is based on process descriptions and flows, emphasizing pre-driven process
links and integration points.

6.1.6 White Box Testing


White Box Testing is a testing in which the software tester has knowledge of the inner workings, structure and
language of the software, or at least its purpose. It has a purpose. It is used to test areas that cannot be
reached from a black box level.

6.1.7 Black Box Testing


Black Box Testing is testing the software without any knowledge of the inner workings, structure or language of
the module being tested. Black box tests, as most other kinds of tests, must be written from a definitive source
document, such as specification or requirements document, such as specification or requirements document. It
is a testing in which the software under test is treated as a black box you cannot “see” into it. The test provides
inputs and responds to outputs without considering how the software works.

6.1.8 Acceptance Testing


User Acceptance Testing is a critical phase of any project and requires significant participation by the end user.
It also ensures that the system meets the functional requirements.

25
6.2 Test Strategy and Approach
Field testing will be performed manually and functional tests will be written in detail.
Test Objectives
● All field entries must work properly.
● Pages must be activated from the identified link.
● The entry screen, messages and responses must not be delayed.
Features to be tested
● Verify that the entries are of the correct format
● No duplicate entries should be allowed
● All links should take the user to the correct page.

6.3 Test Cases


All the test cases mentioned above passed successfully. No defects encountered.

SAMPLE CODE

Spotipy.py
26
import spotipy
import spotipy.oauth2 as oauth2
from spotipy.oauth2 import SpotifyOAuth
from spotipy.oauth2 import SpotifyClientCredentials
import pandas as pd
import time
auth_manager = SpotifyClientCredentials('','')
sp = spotipy.Spotify(auth_manager=auth_manager)
def getTrackIDs(user, playlist_id):
track_ids = []
playlist = sp.user_playlist(user, playlist_id)
for item in playlist['tracks']['items']:
track = item['track']
track_ids.append(track['id'])
return track_ids
def getTrackFeatures(id):
track_info = sp.track(id)
name = track_info['name']
album = track_info['album']['name']
artist = track_info['album']['artists'][0]['name']
# release_date = track_info['album']['release_date']
# length = track_info['duration_ms']
# popularity = track_info['popularity']

track_data = [name, album, artist] #, release_date, length, popularity


return track_data
emotion_dict={0:"Angry",1:"Disgusted",2:"Fearful",3:"Happy",4:"Neutral",5:"Sad",6:"Surprised"}
music_dist={0:"0l9dAmBrUJLylii66JOsHB?si=e1d97b8404e34343",1:"1n6cpWo9ant4WguEo91KZh?
si=617ea1c66ab6446b ",2:"4cllEPvFdoX6NIVWPKai9I?si=dfa422af2e8448ef",3:"0deORnapZgrxFY4nsKr9JA?
si=7a5aba992ea14c93",4:"4kvSlabrnfRCQWfN0MgtgA?
si=b36add73b4a74b3a",5:"1n6cpWo9ant4WguEo91KZh?
si=617ea1c66ab6446b",6:"37i9dQZEVXbMDoHDwVN2tF?si=c09391805b6c4651"}

Train.py
27
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.optimizers import Adam
from keras.preprocessing.image import ImageDataGenerator
train_dir = 'data/train'
val_dir = 'data/test'
train_datagen = ImageDataGenerator(rescale=1./255)
val_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
train_dir,
target_size = (48,48),
batch_size = 64,
color_mode = "grayscale",
class_mode = 'categorical')
val_generator = val_datagen.flow_from_directory(
val_dir,
target_size = (48,48),
batch_size = 64,
color_mode = "grayscale",
class_mode = 'categorical'
)
emotion_model = Sequential()
emotion_model.add(Conv2D(32, kernel_size=(3,3), activation='relu', input_shape = (48,48,1)))
emotion_model.add(Conv2D(64, kernel_size=(3,3), activation='relu'))
emotion_model.add(MaxPooling2D(pool_size=(2,2)))
emotion_model.add(Dropout(0.25))
emotion_model.add(Conv2D(128, kernel_size=(3,3), activation='relu'))
emotion_model.add(MaxPooling2D(pool_size=(2,2)))
emotion_model.add(Conv2D(128, kernel_size=(3,3), activation='relu'))
emotion_model.add(MaxPooling2D(pool_size=(2,2)))
emotion_model.add(Dropout(0.25))
emotion_model.add(Flatten())
emotion_model.add(Dense(1024, activation='relu'))
28
emotion_model.add(Dropout(0.5))
emotion_model.add(Dense(7, activation='softmax'))
emotion_model.compile(loss='categorical_crossentropy',optimizer=Adam(lr=0.0001, decay=1e-
6),metrics=['accuracy'])
emotion_model_info = emotion_model.fit_generator(
train_generator,
steps_per_epoch = 28709 // 64,
epochs=75,
validation_data = val_generator,
validation_steps = 7178 // 64
)
emotion_model.save_weights('model.h5')

App.py
from flask import Flask, render_template, Response, jsonify
import gunicorn
from camera import *
app = Flask(__name__)
headings = ("Name","Album","Artist")
df1 = music_rec()
df1 = df1.head(15)
@app.route('/')
def index():
print(df1.to_json(orient='records'))
return render_template('index.html', headings=headings, data=df1)
def gen(camera):
while True:
global df1
frame, df1 = camera.get_frame()
yield (b'--frame\r\n'
b'Content-Type: image/jpeg\r\n\r\n' + frame + b'\r\n\r\n')

@app.route('/video_feed')
def video_feed():
return Response(gen(VideoCamera()),
29
mimetype='multipart/x-mixed-replace; boundary=frame')
@app.route('/t')
def gen_table():
return df1.to_json(orient='records')
if __name__ == '__main__':
app.debug = True
app.run()

OUTPUT SCREENSHOTS

30
31
CONCLUSION AND FUTURE SCOPE

9.1 CONCLUSION

32
The Mood-Based Music Recommendation System effectively integrates emotion detection with AI-driven music
selection to create a personalized and therapeutic music experience. Unlike conventional music
recommendation systems that rely on past listening history, this approach analyzes real-time user emotions
using facial expressions, voice tone, and optionally physiological signals.
By leveraging machine learning (ML) and deep learning (DL) techniques, the system accurately classifies
emotions and recommends music that aligns with the user's mood. The music recommendation engine uses
content-based and collaborative filtering, dynamically adjusting to emotional fluctuations to enhance user
satisfaction.
The system’s real-time feedback loop ensures continuous monitoring of user reactions, allowing adaptive
playlist modifications for better emotional alignment. Its deployment as a mobile or web application with cloud-
based integration enhances accessibility and personalization over time.
In conclusion, this project offers a novel approach to mood-based music recommendations, improving the
emotional well-being of users. Future enhancements could include multi-modal emotion recognition, user
preference learning over time, and integration with wearable devices for even more precise recommendations.

9.2 FUTURE SCOPE

The Mood-Based Music Recommendation System has significant potential for future advancements, leveraging
emerging technologies to enhance accuracy, personalization, and user experience. Below are some key areas
for future development:
1. Multi-Modal Emotion Recognition
 Integration of multiple emotion detection methods (facial expressions, voice tone, text sentiment, and
physiological signals like heart rate).
 Use of EEG-based emotion recognition for a more precise understanding of user moods.
2. Advanced Machine Learning & AI Models
 Implementation of deep learning architectures (e.g., CNNs, RNNs, and Transformer models) for
improved accuracy in mood classification.
 Use of reinforcement learning to dynamically refine recommendations based on user feedback.
3. Integration with Smart Wearables & IoT Devices
 Connecting with smartwatches and fitness bands (e.g., Apple Watch, Fitbit) to monitor heart rate and
stress levels for better mood detection.
 Incorporation of IoT-enabled smart speakers for seamless music playback based on real-time mood
changes.
4. Cross-Platform & Multi-Device Compatibility
33
 Developing a mobile and web-based ecosystem that synchronizes music recommendations across
multiple devices.
 Integration with voice assistants (Google Assistant, Siri, Alexa) for hands-free control.
5. Real-Time Adaptive Playlists
 Implementing a dynamic recommendation engine that adjusts playlists in real-time based on user
mood variations.
 Creating mood transition playlists to help users shift from negative to positive emotional states.
6. Personalized Therapy & Mental Health Applications
 Integration with mental health applications to provide music therapy for stress, anxiety, and depression.
 Collaboration with psychologists and wellness platforms for medically validated music therapy
solutions.
7. Social & Community-Based Features
 Allowing users to share mood-based playlists with friends.
 Community-based collaborative filtering where users with similar emotional patterns get recommended
music based on shared experiences.
8. Support for Multiple Languages & Cultural Preferences
 Expanding music databases to include regional and international music preferences for a diverse user
base.
 Incorporating natural language processing (NLP) for emotion detection in multilingual text messages.
9. AR/VR Integration for Immersive Experiences
 Developing Augmented Reality (AR) and Virtual Reality (VR)-based music environments where users
can interact with music dynamically.
 AI-powered visual effects that sync with the music to enhance emotional impact.
By implementing these future enhancements, the Mood-Based Music Recommendation System can evolve
into a highly intelligent, personalized, and therapeutic platform that improves user engagement and emotional
well-being.

References
[1] A guide to Natural Langugage Processing, Available at
https://en.wikipedia.org/wiki/Natural_language_processing
[2] Chatbot definition, Available at https://medium.com/@mg/bot-is-a-
hilariously-over-simplified-buzzword-let-s-fix-that-
f1d63abb8ba7#:~:text=A%20chatterbot%20(also%20known%20as,via%
20auditory%20or%20textual%20methods.
[3] Introduction to Artificial Intelligence Markup Language, Available at
https://www.tutorialspoint.com/aiml/aiml_introduction.htm
[4] Prof.K.Bala, Mukesh Kumar, SayaliHulawale, SahilPandita,“Chat-Bot
For College Management System Using A.I” International Research
34
Journal of Engineering and Technology (IRJET) Volume: 04, Issue: 11,
Page no: 2030-2033| Nov 2017.
[5] Porter Stemmer Algorithm, Available at
http://snowball.tartarus.org/algorithms/porter/stemmer.html
[6] Guruswami Hiremath, AishwaryaHajare, PriyankaBhosale,
RasikaNanaware, Dr. K. S. Wagh, “Chatbot for education system”
International Journal of Advance Research, Ideas and Innovations in
Technology (IJARIIT) ISSN: 2454-132X, Volume: 4, Issue: 3, Page no:
37-43|2018.
[7] Amey Tiwari, Rahul Talekar, Prof.S.M.Patil, “College Information Chat
Bot System” International Journal of Engineering Research and General
Science (IJERGS) Volume: 5, Issue: 2, Page no: 131-137| March-April
2017.
[8] K. Jwala, G.N.V.G Sirisha, G.V. Padma Raju, “Developing a Chatbot
using Machine Learning” International Journal of Recent Technology
and Engineering (IJRTE) ISSN: 2277-3878, Volume: 8 Issue: 1S3, Page
no: 89-92| June 2019.
[9] Basics of Natural Language ToolKit, Avalilable at https://www.nltk.org/
[10] ] Naeun Lee, Kirak Kim, Taeseon Yoon, “Implementation of Robot
Journalism by Programming Custombot using Tokenization and Custom
Tagging” International Conference on Advanced Communications
Technology (ICACT) Page no: 566-570| Feb 2017.
[11] Fundamentals of Natural Language Processing - Tokenization,
Lemmatization, Stemming and Sentence Segmentation, Available at
https://colab.research.google.com/github/dair-
ai/notebooks/blob/master/_notebooks/2020-03-19-
nlp_basics_tokenization_segmentation.ipynb#scrollTo=H7gQFbUxOQt
b
[12] Jazzy spell checker Library, Available at http://jazzy.sourceforge.net/
[13] WordNet Algorithm, Avaliable at https://wordnet.princeton.edu/
[14] Setiaji Bayu, Wibowo Ferry “Chatbot Using a Knowledge in Database:
Human-to-Machine Conversation Modeling” 7th International
Conference on Intelligent Systems, Modelling and Simulation (ISMS)
Page no: 72-77| Jan 2016. DOI: 10.1109/ISMS.2016.53
[1] A guide to Natural Langugage Processing, Available at
https://en.wikipedia.org/wiki/Natural_language_processing
[2] Chatbot definition, Available at https://medium.com/@mg/bot-is-a-
hilariously-over-simplified-buzzword-let-s-fix-that-
f1d63abb8ba7#:~:text=A%20chatterbot%20(also%20known%20as,via%
20auditory%20or%20textual%20methods.
[3] Introduction to Artificial Intelligence Markup Language, Available at
https://www.tutorialspoint.com/aiml/aiml_introduction.htm
[4] Prof.K.Bala, Mukesh Kumar, SayaliHulawale, SahilPandita,“Chat-Bot
For College Management System Using A.I” International Research
Journal of Engineering and Technology (IRJET) Volume: 04, Issue: 11,
Page no: 2030-2033| Nov 2017.
[5] Porter Stemmer Algorithm, Available at
http://snowball.tartarus.org/algorithms/porter/stemmer.html
[6] Guruswami Hiremath, AishwaryaHajare, PriyankaBhosale,
RasikaNanaware, Dr. K. S. Wagh, “Chatbot for education system”
International Journal of Advance Research, Ideas and Innovations in
Technology (IJARIIT) ISSN: 2454-132X, Volume: 4, Issue: 3, Page no:
37-43|2018.
[7] Amey Tiwari, Rahul Talekar, Prof.S.M.Patil, “College Information Chat
Bot System” International Journal of Engineering Research and General
Science (IJERGS) Volume: 5, Issue: 2, Page no: 131-137| March-April
35
2017.
[8] K. Jwala, G.N.V.G Sirisha, G.V. Padma Raju, “Developing a Chatbot
using Machine Learning” International Journal of Recent Technology
and Engineering (IJRTE) ISSN: 2277-3878, Volume: 8 Issue: 1S3, Page
no: 89-92| June 2019.
[9] Basics of Natural Language ToolKit, Avalilable at https://www.nltk.org/
[10] ] Naeun Lee, Kirak Kim, Taeseon Yoon, “Implementation of Robot
Journalism by Programming Custombot using Tokenization and Custom
Tagging” International Conference on Advanced Communications
Technology (ICACT) Page no: 566-570| Feb 2017.
[11] Fundamentals of Natural Language Processing - Tokenization,
Lemmatization, Stemming and Sentence Segmentation, Available at
https://colab.research.google.com/github/dair-
ai/notebooks/blob/master/_notebooks/2020-03-19-
nlp_basics_tokenization_segmentation.ipynb#scrollTo=H7gQFbUxOQt
b
[12] Jazzy spell checker Library, Available at http://jazzy.sourceforge.net/
[13] WordNet Algorithm, Avaliable at https://wordnet.princeton.edu/
[14] Setiaji Bayu, Wibowo Ferry “Chatbot Using a Knowledge in Database:
Human-to-Machine Conversation Modeling” 7th International
Conference on Intelligent Systems, Modelling and Simulation (ISMS)
Page no: 72-77| Jan 2016. DOI: 10.1109/ISMS.2016.53
[1]

Michael Dobson, Douglas Ahlers, Bernie DiDario, <Attendance Tracking System",


United States Patent Application Publication, Pub. No.: US 2006/0035205 A1, Feb.16,
2006.
[2]

Naveed Khan Balcoh, M. HaroonYousaf, Waqar Ahmad and M. IramBaig, <Algorithm


for Efficient Attendance Management: Face Recognition based approach=, IJCSI
International Journal of Computer Science Issues, Vol. 9, Issue 4, No 1, pp.146-150, July
2012.
[3]

O. Shoewu and O.A. Idowu, <Development of Attendance Management System using


Biometrics ", The Pacific Journal of Science and Technology, Vol. 13, Number1, pp.300-
307, May 2012 (Spring).
[4]

Damir Demirovic, Emir Skejic , Amira Serfovic, <Performance of some images


processing algorithms in TensorFlow=, computer vision and pattern recognition, pp, 799-
778, 2018.
[5]

http://www.openCV.org
[6]

Hapani, Smit, et al. "Automated Attendance System Using Image Processing." 2018
Fourth International Conference on Computing Communication Control and Automation
(ICCUBEA). IEEE, 2018.
[7]

S. Bahrampour, N. Ramakrishnan, L. Schott. And M. Shah, M. Comparative Study of


Caffe, Neon, Theano, and Torch for Deep Learning<. CoRR, abs/1511.06435. 2015.
[8]
36
I. K. Park, N. Singhal, M. H. Lee, S. Cho and C. Kim, "Design and Performance
Evaluation of Image Processing Algorithms on GPUs," in IEEE Transactions on Parallel
and Distributed Systems, vol. 22, no. 1, pp. 91-104, Jan. 2011.[]
[1] M. Keerthana, M. Shruthi, and S. Aravind Kumar, "Emotion Based Music Recommendation System,"
International Journal of Creative Research Thoughts (IJCRT), vol. 9, no. 6, pp. e356–e360, June 2021.

[2] Anand R., Sabeenian R.S., Deepika Gurang, and Kirthika R., "AI-based Music Recommendation System
using Deep Learning Algorithms," International Journal of Creative Research Thoughts (IJCRT), vol. 8, no. 5,
pp. 1234–1240, May 2020.

[3] Sriraj Katkuri, Mahitha Chegoor, K.C. Sreedhar, and M. Sathyanarayana, "Emotion Based Music
Recommendation System," International Journal of Advanced Research in Computer and Communication
Engineering, vol. 9, no. 5, pp. 45–50, May 2020.

[4] Saurav Joshi, Tanuj Jain, and Nidhi Nair, "Emotion Based Music Recommendation System Using LSTM-
CNN Architecture," International Journal of Creative Research Thoughts (IJCRT), vol. 11, no. 2, pp. 678–685,
February 2023.

[5] Tina Babu, Rekha R. Nair, and Geetha A., "Emotion-Aware Music Recommendation System: Enhancing
User Experience Through Real-Time Emotional Context," arXiv preprint arXiv:2311.10796, November 2023.

[6] Xinyu Chang, Xiangyu Zhang, Haoruo Zhang, and Yulu Ran, "Music Emotion Prediction Using Recurrent
Neural Networks," arXiv preprint arXiv:2405.06747, May 2024.

[7] Ramiz Mammadli, Huma Bilgin, and Ali Can Karaca, "Music Recommendation System based on Emotion,
Age and Ethnicity," arXiv preprint arXiv:2212.04782, December 2022.

[8] Erkang Jing, Yezheng Liu, Yidong Chai, Shuo Yu, Longshun Liu, Yuanchun Jiang, and Yang Wang,
"Emotion-aware Personalized Music Recommendation with a Heterogeneity-aware Deep Bayesian Network,"
arXiv preprint arXiv:2406.14090, June 2024.

[9] Florence and Uma, "Emotion-Based Music Recommendation System Using Facial Expression Analysis,"
International Journal of Research Publication and Reviews, vol. 5, no. 11, pp. 7876–7882, November 2024.

[10] Patel et al., "Emotion-Based Music Player Using CNN for Facial Emotion Recognition," International
Journal of Research Publication and Reviews, vol. 5, no. 11, pp. 7883–7890, November 2024.

37
Nadimpalli Satyanarayana Raju Institute of Technology
(AUTONOMOUS)
(Permanently to JNTU-GV, Vizianagaram, Approved by AICTE, New Delhi)

Sontyam, Visakhapatnam-531173

Department of Computer Science and Engineering (AIML)

Project Guide: MRS. J. SANTOSHI KUMARI, SR Assistant Professor


Along with Project Team Members

38

You might also like