0% found this document useful (0 votes)
16 views29 pages

MTE Project Report

This project report presents a study on a movie recommendation system utilizing content-based filtering and sentiment analysis to enhance personalized suggestions. The work, conducted by Faizan Rasool, Surendra Kumar, and Suraj Kumar under Dr. Swati Singh's supervision at Galgotias University, outlines the project's objectives, methodologies, and the significance of integrating user sentiment into recommendation algorithms. The report includes a comprehensive literature survey, software requirements, and a detailed project structure aimed at improving the quality and accuracy of movie recommendations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views29 pages

MTE Project Report

This project report presents a study on a movie recommendation system utilizing content-based filtering and sentiment analysis to enhance personalized suggestions. The work, conducted by Faizan Rasool, Surendra Kumar, and Suraj Kumar under Dr. Swati Singh's supervision at Galgotias University, outlines the project's objectives, methodologies, and the significance of integrating user sentiment into recommendation algorithms. The report includes a comprehensive literature survey, software requirements, and a detailed project structure aimed at improving the quality and accuracy of movie recommendations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

A Project Report

On

A Study on an Education Website


Submitted in partial fulfillment of the

requirement for the award of the degree of

MASTER OF COMPUTER APPLICATION

DEGREE
Session 2024-25
in

Master of Computer Application


By

Faizan Rasool Surendra Kumar Suraj Kumar


[23SCSE2030295] [23SCSE2030031] [23SCSE2030057]

Under the guidance of


Dr. Swati Singh

SCHOOL OF COMPUTER APPLICATIONS AND TECHNOLOGY


GALGOTIAS UNIVERSITY, GREATER NOIDA
INDIA
April, 2025
SCHOOL OF COMPUTER APPLICATIONS AND
TECHNOLOGY

GALGOTIAS UNIVERSITY, GREATER NOIDA

CANDIDATE’S DECLARATION

We hereby certify that the work which is being presented in the project, entitled “A
Study on an Education Website” in partial fulfillment of the requirements for the
award of the MCA (Master of Computer Application) submitted in the School of
Computer Applications and Technology of Galgotias University, Greater Noida, is
an original work carried out during the period of March, 2025 to April 2025, under
the supervision of Dr. Swati Singh, School of Computer Applications and
Technology , Galgotias University, Greater Noida.

The matter presented in the project has not been submitted by us for the award
of any other degree of this or any other places.

Faizan Rasool [23SCSE2030295]

Surendra Kumar [23SCSE2030031]

Suraj Kumar [23SCSE2030057]

This is to certify that the above statement made by the candidates is correct to
the best of my knowledge.

Dr. Swati Singh

Assistant Professor
SCHOOL OF COMPUTER APPLICATIONS AND
TECHNOLOGY

GALGOTIAS UNIVERSITY, GREATER NOIDA

CERTIFICATE

This is to certify that Project Report entitled “A Study on an Education Website”


which is submitted by Faizan Rasool, Surendra Kumar and Suraj Kumar in partial
fulfillment of the requirement for the award of degree MCA in Department of School
of Computer Applications and Technology, Galgotias University, Greater Noida,
India is a record of the candidate own work carried out by him/them under my
supervision. The matter embodied in this thesis is original and has not been
submitted for the award of any other degree.

Signature of Examiner(s) Signature of Supervisor(s)

Date: April, 2025

Place: Greater Noida


TABLE OF CONTENTS

Content Page No.

List of Figures ----------------------------------------------------------------- 05


List of Abbreviations --------------------------------------------------------- 06
Acknowledgement ----------------------------------------------------- 07
Chapter 1:
Abstract ------------------------------------------------------------------ 08
Introduction ------------------------------------------------------------- 09
Chapter 2:
Literature Survey ------------------------------------------------------ 11
Chapter 3:
Software Requirement Specification ----------------------------- 15
Chapter 4:
System Analysis and Design --------------------------------------- 17
Chapter 5:
Implementation and Result ----------------------------------------- 23
Chapter 6:
Conclusion -------------------------------------------------------------- 26
Chapter 7:
Reference --------------------------------------------------------------- 28

4
LIST OF FIGURES

Figure 1: Architecture of Content Based Approach

Figure 2: Project Flow Diagram

Figure 3: Architecture of the whole Project

Figure 4: Use-case Diagram

5
LIST OF ABBREVIATIONS

KNN K- Nearest Neighbor

DBSCAN Density-Based Spatial Clustering of Applications with Noise

EDA Exploratory Data Analysis

RMSE The Root Mean Squared Error

6
Acknowledgement

We would be extremely grateful if all those concerned would accept this


acknowledgment for completing the project in an excellent way.

Firstly, we have profound gratitude toward Mr. Murari Krishna Saha for constant
support, useful guidelines, and beneficial feedback provided in the duration of the entire
project.

We would like to extend our heartfelt thanks to the faculties whose collaboration and
constructive suggestions were of immense value in the development process. Their
technical as well as theoretical assistance helped smoothen out the project a lot.

We would like to thank Galgotias University for providing us with the necessary
resources, tools, and infrastructure for carrying out this project.

I would like to thank my family and friends for their support and motivation, which kept
me going throughout the project.

This project, a movie recommendation system, would not have been possible without
the combined efforts and contributions of everyone involved. Thank you all for your
encouragement and belief in our work.

7
CHAPTER 1

ABSTRACT

There are so much content available on the internet these days. It's crucial to display
movie suggestions so that users don't have to spend a lot of time looking for stuff they
might enjoy. In order to provide users with tailored movie recommendations, a movie
recommendation system is essential. We learned that the recommendations produced
using Content-based Filtering involve employing a single technique to convert text to
vectors and a single technique to determine the similarity between the vectors after
conducting extensive online research and consulting numerous academic papers.
Several text-to-vector conversion methods were employed in this study, and the final
recommendation list was obtained by manipulating the output of several algorithms.

Keyword: - Content-based Filtering, Vector similarity, Hybrid approach, Movie


Recommendations, Text to vector

8
INTRODUCTION

A recommendation system that attempts to forecast a user's preferences and provide


suggestions based on those preferences is called a recommendation system or
recommendation engine. These days, these systems are widely employed in a variety
of industries, including restaurants, food, entertainment, movies, music, books, videos,
apparel, and other utilities. Movies are an inevitable part of life, and these systems
gather data about user interests and behaviour to make better recommendations in the
future. There are a variety of film genres, including action or horror films, educational
films, children's animation films, and films for enjoyment. Movies can be readily
distinguished by their genres, such as action, comedy, thriller, and animation. There
are other ways to differentiate between films, such as by director, language, or year of
release. When we watch movies online, we can look up a lot of our favourite films.
Movie recommendation systems save us the hassle of wasting a lot of time looking for
our favourite films by assisting us in finding our favourites among all of these different
kinds of films. Therefore, it is necessary for the movie recommendation system to be
extremely dependable and to suggest films that are either precisely the same as or
most similar to our tastes. Recommendation systems are being used by many
businesses to improve customer engagement and enhance the purchasing experience.
Increased revenue and customer satisfaction are the main benefits of recommendation
systems. One important and effective tool is the movie recommendation system.
However, the issues with a purely collaborative approach also lead to low suggestion
quality and scale issues for movie recommendation systems.

Problem Statement

In the current digital media landscape, the sheer number of films available on streaming
services has surged dramatically, complicating the process for users to find content
that resonates with their preferences. Conventional recommendation systems
frequently fall short in addressing the intricate tastes of individual users, as they
9
predominantly depend on collaborative filtering, which may overlook specific user
inclinations or newly released content. Furthermore, user reviews and ratings, which
provide critical insights into viewer opinions, are often inadequately utilized in crafting
personalized recommendations.

Objective

The aim of this project is to create a comprehensive movie recommendation system


that utilizes content-based filtering and sentiment analysis to improve the
personalization and relevance of suggestions. The system intends to:

• Examine and recommend films based on various attributes, including genres,


cast, directors, plot summaries, and other relevant metadata.
• Integrate sentiment analysis of user reviews to gauge audience feelings towards
films and further enhance recommendations.
• Ensure a smooth and user-friendly experience, allowing users to receive highly
customized movie suggestions that reflect their preferences and emotional
responses.
• Showcase the effectiveness of merging content-based filtering with sentiment
analysis in achieving superior recommendation accuracy compared to traditional
methods.

This strategy will not only assist users in discovering films they are likely to appreciate
but also lay the groundwork for future advancements, such as hybrid recommendation
systems.

Scope of the Project

This project's goal is to give people accurate movie suggestions. The project's objective
is to enhance the movie recommendation system's quality, scalability, and accuracy in
comparison to pure techniques. By integrating sentimental analysis with content-based
filtering, a hybrid strategy is used to accomplish this.

10
CHAPTER 2

LITERATURE SURVEY

Numerous recommendation systems have been created over time utilizing hybrid,
content-based, or collaborative filtering techniques. Numerous machine learning and
big data methods have been used in the implementation of these systems. K-Nearest
Neighbor and K-Means Clustering Movie Recommendation System. A
recommendation system gathers implicit or explicit information about a user's interests
for various products, such as movies. The behavior of the viewer while viewing the films
is used as an implicit acquisition in the creation of the movie recommendation system.
An explicit acquisition, on the other hand, makes advantage of the user's past ratings
or history when creating a movie recommendation system. Clustering is another
supporting approach utilized in the creation of recommendation systems. The method
of clustering involves arranging a collection of items so that those in the same clusters
resemble one another more than those in other groups. To get the best-optimized
outcome, KMeans Clustering and K-Nearest Neighbor are used to the movie lens
dataset. Whereas the suggested approach gathers data and produces fewer clusters,
the conventional technique scatters data, resulting in a high number of clusters. The
suggested plan optimizes the movie recommendation process. Using a variety of
criteria, the suggested recommender system forecasts the user's preferred film.

Movie Recommendation System Using Content Based Filtering


Depending on the characteristics of the user, content-based filtering algorithms are
used. This approach is used when information about an item—such as its identity,
location, or descriptions—is known but not about the user. Similar to collaborative
techniques, it makes predictions about elements based on user knowledge while
completely disregarding the contributions of other users. It frequently uses the user's
supplied information, whether explicitly or implicitly. The engine gets more accurate

11
when the user applies more content-based filtering procedures (like a content-based
recommender) to the suggestions. Every user is presumed to function independently
in a content-based recommendation engine. No information about other users is
needed while analysing the item's qualities or attributes; instead, it searches for
similarities across objects and recommends the option that is most comparable to other
users. Every member of the cast, director, and writer, for instance, might be considered
a feature if we looked at the film's content. Items that are significantly similar to the one
that the user voted for are suggested to them.

Sentiment Analysis:
A Movie Recommendation System with Sentiment-Based Filtering extends a generic
recommendation engine with user sentiment analysis from reviews or ratings. The idea
is to merge content or collaborative filtering with Natural Language Processing (NLP)
to evaluate how the users feel about the films they have watched. Layering this
sentiment will allow recommendations to be generated not only from which movies
were watched by the user but also by how they felt about those films.

How does a Sentiment-Based Movie Recommendation Work?


Collect Reviews: Gather user reviews, ratings, or comments about the films.
Sentiment Analysis: Apply NLP methods to analyse sentiment in the reviews as
negative, neutral, or positive.
Profile Building: Build user profile detailing explicit ratings and sentiment inferred from
reviews.
Recommendation Generation: Recommend movies by mixing movie properties
(genres, directors and cast) and user sentiment against similar movies.

A survey of the several methods utilized in movie rec systems is included in the work
by Mahesh et al. [1]. It investigates deep learning techniques, hybrid approaches,
content-based filtering, and collaborative filtering. A number of similarity metrics are
looked at. It draws attention to how many businesses, including Facebook, LinkedIn,

12
Pandora, Netflix, and Amazon, use recommendation systems. The review offers a
concise synopsis of various approaches and strategies, providing insightful information
for future recommendation system research. The study by Jiang et al. [2] discusses
movie recommendation systems' scalability and useful usage feedback. It suggests a
user clustering-based recommendation system that is quite effective. The approach
has a lower time complexity and performs similarly to conventional CF systems.
Rishabh and colleagues' study focuses on developing a movie recommender system
using K Means Clustering and K Nearest Neighbour Algorithms [3]. The dataset used
is called MovieLens, and the system is operated using Python. We provide several
machine learning concepts, tools, and techniques, including Content-Based Filtering,
KNN, K-Means Clustering, and Collaborative Filtering. The architecture, process flow,
and pseudocode of the proposed system are described. The results show that the
recommended system outperforms the state-of-the-art techniques, with a best RMSE
value of 1.081648. Recommender systems (RS), developed by Choudhury et al. [4],
tackle the issue of information overload, specifically in the context of movie
suggestions. The four recommendation models—BPNN, SVD, DNN, and DNN with
Trust—are compared. With the best accuracy of 83% and a low MSE of 0.74, the DNN
with trust model is a good option for movie suggestions.
Sahu et al. [5] propose a content-based movie recommendation system that takes into
account a number of variables. A CNN deep learning model predicts the popularity of
a film based on RS results, film ratings, and voting data.
The study's accuracy of 96.8% outperforms benchmark models. A collaborative filtering
approach for movie suggestions that considers temporal effects is shown in the study
by Behera et al. [6]. It outperforms leading models, according to examination of the
Movielens dataset, with improvements of 1.35% and 1.28% on the ML-100K and 1M
datasets, respectively.
The study by Gupta et al. [7] uses cosine similarity in conjunction with K-NN algorithms
and collaborative filtering. The strategy mitigates the drawbacks of content-based
filtering by skilfully combining the advantages of both approaches. The precision
obtained by using cosine similarity is comparable to that of Euclidean distance.

13
Darban et al. [8] present a novel graph-based model that considers geography,
demographic information, and user similarities. The cold-start problem is resolved and
suggestion reliability is increased by the use of Autoencoder feature extraction.
Experimental results on the dataset show the effectiveness of the proposed method.

14
CHAPTER 3
SOFTWARE REQUIREMENT SPECIFICATION
The detailed explanation of the specification of the hardware and software is involved
in this chapter.

Hardware Requirements

• A PC with Windows or Mac OS


• Processor with 2.40GHz- 2.50 GHz speed
• Minimum of 4gb RAM

Software Specification

• Text Editor (VS code/ Jupyter Notebook/ Google Colab)


• Python Libraries
• A web browser
• A working internet connection

Software Requirements

Python libraries

We require specific Python modules for analytics in order to compute and analyze the
data. It is necessary to have packages like SKlearn, NumPy, pandas, Matplotlib, Flask
framework, etc.

SKlearn: This includes a number of classification, regression, and clustering algorithms


ranging from support vector machines, random forests, gradient boosting, k-means,
and DBSCAN, and is compatible with the numerical and scientific libraries, NumPy and
SciPy, in Python.

15
NumPy: NumPy serves as a versatile library designed for array manipulation. It offers
a high-performance multidimensional array object along with various tools for handling
these arrays. This library is essential for scientific computing within the Python
ecosystem. On the other hand, Pandas is among the most prevalent Python libraries
utilized in the field of data science. It delivers high-performance, user-friendly data
structures and analytical tools. Unlike NumPy, which focuses on multi-dimensional
arrays, Pandas introduces an in-memory two-dimensional table structure known as
DataFrame.

Pandas: Pandas is an open-source library built upon the NumPy library. It is a Python
package providing a variety of data structures and operations for manipulating
numerical data and time series mainly popularly known for importing and analysing
data in the easiest way. This association with NumPy gives it speed and high-
performance productivity for users.

Flask: Flask is a micro-framework for Python that is lightweight and adaptable. It


provides a minimal set of tools for the construction of web applications without imposing
any project structure or including ready-made components for everything, thus freeing
up developers to choose all the libraries and equipment that they need.

Scipy: SciPy is an open-source scientific and technical computing library for Python.
Generally, SciPy includes modules for optimization, linear algebra, integration,
interpolation, special functions, FFT, signal and image processing, and ODE solvers,
alongside a number of other tasks commonplace in science and engineering.

16
CHAPTER 4

SYSTEM ANALYSIS AND DESIGN

System Architecture of Proposed System:

Similar

A B C D Items

Recommend
Likes

User

Fig 1: Architecture of Content Based Approach

Content-based filtering in recommender systems builds a training model on the basis


of a set of features and user knowledge so as to predict and recommend similar
products to customers. It operates by recommending a product based on a previously
known or characteristic knowledge of a product and the decisions made by the user.
The recommender system builds a profile of the user based on accumulated

17
information such as clicks, ratings, and likes from past interactions. The more the user
interacts with the system, the better the future recommendations will be.

Project Flow:

Data Pre- Processing

Machine Learning Model

Add model to website Deploy

Fig 2: Project Flow

Initially data sets are loaded that are required to build a model. Data set that are
required in this project are tmdb_credits.csv and tmdb_movies.csv all the data sets are
available on Kaggle.com. Later, we added some other datasets to add Indian movies
to our project database. Basically, the whole project model is created using a content-
based approach and then imported into a website using the Flask Python library used
for creating web apps.

18
Architecture:

Fig 3: Architecture of the whole project

Use-case Diagram:

Fig 4: Use-case Diagram

19
Exploratory Data Analysis (EDA) of the final dataset gives some information regarding
the available attributes:

20
From above observation it’s clear that Most of the popularity data lies between 0 to 200.

21
Popular movies tend to earn more profit.

Adventure, Science Fiction, Fantasy, Action, Animation are the top 5 Genres which have top
5 popularity.

22
CHAPTER 5

IMPLEMENTATION AND RESULTS

The Proposed System Makes Use of Different Algorithms and Methods for the
implementation of Content based approach.

Similarity Score:

It is a numerical value that goes from 0 to 1 that indicates how similar two items are to
one another on a scale of one to 0. The resemblance between the text details of the
two items is measured to determine this similarity score. The similarity score is a way
to quantify how similar the text details of two objects are. Cosine-similarity can be used
to do this.

How Cosine Similarity works?

Cosine similarity is a metric used to determine how similar any two documents are,
regardless of their document size. It is calculated by measuring the cosine of the angle
which two vectors form when they are projected in three dimensions mathematically.
The cosine similarity allows two similar documents to be oriented at a closer position
even if they have a large Euclidean distance (due to that size of the document). Cosine
similarity increases with the decrease in angle.

23
CountVectorizer:

One excellent utility offered by the scikit-learn module in Python is Count-Vectorizer.


Based on the frequency (count) of each word that appears throughout the text, it is
used to convert a given text into a vector. When we have several of these texts and
want to turn each word into a vector (for use in subsequent text analysis), this is useful.

24
Snapshots Of Interface:

25
CHAPTER 6

CONCLUSION

The project successfully produced a content-based movie recommendation system that


suggested movies to users, tailoring recommendations based on user preference. The
recommendations consulted features such as genre, cast, directors, and keywords with
regards to inferences regarding one movie being more similar to other movies that
would recommend that matched the past behavioral preferences of a user.

Key Takeaways:

Efficiency of Content-Based Filtering:

The content-based approach proved that it could provide relevant recommendations


tailored to individual user profiles, thus outlining the importance of feature engineering
and similarity measures.

Scalability and Personalization:

Such an approach scales well in the particular case of newly added users when much
historical data is present. Unique recommendations are therein provided for every user
necessarily adapted only to their own preferences instead of any collective or pure
observational behavior.

26
Challenges:

Cold-Start Problem for New Users: The system has trouble dealing with users who do
not have enough historical data since this system operates solely from past
interactions-the one case the model finds difficult.

Narrow Recommendations: Content-based systems often recommend items similar to


what a given user has already interacted with and hence restrict diversity.

Future Scope:

The enhancement of the system in the near future would entail dealing with cold-start
problems and recommending more diverse items in a hybrid manner, that is, by creating
a hybrid approach of content-based filtering along with collaborative filtering or deep
learning models. We can also enlarge our dataset adding Bollywood movies dataset
with their overview and reviews. Evaluation based on explicit rating and implicit
adjustment can further refine the accuracy of this system.

In this way, this project is a proof of feasibility and efficacy of the content-based
approach to movie recommendations and serves to widen the areas of future work that
could together form a much more powerful and holistic recommendation engine.

27
CHAPTER 7

REFERENCE

[1] Goyani, Mahesh; Chaurasiya, Neha. "A Review of Movie Recommendation System:
Limitations, Survey and Challenges". ELCVIA : Electronic Letters on Computer Vision
and Image Analysis.

Vol. 19 No. 3 (2020), p. 18-37 DOI 10.5565/rev/elcvia.1232


https://ddd.uab.cat/record/232276

[2] J. Zhang, Y. Wang, Z. Yuan and Q. Jin, "Personalized real-time Movie


Recommendation System: Practical prototype and evaluation," in Tsinghua Science
and Technology, vol. 25, no. 2, pp. 180-191, April 2020, doi:
10.26599/TST.2018.9010118.

[3] R. Ahuja, A. Solanki and A. Nayyar, "Movie Recommender System Using K-Means
Clustering AND K-Nearest Neighbor," 2019 9th International Conference on Cloud
Computing, Data Science & Engineering (Confluence), Noida, India, 2019, pp. 263-
268, doi: 10.1109/CONFLUENCE.2019.8776969.

[4] Choudhury, S.S., Mohanty, S.N. & Jagadev, A.K. Multimodal trust based
recommender system with machine learning approaches for movie recommendation.
Int. j. inf. tecnol. 13, 475–482 (2021). https://doi.org/10.1007/s41870-020-00553-2

[5] S. Sahu, R. Kumar, M. S. Pathan, J. Shafi, Y. Kumar and M. F. Ijaz, "Movie


Popularity and Target Audience Prediction Using the Content-Based Recommender
System," in IEEE Access, vol. 10, pp. 42044-42060, 2022, doi:
10.1109/ACCESS.2022.3168161.

28
[6] Gopal Behera, Neeta Nain, Collaborative Filtering with Temporal Features for Movie
Recommendation System, Procedia Computer Science, Volume 218, 2023, Pages
1366-1373, ISSN 1877-0509, https://doi.org/10.1016/j.procs.2023.01.115.

[7] M. Gupta, A. Thakkar, Aashish, V. Gupta and D. P. S. Rathore, "Movie


Recommender System Using Collaborative Filtering," 2020 International Conference
on Electronics and Sustainable Communication Systems (ICESC), Coimbatore, India,
2020, pp. 415-420, doi: 10.1109/ICESC48915.2020.9155879.

[8] Zahra Zamanzadeh Darban, Mohammad Hadi Valipour, GHRS: Graph-based


Hybrid recommendation system with application to movie recommendation, Expert
Systems with Applications, Volume 200, 2022, 116850, ISSN 0957-4174,
https://doi.org/10.1016/j.eswa.2022.116850.

[9] Priyanshu Modi, Atul Kumar, Bhaskar Kapoor - Filmview: A Review Paper on Movie
Recommendation Systems, JUN 2023 | IRE Journals | Volume 6 Issue 12| ISSN: 2456-
8880.

[10] N Pavitha, Vithika Pungliya, Ankur Raut, Roshita Bhonsle, Atharva Purohit,
Aayushi Patel, R Shashidhar - Movie recommendation and sentiment analysis using
machine learning - Global Transitions Proceedings 3 (2022) 279–284

[11] Yeole Madhavi B.1, Rokade Monika D.2, Khatal Sunil S.3 - Movie
Recommendation System using Content based Filtering - Vol-7 Issue-4 2021 IJARIIE-
ISSN(O)-2395-4396-14954

[12] Dr. Ganesh D, Yash Bhansali - Movie Recommendation System Using Content-
Based Filtering - 2022 IJCRT | Volume 10, Issue 4 April 2022 | ISSN: 2320-2882

29

You might also like