Volume 4, Issue 4, April – 2019                                     International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
                                               Travel Companion
      Vishal Israni1, Raaj Raisinghani2, Varun Rathi3                                           Charusheela Nehete4
      Student, Department of Information Technology                         Assistant Professor, Department of Information Technology
  Vivekanand Education Society’s Institute of Technology,                    Vivekanand Education Society’s Institute of Technology,
                          Mumbai                                                                     Mumbai
Abstract:- This paper presents a generalized and effective                  are trusted more than recommendations from others. This
methodology for recommending various entertainment                          information is used in the decision on which movie to see.[1]
parameters which can be very helpful while traveling                        The algorithm calculates the similarity between two users or
using various AI and ML algorithms. A product that                          items, and produces a prediction for the user by taking the
recommends the user books, movies, TV shows, songs and                      weighted average of all the ratings. Similarity computation
places to visit based on their past preferences and the                     between items or users is an important part of this approach.
time they have at their disposal. Use of content based,                     Multiple measures, such as Pearson correlation and vector
collaborative      filtering,  hybrid    algorithms      and                cosine based similarity are used for this. The Pearson
demographic based recommender systems to filter the                         correlation similarity of two users x, y is defined as
results and recommend them to the user. An additional
feature of filtering results based on time constraint is also
implemented. Blockchain-based micropayments and
other features such as Proof of work and Proof of
authority are used for pay per use feature and OAuth will
ensure the user’s authentication.                                                   Equation 1.1 : Pearson Correlation Similarity
Keywords:- Blockchain, Machine Learning, Hybrid Filtering,                        where Ixy is the set of items rated by both user x and
Content-Based    Filtering,   Collaborative     Filtering,                  user y.[2]
Demographic-Based Filtering, Proof of Work, Proof of
Authority.                                                                  B. Content Based Filtering
                                                                                   Content-based filtering, also referred to as cognitive
                  I.   INTRODUCTION                                         filtering, recommends items based on a comparison between
                                                                            the content of the items and a user profile. The content of
      The current recommendation systems are quite specific                 each item is represented as a set of descriptors or terms,
in their usage and there is a lack of a good application that               typically the words that occur in a document. The user profile
suffices all the user’s needs of a recommendation system all                is represented with the same terms and built up by analyzing
integrated in one platform.                                                 the content of items which have been seen by the user. [3]
                                                                            The algorithm makes use of cosine similarity and finds the
      The portal is aimed at providing the user with                        relation between the user and the item in consideration.
entertainment which the user will tend to like. The time
constraint feature aids the user in selecting relevant content
without fussing over irrelevant recommendations. This
application will ensure the social well-being of an individual.
Various domains covered are Movies, TV shows, books,                                 Equation 1.2 : User and Item Correlation
Songs and places to visit. The product intends to deliver the
user, the most relevant recommendations through a self-                     C. Hybrid Recommender Systems
explanatory and simplistic Graphical User Interface, along                        Both, content based and collaborative filtering have
with the liberty to pay only for what the user really wants,                strengths and weaknesses. Four specific problems can be
secured by the robust blockchain.                                           distinguished for collaborative filtering :
                                                                            Cold start, Sparsity, First rater, popularity bias.
A. Collaborative Filtering
       Collaborative filtering, also referred to as social                  Whereas, Content based filtering faces issues with :
filtering, filters information by using the recommendations of                   Content description, Over specialization, subjective
other people. It is based on the idea that people who agreed in             problem domain [4].
their evaluation of certain items in the past are likely to agree
again in the future. A person who wants to see a movie for                        A system that combines content-based filtering and
example, might ask for recommendations from friends. The                    collaborative filtering could take advantage from both the
recommendations of some friends who have similar interests                  representation of the content as well as the similarities among
                                                                            users. Although there are several ways in which to combine
IJISRT19AP624                                                 www.ijisrt.com                                                          240
Volume 4, Issue 4, April – 2019                                      International Journal of Innovative Science and Research Technology
                                                                                                                     ISSN No:-2456-2165
the two techniques a distinction can be made between two                     song_id, title, release,artist_name,year and release. These two
basis approaches. A hybrid approach combines the two types                   files form the dataset of the song recommender system.
of information while it is also possible to use the
recommendations of the two filtering techniques                               Generic Song Recommender System:-
independently.[5].                                                                 This approach is a naive approach and does not yield
                                                                             personalised results. It can be thought of as an all time top
                 II.   METHODOLOGY                                           songs listened. The generalised approach can be used to
                                                                             recommend songs and overcome the “cold -start” problem
      The system takes the user’s categorical choice as input                [7]. The listen count of all the songs are found out. The total
and recommends to him/her a list of things that he/she will                  listen count of all the songs are added and a sum is obtained.
tend to like. The recommendation list can also be filtered on                The percentage of each song’s listen count is found with
the basis of time available to the user. I.e if the user wishes to           respect to the total listen count.
watch a movie but only has 2 hours to spend on, the system
will recommend relevant movies of length 2 hours or less.                          The sum is sorted and the all the values are normalised
Proposed system architecture :                                               between 0 and 1. The top 10 values are then sorted and
                                                                             displayed to the user.
                                                                              Song Recommendation Based On A Particular Song:-
                                                                                    A database indexer named Solr is used for fast and
                                                                             efficient indexing of the dataset. The more like this query
                                                                             parser of Solr[8] is used to score songs based in their
                                                                             similarity to a particular song. The similarities are based
                                                                             artist_name, release and title. The more like this query parser
                                                                             will then search over the entire dataset and score the tuples
                                                                             based on their “likeness” to the original entries. The top ten
                                                                             entries are displayed as the recommendation to the user.
                  Fig 1:- System Architecture                                 Song Recommendation Based On Entire Profile:-
                                                                                   Initially, the songs are sorted according to them being
 Domains Covered:-                                                          the most popular. Unique songs to each of the user and
   Movies, TV shows, Songs, Books, Places to visit.                          unique user for each of the songs are calculated. The unique
                                                                             users are only found for the songs which the user has listened
 Special Section:-                                                          to. A co-occurrence matrix is created. The dimensions of this
      Surprise Element. Above are the major sections covered                 co-occurrence matrix are len(user_songs)*len(songs). Then
in the system.                                                               the similarity between all the songs are found out. This
                                                                             similarity is nothing but the number of common users who
 Collecting Data:-                                                          have listened to both the songs. The similarity measure used
      The initial phase of any new user is basically collection              is a Jaccard index[9]. A weighted average for all the songs is
of previously liked entertainment genres and other                           calculated which is then sorted and the top ten songs are
parameters. Till then, the new users are recommended with                    displayed to the user.
certain popular & trending elements.
                                                                             B. Movie Recommendations
 User Preferences:-                                                              With such an excel in technology, movies have become
      The data collected is then analysed and segregation of                 an amazing source of entertainment for almost every age
data based on what user likes, what similar users like,                      group. Recommending movies while travelling, free time will
similarity distance between user and items in the system. And                thus be a great action from the user as well as business
based on all these, algorithms to generate relevant                          perspective. To construct an efficient movie recommender
recommendations in desired domains are implemented.                          system, the dataset from Kaggle[10] is being used The
                                                                             advancement of the construction of a movie recommender in
A. Song Recommendations                                                      our product line up is as follows Simple Recommender:- The
      The user is given three types of songs viz. Generic                    Simple Recommender offers generalized recommendations to
popular songs recommendation, Song recommendations                           every user based on movie popularity and genre. The idea
based on a particular song and songs recommended on the                      behind this recommender is that movies that are more
entire profile. The dataset used is the Million Songs                        popular and more critically acclaimed will have a higher
Dataset[6]. The schema of one of the files contains the user                 probability of being liked by the general populace. We sorted
id, play count and the song id. The other file contains                      our movies based on ratings and popularity and display the
                                                                             top movies of our list. We used the TMDB Ratings to come
IJISRT19AP624                                                  www.ijisrt.com                                                          241
Volume 4, Issue 4, April – 2019                                    International Journal of Innovative Science and Research Technology
                                                                                                                   ISSN No:-2456-2165
up with our Top Movies Chart. We used IMDB's weighted                      rating for certain movies, Other user’s ratings for similar
rating formula to construct a chart. Mathematically, it is                 movies to correlate the liking character. The title, overview,
represented as follows:                                                    cast, rating, and other suggestive metadata about the movie.
                                                                           And after considering all the prominent parameters, and user
                                                                           profile study, the system generates an efficient list of movies
         Equation 2.2.1 : Weighted Rating Formula                          relevant to the taste of the user.
       where, v is the number of votes for the movie m is the              C. Books Recommendation:-
minimum votes required to be listed in the chart R is the                       For recommending books, collaborative filtering is
average rating of the movie C is the mean vote across the                  being used where similar users are studies and
whole report. The algorithm works fine and suggests popular                recommendations are made based on books liked by the
movies with high ratings and most likely to be liked by most               group of those similar users amongst themselves.
of the users. And the algorithm also works great for genre-
specific popular movies. That is, when asked for popular
movies under a specific genre, it gives appropriate results for
an average user. But, the recommender we built in the
previous section suffers some severe limitations. For one, it                         Equation 2.3.1 : Formula for Similarity
gives the same recommendation to everyone, regardless of
the user's personal taste. And to have something which                          k-nearest neighbour method is used for clustering and
recommends based on user choices and more personalised                     forms the basis for user-based collaborative filtering
stuff, we used content based on collaborative filtering.
                                                                                The function used for similarity between users is :
 Content Based:-
      For ‘The Dark Knight’, our system is able to identify it
as a Batman film and subsequently recommend other Batman
films as its top recommendations. But this is not of much use                        Equation 2.3.2 : Similarity between Users
to most people as it doesn't take into considerations very
important features such as cast, crew, director and genre,                       Where p(a,i) is the prediction for target user a for item
which determine the rating and the popularity of a movie.                  i, w(a,u) is the similarity between users a and u, and K is the
Someone who liked The Dark Knight probably likes it more                   neighbourhood of most similar users.
because of Nolan and would hate Batman Forever and every
other substandard movie in the Batman Franchise. Therefore,                       The other approach, i.e. item-based collaborative
more suggestive metadata like cast, crew, director                         filtering will use the following formula :
information etc was also considered When made different
considerations, the recommendation list changes towards
more relevant results.
 Collaborative Filtering:-                                                          Equation 2.3.3 : Item Collaboration study
      The engine that we built is not really personal in that it
doesn't capture the personal tastes and biases of a user.                         where K is the neighbourhood of most similar items
Anyone querying our engine for recommendations based on a                  rated by an active user a, and w(i,j) is the similarity between
movie will receive the same recommendations for that movie,                items i and j. The knn clustering is used to cluster items.
regardless of who s/he is. Therefore, we took a step towards               Integrating both implementations, the hybrid system
Collaborative Filtering to make recommendations to Movie                   recommends significantly relevant books as per the user’s
Watchers. Collaborative Filtering is based on the idea that                taste.
similar-other users’ taste can be used to predict how much
the current user will like a particular product or service. We             D. TV shows Recommendation:-
used the Surprise library that used extremely powerful                           Much like movies, TV shows have become a prominent
algorithms like Singular Value Decomposition (SVD) to                      trending entertainment factor. We consider the most
minimise RMSE (Root Mean Square Error) and give great                      significant parameter of genre along with cast. TV shows are
recommendations.                                                           also chosen by end users based on their run length. Some
                                                                           users tend not to start a show if its too long in run length.
 Hybrid:-                                                                 Again, a hybrid system integrating collaborative and content
     Advantages of both, collaborative and content-based                   based filtering to generate a list of recommended TV shows
were exploited to have an efficient recommender engine that                clustering similar users and the ratings they have given to
provides better and relevant results considering : User’s                  various shows is used. Users tend to start watching a TV
IJISRT19AP624                                                www.ijisrt.com                                                           242
Volume 4, Issue 4, April – 2019                                    International Journal of Innovative Science and Research Technology
                                                                                                                   ISSN No:-2456-2165
show based on ratings and popularity in critics world as well              from this. GPS calculate location information from the
as end user ratings. [11]                                                  satellite signal. It has the highest accuracy; in most Android
                                                                           smartphones, the accuracy can be up to 10 metres.
                                                                            Mobile Network Location:-
                                                                                Mobile phone tracking is used if a cell phone or
                                                                           wireless modem is used without a GPS chip built in.
                                                                            WiFi Positioning System:-
                                                                                 If WiFi is used indoors, a Wi-Fi positioning system is
                                                                           the likeliest source. Some WiFi spots have location services
                                                                           capabilities.
                                                                            IP Address Location:-
                  Fig 2:- Critic Review Plot                                    Location is detected based on nearest Public IP Address
                                                                           on a device (which can be a computer, the router it is
                                                                           connected to, or the ISP the router uses). The location
                                                                           depends on the IP information available, but in many cases
                                                                           where the IP is hidden behind Internet Service Provider
                                                                           NAT, the accuracy is only to the level of a city, region or
                                                                           even country.
                                                                                 The user’s location is pointed by a red marker whereas
                                                                           the recommended places are denoted by blue markers and are
                                                                           highlighted on the map as well.
                                                                           F. Surprise Element:-
                   Fig 3:- User Review Plot                                      Many a times, when user is bored and psychologically
                                                                           is not able to decide what to do - whether to watch a movie,
      Based on these parameters, average ratings are                       or listen to some music or binge on a TV show. In such a
calculated and using similarity measures between similar                   case, the system simply pops up with some random
users and between user - TV shows, recommendation list is                  personalised element (can be a movie, a book, or any such
generated.                                                                 thing) and recommends the user to go for the same.
E. Places to Visit:-                                                       G. Time Filtration:-
      The user gets a recommendation of places he/she is                         This feature is to filter results while recommending
most probably to enjoy visiting. The system analyses the user              entities. The significance being many a times user might have
profile and studies the type of places user usually visits. Thus           a limited time (say 1.5 hours) and asks for a movie. The
if user has visited more beaches in Mumbai, he/she will be                 system recommending a 3 hour movie would not be an
recommended with beaches when visited in Bangalore or any                  efficient solution. Thus, a time filtration component would
other city. The demographic being the proximity to the place               actually be useful in such a scenario thereby recommending
and the user's current location. We take the user’s                        entities according to the time entered by the user.
geolocation via the IP address of the device[12]. We have
used the W3C geolocation API and integrated with the                       H. Blockchain:-
Google Maps API to get the user’s location and represent it                      A blockchain is a digital, immutable, distributed ledger
on the maps. The result of W3C Geolocation API gives 4                     that chronologically records transactions in near real time.
location properties, including latitude and longitude                      The prerequisite for each subsequent transaction to be added
(coordinates), altitude (height), and [accuracy of the position            to the ledger is the respective consensus of the network
gathered], which all depend on the location sources. In some               participants (called nodes), thereby creating a continuous
queries, altitude may yield or return no value. The four                   mechanism of control over manipulation, errors, & data
possible methods for locating the user with the help of this               quality. [13]
API is:
                                                                                 There are several reasons to switch to cashless
 GPS (Global Positioning System):-                                        transactions with the help of blockchains. Some of them
     This happens for any device which has GPS                             being : low transaction cost, irrevocable and tamper resistant
capabilities. A smartphone with GPS capabilities and set to                transactions, highly secure, Fraud minimisation, tracing and
high accuracy mode will be likely to obtain the location data              auditing by supervisors [14]. The application of such advance
IJISRT19AP624                                                www.ijisrt.com                                                         243
Volume 4, Issue 4, April – 2019                                    International Journal of Innovative Science and Research Technology
                                                                                                                     ISSN No:-2456-2165
featurism is to implement micro payments along with                        [7]. Lika, B., et al. (2014). "Facing the cold start problem in
transparency and security in transactions. Entertainment these                   recommender systems." Expert Systems with
days is costly. End users need to purchase monthly or yearly                     Applications, vol. 41(4), pp. 2065-2073.
subscription packages just to watch a few digital                          [8]. Class         MoreLikeThis,        Available         at     :
programmes. The system aims to provide a pipeline where                          https://lucene.apache.org/core/7_0_0/queries/org/apach
user can watch recommended digital entities and pay only for                     e/lucene/queries/mlt/MoreLikeThis.html
what they have used. For example, the user only wishes to                  [9]. wikipedia.com           :        Avaialble         at       :
watch 2 episodes of any XYZ Tv series. So instead of                             https://en.wikipedia.org/wiki/Jaccard_index
purchasing entire subscription to watch merely 2 episodes,                 [10]. The Movies Dataset, kaggle.com available at:
the user simply pays for what he watches. (The pay per use                       https://www.kaggle.com/rounakbanik/the-movies-
concept) [15]. The transactions use public keys and private                      datasetRounak Banik, 2017.
keys of senders and receivers. The transaction is signed by                [11]. Metarecommender,              available          at        :
the hash of public key of the sender and is termed as digital                    https://nycdatascience.com/blog/student-
signature[16]. This signature is to validate the authenticity of                 works/capstone/metarecommendr-recommendation-
the sender. Further, the proof of concept[17] and proof of                       system-video-games-movies-tv-shows/
authority[18], proof of work[19] features of blockchain                    [12]. Geolocation         API        Specification,        w3.org,
verify whether the transaction is valid. This is done by                         https://www.w3.org/TR/geolocation-API/, 8 November
miners[20] of the transactions and receive a reward in the                       2016.
form of cryptocurrency(digital currency used for transaction)              [13]. Monitor Deloitte: A new game changer for media
for the same. This way, the user has the liberty to pay only                     industry - page 4.
for what he/she has used and the transactions are highly                   [14]. imarticus.org            available            at           :
secure and transparent as well.                                                  https://imarticus.org/benefits-of-blockchain-technology/
                                                                           [15]. https://internetofthingsagenda.techtarget.com/blog/IoT-
                 III.   CONCLUSION                                               Agenda/How-blockchain-will-enable-pay-as-you-go-
                                                                                 insurance
       All the modules use different filtering models, based on            [16]. wikipedia.org             :          available            at
their appropriateness. Demographic filtering was found to be                     https://en.wikipedia.org/wiki/Digital_signature
suited for the places to visit model with Straight Line                    [17]. https://en.wikipedia.org/wiki/Proof_of_concept
Distance as the admissible heuristic. Similarly, hybrid                    [18]. https://en.wikipedia.org/wiki/Proof-of-authority
filtering for movies and a combination of content-based                    [19]. https://en.wikipedia.org/wiki/Proof-of-work_system
filtering, item-based collaborative filtering and user-based               [20]. investopedia.com        [online]     available       at    :
collaborative filtering for books gave the most satisfactory                     https://www.investopedia.com/terms/b/bitcoin-
results. Coming to TV shows, hybrid filtering gave efficient                     mining.asp
results and collaborative filtering proved to be appropriate for
recommending songs. The transactions in the end were made
highly secure using blockchain features thereby allowing the
‘pay per use’ concept for services leveraged by the user.
                        REFERENCES
[1]. recommender-systems.org          :     available      at
     http://recommender-systems.org/collaborative-filtering/
[2]. wikipedia.com            :         available          at
     https://en.wikipedia.org/wiki/Collaborative_filtering
[3]. recommender-systems.org          :     available      at
     http://recommender-systems.org/content-based-
     filtering/
[4]. slideshare.net          available          at          :
     https://www.slideshare.net/microlife/recommender-
     systems-contentbased-and-collaborative-filtering
[5]. recommender-systems.org          :     available      at
     http://recommender-systems.org/hybrid-recommender-
     systems/
[6]. Bertin-Mahieux, T., et al. (2011). The million song
     dataset. 12th International Society for Music
     Information Retrieval Conference (ISMIR).
IJISRT19AP624                                                www.ijisrt.com                                                             244