0% found this document useful (0 votes)
96 views4 pages

Tourist Recommender System Using Hybrid Filtering

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
96 views4 pages

Tourist Recommender System Using Hybrid Filtering

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

2019 4th International Conference on Information Systems and Computer Networks (ISCON)

GLA University, Mathura, UP, India. Nov 21-22, 2019

Tourist Recommender System using Hybrid


Filtering
Praphul Kumar Jain
Maddala Lakshmi Bai Rajendra Pamula Research Scholar, Department of CSE
Research Scholar, Department of CSE Assistant Professor, Dept. of CSE IIT(ISM) Dhanbad
IIT(ISM) Dhanbad IIT(ISM) Dhanbad India
India India praphulajn1@gmail.com
maddala.18DP000389@cse.iitism.ac.in rajendra@iitism.ac.in

Abstract—Recommendation has become difficult task to the tourist based on the tourist profile. The systems in the
predict the ratings for the new user and new item in the tourism sectors are known as personalized tourist
recommender systems. This problem is known as ‘cold-start’. recommender systems (PTRS) [2]. The main objective of
In this paper a hybrid filtering which uses content-based these systems is to recommend suitable personal packages
filtering, collaborative filtering and demographic is proposed that match to tourist profile. This would increase tourist
to address the ‘cold-start’ problem. To predict the new user experience [3] and frees from the worry in panning visit.
rating and hence to find the similar items with neighborhood, This also improves the business of the tourism department.
hybrid filtering uses demographic details. This proposed
approach uses the advantages and overcomes the drawbacks
that are in existing recommendation methods like CB and CF. II. LITERATURE SURVEY
In this paper different dataset are used to predict the ratings
for the new user using the demography and different POIs are A. Need of Recommender systems
extracted that satisfy the new user. The results produced by A Recommender System is software system that filters
this approach are relatively acceptable. This proposed method the information according to the user’s preferences [14] by
can work effectively and efficiently to solve the cold-start
problem.
reducing the information overload from the huge amount of
data. The main purpose of the recommender system is to
Keywords—Tourism, Tourist, Hybrid filtering, Personalized offer good and suitable personalized services [15] that can
recommendation, cold-start. satisfy user requirements. The promising characteristic of
the recommender system is the capability to predict the
I. INTRODUCTION user’s preferences and interests to generate
recommendations by analyzing the user’s behavior. The
It has been observed that the tourism playing vital role in behavior of the user can be stored in the user’s profiles.
the economic development of different countries in the past
Recommender system can also speed up [16] the purchasing
few decades. The tourists are just considering the tourism as
of goods while satisfying the user’s needs.
a need to relax out of all tensions and work pressures in their
day to day life for few days or hours. There are many factors B. Categories of recommender systems
that initiate tourist to travel are: spending holiday leisure,
visiting friends and relatives, business and professional tasks,
The primary techniques used in the recommendation are
getting health checkup, attending to religious activities and
classified into three main categories: content-based [4],
any other personal interest. The tourism development
collaborative filtering [5], and hybrid filtering [6]. Content
departments of different countries are trying to establish
based recommendation, recommends items to the users from
relationship with travel agents, and tourists. The tourism
their past history. Which means the items ratings information
offers many destination points, attractions, sites, and staying
is used to construct the recommendation. It is also known as
facilities etc. In this scenario, a tourist is required to plan
“Item-based” or “Model-based” filtering. The history is
his/her visit just by selecting a destination that may include
stored in the form of profile. The new item is estimated
many different points of interests (PoIs). The tourist naturally
based on the similarity of the item purchased in the past. A
uses the information from the travel agents, travel manual,
recommendation is estimated by using the similarity
and tourism websites to plan intended visit. The tourist can
measures taking the attributes of the item purchased in the
access huge amount of information for all these sources like
past. Here vector space models and K-Nearest neighbor
travel agents, travel manual and tourism websites. Due to this
(KNN) [7] measures can be used to find the similarity
huge information overload it becomes a difficult task for the
between the new item and item already purchased in the past.
tourist to plan a visit. It has been also noted from the tourism
A new item is categorized to most suitable class among the
websites such as https://www.makemytrip.com and
KNNs. The distance between new item
https://www.tripadvisor.in, which are actually not offering
X=(xa1,xa2,xa3,…xan) and the existing item
personalized recommendation, instead offering non
Y=(ya1,ya2,ya3,…yan) is calculated using the Euclidian
personalized recommendation number of visits of PoIs [12]
distanced as follow:
or based on rating and reviews [13] of previous visits of the
tourists. n

¦
2
d ( X ,Y ) ( yai  xai )           (1)
In this context there is a great demand for personalized i 1
recommender system (PRS) [1] to send recommendations to

978-1-7281-3651-6/19/$31.00 ©2019 IEEE 746


From the above equation (1) distance are computed, and top and CF, another Hybrid filtering approach that uses the
n items that satisfy a threshold value are used for Demographic details to determine the recommendation in
recommendation. this context. The Demographic filtering (DF) [3] addresses
the new user cold-start problem based on the Demographic
Collaborative Filtering based approaches try to predict details like age, gender, country, region and religion etc.
value of item for a particular user based on the items that
were already rated by other users. That means here user
information is used to construct the recommendation. It is III. HYBRID RECOMMENDATION WITH DEMOGRAPHIC
also “Memory-based” or “user-based” collaborative filtering. FILTERING
The new items [8] are predicted to the user based on the In this research work a novel personalized hybrid
similarity of the item with other users. Hence, CF filtering approach is proposed for tourist recommender
recommends items or products to the user based on the system that uses three techniques: CB, CF and DF. The
similarity item with other similar users. This approach needs Demographic details such as age, gender, country and city
user’s items or products history to find the similar users. etc. are collected at the time of registration through Tourism
From these similar users different items or packages that web sites. The details are further used in the process of
have already been purchased can be recommended to the filtering the information according to the user’s
new user. There are number of similarity measures to requirements. The Demographic filtering that uses Decision
estimate the distance between two users to determine the tree [3] addresses new user cold start problem according to
neighborhood such as Pearson correlation, Cosine similarity, the following procedure:
Spearman correlation, Tanimoto coefficient, Euclidian
distance and Log likelihood similarity. The Euclidian A. New user cold-start problem
distance formula is explained in the equation (1). It has been
proven that Tanimoto coefficient has given accurate value to Algorithm 1 New user cold-start
find the distance between two users. Let s1 and s2 are two set i. Extract the information based on demographic
of items for which user 1 (u1) and user 2 (u2) have given the details whose demographic details match with new
ratings. The neighborhood is calculated as follow: user
ii. Use different combination of demographic details
s1ˆ s 2
s im u 1, u 2       (2)  to extract information such as age and country, or
s1  s 2  s1ˆ s 2 age and city or age and gender
iii. Table that satisfies given demographic details is
Where, s1ˆs2 is the intersection set for both user’s u1 obtained as shown in the Table 1
and u2 respectively. When a new user ahs enter into the iv. Average rating is predicted from the above Table 1
system the collaborating filtering cannot predict the similar. information and is assigned to new user
Because it has no previous history from which the similarity v. The tourism packages that match with newly
between users is computed. Hence, it suffers from the new predicted rating will be recommended to new user
user problem called ‘cold-start’. using CF.
Sometimes both of these methods can be integrated to
TABLE I. DEMOGRAPHIC DETAILS FOR NEW USER MATCH
form an effective and efficient recommendation system. It is
also known as “Knowledge-based” or “Hybrid filtering”. sn name city age country rating
When it comes to practicality both CB and CF have their 1 U1 Sydney 37 Australia 4.2
own merits and demerits. To overcome the drawbacks of the 2 U2 Canberra 45 Australia 3.2
CB and CF when were used individually, and building an 3 U3 Hamilton 21 Australia 2.7
effective personalized recommender system these
approaches can be combined to form Hybrid filtering. This B. New package cold-start problem
method uses the advantages of CB and CF [9]. For instance a The following procedure is used to address the cold-start
user has got an item, but did not rate. In such a situation CB problem for the tourist package that is not yet rated to predict
works well. In other case user has rated an item which was rating.
already rated by similar user, in this context CF work well.
First CB can be performed to get some related information. Algorithm 2 New Package cold-start
Later on the outcome of CB, CF can be applied to get
suitable items and avoiding the information overload. By i. Extract the information based on demographic
integrating the CB and CF approaches, more powerful details whose demographic details match with new
personalization solution can be obtained. Smyth and Cotter user
suggested this kind of personalized hybrid recommendation ii. Use different combination of demographic details
system [10]. This approach works well for the active users to to extract information such as age and country, or
predict the rating and finding the content and similar users. age and city or age and gender
Claypool et al. [11] have given a hybrid model based on a iii. Filter the information based on tourist package
weighted average prediction of CB and CF. For each user the name using CB for the above demographic details
weights are determined to allow the system to determine iv. Table that satisfies given demographic details is
optimum combination of CB and CF. But, when a new user obtained as shown in the Table 2
come who do not have past history then this combination of v. Average rating is calculated from the above Table
hybrid filtering suffers from same problem called ‘cold- information and is assigned to new package
start’. To address the ‘cold-start’ problem and consider the vi. CF is applied to find the neighborhood users that
advantages of different recommendation approaches like CB match with average that is newly predicted.

747
TABLE II. DEMOGRAPHIC DETAILS FOR PACKAGE MATCH tripadvisor website for Thailand POIs. This dataset has
sn name city Package age country rating attributes like POI, Reviews, excellent (5), very good (4),
1 U1 Sydney Pkg1 37 Australia 4.2 average (3), poor (2), and terrible (1) as part of ratings, time
2 U2 Canberra Pkg2 45 Australia 3.2 of the year, and language. The tourists from different travel
3 U3 Hamilton Pkg1 21 Australia 2.7 type have been considered. This traveler type includes
families, couple, solo, business and friends. In the proposed
work as part of experiment the Indian tourists who can speak
From the above Table 2 average rating is predicted and is English is considered using traveler type families, couple and
assigned to the new package. The packages that are similar in solo.
rating with package are identified. The other packages at this
match of different users can be extracted and personal In this paper as part of the experiment 7 mostly visited
recommendation is made to a user. POIs are considered based on the reviews information. This
list includes: Historic City of Ayutthaya, Big Buddha Phuket,
Golden Triangle, Temple of the Reclining Buddha, Koh Phi
C. Architecture of Proposed PTRS
Viewpoint, The Sanctuary Of Truth and Patong Beach.
The architecture of the Personalized Tourist These have been visited by different tourist types like
Recommender System (PTRS) shown in Fig.1 gives the families, couples and solo. These POIs have been rated by
overall working process of the system. the tourists like excellent, very good, average, poor, and
terrible with rating 5,4,3,2,1 respectively. Those POIs whose
reviews and ratings are greater than their mean are listed.
This even can include different tourist types. The Tanimoto
coefficient is calculated between the tourist types to find the
similarity. Those PoIs that are similar are listed and can be
recommended to the new user to address the cold-start
problem efficiently and effectively.

TABLE III. DATASET ATTRIBUTES AND ENTRIES


Fig. 1. Working model of PTRS region nationality year month tourists
Africa AfrOthers 2010 1 6553
Working process of the proposed system uses Africa AfrOthers 2010 2 5618
personalized hybrid filtering. The hybrid combination may SoAsia India 2014 7 71630
use CB, CF and DF. The main objective of this work is to SoAsia India 2014 8 74983
address the ‘cold-start’ problem that comes with new user or SoAsia SriLanka 2016 8 6040
new POIs. When a new person registers with tourism website SoAsia SriLanka 2016 9 5221
like https://www.tripadvisor.in, it is required to the tourist to
give details like name, current city, country and website etc.
The demographic details such as country, region, age and The attributes and entries are present in the dataset as
gender are used to extract the information related to that shown in the Table 3. From this Table attribute name
demography. Once information is collected users dataset nationality related to India is extracted using CB from the
(Users DS) which matches with new user demography, then SoAsia region. The data related to Indian tourists is
to which country tourist wanted to visit is collected based on predicted as shown in the Fig. 2 during the years from 2010
CB to extract the points of interests in that country. After that to 2016.
the CB content is filtered with CF to further reduce down the
information overload. This produces the short list of POIs
that match with user demography and also are among the
most visited list by filtering the reviews and ratings from the
ratings dataset (Ratings DS).

IV. RESULTS AND DISCUSSION


In this paper, the tourism dataset of Thailand
‘thaitourism2.csv’ is used. And the experiment is conducted
using the Python library using Pandas. It has information of
four quarters during 7 year period of 2010 and 2016 which is
based on the immigration monthly statistics as a mean of
understanding and analyzing the trends of the tourists. This Fig. 2. Indian tourists trends to Thailand
dataset with various attributes is collected from
http://tourism2.tourism.go.th/home/content. It has many Information related to month wise visits is predicted from
regions like Asia, Africa, Europe, America etc. The content this. From Fig.3, it can be noticed that some months have
was used to predict the POIs for the new user on various more number of visits. From this it is relatively related to get
ratings and reviews from the other dataset ‘thairatings.csv’ the knowledge of more number of attractions during those
which was constructed from the information present in the months.

748
V. CONCLUSION
In this paper Personalized recommender system to
address the ‘cold- start’ problem for the new user is
proposed. The data from different sources like Tripadvisor
and Thai tourism web sites extracted to perform the
experiments. The proposed Hybrid filtering approach which
uses CB, CF and DF used to extract the PoIs to recommend
to the tourist personally. As part of the future work it is
suggested to use the proposed approach to address the ‘cold-
start’ problem with new item or new PoIs. It has been
observed that many datasets are not maintaining the
demographic details which even play vital role in the
process of recommendations. It will be very useful to the
Fig. 3. Month wise Indian tourist’s trends research to predict the items if the demographic details are
also maintained along with other attributes.
The tourists POIs that are most frequently visited in the
Quarter 1 and Quarter 2 that match with April to June from REFERENCES
the Fig.3 are extracted from the “thairatings.csv” dataset.
[1] Francesco Ricci, “Travel recommender systems,” IEEE Intelligent
This contains different traveler type such as families, couple, Systems,vol.17,no.6,pp.55-57,2002.
and solo along with reviews details whose ratings are more
[2] Renjith, Shini & Anjali, C. , “A Personalized Travel Recommender
than the specified average excellent ratings. Model Based on Content-based Prediction and Collaborative
Recommendation,” International Journal of Computer Science and
Mobile Computing, ICMIC13, pp.66-73. Dec 2013.
TABLE IV. QUARTERS 1 & 2 POIS
[3] Mohamed Elyes Ben Haj Kbaier , Hela Masri & Saoussen Krichen,
POIs Traveler type “A Personalized hybrid tourism recommender System,” IEEE/ACS
14th International Conference on Computer Systems and Applications
Big Buddha Phuket Couples (AICCSA),pp.244-50,2017.
Big Buddha Phuket Couples [4] P.Lops, M.De Gemmins, G.Semeraro, “Content-based recommender
systems: state of the art and trend,” Recommender systems handbook,
Temple of the Reclining Buddh families Springer,pp.73-105,2011.
Temple of the Reclining Buddh Couples [5] X. Su, T.M. Khoshgoftaar, A survey of collaborative filtering
techniques, Advances in Artificial Intelligence 2009 (2009) 4.
Temple of the Reclining Buddh Couples [6] H.-N. Kim, A. El-Saddik, G.-S. Jo, Collaborative error-reflected
Temple of the Reclining Buddh Solo models for cold-start recommender systems, Decision Support
Systems 51 (2011) 519–531.
Temple of the Reclining Buddh Solo [7] Sang.H.Choi, Young-Seon Jeong and Myong K. Jeong, “A Hybrid
recommendation method with reduced data for large-scale
Patong Beach Couples
application,” IEEE Transaction on systems, and cybenetics,
Patong Beach Couples vol.40,no.5, sep 2010.
[8] E.han and G.Karypis, “Feature based recommendation system,”
presented at 14th ACM Int.Conf.Inf.Know.Manag, Bremen,
Germany, oct.31-nov.5, 2005.
Using the above information by applying the CF on the
[9] R.Burkey,”Hybrid recommender systems:survey and
dataset ‘thairatings.csv’ the following Tanimoto coefficient experiments,”User model. User-Adapted Interact.,vol.12,no.4,pp.331-
with 0.33 on the families and couples sets is predicted based 370,2002.
on this similarity measure, following list of Table 5 of POIs [10] B.Smyth and P.Cotter, “A personalized television listing services,”
have been predicted that can satisfy the new user to make a Commun.ACM,vol.43,no.8,pp.107-111,2000.
plan to above said country. The mean of the reviews related [11] M.Claypool et al., “Combining content-based and collaborative filters
to the quarter 1 and quarter 2 are calculated. Similarly for in an online news papers,” in Proc. ACM SIGIR Workshop
same quarters mean is calculated for ratings. The list of POIs Recommender Syst.pp.1-11,1999.
that match the above condition is predicted. From the [12] http://tourism2.tourism.go.th/home/content
Equation (2) similarity using Tanimoto coefficient is [13] https://www.tripadvisor.in/Attraction_Review-g1389361-d2433844-
Reviews-Big_Buddha_Phuket-Chalong_Phuket_Town_Phuket.html
calculated as stated above. The Table 5 gives the final
prediction for the new user who has not yet rated anything [14] Jie. Lu, Dianshuang. Wu, Mingsong. Mao, Wei. Wang, Guangquan.
and has no past records. Zhang, Recommender system application developments: A survey,
Decision support systems 74 (2015) 12-32.
[15] Jiantao. Zhao, Hengwei. Zhang, Yue. Lian, Analysis and design of
TABLE V. POIS PREDICTED AND RECOMMENDED personalized recommender system based on collaborating filtering,
IOT Workshop, CCIS 312 Springer-Verlag Berlin Heidelbeg (2012)
POI Traveler 473-480.
Reviews
type [16] Ken Lang. 1995. Newsweeder: Learning to filter netnews. In Machine
Big Buddha Phuket 15012 couples Learning Proceedings 1995. Elsevier, 331–339.
Temple of the Reclining 48868 families
Buddh
Patong Beach 15504 couples

749

You might also like