Machine Learning Based Drug Recommendation from
Sentiment Analysis of Drug Rating and Reviews
Kodepogu Koteswara Rao a, Kona Sravya b, Kadamati Jaya Phanidra Sai b, Gummadi Giri
Ratna Sai b, Geetha Ganesan c
a
Associate Professor, Dept of CSE, PVP Siddhartha Institute of Institute of Technology, Vijayawada, India
b
Bachelor of Technology, Dept of CSE, PVP Siddhartha Institute of Institute of Technology, Vijayawada, India
c
Advanced Computing Research Society, Chennai, Tamilnadu, India
Abstract
A suggestion framework can help the client to make an arrangement out of necessities and
propose educated choices from a great deal regarding confounded information. Suggestion
from an investigation of feelings is by all accounts an incredible test as client created content
is addressed involving human language in more ways than one. Many examinations have
zeroed in on normal fields like surveys of electrical things, movies, and cafés, yet insufficient
on wellbeing and clinical issues. Feeling examination of medical care overall and that of the
medication encounters of people, specifically, may reveal extensive insight into how to zero
in on working on general wellbeing and arrive at the right choice. In this work, we plan in
addition carry out a medication recommender framework scheme that spread on feeling
examination advancements taking place drug audits. The target of this examination is to
construct a dynamic help stage to assist patients with accomplishing more huge decisions in
drug determination. First and foremost, we propose a wistful estimation way to deal with
drug surveys and produce evaluations on drugs. Furthermore, we receipts by what means
much the medication audits are helpful to clients, patient's situations, and word reference
opinion extremity of medication surveys into thought. Then, at that point, we intertwine those
factors into the proposal framework to list suitable meds. Tests have been done utilizing
Decision Tree, K Nearest Neighbours, and Linear Support Vector Classifier calculation in
rating age and Hybrid model in proposal in light of the given open dataset. The investigation
is kept out to melody the boundaries for every calculation to accomplish more prominent
execution. At long last, Linear Support Vector Classifier is chosen intended for rating age to
get a decent compromise in the middle of model exactness, model effectiveness, then model
versatility.
Keywords 1
Drug rating, Sentiment, machine Learning
1. Introduction
With the impact of Web 2.0 stages, there are enormous measures of content made by customers,
called internet-based media. Consequently, an excessive number of researchers have been
investigating capable calculations for feeling examination of content made by purchasers throughout
the most recent ten years. The area of feeling investigation, otherwise called assessment mining,
examinations the conclusions, insights, convictions, decisions, perspectives, and feelings of
individuals, including items, administrations, associations, characters, occasions and points. Lately,
these two spaces of utilization have gotten extraordinary interest. In nostalgic exploration, the
investigations are by and large partitioned into two classes, positive and negative. Yet, in the event
WAI-2022: Workshop on Artificial Intelligence, January 27 – 28, 2022, Chennai, India.
EMAIL: geetha@advancedcomputingresearchsociety.org (Geetha Ganesan)
ORCID: 0000-0001-7338-973X (Geetha Ganesan)
©️ 2022 Copyright for this paper by its authors.
Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
CEUR Workshop Proceedings (CEUR-WS.org)
1
that every one of up-and-comers' items reflect good or gloomy sentiments it is hard for individuals to
decide. To settle on a choice, individuals need not exclusively to know whether the item is great yet in
addition how great it is. It is additionally acknowledged that different individuals have various
inclinations for nostalgic articulation. Thus, it is additional essential to offer mathematical notches
rather than paired choices in numerous useful cases, for example, drug suggestion and fabricates an
arrangement of choice help that helps individuals in choosing items. This new application field
presents the two difficulties and examination valuable open doors in clinical wellbeing. A proposal
system intends to anticipate the inclinations of clients and make ideas that would bear some
significance with clients. Cooperative sifting (CF), content based (CB), and information based (KB),
and half-breed proposal advancements, all of which have specific limits, are closed by conventional
suggestion innovation. CB has overspecialized suggestions and CF dislikes sparsity, adaptability, and
cold-start issue. Yet, a few scientists zeroed in on drug proposal framework from client audits, and
have demonstrated that the opinion investigation of medical services overall and that of client's
medication experience, specifically, could reveal critical insight into the interaction to work on
general wellbeing and settle on the best choices, and this framework joins with customary suggestion
framework is more successful. In our exploration, we are centered on assessment mining in drug
audits, in which patients share their encounters and conclusions about prescriptions and afterward
group the suppositions into appraisals, and even suggest a medicine list that would be generally
suitable for the patient. Executing the proposed way to deal with feeling examination won't simply be
helpful to patients yet in addition to drug specialists and clinicians for significant popular assessment
synopses.
2. Problem Statement
Proposal from an investigation of opinions is by all accounts an incredible test as client produced
content is addressed involving human language in more than one way. Feeling investigation of
medical services overall and that of the medication encounters of people, specifically, may reveal
significant insight into how to zero in on working on general wellbeing and arrive at the right choice.
For our situation, we are executing directed AI calculations which remain utilized to create
assessment from drug audit and suggestion model that recommend a suitable prescription to eliminate
the particular condition.
2.1. Objective
Suggestions procedures expect to give buyers customized labour and products to adapt to the
developing issue of over-burdening on the web data. Reads up involved various techniques for feeling
examination, and since the mid1990s, recommender model procedures have been suggested. Many
early explores focus on report level review and allude to e-business, e-government, e-learning, web
based business/e-shopping, e-the travel industry, and so on Notwithstanding, the universe of
medication contains uncommon suggesting advances. This task expects to introduce a medication
recommender framework that can radically diminish expert's load. AI has been important in in
numerous applications, and there is an expansion in inventive work for computerization. In this
examination, we fabricate a medication proposal framework that utilizes patient surveys to anticipate
the opinion utilizing different vectorization processes like Manual Feature Analysis, which can assist
with suggesting the top medication for a given infection by various characterization calculations.
3. Proposed Work
Our medication rating age and recommender framework system essentially comprises five
modules, should be visible in Figure beneath, which be situated information pre-handling building
block (including highlight taking out), rating age module, model assessment module, word reference
feeling investigation module, and proposal model module.
2
Figure 1: Proposed methodologies
3.1. Data Pre-processing
Information cleaning is the strategy for finding and fixing (or eliminating) harmed or blemished
data from a record set, which alludes to finding absent, mistaken, deficient, or insignificant segments
of the information and afterward adding, changing, or erasing filthy or coarse information. Legitimate
information planning is a necessary advance, for a substantial trial as well as in any case to permit the
mining of a dataset utilizing the method for AI. An assortment of pre-handling steps expected to
permit the AI framework and calculations to peruse and investigate the information, just as to
diminish the dataset to contain the essential items and qualities for the examination. Essentially, the
creation or estimation of extra ascribes from the information could likewise be significant assuming
such determined traits may help the examination and in this manner permit better forecasts. At the
point when we've utilized online media information, the informational indexes should be cleaned
astutely. Basically, online media information can't be handled in a solitary manner. Consequently, we
involved our procedures for appropriately investigating opinions to tidy up that information.
These are the subsequent tools we rummage-sale for pre-processing our drug dataset:
• Tokenization
• Stop word
• Handling Negative Adjectives
• Stemming
3.2. Feature Extraction
Machines can't get characters and words. So, when managing message information we really want
to address it in numbers to be perceived by the machine
Count Vectorizer
Count Vectorizer is a technique to change text over to mathematical information. It makes a grid in
which every extraordinary word is addressed by a segment of the framework, and every text test from
the report is a line in the lattice. The worth of every cell is only the include of the word in that specific
message test.
3
Machines can't get characters and words. So, when managing message information, we really want
to address it in numbers to be perceived by the machine. Count vectorizer is a technique to change text
over to mathematical information.
CountVectorizer is an exceptional gadget given by the scikit-learn library in Python. It is used to
change a given text into a vector in light of the repeat (count) of each word that occurs in the entire
text. This is useful when we have different such texts, and we wish to change over each word in every
text into vectors (for utilizing in additional text examination). Count Vectorization includes counting
the quantity of events each words shows up in a report (unmistakable text like an article, book, even a
passage). It likewise empowers the pre-handling of message information preceding creating the vector
portrayal. This usefulness makes it a profoundly adaptable component portrayal module for text. Count
Vectorizer makes it simple for text information to be utilized straightforwardly in AI and profound
learning models like text order.
3.3. Method
The 3 administered AI calculations which be situated utilized to create rating from drug audit in
addition proposal model that prescribe a suitable prescription to eliminate the particular condition are
as per the following:
Decision Tree (DT)
Perhaps the most broadly involved progressive models for directed discovering that distinguishes
neighborhood districts as series of recursive partition through choice hubs in the test work. The instinct
behind the calculation of the choice tree is straightforward, yet at the same time very strong. It
segments data into two subsections to keep the information in each fragment extremely homogeneous
(all information in the section is of a comparative objective class) than the prior/substitute subsections;
the two subsections can then be disconnected again before the homogeneity or later based halting edges
are met. In extending the decision tree, a comparable marker limit can be applied to many spots. An
authoritative place of parcel is to survey the right element associated with the legitimate edge to
assemble subgroup/branch homogeneity.
Naïve-Bayes (NB)
It is a characterization strategy in light of Bayes' Theorem with an assumption of opportunity
among pointers. In clear terms, a Naive Bayes classifier acknowledges that the presence of a particular
part in a class is immaterial to the presence of another component. Gullible Bayes model is easy to
gather and particularly significant for very gigantic educational assortments. Close by ease, Naive
Bayes is known to outmaneuver even astoundingly present day gathering systems.
1. Gaussian naïve Bayes
2. Multinomial naïve Bayes
Bayes theorem provides a way of calculating posterior probability P(c|x) from P(c), P(x) and P(x|c).
Look at the equation below:
P(c|x) = (P(x|c)*P(c))/P(x) (1)
Where, P(c|x) is the posterior probability of class (c, target) given predictor (x, attributes). P(c) is
the prior probability of class.
P(x|c) is the probability which is the likelihood of indicator given class. P(x) is the earlier likelihood
of indicator. Exactly when doubt of independence holds, a Naive Bayes classifier performs better
differentiation with various models like determined backslide and you truly need less planning data. It
performs well in the event that there ought to emerge an event of full scale input factors diverged from
4
numerical variable(s). For numerical variable, normal allotment is acknowledged (ring twist, which is a
strong speculation).
Support Vector Machine (SVM)
The SVM thought relies upon the Structural Risk Minimization rule of computational learning
speculation [24] and conceivably the most strong and convincing strategy used in AI. In this
speculation, data is evaluated and the restrictions of decisions are portrayed by having hyper planes. By
virtue of data that can't be easily disengaged, it utilizes 4 section structures for request tasks including
straight, polynomial, outspread based, and sigmoid limits by arranging the information data into high-
layered component space to allow the data supportively particular. The hyper plane parcels the text
vectors of each class with the end goal that the capability is held as broad as could be anticipated.
Straight SVC is undifferentiated from SVC with limit kernel='linear'. Learning the hyper plane in
straight SVM occurs by using direct polynomial math to change the issue. Clear information is that,
rather than using insights themselves, the direct SVM is generally rephrased using inward thing with
any two components. A measure of the expansion of the data regards for each pair is the inward thing
between two vectors. The condition for making a gauge for data using the spot thing between the
information (X) and each help vector not entirely settled as follows:
F(X) = Bo + ∑ Ai(X, Xi) (2)
The condition no. 1 includes the working out of the internal results of another info vector (X) with
all help vectors in preparing information. From the preparation information on the learning calculation,
coefficients Bo and Ai (for each information) should be determined. The dab item is known as the
piece and can be re-composed as:
K(X, Xi) = ∑ (X*Xi) (3)
The piece chooses the equivalence or distance of new data from help vectors. The bit thing is an
extent of similarity used for straight SVM or an immediate part since the distance between the
information sources is immediate show [24]. Proposition Model: We are familiar with anticipating
what a purchaser will give the "rate" or "tendency." Recommendation engines are mechanical
assemblies for data filtering that use computations and data to enlighten a lone customer in regards to
the primary things. Then again they are only a motorized sort of a "shop counterman". For a thing, you
ask him, he shows the prescription just as the things you would purchase. They are particularly ready in
decisively pitching and up selling. As there is growing data on the Internet and the amount of
customers has extended essentially, looking, arranging, and outfitting associations with the information
they need, according to their tendencies and tastes, is huge.
Dictionary Sentiment Analysis
In the examination of word reference feeling, we performed enthusiastic investigation utilizing a
passionate word reference to determine the constraints of the bundle worked with the information from
motion pictures. To compensate for this, we utilized the Harvard enthusiastic word reference to play
out extra passionate evaluations. To begin with, we count the quantity of words remembered for the
word reference and determined positive proportion in pre-handled information.
Positive Ratio = n(P)/(n(P)+n(N)) (4)
Where:
n(P) is the number of positive words in the review and
n(N) is the number of negative words in the review
Assuming the proportion is under 0.5, we have ordered it as negative and in the event that it is
more noteworthy than 0.5, we have grouped it as certain. We reviewed it as nonpartisan with
leftovers, which incorporates the sentence with no sure or negative terms.
5
4. Experimental Results
For building this model, we use the dataset of drug reviews. The dataset contains data like the drug
name, the condition the patient is in while using the drug, date the review collected on, useful count
which is the number of people found the review helpful, rating given by the user for the drug and
finally, the detailed review given by the user.
Since the rating is on the scale 1-10 in the dataset, to reduce the number of classes a review falls
in, we brought down the rating to the scale of 1-5 by simple division as shown below:
Rating in dataset(R) Converted Rating
10, 9, 8 5
7, 6 4
5, 4 3
3, 2 1
Figure 2: Removing Stop words
For removing stop words, we used the Natural Language Toolkit (NLTK) in Python. The NLTK
library contains stop words from 16 different languages. Since the reviews are in English, we used the
list of English stop words. Since sets in Python provide better Time Complexity for searching, we
converted the list into a Python set before searching for a word in stop words.
For Tokenization, we used the Regular Expression(re) module in python. Also, instead of dealing
with alphabets of uppercase and lowercase separately, we converted all uppercase alphabets to
lowercase.
For Stemming, we used the Porter Stemmer algorithm in the NLTK module. Porter Stemmer is the
widely used algorithm for stemming words in English language.
The following image includes all the steps we used for pre-processing
Figure 3: Pre Processing
6
We stored the result of the pre-processing steps in a Python list “corpus”. The ‘corpus’ list
contains the refined reviews, which is then used for the further steps.
After Pre-processing, we extracted features using the CountVectorizer from the Scikit-learn
library. We limited the maximum allowed features to 10,000 which is the nearest round figure to
eliminate the less-frequent words that are likely to be un-useful. Also, we used only the first 10,400
reviews, considering the size of the dataset, which approximately contains 53,000 reviews.
Figure 4: Features Extraction
Also, we split the data into train data and test data in the ratio 7:3. The train data is used to build
the model and the test data is used to test the accuracy of the model.
Figure 5: Train data and test data
We used 3 classification algorithms namely- Gaussian Naïve Bayes Classifier, Decision Tree
Classier and Support Vector Classifier for generating rating.
Gaussian Naïve Bayes Classifier
Figure 6: Gaussian Naïve Bayes Classifier
Decision Tree Classifier
Figure 7: Decision Tree Classifier
Support Vector Classifier
Figure 8: Support Vector Machine
7
Of all 3 classifiers, the Naïve Bayes classifier gave the best accuracy of 60.41%, followed by
Support Vector Classifier and Decision Tree Classifier.
Table 1: Accuracy
Classifier Accuracy (%)
Naïve Bayes 60.41
Decision Tree Classifier 56.63
Support Vector Classifier 56.83
For Dictionary Sentiment Analysis, we refined the Harvard emotional dictionary csv file into 2
files- one consisting of positive words and the other consisting of negative words. We imported the
csv files and stored in 2 Python sets. Later, we had added some code to count the number of positive
and negative words in each review and to calculate Dictionary Sentiment polarity of the review.
Figure 9: Calculate Dictionary Sentiment polarity
Figure 10: Dictionary Sentiment polarity calculation
In the next step, we used the formula shown below to generate the score of each drug for a
particular condition.
Score = (Dictionary Sentiment Polarity * usefulCount) + generatedRating
The implementation s shown below:
Figure 11: Score Calculation
8
The next step we did is to create a dictionary containing conditions and drug names and the mean
of scores of each drug for the specified condition.
Figure 12: Creating dictionary
The last step is to prescribe take the list of conditions the patient is suffering from and to
recommend the top-3 drugs for each condition along with the recommended score for each drug.
4.1. Input and Output Parameters
Reducing the Scale of Ratings:
INPUT : Data frame with ratings on the scale of 1-10
OUTPUT : Data frame with ratings on the scale of 1-5
Pre-Processing
INPUT : Data frame with unprocessed data
OUTPUT : A Python list containing the reviews that are tokenized, stemmed and
free from stop words.
Count Vectorizer:
INPUT : The Python list containing reviews.
OUTPUT : A matrix of with each cell containing the number of
occurrences of a word(column) in each review(row)
Naïve Bayes Classifier:
INPUT : The count vectorizer matrix with 7000 rows
OUTPUT : Naïve Bayes classifier
Decision Tree Classifier:
INPUT : The count vectorizer matrix with 7000 rows
OUTPUT : Decision Tree classifier
Support Vector Classifier:
INPUT : The count vectorizer matrix with 7000 rows
OUTPUT : Support Vector classifier
Dictionary Sentiment Polarity:
INPUT : Data frame containing reviews
OUTPUT : Data frame containing Dictionary Sentiment Polarity of each review.
9
Calculating Score of Each Review:
INPUT : Data frame containing Dictionary Sentiment Polarity and ratings
OUTPUT : Data frame containing score for each review.
Grouping Conditions and Drugs:
INPUT : Data frame containing score or each review
OUTPUT : A Python Dictionary containing conditions, drugs and the mean
score of drugs
Recommending Drugs:
INPUT : A Python list containing the conditions patients has.
OUTPUT : Recommended Drugs in decreasing order of their scores
4.2. Implementation Results
Reducing the Scale of Ratings
Figure 13: Reducing the scale of ratings
Pre-Processing:
After pre-processing the following review, the refined review generated is shown below:
“This med was given as a result of a deep gouge from a dog nail. healing was not occurring after 4
weeks including a trip to an ambulatory care. my doc said they treated it incorrectly. he prescribed
this. i have an appointment at the wound center Tues. dr also said it needed debriding. after 5 days i
see no improvement. if anything, the area is more red and sore.”
Figure 14: Import text wrap
Naïve Bayes Classifier, Decision Tree Classifier, Support Vector Classifier:
Figure 15: Accuracy measures
10
Dictionary Sentiment Polarity:
Figure 16: Dictionary sentiment Polarity
Calculating Score of Each Review:
Figure 17: Score calculation
Grouping Conditions and Drugs:
Figure 18: Grouping Conditions
RECOMMENDING DRUGS:
Figure 19: Recommending drugs
5. Conclusion
At last, the Naïve Bayes model is chosen for rating age to get a decent compromise among model
exactness (60.0%), model productivity, and model versatility where this outcome is utilized in Hybrid
Recommendation Model to list proper meds.
• Notwithstanding it, we directed the passionate investigation utilizing an enthusiastic word
reference to defeat constraints of the medication information utilized.
• In the last investigation this study shows that the wistful qualities contribute significantly to
the expectation of medication rating, just as suggestions. It additionally shows huge
enhancements for a genuine world dataset contrasted with current techniques.
11
6. Future Scope
The scope of this task is that while assessing the unique circumstance, we can track down more
phonetic standards, and to fuse state level opinion examination, we might adjust or fabricate half and
half factorization models like tensor factorization, or profound learning strategies. The venture can
likewise be stretched out to improve the exactness and unwavering quality of the proposal model
further.
7. References
[1] B. Liu, Sentiment Analysis (Introduction and Survey) and Opinion Mining. 2012.
[2] X. Lei, X. Qian, and G. Zhao, “Rating Prediction Based on Social Sentiment from Textual
Reviews,” IEEE Trans. Multimed., vol. 18, no. 9, pp. 1910–1921, Sep. 2016, doi:
10.1109/TMM.2016.2575738.
[3] Y. Bao and X. Jiang, “An intelligent medicine recommender system framework,” in Proceedings
of the 2016 IEEE 11th Conference on Industrial Electronics and Applications, ICIEA 2016, Oct.
2016, pp. 1383–1388, doi: 10.1109/ICIEA.2016.7603801.
[4] R. Majethia, V. Mishra, A. Singhal, K. Lakshmi Manasa, K. Sahiti, and V. Nandwani,
“PeopleSave: Recommending effective drugs through web crowdsourcing,” in 2016 8th
International Conference on Communication Systems and Networks, COMSNETS 2016, Mar.
2016, doi: 10.1109/COMSNETS.2016.7440000.
[5] R. C. Chen, Y. H. Huang, C. T. Bau, and S. M. Chen, “A recommendation system based on
domain ontology and SWRL for anti-diabetic drugs selection,” Expert Syst. Appl., vol. 39, no. 4,
pp. 3995–4006, Mar. 2012, doi: 10.1016/j.eswa.2011.09.061.
[6] J.-C. Na and W. Y. M. Kyaing, “Sentiment Analysis of User-Generated Content on Drug Review
Websites,” J. Inf. Sci. Theory Pract., vol. 3, no. 1, pp. 6–23, Mar. 2015, doi:
10.1633/jistap.2015.3.1.1.
[7] M. E. Basiri, M. Abdar, M. A. Cifci, S. Nemati, and U. R. Acharya, “A novel method for
sentiment classification of drug reviews using fusion of deep and machine learning techniques,”
Knowledge-Based Syst., vol. 198, p. 105949, Jun. 2020, doi: 10.1016/j.knosys.2020.105949.
[8] S. Vijayaraghavan and D. Basu, “Sentiment Analysis in Drug Reviews using Supervised
Machine Learning Algorithms,” arXiv, Mar. 2020, Accessed: Nov. 20, 2020. [Online].
Available: http://arxiv.org/abs/2003.11643.
[9] V. Doma et al., “Automated Drug Suggestion Using Machine Learning,” in Advances in
Intelligent Systems and Computing, Mar. 2020, vol. 1130 AISC, pp. 571–589, doi: 10.1007/978-
3-030-39442-4_42.
[10] A. A. Hamed, R. Roose, M. Branicki, and A. Rubin, “TRecs: Time-aware twitter-based drug
recommender system,” in Proceedings of the 2012 IEEE/ACM International Conference on
Advances in Social Networks Analysis and Mining, ASONAM 2012, 2012, pp. 1027– 1031, doi:
10.1109/ASONAM.2012.178.
[11] C. Chen, L. Zhang, X. Fan, Y. Wang, C. Xu, and R. Liu, “A epilepsy drug recommendation
system by implicit feedback and crossing recommendation,” in Proceedings - 2018 IEEE
SmartWorld, Ubiquitous Intelligence and Computing, Advanced and Trusted Computing,
Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People
and Smart City Innovations, SmartWorld/UIC/ATC/ScalCom/CBDCo, 2018, doi:
10.1109/SmartWorld.2018.00197.
[12] D. Chen, D. Jin, T. T. Goh, N. Li, and L. Wei, “ContextAwareness Based Personalized
Recommendation of AntiHypertension Drugs,” J. Med. Syst., vol. 40, no. 9, pp. 1– 10, Sep.
2016, doi: 10.1007/s10916-016-0560-z.
[13] A. Gottlieb, G. Y. Stein, E. Ruppin, R. B. Altman, and R. Sharan, “A method for inferring
medical diagnoses from patient similarities,” BMC Med., vol. 11, no. 1, p. 194, Sep. 2013, doi:
10.1186/1741-7015-11-194.
12
[14] K. Shimada, K. Fujikawa, K. Yahara, and T. Nakamura, “Antioxidative Properties of Xanthan on
the Autoxidation of Soybean Oil in Cyclodextrin Emulsion,” 1992. Accessed: Jul. 29, 2020.
[Online]. Available: https://pubs.acs.org/sharingguidelines.
[15] Q. Zhang, G. Zhang, J. Lu, and D. Wu, “A framework of hybrid recommender system for
personalized clinical prescription,” in Proceedings - The 2015 10th International Conference on
Intelligent Systems and Knowledge Engineering, ISKE 2015, Jan. 2016, pp. 189–195, doi:
10.1109/ISKE.2015.98.
13