0% found this document useful (0 votes)
331 views44 pages

Final PPT

This document presents a dissertation proposal on developing a hybrid CNN and capsule network model for sentiment analysis. It outlines the motivation, problem statement, and objectives of the research. The proposal will present the student's methodology, implementation details, simulation results, and conclusions. It is supervised by Dr. Poonam Saini and Prof. Anoop Dobhal from Punjab Engineering College.

Uploaded by

Sukhwinder singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
331 views44 pages

Final PPT

This document presents a dissertation proposal on developing a hybrid CNN and capsule network model for sentiment analysis. It outlines the motivation, problem statement, and objectives of the research. The proposal will present the student's methodology, implementation details, simulation results, and conclusions. It is supervised by Dr. Poonam Saini and Prof. Anoop Dobhal from Punjab Engineering College.

Uploaded by

Sukhwinder singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 44

CNN based Capsule Model for Sentiment

Analysis
M.E. Dissertation Presentation
Dated: 5 July 2018

Presented By: Supervisor(s):


Manish Bisht Dr. Poonam Saini
ME-CSE Assistant Professor
SID: 16205012 Prof. Anoop Dobhal
Assistant Professor

Department of Computer Science and Engineering


Punjab Engineering College (Deemed to be University), Chandigarh
Outline

⊡ Introduction
⊡ Literature Review
□ Literature survey
□ Research gaps
⊡ Methodology and Problem statement
□ Motivation
□ Problem statement
□ Objectives
□ Methodology
⊡ Implementation
⊡ Simulation
□ Simulation details
□ Simulation Results
⊡ Conclusion and future scope
⊡ References

2
1

Introduction

3
Introduction

⊡ Process of computationally identifying and categorizing opinions


expressed in text towards a particular topic, product, etc. is
positive or negative.

The movie The movie The movie


was fabulous! stars Mr. X was horrible!

4
Types of Sentiment
Analysis (SA)

Document level Sentence level Aspect level


 main task is to  sentence is  Allows to extract
define opinion of considered as a opinions
the whole short document towards aspects
document which can be of entities.
 opinion should be subjective or
expressed about objective.
one topic.  Subjective
(opinionated)
sentence
expresses
sentiment.

5
Approaches of SA

6
Lexican based
approach

Dictionary-based approach Corpus-based approach


 Uses a lexicon that consists of  Identification of opinion words
terms with respective and their polarities in the
sentiment scores to each term. domain corpus using a given
set of opinion words.
 Building a new lexicon within
the particular domain from
another lexicon using a
domain corpus.

7
Machine Learning
approach

Unsupervised machine Supervised machine


learning methods learning methods
 Uses unlabeled datasets in  Assume the presence of
order to discover the structure labeled training data that are
and find the similar patterns used for the learning process.
from the input data.  Estimates the output from the
input dataset

8
2

Literature Review

9
Literature Survey

S.No. Title Source Observations


1. Efficient Estimation of arXiv preprint • Improved vector
Word Representations arXiv:1301.3781 representation of words.
in Vector Space • Captures relation between
(Mikolov et al., 2013) words.
2. Glove: Global vectors Proceedings of the 2014 • Used efficient statistics
for word conference on empirical methods to improve word
representation(Penning methods in natural language representation
ton et al., 2014) processing (EMNLP), Qatar • Faster than word2vec
3. Twitter sentiment CS224N Project Report, • Emoticons are used to train
classification using Stanford the classifier.
distant supervision (Go • Novel preprocessing steps
et al., 2009) were introduced

10
Literature Survey

S.No. Title Source Observations


4. Convolutional arXiv preprint • Shows state of the art performance
neural networks for arXiv:1408.5882 on 4 NLP task using deep neural
sentence network.
classification (Kim, • Initialization of CNN with good word
2014) representation improves accuracy.
5. Twitter sentimentProceedings of the 38th • Initialized model with wor2vec and
analysis with deep International ACM SIGIR then tuned with distant supervision
convolutional neural Conference on Research • Produces state of the art results
networks(Severyn and Development in
et al., 2015) Information Retrieval,
Brazil
6. SemEval-2016 task Proceedings of the 10th • Ensemble of two CNNs are used to
4: Sentiment International Workshop improve accuracy
analysis in Twitter on Semantic Evaluation • Predictions are combined using
(Nakov et al., 2016) (SemEval-2016), USA random forest classifier

11
Literature Survey

S.No. Title Source Observations


7. SemEval-2017 task 4: Proceedings of the • Ensemble of several
Sentiment analysis in Twitter 11th International CNNs and LSTMs are
(Rosenthal et al., 2017) Workshop on used to improve accuracy
Semantic Evaluation
(SemEval-2017),
Canada
8. Dynamic routing between Advances in Neural • Dynamic routing algorithm
capsules (Sabour et al., 2017) Information Processing is introduced to take
Systems, USA information from one level
to next.
9. Sentiment analysis by capsules Proceedings of the • A hybrid approach of RNN
(Wang et al., 2018) 2018 World Wide Web and capsule network is
Conference on World proposed
Wide Web , USA • Achieves state of the art
performance on movie
review dataset

12
Capsule Network

Figure: Architecture of Capsule Network (Sabour et al., 2017)

13
Research Gaps

 Pooling layer in CNN keeps only the most dominating feature. For

e.g., a standard CNN considers “you are amazing” and “are you

amazing” as sentences with same sentiment.

 Research is only limited to textual data.

 Real time opinion minor can be build.

14
Research Gap
explained

Convolution + activation Pooling


You 1 0 0 0 0
are 0 0 1 1 1 1
Amazing 1 1 1 1 1
0.5

0
1

0.5

Are 0 0 1 1 1 1

you 1 0 0 0 0 0

Amazing 1 1 1 1 1

Convolution + activation
15
3

Methodology and Problem


Statement

16
Motivation

⊡ Knowing sentiment is a very natural ability of a human being.


Can a machine be trained to do it?

⊡ SA aims at getting sentiment-related knowledge especially from the


huge amount of information on the internet

⊡ Can be generally used to understand opinion in a set of documents

17
Problem Statement
To develop a hybrid text classification model using CNN
and capsule network which can improve accuracy of
sentiment classification task.

18
Objectives

⊡ To study feature selection methods for text classification and to


investigate machine learning algorithms that can be applied for
classification problem.

⊡ To analyze accuracy of existing algorithms with respect to


different datasets and analyze their computational time and
resource requirement.

⊡ To evaluate the results of applied techniques and propose an


improvement model based on results obtained.

19
Methodology

20
4

Implementation

21
Preprocessing

⊡ Remove URLs, special characters

⊡ Stemming and lemmatization

⊡ Tokenization

22
Word Embedding’s
Model

⊡ Word embedding vectors initialized by Glove are used.

⊡ Word embedding vectors are pre-trained on unlabeled corpus


size of 840 billion with dimension size 300.

23
Proposed System
Model

Sentiment Capsule

Figure: Architecture of CNN based Capsule Model

24
Training Operation

25
Dynamic Routing
by Agreement
Algorithm: Following steps are performed in this algorithm and are repeated for r iterations.

1. Softmax is calculated for all the routing weights corresponding to a primary capsule.
ci = Softmax (bi )

2. Weighted sum is calculated for each predicted vector


sj = σ𝑖 𝐶𝑖𝑗 û𝑗|𝑖

3. Weighted sum is squashed to get the output of sentiment capsule


Vj = Squash(Si )

4. Routing weight gets updated


bij = bij + ûij x Vj

26
5

Simulation Details

27
Dataset

⊡ Movie reviews dataset is used from www.rottentomatoes.com

⊡ There are 5331 positive and 5331 negative reviews

⊡ 10% of the data is used as testing set.

28
Python (3.6)

Libraries Used For

Keras  Inbuilt CNN and RNN


 Support GPU
 Inbuilt activation functions
Tensorflow  Efficient matrix multiplication

Dill  Serializing objects

Gensim  Inbuilt word2vec model


 Transforming GloVe to word2vec
NLTK  Preprocessing of data
 Tokenization, removing stop words, removing URLs, special characters
Multiprocessing  Parallel computing
 Computing thousands of operations in parallel on GPU
Pandas  Importing and analyzing data

Numpy  Compute and manipulate dataset

29
Jupyter Notebook

⊡ platform between server and client that allows execution and


implementation of python code via internet browser.

⊡ can be accessed remotely through web.

⊡ (VM) instances are taken from Google cloud and accessed


through external IP.

30
Google Cloud

Resource Name Resource Configuration

No. of CPU 8

1 CPU RAM 3.75 GB

Total RAM used 30

No. of GPU 1

Total GPU RAM 12 GB

Secondary Memory used 30 GB

31
6

Simulation Results

32
Confusion matrix

n=1064 Predicted Positive Predicted Negative

Actual Positive TP = 437 FN = 95

Actual Negative FP = 85 TN = 447

33
Performance
Parameters
Parameters Value in percent

Accuracy 83

Misclassification Rate 16.8

True Positive Rate 82

False Positive Rate 16

Specificity 84

Precision 83

Prevalence 50

F1 score 84

34
Accuracy
Model Movie Review (MR)
RAE 77.7
RNTN 75.9
LSTM 77.4
Bi-LSTM 79.3
LR-LSTM 81.5
LR-Bi-LSTM 82.1
Tree-LTSM 80.7
CNN 81.5
NCSL 82.9
CNN-Capsule 83.05
RNN-Capsule 83.8

35
CNN vs RNN
capsule
⊡ CNNs tend to be much faster (~5 times faster) than RNN.(Sharan Narang.
2018. DeepBench. [ONLINE] Available at: https://github.com/baidu-
research/DeepBench. [Accessed 5 June 2018].)
⊡ CNN capsule is 8 times faster than RNN capsule
Filter No. of Padding Stride Fwd
Application Total Time (ms) Processor
Size Filters (h, w) (h, w) TeraFLOPS
R = 5, Sentiment Tesla V100
32 0, 0 2, 2 1.03 7.75
S = 20 analysis FP32

Total
Hidden Batch Recurrent Fwd
TimeSteps Application Time Processor
Units Size Type TeraFLOPS
(ms)
Sentiment
1760 16 50 Vanilla 8.21 1.19 Tesla V100
analysis

36
Conclusion

⊡ shows that this simple capsule model achieves good sentiment


classification accuracy without any carefully designed instance
representations or linguistic knowledge.

⊡ proposed CNN based capsule model achieves nearly 83%


accuracy.

⊡ objective of the training is to maximize the length of the activation


vector of capsule and minimizing its reconstruction loss.

37
Future Work

⊡ training data must be large enough to train the system properly so


that system can be initialized to produce accurate results.

⊡ preprocessing steps can be altered or enhanced to improve the


accuracy of the model.

⊡ Parameters of the model can be changed to improve accuracy.

38
References

⊡ Bautin, M., Vijayarenu, L. and Skiena, S., 2008, April. International Sentiment
Analysis for News and Blogs. In ICWSM.
⊡ Beineke, P., Hastie, T., Manning, C. and Vaithyanathan, S., 2004. Exploring
sentiment summarization. In Proceedings of the AAAI spring symposium on
exploring attitude and affect in text: theories and applications (Vol. 39, pp. 1-4).
⊡ Bojanowski, P., Grave, E., Joulin, A. and Mikolov, T., 2016. Enriching word vectors
with subword information. arXiv preprint arXiv:1607.04606.
⊡ Bollen, J., Mao, H. and Zeng, X., 2011. Twitter mood predicts the stock market.
Journal of computational science, 2(1), pp.1-8.
⊡ Cliche, M., 2017. BB_twtr at SemEval-2017 Task 4: Twitter Sentiment Analysis with
CNNs and LSTMs. arXiv preprint arXiv:1704.06125.
⊡ Crossley, S.A., Kyle, K. and McNamara, D.S., 2017. Sentiment Analysis and Social
Cognition Engine (SEANCE): An automatic tool for sentiment, social cognition, and
social-order analysis. Behavior research methods, 49(3), pp.803-821.

39
References

⊡ Deriu, J., Gonzenbach, M., Uzdilli, F., Lucchi, A., Luca, V.D. and Jaggi, M., 2016.
Swisscheese at semeval-2016 task 4: Sentiment classification using an ensemble of
convolutional neural networks with distant supervision. In Proceedings of the 10th
International Workshop on Semantic Evaluation (No. EPFL-CONF-229234, pp.
1124-1128).
⊡ Go, A., Bhayani, R. and Huang, L., 2009. Twitter sentiment classification using
distant supervision. CS224N Project Report, Stanford, 1(12).
⊡ Goodfellow, I., Bengio, Y., Courville, A. and Bengio, Y., 2016. Deep learning (Vol. 1).
Cambridge: MIT press.
⊡ Horrigan, J.A., 2008. Online shopping. Pew Internet & American Life Project Report,
36, pp.1-24.
⊡ Kalchbrenner, N., Grefenstette, E. and Blunsom, P., 2014. A convolutional neural
network for modelling sentences. arXiv preprint arXiv:1404.2188.
⊡ Khan, F.H., Bashir, S. and Qamar, U., 2014. TOM: Twitter opinion mining framework
using hybrid classification scheme. Decision Support Systems, 57, pp.245-257.

40
References

⊡ Kim, P., 2006. The forrester wave: Brand monitoring, Q3 2006. Forrester Wave
(white paper).
⊡ Kim, Y., 2014. Convolutional neural networks for sentence classification. arXiv
preprint arXiv:1408.5882.
⊡ Kingma, D.P. and Ba, J., 2014. Adam: A method for stochastic optimization. arXiv
preprint arXiv:1412.6980.
⊡ Lei, T., Barzilay, R. and Jaakkola, T., 2015. Molding cnns for text: non-linear, non-
consecutive convolutions. arXiv preprint arXiv:1508.04112.
⊡ Liu, B., 2012. Sentiment analysis and opinion mining (synthesis lectures on human
language technologies). Morgan & Claypool Publishers, 5(1), pp.1-67.
⊡ Mikolov, T., 2012. Statistical language models based on neural networks.
Presentation at Google, Mountain View, 2nd April.
⊡ ‘a’ Mikolov, T., Chen, K., Corrado, G. and Dean, J., 2013. Efficient estimation of word
representations in vector space. arXiv preprint arXiv:1301.3781.

41
References

⊡ ‘b’ Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S. and Dean, J., 2013. Distributed
representations of words and phrases and their compositionality. In Advances in
neural information processing systems (pp. 3111-3119).
⊡ O'Connor, B., Balasubramanyan, R., Routledge, B.R. and Smith, N.A., 2010. From
tweets to polls: Linking text sentiment to public opinion time series. Icwsm, 11(122-
129), pp.1-2.
⊡ Pak, A. and Paroubek, P., 2010, May. Twitter as a corpus for sentiment analysis
and opinion mining. In LREc (Vol. 10, No. 2010).
⊡ Pang, B. and Lee, L., 2005, June. Seeing stars: Exploiting class relationships for
sentiment categorization with respect to rating scales. In Proceedings of the 43rd
annual meeting on association for computational linguistics (pp. 115-124).
Association for Computational Linguistics.
⊡ Pennington, J., Socher, R. and Manning, C., 2014. Glove: Global vectors for word
representation. In Proceedings of the 2014 conference on empirical methods in
natural language processing (EMNLP) (pp. 1532-1543).

42
References
⊡ ‘Sabour, S., Frosst, N. and Hinton, G.E., 2017. Dynamic routing between capsules.
In Advances in Neural Information Processing Systems (pp. 3859-3869).
⊡ Saif, H., He, Y., Fernandez, M. and Alani, H., 2016. Contextual semantics for
sentiment analysis of Twitter. Information Processing & Management, 52(1), pp.5-19.
⊡ Saif, H., He, Y., Fernandez, M. and Alani, H., 2016. Contextual semantics for
sentiment analysis of Twitter. Information Processing & Management, 52(1), pp.5-19.
⊡ Severyn, A. and Moschitti, A., 2015, August. Twitter sentiment analysis with deep
convolutional neural networks. In Proceedings of the 38th International ACM SIGIR
Conference on Research and Development in Information Retrieval (pp. 959-962).
ACM.
⊡ Tai, K.S., Socher, R. and Manning, C.D., 2015. Improved semantic representations
from tree-structured long short-term memory networks. arXiv preprint
arXiv:1503.00075.

43
References

⊡ Thakkar, H. and Patel, D., 2015. Approaches for sentiment analysis on twitter: A
state-of-art study. arXiv preprint arXiv:1512.01043.
⊡ Tumasjan, A., Sprenger, T.O., Sandner, P.G. and Welpe, I.M., 2010. Predicting
elections with twitter: What 140 characters reveal about political sentiment. Icwsm,
10(1), pp.178-185. Wang, Y., Sun, A., Han, J., Liu, Y. and Zhu, X., 2018, April.
Sentiment analysis by capsules. In Proceedings of the 2018 World Wide Web
Conference on World Wide Web (pp. 1165-1174). International World Wide Web
Conferences Steering Committee.
⊡ Wu, D.D., Zheng, L. and Olson, D.L., 2014. A decision support approach for online
stock forum sentiment analysis. IEEE transactions on systems, man, and
cybernetics: systems, 44(8), pp.1077-1087.
⊡ Zabin, J. and Jefferies, A., 2008. Social media monitoring and analysis: Generating
consumer insights from online conversation. Aberdeen Group Benchmark Report,
37(9).
⊡ Zhu, X., Kiritchenko, S. and Mohammad, S., 2014. Nrc-canada-2014: Recent
improvements in the sentiment analysis of tweets. In Proceedings of the 8th
international workshop on semantic evaluation (SemEval 2014) (pp. 443-447).
44

You might also like