Académique Documents
Professionnel Documents
Culture Documents
AbstractSentiment analysis has now become a popular cancer, where prevent reverses the negative polarity of cancer.
research problem to tackle in NLP field. However, there are very The work of [3] introduced an approach based on CRFs with
few researches conducted on sentiment analysis for Chinese. hidden variables with very good performance. RNTN was
Progress is held back due to lack of large and labelled corpus and employed to predict sentiment label on phrases and full
powerful models. To remedy this deciency, we build a Chinese sentence level [4].
Sentiment Treebank over social data. It concludes 13550 labeled
sentences which are from movie reviews. Furthermore, we However, there is relatively little investigation conducted
introduce a novel Recursive Neural Deep Model (RNDM) to on Chinese sentiment analysis. Progress is held back in
predict sentiment label based on recursive deep learning. We Chinese sentiment analysis, due to lack of large and labelled
consider the problem of classifying one sentence by overall corpus and powerful models. To address this problem, we
sentiment, determining a review is positive or negative. On introduce a Chinese Sentiment Treebank and a powerful
predicting sentiment label at sentence level, our model recursive deep model that can accurately predict the sentiment
outperforms other commonly used baselines, such as Nave Bayes, label on sentence level in this new corpus. The corpus is based
Maximum Entropy and SVM, by a large margin. on movie reviews from social networks. And it is the first
Chinese sentiment corpus with labeled parse trees. We
KeywordsSentiment analysis; Chinese Sentiment Treebank; introduce a novel recursive neural deep model. Our model
recursive deep learning
represents a sentence through word vectors and a parse tree and
then computes vectors for higher nodes in the tree using the
I. INTRODUCTION same composition function.
Sentiment analysis is the field of study that analyzes Sentiment analysis at sentence level goes to the sentences
peoples opinions, sentiments, evaluations, appraisals, attitudes, and determines whether each sentence expressed a positive,
and emotions towards entities such as products, services, negative, or neutral opinion. Neutral opinion usually means no
organizations, individuals, issues, events, topics, and their opinion. In this paper, we consider the problem of classifying
attributes [1]. one sentence by overall sentiment, determining a review is
With the development of Web 2.0 and enormous growth of positive or negative.
social data in Internet, sentiment analysis has now become a Combination with the Chinese Sentiment Treebank, our
popular research problem to tackle. In contrast to Web sites recursive deep model performs robust in predicting sentiment
where people are limited to the passive viewing of content, label distributions. We compare to several baselines such as
Web 2.0 may allow users to more easily express their views Nave Bayes (NB), Maximum Entropy and SVM. Our model
and opinions on social networking sites, such as Twitter and achieves the best performance. Meanwhile, based on Chinese
Facebook. Sentiment Treebank, all baseline models also obtain high
The opinion information they leave behind is of great value. performance.
For example, by collecting movie reviews, film companies can The rest of the paper is organized as follows: In Section 2,
gather feedbacks from their customers to further improve their we introduce some related work including word representations,
products. Customers can decide which movie is worthy recursive deep learning and sentiment analysis. Section 3
watching based on fellow customers movie reviews. Hence, in introduces our Chinese Sentiment Treebank. Section 4
recent years, sentiment analysis has become a popular topic for describes our recursive neural deep model. Section 5 presents
many research communities, including artificial intelligence the experiments on predicting Chinese sentiment label. The
and natural language process. conclusion and future work are presented in Section 6.
There has been a large amount of prior research in
sentiment analysis, especially in the domain of product reviews, II. RELATED WORK
movie reviews, and blogs. The work of [2] focused on This work is mainly connected to three areas of NLP
manually constructing several lexica and rules for both polar research: word representations, recursive deep learning and
words and related content-word negators, such as prevent sentiment analysis.
181
TABLE I. EXAMPLES OF MOVIE REVIEWS AFTER FILTERING AND WORD Finally, we employed the Stanford parser to parse all
SEGMENTATION
sentences [21]. Note that all punctuations are also taken as a
Sentim
Label S-Num Example
word. As a result, we obtained Chinese Sentiment Treebank.
ent Fig.1 shows two examples in Chinese Sentiment Treebank.
(a The top one shows one positive movie review, while the down
good story worth reading) one shows one negative movie review.
4 4976 (Does it need reason for Chinese Sentiment Treebank concludes 14964 Chinese
recommending this story?) words and 13550 sentences. It allows us to better predict
(It always sentiment label of sentences based on various machine learning
brings happy for people.)
frameworks. We can train and test various models on the new
,
Positive
corpus.
.(It is a really good movie,
(+) deserving your tiers and smile.) Combination with recursive deep model proposed in this
work, we will get good performance in predicting sentiment
(When I was a label task.The details will be described in next section.
3 6463 little child, I have seen a good movie,
in which it is full of wonderful IV. RECURSIVE DEEP LEARNING
fantasy.)
In this section, we firstly introduce Recursive Neural Deep
(The picture of this Korea movie is Model (RNDM). And then we describe the details of parameter
really beautiful.) learning.
(The
movie is lack of real story about A. Recursive Neural Deep Model
every days life.)
(It is too
RNDM can compute compositional vector representations
1 1364 for phrase of any length and sentence. These representations
boring to see through.)
will then be used as features to predict sentiment label. Fig.2
Neg- (The female leading role of this shows the structure of RNDM.
ative movie is not suitable for me at all. )
() (It is For ease of exposition, we used a tri-gram ,
too bad to be called amovie) (corresponding to the best movie in English), to explain our
model.
0 747 (It is a piece of junk!)
(The story of this movie is
Label
really boring.)
+
WS
p2=g(a,p1)
W
p1=g(b,c)
a b c
-
Fig. 2. RNDM for predicting sentiment label.
When an n-gram is given to the model, it is parsed it into a
binary tree and each leaf node, corresponding to a word, is
represented as a vector. In Fig.2, node
, and are word
vector representations for each word respectively.
The sentiment label vector is
-dimensional. is
Fig. 1. Examples in Chinese Sentiment Treebank. the sentiment classification matrix; is the
182
transformation matrix. In our model, we omit the bias for case of using =
, can be computed as C (&) = 1
simplicity. (&).
RNDM uses the tensor-based compositionality function We define the full incoming error for a node H as I J,K1LL .
[22]. Its main property is that it can directly relate input vectors. For the node , the received error is computed as formula (6).
In our example, the vector of in Fig.2, the parent node I !",MNOO = P( ) ( ' ' )Q $ ( ) (6)
of and , is computed through formula (1).
Then we compute the error for the two children of by
formula (7).
= [:] + (1)
[:] represents the tensor that defines I !",RST = ( I !",K1LL + U) $ (7)
multiple bilinear forms. denotes the concatenation of two We define S as formula (8).
column vectors resulting in a vector. is the ! ,K1LL
transformation matrix and also the main parameter to learn. S = W5 IW " B [W] + P [W] Q E (8)
=
is a standard element-wise nonlinearity.
, the right child of will then take half of this vector to
After computing the first two nodes, the network is shifted compute the full I as described in formula (9).
by one position and takes as input vectors and again computes
a potential parent node. The next parent vector in Fig.2 will I !X,K1LL = (I !",RST [1 + : 2]) (9)
!" ,RST [1
be computed as formula (2). I + : 2] means that is the right child of Y
and hence takes the 2th half of the error. For the left child of ,
= [:] + (2) the derivate for
, it will be I !",RST [1: ].
Note that the parent vectors must be of the same The full derivative for is the sum of the derivatives at
dimensionality to be recursively compatible and be used as each node. We can use formula (10) to compute the full
input to the next composition. derivative for .
183
A. Nave Bayes We report the overall accuracy on predicting sentiment
Nave Bayes is a simple model based on conditional label of movie reviews in test set. The RNDM obtains an
independence assumption. Given a feature vector table, the accuracy of 90.8% compared to NB (78.65%), ME (87.46%)
algorithm computes the posterior probability that the sentence and SVM (84.9%) just as shown in Fig. 3.
# belongs to one label . We use a multinomial Nave Bayes
model as described in formula (12).
92
!(L) j gh (i)
hkX !(K|L)
0f (|#) = (12) 90 NB
!() ME
88
In this formula, represents a feature and J (#) represents 86
SVM
the count of feature found in movie review #. There are a RNDM
84
total of % features. Parameters () and (|) are obtained
82
through maximum likelihood estimates, and smoothing
algorithm is utilized for unseen features. 80
78
We assign one movie review to the sentiment label with the
76
highest posterior probability as described in formula (13).
74
=
c%
& 0f (|#) (13) 72
Machine Learning Methods
B. Maximum Entropy
Maximum Entropy models are feature-based models. It Fig. 3. Accuracy for predictions at sentence level.
makes no independence assumptions for its features. So we can
add features like bigrams and phrases without worrying about From Fig.3, we can see that the RNDM gets the highest
features overlapping. The model is represented by formula (14). performance in predicting positive/negative sentiment followed
mn!(h oh Kh (L,))
by ME, SVM and NB. Combination with our Chinese
? (|#, l) = (L,))
(14) Sentiment Treebank, even baseline NB and ME, despite their
Op mn!(h oh Kh
simplicity, can achieve high accuracy in predicting sentiment
In this formula, l is the sentiment label, s is the movie view, label.
and l is a weight vector. The weight vectors decide the
significance of a feature in classification. A higher weight The result highlights the fact that combination with Chinese
means that the feature is a strong indicator for the sentiment Sentiment Treebank, our model is reliable in predicting
label. sentiment label of sentences.
184
TABLE II. EXAMPLES OF MOVIE REVIEWS WITH CONTRASTIVE REFERENCES
CONJUNCTION STRUCTURE
[1] Liu, B. Sentiment analysis and opinion mining. Synthesis Lectures on
Movie reviews with contrastive conjunction Human Language Technologies series, Morgan & Claypool Publishers,
Sentiment Label 2012.
structure
[2] Y. Choi and C. Cardie. 2008. Learning with compositional semantics as
(Wu Zhenyu has good acting skill, but he is structural inference for subsentential sentiment analysis. In EMNLP.
0 not suitable for roll in the comedy.) [3] T. Nakagawa, K. Inui, and S. Kurohashi. 2010. Dependency tree-based
(It is really sentiment classification using CRFs with hidden variables. In NAACL,
exquisite, but not interesting.) HLT.
Negative [4] Richard Socher, Alex Perelygin, Jean Y. Wu, Jason Chuang,
() (It has good picture, but with an Christopher D. Manning, Andrew Y. Ng, and Christopher Potts. 2013.
arrogant and disgusting taste.) Recursive deep models for semantic compositionality over a sentiment
1 (It is full treebank. In Proc. EMNLP.
of suspended story, but it is really uninteresting.) [5] J. Turian, L. Ratinov, and Y. Bengio. 2010. Word representations: a
simple and general method for semi supervised learning. In Annual
(I can play the game, but I am dizzied when look it.) Meeting of the Association for Computational Linguistics.
[6] R. Collobert. 2011. Deep learning for efficient discriminative parsing. In
(It has some interesting parts, but International Conference on Artificial Intelligence and Statistics.
the whole story has no principal line.) [7] Mnih, A., & Hinton, G. E. 2009. A scalable hierarchical distributed
language model. NIPS (pp. 10811088).
3 (At first, I am not interested in it, but in the final [8] E. H. Huang, R. Socher, C. D. Manning, and A. Y. Ng. 2012. Improving
part, I am really touched by it.) word representations via global context and multiple word prototypes.
In Annual Meeting of the Association for Computational Linguistics.
(The story of it is simple, but the whole movie is [9] Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013.
Positive
fairly good.) Efficient estimation of word representations in vector space. arXiv
(+)
preprint arXiv:1301.3781.
(Although it is lack of self-help story, the [10] R. Socher, C. D. Manning, and A. Y. Ng. 2010. Learning continuous
movie has plentiful stories, alike the Movie Forrest phrase representations and syntactic parsing with recursive neural
Gump.) networks. In Proceedings of the NIPS-2010 Deep Learning and
4 Unsupervised Feature Learning Workshop.
(At the beginning, the movie is [11] R. Socher, B. Huval, C. D. Manning, and A. Y. Ng. 2012. Semantic
terrifying for me, but it turns out to be a touching compositionality through recursive matrix vector spaces. In EMNLP.
and warm story at end.) [12] B. Pang and L. Lee. 2004. A sentimental education: Sentiment analysis
96 using subjectivity summarization based on minimum cuts. In ACL.
94 NB [13] Hang Cui, Vibhu Mittal and Mayur Datar. 2006. Comparative
Experiments on Sentiment Classification forOnline Product Reviews. In
92 ME
Proceedings of the Twenty First National Conference on Artificial
SVM
90 Intelligence.
88 [14] John Blitzer, Mark Dredze, Fernando Pereira. 2007. Biographies,
Bollywood, Boom-boxes and Blenders: Domain Adaptation for
86 Sentiment Classification. Association for Computational Linguistics.
84 [15] P. D. Turney. 2002. Thumbs up or thumbs down?: semantic orientation
82 applied to unsupervised classification of reviews. In Proceedings of the
40th Annual Meeting on Association for Computational Linguistics.
80 [16] Bing Liu. 2010. Sentiment Analysis and Subjectivity. To appear in
Machine Learning Methods Handbook of natural Language Processing, Second Edition.
Fig. 4. Accuracy for predicting sentences with contrastive conjunction [17] Q. Mei, X. Ling, M. Wondra, H. Su, and C. Zhai. 2007. Topic sentiment
structure. mixture: modeling facets and opinions in weblogs. In Proceedings of the
16th international conference on World Wide Web, pages 171180.
VI. CONCLUSION [18] I. Titov and R. McDonald. 2008. Modeling online reviews with multi-
grain topic models. In WWW 08: Proceeding of the 17th international
In this work, we focus on sentiment analysis for Chinese on conference on World Wide Web, pages 111120, New York, NY, USA.
sentence level. We firstly introduce Chinese Sentiment [19] C. Lin, and Y. He. 2009. Joint Sentiment/Topic Model for Sentiment
Treebank based on movie reviews from social websites. And Analysis. The 18th ACM Conference on Information and Knowledge
then we introduce RNDM to predict sentiment label of movie Management.
reviews on sentence level. Combination with Chinese [20] S. Tan, J. Zhang. An empirical study of sentiment analysis for Chinese
documents Expert Systems with Applications, 34 (4) (2008), pp. 2622
Treebank, even baselines can obtain a good performance. 2629
However, RNDM achieves the highest accuracy in predicting [21] D. Klein and C. D. Manning. 2003. Accurate unlexicalized parsing. In
binary sentiment label on sentence level. In predicting ACL.
sentiment label of sentences with contrastive conjunction [22] J. Mitchell and M. Lapata. 2010. Composition in distributional models
structure, RNDM outperforms the baselines by a large margin. of semantics. Cognitive Science, 34(8):13881429.
[23] J. Duchi, E. Hazan, and Y. Singer. 2011. Adaptive subgradient methods
VII. ACKNOWLEDGMENT for online learning and stochastic optimization. JMLR, 12, July.
This work is supported by National Program on Key Basic Research Project
(973 program) under Grant:2013CB329302 and the National Natural Science
Foundation of China under Grant No. 61175050&NO.61203281&No.61303172.
185