Improve CNN and LSTM In Sentiment Analysis for Vietnamese from Data Preprocessing Phase

Authors

  • Duy Ngoc Nguyen 362/84 ward 10, Ho Nai ward, Bien Hoa city, Dong Nai province
  • Diep Ngoc Luu 11, Da Kao ward, District 1, Ho Chi Minh city

Keywords:

Classification, CNN, Convolution Neural Network, Corpus, Deep Learning, Long Short Term Memory, LSTM, Opinion Mining, Sentiment, Sentiment Analysis, Support Vector Machine, SVM

Abstract

The deep learning method has achieved good results in many application fields, such as image processing and computer vision. Recently, this method has also been used in the field of natural language processing and has achieved good results too. In this area, an issue of concern is subjective opinion classification. A subjective opinion is an individual's thinking or judgment about a product or a socio-cultural event or issue. Subjective opinions have received attention from many producers and businesses who are interested in exploiting the opinions of the community and scientists. This paper experiments with the deep learning model convolution neural network (CNN), long short-term memory (LSTM), and the combined model of CNN and LSTM. The training data set comprise reviews of cars in Vietnamese that are pre-processed according to the method of aspect analysis based on an ontology of semantic and sentimental approaches. This data set experiment with CNN, LSTM, and CNN + LSTM models are used to evaluate the effectiveness of the data preprocessing method that was used in this paper. This paper tests the sentiment classification with the English Sentence Collection Stanford Sentiment Treebank (SST) to assess the validity of the test models with the Vietnamese opinion set. The non-neural method, SVM, was also tested to evaluate the effectiveness of the data processing method of the paper

Author Biographies

Duy Ngoc Nguyen, 362/84 ward 10, Ho Nai ward, Bien Hoa city, Dong Nai province

Nguyen Ngoc Duy is currently a lecturer of the Faculty of Information Technology at Posts and Telecommunications Institute of Technology in Vietnam, campus Ho Chi Minh City. I received M.Sc. in Computer Science in the Ho Chi Minh City University of Technology, Vietnam (HCMUT) in 2005, and became Ph.D. Candidate at HCMUT since 2016. My research interests include machine learning, data mining, and natural language processing.

Diep Ngoc Luu, 11, Da Kao ward, District 1, Ho Chi Minh city

Luu Ngoc Diep is currently a lecturer of the Faculty of Information Technology at Posts and Telecommunications Institute of Technology in Vietnam, campus Ho Chi Minh City. I received M.Sc. Electronics and Telecommunications in the Ho Chi Minh City University of Technology, Vietnam (HCMUT) in 2003

Downloads

Published

2021-07-15