Improve CNN and LSTM In Sentiment Analysis for Vietnamese from Data Preprocessing Phase
Keywords:
Classification, CNN, Convolution Neural Network, Corpus, Deep Learning, Long Short Term Memory, LSTM, Opinion Mining, Sentiment, Sentiment Analysis, Support Vector Machine, SVMAbstract
The deep learning method has achieved good results in many application fields, such as image processing and computer vision. Recently, this method has also been used in the field of natural language processing and has achieved good results too. In this area, an issue of concern is subjective opinion classification. A subjective opinion is an individual's thinking or judgment about a product or a socio-cultural event or issue. Subjective opinions have received attention from many producers and businesses who are interested in exploiting the opinions of the community and scientists. This paper experiments with the deep learning model convolution neural network (CNN), long short-term memory (LSTM), and the combined model of CNN and LSTM. The training data set comprise reviews of cars in Vietnamese that are pre-processed according to the method of aspect analysis based on an ontology of semantic and sentimental approaches. This data set experiment with CNN, LSTM, and CNN + LSTM models are used to evaluate the effectiveness of the data preprocessing method that was used in this paper. This paper tests the sentiment classification with the English Sentence Collection Stanford Sentiment Treebank (SST) to assess the validity of the test models with the Vietnamese opinion set. The non-neural method, SVM, was also tested to evaluate the effectiveness of the data processing method of the paper