A topic-driven graph-of-words convolutional network for improving text classification

Authors

  • Tham Vo Thu Dau Mot University

Keywords:

neural topic modelling, BERT, GCN, graph-of-words

Abstract

In general, text classification is considered as a primitive task for most of common problems in the natural language processing (NLP) domain. In recent years, with dramatic progresses of deep learning as well as the significant emergence of advanced deep neural architectures in NLP, e.g., as auto-encoding (AE), attention mechanism and transformer, have demonstrated remarkable improvements in multiple NLP’s tasks, including text categorization. However, recent advanced deep sequential text embedding still suffered limitations regarding with the capability of preserving the long-range dependencies between words and text documents at the corpus level. Thus, several graph neural network (GNN) based techniques for text classification, like as TextGCN, TensorGCN, etc. have been proposed to cope with this challenge. However, these GNN-based text classification techniques also encountered challenges of integrating global semantic information of texts like as topic to facilitate the textual representation learning process for better classification-driven fine-tuning objective. To deal with these challenges, in this paper we proposed a novel integrated neural topic modelling with the textual graph based text representation learning framework upon the traditional graph-of-words (GOW) paradigm, called as: GOWTopGCN. Our proposed GOWTopGCN enables to jointly learn the rich global semantic information as well as long-range dependent relationships between words and documents within a given text corpus. Extensive experiments in benchmark textual datasets show the outperformances of our proposed model in comparing with state-of-the-art transformer-based and GNN-based text classification baselines.

Downloads

Published

2022-03-30