基于改进长短时记忆网络的文本分类方法

doi:10.11835/j.issn.1000-582X.2023.05.012

首页 > 过刊浏览>2023年第46卷第5期 >111-118. DOI:10.11835/j.issn.1000-582X.2023.05.012

基于改进长短时记忆网络的文本分类方法
DOI:
                        10.11835/j.issn.1000-582X.2023.05.012
                    
CSTR:
                        
                    
作者:
                        李建平李建平
东北石油大学 计算机与信息技术学院，黑龙江 大庆 163318
在期刊界中查找
在百度中查找
在本站中查找
陈海鸥陈海鸥
东北石油大学 计算机与信息技术学院，黑龙江 大庆 163318
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:东北石油大学 计算机与信息技术学院，黑龙江 大庆 163318
作者简介:李建平(1976—)，男，博士，教授，主要从事网络信息安全、自然语言处理方向研究，(E-mail) ljp@nepu.edu.cn。
通讯作者:
中图分类号:
基金项目:国家自然科学基金资助项目(61702093)。

Text classification method based on improved long-short term memory network

Author:

LI Jianping
LI Jianping
School of Computer and Information Technology, Northeast Petroleum University, Daqing 163318, Heilongjiang, P. R.China
在期刊界中查找
在百度中查找
在本站中查找
CHEN Haiou
CHEN Haiou
School of Computer and Information Technology, Northeast Petroleum University, Daqing 163318, Heilongjiang, P. R.China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

School of Computer and Information Technology, Northeast Petroleum University, Daqing 163318, Heilongjiang, P. R.China

Fund Project:

Suppported by National Natural Science Foundation of China(61702093).

摘要

图/表

访问统计

参考文献 [16]

相似文献 [20]

引证文献

资源附件

文章评论

摘要:

针对传统长短时记忆网络(long short-term memory,LSTM)在文本分类中无法自动选取最重要潜在语义因素的问题，提出一种改进的LSTM模型。首先，将传统LSTM的运算关系拓展为双向模式，使网络充分记忆输入特征词的前后关联关系；然后在输出层前面增加池化层，以便更好选择找到最重要的潜在语义因素。互联网电影资料库评论数据实验结果表明，该模型优于传统长短时记忆神经网络以及其他同类模型，揭示了改进方案对提高文本分类准确率是有效的。

关键词:自然语言处理;文本分类;循环神经网络;长短时记忆神经网络

Abstract:

Traditional long-short term memory network (LSTM) cannot automatically select the most important latent semantic factors in text categorization. To solve the problem, this paper proposes an improved LSTM model. First, the traditional LSTM operation relationship is extended to the bidirectional mode, so that the network fully remembers the context of the input feature words. Then, the pooling layer is added in front of the output layer to better select the most important latent semantic factors. The experiment on the Internet Movie Database review data show that the model is superior to the traditional long-short term memory neural network and other similar models, revealing that the improved scheme proposed in this paper can improve the accuracy of text classification.

Key words:natural language processing;text classification;recurrent neural network;long-short term memory neural network

参考文献

[1] 石锋. 面向中文新闻文本的实体关系抽取研究[D]. 哈尔滨: 哈尔滨工业大学, 2017.Shi F. Research on entity relationship extraction for Chinese news text[D]. Harbin: Harbin Institute of Technology, 2017. (in Chinese)

[2] Abdi A, Shamsuddin S M, Hasan S, et al. Deep learning-based sentiment classification of evaluative text based on Multi-feature fusion[J]. Information Processing & Management, 2019, 56(4): 1245-1259.

[3] 崔莹. 深度学习在文本表示及分类中的应用研究[J]. 电脑知识与技术, 2019, 15(16): 174-177.Cui Y. Application of deep learning in text representation and classification with deep learning[J]. Computer Knowledge and Technology, 2019, 15(16): 174-177.(in Chinese)

[4] Liu Q , Yuan J Z , Weng C H. Survey of short text classification based on deep learning[J].Computer Sciences,2017, 44(A),11-15

[5] Bi H, Sun J, Xu Z. A Graph-Based semisupervised deep learning model for polSAR image classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2019, PP(99):1-17.

[6] 陈文实, 刘心惠, 鲁明羽. 面向多标签文本分类的深度主题特征提取[J]. 模式识别与人工智能, 2019, 32(9): 785-792.Chen W S, Liu X H, Lu M Y. Feature extraction of deep topic model for multi-label text classification[J]. Pattern Recognition and Artificial Intelligence, 2019, 32(9): 785-792.(in Chinese)

[7] Lee J, Yu I, Park J, et al. Memetic feature selection for multilabel text categorization using label frequency difference[J]. Information Sciences, 2019, 485: 263-280.

[8] 牛硕硕, 柴小丽, 李德启, 等. 一种基于神经网络与LDA的文本分类算法[J]. 计算机工程, 2019, 45(10): 208-214.Niu S S, Chai X L, Li D Q, et al. A text classification algorithm based on neural network and LDA[J]. Computer Engineering, 2019, 45(10): 208-214.(in Chinese)

[9] 向进勇, 杨文忠, 吾守尔·斯拉木. 基于特征选择和深度信念网络的文本情感分类算法[J]. 计算机应用, 2019, 39(7): 1942-1947.Xiang J Y,Yang W Z,Silamu W. Text sentiment classification algorithm based on feature selection and deep belief network[J]. Journal of Computer Applications, 2019, 39(7): 1942-1947.(in Chinese)

[10] 马力, 李沙沙. 基于词向量的文本分类研究[J]. 计算机与数字工程, 2019, 47(2): 281-284, 303.Ma L, Li S S. Research on text classification based on word embedding[J]. Computer & Digital Engineering, 2019, 47(2): 281-284, 303.(in Chinese)

[11] 陈巧红, 王磊, 孙麒, 等. 卷积神经网络的短文本分类方法[J]. 计算机系统应用, 2019, 28(5): 137-142.Chen Q H, Wang L, Sun Q, et al. Short text classification based on convolutional neural network[J]. Computer Systems & Applications, 2019, 28(5): 137-142.(in Chinese)

[12] 涂文博, 袁贞明, 俞凯. 针对文本分类的神经网络模型[J]. 计算机系统应用, 2019, 28(7): 145-150.Tu W B, Yuan Z M, Yu K. Neural network models for text classification[J]. Computer Systems & Applications, 2019, 28(7): 145-150.(in Chinese)

[13] Deng H L, Zhang L, Shu X. Feature memory-based deep recurrent neural network for language modeling[J]. Applied Soft Computing, 2018, 68: 432-446.

[14] Hochreiter S, Schmidhuber J. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735-1780.

[15] Poon H K, Yap W S, Tee Y K, et al. Hierarchical gated recurrent neural network with adversarial and virtual adversarial training on text classification[J]. Neural Networks, 2019, 119: 299-312.

[16] Wang Y S, Sohn S, Liu S J, et al. A clinical text classification paradigm using weak supervision and deep representation[J]. BMC Medical Informatics and Decision Making, 2019, 19(1): 1.

引用本文

李建平,陈海鸥.基于改进长短时记忆网络的文本分类方法[J].重庆大学学报,2023,46(5):111-118.

复制

文章指标

点击次数:394
下载次数: 681
HTML阅读次数: 207
引用次数: 0

历史

收稿日期:2021-08-11
最后修改日期:
录用日期:
在线发布日期: 2023-05-31
出版日期:

期刊社主页

编辑部首页

期刊介绍

编委会

数据库收录

过刊浏览

联系我们

引用本文

相关视频

分享

文章指标

历史

文章二维码

期刊社主页

编辑部首页

期刊介绍

编委会

数据库收录

过刊浏览

联系我们

引用本文

相关视频

分享

微信扫一扫：分享

文章指标

历史

文章二维码