805 | 12 | 14 |
下载次数 | 被引频次 | 阅读次数 |
军事武器实体识别是军事领域本体构建的一项重要任务,基于深度学习方法实现自动军事武器实体识别能够提升军事情报信息检索的效率。为提升军事武器实体识别的精确率,面向网络公开非结构化军事新闻数据,提出了一种结合双层多头自注意力机制和BiLSTM-CRF模型的武器实体识别方法。在BiLSTM-CRF模型的基础上,采用双层自注意力机制,分别在嵌入层提取重要输入特征以及BiLSTM层提取关键字符信息,并结合军事武器实体构词特点,建立正则匹配模板对识别结果进行校正。构建了包含1 196条数据的军事武器数据集,测试结果表明,提出方法的精确率、召回率和F1值分别为0.929 3,0.930 1和0.929 7,相比于经典深度学习模型的最优结果,在精确率、召回率以及F1值上分别提升了1.15%,0.97%和0.97%。
Abstract:Military weapon entity recognition is an important task of ontology construction in military field.The realization of automatic military weapon entity recognition based on deep learning method can improve the efficiency of military intelligence information retrieval.To improve the accuracy of military weapon entity recognition, a weapon entity recognition method combined with double-layer multi-head self-attention mechanism and BiLSTM-CRF model is proposed, which can be used for the unstructured military weapon news data published on the Internet.Based on the BiLSTM-CRF model, this method uses double-layer self-attention mechanism to extract important input features in the embedded layer and key character information in the BiLSTM layer respectively.Combined with the word-building characteristics of military weapon entities, a regular matching template is established to correct the recognition results.A military weapon data set containing 1 196 pieces of data is constructed.The test results show that the accuracy rate, recall rate and F1 value of the proposed method are 0.929 3,0.930 1 and 0.929 7 respectively, which, as compared with the best results using classical deep learning model, are improved by 1.15%,0.97% and 0.97% respectively.
[1] KAUSHIK N,CHATTERJEE N.A Practical Approach for Term and Relationship Extraction for Automatic Ontology Creation from Agricultural Text[C]//2016 International Conference on Information Technology (ICIT).Bhubaneswar:IEEE,2016:241-247.
[2] GIM J,KIM D J,WANG M H,et al.Extracting Protein Terminologies in Literatures[C]//2013 IEEE International Conference on Green Computing and Communications and IEEE Internet of Things and IEEE Cyber,Physical and Social Computing.Beijing:IEEE,2013:2136-2140.
[3] GUO R,QIU J,ZHANG G.Web-based Chinese Term Extraction in the Field of Study[C]//2015 11th International Conference on Semantics,Knowledge and Grids (SKG).Beijing:IEEE,2015:133-139.
[4] DU L P,LI X G,LIN D Y.Chinese Term Extraction from Web Pages Based on Expected Point-wise Mutual Information[C]//2016 12th International Conference on Natural Computation,Fuzzy Systems and Knowledge Discovery (ICNC-FSKD).Changsha:IEEE,2016:1647-1651.
[5] 刘豹,张桂平,蔡东风.基于统计和规则相结合的科技实体自动识别研究[J].计算机工程与应用,2008,44(23):147-150.
[6] 樊梦佳,段东圣,杜翠兰,等.统计与规则相融合的领域术语抽取算法[J].计算机应用研究,2016,33(8):2282-2285.
[7] 刘里,肖迎元.基于实体长度和语法特征的统计领域实体识别[J].哈尔滨工程大学学报,2017,38(9):1437-1443.
[8] 雷钰丽,李阳,王崇骏,等.基于权重的马尔可夫随机游走相似度度量的实体识别方法[J].河北师范大学学报(自然科学版),2010,34(1):26-30.
[9] 贾美英,杨炳儒,郑德权,等.采用 CRF 技术的军事情报术语自动抽取研究[J].计算机工程与应用,2009,45(32):126-129.
[10] 李丽双,党延忠,张婧,等.基于条件随机场的汽车领域实体识别[J].大连理工大学学报,2013,53(2):2267-2272.
[11] 赵洪,王芳.理论实体识别的深度学习模型及自训练算法研究[J].情报学报,2018,37(9):923-938.
[12] 马建红,张亚梅,姚爽,等.基于 BLSTM_attention_ CRF 模型的新能源汽车领域术语抽取[J].计算机应用研究,2019,36(5):1385-1389.
[13] MNIH V,HEESS N,GRAVES A,et al.Recurrent Models of Visual Attention[C]//27th International Conference on Neural Information Processing Systems.Montreal:MIT Press,2014:2204-2212.
[14] 吴俊,程垚,郝瀚.基于 BERT 嵌入 BiLSTM-CRF 模型的中文专业实体识别研究[J].情报学报,2020,39(4):409-418.
[15] DEVLIN J,CHANG M W,LEE K,et al.BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding[C]//2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.Minneapolis:Association for Computational Linguistics,2019:4171-4186.
[16] 任欢,王旭光.注意力机制综述[J].计算机应用,2021,6(20):1-7.
[17] VASWANI A,SHAZEER N,PARMAR N,et al.Attention is All You Need[C]//31st International Conference on Neural Information Processing Systems.Long Beach:Curran Associates Inc.,2017:5998-6008.
[18] 刘煜澄.面向多源数据的军事本体构建系统[D].南京:东南大学,2019.
基本信息:
DOI:
中图分类号:E91;TP391.1
引用信息:
[1]俞海亮,彭冬亮,谷雨.结合双层多头自注意力和BiLSTM-CRF的军事武器实体识别[J].无线电工程,2022,52(05):775-782.
基金信息:
浙江省自然科学基金(LY21F030010)~~