Please wait a minute...

大连海洋大学学报  2024, Vol. 39 Issue (1): 153-161    DOI: 10.16535/j.cnki.dlhyxb.2023-201
  |
基于改进BiRTE的渔业健康养殖标准复杂关系抽取
宋奇书,于红*,乔诗晗,罗璇,李光宇,邵立铭,张思佳
1.大连海洋大学 信息工程学院,辽宁 大连 116023;2.大连市智慧渔业重点实验室,辽宁 大连 116023;3.设施渔业教育部重点实验室(大连海洋大学),辽宁 大连 116023;4.辽宁省海洋信息技术重点实验室,辽宁 大连 116023
Complex relation extraction from health aquaculture standards based on an improved BiRTE model
SONG Qishu,YU Hong*,QIAO Shihan,LUO Xuan,LI Guangyu,SHAO Liming,ZHANG Sijia
1.College of Information Engineering,Dalian Ocean University,Dalian 116023,China;2.Dalian Key Laboratory of Smart Fisheries,Dalian 116023,China;3.Key Laboratory of Environment Controlled Aquaculture (Dalian Ocean University),Ministry of Education,Dalian 116023,China;4.Key Laboratory of Marine Information Technology of Liaoning Province,Dalian 116023,China
下载:  HTML  PDF (5390KB) 
输出:  BibTeX | EndNote (RIS)      
摘要 为解决渔业健康养殖标准文本关系抽取领域特定性强、语义复杂导致关系抽取准确率不高等问题,提出了基于改进BiRTE的渔业健康养殖标准复杂关系抽取方法,针对实体和语义关联建模,将RoBERTa作为编码器,采用全词掩码和动态掩码的方式增强词向量特征表示,并在此基础上融合了自注意力机制(Self-Attention,SelfATT)将实体特征与关系特征结合聚焦,加强实体抽取与关系预测的联系,从而提升渔业标准文本抽取的准确性。结果表明:本文提出的基于改进BiRTE的渔业健康养殖标准复杂关系抽取模型(RoBERTa-BiRTE-SelfATT)对渔业标准复杂关系抽取的准确率、召回率和F1值分别为95.9%、95.4%、95.7%,较BiRTE模型分别提升了4.2%、3.1%、3.8%。研究表明,本文提出的渔业健康养殖标准复杂关系抽取模型RoBERTa-BiRTE-SelfATT可以有效解决渔业标准文本关系抽取中专有名词识别不准确、语义复杂导致实体关系难以抽取的问题,是一种有效的渔业标准复杂关系抽取方法。
服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
宋奇书
于红
乔诗晗
罗璇
李光宇
邵立铭
张思佳
关键词:  渔业标准  关系抽取  重叠关系  复杂关系  自注意力机制    
Abstract: A complex relationship extraction method for health aquaculture standards is proposed to address issues such as inaccurate recognition of domain-specific nouns and the complexity of semantics hindering entity relationship extraction based on an improved BiRTE model. The BiRTE model, which reduces error propagation through bidirectional extraction and exhibits strong relationship extraction capabilities, was adopted as the foundational model. To enhance the model’s information-extracting ability from texts of fisheries standard files, RoBERTa was used as the encoder encoding domain-specific nouns in fisheries standard files using whole-word masking and dynamic masking, enriching word vector information and enhancing feature representation. Thus, the Self-Attention is integrated to combine entity features and relationship features, focusing on strengthening the connection between entity extraction and relation prediction, thereby improving the accuracy of relation extraction. It was found that the proposed model achieved precision of 95.9%, recall of 95.4%, and F1 scores of 95.7% from the extraction of complex relationships in fisheries standards, representing an improvement of 4.2%, 3.1%, and 3.8%, respectively, compared to the original model. The finding indicates that the proposed improved BiRTE-based model, as an effective method for extracting complex relationships in fishing standards, can effectively address the problems of inaccurate identification of proper nouns and difficulty in extracting entity relationships due to semantic complexity in the field of fishing standard text relation extraction.
Key words:  fishery standard    relation extraction    overlapping relation    complex relation    Self-Attention
               出版日期:  2024-03-13      发布日期:  2024-03-13      期的出版日期:  2024-03-13
中图分类号:  S 932.2  
  TP 391  
基金资助: 辽宁省重点研发计划项目(2023JH26/10200015)
引用本文:    
宋奇书, 于红, 乔诗晗, 罗璇, 李光宇, 邵立铭, 张思佳. 基于改进BiRTE的渔业健康养殖标准复杂关系抽取[J]. 大连海洋大学学报, 2024, 39(1): 153-161.
SONG Qishu, YU Hong, QIAO Shihan, LUO Xuan, LI Guangyu, SHAO Liming, ZHANG Sijia. Complex relation extraction from health aquaculture standards based on an improved BiRTE model. Journal of Dalian Ocean University, 2024, 39(1): 153-161.
链接本文:  
https://xuebao.dlou.edu.cn/CN/10.16535/j.cnki.dlhyxb.2023-201  或          https://xuebao.dlou.edu.cn/CN/Y2024/V39/I1/153
[1] 孙哲涛, 于红, 宋奇书, 李光宇, 邵立铭, 杨惠宁, 张思佳, 孙华. 基于规则匹配与深度学习AbTransformer的渔业标准表格信息抽取方法[J]. 大连海洋大学学报, 2023, 38(1): 140-148.
[2] 何津民, 张丽珍. 基于自注意力机制和CNN-LSTM深度学习的对虾投饵量预测模型[J]. 大连海洋大学学报, 2022, 37(2): 304-311.
[3] 杨鹤, 于红, 刘巨升, 杨惠宁, 孙哲涛, 程名, 任媛, 张思佳. 基于BERT+BiLSTM+CRF深度学习模型和多元组合数据增广的渔业标准命名实体识别[J]. 大连海洋大学学报, 2021, 36(4): 661-669.
[4] 程名, 于红, 冯艳红, 任媛, 付博, 刘巨升, 杨鹤. 融合注意力机制和BiLSTM+CRF的渔业标准命名实体识别[J]. 大连海洋大学学报, 2020, 35(2): 296-301.
[5] 于红, 冯艳红, 李晗, 戚浩然, 刘海映, 苏延明, 庞建宝. 渔业标准体系化服务与决策系统研究[J]. 大连海洋大学学报, 2019, 34(2): 260-266.
No Suggested Reading articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed