首页 | 官方网站   微博 | 高级检索  
     

基于BERT-Bi-LSTM-CRF模型的自主式交通系统参与主体识别方法
引用本文:唐进君,庹昊南,刘佑,付强.基于BERT-Bi-LSTM-CRF模型的自主式交通系统参与主体识别方法[J].交通信息与安全,2022,40(5):80-90.
作者姓名:唐进君  庹昊南  刘佑  付强
作者单位:中南大学交通运输工程学院 长沙 410075
基金项目:国家重点研发计划项目2020YFB1600400
摘    要:自主式交通系统(ATS)的重要组成部分是参与主体,参与主体的信息通常依靠文本进行描述。为构建自主式交通知识图谱,需要从文本中准确地识别出大量参与主体。为此,研究了基于BERT-Bi-LSTM-CRF模型的实体识别方法,对自主式交通系统参与主体进行抽取。词嵌入模型BERT为预训练语言模型,用以捕获丰富的语义特征,将捕获的语义特征输入到双向长短时记忆神经网络(Bi-LSTM)模型中提取上下文双向序列信息,经条件随机场(CRF)处理得到最优序列预测结果。收集交通专业相关的原始语料,经过数据预处理与文本标注,形成了可用于自主式交通系统参与主体识别的语料库,基于此数据开展实体识别对比实验。结果证明:BERT模型显著提升了自主式交通系统参与主体识别任务的性能。相较于传统方法CNN-LSTM或Bi-LSTM等,所提方法可以得到最佳综合识别效果,各实体的综合F1值为86.81%,表明通过BERT模型提取参与主体的语义特征,可以增强识别方法的泛化能力。“使用者”“运营者”“提供者”“规划者”“维护者”类实体的F1值分别为90.35%,92.31%,90.48%,93.33%,95.00%。验证了所提方法识别自主式交通系统参与主体的有效性。 

关 键 词:智能交通    命名实体识别    知识图谱    BERT-Bi-LSTM-CRF    知识建模
收稿时间:2022-01-02

A Method for Identifying the Participants of Autonomous Transportation System Based on a BERT-Bi-LSTM-CRF Model
Affiliation:School of Traffic and Transportation Engineering, Central South University, Changsha 410075, China
Abstract:Autonomous Transportation System (ATS) consists of participants whose information is generally described by texts. In order to develop a knowledge graph of the participants of the ATS, it is necessary to accurately identify the participants from the texts. Therefore, an entity recognition method based on a BERT-Bi-LSTM-CRF model is developed to extract the participants of ATS. Specifically, a Bi-LSTM (bidirectional long short-term memory) model is used to bi-directionally extract contextual sequence information from the semantic characteristics, which are captured by a word embedding model—BERT (bidirectional encoder representation from transformers). The optimal results of sequence prediction are obtained through the CRF(conditional random fields). After the original text source related to transportation engineering is collected, preprocessed and annotated, a new dataset is developed for identifying the participants of the ATS. Moreover, a comparative experiment of the entity recognition is carried out based on the same dataset. The results indicate that the BERT model significantly improves the performance of identifying the participants. Compared with other methods such as CNN-LSTM and Bi-LSTM, the proposed method achieves the best performance. The overall F1-score of participants is 86.81%, which shows that the proposed BERT model can enhance the generalized capability of the detection methods by extracting the semantic features of participants. The for identifying each type of including "user" "operator" "supplier" "planner" and "maintainer" reaches 90.35%, 92.31%, 90.48%, 93.33%, and 95.00%, respectively. Therefore, it can be concluded from the study results that the proposed method is effective and accurate. 
Keywords:
点击此处可从《交通信息与安全》浏览原始摘要信息
点击此处可从《交通信息与安全》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号