设为首页 |  加入收藏
首页首页 期刊简介 消息通知 编委会 电子期刊 投稿须知 广告合作 联系我们
基于多特征融合的中文电子病历命名实体识别

Named entity recognition in Chinese electronic medical records based on multi-feature integration

作者: 于楠  王普  翁壮  方丽英 
单位:<span style="font-family:宋体">北京工业大学信息学部(北京</span> 100124<span style="font-family:宋体">)</span><p><span style="font-family:宋体">城市轨道交通北京实验室(北京</span> 100124<span style="font-family:宋体">)</span></p><p><span style="font-family:宋体">数字社区教育部工程研究中心</span> <span style="font-family:宋体">(北京</span> 100124<span style="font-family:宋体">)</span></p><p><span style="font-family:宋体">计算智能与智能系统北京市重点实验室(北京</span> 100124<span style="font-family:宋体">)</span></p>
关键词: 电子病历;  多特征融合;  条件随机场模型;  命名实体识别 
分类号:R318.04
出版年·卷·期(页码):2018·37·3(279-284)
摘要:

Objective For the unstructured components (medical diagnosis and patients' condition) of a tertiary hospital electronic medical records, we establish the conditional random field model with multi-feature integration, automatically identify diseases and symptoms in electronic medical record (EMR) which is described by natural language, in order to realize the structured storage of EMR, and it is beneficial for EMR information mining and statistical analysis. Methods The manually labeled corpus was divided into training set and testing set, we used NLPIR to segment the text and chose CRF + + tool for experiments. According to the data characteristics of Chinese EMR, we selected basic features and templates, determined the size of context window by contrast experiments. Then we added guide word pattern and word formation pattern, compared the effects of two advanced features on experimental result. Results When we only chose basic features, the context window was 7, the recognition performance was better; then we added advanced features, the F-measures in disease entities reached 92. 80%, the F-measures in symptom entities reached 94. 17%. Conclusions Conditionalrandom field model with multi-feature integration can achieve high recognition performance for disease entities and symptom entities in EMR. The study is of great significance to the named entity recognition in EMR.

参考文献:

服务与反馈:
文章下载】【加入收藏
提示:您还未登录,请登录!点此登录
 
友情链接  
地址:北京安定门外安贞医院内北京生物医学工程编辑部
电话:010-64456508  传真:010-64456661
电子邮箱:llbl910219@126.com