设为首页 |  加入收藏
首页首页 期刊简介 消息通知 编委会 电子期刊 投稿须知 广告合作 联系我们
基于数据挖掘技术的消化道恶性肿瘤诊断

Diagnosis of Digestive Tract Cancer Based on Data Mining

作者: 游佳    陈卉    武文芳    夏翃    杨淼    刘志成 
单位:首都医科大学生物医学工程学院(北京100069)
关键词: 数据挖掘;消化道恶性肿瘤;神经网络;Logistic回归;朴素贝叶斯分类器 
分类号:
出版年·卷·期(页码):2011·30·2(132-136)
摘要:

目的 探讨数据挖掘技术在血清肿瘤标志物(STM)联合检测诊断消化道恶性肿瘤(DTC)中应用的可能性,并比较Logistic回归模型、神经网络和朴素贝叶斯分类器及临床单一及联合STM诊断DTC的性能。方法 对301例DTC和114例消化道良性疾病患者的血清肿瘤标志物CA19-9、CA242、CA50、CEA检测值,分别建立基于统计Logistic回归、反向传播神经网络和朴素贝叶斯方法的诊断分类器,并进行10折交叉验证。利用诊断敏感度、特异度和接受者操作特征(ROC)曲线下面积对三种数据挖掘分类器、CA19-9以及4种STM并联诊断DTC的性能进行评价。结果 神经网络诊断模型的敏感度和ROC曲线下面积(Az)分别为92.0%和0.903,高于STM并联诊断的敏感度83.4%(P<0.001)和CA19-9诊断的ROC曲线下面积0.806(P<0.001),特异度69.3%与STM并联诊断的特异度68.4%相当(P=1.00);Logistic回归模型的敏感度91.4%高于STM并联诊断(P<0.001),特异度45.6%低于STM并联诊断(P<0.001),Az=0.819与CA19-9诊断相当(P=0.55);贝叶斯分类器的敏感度72.8%低于STM并联诊断(P<0.001),特异度75.4% 和Az=0.797与STM并联诊断和CA19-9诊断相当(P=0.13和P=0.61)。结论 数据挖掘技术的分类方法中,神经网络的分类方法比单一STM及其并联诊断的准确性高,Logistic回归和贝叶斯方法的诊断水平与普通STM并联诊断水平相当;神经网络分类器的诊断性能优于Logistic回归模型和贝叶斯分类器,可进一步应用于计算机辅助诊断中。

Objective To investigate the potential applications of data mining methods in the diagnosis of digestive tract cancer (DTC) using several tumor markers(STM), and to compare the diagnostic performance for DTC with several methods of Logistic regression model, neural network, Bayesian classifier, and clinical diagnosis using a single STM and the combination of STMs. Methods Serum levels of CA19-9 , CA242 ,CA50 and CEA in 301 patients with DTC and 114 persons with benign digestive disease were used to build diagnostic classifiers based on three data mining methods, including Logistic regression, BP based neural network and Bayesian network. Ten-fold cross validation was employed to test these classifiers. The diagnostic performance was assessed and compared on the basis of sensitivity, specificity and receiver operating characteristic (ROC) curve. Results Sensitivity and the area under the ROC curve (Az) of BP neural network were 92.0% and 0.903, which were greater than the sensitivity of STM parallel diagnosis (83.4%, P<0.001) and Az value of CA19-9 (0.806, P<0.001), respectively, while the specificity (69.3%) was similar with that of STM parallel diagnosis (68.4%, P=1.00). Logistic regression model had a higher sensitivity of 91.4% than that of STM parallel diagnosis (P<0.001), a lower specificity of 45.6% than that of STM parallel diagnosis (P<0.001), and an similar Az value of 0.819 with that of STM parallel diagnosis (P=0.55). The sensitivity of Bayesian classifier was 72.8%, which was less than that of STM parallel diagnosis (P<0.001), and the specificity (75.4%) and the Az (0.797) were similar with those of STM parallel diagnosis and CA19-9 (P=0.13 and P=0.61), respectively. Conclusions BP neural network had higher diagnostic accuracy than the parallel diagnosis of the four tumor markers. Logistic regression and Bayesian network had equivalent diagnostic level to the parallel diagnosis of the four tumor markers, and BP neural network has higher diagnostic performance than the other two classifiers.

参考文献:

[1]杨昊,张帆,周萍,等.联合检测血清AFP、CEA、CA 199对消化道恶性肿瘤的诊断价值[J].实用预防医学,2005,12(2):296-298.
[2]潘源,宋丰举,崔林,等.胃癌联合检测CA72-4 CA242 CA19-9和CEA的临床意义与诊断价值[J].中国肿瘤临床,2009,36(13):729-731.
[3]夏峰,郑姬,王曙光,等.多肿瘤标志物的联合检测和分析对胰腺癌诊断的价值探讨[J].消化外科,2006,5(2):111-114.
[4]Erol R, Oulata RN, Sahin C, et al. A radial basis function neural network (RBFNN) approach for structural classification of thyroid diseases[J]. Journal of Medical Systems, 2008, 32(3): 215-220.
[5]Murat C, Mehmet E, Erkan ZE, et al. Early prostate cancer diagnosis by using artificial neural networks and support vector machines[J]. Expert Systems with Applications, 2009, 36: 6357-6361.
[6]易静,苏新良,王润华.基于数据挖掘技术的乳腺癌高位淋巴结转移判别分类[J].现代预防医学,2008,35(13):2410-2413.
[7]Han JW, Kamber M. Data Mining: Concepts and Techniques[M]. 2nd Ed. Maryland Heights, MO: Elsevier, 2006:285.
[8]Song JH, Venkatesh SS, Conant EA, et al. Comparative analysis of logistic regression and artificial neural network for computer-aided diagnosis of breast masses[J]. Academic Radioloy, 2005, 12:487-495.
[9]Tangri N, Ansell D, Naimark D. Predicting technique survival in peritoneal dialysis patients: comparing artificial neural networks and logistic regression[J]. Nephrology Dialysis Transplantation, 2008, 23: 2972-2981.
[10]Tsai NC, Chen CW, Hsu SL. Computer-aided diagnosis for early-stage breast cancer by using wavelet transform [J]. Computerized Medical Imaging and Graphics, 2011, 35(1): 1-8.

服务与反馈:
文章下载】【加入收藏
提示:您还未登录,请登录!点此登录
 
友情链接  
地址:北京安定门外安贞医院内北京生物医学工程编辑部
电话:010-64456508  传真:010-64456661
电子邮箱:llbl910219@126.com