设为首页 |  加入收藏
首页首页 期刊简介 消息通知 编委会 电子期刊 投稿须知 广告合作 联系我们
一种面向组学数据的中级融合分类方法

Amid-level fusion method for omics dataset

作者: 李明达  郑浩然 
单位:中国科学技术大学计算机科学与技术学院(合肥230027)
关键词: 学数据;降维;中级融合;偏最小二乘法;支持向量机 
分类号:R318.04
出版年·卷·期(页码):2016·35·3(249-253)
摘要:

目的 对组学数据进行深入分析有助于推动医疗诊断等方面的研究。利用单一种类组学数据的分析方法无法解决某些复杂生物医学问题。为利用多种组学信息以解决复杂的生物医疗问题,本文提出一种中级融合分类方法。方法 引入偏最小二乘法(partial least squares,PLS)分别对各种组学数据进行降维,然后利用支持向量机(support vector machine,SVM)对融合后的数据进行分类。结果 “非小细胞肺癌与肾癌”和“结肠直肠癌与结肠直肠腺瘤”这两个组学数据集被用于测试本文方法的有效性。在这两个癌症组学数据集上的应用,体现出该方法不但能有效降低高维组学数据的维数,而且具有较高的分类准确率(接受者操作特征曲线下的面积达0.95以上)。结论 本文提出的中级融合方法能够利用多种组学数据对癌症样本进行分类,可有效提高疾病诊断的准确率。


Objective The analysis of omics data is of great importance for medical diagnosis. Methods to analyze only one type of omic dataset cannot solve certain complex biomedical problems. In order to solve the complex biomedical problems by using different kinds of omics datasets,a mid-level fusion method is proposed. Methods Partial least squares (PLS) is used to reduce the dimension,then support vector machine (SVM) is used for classification. Results “Non-small cell lung cancer vs renal cancer” and “colorectal cancer vs colorectal adenomas” datasets are used for testing the method’s effectiveness. The experimental results demonstrate that the mid-level fusion method can not only reduce the dimension of omics data but also obtain a high classification accuracy (The area under receiver operating characteristic curve is higher than 0.95). Conclusions The mid-level fusion method takes advantages of different kinds of omics datasets for classification and improves the accuracy of diagnosis.

参考文献:


[1]Joyce AR, Palsson BO. The model organism as a system:integrating ‘omics’ data sets[J]. Nat Rev Mol Cell Biol, 2006, 7 (3):198-210.

[2]Boccard J, Rutledge DN. A consensus orthogonal partial least squares discriminant analysis (OPLS-DA) strategy for multiblock Omics data fusion[J]. Analytica chimica acta, 2013, 769:30-39.

[3]Wangen LE, Kowalski BR. A multiblock partial least squares algorithm forinvestigating complex chemical systems [J]. Chemom, 1989, 3: 3-20.

[4]Smilde AK, van der Werf MJ, Bijlsma S, et al. Fusion of mass spectrometry-based metabolomics data [J]. Anal Chem, 2005, 77: 6729-6736.

[5]Moyon T, Le Marec F, Qannari EM, et al. Alexandre-Gouabau, Statistical strategies for relating metabolomics and proteomics data: a real case study in nutrition research area [J]. Metabolomics, 2012,8(6):1090-1101.

[6]Bylesj M, Rantalainen M, Cloarec O, et al. OPLS discriminant analysis: combining the strengths of PLS-DA and SIMCA classification[J]. Journal of Chemometrics, 2006, 20:341-351.

[7]Roussel S, Roger JM, Bellon-Maurel V, et al. Fusion of Aroma, FT-IR and UV Sensor Data Based on the Bayesian Inference. Application to the Discrimination of White Grape Varieties [J]. Chemom Intell Lab Syst, 2003, 65 (2): 209-219.

[8]Smolinska A, Blanchet L, Buydens LMC, et al. NMR and pattern recognition methods in metabolomics: from data acquisition to biomarker discovery: a review[J]. Analytica Chimica Acta, 2012, 750: 82-97.

[9]Wold S, Sjstrm M, Eriksson L. PLS-regression: a basic tool of chemometrics[J]. Chemometrics and Intelligent Laboratory Systems, 2001, 58(2):109-130. 

[10]Cortes C, Vapnik V. Support-vector networks [J]. Machine Learning, 1995, 20(3):273-297.

[11]Biancolillo A, Bucci R, Magrì AL, et al. Data-fusion for multiplatform characterization of an italian craft beer aimed at its authentication [J]. Analytica Chimica Acta, 2014, 820:23-31.

[12]Shoemaker RH. The NCI60 human tumour cell line anticancer drug screen[J]. Nat Rev Cancer, 2006, 6:813-823.

[13]Ross DT, Scherf U, Eisen MB, et al. Systematic variation in gene expression patterns in human cancer cell lines[J]. Nat Genet, 2000, 24:227-235.

[14]Staunton JE, Slonim DK, Coller HA, et al. Chemosensitivity prediction by transcriptionalprofiling[J]. Proc Natl Acad Sci USA, 2001, 98:10787-10792.

[15]Bro R, Nielsen HJ, Savorani F, et al. Data fusion in metabolomic cancer diagnostics[J]. Metabolomics, 2013, 9(1): 3-8.


服务与反馈:
文章下载】【加入收藏
提示:您还未登录,请登录!点此登录
 
友情链接  
地址:北京安定门外安贞医院内北京生物医学工程编辑部
电话:010-64456508  传真:010-64456661
电子邮箱:llbl910219@126.com