寡聚蛋白质相对于单体蛋白质具有许多优势,广泛地参与多种生命活动。本文提出次生特征提取方法, 使用支持向量机作为分类器, 采用“ 一对一”的多类分类策略, 基于蛋白质一级序列提取特征方法,对四类同源寡聚体进行分类研究。结果表明, 在Jackknife检验下, 基于次生特征和氨基酸组成成分特征构成的特征集, 加权情况下,其总分类精度最高达到了78.41%, 比氨基酸组成成分特征提高13.09%,比参考文献最好特征集BG提高了6.86%,比最好原生特征集CM1提高了5.53%。此结果说明次生特征提取方法对于蛋白质同源寡聚体分类是一种非常有效的特征提取方法。
|
[1]Chou KC. Molecular therapeutic target for type 2 diabetes. J Proteome Res, 2004, 3:1284-1288.
[2] Chou KC. Review: Structural bioinformatics and its impact to biomedical science. Cur Med Chem, 2004, 11: 2105-2134.
[3]Garian R. Prediction of quaternary structure from primary structure. Bioinformatics, 2001, 17: 551-556.
[4]Chou KC, Cai YD. Predicting protein quaternary structure by pseudo amino acid composition. Proteins: Structure,Function,Genetics,2003, 53: 282-289.
[5]张绍武, 潘泉, 陈润生等. 基于支持向量机的蛋白质同源寡聚体分类研究. 生物化学与生物物理进展, 2003,30 (6):879-883.
[6]Zhang SW, Quan P, Zhang HC,et al. Support vector machines for predicting protein homooligomers by incorporating pseudoamino acid composition. Internet Electronic Journal of Molecular Design,2003,2(6):392-402.
[7]Zhang SW, Pan Q, Zhang HC,et al. Prediction Protein Homooligomer Types by Pesudo Amino Acid Composition: Approached with an Improved Feature Extraction and Naive Bayes Feature Fusion, Amino Acids, 2006, 30(4):461-468.
[8]张绍武,潘泉,赵春晖,等.基于加权自相关函数特征提取法的多类蛋白质同源寡聚体分类研究. 生物医学工程学杂志,2007, 24 (4) : 721-726.
[9]施建宇, 潘泉, 张绍武, 等. 基于氨基酸组成分布的蛋白质同源寡聚体分类研究. 生物物理学报, 2006,22 (1): 49-56.
[10]Li Qipeng, Zhang Shaowu, Pan Quan. Using multiscale glide zoom window feature extraction approach to predict protein homooligomer types. 3rd IAPR International Conference on Pattern Recognition in Bioinformatics, PRIB 2008, v 5265 LNBI: 78-86.
[11]Li Qipeng, Zhang Shaowu, Pan Quan. Prediction of protein homooligomer types with a novel approach of glide zoom window feature extraction. Advanced Intelligent Computing. Theories and Applications: With Aspects of Theoretical and Methodological Issues-4th International Conference on Intelligent Computing, ICIC 2008, Proceedings, v 5226 LNCS: 71-78.
[12]Chou P, Y. Amino acid composition of four classes of proteins. Abstracts of Papers, Part Ⅰ, Second Chemical Congress of the North American Continent. Las Vegas. 1980.
[13]Nishikawa K, Ooi T. Correlation of the amino acid composition of a protein to its structural and biological characters. J Biochem, 1982, 91: 1821-1824.
[14]Chou KC. Prediction of protein cellular attributes using pseudoamino acid composition. Proteins: Struct Funct Genet, 2001, 43:246-255.
[15]Matthews BW. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim Biophys Acta, 1975, 405:442-451.
[16]Fasman GD. Handbook of Biochemistry and Molecular Biology. 3rd ed.Cleveland: ProteinsVolume1, CRC Press, 1976.
|