Objective Unbalanced data which means inequality between positive and negative data, is a common problem in clinical data analysis, and this problem may result in bias. Methods for balancing data are various, yet some may over fit or lose data. Methods In this paper, SMOTE arithmetic and the application in R language were introduced briefly and we used SMOTE arithmetic for real unbalanced data. Results The ratio between benign and malignant cases was 1/3 in original data and the ratio was 1 in balanced data. Conclusions The SMOTE arithmetic has good performance in balancing data.
|