Demanding Data Arrangement Using Logistic Regression with Error Matrix

GLNVS KUMAR, AVSM GANESH, ML REKHA .

Abstract


There have been many challenges to classify imbalanced data, since this classification is critical in a wide variety of applications related to the detection of anomalies, failures, and risks. Many conventional methods, which can be categorized into sampling, cost-sensitive, or ensemble, include heuristic and task dependent processes. In order to achieve a better classification performance by formulation without heuristics and task dependence. Logistic regression function is the harmonic mean of various evaluation criteria derived from a error matrix, such criteria as sensitivity, positive predictive value, and others for negatives. This objective function and its optimization are consistently formulated on the framework of logistic regression with error matrix, based on minimum classification error and generalized probabilistic descent (MCE/GPD) learning. Due to the merits of the harmonic mean, logistic regression with error matrix, and MCE/GPD improves the multifaceted performances in a well-balanced way.

Full Text:

PDF

References


H. He and E. A. Garcia, “Learning from imbalanced data,” IEEE Trans. Knowl. Data Eng., vol. 21, no. 9, pp. 1263–1284, Sep. 2009.

M. Galar, A. Fernandez, E. Barrenechea, H. Bustince, and F. Herrera, “A review on ensembles for the class imbalance problem: Bagging-, boosting-, and hybrid-based approaches,” IEEE Trans. Syst. Man Cybern. Part C Appl. Rev., vol. 42, no. 4, pp. 463–484, Jul. 2012.

V. Lopez, A. Fernandez, S. Garcia, V. Palade, and F. Herrera, “An insight into classification with imbalanced data: Empirical results and current trends on using data intrinsic characteristics,” Inf. Sci., vol. 250, no. 20, pp. 113–141, 2013.

P. Branco, L. Torgo, and R. P. Ribeiro, “A survey of predictive modeling on imbalanced domains,” ACM Comput. Surveys, vol. 49, no. 2, 2016, Art. no. 31.

V. Roth, “Probabilistic discriminative kernel classifiers for multi-class problems,” Lecture Notes Comput. Sci., vol. 2191, pp. 246–253, 2001.

J. Zhu and T. Hastie, “Kernel logistic regression and the import vector machine,” J. Comput. Graph. Statist., vol. 14, no. 1, pp. 185–205, 2005.

S. Katagiri, B. H. Juang, and C.-H. Lee, “Pattern recognition using a family of design algorithms based upon the generalized probabilistic descent method,” Proc. IEEE, vol. 86, no. 11, pp. 2345–2373, Nov. 1998.

C. M. Bishop, Pattern Recognition and Machine Learning. Berlin, Germany: Springer, 2006.

V. N. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “SMOTE: Synthetic minority over-sampling technique,” J. Artif. Intell. Res., vol. 16, pp. 321–357, 2002.

H. He, Y. B. Edwardo, A. Garcia, and S. Li, “ADASYN: Adaptive synthetic sampling approach for imbalanced learning,” in Proc. IEEE Int. Joint Conf. Neural Netw., 2008, pp. 1322–1328.

Y. Tang, Y.-Q. Zhang, N. V. Chawla, and S. Krasser, “SVMs modeling for highly imbalanced classification,” IEEE Trans. Syst. Man Cybern. Part B Cybern., vol. 39, no. 1, pp. 281–288, Feb. 2009.

C. Seiffert, T. M. Khoshgoftaar, J. V. Hulse, and A. Napolitano, “RUSBoost: A hybrid approach to alleviating class imbalance,” IEEE Trans. Syst. Man Cybern. Part A Syst. Humans, vol. 40, no. 1, pp. 185–197, Jan. 2010.

B. C. Wallace, K. Small, C. E. Brodley, and T. A. Trikalinos, “Class imbalance, redux,” in Proc. IEEE Int. Conf. Data Mining, 2011, pp. 754–763.

S. Wang, Z. Li, W. Chao, and Q. Cao, “Applying adaptive over- sampling technique based on data density and cost-sensitive SVM to imbalanced learning,” in Proc. IEEE Int. Joint Conf. Neural Netw., 2012, pp. 1–8.

P. Yang, P. D. Yoo, J. Fernando, B. B. Zhou, Z. Zhang, and A. Y. Zomaya, “Sample subset optimization techniques for imbal- anced and ensemble learning problems in bioinformatics applic- ations,” IEEE Trans. Cybern., vol. 44, no. 3, pp. 445–455, Mar. 2014.

B. Das, N. C. Krishnan, and D. J. Cook, “RACOG and wRACOG: Two probabilistic oversampling techniques,” IEEE Trans. Knowl. Data Eng., vol. 27, no. 1, pp. 222–234, Jan. 2015.

T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed. Berlin, Germany: Springer, 2014.

X.-Y. Liu and Z.-H. Zhou, “The influence of class imbalance on cost-sensitive learning: An empirical study,” in Proc. IEEE Int. Conf. Data Mining, 2006, pp. 970–974.

C. L. Castro and A. P. Braga, “Novel cost-sensitive approach to improve the multilayer perceptron performance on imbalanced data,” IEEE Trans. Neural Netw. Learn. Syst., vol. 24, no. 6, pp. 888– 899, Jun. 2013.

B. Krawczyk, “Cost-sensitive one-vs-one ensemble for multi-class imbalanced data,” in Proc. IEEE Int. Joint Conf. Neural Netw., 2016, pp. 2447–2452.

C. Zhang, K. C. Tan, and R. Ren, “Training cost-sensitive deep belief networks on imbalance data problems,” in Proc. IEEE Int. Joint Conf. Neural Netw., 2016, pp. 4362–4367.

V. Nilulin, G. J. McLachlan, and S. K. Ng, “Ensemble approach for the classification of imbalanced data,” Lecture Notes Artif. Intell., vol. 5866, pp. 291–300, 2009.

S. Wang and X. Yao, “Relationships between diversity of classifi- cation ensembles and single-class performance measures,” IEEE Trans. Knowl. Data Eng., vol. 25, no. 1, pp. 206–2019, Jan. 2013.

Z. Sun, Q. Song, X. Zhu, H. Sun, B. Xu, and Y. Zhou, “A novel ensemble method for classifying imbalanced data,” Pattern Recog., vol. 48, pp. 1623–1637, 2015.

R. E. Schapire, “The strength of weak learnability,” Mach. Learn., vol. 5, no. 2, pp. 197–227, 1990.

D. H. Wolpert, “Stacked generalization,” Neural Netw., vol. 5, pp. 241–259, 1992.

L. Breiman, “Bagging predictors,” Mach. Learn., vol. 24, no. 2, pp. 123–140, 1996.

L. Breiman, “Random forests,” Mach. Learn., vol. 45, no. 1, pp. 5–32, 2001.

I. Nouretdinov, et al., “Machine learning classification with confi- dence: Application of transductive conformal predictors to MRI- based diagnostic and prognostic markers in depression,” Neuro- Image, vol. 56, no. 2, pp. 809–813, 2011.




DOI: https://doi.org/10.23956/ijarcsse.v8i6.729

Refbacks

  • There are currently no refbacks.




© International Journals of Advanced Research in Computer Science and Software Engineering (IJARCSSE)| All Rights Reserved | Powered by Advance Academic Publisher.