PHÂN LỚP DỮ LIỆU KHÔNG CÂN BẰNG VỚI ROUGHLY BALANCED BAGGING
Abstract
Tóm tắt
Article Details
Tài liệu tham khảo
Asuncion, A. & Newman, D.J.: UCI Machine Learning Repository. Irvine, CA: University of California, School of Information and Computer Science, 2007. [http://www.ics.uci.edu/~m-learn/MLRepository.html]
Breiman, L., Friedman, J., Olshen, R. and Stone C.: Classification and Regression Trees. Chapman & Hall, New York, 1984.
Breiman, L.: Bagging predictors. Machine Learning 24(2):123–140, 1996.
Breiman, L.: Random Forests. Machine Learning, 45(1):5-32, 2001.
Chawla, N., Japkowicz, N. and Kolcz, A.: ICML’Workshop on Learning from Imbalanced Data Sets. 2003.
Chawla, N., Japkowicz, N. and Kolcz, A.: Special Issue on Class Imbalances. In SIGKDD Explorations Vol. 6, 2004.
Chawla, N., Lazarevic, A., Hall, L.O. and Bowyer, K.W.: SMOTEBoost: Improving prediction of the minority class in boosting. In proc. of European Conf. on Principles and Practice of Knowledge Discovery in Databases, pp. 107–119, 2003.
Domingos, P.: Metacost: A general method for making classifiers cost sensitive. In proc. of Intl Conf. on Knowledge Discovery and Data Mining, pp. 155–164, 1999.
Freund, Y. and Schapire, R.: A decision-theoretic generalization of on-line learning and an application to boosting. In Computational Learning Theory: Proceedings of the Second European Conference, pp. 23–37, 1995.
Hido, S. and Kashima, H.: Roughly balanced bagging for imbalanced data. In proc. of SIAM Intl Conference on Data Mining, pp. 143–152, 2008.
Ihaka, R. and Gentleman, R.: R: A language for data analysis and graphics. Journal of Computational and Graphical Statistics, 5(3):299-314, 1996.
Lenca, P., Lallich, S., Do, T-N. and Pham, N-K.: A comparison of different off-centered entropies to deal with class imbalance for decision trees. In The Pacific-Asia Conference on Knowledge Discovery and Data Mining, LNAI 5012, pp. 634–643, 2008.
Liu, X.-Y., Wu, J. and Zhou, Z.-H.: Exploratory under-sampling for class-imbalance learning. In proc. of Sixth IEEE Intl Conf. on Data Mining (ICDM’06), pp. 965–969, 2006.
Liu, X-Y. and Zhou, Z-H.: The influence of class imbalance on costsensitive learning: An empirical study. In proc. of Sixth IEEE Intl Conf. on Data Mining (ICDM’06), pp. 970–974, 2006.
Quinlan, J.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, 1993.
van Rijsbergen, C.V.: Information Retrieval. Butterworth, 1979.
Vapnik, V.: The Nature of Statistical Learning Theory. Springer-Verlag, New York, 1995.
Weiss, G.M. and Provost, F.: Learning when training data are costly: The effect of class distribution on tree induction. Journal of Artificial Intelligence Research Vol.(19):315–354, 2003.
Yang, Q. and Wu, X.: 10 Challenging Problems in Data Mining Research. Intl Journal of Information Technology and Decision Making 5(4), 597–604, 2006.