A Comparison of Techniques for Virtual Concept Drift Detection

  1. González, Manuel L.
  2. Sedano, Javier
  3. García-Vico, Ángel M.
  4. Villar, José R. 1
  1. 1 Universidad de Oviedo
    info

    Universidad de Oviedo

    Oviedo, España

    ROR https://ror.org/006gksa02

Actas:
16th International Conference on Soft Computing Models in Industrial and Environmental Applications (SOCO 2021)

ISSN: 2194-5357 2194-5365

ISBN: 9783030878689 9783030878696

Año de publicación: 2021

Páginas: 3-13

Tipo: Aportación congreso

DOI: 10.1007/978-3-030-87869-6_1 GOOGLE SCHOLAR lock_openAcceso abierto editor

Resumen

Concept Drift is one of the main problems presents in data stream processing for Data Mining and Machine Learning. This study focuses on Virtual Concept Drift. A common approach includes i) the detection of the drift with a specialized algorithm, and ii) the adaptation of the model to the current scenario. This work studies how well-known pre-processing methods affect abrupt Virtual Concept Drift detection in data streams. The proposed pre-processing techniques are: i) deleting the trend and ii) transforming the data stream from time to spectral domain. Moreover, three Virtual Concept Drift detection methods are compared over three publicly available data sets. According to the results, a slight improvement in the detection of Virtual Concept Drift is achieved when the trend is deleted. In contrast, no detection of Virtual Concept Drift is reported on the spectral domain.

Referencias bibliográficas

  • Gama, J., Medas, P., Castillo, G., Rodrigues, P.: Learning with drift detection. In: Bazzan, A.L.C., Labidi, S. (eds.) Advances in Artificial Intelligence, SBIA 2004. Lecture Notes in Computer Science, vol. 3171. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-28645-5_29
  • Khamassi, I., Sayed-Mouchaweh, M., Hammami, M., Ghédira, K.: Discussion and review on evolving data streams and concept drift adapting. Evolving Syst. 9(1), 1–23 (2016). https://doi.org/10.1007/s12530-016-9168-2
  • Webb, G.I., Lee, L.K., Goethals, B., Petitjean, F.: Analyzing concept drift and shift from sample data. Data Min. Knowl. Disc. 32(5), 1179–1199 (2018). https://doi.org/10.1007/s10618-018-0554-1
  • Gama, J., Žliobaite I., Bifet, A., Pechenizkiy, M., Bouchachia, A.: A survey on concept drift adaptation. ACM Comput. Surv. 46(4), 1–44 (2014). https://doi.org/10.1007/s10618-018-0554-1
  • Gama, J., Castillo, G.: Learning with local drift detection. In: Li, X., Zaïane, O.R., Li, Z. (eds.) Advanced Data Mining and Applications. ADMA 2006. Lecture Notes in Computer Science, vol. 4093. Springer, Heidelberg (2006). https://doi.org/10.1007/11811305_4
  • Baena-Garcia, M., Del Campo-Ávila, J., Fidalgo, R., Bifet, A., Gavaldà, R., Morales-Bueno, R.: Early drift detection method. In: International Workshop on Knowledge Discovery from Data Streams, pp. 77–86 (2006)
  • Gao, J., Fan, W., Han, J., Yu, P.: A general framework for mining concept-drifting data streams with skewed distributions. In: Proceedings of the 2007 SIAM International Conference on Data Mining, pp. 3–14 (2007). https://doi.org/10.1137/1.9781611972771.1
  • Lu, J., Liu, A., Dong, F., Gu, F., Gama, J., Zhang, G.: Learning under concept drift: a review. IEEE Trans. Knowl. Data Eng. 31(12), 2346–2363 (2019). https://doi.org/10.1109/TKDE.2018.2876857
  • De Barros, R., Garrido, S., Santo, C.: An overview and comprehensive comparison of ensembles for concept drift. Inf. Fusion 52, 213–244 (2019). https://doi.org/10.1016/j.inffus.2019.03.006
  • Žliobaite, I.: Learning under concept drift: an overview. arXiv (2010). https://arxiv.org/abs/1010.4784
  • Sobolewski, P., Wozniak, M.: Comparable study of statistical tests for virtual concept drift detection. In: Proceedings of the 8th International Conference on Computer Recognition Systems, pp. 329–337 (2013). https://doi.org/10.1007/978-3-319-00969-8_32
  • Souza, V.M.A., Parmezan, A.R.S., Chowdhury, F.A., Mueen, A.: Efficient unsupervised drift detector for fast and high-dimensional data streams. Knowl. Inf. Syst. 63(6), 1497–1527 (2021). https://doi.org/10.1007/s10115-021-01564-6
  • Oliveira, G., Cavalcante, R., Cabral, G., Minku, L., Oliveira, A.: Time series forecasting in the presence of concept drift: a PSO-based approach. In: 2017 IEEE 29th International Conference on Tools with Artificial Intelligence (ICTAI), pp. 239–246 (2017). https://doi.org/10.1109/ICTAI.2017.00046
  • Baier, L., Hofmann, M., Kuhl, N., Mohr, N., Satzger, G.: Handling concept drifts in regression problems-the error intersection approach. In: Proceedings of 15th International Conference on Wirtschaftsinformatik (2020)
  • Ramírez-Gallego, S., Krawczyk, B., García, S., Woźniak, M., Herrera, F.: A survey on data preprocessing for data stream mining: current status and future directions. Neurocomputing 239, 39–57 (2017). https://doi.org/10.1016/j.neucom.2017.01.078
  • Cooley, J., Lewis, P., Welch, P.: The finite Fourier transform. IEEE Trans. Audio Electroacoust. 17(2), 77–85 (1969). https://doi.org/10.1109/TAU.1969.1162036
  • Bifet, A., Gavaldà, R.: Learning from time-changing data with adaptive windowing. In: SIAM International Conference on Data Mining, pp. 443–448 (2007). https://doi.org/10.1137/1.9781611972771.42
  • Dunn, O.: Multiple comparisons among means. J. Am. Stat. Assoc. 56(293), 52–64 (1961). https://doi.org/10.1080/01621459.1961.10482090
  • Wang, Z., Wang, W.: Concept drift detection based on Kolmogorov-Smirnov test. In: Liang, Q., Wang, W., Mu, J., Liu, X., Na, Z., Chen, B. (eds.) Artificial Intelligence in China. Lecture Notes in Electrical Engineering, vol. 572. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-0187-6_31
  • Raab, C., Heusinger, M., Schleif, F.: Reactive soft prototype computing for concept drift streams. Neurocomputing 416, 340–351 (2020). https://doi.org/10.1016/j.neucom.2019.11.111
  • Misra, S., Biswas, D., Saha, S., Mazumdar, C.: Applying Fourier inspired windows for concept drift detection in data stream. In: Proceedings of 2020 IEEE Calcutta Conference (CALCON), pp. 152–156 (2020). https://doi.org/10.1109/CALCON49167.2020.9106537
  • Bhattacharyya, A.: On the measure of divergence between two statistical populations defined by their probability distributions. Bull. Calcutta Math. Soc. 35, 99–109 (1943)