PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Regression function and noise variance tracking methods for data streams with concept drift

Autorzy
Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
Two types of heuristic estimators based on Parzen kernels are presented. They are able to estimate the regression function in an incremental manner. The estimators apply two techniques commonly used in concept-drifting data streams, i.e., the forgetting factor and the sliding window. The methods are applicable for models in which both the function and the noise variance change over time. Although nonparametric methods based on Parzen kernels were previously successfully applied in the literature to online regression function estimation, the problem of estimating the variance of noise was generally neglected. It is sometimes of profound interest to know the variance of the signal considered, e.g., in economics, but it can also be used for determining confidence intervals in the estimation of the regression function, as well as while evaluating the goodness of fit and in controlling the amount of smoothing. The present paper addresses this issue. Specifically, variance estimators are proposed which are able to deal with concept drifting data by applying a sliding window and a forgetting factor, respectively. A number of conducted numerical experiments proved that the proposed methods perform satisfactorily well in estimating both the regression function and the variance of the noise.
Rocznik
Strony
559--567
Opis fizyczny
Bibliogr. 43 poz., wykr.
Twórcy
autor
  • Institute of Computational Intelligence, Częstochowa University of Technology, Armii Krajowej 36, 42-200 Częstochowa, Poland
Bibliografia
  • [1] Alippi, C., Boracchi, G. and Roveri, M. (2017). Hierarchical change-detection tests, IEEE Transactions on Neural Networks and Learning Systems 28(2): 246–258.
  • [2] Andrzejewski, W., Gramacki, A. and Gramacki, J. (2013). Graphics processing units in acceleration of bandwidth selection for kernel density estimation, International Journal of Applied Mathematics and Computer Science 23(4): 869–885, DOI: 10.2478/amcs-2013-0065.
  • [3] Bifet, A., Holmes, G., Kirkby, R. and Pfahringer, B. (2010). MOA: Massive online analysis, Journal of Machine Learning Research 11: 1601–1604.
  • [4] Brown, L.D. and Levine, M. (2007). Variance estimation in nonparametric regression via the difference sequence method, Annals of Statistics 35(5): 2219–2232.
  • [5] Carroll, R.J. and Ruppert, D. (1988). Transformation and Weighting in Regression, CRC Press, Boca Raton, FL.
  • [6] Dai, W., Ma, Y., Tong, T. and Zhu, L. (2015). Difference-based variance estimation in nonparametric regression with repeated measurement data, Journal of Statistical Planning and Inference 163: 1–20.
  • [7] Diggle, P.J. and Verbyla, A.P. (1998). Nonparametric estimation of covariance structure in longitudinal data, Biometrics 54(2): 401–415.
  • [8] Ditzler, G., Roveri, M., Alippi, C. and Polikar, R. (2015). Learning in nonstationary environments: A survey, IEEE Computational Intelligence Magazine 10(4): 12–25.
  • [9] Domingos, P. and Hulten, G. (2000). Mining high-speed data streams, Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Boston, MA, USA, pp. 71–80.
  • [10] Duda, P., Jaworski, M. and Rutkowski, L. (2017). Knowledge discovery in data streams with the orthogonal series-based generalized regression neural networks, Information Sciences 460–461: 497–518.
  • [11] Duda, P., Jaworski, M. and Rutkowski, L. (2018). Convergent time-varying regression models for data streams: Tracking concept drift by the recursive Parzen-based generalized regression neural networks, International Journal of Neural Systems 28(02): 1750048.
  • [12] Epanechnikov, V.A. (1969). Non-parametric estimation of a multivariate probability density, Theory of Probability & Its Applications 14(1): 153–158.
  • [13] Fan, J. and Yao, Q. (1998). Efficient estimation of conditional variance functions in stochastic regression, Biometrika 85(3): 645–660.
  • [14] Gama, J., Zliobaite, I., Bifet, A., Pechenizkiy, M. and Bouchachia, A. (2014). A survey on concept drift adaptation, ACM Computing Surveys (CSUR) 46(4): 44:1–44:37.
  • [15] Gasser, T., Kneip, A. and Köhler, W. (1991). A flexible and fast method for automatic smoothing, Journal of the American Statistical Association 86(415): 643–652.
  • [16] Gasser, T., Sroka, L. and Jennen-Steinmetz, C. (1986). Residual variance and residual pattern in nonlinear regression, Biometrika 73(3): 625–633.
  • [17] Greblicki, W. (1974). Asymptotically Optimal Algorithms of Recognition and Identification in Probabilistic Conditions, BI Wrocław University of Technology, Wrocław, (in Polish).
  • [18] Greblicki, W. and Pawlak, M. (2008). Nonparametric System Identification, Cambridge University Press, Cambridge.
  • [19] Györfi, L., Kohler, M., Krzyzak, A. and Walk, H. (2002). A Distribution-Free Theory of Nonparametric Regression, Springer, New York, NY.
  • [20] Hall, P. and Carroll, R.J. (1989). Variance function estimation in regression: The effect of estimating the mean, Journal of the Royal Statistical Society: Series B (Methodological) 51(1): 3–14.
  • [21] Hart, J. (1997). Nonparametric Smoothing and Lack-of-Fit Tests, Springer, New York, NY.
  • [22] Jaworski, M., Duda, P. and Rutkowski, L. (2017). New splitting criteria for decision trees in stationary data streams, IEEE Transactions on Neural Networks and Learning Systems PP(99): 1–14.
  • [23] Krzyzak, A. and Pawlak, M. (1984). Almost everywhere convergence of a recursive regression function estimate and classification, IEEE Transactions on Information Theory 30(1): 91–93.
  • [24] Krzyzak, A. and Pawlak, M. (1987). The pointwise rate of convergence of the kernel regression estimate, Journal of Statistical Planning and Inference 16: 159–166.
  • [25] Mzyk, G. (2007). Generalized kernel regression estimate for the identification of Hammerstein systems, International Journal of Applied Mathematics and Computer Science 17(2): 189–197, DOI: 10.2478/v10006-007-0018-z.
  • [26] Nikulin, V. (2016). Prediction of the shoppers loyalty with aggregated data streams, Journal of Artificial Intelligence and Soft Computing Research 6(2): 69–79.
  • [27] Parzen, E. (1962). On estimation of probability density function and mode, Annals of Mathematical Statistics 33: 1065–1076.
  • [28] Pietruczuk, L., Rutkowski, L., Jaworski,M. and Duda, P. (2014). The Parzen kernel approach to learning in non-stationary environment, Proceedings of the International Joint Conference on Neural Networks (IJCNN), Beijing, China, pp. 3319–3323.
  • [29] Pietruczuk, L., Rutkowski, L., Jaworski, M. and Duda, P. (2017). How to adjust an ensemble size in stream data mining?, Information Sciences 381: 46–54.
  • [30] Rafajlowicz, E. (1987). Nonparametric orthogonal series estimators of regression: A class attaining the optimal convergence rate in L2, Statistics and Probability Letters 5(3): 219–224.
  • [31] Rafajlowicz, E. (1989). Reduction of distributed system identification complexity using intelligent sensors, International Journal of Control 50(5): 1571–1576.
  • [32] Rao, B.P. (2014). Nonparametric Functional Estimation, Academic Press, Cambridge, MA.
  • [33] Ruppert, D., Wand, M.P., Holst, U. and Hössjer, O. (1997). Local polynomial variance-function estimation, Technometrics 39(3): 262–273.
  • [34] Rutkowski, L. (2004). Generalized regression neural networks in time-varying environment, IEEE Transactions on Neural Networks 15: 576–596.
  • [35] Rutkowski, L. and Galkowski, T. (1994). On pattern classification and system identification by probabilistic neural networks, International Journal of Applied Mathematics and Computer Science 4(3): 413–422.
  • [36] Rutkowski, L., Jaworski, M., Pietruczuk, L. and Duda, P. (2015). A new method for data stream mining based on the misclassification error, IEEE Transactions on Neural Networks and Learning Systems 26(5): 1048–1059.
  • [37] Rutkowski, L., Pietruczuk, L., Duda, P. and Jaworski, M. (2013). Decision trees for mining data streams based on the McDiarmid’s bound, IEEE Transactions on Knowledge and Data Engineering 25(6): 1272–1279.
  • [38] Shaker, A. and Hüllermeier, E. (2014). Survival analysis on data streams: Analyzing temporal events in dynamically changing environments, International Journal of Applied Mathematics and Computer Science 24(1): 199–212, DOI: 10.2478/amcs-2014-0015.
  • [39] Shen, H. and Brown, L.D. (2006). Non-parametric modelling of time-varying customer service times at a bank call centre, Applied Stochastic Models in Business and Industry 22(3): 297–311.
  • [40] von Neumann, J. (1941). Distribution of the ratio of the mean square successive difference to the variance, Annals Mathematic of Statistics 12(4): 367–395.
  • [41] Wang, H., Fan, W., Yu, P.S. and Han, J. (2003). Mining concept-drifting data streams using ensemble classifiers, Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD’03, Washington, DC, USA, pp. 226–235.
  • [42] Weinberg, A.I. and Last, M. (2017). Interpretable decision-tree induction in a big data parallel framework, International Journal of Applied Mathematics and Computer Science 27(4): 737–748, DOI: 10.1515/amcs-2017-0051.
  • [43] Zliobaite, I., Bifet, A., Pfahringer, B. and Holmes, G. (2014). Active learning with drifting streaming data, IEEE Transactions on Neural Networks and Learning Systems 25(1): 27–39.
Uwagi
PL
Opracowanie rekordu w ramach umowy 509/P-DUN/2018 ze środków MNiSW przeznaczonych na działalność upowszechniającą naukę (2018).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-d3e97373-6ae7-4699-a17a-cfd841f92356
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.