PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Learning novelty detection outside a class of random curves with application to COVID-19 growth

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
Let a class of proper curves is specified by positive examples only. We aim to propose a learning novelty detection algorithm that decides whether a new curve is outside this class or not. In opposite to the majority of the literature, two sources of a curve variability are present, namely, the one inherent to curves from the proper class and observations errors’. Therefore, firstly a decision function is trained on historical data, and then, descriptors of each curve to be classified are learned from noisy observations.When the intrinsic variability is Gaussian, a decision threshold can be established from T2 Hotelling distribution and tuned to more general cases. Expansion coefficients in a selected orthogonal series are taken as descriptors and an algorithm for their learning is proposed that follows nonparametric curve fitting approaches. Its fast version is derived for descriptors that are based on the cosine series. Additionally, the asymptotic normality of learned descriptors and the bound for the probability of their large deviations are proved. The influence of this bound on the decision threshold is also discussed.The proposed approach covers curves described as functional data projected onto a finite-dimensional subspace of a Hilbert space as well a shape sensitive description of curves, known as square-root velocity (SRV). It was tested both on synthetic data and on real-life observations of the COVID-19 growth curves.
Słowa kluczowe
Rocznik
Strony
195--215
Opis fizyczny
Bibliogr. 74 poz., rys.
Twórcy
  • Department of Control Systems and Mechatronics Wrocław University of Science and Technology, Wroclaw, Poland
Bibliografia
  • [1] C. Abraham, G. Biau, and B. Cadre, On the kernel rule for function classification, Annals of the Institute of Statistical Mathematics, 58(May 2005): 619–633, 2006.
  • [2] TW. Anderson, The Statistical Analysis of Time Series, Wiley Online Library, 1971.
  • [3] G. Aneiros, E. Bongiorno, R. Cao, P. Vieu, et al, Functional statistics and related fields. Springer, Cham 2017.
  • [4] G. Biau, F. Bunea, and M. Wegkamp, Functional classification in hilbert spaces. IEEE Transactions on Information Theory, 51(6): 2163–2172, 2005.
  • [5] P. Bickel and K. Doksum, Mathematical statistics: basic ideas and selected topics, volume I, volume 117. CRC Press,Boca Raton 2015.
  • [6] W. Bock, B. Adamik, M. Bawiec, V. Bezborodov, M. Bodych, J. Burgard, T. Goetz, T. Krueger, A. Migalska, B.a Pabjan, T. Ożański, E. Rafajłowicz, W. Rafajłowicz, E. Skubalska-Rafajłowicz, S. Ryfczyńska, E. Szczureki, and P. Szymański, Mitigation and herd immunity strategy for COVID-19 is likely to fail, medRxiv, 2020.
  • [7] V. Britanak, P. Yip, and K. Rao, Discrete cosine and sine transforms: general properties, fast algorithms and integer approximations, Elsevier, Oxford, 2010.
  • [8] D. Clifton, S. Hugueny, and L. Tarassenko, A comparison of approaches to multivariate extreme value theory for novelty detection, In: IEEE Workshop on Statistical Signal Processing Proceedings, pages 13–16, 2009.
  • [9] A. Cuevas, A partial overview of the theory of statistics with functional data, Journal of Statistical Planning and Inference, 147: 1–23, 2014.
  • [10] A. Cuevas, M. Febrero, and R. Fraiman, Robust estimation and classification for functional data via projection-based depth notions, Computational Statistics, 22(3): 481–496, 2007.
  • [11] L. Devroye, L. Gyorfi, and G. Lugosi, A probabilistic theory of pattern recognition, volume 31. Springer Science & Business Media, New York 2013.
  • [12] L. Devroye and G. Lugosi, Almost sure classification of densities, Journal of Nonparametric Statistics, 14(6): 675–698, 2002.
  • [13] P. Duda, K. Przybyszewski, and L. Wang, A novel drift detection algorithm based on features’ importance analysis in a data streams environment, Journal of Artificial Intelligence and Soft Computing Research, 10(4): 287–298, 2020.
  • [14] P. Duda, L. Rutkowski, M. Jaworski, and D. Rutkowska, On the parzen kernel-based probability density function learning procedures over time-varying streaming data with applications to pattern classification, IEEE Transactions on Cybernetics, 50(4), 2018.
  • [15] P. Duda, L. Rutkowski, M. Jaworski, and D. Rutkowska, On the Parzen Kernel-Based Probability Density Function Learning Procedures Over Time-Varying Streaming Data With Applications to Pattern Classification, IEEE Trans. on Cybernetics, 50(4): 1683–1696, 2020.
  • [16] A. Ehrenfeucht, D. Haussler, M. Kearns, and L. Valiant, A general lower bound on the number of examples needed for learning, Information and Computation, 82: 247–261, 1989.
  • [17] F. Ferraty and P. Vieu, Nonparametric functional data analysis: theory and practice, Springer Science & Business Media, New York 2006.
  • [18] Ralph Foorthuis, On the nature and types of anomalies: A review, arXiv preprint arXiv: 2007.15634, 2020.
  • [19] European Centre for Disease Prevention and Control, Data on the geographic distribution of covid-19 cases worldwide.
  • [20] P. Galeano, J. Esdras, and R. Lillo, The mahalanobis distance for functional data with applications to classification, Technometrics, 57(2): 281–291, 2015.
  • [21] T. Gałkowski, A. Krzyżak, and Z. Filutowicz, A new approach to detection of changes in multidimensional patterns, Journal of Artificial Intelligence and Soft Computing Research, 10(2):125–136, 2020.
  • [22] T. Galkowski and L. Rutkowski, Nonparametric Fitting of Multivariate Functions, IEEE Transactions on Automatic Control, 31(8): 785–787, 1986.
  • [23] F. Gouin, C. Ancourt, and C. Guettier, Three-wise: A local variance algorithm for GPU, Proceedings - 19th IEEE International Conference on Computational Science and Engineering, 14th IEEE International Conference on Embedded and Ubiquitous Computing and 15th International Symposium on Distributed Computing and Applications to Business, Engineering and Science, CSE-EUC-DCABES 2016, pages 257–262, 2017.
  • [24] W. Greblicki, Pattern recognition procedures with nonparametric density estimates, IEEE Transactions on Systems, Man and Cybernetics, 8: 809–812, 1978.
  • [25] W. Greblicki and M. Pawlak, Classification using the Fourier series estimate of multivariate density functions, IEEE Transactions on Systems, Man and Cybernetics, 11: 726–730, 1981.
  • [26] W. Greblicki and M. Pawlak, Fourier and {H}ermite series estimates of regression functions, Ann. Inst. Stat. Math., 37: 443–454, 1985.
  • [27] L. Gyorfi, M. Kohler, A. Krzyzak, and H. Walk, A Distribution-Free Theory of Nonparametric Regression, Springer, New York, 2002.
  • [28] T. Harris, D. Tucker, B. Li, and L. Shand, Elastic depths for detecting shape anomalies in functional data, Technometrics, pages 1–11, 2020.
  • [29] D. Haussler, M. Kearns, H Sebastian Seung, and N. Tishby, Rigorous learning curve bounds from statistical mechanics, Machine Learning, 25(2-3): 195–236, 1996.
  • [30] W. Homenda, A. Jastrzębska, W. Pedrycz, and F. Yu, Combining classifiers for foreign pattern rejection, Journal of Artificial Intelligence and Soft Computing Research, 10(2): 75–94, 2020.
  • [31] L. Horváth and P. Kokoszka, Inference for functional data with applications, volume 200, Springer Science & Business Media, 2012.
  • [32] J. Jurečková and J. Kalina, Nonparametric multivariate rank tests and their unbiasedness, Bernoulli, 18(1): 229–251, 2012.
  • [33] W C M Kallenberg, T Ledwina, and E Rafajlowicz, Testing bivariate independence and normality, Sankhya: The Indian Journal of Statistics, Series A, 59(1): 42–59, 1997.
  • [34] M. Kemmler, E. Rodner, E. Wacker, and J. Denzler, One-class classification with Gaussian processes, Pattern Recognition, 46(12): 3507–3518, 2013.
  • [35] J. T. Kwok, I. W. Tsang, and J. M Zurada, A class of single-class minimax probability machines for novelty detection, IEEE Transactions on Neural Networks, 18(3): 778–785, 2007.
  • [36] N. Ling and P. Vieu, Nonparametric modelling for functional data: selected survey and tracks for future, Statistics, 52(4): 934–949, 2018.
  • [37] M. Markou and S. Singh, Novelty detection: A review - Part 2:: Neural network based approaches, Signal Processing, 83(12): 2499–2521, 2003.
  • [38] M. Markou and S. Singh, Novelty detection: a review—part 1: statistical approaches, Signal Processing, 83(12): 2481–2497, 2003.
  • [39] J. Marron, J. Ramsay, L. Sangalli, and A. Srivastava, Functional data analysis of amplitude and phase variation, Statistical Science, 30(4): 468–484, 2015.
  • [40] J. S. Marron, J. Ramsay, L. Sangalli, and A. Srivastava, Functional data analysis of amplitude and phase variation, Statistical Science, 30(4): 468–484, 2015.
  • [41] D. Montgomery, Introduction to statistical quality control, John Wiley & Sons New York, 2009.
  • [42] H-G Mueller et al, Peter Hall, functional data analysis and random objects, The Annals of Statistics, 44(5): 1867–1887, 2016.
  • [43] K. Patan, M. Witczak, and J. Korbicz, Towards robustness in neural network based fault diagnosis, International Journal of Applied Mathematics and Computer Science, 18(4): 443–454, 2008.
  • [44] S. Perera and J. Liu, Complexity reduction, self/completely recursive, radix-2 dct i/iv algorithms, Journal of Computational and Applied Mathematics, 379: 112936, 2020.
  • [45] E. Rafajłowicz and Schwabe R, Halton and Hammersley sequences in multivariate nonparametric regression, Statistics and Probability Letters, 76(8): 803–812, 2006.
  • [46] E. Rafajłowicz and R. Schwabe, Equidistributed designs in nonparametric regression, Statistica Sinica, 13(1), 2003.
  • [47] E. Rafajłowicz and E. Skubalska-Rafajłowicz, FFT in calculating nonparametric regression estimate based on trigonometric series, Journal of Applied Mathematics and Computer and Computer Science, 3(4): 713–720, 1993.
  • [48] E. Rafajłowicz and A. Steland, A binary control chart to detect small jumps, Statistics, 43(3): 295–311, 2009.
  • [49] E. Rafajłowicz and A. Steland, The Hotelling—Like T2 Control Chart Modified for Detecting Changes in Images having the Matrix Normal Distribution, In Springer Proceedings in Mathematics and Statistics, volume 294, pages 193–206, 2019.
  • [50] E. Rafajłowicz, Nonparametric orthogonal series estimators of regression: a class attaining the optimal convergence rate in L2, Statistics and Probability Letters, 5: 219–224, 1987.
  • [51] J. Ramsay and B. Silverman, Applied functional data analysis: methods and case studies, Springer, 2007.
  • [52] D. Rutkowska and L. Rutkowski, On the Hermite series-based generalized regression neural networks for stream data mining, In: International Conference on Neural Information Processing, pages 437–448. Springer, 2019.
  • [53] L. Rutkowski, A general approach for nonparametric fitting of functions and their derivatives with applications to linear circuits identification, IEEE Transactions on Circuits and Systems, 33(8): 812–818, 1986.
  • [54] L. Rutkowski, M. Jaworski, and P. Duda, Stream data mining: algorithms and their probabilistic properties, Springer, Cham, 2020.
  • [55] L. Rutkowski and E. Rafajłowicz, On optimal global rate of convergence of some nonparametric identification procedures, IEEE Trans. Automatic Control, AC-34: 1089–1091, 1989.
  • [56] S. Sameer and M. Markou, An approach to novelty detection applied to the classification of image regions, IEEE Transactions on Knowledge and Data Engineering, 16(4): 396–407, 2004.
  • [57] R. Serfling. Approximation theorems of mathematical statistics, volume 162. John Wiley & Sons, New York 2009.
  • [58] E. Skubalska-Rafajłowicz, One-dimensional Kohonen’s Lvq nets for multidimensional patterns recognition, International Journal of Applied Mathematics and Computer Science, 10(4): 767–778, 2000.
  • [59] E. Skubalska-Rafajlowicz, Pattern recognition algorithms based on space-filling curves and orthogonal expansions, IEEE Transactions on Information Theory, 47(5): 1915–1927, 2001.
  • [60] E. Skubalska-Rafajłowicz, Random projection RBF nets for multidimensional density estimation, International Journal of Applied Mathematics and Computer Science, 18(4): 455–464, 2008.
  • [61] E. Skubalska-Rafajlowicz and A. Krzyzak, Fast k-NN classification rule using metric on space-filling curves, In Proceedings of 13th International Conference on Pattern Recognition, volume 2, pages 121–125. IEEE, 1996.
  • [62] A. Srivastava and E. Klassen, Functional and shape data analysis, volume 1, Springer, Cham, 2016.
  • [63] A. Srivastava, E. Klassen, S. Joshi, and I. Jermyn, Shape Analysis of Elastic Curves in Euclidean Spaces, IEEE Journal on Selected Areas in Communications, 10(2): 391–400, 1992.
  • [64] A. Srivastava, E. Klassen, S. Joshi, and I. Jermyn, Shape analysis of elastic curves in Euclidean spaces, IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(7): 1415–1428, 2010.
  • [65] A. Steland and R. von Sachs, Asymptotics for high-dimensional covariance matrices and quadratic forms with applications to the trace functional and shrinkage, Stochastic Process. Appl., 128(8): 2816–2855, 2018.
  • [66] L. Tarassenko, A. Nairac, N. Townsend, I. Buxton, and P. Cowley, Novelty detection for the identification of abnormalities, International Journal of Systems Science, 31(11): 1427–1439, 2000.
  • [67] B. Trawiński, M. Smętek, Z. Telec, and T. Lasota, Nonparametric statistical analysis for multiple comparison of machine learning regression algorithms, International Journal of Applied Mathematics and Computer Science, 22(4): 867–881, 2012.
  • [68] M. Vidyasagar, A theory of learning and generalization, Springer-Verlag, Berlin, 2002.
  • [69] G. Vinue and I. Epifanio, Robust archetypoids for anomaly detection in big functional data, Advances in Data Analysis and Classification, pages 1–26, 2020.
  • [70] J. Wang, J. Chiou, and H-G Mueller, Review of Functional Data Analysis, pages 1–41, 2015.
  • [71] W. Xie, O. Chkrebtii, and S. Kurtek, Visualization and Outlier Detection for Multivariate Elastic Curve Data, IEEE Transactions on Visualization and Computer Graphics, 26(11): 3353–3364, 2020.
  • [72] Y. Yang and T. Mathew, The simultaneous assessment of normality and homoscedasticity in one-way random effects models, Statistics and Applications (ISSN 2452-7395(online)), 18(2): 97–119, 2020.
  • [73] M. Yao and H. Wang, One-Class Support Vector Machine for Functional Data Novelty Detection, In 2012 Third Global Congress on Intelligent Systems, pages 172–175. IEEE, 2012.
  • [74] M. Yao and H. Wang, One-class support vector machine for functional data novelty detection, In: Proceedings - 2012 3rd Global Congress on Intelligent Systems, GCIS 2012, number 1, pages 172–175. IEEE, 2012.
Uwagi
Opracowanie rekordu ze środków MNiSW, umowa Nr 461252 w ramach programu "Społeczna odpowiedzialność nauki" - moduł: Popularyzacja nauki i promocja sportu (2021).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-b3f14835-3e0d-4aa3-8af0-55861f42bfde
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.