Czasopismo
2024
|
Vol. 44, Fasc. 1
|
1--13
Tytuł artykułu
Autorzy
Wybrane pełne teksty z tego czasopisma
Warianty tytułu
Języki publikacji
Abstrakty
We prove a Bernstein-type bound for the difference between the average of the negative log-likelihoods of independent categorical variables with infinitely many levels - that is, a countably infinite number of categories, and its expectation - namely, the Shannon entropy. The result holds for the class of discrete random variables with tails lighter than or of the same order as a discrete power-law distribution. Most commonly used discrete distributions, such as the Poisson distribution, the negative binomial distribution, and the power-law distribution itself, belong to this class. The bound is effective in the sense that we provide a method to compute the constants within it. The new technique we develop allows us to obtain a uniform concentration inequality for categorical variables with a finite number of levels with the same optimal rate as in the literature, but with a much simpler proof.
Czasopismo
Rocznik
Tom
Strony
1--13
Opis fizyczny
Bibliogr. 16 poz.
Twórcy
autor
- Department of Statistics Colorado State University Fort Collins, CO 80521, USA, Yunpeng.Zhao@colostate.edu
Bibliografia
- [1] A. Antos and I. Kontoyiannis, Convergence properties of functional estimates for discrete distributions, Random Structures Algorithms 19 (2001), 163-193.
- [2] V. Baccetti and M. Visser, Infinite Shannon entropy, J. Statist. Mech. Theory Exp. 4 (2013), art. P04010, 12 pp.
- [3] J. Beirlant, E. J. Dudewicz, L. Györfi, E. C. Van der Meulen, Nonparametric entropy estimation: An overview, Int. J. Math. Math. Sci. 6 (1997), 17-39.
- [4] D. S. Choi, P. J. Wolfe, and E. M. Airoldi, Stochastic blockmodels with a growing number of classes, Biometrika 99 (2012), 273-284.
- [5] T. M. Cover and J. A. Thomas, Elements of Information Theory, 2nd ed., Wiley-Interscience, Hoboken, NJ, 2006.
- [6] I. Csiszár and J. Körner, Information Theory: Coding Theorems for Discrete Memoryless Systems, Cambridge Univ. Press, Cambridge, 2011.
- [7] D. P. Dubhashi and A. Panconesi, Concentration of Measure for the Analysis of Randomized Algorithms, Cambridge Univ. Press, Cambridge, 2009.
- [8] Y. Li and B. Tian, Optimal non-asymptotic concentration of centered empirical relative entropy in the high-dimensional regime, Statist. Probab. Lett. 197 (2023), art. 109803, 5 pp.
- [9] M. Raginsky and I. Sason, Concentration of measure inequalities in information theory, communications and coding, arXiv:1212.4663 (2012).
- [10] Z. Ren, Optimal distribution-free concentration for the log-likelihood function of Bernoulli variables, J. Inequal. Appl. 2023, art. 81, 11 pp.
- [11] H. Robbins, A remark on Stirling’s formula, Amer. Math. Monthly 62 (1955), 26-29.
- [12] C. E. Shannon, A mathematical theory of communication, Bell Labs Tech. J. 27 (1948), 379-423.
- [13] R. Vershynin, High-Dimensional Probability: An Introduction with Applications in Data Science, Cambridge Univ. Press, Cambridge, 2018.
- [14] M. J. Wainwright, High-Dimensional Statistics: A Non-Asymptotic Viewpoint, Cambridge Univ. Press, Cambridge, 2019.
- [15] Y. Zhao, A note on new Bernstein-type inequalities for the log-likelihood function of Bernoulli variables, Statist. Probab. Lett. 163 (2020), art. 108779, 5 pp.
- [16] Y. Zhao, An optimal uniform concentration inequality for discrete entropies on finite alphabets in the high-dimensional setting, Bernoulli 28 (2022), 1892-1911.
Typ dokumentu
Bibliografia
Identyfikatory
Identyfikator YADDA
bwmeta1.element.baztech-e95bc6c6-5521-4139-9ffc-8d2151de58b9