Tytuł artykułu
Autorzy
Wybrane pełne teksty z tego czasopisma
Identyfikatory
Warianty tytułu
Języki publikacji
Abstrakty
We consider the regression model in the situation when the number of available regressors pn is much bigger than the sample size n and the number of nonzero coefficients p0n is small (the sparse regression). To choose the regression model, we need to identify the nonzero coefficients. However, in this situation the classical model selection criteria for the choice of predictors like, e.g., the Bayesian Information Criterion (BIC) overestimate the number of regressors. To address this problem, several modifications of BIC have been recently proposed. In this paper we prove weak consistency of some of these modifications under the assumption that both n and pn as well as p0n go to infinity.
Słowa kluczowe
Czasopismo
Rocznik
Tom
Strony
47--55
Opis fizyczny
Bibliogr. 19 poz.
Twórcy
autor
- Institute of Mathematics and Computer Science, Wrocław University of Technology,ul. Janiszewskiego 14a, 50-372 Wrocław, Poland
Bibliografia
- [1] F. Abramovich, Y. Benjamini, D. L. Donoho and I. M. Johnstone, Adapting to unknown sparsity by controlling the false discovery rate, Ann. Statist. 34 (2006), pp. 584-653.
- [2] A. Baierl, M. Bogdan, F. Frommlet and F. Futschik, On locating multiple interacting quantitative trait loci in intercross designs, Genetics 173 (2006), pp. 1693-1703.
- [3] A. Baierl, F. Futschik, M. Bogdan and P. Biecek, Locating multiple interacting quantitative trait loci using robust model selection, Comput. Statist. Data Anal. 51 (2007), pp. 6423-6434.
- [4] Y. Benjamini, Y. Hochberg and A. B. Tsybakov, Controlling the false discovery rate: a practical and powerful approach to multiple testing, J. Roy. Statist. Soc. Ser. B 57 (1) (1995), pp. 289-300.
- [5] P. J. Bickel, Y. Ritov and A. B. Tsybakov, Simultaneous analysis of LASSO and Dantzig selector, Ann. Statist. 37 (2009), pp. 1705-1732.
- [6] M. Bogdan, A. Chakrabarti, J. K. Ghosh and F. Frommlet, Asymptotic Bayesoptimality under sparsity of some multiple testing procedures, Ann. Statist. 39 (2011), pp. 1551-1579.
- [7] M. Bogdan, J. K. Ghosh and R. W. Doerge, Modifying the Schwarz Bayesian information criterion to locate multiple interacting quantitative trait loci, Genetics 167 (2004), pp. 989-999.
- [8] M. Bogdan, J. K. Ghosh, M. ˙Zak-Szatkowska, Selecting explanatory variables with the modified version of Bayesian Information Criterion, Qual. Reliab. Eng. Int. 24 (2008), pp. 627-641.
- [9] K. W. Broman and T. P. Speed, A model selection approach for the identification of quantitative trait loci in experimental crosses, J. Roy. Statist. Soc. Ser. B 64 (2002), pp. 641-656.
- [10] E. Candes and T. Tao, The Dantzig selector: statistical estimation when p is much larger than n, Ann. Statist. 35 (2007), pp. 2313-2351.
- [11] J. Chen and Z. Chen, Extended Bayesian information criterion for model selection with large model space, Biometrika 94 (2008), pp. 759-771.
- [12] J. Chen and Z. Chen, Extended BIC for small n-large-P sparse GLM (2010) (submitted, available at www.stat.nus.edu.sg/~stachenz/ChenChen.pdf).
- [13] Z. Chen and Z. Luo, Extended BIC for linear regression models with diverging number of parameters and high or ultra-high feature spaces (2011) (technical raport available at arxiv.org/abs/1107.2502v1).
- [14] F. Frommlet, M. Bogdan and A. Chakrabarti, Asymptotic Bayes optimality under sparsity for general priors under the alternative (2011) (technical raport available at arxiv.org/abs/1005.4753v2).
- [15] F. Frommlet, F. Ruhaltinger, P. Twaróg and M. Bogdan, A model selection approach to genome wide association studies, Comput. Statist. Data Anal. (2011) (doi:10.1016/j.csda.2011.05.005).
- [16] E. I. George and D. P. Foster, Calibration and empirical Bayes variable selection, Biometrika 87 (2000), pp. 731-747.
- [17] W. Li and Z. Chen, Multiple interval mapping for quantitative trait loci with a spike in the trait distribution, Genetics 182 (2) (2009), pp. 337-342.
- [18] M. ˙Żak-Szatkowska and M. Bogdan, Modified versions of Bayesian Information Criterion for sparse Generalized Linear Models, Comput. Statist. Data Anal. 55 (11) (2011), pp. 2908-2924.
- [19] J. Zhao and Z. Chen, A two-stage penalized logistic regression approach to case-control genome-wide association studies (2010) (submitted, available at www.stat.nus.edu.sg/~stachenz/MS091221PR.pdf).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-3cdb3bcb-0b20-429f-b58a-f57730825fe7