A new space-saving Bayesian tree construction method for high dimensional data

Kłopotek, M.

Powiadomienia systemowe

Sesja wygasła!

Artykuł - szczegóły

Tytuł artykułu

A new space-saving Bayesian tree construction method for high dimensional data

Autorzy

Kłopotek M.

Wybrane pełne teksty z tego czasopisma

http://www.degruyter.com/view/j/dema

Identyfikatory

Warianty tytułu

Języki publikacji

Abstrakty

Bayesian networks have many practical applications due to their capability to represent joint probability distribution in many variables in a compact way. There exist efficient reasoning methods for Bayesian networks. Many algorithms for learning Bayesian networks from empirical data have been developed. A well-known problem with Bayesian networks is the practical limitation for the number of variables for which a Bayesian network can be learned in reasonable time. A remarkable exception here is the Chow/Liu algorithm learning tree-like Bayesian networks. However, also this algorithm has an important limitation, related to space consumption. The space required is quadratic in the number of variables. The paper presents a novel algorithm overcoming this limitation for the tree-like class of Bayesian networks. The new algorithm space consumption grows linearly with the number of variables while the execution time is comparable with the Chow/Liu algorithm. This opens new perspectives in construction of Bayesian networks from data containing thousands and more variables, e.g. in automatic text categorization.

Słowa kluczowe

machine learnig Bayesian networks Chow/Liu algorithm

Wydawca

De Gruyter

Czasopismo

Demonstratio Mathematica

Rocznik

2002

Tom

Vol. 35, nr 3

Strony

671--684

Opis fizyczny

Bibliogr. 17 poz.

Twórcy

autor

Kłopotek M.

klopotek@ipipan.waw.pl

Institute of Computer Science, Polish Academy of Sciences, ul. Ordona 21, 01-237 Warszawa, Poland
Faculty of Mathematics and Information Science, Warsaw University of Technology, Plac Politechniki 1, 00-661 Warszawa, Poland

Bibliografia

[1] J. Cerquides, Applying General Bayesian Techniques to Improve TAN Induction, Knowledge Discovery and Data Mining, 1999, pp. 292-296, http://citeseer.nj.nec.com/cerquides99applying.html.
[2] J. Cheng, D. A. Bell, W. Liu, An algorithm for Bayesian belief network construction from data, Proceedings of AI & STAT'97, Ft. Lauderdale, Florida, 1997.
[3] J. Cheng, D. A. Bell, W. Liu, Learning belief networks from data: an information theory based approach, Proceedings of the Sixth ACM International Conference on Information and Knowledge Management, 1997.
[4] C. K. Chow, C. N. Liu, Approximating discrete probability distributions with dependence trees, IEEE Transactions on Information Theory, IT-14, No. 3, 1968, pp. 462-467.
[5] C. K. Chou, T. J. Wagner, Consistency of an estimate of tree-dependent probability distribution, IEEE Transactions on Information Theory, IT-19, 1973, 369-371.
[6] N. Friedman, D. Geiger, M. Goldszmidt, Bayesian Network Classifiers, Machine Learning vol. 29, 1997, pp. 131. http://citeseer.nj.nec.com/friedman97bayesian.html
[7] N. Inza, M. Merino, P. Larranaga, J. Quiroga, B. Sierra, M. Girala, Feature Subset Selection by Genetic Algorithms and Estimation of Distribution Algorithms. A Case Study in the Survival of Cirrhotic Patients Treated with TIPS., Artificial Intelligence in Medicine (in press), http://citeseer.nj.nec.com/345995.html
[8] M. A. Kłopotek, S. T. Wierzchoń, M. Michalewicz, M. Bednarczyk, W. Pawłowski, A. Wąsowski, Bayesian Network Mining System, Proc. X International Symposium on Intelligent Information Systems, Zakopane, 18-22 June, 2001, Springer-Verlag, New York 2001. ISBN-3-7908-1407-5. pp. 97-110.
[9] M. A. Kłopotek, On a Deficiency of the FCI Algorithm Learning Bayesian Networks from Data, Demonstratio Math. 33 (2000), 181-194.
[10] M. A. Kłopotek, Methods of Identification and Interpretations of Belief Distributions in the Dempster-Shafer Theory, (in Polish). Publisher: Institute of Computer Science, Polish Academy of Sciences, Warsaw, Poland, 1998, ISBN 83-900820-8-x.
[11] M. Meila, An Accelerated Chow and Liu Algorithm: Fitting Tree Distributions to High-Dimensional Sparse Data, http://citeseer. nj.nec.com/285522.html
[12] M. Meila, An accelerated Chow and Liu algorithm: fitting tree distributions to high-dimensional sparse data, http://citeseer. nj.nec.com/363584.html
[13] M. Meila, M. Jordan, Learning with mixtures of trees, J. Machine Learning Research 1 (2000), http://citeseer.nj.nec.com/meila001earning.html, http://www.ai.mit.edu/projects/jmlr/papers/ volume1/meila00a/html/meila00a.html
[14] J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, Morgan Kaufmann, San Mateo CA, 1988.
[15] J. Suzuki, Learning Bayesian Belief Networks based on the Minimum Description Length Principle: Basic Properties, IEICE Trans, on Foundamentals, Vol. E82-A, Oct. 1999, and the conference paper in UAI-93. http://www .math.sci.osaka-u.ac.jp/suzuki/net. html
[16] R. S. Valiveti, B. J. Oommen, The use of chi-squared statistics in determining dependence tree, Technical Report SCS-TR-153, School of Computer Science, Carleton University, Ottawa, Ontario, March 1989.
[17] R. S. Valiveti, B. J. Oommen, On using the chi-squared statistics for determining statistic dependence, Pattern Recognition Vol. 25 No. 11 (1992), 1389-1400.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-PWA1-0044-0022