Powiadomienia systemowe
- Sesja wygasła!
- Sesja wygasła!
Identyfikatory
Warianty tytułu
Języki publikacji
Abstrakty
Constrained optimization is central to large-scale machine learning, particularly in parallel and distributed environments. This paper presents a comprehensive study of augmented Lagrangian-based algorithms for such problems, including classical Lagrangian relaxation, the method of multipliers, the Alternating Direction Method of Multipliers (ADMM), Bertsekas’ algorithm, Tatjewski’s method, and the Separable Augmented Lagrangian Algorithm (SALA). We develop a unified theoretical framework, analyze convergence properties and decomposition strategies, and evaluate these methods on two representative classes of tasks: regularized linear systems and K-means clustering. Numerical experiments on synthetic and real-world datasets show that Bertsekas’ method consistently achieves the best balance of convergence speed and solution quality, while ADMM offers practical scalability under decomposition but struggles in high-dimensional or ill-conditioned settings. Tatjewski’s method benefits significantly from partitioning, whereas the classical Augmented Lagrangian approach proves computationally inefficient for large-scale problems. These findings clarify the trade-offs among augmented Lagrangian algorithms, highlighting Bert-sekas’ method as the most effective for distributed optimization and providing guidance for algorithm selection in large-scale machine learning applications.
Rocznik
Tom
Strony
18
Opis fizyczny
Bibliogr. 31 poz., tab., rys.
Twórcy
autor
- Warsaw University of Technology, Warsaw, Poland
autor
- Warsaw University of Technology, Warsaw, Poland
Bibliografia
- [1] K. Arrow, H. Azawa, K. M. R. Collection, L. Hurwicz, H. Uzawa, H. Chenery, S. Johnson, and S. Karlin, Studies in Linear and Non-linear Programming, ser. Stanford mathematical studies in the social sciences. Stanford University Press, 1958. [Online]. Available: https://books.google.pl/books?id=TkRlnQEACAAJ
- [2] M. R. Hestenes, “Multiplier and gradient methods,” Journal of Optimization Theory and Applications, vol. 4, no. 5, pp. 303-320, nov 1969. [Online]. Available: https://doi.org/10.1007/bf00927673
- [3] M. J. D. Powell, “A method for nonlinear constraints in minimization problems,” Optimization, pp. 283-298, 1969. [Online]. Available: https://ci.nii.ac.jp/naid/20000922074/en/
- [4] D. P. Bertsekas, “Convexification procedures and decomposition methods for nonconvex optimization problems,” Journal of Optimization Theory and Applications, vol. 29, no. 2, pp. 169-197, oct 1979. [Online]. Available: https://doi.org/10.1007/bf00937167
- [5] A. Tanikawa and H. Mukai, “A new technique for nonconvex primal-dual decomposition of a large-scale separable optimization problem,” IEEE Transactions on Automatic Control, vol. 30, no. 2, pp. 133-143, 1985.
- [6] A. C. Nwachukwu and A. Karbowski, “Solution of the simultaneous routing and bandwidth allocation problem in energy-aware networks using augmented Lagrangian-based algorithms and decomposition,” Energies, vol. 17, no. 5, p. 1233, mar 2024. [Online]. Available: http://dx.doi.org/10.3390/en17051233
- [7] P. Tatjewski, “New dual-type decomposition algorithm for nonconvex separable optimization problems,” Automatica, vol. 25, no. 2, pp. 233-242, mar 1989. [Online]. Available: https://doi.org/10.1016/0005-1098(89)90076-9
- [8] D. Gabay and B. Mercier, “A dual algorithm for the solution of nonlinear variational problems via finite element approximation,” Computers & Mathematics with Applications, vol. 2, no. 1, pp. 17-40, 1976. [Online]. Available: https://doi.org/10.1016/0898-1221(76)90003-1
- [9] S. Boyd, “Distributed optimization and statistical learning via the alternating direction method of multipliers,” Foundations and Trends® in Machine Learning, vol. 3, no. 1, pp. 1-122, 2010. [Online]. Available: https://doi.org/10.1561/2200000016
- [10] J. Eckstein and D. P. Bertsekas, “On the Douglas—Rachford splitting method and the proximal point algorithm for maximal monotone operators,” Mathematical Programming, vol. 55, no. 1-3, pp. 293-318, apr 1992. [Online]. Available: https://doi.org/10.1007/bf01581204
- [11] A. Hamdi, P. Mahey, and J. P. Dussault, “A new decomposition method in nonconvex programming via a separable augmented Lagrangian,” in Recent Advances in Optimization, P. Gritzmann, R. Horst, E. Sachs, and R. Tichatschke, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 1997, pp. 90-104.
- [12] A. Hamdi, “Two-level primal-dual proximal decomposition technique to solve large scale optimization problems,” Applied Mathematics and Computation, vol. 160, no. 3, p. 921 - 938, 2005.
- [13] A. Hamdi and S. K. Mishra, “Decomposition methods based on aug-mented Lagrangians: A survey,” Springer Optimization and Its Applica-tions, vol. 50, p. 175 - 203, 2011.
- [14] A. N. Tikhonov and V. Y. Arsenin, Solutions of Ill-posed Problems. New York: Wiley, 1977.
- [15] P. C. Hansen, Rank-Deficient and Discrete Ill-Posed Problems: Numer-ical Aspects of Linear Inversion. Philadelphia: SIAM, 1998.
- [16] A. N. Tikhonov, “Ill-posed problems of linear algebra and a stable method for their solution,” Soviet Math. Dokl., vol. 5, pp. 1035-1038, 1965.
- [17] G. H. Golub, P. C. Hansen, and D. P. O’Leary, “Tikhonov regularization and total least squares,” SIAM Journal on Matrix Analysis and Applica-tions, vol. 21, no. 1, pp. 185-194, 1999.
- [18] A. E. Hoerl and R. W. Kennard, “Ridge regression: Biased estimation for nonorthogonal problems,” Technometrics, vol. 12, no. 1, pp. 55-67, 1970.
- [19] A. K. Jain, M. N. Murty, and P. J. Flynn, “Data clustering: a review,” ACM Computing Surveys, vol. 31, no. 3, pp. 264-323, 1999.
- [20] S. Lloyd, “Least squares quantization in PCM,” IEEE Transactions on Information Theory, vol. 28, no. 2, pp. 129-137, 1982.
- [21] J. MacQueen, “Some methods for classification and analysis of multi-variate observations,” in Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, vol. 1. University of California Press, 1967, pp. 281-297.
- [22] E. G. Birgin, J. M. Mart´ınez, and M. Raydan, “Practical augmented Lagrangian methods for constrained optimization,” SIAM Journal on Optimization, vol. 14, no. 2, pp. 500-523, 2005.
- [23] J. Nocedal and S. J. Wright, Numerical Optimization, 2nd ed. New York: Springer, 2006.
- [24] P. Bradley, K. Bennett, and A. Demiriz, “Constrained K-means clustering,” Microsoft Research, Redmond, Tech. Rep. MSR-TR-2000-65, 2000, Microsoft Research Technical Report. [On-line]. Available: https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/tr-2000-65.pdf
- [25] W. H. Wolberg and O. L. Mangasarian, “Breast cancer Wisconsin (diagnostic),” 1993.
- [26] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vander-plas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duch-esnay, “Scikit-learn: Machine learning in Python,” Journal of Machine Learning Research, vol. 12, pp. 2825-2830, 2011.
- [27] P. Tschandl, C. Rosendahl, and H. Kittler, “The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions,” Scientific Data, vol. 5, no. 1, 8 2018. [Online]. Available: http://dx.doi.org/10.1038/sdata.2018.161
- [28] N. C. F. Codella, D. Gutman, M. E. Celebi, B. Helba, M. A. Marchetti, S. W. Dusza, A. Kalloo, K. Liopyris, N. Mishra, H. Kittler, and A. Halpern, “Skin lesion analysis toward melanoma detection: A challenge at the 2017 International Symposium on Biomedical Imaging (ISBI), Hosted by the International Skin Imaging Collaboration (ISIC),” 2017. [Online]. Available: https://arxiv.org/abs/1710.05006
- [29] M. Combalia, N. C. F. Codella, V. Rotemberg, B. Helba, V. Vilaplana, O. Reiter, C. Carrera, A. Barreiro, A. C. Halpern, S. Puig, and J. Malvehy, “BCN20000: Dermoscopic lesions in the wild,” 2019. [Online]. Available: https://arxiv.org/abs/1908.02288
- [30] A. Ben Abacha and D. Demner-Fushman, “A question-entailment approach to question answering,” BMC Bioinform., vol. 20, no. 1, pp. 511:1-511:23, 2019. [Online]. Available: https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-019-3119-4
- [31] E. Alsentzer, J. R. Murphy, W. Boag, W. Weng, D. Jin, T. Naumann, and M. B. A. McDermott, “Publicly available clinical BERT embeddings,” CoRR, vol. abs/1904.03323, 2019. [Online]. Available: http://arxiv.org/abs/1904.03323
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-065b4149-700b-4349-9814-d59ab9827783
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.