Deep Learning Optimization Tasks and Metaheuristic Methods

Biedrzycki, Rafał; Zawistowski, Paweł; Twardowski, Bartłomiej

doi:10.3233/FI-2019-1828

Artykuł - szczegóły

Tytuł artykułu

Deep Learning Optimization Tasks and Metaheuristic Methods

Autorzy

Biedrzycki Rafał , Zawistowski Paweł , Twardowski Bartłomiej

Wybrane pełne teksty z tego czasopisma

https://fi.episciences.org/

Identyfikatory

DOI

10.3233/FI-2019-1828

Warianty tytułu

Języki publikacji

Abstrakty

In this paper we identify and formulate two optimization tasks solved in connection with training DL models and constructing adversarial examples. This guides our review of optimization methods commonly used within the DL community. Simultaneously, we present findings from the literature concerning metaheuristics and black-box optimization. We focus on well-known optimizers suitable for solving ℝ^N tasks, which achieve good results on benchmarks and in competitions. Finally, we look into the research connected with utilizing metaheuristic optimization methods in combination with deep learning models.

Słowa kluczowe

deep learning metaheuristics optimization

Wydawca

IOS Press

Czasopismo

Fundamenta Informaticae

Rocznik

2019

Tom

Vol. 168, nr 2-4

Strony

185--218

Opis fizyczny

Bibliogr. 127 poz., fot., rys., tab.

Twórcy

autor

Biedrzycki Rafał

rbiedrzy@elka.pw.edu.pl

Institute of Computer Science, Warsaw University of Technology, Nowowiejska 15/19, 00-665 Warsaw, Poland

autor

Zawistowski Paweł

pawel.zawistowski@pw.edu.pl

Institute of Computer Science, Warsaw University of Technology, Nowowiejska 15/19, 00-665 Warsaw, Poland

autor

Twardowski Bartłomiej

B.Twardowski@ii.pw.edu.pl

Institute of Computer Science, Warsaw University of Technology, Nowowiejska 15/19, 00-665 Warsaw, Poland

Bibliografia

[1] Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A. Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2015 pp. 1-9. doi:10.1109/CVPR.2015.7298594.
[2] He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2016 pp. 770-778. doi:10.1109/CVPR.2016.90.
[3] Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2015 pp. 3431-3440. doi:10.1109/CVPR.2015.7298965.
[4] Pohlen T, Hermans A, Mathias M, Leibe B. Fullresolution residual networks for semantic segmentation in street scenes. arXiv preprint:1611.08323, 2017.
[5] Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J. Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems. 2013 pp. 3111-3119.
[6] Mikolov T, Grave E, Bojanowski P, Puhrsch C, Joulin A. Advances in Pre-Training Distributed Word Representations. In: Proceedings of the International Conference on Language Resources and Evaluation (LREC 2018). 2018.
[7] Bowman SR, Vilnis L, Vinyals O, Dai AM, Jozefowicz R, Bengio S. Generating sentences from a continuous space. arXiv preprint:1511.06349, 2015.
[8] Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate. arXiv preprint:1409.0473, 2014.
[9] Moosavi-Dezfooli SM, Fawzi A, Frossard P. Deepfool: a simple and accurate method to fool deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016 pp. 2574-2582.
[10] Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y. Generative adversarial nets. In: Advances in neural information processing systems. 2014 pp. 2672-2680.
[11] Papernot N, McDaniel P, Goodfellow I, Jha S, Celik ZB, Swami A. Practical black-box attacks against machine learning. In: Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security. ACM, 2017 pp. 506-519.
[12] Carlini N, Wagner D. Towards Evaluating the Robustness of Neural Networks. In: 2017 IEEE Symposium on Security and Privacy (SP). 2017 pp. 39-57. doi:10.1109/SP.2017.49.
[13] Carlini N, Wagner D. Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods. In: Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, AISec ’17. ACM, New York, NY, USA. ISBN 978-1-4503-5202-4, 2017 pp. 3-14. doi:10.1145/3128572.3140444.
[14] Athalye A, Carlini N, Wagner D. Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples. arXiv preprint:1802.00420.
[15] Madry A, Makelov A, Schmidt L, Tsipras D, Vladu A. Towards Deep Learning Models Resistant to Adversarial Attacks. 2017. doi:10.1227/01.NEU.0000255452.20602.C9.1706.06083.
[16] LeCun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 1998. 86(11):2278-2324.
[17] Goodfellow I, Bengio Y, Courville A. Deep Learning. MIT Press, 2016.
[18] Hochreiter S, Schmidhuber J. Long short-term memory. Neural computation, 1997. 9(8):1735-1780.
[19] Xiao L, Bahri Y, Sohl-Dickstein J, Schoenholz SS, Pennington J. Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks. arXiv preprint:1806.05393, 2018.
[20] Rumelhart DE, Hinton GE, Williams RJ. Learning internal representations by error propagation. Technical report, California Univ San Diego La Jolla Inst for Cognitive Science, 1985.
[21] Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M, et al. Tensorflow: a system for large-scale machine learning. In: OSDI, volume 16. 2016 pp. 265-283.
[22] Paszke A, Gross S, Chintala S, Chanan G, Yang E, DeVito Z, Lin Z, Desmaison A, Antiga L, Lerer A. Automatic differentiation in pytorch. In: 31st Conference on Neural Information Processing Systems (NIPS 2017). 2017.
[23] Srivastava RK, Greff K, Schmidhuber J. Highway networks. arXiv preprint:1505.00387, 2015.
[24] Chung J, Gulcehre C, Cho K, Bengio Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint:1412.3555, 2014.
[25] Bengio Y, Louradour J, Collobert R, Weston J. Curriculum Learning. In: Proceedings of the 26th Annual International Conference on Machine Learning, ICML ’09. ACM, New York, NY, USA. ISBN 978-1-60558-516-1, 2009 pp. 41-48. doi:10.1145/1553374.1553380.
[26] Bottou L. Online Algorithms and Stochastic Approximations. In: Saad D (ed.), Online Learning and Neural Networks. Cambridge University Press, Cambridge, UK, 1998. Revised, Oct 2012.
[27] Ruder S. An overview of gradient descent optimization algorithms. CoRR, 2016. abs/1609.04747.
[28] Darken C, Chang J, Z JC, Moody J. Learning Rate Schedules For Faster Stochastic Gradient Search. In: Neural Networks for Signal Processing [1992] II., Proceedings of the 1992 IEEE-SP Workshop. IEEE Press, 1992 pp. 3-12.
[29] Dauphin YN, Pascanu R, Gulcehre C, Cho K, Ganguli S, Bengio Y. Identifying and attacking the saddle point problem in high-dimensional non-convex optimization. In: Advances in neural information processing systems. 2014 pp. 2933-2941.
[30] Sutskever I, Martens J, Dahl G, Hinton G. On the importance of initialization and momentum in deep learning. In: International conference on machine learning. 2013 pp. 1139-1147.
[31] Duchi J, Hazan E, Singer Y. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 2011. 12(Jul):2121-2159.
[32] Dean J, Corrado G, Monga R, Chen K, Devin M, Mao M, Ranzato M, Senior A, Tucker P, Yang K, Le QV, Ng AY. Large Scale Distributed Deep Networks. In: Pereira F, Burges CJC, Bottou L, Weinberger KQ (eds.), Advances in Neural Information Processing Systems 25, pp. 1223-1231. Curran Associates, Inc., 2012.
[33] Tieleman T, Hinton G. Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude. Technical report, 2012.
[34] Zeiler MD. ADADELTA: An Adaptive Learning Rate Method. CoRR, 2012. abs/1212.5701.
[35] Kingma DP, Ba JL. Adam: A method for stochastic optimization. In: Proc. 3rd Int. Conf. Learn. Representations. 2014.
[36] Zhu C, Byrd RH, Lu P, Nocedal J. Algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound constrained optimization. ACM Transactions on Mathematical Software (TOMS), 1997. 23(4):550-560.
[37] Levenberg K. A method for the solution of certain non-linear problems in least squares. Quarterly of applied mathematics, 1944. 2(2):164-168.
[38] Marquardt DW. An algorithm for least-squares estimation of nonlinear parameters. Journal of the society for Industrial and Applied Mathematics, 1963. 11(2):431-441.
[39] Papernot N, McDaniel P, Sinha A, Wellman MP. SoK: Security and Privacy in Machine Learning. In: 2018 IEEE European Symposium on Security and Privacy (EuroS P). 2018 pp. 399-414. doi:10.1109/EuroSP.2018.00035.
[40] Brendel W, Rauber J, Bethge M. Decision-Based Adversarial Attacks: Reliable Attacks Against Black-Box Machine Learning Models. In: International Conference on Learning Representations. 2018.
[41] Glover F. Future Paths for Integer Programming and Links to Artificial Intelligence. Comput. Oper. Res., 1986. 13(5):533-549. doi:10.1016/0305-0548(86)90048-1.
[42] Wolpert DH, Macready WG. No Free Lunch Theorems for Optimization. Trans. Evol. Comp, 1997. 1(1):67-82. doi:10.1109/4235.585893.
[43] Awad NH, Ali M, Liang J, Qu B, Suganthan PN. Problem definitions and evaluation criteria for the CEC 2017 special session and competition on real-parameter optimization. Technical report, Nanyang Technol. Univ., Singapore and Jordan Univ. Sci. Technol. and Zhengzhou Univ., China, 2016.
[44] COCO (COmparing Continuous Optimisers). http://coco.gforge.inria.fr/. Accessed:2018-02-22.
[45] Loshchilov I, Glasmachers T. Black Box Optimization Competition. http://bbcomp.ini.rub.de/. Accessed:2018-01-25.
[46] Suganthan PN. Shared documents. http://web.mysites.ntu.edu.sg/epnsugan/PublicSite/Shared%20Documents. Accessed: 2018-06-07.
[47] Kumar A, Misra RK, Singh D. Improving the local search capability of Effective Butterfly Optimizer using Covariance Matrix Adapted Retreat Phase. In: 2017 IEEE Congress on Evolutionary Computation (CEC). 2017 pp. 1835-1842. doi:10.1109/CEC.2017.7969524.
[48] Hansen N, Ostermeier A. Completely Derandomized Self-Adaptation in Evolution Strategies. Evol. Comput., 2001. 9(2):159-195. doi:10.1162/106365601750190398.
[49] Zhang G, Shi Y. Hybrid Sampling Evolution Strategy for Solving Single Objective Bound Constrained Problems. In: 2018 IEEE Congress on Evolutionary Computation, CEC 2018, Rio de Janeiro, Brazil, July 8-13, 2018. 2018 pp. 1-7. doi:10.1109/CEC.2018.8477908.
[50] Brest J, Maučec MS, Bošković B. Single objective real-parameter optimization: Algorithm jSO. In: IEEE Congr. Evol. Comput. 2017 pp. 1311-1318. doi:10.1109/CEC.2017.7969456.
[51] Storn R, Price K. Differential Evolution - A Simple and Efficient Heuristic for global Optimization over Continuous Spaces. Journal of Global Optimization, 1997. 11(4):341-359. doi:10.1023/A:1008202821328.
[52] Stanovov V, Akhmedova S, Semenkin E. LSHADE Algorithm with Rank-Based Selective Pressure Strategy for Solving CEC 2017 Benchmark Problems. In: 2018 IEEE Congress on Evolutionary Computation (CEC). 2018 pp. 1-8. doi:10.1109/CEC.2018.8477977.
[53] Awad NH, Ali MZ, Suganthan PN. Ensemble sinusoidal differential covariance matrix adaptation with Euclidean neighborhood for solving CEC2017 benchmark problems. In: 2017 IEEE Congress on Evolutionary Computation (CEC). 2017 pp. 372-379. doi:10.1109/CEC.2017.7969336.
[54] Mohamed AW, Hadi AA, Fattouh AM, Jambi KM. LSHADE with semi-parameter adaptation hybrid with CMA-ES for solving CEC 2017 benchmark problems. In: 2017 IEEE Congress on Evolutionary Computation (CEC). 2017 pp. 145-152. doi:10.1109/CEC.2017.7969307.
[55] Jagodziński D, Arabas J. A differential evolution strategy. In: 2017 IEEE Congress on Evolutionary Computation (CEC). 2017 pp. 1872-1876. doi:10.1109/CEC.2017.7969529.
[56] Sallam KM, Elsayed SM, Sarker RA, Essam DL. Improved United Multi-Operator Algorithm for Solving Optimization Problems. In: 2018 IEEE Congress on Evolutionary Computation (CEC). 2018 pp. 1-8. doi:10.1109/CEC.2018.8477759.
[57] Hansen N. Benchmarking a BI-population CMA-ES on the BBOB-2009 Function Testbed. In: Proceedings of the 11th Annual Conference Companion on Genetic and Evolutionary Computation Conference: Late Breaking Papers, GECCO ’09. ACM, New York, NY, USA. ISBN 978-1-60558-505-5, 2009 pp. 2389-2396. doi:10.1145/1570256.1570333.
[58] Nishida K, Akimoto Y. Benchmarking the PSA-CMA-ES on the BBOB noiseless testbed. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion, GECCO 2018, Kyoto, Japan, July 15-19, 2018. 2018 pp. 1529-1536. doi:10.1145/3205651.3208297.
[59] Nguyen DM. Benchmarking a variant of the CMAES-APOP on the BBOB noiseless testbed. In: Aguirre HE, Takadama K (eds.), Proceedings of the Genetic and Evolutionary Computation Conference Companion, GECCO 2018, Kyoto, Japan, July 15-19, 2018. ACM, 2018 pp. 1521-1528. doi:10.1145/3205651.3208299.
[60] Belkhir N, Dréo J, Savéant P, Schoenauer M. Per Instance Algorithm Configuration of CMA-ES with Limited Budget. In: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO’17. ACM, New York, NY, USA. ISBN 978-1-4503-4920-8, 2017 pp. 681-688. doi:10.1145/3071178.3071343.
[61] Vaneev A. Derivative-Free Optimization Method. https://github.com/avaneev/biteopt. Accessed:2018-02-22.
[62] Pitra Z, Bajer L, Holena M. Doubly Trained Evolution Control for the Surrogate CMA-ES. In: Parallel Problem Solving from Nature - PPSN XIV - 14th International Conference, Edinburgh, UK, September 17-21, 2016, Proceedings. 2016 pp. 59-68. doi:10.1007/978-3-319-45823-6_6.
[63] Audet C, Dennis J, Digabel S. Parallel Space Decomposition of the Mesh Adaptive Direct Search Algorithm. SIAM Journal on Optimization, 2008. 19(3):1150-1170. doi:10.1137/070707518.
[64] Audet C, Dennis J. Mesh Adaptive Direct Search Algorithms for Constrained Optimization. SIAM Journal on Optimization, 2006. 17(1):188-217. doi:10.1137/040603371.
[65] Wessing S. BBComp. https://ls11-www.cs.tu-dortmund.de/staff/wessing/bbcomp. Accessed:2018-02-22.
[66] Ulinski M, Zychowski A, Okulewicz M, Zaborski M, Kordulewski H. Generalized Self-adapting Particle Swarm Optimization Algorithm. In: Parallel Problem Solving from Nature - PPSN. 2018 doi:10.1007/978-3-319-99253-2_3.
[67] Kennedy J, Eberhart R. Particle swarm optimization. In: Proceedings of ICNN’95 - International Conference on Neural Networks, volume 4. 1995 pp. 1942-1948. doi:10.1109/ICNN.1995.488968.
[68] Holland JH. Adaptation in Natural and Artificial Systems. University of Michigan Press, Ann Arbor, MI, USA, 1975.
[69] Goldberg DE. Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 1st edition, 1989. ISBN 0201157675.
[70] Goldberg DE. Real-coded Genetic Algorithms, Virtual Alphabets, and Blocking. Complex Systems, 1991. 5.
[71] Michalewicz Z. Genetic Algorithms + Data Structures = Evolution Programs. Springer Verlag, Ann Arbor, MI, USA, 1992, 1994, 1996.
[72] De Jong KA. An Analysis of the Behavior of a Class of Genetic Adaptive Systems. Ph.D. thesis, Ann Arbor, MI, USA, 1975. AAI7609381.
[73] Goldberg DE, Deb K, Clark JH. Genetic Algorithms, Noise, and the Sizing of Populations. Complex Systems, 1992. 6.
[74] Arabas J, Michalewicz Z, Mulawka J. GAVaPS-a genetic algorithm with varying population size. In: Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence. 1994 pp. 73-78 vol.1. doi:10.1109/ICEC.1994.350039.
[75] Klockgether J, Schwefel HP. Two-phase nozzle and hollow core jet experiments. In: Proc. 11th Symp. Engineering Aspects of Magnetohydrodynamics. Pasadena, CA: California Institute of Technology, 1970 pp. 141-148.
[76] Schwefel HP. Numerical Optimization of Computer Models. John Wiley & Sons, Inc., New York, NY, USA, 1981. ISBN 0471099880.
[77] Schwefel HPP. Evolution and Optimum Seeking: The Sixth Generation. John Wiley & Sons, Inc., New York, NY, USA, 1993. ISBN 0471571482.
[78] Bäck T, Hoffmeister F, Schwefel H. A Survey of Evolution Strategies. In: Belew R, Booker L (eds.), Proceedings of the Fourth International Conference on Genetic Algorithms. Morgan Kaufmann, San Francisco, CA, USA, 1991 pp. 2-9.
[79] Beyer HG, Schwefel HP. Evolution strategies - A comprehensive introduction. Natural Computing, 2002. 1(1):3-52. doi:10.1023/A:1015059928466.
[80] Biedrzycki R, Arabas J, Jasik A, Szymański M, Wnuk P, Wasylczyk P, Wójcik-Jedlińska A. Application of Evolutionary Methods to Semiconductor Double-Chirped Mirrors Design. In: Bartz-Beielstein T, Branke J, Filipič B, Smith J (eds.), Parallel Problem Solving from Nature - PPSN XIII, volume 8672 of Lecture Notes in Computer Science. Springer. ISBN 978-3-319-10761-5, 2014 pp. 761-770. doi:10.1007/978-3-319-10762-2_75.
[81] Chen S, Montgomery J, Bolufé-Röhler A. Measuring the curse of dimensionality and its effects on particle swarm optimization and differential evolution. Applied Intelligence, 2015. 42. doi:10.1007/s10489-014-0613-2.
[82] Mallipeddi R, Suganthan PN. Empirical study on the effect of population size on Differential evolution Algorithm. In: 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence). 2008 pp. 3663-3670. doi:10.1109/CEC.2008.4631294.
[83] Qin AK, Huang VL, Suganthan PN. Differential Evolution Algorithm with Strategy Adaptation for Global Numerical Optimization. IEEE Trans. Evol. Comput., 2009. 13(2):398-417. doi:10.1109/TEVC.2008.927706.
[84] Zhang J, Sanderson AC. JADE: Adaptive Differential Evolution with Optional External Archive. IEEE Trans. Evol. Comput., 2009. 13(5):945-958. doi:10.1109/TEVC.2009.2014613.
[85] Das S, Mullick SS, Suganthan PN. Recent advances in differential evolution - An updated survey. Swarm and Evol. Comput., 2016. 27:1-30. doi:10.1016/j.swevo.2016.01.004.
[86] Neri F, Tirronen V. Recent advances in differential evolution: a survey and experimental analysis. Artificial Intelligence Review, 2010. 33(1):61-106. doi:10.1007/s10462-009-9137-2.
[87] Al-Dabbagh RD, Neri F, Idris N, Baba MS. Algorithmic design issues in adaptive differential evolution schemes: Review and taxonomy. Swarm and Evolutionary Computation, 2018. doi:10.1016/j.swevo.2018.03.008., in press, 2018.
[88] Cazzaniga P, Nobile MS, Besozzi D. The impact of particles initialization in PSO: Parameter estimation as a case in point. In: Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), 2015 IEEE Conference on. IEEE, 2015 pp. 1-8.
[89] Pedersen MEH, Chipperfield AJ. Simplifying particle swarm optimization. Applied Soft Computing, 2010. 10(2):618-628.
[90] Li X, Tang K, Omidvar MN, Yang Z, Qin K. Benchmark Functions for the CEC2013 Special Session and Competition on Large-Scale Global Optimization. Technical report, School of Computer Science and Information Technology, RMIT University, Melbourne, Australia and School of Computer Science and Technology, University of Science and Technology of China, Hefei, Anhui, China and National University of Defense Technology, Changsha 410073, China, 2013.
[91] Herrera F, Lozano M, Molina D, Lozano Are With M. Test suite for the special issue of soft computing on scalability of evolutionary algorithms and other metaheuristics for large scale continuous optimization problems. Technical report, University of Granada, Spain, 2010.
[92] LaTorre A, Muelas S, Sánchez JMP. Large scale global optimization: Experimental results with MOS-based hybrid algorithms. In: Proceedings of the IEEE Congress on Evolutionary Computation, CEC 2013, Cancun, Mexico, June 20-23, 2013. 2013 pp. 2742-2749. doi:10.1109/CEC.2013.6557901.
[93] LaTorre A, Muelas S, Peña JM. A MOS-based dynamic memetic differential evolution algorithm for continuous optimization: a scalability test. Soft Computing, 2011. 15(11):2187-2199. doi:10.1007/s00500-010-0646-3.
[94] Molina D, Lozano M, Herrera F. MA-SW-Chains: Memetic algorithm based on local search chains for large scale continuous global optimization. In: IEEE Congress on Evolutionary Computation. 2010 pp.1-8. doi:10.1109/CEC.2010.5586034.
[95] Brest J, Maučec MS. Self-adaptive differential evolution algorithm using population size reduction and three strategies. Soft Computing, 2011. 15(11):2157-2174. doi:10.1007/s00500-010-0644-5.
[96] Wang Y, Li B. Two-stage based ensemble optimization for large-scale global optimization. In: IEEE Congress on Evolutionary Computation. 2010 pp. 1-8. doi:10.1109/CEC.2010.5586466.
[97] Mühlenbein H, Paaß G. From recombination of genes to the estimation of distributions I. Binary parameters. In: Voigt HM, Ebeling W, Rechenberg I, Schwefel HP (eds.), Parallel Problem Solving from Nature-PPSN IV. Springer Berlin Heidelberg, Berlin, Heidelberg. ISBN 978-3-540-70668-7, 1996 pp.178-187.
[98] Yang Z, Tang K, Yao X. Scalability of generalized adaptive differential evolution for large-scale continuous optimization. Soft Computing, 2011. 15(11):2141-2155. doi:10.1007/s00500-010-0643-6.
[99] Molina D, LaTorre A, Herrera F. SHADE with Iterative Local Search for Large-Scale Global Optimization. In: 2018 IEEE Congress on Evolutionary Computation (CEC). 2018 pp. 1-8. doi:10.1109/CEC.2018.8477755.
[100] Loshchilov I. A Computationally Efficient Limited Memory CMA-ES for Large Scale Optimization. In: Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation, GECCO’14. ACM, New York, NY, USA. ISBN 978-1-4503-2662-9, 2014 pp.397-404. doi:10.1145/2576768.2598294.
[101] Varelas K, Auger A, Brockhoff D, Hansen N, ElHara OA, Semet Y, Kassab R, Barbaresco F. A Comparative Study of Large-Scale Variants of CMA-ES. In: Auger A, Fonseca CM, Lourenço N, Machado P, Paquete L, Whitley D (eds.), Parallel Problem Solving from Nature - PPSN XV. Springer International Publishing, Cham. ISBN 978-3-319-99253-2, 2018 pp. 3-15.
[102] LaTorre A, Muelas S, Pena JM. A comprehensive comparison of large scale global optimizers. Information Sciences, 2015. 316:517-549. doi:10.1016/j.ins.2014.09.031. Nature-Inspired Algorithms for Large Scale Global Optimization.
[103] Molina D, LaTorre A. Toolkit for the Automatic Comparison of Optimizers: Comparing Large-Scale Global Optimizers Made Easy. In: 2018 IEEE Congress on Evolutionary Computation (CEC). 2018 pp. 1-8. doi:10.1109/CEC.2018.8477924.
[104] Chen Q, Liu B, Zhang Q, Liang J, Suganthan P, Qu B. Problem Definition and Evaluation Criteria for CEC 2015 Special Session and Competition on Bound Constrained Single-Objective Computationally Expensive Numerical Optimization. Technical report, Computational Intelligence Laboratory, Zhengzhou University, Zhengzhou, China, Nanyang Technological University, Singapore, 2014.
[105] Rueda JL, Erlich I. MVMO for bound constrained single-objective computationally expensive numerical optimization. In: 2015 IEEE Congress on Evolutionary Computation (CEC). 2015 pp. 1011-1017. doi:10.1109/CEC.2015.7257000.
[106] Erlich I, Venayagamoorthy GK, Worawat N. A Mean-Variance Optimization algorithm. In: IEEE Congress on Evolutionary Computation. 2010 pp. 1-6. doi:10.1109/CEC.2010.5586027.
[107] Berthier V. Experiments on the CEC 2015 expensive optimization testbed. In: 2015 IEEE Congress on Evolutionary Computation (CEC). 2015 pp. 1059-1066. doi:10.1109/CEC.2015.7257007.
[108] Igel C, Suttorp T, Hansen N. A Computational Efficient Covariance Matrix Update and a (1+1)-CMA for Evolution Strategies. In: Proceedings of the 8th Annual Conference on Genetic and Evolutionary Computation, GECCO ’06. ACM, New York, USA. ISBN 1-59593-186-4, 2006 pp. 453-460. doi:10.1145/1143997.1144082.
[109] Wilke CO, Wang JL, Ofria C, Lenski RE, Adami C. Evolution of Digital Organisms at High Mutation Rates Leads to Survival of the Flattest. Nature, 2001. 412:331-333. doi:10.1038/35085569.
[110] Arabas J, Biedrzycki R. Quasi-Stability of Real Coded Finite Populations. In: Bartz-Beielstein T, Branke J, Filipič B, Smith J (eds.), Parallel Problem Solving from Nature - PPSN XIII, volume 8672 of Lecture Notes in Computer Science. Springer. ISBN 978-3-319-10761-5, 2014 pp. 872-881. doi:10.1007/978-3-319-10762-2_86.
[111] Schonfeld J, Ashlock DA. A comparison of the robustness of evolutionary computation and random walks. In: Proceedings of the 2004 Congress on Evolutionary Computation (IEEE Cat. No.04TH8753), volume 1. 2004 pp. 250-257 Vol.1. doi:10.1109/CEC.2004.1330864.
[112] Mendes R, Cortez P, Rocha M, Neves J. Particle swarms for feedforward neural network training. In: Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN’02 (Cat. No.02CH37290), volume 2. 2002 pp. 1895-1899 vol.2. doi:10.1109/IJCNN.2002.1007808.
[113] Gudise VG, Venayagamoorthy GK. Comparison of particle swarm optimization and backpropagation as training algorithms for neural networks. In: Proceedings of the 2003 IEEE Swarm Intelligence Symposium. SIS’03 (Cat. No. 03EX706). IEEE, 2003 pp. 110-117.
[114] M Ibrahim A, El-Amary N. Particle Swarm Optimization Trained Recurrent Neural Network for Voltage Instability Prediction. Journal of Electrical Systems and Information Technology, 2017. 5. doi:10.1016/j.jesit.2017.05.001.
[115] Zhang JR, Zhang J, Lok TM, Lyu MR. A hybrid particle swarm optimization-back-propagation algorithm for feedforward neural network training. Applied mathematics and computation, 2007. 185(2):1026-1037.
[116] Wierstra D, Schaul T, Peters J, Schmidhuber J. Natural evolution strategies. In: Evolutionary Computation, 2008. CEC 2008. (IEEE World Congress on Computational Intelligence). IEEE Congress on. IEEE, 2008 pp. 3381-3387.
[117] Wierstra D, Schaul T, Glasmachers T, Sun Y, Peters J, Schmidhuber J. Natural Evolution Strategies. J. Mach. Learn. Res., 2014. 15(1):949-980.
[118] Salimans T, Ho J, Chen X, Sidor S, Sutskever I. Evolution strategies as a scalable alternative to reinforcement learning. arXiv preprint:1703.03864, 2017.
[119] Ilyas A, Engstrom L, Athalye A, Lin J, Athalye A, Engstrom L, Ilyas A, Kwok K. Black-box Adversarial Attacks with Limited Queries and Information. In: Proceedings of the 35th International Conference on Machine Learning, ICML. 2018.
[120] Su J, Vargas DV, Sakurai K. One pixel attack for fooling deep neural networks. IEEE Transactions on Evolutionary Computation, 2019.
[121] Choromanska A, Henaff M, Mathieu M, Arous GB, LeCun Y. The loss surfaces of multilayer networks. In: Artificial Intelligence and Statistics. 2015 pp. 192-204.
[122] Safran I, Shamir O. Spurious Local Minima are Common in Two-Layer ReLU Neural Networks. arXiv preprint:1712.08968, 2017.
[123] LIANG S, Sun R, Li Y, Srikant R. Understanding the Loss Surface of Neural Networks for Binary Classification. 2018. 80:2835-2843.
[124] Du S, Lee J, Tian Y, Singh A, Poczos B. Gradient Descent Learns One-hidden-layer CNN: Dont be Afraid of Spurious Local Minima. 2018. 80:1339-1348.
[125] Laurent T, Brecht J. Deep linear networks with arbitrary loss: All local minima are global. In: International Conference on Machine Learning. 2018 pp. 2908-2913.
[126] Du SS, Lee JD, Li H, Wang L, Zhai X. Gradient Descent Finds Global Minima of Deep Neural Networks. CoRR:1811.03804v2, 2018.
[127] Rozsa A, Gunther M, Boult TE. Towards Robust Deep Neural Networks with BANG. In: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 2018 pp. 803-811.

Uwagi

Opracowanie rekordu w ramach umowy 509/P-DUN/2018 ze środków MNiSW przeznaczonych na działalność upowszechniającą naukę (2019).

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-9cc94438-91a6-43b9-ace6-531c44bbc53c