Control Systems and Computers, N1, 2023, Article 2
https://doi.org/10.15407/csc.2023.01.018
Control Systems and Computers, 2023, Issue 1 (301), pp. 18-32
UDC 004.8 + 004.032.26
O.O. HOLTSEV, PhD Student, International Research and Training Centre for Information Technologies and Systems of the NAS and MES of Ukraine, Acad. Glushkov ave., 40, Kiev, 03187, Ukraine, ORCID: https://orcid.org/0000-0002-1846-6648. rcwolf@adg.kiev.ua
V.I. GRITSENKO, Corresponding Member of the Ukrainian Academy of Sciences, Director, International Research and Training Centre for Information Technologies and Systems of the NAS and MES of Ukraine, Scopus ID: 7101892671, Acad. Glushkov ave., 40, Kiev, 03187, Ukraine, ORCID: https://orcid.org/0000-0002-6250-3987,
vig@irtc.org.ua
A SHORT OVERVIEW OF THE MAIN CONCEPTS
OF ARTIFICIAL NEURAL NETWORKS
A significant increase in computer performance, the accumulation of a large amount of data necessary for training deep neural networks, the development of training methods for neural networks that allow you to quickly and efficiently train networks consisting of a hundred or more layers, has led to significant progress in training deep neural networks. This allowed deep neural networks to take a leading position among machine learning methods. In this work, neural network paradigms (and their methods of training and functioning) considers, such as Rosenblatt perceptron, multilayer perceptrons, radial basis function network, Kohonen network, Hopfield network, Boltzmann machine, and deep neural networks. As a result of comparative consideration of these paradigms, it can be concluded that they all successfully solve the tasks set before them, but now, deep neural networks are the most effective mechanism for solving intellectual practical tasks.
Download full text! (On English)
Keywords: artificial intelligence, artificial neural networks, machine learning methods, deep neural networks.
[1] Rosenblatt, F. (1962). Principles of Neurodynamics. Perceptrons and Theory of Brain Mechanisms. Washington, DC: Spartan Books.
https://doi.org/10.21236/AD0256582
[2] Hebb, D.O. (1949). The Organization of Behavior. New York, USA: John Wiley & Sons Inc.
[3] Minsky, M., Papert, S. (1971). Perceptrons, Mir, 261 p. (In Russian).
[4] Kussul, E, Baidyk, T., Kasatkina, L., Lukovich, V. (2001). “Rosenblatt perceptrons for handwritten digit recognition”. IJCNN’01. Proceedings of the International Joint Conference on Neural Networks., Vol. 2, pp. 1516-1520. https://doi.org/10.1109/IJCNN.2001.939589
[5] Kussul, E., Baidyk, T. (2004). “Improved method of handwritten digit recognition tested on MNIST database”. Image and Vision Computing, 22, pp. 971-981.
https://doi.org/10.1016/j.imavis.2004.03.008
[6] Kussul E., Baidyk T. (2006). “LIRA neural classifier for handwritten digit recognition and visual controlled microassembly”. Neurocomputing, 69 (16-18), pp. 2227-2235.
https://doi.org/10.1016/j.neucom.2005.07.009
[7] Parallel Distributed Processing: Explorations in the Microstructures of Cognition (1986). Ed. by Rumelhart D. E. and McClelland J.L. Cambridge, MA: MIT Press.
[8] Galushkin, A.I. (1974). Sintez Mnogosloynykh Sistem Raspoznavaniya Obrazov. M.: “Energiya”, 1974 p. (In Russian).
[9] Werbos, P.J. (1974). Beyond regression: New tools for prediction and analysis in the behavioral sciences. Ph.D. thesis, Harvard University, Cambridge, MA.
[10] Rumelhart, D.E., Hinton, G.E., Williams, R.J. (1986). “Learning internal representations by error propagation”. In: Parallel Distributed Processing, Vol. 1, Cambridge, MA, MIT Press. pp. 318-362.
[11] Broomhead, D.S., Lowe, D. (1988). “Multivariable functional interpolation and adaptive networks”. Complex Systems. 2, pp. 321-355.
[12] Schwenker, F., Kestler, H.A., Palm, G. (2001). “Three learning phases for radial-basis-function networks”. Neural Networks. 14 (4-5), pp, 439-458. https://doi.org/10.1016/S0893-6080(01)00027-2
[13] Kohonen, T. (2001). Self-Organizing Maps (Third Extended Edition), New York, 501 p. ISBN 3-540-67921-9.
[14] Callan, R. (1999). The Essence of Neural Networks. Prentice Hall Europe, London. ISBN 13: 9780139087325.
[15] Hopfield, J. (1984). “Neurons with graded response have collective computational properties like those of two-state neurons”. Proceedings of the National Academy of Sciences of the United States of America. 81. pp. 3088-3092. https://doi.org/10.1073/pnas.81.10.3088
[16] Ackley, D.H., Hinton, G.E., Sejnowski, T.J. (1985). “A learning algorithm for Boltzmann machines”. Cognitive Science. 9 (1), pp. 147-169.
https://doi.org/10.1207/s15516709cog0901_7
[17] Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., Teller, E. (1953). “Equations of state calculations by fast computing machines”. Journal Chemical Physics, 21, pp. 1087-1091. DOI: https://doi.org/10.2172/4390578
[18] Sozykin, A.V. (2017). “An overview of methods for deep learning in neural networks”. Vestnik Yuzhno-Ural’skogo Gosudarstvennogo Universiteta. Seriya” Vychislitelnaya Matematika i Informatika”. 6 (3), pp. 28-59.
[19] Fukushima, K. (1980). “Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position”. Biological Cybernetics, 36, pp. 193-202.
https://doi.org/10.1007/BF00344251
[20] Wiesel, D.H., Hubel, T.N. (1959). “Receptive fields of single neurones in the cat’s striate cortex”. The Journal of Physiology, 148 (3), pp. 574-591. https://doi.org/10.1113/jphysiol.1959.sp006308
[21] Ballard, D.H. (1987). “Modular learning in neural networks”. Proceedings of the Sixth National Conference on Artificial Intelligence. Seattle, Washington, USA, July 13-17, 1987. Vol. 1, pp. 279-284.
[22] Le Cun, Y., Boser, B., Denker, J.S., Henderson, D., Howard, R.E., Hubbard, W., Jackel, L.D. (1990). “Handwritten digit recognition with a back-propagation network”. Advances in Neural Information Processing Systems 2. Morgan Kaufmann, pp. 396-404.
[23] Hochreiter, S. (1991). Untersuchungen zu dynamischen neuronalen netzen. Diploma thesis. Institut fur Informatik, Lehrstuhl Prof. Brauer. Technische Universitat Munchen.
[24] Hochreiter, S., Bengio, Y., Frasconi, P., Schmidhuber, J. (2001). Gradient flow in recurrent nets: the difficulty of learning long-term dependencies. In S. C. Kremer & J. F. Kolen (ed.), A Field Guide to Dynamical Recurrent Neural Networks. Wiley-IEEE Press, pp. 237-243.
https://doi.org/10.1109/9780470544037.ch14
[25] Khurshudov, A.A. (2014). “Obucheniye mnogosloynogo razrezhennogo avtokodirovshchika na izobrazheniyakh bol’shogo masshtaba”. Vestnik komp’yuternykh i informatsionnykh tekhnologiy, 2, pp. 27-30. (In Russian).
https://doi.org/10.14489/vkit.2014.02.pp.027-030
[26] Hinton, G.E. (2002). “Training products of experts by minimizing contrastive divergence”. Neural Computation. 14 (8), pp. 1771-1800. https://doi.org/10.1162/089976602760128018
[27] He, K., Zhang, X., Ren, S., et al. (2016). “Deep residual learning for image recognition”. IEEE Conference on Computer Vision and Pattern Recognition (Las Vegas, NV, USA, 27-30 June 2016), pp. 770-778. https://doi.org/10.1109/CVPR.2016.90
Received 21.10.2022