Identyfikatory
Warianty tytułu
Języki publikacji
Abstrakty
In this paper, we introduce several system theoretic problems brought forward by recent studies on neural models of motor control. We focus our attention on three topics: (i) the cerebellum and adaptive control, (ii) reinforcement learning and the basal ganglia, and (iii) modular control with multiple models. We discuss these subjects from both neuroscience and systems theory viewpoints with the aim of promoting interplay between the two research communities.
Rocznik
Tom
Strony
77--104
Opis fizyczny
Bibliogr. 39 poz., rys., wykr.
Twórcy
autor
- Information Sciences Division, ATR International; CREST, Japan Science and Technology Corporation, 2-2-2 Hikaridai, Seika, Soraku, Kyoto 619-0288, Japan
autor
- Graduate School of Frontier Science, University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan
autor
- Graduate School of Frontier Science, University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan
Bibliografia
- [1] Albus J.S. (1971): A theory of cerebellar function. - Math. Biosci., Vol. 10, pp. 25-61.
- [2] Åström K.J. and Wittenmark B. (1989): Adaptive Control. - Massachusetts: Addison Wesley.
- [3] Barto A.G. (1995): Adaptive critics and the basal ganglia, In: Models of Information Processing in the Basal Ganglia (Houk J.C., Davis J.L. and Beiser D.G., Eds.). - Cambridge, MA: MIT Press, pp. 215-232.
- [4] Barto A.G., Sutton R.S. and Anderson C.W. (1983): Neuronlike adaptive elements that can solve difficult learning control problems. - IEEE Trans. Syst. Man Cybern., Vol. 13, pp. 834-846.
- [5] Bertsekas D.P. and Tsitsiklis J.N. (1996): Neuro-Dynamic Programming. - Belmont, MA: Athena Scientific.
- [6] Dayan P. (1992): The convergence of TD(λ) for general λ. - Machine Learn., Vol. 8, pp. 341-362.
- [7] Doya K. (1999): What are the computations of the cerebellum, the basal ganglia, and the cerebral cortex. - Neural Netw., Vol. 12, No. 7-8, pp. 961-974.
- [8] Doya K. (2000): Reinforcement learning in continuous time and space. - Neural Comp., Vol. 12, No. 1, pp. 243-269.
- [9] Doya K., Katagiri K., Wolpert D.M. and Kawato M. (2000a): Recognition and imitation of movement paterns by a multiple predictor-controller architecture. - Tech. Rep. Institute of Electronic, Information, and Communication Engineers, TL2000-11, pp. 33-40.
- [10] Doya K., Samejima K., Katagiri K. and Kawato M. (2000b): Multiple model-based reinforcement learning. - Tech. Rep. KDB-08, Kawato Dynamic Brain Project, ERATO, Japan Science and Technology Corporation.
- [11] Ghahramani Z. and Wolpert D.M. (1997): Modular decomposition in visuomotor learning. - Nature, Vol. 386, pp. 392-395.
- [12] Ghez C. and Thach W.T. (2000): The cerebellum, In: Principles of Neural Science, 4th Ed. (Kandel E.R., Schwartz J.H. and Jessell T.M., Eds.). - New York: McGraw-Hill, pp. 832-852.
- [13] Haruno M., Wolpert D.M. and Kawato M. (1999): Multiple paired forward-inverse models for human motor learning and control, In: Advances in Neural Information Processing Systems, No.11 (Kearns M.S., Solla S.A. and Cohen D.A., Eds.). - Cambridge, MA: MIT Press, pp. 31-37.
- [14] Hikosaka O., Nakahara H., Rand M.K., Sakai K., Lu X., Nakamura K., Miyachi S. and Doya K. (1999): Parallel neural networks for learning sequential procedures. - Trends in Neurosci., Vol. 22, No. 10, pp. 464-471.
- [15] Houk J.C., Adams J.L. and Barto A.G. (1995): A model of how the basal ganglia generate and use neural signals that predict reinforcement, In: Models of Information Processing in the Basal Ganglia (Houk J.C., Davis J.L. and Beiser D.G., Eds.). - Cambridge, MA: MIT Press, pp. 249-270.
- [16] Imamizu H., Miyauchi S., Sasaki Y., Takino R., Pütz B. and Kawato M. (1997): Separated modules for visuomotor control and learning in the cerebellum: A functional MRI study, In: NeuroImage: Third International Conference on Functional Mapping of the Human Brain (Toga A.W., Frackowiak R.S.J. and Mazziotta J.C., Eds.). - Copenhagen, Denmark: Academic Press, Vol. 5, p. S598.
- [17] Imamizu H., Miyauchi S., Tamada T., Sasaki Y., Takino R., Putz B., Yoshioka T. and Kawato M. (2000): Human cerebellar activity reflecting an acquired internal model of a new tool. - Nature, Vol. 403, pp. 192-195.
- [18] Ito M. (1993): Movement and thought: identical control mechanisms by the cerebellum. - Trends in Neurosci., Vol. 16, No. 11, pp. 448-450.
- [19] Ito M., Sakurai M. and Tongroach P. (1982): Climbing fibre induced depression of both mossy fibre responsiveness and glutamate sensitivity of cerebellar purkinje cells. - J. Physiol., Vol. 324, pp. 113-134.
- [20] Kawagoe R., Takikiwa Y. and Hikosaka O. (1998): Expectation of reward modulates cognitive signals in the basal ganglia. - Nature Neurosci., Vol. 1, No. 5, pp. 411-416.
- [21] Kawato M., Furukawa K. and Suzuki R. (1987): A hierarchical neural network model for control and learning of voluntary movement. - Biol. Cybern., Vol. 57, pp. 169-185.
- [22] Kawato M. and Gomi H. (1992): A computational model of four regions of the cerebellum based on feedback-error learning. - Biol. Cybern., Vol. 68, pp. 95-103.
- [23] Marr D. (1969): A theory of cerebellar cortex. - J. Physiol., Vol. 202, pp. 437-470.
- [24] Miyamura A. and Kimura H. (2000): Mathematical foundations of feedback error learning method. - (submitted).
- [25] Montague P.R., Dayan P. and Sejnowski T.J. (1996): A framework for mesencephalic dopamine systems based on predictive Hebbian learning. - J. Neurosci., Vol. 16, pp. 1936- 1947.
- [26] Morimoto J. and Doya K. (2000): Acquisition of stand-up behavior by a real robot using hierarchical reinforcement learning. - Proc. 17th Int. Conf. Machine Learning, Vancouver, Vol. 1, pp. 623-630.
- [27] Morse A.S. (1996): Supervisory control of families of linear set-point controllers-part 1: Exact matching. - IEEE Trans. Automat. Contr., Vol. 41, No. 10, pp. 1413-1431.
- [28] Nakahara H., Doya K. and Hikosaka O. (1998): Benefit of multiple representaitons for motor sequence control in the basal ganglia loops. - BSIS Tech. Rep. 98-05, RIKEN Brain Science Institute.
- [29] Pawelzik K., Kohlmorge J. and Müller K.R. (1996): Annealed competition of experts for a segmentation and classification of switching dynamics. - Neural Comput., Vol. 8, pp. 340-356.
- [30] Schultz W., Dayan P. and Montague P.R. (1997): A neural substrate of prediction and reward. - Science, Vol. 275, pp. 1593-1599.
- [31] Shima K., Mushiake H., Saito N. and Tanji J. (1996): Role for cells in the presupplementary motor area in updating motor plans. - Proc. Nat. Acad. Sci., Vol. 93, pp. 8694-8698.
- [32] Suri R.E. and Schultz W. (1998): Learning of sequential movements by neural network model with dopamine-like reinforcement signal. - Experim. Brain Res., Vol. 121, pp. 350-354.
- [33] Sutton R.S. (1988): Learning to predict by the methods of temporal difference. - Machine Learn., Vol. 3, pp. 9-44.
- [34] Sutton R.S. and Barto A.G. (1998): Reinforcement Learning. - Cambridge, MA: MIT Press.
- [35] Tesauro G. (1994): TD-Gammon, a self teaching backgammon program, achieves master-level play. - Neural Comput., Vol. 6, pp. 215-219.
- [36] Watkins C.J.C.H. (1989): Learning from delayed rewards. - Ph.D. Thesis, Cambridge University.
- [37] Wilson C.W. (1998): Basal ganglia, In: The Synaptic Organization of the Brain, 3rd Ed. (Shepherd G.M., Ed.). - New York: Oxford University Press, pp. 329-375.
- [38] Wolpert D.M. and Kawato M. (1998): Multiple paired forward and inverse models for motor control. - Neural Netw., Vol. 11, pp. 1317-1329.
- [39] Zames G. (1981): Feedback and optimal sensitivity: Model reference transformations multiplicative seminors, and approximate inverses. - IEEE Trans. Automat. Contr., Vol. AC26, No. 2, pp. 301-320.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-article-BPZ1-0012-0004