ccnbook-motor_summary

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision Both sides next revision
ccnbook-motor_summary [2018/06/02 21:52]
n.arakawa created
ccnbook-motor_summary [2018/06/02 22:14]
n.arakawa
Line 1: Line 1:
 The aim of this article is to present an actor-critic model based on [[https://grey.colorado.edu/CompCogNeuro/index.php/CCNBook/Main|the CCNBook]].  As the description of reinforcement learning in [[https://grey.colorado.edu/CompCogNeuro/index.php/CCNBook/Motor|the Motor chapter]] seems a bit ‘roundabout’, this memo tries to simplify it.  The aim of this article is to present an actor-critic model based on [[https://grey.colorado.edu/CompCogNeuro/index.php/CCNBook/Main|the CCNBook]].  As the description of reinforcement learning in [[https://grey.colorado.edu/CompCogNeuro/index.php/CCNBook/Motor|the Motor chapter]] seems a bit ‘roundabout’, this memo tries to simplify it. 
 ---- ----
-The chapter is based on the hypothesis that **the basal ganglia (BG) uses the actor-critic type** of reinforcement learning, which in turn is based on the finding that **the dopamine output from SNc encodes TD (time difference) δ** used in AC learning.+The chapter is based on the hypothesis that **the basal ganglia (BG) uses the actor-critic type** of reinforcement learning, which in turn is based on the finding that **the dopamine output from [[https://en.wikipedia.org/wiki/Pars_compacta|SNc]] encodes TD (time difference) δ** used in AC learning.
  
-{{https://grey.colorado.edu/mediawiki/sites/CompCogNeuro/images/8/85/fig_actor_critic_basic.png}} +|  {{https://grey.colorado.edu/mediawiki/sites/CompCogNeuro/images/8/85/fig_actor_critic_basic.png}}  | 
- +|  **Figure 7.6**: Basic structure of the actor critic architecture (from [[https://grey.colorado.edu/CompCogNeuro/index.php/File:fig_actor_critic_basic.png|CCNBook]])  |
-**Figure 7.6**: Basic structure of the actor critic architecture (from [[https://grey.colorado.edu/CompCogNeuro/index.php/File:fig_actor_critic_basic.png|CCNBook]])+
  
  
 Since SNc provides with δ, it is supposed to be part of the Critic, and with the Figures 7.2 & 7.4 of the chapter, the Actor contains the loop of Frontal Cortex, Striatum, and Thalamus with the dopamine (δ) input to the Striatum (Figure 1). Since SNc provides with δ, it is supposed to be part of the Critic, and with the Figures 7.2 & 7.4 of the chapter, the Actor contains the loop of Frontal Cortex, Striatum, and Thalamus with the dopamine (δ) input to the Striatum (Figure 1).
  
-Figure 1+|  {{ccnmotornutshell1.png}} 
 +|  **Figure 1**  |
  
 Here, the Actor determines its action based on the state representation (of the environment) in the Frontal Cortex, which is in turn formed with its input (not shown) from other cortical areas, the amygdala, the hippocampus, and other subcortical nuclei (the input sources vary with the area in FC).  While the Frontal Cortex provides with output options, the Striatum selects an option to be outputted. Here, the Actor determines its action based on the state representation (of the environment) in the Frontal Cortex, which is in turn formed with its input (not shown) from other cortical areas, the amygdala, the hippocampus, and other subcortical nuclei (the input sources vary with the area in FC).  While the Frontal Cortex provides with output options, the Striatum selects an option to be outputted.
 The reward r to the Critic is explained the PVLV (Primary Value, Learned Value) model section of the chapter (Figure 7.8).  The reward r to the Critic is explained the PVLV (Primary Value, Learned Value) model section of the chapter (Figure 7.8). 
  
-Figure 7.8: Biological mapping of the PVLV algorithm +|  {{https://grey.colorado.edu/mediawiki/sites/CompCogNeuro/images/thumb/e/e9/fig_pvlv_bio_no_cereb.png/400px-fig_pvlv_bio_no_cereb.png}} 
-VS: Ventral Striatum +|  **Figure 7.8**: Biological mapping of the PVLV algorithm  (from [[https://grey.colorado.edu/CompCogNeuro/index.php/File:fig_pvlv_bio_no_cereb.png|CCNBook]])  | 
-VTA: Ventral Tegmental Area +|  VS: Ventral StriatumVTA: [[https://en.wikipedia.org/wiki/Ventral_tegmental_area|Ventral Tegmental Area]], PPT: [[https://en.wikipedia.org/wiki/Pedunculopontine_nucleus|Pedunculopontine Tegmental Nucleus]], LHA: Lateral Hypothalamic NucleusCNA: Central Nucleus of the AmygdalaCS: Conditioned StimuliUS: Unconditioned Stimuli (〜Reward)  | 
-PPT: Pedunculopontine Tegmental Nucleus + 
-LHA: Lateral Hypothalamic Nucleus +An apparent problem of Figure 7.8 is that SNc does not receive a reward signal (US).  The problem is solved in Figure PV.1 in [[https://grey.colorado.edu/CompCogNeuro/index.php/CCNBook/Sims/Motor/PVLV|the PVLV page]], where SNc is substituted with VTA.
-CNA: Central Nucleus of the Amygdala +
-CS: Conditioned Stimuli +
-US: Unconditioned Stimuli (〜Reward) +
-An apparent problem of Figure 7.8 is that SNc does not receive a reward signal (US).  The problem is solved in Figure PV.1 in the PVLV page, where SNc is substituted with VTA.+
    
 Figure PV.1 Figure PV.1
  • ccnbook-motor_summary.txt
  • Last modified: 2018/06/02 22:32
  • by n.arakawa