Both sides previous revision
Previous revision
Next revision
|
Previous revision
|
ccnbook-motor_summary [2018/06/02 22:21] n.arakawa |
ccnbook-motor_summary [2018/06/02 22:32] n.arakawa |
| ==== CCNBook Motor in a Nutshell === |
The aim of this article is to present an actor-critic model based on [[https://grey.colorado.edu/CompCogNeuro/index.php/CCNBook/Main|the CCNBook]]. As the description of reinforcement learning in [[https://grey.colorado.edu/CompCogNeuro/index.php/CCNBook/Motor|the Motor chapter]] seems a bit ‘roundabout’, this memo tries to simplify it. | The aim of this article is to present an actor-critic model based on [[https://grey.colorado.edu/CompCogNeuro/index.php/CCNBook/Main|the CCNBook]]. As the description of reinforcement learning in [[https://grey.colorado.edu/CompCogNeuro/index.php/CCNBook/Motor|the Motor chapter]] seems a bit ‘roundabout’, this memo tries to simplify it. |
---- | ---- |
| |
| {{https://grey.colorado.edu/mediawiki/sites/CompCogNeuro/images/thumb/e/e9/fig_pvlv_bio_no_cereb.png/400px-fig_pvlv_bio_no_cereb.png}} | | | {{https://grey.colorado.edu/mediawiki/sites/CompCogNeuro/images/thumb/e/e9/fig_pvlv_bio_no_cereb.png/400px-fig_pvlv_bio_no_cereb.png}} | |
| **Figure 7.8**: Biological mapping of the PVLV algorithm (from [[https://grey.colorado.edu/CompCogNeuro/index.php/File:fig_pvlv_bio_no_cereb.png|CCNBook]]) | | | **Figure 7.8**: Biological mapping of the PVLV algorithm (from [[https://grey.colorado.edu/CompCogNeuro/index.php/File:fig_pvlv_bio_no_cereb.png|CCNBook]]) | |
| VS: Ventral Striatum, VTA: [[https://en.wikipedia.org/wiki/Ventral_tegmental_area|Ventral Tegmental Area]], PPT: [[https://en.wikipedia.org/wiki/Pedunculopontine_nucleus|Pedunculopontine Tegmental Nucleus]], LHA: Lateral Hypothalamic Nucleus, CNA: Central Nucleus of the Amygdala, CS: Conditioned Stimuli, US: Unconditioned Stimuli (〜Reward) | | | VS: Ventral Striatum, VTA: [[https://en.wikipedia.org/wiki/Ventral_tegmental_area|Ventral Tegmental Area]], PPT: [[https://en.wikipedia.org/wiki/Pedunculopontine_nucleus|Pedunculopontine Tegmental Nucleus]], LHA: Lateral Hypothalamic Nucleus, CNA: Central Nucleus of the Amygdala, CS: Conditioned Stimuli, US: Unconditioned Stimuli (〜Reward) | |
| |
| |
| {{https://grey.colorado.edu/mediawiki/sites/CompCogNeuro/images/thumb/0/02/fig_bvpvlv_pv_lv_only.png/800px-fig_bvpvlv_pv_lv_only.png?500}} | | | {{https://grey.colorado.edu/mediawiki/sites/CompCogNeuro/images/thumb/0/02/fig_bvpvlv_pv_lv_only.png/800px-fig_bvpvlv_pv_lv_only.png?500}} | |
| **Figure PV.1** | | | **Figure PV.1** (from [[https://grey.colorado.edu/CompCogNeuro/index.php/File:fig_bvpvlv_pv_lv_only.png|CCNBook]]) | |
| LHB: [[https://en.wikipedia.org/wiki/Habenula#Lateral_habenula|Lateral Habenula]], RMTg: Rostral Medial Tegmental gyrus | | | LHB: [[https://en.wikipedia.org/wiki/Habenula#Lateral_habenula|Lateral Habenula]], RMTg: Rostral Medial Tegmental gyrus | |
| |
If you want distinguish all the parts shown in Figure PV.1, you should keep them in your model. However, if the function of the circuit is the Critic in AC learning, the complication would not be necessary in engineering terms. Figure 2 shows a model in which the complication is encapsulated (parts such as the amygdala, VTA/SNc, part of the striatum are hidden). Note that the TD error δ encodes rt+1 +γV(st+1)−V(st), where r stands for reward, V(s) the evaluation of the state s, and γ the discount coefficient. | If you want distinguish all the parts shown in Figure PV.1, you should keep them in your model. However, if the function of the circuit is the Critic in AC learning, the complication would not be necessary in engineering terms. Figure 2 shows a model in which the complication is encapsulated (parts such as the amygdala, VTA/SNc, part of the striatum are hidden). Note that the TD error δ encodes r<sub>t+1</sub> +γV(s<sub>t+1</sub>)−V(s<sub>t</sub>), where r stands for reward, V(s) the evaluation of the state s, and γ the discount coefficient. |
| | {{ccnmotornutshell2.png}} | |
| **Figure 2** | | | **Figure 2** | |
You might want to distinguish the reward system from the punishment system, but its physiology may be in the dark. | You might want to distinguish the reward system from the punishment system, but its physiology may be in the dark. |
A simple overall (AC) scheme would be modeled as below (Figure 3). Note that the Frontal Cortex is also included in the State box. | A simple overall (AC) scheme would be modeled as below (Figure 3). Note that the Frontal Cortex is also included in the State box. |
| | {{ccnmotornutshell3.png}} | |
| **Figure 3** | | | **Figure 3** | |
=== Reference === | === Reference === |
Daphna Joela, Yael Niva, Eytan Ruppin: [[https://www.princeton.edu/~yael/Publications/NN2002.pdf|Actor–critic models of the basal ganglia]], Neural Networks 15 (2002). | Daphna Joela, Yael Niva, Eytan Ruppin: [[https://www.princeton.edu/~yael/Publications/NN2002.pdf|Actor–critic models of the basal ganglia]], Neural Networks 15 (2002). |