RFR: Continual Few-shot Reinforcement Learning

RFR: Continual Few-shot Reinforcement Learning

Proposed by Cerenaut and the Whole Brain Architecture Initiative
(What is a Request for Research/RFR?)

This project takes a different approach to RL, inspired by evidence that hippocampus replays to the frontal cortex directly [1, 2, 3].  It is likely used for model building, as opposed to the mainstream view in cognitive science and ML — where ‘experience replay’ ultimately improves policy.  The predicted benefits are sample efficiency, better ability to generalize to new tasks and an ability to learn new tasks without forgetting old ones.  The project objective is to improve biological models and advance state-of-the-art in continual reinforcement learning.

Required Knowledge:
Machine Learning, Deep Learning or some knowledge and willingness to learn. Must have Python and some experience with PyTorch or Tensorflow.

[1] D. Kumaran, D. Hassabis, and J. L. McClelland, “What Learning Systems do Intelligent Agents Need? Complementary Learning Systems Theory Updated,” Trends in Cognitive Sciences, vol. 20, no. 7, 2016, doi: 10.1016/j.tics.2016.05.004.
[2] J. L. McClelland, B. L. McNaughton, and R. C. O’Reilly, “Why there are complementary learning systems in the hippocampus and neocortex: Insights from the successes and failures of connectionist models of learning and memory,” Psychological Review, vol. 102, no. 3, pp. 419–457, 1995, doi: 10.1037/0033-295X.102.3.419.
[3] R. C. O’Reilly, R. Bhattacharyya, M. D. Howard, and N. Ketz, “Complementary learning systems,” Cognitive Science, vol. 38, no. 6, pp. 1229–1248, 2014, doi: 10.1111/j.1551-6709.2011.01214.x.

Status: Open

Monash Uni masters student Luke Yang completed a project at the end of 2023. We found that hippocampus-inspired replay to a world model can be an effective sample-efficient approach for continual RL. It forms the basis for future projects. Preprint available here. Implementation available here.

Contact: rfr [at] wba-initiative.org