Blog

Gato apparently proved broad common sense is learnable from multimodal data

Hiroshi Yamakawa, Yutaka Matsuo, Koichi Takahashi,
Shin’ichi Asakawa, Ryutaro Ichise, Takashi Omori, Satoshi Kurihara,
Naoyuki Sato, 
Yoshimasa Tawatsuji, Ayako Fukawa

The Whole Brain Architecture Initiative

Overview

This article presents our view that DeepMind’s recent announcement of Gato [DeepMind, 2022] has contributed to the advancement of artificial general intelligence (AGI) in that it experimentally demonstrated a single AI system can acquire broad common sense knowledge from labeled, multi-domain data.  We believe that it stimulated discussions of the accomplishment condition of AGI.  We discuss how it could be beneficial for WBA or brain-inspired AGI development.

1. Introduction

On May 12, 2022, DeepMind, a Google company, announced Gato, a single AI system capable of performing 604 tasks through imitation learning using primarily expert supervised data.  This announcement had the sensational title “A Generalist Agent,” and it seems to have sparked a renewed debate on AGI (see the press coverage at the end of this article).  This time, most discussions seemed to assume that AGI would eventually be realized; there were few arguments that AGI would not be feasible.

We, as a non-profit organization, are promoting the development of brain-inspired AGI in a democratic manner, based on the Whole Brain Architecture (WBA) approach “to create a human-like artificial general intelligence (AGI) by learning from the architecture of the entire brain.”  We believe that learning from the brain at a certain granularity relating to human cognitive behavior helps the design of AGI.  We also believe that the WBA thus constructed will be AGI useful for human-AI interaction and human understanding, and will be used as a vessel for uploading the human brain, since it will solve tasks with a similar mechanism as the human brain.  Thus WBAI has been working since 2015 to build AGI, though it has technical and organizational differences from DeepMind.

In the following, we will position Gato in the discussions held in WBAI, and consider its effects on WBA research.

2. Two Capabilities that support General Intelligence

Since Gato aims for AGI, it would be inappropriate to focus on its problem-solving performance for individual tasks; even if a sage 100 years ago had poor programming skills s/he could still be a sage.

Here, we would like to start discussion from our definition of AGI: “AGI is an artificial intelligence that automatically acquires versatile problem-solving capabilities in various domains and solves problems unexpected at the time of its design.”  To add to this definition, the AGI we aim for should have both extensive knowledge acquisition and flexible knowledge utilization capabilities at the same level as an individual human brain:

  • Extensive knowledge acquisition refers to the ability to solve problems in a variety of problem domains and to acquire the necessary knowledge from experience.  It is ‘general’ with respect to the breadth of task coverage (*1).  The problem solving capability is acquired through experience and imitation in each problem domain at the individual level.  In machine learning as well, problem solving capability is acquired with a large amount of data in each specific domain.  It is thought to play a central role in the acquisition of common sense [Yamakawa, 2020, in Japanese].
  • Flexible knowledge utilization is the ability to combine and transfer knowledge possessed, to solve problems beyond what was envisioned at the time of design (*2).  This capability is valued by performing well even with less data available for learning.  It can be seen as the demonstration of creativity [Yamakawa, 2020, in Japanese].  Examples range from the level of applying mathematical formulas as stored knowledge to realistic problems to the level of great discoveries of universality in the natural world.

The importance of both depends on the situation in which they are used.  An all-in-one system that has acquired common sense in various fields will be highly usable as a general-purpose technology and lead to cost reductions in development and operation.  For dialogue AI and policy decision support AI, interpersonal communication skills will be emphasized, and the importance to flexibly apply knowledge will also increase.  Flexible knowledge utilization will also be a central capability for antifragile AI to deal with unexpected system troubles and for scientific AI to generate hypotheses and explore the intellectual frontiers for humanity.

From this standpoint, Gato’s contribution would be its wide range of knowledge acquisition capabilities.  Such capabilities will be realized by big-switch statement AI, consisting of a mishmash of specialized AIs focused on specific tasks [Yamakawa, 2016, in Japanese].  In machine learning in the 2020s, training models (such as transformers) with large amounts of data to process various tasks is no longer a novelty.  Even so, Gato seems to have left its mark on the common sense issue, as it demonstrated a single AI system with a uniform structure can handle hundreds of tasks.  Looking back at the history of AI for a moment, traditional symbolic AI research aimed to build common sense by having people describe vast amounts of knowledge, as in the Cyc project [Cyc], but the problem remained as knowledge cannot be exhaustively written down.  While Gato’s approach requires the cost of preparing a large amount of labeled data, it is superior to the previous approach in that it can incorporate knowledge difficult to verbalize.  In the future, complementing/integrating these two directions may increase the common sense of AI, including its social capabilities [Shrobe, 2018].

While Gato’s capability has been advanced in current machine learning in the form of transfer learning, domain adaptation, zero/one/few-shot learning, and so on, we do not find Gato’s contribution to flexible knowledge utilization: it has not reached the point where it copes with intelligence tests at a level close to that of humans.

3. Gato’s Effect on WBA

3.1 It demonstrated diverse computational functions of the neocortex in software

In the construction of WBA, the i-th goal of “solving task Ai” is replaced by the derived goal of “solving task Ãi = solving Ai with a mechanism similar to the human brain.”

We have developed a methodology for WBA development called Brain Reference Architecture (BRA) driven development [Yamakawa, 2021].  In BRA-driven development, hypothetical component diagrams (HCDs) consistent with mesoscopic anatomical structures are created for the entire brain to design software implementation (*3).  BRAs are integrated to the whole-brain reference architecture (WBRA).

Currently, we are creating HCDs for various brain regions, while systematizing computational functions to be consistent with anatomical structure.  Before we set about this work, it is desirable to have evidence that the computational functions assumed in the regions of interest (ROIs) are feasible in principle (whatever its neural mechanism would be).  A strong evidence would be the existence of artifacts, such as software, that realize the computational functions.  For example, an existing implementation of SLAM helped demonstrate the plausibility of a similar function in the hippocampus [Taniguchi, 2022].

As mentioned above, Gato has demonstrated that a single AI system consisting of uniform mechanisms can solve various tasks as working software.  If this function is mapped to a brain region, it would be the uniform neocortical circuits.  That is, the results confirm that the functions of the neocortex could be implemented as an artifact, even if its manner does not correspond to that of the brain.  Since the neocortex is responsible for human-like and versatile functions of the brain, Gato’s achievement backs the development of WBA.

3.2. Reconsideration of the capability requirements for the completed WBA

“Can we say that Gato has reached AGI?”  Perhaps the reason for which this question seems controversial is the lack of criteria to determine what constitutes the completion of AGI.

With our WBA approach “to create a human-like AGI by learning from the architecture of the entire brain,” we currently set the completion requirement as two of the following:

  • Capability Requirement: Realization of a list of typical capabilities (tasks) required for AGI.
  • Brain Component Requirement: Major brain organs are implemented and all of them are used in one of the tasks above.

The capability requirement has been a source of concern in WBAI for some time.  Some of AGI evaluation methods are known to be human-oriented and others are not; meanwhile the WBA approach studies the human brain and the list of typical tasks should be based on human capabilities.

Now, how can we determine a list of typical tasks a person can perform?  It is quite difficult to determine such a list to be used in an AGI evaluation: tasks are difficult to enumerate and can be infinite depending on the environment.  Furthermore, even tasks have been determined, AI systems tend to be tuned to the fixed target (*4), which often makes it difficult to evaluate the flexible knowledge utilization capabilities of AGI.

This issue is not unique to the WBA approach and has been discussed for years in the AGI field.  So, if a standard framework for AGI evaluation is set through the progress in the field, we will be able to use it as a requirement for WBA.

When the WBRA is nearing completion, it may not be possible to obtain an appropriate task list for it from the AGI research community.  In such a situation, the next best measure would be to make a capability requirement by selecting a task list so that all components in the WBRA are used in more than a specific number of tasks [Yamakawa, 2021].

4. Conclusion

In this article, we first pointed out that AGI should have both the ability to acquire extensive knowledge related to common sense and the ability to use knowledge flexibly related to creativity at the human level.  We then noted the value of DeepMind’s recent announcement of Gato for the AGI field was that it had experimentally demonstrated a single AI system could learn broad common-sense knowledge from labeled multimodal data.

Gato’s results also had some positive effects for developing brain-inspired AGI.  First, the demonstration of a wide range of knowledge acquisition capabilities as software provides support for brain-inspired/BRA-driven development of AGI, as it showed the computational functions realized by the neocortex were feasible in principle.  Second, Gato’s announcement seems to have provided an opportunity for a wide range of AI researchers to rethink the meaning and completion condition of AGI.  AGI developers will declare that they have developed AGI on occasions.  We hope that a consensus on AGI evaluation will be gradually formed in such opportunities with discussions stimulated in the research community.

Notes

  1. It roughly corresponds to the learning process of System 1 in the dual process theory.
  2. It roughly corresponds to the computational process of System 2 in the dual process theory.
  3. It includes the Structure-constrained Interface Decomposition (SCID) method.
  4. Test scores will increase even with a human being, if an intelligence test is conducted multiple times. 

References

[Cyc] Cyc. Wikipedia, https://en.wikipedia.org/wiki/Cyc
[DeepMind, 2022] DeepMind, A Generalist Agent. (2022).   https://www.deepmind.com/publications/a-generalist-agent
[Shrobe,2018] Shrobe, H. (2018). Machine Common Sense (MCS). DARPA. https://www.darpa.mil/program/machine-common-sense
[Taniguchi, 2022] Taniguchi, A., Fukawa, A., & Yamakawa, H. (2022). Hippocampal formation-inspired probabilistic generative model. Neural Networks: The Official Journal of the International Neural Network Society, 151, 317–335.
[Yamakawa, 2021] Yamakawa, H. (2021). The whole brain architecture approach: Accelerating the development of artificial general intelligence by referring to the brain. Neural Networks: The Official Journal of the International Neural Network Society, 144, 478–495.
[Yamakawa, 2016] 山川 宏. (2016). 汎用知能の知識記述長最小化原理仮説の提案. 人工知能学会全国大会論文集, JSAI2016, 2E4OS12a3–2E4OS12a3. https://doi.org/10.11517/pjsai.JSAI2016.0_2E4OS12a3
[Yamakawa, 2020] 山川宏, 進化したWBAアプローチの現在, 第5回WBAシンポジウム, p.8, 2020. https://www.slideshare.net/wba-initiative/wba-239076014

Press Coverage on Gato

DeepMind’s ‘Gato’ is mediocre, so why did they build it?
‘The Game is Over’: AI breakthrough puts DeepMind on verge of achieving human-level artificial intelligence | The Independent
DeepMind’s new Gato AI makes me fear humans will never achieve AGI
The hype around DeepMind’s new AI model misses what’s actually cool about it