RFR: Self-Organising Architectures

Proposed by Project AGI and the Whole Brain Architecture Initiative

(What is a Request for Research/RFR?)

Summary

Overview

The mammalian neocortex is the most recently evolved part of the brain and crucial for our advanced sensory processing and cognition. If you spread it out, it is a physically a thin sheet. Connections between regions functionally organise it into a hierarchy (this has been widely documented, for example see Principles of Neuroscience for a thorough description). In other words, there are virtual layers stacked up, rather than actual physically separated layers.

Moving across the surface of the neocortex, there are gradual changes in the encoded representations and in integration between sensory modalities. This has been described in the Gradiental Model by Elkhonon Goldberg [1].

The Gradiental Model strongly implies that neocortical organisation is interactive, continuous and emergent. Within the constraints of fixed sensory input and evolved neural parameters, the hierarchy is self organising, in terms of how resources are allocated to represent input and for integration of receptive fields and modalities hierarchically.

In contrast, for state of the art machine learning (ML), the architecture (receptive fields, their connectivity and integration) is almost always fixed and specified beforehand by the algorithm designer. Mimicking the Gradiental Model should lead to more effective use of resources, better modelling of input and a quantum leap in flexibility for a range of environments and tasks.

Objective

The objective is to create a single layer neural network that exhibits gradiental and hierarchical features in a self-organizing manner. Future projects will look at hierarchies and combining multiple sensing modalities.

Success Criteria

A successful project will demonstrate an algorithm design and implementation. Also, experimental setup and results of performance of the system compared to fixed-neocortical organization architectures. As a stretch goal, demonstrate a compelling advantage, but this may be very hard to do.

Dataset and Tests

For the first phase using a single sensing modality, we suggest a simple classification problem, using vision as it is intuitive to visualise. MNIST and CIFAR10 are a good place to start as they are well known datasets with existing performance benchmarks for comparison. They are also easily obtainable and usable in ML frameworks. The classification can be done either supervised or unsupervised. For the latter, it is common practice to use an additional simple discriminative algorithm (such as logistic regression) to test the unsupervised model’s output features.

Selecting meaningful criteria and choosing how to test and compare algorithms is part of the scope. We’ll be in close contact to assist.

Detailed Project Description

We start with a 2d surface of cells resembling the surface of cortex. It is not organized into any form of hierarchy a priori. Physically, it’s just a single layer. Each region (or more loosely, group) of cells receives input from an area of the surface (i.e. a set of adjacent cells) called a receptive field. Every cell in the group shares the same fixed-size receptive-field that can target any part of the surface. The output of cells is mapped recursively back to the input surface, illustrated in the figure below. This means that a hierarchy can emerge. Cells that process external input would produce output; other cells would target their receptive fields on this output as input. The latter cells would implicitly become layer 2 of the hierarchy or architecture.

Fig.1 Illustration of a 1d physical layer being organised into a virtual hierarchy through recursive connections. Note that the learning of these connections is referred to as the ‘secondary-learning’ algorithm.

Each region is a type of neural network with its own learning rule e.g. a SOM variant or conventional supervised dense layer. One aspect of this project is selecting a suitable neural network. The other aspect is implementing a secondary-learning rule to make recursive connections to form an effective hierarchy that improves performance. In other words, the objective of the secondary-learning rule is to optimize the size and placement of receptive fields such that overall network performance is optimized.

The most likely proxy objective for the secondary-learning algorithm is to seek out areas with under-utilized input for each region. It should be a ‘competitive process’ between regions, so that the input is distributed evenly between them. A good starting point is to fix the receptive field size and let their placement wander.

How can the secondary-learning rule be achieved? Some options are Shannon Entropy, or some kind of Free-Energy minimization scheme (as introduced by Friston). Another is predictive coding – if cells can ‘suppress’ output by predicting it, other cells’ receptive fields would look elsewhere for unsuppressed input.

Parameters such as region size (i.e. how many cells share a receptive field) and number of cells (total resources) can be fixed manually or empirically. As mentioned above, you may also like to start with a fixed size receptive. It may also help to set a starting position and/or bias the receptive field placements.

Difficulties anticipated

The discussion above is focussed on the secondary-learning rule. However, the base algorithm learning will need to continue to operate successfully as well, as the secondary-learning rules dynamically modify the receptive fields. For this reason, a neural net such as a SOM that can learn online and continue to adapt is necessary.

There are many ways that a secondary-learning scheme for receptive fields could be trained, but this openness also means there are many potential solutions to investigate.

A further difficulty is preventing cycles of self-excitation, which would potentially cut off from the rest of the surface and external input. It is likely that lateral inhibition plays a role in the neocortex.

Background Information

Artificial Neural Networks

In ANNs, the vast majority of architectures are hand-designed to fulfil a specific narrow purpose. The structure of the network in terms of number of layers, and cells per layer, is almost always fixed by a mix of empirical discovery and intuition (it’s a dark art, they say). In convolutional networks which are quite dominant in visual processing, and recently in sequential (time series) processing, additional geometry determining receptive fields and pooling regions is also specified by the researcher.

Note that the function, purpose, role and/or data presented to individual layers in an ANN is learned. The most successful deep learning algorithms, such as residual networks, are trained to spontaneously form ensembles of layers (typically 5-30 layers) in response to different inputs. Effectively the network becomes a set of interacting networks that collectively specialize in different input.

References

[1] Goldberg E, “Gradiental Approach to Neocortical Functional Organization”, J Clin Exp Neuropsychol, vol 11, no, 4, 1989 [pdf]