Package deep provides the DeepLeabra variant of Leabra, which performs predictive learning by attempting to predict the activation states over the Pulvinar nucleus of the thalamus (in posterior sensory cortex), which are strongly driven phasically every 100 msec by deep layer 5 intrinsic bursting (5IB) neurons that have strong focal (essentially 1-to-1) connections onto the Pulvinar Thalamic Relay Cell (TRC) neurons. The predictions are generated by layer 6 corticothalamic (CT) neurons, which provide numerous weaker projections to these same TRC neurons. See O'Reilly et al. (2020) for the model, and Sherman & Guillery (2006) for details on circuitry.
Computationally, it is important for the CT neurons to reflect the prior burst activation within their home cortical microcolumn, instead of the current superficial layer activation, so that the system is forced to make a genuine prediction instead of just copying the current state. This is achieved using a CTCtxt projection, which operates much like a simple recurrent network (SRN) context layer (e.g., Elman, 1990).
This same corticothalamic circuitry is also important for broader inhibitory competition among cortical areas that cannot practically interact directly in cortex, given the long physical distances. The thalamic reticular nucleus (TRN) integrates excitatory inputs from the CT and TRC neurons, and projects pooled inhibition across multiple spatial scales back to the TRC neurons. These TRC neurons then project back primarily into layer 4 (and more weakly into other layers) and thus convey the attentionally modulated predictive activation back into cortex.
Computationally, it makes sense that attention and prediction are linked: you only predict the subset of information that you're attending to (otherwise it is too overwhelming to predict everything), and prediction helps to focus the attentional spotlight in anticipation of what will happen next, and according to the regularities that predictive learning has discovered (e.g., coherently moving objects).
However, reconciling the attentional and predictive roles of this circuitry introduces some complexity. The attentional function depends on a closed-loop connectivity pattern organized around pools of neurons (e.g., hypercolumns), such that CT -> TRN -> TRC -> Cortex is all properly aligned. This is generally how the pulvinar is organized topographically (Shipp, 2003). However, from a predictive-learning perspective it is simpler to coordinate the TRC according to its driver 5IB projections so you can directly measure how well the prediction matches these driver inputs. Furthermore, each cortical pool is involved in predicting across multiple different drivers across multiple layers in the hierarchy (e.g., IT cortex predicts V1, V2, V4 drivers), so the closed-loop TRC is driven by diverse inputs that need to be coordinated, whereas the driver-based organization can handle this more simply with multiple projections. This is how we organized things computationally in previous implementations.
In the current implementation, we use the closed-loop connectivity, with Drivers list on TRCLayer
that manages the aggregation of driver inputs from multiple different layers, which also handles the conversion between the group-level topologies of these layers (which was handled by projection pattern logic previously).
V1Super -----> V2 --Ctxt--> CT
| ^ |\
Burst | (pool | v
| | loop) | TRN
v | | /
Pred <------> TRC <-----------o (inhib)
This package has 3 primary specialized Layer types:
-
SuperLayer
: implements the superficial layer neurons, which function just like standard leabra.Layer neurons, while also directly computing the Burst activation signal that reflects the deep layer 5IB bursting activation, via thresholding of the superficial layer activations (Bursting is thought to have a higher threshold). -
CTLayer
: implements the layer 6 regular spiking CT corticothalamic neurons that project into the thalamus. They receive the Burst activation via aCTCtxtPrjn
projection type, typically once every 100 msec, and integrate that in the CtxtGe value, which is added to other excitatory conductance inputs to drive the overall activation (Act) of these neurons. Due to the bursting nature of the Burst inputs, this causes these CT layer neurons to reflect what the superficial layers encoded on the previous timestep -- thus they represent a temporally delayed context state.
CTLayer can send Context via self projections to reflect the extensive deep-to-deep lateral connectivity that provides more extensive temporal context information.
TRCLayer
: implement the TRC (Pulvinar) neurons, upon which the prediction generated by CTLayer projections is projected in the minus phase. This is computed via standard Act-driven projections that integrate into standard Ge excitatory input in TRC neurons. The 5IB Burst-driven plus-phase "outcome" activation state is driven by direct access to the corresponding driver SuperLayer (not via standard projection mechanisms). Wiring diagram:
The alpha-cycle quarter(s) when Burst is updated and broadcast is set in BurstQtr (defaults to Q4, can also be e.g., Q2 and Q4 for beta frequency updating). During this quarter(s), the Burst value is computed in SuperLayer, and this is continuously accessed by TRCLayer neurons to drive plus-phase outcome states.
At the end of the burst quarter(s), in the QuarterFinal method, CTCtxt projections convey the Burst signal from Super to CTLayer neurons, where it is integrated into the Ctxt value representing the temporally delayed context information.
The basic anatomical facts of the TRN strongly constrain its role in attentional modulation. With the exception of inhibitory projections from the GPi / SNr (BG output nuclei), it exclusively receives excitatory inputs from CT projections, and a weaker excitatory feedback projection from the TRC neurons that they in turn send GABA inhibition to. Thus, their main function appears to be providing pooled feedback inhibition to the TRC, with various levels of pooling on the input side and on the diffusion on the output side. Computationally, this pooling seems ideally situated to enable inhibitory competition to operate across multiple different scales.
Given the pool-level organization of the CT -> TRC -> Cortex loops, the pool should be the finest grain of this competition. Thus, a contribution of the TRN is supporting layer-level inhibition across pools -- but this is already implemented with the layer level inhibition in standard Leabra. Critically, if we assume that inhibition is generally hierarchically organized, then the broader level of inhibition would be at the between-layer level. Thus, the TRN implementation just supports this broadest level of inhibition, providing a visual representation of the layers and their respective inhibition levels.
In addition, the TRC layer itself supports a gaussian topographic level of inhibition among pools, that represents a finer grained inhibition that would be provided by the TRN.
Perhaps the most important contribution that the TRC / TRN can provide is a learning modulation at the pool level, as a function of inhibition.
It is relatively easy to make something that locks in a given attentional pattern, but a problem arises when you then need to change things in response to new inputs -- often the network suffers from too much attentional lock-in...
Grossberg (1999) emphasized that it can be beneficial for attention to modulate the inputs to a given area, so it gets "folded" into the input stream. Another way of thinking about this is that it is more effective to block a river further upstream, before further "compounding" effects might set in, rather than waiting until everything has piled in and you have to push against a torrent. This is achieved by modulating the layer 4 inputs to an area, which happens by modulating forward projections.
See PBWM for info about Prefrontal-cortex Basal-ganglia Working Memory (PBWM) model that builds on this deep framework to support gated working memory.
-
Elman, J. L. (1990). Finding structure in time. Cognitive Science, 14(2), 179–211.
-
O’Reilly, R. C., Russin, J. L., Zolfaghar, M., & Rohrlich, J. (2020). Deep Predictive Learning in Neocortex and Pulvinar. ArXiv:2006.14800 [q-Bio]. http://arxiv.org/abs/2006.14800
-
Sherman, S. M., & Guillery, R. W. (2006). Exploring the Thalamus and Its Role in Cortical Function. MIT Press. http://www.scholarpedia.org/article/Thalamus
-
Shipp, S. (2003). The functional logic of cortico-pulvinar connections. Philosophical Transactions of the Royal Society of London B, 358(1438), 1605–1624. http://www.ncbi.nlm.nih.gov/pubmed/14561322