René Doursat
 PhD, Habil.

Professor of Complex Systems & Deputy
   Head, Informatics Research Centre,
   School of Computing, Math & Digital Tech,
   Manchester Metropolitan University, UK

Research Affiliate, BioEmergences Lab,
   CNRS (USR3695), Gif s/Yvette, France

Steering Committee & Fmr. Director,
   Complex Systems Institute, Paris (ISC)

Officer (Secretary), Board of Directors,
   International Society for Artificial Life



email: R.Doursatmmu.ac.uk





Books

Growing Adaptive
Machines



Springer 2014
Morphogenetic
Engineering



Springer 2012
Cognitive
Morphodynamics



Peter Lang 2011
Edited Proceedings

Artificial Life
ALife'14, ECAL'15





MITPress 2014,2015
Evolution. Comp.
GECCO'12, '13





ACM 2012,2013
Artificial Life
ECAL'11



MITPress 2011
Swarm Intell.
ANTS'10



Springer 2010
IT Revolutions
ICST'08



Springer 2009


Home Page
Research
Teaching
Publications
Activities, Grants
Industry
Education, Career
   • Degrees
   • Career Summary
   • Academic Positions
   • PhD Dissertation
       1. Bias/Variance
       2. Elastic Matching
       3. Synfire Chains

PhD Dissertation  
A contribution to the study of representations in the nervous system and in artificial neural networks
The central theme of my 1991 doctoral thesis under the guidance of Elie Bienenstock was the relationship between neural code and mental representation. If we make the assumption that all mental "entities" (sensation, perception, concept, word, external object, action, etc.) are represented in the nervous system as states of neuronal activities, a fundamental problem of cognitive neuroscience is to elucidate the structure and properties of such representational states.
I conducted three different, yet interrelated studies advocating Christoph von der Malsburg's theory of temporal correlations as the basis of the neural code: a handwritten character classifier (see 2. Elastic Matching), a model of cortical self-organization (see 3. Synfire Chains), and a review of the limits of statistical learning in neural networks (see 1. Bias/Variance). I developed these various theoretical and practical models and carried out numerical simulations with the purpose of illustrating the relevance of the alternative neural code originally proposed by von der Malsburg.

This question has often been debated since the beginnings of modern neuroscience but it is generally accepted that the average firing rate of neurons constitutes an important part of the neural code. In short, the classical view holds that mental entities are coded by cell assemblies (Hebb, 1949), which are spatial patterns of average activity.

Following Christoph von der Malsburg's "Correlation theory of brain function" (1981) and the work of my thesis advisor, Elie Bienenstock (see, e.g., von der Malsburg and Bienenstock, 1986), I defended another format of representation that involves higher-order moments or temporal correlations among neuronal activities. Here, mental representations are not exclusively based on individual mean activity rates <xi>, which are events of order 1, but more generally on order-N events <xi1 xi2 ... xiN> and, in particular, correlations between two neurons, <xi xj>.

Naturally, the traditional order-1 code stems from classical observations in the primary sensory areas (e.g., visual cortex), in which cells seem to possess selective response properties. From these experiments, it was inferred that one such neuron, or a small cluster of neurons, could individually and independently represent one specific type of stimulus (e.g., the orientation of an edge).

However, to obtain the global representation of an object these local features must eventually be linked and integrated. The problem is that this integration is unlikely to be carried out by highly specialized cells at the top of a hierarchical processing chain (the conjectural "grand-mother" cells that fire only when you see your grand-mother). Equally unlikely would be for the assembly of feature-coding activity rates to be maintained in a distributed state because of the impossiblity to overlap two such states without losing relational information (the so-called "binding problem"). If two cells coding for "red" and "circle" are active and two other cells coding for "green" and "triangle" also become active, then this global state of activation is undistinguishable from the alternative combination "red triangle" and "green circle" (von der Malsburg, 1987).

This is why we advocated the idea that feature integration requires higher-order codes to be able to represent relationships between elementary components that are initially uncorrelated (in the above example the spike trains of "red" and "circle" would be synchronous and out of phase with "green" and "triangle"). These correlation events bring to the representation format a structure that is fundamentally missing from the mere feature lists of Hebbian cell assemblies.

To use a chemical metaphor, we could say that feature lists are to molecular formulas (e.g., C3H8O) what correlations are to line-bond diagrams (e.g., 1-propanol vs. 2-propanol).

The first part does not offer a new method or algorithm but rather aims at bringing to light general problems and limitations encountered by statistical learning processes, especially of the generalist or "nonparametric" kind. The main goal of this study is to stress the crucial importance of identifying the right format of representation and giving it priority over other concerns about the "adaptability" or power of generalization of a learning system.

We put into practice this recommendation in the second part by designing a handwritten character classification method based on order-2 correlations. Images are represented by 2-D deformable lattices instead of unstructured lists of pixels, while the "distance" between two input images is defined as the cost-functional of a graph-matching process. The success rates achieved by this criteria are superior to feed-forward neural classifiers, implicitly based on Hamming or Euclidean metrics. This study illustrates both the importance of a problem-specific format of representation and the particular appropriateness of higher-order codes in this matter.

The third part is more speculative and attempts to answer Fodor and Pylyshyn's (1988) influential criticism about the lack of structured representations in neural networks. We show that the compositionality of language and cognition can actually arise from the simultaneous self-organization of connectivity and activity in an initially random cortical network. ← Less

References

von der Malsburg (1981) Correlation theory of brain function. MPI report.

von der Malsburg & Bienenstock (1986) In Disordered syst. and biological org., Springer: 247-272.

Publications

Bienenstock, E. & Doursat, R. (1990) Spatio-temporal coding and the compositionality of cognition. In Proceedings of the Workshop on Temporal Correlations and Temporal Coding in the Brain, April 25-27, 1990, Paris, France; R. Lestienne, ed.: pp. 42-47.

Doursat, R. (1991) Contribution à l'étude des représentations dans le système nerveux et dans les réseaux de neurones formels. PhD Thesis, Université Pierre et Marie Curie (Paris 6). INTRO 

1. Bias/Variance  
The bias/variance dilemma in formal neural networks  
The first part, in collaboration with Stuart Geman, does not offer a new method or algorithm but rather aims at bringing to light general problems and limitations encountered by statistical learning processes, especially of the generalist or "nonparametric" kind. The main goal of this study is to stress the crucial importance of identifying the right format of representation and giving it priority over other concerns about the "adaptability" or power of generalization of a learning system. It became an oft-cited paper published in Neural Computation in 1992.
In this work, we addressed the issue of representation within the framework of statistical estimation theory. During the renewal of interest for connectionist models in the 1980's, the great majority of neural network methods focused on classification or estimation problems, especially regression. These generalist systems are preoccupied with interpolating sample data through gradual statistical approximation, without prior or built-in knowledge of the problem at hand. This is also called nonparametric inference.

Nonparametric estimation has been successful in numerous application domains and performs in some cases better than logical inference (expert systems, AI). However, it must also be asked whether this statistical approach is equally relevant to the cognitive and neurobiological domains. Does it provide an answer to the issue of neural representation? Can it solve complex cognitive problems such as invariant perception or language?

In short, when dealing with the nervous system and attempting to unravel the deep mechanisms of cognition, one may question whether the learning paradigm that drives the core of these methods is effectively as crucial as it is often claimed.

We show that the efficacy of any statistical estimator is bounded by universal limits. The mean quadratic error produced by an estimator (averaged over different learning data sets) is made of two terms, the bias and variance, which clearly point to the cause of these limitations. The bias term represents the average discrepancy between the system's predictions and the expected response, while the variance represents the fluctuations of these predictions with the various sample sets.

Our conclusion is that statistical estimation can yield good results in "simple" problems that do not require the system to extrapolate to unknown examples in a nontrivial way. Otherwise, if one expects a high power of generalization after the first interpolation phase, we claim that it becomes necessary to design into the system an adequate format of representation for the data and that the quality of this representation must have priority over tuning the learning parameters or selecting the examples. This is especially critical in "complex" problems of a cognitive nature.

This view is now widely accepted, yet it is still hoped in many cases that learning methods can actually discover by themselves the relevant data representation. In particular, these methods should be able to extract from the examples hidden component features that are characteristic of their class. This hope, again, has to be measured against the complexity of the problem. ← Less

Publications

Cited over 2400 times (as per Google Scholar):
Geman, S., Bienenstock, E. & Doursat, R. (1992) Neural networks and the bias/variance dilemma. Neural Computation 4(1): 1-58. PAPER

Doursat, R. (1991) Contribution à l'étude des représentations dans le système nerveux et dans les réseaux de neurones formels. PhD Thesis, Université Pierre et Marie Curie (Paris 6). PART 3 

2. Elastic Matching  
Elastic matching for handwritten character recognition  
We put into practice the first part's recommendation in the second part by designing a handwritten character classification method based on order-2 correlations. Images are represented by 2-D deformable lattices instead of unstructured lists of pixels, while the "distance" between two input images is defined as the cost-functional of a graph-matching process. The success rates achieved by this criteria are superior to feed-forward neural classifiers, implicitly based on Hamming or Euclidean metrics.
In this part we described a concrete implementation of a shape recognition model inspired by von der Malsburg (1981). This author offers an original representation format in the nervous system based on an order-2 neural coding. In this coding, relationships between objects in a visual scene are represented by temporal correlations between neuronal activities. The present study illustrates both the importance of a problem-specific format of representation and the particular appropriateness of higher-order codes in this matter.

Von der Malsburg's theory was prompted by the realization that the average rate-based conceptual format inevitably led to great difficulties or the impossibility to handle complex cognitive problems, including visual perception. In short, the classical Hebbian "cell assembly" format is lacking the structure necessary to code for relational information (see Introduction). Therefore, in this framework, the only way to represent relationships and avoid the "superposition catastrophe" (see the "red circle/green triangle" example of the introduction) is to dedicate new cells for each new combination of features.

This solution is obviously not realistic and it is much more natural to code relations with temporal correlations among cellular activities rather than with new cells. Moreover, through a Hebbian-like positive feedback loop on the millisecond timescale, these correlations reinforce the synaptic connections that support them. Therefore, instead of correlations, a relational structure can be equivalently represented by dynamical links.

  
We developed here a shape-recognition algorithm directly motivated by these principles but also simplified and made more computationally efficient for practical purposes. The core of the model is a template matching operation between two labeled graphs that are relational representations of two shapes. This graph-matching procedure is construed as a functional optimization that searches the best (oriented) mapping from the nodes of the first object to the nodes of the second object.

The cost-functional contains two constraint terms, one penalizing the elastic deformation of edges between neighboring nodes, the other penalizing mismatches between the labels carried by pairs of mapped nodes. The best mapping is thus a trade-off between structural integrity and feature integrity.

We carried out numerical experiments on a public database of 1200 16x16-pixel handwritten digits and compared the results with traditional feed-forward layered network methods. Defining our own graph-matching cost-functional as a "pseudo-metric" in image space, we applied a simple nearest-neighbor decision criterion and found significantly lower error rates. This proved that designing an appropriate representation format (here, 2-D relational) is often more important than fine-tuning the parameters of a learning process.

I later created a new version of 2-D graph matching based on temporal correlations and phase-locking in spiking neural networks (Doursat et al. 1995). ← Less

References

von der Malsburg (1981) Correlation theory of brain function. MPI for Biophys. Chemistry, Göttingen.

Publications

Bienenstock, E. & Doursat, R. (1994) A shape-recognition model using dynamical links. Network: Computation in Neural Systems 5(2): 241-258 [18 pages]. PAPER

Doursat, R. (1991) Contribution à l'étude des représentations dans le système nerveux et dans les réseaux de neurones formels. PhD Thesis, Université Pierre et Marie Curie (Paris 6). PART 2 

3. Synfire Chains  
An epigenetic development model of the nervous system  
The third part approached the issue of neural representation from a more abstract and speculative viewpoint. We wanted to address the compositionality of cognitive processes and language, i.e., the faculty of assembling elementary constituent features into complex representations. Answering Fodor and Pylyshyn's (1988) influential criticism about the lack of structured representations in neural networks, we showed that compositionality can arise from the simultaneous self-organization of connectivity and activity in an initially random network.
Already apparent in invariant perceptual tasks, where objects are categorized according to the relationships among their parts, compositionality is particularly striking in language and is also referred to as constituency. Language is often described as a "building block" system, in which the operative objects are symbols endowed with an internal combinatorial structure. This structure allows elementary symbols to be assembled in many different ways into complex symbols, whose meaning is sensitive to their internal arrangement. Again, chemistry provides a useful metaphor if we compare symbols with molecules and symbolic composition with the various possible reactions and products that depend on the geometrical structure of molecules.

In this context, the issue of an appropriate format of representation of mental entities is of particular importance and our proposal is that the nervous system uses a higher-order code to represent linguistic entities.

The goal of the present neural model was to show that compositionality can arise from the gradual ontogenetic development of the nervous system during the early stages of synaptogenesis. By this, we adhered to Chomsky's conception that language actually "grows" and matures in children's brain like a limb or an organ. This claim might sound suprising at first but is in accordance with well-known observations and general principles of neural development.

The visual system and many other cortical areas display striking regularities in their connectivity, which self-organize during fetal and postnatal development (with or without input from external stimuli) and account for their functional specialization. Similarly, it is postulated here that the faculty of language (as opposed to any specific language) is supported by specialized neural pathways that develop through a feedback interaction between neuronal activities and synaptic efficacies.

Starting from an initially disordered network with low random activity, certain synaptic connections are gradually selected and strengthened to the detriment of others. This focusing of the connectivity is also accompanied by a gradual increase and durability of correlated firing. Connections and correlations reinforce each other through heterosynaptic cooperation, while the global stability of the network is maintained through a constraint of competition.

On the whole, complex spatiotemporal patterns of spiking activity spontaneously emerge within the network. These patterns have been experimentally detected in mammalian species and termed synfire chains (Abeles 1982). In our claim, they constitute the elementary components or building blocks of compositionality. Such patterns have the required properties discussed above, i.e., an internal combinatorial structure that allows them to assemble in multiple ways, thereby opening the way to a virtually infinite hierarchy of combinations.

For a more recent version of this study, see the SYNDEVO project. ← Less

References

Abeles (1982) Local cortical circuits. Springer.

Fodor & Pylyshyn ('88) Connectionism and cognitive architecture Cognition 28: 3-71.

Publications

Doursat, R. & Bienenstock, E. (2006) Neocortical self-structuration as a basis for learning. 5th International Conference on Development and Learning (ICDL 2006), May 31-June 3, 2006, Indiana University, Bloomington, IN. IU, ISBN 0-9786456-0-X. PAPER SLIDES POSTER

Doursat, R. (1991) Contribution à l'étude des représentations dans le système nerveux et dans les réseaux de neurones formels. PhD Thesis, Université Pierre et Marie Curie (Paris 6). PART 3