Back to CQ Homepage

Attractor Spaces as Modules:

 a Semi-Eliminative Reduction of Symbolic AI to Dynamic Systems Theory

 

Teed Rockwell

2419A Tenth St

Berkeley, CA 94710

510/ 548-8779 Fax 548-3326

74164.3703@compuserve.com

 

 

 

 


 

 

Attractor Spaces as Modules:

 A Semi-Eliminative Reduction of Symbolic AI to Dynamic Systems Theory

 

abstract: I propose a semi-eliminative reduction of Fodor’s concept of module  to the concept of Attractor Basin which is used in Cognitive Dynamic Systems Theory (DST). I show how Attractor basins perform the same explanatory function as module in several DST based research program. Attractor basins in some organic dynamic systems have even been able to perform cognitive functions which are equivalent to the If/Then/Else loop in the computer language LISP.  I suggest directions for future research programs which could find similar equivalencies between organic dynamic systems and other cognitive functions. Research that went in these directions could help us discover how (and/or if) it is possible to use Dynamic Systems Theory to more accurately model the cognitive functions that are now being modeled by subroutines in Symbolic AI computer models. If such a reduction of subroutines to basins of attraction is possible, it could free AI from the limitations that prompted Fodor to say that it was impossible to model certain higher level cognitive functions.

 

 

What is this Thing Called Modularity?

 

                  To some degree, Fodor's claim that Cognitive science divides the mind into modules tells us more about the minds doing the studying than the mind being studied. The knowledge game is played by analyzing the object of study into parts, and then figuring out how those parts are related to each other. This is the method regardless of whether the object being studied is a mind or a solar system. If a module is just another name for a part, then to say that the mind consists of modules is simply to say that it is comprehensible. Fodor comes close to acknowledging this in the following passage.

 

The condition for successful science (in physics, by the way, as well as psychology) is that nature have joints to carve it at: relatively simple subsystems which can be artificially isolated and which behave, in isolation, in something like the way that they behave in situ. (Fodor 1983 p.128)

 

                 If this were really as unconditionally true as Fodor implies in this sentence, Fodor's Modularity of Mind would have been a collection of tautologies. In fact, Fodor goes to great lengths to show that his central claims are not tautologies, but rather reasonable generalizations from what has been discovered in the laboratory at this point. Fodor gets more specific in passages like this one:

 

       One can conceptualize a module as a special purpose computer with a proprietary data base, under the conditions that a) the operations that it performs have access only to the information in its database (together of course with specifications of currently impinging proximal stimulations) and b) at least some information that is available to at least some other cognitive processes is not available to the module. (Fodor 1985 p. 3)

 

 Fodor sums up these two conditions with the phrase informationally encapsulated, and adds that modules are by definition also  domain specific  i.e. each module deals only with a certain domain of cognitive problems. He also claims that what we call higher cognitive functions cannot be either informationally encapsulated or domain specific.  Instead, these higher processes must be "isotropic", (i.e. every fact we know is in principle relevant to their success) and "Quinean" (they must rely on emergent characteristics of our entire system of knowledge.). Roughly speaking, “isotropic” is the opposite of “informationally encapsulated” and “Quinean” is the opposite of “domain specific”. Because modular processes are by definition neither Quinean nor isotropic,  there is  "a principled distinction between cognition and perception" (ibid.).

 

This is more than a little ironic because Artificial Intelligence usually studies cognition, not perception. When it does study perception, it does so by describing it as a form of cognition. But Fodor claims that “Cognitive Science has gotten the architecture of the mind exactly backwards” (Fodor 1985 p.497) when it sees perception as a form of cognition. Thinking beings are by definition capable of responding flexibly and skillfully to a variety of different situations.  Perception, according to Fodor, is by its very nature reflexive and rigid. It consists of unthinking responses to the immediate environment, over which our conscious rational minds have essentially no control. These kinds of processes are much easier to model than “what is most characteristic and most puzzling about the higher cognitive mind: Its non-encapsulation, its creativity, its holism, and its passion for the analogical.”(ibid.) Fodor consequently claims that, given the conceptual tools of cognitive science, it is not possible to have a science of the higher "Quinean-Isotropic" cognitive functions, such as thought or belief.

 

This analysis of the current state of Cognitive Psychology is also backed up with considerable scientific detail by Uttal 2001. Uttal specifically says that he believes Fodor’s distinction between perceptual faculties, which are modular, and higher order faculties, which are not, is essentially correct. (p.115)  Nevertheless, Uttal does raise some points which could be used to justify  misgivings about any kind of modular psychology. He points out that the reason that it is easier to correlate brain function with perception is mainly a function of the nature of perception, not of the brain itself.

 

. . .the dimension of each sensory modality is well defined. For example, vision is made up of channels or modules sensitive to color, brightness, edges and so on. . . .Because a thought may be triggered by a spoken word, by a visual image or even by a very abstract symbol, we can establish neither its links to the physical representation nor its anatomical locus.”  (p.114)

 

In other words, when you can’t precisely define the nature of your stimulus, it will be difficult to replicate a consistent stimulus-response connection. The reason that stimulus-response connections can be established between brain states and perceptions is that everybody knows what a ray of light is, and there are precise quantitative ways of measuring its characteristics. It is therefore not surprising that we can produce precise variations in neural behavior by precisely varying the ray of light. But the existences of these replicable S-R connections need not imply that these variations are being produced entirely by an autonomous module. As Uttal points out, the fact that different parts of the brain influence different mental or behavioral processes does not require us to accept  “the hypothesized role of these regions as the unique locations of the mechanisms underlying these processes”.  (p. 11) Just because a certain kind of neural activity is necessary for perception does not mean that it is sufficient. There is also evidence which gives reason to question the modular hypothesis.  Uttal cites research which “argued strongly for the idea that even such an apparently simple learning paradigm as classical conditioning is capable of activating widely distributed regions of the brain.” (p.13). If ‘simple’ stimulus-response connections are not modular, is there any reason to think anything else is?

 

The case for perceptual modularity looks even weaker when we shift from Cognitive Psychology to Artificial Intelligence. If the brain regions being studied  by Cognitive Psychology were really informationally incapsulated and domain specific perceptual modules, it ought to be possible to build machines that duplicated their functions using the modular architecture of computer science. Unfortunately, although classical Artificial Intelligence has had some success in duplicating what are often thought of as higher brain functions, its biggest failures have been its attempts to understand perception, as Hubert Dreyfus has documented in great detail (Dreyfus 1972/1994). If Fodor and Dreyfus are both right, this would mean that Cognitive Science is suffering from a serious lack of consensus in two of its branches. Neuropsychology cannot localize the higher functions which can be mechanically duplicated by the modular architecture of Artificial Intelligence. And Artificial Intelligence cannot use modular architecture to duplicate the perceptual functions which Neuropsychology claims are localized modules. It seems that even Fodor’s final exclamation of “modified rapture” was too optimistic.

 

This paper, however, will be an attempt to offer a hopeful alternative to this gloomy picture.  Fodor admits that this limitation may only be true "of the sorts {of computational models} that cognitive sciences are accustomed to employ"(Fodor 1983 p.128). An examination of the presuppositions of those computational models may reveal this to be a limitation of only one particular kind of cognitive science. The distributed connectionist systems have so far had the most success with replicating perception in ways that are commonly thought of as being “non-modular” in some sense. I will argue, however, that the cognitive abilities of these and other dynamic systems may be modular in another sense, which need not share the limitations that Fodor thinks are essential to modular architecture.

 

Fodor and the Symbolic Systems Hypothesis

 

                  I believe that the limitations described by Fodor do hold for the paradigm that gave birth to cognitive science, which is often called the symbolic systems hypothesis.  It's most fundamental claim is that a mind is a computational device that manipulates symbols according to logical and syntactical rules.  All computers and computer languages operate by means of symbolic systems, so another way of phrasing this claim is to say that a mind is a kind of computer. The symbolic systems hypothesis is still basically alive and well, but now that it is no longer universally accepted it is often given disparaging nicknames, like Haugeland's "GOFAI" (for good old fashioned artificial intelligence) or Dennett's "high church computationalism".  Fodor remains the most articulate preacher of the gospel of high church computationalism, and when his concept of modularity goes beyond the tautologous claim that minds are analyzable, it almost always brings in strong commitments to the claim that a mind is some kind of computer. Fodor’s modules are really what computer programmers call subroutines, which is why he defines modules in the quote above as "special purpose computers with proprietary data bases." GOFAI scientists model cognitive processes by breaking them down into subroutines by means of block diagrams, then breaking those subroutines down into simpler subroutines and so on until the entire cognitive process has been broken down into subroutines that are simple enough to be written in computer code. ( See Minsky 1985  to see this process in action.). Most of what Fodor said about modules in Fodor 1983 follows from the fact that  subroutines are domain specific and informationally encapsulated ( i.e. they are each designed for specific tasks, and only communicate with each other through relatively narrow input-output connections). At the time, Fodor believed (and apparently still believes) that GOFAI was and is the only game in town for AI. When Fodor says that it is impossible to model Quinean and isotropic properties, what he really means is that it is impossible to model them with the conceptual tools of GOFAI, and in this narrow sense of "impossible" he is probably right.

 

Dynamic Systems Theory

as an Alternative Paradigm

 

                  This paper will deal with whether there are similar constraints on the new sorts of models currently available to cognitive science, which were not available when Modularity of Mind was written.   Recent developments in non-linear dynamics have made it possible to use physics equations to describe systems which have the kind of flexibility that seems to justify calling them cognitive systems. This has resulted in a branch of cognitive science called Dynamic Systems Theory (DST). There is now much controversy over whether it is possible for the principles of DST to replace or supplement the computer inspired view of cognition that is often called the symbolic systems hypothesis, or Good Old Fashioned Artificial Intelligence (GOFAI).

 

                  Because Fodor's modularity theory reveals both the strengths and the weaknesses of the symbolic systems hypothesis, it provides excellent criteria for the evaluation of the relative merits of DST and GOFAI. Fodor claims, I believe correctly, that GOFAI explains cognitive behavior by a dividing a system up into interacting modules. In order to be on equal footing with the symbolic systems hypothesis, DST must enable us to account for the functions and properties that Fodor calls modular. And if DST is also able to account for Quinean and isotropic mental process (or show why the distinction between modular and Quinean-isotropic processes is spurious), it would be clearly superior to the symbolic systems hypothesis, for whom these processes are, by Fodor's own admission, a complete mystery.

 

                  This paper will describe a concept used in DST which I think has many significant isomorphisms with the concept of module. These isomorphisms may enable us to reduce the concept of module to this concept from DST when we are talking about organic systems. Computers, of course, have real modules, because we build them that way. But we may be decide that artificial intelligence may differ from organic intelligence because the former approximates this feature of organic systems with a brittle modular metaphor that is significantly different from the real thing. A reductive account of the properties that Fodor calls modular would not enable us to accept everything he (or anyone else) says about modules. Whenever a new theory replaces an old one, it does so by contradicting some parts of the old theory and accepting others. If it accepts a substantial part of the old theory, the new theory is called a reduction. If it rejects most of the old theory, we say that the new theory eliminates the old theory. The best  contemporary theories of reduction claim that there is a continuum connecting these two extremes of elimination and reduction (see Bickle 1998). We decide where on the continuum a particular theory replacement belongs by comparing the old and the new theory, and seeing how much and what sort of isomorphisms exist between the two. I will not try to definitely answer whether the example I am discussing is an elimination or a reduction, partly because this is a question that can only be answered by future research, and partly because I believe that attempting to make this distinction completely sharp is more misleading than useful. Hopefully, however, my analysis will give some sense of where on the reduction/elimination continuum we might place the relationship between DST and the modular structures of the symbolic system hypotheses.

 

A Brief Introduction to DST

In a multidisciplinary paper, it is frequently necessary to include a brief summary of a science which takes a lifetime to fully understand. Such summaries will sometimes belabor what is obvious, other times oversimplify ideas that have important complications, and still have parts which will be difficult to understand for many educated and intelligent people. The following summary will probably have all of those faults, but it will, at least, be focused towards those factors which are relevant to the philosophical concerns of this paper. Its goal will be the understanding of the essential nature of what the mathematical equations are measuring, rather than with the equations themselves.

 

                  A dynamic system is created when conflicting forces of various kinds interact, then resolve into some kind of partly stable, partly unstable, equilibrium. The relationships between these forces and substances create a range of possible states that the system can be in. This set of possibilities is called the state space of the system. The dimensions of the state space are the variables of the system. Every newspaper contains graphs which plot the relationship between two variables, such as inflation and unemployment, or wages and price increases, or crop yield and rainfall etc.  A graph of this sort is a representation of a set of points in a two-dimensional space. It is also possible to make a graph which adds a third variable and thus represents a three dimensional space, using the tricks of perspective drawing. Because our visual field has only three dimensions, that is the highest number of variables that we can visualize in a computational space. But the mathematics is the same regardless of how many variables the space contains. The state space of the sort of dynamic system studied by cognitive scientists will have many more dimensions than this, each of which measures variations in a different biologically and/or cognitively relevant variable: Air pressure, temperature, concentration of a certain chemical, even (surprise!)  a position in physical space.

 

However, although these variables define the range of possibilities for the system, only a few of these possibilities actually occur. To study a dynamic system is to look for mathematically describable patterns in the way the values of the variables change and fluctuate within the borders of its state space. The patterns that a system tends to settle into are called attractors, basins of attraction, or invariant sets. I believe that these invariant sets have the potential to provide a reductive explanation for what Fodor calls modules: i.e that science may eventually decide that modules in dynamic systems really are basins of attraction, just as light really is electromagnetic radiation.

 

  In Port and Van Gelder 1995, an invariant set is defined as "a subset of the state space that contains the whole orbit of each of its points. Often one restricts attention to a given invariant set, such as an attractor, and considers that to be a dynamical system in its own right." (p.574) In other words, an invariant set is not just any set of points within the state space of the system. When several interrelated variables fluctuate in a predictable and law-like way, the point that describes the relationship between those variables travels through state space in a path which is called an orbit. The set of points which contains that orbit is called an invariant set because the variations in that part of the system repeat themselves within a permanent set of boundaries. The second sentence in the above quote from Port and Van Gelder is encouraging for our project. If an invariant set can be considered as a dynamic system in its own right, this seems isomorphic with Fodor's claim that modules are domain specific and informationally encapsulated.

 

                  Port and Van Gelder define "attractor" as " the regions of the state space of a dynamical system toward which trajectories tend as time passes. As long as the parameters are unchanged, if the system passes close enough to the attractor, then it will never leave that region." (p.573). The conditional clause of the second sentence holds the key to the cognitive abilities of dynamic systems. For of course the parameters of every dynamic system do change, and these changes cause the system to tend towards another attractor, and thus initiate a different complex pattern of behavior in response to that change.

 

                  The simplest example of an attractor is an attractor point, such as the lowest point in the middle of a pendulum swing. The flow of this simple dynamic system is continually drawn to this central attractor point, and after a time period determined by a variety of factors (the force of the push, the length of the string, the friction of the air etc.) eventually settles there. A slightly more complex system would settle into not just an attractor point but an attractor basin.  i.e. a set of points that describes a region of that space. The reason that these attractors are called basins of attraction is because the system "settles" into one of these patterns as its parameters shift, not unlike the way a rolling ball will settle into a basin on a shifting irregular surface. A soap bubble[1] is the result of a single fairly stable attractor basin, caused by the interaction of the surface tension of the soap molecules with the pressure of the air on its inside and outside. Because a spherical shape has the smallest surface area for a given volume, uniform pressure on all sides makes the bubble spherical. But when the air pressure around the soap bubble changes, e.g. when the wind blows, the shape of the bubble also changes. The bubble then becomes a simple easily visible dynamic system of a sort, marking out a region in space that changes as the tensions that define its boundaries change. To see how these same principles can eventually reach a level of complexity that makes them a plausible embodiment of thought and consciousness, imagine the following developments.

 

                  1) The soap bubble could get caught up in an air current that flows regularly so that, even though the soap bubble is not staying the same shape, it changes shape in a repeating pattern. As I mentioned earlier, this pattern is often called an orbit, because the trajectory that describes this repeating change forms something like a loop traveling through the state space of the system. Systems that settle into orbits are usually more complicated than those which settle only into attractor basins which are temporally static, particularly when those orbits follow patterns that are more complicated than mere loops.

 

2) Instead of having the soap bubble fluctuate in three dimensional space, imagine that it is fluctuating in a multi-dimensional computational state space.  As I mentioned earlier, state space is not limited to the three dimensions of physical space, for it can have a separate dimension for every changeable parameter in the system. The most popular example in cognitive science of a system that operates within a multi-dimensional state space is a connectionist neural network. Connectionist nets consist of arrays of neurons, and each neuron in a given array has a different input or output voltage. Each of those voltages is seen as a point along a dimension of a Cartesian coordinate system, so that an array of ten neurons, for example, would describe a ten-dimensional space. But in other kinds of dynamic systems analyses, any variable parameter can be a dimension in a Cartesian computational space. Our friend the soap bubble can be interpreted as a visual representation of the air pressure coming from each of the three dimensions in physical space, if all other background conditions remain stable. And when the various interacting forces and variables in a dynamic system are designated as dimensions in a multi-dimensional space, it becomes possible to predict and describe the relationships between different attractor basins in that system. This is the most relevant disanalogy between a soap bubble and the more complicated dynamic systems studied by cognitive scientists.  Because:

 

                  3) A soap bubble has really only one stable attractor basin. Although the attractor space that produces a soap bubble is fairly flexible, the bubble pops and dissolves if too much pressure is put on it from any one side. But in certain systems, there are fluctuations of the variables which can cause the system to settle into a completely different attractor space. These systems thus consist of several different basins of attraction, which are connected to each other by means of what are called bifurcations.  This makes it possible for the system to change from one attractor basin to another by varying one parameter or group of parameters.

 

                  This propensity to bifurcate between different attractor basins is what differentiates relatively stable systems (like soap bubbles) from unstable systems (like living organisms or ecosystems). In this sense, all living systems are unstable, because they don’t settle into a equilibrium state that isolates them from their surroundings. Organisms are constantly taking in food, breathing in air, and excreting waste products back into the environment they are interacting with. We usually think of unstable processes as formless and incomprehensible, but this is often not the case. Certain unstable systems have a tendency to settle into patterns which still fluctuate, but fluctuate within parameters that are comprehensible enough to produce an illusion of concreteness. When the various forces that constitute the processes shift in interactive tension with each other, a basin of attraction destabilizes in a way that makes the system bifurcate i.e. shift to another basin of attraction. This kinds of system is sometimes called multi-stable, because its changes between various basins of attraction are predictable and (to some degree) comprehensible.

 

                  My claim is that, in a system that can shift between basins of attraction in a biologically viable way, the attractor basins can be seen as functionally isomorphic with the modules that are fundamental to GOFAI cognitive systems. There is a lot of experimental work that supports this possibility. We will first consider some work on infant and animal locomotion, in which some experimenters identify attractor basins with modules, and alter the concept of module significantly in doing so. However, because their research is measuring abilities which are not ordinarily thought of as cognitive, this work provides only an important first step in the direction I am suggesting.  I will then show how neurobiologist Walter Freeman has used the concepts of bifurcation and attractor landscapes to explain how olfactory processing in the rabbit brain can produce perceptual categories. Because Fodor has argued that perception is the cognitive ability that is best duplicated by modular architecture, this shows that DST can provide an alternative to some of the most important information processing models that are the basis of GOFAI systems. However, the fact that DST can be used to model perception does not necessarily show that it will be equally effective in modeling the “higher level’ cognitive processes that are the basis of rational inference. There is, however, other work on animal motion which indicates that bifurcation between attractor basins can sometimes significantly resemble the switching between possible branches of decision trees, which is the fundamental cognitive process of AI computer languages such as LISP.  I therefore propose that one of the most fruitful directions for future research would be to determine whether dynamic systems are capable of duplicating all the types of decision-making performed by computers. The work that has been done so far seems to indicate that the answer might be “yes”.

 

Thelen and Smith on Modularity Without Modules

 

In Thelen and Smith 1994, the authors argue that their research on infant locomotor development gives evidence that “cognitive development is equally modular, heterochronic, context dependent, and multidimensional.” (p.17) Not surprisingly, this discussion will focus on their claim that cognitive development is modular. This claim is proposed as an alternative to the idea that infant locomotion develops to maturity by the gradual unfolding of what is called a Central Pattern Generator (or CPG) i.e. a unified program stored in the brain or DNA that controls the motor processes from a central location in the nervous system. Although there was some evidence that such a thing existed in cats, the behaviors that were isolated in cats during the search for a CPG controlled only a part of what is essential for locomotion. Experimenters were able to isolate the spinal cord neural firings from both the cat’s brain and from perceptual influences, and thus produce a “spinal cat” that could walk on a treadmill if supported. But Thelen and Smith (hereinafter T&S) argue that this set of behaviors was only a “module” that could not produce walking behaviors without the help of several other “modules”.  A spinal cat, for example, cannot stand up without assistance, or reorganize neuromuscular patterns to deal with terrain that was more irregular than a treadmill.

 

When T&S studied the development of walking behaviors in human infants, they discovered that the separate components necessary for walking appeared and reappeared at different times in the infant’s life, and in response to different environmental stimuli. For example, it is widely acknowledged that newborns have the ability to make coordinated step-like movements when held erect. This ability disappears at about two months, and does not reappear until the infant learns to walk many months later. T&S discovered, however, that even after the ability to make these step-like movements had supposedly disappeared, these infants would still occasionally make them under certain conditions, such as lying on their backs, or walking on treadmills, or a change in emotional mood. (pp.10-11) This vital fragment of the ability to walk was a very early part of the infant’s repetoire, which was eventually assembled together with other behavioral components to make walking possible. Walking did not emerge because of the switching on of a Central Pattern Generator.  T&S’s  conclusion was that locomotion in humans and other non-human vertebrates had “ homogeneity of process from afar, but modularity. . .when viewed close up.” (p.17)

 

The work that T&S cite on non-human vertebrates was done not only with cats, but also with frogs and chickens. Stehouwer and Farel 1983 describes the discovery that the underlying neural activity for hind limb stepping was found in bullfrog tadpoles before they had hind limbs. When the tadpoles had grown vestigial limbs which were not yet fully capable of walking, it was possible to get them to perform walking movements by supporting them on dry rough surfaces in much the way that T&S supported human babies on treadmills. Watson and Bekoff 1990, revealed a similar modularity in the motor movements of chickens. There is a particular motion that a prenatal chick uses to break its shell when hatching and which it never uses again—unless the right context is created, by bending the chick’s neck forward to simulate the position of a chick embryo which has grown too big for its shell. In other contexts, the hatched chick uses a completely different set of movements: stepping with alternate legs, hopping with both legs together, even swimming when placed in water. T&S claim that all of this data supports the view that animals “can  generate patterned limb activity very early in life, but walking alone requires more—postural stability, strong muscles and bones, motivation to move forward, a facilitative state of arousal, and an appropriate substrate. Only when these components act together does the cat truly walk.” (p.20)

 

It may seem at first that T&S are attacking a straw person with these arguments. Would anyone seriously claim that every aspect of the ability to walk must be stored in Central Pattern Generator, or deny the possibility that a CPG could rely on pre-existing muscle patterns to do its job? And would anyone deny that the CPG must have parts, and cannot be single undifferentiated whole? And if it has parts, why shouldn’t those parts manifest at different times in the history of the organism? However, T&S are in fact criticizing a specific position commonly held by their colleagues which has serious problems. Of course everyone acknowledges that the biological processes in the nervous system cannot be completely responsible for locomotion. We can’t walk on our nerves. But T&S claim that traditionally researchers have privileged those aspects of locomotion skills which occur in the nervous system as being a “a fundamental neural structure that encodes the pattern of movement” ( p. 8) and consider everything else necessary for locomotion as being somehow less important. They even quote one researcher who claims that the pattern must be stored in the genes. (p. 6) They are quite right to consider this distinction as ad hoc and misleading, and to insist that the parts of the locomotion process that take place outside the brain and/or genes are every bit as important as the so-called neurological or genetic “encoder”.

 

To some degree, the question of whether locomotor development is controlled by a module in the brain is obviously an empirical one. But although data and research are clearly necessary for answering this question, they are not sufficient. There are significant disanalogies between computers and biological systems which the computer metaphor forces us to ignore, and which can render the centralized control theory dangerously unfalsifiable. Mackey 1980 argues that because “the true concept of programming transcends the centralist-peripheralist arguments. . . the term ‘central program’ is an oxymoron, and the concept unviable in the real world” (pp. 97 and 100.)  After all, no computer program completely controls anything from a central point. If it did, it would be, as Mackey points out, more like a tape or phonograph record. The cognitive power of a program comes from its ability to respond in different ways to different inputs. The instructions in the program “detail the operations to be performed on receipt of specific inputs” (ibid. p. 97), and without those inputs it would not be a program at all. One could use these facts about computer programs to respond to all of the objections that T&S raise against the CPG. One need only say that what happens in the brain and/or genes is not a complete Central Pattern Generator. CPGs should instead be seen as “generalized motor ‘schemata’, which encode only general movement plans but not specific kinematic details.” (Thelen and Smith 1994). The problem with this answer is that it can deal not only with all of T&S’s objections, but with every possible objection that anyone could ever make. Whenever an organism makes a motion, there will always be something happening in the nervous system. This version of the CPG theory enables you to call that neural activity the CPG, and everything else in the body or environment mere “kinematic details”. And there would be no reason you couldn’t do this regardless of the empirical results.  Clearly it is not acceptable to use a scientific theory which predetermines your answer before the data is in.

 

Why then does the distinction between program and hardware work so well when we are talking about computers?  In a computer, what is going on inside the CPU is the program, and what is going outside the CPU is obviously “peripheral” in some significant sense?  Why doesn’t this distinction carry over to biology? I believe that this is only because of the way computers are made and used in our society, and that there is no comparable set of criteria that would enable us to make the distinction for biological systems. Computers of a given brand are all engineered the same, which makes it possible for the programmers to ignore the hardware and create a control structure that resides in a central location. Mackey’s description (mentioned above)  says that a computer program must “detail the operations to be performed on receipt of specific inputs”. In order to specify these operations, however, the program must have a taxonomy of possible input it will receive, so it can specify responses to each of them. With a computer, we can tell ourselves that if we know the central program we know how it works. The hardware never changes so it can be safely ignored. T&S point out, however, that neural activity, unlike computer programs, does not have the advantage of knowing precisely what kind of input it will receive.  No two human infant bodies are alike, and the bodily structure of the infant changes radically as the infant matures. Because of these differences, radically different neurological development is needed in order to produce the same behavior in two different people. Although there are obviously things going on in the nervous system that are necessary for developing locomotor skills, studying the nervous system isn’t going to tell us the essential story if we don’t also know the peripheral inputs that the nervous systems must interact with.

 

There is. . .no essence of locomotion either in the motor cortex or in the spinal chord. Indeed it would be equally credible to assign the essence of walking to the treadmill than to a neural structure, because it is the action of the treadmill that elicits the most locomotor-like behavior. ( Thelen and Smith p.17)

 

We can thus see that although T&S frequently use the word “module” to refer to the components that make locomotion possible, their use of this term is very different from Fodor’s. (as they explain in considerable detail on pp.34-37).   They strongly emphasize that they do not mean that there is an organ in the brain that produces or controls each of these components. Furthermore, T&S’s modules, unlike Fodor’s, are neither static nor informationally encapsulated. They grow and change through time, and their borders overlap with each other. Their interactions with each other are also not hardwired, which Fodor says is an essential characteristic of modules (Fodor 1983 p.98). And most importantly, T&S’s modules do not carve a cognitive system at its functional joints. T&S’s main point is that what is happening in the nervous system, or in the muscles, or in the bones, is functionally useless until it sets up an effective equilibrium with various other parts of the body and with the particular environment the organism is interacting with. That is why no particular part of the nervous system or genes can be seen as a Central Pattern Generator. That is also why T&S refer to Fodor’s modules as “autonomous modules” to distinguish them from theirs.

 

There is nothing wrong in principle with not following Fodor’s usage of the word “module”. But I need to return to something closer to Fodor’s definition of “module” if I am to make the central philosophical point of this paper. Fodor sees a module as an organ in the brain that performs a single functional role all by itself. He would probably describe T&S’s “modules” as being fragments of modules. Consequently, if we were trying to find something in a dynamic system which could be reductively equivalent to a Fodor’s autonomous modules, T&S’s modules would not be up for the job. When we look at chapter 4 of Thelen and Smith 1994, however, we see that they are providing us with a detailed alternative that might be up for the job. They are making a claim which I will describe thusly: 1) When an organism interacts with its environment, the attractor basins of the resulting dynamic system perform the functions that Fodor attributes to modules. 2) In order to study these Fodorian modules, we must focus our attention not on physical space, but on state space.

 

As I mentioned earlier, the difference between a reductive identity and an outright elimination is only one of degree. When we say that the concept of light can be reduced to the concept of electromagnetic radiation, we are still acknowledging that the resulting new concept of light is very different from the old one. For the reasons described above, among many others, neither T&S nor I believe that the concept of attractor space is exactly identical to Fodor’s concept of module. This is made more obvious by the fact that T&S are more interested in what DST can do that classical cognitive models cannot: interpret change as development and growth, rather than dismissing it as ‘noise’. But a new theory can replace an old one only if it is capable of explaining both what is inexplicable and what is explicable to the old theory. T&S’ want to show that DST can do things that GOFAI cannot.  I want to show that DST might also be able to replace GOFAI on its home turf if we identify attractor spaces with Fodorian modules. And chapter four of Thelen and Smith 1994. especially pp. 80 through 86, takes the first steps towards doing exactly that.

 

When T&S  began researching the development of infant motor skills, they assumed that they could account for them by measuring the neural voltages that were being sent from the infant’s nervous system to its muscles. Unfortunately, there was no repeating pattern that could be found. It was not even possible to find a constant relationship between the voltages sent to the flexor and extensor muscles. In theory, there had to be a precise alternation between signals sent to the flexor and extensor muscles in order for the infant to move its legs. In practice it didn’t always work out that way. However, T&S were able to account for these variations with greater accuracy when they saw the infant’s locomotor skills as emerging from the interaction of several different factors, including the elasticity of the muscles and tendons, the excess body weight produced by subcutaneous fat, the length of the bones etc. When all of these factors were combined into what T&S called a collective variable, it was possible to make sense out how the infant was learning to move its legs. The effective movement emerged because this collective variable gradually settled into “evolving and dissolving attractors” (p. 85). When the skill was fully developed, the attractor was a deep basin in the state space i.e. only radical changes in the variables that defined the space would throw the system out of equilibrium. But the times in which the infant was still learning to walk would be mathematically described by saying that the collective variables formed a system with shallow attractors i.e. a slight change in any variable could cause the infant to topple over. To say that the infant was learning to walk was to say that these basins of attraction were adjusting themselves so that they gradually became deeper and more stable. Any attempt to describe this process by referring to changes in only one of these variables, such as the nervous system, would be essentially incomplete. The only thing that you could identify as being the embodiment of the walking skill was the system of attractor basins that existed when all of these factors interacted in a single dynamic system. Consequently, it is these attractor spaces that must be identified as the “walking module”.

 

  T&S claim, I think correctly, that a complete living organism is best understood by identifying all the significant variables that constitute its behavior, both inside and outside the head, then measuring the patterns that emerge as the resulting system fluctuates from one attractor space to another. Nevertheless, what happens in the head is of course necessary (but not sufficient) for these behavioral components to interact and create a dynamic system. And there is good reason to believe that the brain itself is best understood as a dynamic system. All DST analyses are incomplete, and limiting the system being studied to parameters of brain states can often be a useful way of drawing the borders of a Dynamic System. This is what Neurobiologist Walter Freeman has elected to do, and like T&S, he has found that identifying mental representations and functions with attractor basins is the most effective way of understanding perception in the laboratory animals he has studied. The fact that he came to this conclusion is strong evidence that these attractor spaces are doing the work that Fodor attributed to perceptual  modules.

 

Freeman and the Attractor Landscape of the Olfactory Brain.

 

                  Unlike T&S, neurobiologist Walter Freeman is willing to study cognitive functions by focusing entirely on the brain. Nevertheless, their similarities are more important than their differences, for Freeman believes that the brain itself is a dynamic system and not a system made up of mechanical modules. Furthermore, Freeman was able to use dynamic systems theory to account for a mental process that would be considered cognitive by even the most orthodox GOFAI devotee. After training Rabbits to recognize different kinds of odors, and measuring the neurological signals on their olfactory bulbs, he decided that the best way to account for their discriminative abilities was with the concepts of DST.

 

To use the language of dynamics. . . there is a single large attractor for the olfactory system, which has multiple wings that form an “attractor landscape”. . . .This attractor landscape contains all the learned states as wings of the olfactory attractor, each of which is accessed when the appropriate stimulus is presented. Each attractor has its own  basin of attraction, which was shaped by the class of stimuli the animals received during training. No matter where in each basin the stimulus puts the bulb, the bulb goes to the attractor of that basin, accomplishing generalization to the class. (Freeman 2000 p.80)

 

Freeman has thus come very close to articulating the thesis of this paper: that cognition is best explained by identifying mental functions, not with organs or modules, but with attractor basins. I, however, agree with Thelen and Smith that neurological activity is not sufficient to explain cognitive functions, and therefore we need to analyze the attractor basins created by interacting variables throughout the brain/body/world nexus. This is not a criticism of Freeman’s scientific work. Every DST analysis has to focus on some variables and ignore others. Focusing on the brain is as good a choice as any, as long as one remembers that it is not the only possible choice. But I am saying that there is no essential difference between T&S’s use of these principles and Freeman’s. T&S are, I believe, correct in saying that locomotion cannot be effectively understood with a modularity theory that assumes that each locomotive module must be located in a particular spot in the brain. The most effective alternative is to explain locomotion by identifying what T&S call modules with attractor basins in state space.

 

Some might be tempted to ask whether T&S would need such a complex conceptual apparatus. Is it really possible to think of locomotion as a cognitive activity in a robust, non-metaphorical sense? Distinguishing between different perceived items, such as odors, is a paradigm example of the traditional view of perceptual cognition. But do walking, running, and jumping really deserve to be members of the same category? If we are going to answer that question fairly, we must have a definition of cognition which will not prejudice our judgments in favor of the traditional linguistic and perceptual idea of cognition. Fortunately Newell and Simon already have formulated a definition, which was deliberately designed not to tip the scales in favor of their own symbolic system hypothesis.

 

. . .we measure the intelligence of a system by its ability to achieve stated ends in the face of variations, difficulties, and complexities posed by the task environment. (in Haugeland 1997 p.83)

 

The popular notion of muscular activity assumes that it simply an unconscious mechanical activity which is “switched on” by the brain. One of the reasons that Descartes believed that mind and body were fundamentally distinct was that he believed it was impossible for a physical device to make rational decisions that would vary so as to be equally appropriate in different contexts.  This was understandable, because the most sophisticated machinery of his time was clockwork. The humanoid automata that he had seen could do relatively complicated things, but they were all stored in the machine in advance and would always be the same regardless of what the outside world did. (see Dreyfus 1972 pp. 235-6) Once you wound up a clockwork dummy, it would continue to do the same actions every time you flipped the switch, even if that meant piling into a wall or plunging into a fountain. It was only after the computer was invented that it was possible for a machine to have some of the flexibility that we associate with rational thought. Today, however, we are still in the grips of a Cartesian Materialism, which assumes that the computer metaphor is applicable only to the brain. It is often assumed that motor control does not involve decision making, but is rather a matter of the brain flipping the switches of a variety of preset muscular clockwork systems.

 

However, modern biology seriously weakens this distinction between brain as computer and body as clockwork. We now know that every step we take requires a constant flow of information between an organism and its environment, and a variety of adjustments and “decisions” made in response to that information. Ordinary walking is cognitive by Newell and Simon’s definition, for it does have to “achieve stated ends in the face of variations, difficulties and complexities posed by the task environment”. It is “not controlled by an abstraction, but in a continual dialogue with the periphery” (p. 9 Thelen and Smith 1994) The following examples show that, in order to achieve those stated ends, the walking organism must make “decisions” that can be seen as functionally equivalent to the conditional branching which GOFAI expresses in computer languages. And there is good reason to believe that these examples could be the first of many more.

 

 How Horses (and Other Animals) Move

 

                  The ambulatory system of a horse divides into four distinct attractor spaces, colloquially referred to as walk, trot, canter and gallop. Each of these consists of a set of motions governed by complex input from both the environment and horse's nervous and muscular system. Careful laboratory study has made it possible to map the dynamics of each gait, (see Kelso 1995 p.70) and each map reveals a multidimensional state space that contains a great enough variety of possible states to respond to variations in the terrain, the horse's heart and breathing rate etc., and yet regular enough to be recognizable as only one of these four types of locomotion. There are no hybrid ways for a horse to move that are part trot and part walk; the horse is always definitely doing one or the other. And if all other factors remain stable, the primary parameter that determines the horse's utilization of each gait is usually how fast the horse is moving. From speed A to B the horse walks, from speed B to C it trots, and so on. There is not an exact speed at which the transition always occurs. If there were, a horse would wobble erratically between the two gaits whenever it ran anywhere near those speeds. What usually happens, however, is that the horse rarely travels at these borderline speeds (unless it is being used as a laboratory subject). Instead it travels at certain speeds around the middle of each range for each gait, because those are the ones that require the minimum oxygen and/or metabolic energy consumption per unit of distance traveled. This means that a graph correlating the horse's choice of gait with its speed usually consisting of bunches of dots, rather than a straight line, because certain speeds are not efficient with any of the four possible gaits

 

                   We can make a computer model of the horse's ability to adapt its gait to its speed using LISP, which is one of the most popular GOFAI languages. LISP models cognitive processes by means of commands that tell the program how to behave when it comes to a branch in the flow of information, which seems isomorphic to a bifurcation in the flow of a dynamic system from state space to state space. We'll start by positing four subroutines we'll call WALK, TROT, CANTOR, and GALLOP,  as well as a fifth subroutine we'll call "CURRENT SPEED" which measures how fast the horse is moving. Because we are only modeling the decision making process, rather than the entire dynamic system, we will accept those as unexplained primitives. To these five subroutines, we will add some basic subroutines from LISP:

1) “defun”,  which defines a new subroutine

 

2) “equal” which compares two numbers and checks whether they are equal

 

3) “<” which compares two numbers and checks whether the first is less than the second.

 

4) “if. . . else”, the conditional which performs the decision making process.

 

 

We can now describe a possible program that essentially duplicates the decision function of the horse's dynamic ambulatory system. We will posit convenient speeds for each gait of 5,10, 15, and 25.  the LISP  term “defun”  will establish the word  "GO" as the name of this program.

 

(defun GO (CURRENT-SPEED)

    (if   (equal CURRENT-SPEED 0)  0

    (if   ( <    CURRENT-SPEED 5)  (WALK)

    (if   ( <    CURRENT-SPEED 10) (TROT)

    (if   ( <    CURRENT-SPEED 15) (CANTER)

    (if   ( <    CURRENT-SPEED 25) (GALLOP)

    (else (GO) ) ) ) ) )

 )

 

The complete program would have to contain four definitions that looked something like this: (defun WALK (make the horse walk)), and so on for the subroutines trot, canter, and gallop. The phrase “make the horse walk” is of course deliberately empty hand waving, because the details of the four gait programs are of no significance for the point I am making. There is however other research which finds similar kinds of conditional decision-making within the individual gaits used by animals. Taylor 1978, for example, describes research done with several different kinds of animals, including birds, lions, and kangaroos, showing how changes in gait require “recruitment” of different muscles and tendons. When any of these animals is walking, a certain set muscles and tendons are brought into play, and it is possible to measure how much energy is being used by each muscle by measuring glycogen depletion. When an animal increases its speed, however, it must run another “program” that decreases the reliance on those muscles, and simultaneously recruits a different set of muscles. Taylor also discovered that the relationship between speed and glycogen depletion turned out to be dependent on several other factors as well. Gravitational energy is stored by means of the stretch and recovery of muscles and tendons in the faster gaits, making it possible for certain animals to actually use less muscle energy when traveling at faster speeds.  These relationships can only be described accurately by means of complex conditional relationships very much like computer subroutines.

 

                  To some degree, these examples[2] are an extension of Tim Van Gelder's "computational governor" thought experiment. (Van Gelder 1995)  Van Gelder's thought experiment showed that if a computer were to duplicate the function performed by the device which controls the speed of a Watt Steam engine, it would require fairly sophisticated computations. Van Gelder claimed that this task was clearly cognitive, that the Watt governor performed this task without computations, and that the same kind of physics which underlies DST was the best explanation for the Watt governor's cognitive abilities. This prompted some to say that the Watt Governor really was computational after all (Bechtel 1998) and others to say that the task was too simple to be called cognitive, and therefore the analogy was spurious. (Eliasmith 1997). Others pointed out that the Watt Governor was merely a feedback loop, and therefore DST must be (in Van Gelder's own words summing up this criticism) "Cybernetics returned from the dead." (Van Gelder 1999) My Horse LISP example is meant to be a partial answer to these last two criticisms. The paradigm cognitive ability in computer science is often considered to be decision making i.e. choosing  between alternatives.  An If-then-else command is certainly more of a decision making device than a feedback loop, and this example shows that the ambulatory system of a horse is a dynamic system that, among other things, performs the function of an If-then-else command.

 

 

Some Possible Futures for DST modules

                  If the horse's ambulatory system is capable of making the kind of cognitive distinctions that we ordinarily associate with high level computer programs like LISP, and dynamic systems theory can explain how this is done by equations that show bifurcations connecting sets of attractors, then perhaps we have something like a reduction of certain aspects of the LISP computer language to DST. And if it were possible to duplicate enough other branching functions performed by LISP with bifurcations in dynamic systems, it would be tempting to conclude that the symbolic system hypothesis had been reduced to being a subset of DST, in much the same way that Newtonian physics was reduced to being a subset of Einsteinian Physics.[3] This opens the possibility of an interesting treasure hunt --one in which establishing that there was no treasure would be as important as discovering it. There seem to be four possible extremes in what research might eventually discover.

 

1)  Dynamic Systems are incapable of performing many bifurcations that are essential to cognition, and the horse case described above is an isolated example from which we cannot extrapolate. If it were possible to prove this mathematically, we might decide that Cognitive DST was a blind alley.

 

 2) Dynamic Systems are implementations of classical computations, because basins of attraction (or some other feature of dynamic systems) are identical with  certain computer subroutines the way electromagnetism is identical with light.

 

3) Dynamic Systems are cognitive, but in a way that has nothing to do with classical computationalism. This would mean that DST could eventually eliminate computationalist theories of mind, the way chemistry eliminated the alchemical essences. (although computational theories would remain as useful to engineers as ever.)

 

 4) DST reduces computational theories of mind not with identities but with more ambiguous relationships that make the reduction more "bumpy" than "smooth". This would force us to change our ideas both of what a subroutine is, and what a dynamic system is.

 

                  I personally would bet on 4), but any one of these conclusions would be an important discovery. For example, it would be very convenient if we could simply find styles of dynamic bifurcation that corresponded to each of the five LISP primitives described in McCarthy 1960. Then we would have a perfect reductive identity between LISP and those particular dynamic systems, which would produce the result described in 2). But the chances of things working out exactly that neatly are very slim, for a variety of reasons.

 

                  For one thing, it is far more likely that most cognitively effective DST bifurcations will require several lines of code, or even whole programs, to be modeled effectively. Conversely, a computer program simulating a dynamic system would contain elements that would be unnecessary in the original system. Our model of the horse ambulatory system, for example, contains several elements that presuppose a computer's need to search and choose before each action. The recursive terms in our horse LISP subroutine made it possible to compare the value of the incoming speed variable to each of the gait subroutines in sequence until the correct one was found. Dynamic systems do not have any need for this kind of comparing function. They shift among different sets of attractors when certain parameters change in value, but in no sense do they "consider" other alternatives before they shift. They do it right the first time.  A connectionist net, for example, does need a training period to adjust its weights to perform the proper output. But unlike a computer program, it does not need to reconsider all of the wrong choices after it has been trained.[4]  These dissimilarities could be a strength, however, if they helped to account for many of the differences between real organic systems and their GOFAI idealizations, such as the former's ability to move fast enough to interact with the real world.

 

                  Secondly, if we discover that a bifurcating dynamic systems can duplicate the branching functions of computer subroutines connected by a LISP decision tree, the dynamic system will still remain free of many of the limitations of the modular architecture of computers. In a sense, the attractor spaces in a dynamic system are both informationally encapsulated and domain specific to some degree. But they also possess a flexibility that frees them from the limitations that were unavoidable for Fodor's modules.

 

Can There be Distributed Modules?

 

                 When connectionism first appeared on the AI scene, it was seen as radically non-modular, because everyone was struck by the fact that it used what was called distributed representation. The usual claim, both defended and attacked, was that in a connectionist net there was no single place where a particular bit of information was represented.  I believe that the proper approach to this controversy is to remember that a connectionist net is one kind of dynamic system, and that this means its fundamental parts are not modules that exist in physical space, but basins of attraction that exist in computational space. It may be that connectionist AI was guilty of a kind of misplaced concreteness when it saw itself as modeling the behavior of organlike neural structures, rather than the state spaces of dynamic patterns.  I am tempted to think of the connectionist modules used in contemporary AI as little dynamic systems imprisoned like birds in cages, so that they can communicate with other modules only by means of input-and-output devices.

 

                   The current engineering perspective tends to see connectionism as one more trick in an AI toolbox which is still running on fundamentally GOFAI principles. The two most common approaches for interfacing connectionism with GOFAI systems are:


1) Creating a virtual connectionist environment on a standard digital computer system. These virtual  connectionist programs function as modules within a fundamentally undistributed system. Although there is arguably distributed processing going on within the virtual module created by these programs, the module communicates to the rest of the system by means of standard input and output connections. It thus functions by exchanging information the same way any other modular system exchanges information. These connectionist programs are really only subroutines that the digital computer calls up when it needs to activate them in a larger programming context. This is why the designers of the Joone neural net framework claim that that their programming environment makes it so that  Everyone can write new {connectionist} modules to implement new algorithms or new architectures”. (www.jooneworld.com). 

 

2) There are some computer chips which use genuinely analog connectionist processing, although until recently very little AI work has been done with such chips. The reasons for the initial failure of genuinely connectionist modules are of little philosophical significance.

 

{The first analog connectionist chips} failed for two reasons. First, the actual improvement in performance over software running on a conventional processor was not that great. Secondly, five to 10 years ago you could not implement sufficiently large neural networks in silicon. (Hamish Grant quoted in P.  Clark 1999)

 

There is thus no reason to deny that genuinely analog connectionist chips will eventually be quite common. However, even if genuinely connectionist processors do replace virtual ones, this would not change the fundamentally modular nature of the systems in which they are embedded. Even though there is no question that the processing taking place within these chips is genuinely distributed, the distribution stops when you hit the borders of the chip. The fundamental computational tool inside these modules is state space transformations, just as in the dynamic systems we discussed earlier. But the state spaces in the connectionist chips are unnaturally easy to isolate. This makes them useful for engineering, but very misleading biologically. Of course, real neurons really do have inputs and outputs with reasonably exact voltages and weight summations. And by replicating those in silicon, it becomes possible to create modules that perform state space transformations on specific inputs.  But as long as we see this as the only way of utilizing connectionism, the relationship between connectionist and other dynamic systems becomes obscured, and connectionism loses almost all of its original revolutionary force. A connectionist net becomes rather like an AI "toy world" version of a dynamic system, and is still subject to many of the objections raised by Dreyfus against GOFAI systems. (See Dreyfus 1994 p.xxxiii-xxxviii)

 

 The creation of these connectionist modules makes sense from an engineering perspective, at least in the short term. It enables us to use GOFAI and connectionist systems in partnership, which results in the fullest utilization of all of our engineering resources. But it also closes the door on further  development of the distributed representations  that make organic systems so much more flexible than GOFAI systems.  The fundamental building blocks of these systems are, like any other GOFAI system, physically distinct modules that are located on different parts of a circuit board, (or in the case of the virtual nets, different regions of RAM or hard disk space), not attractor basins located in different regions of computational space. They are thus limited from performing Quinean and Isotropic processes for all the reasons described in Fodor 1983.

 

                 However, if we could show how a DST system could perform the functions of a GOFAI system using attractor spaces as something like distributed modules, further progress might be possible. To most people in the field, the idea of a distributed module is a contradiction in terms, and as long as this is the case DST will never be able to establish reductive relationships with the modular concepts of GOFAI. Some proponents of DST like it that way, for they want to go for broke for a total elimination of GOFAI by DST. But I don't think this is a very realistic view of how either reductions or eliminations operate in the history of science. If there is no relationship at all established between one domain of discourse and another,  there is no way of establishing that the two discourses are even talking about the same thing. As Dennett pointed out, no one would accept an elimination of the concept of Santa Claus which claimed that Santa Claus is a skinny man named Fred who lives in Florida, plays the violin, never buys gifts for anyone and hates children (Dennett 1991 p. 85) At the very least we need to show why people thought that the concepts of the old theory were legitimate. We may eventually decide that a given reductive relationship is so cock-eyed that it would be better described as an elimination than an identity. But we have to begin by positing identities between things in the old and new theories, and in this case, the concept of a distributed module seems to be a good place to start.

 

The Nature of Distribution

 

                   Fodor claims most of the time that his modules are not organs with concrete locations in the brain, but rather abstract faculties defined by the functions they perform. A module is thus "individuated by its characteristic operations, it being left open whether there are distinct areas of the brain that are specific to the function that the system carries out" (Fodor 1983 p.13). In the breach, however, Fodor usually speaks as though his modules probably are organs in some sense. This is most noticeable on p.98 of Fodor 1983, which has the heading "Input systems are associated with fixed neural architecture". I can see no difference between an organ and fixed neural architecture. Although Fodor admits that there might not be distinct areas of the brain for each function , he apparently does not take this possibility really seriously. The only real cash value of this assumption for Fodor is to permit him to describe the function abstractly, and ignore as mere "hardware problems" exactly how the function is physically embodied.

                   

                  This strategy became more obvious when many people began to claim that  connectionist systems were not modular because they used distributed processing. Fodor's response was basically to say that connectionist systems were distributed only physically, and that functionally they were still modular. (Fodor and Pylyshyn 1988) However, he has never really explained how a system could be physically distributed yet functionally modular. I think, however, that if DST does deliver on its promise as a cognitive science paradigm, there is a sense in which distributed systems can be described as modular in some sense, although with several important qualifiers.

 

                  Van Gelder 1991 claims, I think correctly, that the essence of distribution is summed up in a concept he calls superposition. For our purposes, I think Van Gelder's concept of superposition is effectively illustrated by the following series of examples. Let us consider a set of 26 cards, each of which has a letter of the alphabet on it. In this case, the representation of the alphabet is completely modular and undistributed. Each card represents exactly one letter of the alphabet, without any reliance on the other cards. Now let us suppose that instead of twenty six cards we have only 10 cards. We lay the cards out on the floor in a set pattern, and paint each card white on one side and black on the other. We then represent "A"  by turning a certain combination of black and white surfaces face up (Odd cards black, even cards white, for example), another combination is posited as representing "B", and so on for all twenty six letters. In this case the representation of the alphabet is superposed on all ten cards, because no one card represents any one letter. Despite the fact that we have only ten cards, and all of them are used to represent a single letter, we do not end up with less representational power, but far, far more. There are, in fact, 210 possible combinations of black and white, leaving us  998 (1024-26) possible other combinations to represent whatever else we like. (the Russian and Sanskrit alphabets, perhaps). If we used ten six-sided cubes with a different color on each side, instead of cards with only two sides,  the number of possible combinations would be  610.

 

                  In the one-card-per-letter system, each card could be seen as a module both physically and functionally. One physical piece of paper performs exactly one linguistic function, no more no less. In the superposed system with the black and white cards, however, no one card is a letter. Instead, each card functions like an axis in a Cartesian state space, and each axis has exactly two points on it. (one for the black side of the card and one for the white) When the cards are replaced with six sided cubes, each cube functions as an axis with six points on it. And because the cards are performing this Cartesian function, they make it possible to conceive of each of these letters as a point in the 10 dimensional state space defined by these 10 cards or cubes. Consequently, we can functionally represent all twenty six letters without needing twenty six distinct physical cards.

 

In their classic paper, Ramsey Stich and Garon (1991) have claimed that because beliefs are represented distributively in a connectionist system, we should therefore conclude that strictly speaking, from a scientific standpoint, beliefs are not represented in our brains at all. However, the above explication of distributed representation shows why this inference is invalid. The ten cards in the example above really do represent all twenty six letters of the alphabet, even though no one card represents any single letter. They are able to do this because there is a point in the state space of possible card positions that represents each letter.  And similarly, the basins of attraction in a dynamic system are capable in principle of doing every bit as much real cognitive work as an undistributed “physical” module. They are every bit as physically real as gravity, capacitance, voltage or any other theoretical scientific entity that we cannot see, feel, hear, or trip over. The state space modules described by DST are grist for the scientific mill, capable of being studied and measured. And more importantly, because they are more like events than objects, they have a speed and flexibility which is lacking in the material entities that Fodor called modules, and anatomists call organs.

 

 State Spaces Vs. Organs

 

                 The concept of "organ" is, as Fodor points out, the biological equivalent of a module, and it will probably continue to be useful. But it is based on a possibly misleading assumption: that morphology is always an accurate guide to function, because the body, like a Hi-Fi set, is supposedly divided up into distinct modules with materially delineable borders. Those who accept this assumption acknowledge that it may require microscopes, or sophisticated staining techniques, to find those borders. But the assumption remains that once one has marked out those borders, one has carved the brain or body at it's fundamental joints, and that the purpose of neuroscience is then to answer questions like "what is the hippocampus for?". But attractor basins and orbits are far more volatile entities than modules, and their boundaries and functions are far more flexible. Strictly speaking, they are much more like events than objects. They endure longer than most events, but in this respect they resemble events like tornadoes or waterfalls, which seem object-like because the flow of their constituting processes cohere in a stable pattern.

 

                  We must not discount the possibility that morphology is not the essential factor in determining function. The gray matter of the brain could be seen as primarily a medium through which dynamic ripples eddy and coalesce into state spaces. The fact that people can often relearn skills lost after brain damage, even though the "organs" supposedly responsible for those skills have been damaged or surgically removed, could offer support for that possibility.

Although damage to the central nervous system often results in permanent disruption. . .In some cases there is a return to normal or near-normal levels of performance even after extensive loss of nervous tissue. . .some see the fact of sparing and recovery as a direct challenge to the principle of localization (Laurence and Stein 1978 pp. 369 & 394)

 

The idea that functions are not modular and localized is still highly controversial, and understandably so. To say that the brain can perform complex functions without different parts of it doing different things seems to be either magical or nonsensical. But if mental functions were localized in state space, but not strongly attached to a material location in the brain, this would provide an alternative to traditional modularity which is in principle capable of making sense. In their concluding remarks, Lawrence and Stein admit that modern neuroscience “cannot claim to have solved the riddle of recovery” (ibid. p. 401) so clearly there is room for the development of new theories. Such a theory might also be able to account for the fact that brain scans of different people (or even the same person at different times) often reveal brain functions taking place at noticeably different locations. (John O’Keefe, Institute of Cognitive Neuroscience, University College, London. Personal communication). However, the most dramatic support for separating function from location is Mezernich’s and Kaas’s research on the primary somatosensory cortex of new world monkeys.

 

Before Mezernich and Kaas, the model for somatasensory function was the modular view of Penfield and Rasmussen. In 1957, they published a map of the surface of the cortex, which reputed to show a one to one relationship between parts of the body and the parts of the brain that controlled them. This view of brain function implied “both that these neatly ordered representations were established in early life by the anatomical maturation of the nervous system, and that they were functionally static thereafter.” (Thelen and Smith 1994 p.136). The work of Mezernich and Kaas, however, “has forced a drastic reexamination of those beliefs.” (ibid). Their experiments showed that what part of the brain controlled what part of the body was shaped by how the monkeys used their hands, and that the region of the cortex that controlled any given finger could be shifted from one part of the brain to another by inhibiting the monkey’s ability to move its hand and fingers. If a digit was amputated, the region of cortex which formerly controlled the missing digit would be ‘taken over’ by the remaining digits. If two digits were fused together, and used by the monkey for an extended time as a single digit, the two controlling regions in the cortex would fuse together. Once the monkey’s digits were freed and separated, the two controlling regions separated again. Nor was this experiment an isolated anomaly. “Subsequently, investigators have found similar reorganization for somatic senses in subcortical areas and in the visual, auditory, and motor cortices in monkeys, and in other mammals.” (ibid). This is not the sort of thing that happens in a modular system, where each part does the job it was designed to do and nothing else.

 

On the other hand, we also shouldn't assume that we must make an either/or choice between a dynamic brain and a brain assembled from organs. To continue the ripple metaphor (if it is only a metaphor), even a river's dynamic flow is shaped by the contours of the riverbank. In a similar way, perhaps the biological structure of the brain would make it more likely that certain dynamic patterns would stabilize into recurring attractor spaces, just as certain kinds of eddies and tide pools are more likely to form if the river banks contain the appropriately shaped inlets. Only future research will determine the exact relationship between the “organs” in the brain and dynamic patterns that flow through them. But there seems to be strong indications that we can’t unquestioningly adopt the “one organ—one function” relationship presupposed by traditional modularity theory.

 

Informational Encapsulation and DST

 

                  The connections between state spaces in embodied dynamic systems are not hard wired the way they are in AI connectionist systems. The bifurcations between attractor spaces are not specific connective neurons, they are only abstract measurements of forces. As the parameters that shape these forces shift, so do the cognitive characteristics of the attractor. For this reason, it is highly implausible that attractor basins in dynamic systems are informationally encapsulated in the way that Fodor claimed modules must be. Many of Fodor's arguments for informational encapsulation (for example, the fact that modules must have fast response times), require only that the module be informationally encapsulated at the time it is performing its function. Fodor is correct when he says that "in the rush and scramble of panther identification, there are many things I know about panthers whose bearing on the likely pantherhood of the present stimulus I do not wish to have to consider" (Fodor 1983 p. 71 italics in original). After the rush and scramble have subsided, however, there is no reason that the module which enables us to instantaneously identify panthers shouldn't receive all sorts of input from other sources, and reconfigure itself so as to make more effective responses the next time we see a panther. Consider another example: From my experience as a musician, and from conversations with other musicians, I know that learning to sight read music requires the development of sets of quick response connections between the eyes and hands. However, a set of responses which are very effective for one style of music do not yield the necessary speed for another style. Even if one can sight read Bach fluently, it is likely that difficulties will arise the first time one tries to sight read Duke Ellington until one has learned several new pieces in his style. But because the sight reading "module" is not permanently informationally encapsulated, it is possible for it to take in new information the more one studies a new style of music, and thus learn the reflex like speed that makes fluent sight reading possible the next time around.

 

                  Furthermore, it is possible in principle for attractor spaces to receive influxes of new information with very little changes in material structure.   Fodor claims that  "if you facilitate the flow of information from A to B by hardwiring a connection between them, then you provide B with a kind of access to A that it doesn't have to locations C,D,,E, . . ." (Fodor 1983 p.98). But although this is true for hardwired modules, it is not true for bifurcations in a dynamic system.  For them, informational interpenetration is probably the rule, rather than the exception. We must remember that "invariant set" is a highly conditional term in DST. "Invariant" really only means that there is a pattern that stays stable long enough to be noticed and (partially) measured. Given the number of parameters that must reach some kind of equilibrium for an invariant set to emerge, it is highly unlikely that they will always remain stable enough to produce anything that could be called informational encapsulation.  The slightest flicker in the parameters that hold an invariant set stable could bring in information from almost anywhere else in the system, which could change the system (hopefully for the better) when it restabilized.