Invited for Behavioral and Brain Sciences.
Karl F. MacDorman
Department of Systems and Human Science
Osaka University
Toyonaka, Osaka 560
Japan
kfm@me.es.osaka-u.ac.jp
http://www.cl.cam.ac.uk/~kfm11
Cognitive theories based on a fixed feature set suffer from the frame and symbol grounding problems. Flexible features and other empirically acquired constraints (e.g., analog-to-analog mappings) provide a framework for letting extrinsic relations influence symbol manipulation. By offering a biologically plausible basis for feature learning, nonorthogonal multiresolution analysis and dimensionality reduction, informed by functional constraints, may contribute toward a solution to the symbol grounding problem.
Keywords: categorization; dimensionality reduction; feature learning; the frame problem; Gabor wavelets; intentional contents; multiresolution analysis; principal components analysis; nonorthogonality; object concepts; object recognition; receptive field profiles; sensorimotor predictions; the symbol grounding problem
Schyns et al. present compelling arguments (sect. 2.7) and evidence for the ability to learn diagnostic features. It is accordingly useful to begin by considering why fixed features have proven so popular. Perhaps most important, fixed features provide a straightforward basis for framing cognitive theories (sect. 1.1). Fixed features and combination rules enable a symbol system to simulate key aspects of human thinking such as its systematicness, productivity, and semantic coherence. Fixed features also simplify the mechanics of deduction and abstract planning by separating the symbol system from the details of sensorimotor control.
Trouble arises because, in current systems, symbol manipulation turns on properties intrinsic to the system (i.e., syntactic constraints and the physical properties of the implementation media). However, even proponents of symbol systems (Fodor 1994) now admit that extrinsic relations influence intentional contents -- in particular, the causal relation between a thought's content and what it represents. To ignore this fact leads to the symbol grounding problem (Harnad 1990).
Standard artificial intelligence solutions fail because (1) outside of simple domains, a programmer cannot anticipate all necessary elementary features and, hence, cannot set up a priori feature detectors; (2) to represent all sensorimotor information symbolically creates unreasonable computational demands (Janlert 1996); and (3) analog information needs to bear on abstract reasoning. Otherwise, symbol systems lack empirical constraints and (having only formal constraints) can define a limitless number of ``kooky'' concepts (Fodor 1987). This excess of freedom contributes to the frame problem. Its solution requires finding a representational form that obviates the need to reason about stabilities (Janlert 1996). However, our best hope for that is precisely the bottom-up perceptual and functional constraints lacked by current symbol systems (Harnad 1993). This is one reason why we may need representational forms that fall along an analog-categorical continuum (Harnad 1987).
By letting extrinsic relations (as mediated by the body) influence internal symbol manipulation, feature learning offers a more sound foundation for cognitive theories. Feedback from failed sensorimotor predictions could form the basis for learning analog-to-categorical mappings (MacDorman 1997a, 1997b). These could in turn ground symbols and rules. Sensorimotor predictions correlate the sensory and internal consequences of motor activity. When they are violated, orienting reactions ensue that guide attention toward the source of error. These predictions underlie the ability to distinguish object affordances. For this reason they must integrate information from multiple modalities. This provides a framework for more passive or abstract feature learning. Active sensorimotor predictions constitute an individual's current world-model.
More specific issues are discussed in the following sections:
Viewpoint. There is good reason to doubt the existence of a unique task-independent view-based representation of an object (Schyns et al., sect. 2.1). One alternative is to learn a viewpoint-independent representation of an object by developing predictions concerning how bodily movements transform the object's (viewpoint-dependent) sensory projections (MacDorman 1997a, 1997b). Using these, it is possible to shift from a viewer-centered to an object-centered frame of reference, because predictions concerning how self-induced movements cause sensory transformations can be used to compensate for those movements (see also Edelman, 1998).
Nonorthogonality. As Schyns et al. point out, the categorization process can profitably inform the dimensionality reduction of its input (sect. 3.4.2). Although this approach makes a featural interpretation of principal components easier, one drawback is the high cost of each recalculation of the eigenvectors used by the Karhunen-Loeve transform. Another drawback is that biological sensorimotor systems use nonorthogonal coordinates. Orthogonality would exact a high cost both genetically and in terms of neural development. It leaves visual processing susceptible to changes in receptive field profiles. If the brain used orthogonal coordinates, neural death could easily render some patterns imperceptible and others indistinguishable.
Multiresolution analysis. Results in character and face recognition research from using wavelets to parse input signals at various scales support Schyns et al.'s conclusion that larger features need not be composed from smaller ones (sect. 3.1). Daugman (1980) proposed that a parametarized family of two-dimensional Gabor filters (nonorthogonal wavelets) offers a suitable model of the anisotropic receptive field profiles of single neurons in several areas of the primary visual cortex. The two-dimensional Gabor filters are able to account for the selective tuning of simple cells for characteristic scale, localization, orientation, and quadrature phase relationships. Daugman (1985) has shown using chi-squared tests that this family of elementary functions fits the profiles of 97% of simple cells in the cat visual cortex. Within-category invariance (with between-category invariance filtered out) learned from the output of Gabor filters may underpin flexible features (MacDorman, 1997b).
Schyns et al.'s use of alternative materials is laudable. Virtual reality may soon permit us to pursue a truly bottom-up multimodal investigation of feature learning. Subjects possessed of 'bodies' with utterly different kinematics, sensors, and actuators will move in alien perceptual worlds, and we will get to study them.
Daugman, J. G. (1980). Two-dimensional spectral analysis of cortical receptive field profiles. Vision Research, 20, 847-856.
Daugman, J. G. (1985). Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters. Journal of the Optical Society of America, 2(7), 1160-1169. Edelman, G. M. (1998). Representation is representation of similarities. Behavioral and Brain Sciences, 21(3). Fodor, J. A. (1987). Modules, frames, fridgeons, sleeping dogs, and the music of the spheres. In Z. W. Pylyshyn (Ed.), The robot's dilemma. Norwood, NJ: Ablex.
Fodor, J. A. (1994). Jerry A. Fodor. In S. Guttenplan (Ed.), A companion to the philosophy of mind. Oxford: Blackwell.
Harnad, S. (1987). Category induction and representation. In S. Harnad (Ed.), Categorical perception. Cambridge: Cambridge University Press.
Harnad, S. (1990). The symbol grounding problem. Physica D, 42(1-3), 335-346.
Harnad, S. (1993). The frame problem as a symptom of the symbol grounding problem. Psycholoquy, 4(34), frame-problem.11.
Janlert, L.-E. (1996). The frame problem: Freedom or stability? With pictures we can have both. In K. M. Ford & Z. W. Pylyshyn (Eds.), The robot's dilemma revisited. Norwood, NJ: Ablex.
MacDorman, K. F. (1997a). How to ground symbols adaptively. In S. O'Nuallain, P. McKevitt & E. MacAogain (Eds.), Readings in computation, content and consciousness. Amsterdam: John Benjamins.
MacDorman, K. F. (1997b). Symbol grounding: Learning categorical and sensorimotor predictions for coordination in autonomous robots. Technical Report No. 423, Computer Laboratory, Cambridge University (e-mail librarian@cl.cam.ac.uk).