Edmund T. Rolls
- Published in print:
- 2016
- Published Online:
- November 2016
- ISBN:
- 9780198784852
- eISBN:
- 9780191836299
- Item type:
- chapter
- Publisher:
- Oxford University Press
- DOI:
- 10.1093/acprof:oso/9780198784852.003.0025
- Subject:
- Neuroscience, Molecular and Cellular Systems, Behavioral Neuroscience
This chapter shows that the feature hierarchy approach has a number of advantages in performing object recognition over other approaches (see Section 25.3), and that some of the key computational ...
More
This chapter shows that the feature hierarchy approach has a number of advantages in performing object recognition over other approaches (see Section 25.3), and that some of the key computational issues that arise in these architectures have solutions (see Sections 25.4 and 25.50. The neurophysiological and computational approach taken here focuses on a feature hierarchy model in which invariant representations can be built by self-organizing learning based on the statistics of the visual input. The model can use temporal continuity in an associative synaptic learning rule with a short-term memory trace, and/or it can use spatial continuity in Continuous Spatial Transformation learning. The model of visual processing in the ventral cortical stream can build representations of objects that are invariant with respect to translation, view, size, and lighting. The model uses a feature combination neuron approach with the relative spatial positions of the objects specified in the feature combination neurons, and this provides a solution to the binding problem. The model has been extended to provide an account of invariant representations in the dorsal visual system of the global motion produced by objects such as looming, rotation, and object-based movement. The model has been extended to incorporate top-down feedback connections to model the control of attention by biased competition in for example spatial and object search tasks (see further Chapter 6). The model has also been extended to account for how the visual system can select single objects in complex visual scenes, how multiple objects can be represented in a scene, and how invariant representations of single objects can be learned even when multiple objects are present in the scene. The model has also been extended to account for how the visual system can select multiple objects in complex visual scenes using a simulation of saliency computations in the dorsal visual system, and then with fixations on the salient parts of the scene perform view-invariant visual object recognition using the simulation of the ventral visual stream, VisNet. It has also been suggested in a unifying proposal that adding a fifth layer to the model and training the system in spatial environments will enable hippocampus-like spatial view neurons or place cells to develop, depending on the size of the field of view (Section 24.3.11). We have thus seen how many of the major computational issues that arise when formulating a theory of object recognition in the ventral visual system (such as feature binding, invariance learning, the recognition of objects when they are in cluttered natural scenes, the representation of multiple objects in a scene, and learning invariant representations of single objects when there are multiple objects in the scene), could be solved in the cortex, with tests of the hypotheses performed by simulations that are consistent with complementary neurophysiological results. The approach described in this chapter is unifying in a number of ways. First, a set of simple organizational principles involving a hierarchy of cortical areas with convergence from stage to stage, and competitive learning using a modified associative learning rule with a short-term memory trace of preceding neuronal activity, provide a basis for understanding much processing in the ventral visual stream, from V1 to the inferior temporal visual cortex. Second, the same principles help to understand some of the processing in the dorsal visual stream by which invariant representations of the global motion of objects may be formed. Third, the same principles continued from the ventral visual stream onwards to the hippocampus help to show how spatial view and place representations may be built from the visual input. Fourth, in all these cases, the learning is possible because the system is able to extract invariant representations because it can utilize the spatio-temporal continuities and statistics in the world that help to define objects, moving objects, and spatial scenes. Fifth, a great simplification and economy in terms of brain design is that the computational principles need not be different in each of the cortical areas in these hierarchical systems, for some of the important properties of the processing in these systems to be performed. The principles of cortical operation that are illustrated include the following: One is that advantage is taken of the statistics of inputs from the world to help learning, with for example temporal and spatial continuity being relevant. Another is that neurons need to learn to respond to non-linear combinations of their inputs, in the case of vision including their spatial arrangement which is provided by the convergent topology from area to area of the visual cortex, using principles such as competitive learning. Another principle is that the system must be able to form sparse distributed representations with neurons that encode perceptual and invariance properties, so that the next stage of cortical processing can read the information using dot product decoding as in a pattern associator, autoassociator, or competitive network. Another principle is the use of hierarchical cortical computation with convergence from stage to stage, which breaks the computation down into neuronally manageable computations. Another principle is breaking the computation down into manageable parts, by for example not trying to analyze the whole of a scene simultaneously, but instead using successive fixations to objects in different parts of the scene, and maintaining in short-term memory a limited representation of the whole scene. In conclusion, we have seen in this chapter how a major form of perception, the invariant recognition of objects, involves not only the storage and retrieval of information, but also major computations to produce invariant representations. Once these invariant representations have been formed, they are used for many processes including not only recognition memory (see Section 24.2.6), but also associative learning of the rewarding and punishing properties of objects for emotion and motivation (see Chapter 24), the memory for the spatial locations of objects and rewards (see Chapter 24), the building of spatial representations based on visual input (Section 24.3.11), and as an input to short-term memory (Section 4.3.1), attention (Chapter 6), and decision systems (Section 5.6).Less
This chapter shows that the feature hierarchy approach has a number of advantages in performing object recognition over other approaches (see Section 25.3), and that some of the key computational issues that arise in these architectures have solutions (see Sections 25.4 and 25.50. The neurophysiological and computational approach taken here focuses on a feature hierarchy model in which invariant representations can be built by self-organizing learning based on the statistics of the visual input. The model can use temporal continuity in an associative synaptic learning rule with a short-term memory trace, and/or it can use spatial continuity in Continuous Spatial Transformation learning. The model of visual processing in the ventral cortical stream can build representations of objects that are invariant with respect to translation, view, size, and lighting. The model uses a feature combination neuron approach with the relative spatial positions of the objects specified in the feature combination neurons, and this provides a solution to the binding problem. The model has been extended to provide an account of invariant representations in the dorsal visual system of the global motion produced by objects such as looming, rotation, and object-based movement. The model has been extended to incorporate top-down feedback connections to model the control of attention by biased competition in for example spatial and object search tasks (see further Chapter 6). The model has also been extended to account for how the visual system can select single objects in complex visual scenes, how multiple objects can be represented in a scene, and how invariant representations of single objects can be learned even when multiple objects are present in the scene. The model has also been extended to account for how the visual system can select multiple objects in complex visual scenes using a simulation of saliency computations in the dorsal visual system, and then with fixations on the salient parts of the scene perform view-invariant visual object recognition using the simulation of the ventral visual stream, VisNet. It has also been suggested in a unifying proposal that adding a fifth layer to the model and training the system in spatial environments will enable hippocampus-like spatial view neurons or place cells to develop, depending on the size of the field of view (Section 24.3.11). We have thus seen how many of the major computational issues that arise when formulating a theory of object recognition in the ventral visual system (such as feature binding, invariance learning, the recognition of objects when they are in cluttered natural scenes, the representation of multiple objects in a scene, and learning invariant representations of single objects when there are multiple objects in the scene), could be solved in the cortex, with tests of the hypotheses performed by simulations that are consistent with complementary neurophysiological results. The approach described in this chapter is unifying in a number of ways. First, a set of simple organizational principles involving a hierarchy of cortical areas with convergence from stage to stage, and competitive learning using a modified associative learning rule with a short-term memory trace of preceding neuronal activity, provide a basis for understanding much processing in the ventral visual stream, from V1 to the inferior temporal visual cortex. Second, the same principles help to understand some of the processing in the dorsal visual stream by which invariant representations of the global motion of objects may be formed. Third, the same principles continued from the ventral visual stream onwards to the hippocampus help to show how spatial view and place representations may be built from the visual input. Fourth, in all these cases, the learning is possible because the system is able to extract invariant representations because it can utilize the spatio-temporal continuities and statistics in the world that help to define objects, moving objects, and spatial scenes. Fifth, a great simplification and economy in terms of brain design is that the computational principles need not be different in each of the cortical areas in these hierarchical systems, for some of the important properties of the processing in these systems to be performed. The principles of cortical operation that are illustrated include the following: One is that advantage is taken of the statistics of inputs from the world to help learning, with for example temporal and spatial continuity being relevant. Another is that neurons need to learn to respond to non-linear combinations of their inputs, in the case of vision including their spatial arrangement which is provided by the convergent topology from area to area of the visual cortex, using principles such as competitive learning. Another principle is that the system must be able to form sparse distributed representations with neurons that encode perceptual and invariance properties, so that the next stage of cortical processing can read the information using dot product decoding as in a pattern associator, autoassociator, or competitive network. Another principle is the use of hierarchical cortical computation with convergence from stage to stage, which breaks the computation down into neuronally manageable computations. Another principle is breaking the computation down into manageable parts, by for example not trying to analyze the whole of a scene simultaneously, but instead using successive fixations to objects in different parts of the scene, and maintaining in short-term memory a limited representation of the whole scene. In conclusion, we have seen in this chapter how a major form of perception, the invariant recognition of objects, involves not only the storage and retrieval of information, but also major computations to produce invariant representations. Once these invariant representations have been formed, they are used for many processes including not only recognition memory (see Section 24.2.6), but also associative learning of the rewarding and punishing properties of objects for emotion and motivation (see Chapter 24), the memory for the spatial locations of objects and rewards (see Chapter 24), the building of spatial representations based on visual input (Section 24.3.11), and as an input to short-term memory (Section 4.3.1), attention (Chapter 6), and decision systems (Section 5.6).
Loring W. Tu
- Published in print:
- 2020
- Published Online:
- January 2021
- ISBN:
- 9780691191751
- eISBN:
- 9780691197487
- Item type:
- chapter
- Publisher:
- Princeton University Press
- DOI:
- 10.23943/princeton/9780691191751.003.0013
- Subject:
- Mathematics, Educational Mathematics
This chapter explores integration on a compact connected Lie group. One of the great advantages of working with a compact Lie group is the possibility of extending the notion of averaging from a ...
More
This chapter explores integration on a compact connected Lie group. One of the great advantages of working with a compact Lie group is the possibility of extending the notion of averaging from a finite group to the compact Lie group. If the compact Lie group is connected, then there exists a unique bi-invariant top-degree form with total integral 1, which simplifies the presentation of averaging. The averaging operator is useful for constructing invariant objects. For example, suppose a compact connected Lie group G acts smoothly on the left on a manifold M. Given any C∞ differential k-form ω on M, by averaging all the left translates of ω over G, one can produce a C∞ invariant k-form on M. As another example, on a G-manifold one can average all translates of a Riemannian metric to produce an invariant Riemann metric.Less
This chapter explores integration on a compact connected Lie group. One of the great advantages of working with a compact Lie group is the possibility of extending the notion of averaging from a finite group to the compact Lie group. If the compact Lie group is connected, then there exists a unique bi-invariant top-degree form with total integral 1, which simplifies the presentation of averaging. The averaging operator is useful for constructing invariant objects. For example, suppose a compact connected Lie group G acts smoothly on the left on a manifold M. Given any C∞ differential k-form ω on M, by averaging all the left translates of ω over G, one can produce a C∞ invariant k-form on M. As another example, on a G-manifold one can average all translates of a Riemannian metric to produce an invariant Riemann metric.