Connectionism seem to be appropriate in explaining perception owing to the fact that the vast amount of data converging across a sensory surface like the warrants needs parallel processing. The perceptrons were one of the early connectionist models for pattern recognition, although they were not very successful in many complex domains. There were two main reasons for the failure. One was the then lack of recognition of the role played by hidden units (in multiple layers) in such complex processes. The other reason was the inability of the perceptron convergence learning rule to be applied to multilayer systems. Current approach to explaining visual processes make use of von Neumann computation, where they explain such processes in terms of derivation and operations on symbolic structures that represent visual information. This approach has led to the appreciation of vision as a set of sub processes where each sub process ideally requires a theory. Connectionist systems appears to be appropriate for such a task.
One important area in visual perception relates to recognition and high-level vision. In high-level perceptual processing hierarchical use of information is one vital aspect. Pandemonium, an early example of a connectionist system bears a similar architecture. In pandemonium, various procession demons are attached to some specialist tasks or subtasks. All the tasks are performed on a blackboard, which serves as a broad communication channel, thus making the contents of the blackboard accessible to all. The contents of the blackboard change over time owing to the activity of the demons, and their combined activity help in arriving at a solution. Pandemonium is a good example to demonstrate that problems in visual analysis could be considered as forms of multiple simultaneous constraint satisfaction. Unlike low and intermediate level perception, higher level perception requires access to stored knowledge, where the results of perceptual analysis must be compared with the stored representations of the categories to aid in identifying objects in the visual field. Thus the two main problems in this area relate to the derivation of adequate representation, and the control of information flow.
One important characteristics of the visual system is that its representations are independent of viewing conditions (viewing angle, viewing distance, location in the visual field). Thus object recognition warrants a stable shape description which is independent of the viewpoint, which further means that a canonical coordinate system must be established within the object before its shape is described. Hinton (1981) devised a network that has a solution to deriving object centered representation for simple two dimensional patterns. The network had four types of units; image-based feature units, object-based feature units, letter units and mapping units. An input activates a subset of image based feature units, which in turn leads to simultaneous activity in several mapping units, which in turn activates object-based feature units. The network, over time, relaxes to a state where only one object and mapping unit is active. While Hinton"s model demonstrates a few examples of multiple constraint satisfaction, one of the main difficulty faced by any connectionist system is that of binding the right features to their objects. The question related to binding is complex, and one of the solutions to the binding problem is to bind by location. In the more complex cases the number of units required to represent all possible relations between all the components making up an object increases in an exponential manner, thus rendering the binding problem very serious. Further difficulty arise when objects overlap. Hummel & Biederman (1992) proposed an approach dependent on the synchrony in the activation of units in a network to overcome this problem. In this approach, units whose activation varies together are bound together. Hummel & Biederman"s (1992) work incorporates these features to model object recognition. In this model there are seven layers of units. Each layer deals with a different class of visual or spatial feature. A second type of connection between units, called the FELs (fast enabling links) where the signals travelling along the FELs between two active unts occurs faster than the activation cycle, thus aiding in achieving synchrony between active units. The arranging of FELs among units that should or could be bound together solves the binding problem in the model. Units representing features with a high probability of belonging together in the real world are linked by FELs. The model, thus, has a very efficient mechanism for binding, invariant representation, and is a good demonstration of constraint analysis.
Connectionist networks also demonstrate the relation between representation and action. Martinez, Ritter & Schulten"s (1990) model of the visuomotor coordination of a robot arm is one example demonstrating such a relation. It is a three-joint arm which had to learn to use information provided by two cameras to be able to reach objects in a three-dimensional workspace. The cameras were used to determine the retinal coordinate position of an object in the workspace, and each pair of retinal coordinate was combined to form four-component vectors. Learning involved the adaptation of values by the application of delta learning rule. The model is an example of the close coupling of visual representation and action.