How our brain learns to recognise objects
Understanding how the brain recognises objects is a central challenge for understanding human vision, and for designing artificial vision systems. (No computer system comes close to human vision.) A new study by MIT neuroscientists suggests that the brain learns to solve the problem of object recognition through its vast experience in the natural world.
Take for example, a dog. It may be sitting nearby or far away or standing in sunshine or shadow. Although each variation in the dog’s position, pose or illumination produces a different pattern of light on the retina, we still recognise it as a dog.
One possible way to acquire this ability to recognise an object, despite these variations, is through a simple form of learning. Objects in the real world usually do not suddenly change their identity, so any two patterns appearing on the retina in rapid succession likely arise from the same object. Any difference between the two patterns probably means the object has changed its position rather than having been replaced by another object. Therefore, by simply learning to associate images that appear in rapid succession, the brain might learn to recognise objects even if viewed from different angles, distances, or lighting conditions.
Making monkeys… of monkeys
To test this idea, called ‘temporal contiguity,’ graduate student Nuo Li and associate professor James DiCarlo at MIT’s McGovern Institute for Brain Research ‘tricked’ monkeys by exposing them to an altered visual world in which the normal rules of temporal contiguity did not apply. They recorded electrical activity from individual neurons in a region of the monkey brain called the inferior temporal cortex (IT), where object recognition is thought to happen. IT neurons respond selectively to particular objects; a neuron might, for example, fire more strongly in response to images of a Dalmatian than to pictures a rhinoceros, regardless of its size or position on the retina.
In the new study, which appears in the 23 September issue of Neuron, monkeys observed an object on a computer screen as the object became larger or smaller, as though it were approaching or receding from view. However, in some cases, the researchers replaced an object with another as it changed in size. For example, as a Dalmatian became larger on the screen, it suddenly transformed into a rhinoceros.
‘We know that IT neurons are involved in object recognition, so our prediction was that these neurons would become confused,’ explains Li. ‘By exposing them to this artificial visual experience, we undermined the regularities that we hypothesised teach neurons to recognise the object at multiple sizes.’
Neurons become confused
After a few hours, each IT neuron did indeed become confused. For example, a neuron that preferred a dog to a rhino (regardless of size) began to lose this preference specifically among large dogs and large rhinos (the size at which the temporal contiguity rules had been broken by the researchers). In some cases, the object preferences even started to reverse, and the neuron would begin to prefer large rhinos to large dogs. In other words, the altered visual experience was not merely degrading existing patterns of selectivity but also creating new ones. The new results offered the strongest support yet for the ‘temporal contiguity’ hypothesis of object representation.
‘Our monkeys saw only a few hundred examples of these altered visual stimuli during the experiment,’ says DiCarlo. ‘If we extrapolate to a lifetime of visual experience, we think this effect is a major contributor to object constancy.’
In earlier work, Li and DiCarlo had shown a similar effect when object identities were switched during eye movements. ‘But we couldn’t tell whether that result was peculiar to eye movements – for example, you don’t need eye movements to observe a large image of a car changing into a small image of the same car,’ says DiCarlo. ‘Our new results strongly suggest that this is a general mechanism for learning about object identity across a wide range of real-world conditions.’
Monkeys become confused
The researchers did not measure the monkeys’ behaviour during the new study, so it is not known whether the monkeys themselves become confused by the altered visual experience. However, similar effects have been seen in human studies, and it seems likely that the changing pattern of neural activity would lead to changes in perceptual judgments. DiCarlo and colleagues plan to test this in future studies.
Source: ‘Unsupervised natural visual experience rapidly reshapes size invariant object representation in inferior temporal cortex,’ by Li N, DiCarlo JJ. Neuron, 23 Sept 2010.
Funding: McGovern Institute for Brain Research at MIT, McKnight Endowment Fund for Neuroscience, and NIH.
Caption: Neurons in the inferior temporal cortex (IT) confuse the identity of two different objects (in this case, a BMW Z3 and a Volkswagen Beetle) after repeated exposure to the two objects in rapid succession. (Image: James DiCarlo)

