RESEARCH TEAM
Principal investigators: Marcelo Bertalmío (Institute of Optics, CSIC); Jesús Malo López (Department of Optics, Physics Faculty, University of Valencia); Felix Wichmann (Department of Computer Science, Neural Information Processing Group, University of Tübingen, Germany)
Team members: Raúl Luna (Institute of Optics, CSIC), Javier de la Portilla Muelas (Institute of Optics, CSIC), Ilias Rentzeperis (Institute of Optics, CSIC)
DESCRIPTION
The last few decades have brought spectacular advances in the field of artificial vision, like the development of self-driving vehicles or robots that can assist surgeons in the most delicate operations. Yet the reality is that today’s artificial vision systems – based on neural networks trained to find patterns in large databases – still present serious shortcomings.
They often, for instance, make simple mistakes a human being would never make: “You show them a chair and they recognize it as a chair, and then you rotate it by a small angle, and suddenly they mistake it for an elephant,” remarks Felix Wichmann, leader of the Neural Information Processing Group in the Department of Computer Science at the University of Tübingen (Germany).
At the same time, neural networks are prone to serious errors when an image shifts in a way imperceptible to the human eye, with the tiniest variation in just a few pixels. The result can be such monumental mix-ups as the artificial vision network confusing a banana with a crocodile. This kind of error may pose critical risks to user safety in situations like driving an autonomous car, which can be easily misled: simply place a sticker on a stop sign and it can no longer properly interpret what it has to do. The goal of the project selected in the Mathematics, Statistics, Computer Science and Artificial Intelligence area is to try and overcome these neural network limitations.
“Neural networks need much more data to achieve the same level of visual performance as a human, or in some cases, even a duck or an insect,” says Marcelo Bertalmio, a scientific researcher at the Institute of Optics (CSIC). These constraints emerge, explain the project PIs, because today’s artificial vision systems are based on outdated neuroscience, whose approaches were developed back in the 1970s and 80s.
“There has been a lot of progress in the last 10 or 20 years showing that the human brain is actually much more complicated,” says Felix Wichmann. “That’s why our aim with this project is to transform the inner workings of neural networks based on what we now know about the brain’s complexity.”
For Jesús Malo, Professor of Optometry and Visual Sciences at the University of Valencia, a fundamental ability of the human visual system is what specialists in the field call “adaptation,” that is, “being equipped to deal in a meaningful way with new data we haven’t seen before.” Thanks to this ability to generalize and classify visual inputs on the basis of previous experience, “if we rotate a chair, we don’t make the silly mistake of thinking it’s an elephant instead.” The problem with current artificial vision systems, in his view, is that they draw on “overly simplistic” models of visual neurons.
The hope is that the multidisciplinary cooperation of this research team, with expertise in optics, neuroscience, computation and artificial intelligence, will yield models that are much closer to the biological workings of neurons in the human brain, facilitating the design of more sophisticated, safe and efficient systems for application not just in robotic vision, but in other artificial intelligence fields such as text interpretation and generation.