Pablo Morales-Álvarez, Pablo Ruiz, Scott Coughlin, Rafael Molina and Aggelos K. Katsaggelos
SPANISH SOCIETY OF STATISTICS AND OPERATIONS RESEARCH (SEIO)-BBVA FOUNDATION AWARDS
Best Applied Contribution in Statistics
2023
For their paper “Scalable variational Gaussian processes for crowdsourcing: Glitch detection in LIGO,” published in IEEE Transactions on Pattern Analysis and Machine Intelligence; a “very innovative” piece of work, according to the award committee.
CONTRIBUTION
Gravitational waves have signified a new way of exploring the Universe since they were first detected in 2015 by the LIGO observatory (United States), demonstrating one of Albert Einstein’s predictions in his theory of general relativity. Among the signals received by this observatory, many are not gravitational waves but other noise patterns produced by movements of the Earth or other external phenomena.
To aid in their identification, researcher Pablo Morales Álvarez, Assistant Professor of Statistics and Operations Research at the University of Granada, together with his team, developed an AI system able to distinguish between gravitational waves and other noise patterns from among the signals picked up by LIGO. Their findings were written up in the paper “Scalable variational Gaussian processes for crowdsourcing: Glitch detection in LIGO, published in IEEE Transactions on Pattern Analysis and Machine Intelligence.
“We use a statistical technique known as the Gaussian process, which permits analysis of a labelled data set, in this case signals; some of them labelled as gravitational waves and others as noise,” the researcher explains.
The team also made use of a novel technique in statistics known as crowdsourcing: “These algorithms need a very large data set to learn from, and expert gravitational wave physicists don’t have the time to annotate that much data. So we turned to volunteers trained to distinguish gravitational waves from noise, who managed to annotate a very large set,” Morales explains. Specifically, more than 30,000 volunteers took part in the scheme, providing more than seven million annotations for upwards of one million signals.
“We now have access to mountains of data across multiple areas, so these crowdsourcing techniques can be applied in many fields.” Right now, he says, “we are using them to detect cancers in biopsy images. Pathologists don’t have time to label thousands of digitized biopsy images, so we have entrusted the labelling to medical student volunteers. The idea is to use the data set to make predictions, with the ultimate goal of developing a support system for the medical diagnosis of this type of condition.”