Literature-based AI system could accelerate cancer discoveries

Charlotte Edwards 28 November 2018 (Last Updated December 23rd, 2019 10:23)

Computer scientists and cancer researchers at the University of Cambridge in the UK have created an AI-driven literature-based discovery system to aid scientists in search of cancer-related discoveries.

Literature-based AI system could accelerate cancer discoveries
These skin cancer cells from a mouse show how cells attach at contact points. Credit: NIH Image Gallery.

Computer scientists and cancer researchers at the University of Cambridge in the UK have created an AI-driven literature-based discovery system to aid scientists in search of cancer-related discoveries.

The system is called LION LBD (Literature-Based Discovery) and is designed to save researchers hours of time which they would have previously spent manually sifting through mountains of published research.

LION LBD is the first literature-based discovery system aimed at supporting cancer research. An article detailing the system’s results has been published in the journal Bioinformatics.

As global cancer research continues to grow so does the supporting scientific literature, which is now so vast that researchers are struggling to find relevant information to further their own studies.

Cambridge Language Technology Lab co-director and one of the leaders of the LION LBD development Professor Anna Korhonen said: “As a cancer researcher, even if you knew what you were looking for, there are literally thousands of papers appearing every day. LION LBD uses AI to help scientists keep up-to-date with published discoveries in their field, but could also help them make new discoveries by combining what is already known in the literature by making connections between sources that may appear to be unrelated.”

Literature-based discovery is a concept which aims to make new discoveries by combing through data from disconnected sources.

The design of the new system enables users to discover indirect associations between studies in a database of tens of millions of publications, while also allowing each text to be explored in its original context.

“For example, you may know that a cancer drug affects the behaviour of a certain pathway, but with LION LBD, you may find that a drug developed for a totally different disease affects the same pathway,” Korhonen said.

LION LBD has a particular focus on the molecular biology of cancer and uses machine learning and natural language processing techniques to detect references to the hallmarks of cancer in the text. The system has demonstrated an ability to identify undiscovered links and rank relevant concepts highly among potential connections.

The technology is available as an interactive web-based interface or a programmable API.

The researchers are currently trying to extend the scope of LION LBD and are working closely with cancer researchers to improve the technology for the end users.