Looking for a tiny needle in an extremely large haystack

FOR most people, medical research conjures up images of doctors, stethoscopes and white coats. But for my colleagues and I in Oxford University’s new Big Data Institute, medical research involves nothing but computers.

For centuries, medical discovery involved individual patients and studying them in great detail.

But in the 1950s, the pioneering work of Oxford’s Sir Richard Doll, who first proved the link between smoking and cancer, demonstrated the value of collecting information on large numbers of individuals.

These days, studies such as UK Biobank or the Million Women Study, both Oxford-led initiatives, have hundreds of thousands or even millions of participants.

What’s changed in the last few years is the scale on which information on individuals can be collected. For example, we have been sequencing the genomes of children with undiagnosed genetic disorders, such as crippling skeletal malformations or severe epilepsy, to try to identify the cause of their condition.

A genome sequence is ‘big data’; each genome generates information equivalent to 200 hours of video and would take the entire memory of an iPod to store.

Typically, we’re looking for one genetic change in a genome of three billion letters. That’s a very small needle in a very large haystack. Not surprisingly, to make this work requires computing – and computing on a large scale.

But when the search is successful, it can change the lives of the families affected. Not only does it provide an explanation for the family, it can tell us about how likely the parents are to have another affected child and even suggest new ways of treating the disease.

Genome sequencing is just one source of big data. And all these sources of information are being collected on increasingly large numbers of patients.

Imaging information, such as advanced MRI scans that show activity within the brain, are becoming widespread in medical diagnosis. Storing such information on a large scale and then mining it to understand why people get diseases and how best to treat them is a major headache.

But storing such data is just part of the problem. There simply aren’t the statistical and computational methods out there to help us make sense of everything we can measure.

This is where Oxford University’s new initiative comes in. We are working to bring together experts who are good at collecting, storing and analysing these vastly complex sources of information in a new Big Data Institute. The hope is that many new medical discoveries will come from such integration.

Already, we’re beginning to identify new genetic causes for both rare and common diseases.

We’re finding ways of identifying patient groups that will respond well to a drug and others that are likely to have side effects.

And further afield, the new institute will help in the fight against malaria and other infectious diseases, for example in actively monitoring where resistance to frontline drugs is spreading.

This is an important time for big data in medicine. There is much to discover and much to get right. But I firmly believe that Oxford University will play a big role in demonstrating the medical benefits that combining all this information will bring for us all.

Two public consultation events to provide information on plans for a building to house the new Big Data Institute on the University of Oxford’s Old Road Campus in Headington took place on March 28 and 29. An amenities building for the whole Old Road Campus is also being proposed.

Search

News

Sport

Oxford Utd

What's On

Crime

Business

Education Guide

Announcements

e-Editions