Pete Gething is head of the Malaria Atlas Project (MAP), Associate Professor at the Oxford University Nuffield Department of Medicine and Director of the World Health Organization (WHO) Collaborating Centre in Geospatial Disease Modelling. After our partial relocation the new Big Data Institute, where MAP is now permanently based, I had a chance to catch up with him and learn all about maps – why they are such a powerful tool and the challenges and risks involved to get them right.
Hi Pete, tell me about the WHO Collaborative Centre that you direct.
We have always tried to build our research around providing evidence that can directly influence malaria control policy. Some work back in 2015 nicely exemplifies this and the importance of working with WHO as a formal WHO Collaborating Centre in Geospatial Disease Modelling.
Global financing for malaria control was in jeopardy in about 2009 because of the financial crash. Policy makers at the governmental level needed to be able to point to solid evidence that their previous investment was having an impact. They needed to understand the bigger picture of whether interventions were working and we were well-placed to fill that gap.
We had already accumulated a lot of data on malaria prevalence and coverage of malaria interventions such as the distribution of bed nets, antimalarial drugs, and insecticide sprays. Our challenge was to put together an analytical framework to synthesise all this information to understand whether the pattern of malaria risk had changed and if we could – at least in a statistical sense – point to a causal link between changing prevalence and interventions.
Our analysis, published in Nature, showed the huge impact of distribution and use of bed nets had on decreasing malaria prevalence. In partnership with the WHO we visited the Houses of Parliament to speak with the Department for International Development and other policymakers to make the case that their investment was directly translating into saving lives on a massive scale. This was one part of a successful advocacy campaign that maintained the commitment of the UK government to malaria control.
What’s the nature of your formal partnership with the WHO?
We’ve had a long-standing history of close collaboration with the WHO. We have been providing many of the statistics that are included in the annual World Malaria Report, the go-to document for latest updates on malaria globally. But being a collaborating centre actually makes it easier for the statistics we generate to become officially adopted as United Nations sanctioned numbers.
And it’s obviously a great recognition of our efforts, it’s good to see what we’re doing is useful to policy makers. It’s an unambiguous stamp that our work is having uptake and influencing policy. This also helps when we approach third parties such as individual countries or other organisations to gather data: the WHO is a trusted institution that many are already collaborating with.
What is it about maps that makes them such a powerful tool?
It may sound a bit cliché but a picture does paint a thousand words, doesn’t it? We are displaying a massive amount of information: Our global map has around 5-6 million pixels on a 1kmx1km grid. That’s 5-6 million little pieces of information displayed in just one single image! I think the human brain is well wired to look at a map and immediately understand it, by understanding the variation and the areas of high and low texture patterns.
Maps are a powerful tool to get a lot of information across in a very condensed way. Having said this, the flipside is that people have an inherent desire to trust a map as soon as it’s seen, so you have to be right in what you are saying, you have to be sure you are doing a good job and the work is robust. So the founding principle of MAP has been to not just make maps but to make sure they are as robust and as correct as they possibly can be, which means that we need to understand all the uncertainty within them.
Quality is obviously very important. How do you make sure you achieve it?
To do it well we need a lot of data, and the more empirically based the data are, the more we are going to be able to trust them. We put a lot of resources into getting as much data as we can and into overcoming the political and technical challenges in assembling data from across all the different malaria-endemic countries. We have a group called the Repository of Open Access Data for the Malaria Atlas Project (ROADMAP) which is purely dedicated to obtaining data for downstream MAP modelling. ROADMAP pulls in data on prevalence, intervention use, the environment, and so on, and does all the necessary formatting, standardising and curating to build modern, fit-for-purpose database infrastructures. A key aspect is that the data has a measurement of time and place. There is a huge pipeline of work that needs to happen before we begin any modelling.
The main analytical framework of our modelling work is based on Bayesian geostatistics. Whilst it’s quite easy to put data into a simple bit of software and generate a smoothed map, understanding what the variation in the data looks like, how it’s structured in time and space, and assessing how much of that variation the model explains is a lot harder. We’re transparent about uncertainty in our models, and in fact, understanding this uncertainty is another of the driving principles of the whole operation. So we have another group of methodologists who have been primarily concerned with developing our models for several years now.
Can you tell me a bit more about the different scales you’re working at? How does the work differentiate between them?
Our aim is to inform policy at different levels. On an international scale, our work can inform the prioritisation of resources between countries, for example through the WHO or Gates Global Fund. But we’re also increasingly focussing on operations that are going on within a country, such as the use and amount of bed nets needed in different districts and other local-level decisions.
For that, we increasingly need to integrate different types of data. Case reporting, for example, informs on the number of people diagnosed with malaria at clinics assimilated by a Ministry of Health. Historically, we used this at quite an aggregate level but this is not reliable if you are looking at very fine scale trends. Increasingly we are able to access data collected at the level of the individual health facility. This gives you a latitude and longitude for a clinic and how many people with malaria came to the clinic in a given month. This spatially and temporally rich resource gives information for clinics around the country, but it has actually brought up a whole raft of methodical challenges in terms of how we can use this data.
One of the challenges is understanding your denominator population. Once you’ve got the number of people with malaria showing up at a clinic, you want to know which population these cases arose from. This then lets you change your count of malaria cases into a rate, which can be standardised and which allows you to move forward and map that rate across the country.
However, it turns out that this is exceptionally complicated to understand. You need to know which clinics people use and what the catchment area around a clinic is. And when you look at data about people’s decisions, about why they go and how they chose between a set of different clinics, the behaviours are really complicated and it’s influenced by many things which are often very unintuitive. People will walk past three clinics to go to the fourth. So understanding the proportion of cases in a population turns out to be a very difficult question to answer.
So you can’t just use a census estimate of how many people live in a place?
That’s right. Understanding where people live and which population they belong to is a big challenge. Andy Tatem’s group at the University of Southampton is working on the WorldPop Project, mapping populations down to a 200m by 200m resolution grid based on a combination of census data with satellite images and other things. This is brilliant work that gives you a static snapshot on where people live.
But it’s a whole other thing to understand how people behave. We don’t yet understand the reasons why people might seek healthcare at a specific clinic, or how far they travel to work or where they are exposed to malaria. For example, a person might go to work in a forest or coastal area but seek help in their home village or somewhere on the way. This can potentially cause a huge disconnect between the data we get and what’s actually going on. A lot of our work goes into trying to bridge that gap to figure out how we can convert from cases reported at facilities to malaria infection risk in the community.
What are the next big steps for malaria mapping and modelling?
We are certainly going to continue the global scale prevalence mapping, burden estimation trends, and intervention impact assessments. Although this is reasonably established now, it needs yearly updating to make sure the estimates are as accurate as possible. This will remain one of our cornerstones: it’s what MAP is all about. We will also continue to support the WHO with all the global numbers for malaria, and we have recently started to provide this information to the Institute for Health Metrics and Evaluation in Seattle.
The more cutting-edge research going on at MAP is more focused on the question of elimination in countries or sub-regions with very little malaria which will be able to eliminate the disease in the near future. This raises the question of what role MAP can take in supporting elimination programmes when local transmission is no longer an issue. This is rather a different concern than that of predicting risk across countries when the aim is merely to control the disease and observe overall trends.
As we try to eliminate malaria, we’ll observe the last 1,000 cases in countries. Then the last 500, 100, 10 and, ultimately, the last case. But this then becomes a kind of needle in a haystack problem, because if you don’t know where those parasites are, it’ll be very challenging to target your control and elimination activities precisely. There is a danger of having inefficient and expensive activities running to eliminate the last parasites.
This raises a completely different set of challenges about what types of data we need, how we map them and how we deal with the statistical aspects of getting data that are nearly all zero – so no malaria – when we know there are hidden pockets of transmission going on. This will need extensive new methodological development, and we will have to investigate how to exploit new information sources.
Can you give me an example of how you are tackling this?
We have a project in Haiti where during the dry season the vector species, Anopheles albimanus, has very specific breeding sites. These are brackish, slightly saline pools behind beaches that are often formed when a storm washes a load of sand inland blocking parts of a stream or river. So clearly, knowing where these pools are, which changes every year, is important for mapping risk. Our old approach at the global scale of using environmental co-variants from satellite imagery is just too coarse here, too crude. So we are now focussing much more on very, very high-resolution imagery to try and be very specific about picking out certain features of a landscape that then feed into our evaluation of risk.
We’re also investing a lot into exploring new types of malaria data, as traditional data sources may not be very well suited for this needle in a haystack problem. Serology, for example, can give that extra level of insight into past transmission, by identifying blood biomarkers associated with past infection, but it’s a far more complicated metric to understand, map, and standardise across populations.
Another approach is the work of MalariaGEN, who are using genetics and genomics to understand more about levels of transmission in a population, where infections are found and where they originated. This has the potential to tell us about the geographical flux of parasites through time and across areas, as well as how different populations of parasites are related.
Are you hoping your recent relocation to the Big Data Institute (BDI) in Oxford will help with some of these new challenges?
In simple terms, our work will require greater and greater volumes of data which will be more and more complex. At the same time, we have to keep improving our basic infrastructure, i.e. our ability to get data, and then store, curate, and disseminate it.
In addition, and more substantively on the algorithmic scientific side, it’s not a question of how to make sense of increasing volumes of data that’s the issue. What’s much more challenging is to detect signal from noise, to understand which data are useful and which are not, and to triangulate different types of data that may represent very different things.
So many of our challenges are wrapped up in the kind of work that many groups at the BDI are involved with. There is a common thread in that we are all embracing similar challenges in terms of volume, complexity, and analytical power, which are the challenges of dealing with larger, more complex datasets. Moreover, this is only going to increase as we move forward both amassing actual malaria metric data on prevalence, serology, genomics, and so on, but also on existing and new kinds of environmental information that we increasingly rely on.
The new BDI building is a very good home for us. There is huge potential for collaboration as we are sitting alongside lots of people with techniques and experience we can draw on. And hopefully, we can have some expertise to send back the other way as well.
Learn more about the Malaria Atlas Programme
Bhatt, Weiss et al. The effect of malaria control on Plasmodium falciparum in Africa between 2000 and 2015. Nature. 2015 Oct 8;526(7572):207-11. doi: 10.1038/nature15535. Epub 2015 Sep 16.