Our approach

Collecting samples in Kenya. Photo credit: Victoria Cornelius.

MalariaGEN was founded in 2005, on the belief that next generation DNA sequencing tools and technologies would provide new opportunities to understand how genetic variation drives important aspects of malaria biology and epidemiology.

At the time, this approach was relatively untested in malaria research, even radical. The research community was fragmented and tended to pursue small, local studies. Variations in methods made comparisons between isolated studies difficult. Clinical and genomic data were not often brought together. The ethical implications of collecting and analysing genomic data were largely unexplored, particular in low- and middle-income settings.

Pursuing our vision of large-scale genomic studies that integrate genetic data with clinical and epidemiological data, would prove to be a complex and multi-faceted challenge. 

Below you can learn a bit more about our history and how our collaborative research has evolved over the last decade. Our experience continues to inform our approach:


Recognising the network as the foundation of our scientific research

With initial support from the Wellcome Trust and the Foundation for the National Institutes of Health as part of the Bill & Melinda Gates Grand Challenges in Global Health Initiative, we focused on understanding how people develop natural immunity to malaria. Read more about our work in the Global Grand Challenges Project Retrospective on Protective Immunity (Kwiatkowski)

We wanted to address the question of why, in regions where people are repeatedly exposed to malaria parasites, some people die from the infection while others survive. 

The first step was to bring together researchers from a number of malaria endemic countries in a series of workshops to discuss and define how we might work together. The first of these meetings was held in Oxford, UK, in July 2005. Participating researchers, who subsequently became known as MalariaGEN Investigators, began to define the edges of our first two projects: Consortial Project 1 and Consortial Project 2.

These projects were the first to implement a partner study model, whereby we work with individual investigators who are pursuing independent research at specific locations. Today, several MalariaGEN projects are composed of partner studies; each project defining partner studies in their own way. Partner studies remain an important means of recognising the scientific goals and work of our partners, as well as forming a cornerstone of our approach to data sharing.

This initial gathering, along with subsequent meetings in France and Cameroon, were critical opportunities to open a dialogue about the many practical and ethical considerations involved in collecting and sharing data across an international network.

Addressing ethical challenges helps to define our values

Our human consortial projects involve collecting tens of thousands of samples, sharing expertise, generating genome-wide data on millions of genetic variations, and curating clinical data (for example, gender, ethnicity, and parasitaemia) on severe malaria phenotypes. The scale and complexity of this research meant that we needed to agree on policies for managing shared resources and handling genetic data at an early stage.

This is particularly the case for human genetic data, where it is essential to preserve privacy and to safeguard the interests of the communities that participate in our research.

Another key concern to emerge from these early debates within our community was the need to ensure the sustainability of our research collaborations.

During the inaugural meeting in July 2005, an initial proposal for managing data sharing, intellectual property and publications was presented to the network. The proposal was refined and endorsed during this meeting, and it represents our first consensus on these issues. We’ve subsequently developed more specific guidelines, governing, for example, how samples and data are transferred among MalariaGEN partners and how data are released publicly.

A Programme Management Committee was formed to oversee the establishment of our human Consortial Project 1 and Consortial Project 2. As these projects matured, oversight was moved to a Governance Committee, along with the Publications and Presentations Committee and the Independent Data Access Committee.

Our deep experience in this arena taught us a great deal about the necessity of taking a considerate, balanced approach to collaborative research – something that is also reflected in the projects that we subsequently established to study the malaria parasites and the mosquitoes that transmit them.

We take seriously the trust placed in us by our partners, and acknowledge their legitimate interests in the data resources created through our collaborations. Each project defines its data release policies in a way that strikes a balance between maximising availability for the wider community with the legitimate interests of contributors. Each of these data release policies varies, but they are all designed to be equitable and appropriate, and to acknowledge the contributions of the researchers that conducted the original scientific study and generated the data.

In the process, we helped to define ethical best practices in seeking informed consent and sharing data in low- and middle-income countries – practices that are woven into our own research. And, we’ve built lasting partnerships that continue to build significant data resources and advance our understanding the genetics of malaria, and importantly helped to build research capacity in malaria endemic countries.

Supporting capacity amongst multidisciplinary researchers

Capacity building is an important component of our work, and these efforts are also rooted in our early work studying the human genome. In 2006, we created a Data Bursary Scheme to support MalariaGEN Data Fellows to develop capacity for statistics and genetic data analysis. MalariaGEN Data Fellows often worked in malaria-endemic areas, helping to manage data for a specific partner study. This scheme closed in 2010, but we continue to assist our collaborators through scientific meetings and trainings, the provision of online tools, and one-on-one support. Through our collaboration with the MRC Centre for Genomics and Global Health and the P. falciparum Community Project, we also support the Plasmodium Diversity Network Africa (PDNA).

Building technical and analytical infrastructure

Our collaborators are at the heart of what we do, and our projects and their policies provide a framework to work together.  With these foundations in place, the next step was to build the technical infrastructure to collect and manage data on a large scale.

We’ve pioneered sampling techniques, notably sequencing parasite DNA from small sample volumes, typically blood spots collected on filter paper, and we've developed methods to decrease contamination of human DNA when isolating parasite DNA from patient blood samples. We’ve also put in place informatics pipelines capable of genotyping many thousands of samples – and online tools to securely return this data to partners and the wider research community.

These tools have been particularly important as data production scaled up to include parasite and mosquito genomes, as well as humans.

Without these technical foundations, we would not have been able to produce the large data resources which we now have – and which are now providing new insights into the evolutionary battle between humans, malaria parasites and the mosquitoes that transmit them.

Designing tools to make data more accessible and drive discovery

As our data resources have grown in size and complexity, we’ve recognised the need to build tools that make these data more accessible to people with various levels of expertise and located in different parts of the world. A good portion of our effort in this arena has been focused on developing web-based applications.

One of our first tools to be made publicly available was LookSeq, which allows users to view DNA sequences aligned against a reference genome – a useful tool for browsing genetic variants in sequence reads.

We also developed a tool, ExplorerCat, which allowed people to explore single nucleotide polymorphisms (SNPs) in publicly available data. As an extension of this work, we developed My Genotypes, a tool that our partners can use to privately browse genotypes for the samples that they’ve submitted to MalariaGEN projects. Our My Studies software also helps our partners to track their samples and data in our systems.

These early tools informed our efforts to build a richer data tool that served genetic and genomic data alongside other information. MapSeq was our first attempt to integrate various types of related data including geographical metadata and information on the community of researchers involved in generating the data. Although MapSeq didn’t survive, the vision did. And, teaming with the MRC Centre for Genomics and Global Health, we first built a series of project-specific web applications (P. falciparum Community Project, P. falciparum Crosses) that were precusors to a new flexible software framework called Panoptes, which currently powers a number of our project-specific web applications (Ag1000G, Pf3k).