Insights into malaria susceptibility using genome-wide data on 17,000 individuals from Africa, Asia and Oceania

16 Dec 2019
Nature Communications 2019 10 5732 DOI: 10.1038/s41467-019-13480-z

This resource page accompanies the publication from Consortial Project 1 entitled "Insights into malaria susceptibility using genome-wide data on 17,000 individuals from Africa, Asia and Oceania", which describes a large multi-centre study to investigate the genetic determinants of resistance to malaria conducted by MalariaGEN partners from 12 sites in 11 countries. On this page we provide information about the partner studies, the data and how to access it, and the software which accompany the manuscript.

The full citation for the manuscript is: Malaria Genomic Epidemiology Network. Insights into malaria susceptibility using genome-wide data on 17,000 individuals from Africa, Asia and Oceania. Nat Commun 10, 5732 (2019) doi:10.1038/s41467-019-13480-z 

Please see a full list of MalariaGEN Consortium members, past and present.

Background

A person’s risk of developing severe malaria is influenced by many different genetic and environmental factors, but relatively little is known about their precise nature and how they interact. In 2005, the Malaria Genomic Epidemiology Network, or MalariaGEN, was established (Malaria_Genomic_Epidemiology_Network. A global network for investigating the genomic epidemiology of malaria. Nature. (2008); 456:732–737.). This consortium of researchers worked together to undertake large-scale genomic studies that integrate genetic data with clinical and epidemiological data, to address the question of why, in regions where people are repeatedly exposed to malaria parasites, some people die from the infection while others survive.

At the outset of the project MalariaGEN investigators agreed on principles for sharing data and on standardised clinical definitions, and worked together to define best ethical practices across different local settings including the development of guidelines for informed consent. These principles underlie access to data from the project which are described below.

Content of the data release

The following describes the content of the data release that accompanies our Nature Communications publication.

This release contains details on contributing partner studies, samples included in the project, metadata, genomic data including raw sequence reads and genome-wide genotyping, a set of directly typed genotypes used for validation and replication, and several software packages developed for handling such data.

Datasets from Consortial Projects 1 and 3 are released through a managed process described here, that includes information on applying for access to the data. Additionally, some data is available under open access terms as noted below. Full details of data processing pipelines and analytical results can be found in the accompanying manuscript.

  • Genotype data from severe malaria affected individuals, population controls and other individuals (managed access):
    • Genome-wide Illumina SNP genotype data and association test results from our analysis of severe malaria in eleven populations (Gambia, Mali, Burkina Faso, Ghana, Nigeria, Cameroon, Tanzania, Malawi, Kenya, Vietnam and Papua New Guinea). The datasets available include raw Illumina Omni 2.5M genotype data from each population analysed.  In addition, we provide a set of processed data for a subset of samples that passed our quality control process, including phased and imputed SNP genotypes and a set of association test summary statistics.
    • Direct typing using the Sequenom MassArray (Agena Bioscience) platform of over 500 selected genetic variants.  Data is available on cases, controls and family trios from the above populations, whole-genome sequenced individuals from Consortial Project 3 and the Gambian Genome Variation Project, and a set of HapMap individuals.
    • HLA allele types for 32 Gambian trios: HLA allele calls obtained from Sanger sequencing of 32 Gambian severe malaria cases and their parents.
  • Illumina short-read whole-genome sequence data of six ethnic groups from Burkina Faso, Cameroon and Tanzania (managed access): This release contains six packages of Illumina whole-genome sequence data and Illumina Omni 2.5M genotype data for individuals from three African countries. Individuals were collected as nominally unrelated (Burkina Faso) or as family trios (Cameroon and Tanzania)
  • Open-access Illumina short-read whole-genome sequence data - Gambian Genome Variation Project: The Gambian Genome Variation Project (GGVP) is a collaboration of the MRC Unit in The Gambia, the Wellcome Sanger Institute, the MRC Centre for Genomics and Global Health at Oxford University, and the MalariaGEN Resource Centre. The purpose of the project was to support the discovery and understanding of genetic variants that influence human disease.
  • Open-access association test and meta-analysis summary statistics: A set of summary statistics for association tests between severe malaria cases and population controls collected in eleven populations.

Publications using these data should acknowledge and cite the source of the data using the following format: 

"This study makes use of data generated by MalariaGEN. A full list of the investigators who contributed to the generation of the data is available from www.MalariaGEN.net. Funding for this project was provided by Wellcome Trust (WT077383/Z/05/Z) and the Bill & Melinda Gates Foundation through the Foundation of the National Institutes of Health (566) as part of the Grand Challenges in Global Health Initiative."

Software

As part of this project several softwares were developed to handle, curate and analyse genome-wide data in general. These are provided as is and details of their use and acknowledgement is included on the individual software pages and packages.

  • QCTOOL (a package for manipulation of genome-wide genetic datasets)
  • SNPTEST (software for genome-wide association testing)
  • inthinnerator (a helper tool for LD thinning and annotation of GWAS datasets)
  • BINGWA (software for Bayesian and frequentist meta-analysis)

Partner study information

  • Partner study information (PDF) (Full details of partner studies, including study description, contact information and key people ): In total 12 study sites contributed samples, clinical data and expertise to this project.

 

Publications that have used the MalariaGEN Consortial data resource, prior to and including the upcoming publication

  1. Malaria Genomic Epidemiology Network, Insights into malaria susceptibility from the genomes of 17,000 individuals, Nature Communications (2019). In Press.
  2. Malaria Genomic Epidemiology Network, Band, G., Rockett, K.A., Spencer, C.C. & Kwiatkowski, D.P. A novel locus of resistance to severe malaria in a region of ancient balancing selection. Nature 526, 253-7 (2015).
  3. Malaria Genomic Epidemiology Network. Reappraisal of known malaria resistance loci in a large multi-centre study. Nature Genetics 46, 1197–1204, (2014).
  4. Shah, S.S. et al. Heterogeneous alleles comprising G6PD deficiency trait in West Africa exert contrasting effects on two major clinical presentations of severe malaria. Malar J 15, 13 (2016).
  5. Clarke, G.M. et al. Characterisation of the opposing effects of G6PD deficiency on cerebral malaria and severe malarial anaemia. Elife 6, e15085 (2017).
  6. Jallow, M. et al. Genome-wide and fine-resolution association analysis of malaria in West Africa. Nat Genet 41, 657-65 (2009).
  7. Band, G. et al. Imputation-based meta-analysis of severe malaria in three African populations. PLoS Genet 9, e1003509 (2013).
  8. Busby, G.B. et al. Admixture into and within sub-Saharan Africa. Elife 5, e15266 (2016).
  9. Leffler, E.M. et al. Resistance to malaria through structural variation of red blood cell invasion receptors. Science 356 eaam6393 (2017).
  10. Toure, O. et al. Candidate polymorphisms and severe malaria in a Malian population. PLoS One 7, e43987 (2012).
  11. Olaniyan, S.A. et al. Tumour necrosis factor alpha promoter polymorphism, TNF-238 is associated with severe clinical outcome of falciparum malaria in Ibadan southwest Nigeria. Acta Trop 161, 62-7 (2016).
  12. Apinjoh, T.O. et al. Association of cytokine and Toll-like receptor gene polymorphisms with severe malaria in three regions of Cameroon. PLoS One 8, e81071 (2013).
  13. Apinjoh, T.O. et al. Association of candidate gene polymorphisms and TGF-beta/IL-10 levels with malaria in three regions of Cameroon: a case-control study. Malar J 13, 236 (2014).
  14. Opi, D.H. et al. Two complement receptor one alleles have opposing associations with cerebral malaria and interact with alpha(+)thalassaemia. Elife 7(2018).
  15. Ndila, C.M. et al. Human candidate gene polymorphisms and risk of severe malaria in children in Kilifi, Kenya: a case-control association study. Lancet Haematol 5, e333-e345 (2018).
  16. Shah, S.S. et al. Genetic determinants of glucose-6-phosphate dehydrogenase activity in Kenya. BMC Med Genet 15, 93 (2014).
  17. Uyoga, S. et al. Glucose-6-phosphate dehydrogenase deficiency and the risk of malaria and other diseases in children in Kenya: a case-control and a cohort study. Lancet Haematol 2, e437-44 (2015).
  18. Mackinnon, M.J. et al. Environmental Correlation Analysis for Genes Associated with Protection against Malaria. Mol Biol Evol 33, 1188-204 (2016).
  19. Kariuki, S.M. et al. The genetic risk of acute seizures in African children with falciparum malaria. Epilepsia 54, 990-1001 (2013).
  20. Muriuki, J.M. et al. The ferroportin Q248H mutation protects from anemia, but not malaria or bacteremia. Sci Adv 5, eaaw0109 (2019).
  21. Manjurano, A. et al. Candidate human genetic polymorphisms and severe malaria in a Tanzanian population. PLoS One 7, e47463 (2012).
  22. Manjurano, A. et al. USP38, FREM3, SDC1, DDC, and LOC727982 Gene Polymorphisms and Differential Susceptibility to Severe Malaria in Tanzania. J Infect Dis (2015).
  23. Manjurano, A. et al. African glucose-6-phosphate dehydrogenase alleles associated with protection from severe malaria in heterozygous females in Tanzania. PLoS Genet 11, e1004960 (2015).
  24. Sepulveda, N. et al. Malaria Host Candidate Genes Validated by Association With Current, Recent, and Historical Measures of Transmission Intensity. J Infect Dis 216, 45-54 (2017).
  25. Ravenhall, M. et al. Novel genetic polymorphisms associated with severe malaria and under selective pressure in North-eastern Tanzania. PLoS Genet 14, e1007172 (2018).
  26. Dunstan, S.J. et al. Variation in human genes encoding adhesion and proinflammatory molecules are associated with severe malaria in the Vietnamese. Genes Immun 13, 503-8 (2012).
  27. Manning, L. et al. A Toll-like receptor-1 variant and its characteristic cellular phenotype is associated with severe malaria in Papua New Guinean children. Genes Immun 17, 52-9 (2016).

Publications arising from the MalariaGEN Consortial data resource by external users

As described above the Consortial Project 1 data are available through the European Genome-phenome Archive through a managed-access policy. A list of researchers who have been approved access to the data by an Independent Data Access Committee can be found here. Publications include:

  1. Loley Ch., et al. A unifying framework for robust association testing, estimation, and genetic model selection using the generalised linear model. European Journal of Human Genetics (2013). 21:1442-8. PMID:23572026 . (https://doi.org/10.1038/ejhg.2013.62)
  2. Howie B., et al. Genotype imputation with thousands of genomes(link is external). G3: Genes, Genomes, Genetics (2011), 1:457-470. PMID:22384356 . (https://doi.org/10.1534/g3.111.001198)
  3. Timmann, C., et al. Genome-wide association study indicates two novel resistance loci for severe malaria. Nature. (2012), 489:443-6. PMID:22895189. (https://doi.org/10.1038/nature11334)
  4. Gurdasani D., eta l. The African Genome Variation Project shapes medical genetics in Africa. Nature. (2015). 517:327-32. PMID:25470054. (https://doi.org/10.1038/nature13997)
  5. Howey R., and Cordell HJ. Imputation without doing imputation: a new method for the detection of non-genotyped causal variants. Genetic Epidemiology. (2014) 38:173-190. PMID:24535679. (https://doi.org/10.1002/gepi.21792)

MalariaGEN Resource Page 25 version 0.2 -14 November-2019