Pf3k

 

 

 

Started 2014

Pf3k is an international collaboration using the latest sequencing technologies to provide a high-resolution view of natural variation in the malaria parasite Plasmodium falciparum.

Objectives & Coordination

The Pf3k project is led by researchers at the Broad Institute, the University of Oxford and the Wellcome Trust Sanger Institute.

Our primary goal is to undertake a comprehensive analysis of genome variation in 3,000 parasite samples representing the major malaria endemic regions of the world. In doing so, we'll:

  • Provide an open set of P. falciparum genome sequence data that captures common variation across multiple populations in different parts of the world
  • Use a combination of short- and long-read sequencing technologies in controlled settings to establish standards for accuracy and completeness in the inference of P. falciparum genome sequence variation and to characterise the quality of information obtained from standard approaches
  • Combine information from read-mapping, full de novo assembly, variant assembly and iterative reassembly of specific genes to obtain the most comprehensive resource on P. falciparum variation to date
  • Develop new high-quality reference genomes that will increase the resolution and accuracy of variation analysis across the whole sample set
  • Analyse the data to learn about parasite population structure, epidemiology and history, mutational and recombinational processes generating diversity, evolutionary processes including drug resistance and immune evasion, and how such phenomena differ between populations and regions

The primary output of the project will be an open access data resource with companion publications on genomic diversity and population genetics that together provide a detailed description of P. falciparum genome variation across the major malaria endemic regions.

Other outputs will include papers on methodology and standardisation of protocols for P. falciparum sequence analysis and genotyping calling. All of the underlying data will be made publicly available for use by the scientific community, initially under Fort Lauderdale conditions.

Scientific working groups will drive forward specific areas of analysis including statistics and population genetics (led by Gil McVean and Roberto Amato), technology benchmarking (led by Dan Neafsey and Jim Stalker) and reference genomes (led by Matt Berriman and Thomas Otto).  The MalariaGEN Resource Centre will provide support for partner studies, data production pipelines, communications and project management. The Project is overseen by the Pf3k Management Committee that is comprised of working group leaders, with support from members of the MalariaGEN Resource Centre.

Pilot phase

The Pf3k project will have several discrete phases, beginning with a pilot phase which commenced in June 2014. During the pilot phase, the Project is analysing Illumina short-read sequence data on 2,512 samples from multiple locations in Africa and Asia, together with laboratory samples for benchmarking and methods development. The MalariaGEN P. falciparum Community Project and the Broad Institute, together with their partners, have contributed the samples for the pilot phase. The Project will generate genotype calls by a range of different methods, and will perform methodological comparisons and performance metrics.

Planned analyses

During the pilot phase, the Project is undertaking a series of planned analyses that will form the basis of a manuscript, 'A global reference for genomic variation in Plasmodium falciparum', using Pilot Phase data (2,512 samples).

  • Sequence data and quality including SNPs, short tandem repeats, haplotypes and patterns of linkage disequilibrium
  • Population genetic phenomena such as population comparisons, mutation and recombination rates (haplotype structure and LD)
  • Signals of selection and demographic analyses
  • Merozoite surface proteins
  • var genes and genes implicated in drug resistance

Removing Pf3k Pilot Phase Terms of Use

The Pf3K Pilot Phase Terms of Use were applied to Pilot Phase data releases when they were publically released. In September 2016 these restrictions have been lifted from Pf3k pilot data release packages 1-5 and the data are available open access.

Sampling locations

  • Bangladesh (BD)
  • Cambodia (KH)
  • Congo (Democratic Republic of the) (CD)
  • Ghana (GH)
  • Guinea (GN)
  • Laos (LA)
  • Malawi (MW)
  • Mali (ML)
  • Myanmar (MM)
  • Nigeria (NG)
  • Senegal (SN)
  • Thailand (TH)
  • The Gambia (GM)
  • Vietnam (VN)

Data

The Project will publicly release data on a regular basis and prior to publication. Raw sequence reads will be deposited in either the European Nucleotide Archive (ENA) or the NCBI. Alignments and variant calls will be released on individual samples, and data formats and software developed by the Project will be made publicly available. Associated sample information will be made available in the public domain through the MalariaGEN website and other public databases as appropriate. Public release of the data will be associated with contact information for the lead investigators that have contributed the samples.

At the time of their release, these data were subject to the Pf3k Pilot Phase Terms of Use. In September 2016, these restrictions were lifted and this dataset is now available open access.

Current

9 Feb 2016

Pf3k pilot data release 5

Species: P. falciparum

Sample set: 2,512 field isolates; 5 lab clonal samples; 96 crosses samples; 27 mixed lab strains

Sample information, accession numbers, analysis BAMs, and a set of de novo genotype calls, both indels and SNPs, built using a pipeline based on GATK best practices

16 Oct 2015

Pf3k pilot data release 4

Species: P. falciparum

Sample set: 2,517 samples from 14 countries and 5 lab strains (7G8, GB4, KH02, KE01, GN01)

Sample information, analysis BAMs, and de novo genotype calls built using a pipeline based on GATK best practices

14 Apr 2015

Pf3k pilot data release 3

Species: P. falciparum

Sample set: 2,512 samples from 14 countries

Sample information, accession numbers, and genotypes

Archive

6 Nov 2014

Pf3k pilot data release 2

Species: P. falciparum

Sample set: 1,931 samples from 12 countries

Sample information, accession numbers, and genotypes

13 Aug 2014

Pf3k 1.0 pilot data release

Species: P. falciparum

Sample set: 1,794 samples from 11 countries

Sample information and accession numbers

People

Investigators involved in the Pf3k pilot phase include:

Prof Abdoulaye Djimdé
Associate Professor of Parasitology and Mycology; Chief of the Molecular Epidemiology and Drug Resistance Unit
Malaria Research and Training Centre, University of Science, Techniques and Technologies of Bamako, Mali
Wellcome Trust International Fellow
Malaria Programme, Wellcome Trust Sanger Institute, UK
Dr Alfred Amambua-Ngwa
Medical Research Council Unit, The Gambia
Alistair Miles
Head of Epidemiological Informatics
Wellcome Trust Centre for Human Genetics, University of Oxford, UK
Prof Alister Craig
Professor of Molecular Parasitology and Dean of Biological Sciences
Liverpool School of Tropical Medicine, UK
Malawi-Liverpool-Wellcome Trust Clinical Research Programme
Prof Arjen Dondorp
Mahidol Oxford Tropical Medicine Research Unit (MORU), University of Oxford, Thailand
Dr Brian Brunk
EuPathDB Project Manager
University of Pennsylvania, USA
Projects
Prof Chris Newbold
Weatherall Institute of Molecular Medicine, University of Oxford, UK
Dr Dan Neafsey
Associate Director, Genomic Center of Infectious Disease
Broad Institute, USA
Visiting Scientist
Harvard T. H. Chan School of Public Health
Prof Dominic Kwiatkowski
Professor of Genomics and Global Health
Wellcome Trust Centre for Human Genetics, University of Oxford, UK
Head of Malaria Programme
Wellcome Trust Sanger Institute, UK
Professorial Fellow in Genomics and Global Health
St John's College, University of Oxford, UK
Prof Gilean McVean
Group Head / PI and Unit Director and Acting Director of the Oxford Big Data Institute
Wellcome Trust Centre for Human Genetics, University of Oxford, UK
Professor of Statistical Genetics and University Lecturer in Mathematical Genetics
Department of Statistics, University of Oxford, UK
Dr Gordon Awandare
Senior Lecturer, Head, Department of Biochemistry Cell and Molecular Biology
University of Ghana, Legon, Ghana
Director
West African Centre for Cell Biology of Infectious Pathogens (WACCBIP)
Dr John O'Brien
Assistant Professor of Mathematics
Bowdoin College Brunswick, USA
Dr Joe Zhu
Post-doctoral researcher
Wellcome Trust Centre for Human Genetics, University of Oxford, UK
Dr Matt Berriman
Senior Group Leader
Wellcome Trust Sanger Institute, UK
Prof Nicholas J White
Professor of Tropical Medicine and Chair of Wellcome Trust SE Asian Tropical Medicine Research Programmes
Mahidol Oxford Tropical Medicine Research Unit (MORU), University of Oxford, Thailand
Prof Nicholas Day
Mahidol Oxford Tropical Medicine Research Unit (MORU), University of Oxford, Thailand
Dr Olivo Miotto
Senior Informatics Fellow
Mahidol Oxford Tropical Medicine Research Unit (MORU), University of Oxford, Thailand
MalariaGEN Affiliations
Projects
Prof Philip Bejon
KEMRI Wellcome Trust Research Programme, Kenya
Projects
Dr Richard Pearson
Principal Bioinformatician
Wellcome Trust Centre for Human Genetics, University of Oxford, UK
Visiting Bioinformatician
Wellcome Trust Sanger Institute, UK
Dr Roberto Amato
Staff Scientist
Wellcome Trust Sanger Institute, UK
Sorina Maciuca
Wellcome Trust Centre for Human Genetics, University of Oxford, UK
Projects
Dr Thomas D Otto
Senior Staff Scientist
Wellcome Trust Sanger Institute, UK
Honorary Senior Lecturer
London School of Hygiene & Tropical Medicine, UK
Dr Zamin Iqbal
PI/Group Leader
Wellcome Trust Centre for Human Genetics, University of Oxford, UK