Pf3k pilot data release 2

Project: Pf3k

Released on 6 Nov 2014

This data release contains sample information, accession numbers, and baseline genotypes for 1,931 samples comprised of the 1,794 samples included in the Pf3k 1.0 pilot data release along with an additional 137 samples contributed by the Broad Institute.

At the time of their release, these data were subject to theĀ Pf3k Pilot Phase Terms of Use. In September 2016, these restrictions were lifted and this dataset is now available open access.

2.0 Data

This data set comprises sample information and analysis BAMs from the 1,794 samples included in the 1.0 pilot data release as well as 137 samples contributed by the Broad Institute.

This data set includes:

  • A table of sample metadata in tab-delimited and Excel file formats. This table includes:
    • Accessions for downloading the sequence reads from the European Nucleotide Archive (ENA)
    • Sampling location
    • Contributing partner study ID and contact person
    • Mapping metadata including sequence coverage metrics

These data can be downloaded from the Wellcome Trust Sanger Institute public ftp site.

2.1 Data

This data set contains baseline genotypes for the 2.0 sample set. These genotypes are based on a set of high-quality SNP loci from the MalariaGEN partner studies, but these samples have not been through de novo variant discovery and these genotypes should not be taken as a quality-controlled output of the Pf3K project. They are provided for public interest, and as a basis for future methods development. For more information, see the README files on the ftp site.

This data set includes:

These data can be downloaded from the Wellcome Trust Sanger Institute public ftp site.

Release notes

9 Feb 2016
Analysis BAMs removed

This release previously contained analysis BAM files, one-per-sample, aligned to the 3D7_v3 reference. These data have been superseded; please see Pf3k pilot data release 5 for the latest analysis BAMs for this sample set.