NEW: Dominic Kwiatkowski’s final paper... more
Pf3k pilot data release 2
Project: Pf3k

Released on 6 Nov 2014.

Parasite

This data release contains sample information, accession numbers, and baseline genotypes for 1,931 samples comprised of the 1,794 samples included in the Pf3k 1.0 pilot data release along with an additional 137 samples contributed by the Broad Institute.

At the time of their release, these data were subject to the Pf3k Pilot Phase Terms of Use. In September 2016, these restrictions were lifted and this dataset is now available open access.

Data sets

2.0 Data

This data set comprises sample information and analysis BAMs from the 1,794 samples included in the 1.0 pilot data release as well as 137 samples contributed by the Broad Institute.

This data set includes:

  • A table of sample metadata in tab-delimited and Excel file formats. This table includes:
    • Accessions for downloading the sequence reads from the European Nucleotide Archive (ENA)
    • Sampling location
    • Contributing partner study ID and contact person
    • Mapping metadata including sequence coverage metrics

These data can be downloaded from the Wellcome Trust Sanger Institute public ftp site.

NOTE: Many browsers now do not support links to FTP sites. If you are experiencing difficulties, you may need to change your browser settings.

Go to FTP

2.1 Data

This data set contains baseline genotypes for the 2.0 sample set. These genotypes are based on a set of high-quality SNP loci from the MalariaGEN partner studies, but these samples have not been through de novo variant discovery and these genotypes should not be taken as a quality-controlled output of the Pf3K project. They are provided for public interest, and as a basis for future methods development. For more information, see the README files on the ftp site.

This data set includes:

  • A VCF file (http://vcftools.sourceforge.net/specs.html) containing genotypes for all 2.0 samples at 682k high-quality SNP loci.

These data can be downloaded from the Wellcome Trust Sanger Institute public ftp site.

NOTE: Many browsers now do not support links to FTP sites. If you are experiencing difficulties, you may need to change your browser settings.

Go to FTP

Release notes

Analysis BAMs removed
7 Jul 2021

This release previously contained analysis BAM files, one-per-sample, aligned to the 3D7_v3 reference. These data have been superseded; please see Pf3k pilot data release 5 for the latest analysis BAMs for this sample set.

Open access

Archived

Our approach to sharing data

Data package contact

Citations

To cite this release directly, please use the following format:

The Pf3K Project (2014): pilot data release 2. http://www.malariagen.net/data_package/pf3k-2/