The As1.0: Anopheles stephensi Phase 1 data resource integrates single nucleotide polymorphism (SNP) calls from whole-genome sequencing of 639 mosquitoes collected from different sites across Ethiopia, Sudan, Djibouti, Yemen, Afghanistan, Pakistan, Saudi Arabia, Iran, Kenya and India.
All of these samples were contributed and sequenced as part of the Controlling Emergent Anopheles stephensi in Sudan and Ethiopia (CEASE) project, and were collected between 2005 and 2025. Outputs from this project have been described in here (https://pmc.ncbi.nlm.nih.gov/articles/PMC11974716/).
Data sets
As1.0 Contributing Studies
1363 - Anopheles stephensi vector surveillance in Ethiopia
1364 - Anopheles stephensi vector surveillance in Sudan
1365 - Anopheles stephensi vector surveillance in Djibouti
1366 - Anopheles stephensi vector surveillance in Yeman
1367 - Anopheles stephensi vector surveillance in Afghanistan
1368 - Anopheles stephensi vector surveillance in Pakistan
1369 - Anopheles stephensi vector surveillance in Saudi Arabia
1370 - Anopheles stephensi vector surveillance in Iran
1385 - Anopheles stephensi vector surveillance from colony samples in Djibouti
1386 - Anopheles stephensi vector surveillance in Kenya
1458 - Anopheles stephensi vector surveillance in Ethiopia
1459 - Anopheles stephensi vector surveillance in Sudan
As1.0 Terms of Use
Data from this project will be made publicly available before journal publication. Unless otherwise stated, analyses of project data are ongoing and publications are in preparation by project partners, and it is not permitted to use project data for publication (including any type of communication with the general public) without prior permission from the originating partner studies.
Although malaria is generally an endemic rather than an epidemic disease, and the focus of this project is on surveillance of disease vectors rather than pathogens, our data terms of use build on MalariaGEN's approach to data sharing, and adopt norms which have been established for rapid sharing of pathogen genomic data during disease outbreaks. The primary rationale for this approach is that malaria remains a public health emergency, where ethically appropriate and rapid sharing of genomic surveillance data can help to detect and respond to biological threats such as new forms of insecticide resistance, and to adapt malaria vector control strategies to different settings and changing circumstances.
The publication embargo for all data in this release will expire on the 5th of April 2028.
If you have any questions regarding these terms of use, please contact support@malariagen.net.
As1.0 Data Availability
This data release includes sample metadata, whole genome sequence data, and genome-wide SNP calls from whole genome sequencing of 625 wild-caught mosquitoes collected from Ethiopia, Sudan, Djibouti, Yemen, Afghanistan, Pakistan, Saudi Arabia, Iran, India and Kenya. Also 14 colony mosquitos derived from wild collected samples collected in Djibouti. These individuals are all An. stephensi s.l. These data are hosted in Google Cloud. For more information about accessing data in the cloud, please see the cloud data access guide.
As1.0 Whole genome sequencing and variant calling
DNA was extracted from individual whole mosquitoes using the Qiagen DNEasy Blood and Tissue Kit. Sequencing was performed commercially to a target coverage of 30X using 2x150bp paired reads on an Illumina NovaSeq X instrument. Sequence reads were aligned to the An. stephensi reference genome UCISS2018 (https://vectorbase.org/vectorbase/app/record/dataset/DS_869a805bc4).
Full methods are described in The origin, invasion history and resistance architecture of Anopheles stephensi in Africa - PMC.