A conceptual framework employing genomic ancestry blocks
Jutta Buschbom
Statistical Genetics, November 22-24, 2019
Poster presented at the 61th Phylogenetic Symposium “Reticulate Evolution”, Göttingen, Germany, Nov. 22-24, 2019.
Abstract
The ability to assign an individual reliably to its taxonomic lineage and geographic origin, even in species groups with unclear taxonomic boundaries and/or gene flow, is essential for gaining robust and detailed insight into the current state of a taxon, as well as, the processes shaping its responses to quickly and potentially unpredictably changing climatic and environmental conditions.
Accumulating experience for long-lived and widely distributed species that form isolation-by-distance genetic patterns shows that reliable population assignment is very difficult to achieve. Current methodological developments within species, thus, are focusing on approaches, which involve the reconstruction of ancestry blocks, that is, admixture linkage haploblocks.
Building on existing ancestry-block-based reconstructions of geographic origin and historical migration events developed for humans, I am proposing a conceptual framework that extends these approaches to taxonomic groups with multiple hybridizing species, as for example white oaks (Quercus sect. Quercus).
Distribution-range-wide and genome-wide-sequence datasets allow the comprehensive reconstruction of ancestry blocks, which can be linked to both, species (taxon) identity and geographic origin. In a first step, reconstructed ancestry blocks are classified according to taxon. Here, ancestry blocks obligately present in a taxon and differentiated between taxa are candidates for speciation islands, which can be used for barcoding applications.
In a second step, within each class of ancestry blocks associated with one taxon, ancestry is now correlated with the geographic origin of the source individuals. In each geographic region distinctly different clades of ancestry blocks can be present, originating from the multiple taxa in the region. Yet, these different clades all are characteristic for this region, if considered within each taxon.
In a final step, the presence of compound genomic haploblocks can be investigated. These will be characteristic for geographic regions at different times and scales. Such compound haploblocks would show specific patterns of (multi-taxon) ancestry blocks characteristic for individuals belonging to, e. g., a geographic provenance.
In evolving and highly complex natural systems, the mosaic structure of ancestries, that is genealogies, within the genome can provide additional information necessary for archiving sufficiently accurate, decisive and reliable assignment results. Robust, useful models of the mosaic genealogical structure of the genome, in this way, contribute to the inference of species (taxon) membership in groups with gene flow and, at the same time, to population assignment to geographical region with increased spatial resolution.