search this blog

Loading...

Friday, July 22, 2016

Sneak peek: Basal-rich K7


I've got a new test. Currently I'm only using it to explore ancient genomes, but at some point I'll make another version available to the general public, one way or another. However, that might take a little bit of work and time to mitigate the effects of the calculator effect and so on.

Below are a few results featuring ancient and present-day samples. Right now, I have to test many of the samples in separate ADMIXTURE runs, so it's taking ages.

Please note that the Basal-rich component is unlikely to be a perfect representation of the hypothetical Basal Eurasian population. At the same time, it's likely that the two hunter-gatherer components, AG3-MA1 and Villabruna, contain some Basal Eurasian admixture. I'll update this post tomorrow.

Afanasievo RISE511
AG3-MA1 53.8
Andamanese 0.24
Basal-rich 7.54
Oceanian 0.31
Southeast_Asian 1.26
Sub-Saharan 0.8
Villabruna 36.06

Anatolia_Neolithic I0709
AG3-MA1 0
Andamanese 0
Basal-rich 45.44
Oceanian 0.06
Southeast_Asian 0
Sub-Saharan 0.12
Villabruna 54.37

Han HGDP00774
AG3-MA1 5.47
Andamanese 2.36
Basal-rich 0.04
Oceanian 0.96
Southeast_Asian 91.11
Sub-Saharan 0.03
Villabruna 0.03

Iran_Hotu I1293
AG3-MA1 47.72
Andamanese 1.41
Basal-rich 43.01
Oceanian 1.38
Southeast_Asian 4.44
Sub-Saharan 0.01
Villabruna 2.04

Iran_Neolithic I1290
AG3-MA1 45.3
Andamanese 0.88
Basal-rich 52.2
Oceanian 0.9
Southeast_Asian 0.12
Sub-Saharan 0.5
Villabruna 0.09

Kalash HGDP00267
AG3-MA1 45.03
Andamanese 4.5
Basal-rich 30.21
Oceanian 1.1
Southeast_Asian 6.4
Sub-Saharan 0
Villabruna 12.77

Karitiana HGDP00999
AG3-MA1 42.07
Andamanese 0.01
Basal-rich 0
Oceanian 0
Southeast_Asian 57.92
Sub-Saharan 0
Villabruna 0

Yamnaya_Kalmykia RISE552
AG3-MA1 56.24
Andamanese 0.19
Basal-rich 10.58
Oceanian 0.03
Southeast_Asian 0.02
Sub-Saharan 0.81
Villabruna 32.14

Thursday, July 14, 2016

Early Neolithic genomes from the eastern Fertile Crescent (Broushaki et al. 2016)


Open access at Science:

Abstract: We sequenced Early Neolithic genomes from the Zagros region of Iran (eastern Fertile Crescent), where some of the earliest evidence for farming is found, and identify a previously uncharacterized population that is neither ancestral to the first European farmers nor has contributed significantly to the ancestry of modern Europeans. These people are estimated to have separated from Early Neolithic farmers in Anatolia some 46-77,000 years ago and show affinities to modern day Pakistani and Afghan populations, but particularly to Iranian Zoroastrians. We conclude that multiple, genetically differentiated hunter-gatherer populations adopted farming in SW-Asia, that components of pre-Neolithic population structure were preserved as farming spread into neighboring regions, and that the Zagros region was the cradle of eastward expansion.


Broushaki et al., Early Neolithic genomes from the eastern Fertile Crescent, Science 14 Jul 2016, DOI: 10.1126/science.aaf7943

See also...

Economic overhaul + population shift in Late Neolithic Iran

Modeling Steppe_EMBA

qpAdm tour of Iran

Yamnaya =/= Eastern Hunter-Gatherers + Iran Chalcolithic

Monday, July 11, 2016

Genome-wide variants of Eurasian facial shape differentiation


Very interesting new preprint at bioRxiv:

Abstract: It is a long standing question as which genes define the characteristic facial features among different ethnic groups. In this study, we use Uyghurs, an ancient admixed population to query the genetic bases why Europeans and Han Chinese look different. Facial trait variations were analyzed based on high dense 3D facial images; numerous biometric spaces were examined for divergent facial features between European and Han Chinese, ranging from inner-landmarks to dense shape geometrics. A series of genome-wide association analyses were conducted on a discovery panel of Uyghurs. Six significant loci were identified and four of which, rs1868752, rs118078182, rs60159418 at or near UBASH3B, COL23A1, PCDH7 and rs17868256 were replicated in two independent cohorts of Uyghurs or Southern Han Chinese. We further developed a quantitative model to predict 3D faces based on 277 top GWAS SNPs. In hypothetic forensic scenarios, this model was found to significantly enhance the rate of suspect verification, suggesting a practical potential of related research.

Lu Qiao et al., Detecting Genome-wide Variants of Eurasian Facial Shape Differentiation: DNA based Face Prediction Tested in Forensic Scenario, bioRxiv, posted July 11, 2016, doi: http://dx.doi.org/10.1101/062950

Layers of Ancient North Eurasian-related ancestry in East Asia


Back in May I hypothesized that present-day East Asians were prehistoric hybrids of partly Ancient North Eurasian (ANE) origin. I got the idea from a series of TreeMix runs (see here).

This was essentially confirmed recently in the Lazaridis et al. 2016 preprint. Refer to page 147 in the paper's supplementary information PDF here.

However, based on more recent TreeMix runs featuring data from Lazaridis et al., I'd say the situation is more complex than just some minor ANE-related admixture in East Asians. I suspect now that all East Asians, including even the Onge, an ancient isolate population from the Andaman Islands, harbor significant ANE-related ancestry that may have arrived in East Asia in separate waves.

Here's what I'm talking about. Note that all of the samples on the East Asian node - Upper Paleolithic west Siberian forager Ust-Ishim, Han Chinese and Onge - are influenced by a massive migration edge from the base of the AG3-MA1 or ANE branch. However, as per the second graph, only the ancestors of more northerly East Asians, like those of the Han, appear to have been recipients of the latest ANE-related admixture into East Asia.



Indeed, when I add the Natufians from the Epipaleolithic Levant to the analysis, Ust-Ishim and the East Asians join AG3-MA1 on the same branch, but now receive a 36% migration edge from a point basal to all Eurasians. This is not admixture from the hypothesized Basal Eurasian clade, but probably from another basal clade, specific to East Asians, which I'd say occasionally shows up as pseudo Sub-Saharan admixture in East Asians.


But obviously, we'll need a solid selection of ancient genomes from across space and time in East Asia to confirm these results. Rumor has it that they're on their way.

Saturday, July 9, 2016

Modeling Steppe_EMBA


Lazaridis et al. showed that their Steppe_EMBA grouping, which included Afanasievo, Poltavka and Yamnaya, as well as two Potapovka samples, one Russia_EBA sample and one Srubnaya_outlier sample, were best modeled in the following two ways using qpAdm:

Steppe_EMBA
Eatern Hunter-Gatherer (EHG) 0.568
Iran Chalcolithic (Iran_ChL) 0.432

Steppe_EMBA
Caucasus Hunter-Gatherer (CHG) 0.181
Eastern Hunter-Gatherer (EHG) 0.527
Iran Chalcolithic (Iran_ChL) 0.292

I'm not a huge fan of either of these models, but especially the first one, even though I understand that they're both statistically very sound. For one, the uniparental markers don't match, and two, TreeMix seems to disagree (see here).


So let's try something a little different and see what happens when I model Steppe_EMBA as EHG, CHG, and Anatolia Chalcolithic.

Outgroups
Anatolia_Neolithic
Andamanese_Onge
Chukchi
Han
Israel_Natufian
Karitiana
Kostenki14
Levant_Neolithic
MA1
Mbuti.DG
Papuan
WHG

Steppe_EMBA
Anatolia Chalcolithic (Anatolia_ChL) 0.128
Caucasus Hunter-Gatherer (CHG) 0.375
Eastern Hunter-Gatherer (EHG) 0.497

As far as I can tell, it's a very decent fit, especially considering that I'm using 12 outgroups and three reference populations. To me, at least, the standard errors look surprisingly low for such a complex model: 0.033, 0.046 and 0.020, respectively.

Now, I'm not arguing here that Chalcolithic Anatolia is the answer. What I'm saying is that multiple lines of evidence do not support Chalcolithic Iran as a real source of admixture for Steppe_EMBA, and I'm offering what I see as a plausible alternative among the currently available samples.

I know that this is a work in progress for the Broad MIT/Harvard team, and we'll have to wait for more ancient samples and another paper or two before a consensus is reached on the topic.

But here's my prediction: Steppe_EMBA only has 10-15% admixture from the post-Mesolithic Near East not including the North Caucasus, and basically all of this comes via female mediated gene flow from farming communities in the Caucasus and perhaps present-day Ukraine.

Friday, July 8, 2016

Khazar shmazar #2


Open access at Genome Biology and Evolution:

Abstract: In a recent interdisciplinary study, Das and co-authors have attempted to trace the homeland of Ashkenazi Jews and of their historical language, Yiddish (Das et al. 2016. Localizing Ashkenazic Jews to Primeval Villages in the Ancient Iranian Lands of Ashkenaz. Genome Biology and Evolution). Das and co-authors applied the geographic population structure (GPS) method to autosomal genotyping data and inferred geographic coordinates of populations supposedly ancestral to Ashkenazi Jews, placing them in Eastern Turkey. They argued that this unexpected genetic result goes against the widely accepted notion of Ashkenazi origin in the Levant, and speculated that Yiddish was originally a Slavic language strongly influenced by Iranian and Turkic languages, and later remodeled completely under Germanic influence. In our view, there are major conceptual problems with both the genetic and linguistic parts of the work. We argue that GPS is a provenancing tool suited to inferring the geographic region where a modern and recently unadmixed genome is most likely to arise, but is hardly suitable for admixed populations and for tracing ancestry up to 1000 years before present, as its authors have previously claimed. Moreover, all methods of historical linguistics concur that Yiddish is a Germanic language, with no reliable evidence for Slavic, Iranian, or Turkic substrata.

Flegontov et al., Pitfalls of the geographic population structure (GPS) approach applied to human genetic history: A case study of Ashkenazi Jews, Genome Biol Evol (2016) doi: 10.1093/gbe/evw162

See also...

Khazar shmazar

Irano-Turko-Slavic roots of Ashkenazi Jews?

Wednesday, July 6, 2016

qpAdm tour of Iran


Just wanted to see if I could model Early Neolithic versus Chalcolithic Zagros farmer ancestry in present-day Iranians using qpAdm. I reckon I can, more or less. The outcomes below are all fairly solid statistical fits, especially considering the complexity of the models and the close similarity between the Early Neolithic and Chalcolithic Zagros farmers.

Outgroups
Bichon
Chukchi
Karelia_HG
Karitiana
Kostenki14
Levant_Neolithic
MA1
Mbuti.DG
Mota
Papuan
Ust_Ishim

Iranian_Bandari
Iran_Chalcolithic 0.136 ± 0.121
Iran_Neolithic 0.631 ± 0.152
Yamnaya_Samara 0.164 ± 0.033
Han 0.026 ± 0.017
Yoruba 0.044 ± 0.013

Iranian_Lor
Iran_Chalcolithic 0.723 ± 0.078
Iran_Neolithic 0.106 ± 0.079
Yamnaya_Samara 0.130 ± 0.024
Han 0.041 ± 0.011

Iranian_Mazandarani
Iran_Chalcolithic 0.558 ± 0.066
Iran_Neolithic 0.209 ± 0.065
Yamnaya_Samara 0.178 ± 0.022
Han 0.055 ± 0.010

Iranian_Persian
Iran_Chalcolithic 0.617 ± 0.064
Iran_Neolithic 0.181 ± 0.062
Yamnaya_Samara 0.148 ± 0.022
Han 0.054 ± 0.010

However, please note that despite the close similarity between the Early Neolithic and Chalcolithic Zagros farmers, the latter did not in most part descend from the former. In fact, it's very likely that the Chalcolithic farmers were largely, or perhaps even entirely, derived from newcomers to present-day Iran from somewhere to the west of the Zagros Mountains (see here).

It's true that in the basic four-way qpAdm model in Lazaridis et al. the Chalcolithic Zagros farmers are largely modeled as Neolithic Zagros farmers (or Iran_N). However, a more comprehensive analysis in the same paper explains them as a mixture of Caucasus Hunter-Gatherers (CHG), Neolithic farmers from the Levant, and Neolithic Zagros farmers, with admixture ratios of 0.631, 0.202 and 0.167, respectively.

I can basically reproduce the same model with the outgroups listed above, except with Israel_Natufian in place of Levant_Neolithic, which I have to use as one of the reference populations.

Iran_Chalcolithic
Caucasus_HG 0.522 ± 0.111
Iran_Neolithic 0.246 ± 0.108
Levant_Neolithic 0.232 ± 0.026

The qpAmd algorithm is freely available at GitHub here. All of the present-day and ancient samples are freely available at the Reich Lab website here.

See also...

Ulan IV

Monday, July 4, 2016

Economic overhaul + population shift in Late Neolithic Iran


Courtesy of Arbuckle et al. at the Journal of Archaeological Science. Emphasis is mine:

Abstract: In this paper we address the timing of and mechanisms for the appearance of domestic cattle in the Eastern Fertile Crescent (EFC) region of SW Asia through the analysis of new and previously published species abundance and biometric data from 86 archaeofaunal assemblages. We find that Bos exploitation was a minor component of animal economies in the EFC in the late Pleistocene and early Holocene but increased dramatically in the sixth millennium BC. Moreover, biometric data indicate that small sized Bos, likely representing domesticates, appear suddenly in the region without any transitional forms in the early to mid sixth millennium BC. This suggests that domestic cattle were imported into the EFC, possibly associated with the spread of the Halaf archaeological culture, several millennia after they first appear in the neighboring northern Levant.

These findings more or less correlate with the results in the new Lazaridis et al. preprint:

During subsequent millennia, the early farmer populations of the Near East expanded in all directions and mixed, as we can only model populations of the Chalcolithic and subsequent Bronze Age as having ancestry from two or more sources. The Chalcolithic people of western Iran can be modelled as a mixture of the Neolithic people of western Iran, the Levant, and Caucasus Hunter Gatherers (CHG), consistent with their position in the PCA (Fig. 1b).

In other words, the small cows weren't just imported into the Eastern Fertile Crescent; they came with people who also made a major genetic impact on the region.

Here's my own PCA featuring the relevant Lazaridis et al. samples. Key: Caucasus_HG = Caucasus Hunter-Gatherer; Iran_ChL = Iran Chalcolithic; Iran_HG = Iran Hunter-Gatherer; Iran_N = Iran Neolithic; Levant_N = Levant Neolithic.


Thus, it would seem that after the early Neolithic farmers from Iran migrated to South Asia, they were largely replaced in their own homeland by Halaf pastoralists and/or related groups. Moreover, their descendents in South Asia, and especially South Central Asia, were then largely replaced by pastoralists from the Bronze Age Eurasian steppe (for instance, see here).

Obviously, this doesn't square too well with the idea of a Proto-Indo-European homeland in the Zagros Mountains of western Iran, does it?

See also...

qpAdm tour of Iran

Yamnaya =/= Eastern Hunter-Gatherers + Iran Chalcolithic

Sunday, July 3, 2016

The mother of all TreeMix runs #1


Interestingly, the most obviously admixed population in this analysis are the Karitiana Indians from the Amazon basin; a two-way mixture between East Eurasians and Upper Paleolithic Siberians. Who woulda thunk it? Not me, that's for sure.










Saturday, July 2, 2016

Ulan IV


If Indo-Iranian languages didn't expand from the Andronovo horizon, but rather from an earlier archaeological steppe culture, which is what it seems like based on the latest analysis of ancient genomes from the steppe (see page 123 here), then I reckon the best option is the Catacomb Culture.

As far as I can tell, one of the Yamnaya samples from Allentoft et al. 2015, RISE552 from the Ulan IV burial, might actually be a Catacomb sample. That's because Ulan IV is classified as an West Manych Catacomb Culture site. Check out this awesome paper on one of the graves from this site here.


Here's an qpAdm model of the Kalasha from the Hindu Kush featuring Ulan IV RISE552 based on over 200K SNPs:

Outgroups
Bichon
Chukchi
Israel_Natufian
Karitiana
Kostenki14
MA1
Mbuti.DG
Papuan
Ust_Ishim

Kalash
Ulan_IV 0.609 ± 0.051
Iran_Neolithic 0.184 ± 0.066
Andamanese_Onge 0.175 ± 0.041
Han 0.032 ± 0.023

I'm not saying this model is definitive by any stretch, but it's more or less statistically sound, with fairly low standard errors for each of the coefficients (0.051, 0.066, 0.041, 0.023 respectively). It's also very similar to the optimal qpAdm model of the Kalasha in Lazaridis et al. 2016.

Interestingly, it also matches closely a TreeMix analysis that I posted at my other blog last year, months before I even knew that ancient genomes from Neolithic Iran were on the way (see here). This is what I said in that blog entry:

Both of these models are correct; they just show the same thing in different ways. So if we mesh them together the Kalash and Pathans come out ~65% LNE/EBA European (which includes substantial Caucasus or Caucasus-related ancestry), ~12% ASI, and ~23% something as yet undefined.

If I had to guess, I'd say the mystery ~23% was Neolithic admixture from what is now Iran.

That's not bad considering how difficult it is to make predictions about ancient population movements without direct evidence from ancient DNA. In any case, it's a lot better than what has been published on the topic in some major journals.