Friday, May 6, 2016

Villabruna cluster =/= Near Eastern migrants

I've been running a lot of Treemix analyses with the samples from the recent Qiaomei Fu et al. paper. And the impression I'm getting is that the authors missed the elephant in the room, the one with R1b painted on its big butt.

Now, it's true that Treemix output can't be used as unambiguous evidence in support of complex models. That's because in the absence of key samples the algorithm can get exceedingly creative in modeling the available data, sometimes to such extremes that the results might seem absurd.

However, when something keeps showing up again and again, even when using somewhat different samples and marker sets, and makes sense in the context of haploid and archaeological data, then at the very least it deserves serious consideration.

Here's a nice Treemix series that more or less captures the meat and potatoes of my many Treemix experiments with the Qiaomei Fu et al. dataset. For the archaeological contexts and other details about these ancient samples see here.

Below is my interpretation of the results:

- Villabruna is a sister clade of the earlier European Vestonice clade, but with significant input from an AfontovaGora3-related North Eurasian population, perhaps one that was living north of the Black Sea after the Kostenki people went the way of the dodo

- Hence, the R1b lineage carried by Villabruna I9030, the individual in this Treemix series, probably comes from the Eurasian steppe

- Kotias, a Caucasus Hunter-Gatherer, is in large part derived from the same North Eurasian population, hence the close relationship between Villabruna and Caucasus Hunter-Gatherers

- Villabruna and/or closely related foragers contributed significant ancestry to Neolithic Anatolians, and thus, indirectly, possibly to all extant Near Eastern and even many African populations

- Kotias is also closely related to Neolithic Anatolians, but probably mostly via a more basal population, perhaps the so called Basal Eurasians, native to the Near East prior to the Villabruna and/or related gene flow across the Near East and parts of Africa

- Present-day East Asians might be ancient hybrids with admixture from the same or very similar North Eurasian population, although as per the above mentioned quirks of Treemix, it's possible that the North Eurasians that contributed ancestry to Villabruna, Caucasus foragers and Eurasian steppe populations were in fact partly East Asian

So basically what I'm seeing are back migrations from Europe and the Eurasian steppe or Siberia to the Near East soon after the Ice Age. Yes, from Europe, although I admit that things can get fuzzy here.

That's because at the time much of the Aegean Sea was dry land, and thus there was no geographic barrier between the European Balkans and Asian Anatolia. The two regions, which might seem very distinct to us today, were basically one. So when I say Europe, I actually mean the ancient landmass now divided between the Balkans and Anatolia.

Or not? Are there any D-stats that we can run to either confirm or debunk my Treemix-based hypothesis? Feel free to post your proposals in the comments and I'll try and run them as soon as possible.

Update 07/05/2016: The plot thickens somewhat. I added MA1 to the line up, and now Villabruna shows minor Kotias-related ancestry. In other words, probably something from the Caucasus. The full series is available in a zip file here.

However, although interesting, this result doesn't change much. In fact, it adds more weight to the argument that Villabruna inherited his Y-chromosome from steppe foragers. That's because if this migration edge was a reflection of gene flow from Anatolia to Europe, I'd expect it to run from the Anatolia Neolithic branch, rather than the Kotias branch.

Chad Rohlfsen said...

No. When added, the errors and chi go up, and the tail goes down. Same with the inverse. They do fit better than their models of UP Europeans though. Still, I'm just playing around with the other option.

Chad Rohlfsen said...

Unlike EHG as a mix of ANE and WHG, this model actually passed, but minimally. Chi of 11 and tail .04. Like I said, this may only be covering for UHG, which will make WHG as ANE shifted.

ryukendo kendow said...
Aram said...


What route You propose for R1b-Z2103 expansion? After all this SNP's TMRCA is more close to L23 so it can be more informative about the origin of L23 and M269.

Also details about Z2106, Z2109, CTS7822 are welcomed. All this SNPs ( except CTS7822 relevant to Balkans ) are found in Yamna. Z2103* included.

Gioiello said...

@ Davidski

In this thread many spoke about hg. R1a-Z93*. Apart the fact that India and Asia has above all the subclades from Z94 and downstream, this morning a friend of mine, Italian-American Grisi, wrote to me that not only he is R1a-Z93 and negative for all the downstream SNPs, but also the last result from said he is negative also for KMS141, signed as Z93* Parent.
Italy has a few R1a, but, as I am saying from so long, there are here all the oldest subclades, already from R1a-M420*, and they may be here very old.

Rob said...


I'd like to ask, in your least poetic terms :), can you summarise the basal clades (not haplotypes) which exist in Italy & Spain ?

Chad Rohlfsen said...


I plan on taking a look at that and seeing if that phylogeny works in ADMIXTUREGRAPH. David could try the TreeMix with Africans, Neandertal, Oase1, Ust_Ishim, Iberia_EN, Anatolia_EN, Satsurblia, Kostenki14, GoyetQ116-1, Vestonice16, MA1, AfontovaGora3, Karitiana, Han, and Dai. We'll see if it creates something like these odd qpAdm runs I have or Basal Eurasian.

Karl_K said...


Chad Rohlfsen said...

Oh yeah, I forgot to add we'll need a WHG or two in that TreeMix run. One without wouldn't hurt to see if it puts the UP by EN + Ust Ishim or Oase1.

Matt said...

OT: All, with how Basal Eurasian is defined in this new paper (the new clarification being that in theory affinity to ENA vs Mbuti is an imperfect measure and affinity to Ust Ishim and Oase being what we should look for), what do we think about using either the ratios:

a) f4 (GoyetQ116-1,Mbuti:Pop,Kostenki14) : f4 (GoyetQ116-1,Mbuti:Vestonice16,Kostenki14)


b) f4 (UstIshim,Mbuti:Pop,Kostenki14) : f4 (UstIshim,Mbuti:GoyetQ116-1,Kostenki14)

to try and find a level of Basal Eurasian?

(Cross check the latter with
b2) f4 (Oase1,Mbuti:Pop,Kostenki14) : f4 (Oase1,Mbuti:GoyetQ116-1,Kostenki14) ).

Seems like this would leverage having a greater number of old West Eurasian samples that don't have a lot of affinity to any particular recent populations (unlike the Bichon and Motala that Davidski used before), and also in theory don't have Basal Eurasian, and also the "unaffiliated" populations (Oase1 and Ust Ishim).

Vestonice16 and Kostenki14 taken above because IIUC they are supposed to form a clade relative to GoyetQ116-1, and likewise GoyetQ116-1, Vestonice16 and Kostenki14 all form a clade relative to Ust Ishim.

a) I think is really a sensor of "West Eurasian" clade ancestry, so would be decreased by ANE clade though, plus would be disrupted by El Miron type ancestry (should be low in moderns, may be some in WHG), so the b) models might work better for actual Basal Eurasian.

Davidski, does the SNP overlap suck too much for any of the above tests, or are they viable?

Matt said...

Actually, scratch that, wouldn't work. The second term in all those ratios would be approximately zero :(. (where we probably want it to be approximately close to one?)

Matt said...

What about: f4 (Oase1,Mbuti,Pop,Ust_Ishim) : (Oase1,Mbuti,Yoruba,Ust_Ishim)?

human443 said...

I've noticed some strange effects when Goyet-Q116 is involved in D-stats...can someone run a few just to see for sure if anything is going on.

Mbuti Ust_Ishim GoyetQ116-1 Kostenki14
Mbuti GoyetQ116-1 Han Onge
GoyetQ116-1 Ust_Ishim Han Onge

Olympus Mons said...

Which are the Iberia EN?

Olympus Mons said...

Buzz Alert!
Max Planck is about to reveal several Results for several Iberia Bell Beaker sites. Hummm My Shulaveri R1b(M269) and Friends E1bM81 from North Africa should be revelead. ;)
And maybe with luck A H2a or H13...

Chad Rohlfsen said...

Iberia Early Neolithic.

Where did you hear this about Beakers? Anything we can read?

Davidski said...


Oase1 Mbuti Satsurblia Ust_Ishim : Oase1 Mbuti Yoruba Ust_Ishim 0.082402
Oase1 Mbuti Kotias Ust_Ishim : Oase1 Mbuti Yoruba Ust_Ishim 0.064586
Oase1 Mbuti Anatolia_Neolithic Ust_Ishim : Oase1 Mbuti Yoruba Ust_Ishim 0.048734
Oase1 Mbuti Stuttgart Ust_Ishim : Oase1 Mbuti Yoruba Ust_Ishim 0.064584
Oase1 Mbuti Iberia_EN Ust_Ishim : Oase1 Mbuti Yoruba Ust_Ishim 0.060454
Oase1 Mbuti Iberia_MN Ust_Ishim : Oase1 Mbuti Yoruba Ust_Ishim 0.024079
Oase1 Mbuti Karelia_HG Ust_Ishim : Oase1 Mbuti Yoruba Ust_Ishim -0.036115
Oase1 Mbuti Villabruna Ust_Ishim : Oase1 Mbuti Yoruba Ust_Ishim -0.043548
Oase1 Mbuti Loschbour Ust_Ishim : Oase1 Mbuti Yoruba Ust_Ishim 0.001228


I don't have the Onge.

Mbuti Ust_Ishim GoyetQ116-1 Kostenki14 -0.0122 -1.82 629888
Mbuti GoyetQ116-1 Han Dai 0.0001 0.083 658457
GoyetQ116-1 Ust_Ishim Han Dai 0.0012 0.586 656924

postneo said...

@jaydeep, Onur, Ryu

1) z93 of south asia vs europe are not bifurcated. All clades are represented with no regional structure in south asia. In central asia and the ashkenaz groups the representation is less parsimonious.

2) There is some evidence of mixing btw a "NE eurasian like" pop and and "ME" like pop in south asia but this could easily be due to the mixing of IVC pops as they moved eastwards. this is well represented archeologically.

3) as for the the split btw Guj A and D. attributing that to andronovo/sintashta is a stretch. there seems to be a vast distance autosomally on plots seen so far. There might have been a common link population at some point but it seems quite distant.

Davidski said...


No one said anything about a NE Eurasian-like population.

The population mentioned was the Bronze Age Eastern European steppe population, rich in Z93, and very similar to present-day Northern and Eastern Europeans.

I should also mention that the oldest of these Z93 samples, Poltavka outlier, is very close in time to the major expansion of Z93 as inferred recently from the deep sequencing of South Asian Y-chromosomes. This sample is fully European and shows no signs of being a recent migrant to Europe from South or Central Asia.

So in other words, there was a massive migration from the steppe into South Asia during the Bronze Age, bringing with it Z93 and European admixture.

ryukendo kendow said...
Matt said...

@ Davidski, thanks, looks mostly in the right direction, but magnitude incorrect and some noise. Populations with Basal Eurasian (Stuttgart, Kotias, Satsurblia) are only weakly less related to Oase1 than Ust_Ishim is, when compared to Yoruba.

If possible, could you test the methodology of

f4 (Kostenki14,Mbuti:Pop,Ust-Ishim) : f4 (Kostenki14,Mbuti:GoyetQ116-1,Ust-Ishim)


f4 (GoyetQ116-1,Mbuti:Pop,Ust-Ishim) : f4 (GoyetQ116-1,Mbuti:Kostenki14,Ust-Ishim)

for the same test pops (plus MA-1, AG3 and Dai)?

Based on this phylogeny of a weak K14+GoyetQ116-1 clade. That's more of a Western Eurasian clade ancestry test though.


f4 (Oase1 Chimp Pop Ust_Ishim : Oase1 Chimp Mbuti Ust_Ishim)?

Davidski said...

Let's move things to the new thread...

Jaydeepsinh Rathod said...


R1b is well spread out in Central Asia, Iran, Pakistan besides India. It is even present in Nepal & Bhutan. And in most of these places there is a presence of very ancient subclades of R1b. Coupled that with the presence of R1a & R2a around most of the above countries and that is sufficient enough to suggest that R1b in SC Asia & in Iran might be an old presence.


Thanks for the clarification. I think I now understand your point better. Let us wait for more amazing aDNA finds. It is atleast clear now that R1b in western Europe was present even before the Younger Dryas.

Jaydeepsinh Rathod said...


Thank you for the stats. The stats seem to support the division of South Asian people into roughly 2 ancestral groups aka ANI & ASI. However, ASI in South Asia is represented by mtDNA M and this has been suggested in a recent paper to have come into South Asia through a back-migration from SE Asia. This makes the situation quite interesting. On the other hand, the West Eurasian ANI might also be a very old presence in South Asia.

According to Metspalu et al 2011,

"Overall, PCA reveals that the genetic landscape of South Asia is characterized by two principal components of which PC2 is specific to India and PC4 to a wider area encompassing Pakistan, the Caucasus, and Central Asia."

PC2 is apparently the ASI and PC4 the ANI. Later on in admixture, they label the PC2 component as k6 & PC4 as k5. Regarding k5, which is the West Eurasian component with greatest density in SC Asia & Caucasus they say,

"However, we found that haplotypic diversity of this ancestry component (i.e. k5) is much greater than that of those dominating in Europe (k4, depicted in dark blue) and the Near East (k3, depicted in light blue), thus pointing to an older age of the component and/or long-term higher effective population size."

They also say the following regarding this West Eurasian component (k5) in South Asia,

"However, considering the geographic spread of this component within India, there is only a very weak correlation (r ¼ 0.4) between probability of membership in this cluster and distance from its closest core area in Baluchistan (Figure S6). Instead, a more steady cline (correlation r ¼ 0.7 with distance from Baluchistan) of decrease of probability for ancestry in the k5 light green ancestral population can be observed as one moves from Baluchistan toward north (north Pakistan and Central Asia) and west (Iran, the Caucasus, and, finally, the Near East and Europe)."

Since this component is most likely closely related to CHG, there appears to be no significant cline of CHG ancestry in South Asia.

Regarding the South Asian specific k6 they say,

"In contrast to widespread light green ancestry, the dark green ancestry component, k6 is primarily restricted to the Indian subcontinent with modest presence in Central Asia and Iran. Haplotype diversity associated with dark green ancestry is greatest in the south of the Indian subcontinent, indicating that the alleles underlying it most likely arose there and spread northwards. It is notable that this ancestry component also exhibits greater haplotype diversity than European or Near Eastern components."


Jaydeepsinh Rathod said...


Both the ancestral components in South Asia therefore appear to have a very ancient presence. The West Eurasian cline in South Asia need not be explained by a series of migrations into South Asia. It is equally probable that West Eurasians separated from East Eurasians around SC Asia. The ASI like people might have been a 3rd group - a basal South Asian if you like - which migrated into Peninsular India and remained relatively confined geographically.

From then on, a group may have separated from the West Eurasian group & migrated towards Europe & another towards the Middle East to admix with the Basal Eurasians. Hence, a West Eurasian presence within SC Asia need not be recent.

Since Villabruna appear to be closer to South Asians than Kostenki is, this must be a result of some later admixture between Villabruna & SC Asians. I will not speculate however on the direction of the migration.

The West Eurasian cline in South Asia, with respect to both Villabruna & Kostenki, might be explained in the following manner. The ANI West Eurasian & ASI may have managed to remain in different parts of South Asia without getting admixed or maybe they did admix but not majorly. Even if there was some admixture, a cline from North to South would still remain with respect to the West Eurasians. Meanwhile there may have been periodic migrations of West Eurasian ANI people from SC Asia towards Middle East & Northward towards the steppe.

The major ANI-ASI admixture has been dated to after 4200 BP and maybe there is some truth to it (though I am not entirely convinced because ASI is found even among Kalash, Central Asians & the Iranians). By 4200 BP, the ANI, inspite of an early presence in South Asia, would still have been closer to the West Eurasians compared to the ASI. Later on as the ANI ancestry penetrated into Peninsular India, it would create a cline towards West Eurasians correlating with the ANI cline.

Davidski said...

Villabruna doesn't have any South Asian ancestry. But South Asians have Villabruna ancestry.

Also, the North/East European R1a-Z282 separated from the Asian (and South Asian) R1a-Z93 only around 5,000 years ago. North/East Europeans don't have South Asian admixture, but south Asians have North/East European admixture.

It's really not an issue worth debating. South Asians have West Eurasian admixture. Both from the Near East and from Europe.

Nirjhar007 said...

Hi Jaydeep,

Yes , What you say surely has logic . But I am not very interested on Basal clades, the main reason is they have nothing to do with PIE . Too old! :). But on M-269 related clades, it will be interesting what the aDNA from SC Asia say.

