Friday, April 3, 2015

The teal people: did they actually exist, and if so, who were they?

The ADMIXTURE analysis in Haak et al. 2015 includes a series of intriguing teal colored components from K=16 to K=20 (see image here). The main reason I'm so intrigued by these components is because they generally make up over 40% of the genetic structure of the potentially Proto-Indo-European Yamnaya genomes.

But there's only so much one can learn by starring at a bar graph, so I thought I'd have a go at isolating the same signal with ADMIXTURE to study it in more detail. You can view the results of my experiment in the spreadsheet here.

I wasn't able to completely nail any one of the teal components from Haak et al., because I don't have access to all of the samples used in the paper (I'd have to sign a waiver to get them). Nevertheless, the signal looks basically the same.

Below is a bar graph based on the output featuring selected populations and ancient genomes from Europe and Asia. The Fst genetic distances between the nine components are available here.

Note that the teal component peaks in the Caucasus and the Hindu Kush, and generally shows a strong correlation with regions of relatively high MA1-related or Ancient North Eurasian (ANE) admixture. On the other hand, the orange component peaks among Early European Farmers (EEF), who basically lack ANE.

To learn about the structure of the three main West Eurasian components - blue, orange and teal - I made synthetic individuals from the P output to represent each of the components, and tested them with my K8 model. As expected, the teal component harbors a high level of ANE, while the orange component lacks it altogether. Refer to the spreadsheet here.

It's very likely that the teal and orange components from Haak et al. share these traits. I think this is more than obvious by looking at their frequencies across space and time in Eurasia.

I also analyzed the synthetic individuals with PCA based on their K8 ancestry proportions. The samples representing the orange component fall just south of the Stuttgart genome from Neolithic Germany, and this is basically where I expect Neolithic genomes from the Near East to cluster when they become available.

Interestingly, the samples representing the blue component are dead ringers for Scandinavian hunter-gatherers (SHG). However, I suspect this is something of a coincidence caused by the small number of Western European hunter-gatherer (WHG) and Eastern hunter-gatherer (EHG) genomes in the dataset. The algorithm probably doesn't have enough variation to latch onto to create both WHG and EHG components, and in the end settles for something in between, which just happens to resemble SHG.

But the fact that the orange and blue samples more or less pass for ancient populations leaves open the possibility that the same might be said for the teal samples.

So did the teal people actually exist, and if so, who were they?

My view at the moment is that a population very similar to the teal samples formed in Central Asia or the North Caucasus during the Neolithic as result of admixture between MA1-like and Near Eastern groups. This population, I believe, then expanded into the Russo-Kazakh steppe by the onset of the Eneolithic.

Were they perhaps the Proto-Indo-Europeans? Probably not. I'd say they were Neolithic farmers who eventually played a role in the formation of the Proto-Indo-Europeans. In any case, someone had to bring the Caucasian or Central Asian admixture to the steppe, and I have it on good authority that it was already present among the Khvalynsk population of the Eneolithic, albeit at a lower level than among the Yamnaya of the early Bronze Age.


Haak et al., Massive migration from the steppe was a source for Indo-European languages in Europe, Nature, Advance online publication, doi:10.1038/nature14317

Update 16/11/2015: 'Fourth strand' of European ancestry originated with (Caucasus) hunter-gatherers isolated by Ice Age


Grey said...

If there are potential medical benefits from tailored medicine then as it becomes more obvious then rich people and governments in other parts of the world will start funding research into their own ancestries. If so the coverage will snowball rapidly.

rozenfag said...

Excerpt:"“With the help of forensic experts, we will try to reconstruct their DNA,” Prof Jadhav said.

“We tried doing the same with the help of a Japanese anthropologist five years ago, when a Harappan-era graveyard was discovered at Farmana village in Rohtak district, but failed,” he added. "

Nirjhar007 said...

Old News Roz.

Krefter said...

A display with individuals from the 4,000YBP site from the "strzyżowski burial culture of the early Bronze Age" from Hrubieszow, Poland(southeast corner of Poland) will be at the "Museum University of Science Hrubieszow" from April 24-End of July.

This has already been posted but anyways here's mtDNA found on the site. The H clades are more typical of Yamna-CWC than Neolithic Poland-Central Europe.

"U5b1, H1b, H2a, H1b, H1a, H2a, H6"

They tested Y DNA, but I haven't heard any news on what haplogroup the "warrior" belonged to.

Does anyone know if they tested autosomal DNA?

Mike Thomas said...

@ Maju

"c. 5000 BCE, another wave from Anatolia (related to Tell Halaf, it seems) that conquered much of the Balcans (Dimini-Vinca) and surely changed their demography"

That's one opinion, Maju. Certainly the opinion of scholars from SEE today is that the Vinca culture emerge in situ. Granted, the social changes from earlier starcevo were marked.

I guess studies will prove either way; if the greek data is anything to go by

Maju said...

"That's your opinion, Maju.

It's my opinion based on opinion of at least some scholars. Also:

"... Certainly the opinion of scholars from SEE"...

All the scholars or some particular individual you happen to favor? I mean: seriously, cheap propaganda tricks is what your phrasing is made of.

Not just social changes were important, there is new pottery painted with new colors, new art, burned villages, new villages... and a connection with East Anatolia (Can Hassan) and the Euphrates (Tell Halaf).

If we accept now that Cardial implied replacement (even against continuity of local Epipaleolithic techs in most sites), how are we not going to imagine the same or worse with the Sesklo-Dimini change?!

Mike Thomas said...

@ Maju

"cheap propaganda tricks is what your phrasing is made of. "

I was merely stating by-the -by that, as with any cultural development, the discourse on the origins of the Vinca culture is dominated by debates about diffusionism vs internal developments, and migrationism vs authochthonism.

That isn;t propaganda, but a well-balanced assessment of the current academic status quo. Now, you are convinced of a migration event. Perhaps. I wasnt actually arguing against it per se.

And as I clearly stated, the genetic evidence might indeed suggest that there was a second lot of (mid-Neolithic) migrations into SEE {referencing the findings of the recent "palatial mtDNA" paper - although this isn;t Vinca country).

So I fail to see the need for your whiney little comment. You can really come across as a pompous, self -important and crabby. Rather surprising given that for many facts you appear to be out of touch, and even utterly ignorant of recent developments.

Krefter said...

"Analysis of over 2,000 HVR1 profiles From SouthWest Asia"

Mike Thomas said...


Do you think full Y sequencing is likely to under or over estimate node ages?

Davidski said...

It depends on the precise methodology, but YFull has a big database now and calibrates its estimates with ancient genomes, so I'd say it's as accurate as anyone, and certainly more so than any paper published to date.

Maju said...

@Mike: I was "whiney" because the way you presented the case: you didn't say "there are other opinions", you quite aggressively and one-sidedly attempted to demolish my argument with what I immediately realized was a cheap publicity trick ("experts say..."), which you may have used unconsciously but was clearly a manipulation anyhow. And the best way to fence off a manipulative attack in what should be a dispassionate and rational debate is always to expose it: "you are doing trick number 7", for example. So I did.

"... or many facts you appear to be out of touch, and even utterly ignorant of recent developments".

Maybe (not sure because you seldom argue your point, just attempt to disqualify). In any case you are not going to persuade me (and possibly nobody else either) with aggressive comments like that. If you'd be less aggressive and more pedagogic, you'd surely make a greater impact.

So instead of saying: "you are an ignorant and don't know half of it", you could say: "I think you may have missed this and that studies, [refs.], where it's argued quite solidly for a different conclusion". That way we could debate as adults and (wannabe) scientists, in an environment of mutual respect.

Thank you in advance for your (expected) change of attitude.

Mike Thomas said...

@ Davidski

" YFull has a big database now and calibrates its estimates with ancient genomes, so I'd say it's as accurate as anyone, and certainly more so than any paper published to date."

Yes I agree

Karl_K said...

@Maju Re:Mike

"In any case you are not going to persuade me (and possibly nobody else either) with aggressive comments like that. If you'd be less aggressive and more pedagogic, you'd surely make a greater impact."

I totally agree. Maju and I are trying to be civil, but these burning nail arguments are starting to, well... burn... like nails?"

(FYI Maju; your Spanish idioms might sometimes be confusing to others)

Maju said...

"FYI Maju; your Spanish idioms might sometimes be confusing to others"...

You mean using "propaganda" as synonym of "publicity"? I noticed but later only. We don't really make that distinction: commercial or political, it's the same old evil Goebbels trying to brainwash you.

But, anyhow, you know: languages change through corruption. The bigger and faster-growing the empire, the bigger and faster the change. I reclaim my right to use English wrongly at least now and then. At least until China takes over.

Krefter said...

A found this from a poster at Eupedia. This will probably help tell why South Asians are distinct.

Mike Thomas said...


Do you want to do a new post recapping the current status of aDNA evidnece from NW russia and the Nth pontic region. Although limited to mtDNA ; there have been several studies with at times different conclusions; and raises fundamental questions as to whether the Ntj pontic Region was WHG or EHG before the Eneolithic

Davidski said...

Mike, I'm not sure if the results are contradictory, since we only have genome-wide data from one part of the Yamnaya horizon and a few Corded Ware samples from Germany.

I think we'll eventually see significant substructures within the Yamnaya horizon, probably with some regions showing some of that orange EEF and also inflated WHG.

To me it just seems like common and sense. So let's just wait for some more genomes from the steppe.

Mike Thomas said...

Yeah sure
But I was referring to like a roundup of the several mtDNA studies to date, and what they currently suggest.

By contradictory I meant differences of opinions between the studies themselves. Eg the provenance of mtDNA Hg C, role of various admixtures and from where, etc. There's be quite a bit of stuff there

Simon_W said...

@ Alberto

It was you who had started speaking metaphorically of M417 as „the grandfather“ etc, of course I didn't mean literally that they grew up in the same house.

But true, if „the grandfather“ had already spread over parts of Europe and West Asia before the two descendent clade founders had originated, it's possible that the one „son“ originated in Europe and the other one in Asia. However, we don't know when the grandfather had spread to the places where he is found now. We just know that both descendants go back to the same root, and hence the distribution of the uncle lends some credibility to the thesis that the origin was in Europe rather than in Asia. But it has to be realized, this is no compelling argument, it's nothing more than a weak hint.

Simon_W said...

@ Maju

Well, what's the meaning of „being centered somewhere“? There may be different definitions for this, but as far as absolute and relative numbers are concerned, R1a-L664 is certainly centered on northern, central and northwestern Europe. In the FTDNA R1a project there is not a single L664 from West Asia, or from Asia in general, for that matter. Sampling bias cannot explain this, as the project has tested enough West Asian people to find two West Asian R1a*(xSRY10831.2). R1a-CT4385* (xL664) is another matter, however. It's still such a recent discovery that it's not yet included in the ISOGG tree. And it's definitely rare. The FTDNA R1a project has just one individual from England. But this doesn't prove much, because, as you said, Underhill's Turkish M417* individuals were surely not tested for this SNP, which allows for the possibility that they had it too. Which raises the question how the presence of a full-grown clade matters more than the existence of some unassigned paragroup individuals. A clade may be the result of a random founder effect which turned an individual of a paragroup with some private mutations into a clade of many people. This doesn't mean that the distribution of this clade matters more than the unassigned paragroup individuals.

Simon_W said...

I don't believe that the teal population, if it existed, may have originated in the North Caucasus during the Neolithic as a result of admixture between MA1-like and Near Eastern groups. Because as late as the Neolithic we can hardly expect to find MA1-like people in the Caucasus foreland in Southern Russia, other than the regular EHG. But PCA plots clearly show that EHG + Near Easterners =/ the teal component.

Simon_W said...

Correction: Two R1a-CTS4385 (xL664) individuals in the R1a project, apparently both of British origin. Doesn't change anything though.

Maju said...

@Simon: "In the FTDNA R1a project there is not a single L664 from West Asia, or from Asia in general, for that matter. Sampling bias cannot explain this".

It absolutely can, or are you telling me that FTDNA is now investing its profits in sampling areas where they have almost no customers? Obviously not: FTDNA is not an academic institution. They have made some interesting phylogenetic research with the data they get from almost exclusively NW European ancestry people but they have never performed field research as such, neither in West Asia nor in Southern Europe nor anywhere else. Nor they will, nor is admittedly their task.

"... as the project has tested enough West Asian people to find two West Asian R1a*(xSRY10831.2)".

Are you kidding who? The quality of the West Asian data that a single academic field study such as Underhill's provides is clearly much much better. But what really matters is the issue of comparative sampling: how many West Asians and particularly Turks, Kurds and Iranians (no, I don't think affluent Kuwaitis and Israelis count here) has FTDNA sampled in comparison with NW Europeans? What's the ratio: 1-100?, 1-1000?, 1-10,000? There's no way to make meaningful comparisons because you will always find many more rare lineages in a huge sample than in a tiny one. FTDNA samples are meaningless for the kind of analysis you want to imply.

But regardless... L664 is just some of that M417* by another name and M417* was indeed found by Underhill in Turkey. This is more than enough to radically question your biased narrative, your wishful thinking.

Simon_W said...

Nobody denies that the FTDNA samples are primarily northwest European, much more than anything else. But do you believe R1a*(xSRY10831.2) isn't rare in West Asia? Do you think it's common there? Hardly so. Yet they found an Arab from northern Iraq (Mosul) with this haplogroup, and another West Asian without proper personal information. And nobody could claim that there are not enough FTDNA customers from southern, southwestern or eastern Europe for at least one L664 to be found, yet there were no L664 from these places. (Except one from the Baltic IIRC.) There's no way to get around it, it's a north-central-northwest European haplogroup.

Maju: „ L664 is just some of that M417* by another name and M417* was indeed found by Underhill in Turkey. This is more than enough to radically question your biased narrative, your wishful thinking.“

Did you read my post at all? You just repeated what I said: „Which raises the question how the presence of a full-grown clade (i.e. L664) matters more than the existence of some unassigned paragroup individuals (M417*). A clade may be the result of a random founder effect which turned an individual of a paragroup with some private mutations into a clade of many people. This doesn't mean that the distribution of this clade matters more than the unassigned paragroup individuals.“

Simon_W said...

I think the only evidential advantage of a fully-grown, geographically confined clade versus a couple of unresolved paragroup individuals is that the members of a clade cannot be explained away as erratics, whereas the ancestors of unresolved paragroup members in theory could be from anywhere. Admittedly not a big advantage.

Simon_W said...

Even if R1a-Z93 originated in southern central Asia and spread from there, it doesn't follow that its ancestor was from West Asia rather than from the Eurasian steppe.

Underhill found six R1a1-SRY10831.2*(xM417/Page7) chromosomes, five of which were from Iran and the sixth from the Caucasus. But we now know, thanks to ancient DNA, that precisely this haplogroup was present in Karelian EHG and in EHG on the Upper Dvina around 4000 BC. Why did Underhill not find it in modern Karelians and northwestern Russians? The only reasonable answer being: Because in these places it has disappeared in the meantime. And therefore it obviously follows that modern absence of evidence isn't evidence for absence in prehistory!

The 24 R1a-M420*(xSRY10831.2) chromosomes found by Underhill in Iran and Kurdistan might be a similar case – we cannot safely conclude that it wasn't present in EHG and that it was only present in West Asia. The above example proved beyond doubt that such reasoning can be erroneous.

To me R1a-M417 looks a lot like an Indo-European marker. And since the PIE homeland presumably was on the PC steppe, it isn't far fetched to assume that R1a-M417 spread from there.

Now it seems true that Yamnaya and Corded Ware people had a lot of West Asian and „teal“ admixture. But we cannot safely ascribe R1a-M417 to an expansion of West Asian „teal“ people. The said admixture may be associated with R1b-P297, with R1a-M417, with both or with neither haplogroup. And at least I note that the purely R1b Yamnaya sample had more teal/West Asian than the Corded sample without R1b.

Maju said...

@Simon: If you have a subclade of M417 (Z93 and ALSO other smaller lineages within the paragroup M417*, such as the sample detected by Underhill in Turkey) centered in West Asia and you have a pre-existent stage (upstream of M417) in which all the action is concentrated in West Asia, parsimony obliges, because there is not enough evidence demanding a European centrality for the only dubious node, which is M417.

Only if you can gather enough evidence (not just in NW Europe but in the overall West & South-Central Eurasian region) about M417 being strongly associated to a European centrality, convincingly breaching the parsimony principle, then I would not be able to object. But so far there's nothing of that, just wishful thinking based on very unequal samples.

Simon_W said...

Sorry, I was on holiday:

You said all the action upstream of M417 is concentrated in West Asia. That's simply not true. We've got a Karelian hunter-gatherer with R1a1 and a hunter-gatherer from the upper Dvina 4000 BC also with R1a1. How can you ignore these? You might ignore them if R1a1 had been rare in hunter-gatherers of northeastern Europe. But the idea that it was rare is very unlikely, because we have just very few samples, and it's unlikely to find rare haplogroups in a tiny population sample.

On the other hand the present-day evidence you're drawing upon isn't compelling because there is no guarantee that basal forms of a haplogroup still have to be in place today close to where they originated. This would presuppose a continuity of settlement of at least part of the original population, and moreover that the marker didn't get washed away by drift.

Note: I'm not saying that R1a1 or M417 originated in Europe, and I don't have the wish that it originated in Europe. Europe or Asia, what does it matter?

Maju said...

@Simon: apples and apples, oranges and oranges. Compare modern frequencies with modern frequencies, and compare Karelian aDNA... with Iranian aDNA - when we have some.

Anyhow, no survey of n=1 is significant. It's like imagining that C1 originated in Castile-León because the only ancient sample we have is from there. It's a data point but an isolated data point that can hardly be compared to anything.

My interpretation anyhow is that R1a upstream of M417 (and also partly downstream as discussed previously) was an Iran-Kurdistan thing (plus Caucasus, Anatolia surely) and that some (private or minor) sublineages branches were scattered around much like C1 or pre-R1 in the Mal'ta boy. We don't know yet what lineages actually dominated among those peoples (N1 in Karelia already?) nor how diverse were their sub-branches of these essentially Asian lineages.

The only think certain about ancient Y-DNA is that I (at the very least I2) is a European-specific haplogroup since the Paleolithic: that is something that both ancient and modern DNA data confirm beyond any reasonable doubt. Therefore I outside Europe seems to indicate migrations from the subcontinent. However I outside Europe is rather limited (Guanche mummies and a thin scatter in North Africa and West Asia that doesn't say much).

Simon_W said...

Yes, we may separate the modern distributions and the aDNA. Regarding the former I agree that there is certainly some logic in the assumption that the phylogeographical pattern, i.e. the different distributions of older markers and paragroups versus younger clades is to some degree correlated with the actual spread of the haplogroup in question. That's why e.g. there is haplogroup A0 in Africa, coinciding with an African root of the human yDNA tree. The question is just how exact and closely this pattern is related with the actual spread. The older the haplogroup in question, the more time the carriers of old variants had to migrate, that's also logical. Therefore it's unlikely that there is a 100% match between modern distributions and the actual spread. And there is also the problem that refuge areas like mountains (Caucasus, Iran!) or islands preserve older variants better than plain, open areas which are more easily overrun by new arrivals – also logical. That's why we see a correlation of EEF haplogroup G2a with mountains.

R1a1 wasn't exclusively an Iran-Kurdistan-Caucasus thing as the modern patterns would make us believe. Now we know for sure, thanks to aDNA evidence, that it was common in hunter-gatherers of northeastern Europe. Yes we can assume that it was common there – otherwise it wouldn't have been found twice in a sample of two individuals. (Here is the second one btw: )You don't happen to come across two rare haplogroups in a sample size of two. It's possible, but very unlikely, and the more unlikely the rarer the haplogroup. We cannot say for sure that it was the predominant haplogroup, but it was hardly rare. The same with C1a2 in Iberian HGs. It's extremely uncommon in modern Iberians, but I have no doubt that it was at least not uncommon in Iberian HGs. Therefore small samples of aDNA are not insignificant, to the contrary, they are very important and valuable. Does it mean C1 originated in Iberia? Of course not, and there is ancient DNA evidence to the contrary (C1a2 having been found in an EEF of Hungary, C1 in the UP hunter-gatherer K14).

Is R1a1 essentially Asian? Again, I don't consider this to be an important question. The Eurasian steppe belt is essentially Eurasian, what does it matter on which side of the artificial divide R1a1 originated? It's more informative to note that R1a1 is essentially West Eurasian, not East or South Eurasian. N1 on the other hand does seem to be associated with some East Eurasian admixture. (I think it's definitely possible that N1 was already in Karelia at the same time as the early R1a1 – it presumably spread with the Pit-Comb Ware, and some crania of the Pit-Comb Ware people show unmistakable eastern, East Eurasian-like influence.)

I agree about I being the most European specific of all haplogroups.

Maju said...

@Simon: I totally forgot about the Upper Dvina findings (I'm definitely not paying the same attention as I used to), you are correct.

However these lineages don't seem to be tested for R1a1 downstream mutations. So IMO the represent the seedling of Z282 (in a stage between M417 and Z282), which seems to have expanded precisely from that area or probably a bit farther to the south, towards the Upper Dniepr. This seems to imply that the big R1a1 expansion is rather pre-Neolithic or borderline Neolithic at the latest, rather than the shorter scholastic time-frames arbitrarily favored by Underhill.

But indeed the Karelian R1a1* is probably an offshoot of this early M417 arrival to Europe, North German (and the few English) M417* surely also belongs to this pre-Kurgan preliminary setting. It is very interesting that R1a1 is found with N1c since c. 2500 BCE, what indicates the Uralic presence nearby. But even more interesting is the high frequency of mtDNA H (CRS is probably H1, almost certainly not "H2", that's a clear labeling mistake).

If we get back to the wider picture, this to me implies that, at about that same time, since c. 6000 BCE (or maybe a bit earlier: I can accept up to c. 11 Ka BP, i.e. c. 9000 BCE), there was also a parallel development of the same kind in Iran leading to Z93, with the Turkish M417* being possibly related to this and/or the migration to Europe.

I'm integrating both data-sets: the Europe-only ancient DNA and the wider modern DNA one of Underhill. I'm also integrating the most plausible chronologies, which are necessarily older than the one arbitrarily favored by this researcher.

"R1a1 wasn't exclusively an Iran-Kurdistan-Caucasus thing as the modern patterns would make us believe. Now we know for sure, thanks to aDNA evidence, that it was common in hunter-gatherers of northeastern Europe".

We only know the exact affiliation of the Karelian HG, the others are blurry and could (IMO should or even must) be downstream of the R1a1 node and rather be precursors of Z282 (uncertain about the exact stage but M417 at the very least). Else you have to argue for a back-migration of M417 to Iran before it became Z93, what seems impossible for all I know about prehistory. So IMO M417 is when R1a migrates to Europe and Z93 when it does to Central/South Asia. But M417 is not European as such, only a subset of it is.

Simon_W said...

IMO the presence of R1a1* in a Karelian HG means that M417 may just as well have originated on the steppe, or even further north. This R1a1 in eastern Europe might be a precursor of M417. (Of course not exactly this one, but he surely had remote relatives with R1a1 in the not so distant neighbourhood.) It also proves that the evidence presented by Underhill is blurry and incomplete: According to present-day distributions R1a1* cannot have been present in Karelia, only in Iran and the Caucasus.

I don't see why M417 would have to have back-migrated to Iran before it became Z93. The population which carried Z93 to Iran may easily have included members of other haplogroups and M417-carriers who lacked Z93. The same with European M417*.

The Yamnaya and Corded people did indeed have some West Asian autosomal admixture, and there is a Kartvelian-related language influence in PIE. But if this was associated with a y-haplogroup, this is more likely to have been R1b, since R1b1b and R1b1c are purely Asian and African clades of R1b. And R1b-L23 is rather common in one Kartvelian region, whereas R1a is completely uncommon in Kartvelians.

For linguistic reasons I favour a PIE homeland on the steppe, I think there we agree. But wouldn't it be strange if the patriarchalic PIEs who disseminated their language over a very wide area were on the paternal side exclusively descended from non-IE West Asians? Of course it's possible, but not very plausible.

Don't get me wrong: I'm just trying to be realistic. According to the latest Eurogenes K6 I'm 11.7% Middle Eastern on top of the EEF and Yamnaya-related ancestry, and proud of it.

Maju said...

@Simon: I can't but disagree. The Patriarchal IEs (not anymore PIEs after the first expansion and dilution event, for example Sredny-Stog II in Europe) left a mark but, as good patriarchal (and not just partilocal), their main goal was to expand their family, i.e. their collection of slaves (famulus = house slave, from which family = set of slaves, including women and one's offspring), and that implies means other than just bio-genetic. You don't want you or your (main) lineage to work but to rule, that's the patriarchal concept, and you have to work if you are the only ones around.

If we accept, and I do grosso modo, that (most) R1a in West Europe is of IE origins (say Corded Ware or similar), you still get impressive IE Y-DNA scores. Let's assume that the original R1a score among CW IEs is exactly the same as modern Polish (58% per Eupedia), that means:

→ 40% IE patrilineal ancestry in North/East Germany
→ 16% in West/South Germany
→ 8% in England
→ 3% in Spain (extremes: 16% in Cantabria, 0% among Basques)

That's already quite impressive, IMO. BUT if you consider as baseline the frequency of West Germany (9%), what implies accepting a dilution to 16% first of all, as per above, then:
→ England: 50%
→ Spain: 22% (extremes: 94% in Cantabria, 0% among Basques)

And that is truly amazing for a bunch of disorganized Celts (and their Italic/-oid and Germanic cousins). It seems to imply that Celts had much more of an impact in far away areas than their Corded Ware precursors in nearby areas (although this is a very rough simplification, admittedly). Obviously, if Celts carried only 9% R1a, much or the rest of their lineages was R1b indeed (47% if we accept the modern West German baseline as valid) but there is still a large percentage that is pre-Celtic and hence pre-IE, for example in England still 44% of pre-IE lineages would be R1b (vs. 24% Celtic+). Anyhow this is a very rough approximation because you would also have to factor Germanics and possibly also Italians (Romans), but still shows that unless you adopt an extremist position re. a very recent origin of R1b in Central Europe, able even to penetrate massively (but without a hint of R1a) pre-IE populations like Basques, R1b is clearly pre-IE in Western Europe (at least most of it).

Simon_W said...

Maju, I wasn't claiming that R1b was originally IE or the marker of IE centum populations, like some people do. But the fact remains that there was R1b on the steppe, and in Yamnaya, and this may well have been associated with the Georgian-/Armenian-like West Asian autosomal admixture in Yamnaya and Corded Ware, that was my point. How many Corded Ware y-chromosomes have been analyzed so far? Only few, so the possibility remains that some had R1b, just like their Yamnaya neighbours had.

Regarding R1b in western Europe, this is again another topic. But going by to the IBS stats posted by David, one of the German Bell Beaker people was closest to modern Basques. And I commented, in the respective thread, that given this autosomal similarity, and the high frequency of R1b in Bell Beaker people, it's not far fetched to think that this woman from Germany spoke a Basque-related language. But all the EN and MN samples so far, and they're quite a lot, are closer to Sardinians than to Basques. I concluded that the genetic signature of Basques originated where east and west met, and more precisely where the eastern influence faded out, and that they had R1b since that time. While IE speaking central European populations had less R1b, not just more R1a, but also more of the local I2, so that the further IE expansion westward caused the pattern of highest R1b frequency on the Atlantic facade.

Maju said...

@Simon: "But the fact remains that there was R1b on the steppe, and in Yamnaya"...

But that is a different lineage to what is found in Western Europe, a lineage particular of parts of Asia and that remote Easternmost border of Europe. There's no relation whatsoever with Western R1b, just the same as there's no relation whatsoever with Sudan-Chad R1b. All originated at some point in West Asia but that happened who-knows-when and otherwise there is no connection, just like there is no particular connection between R and O, even if they are "cousins". This last is a good example: in order to understand their connection you have to go to Middle Paleolithic Sundaland but, thinking shallowly or knowing just a few basic facts, someone could (and sometimes did in fact) argue for a common origin somewhere in the steppe, even a relatively recent one. Facts show that it is not the case at all. Luckily for the understanding of this matter, R and O (or their precursors P and NO) have simple well defined labels but that's not the case of R1b, whose nomenclature is ever-changing and who-on-earth remembers the defining SNP markers easily (so many!), so people talk of "R1b" as if it'd be something homogeneous, instead of R1b-Western, and that is as extremely misleading as trying to understand R based on O, or as dealing with E1b without the much needed various nuances affecting each of its sublineages (anyhow, even E1b is simpler to understand than R1b, it seems).

"Regarding R1b in western Europe, this is again another topic".

Is it? The way you present it, it seems it's just a single huge topic labeled "R1b", without distinctions. It should indeed be a totally different topic but that implies adding a label to "R1b" such as R1b-Caspian, R1b-Sudan or R1b-Western

"But going by to the IBS stats posted by David, one of the German Bell Beaker people was closest to modern Basques".

I can't say. Going by the data posted by Skoglund & Mälstrom last year the closest thing to modern Basques and nearby SW Europeans were Swedish Megalithic Farmers, so I'm still awaiting more refined analyses on the autosomal matter. It's possible that Bell Beaker people were descendants of SW Europeans, as the chronology of the phenomenon (and its Megalithic precursor) suggests but it's also possible that different analyses of autosomal data is just giving contradictory solutions... so I'm remaining skeptic for the time being and will await for more data and more analyses before finally making up my mind. Lazaridis was revealing, Haak is confusing instead.


Maju said...


"But all the EN and MN samples so far, and they're quite a lot, are closer to Sardinians than to Basques".

Not if you follow Skoglund and Dasakali: both Swedish researchers, independently, published papers where Gökhem people cluster with Basques (Dasakali, one sample) and with Basques and Spanish (Skoglund, all four samples). In this case there was one being intermediate between Basques and Orcadians (but to the "West" of French), so maybe Irish or British-like...

The Haak study lacks constrasting information, such as different PCAs with different samples, Europe-only analyses particularly, and it is in this sense much worse than Lazaridis (except that it has more ancient samples). It is a study but it is not the last word nor automatically overrides the previous data, as they haven't even bothered discussing how the previous data and theirs are in contradiction at some key issues such as this one of what they so-annoyingly mislabel "middle Neolithic" (i.e. Early Chalcolithic) populations, clustering in certain position in their study that is clearly contradicting what previous (but very recent and very important) studies found.

So I see no reason to take Haak's results as better than Skoglund's and Dasakaali's, particularly on this matter of Gökhem (and maybe other Early Chalcolithic samples). I do have the impression that the results are somewhat "cooked up" in Haak's, really, in order to push ahead with a genetic Indoeuropeanist agenda (even then they are more cautious than many opinions here, particularly re. R1b).

"I concluded that the genetic signature of Basques originated where east and west met, and more precisely where the eastern influence faded out, and that they had R1b since that time".

Where East and West meet? That's a very blurry label. I'm assuming you mean Germany from context. I do not see that: I see genetic continuity among Basques since at least the Early Chalcolithic (very possibly Early Neolithic, at least in some sites like Paternabidea), I see Basques as the first modern population among those studied in Europe in terms genetic (particularly mtDNA) and all that fits well with archaeology and the logic of Vasconic being most likely the language family of the mainline European farmers (i.e. excluding Eastern European ones).

Simon_W said...

@ Maju

Actually I was drawing upon analyses made by Davidski, not directly from the Haak paper. He did use genetic raw data from the Haak paper, but also from other sources. And he calculated the amount of IBS sharing between ancient individuals and modern populations. Going by this analysis, the first ancient individual to have the highest IBS sharing with Basques was Bell Beaker I0108. That's IBS sharing, not PCA, I acknowledge it's possible to arrive at different conclusions using different methods. And anyway, I just noticed that he didn't include the Skoglund_MN samples to his analysis, which is a big downside. Especially Gok2 with his strong WHG admixture may be less similar to Sardinians.

Regarding the R1b in Yamnaya: The majority of it indeed belonged to a brother clade of west European M412, leading to the Near Eastern L277. But there was also an L23 with no further downstream mutation present and one P297. The latter two are upstream of the West European M412.

Simon_W said...

Regarding R1 in Europe I tend to agree with Krefter who hit the nail quite nicely, quote:

Founder effect. 2 random R1 men(R1a-Z283, R1b-L11) who lived ~6,000YBP< represent almost 50% of west and northeast European paternal lineages. Most paternal lineages from that time period are extinct or very very rare. There are other clear founder effects from that time period within I1a-DF29 and I2a2a-M223. Also, as R1b-L11 moved in west Europe it kept having regional founder effects(R1b-L21, R1b-DF27, R1b-U152, R1b-U106).

You don't see the same trend at all with mtDNA. One ~5,000YBP maternal lineage in a region of Europe will probably represent at most a few percent of the maternal lineages.

Maju said...

What happens if the actual date is not 6 Ka BP but 12 or 15 Ka BP? It changes everything! Even just 2-3 millennia earlier (what is likely because you're drinking from the most recentist possible guesstimates) would make the founder effect Neolithic.

Your problem is that you place way too much credibility to some "molecular clock" hunches and that is like arguing about the origin languages based on the legend of the tower of Babel (i.e. pseudo-science).

The recent mtDNA data from Burgundy ( seems to support an area including at least France and the Basque Country of continuity since Neolithic times, so...

Oaie Porc said...

so the teal score is the Central Asian component in the K9 test? I score over 23% on that :D (I am ~ 1/2 Central Moldavian, 3/8 Southwest Ukrainian, 1/8 Southeast Polish)


East_Asian -
Siberian 3.15%
Sub-Saharan -
Oceanian -
Central_Asian 23.60%
South_Asian 0.36%
Amerindian -
European 44.57%
EEF 28.31%



Yamnaya_related 39.83%
WHG_extra 3.49%
ENA 1.78%
Middle_Eastern 13.76%
Pre-Yamnaya 40.43%
Sub-Saharan 0.71%

jv said...

Thank You, thank you(old post but I had to comment) I think my ancient Grandmothers were Teal. I think my Haplogroup originated in the Central Asian countries. She may been a HG or with Fisher tribes in the Pre-Yamnaya Era. Maybe she lived in the Elshanka Culture as pottery skills would be in high demand. jv

