search this blog

Friday, October 16, 2015

Basques are not simply a fusion of Iberian hunter-gatherers and early farmers


I thought I'd revisit the issue of Basque origins with my new Principal Component Analysis (PCA) of West Eurasia. The useful thing about this PCA is that it gets around two problems that routinely affect PCA featuring ancient samples: projection bias, otherwise known as shrinkage, and exaggerated outcomes for individuals with high counts of homozygous genotypes.


A couple of recent papers argued that Basques were the direct descendants of local hunter-gatherers and early Neolithic farmers who arrived in Iberia from the eastern Mediterranean. This is probably correct for the most part, but it doesn't tell the whole story.

On the PCA above, Basques are quite distinct from Early Neolithic, Middle Neolithic and Copper Age Iberians (marked Iberia_EN, Iberia_MN and Iberia_CA, respectively), because they are significantly more eastern. In fact, they cluster with the only Bronze Age Iberian on the plot (Iberia_BA), which is the same individual that I found to harbor steppe-related ancestry (see here).

Thus, the story told by the PCA is that Basques are the progeny of Bronze Age Iberians, who, unlike their Copper Age predecessors, experienced a pulse of steppe-related admixture from the east.

Formal statistics back this up. For instance, here's a quote from the recently revised Mathieson et al. preprint:

However, the statistic f4(Basque, Iberia_Chalcolithic; Yamnaya_Samara,Chimp)=0.00168 is significantly positive (Z=8.1), as is the statistic f4(Spanish, Iberia_Chalcolithic; Yamnaya_Samara, Chimp)= 0.00092 (Z=4.6). This indicates that steppe ancestry occurs in present-day southwestern European populations, and that even the Basques cannot be considered as mixtures of early farmers and hunter-gatherers without it (4).

The key question now is who brought the steppe-related ancestry to Basque country. Were they Indo-Europeans or speakers of Proto-Basque? Also, did they actually come from the steppe, or somewhere nearby, like the Carpathian Basin?

The reason I mention the Carpathian Basin is because, as per the PCA, Basques more or less cluster between Copper Age Iberians and some of the Bronze Age Hungarians (marked Hungary_BA). But this is just one possibility, and I'm not sure at this stage how plausible it looks with, say, formal statistics.

In this analysis I used samples from the Allentoft et al., Gunther et al., Haak et al. and Lazaridis et al. datasets, all of which are publicly available. The latter two are found at the Reich Lab site here. If you're confused by some of the acronyms in the PCA key, see here.

94 comments:

Alberto said...

Thank you David. This is a good summary of what we know up to now. I'll add here some stats you and Chad run for me some time ago and might be relevant:

Spain_MN Basque Yamnaya Mbuti -0.0047 -2.203 16014 16165 343197
Spain_MN Basque EHG Mbuti -0.0011 -0.400 15893 15928 338080
Spain_MN Basque Georgian Mbuti -0.0031 -1.617 16019 16119 343660
Spain_MN Basque WHG Mbuti 0.0093 3.465 16431 16128 343625

About the possible connection with the Carpathian samples (those ones with high WHG ancestry and rather low ANE), yes, I also speculated about that population being ancestral to modern Basques. Though there are a couple of caveats: They would need not some 50% impact in MN local population, but closer to 70-80%. While populations with higher ANE (around 15-16% in average on Bell Beakers) would require 50% impact which is more reasonable. And second, those Hungarians were not even R1b, were they? Which again makes them a less likely source for modern Basques who are some 85% R1b.

Alberto said...

Ah, also your latest K12 can be relevant. In it, the Euro_HG cluster is pure WHG, without any ANE that can hide the eastern component is populations with high WHG/ANE ratio. It's an unsupervised run (as far as I know) with real samples, no zombies or anything like that:

https://docs.google.com/spreadsheets/d/1ajolEB_2NXAnxtGJXSbwKSF1CvDvFiGy5otpq5f-C1g/edit?usp=sharing

Krefter said...

@Alberto,

Okay that makes sense. The D-stats you posted take away the possibility extra WHG is why Basque are Eastern-shifted compared to Copper age North Spanish.

Chad Rohlfsen said...

The Dstats also show that the MN samples, even being more Near Eastern, share more drift with WHG, showing the eastern ancestry isn't just EHG, but also Caucasus shifted.

Maju said...

I appreciate that you take a look at Basques but I don't think your take is good enough.

On one side, the extra Yamna may only be something Yamna-like (for example extra "pure EHG" or even just extra WHG would produce similar results), so that's not good enough. Other tests should be performed.

On another side, the drift of ATP9 is clearly towards modern Britain, that produces an artificial overlap with Spaniards in the original paper and here a probably also artificial overlap with modern Basques. Just yesterday I updated my entry on the Atapuerca paper mentioning a hoard of Bronze Age weapons, not too far from the site, that bears all the signs of British origin. While the date (unclear) seems to be more recent than ATP9, it underlines the very dynamic Atlantic interactions of the Bronze Age, including the Atapuerca and Basque areas with Britain. In my understanding the ATP9 "steppe drift" is actually "British drift". And that means that BA Brits already had some pseudo-steppe affinity (can't be considered true steppe affinity unless checked for Caucasus component).

Something I perceive in your PCA is that "Far East" Bell Beaker samples seem to envelope the Basque cluster by the 'North' and the 'South'. I say "Far East" because that's what East Germany is for the Bell Beaker phenomenon: a remote most peripheral area that can hardly be representative of the whole phenomenon. A serious possibility is that the origin of the pseudo-steppe ancestry in Basques is within Bell Beaker and related (Megalithic, Bronze Age), which could well have remixed the Atlantic ancestry, long before the Indoeuropean speakers made any inroads in the West. It's plausible that, to some extent at least, BB attracted people who were less indoeuropeanized, less part of the IE elites, and in fact, in Central Europe, BB burials show some traits that imitate in reverse the Corded Ware ones, what may well be a political-religious statement. Also, at least in some parts of the CW+BB area, BB sites do not show any continuity with Corded Ware ones but rather appear in different places with different patterns of... everything. I'm thinking Switzerland right now but it may be less obvious in the "Eastern Province", where a closer (but not at all strict) overlap with CW seems apparent at least in Moravia).

My two cents.

Davidski said...

Extra WHG wouldn't produce the same result. You'd just get a result like the most WHG shifted Iberia_CA individual on the plot above.

The extra EHG/ANE is definitely there, and it may have come from Britain I suppose. Impossible to say at the moment.

Shaikorth said...

Spain_MN Basque Yamnaya Mbuti -0.0047 -2.203 16014 16165 343197
Spain_MN Basque EHG Mbuti -0.0011 -0.400 15893 15928 338080

Pure extra EHG doesn't seem plausible because Basques' EHG shift compared to MN is insignificant. Yamnaya shift is clearly higher.

http://1.bp.blogspot.com/-LIug8kEPKW4/UFtC9XP-VsI/AAAAAAAAD1A/HTmsEscIGoI/s1600/MDLPwestasian.jpg

Having Caucasus components like above (or not) isn't clear-cut evidence because ADMIXTURE components can masks affinities.

Maju said...

@David: that is assuming that your PCA variant is actually most optimal. You do claim that and I don't have any specific reason to contest your claim but that's actually because I don't understand the details well enough to judge either (a 'materials and methods' section, so to say, would help), while academic papers are all the time producing a different plot, with WHGs much more "to the south" and with Basques sitting right on the straight line between WHGs and the Neolithic cluster, so... for me it's yet uncertain matter.

The best reason I have to believe that Basques actually do have more EHG (but also more WHG) than Spain_EN (MN/CA may well not be the optimal baseline) are Matt's graphs, so I do agree with the statement "The extra EHG/ANE is definitely there", but associated to a similar amount of extra WHG.

"... and it may have come from Britain I suppose. Impossible to say at the moment".

I agree that it's early to say with any certainty, but there are some bits of evidence pointing to Britain and/or West France (ref. Gurgy) IMO, among others the LCT allele issue.

@Saikorth: I would think that Spain_MN/CA is not the best proxy for the ancestry, because they are clearly much more WHG-like in the WHG/EHG ratio than both Basques/Gascons and Spain_EN. Arguable, I know, but a possibility worth considering, among other reasons because Spain_MN/CA lack the LCT allele found in Basques since Chalcolithic and the modern-like mtDNA pool found in Basques since Neolithic (in Paternabidea at least). They are clearly a parallel kind of "transition" towards greater HG levels but not at all optimal for direct ancestors. Hence I suggest using Spain_EN instead.

The Caucasus affinity (always relative) can be detected with f4 or f3 stats, so that should not be a problem. What is clear is that Basques can be described as "zero Caucasus" relative to all other Europeans and West Eurasians in general. It's almost their defining trait, and it is a defining trait that is totally "anti-Kurgan". They have even less Caucasus affinity than Neolithic and Chalcolithic EEF-like peoples: there's nothing in West Eurasia past and present that is less Caucasus-like than Basques, except HGs.

Grey said...

"Also, did they actually come from the steppe, or somewhere nearby, like the Carpathian Basin?"

My guess would be a chain of events where a guy from the steppe moves to a, marries local, his son moves to b, marries local, his son moves to Ebro, marries local - hence the increase in paleoeuropean at the end.

(where a and b might be Carpathian Basin and Brittany or Crete and Sardinia or whatever)

Ariele Iacopo Maggi said...

Shaikorth
http://1.bp.blogspot.com/-LIug8kEPKW4/UFtC9XP-VsI/AAAAAAAAD1A/HTmsEscIGoI/s1600/MDLPwestasian.jpg

When i see a map like that I'm thinking that we are missing something big that happened at some point in the bronze age, it looks like to me that some extra caucasus ancestry passed through southern Italy and Greece to reach Germany, France up to Scotland, Scandinavia and Portugal. I mean why germany share extra affinities with caucaus compared to Poland? The fact the Yamna had caucaus does not imply necessarily that every drop of caucaus came from yamna, right? (Roman Empire could be a solution but what about Scandinavia and Germany?) But btw you are right, this map does not look like a good proxy for yamna-like admixture....

Chad Rohlfsen said...

It's not WHG maju. Didn't you see than MN is closer to WHG. So, it can't be just EHG either. This isn't difficult to figure out. Drop biases and actually think!

Chad Rohlfsen said...

Again, look at that k12b. Basques and western Europeans in general, have a lot of Gedrosia, which is the ANE bearing component of West Asian. Look at Basques closer to Yamnaya and Georgians, but no difference with EHG, and farther from WHG. Use your head.

Chad Rohlfsen said...

That all means the shift is between Yamnaya and Georgians, not Yamnaya and EHG.

truth said...

In the study of Martinez-Cruz 2012 there is 20% of R-L21 in Basques, which is typically a British Isles and NW French subclade, so there definately was some kind of migration from there.

Matt said...

You can see this pattern of shift between Iberia Chalcolithic and Basque in the FST stats from Mathieson 2015 as well:
http://i.imgur.com/YdgnsNK.png (or a comparison with an adjustment for the Iberian Chalcolithic looking a little more drifted than Basques are - http://i.imgur.com/iS54Eq3.png).

In terms of FST, the populations the Basque sample is most shifted towards relative to Iberia Chalcolithic are the ancient steppe populations.
OTOH shifts comparing Sardinians to Iberians and Anatolian Neolithic - http://i.imgur.com/BH8vKXL.png seem to have a slightly different pattern (but not totally unalike).

(http://i.imgur.com/WQPfVE3.png, http://i.imgur.com/bWLn6ZS.png, http://i.imgur.com/1knjlJA.png, http://i.imgur.com/cn2x2Tg.png, http://i.imgur.com/yoTLPum.png, http://i.imgur.com/2JKyFGD.png for some more examples of these FST differences).

Maju said...

@Chad: I'm saying that Spain_EN looks to my eyes a better baseline than MN/CA for various reasons. So it's a matter of opinion. I'm using Spain_EN in any case, I don't think there's any particular relation between modern Basques and the Atapuerca and La Mina samples but there is with the overall Mediterranean Neolithic baseline.

"Again, look at that k12b."

What's that? A "calculator"? I don't pay much attention to zombie-based calculators because they heavily rely on aprioristic biases by the designer (they may work well for this case but horribly for another, the designer's judgment can be better or worse). In any case if the methodology is approximately correct (i.e. there's not another zombie hiding part of that stuff), Basques should appear as lower than all other West Eurasians, excepted maybe Finnics, the other deeply rooted non-IE population of Europe, just as the map linked above by Saikorth shows.

Chad Rohlfsen said...

Your logic, doesn't make sense. Why EN, when Northern Spain got farming just before Northern Europe. MN makes more sense, is closer to Basques than EN, but still closer to WHG. Forget the zombies. Whether you like them or not, Dstats agree with them.

Chad Rohlfsen said...

The eastern ancestry of the Basque falls between Yamnaya and Georgians. You can deny all you like, but you'll be wrong again. Your people are on the same cline as other Western Europeans. It's only proportions that differ.

Maju said...

@Matt: OK, I've been looking at your figures, calculator on hand, and it seems that, at least based on Fst, my hypothesis does not stand - because the ratio Fst(Basque,X)/Fst(Iberia_EN,X) is worse (implies greater change) for X=Yamna and even X=Georgian than for X=EHG. I retire it.

However the fact that Basques are the less Georgian-like of all modern Europe and West Asia, tied with Lithuanians (in your sample) does stand. They are also the less Yamna-like except Sardinians and Bedouins. However these "records" do not stand before ancient Neolithic/Chalcolithic populations, so something has changed most likely (unless new ancient samples tell some other story) and it seems that it was in Yamnaya direction (best ratio of all).

Maju said...

"Why EN, when Northern Spain got farming just before Northern Europe?"

Just... 1000 years earlier (at least if you mean Britain and Denmark by "Northern Europe" and the Basque Country by "Northern Spain").

Modern archaeology confirms consolidated farming in the Basque Country c. 5000 BCE, just 500 years after Eastern Spain and 700 after Provence. Another thing could be the regions west of the Basque Country, notably Galicia, which (barring future revolutionary findings) can still get a quite more recent date, barely pre-dating Megalithism, as in Britain, Ireland and Denmark.

Also it's important to understand the fine detail of Basque Early Neolithic genetics, so far only described by mtDNA, showing three different areas: (1) a southern one at the Ebro banks which can be considered comparable to later ATP and that in the anthropometrics of the literature is often described as "Mediterranean", (2) a northern or coastal one which seems continuous with previous Epipaleolithic people and (3) an intermediate one in the Pamplona basin that displays modern-like mtDNA (like Gurgy). Either #2 or #3 are traditionally described as "Pyrenaic" (Keltid?), a type that was considered approximative to modern typical Basque phenotype (anthropometrics just for the record, I don't pay too much credibility to them in any case).

So I don't see any particular reason to associate modern Basques to the Ebro basin Neo- and Chalcolithic peoples, although, naturally they must have got some influence, at least in those specific areas. They were areas that suffered much in many wars anyhow, unlike the mountain refuges, because of their agricultural wealth and border nature, even before the Celts arrived most probably.

Beyond all that, I don't see any mtDNA pool nor LCT+ traits in the Ebro basin peoples that could be considered ancestral to modern Basques in significant terms. If anything I'd look rather in Aquitaine and even further into France but most importantly in the local findings that indicate already modern traits (mtDNA pool in Early Neolithic Paternabidea and LCT+ in a subpopulation detected in war cemeteries in the Ebro Valley, which may well be piedmont, mountain or coastal peoples from the same Basque Country but probably not the Ebro banks' locals).

Maju said...

@Truth: not 20% but rather 5-13%. It could come from Ireland or Britain but it could also come from Western France (incl. Gascony, Brittany). All those P-312 subclades are present in France, which appears as the most likely origin of all them. The high frequencies in inland and mountain areas, as well as in the Labourd corridor may rather support the "French" origin model but guess that only haplotype analysis would allow for a proper assessment.

Chad Rohlfsen said...

Well, like it or not, MN works better as the pre-Yamnaya like admixture. The Basque do have WHG, on top of EN. As for Lithuanians and such being closer to Georgians than the Basque, you're wrong. It's actually pretty even, across Europe. The difference is EHG. Lithuanians are more EHG to Georgian like, than others. This tells us that they had EHG, prior to LN/EBA admixture.

result: Chimp Georgian Basque Lithuanian -0.0009 -0.664 16907 16938 354212
result: Chimp Georgian Basque Norwegian -0.0009 -0.696 16910 16941 354212
result: Chimp Georgian Basque Orcadian -0.0009 -0.689 16919 16948 354212
result: Chimp Georgian Basque English -0.0004 -0.340 16914 16929 354212
result: Chimp Georgian Basque French -0.0016 -1.576 16914 16967 354212

Alberto said...

@Matt

Thanks for all those figures. I'm surprised that Fst distances work this good, actually.

It's interesting also in some of those comparisons to look at both ends. Depending on what you want to check, significant numbers are the most positive or the most negative, while those closer to 0 are not significant.

For example, in the one comparing Basque and Lithuanian, the most negative is Spain_EN. Looking at David's PCA above, indeed Basques are half way between Spain_EN and Lithuanians (gray dots closest to Motala_HG, I assume).

But then looking at the other end the most positive figure is with Samara_Eneolithic and EHG. And then looking again at the PCA, indeed Lithuanians are half way between Basques and EHG.

Alberto said...

@Maju

"However the fact that Basques are the less Georgian-like of all modern Europe and West Asia, tied with Lithuanians (in your sample) does stand. They are also the less Yamna-like except Sardinians and Bedouins. However these "records" do not stand before ancient Neolithic/Chalcolithic populations, so something has changed most likely (unless new ancient samples tell some other story) and it seems that it was in Yamnaya direction (best ratio of all)."

I agree, that's basically the big picture. The fine details are much more complicated and need a lot more samples, but it's very unlikely that they will change this picture with some unexpected population of EHG+WHG in the Atlantic area (for the record, I personally would like such population to appear instead of accepting this picture, but I'm realistic and History is most of the times rather ugly).

Maju said...

@Chad: It's Matt's Fst data, don't blame it on me if it is different than yours.

Anyhow, I'm a bit perplex by your formulation, following the original description of D-statistics (p. 129 onwards), it should be D(H1,H2,H3,chimp), where a positive result means that H3 is closer to H2 than to H1 and a negative result the opposite. In any case, the outgroup (chimp, but can be other) is apparently the last one and I'm not sure if that can be changed and how it affects the result. Can someone explain that to me?

Maju said...

PS- the fact that the French, who'd be typically the most Caucasus-like of all the populations listed produce the strongest negative score seems to reinforce my notion that something is wrong in your stats, Chad.

Maju said...

PS2- Not sure if I'm reading too much but the negative results seems to indicate that the rightmost X population would be closer to Georgians in all cases, because that result should be read as "Basques are closer to Chimp than to Georgians, relative to X", right?

Alberto said...

The D-stats are insignificant, mostly. Georgians have high ANE, but also high "Anatolian farmer". So comparing Basques and North Europeans it evens out and they're about the same distance (the PCA above show that distance being similar too. It would be different with North Caucasus populations like Lezgins, who will be closer to North Europeans).

The only strange result is indeed with French. They should be a bit closer to Georgians than the rest (though not much), but the stat shows them actually a bit more distant. I don't think it means much. Could be any small detail in the samples compared. A different set could give other results. Things are never perfect.

Chimp as an outgroup might not be the best choice (Mbuti, Ju_Hoan_North might give more stable results), but it shouldn't be the culprit, I think. The outgroup is supposed to be equally distant from both populations on the other side, so the negative result should be driven by (in this case) Basques being closer to Georgian than X (but insignificantly so).

Matt said...

Alberto: Yes, if you look at the ranks only they do seem to work surprisingly similar to the D stats, although I do find that there more of a sharp signal in some instances with the FST. I didn't notice any new patterns really. Well observed re: the correspondence with the PCA. Comparing the Lithuanian and Yamnaya FSTs, one end (closer to Lithuanian) is the Central_MN and Remedello, while the other Poltavka and EHG, which for both pairs less connected with any PCA pattern, but probably because there just aren't any single populations in the set that map that way on the PCA.

Re: distance from Georgians, the Basque population does have the equal highest FST population differentiation from Georgians in modern Europe, along with Lithuanians, in Mathieson et al's table.

That's compatible with an outgroup stat showing that they are similar to other Europeans in differentation from Georgians net of their overall genetic drift.

The ancient populations seem generally to have higher degree of genetic drift / differentation, probably from a combination of being more homegenous and drift building up during the paleolithic / mesolithic periods of small population size (WHG / EHG / Motala), rapid expansion during Early Neolithic (Spain_EN / MN in particular) and possibly error (Remedello, unless something very strange was happening with that subpop).

The Basques and Lithuanians do seem to more resemble the ancients in being relatively unadmixed and homogenous still, but both seem affected by admixture, from isolation by distance (or this plus founder effect) if nothing else.

Using that FST comparison method, the Basque look at the same place as Iberia_Chalcolithic on an EHG-Georgian axis, http://i.imgur.com/8YEJX6h.png, but by itself I don't know if that says a lot, as clearly WHG mixture also pushes a population towards the EHG end of the EHG-Georgian axis (noting the differences between the EEF on this axis).

batman said...

Bellish beakers, battleaxes and dehumanization - from Samara, Khavalynsk and Sredni Stog:

http://s155239215.onlinehome.us/turkic/btn_Archeology/Mallory/BC3000SamaraAndDD.gif

http://s155239215.onlinehome.us/turkic/btn_Archeology/Mallory/JMalloryEneolothBronzeAgeEn.htm

Krefter said...

@Maju,

LCT and mtDNA isn't very important. The core of Iberian/SW French autosomal DNA is from Copper age Spain, one way or another the same is true for their mtDNA. How could there be modern mtDNA if autosomally there is so much diversity? LCT can go up and down in frequencies via selection, so isn't a concrete ancestor marker.

Romulus said...

This is easy. Basque/Vasconic is the original language of the R1b types from the Steppe, hence why it is related to Kartvelian, and was spoken by the Beaker people. Basques have a lot of R1b and little steppe admixture because they were minorities (who later became a majority) in a heavily populated EEF area. Indo European, in part, spread from CWC and the Celtic language expanded NW to Britain at a later date with Halstatt/La Tene.

arbogan said...

Still waiting for something more informative like maykop or the north iranian HGS. These results are not surprising at all and should be expected.

Tobus said...

@Maju: the outgroup (chimp, but can be other) is apparently the last one and I'm not sure if that can be changed and how it affects the result. Can someone explain that to me?

Short answer: The D-stat calculation gives a valid result with the outgroup in *any* position - you use the relative position of the pops/outgroup to interpret that result.

Long answer:
D-stats should be considered as two pairs. A run of D(A, B; C, D) is looking at which of A/B is closest to C/D.

To work this out, you look at those sites where A and B have different alleles, and you count how many times C has A's allele and D has B's ("ABAB"), and vice versa ("ABBA"). Sites where A/B have the same allele are not counted. Sites where C/D are the same or have a 3rd allele are not counted. The final score is a signed ratio of the two counts: Dscore = ("ABAB" - "ABBA") / ("ABAB" + "ABBA").

You can see that a +ve score means there were more ABAB sites than ABBA ones, meaning either A/C, or B/D, or both, have more alleles in common. A -ve means the reverse - more ABBA than ABAB so A/D and/or B/C have higher affinity. If you use an outgroup then one of the "and/or"s can be rejected a priori leaving only one possible cause for the score. If "D" is the outgroup then the result indicates which of A/B is closer to C (since their affinity to outgroup D will be the same). If "A" is the outgroup then the score is C or D's relative affinity to B. If the position of the outgroup is reversed in its pair, say D(A, B; D, C), the result will be identical except you'll have -ve instead of +ve, or vice versa. So the outgroup can go anywhere in the calculation, you interpret what the score means based on the order of populations/outgroup.

There is also the Z-score which indicates how "significant" (in mathematical terminology, not rhetoric) this is - how far away from randomness. Basically it counts whether the net difference indicated by the Dscore is spread evenly throughout the genome, or if it appears as large blocks of DNA. The higher the Z-score, the more "blocky" the difference is, and thus it's less likely the Dscore could ever have happened randomly. The Z-score is affected by the number of sites compared, but usually greater than 2 or 3 (or less than -2/-3) is taken as proof of shared ancestry. Z-scores less than 1 (ie. between -1 and 1) are usually considered proof that there isn't.

Krefter said...

Basque and SW French/Spanish/Portuguese are basically the same within European genetic diversity.

Everyone in Iberia/SW France wasn't Basque before Rome(modern ethnic ties formed after modern genetic situation came to be). Maybe Basque have been genetically isolated, I don't know. Is there any reason they see them as isolated besides retaining a pre-Roman language? The expansion of R1b-DF27 and other events don't only play apart of Basque history. The focus should extend to all of SW Europe.

bellbeakerblogger said...

Quick thoughts:
If you extracted the Corded Ware parts of a German Beaker, the entire Beaker population would be shifted down, closer to Basques.
Basques in turn could be modeled as a two way mix.of pre-ARTENACIAN and Maritime Beakers (which I think works culturally as well) then developed in isolation for the remainder of the BA.
There is this expectation that the earliest Beakers will look like Corded Ware. I've warned the problem with this before, all the Beakers thus far are corded hybrids.

German Dziebel said...

@Davidski

Thanks for reacting to my comment on ANE in Basques with a post. I think fundamentally ANE is a pan-Eurasian phenomenon (with Amerindian roots) and as such it must be older and more ancestral to EHG and WHG. It has survived in such refugia as Northeast Europe (Lithuanians), the Caucasus (Kartvelians, etc..) and Iberia (Basques). in Iberia it got diluted twice - first by WHG, then by farmers. Same in northeast Europe and the Caucasus but to a lesser degree. Linguistically, it may potentially correspond to Indo-European-Uralic-Eskimo-Aleut (the northern wing), on the one hand, and to Basque-NorthCaucasian-Burushaski-Na-Dene (the southern wing). Both families are pan-Eurasian, both families have representatives in America.

I don't think ANE got injected into WHG + EEF-only proto-Basques by a pastoral steppe population. Not only that it's an ad hoc scenario. It also misses the important datapoint: Mesolithic La Brana is more Amerindian than East Asians (and closely related to Kostenki, which, among modern populations, is best matched by none other than Lithuanians), hence La Brana must have contained ANE.

Maju said...

@Tobus: I very much appreciate your detailed answer, however I remain in doubt.

You said: "The D-stat calculation gives a valid result with the outgroup in *any* position - you use the relative position of the pops/outgroup to interpret that result".

What happens is, as Chad did, the positions are totally inverted from the original model, is Chad's interpretation (Basques have more affinity to Georgians than the listed X pops.) correct or, as I suspect, the exact opposite is true (Basques have, if anything, less affinity to Georgians)?

"There is also the Z-score which indicates how "significant" (in mathematical terminology, not rhetoric) this is - how far away from randomness".

Of course. I didn't even check that part but I reckon that it is very important. As Alberto says, using Chimp as outgroup here may aid in causing low significance and a distant (but neutral) H. sapiens pop., say Ju-Hoan, Yoruba or Papuan, should be used instead for improved results.

Maju said...

@BBB: Good points. I almost fully concur. My doubt is about your proposal of using Artenacian + Maritime BB, populations about whose genetics we know nothing so far.

@Krefter: Strongly disagree. In nearly every population anywhere mtDNA and autosomal DNA are very much related, something that cannot be said of Y-DNA. Also mtDNA, with reasonable caution, provides some evidence where autosomal DNA is not yet known. The LCT issue is definitely most important and cannot be ignored even if strong selection is at play. After all we see in that subpopulation the first known example of one where selection acted (if that was the reason of fixation and not a mere founder effect) and a potential source of further expansion of the allele. Every bit of info is important, especially where no other data is available.

"Basque and SW French/Spanish/Portuguese are basically the same within European genetic diversity".

LOL, even a cursory look at the PCA (or any other data) shows that is not true. Basques and Iberians are two clearly distinct clusters with nearly no overlap, in ADMIXTURE analysis Basques lack both the Caucasus component and the North African one, in Y-DNA Basques lack G2a or E1b, and have almost unique R1b subclades.

What you say is an insult to intelligence! Iberians are as close to Basques as they are to French or Italians, maybe even less.

"Is there any reason they see them as isolated besides retaining a pre-Roman language?"

Genetics maybe? Something apparent even from the time of blood groups, the famous Rh⁻ peak, trait shared with Irish most notably and other "Celtic" Atlantic peoples, just like R1b, LCT+, etc. There are clear differences, there are also similitudes. We should not exaggerate the differences but we should not be artificially blind to them either.

"The focus should extend to all of SW Europe".

I agree that there should be a focus to SW Europe in general and even to Western Europe in general but that's beyond the point. Your tendency to oversimplify South European diversity and try to make us more homogeneous than we actually are reminds me of some old school anthropometry (Coon particularly) with its zillion Northern "subraces" and much larger catch-all categories for Mediterraneans or peoples from other areas of the World. It's like understanding other groups and their (objectively not less important) nuances bothers you. Unspoken and maybe subconscious racism ultimately. Get over that: you'll feel better. Everyone matters the same, everyone is interesting their own way.

Chad Rohlfsen said...

BBB,
How can they be hybrids when the Haak and Allentoft Beakers and Corded samples overlap so much? Why not have L51 from Ukraine, mixing more with EEF than Corded mixed with EEF around Belarus and NW Russia?

Maju,
How about instead of always being a doubting Thomas, and acting as if you know more than everyone else, you actually pay attention and research on your own if you won't take anyone's word? Whether or not you use the inverse on Dstats doesn't matter. It makes little sense to tell someone they're wrong, when you haven't the slightest idea about it yourself.

result: Chimp Georgian Basque Lithuanian -0.0009 -0.664 16907 16938 354212
result: Basque Lithuanian Georgian Chimp 0.0009 0.664 16938 16907 354212
result: Gorilla Georgian Basque Lithuanian -0.0011 -0.813 15742 15778 329241
result: Basque Lithuanian Georgian Gorilla 0.0011 0.813 15778 15742 329241
result: Yoruba Georgian Basque Lithuanian -0.0003 -0.320 16586 16597 354212
result: Basque Lithuanian Georgian Yoruba 0.0003 0.320 16597 16586 354212

Chad Rohlfsen said...

It looks quite simply like Basque and Lithuanians even out as Basques carry more of the ANE shifted West Asian stuff and Lithuanians carry more of the ANE shifted EHG.

Chad Rohlfsen said...

Just wait until we have aDNA from Ukraine 3500-3000BCE. But, we all know you'll have some ridiculous explanation that flies in the face of the data, once again.

Krefter said...

@Maju,

Your classification of modern mtDNA is too simple, because it's based on mtDNA H. It makes no sense. I don't understand how in mtDNA there's a such thing as "modern" when there's so much autosomal diversity. There's no way "modern mtDNA" expanded all the way from Portugal to Estonia after 4,000 years ago.

LCT in a site from Neolithic Spain is important but nonetheless Copper age Spanish are the main ancestors of moderns, the particular samples we have aren't dead-ends because they lacked LCT(actually one did have LCT).

I use North Sea as a category, Levant as a category, Arabia as a category, Caucasus as a category, NE Euro as a category, etc. It's no differnt with Iberia/SW France. Most on this blog do the same.

I have no reason to apologize for not being an expert on diversity in Iberia. Besides it's true Iberia/SW France forms a cluster opposed to other people.

Why would Basque be more similar to Italians than people who live right next to them? There's special attention on Basque because linguistics in the last few hundred years found they weren't Indo European. Before that it was seen as just another region in Spain or France.

Basque belong to unique R1b in the sense they belong to a recent Basque founder effect, just like what we see in mtDNA. They still belong to DF27 like Iberians and SW French and unlike Italians.

bellbeakerblogger said...

@Chad
All of the Beakers thus far come from tributaries of the Elbe, Saale and Danube. These were Corded Settlement areas mostly prior to the Beakers or alongside. No question they were mixed in both directions both racially and culturally. There's evidence of both.

I'm assuming if a Mittle-Saale Beaker is less steppic than a rough contemporary Corded person (knowing he was some mixture of the two) then his older Beaker ancestry must have been much less immediately prior. I'd expect an Eastern like ancestry originally in Maritime Beakers, but I don't see any reason to believe it'll be a twin of CWC.

Maju said...

@Chad: On your personalized criticism, I tend to behave the kid who points out at the naked emperor. Sometimes I'm wrong, the emperor is not actually naked and I'm the one ashamed but in most cases he is, because I try not to laugh at well dressed emperors. The emperor and the court are like people who accept what others say without criticism, the kid is the one who puts all that ridiculous blind faith (scholasticism for example but in band-wagonism in general) in evidence. Someone has to and someone will sooner or later. Scientia vincere tenebras.

Anyhow, I appreciate that you redid the Dstats in the "classical" form:

Basque Lithuanian Georgian Chimp 0.0009 0.664 16938 16907 354212

A positive result should be read as Georgians are closer to Lithuanians than to Basques. However the value is tiny and the Z-score <1, so statistically irrelevant, mere noise, (this part following what Tobus just said).

Right?

"It looks quite simply like Basque and Lithuanians even out"...

That would be the case if the result would be statistically significant, yes. And that was what I was saying based on Matt's Fst data. So what's the problem?

"... as Basques carry more of the ANE shifted West Asian stuff and Lithuanians carry more of the ANE shifted EHG".

Slow down. Where do you get all those ideas from? And what is the relationship of ANE with Caucasus affinity, if any at all?

Maju said...

@Krefter: "Your classification of modern mtDNA is too simple"...

Fair enough. Sadly in many studies the data is not nuanced enough to reach much further and if we want to make comparisons we have to do some such concessions to simplicity. This is something that not just I do, many scholarly articles also do. Where more detailed data is available, we can always look more carefully at it afterwards. For example I paid careful attention at the surprisingly high U* in Basque-Ebro EN mtDNA. A shallow look would make them appear as almost Epipaleolithic (high U) but the clades have no local precedents, so they probably arrived from somewhere else, maybe Mediterranean Iberia or even further away. I do try to do both levels of the analysis: the generic and the nuanced.

"There's no way "modern mtDNA" expanded all the way from Portugal to Estonia after 4,000 years ago".

Well, that's what the data says, right? There may be several sources but the process is convergent. For instance one source may have affected the SW (roughly the Southern BB province) and carry high frequencies of H3, while other sources may have affected more northernly regions with much lower levels of this lineage but similarly high H1 and H* (and a long etcetera of possibilities). I'm not going that far, not yet (even with lots of well-described ancient DNA it'd be quite labyrinthine). All I say is that modern genetic pools have lots (40-60%) H and that most ancient ones (Neolithic or Kurgan alike) do not even approach these figures. Why the change? There must be an explanation. And there must be a source (or several).

"Copper age Spanish are the main ancestors of moderns"

That's not so straightforward. Not considering the few samples we have from Atapuerca or La Mina (all from the same small area incidentally, a not too important region in what refers to cultural changes, at least before the Celts). The mtDNA does not fit at all, while there are other sites (so far unsampled for autosomal DNA) that fit much better (Paternabidea, Gurgy in the Neolithic or even the Basque peripheral Chalcolithic sites sampled already 16 years ago by Dr. De La Rúa and where later studies detected a subpopulation with fixated LCT allele).

I say: follow the right tracks, not the ones that obviously do not match the prey we are chasing.

"I have no reason to apologize for not being an expert on diversity in Iberia".

It's not about being an expert is just looking at any PCA and see the obvious: Basques stand out like a sore thumb!

"Besides it's true Iberia/SW France forms a cluster opposed to other people".

I must disagree: Iberians - French - Basque+Gascons form three similarly differentiated and similarly related clusters. French tend more to North Europe, Iberians to Italy and Basques to "nowhere" (WHG in many scholarly PCAs, although not in David's). There is no clade Basques+Iberians vs French, not at all: it is a balanced near-equilateral triangle with three clearly distinctive vertices.

"Why would Basque be more similar to Italians than people who live right next to them?"

They are not. Iberians are the ones tending to Italy, not Basques. For Chaos sake: just look at any West Eurasian or European PCA!

"There's special attention on Basque because linguistics"...

Not really. The main reason is genetic distinctiveness (within the European relative homogeneity). Language difference has some implications but the main reason, since Cavalli-Sforza, since long before him actually (blood groups) is genetic.

...

Maju said...

...

"(...) in the last few hundred years (...) Before that it was seen as just another region in Spain or France".

Before a few hundred years there was no Spain (officially created c. 1720) and barely a France. You are just seeing matters from the viewpoint of modern maps created by and for "nation-states" but if something is true is that the Basque nation (ethnos, but often also politically distinct) is much older than those two Roman colonial products. At the very least it probably existed in some way when Celts cut them from Iberians and Ligurians in the 1300-550 BCE bracket. We are talking of a very old nation, not always a nation-state (or something close to it) but certainly a nation-people (ethnos). You see it from the Indoeuropean viewpoint, notably Roman colonial (provincial) demarcations, but we don't see it that way, because ethno-social reality is not that way. Similarly one can think of, say South Africa but people inside those largely artificial colonial borders may think in terms of Zulu, Xosha, Venda, Coloured, Afrikaans, etc.

"Basque belong to unique R1b in the sense they belong to a recent Basque founder effect"...

There's nothing "recent" about Basque genesis. Unless by "recent" you mean Neolithic or Chalcolithic. You are interpreting facts according to your bias, and that's never a good idea.

"They still belong to DF27 like Iberians and SW French and unlike Italians".

True, although DF27 seems to extend also (at lower frequencies) to places like Brittany or Bavaria, and Basques are also one of few populations with significant frequencies of S116* (the other one I know of are Irish). This is hardly the whole story anyhow: again you're focusing on Y-DNA and trying to fit everything inside Y-DNA. And this has many problems: one that we don't yet have enough ancient Y-DNA data to say anything about Western European R1b, another that autosomal DNA and mtDNA, as well as LCT+ and possibly other traits (Rh⁻ for instance) need explaining and that explaining can't be done easily only in Y-DNA terms.

Anyhow, notice that when I talk of "modern" mtDNA pool, I mention Paternabidea (Basque) but also Gurgy (North French) and a wider Atlantic question mark area. R1b-S116 clearly originates in France (probably not in the Gascon/Basque area but more data is needed to be sure), so maybe ancient France is the place to look at and ancient Basques are not a "red herring" but probably a less central element. They are in any case one of two (or more) flashing lights that are crying out loud for whoever wants to hear: look in Atlantic Europe if you want to find answers.

Krefter said...

@Maju,

My original argument is correct: Basque are similar to other Iberians. Discussion about Copper age and Bronze age ancestors of Basque involves other Iberians. I never said Basque are identical to other Iberians. There was no reason for a debate. Therefore discussion should mention all Iberians. Yes Basque have been isolated for a while, but they still have lots of common ancestry with Spanish, Portuguese, French. It could be Bronze or Copper age common ancestry but is still a reason to include other Iberians in the discussion.

"There's nothing "recent" about Basque genesis. Unless by "recent" you mean Neolithic or Chalcolithic. You are interpreting facts according to your bias, and that's never a good idea. "

Lots happens in 4,000 or 5,000 or whatever years. Even if a people are isolated from admixture they can have founder effect lineages and drift. I've read lots of Basque R1b is R1b-M153 and that this marker is specific to the Basque country and its surroundings. It definitely wasn't around in the Copper age, but I do know it was found in remains from 500 AD.

"Anyhow, notice that when I talk of "modern" mtDNA pool, I mention Paternabidea (Basque) but also Gurgy (North French) and a wider Atlantic question mark area. "

This needs to be considered. H frequencies though aren't very important. Deep-subclades of other haplogroups that can be defined with low coverage do. With Corded Ware and Unetice for example we don't see 40% H but we see subclades that match with moderns. This tells us there's plenty of mtDNA continuum.

Alberto said...

@Matt

I used these Fst distances to see if I could get an idea of how D-stats of the type:

D(Mbuti, X)(Anatolia_Neolithic, Georgian)
D(Mbuti, X)(Analolia_Neolithic, Lezgin)

would behave in similar pattern as the ones:

D(Chimp, X)(WHG, EHG)

First I checked the Fst distance of a few populations to WHG vs EHG:

https://docs.google.com/spreadsheets/d/1ziR7-OWaI-geplFOSrkWTHhDp6XG2je-xDAhoyHnAjY/edit?usp=sharing

While by Fst distance, all populations appear closer to EHG (except Sardinian, which is equal to both), the pattern is still the same as the D-stats. That is, Yamnaya very strong signal towards EHG, then Central_LNBA Clearly weaker but still strong, and then modern populations clearly weaker than Central_LNBA.

Then I checked the Anatolia_Neolithic vs Georgian:

https://docs.google.com/spreadsheets/d/1E2ZwazKfcVMd-jNSPBs23SdihGsQiGj7anFg7dwBbs4/edit?usp=sharing

Here the pattern is: Yamnaya has a strong signal towards Georgian, then Central_LNBA quite diluted, and then modern populations stay in the same range (actually increases a bit in Northern Europe, but decreases a bit in Southern Europe).

I finally checked with Anatolia_Neolithic vs Lezgin:

https://docs.google.com/spreadsheets/d/10IwbBjlVrlMXp_ceDfrSJc9v1BVNA5hb_rvB21RbvuM/edit?usp=sharing

With analogous results as with Georgian.

So it still seems like between CWC (et al. LNBA) and modern populations, EHG affinity diluted significantly, but the "Georgian-like" side of Yamnaya (sort of, we'd need the real population, not Georgian and Lezgin to be sure) remained stable.

We'll need more samples from different times and places to know if and why this happened.

Dospaises said...

@Maju
"Basques are also one of few populations with significant frequencies of S116* (the other one I know of are Irish)."

Which study are you referring to that found that? There is only one I can think of and that one found DF27 to be 71.50% in rural Basques and S116* was only 16.06% while the Irish were only 0.68% DF27 and S116* was 17.81%. While the S116* looks similar to be a similar amount it is only a small portion of the DF27 in Basques and Irish have very little DF27. Additionally that study did not use Next Generation Sequencing so the S116* could be determined if the DF27 testing failed or if the S116* in the Basque and the Irish is distinct which is probably the case.

Dospaises said...

@Krefter
"I've read lots of Basque R1b is R1b-M153 and that this marker is specific to the Basque country and its surroundings."

M153 only exists at a rate of 6.55% in autochthonous Basque even though they are 70.74% DF27 and 37.99% Z196 and Z220.
http://www.fsigeneticssup.com/article/S1875-1768%2815%2930174-8/abstract

Chad Rohlfsen said...

A positive result means that c is closer to a than b, when chimp is d. You're backwards again. Thanks for those nice words. You remind me of the stubborn old man that refuses to give up his ignorant and prejudice views.

Matt said...

@ Alberto: Re: the absolute FST distances of WHG and EHG, these are strongly affected by within population drift and diversity, but the differences should be mostly linear with the D stats (although FSTs themselves are I think maybe not linear with the amount of contribution from one population or another, so hard to predict % contributions from).

Picking up diversity within a group tend to reduce the group differentiation from all other group (which is why the Mordovians who are a mix of groups who are not themselves close to Africans come out closer to Africans as a group than other Northern / Northeast European populations).
Graphing the relationship between the (FST Anatolian Neo - FST Georgian) - http://i.imgur.com/x9LIgU5.png. It is mostly linear with (FST WHG - FST EHG) and (FST Central MN - Yamnaya Samara). There are some differences, with modern Europeans, Africans and East Asians being closer to the Georgians relative to ancient Anatolians than is implied by the relationship between shift between EHG:WHG or Central MN:Yamnaya_Samara though.

http://i.imgur.com/4sOw4Gr.png - same with Armenians

There is definitely a relationship, although I am sure ancient samples will eventually find a better proxy than Georgians or Armenians. http://i.imgur.com/YvfvuE3.png - for a totally unrelated distinction (the BedouinB-Anatolian Neolithic difference, appears totally unrelated to Central_MN-Anatolian Neolithic), a very close one (Anatolian Neolithic vs Central MN and Iberian Chalcolithic) and another example of a fairly close one.

Out of the pairs in the above graph, the difference FST Central MN - Yamnaya Samara and FST Iberia MN - Afanasievo probably map OK to phylogenic relatedness because they have more or less the same FST from the African outgroup (even though Mota has thrown some question on whether they are perfectly a true outgroup).

Maju said...

@Dospaíses: Effectively I was talking of Valverde & Illescas 2015. At no moment I meant that it implied any particular affinity with Irish, which seems to be your reading, but actually I was emphasizing only upstream S116 diversity in these two populations (and probably others, data for France and Britain is missing). It's plausible that, given the lack of general relatedness under S116, the Irish S116* and the Basque S116* will be eventually revealed as two or more different clades. What matters here is basal S116 diversity.

"Additionally that study did not use Next Generation Sequencing so the S116* could be determined if the DF27 testing failed or if the S116* in the Basque and the Irish is distinct which is probably the case".

Are you challenging the testing of not one ancient individual but of large swathes of modern samples? I don't understand this objection that ends up like saying something like "but well, never mind, it's probably all well". Care to elaborate?

Maju said...

@Rob: Sorry not finding it plausible in general terms. Bronze/Iron Age changes probably happened in some areas but what would Kurgan people bring: Unetice-like Neolithic-style lineages and decreasing levels of LCT+? It does not add up. It may actually work for Portugal but not for the bulk of Europe.

Alberto said...

@Matt

Thanks once more for the graphs.

Yes, once plotted the difference does look pretty linear in the Anatolia_Neolithic-Georgia vs WHG-EHG one. The LNBA populations do appear above the line, while modern Europeans (and Northern ones more so) above the line, but the difference doesn't look really striking. And again, we're testing with Georgians, not with the real ancient population, so that could well explain this effect.

On a completely different note: I find interesting the Anatolia_Neolithic - BedouinB difference. It's another hint that BedouinB indeed has some decent amount of ANE, in line with other tests.

Maju said...

@Chad:

I said, based on Reich 2010: "a positive result means that H3 is closer to H2 than to H1 and a negative result the opposite" for the original formula D(H1,H2,H3,outgroup).

You're telling me the opposite: "A positive result means that c is closer to a than b, when chimp is d". In other words: that H3 would be closer to H1 than to H2.

But that contradicts the original supp. materials section on which D-statistics are first described. Examples taken from page 135 (H1 H2 H3: value% : interpretation):

→ San Yoruba Han: 13.3 : Han closer to Yoruba than to San (positive score: H3 closer to H2 than to H1).

→ French Yoruba Neandertal: -4.6 : Neandertal gene flow with non-Africans (negative score: H3 closer to H1 than to H2).

So, who is being the stubborn one on this? Aren't you in a major error and needing to make a correction? Please check and recheck all you want, I've done my homework.

Maju said...

Erratum: Green 2010, not "Reich".

link to the PDF for easy reference.

Dospaises said...

@Maju
"What matters here is basal S116 diversity."
If the Irish S116* and the Basque S116* are eventually revealed as two or more different clades why will it still matter that they both had what was S116* diversity? Once enough people get enough SNP testing and the YFull dates are taken into consideration the Irish will probably be an arrival to Ireland at a date later to the arrival of DF27 to the continent. So it's existence there won't be evidence of S116 being as widespread in the Neolithic as you are purporting.

Which ancient individual are you referring to?

I don't have an objection. I stated a caution and the reason is because, as stated above, if they are completely different subclades of S116 or show a split in more recent times within the subclades then the existence in Ireland is irrelevant. NGS testing would have allowed for a better granularity of the subclades and if they had actually tested positive for subclades below DF27 then we would have known that the DF27 testing had failed in those cases. I am referring only to the Valverde test subjects when I mention DF27 possibly failing.

I agree that there needs to be more testing of western Europe of DF27 and it's subclades.

Maju said...

@Krefter: "My original argument is correct: Basque are similar to other Iberians".

Relative to what? Not the French for sure. I could say, bringing the argument to an extreme, English are similar to Zulus... compared to what? Neanderthals, chimpanzees, mosquitoes, trees, rocks! Similar is relative.

"Yes Basque have been isolated for a while, but they still have lots of common ancestry with Spanish, Portuguese, French".

Yeah, as long as you balance the issue including French I'm fine with it. The similitude is still relative and each of the three groups (Basques/Gascons, French and Iberians) form a distinctive cluster with clear peculiarities.

"Lots happens in 4,000 or 5,000 or whatever years".

Can happen (not the same as "does happen"). But in this particular case we have it reasonably documented in terms of mtDNA. And the change is negligible, especially compared to... anywhere else in Europe.

"Even if a people are isolated from admixture they can have founder effect lineages and drift".

Do you really understand how founder effect and drift works? Basically you need very low population levels for either one to happen in the dramatic way you are imagining.

"I've read lots of Basque R1b"...

You're again bringing the matter outside of the available ancient DNA information we do have, to the slippery terrain of Y-DNA. I decline to follow that game.

"It definitely wasn't around in the Copper age"...

Really? Do you have an exclusive ancient Y-DNA survey of the Basque Country? *sarcasm meant*

"With Corded Ware and Unetice for example we don't see 40% H but we see subclades that match with moderns".

Like what? Be specific please.

Alberto said...

@Maju

You found some strange D-stats maybe using a different program (?). Quite unlucky after doing the research. Here is the (very brief) documentation of the program used in these above and most other ones (AdmixTools):

https://github.com/DReichLab/AdmixTools/blob/master/README.Dstatistics

The output of qpDstat is informative about the direction of gene flow. So for 4 populations (W, X, Y, Z) as follows -
If the Z-score is +ve, then the gene flow occured either between W and Y or X and Z
If the Z-score is -ve, then the gene flow occured either between W and Z or X and Y.

Grey said...

Basques

If understood the argument above correctly

- Basques have some shift toward the steppe
but
- at the same time are least shifted to Georgian / caucasus

so time?

i.e. Basque's steppe admixture came from an earlier time before Yamnaya became more Georgian?

.

for example

case 1)
steppe dude A (100% steppe)
moves to x, marries farmer woman
son (1/2 steppe + 1/2 farmer)
moves to Pyrenees, marries mixed farmer/paleo megalith woman
son 1/4 steppe, 1/2 farmer, 1/4 paleo

some centuries later after steppe mixed with caucasus

case 2)
steppe dude B (now 1/2 steppe, 1/2 caucasus)
moves to x, marries farmer woman
son moves to Brittany, marries mixed farmer/paleo megalith woman
son, 1/8 steppe, 1/8 caucasus, 1/2 farmer, 1/4 paleo

the proportions mentioned aren't supposed to be exact just a simple illustration of the idea.

You'd generally expect individual stories to average out en masse...

unless you were looking at a region where it looked like there were big founder effects in which case one of the individual stories would also be the big story.

.

(alternatively Basques and Lithuanians have more of some other component than other populations and that is distorting things)

Grey said...

"There's no way "modern mtDNA" expanded all the way from Portugal to Estonia after 4,000 years ago."

There was a paper (some time ago now) that suggested the modern *frequency* of mtdna in Europe had spread from Iberia - which might just mean selection in place of existing mtdna.

Apparently there is a connection between mtdna and body temperature so maybe *hot* mtdna was adaptive for foragers and less so for settled people in their houses?

capra internetensis said...

@Grey

I do think selection on mtDNA is probably important.

But Neolithic farmers did not get to sit around inside of their houses all day, and European Mesolithic foragers were already quite sedentary. So the mechanism you suggest is unlikely.

Krefter said...

@Maju,
"Really? Do you have an exclusive ancient Y-DNA survey of the Basque Country? *sarcasm meant*"

We just got 9 new Y DNA samples from Copper age Burgos Spain. All are I, G2a, and H2. You can't keep saying "Get more Atlantic DNA". The lack of P312 isn't a fluke. P312 wasn't hiding somewhere.

Grey said...

capra

yeah, it's not something i'd bet the house on - just enough for a little side bet

Maju said...

"We just got 9 new Y DNA samples from Copper age Burgos Spain. All are I, G2a, and H2. You can't keep saying "Get more Atlantic DNA". The lack of P312 isn't a fluke. P312 wasn't hiding somewhere".

Burgos Atlantic? OK, you just broke my Basque geographic schemes (all that is too Mediterranean for what I'm considering, although close enough and good try). Anyway, the key issue is that ATP mtDNA pool is totally un-modern, so not what I'm looking for.

Did we get E1b-V13, J2b at the first try? Nope. E1b-V13 or J2b are similarly important in Europe as is G2a but for some reason G2a keeps showing up everywhere (even in Merovingians!), while other Neolithic lineages appear to have been less common in the studied sites. Can you explain that to me? I say it's luck combined with very patchy sampling.

But it's pointless to discuss while the data is not yet there. Why no samples from Britain for instance? AFAIK they are as technologically advanced as the Germans (or almost) and do not have the bias against genetics the French have - or do they?

When we have a wide Atlantic European sample, we can retake this discussion. Until then, let's stick to the facts, like mtDNA, LCT+, etc.

And remember that absence of evidence is not evidence of absence, notably when the survey is still so horribly bad in their coverage of the relevant geography.

Rob said...

Maju

Obviously it's hazardous to guess, but Mesolithic Britain will be Loschbour-like and I2a.

Krefter said...

@Maju,
"I say it's luck combined with very patchy sampling."

No it isn't a fluke. G2a, H2, I2 dominated Neolithic Y DNA. We see the same pattern in Anatolia in 6300 BC.

https://docs.google.com/spreadsheets/d/12G2cfjG0wHWarsl5bB99ridFmvUWzqlZfZ6_e_R6oIA/edit#gid=1870266760

My point with the results is a lot has changed in Spanish Y DNA since 2800 BC. Atlantic samntic. We have lots of Neolithic/Copper age French and Spanish Y DNA, I don't care about the Atlantic coast. The fact is in 2800 BC DF27 is absent in lots of Y DNA samples where it dominates today. Your theory of P312 expansion with Megalithic or before 2800 BC in general has been proven incorrect. I considered a local origin till we got this 2800 BC Y DNA.

You're clearly biased and don't want R1b-P312 having an Eastern origin and for Basque to have any Eastern ancestry. This is why I and Chad get frustrated. It'just a matter of time before someone gets Western Steppe DNA and finds R1b-L51.

Maju said...

You are not presenting a model that makes sense either, Krefter. Now you don't only have to explain how R1b-S116 and R1b-L11 in general originated and expanded "out of nowhere" but you also need to posit explanations for the expansion of E1b-V13 and J2b among others.

IMO part of your problem is that you tend to use broad categories like "Spain", taking the part for the whole, without gaining sufficient knowledge of that whole first. Like the blind men and the elephant, you know.

"This is why I and Chad get frustrated".

And because of people with a pre-established narrative like you guys is why I and a growing number of people is getting frustrated. Cry me a river, I have an ocean to cry for you.

Rob said...

Maju
So what's your vision of how M269 clades expanded- within a Europe -wide context ?

Krefter said...

@Maju,

J2 and E1b-V13 were very rare in Neolithic European samples. Our only J2 that was tested for downstream clades is from Anatolia and it was J2a. There's also a J2a1 from Bronze age Hungary. Anyways, EEF had lots of common ancestry with West Asians, so we should expect to see a J2 or E1b or R1b1c or R1b1a2 pop up once and a while. That doesn't mean modern J2/E1b/R1b descend from them.

A small percentage of Italian J2 is J2b. Italian J2 is of the same variety as SouthWest Asian J2(largely or mostly J2a1b-M67). It looks like most J2 came to Italy after the Neolithic from West Asia. Sardinians have the least J/E1b in Italy, which is consistent with this idea. . Italian E1b is also of the same variety as SouthWest Asian, most is M123 and M78. I don't know about V13 frequencies.

In the Balkans there's a lot of J2b(which also exists in SW Asia) but still some typical SW Asian J2. High frequencies of J2b and E-V13 could have a million differnt origins from Mesolithic to Bronze age, and looking at Early Neolithic Y DNA from Germany/Hungary won't tell us where it comes from.

My opinion is that E1b and J2 in Italy is mostly of post-Neolithic SouthWest Asian origin. Typical SW Asian mtDNA also pops up in Italy more often than elsewhere. Davidski has theorized Bronze, Iron age gene flow from SW Asia into SE Europe. IMO the Jews were definitly not the first Semetic people in Europe. I'd say this is also the case for Iberia and most of Europe to a lesser extent, where you find typical West Asian J2, J1, and E1b. There's no very convincing evidence for this though. Ancient DNA is needed.

For the Balkans I don't have a guess. They live right next to West Asia, so there could have been constant gene flow after EEF arrived over 8,000 years ago. Just like Italy, they have a pull towards SW Asia. IMO, there's defintly post-initial arrival of farming common ancestry between SE Europe and SW Asia.

Maju said...

@Rob: → visual answer

For more details you may want to go to the entry I took the map from or follow the links from there, or use the search function or ask again, preferably something more specific.

Eastern Europe is irrelevant: their R1b-M269 is practically zero and their basal diversity does not seem notable in any way either. Obviously the little they have arrived from further West or South. Not everything is about Kurgans (and believe it or not I have always been a staunchly defender of the Kurgan model of IE expansion, but cultural, political and even military expansion is not the same as genocidal expansion: the Stockholm syndrome works equally well).

Maju said...

@Krefter: "It looks like most J2 came to Italy after the Neolithic from West Asia".

Might be but without J1? When we look at modern West Asia J1 and J2 are mixed all around: the frequencies vary but they are never isolated. Even in Ethiopia Semitic migration (which is very ancient) seems to have introduced J2. However when we look at Europe or North Africa they are nearly only one or the other. J2 is not only found in Italy anyhow: it's important in Greece, Bulgaria, Romania, Spain and many other places. So it's much more easy to explain as a Neolithic lineage with more success in some areas and less such in others. Same for E1b-V13 or G2a. Actually the three pretty much go in a package, so to say.

E1b in Europe comes in two varieties: one is E1b-M78 (most of which is E1b-V13), which seems to have spread in the Neolithic from Greece and ultimately from the Levant, Egypt, etc. In fact it would seem as the most clear marker of the "Basal Eurasian" (i.e. African-like) component in EEFs and their modern European descendants.

The other is E1b-M81, which is clearly from Morocco, has a peculiar second home in West Iberia (along with mtDNA U6 and an array of rare L(xM,N) lineages), where it has frequencies approaching 10%, and has some lesser scatter further North along the Atlantic to Britain, France, etc. There is one miner district in North Wales with very high frequencies of this lineage (can't recall the name of the place, sorry). This one is clearly unrelated to EEFs but may have expanded or re-expanded within Atlantic Neolithic (I suspect it's an HG lineage in Portugal and Asturias, dating from Solutrean times).

"My opinion is that E1b and J2 in Italy"...

The problem is that these lineages are not just found in Italy but all around Europe, particularly to the South. With a distribution that is very similar to that of G2a. So they fit best with a Neolithic expansion model and at least E1b-V13 was clearly present in those times. It's possible I guess that Danubian Neolithic had a more G2a-dominated Y-pool (founder effect) but there is much more Neolithic around than LBK.

As for our previous discussion, I want to emphasize that we just adhere to two different hypothesis or theoretical models without yet sufficient evidence (at least not enough to persuade each other). The scientific method to decide who is right is to search for more evidence - and that's precisely all I'm asking for: properly survey blank areas, particularly those with the greatest potential. It's just like finding the Higgs boson: what do scientists do? Argue and argue endlessly until one gives up? Nope: build a hadron collider under Geneva and test all the possible range of options.

Our brains react defensively when our belief systems are challenged, and that certainly creates frustration. Let's just accept that fact of life and relax. In the end what matters is to uncover the truth.

Maju said...

BTW, I just happened to see that one of the co-authors uploaded the Gamba 2014 paper to Academia.edu. I had not got the opportunity to read it before and I notice that, again, their ADMIXTURE analysis shows Basques (also Lithuanians, Orcadians) with zero Caucasian component.

It's like every academic article (except the Haak anomaly) shows that.

Also again the PCA shows HGs distributed off the Atlantic axis, and not like David's graph. I reckon that there must be something to what David is showing us here (per the discussion above) but I do have a hard time fully believing it and more so explaining it to third parties (which would normally react: but I read X and Y paper and it looks totally different).

Krefter said...

@Maju,
"E1b in Europe comes in two varieties"

M123 has a presence to. E1b-M123 is 2-4% in Italy, which is the same frequency we see in Iraq, Turkey, and Lebanon. From the same study M123 popped up in Greece, Croatia, Hungary, and Ukraine. Some E1b-M123 could be of recent Jewish origin because Ashkenazi and Spanish Jews have the highest frequency at 10%+.

"So it's much more easy to explain as a Neolithic lineage with more success in some areas and less such in others. Same for E1b-V13 or G2a. Actually the three pretty much go in a package, so to say."


IMO, only a single lineage can suddenly be successful. It doesn't make sense for example, if 5 differnt R1a lineages suddenly became popular at the same time in the same population.

We see a founder effect in Sardinia and Copper age North Italy(3/3 have I2a1-M26) with I2a1-M26. I2a1-M26 was a rare lineage in original Neolithic package and a single M26 lineage went through a founder effect in Sardinia and maybe Italy.

It makes sense that most E1b and J2 in the Balkans is a Neolithic founder effect, because most is J2b and E-V13. However in Italy we see multiple J2 lineages that are popular. In my view this can't be a founder effect.

Unlike with R1b-L51 and R1a-M417, I'm not very confident about the opinion I lean towards when it comes to J2 and E1b in Italy and Balkans. It's because we don't have ancient DNA from either location(except Copper age by Italy's northern border).

IMO, it's interesting there could be a large amount of undocumented immigration by Semetic-speakers via Mediterranean sea.

Krefter said...

@Maju,

"As for our previous discussion, I want to emphasize that we just adhere to two different hypothesis or theoretical models without yet sufficient evidence (at least not enough to persuade each other)....... what do scientists do? Argue and argue endlessly until one gives up? Nope: "

Agreed, and so I don't see a reason to argue about it anymore.

Maju said...

@Krefter: "IMO, only a single lineage can suddenly be successful. It doesn't make sense for example, if 5 differnt R1a lineages suddenly became popular at the same time in the same population".

Why not? A population does not have a single founder but many, and often these founders carry different lineages (i.e. at some point they came together by alliance, not relatedness). Of course that luck in drift in founder micro-effects can push up some of these and down some others but in an expansive population like the EEFs, there is room for a lot of variety (within the initial pool plus whatever lineages they may incorporate from assimilated HGs).

Real populations have diverse lineages, and that was surely almost always the case. Only in cases of long isolation we do see fixation in one or a few such lineages (and anyhow few is more than one).

"We see a founder effect in Sardinia and Copper age North Italy(3/3 have I2a1-M26) with I2a1-M26. I2a1-M26 was a rare lineage in original Neolithic package and a single M26 lineage went through a founder effect in Sardinia and maybe Italy".

And the Pyrenees. It's the second most common Basque lineage, for instance. But, anyhow, don't Sardinians also carry other (presumably Neolithic) lineages like G2a, E1b-V13, J2b and R1b-V88? Yes, they do. Did these arrive in separate episodes? It's possible but not particularly likely. So even in a case like that of Sardinia we see clear diversity in the founder population(s).

"However in Italy we see multiple J2 lineages that are popular. In my view this can't be a founder effect".

Call it whatever. Obviously the term "founder effect" is normally used to refer to a particular type of bottleneck, that caused by migration, so if the bottleneck is very wide and does not look much like a "neck" then it may well be justified not to use it. But in any case the lineages are most likely not native to Italy, so they came from somewhere else and many at least could have arrived with EEFs (or whatever other populations that established them in "founder effects" of sorts).

A somewhat comparable case could be US colonial/immigrant "founder effects". They imply many many different lineages, even if we focus only on the European ancestry part, from many diverse populations. But any serious analysis will show that the diversity relative to origin is reduced (necessarily) and that is a mild bottleneck and a mild "founder effect".

Either we renounce to use the term "founder effect" for such extensive, diverse, migrations or we do use it but in a particular way that accepts that the neck is wide (but never as wide as to allow the whole original population's diversity to migrate, many necessarily stay back at home). In any case when we have such a wide type of migration, founder effects are milder. On the other hand such a wide "hose" allows for much easier impact of the migrant population, as the numbers involved are much greater and that's a clear advantage.

In a typical "narrow" founder effect, the founders are few and have to overcome that initial minority status (unless the settled area was effectively uninhabited) to leave their mark.

"IMO, it's interesting there could be a large amount of undocumented immigration by Semetic-speakers via Mediterranean sea".

I would not exclude it in the case of Sicily but Sicily and Malta are anomalous. Otherwise Semitic speakers in particular should have some J2 but also lots of of J1. And the impact of J1 in Europe is very limited. If you want to play the (strict) "founder effect" card, then you have to renounce to the "large amount" (of migrants) one. Or vice versa.

Simon_W said...

Even if there were indigenous West European groups with modern-like mtDNA pools, this doesn't prove that their autosomes were modern-like too. Because in the case of asymmetric sex-biased gene-flow, males may have contributed significant admixture which didn't alter the mtDNA.

Simon_W said...

And sex-biased gene-flow isn't a fantastic, unlikely idea. There are known examples in history where exactly this happened, e.g. on the Canary Islands the Guanche ancestry is much more from females than from males.

Simon_W said...

And indeed lactase persistence is just caused by a marker that moreover came under strong selective pressure, its presence or absence doesn't prove modern-like overall DNA. That would be like concluding that SHG must have been like modern Scandinavians, because the had the alleles for light hair.

So while there is the theoretical possibility of a mystery population on the Atlantic fringe I don't consider this to be likely at all.

Simon_W said...

There is zero evidence in ancient DNA for an origin of R1b-L51 in Neolithic western Europe. In fact evidence for Neolithic R1b there is scarce, there is just one Cardium male with an equivalent to V88, and then ATP3 who was positive for one SNP that occurs in M269, but evidence that he also had the necessary other >60 SNPs to qualify as M269 is lacking. On the other hand the PC steppe had M269 and L23, and Final Neolithic samples from Germany, classified as Bell Beakers are the first positive evidence for P312. Needless to say that these guys differed a lot from the MN Germans, as they deviated into a steppe direction. This all has been known for a while, it's nothing new, just valid, valuable evidence that an unbiased observer has to take into account.

Simon_W said...

I consider myself virtually free of any ethnocentric bias, being of rather mixed ancestry. If I do have a bias, it's to take the ancient DNA evidence very seriously, more than any other category of evidence. But in my opinion that's the most reasonable thing to do.

Maju said...

@Simon: Canary Islands, as well as at least some American colonial cases, are exceptional because they require a sustained immigration for many centuries, generation after generation, and cannot be explained by a punctual migration of patriarchal nature that later re-expands but from a local source. This last is what we usually see reflected in the haploid bias: that the Y-DNA immigrants were not able to cause a major autosomal effect and that the autosomal pool remains much more similar to the mtDNA pool and keeps little or no relation with the Y-DNA one.

The cases you propose as examples need of a sustained imperialist, truly colonial, structure. Did Metal Ages' Europe had those? I don't think so. Even with such a powerful empire as the Roman one, we mostly fail to see any demic-colonial legacy, much less one of the type you suggest.

"And indeed lactase persistence is just caused by a marker that moreover came under strong selective pressure, its presence or absence doesn't prove modern-like overall DNA".

It is less clear than you imagine that LCT+ was so strongly selected. The overall process is not well understood but in any case we do see a West to East flow of the allele, or at least its modern-like dominance. This flow logically should have carried Western genetic pools, i.e. it implies migration ripples from the West in order to make a difference: an allele that is not present cannot be selected for.

In any case when Mathieson et al. fail to address, and even dare to dismiss, the very clear available evidence on Western LCT+ in the Chalcolithic, they are denying their own selection thesis any legitimacy, because junk in = junk out. You can't cherry-pick the evidence and then pretend that your conclusions have any merit.

"So while there is the theoretical possibility of a mystery population on the Atlantic fringe I don't consider this to be likely at all".

An act of faith you make. Because where is the evidence? Heh!

"If I do have a bias, it's to take the ancient DNA evidence very seriously, more than any other category of evidence. But in my opinion that's the most reasonable thing to do".

Well, we need to take in consideration all the available evidence. Maybe if we had modern quality ancient samples that would not be the case but that is effectively impossible. And worse: our current samples leave way too many regions unreported, at least considering only autosomal DNA.

There is much wider evidence from ancient mtDNA anyhow and this one often, almost always in the Old World, keeps a strong correlation with the autosomal DNA (unlike the Y-DNA). And that is also ancient DNA, ancient DNA that you arbitrarily choose to ignore.

In general your bias is much more extreme than you admit to because you attend to only one type of evidence (autosomal ancient DNA), a type that is not really good enough in many aspects, notably the coverage of the geography. It has happened before, many times, that what is a rule for Central Europe is just totally wrong in other parts of Europe, for example with pre-Neolithic mtDNA H. And in this also many people have committed the error, the "scientific crime", of denying the facts in order to be able to stick to their narratives. It is not a good idea: unless it's only a denial phase that you later overcome, it damages your credibility, even before your own self.

Simon_W said...

@ Chad

Just a quick remark about that Gedrosia component you mentioned. While I agree with the main thrust of your argument, it's a bad piece of evidence for the Caucasus affinity of Basques. Because that K12b calculator also has a Caucasus component which, unlike the Gedrosia component, is modal in the Caucasus, and Basques have 0% of this.

Simon_W said...

@ Alberto

„Ah, also your latest K12 can be relevant. In it, the Euro_HG cluster is pure WHG, without any ANE that can hide the eastern component is populations with high WHG/ANE ratio. It's an unsupervised run (as far as I know) with real samples, no zombies or anything like that:

https://docs.google.com/spreadsheets/d/1ajolEB_2NXAnxtGJXSbwKSF1CvDvFiGy5otpq5f-C1g/edit?usp=sharing“


Good point. In that run, the French Basques have 17.1% Afanasievo, while BR1 from Gamba et al. has 18.0% and RISE479, one of the more HG admixed individuals of Bronze Age Hungary, has just 15.4%. This would necessitate a 100% replacement, which is completely impossible. Bell Beakers from Haak et al. have 31.3% Afanasievo, which would still mean a 54.6% admixture in Basques, but at least, that's not impossible.

Presumably Proto-Basque wasn't always confined to the tiny Basque country, it may have been spoken in a much larger area. And the impulse from eastern Europe faded out as it spread westwards, up to a point where the locals acquired still substantial eastern admixture, but didn't adopt the language. And these in turn may have transmitted the admixture to the Basque country proper. This scenario seems more natural than the idea that IEs flooded the Basque country.

Simon_W said...

@ Maju

Figure S8 in Günther et al. shows shared drift of the ATP individuals with modern pops, as measured by f3 stats. ATP9 shares most drift with Basques, Sardinians and Brits. She doesn't share less drift with Basques than with Brits. So it definitely cannot be said that ATP9 seems rather British than Basque. Now, admittedly the D-stats (Figure S13) suggest that Basques have more affinity with Gok2 than with ATP9. The closer affinity to Gok2 holds true in all comparisons with ATP individuals, but is least pronounced in the case of ATP2, which seems to suggest that indeed ATP9 isn't closer to Basques than ATP2. Presumably ATP9 is rather some kind of side branch, on the way to modern Spaniards, but still lacking enough steppe admixture to score high in f3 stats with the Spanish. But clearly PCA plots are not just garbage; if a plot shows just the first two dimensions, then it's clear that it doesn't tell the whole story. But since component 1 and 2 are the two most important components of the genetic variation, the plot nonetheless tells something important. The fact that in the first two dimensions ATP9 is very close to Basques, closer than ATP2 and Gok2, tells us that something important had arrived at the time of ATP9 that is also had by Basques, but not by ATP2 and Gok2. And this „something“ is evidently steppe affinity. So even though on the whole ATP2 seems closer to Basques, he's far from being a perfect fit, notably because he lacks steppe admixture. Yes it's steppe, not pseudo-steppe, because what's measured in the f3 stat is affinity with the modern British who definitely have it, not affinity with the yet unknown Bronze Age Britons.

Regarding the increased EHG and WHG of Basques compared to Spain_EN, you seem to believe it both had to rise together, but that's far from self-evident. Why not first a serious increase in WHG, and later an increase in EHG? Would make sense, because there must have been a lot of WHG ancestry on the western end of Europe, long before the EHG-rich Corded Ware had spread in central Europe.

Regarding R1b, I just have the impression that every time we analyse ancient yDNA from an area where R1b is now very common, it isn't found in the ancient samples. There seems to be a pattern at work. It may be argued that Mesolithic Spain and Luxemburg, also early Neolithic Catalonia and middle Neolithic Southern France are too early. But what about Chalcolithic Northern Spain, Chalcolithic and early Bronze Age Northern Italy, and even the Chalcolithic-Megalithic Seine-Oise-Marne culture? That often gets forgotten, but we have two yDNAs from the dolmen of La Pierre Fritte, near Villeneuve-sur-Yonne, dated to as late as 2750 – 2725 BC, belonging to the megalithic SOM culture, and neither was R1b. And mind you, that site is very close to Gurgy, your purported point of modernity.

Simon_W said...

Maju said: „Eastern Europe is irrelevant: their R1b-M269 is practically zero and their basal diversity does not seem notable in any way either. Obviously the little they have arrived from further West or South. Not everything is about Kurgans“

With all due respect, but this doesn't make sense. We now know that Yamnaya was bursting with R1b-M269, they nearly all had it! So the fact that this haplogroup is rare now in eastern Europe only means that it was displaced in the course of prehistory, but doesn't mean that Eastern Europe is irrelevant.

As for the still underexplored Atlantic fringe: I'm certainly not against more sampling there, it can never harm. But what big news can we expect from there? The farmers were not from an entirely different early Neolithic source. The WHG cannot have been very different from other WHG, considering that they were similar in Spain, Luxemburg and Hungary. Thus the biggest uncertainty are the ratios of these constituents, and the question to what extent SHG diffused into NW Europe. The rest will be details...

Re: your discussion with Krefter on mtDNA, Haak et al. also defined a conglomerate of „LN/BA haplogroups“, which include I, U2, T1 and R. I would also add my own, K2, and perhaps there are some more. These mostly came from the steppe. They didn't change the European mtDNA pool a lot, but it cannot be considered fully modern without them.

In general I surely agree that mtDNA reflects autosomal DNA better than yDNA. But nonetheless it's only part of the story and all extrapolations to autosomal DNA are speculation. And we know that the autosomal DNA in central Europe changed quite remarkably with the advent of the Corded Ware. If the incoming males mixed with local females more often than the incoming women with the local males, what seems plausible given the likelihood of status differences, then the resulting mtDNA may suggest more continuity with the pre-IE population than there really is.

Maju said...

@Simon: My tentative stand re. ATP (and also La Mina, aka "Spain_MN") is that they are a "side branch" and not directly related to modern Basques in any strong way, much like they are not particularly related to modern Sardinians or modern Iberians. ATP9 seems to have ancient British admixture and may be indicative of demic/genetic flows within the Atlantic Bronze complex, which in terms of material exchanges, at least of manufactured artifacts, is even richer than its Chalcolithic precursors at long distances.

PCAs are not "just garbage" but we know well that they have some serious issues and that, when available, other data must be used with preference.

"The fact that in the first two dimensions ATP9 is very close to Basques, closer than ATP2 and Gok2"...

Contradicts the quantitative shared drift datum, which is actually lower. It is a clear example of the limitations of PCA analysis. Maybe using only SW European data we could see it more clear in PCAs but with all those NE Europeans and West Asians coping the PC polarities, we see only a very blurry picture. A 3-dimensional representation with PC3 could also help, I guess, but have yet to see one.

"Why not first a serious increase in WHG, and later an increase in EHG?"

It could be but how do you explain EHG without Caucasus component? Which is the vector population in terms that I can identify in the archaeological record? As for the possible {WHG+EHG} solution, I reckon that it is just tentative in any case but could fit with a possible pre-IE Atlantic flow of EHG (gradually becoming more WHG and maybe also more EEF) via Scandinavia or some nearby regions. They could be refugees or just "trading partners" in the Atlantic route of amber prior to Corded Ware. Can't say, just trying to make sense of the available data, something that would be much easier if we had some meaningful Atlantic aDNA.

"Regarding R1b, I just have the impression that every time we analyse ancient yDNA from an area where R1b is now very common"...

Where are the samples from the Basque Country, Ireland, Gascony, Scotland? Those are the areas where R1b is most common today on Earth. Just a handful of samples from Spain, that very obviously have the "wrong" mtDNA pool, are very clearly not enough. We just can conclude that, as far as we know, the "greater" Ebro basin (incl. Languedoc and Upper Duero) lacked it (excepted some upstream forms). All the rest remains unsampled. And that "all the rest" is a huge area: almost all Western Europe!

So "every time" we are sampling roughly the same very specific area. Time to try something new.

"e have two yDNAs from the dolmen of La Pierre Fritte, near Villeneuve-sur-Yonne, dated to as late as 2750 – 2725 BC, belonging to the megalithic SOM culture, and neither was R1b".

I was not aware. I cannot find it in the usual reference sites. Can you reference it? If confirmed it could be an important piece in the puzzle, although it cannot be the last word either.

"We now know that Yamnaya was bursting with R1b-M269"...

A Volga-specific subclade and not at all within mainline European R1b-M412. Why do I have to underline that once and again. It's as if you said "E1b" instead of E1b-V13 or E1b-M81: it can cause all kinds of confusion.

...

Maju said...

...

"As for the still underexplored Atlantic fringe: I'm certainly not against more sampling there, it can never harm. But what big news can we expect from there?"

Everything: Gokhem alone is much more important to explain modern Western European genetics than most of the other samples. Similarly the mtDNA and LCT data from the region seems crucial.

"The WHG cannot have been very different from other WHG, considering that they were similar in Spain, Luxemburg and Hungary".

WHG seems to be equivalent to "Magdalenian" (epi-Magdalenian peoples to be more precise). However the Hamburgian-Ahrensburgian culture of NW Europe seems to be relatively unrelated (and Motala genetics may support this notion). I really want a good sampling of Britain for example because it may provide huge relevant information.

... "and the question to what extent SHG diffused into NW Europe".

Almost without doubt it was fundamental in Low Germany, Netherlands, Denmark and at least half of Britain: all the regions around the North Sea (once partly emerged). We should not ignore SHG or something like that (not sure how good are Motala as representative of the wider macro-population).

"The rest will be details..."

The devil is in the details, you know.

If we'd go only by those locations sampled with full nDNA, the picture would be most confusing. For example, mtDNA H could still be considered to be only a Neolithic input, when in fact it was clearly present in many Paleolithic populations, but not in anyone that has been sequenced for nDNA. Modern-like mtDNA pools would be imagined to have arisen only after the Chalcolithic, when we know that they already existed in the Neolithic (but not in any site sampled for nDNA, except Gökhem). LCT+ would be imagined (as Mathieson wrongly does) to have been selected only after the Chalcolithic, when we know for a fact that, in some populations, it had been selected before it (but again none that has been sampled for nDNA).

So there's a lot to learn potentially from widely sequencing Atlantic Europe. Sooner or later it will be done. Until then we can park this discussion I guess.

Grey said...

Maju

"It could be but how do you explain EHG without Caucasus component? Which is the vector population in terms that I can identify in the archaeological record?"

If the big difference in Basques is that element in their ancestry is missing the Caucasus component most populations have then that implies the possibility they might have got it before the mixture

which is interesting.


Maju said...

@Grey: yes, that's a key part of the problem, from the autosomal side of the data. The Caucasus+EHG admixture is present in Yamna, Corded Ware and in general all Kurgan derived populations, with a somewhat constant ratio between both components. Hence that complex, and not each component separately, is what constitutes Kurgan or Yamna-like admixture.

What about the populations that have a different apportion, with no effective Caucasus component but still EHG type admixture? I mentioned Basques, but I've come to realize that the pattern also affects NW Europeans, even Lithuanians and such, although their EHG-WHG ratio is much more slanted to EHG than among Basques. There are unexplained sources of EHG admixture: not all is Kurgan.

Simon_W said...

@ Maju

Alright, for the sake of complete certainty it would be useful to see some Chalcolithic yDNA from Ireland, the Scottish highlands, Brittany, and the Basque country. Can't be that hard and expensive to get that. I just won't change my opinion that it's unlikely at the moment that these extreme western places were also the center of gravity of R1b in the Chalcolithic, because according to estimates by yfull, R1b-L23 isn't age old (but I know you don't believe in age calculations) and we know it was very common on the eastern European steppe, that is, very, very far away from the extreme west of Europe. And Bell Beaker males from central & southeastern Germany and Bohemia had a similarly high incidence of R1b as those extreme western places have at present, although the incidence of R1b in modern central Europeans is much lower, which seems to indicate an east-west shift.

I have no problem admitting that the Yamnaya tested so far was predominantly of a side branch to west European R1b, because that's true (though not simply Volga/Bashkir specific, but most of them even closer to Caucasian Avars and Tabassarans, with a close relative in Sardinians). But not all of them: I0443 from Allentoft was just R1b-L23, which is a close ancestor of R1b-L51, just separated by 4 SNPs.

The Chalcolithic yDNA from SOM / La Pierre Fritte was in Lacan 2011:
http://thesesups.ups-tlse.fr/1392/1/2011TOU30177.pdf
The haplogroups were just predicted by STR testing, though.

I've reconsidered it: To me a sampling of Chalcolithic and Bronze Age northwest Europeans would also be interesting, most of all to see how and when steppe admixture and R1b reached these regions.

Maju said...

L51 (= M412) is still a transitional clade and not yet the Western-specific sublineages. Anyhow, what is clear is that the Yamna R1b is part of L23(xM412), which is a West Asia centered paragroup. It is as "useful" (useless) to discuss Western European R1b as any random R1b from Turkey, Iran, Jordan, etc.

Thanks for the link of Lacan 2011. I should have remembered that (but I didn't). It is exactly as you say re. Y-DNA: 5 G2a, 1 E1b-V13, i.e. a typical "first Neolithic" pool. However their mtDNA is also a typical "first Neolithic" pool: 3 K1a, 2 T2b, 1 U5, 1 H3 (very low in H, dominated by "Neolithic" lineages: K and T in this particular case). Hence it is not comparable to Gurgy or Paternabidea but rather fits with the genetics of Western Cardial and other "first Neolithic" sites (LBK, etc.)

What the study of Gurgy clearly underlines, much as I did before for Paternabidea, is that their mtDNA pool is not as much as their known precursors or contemporaries but rather like that of the peoples that came afterwards. This is almost necessarily a most important clue.