Tuesday, August 9, 2016

On the enigmatic early Neolithic farmers from Iran

There still seems to be a lot of confusion around the traps, including in the comments at this blog, about the genetic structure of the early Neolithic Iranian farmers.

They're certainly a unique and mysterious West Eurasian population, but I'd say the picture is generally pretty straightforward considering that they were dug up on the border between the Near East and Central Asia.

As per my K7 test, they're closely related to other West Eurasians, and especially Near Easterners, via an ancient component that appears to be a mixture of Basal Eurasian and something very similar to the Villabruna cluster (see post here and the last page of the accompanying comments).

Apart from that, they harbor a lot of AG3-related ancestry, albeit probably only distantly related. My guess for now is that this is mostly admixture from an as yet unsampled Central Asian forager population, perhaps with elevated affinity to Ust_Ishim (update: probably not, see here).

The graphs below are based on the datasheet available here. Like I say, these ancient Zagros farmers are unique and eastern shifted, but, at the same time, don't show the type of Southeast Asian pull that characterizes present-day South and South Central Asians.


Rob said...

Sorry, maybe I'm missing something, but in your K7 Iran Neol. have zero 'Villabruna'. Only Natufian register it.

(BTW link to your 'comments start here' doesn;t work)

Davidski said...

The Basal-rich component in the K7 is roughly 50/50 Basal Eurasian/Villabruna-related. We can see that in the Fst distances between the clusters.

So unlike other West Eurasians, Iran_Neolithic doesn't have any extra Villabruna stuff, but it does have it, and quite a bit of it too.

I can't make that link to the relevant comments work. But the last page of the comments here is where all of the useful info is.

Rob said...

I see. If 'basal rich' is ~ 50% "Villabruna like, and given the added Villabruna (Natufians ~ 20%; Anatolians 45%), then this "Villabruna-like' population mhad a very hefty preence in the Levant & Anatolia.

3 possible scenarios are
1) Villabruna-like was 'native' to Near East, and the true basal came from elsewhere after the LGM
2) the Near East must was structured with Basal and proto-WHG -like groups
3) The "Basal' was native in the Near East, and there was a large scale reflux of groups from the north somewhere.

Have you or Chad excluded contributions from other UP Europeans, like Kostenki, Vestonice, Goyet, etc.?

Davidski said...

I haven't, but remember, Near Eastern affinity in Europe rises sharply with the appearance of the Villabruna cluster.

Rob said...

If not due to a "common third source admixing into both, as suggested by the Fu paper, then a migration from the near east into Europe during the late glacial is something undocumented in archaeological annals, afaik. Pity the Palaeolithic of Anatolia is such a blank

Davidski said...

I don't think there was a migration to Europe during the late Paleolithic, because the ElMiron cluster is basically the same thing as the Villabruna cluster, except without the elevated Near Eastern and steppe affinity.

So it looks like the Villabruna cluster came from the Balkans, and it either also migrated to the Near East, or its close cousins were living in Anatolia, and even down the coast into the Levant.

Rob said...

I'd consider Balkan and Anatolian Late Glacial essentially as one unit. They were separated (culturally & geographically) from the levant, but must have migrated south after the LGM

epoch2013 said...


El Miron does show elevated Middle-Eastern affinity in the D-stats in the Fu paper, and Goyet-Q2 even more. See figure 4b.

Davidski said...

As high as Villabruna?

epoch2013 said...


No, but GoyetQ-2 almost as high as Falkenstein. OTOH HohlerFels has none. But it looks like a tad more gradual climb than Fu seem to suggest. Well, take a look at the pic.

This does place the Middle-Eastern affinity in a population in Iberia at the end of LGM, though. Can hardly be caused by a migration I would say.

Chad Rohlfsen said...

I think they had ElMiron / Magdalenians as 32% Villabruna and 68% GoyetQ/Aurignacian. I'd have to double check though.

You're getting close, but I'm not sure it's possible to make a cluster shared equally with Iran and Natufians. If the BE rich is about 50% BE, you're still missing 10-15% BE in Iranians which might be getting shoved in ANE. I'll take a stab at it too after I'm done with the other test.

Richard Rocca said...

Davidski said..."Apart from that, they harbor a lot of AG3-related ancestry, albeit probably only distantly related. My guess for now is that this is mostly admixture from an as yet unsampled Central Asian forager population, perhaps with elevated affinity to Ust_Ishim."

Remember that two of the three Iranian Neolithics belong to R2. Not difficult to speculate that that is where the AG3-like ancestry came from IMO.

epoch2013 said...


El Miron proper. The thing is that she lived at the end of the LGM, at which time Iberia was considered a refugium, a bit too early I think to consider a direct migration from the Middle East.

@Rob and David

If the Balkans was the origin of Villabruna/WHG then it had to arrive later, as several Romanian samples show no special affinity to Villabruna (apart from the overall relatedness to K14 almost all ice age HG's show). It was gone during the mesolithic as Greek mesolithic HG's were mtDNA K1c.

Not that I have a better idea, mind you.

Chad Rohlfsen said...


What are you talking about? I mentioned nothing of the Middle East.

bellbeakerblogger said...

I'm curious if Villabruna is actually R2 instead of R1b. Does anyone know the basis for this call?

Matt said...

Kind of seems unusual on these stats, Sardinian is more of an outlier than the early Neolithic. Perhaps it's because of Yoruba group, or without a D(Yoruba,EN)(Ust_Ishim,X) ref stat.

PCAs:, Neighbour joining:

These pair of stats are also strange in that all the ancients are on one side (relatively more related to Onge) and all the moderns are on the other (relatively more related to Dai):

(Must be due to how they're sampled?)

epoch2013 said...


Previous reactions did discuss that.

Richard Rocca said...

@bellbeakerblogger... Villabruna is undoubtedly R1b.. and more specifically R1b1a.

Grey said...

if sequence

1) pre-lgm distribution
2) everyone dives into their nearest local refuge
3) post-lgm expansion

then i guess one logical possibility (data permitting) is a wide spread pre-lgm base population which divides into 2 or more regional refuges during the LGM?


for example - related population all along the north coast of the med pre-LGM splits into separate Iberian, Adriatic and Aegean refuges and then re-expands after?

epoch2013 said...


In other words, I brought up El Miron to respond to previously discussed Anatolian migration suggestions. I reckoned you responded to that.

andrew said...

Who are South Central Asians as distinct from South Asians? The term does not have a clear geographic meaning in my mind.

bellbeakerblogger said...

Thanks, just wanted to be sure it had been independently verified

Matt said...

Cross graphs of K7 "Basal Rich" components with these stats:

(Some are matched by taking CHG's K7 as Satsurblia's etc.)

Lots of them correlates quite well. I think the graph of "Basal Rich" vs the sum of the AG3 and Bichon stat is interesting to look at.

Alberto said...

I don't think it matters much for the overall picture, but the stats with Ust-Ishim in that position look a bit counter intuitive. For example:

D(Yoruba, AG3; Ust_Ishim, Anatolia_Neolithic) = 0.0431
D(Yoruba, AG3; Ust_Ishim, Armenian) = 0.0339

It looks like Ust_Ishim is "masking" a good part of the crown Eurasian part of each of the populations in the 4th position, so it ends up depending largely on the Basal Eurasian part:

D(Yoruba, AG3; Ust_Ishim, Hungary_EN) = 0.0454
D(Yoruba, AG3; Ust_Ishim, Italian_Tuscan) = 0.0417
D(Yoruba, AG3; Ust_Ishim, Iran_Neolithic) = 0.0383

It's actually the first case I see where clearly the stats would not imply that if you placed two of the X populations in the same stats the result would not match the expectations from the two separate ones (that is, D(Yoruba, AG3; Anatolia_Neolithic, Armenian) will probably be positive and not negative as the first 2 stats above could suggest).

Matt said...

@ Alberto, in general, yeah, in these stats, it does look like all ancient samples are displaced more towards the Bichon, EHG, AG3 end of the vectors than you would often expect based on PCA, even comparing, for ex, CHG vs Georgian (Georgian should be more ANE + WHG rich?), etc. Is that the effect of Ust Ishim or Yoruba though?

Alberto said...


Is that the effect of Ust Ishim or Yoruba though?

Couldn't say for sure without testing, but I can't think of a reason why Yoruba would produce this effect (usually it's a neutral outgroup), while Ust-Ishim is an oddball that shares similar drift with WHG, ANE, ENA... kind of "masking" part of those components in the population next to it and making the Basal Eurasian part having more weight in the stat.

Or who knows, maybe it's something else. But I had not seen before this effect with other "secondary" outgroups (BedouinB, Kotias,...). The pattern of higher or lower shared drift between B and D would usually be consistent with what we see in more clear double outgroup stats.

Davidski said...

Thanks Matt, this one's very nice. I put it in the post.

Davidski said...

By the way, the Onge vs Dai stat appears to be affected by a lower number of markers shared between the Onge and the modern samples.

I'll remove the Onge from the datasheet and try not to do that again.

Davidski said...


The distinction between South and South Central Asians is a genetic one.

I can run many of the Pakistani and Afghan groups in these sorts of tests that focus on West Eurasian genetic diversity without skewing the results for the other samples.

However, when I start adding groups from India, the analysis is skewed, some information pertaining to West Eurasian diversity is lost, and the analysis becomes one of West Eurasia plus South Asia.

Rob said...


Good points, I'd have to re-read, but if the rise of Villabruna admixture is more gradual than implied in the body of the text, then this only further points to structure within UP Europe than external migration.

IIRC, we have 2 Romanian samples: Oase and Ciocl. Neither clustered with others or deemed "ancestral" to later foragers. But they're not alone, neither was K12, nor one of the Goyets.

I feel a shortfall of the paper is that it didn't analyse the outliers well enough, and did not fully evaluate how different UP streams or founder lineages might have arrived into Europe

With all those outliers in mind, one can hypothesise that the proto-Villabruna lineage was still there, somehwere, just hasn't been sampled. I don't see any evidence for any movement into SEE in the Mesolithic. Quite the contrary, forager groups in the Balkans were emigrating or "dying off" by the Holocene

Davidski said...

Actually, all of the stats appear to be slightly affected by missing markers in the modern samples.

But I won't re-run the stats on different markers, because the focus of this post is on Iran_N, which has a 100% marker count in all stats.

So please keep that in mind and focus on the ancient samples.

ryukendo kendow said...

@ David

Nice demonstration of AG3 in Iran_N over EHG.

Not sure those stats are so clean for the ENA purposes though... why was Ust-Ishim placed in third position? Especially as it seems certain groups do in fact have varying distance to Ust-Ishim not caused by crown-basal differences, which will cause unwanted fluctuations..

The off-cline shift of Iran_N is very apparent in stats without Ust Ishim in third position, but its also apparent in stats for Ami, albeit less so:

I.e., these stats are straight out double outgroup stats, graphed.

That said, its true that even if there is ENA-like influence in Iran_N, it is by no means large, at most still less than Saami, Tajiks or Persians.

ryukendo kendow said...

Matt and Alberto, you may be interested in this:

David passed me a set of stats for GujaratiA and D a few days ago, which I graphed this time with labels; the thing that really caught my eye, was the fact that the population most displaced towards the lower right, i.e. sharing the most drift with both GujaratiA and D symmetrically, is Yamnaya.

This implies that all the ancestry that does not form an outgroup to West Eurasians in GujaratiA and D both, a.k.a ANI in Gujarati, is most similar to Yamnaya. Which is quite weird, will a 50/50 Iran_N-like plus Sintashta-like contribution result in an ANI that was most similar to Yamnaya? In fact don't qpAdm and ADMIXTURE give even higher levels of Iran_N vs Steppe, more than 50/50? It seems to me that we may be overestimating the levels of Iran_N in most S Asians, and that the West Eurasian contribution in S Asia prior to Steppe contribution was already quite ANE- or CHG-shifted? Or else it was was otherwise diluted but more crown Eurasian ancestry than we have generally come to expect, either West Eurasian or some other undiscovered Central Asian thing...

Davidski said...

Why was Ust-Ishim placed in third position?

It's the most basal Eurasian we have so, I thought it might help as a second outgroup in distilling the differences in shared drift with WHG, ANE and ENA.

The graphs worked out OK, apart from the ancients vs moderns hiccup, so I didn't bother trying any other references.

ryukendo kendow said...

OK, I do think that the question of ENA influence is an open question; Ust-Ishim like ancestry may be quite significant, while Ami-like influence is probably not very large...

Thanks for the stats for Velamas and UP_Brahmin again. The stats are truly excellent, allows for so much resolution!

Here is the latest recap stats for GujaratiA vs GujaratiD, sorted by D score, as expected topped by Sintashta, Andronovo and Srubnaya, but also LBK_EN and Orcadian, making me think EEF really is quite good at distinguishing Steppe West Eurasian vs Neolithic West EUrasian:
Now sorted by Z score, significant differences favouring moderns, but once again indicating the relative importance of EEF as seen in prevalence of West Europeans:

Now for Brahmin_UP vs Chamar, by D, rather similar:
by Z, quite different, favouring EEF even more:

Now Velamas vs Chenchu is very interesting, with Iran_Neol and Iran_Chalc tops, followed by a tail of quite Caucasus-like ancients and moderns:
With Z score Chalcolithic is now tops:

PCA of all three sets of asymmetry stats:

I think the last figure is quite a beautiful demonstration of the parallel nature of Brahmin/non-Brahmin autosomal differences from different parts of India (those two sets point in the same direction), while HG/Neolithic
autosomal differences are quite different, seeming to point more at a space intermediate between Iran_Neol and various Caucasus pops like Armenia MLBA. Maybe the populations of South Asia were somewhat CHG and EHG/ANE shifted before the Steppe incursions already.

ryukendo kendow said...

^^ correction, seemingly pointing at a space between Yamnaya and Iranian populations.

Here is another West Eurasian vs East Eurasian plot, this time WHG vs Ami, where the drag of Iran_N towards Ami is quite apparent too:

It is less than Scythian_IA, Mordovian, Kargopol Russian etc, and about on par with Lezgin and some Persians.

Shaikorth said...

Iran_N and Ami share ancestry from the ANE branch while Natufians have Villabruna-related ancestry Iran_N doesn't, and this has more recently spread more widely. Perhaps then it would be more useful to test Onge (assuming the Iran_N ANE doesn't share the Onge pull of MA-1) or even Papuans instead of Ami. Denisovan ancestry shouldn't confound the latter, according to Sankararaman et al. it seems to be stable in Europe and Near East regardless of f4(X, Yoruba; Australian, Ust'-Ishim) using high coverage sequences.

Alberto said...


Yes, all those stats are very interesting. In the double outgroup ones, it's significant that the 4 populations (Brahmin_UP, Chamar, Velamas and Chenchu) all share more drift with Afanasievo from all the populations tested. This gives a good idea of the west Eurasian side of South Asians (Afanasievo being like 30% Iran_N + 70% EHG, and EHG itself being in the high end in those stats).

Then the stats of Brahmin_UP vs. Chamar show that bias towards Sintashta/Andronovo, reflecting the influence of steppe people in some groups (GujaratiA being a more clear example, probably from the Scythian kingdom times). The bias is quite subtle if you take Afanasievo as a baseline, but it's there.

Interesting the one comparing Velamas and Chenchu. Velamas are biased toward Iran_N while EHG is at the lowest end. This could reflect a farming vs. hunting-gathering signature in South Asia? Since this bias is not subtle, but quite significant.

Maybe the populations of South Asia were somewhat CHG and EHG/ANE shifted before the Steppe incursions already.

I obviously agree, since I've been arguing this for years. I can't see EHG groups moving to South Asia in the Bronze Age. But this debate has been going on for too long, and too many people take it too personally, so I'd rather wait for ancient DNA to come and tell us what it is.

Davidski said...

The stats are easily explained by the models below, which show that Steppe_MLBA influence in South/Central Asia increased form the Indo-Aryan to the Iranian invasions.

Before the steppe invasions, South Asians were a mixture of Iran_N, Iran_Hotu and Onge/Dai.

That's pretty much it. Nothing amazing or unexpected.

Iran_Neolithic 53.25
Yamnaya-Catacomb_Ulan 28.75
Andronovo_Kytmanovo 7.4
Han 7.1
Andamanese_Onge 2.15
Papuan 1.35

distance%=0.384 / distance=0.00384

Iran_Neolithic 54.55
Andronovo_Kytmanovo 16.35
Yamnaya-Catacomb_Ulan 16.35
Han 8.25
Andamanese_Onge 3.1
Papuan 1.4

distance%=0.482 / distance=0.00482

Iran_Neolithic 41.7
Andronovo_Kytmanovo 30.65
Yamnaya-Catacomb_Ulan 19.15
Han 6.95
Andamanese_Onge 1.05
Papuan 0.5

distance%=0.5318 / distance=0.005318

Olympus Mons said...


"Maybe the populations of South Asia were somewhat CHG and EHG/ANE shifted before the Steppe incursions already"

Yes, and maybe the Steppe got before Incursions from the south india?

At least they did have dogs from South Asia, didnt they? :)

Davidski said...

Bronze Age steppe populations didn't have any ancestry from South Asia.

How did you miss this fact? Do you understand anything at all about population genetics?

ryukendo kendow said...

Can you come up with a model for GujaratiA and D? I still 50% Iran_N plus 30% Steppe ancestry makes it impossible for ANI as a whole to be similar to Yamanaya, but then ANI in Kalash and Pathan may be quite different than ANI in Gujarati.

Olympus Mons said...

Its not like we have buckloads of ancient DNA! Dave, we have nothing. Nothing in terms of Ancient DNA that allow us to make strong inferences. - Like we are not finding "new populations" every year and we are not seeing completely shift in population genetics in less than a millennia. And its not like you don't like to talk about "ghost" populations right?

We get dna from one inhumation of one person. Thats is the genetic story of that person. It could be the son of a girl snatched 2000 miles way and gang raped by ten guys!

Davidski said...

Bronze Age steppe populations don't have any ancestry from South Asia.

Do you actually understand this or not?

Olympus Mons said...

Yes i do! And dont know dip sh**t about ancient south asia dna.
Can you point me a link to?

I was joking about R1a and south asia. How many ancient dna we have for r1a steppe bronze age and how many for chalcolithic ancient south asia? Can you also point me to?

ryukendo kendow said...

@ Shaikorth

Good suggestion about the Papuan, it does seem that the behaviour of Iran_N is a dead ringer for actual ENA ancestry here though...

Whether or not its South Asian-like will need double outgroups with Onge. Do you know any nMonte datasheets with Onge columns?

Davidski said...


You don't need to know anything about ancient South Asian DNA to know that Bronze Age steppe populations don't have any ancestry from South Asia.

You just need a really solid understanding of the genetic structure of Bronze Age steppe populations.

But you lack this at the moment, partly due to a lack of experience in this area, but I think also partly due to the fact that you don't want to accept reality, because if you did, that would make your theory about the origins of Bell Beakers untenable.

Shaikorth said...


Now that I look that graph I'm not sure it's all that clear with even the Papuan, given how much AG-MA-1 is outlying...

But back to South Asians, here's Loschbour donation/WC1 donation ratios from Broushaki et al. supplementary tables for various populations. Correlates pretty well with steppe ancestry. Everything that has less than Sindhi seems to get 0's, but that doesn't mean no steppe ancestry since there's no high coverage EHG in the set and Loschbour is an imperfect proxy. Just means "low".

Lazaridis set
Tajik_Pomiri 0.3623
Kalash 0.2062
Burusho 0.1779
Pathan 0.1638
Sindhi 0.0146
GujaratiA 0.1179
GujaratiB 0.0845

Busby set
Tajik 0.3768
Kalash 0.1765
Burusho 0.1563
Pathan 0.1292
Sindhi 0

Hazaras, Uzbeks etc. have higher Han donation but also WHG/WC1 ratios close to or above Tajiks.

Olympus Mons said...


Which would be fine. Really. That is not my point.
You may not notice this, but being a "visible" guy any small "shift" in you is noticeable. And you do show some confirmation bias.
That is all I was actually joking about.
I know enough about genetics to agree with you that there is presently no indication of Dna coming out of India into steppe and that being part of R1a samples we have.

Still Regarding, bell beaker - Wanna bet?

Nirjhar007 said...

Tick tock tick tock

Davidski said...


Some type of South Asian-related ancestry is possible for some of the early Iranian farmers, because there are hints of it in my K7 and in these f3 graphs. But these signals are minor, and appear to be stronger in Iran_Hotu than in the farmers.


Those Harappan results will just show what we already know; Bronze Age Indians were a mixture of Neolithic Iranians and local foragers.

You're fooling yourself if you think we'll suddenly find out that Harappans migrated to Europe. That ship sailed a long time ago.

Nirjhar007 said...

But I am not a OIT proponent , Not for PIE , Not sure yet for R1a ...

Davidski said...

Well you do argue against the idea that Bronze Age Europeans migrated to South Asia, which does mean you're in some serious denial.

And you were also babbling something here about Indo-Iranians migrating from South Asia to Eastern Europe, which is ridiculous.

As for R1a. Get over it. The OIT for R1a ship sailed years ago as well.

Olympus Mons said...

... And i am wrong for saying that yamna r1b found near black sea and up river were actually fleeing shulaveri. And that thosr shulaveri fleeing south were in tell tsaf,in merimda el omari, and then iberia.... How is that anyway disproved by current ancient dna?

Nirjhar007 said...

You are sounding like a teen age girl David. You Know for years that I don't propose OIT for PIE . But it seems you are in misunderstanding.

Neither I propose IIr migrating from India as whole , but SC Asian area is a very good candidate .

Its not intelligent to connect a SNP with the spread of one whole linguistic group.

As for the origins of R1a , S Asia is a very serious area. Without getting aDNA from there ,there is no way anything concluding, can or should be suggested .

Davidski said...

How is that anyway disproved by current ancient dna?

Well duh, there's no evidence that Yamnaya got its Y-chromosomes from south of the Caucasus, because:

- it's mostly of EHG/steppe origin, and EHG/steppe ancestry shows a very high correlation with Y-hg R1

- there's no evidence of paternal gene flow replacing the highly patriarchal groups that were already on the steppe during the Copper Age and already carried R1

- there's no evidence of R1 in any ancient remains from south of the Caucasus dating to before the Bronze Age, not even a single instance

- you claiming repeatedly that the evidence will show up in all its glory as soon as the right samples are tested is not evidence

Davidski said...

Neither I propose IIr migrating from India as whole, but SC Asian area is a very good candidate.


As for the origins of R1a, S Asia is a very serious area.

Not even an option.

Olympus Mons said...

I particularly like this from a guy called dvidsky on a blog called eurogenes

Davidski said...

But did you read it and understand it?

Nirjhar007 said...


Calm down.

Nick Patterson (Broad) said...

A problem to ponder is the deep history of ANE.
Ancient Iran/Caucasus of course has plenty. Does this originate
from NorthEast Eurasia (Malt'a boy) or did it originate very anciently
in Iran/Caucasus/Central Asia and spresd North East. I don't know.

Nirjhar007 said...

It should be Central Asia .

FrankN said...

On the ANE/ ENA influence - has anybody yet had the time to look at that Jomon aDNA?

Not too long ago I linked a paper here that argued for EN Indian, especially Bengal pottery being derived from Jomon cord-impressed ware. Also interesting is that Ainu and Munda/ Santali/ Nihali share the same word for dog, namely seta/sita, a root not found anywhere elsewhere around the world.

Considering that the earliest pottery-making culture was the Jomon, which also has AFAIK the earliest East Asian archeological evidence of the domestic dog, one might assume technology spread out of Jomon during the Epi-Paleolithic and Mesolithic, reaching South Asia by 7000 BC at latest. From around the same time, we also have the earliest European evidence of pottery, from the Samara area. Moreover, the earliest American pottery, ca. 3500 BC on the coast of Equador, has been frequently linked to Jomon influence.

If Jomon is the origin of the worldwide spread of pottery, we should expect traces of Jomon aDNA having been distributed alongside with it, and leaving traces with ENA / ANE / ASI.

ryukendo kendow said...

Thanks David for the stats.

@ Shaikorth

But ENA ancestry does actually exist in MA-1, doesn't it? MA-1 is off cline from EHG, SHG and WHG, too close to East Asians, in Lazaridis.

On another note, I realised the whole 'ANI in Indians similar to Yamnaya' mystery may be a non-issue. All those plots assume that all non-West-Eurasian ancestry in Indians forms a clean outgroup to Europeans and Middle Easterners, i.e. ENA is a separate branch, cleanly split from (European, Middle Eastern). This actually isn't true, ENA is actually closer to Europeans than to Middle Easterners. The ENA ancestry in South Asians will tend to bias them towards Europeans. This may reconcile the qpAdm, nMonte, double outgroup plots, chromopainter runs and ADMIXTURE analyses showing large quantities of Iran_N and little extra ANE in Indians apart from Steppe and Iran_N, vs direct comparisons of Indians with West Eurasians showing high levels of drift with European ANE-rich populations among West Eurasians.

There still may be some CHG and ANE in pre-IE India, but nothing like the extremely high level needed to get to an overall Yamnaya-like profile when ~30-20% Steppe ancestry is added--this may be an artifact of ENA ancestry biasing all Indians towards EHG, ANE and Europeans.

postneo said...

Very ambiguous

ryukendo kendow said...

Lol agree completely. Just to clarify though, I do think the need for extremely high levels of EHG/ANE/CHG in pre-steppe India is more or less completely out, there's a reason why this phenomenon shows up in direct comparisons with a single W Eur population only, and never in any other analysis with multiple sources.

Chad Rohlfsen said...


Wasn't it EHG they had as WHG and a pop on the Han Onge cline, which is ANE? I don't remember them stating that ANE is ENA admixed, as a final model. I thought it was just about MA1 being the West Eurasian that intersects the West and East cline. That would go with the Onge+MA1 = Han.

IIRC, MA1 is not really closer to ENA than WHG, even with UP admixture in WHG. I don't think anything really points to ENA in MA1. Where you really see it is in comparison to AG3, but that is a poor coverage and damaged sample.

The best fit I've had for Han outside of MA1 and Onge is Onge, MA1, and Afanasievo. This was a good fit by adding Natufians and Iran EN to outgroups.

Shaikorth said...

Yeah, but they tested that against Onge, and I think the outlier status of AG3-MA1 is still too significant even when that's accounted for.
Now was it specific to Onge? Too little data to say perhaps, but the fits they got for Dai, She, Ami and Atayal etc. as mixes of Onge and ANE didn't take less AG2 (not Onge-shifted) than they took MA-1 and the P-values were slightly higher with AG2.
@chad, it's in extended data figures of Lazaridis 2016. Ma-1 is shifted towards Onge compared to WHG, SHG, EHG and AG2.

The Burusho results in the Broushaki paper made me wonder about their origin, the WHG/Iran_N ratio is not elevated compared to Kalash and Pathan unlike Tajiks and Turkics and with moderns they're fitted as Pamiri+Bengali mixes in Lazaridis set, or as Brahmin-Tharu-Kurmi (Pathan + Yi without Indian donor pops) mixes in the Busby set. Indicators that whatever genetically differentiates them from their neighbours seems to come from the southeast, not Siberia? I proposed something like this before the paper came out.

postneo said...

In dog years the Himalayan wolf and Indian wolf are like pure autralopithecines and Neanderthal descendants vs humans. But there must have been hybrids. E.g The Mexican wolf appears to be a recent hybrid btw wolf and coyote driven by humans killing wolves.

Pity the paper does not discuss or show Indian and Himalayan wolves in the figs.

FrankN said...

As we are back to ancient dogs, which makes sense to me as a possible indicator of UP population movement, here a link to a little noted paper on East Asian dog and wolf aDNA:

Essentially, they show:

a)An East Siberian Dog (Zokhow Island, 8710 BP) belonging to mtDNA clade A. That clade had already previously been documented for contemporary American Dogs (Koster Cave, IL);

b) "The median joining network for ancient canid sequences (Fig 2) shows a Yana haplotype (S805) [Wolf, 27 kya] that is one step away from a Zhokhov haplotype (S902) that represents one of the main phylogenetic clades among canids (Clade A). Several ancient canid haplotypes are oriented around the Yana S805 including some of the oldest canid haplotypes reported to date, including the Ulakhan-Sullar specimen (Canis cf. variabilis, S809) from our study, in addition to ancient canid haplotypes from Belgium dated to 30,000 YBP and 36,000 YBP and a canid haplotype from Kostenki, Russia dated to 22,000 YBP [10]."

c) "Both Zhokhov and Aachim haplotypes appear to be genetically indistinguishable from domestic dogs, which may suggest that domestic dogs were in Arctic Siberia by at least 8,000 14C YBP. On the other hand, the Zhokhov and Aachim canids show genetic affinity with Asiatic wolves from Fig 3. In particular, C. lupus chanco [Tibetan Wolf] and C. lupus hodophilax [Japanese Wolf, extinct] have either shared haplotypes or are separated by one mutation from the Siberian canids, including one Yana canid."

They conclude:
"While a recent genome-wide analysis of Chinese indigenous dogs showed a closer genetic affinity to domestic dogs [66], another explanation is that the Siberian canids retained a genetic signature from admixture with local breeds through geographical isolation, which has been suggested in other ancient dogs breeds."

Note also this older paper:
"We show that the evolvement of Natufian dogs were the product of unconscious selection of commensal wolves quasi-isolated under the special anthopogenic habitats created within and around Natufian sites, and at least ritually assimilated to human society. There is no evidence that Neolithic dogs are direct descendants of Natufian ancestors. Their multiregional origination is a widely accepted phenomenon. We suggest that Neolithic dogs were either domesticated anew, or were introduced from elsewhere to the southern Levant."

There are two widely discussed cases of possible early dog domestication, Le Goyet (BE) and Razboinichya Cave (Altai), both dating to around 33 kya. Both specimen clustered around Clade A, but have ultimately been determined as "failed domestication attempts". Failed in the sense that they didn't contribute to existing dogs. But maybe thex didn't fail culturally, i.e. they may have provided later populations the knowledge of how to domesticate and subsequently exploit dogs. From such a cultural base could well have sprung a number of later, apparently "independent" domestication events across all of Eurasia, including the equally "failed" Natufian one (which, however, may not have failed completely but constitute the source of the Israeli Wolf admixture in the Mbuti's Basenji).

Matt said...

All, apologies if OT to the topics you're talking about above, but I thought this might possibly be of interest.

Since Davidski said there might be some marker issues comparing ancient and modern, I thought I'd run a PCA with just the ancients.

One thing I did notice is that if you run a cline through the Euro Neolithic, then it does oddly seem like it stretches back to Iran_N, *not* Levant_N like I would expect:

Graphing them direct shows that particularly in AG3/MA1 vs Bichon, what seems to happen is that the projected end of the Anatolia->Euro_Early_Neo->Euro_Chal cline seems to project closer to Iran_Neolithic, rather than Levant Neolithic: / / /

Or put in other words, the ANE vs Bichon affinity of a projected end point of the Anatolia to Europe cline seems too high in these stats to be represented by Levant_N, and instead it seems like it's more like Iran_N, though not quite there. Also seems true for EHG vs Bichon affinity.

(Btw, in these graphs, "Anatolia_N-Iberia_C" is effectively the back projection from Anatolia_N in the opposite direction to Iberia_C, of the same magnitude (in theory like Anatolia - ~25% WHG), while "Iran_N 0.533, Levant N 0.477" is the ratio of Iran_N to Levant_N given in Laz's model for Anatolia_N).

I don't know if that might visually help explain why Lazaridis's models for Anatolia worked out as it did (Levant_N on its own does not seem formally quite right on the WHG-ANE spectrum).

OTOH, there's something not right with that, if ADMIXTURE is placing Anatolia_N and Levant_N together (and for more than just because they are both quite WHG ish). Hopefully ADMIXTURE / PCA are not just accidentally bracketing Levant and Anatolia together as "relatively WHG rich Neolithic" in a mistaken way.

Davidski said...


PCA can do some funny stuff, even when based on formal stats, depending on the variables and samples you have to work with.

A better way to test if Anatolia_N, Levant_N and Natufians are on a cline, and indeed on specific clines of diversity, is to graph a couple of well chosen D-stats.

I'll have a think about what they should be, although you might already have some ideas.


One hypothesis that might be worth testing is an origin of ANE in the Altai-Sayan LGM refuge, followed by a two pronged expansion west; one taking a route south of the Caspian and becoming the AG3-like ANE in Iranian farmers, and another moving north of the Caspian and becoming WHG-rich EHG.

I have no direct evidence for any of this, but there are some indirect clues, like the spread of bears from the Altai-Sayan both to West Eurasia and the Americas, and the spread of pottery west from Siberia into the Near East and Europe.

Rob said...

@ Nick & Dave

Where did ANE originate ?
Depends on when, and what is meant by originate .
By all recent field surveys, Central Asia appears virtually uninhabited during the LGM and a few thousand years on either side

In a recent review (Davis/ Ranov) find " essentially the Upper
Paleolithic, in contrast to the Middle
Paleolithic or Epi-Paleolithic, is un-
known. Why that is the case remains
an open question".. Probably because it was cold and hyper/arid .

If there is not continuous sequneced through the LGM in the Caucasus either, that leaves the Altai, the Indus, or some yet to be discovered cryptic Refugia in the Asian mountain caves (?)

The Altai looks like a refuge for Siberia and the Americas. I don't think it explains haplogroup R-

Davidski said...

Why wouldn't it explain R, considering R and Q are sister clades and MA1 from Siberia belongs to R?

Paternal founder effects can easily explain the lack of native R in the Americas.

FrankN said...

@Rob: The Caucasus has been evidenced as LGM Refugium for a number of plants and animals. What we have is a coltural break during the LGM at Satsurblia. However, this is easily explained by the cave being located above 500 m asl, and as such in the possible glaciation zone. I don't think that break rules out a West Georgian human refugium further downhill and closer to the Black Sea. To the opposite, there is remarkable pre-/post LGM cultural continuity in West Georgia, e.g. as concerns use of dyed flax for clothing (not wool, as in the Levante, nor lime bast as in NW Europe). While the origin of flax domestication hasn't been established yet, a number of indicators point towards the SE Pontic.

You are correct that there are no archeological finds proving LGM human settlement in Central Asia. However, there is quite some indication from faunal and botanical studies that the upper Amu Darya basin formed an important LGM refugium. Not "cold and hyperarid", but possibly reasonably well rain-fed by Westerlies that could gather more humidity than today over a by then much larger Caspian Sea.

@Dave: Your two-pronged ANE westward Expansion out of the Altai is an interesting idea. However, I am not sure whether the late glacial climate allowed for the southern pronge - those Hindukush and Pamir passes are higher than 3500 m asl. If there was such a southern pronge, it most likely opened up during the mesolithic, which might already be too late to explain ANE in Iran-Hotu and Zagros_EN (WC1 etc.). Alternatively, one might consider a bifurcation in the Southern Urals/ Samara area, with one group moving south to the Caspian Sea and beyond, and the other following the Volga up northwards.

Rob said...


Thanks your points are all valid, and things might change in future. But the fact at present is - the area has been researched well- and settlement (even in the Amu Darya region) only picks up in the Mesolithic, and the contrast between MP and UP is also stark. So something was happening during the UP, and we need to ask where did Mesolithic settlers of CA come from if there is virtually no evidenced settlement between 33 and 14 kya . We can't toss this aside


I have difficulties seeing a link between the Altai and post-LGM Europe. The earliest connection occurs wit the appearance of Afontova-like industries in EE, microblades and such. But that is c. 10 kya, thus 4, 000 years too late to explain Villabruna. Moreover, the way from Altai to the ponto-caspian steppe might have been blocked by the post-LGM transgression of the Caspian Sea toward the Urals..

FrankN said...

Rob - "well-researched" is quite an euphemism as concerns Afghanistan. The site of Aq Kupruk, originally described in the 1950s, is just coming under re-investigation:

Barring detailed reports, from what I could gather it appears that Aq Kupruk III has evidence of human occupation some 20-15 ky ago, i.e. from directly after the LGM.

North of the Amu Darya, a number of sites have been recently discovered that put into question the traditional belief of a widely depopulated pre-LGM Central Asia. A good overview on the Kulbulak bladelet traditon, ranging from Afghanistan to Kazakhstan, and C14-dated to as late as 23-21 ka BC uncal. (Dodekatym II, UZ), is found in the link below. The Kulbulak tradition bears parallels to the UP Baradostian Culture in the Central Zagros, predecessor of the Zarzian. Moreover, apparently the pre-LGM Kulbulak tradition blends quite seamlessly into the Central Asian Mesolithic, leading the authors to conlude:
"Because no absolute dates of Mesolithic sites excavated in the late 20th century are available, and several sites were attributed to that stage solely on the basis of the microblade and microlithic technique, the discovery of developed microliths in the Upper Paleolithic (Dodekatym-2) requires a revision of earlier interpretations of “Mesolithic” sites. In the first place, absolute dates are needed. When they have been obtained, Upper Paleolithic (speci¿cally Kulbulakian) sources of local Mesolithic cultures will hopefully become more apparent."

Rob said...

Awesome, thanks Frank

ryukendo kendow said...

@ Shaikorth

Oh I don't know Shaik, I've seen stats that MA-1 actually demonstrates Papuan shift elsewhere, which is reflected in ADMIXTURE as well. It may well be the case that the Onge signal underestimates the Papuan shift. Its pretty interesting because EHG demonstrates only East Asian shift, and SHG only Native American shift, i.e. no ENA shifts at all. Which seems almost to suggest a history/timeseries of contact with changing ENA across Siberia.

While the overall clading is more or less done, I think the situation may be more complex at the margins. That does raise questions of how we can reliably identify minor % ancestry in component-defining samples though.

Nirjhar007 said...

On a different note, very interesting study-

Davidski said...


To test whether the Natufians, Levant_N, Anatolia_N and early European farmers really do form a cline, I basically reproduced the usual PCA of West Eurasia by plotting two D-stats against each other: D(Mbuti,AG3-MA1)(Mota,X) vs D(Mbuti,Bichon-Villabruna)(Mota,X).

So I reckon the answer is yes, although the Anatolians are indeed a bit off cline, and pulling towards CHG, probably because of low level CHG admixture, which we discussed here...

Also, it seems like groups rich in WHG plot above the diagonal, while those rich in ANE below it.

Again, the earliest Iranian farmers look very unique. WC1 looks extremely basal. Hope that's not a technical artifact here, but probably not, because it matches my genotype PCA and the one from the Broushaki et al. paper.

I'll add these to the post.

Davidski said...


On a different note, very interesting study-

Yeah, fascinating.

But why isn't Dienekes posting about the ancient Iranian DNA? Maybe because his whole Gedrosia PIE angle now looks about as likely as OIT.

Davidski said...

By the way, Matt, if you're feeling energetic today, how do these stats match my K7 Basal-rich + AG3-MA1 and Basal-rich + Villabruna components?

Matt said...

@ Davidski:

I added a couple of projected clines onto those graphs: + clines those two stats in PCA mode plus clines

I do still think that, if you're running a model with WHG, then straight up Levant+WHG doesn't quite seem to work for these stats with D(Outgroup1, ANE/WHG,)(Outgroup2,Test). The cline seems to reach back to points between either Levant_N and CHG or Levant_N and Iran_N (albeit closer to Levant_N).

Though I'm not saying the Iran_N model is what actually happened! Rather partial CHG, like you say, seems a bit more geographically likely. It's more I'm trying to understand why Lazaridis's models turned out as they did, rather than straight Levant+WHG.

Also at the same time, using the double outgroup stats, albeit limited to only EHG and Bichon, Hungary_HG has a different position on that cline, so that could contribute (Levant+Hungary_HG+CHG, mainly Levant and WHG, models Anatolia better in ANE vs Bichon/WHG than Levant+WHG, and geographically seems quite likely?).

By the way, Matt, if you're feeling energetic today, how do these stats match my K7 Basal-rich + AG3-MA1 and Basal-rich + Villabruna components?

Sure thing:

AG3-MA1 vs K7 ANE:

Villabruna-Bichon vs K7 Villabruna:

AG3-MA1+Villabruna-Bichon vs K7 Basal Rich:

All the above stuck in a big dumb correlation PCA (which should ignore different scales on the stats):

Least correlated seems like AG3-MA1 vs K7 ANE, where particularly Iran_N but also CHG and recent Iranians seem like the strongest forces in breaking correlation.

Correlation of the asymmetry in AG3-MA1 to Villabruna stats with K7 ANE:

The correlation is good, but note Iran_N and CHG seems to have quite low level of preference for ANE (none in the case of CHG).

ryukendo kendow said...

Nice plots Matt!

To play devil's advocate, the similarity to Villabruna-Bichon in formal stats will be linearly correlated to actual proportion of WHG ancestry only when all the WHG ancestry in the pops we are comparing, whether in Natufian, Anatolian, or MN European, split off from Bichon-Villabruna at a similar depth. Otherwise there will be 'excess' dragging towards Villabruna in populations with WHG that split from Villabruna very recently. Maybe the cline from Anatolian to Euro EEF has increasing WHG ancestry in those Euro EEF that split off at a shallower depth, while the cline from Natufian to Anatolia_Neolithic has WHG that split off at a deeper depth, which will cause two clines with different slopes to appear in the plot.

Not saying this is what happened, its a possibility.

Davidski said...

Holy crap, this is awesome.

There's a clear difference between MA1 and AG3; MA1 is a horrible reference for the Iranians, while AG3 is pretty good.

But it looks like we're missing an ancient sample along the MA1 to AG3 cline, but beyond AG3, that contributed ancestry in a big way to the Near East and Caucasus.

Btw, I removed WC1 from this analysis; he was missing markers and generally looking a bit wobbly all by himself.

Davidski said...

Also check out how on the Bichon-Villabruna/MA1 plot the European Steppe_EMBA samples love MA1. Amazing affinity.

Nirjhar007 said...

But why isn't Dienekes posting about the ancient Iranian DNA? Maybe because his whole Gedrosia PIE angle now looks about as likely as OIT.
Maybe he knows something we don't , just arranging a big post . I know you are feeling the suspense ..

Davidski said...

Just arranging a big post.

Arranging his underwear, more like it. He'll need a fresh pair of boxers when he sees ancient genomes from Bronze Age Greece.

Nirjhar007 said...

Are you sure that you will not be needing a fresh pair of boxers, when you see ancient genomes from Bronze Age Greece ?.

The importance of the Mycenaean aDNA is huge for the IE issue .

Olympus Mons said...

what is your expectation regarding Bronze age Greece?

Nirjhar007 said...

I think the the Mycenaean genome , if of good quality . Will be a cornerstone to decode the IE puzzle .

Chad Rohlfsen said...


Didn't you see the LN Greek sample? It's more Levantine like than Anatolians. Only a big chunk of LN EBA European can get you to modern Greeks.

Olympus Mons said...

@Chad Rohlfsen
That is something I was also thinking... Greek populations by Bronze age, according to time, space (geography) and what we currently anthropologically know, should be something like Chalcolithic Anatolians inserted by Steppe massive chalcolithic and Eastern Europe Chalcholithic/copper age genome.

Aren't you saying that if Mycenaean is very close to Late copper/early Bronze age Steppe genome than IE is from steppe? or are you saying that Mycenaen will be so close to later European IE genomes (Bronze age Europe) that they might be the source?

Note: does anyone know of balkans chalcolithic/copper age genome?

Nirjhar007 said...


We first need to see ,where the BA Greek genome takes us to. Ideally, some 3rd millennium BC samples from Greece are also needed, but without a doubt a massive revelation is coming .

Olympus Mons said...


Just to be clear. I call Yamnaya the Mr 1000KM. So, from some point in the north shores of Black Sea, they moved a radius of 1000KM and left traces.

Yamnaya is not! not Mr 3500KM people that reached Iberia, or Spain, or British isles. No way, not without lots of kurgans in the pathway or by using UFA to travel.

But we will see.

Nirjhar007 said...


I am saying that Mycenaean aDNA will be a good paradigm, to filter out what is IE and what is not.

Nirjhar007 said...

To the interested researchers,

There is an interesting new article :
Population history in third-millennium-BC Europe: assessing the contribution of genetics
Marc Vander Linden
Several recent high-profile aDNA studies have claimed to have identified major migrations during the third millennium BC in Europe. This contribution offers a brief review of these studies, and especially their role in understanding the genetic make-up of modern European populations. Although the technical sophistication of aDNA studies is beyond doubt, the underlying archaeological assumptions prove relatively naive and the findings at odd with more ‘traditional’ archaeological data. Although the existence of past migrations needs to be acknowledged and fully considered by archaeologists, it does not offer either a robust explanatory factor or an enduring platform for interdisciplinary dialogue between archaeology and genetics. Alternative hypotheses are briefly explored.

KEYWORDS: aDNA, Yamnaya, Corded Ware, Bell Beaker, demography,

Strandloper said...

According to Google maps, it would take 45 days to walk from Samara to Lisbon.
On horse back substantially less time.

A group of raiders on horse back could easily traverse that distance in less than a month, maybe just a couple weeks.

epoch2013 said...


" Although the technical sophistication of aDNA studies is beyond doubt, the underlying archaeological assumptions prove relatively naive and the findings at odd with more ‘traditional’ archaeological data."

The recent findings of aDNA studies are only at odds with archaeological models that were created after the second world war, the so called "pots, not people" doctrine. The DNA findings rehabilitate the pre-war archaeological models.

Olympus Mons said...

Hence what you are saying is that all this adna context we all talk here is horseshit, right? If google says that is 45 days walking from samarra to lisbon so finding inhumations in one place has, per your words, absolutely no relevance for ancient population genetics.

Its a 5 day walk from the Caucasus to Ukraine. strange it took 2 thousand years for agriculture to get from one point to the other, right?

so, all the r1b found in yamnaya could be guys on an hunting party? say coming from china for all we know.
Well. its your point of view.

Olympus Mons said...

and Strandlope,
you need to look at an Europe Hyposmetric map!

Its hundreds of hundreds of rivers and mountain ranges, one upon the other, upon the other over and over, each of them able to have kept even dna differences in populations for millennia upon millennia.

From samarra to Lisbon at those days only on board of UFOs. And its funny is that once there people promptly change theirs ways! amazing. Even started to bury their beloved in a different manner (screw Kurgans). What was that? a bacteria infection that changes behavior?

Matt said...

@ Ryu: Otherwise there will be 'excess' dragging towards Villabruna in populations with WHG that split from Villabruna very recently.

Yes, quite possible and actually possibly the Hungarian HG may have some of that tendency.


@ Davidski, is it possible to get a D (Mbuti, Ust Ishim) (Mota, X) and D (Mbuti, Kostenki14) (Mota, X) for these sets of populations?

I was having a look at simple graphs using some of the double outgroup stats you made before using Ust Ishim (I think D(Mbuti, Ust Ishim)(Chimp, X) ) compared to Bichon:

So it sort of recreates the typical West Eurasia PCA, pretty noisily. That seems to make sense in that Ust Ishim stands in for ANE.

But, if I'm thinking correctly, what that also means is that Basal Eurasian should decrease on the West->East vector on West Eurasia PCAs and not just / less so so on the South->North direction. Taken literally I guess that stat would show Lezgins should have the same amount of "Basal", roughly, as Sardinians, and Chechen in the same neighbourhood as Orcadian / Belarusian, taking only the Ust Ishim stat as proxy for level of Basal Eurasian.

Those outgroup stats have some odd outliers (e.g. EHG is registering as having low sharing with Ust Ishim) though, so I was wondering if a D (Mbuti, Ust Ishim) (Mota, X) stat might be smoother and make more sense because of lack of using the Chimp group.

I'd be interested to compare the stat for Ust Ishim with the sum of the stats for D (Mbuti, AG3) (Mota, X) and D (Mbuti, Bichon) (Mota, X), to see if there are any outliers (which would imply some form of non-Basal Eurasian ancestry other than AG3 / Bichon) than expected (ENA).

FrankN said...

Matt: "The cline seems to reach back to points between either Levant_N and CHG or Levant_N and Iran_N (albeit closer to Levant_N)."

For all we know, the "Neolithic package" was probably assembled around the Euphrates bend, a bit east of Aleppo. It drew on exploitation of steppe cereals native to the area, other cereals from the Southern Levante (Jordan valley), Zagros goat/sheep domestication, cattle domestication most likely from E. Anatolia (Çayönü). and pig domestication possibly from the same area (but note "specialised pig hunting" in Satsurblia, where pigs account for 71% of all faunal remains).
The area lies at the junction of major Obsidian roads, which facilitated Access to the a/m components, and subsequent spread of the "full package" in all directions.

Geographically, we are talking about the junction of Zagros-C.Anatolia, and Levante-E.Anatolia - Caucasus roads. Once we get PPN aDNA from that area, I am confident it will plot close to the junctions of the Levante-CHG and Zagros-C/W. Anatolia clines, which is just where the lines in your plots lead to.

On a more general note: With all the NE aDNA coming out, we may need to think about a clearer nomenclature to avoid confusion. I suggest in this regard:

a) Doing away with the "Iran" Label - the country is too large and genetically differentiated, now and then, to be subsumed under a sinple country label. Instead, I vow for:
- Zagros EN (ZEN)= WC1 & other Broushaki samples)
- Zagros LN (ZLN) for Lazarides Iran_N
- Zagros Chalc (ZCH) for Laz Iran_Chalc
- S. Caspiam HG (SCHG) for Hotu (and keeping the SC Label for everything else we may get from that area),

b) Anatolia - the same. We need to differentiate between West/Central/East (in future possible also North). So, I propose:
- Aegean_Neolithic (AEN) for what we used to call Anatolic Neolithic (but which, acc. to Hofmanova, appears to extend to Greece also),
- Cappadociam Neolithic (CDN) for the new Broushaki samples.

c) Levante - there will most likely be a North-South Differentiation. Let's relabel Laz's "Levante" as "Jordan River".

d) The same most likely will need to be done for Iberia and Germany. As concerns the latter, we traditionally have only been dealing with Elba-Saale aDNA, but we already have had some CWC and BB aDNA from Southern Germany coming in, which may require a more differentiated look.

Strandloper said...

@Olympus Mons

My only point is that it is physically possible

for people to travel that far in a summer season.

The river crossings and mountain passes would have been already established.

Olympus Mons said...

Men. After this comment it's not just beer if you ever drop by lisbon... Its full lunch package. With a Touriga nacional red wine and all.

We can not take one single sample and extrapolate for 300.000 people. In iberia by copper age i figure there would have been 3 distinct pop. And that must be truth for every single region of west Eurasia.

Olympus Mons said...

I didn’t want to be rude, sorry. I know what you mean.
But like most people here I truly think there is a problem, a conceptual problem, regarding time and geography. – Always remember. It took 6,000 year for agriculture to go from Anatolia to Iberia or the brit isle… and google says it’s a 40 days walk!

So, when someone tries to convince me that A is related to B over a 1000 miles way and does allow 1000 years in between, as a rule of thumb, I just think they are talking about westeros, Iron throne and lanisters… not about real world events.

postneo said...

Yes google maps assumes paved roads a Commercial network /support system that sells water food lodging gps satellites and fore knowledge of where samara and Lisbon are

It took Alain bombard only 3 months to cross the Atlantic but Still it took modern humans like Columbus or the Vikings a bit longer to cross the Atlantic like 100,000 years after they left Africa

The fight to the moon did not take much time either.

@Olympic mons
apparently bacteria in Yoghurt can lower social anxiety among teens!

postneo said...

Real people would need to take a break from work or survival grind to trek 45 days. I would be very willing.

I think that's the real constrait Columbus went for a bunch of funding rounds for his gig. That's the real enabler,

postneo said...

Mycenaean dna may be the first ancient sample connected to an attested I.E. Lang right? Me bet is that it will indicate that I.E. Speakers were genetically heterogenous

Rob said...

I think the genome -wide study on Myceneans is still a little way off, unless the upcoming study on the (central) European BA also sampled some Greece .

Yes I think we can already seen multiple movements into (? as well as out of) Greece and Balkans in the copper age. It'll be fascinating to put it into the broader context

Strandloper said...

The Appalachian trail is around 3500km long. The fastest through hike time is 46 days.
If you leave in the beginning of May than you can get there by the end of September, no problem.

Groups of well armed men on horseback could spend their summer raiding villages and farms for food and supplies as they kept moving along. Eventually they would hit the coast.

Davidski said...


capra internetensis said...

Neolithic farmers had to adopt to different latitudes and climates all along the way. Obviously the speed of migrations can vary wildly depending on circumstances.

The Thule Eskimos appear to have covered the ~4000 km from Alaska to Greenland in a matter of decades, and that's a heck of a harsh environment. Obviously proper steppe nomads could get around, e.g. the Kalmyks who left Russia in 1771 travelled from Kalmykia to Dzhungaria in half a year. During the conquest of Siberia the Cossacks expanded from the Urals to the Pacific in about 60 years, a distance of around 7000 km by modern highway. This is the historical period, but we are still talking about people travelling across difficult, unknown territory by foot and horse and wagon.

Rob said...

I fail to see the relevance of loosely applied false-analogies, such as modern hipster-treckers or a specially bred, 18th century military caste.

CWC and Yamnaya can be described as semi-pastoral at best, with rather scarce evidence for horse riding. In fact, they probably pulled their wagons with Oxen, and specialised in cattle Herding.

Given that Yamnaya began 3300 BC, but have (to date) no evidence of any steppe admixture in Iberia until 1700 BC (I'm aware we don't yet have actual BB samples), we don't need to make baseless speculations about imagined "mounted warriors" darting across the length and breadth of Europe, although it does make for a good Sci-Fi flick

Davidski said...

It's rather unlikely that we'll ever get genomes from the first migrants to Iberia who carried steppe ancestry, and it's extremely unlikely that Iberia_BA ATP9 was one of these people.

In other words, right now the Late Bronze Age is the latest that steppe ancestry arrived in Iberia, but it not need also be the earliest.

Rob said...

Yes we can

We're not taking about *the first man* to step foot in Iberia, we can work with a 100 year period.

Just sample an early Beaker burial from the north of Spain with single inhumation (as I'm sure the BB study underway is doing). Steppe admixture will appear at 2400 BC *at the earliest*, which means they had 600 + years to meander toward Iberia

epoch2013 said...


OK, Let's assume your guess is right: 2400 BC is the first Yamnaya. IIRC the first BB cultural signs are 2900 BC and the BB phenomena is supposed to have originated in Iberia to expand eastward.

Do you think the Iberian origin hypothesis of BB is wrong? Or more complicated than we think and is BB not, or only partially explained by migrations? The fact that the "pots, not people" doctrine has been debunked on most occasions does not necessarily mean it is irrelevant in all occasions.

capra internetensis said...


The point is that *under the right circumstances* people can migrate very quickly without modern technology or infrastructure, not to argue that the movements of Copper Age prospectors/pastoralists/whatevers closely resemble those of 17th century fur traders.

Are you suggesting that 1 km/year is in fact a generally applicable rule? That the spread of steppe genetic influence to Iberia should be modelled as a glacial advance across Europe?

Aram said...

The FN Greece Klei10 looks only Levantine shifted if it is compared to the Barcin EN. But if we compare it to EN Greece Rev5 then an extra shift toward CHG & Kum is visible.

Aram said...


What You think about Lake Van region. The Brushaki's map showed that Neolithic picked up late there. Some 3000-5000 years later than the neighbouring Fertile Croissant.

Davidski said...

I can't see any CHG per se in Klei10. The shift on the PCA also doesn't look very strong. That sort of shift is well within the range of individual variance.

Aram said...

It is easy to predict what will happen in Chalcholithic Greece. We will see a strong surge of CHG. But what will happen in Bronze Age Greece is much harder to imagine.

Davidski said...

Chalcolithic Greeks will cluster with or near Anatolia_Chalcolithic.

But today, Greeks are much more northern than that. And not all of this northern shift can be attributed to Slavic incursions, especially not in Crete.

ryukendo kendow said...

David, which two samples are these?

Aram said...

Difficult to disagree with You Davidski.
I link that Chl Greece with Minoans who can have some relations to Hurrians.

Matt said...

@ Davidski, thanks for that.

When I drop the D(Mbuti,Ust_Ishim)(Mota,X), D(Mbuti,Bichon)(Mota,X), D(Mbuti,AG3)(Mota,X) all in a PCA, in dimensions 1 and 2, Ust Ishim just points squarely forward on PC1 ("north"):

There's only a very, very small degree to which the D(Mbuti,Ust_Ishim)(Mota,X) in not explained by sharing with D(Mbuti,Bichon)(Mota,X), D(Mbuti,AG3)(Mota,X), seemingly being slightly higher for a Caucasus and early farmer populations than expected for the two stats (while Levant_Neolithic seems lower than expected): /

But that's very much to a small degree and the PC1 of D(Mbuti,Bichon)(Mota,X) and D(Mbuti,AG3)(Mota,X) is very much correlated with the D(Mbuti,Ust_Ishim)(Mota,X). So the two between them seem to signal Ust Ishim (and in theory so Basal Eurasian) quite well.

K7 Basal Rich vs D(Mbuti,Ust_Ishim)(Mota,X):

Rob said...


The Minoans were based in Crete.

It would be incorrect to claim that all of Chalcolithic Greece was "Minoan"
The mainland probably spoke different language(s)

Aram said...

"It would be incorrect to claim that all of Chalcolithic Greece was "Minoan"
The mainland probably spoke different language(s)"

Very probable. Crete has his own pecularities on the Y DNA level.

Rob said...

@ Capra

"The point is that *under the right circumstances* people can migrate very quickly without modern technology or infrastructure, not to argue that the movements of Copper Age prospectors/pastoralists/whatevers closely resemble those of 17th century fur traders.

Are you suggesting that 1 km/year is in fact a generally applicable rule? That the spread of steppe genetic influence to Iberia should be modelled as a glacial advance across Europe?"

Ha of course not, as we know steppe admixture was already present in 2600 BC central Germany (c/- BB samples); and even the 'sedentary early farmers' moved into Europe quickly - at some points - which went beyond 'demic diffusion', but actually involved active, goal-oriented praxis.

Which leads me to my next point : the advance of steppe like ancestry was punctuated in Europe, & not a sweeping east to west movement. Quite obviously it moved from the steppe to north-central Europe quickly (probably because it was a flat plain, with lower population densities), but then took sometime to diffuse further south into Italy & Iberia (as we can see, or rather not see, from the Remedello (3200-2000 BC) and Iberian samples we currently have). So I submit that north central Europe was a "secondary homeland" for these steppe -folk.

So my caution wasn;t against human mobility, which would be an absurd position, but some of the comical remarks above (not yours) about fantasized horse-people making a conquest of western Europe in 45 days.

Rob said...

@ Epoch

"OK, Let's assume your guess is right: 2400 BC is the first Yamnaya. IIRC the first BB cultural signs are 2900 BC and the BB phenomena is supposed to have originated in Iberia to expand eastward.

Do you think the Iberian origin hypothesis of BB is wrong? Or more complicated than we think and is BB not, or only partially explained by migrations? The fact that the "pots, not people" doctrine has been debunked on most occasions does not necessarily mean it is irrelevant in all occasions."

I don't quite understand your question, Epoch. I didn't state the first Yamnaya is from 2400 BC. Yamnaya began 3300 BC in the Black Sea region. What I was actually hypothesizing is that we won't see any steppe -like admixture in Iberia before 2400 BC, or Italy for that matter.

I can't pretend to solve the issue of BB dating, as even specialists in the filed continually debate this issue. There have been problems with over-dating in Iberian archaeology, and as well as issues with what is actually Bell Beaker. My understanding is that BB contained a "package' of items, which only came together in central Europe (say, Germany). Some of these items came from Iberia, moving north, others came from the East (steppe, Balkans), some yet from Italy, culminating c. 2400 BC

It remains to be seen what happens next, but there is a theory that the 'eastern faction' (=the R1b dudes) then began to dominate the BB network, marginalized other lineages, and from central Europe, expanded south into Iberia, as well as back east into former CWC territory (to a partial extent). But we only have German/ Czech BB, so it'll be interesting to see Iberian & Italian BB too, to see if this is what really happened, or whether all the genetic exchange was mediated by female exogamy (with the rise of R1b in Italy & Iberia related to different, later events).

Davidski said...


Maros RISE374
Hungary_MBA RISE349

Davidski said...

OK, I posted a new datasheet a graphs to correct the problem caused by the missing markers in the modern samples. Things should look a lot better now.

Nirjhar007 said...

Rob and others,

I am not sure about BB , but surely CWC as suggested many times perhaps , is not a good fit as an ''off shoot '' of Yamnaya , we have to imagine a hypothetical Proto-Yamnaya kind of movement for that . The earliest dates of CWC as you also have pointed before, comes from Central European area.
CWC is itself quite diverse and not exactly uniform , if i'm not wrong also BB.

Even for the Yamnaya proposal , we have to imagine different sets of paternal groups and tribes . Hypothetically yet to be proven R1a and of course the demonstrated R1b. But chances are weak that Yamnaya or the Yamnaya area was the source of CWC R1a.

Rob said...

It's just my own theory, but by "Central Europe" I meant east -Central Europe, the south Baltic to middle Dnieper region.

Yes, archaeologists increasingly viewed CWC as a horizon, where diverse communities afford a common code and exchanged marital partners. Even physical anthropologists saw different skull shapes, etc. Whilst we might find some other lineages with more sampling, it certainly looks to be more or less an R1a-M417 thing.

Olympus Mons said...

@Rob and Epoch,
I just love when you talk about westeros, and Sansa stark and Rob stark… Its just that instead of saying westeros you just say Yamnaya. - But is all you are doing.

So, By 3500-3200 Iberia saw an Upsurge of people of the hundreds of thousands arriving. And not hippies apparently because that is the only period where you actually find signs of heavy interpersonal violence. Arrows in the back, lots of arm broken in defense moves (parry fractures) .
And just take small Portugal, not even the entire Iberia peninsula:
From 3200 BC places like Mercador villages are raised with DEFENSE WALLS and lots of horses and arrows are the trademark. And horses too old to be food, no cut marks and no sign of heavy work, so its interpreted as for “SOLDERING” (hey…3000 BC!).
By 2900 BC that people in Porto Torrão had built a city that was about 400HA. Something the size of Ur the huge city of the sumerians. So in there and in perdigoes and nearby sites there would be over 100.000 people at that point. At the same time, up north, Zambujal and VNSP were already huge, huge. And strontium show that lots of them were born in say south and died in Zambujal and so forth. So big. And, funny enough, already by 2800 BELL BEAKERS. So lots and lots of people in a network that was not really bitniks. So, fanatic about arrows and blades as we see in Mercador and in Leceia (the soldereing place of Zambujal).

Then by 2500 it comes Sand stark and the lanisters and whatnot, so the super Yamnaya people, and had to be buckloads of them, right? Because is not a hundred or a thousand that take on a land of hundreds of thousands that already were very defensive wall like people.
But above all, its strange that this new guys actually forget all their standards and took, lo and behold, pretty similar life style of the people already living in the region… hummm, were is that movie running?

Rob said...

@ O.M.

"By 3500-3200 Iberia saw an Upsurge of people of the hundreds of thousands arriving"

That sure is a lot of people. What kind of ships did they use to cross the Mediterranean ?

You're probably right, and your theory is indeed fascinating. You write "But leaving on different margin of the river this guys and girls developed a different mutation (M-73) of the R1b haplogroup'". (

So women have Y Chromosomes ?
I think you're onto something.

Gill said...

David, not sure I'm reading your edit right, but you think the source of AG3-like ancestry in Neolithic Iranians was closer to the ANE contributor to Native Americans, possibly actually AG3 related?


Arch Hades said...

Are these the same Iranians that Lazardis was saying had a huge genetic impact on the Yamnaya? Doesn't seem like it.

Olympus Mons said...

Thanks. will correct it. There must be dozens of these sort of mistakes/typos that I don't even notice anymore... needs to be some else... Once again, thanks.

Olympus Mons said...

The Bell beaker,
In Iberia, were the people that after 500 years of bunking in (3300-2800bc), making settlement upon settlements with defense walls (those guys arrived in Iberia really scared of something or Grumpy) and cities to settle in, they, bell beakers were the ones that started to move away from the fortified places. They start to go into open air settlements. They were so confident by then or just so skilled in defending themselves that, like and explosion, by 2800 bc they moved way from the main centers ow walled and just settled outside (some pottery inside, some outside). Not sure if they were really an Elite or were just a people fed up with living “inside” and could not take it anymore.
It’s my personal opinion that the package arriving in Iberia (3500bc) were made of at least, at least, two incoming different people, the north African and the tallish, sturdy and Caucasoid race that lived in Merimde/El-omari mixed with local people (all those arrow in the back and specially the excess of young males in Chalcolithic might means something regarding young Mtdna H girls around).
Bell beakers still had the same live standards of the previous period. Is just that they didn’t seem to need defensive walls anymore and were probably becoming again very clannish. They just move to a quiet place and settled in groups of 20 or 30 and still see agriculture, Pastoral and all the same lifestyle, but now they were able by using the same tools as the previous way of live to move to unexplored agro-pastoral places without the need of fortifications. And they kept on moving outside Iberia (2600bc?)… and to believe that by just 2500bc, 2400bc they would be coming back INTO Iberia with different admixture and different male patrilineage…. Is just pure madness!

Olympus Mons said...

@ rob,
as per katie manning work (see Mp4 video-awesome) there were disappearing hundreds of thousands of people the same time Hundreds of thousands population increased in Iberia. you see that increase from 3600bc as you see people flocking into north africa Oran region and Western rift. So from 3600-to 3200bc people were not crossing the "mediterranan" but Gibraltar straight (not exactly the same ringtone is it). And gibralter straight was crossed by raft or by those dates even a good swimmer could do it. So I don't know how good a swimmer those guys were, but imagine they could built a raft, couldn't they?!
So for someone who believe Yammaya(Imagine you dont think were 5 guys, right!) could "fly" from Black sea to Zambujal in 100 years, just imagine what fleeing population from death (5.9 kiloyear event) would do with a bitch distance like Gibraltar strait? ...

Olympus Mons said...

And about boats. Those hundreds of Thousand of people flocking into the western rift in marroco, were not planning to sail, or fish...they just wanted to get to the other side.
That other side is currently under water.
Actually one of the interesting stories i am trying to follow is the place were some are claimming to be "part of atlantis" which is curently under water at the right place for "crossers" and supposedly are very big settlements.

Alberto said...


It seems to me that Portuguese literature is in agreement with what I find in Spanish literature. To quote (*) a relevant part for what you're discussing:

Bell Beaker Chalcolithic
According to all syntheses (e.g. Soares and Silva 1998,2010; Jorge 2000; Cardoso 2007), important changes occurred at the end of the Chalcolithic, during the second half of the 3rd millennium BC. Most walled settlements and ditched enclosures were abandoned or contracted their perimeters. A return to more mobile strategies of land use seems to emerge out of the fragmentation of former sedentary settlement systems. The existence of long-distance exchanges is no longer as evident in the archaeological record as before. All of these changes become more perceivable when later Bell Beaker styles, whether local (Palmela style) or incised typologies, emerge in Southern Portugal. Most authors (e.g. Cardoso 2004; Soares and Silva 2010) suggest that this process will culminate in the emerging of warrior lineages at the beginning of the Bronze Age.

The Bell Beaker period start around 2400 BC (earliest), and the Bronze Age in Iberia starts around 2200-2000 BC. I can't see the dates matching for an Out of Iberia scenario, nor can I turn discontinuity into continuity by inventing some psychological change in the people.

(BTW, the paper quoted also discusses the horse remains found in Southern Portugal. The pre-BB Chalcolithic from Zambujal (the biggest assemblage of faunal remains) yields the highest number of equids (E. Ferus, most likely), at 315 specimens out of a total of 64.807 specimens identified. That's around 0.5%. Not exactly the Botai culture of the west).


Samuel Andrews said...

@Arch Hades,
"Are these the same Iranians that Lazardis was saying had a huge genetic impact on the Yamnaya? Doesn't seem like it."

The Iranians Laz thinks contributed are Chalcolithic Iranians who were basically the same as modern people in Iran and Iraq. IMO, Chalcolithic Iranians were about 40% Anatolia_N, 40% Iran_Neo, and 20% CHG.

Yamnaya was probably about 55% EHG, 30% CHG, and 15% EEF. Those are the numbers that make the most sense right now.

The reason Iran_Chl fit well as an ancestor of Yamnaya is they have both CHG/Iran_Neo and EEF in them, while CHG just has CHG.

epoch2013 said...


Sorry, the line "your guess is right: 2400 BC is the first Yamnaya" should have read "your guess is right: first Steppe ancestry arrived in Iberia 2400 BC". Posted before my first coffee if that is any excuse.

The Raitlin sample showed clear affinity with an area in Germnany but also slghtly with Portugal. I could imagine (part of) the BB package originating in an MN culture, traveling eastward, being picked up with IE folk carrying Steppe ancestry and then expand westward.

Matt said...

Sort of amusing myself here, but further experiment with these few stats:

Since they sort of recreate a West Eurasia PCA, I wondered about how you could use them to recreate population proportions.

One way that's probably not totally accurate, but which is quick to do, is to assume that the true values are similar to what is in the Basal-Rich K7, and then use a multiple regression equation on the 3 stats to estimate population proportions.

So when I did this, I got the following values:

Multiple Regression Estimated K7 vs Actual K7 /

Mostly very similar ( / / particularly for the Basal Rich. Differences mostly seem 1) most of the ME goes down for that to encompass the difference between Iran_N's stats and everyone else's, 2) slight differences in AG3-MA1, Villabruna and Basal-rich for some populations, where the stats would predict more Villabruna and less AG3 (CHG, Iberian Chal/MN) or more AG3 and less Villabruna / Basal-rich (Anatolia_Neolithic, LBK_EN, Jordan_EBA).

Comparison of PCA on the estimates vs real values:

The main net effect is to shift CHG and Armenia_Chal/MLBA closer to the North Caucasus, shift Jordan_EBA and Anatolia_N closer to the present day East Med, change the shape of the intra-Europe and Bronze Age Europe clines...

(Average of true values from K7 and estimates on PCA is -

epoch2013 said...


Good read. The Iberian origin for BB as proved by dating is popular, though. See the first article in Simliar but different, a recent book on BB.

epoch2013 said...


And another thing from that article: In figure 7 it shows that pre-BB has few, but BB has abundant red deer remains. Red deer is undoubtedly game. That means a very clear behavioural change. It also shows in wild boar, not nearly as pronounced.

Olympus Mons said...

If you wanted to be honest about a discussion of those issues, you would not phrase or compose any of those two topics in the way you just did. Which is fine. Its not like people here are pursuing “truth”.
Just wanted to make that clear. And it’s not a problem because its just comments.

*See, Zambujal has the highest faunal remains, those guys consume even cattle that strontium tells us that came from as far as south Portugal…. And, amazingly they ate everyday, if they didn’t eat horse (not like botai) so horse remains should in fact be a very small part. It’s not rocket science, right? - Secondly Zambujal was “the capital” of the region. Like in some many of these “capital” you do not find many horses. Unlike the “soldiering” places that “protect” Zambujal. VNSP guarding the entrance near Muge (where tagus river is easily crossed by foot) and Leceia near the Tagus river where it protect from anybody crossing the River. Both have more arrows ant lithics related than any other place you see in Iberia.

*Lets take leceia to clear that bell beaker “dating” that you mentioned. First, lets clear some facts regarding just one place:

a. you do not contest that Portugal bell beaker pottery was found in Portugal, between 2900 BC and 2800 BC. Correct?

b. You do not contest that a trademark of bell beaker, or better the trademark after the pottery that gave them name was the “open air sites” that propagate trough Europe and that is actually seen in Leceia transitioning from the Fortified sites to the plains outside those walled places by late 2700BC . So you do not contest that the bell beakers in leceia already had the pottery, the open air sites with the full way of living package (arrows, blades, Farming and pastoral). Correct? – So before 2500 you already at least 4 out 5 of the major components of bell beakers all over europe. Right?

c. You also do not contest, that in leceia, we already see a stratified, hierarchical and specialized society. So that is something we also see with bell beaker societies, correct?

d. Copper! Leceia goes from 4th millennia BC to initial 2nd. Copper is in leceia from the earliest stages. And most copper found in leceia actually should be coming from Ossa morena in spain, where actually the amphibolite to make hard tools. So very pre-bell beaker those guys were smelting copper. So at the initial spread of Bell beakers those guys already imported Copper and work it for a very long time. So, one added trait added to the departing stages of bell beaker. Right?

Actually I could go on… but you get the picture right? – so to you bell beaker is just the guys that buried their dead the same way as these bell beakers in Leceia, oh but added this same tools and utensils that they already had in leceia to graves?

Oh I see. Really big difference.

Olympus Mons said...

Red deer, was the game of the place. once you move to a new place, you need to sustain your self with higher levels of hunting. That fact is truth in Iberia since 3200Bc. Porto carretas, one of the first (if not the first) site to be erected on the left side of guadiana when the arrows and carenated pottery people arrived at Iberia, shows a staggering amount of Arrows, red dear and... Horses!

Grey said...

if there were pre-existing trade routes i'd imagine individual traders/artisans or family groups could get from Crimea to Iberia by sea quite rapidly

all you need is for the PIE to have something to trade or some kind of skill in demand - connected to horses maybe

similarly something like horse traders gradually spreading west - if they're in small groups and not a threat they might be able to spread through the cracks like gypsies

so personally i won't be surprised if there's early Yamnaya-like ancestry in Iberia - not necessarily a lot though

ryukendo kendow said...

@ David

David, do you mind doing a qpAdm of these two samples? Just to check if any population at all in ancient Europe requires excess CHG.

Arch Hades said...

Thanks Samuel for clearing that up. I think Lazardis' 2nd model with Yamnaya being 50% EHG + 50% CHG & Chalcolithic Iranian is probably most reasonable. I havent seen much proof of any non trivial Anatolian-Aegean farmer admixture in the Yamnaya. At least not in the Eastern Yamnaya we have so far. Another thing is CHG and Chalcolithic Iranians are very similar to one another on PCA.

Arch Hades said...

David why do you think Dienekes will shit his pants at bronze age Greek genomes? Let me guess you think they'll be heavily steppe derived and have bloated EHG admixture levels? If that's the case why do Bronze age Armenians not show this bloated level of EHG or Euro HG ancestry? Slightly less than modern Greeks it seems. Heck they only scored like 14-15% "yamnaya" or steppe in some of your very own formal runs when CHGs and ENFs were added into the mix. Considering Armenian is linguistically supposed to be closest to Greek i'm a bit skeptical. I'm skeptical of any of the more Southern IE speaking groups having bloated EHG or even steppe admixture. Especially Greeks, Armenians, and Anatolians. These places were all higly urban and densely populated in ancient times, unlike Northern Europe. I suppose though looking at some of those Mycenaen reconstructions (if they are accurate) they look pretty robust and non gracile, which could indicate bloated EHG or WHG ancestry.

Karl_K said...

@Arch Hades

"David why do you think Dienekes will shit his pants at bronze age Greek genomes? Let me guess you think they'll be heavily steppe derived and have bloated EHG admixture levels?"

It doesn't matter if they even have only 5%. Even this disfavors his old favorite models of IE spread.

Davidski said...


Yamnaya is not 50/50 EHG/CHG, and that's not what Lazaridis et al. model it as. They model Steppe_EMBA (which includes Yamnaya) as 56.8/43.2 EHG/Iran Chalcolithic.

But I reckon my TreeMix/qpAdm models make more sense...

However, I believe that even my TreeMix/qpAdm models bloat the southern ancestry in Yamnaya, and something like 65/25/10 EHG/CHG/Balkan EEF is closer to the truth.

As for Dienekes, why do you assume that there's an EHG admixture threshold at which he'll shit his pants? I think you'll find that he'll shit his pants when the Mycenaean genomes come out because at least some of them will show some steppe-related input one way or another, and that will be that.

Davidski said...


I won't get time to model those Hungarians this weekend, but I wouldn't be surprised if they did have extra CHG from Mid/Late Neolithic or Copper Age Anatolia.

Alberto said...


Thanks for all the graphs. It's really nice that your estimates correlate pretty good with Davidski's K7. One thing that it made me wonder is how accurate is it to estimate that the Basal-rich component is about 50% True Basal and 50% Villabruna-like. I think that's probably the case, but it could be tested with the data here. So for example taking Caucasus_HG's values (your estimates):

Basal-rich/AG3-MA1/Villabruna: 46.76/36.94/16.24

(Villabruna + 50% Basal-rich) = 39.6. So 39.6 - 37 (AG3-MA1) = 2.6%

This would predict that the stats D(Mbuti;Bichon-Villabruna)(Mota;Caucasus_HG) - D(Mbuti;AG3-MA1)(Mota;Caucasus_HG) should be close to 0, very slightly positive: 0.2735 - 0.2738 = -0.0003 (minimally negative, but basically correct - CHG has about equal amounts of WHG and ANE).

The only one that would be (slightly) negative would be EMBA_Steppe pops. For example, Afanasievo:

(Villabruna + 50% Basal-rich) - AG3-MA1 = 42 - 47 = -5%

D(Mbuti;Bichon-Villabruna)(Mota;Afanasievo) - D(Mbuti;AG3-MA1)(Mota;Afanasievo) = 0.3059 - 0.3149 = -0.009

While Iberia_MN should be very positive:

(Villabruna + 50% Basal-rich) - AG3-MA1 = 81.5 - 1.8 = ~80%

D(Mbuti;Bichon-Villabruna)(Mota;Iberia_MN) - D(Mbuti;AG3-MA1)(Mota;Iberia_MN) = 0.3295 - 0.276 = 0.0535

So what I'm wondering is how good those values correlate in a plot with all the populations. If the correlation is good, it would be a nice confirmation that:

- The WHG/ANE ratios are correct (adding that 50% from Basal-rich to Villabruna)
- The Basal Eurasian estimates should also be correct (taking the other 50% of Basal-rich)

Do you think you could try to plot those values? (If you think it makes sense and have the time for it).

Davidski said...

I'm making a huge new datasheet with the new Anatolian samples to test correlations between the K7 and D-stats results. Should be ready later today.

Alberto said...


Great, thanks. I do think that the Basal estimates (at least the pattern) make more sense than the Laz 2016 ones. So it would be great if it could all be correlated with D-stats.

Alberto said...


So I did find one reliable date of Bell Beaker pottery in Portugal from around 2700/2600 BC:

If you read this paper, what do you think about it?

Matt said...

@ Alberto:

I think this is what you are looking for with the statistics:

First, just for the straight up K7 values:
1. (No extra Villabruna from Basal Rich):
2. 30% extra VB from BR:
3. 50% extra VB from BR:

Second, for the estimates I got from using the D(Mbuti;Ancient)(Mota;X) stats multiple regressed against K7 values:
1. (No extra Villabruna from Basal Rich):
2. 30% extra VB from BR:
3. 50% extra VB from BR:

(All at

TBH it does seem like adding any Villabruna proportions from Basal Rich makes implies the stats should be in favour of Villabruna for some populations where it isn't.

But that might be because using these two stats this way is not as good as D(Bichon, AG3)(Mota,Test) instead, or because of some quality of Bichon / AG3.

(Also, for info, ran the same estimating experiment using the full list of D(Yoruba,Ancient)(UstIshim,Test) stats from the other datasheet Davidski put up and those are much the same, really. Same pattern of CHG seeming more Villabruna / WHG related, Anatolia_Neolithic seeming more AG3 related, while Jordan_EBA and the modern Caucasus and some Mediterraneans being less Basal_Rich, than they do in the K7 proportions.)

Olympus Mons said...

Give me 15 minutes and I will give you a reply from leceia fortified place itself! I will take the laptop and answer directly from the place itself. If the gate is open I will even seat at the FM hut itself while typing. :)

Rob said...

..........don't forget your Wifi

Alberto said...


Yes, that's what I was interested in seeing, thanks!

The stat being positive of negative probably will vary depending on the samples used (AG3 gives higher values than AG3-MA1 together, and probably Loschbour would give higher values than Villabruna-Bichon), so knowing the exact WHG/ANE ratio might be tricky.

But it's important that the values correlate very well when adding 30% or 50% of the Basal-rich to the Villabruna cluster. Especially in your estimates it looks really good (similarly good with 30% and 50%, which is strange, maybe because the best fit would be around 40%?).

And importantly, if the WHG/ANE ratios match this good, it should probably mean that that Basal Eurasian estimates are mostly correct too (if we assume that West Eurasians are largely a mix of these 3 components).

Davidski said...

Here's that new datasheet for checking correlations between the K7 and D-stats.

In the end I couldn't include those new Anatolian samples due to a slight lack of markers. But Anatolia_Neolithic is now Barcin_Neolithic in all analyses.

Just had a quick look, and the correlation for Iran_Neolithic between the AG3-MA1 cluster and AG3 affinity is basically perfect. But the inverse correlation between the Basal-rich cluster and AG3 affinity, or indeed AG3+Villabruna affinity, isn't as good.

So there's something missing for Iran_Neolithic in the D-stats. It seems like there's a component hiding somewhere in its K7 scores that isn't well proxied by AG3, Dai, Karitiana, Ust-Ishim or Villabruna.

In fact, if anyone's still wondering whether Iran_Neolithic has any ASI from South Asia, just plot the Basal-rich cluster vs Dai and that'll give you an answer; a bit fat no.

Matt said...

@ Alberto, though note, the correlation with the stats should be a little better with the estimates, because the estimates are modelled on what is effectively a bit like a statistical compromise between what the stats say and the K7 values. Doesn't mean they're closer to correct (esp. as there could be *some* information that matters in some stats that are not in the Ust Ishim, Bichon, AG3 set), but they should be more consistent with the stats (since that's what they're from).

(Also, with the other estimates I was building from the D(Yoruba,Ancient)(UstIshim,X) stats, these are the values: (.csv) and the PCAs: Doesn't really affect the PCA but some bloating of SE Asian / Andamanese because the multiple regression has a tough time accounting for the SCA Asian populations. Assuming the Basal Rich in those is 60% Basal, then Basal would go from 40% Levant Neolithic and Natufians to 35% Iran Neolithic, 30% Anatolia and CHG, 20% Iberia_Chalcolithic then 10% Steppe EMBA, consistent with around 1/3 CHG into Steppe... If the estimates for Basal Rich are close and that's how much it has and if Basal Eurasian were a real people).

Matt said...

@ Davidski, I think that's correct that there is something missing from the D-stats that pulls Iran_N and Caucasus away from Basal-Rich without them actually necessarily having more ANE; as much as I'm sitting here running estimates on the basis that the D-stats can explain all the relatedness in K7 (and maybe add some extra information), seems likely that there's some extra relatedness between the CHG+Iranian Neolithic that pulls them together and is swept up into the AG3-MA1 component.

Also at the same time, probably something binding together the populations that get high K7 Basal Rich. Barcin_Neolithic, Levant_Neolithic, Jordan_EBA and the pre-Bronze Age European Neolithic look tighter on the K7 and genotype PCA than the stat differences for Villabruna/Bichon+EHG/AG3 place them at.

Maybe something that diverges the Basal side of the North Iranian+Caucasus HGs from the Basal side of Levant+Anatolian Epipaleolithic (Boncuklu and Natufians).

(Probably premature speculation - stats actually make me wonder whether there are some differences in the Euro HGs Iberia_EN and LBK_EN mixed with, with Iberia being more western WHG and LBK more like the Hungary-KO1 types with their possible EHG influence. Then they usually end up drawn closer together than they would appear just from the AG3/EHG vs Bichon/Villabruna stats due to the same source from the Boncuklu-Natufian related side of their ancestry).

One related thing I was wondering, although probably more for the post above, thinking about how the Gülşah Merve Kılınç paper made a big deal over how the Boncuklu share much more drift with one another measured by the f3 outgroup method, compared to later Anatolians and also European Neolithic, is anything like this possible with the Natufian, Levant_Neolithic and Iran_Neolithic samples, to compare? Or is the coverage too bad?

Particularly for the Natufians vs Levant_Neolithic it could be interesting, because the same Epipaleolithic situation is mirrored there (though I believe the shift in ancestry for the Levant Neolithic may be greater); whether you see the same pattern of increasing internal genetic diversity in the later populations and the same pattern of the Natufian HGs being quite more tightly related.

Olympus Mons said...


Alberto, I see were you are going. And no.
Look, Leceia was built from the offset as a military powerhouse. It was deliberated design to be a military outpost from 2900 onward. Leceia had containers of arrows near the towers (just imagine the readiness for battle) and inhumations that nobody even care to bury outside its walls.
So, whoever was building huts around it with bell beakers, believe me was one of “them”. No exogenous group of people would ever get even close to leceia. And those bell beaker small settlements were freely popping up around them. So. Local people. Different from the other neighboring people that did not have bell beakers, but one of “them”.

What Joao Cardoso is saying in the paper is what I am saying all along. Whoever got to Iberia by 3300bc was not one people, but different peoples. Its just that it took several centuries to start diverging. I bet, I bet, that id one manages to sample Porto carretas and paraiso, and all other “military” outpost built by 3200 bc and then sample that later bell beaker in the 2half of III millennia that curiously made a point of reoccupying after a long period of abandonment the same “warrioring places” (porto carretas, paraiso, etc) you will find the admixture of the same bell beakers of leceia, but if you sample other places around you will find the admixture of other peoples, and other places if the admixture of local Neolithic Iberia. So all should stop talking about Iberia as nonolithic place in the chalcolithic because is just rubbish.

Its not “sexy” to talk about north Africa, but let me tell you, a very well populated north Africa fleeing the horrific eolean winds of the birth of Sahara devastating every thing on its wake, pushing people and their cattle year upon year north and into tassili mountains and other oasis places, it made the arriving Iberia population on the second hal of the 4th millennia not a very patience or friendly people.

Olympus Mons said...

Unfortunaly a fire nearby made the fireman not let me pass. the fire was in talaide and didnt get near the Lecia site...

Olympus Mons said...

A lesson to all,
Do you know that for a very long time archaeology (some) thought that bell beaker had come from pre-dinastic Egypt. for instance because bell beaker decorations are so similar to Egypts C ware...
The things you learn every day!

One of these days we even discover that the still active Egypt legend and folklore that the peoples of Europe had their origins in a group of people that left from Heliopolis has some validity. :=)

Olympus Mons said...

And to finish this subject… its important to understand dates (time!).
I don’t think Bell beaker culture was born in Zambujal, Leceia or VNSP. Nope. Remember: Oldest dates for Bell beaker clearly are a bit north of that in north Portugal (even maybe 2900, but for sure between 2900 and 2800). Then next dates are in leceia, zambujal and Lisboa surroundings. Then it shows up in Galiza (2600 BC) so pretty much same time as in Lisbon area. Then starts to pop other places. I know your confusion: In the Meseta ( Spanish Estremadura and like Madrid and all) it does not show up before 2500-2400. Even in Portuguese northeast it shows much later. So, those guys really went from North Portugal to galiza, Baque and straight out of Iberia into Rest of Europe in 300 years. At the same time, from that point, south of douro river (and this means a couple mistakes in my Shulaveri thesis) , They went ballistic all over Portugal, spain and even north Africa at a point in time that the ones that stood behind were already worn out by war faring or whatever mess they get into and led them to build all those fortified settlements.

But make no mistake. Bell beaker (pottery, arrows and copper, domesticated cattle and hierarchical and specialized tasks) … was pretty much there in the beginning of central and north Portugal. The wrist protection were and added element that by second half 3rd millennia was already added at least to the meseta group I suppose.

Olympus Mons said...

I swear is the last....
Sampling dna from el portalon or atapuerca and thinking you know what admixture Iberia had in chalcolithic/Copper age.... is just ignorance! The really interesting guys were everywhere but!

Davidski said...


Impossible to test the Natufians and Levant farmers individually with f3-stats. I'd need them to be more or less of comparable quality and real diploids.

Davidski said...


What do you make of this pattern?

Excellent inverse correlation between the Basal-rich cluster and AG3 affinity.

Poorer correlation between the AG3-MA1 cluster and AG3 affinity.


Excellent correlation between the Villabruna cluster and Villabruna affinity

Poorer inverse correlation between the Basal-rich cluster and Villabruna affinity.

Why is the relative success of these results reversed for AG3 and Villabruna?

Chad Rohlfsen said...

I've got F3's for Iran, Levant, and Natufians using top quality genomes from each. The results are interesting. I'll post them soon. The results may show we need more samples from 32-8kya in Eastern Europe, the Caucasus, and Eastern Anatolia.

Chad Rohlfsen said...

Rather poor failures

Levant_EN1 Iran_EN1 Anatolia_EN1 0.010289 0.001905 5.400 317344
Levant_EN1 GoyetQ116-1 Anatolia_EN1 0.008580 0.002022 4.244 296503

Moderate failures

Levant_EN1 AG3 Anatolia_EN1 0.006530 0.002437 2.680 106757
Levant_EN1 MA1 Anatolia_EN1 0.005224 0.001997 2.616 262047
Levant_EN1 ElMiron Anatolia_EN1 0.005824 0.002024 2.878 247666
Levant_EN1 LaBrana1 Anatolia_EN1 0.006590 0.001749 3.769 347918
Levant_EN1 Bichon Anatolia_EN1 0.004616 0.001972 2.341 361173
Levant_EN1 Hungary_HG Anatolia_EN1 0.005240 0.002029 2.583 250762
Levant_EN1 Villabruna Anatolia_EN1 0.006535 0.001990 3.283 327372
Levant_EN1 Motala_HG Anatolia_EN1 0.006082 0.001517 4.011 360225
Levant_EN1 Samara_CA Anatolia_EN1 0.005687 0.001776 3.202 274472
Levant_EN1 Georgian Anatolia_EN1 0.005683 0.001152 4.931 196958

second lowest set of failures.. second one is interesting..

Levant_EN1 Loschbour Anatolia_EN1 0.003105 0.001848 1.680 361331
Levant_EN1 Kotias Anatolia_EN1 0.003380 0.001831 1.846 361266

Check out the least failing group...

Levant_EN1 Yamnaya_Samara1 Anatolia_EN1 0.001923 0.001457 1.320 362635

Now... the only one that was successful.. and missed in the paper with using just the best coverage samples.

Levant_EN1 Karelia_HG Anatolia_EN1 -0.000190 0.001686 -0.113 342440

Chad Rohlfsen said...

Fairly similar with Natufians, but none were negative. Here, Karelia HG input is still better and kind of significantly (nearly twice as high on the f-3) than WHG from Natufian to Anatolia, but Kotias and Georgians are the closest to succeeding.

result: Natufian1 GoyetQ116-1 Anatolia_EN1 0.020209 0.002342 8.629 172902
result: Natufian1 ElMiron Anatolia_EN1 0.016845 0.002281 7.386 145112
result: Natufian1 LaBrana1 Anatolia_EN1 0.012422 0.002034 6.108 202003
result: Natufian1 Loschbour Anatolia_EN1 0.011251 0.002029 5.545 209803
result: Natufian1 Bichon Anatolia_EN1 0.014029 0.002107 6.660 209428
result: Natufian1 Hungary_HG Anatolia_EN1 0.011934 0.002232 5.346 146154
result: Natufian1 Villabruna Anatolia_EN1 0.014401 0.002145 6.713 190259
result: Natufian1 Motala_HG Anatolia_EN1 0.009434 0.001614 5.847 208988
result: Natufian1 Karelia_HG Anatolia_EN1 0.007521 0.002070 3.634 198453
result: Natufian1 AG3 Anatolia_EN1 0.011190 0.002817 3.973 63739
result: Natufian1 MA1 Anatolia_EN1 0.014753 0.002205 6.691 152573
result: Natufian1 Iran_EN1 Anatolia_EN1 0.010939 0.002092 5.230 185583
result: Natufian1 Kotias Anatolia_EN1 0.006843 0.002071 3.304 209222
result: Natufian1 Samara_CA Anatolia_EN1 0.009358 0.001859 5.034 160030
result: Natufian1 Yamnaya_Samara1 Anatolia_EN1 0.008396 0.001651 5.086 210286
result: Natufian1 Georgian Anatolia_EN1 0.006196 0.001417 4.373 113431

Chad Rohlfsen said...

F3s are lining up with that qpAdm which showed much more CHG and EHG admixture going from Levant to Anatolia, vs WHG.

Davidski said...


This is a neutral Z score. Come on, you need something around Z -2 at least.

Levant_EN1 Karelia_HG Anatolia_EN1 -0.000190 0.001686 -0.113 342440

Try Karitiana instead of Karelia_HG. It might improve the Z score. But it won't prove anything, because all you're doing is matching up some allele frequencies that kind of match the ancestral populations that make up Anatolia_EN.

The fact is though, Anatolia_EN is not a mixture involving Levant_Neolithic.

Chad Rohlfsen said...

It's not significant, but it is the only thing that doesn't fail. I think you need Levant. Neolithic Anatolians are more Levantine-like and more BE than the HG's of Turkey from the new paper. Remember, Anatolians are significantly closer to Iran than Levant. Even though Levant is closer to Iran in BE. WHG is also significantly closer to Levant than Iran, so it isn't WHG into Anatolia that causes the stat, but only ANE/EHG/CHG input that can. That with all the Dstats and qpAdm makes a great case.

Davidski said...

I reckon the Anatolians have minor CHG admixture and also ANE via their Villabruna-related ancestry.

None of the stats you posted contradict this. Sorry.

Karl_K said...

Hot off the presses!

"genome sequencing that revealed that farmers from India moved to Iran 7,000-8,000 years ago"

Davidski said...

These people are morons.

Chad Rohlfsen said...

But if all these groups are just your Basal rich and WHG, why is WHG significantly positive here, unlike EHG? There is a lot more going on than BE, plus WHG and minor CHG. I just don't think 14% CHG and 10% EHG is unreasonable. Stats are pointing to a more complex mix.

Davidski said...


Here are some f3 mixture stats for a few average Barcin Anatolian farmers modeled as mixtures of various foragers and the most basal Barcin Anatolian farmers.

Hungary_HG Barcin_Neolithic1 Barcin_Neolithic2 -0.002799 -2.101
Bichon Barcin_Neolithic1 Barcin_Neolithic2 -0.001471 -1.185
Karelia_HG Barcin_Neolithic1 Barcin_Neolithic2 -0.001006 -0.77
Motala_HG Barcin_Neolithic1 Barcin_Neolithic2 0.000276 0.269
Loschbour Barcin_Neolithic1 Barcin_Neolithic2 0.000492 0.345

Only Hungary_HG gives a signal that looks more than just noise. This makes sense, since it's the forager genome from the closest site to Western Anatolia.

Now, using Levant_Neolithic as one of the mixture sources instead of the most basal Barcin farmers basically fails. Not one Z score worth mentioning.

Bichon Levant_Neolithic Barcin_Neolithic2 0.00047 0.294
Karelia_HG Levant_Neolithic Barcin_Neolithic2 0.001843 1.17
Loschbour Levant_Neolithic Barcin_Neolithic2 0.003562 2.215
Hungary_HG Levant_Neolithic Barcin_Neolithic2 0.00432 2.79
Motala_HG Levant_Neolithic Barcin_Neolithic2 0.004253 3.534

Seems like Levant_Neolithic isn't useful in this context. That's not to say a population related to Levant_Neolithic didn't help to form the Barcin farmers, but we don't yet have that population to get useful results with f3 mixture stats.

The stats you're getting are very subtle and very mysterious. Unlike the Hungary_HG result above, it's impossible to say what they mean exactly.

Rob said...

Broadly, the 'Levant Neolithic' samples from Lazarides were all from Israel, and hg E. We'd need more samples from pre-Neolithic central-eastern Anatolia & northern Syria.

Davidski said...

Not even broadly speaking. The southern Levant farmers just don't appear to be relevant to the Anatolian farmers as a mixture source.

If they're forced to be a mixture source, then strange things are likely to happen.

Definitely, we need samples from the Northern Levant, but I'm getting the feeling that this WHG-like and Basal combo was a feature of Near Eastern populations for a very long time. I have no idea where one would look to find the true, or even useful, reference populations for the most basal of the Anatolian Barcin farmers?

Not Iran IMO.

Davidski said...

Alright, I managed to almost get a signal of admixture for the Barcin Anatolians using Levant_Neolithic. Guess what the other mixture reference is?

Levant_Neolithic Boncuklu_Neolithic Barcin_Neolithic -0.005525 0.002109 -2.62

So Barcin looks like a mix of Boncuklu and something from the Levant. Thus, the question becomes, what is Boncuklu exactly? It does show some eastern CHG-like admixture, but not much.

Rob said...


Boncukulu is right on the line where we might expect Epipalaeolithic Anatolia to be split, into west & east regions. This sample is actually pre-ceramic Neolithic, not Epipalaeo, but i suspect it would be representative of the eastern Anatolian Epipalaeolithic, which is concentrated to the southeast, toward Syria.

On the west side, the Marmara shows closest analogies to Mesolithic sites in Ukraine & Crimea, rather than Greece or the Balkans, Whilst Antalya was linked to Greek Thrace.
What survived where during the LGM is complicated, but this area is possibly where a mixed WHG-like -- Basal population could have existed for a very long time thus obviating the need to try and force something "WHG" with "Levant Farmer". It might be worth looking at which of the Palaeoltihic European samples we have is the best fit for the "WHG-like" in Anatolians, as well as Natufians, and compare the two.

Are you getting a feeling of what time depth the Basal in Anatolian, Levant and Iran had split ?

Davidski said...

This looks plausible for Boncuklu (although the high CHG is probably not crucial, judging by the rest of the output that I'm seeing here).

left pops:

best coefficients: 0.619 0.241 0.139
std. errors: 0.229 0.237 0.115

And then the Barcin Anatolians are just Boncuklu with some extra Levant-related stuff.

The problem with using UP Europeans is that they have too much archaic.

Davidski said...

No idea how Basal Eurasian fits into this and when it split.

Matt said...

@ Davidski, I've been trying to come back with thoughts about those stats and correlations. Nothing clear at the moment. If I come up with something that adds rather than reduces confusion, I'll post it.

If you or @ Alberto were interested, I did try another experiment to estimate proportions from the stats. I wasn't quite happy with the model of using a regression on the K7 positions to get estimates out of the stats (and it's sort of a circular way to cross check against K7).

Basically, transformed the full set of stats into PCA, taking the first 3 dimensions which explain 99% variance, then estimated positions of ANE, WHG, Basal_Rich and ASI within those PCA, *then* used 4mix on PCA data to estimate proportions.

(Set of images showing the process from start to finish here: The proxies took a few iterations to get right.

Tried using multivariate regression on the PCA, but 4mix worked a lot more cleanly).

Basically typical 4mix / nMonte methods, really.

The resulting proportions seem to do a better job of recreating the PCA on the D-stats (because there's less constraint on fitting to the stats than doing a regression on the K7 values) and still have pretty close correlations with K7:

Getting back to your questions though, like you'd expect from the correlations with K7 against the D-stats+PCA+dummy ancients models, those estimates do show pretty much exactly the same patterns as the K7 components do, when correlated against the D-stats, and they are explicitly built to model all the variance on the D-stats! I think that's at least confirmation that there is mostly agreement between the two sets of data (K7 and D-stats), even though I have some trouble explaining exactly *how*... ;)

And even if there may be some extra data outside these D-stats that contributes to K7 (perhaps. at the expense of what these D-stats are talking about)...

(The estimates I derived from the statPCA+virtual ancients+4mix are on a datasheet in csv format here:

Neighbour joining comparison of estimates vs K7:

Here's also the PCA data I used (based on the D-stats):

and the "model" ancients detransformed from the PCA back into D-stats: and how they neighbour join with others based on those stats alone ).

Matt said...

Do any models like; Levant_N, Villabruna, AG3 or Levant_N, Villabruna, Iran_Hotu, AG3 work for Boncuklu / Barcin? With outgroups as the O9 (Ust_Ishim, Kostenki14, MA1, Han, Papuan, Onge, Chukchi, Karitiana, Mbuti).

(Or using any other European UP in place of Kostenki14).

Chad Rohlfsen said...

I'm going to merge and make my Geno files, but this would be consistent with Barcin being about 90% Boncuklu, 10% Levant. That would make my exact output on qpAdm with 71% Levant, 14% CHG, 15% WHG.

Davidski said...


Can't get any good models with AG3 or Iran_Hotu, at least not with those outgroups. It seems that CHG is the best proxy for the eastern influence in the Anatolians.

But the standard errors always approach the actual CHG mixture coefficient, which makes me think that we don't yet have any good reference pops for the CHG-related admix in the Anatolians.

By the way, I noticed that the Basal-rich estimates in your 4mix results for the Steppe EMBA samples are actually a bit lower than those in my K7.

Matt said...

Thanks for giving it a shot.

Yeah, the Basal_Rich in that model averages at around 85% of the same named in K7. Though depends on population - most proportionately reduced is Steppe and populations from recent Iran / South Central Asia / Caucasus, while populations that increase or stay constant are European, and the very early-mod Neolithic populations (Iran Neolithic, Israel Neolithic). Presumably if you had to put an interpretation on it, I guess that estimate might be more Basal.

I placed the position for the dummy Basal_Rich on the PCA at a cline intersection in a place that could cut a cline straight through Iran_N towards where I'd put the ANE population, and at the same cut straight through Iberia_MN to where I'd placed the WHG. If it had been placed closer to recent West Eurasia and got higher proportions, then I'd have had to move the ANE at least (probably "south" and "east") to have got Iran_N fitting as just the dummy Basal_Rich and ANE without picking up my ASI (or failing to fit). Then the ANE would've dropped out for the Steppe_EMBA populations, where I had most confidence in the K7 ANE estimate, and where the 4mix+PCA model them. (Though possibly that could've brought a closer match to the K7 overall, since the K7 has more ANE in SCA and less in Europe). Or else move the WHG "west" and Basal_Rich "east", and the Early Neolithic would not work as fairly simple mixes of Basal_Rich and WHG as well.

Ultimately, although I had a mind to what would fit, e.g. looking for proportions to work out as consistent with K7 as I could get, there was an element of choice of positioning the "dummies" at about where I though, so they're not that rock solid as estimates. They're simply mixes of virtual populations in the PCA space that do work, and correlate fairly well with the K7, not necessarily the only or the actual ones. (If anyone used the same methodology again in future, could constrain by adding a Villabruna cluster / EHG sample to set some constraints on positions of Basal_Rich / ANE).

Chad Rohlfsen said...


What are errors like with Natufians and Iran in the outgroups?

Chad Rohlfsen said...

Oh, the other thing is that Levant is supposed to be a mix of Natifians and a group from near the Boncuklu group. So, that might be another reason for errors. Use the 9 and Iran in the outgroups, with Narufians in the pleft.

«Oldest ‹Older   1 – 200 of 204   Newer› Newest»