search this blog

Friday, December 26, 2014

The fateful triangle


Not long ago Lazaridis et al. proposed that most present-day Europeans were derived from three distinct ancestral populations: Ancient North Eurasians (ANE), Early European Farmers (EEF) and Western European Hunter-Gatherers (WHG).

However, this is essentially a stop-gap model, which will in all likelihood be replaced by a partly revised and more robust model once someone manages to sequence a genome or two from the Neolithic Near East. That's because EEF is clearly a hybrid component, largely made up of ancient Near Eastern ancestry and something very WHG-like, sometimes in very different proportions depending on the location and archeological context of the EEF genomes being analyzed.

So what will this new model look like, you might ask? Probably like this, where EEF is replaced by an Early Neolithic Farmer (ENF) component from the ancient Near East, or something very similar:


The diagram above is basically a Principal Component Analysis (PCA) based on output from my new West Eurasia K8 test (see here), in which the Near Eastern component is synonymous with ENF.

I'm quite certain that these results are very close to the truth. However, just in case the Near Eastern ancestry proportions are a little bit too high (and we won't know until we see those ancient genomes from the Near East), I've got another version that offers lower bound Near Eastern estimates.


It might be useful to keep in mind that I rotated the plots to fit geography. As a result, Component 1, which packs around 85% of the variance on both plots, appears smaller than Component 2, which only carries around 10% of the variance.

A spreadsheet with West Eurasia K8 results for a wide variety of populations is available here. Please note that there are two sheets, with the second sheet showing the lower bound Near Eastern ancestry proportions.

We'll probably learn of more ancient European meta-populations as many more genomes are sequenced from across Eurasia. Nevertheless, I doubt this will affect the model outlined above. That's because I'm expecting all such meta-populations to be mixtures of ANE, ENF and/or WHG, as well as, in some cases, extra-West Eurasian components.

However, I suspect that West Eurasia will have to be modeled in a different way from Europe, with, amongst other things, the so called Basal Eurasian component replacing ENF. But for this to happen we'll need at least one ancient genome that is in large-part of Basal Eurasian origin. In any case, that's a whole different subject.

See also...

4mix: four-way mixture modeling in R

ANE is the primary cause of west to east genetic differentiation across West Eurasia

Bell Beaker, Corded Ware, EHG and Yamnaya genomes in the fateful triangle

134 comments:

Davidski said...

By the way, I got the triangle idea from Matt.

Nirjhar007 said...

So in General sense what does the 'Triangle' tell us?

Davidski said...

Most of all it shows us the basic genetic structure of West Eurasians, and in particular Europeans.

But what it also illustrates is that Northwestern Eurasia was populated by some highly differentiated populations until the Neolithic, when these groups began mixing, and, as a result, at least three of them are no longer to be seen in their unmixed state anywhere. However, modern West Eurasians carry significant contributions from each of these three groups, and this is one of the main reasons why they appear so closely related.

Nirjhar007 said...

I Understand but can you differentiate the Pre-IE ANE component in Europe from the Indo-European ANE?

Helgenes50 said...

To differentiate the Pre-IE ANE in Europe from the Indo-European ANE, to draw a line between Motala and ENF could be a solution ?

Davidski said...

I can't differentiate it because it's the same component.

However, almost all of the ANE in modern Europe west of the Black Sea probably arrived there after the Neolithic with various Indo-European groups. This parallels the massive spread of Y-haplogroup R across Europe after the Neolithic.

Hunter-gatherers like Motala12 didn't contribute much, because their ANE ratios were too low and WHG ratios too high. Their Y-DNA also didn't make much of an impact.

Some of the ANE in Mediterranean Europe arrived there rather recently (Iron Age to early Middle Ages) from West Asia. But even this is likely to be of Indo-European origin in a roundabout sort of way.

Nirjhar007 said...

(Sorry for the wrong question before), Do you think the 4.2 KYO Event brought any IE/West Asians To Europe?

Davidski said...

I have no idea about that. The point I was making before was that some ANE arrived in parts of southeastern Mediterranean Europe from West Asia during the Greek and Roman periods.

This is probably one of the main reasons why some southeastern Europeans are best fitted as Stuttgart/MA-1, because their WHG decreased during these admixture events.

Nirjhar007 said...

But you once said the 4.2 kyo event brought some 'west asian' component in Europe:)

Alberto said...

I was thinking that if we had a genome from pre-Celtic Iberia (for example from 1500 BC, when Celts had still not arrived but the 2 populations of HG and Farmers were probably already completely mixed), it could give us a good hint about the 3rd population. For example, if that genome clustered with Sardinians or Stuttgart, it would mean that the 3rd population was eastern and northern (something like Baltic), but if that genome was more where Gokhem2 is, then that 3rd population would have to be eastern and southern (more like Caucasus).

In the absence of this genome, is it possible to make an experiment like checking the f3 stats of:

Spanish, Stuttgart, Lithuanian
Spanish, Gokhem2, Abkhazian

Assuming Gok2 and Stuttgart have similar coverage (which I don't know), would that work and be informative in any way? Or would those results be rather random and meaningless?

Matt said...

Thanks for the props.

Davidski However, just in case the Near Eastern ancestry proportions are a little bit too high (and we won't know until we see those ancient genomes from the Near East)

By the way, we'll probably learn of more ancient European meta-populations as many more genomes are sequenced from across Eurasia.

Nevertheless, I doubt this will affect the model outlined above. That's because I'm expecting all such meta-populations to be mixtures of ANE, ENF and/or WHG, as well as, in some cases, extra-West Eurasian ancestral components.

However, I suspect that West Eurasia will have to be modeled in a different way from Europe, with, amongst other things, the so called Basal Eurasian component replacing ENF. But for this to happen we'll need at least one ancient genome that is in large-part of Basal Eurasian origin. In any case, that's a whole different subject.


I definitely these paras sound correct, as addenda to this model. For me,

- It seems pretty likely to me that the original Near Eastern farmers, who expanded into Europe, South Asia and Africa, may fit "southwest" of Yemen Jews. Mainly I would expect from PCA that the Eurasian ancestry in East Africans, once African ancestry is masked out, seems to have particularly less ANE and also WHG affinity than in Yemen Jews.

- It seems like there might be some divergence between ANE in South Asia and in other parts of West and North Eurasia. I think in particular that after ANE might have started to diverge around the time of Mal'ta (just before or after) and may have been absorbed into other regional populations at some time after 24,000 BP, like EHG in Eastern Europe, proto-Amerind in the Far Northeast, something that shows up as a South Asian component in South Asia. Not because they were conquered or conquered others necessarily, but just as the central population in a world where lots of other populations were in contact with them.

I'd guess this might be something Reich and Patterson will talk about in their paper, if they can sufficiently mask for non-ANE ancestry in South Asians.

- There might also be some genetic divergence between offshoots of early farmers. Not as high as in Dienekes' "Womb of Nations" model that supposed high drift among offshoots of early farmers and low admixture before and after the expansion of farming. This was a reasonable model at the time, as it could produce a similar situation, with *very* high drifts in farmers and high divergences between farmers and HGs, but we now know from the ancient samples we are more in a world with relatively lower drift among early farmers and higher admixture. That would just have the effect of making farmers globally further away from one another and world populations than the FSTs here would suggest though, not really change relative position much within West Eurasia.

The basic model isn't going to change much though.

Matt said...

Here are some more PCA plots along the same concept with Dienekes' Globe13 admixture, to compare how this performs in comparison.

http://i.imgur.com/ACIhBqn.jpg - Dienekes Globe13 K10 (Globe10). Compared with where I think David's populations would overlay - http://i.imgur.com/VorfHNW.jpg. With 3 West Eurasian components.

http://i.imgur.com/aWgCTZH.jpg - Globe11. Again with 3 West Eurasian components.

http://i.imgur.com/1NzlPpV.jpg - Globe13 (plus rotation http://i.imgur.com/D7tQ7I0.jpg, plus rotation and stretch - http://imgur.com/I5Akq4y)

Note this is compressed at the Near Eastern and Mediterranean end, because with 4 West Eurasian components more or the difference of Med and WA from Southwest Asian and similarity of Med and WA to Atlantic Baltic falls into PC3. So here's the Globe13 based MDS which forces everything to fit 2 dimensions - http://imgur.com/RXB8rfq

You can see they all reproduce the same basic clinal structure, and can more or less account for variation in modern day West Eurasians to a similar degree. (I don't think I'd get anything like this out of Eurogenes K13 or above, because its designed to have a number of European clusters with slight divergences to produce Oracles better, so wouldn't work the same when put through PCA.)

What Globe13 can't do and what the new Eurogenes K8 ANE, WHG, Near Eastern model can do is that 1) we have no evidence that the cluster points in Globe13 actually overlap real populations, 2) Globe13's PCA can't contain the real ancient samples we have at the same time as modelling the divergences similarly to genotype.

A combination of divergence between different types of ANE and survival of populations distant from pure ANE and WHG I think will be mainly why these 3 components (or components close to them) don't form in modern ADMIXTURE.

Shaikorth said...

"In the absence of this genome, is it possible to make an experiment like checking the f3 stats of:

Spanish, Stuttgart, Lithuanian
Spanish, Gokhem2, Abkhazian
"

David's previous f3 mix experiment gave stronger shared drift stats for Spanish; Sardinian, Pathan and Spanish; Sardinian, Dai than for Sardinian, Abkhasian and Sardinian, Estonian (for some reason there wasn't a Lithuanian stat for them).
http://eurogenes.blogspot.com/2014/07/f3-stats-100-present-day-populations.html

So it might be interesting to do Spanish; EEF, Dai/She and Spanish; EEF, Pathans and if the Yoruba/Mbuti related signals are replicated with Lithuanians in place of Estonians.

Chad Rohlfsen said...

If we're to go by Stuttgart's supposed 44% BE, your ENF is a little over 60% BE.

Arch Hades said...

What populations from the Near East would bring in ANE to Mediterranean Europe? Wouldn't that mostly come in from the North?

jackson_montgomery_devoni said...

So David this new West Eurasia K8 test of yours is accurate for all West Eurasians? Even those who are outliers such as Finns and Southern Italians?

Tone said...

I'm not very knowledgable so forgive me if this is off, but couldn't we get a very rough approximation for Eastern Hunter Gatherers by drawing a line from Gok2 thru Modern Swedes and intersecting with the WHG/ANE line?

I'm not sure where Modern Swedes are on the plot but here's my guess:

https://dl.dropboxusercontent.com/u/10788581/K8b_PCA_2.png

If this is true, then ANE is coming from a mix with a EHG source, who are basically a hybrid group of WHG and something else very high in ANE which is from parts very far away or from a very isolated group.

Again, forgive me for being wrong or obvious, but I'm just guessing and trying to wrap my mind around it.

epoch1970 said...

Where are the Germans, Begians and Dutch in that spreadsheet?

Shaikorth said...

Tone,

If this PCA mimics a genotype-based one, the area you circled is likely where Belorussians and northern Swedes cluster. Most of southern and central Swedes should be in the "northernmost" part of the tight NW European group so a bit "southwest" from the circled area.

As for Germans, they should be "south" of Swedes and Belgians and Dutch further "southwest".

Davidski said...

Alberto,

Here are those f3 stats.

https://docs.google.com/spreadsheets/d/1iK3HWHlNzXPtCtWArkux5xRhDQ7WsDdMcOwFJXxMaQ4/edit?usp=sharing

But I think there might be a problem in assuming that present-day Iberians are simply a mixture of Iberian hunter-gatherers, Mediterranean Neolithic farmers and early Celts. That's because I think various admixtures entered Iberia during the Roman and Islamic periods.

Arch,

I was thinking of Bronze and Iron Age groups like the proto Etruscans and Minoans who perhaps came from Anatolia. They might have brought ANE with them, but very little or no WHG, which would've raised the level of ANE and Near Eastern ratios in parts of southeastern Europe, but lowered WHG. But of course we'll need ancient DNA to see whether these groups actually came from West Asia, and if so, whether they carried any detectable ANE.

Jackson,

Yes, this test is accurate for Southern Italians and Finns.

By the way, here are spatial maps of some of the K8 components done by Srkz.

https://drive.google.com/file/d/0B9o3EYTdM8lQUzVTS21GdFl4Mk0/view?usp=sharing

https://drive.google.com/file/d/0B9o3EYTdM8lQOS1YODhvOWpWNzg/view?usp=sharing

https://drive.google.com/file/d/0B9o3EYTdM8lQc0VfVnU3ZE1YZnM/view?usp=sharing

https://drive.google.com/file/d/0B9o3EYTdM8lQTGdURTZLTXY2LTA/view?usp=sharing

Grey said...

@Arch

"What populations from the Near East would bring in ANE to Mediterranean Europe?"

I was assuming the comment referred to Hittites and such like spreading ANE in the near east from the more eastern and less WHG-mixed end of ANE and then that component effecting the med. at a later date via the near east so a sequence

north -> south -> west

Grey said...

@Tone

Interesting thoughts.

Alberto said...

David, thanks for those stats.

There seems to be a clear preference for Gokhem2-Pathan over any other combination. Also find interesting that, on the other hand, when using Lithuanian there is a clear preference for Stuttgart over Gokhem2, which seems to show that the axis work with the "fateful triangle" you made in this post.

Anyway I don't think we can read too much into this. In the end only ancient genomes can give us real answers.

As for the North African (or Jewish) admixture in Spaniards, which is small but undeniable, I'm not sure it shows much significance here. The results from areas where this admixture is higher (like Murcia or Andalusia for North Africans, and Extremadura or Castilla for Jews) are equivalent to those where the admixture is hardly existent (like Cantabria). The Roma (Gypsy) population in Spain has similar distribution to North African (much more in the south than the north), but they hardly mixed with Spaniards and are mostly a separate population so I don't think that could explain the Pathan preference over Abkhazian.

Richard Rocca said...

David, do not forget the Phoenicians and Carthaginians in Spain.

jackson_montgomery_devoni said...

Great! Thank you David.

Davidski said...

Tone,

I'd say that's pretty close, and indeed your EHG marker is more or less where I'd put it myself.

However, I think the populations that spread ANE across West Eurasia during the Copper Age would be positioned somewhere below the WHG-ANE dotted line. That's because I'm pretty sure now that they carried some ENF ancestry.

Ponto said...

The Phoenicians/Carthaginians are way over-rated.

They were not settlers or farmers or empire builders just typical Middle Eastern merchants interested in profit and loss statements, balance sheets, the bottom line and obtaining the goods and services they wanted via trade. Military occupation was just for obtaining a monopoly.

The Greeks were different. They built new towns, were active farmers, colonizers and eventually absorbed culturally and linguistically the indigenes.

Davidski said...

The early Greeks are definitely among the top candidates to have carried ANE to coastal Iberia and southern France, but not Basque country.

truth said...

Davidski,

I don't think so. The greeks were very small populations in Iberia, they settled in a few coastal so it was never something big.

truth said...

*coastal towns , I meant.

Gaspar said...

@ponto

The Greeks are overated , they where never a huge populace both present or ancient times. The bulk of the population for migrational purposes was due to the 1200BC bronze-age collapse where indigenous anatolians flooded into coastal greek settlemants in anatolia and sailed to catalan and south french lands.

Chad Rohlfsen said...

Ionian Greeks

ryukendo kendow said...

@ Matt @ Davidski @ Tone
Overall the positions seem to be rather accurate. For example, the Armenians and Motala both have appprox 15% ANE, and a line from motala down to Armenian passes through Scandinavians, who also have approx the same amount. This line is approx parallel to the WHG-ENF line, which is what we would expect of an 'isocline' of 15% ANE, other components kept constant. Likewise, french to Volga-Ural forms an 'isocline' of approx 45% WHG, parallel to ENF-ANE line.

Tone, I think this means that the arrow-drawing you tried to do might be relatively accurate.

Matt, I think your point about the 50% contribution is valid. If we take that figure to be a precise one, (which I think is a good idea--no reason to produce things like 36% or 73% if 50% is only a fuzzy estimate from the peak of your abilities) then any line from armenian to any point on the WHG-ANE line, when bisected, would produce a point south of modern-day eastern euros. If EHG was south of the WHG-ANE line, then that point would be even further to the south.

The triangle once again brings out the contradictions in the information we've been given. If we base our estimates against the 36% 'nonlocal' EHG in Corded ware, and corded ware as present in north-central Europe (Swedish or Polish, both are used as proxies in the graphic below), then a point on the WHG-ANE cline approx twice as far from CW as CW is from Gokhem would land us very close to EHG. But if Yamnaya is 50% EHG + 50% Armenian, then the Yamnaya produced by this estimate would land far to the east below the Volga-Ural pops and in the triangle formed by N.Cauc, Kazakhs, and Volga-Ural, which, while geographically pleasing, means that Corded Ware cannot be 75% Yamnaya.
http://imgur.com/PnYhQ8i,pxhemPB,YfVJ2HN,pEYLDwY,N145ene#1

If we base our estimates on the fact that Corded Ware is 75% Yamnaya, then Yamnaya will be three times closer to CW than CW is to a point on the WHG-ENF line, but this produces a Yamnaya that cannot be 50% Armenian + 50% Karelian, like what Matt said.
http://imgur.com/PnYhQ8i,pxhemPB,YfVJ2HN,pEYLDwY,N145ene#3

We can modify David's last estimate to produce Yamnaya that is both 50% Armenian + 50% Karelian and approx 75% of Corded Ware, but this leaves us wondering how the researchers got 36% EHG, as CW minus the EHG would leave us with a Southeast-Euro, bulgarian-like residual with all WHG, ENF and ANE all represented already. But this may or may not be a big problem.
http://imgur.com/PnYhQ8i,pxhemPB,YfVJ2HN,pEYLDwY,N145ene#0

To sum it up, the big issue is whether it is possible for 1) Yamnaya to be 50% EHG + 50% Armenian, 2) Corded Ware to be 75% Yamnaya, 3) Corded Ware to be 36% EHG, and 4) Corded Ware to center in North-Central Europe, all at the same time.

ryukendo kendow said...

Add to the last 4) Corded Ware as 36% EHG.

Chad Rohlfsen said...

I think we accomplished that on the last figures.

ryukendo kendow said...

The last figures suffer from what matt pointed out, requiring more that the necessary amount of karelian. I already put them in the triangle in the second images.

Davidski said...

I saw the 36% in Corded Ware described as "low bound non-local", so it's not clear what it represents.

But in any case, the 73% Yamnaya admixture was said to be a better fit.

Chad Rohlfsen said...

73% Yamnaya was based on the EHG, not the remainder. Anyhow, the last run should be close.

ryukendo kendow said...

@ Chad
You just contradicted David.

@ Davidski
I think your earlier estimates were closer. Your latest one cannot fit the 50/50 datapoint.


Well it should be out soon.

Chad Rohlfsen said...

73% yamnaya should line up with something about 30%ne / 70% WHG. Check the line and distance.

Chad Rohlfsen said...

Or 34/66. It should work, and within reason. As I stated, I feel the true percentage is between 50-60% with ANE picked up with local sources. It's just a fit and not literal.

Chad Rohlfsen said...

Give me exact quotes from Reich. I'll fit it all. No pulling stuff out of rear ends. Cite the source please.

Chad Rohlfsen said...

I'll get started in about 4-5 hours, after work.

Chad Rohlfsen said...

Going by David's comments here, it seems that everything has been achieved here. Unless, something has changed. Check the line from EHG, if they are 60WHG/40ANE, to Gok2. Corded should be at least 36% EHG. Then the line from Yamnaya, through Corded, should have it at 73%. If we have to, we can make Corded 35-36% NE. That will bring the line on the NE/WHG side down to higher NE.

"Razib also tweeted a few times from the talk, and as far as I can tell, his main point was that the Yamnaya samples showed affinity to the Ancient North Eurasian (ANE) proxy Mal'ta boy, but were also partly of Near Eastern origin, and indeed could be modeled as a 50/50 mixture between present-day Armenians and ancient Karelian hunter-gatherers. He also said that the ancient Karelians were classified as eastern hunter-gatherers (let's call them EHG for now), along with the hunter-gatherers from the Samara Valley, which probably means they carried a lot of ANE admixture.

Moreover, he added that Corded Ware genomes from late Neolithic Germany could be modeled as 75% Yamnaya, while another source from the talk revealed to me that they carried a minimum of 36% EHG."

Chad Rohlfsen said...

That minimum could mean that the lowest ANE among Corded was around 14-15%. The line from Near Gok2 to EHG should reveal that, with Corded, east of the line.

Chad Rohlfsen said...

David,
Have you tried IR1? Would he be around 42/42/16?

Davidski said...

IR1 gets 43/42/15, but IMO it doesn't have enough markers for an accurate run.

ryukendo kendow said...

@ Chad
The latest estimates do not place Yamnaya as 50% Armenian + 50% Karelian. You gave the quote here.

Matt already gave his analysis of this in the comments on the previous post.

Chad Rohlfsen said...

I'm sticking with where it's at

Davidski said...

rk,

You're putting the Armenian marker too far east. The Human Origins Armenians cluster here (red dots).

https://drive.google.com/file/d/0B9o3EYTdM8lQSmhKYzkwNVJfOUk/view?usp=sharing

ryukendo kendow said...

@ Chad
Lets see.

@ Davidski
Yup stand corrected. Matters little though. A line from EHG to Armenian, when bisected, will still be below where East Europeans are, and if EHG is not on the WHG-ANE line like you suspect then it will be even more below.

I still think your earlier estimates with Yamnaya just below the East Euro arch were better.

Chad Rohlfsen said...

The position is not as important as the numbers. The PCA compresses the North, as seen from Gok2's 50% NE being closer to WHG than NE. So, if my Yamnaya is slightly higher than half, that's fine. The numbers are all matched up with requirements.

Matt said...

@ Ryu, for visual purposes here is a triangle with combinations of 50:50 Armenian_centroid:(WHG:ANE:Near_East).

http://i.imgur.com/ZfPyS6Z.png

Anything which can be modeled as a combination of less than literally 50% Armenian will be outside the triangle. Anything which can be modeled as 50% Armenian with whatever combinations of other populations will be along its surface.

Yamnaya should fit within that triangle, at some point, depending on what other populations it combines with and it what ratio, unless the estimate of 50% is very approximate or other factors intrude.

I've put the Armenian and Polish centroids on as Black dots.

@ David, on another topic I was having a look at the full spreadsheet and noticed the Neolithic Hungarian KO1 models as 100% WHG under this ADMIXTURE.

That was a little surprising to me, as it positioned closer to modern day Europe and the Near East on PCA when you ran it through PCA, compared to Loschbour and La Brana.

http://eurogenes.blogspot.com.au/2014/10/genetic-continuity-and-shifts-across.html - "I don't think KO1 has any ANE. It's just a less extreme version of Loschbour. You can compare the two...

https://drive.google.com/file/d/0B9o3EYTdM8lQdkJXZF9BSFdsRk0/view?usp=sharing"
- Loschbour

"https://drive.google.com/file/d/0B9o3EYTdM8lQYTE4X1U2V3Rmb1k/view?usp=sharing" - KO1

https://drive.google.com/file/d/0B9o3EYTdM8lQXy1fbWd5RkpxQ1U/view?pli=1 - Losch, KO1 and ANE altogether

We had thought that was as it had some Near Eastern ancestry.

My alternative idea back then was KO1 was closer because its represented a strain of HG ancestry was closer to the HG ancestry in the Near East.

Does that seem more likely on the back of this test which finds KO1 as 100% WHG, or do you think there is any alternate explanation here? Perhaps a technical one to do with the number of SNPs, etc.

Davidski said...

KO1 has around 75% of the markers missing in this test, and it's a low coverage sample, so its result isn't valid IMO.

Having said that, La Brana-1 also clusters clearly below Loschbour on the typical West Eurasian PCA, and also scores 100% WHG in this test.

ryukendo kendow said...

@ Matt
Thanks, that is a very elegant figure indeed.

@ Chad
No, they do not. The last numbers you suggested, 34/40/26, produce at most 40% Armenian, and if EHG had NE, then even less.

Chad Rohlfsen said...

Here David,

Yamnaya 39/36/27
Corded 35/46/19
Beaker 39/47/14

Try those with an EHG at 60/40, and Armenians.

Ryu,
That will be exactly 50% of the way to Armenians.

Chad Rohlfsen said...

Sorry,

Yamnaya 39/34/27

Chad Rohlfsen said...

See where that goes. I may have to take down some ANE and add WHG to Yamnaya, and then a hair from Corded.

Matt said...

@ Ryukendo, thanks.

@ Davidski, yeah, in both cases perhaps that KO1 / La Brana's genome harbours relatively more similarity to southern people relative to Loschbour and Motala...

This could be because this is because they have more similar composition (have Near Eastern ancestry), contributed to southern people (more donation from La Brana to Southern relative to Northern Europeans, even if it is low overall) or not contributing at all to anyone so falling towards 0 (0 being no differential relationship to any of the samples)...

Also thinking at the moment it'll be interesting to see chromosome painting for this test if it ever gets that far. The segments should be really broken up and evenly distributed after such ancient timescales for admixture.

Chad Rohlfsen said...

Final numbers David.

Yamnaya 39/35/26
Corded. 35/46/19
Beaker 40/46/14
EHG 60/40

That should all line up.

Matt said...

Chad, I won't comment on why you have chosen those composition's for each point, but here is where they plot if you do not know already -

http://i.imgur.com/SNdlU1S.png.

Your Beaker point is basically indistinguishable from modern English, while Corded is more or less identical to Polish. I won't comment on whether that makes sense.

Davidski said...

Red = Yamnaya
Orange = Corded Ware
Yellow = Bell Beaker
Pink = EHG

https://drive.google.com/file/d/0B9o3EYTdM8lQc3pyRjl6SEo5a3c/view?usp=sharing

https://drive.google.com/file/d/0B9o3EYTdM8lQRkZRV05KZ0thUUU/view?usp=sharing

I think that's very close. We won't get any closer without more information.

Chad Rohlfsen said...

Yeah, that pretty much matches up.

Chad Rohlfsen said...

I bet the southern steppe and maykop fill in the rest of the gap.

Chad Rohlfsen said...

Would an EHG at 64/36 or 63/37 match better?

Davidski said...

Not really, because EHG would probably be too close to Motala12.

Anyway, it's not a big difference, and we have no way of knowing which is correct.

ZeGrammarNazi said...

David, that looks very close to the hand drawn plot you uploaded originally.

It will be very interesting to compare this to the data we are going to get from Reich and company.

ryukendo kendow said...

@ Davidski @ Chad
Thanks for the new figures. I think these will mesh with the data better. As to how good it is in an absolute sense, we'll have to wait.

@ Matt @ Davidski
I think there is a chance that this estimate would be close, or closer to correct, whatever the motivations that underlie it.

The only thing that disturbs me at this point in time is how these ANE:WHG:NE ratios are going to be partitioned among ADMIXTURE components. It seems to me quite clear that 1) Yamnaya is probably going to carry ANE and WHG under East Euro in Eurogenes, 2) Yamnaya would be low in Baltic, and 3) Yamnaya will carry Near East influence over and on top of whatever it does under Mediterranean, under West Asian or East Med. I think the second point because Baltic is high among Basques, Finnish, and Hunter-Gatherer aDNA, and seems to me to conform to a relict distr and thus be autochthonous to Europe; I think the last point, firstly because of Razib's tweets, and secondly because the NE level of Yamnaya is very high w.r.t. WHG, and the approx 1:1 ratio of NE:WHG found in Mediterranean 'EEF'-type components would end up gobbling almost all the WHG if it were to account for all the NE, leaving little for East Euro or North Sea or what have you. So some of the NE has to be accounted for by components with higher NE 'concentration' than Med.

The thing is, if Yamnaya contributed 73% to Corded Ware, and CW is the precursor to modern-day East Euros and Balts, then one should be able to infer things about yamnaya from how modern day East Euros differ from other euros. But these have high baltic, and have some of the lowest levels of East Med/West Asian in Europe. Once again there seems to be something off with this chain of inference.

So I expect the ADMIXTURE run of Yamnaya and CW genomes to be very interesting indeed. My own suspicion is that Yamnaya would turn out like I said, and CW will have a 'weird' admixture result different from any modern-day Euros, but its 3-ratios will nonetheless land it among North-central Europeans.

Davidski said...

Yamnaya shouldn't have lower Baltic than present-day Erzya, unless we're talking about remains from some of the more southern burials on the Kuban steppe and near the Balkans.

Maikop, Sintashta and southern Catacomb genomes might have very low Baltic, perhaps at levels comparable to those among Tadjiks and Lezgins.

ryukendo kendow said...

@ Davidski
If you have time, is it possible to plot the components in the triangle?

Thanks!

Tone said...

After looking at Davidski's positioning of Yamnaya, Corded Ware, and Bell Beaker on the "Fateful Triangle", I'm still dumbfounded by the mystery of R1B.

One could draw a relative line that passes roughly through the Beakers, CW and Yamnaya. from GOK2 to the ANE source of the triangle.

I can see how CW is the "child" of Yamnaya. CW are modern europeans, born from the mixing of Yamnaya and Neolithic populations. Modern North East Euros, the descendants of CW are basically GOK 2 pulled Eastward by steppe folk.

I can't seem to come up with a working theory on the spread of R1B. The Beakers appear to be an extension of the same process that that moved North East Europeans away from the Neolithic axis and towards ANE. However, Judging from ydna, Bell Beakers (R1B) are not an extension of Yamnaya (R1A). Unless the ydna is wrong. Also, the Beakers are thought to originate in Iberia. That could be wrong too since they appear to be fully North West European.

I can come up with a few scenarios, and I'm not confident in any of them:

1. The first R1B Beakers were from the East and had a similar Caucasian/EHG origin as Yamnaya. They invaded Europe first via the Steppe and their trail was eradicated by Yamnaya R1A groups who followed on their heels a relatively short time later. Iberian origin is either false or was some sort of back migration.

2. They came from the East and "parachuted" into Iberia and the British Isles via boat. They had vastly superior technology. They encountered GOK2 and Sardinian type natives, mixed with them and repopulated the West relatively quickly via founder effect.

What about a North African origin for R1B? I can't see it based on where the Beakers fall on the plot.

But I'm probably totally wrong. Questions. Questions. Questions.

Chad Rohlfsen said...

Ryu,
A triangle might sort it out. I have a feeling that EHG will have to move much closer to 65/35, to make it perfect.

Matt said...

@Ryu - "The only thing that disturbs me at this point in time is how these ANE:WHG:NE ratios are going to be partitioned among ADMIXTURE components. "

"If you have time, is it possible to plot the components in the triangle?"

I think it is entirely possible that they *cannot* (be partitioned / plotted). I'm sure David has his own take on this, in terms of the Eurogenes components, but using the Dienekes Globe13 as an example, it might be worth bearing in mind that -

It looks a lot like, from Dienekes Globe13 components that at least one of his four ultimate West Eurasian components, Mediterranean (and perhaps also Southwest Asian and West Asian), forms *outside* the ANE:WHG:Near_Eastern component triangle.

Although I can't be certain (new adna may surprise us, etc.), see here - http://imgur.com/YRtxqSY at the position of Mediterranean. Now if you have a triangle where Sardinian sits on the surface of one of the edges, up from around where Yemeni Jews or Saudis are, it seems plain the Globe13 Mediterranean component actually cannot fit within that triangle.

As such, Globe13 Med can't actually be defined within the triangles as some mix of the ANE:WHG:Near East triangle populations, even though the four Globe13 West Eurasian populations are more or less as good a model for modern populations (they fail for the ancients).

This is also possible with Eurogenes component don't form within the triangle either. Davidski has his own scrupulous criteria to produce ADMIXTURE components, but it is not certain to me that even following as scrupulous a method as can be that they necessarily will fall within the triangle, unless there are any modern populations who are like 99% that component.

So all that attempting to work out the position of modern ADMIXTURE components as mixes of ANE:WHG:Near_Eastern/EEF/Basal may have been a bit of a fool's errand - modern components may just be able to mix to produce the same outcomes (for modern populations) as ANE, WHG and Near Eastern do, but not able to themselves be produced from ANE, WHG and Near Eastern.

Davidski said...

I can try running synthetic samples made from the K15 allele frequencies with the K8. If they don't skew the test I'll be able to plot them within the K8 triangle.

It might take a few days though.

ryukendo kendow said...

@ Davdidski
Thank you David.

@ Matt
Ahh I see. Kinda like primary colors and gamuts.

ryukendo kendow said...

I'm going to make a final hypothesis for ADMIXTURE results for Yamnaya. I might be totally off, but let's just see how much inference from modern populations takes us. I'll be using Eurogenes k15.

For the ANE/WHG components, East Euro will dominate by far, being around 2X more than any of the next most common components, which will be North Sea, Atlantic and Baltic. North Sea would be higher than either Atlantic or Baltic, which will both be low. I'm sorry David, I'll have to disagree with you on this one.

For the agricultural components, there will be at most a tiny fraction of west med, and same for east med. Most of it will be accounted for in West Asian, which will have a ratio to East Euro anywhere in the range of 1:2 to 4:5, but probably on the higher side.

I'm going to hypothesize further that modern East Euros differ from Yamnaya mostly in having more baltic, pulling them in the WHG direction; also that West and Central Euros differ mostly in more Atlantic, North Sea and the two Meds, pulling away from the ANE direction; also Central Asians mostly in more West Asian, away from the WHG direction. Lastly, I think Corded Ware finds will differ by region more than Yamnaya finds will.

Davidski said...

I just realized something. Maybe the author of this article wasn't talking about ANE when she said it reached a peak of 29% in Lithuania and was lowest in southern Europe, but rather about EHG?

video.sciencemag.pnw.orc.scoolaid.net/content/345/6201/1106.full.pdf

ryukendo kendow said...

@ Davidski
Interesting.

David, why do you think KO1 did not have ANE?

Davidski said...

I don't think there ever was any ANE among continental European hunter-gatherers west of the Dnieper.

This might have something to do with the microblade cultures that potentially spread ANE into Europe after the LGM. For whatever reason they didn't manage to penetrate into Central Europe and the Balkans.

Matt said...

@Ryukendo "Ahh I see. Kinda like primary colors and gamuts."

Good example, I didn't know of that analogy at all. That's essentially how I'd think of it.

For real implications, It wouldn't mean that the Eurogenes K15 ADMIXTURE components can't be approximated by ANE, WHG, Near Eastern. Although if they might end up forming their own clusters when Eurogenes K15 simulations are put with ANE, WHG and Near Eastern simultations at high K15 or they also might show with some non-European components at K8 (as Oetzi seems to have placed based on David's comment in the post above).

Just that as the Eurogenes K15 might be outside of the genetic "gamut" of K8's ANE, Near Eastern, WHG, if you for example then went "Right, as Czechs are X Baltic and Baltic is X WHG, Y Near Eastern, Z ANE, etc." over all components in Czechs to try and work out the ANE, WHG, Near Eastern proportions in Czechs from that, it might be "off" a bit what the new K8 test finds directly.

ryukendo kendow said...

@ Davidski
Read the link. Weird, that figure is neither the 36% nonlocal or the ~15% ANE. The facts in the labs are probably changing as we speak.

@ Matt
Thanks!

Agree, everything's happening in a large number of dimensions here. E.g. the East Med component is definitely highly NE, but can't be just that, because it peaks in Iranian Jews specifically. So it probably tracks NE generally in a 'larger dimension' where most of the variation in the dataset is found, but tracks something where Iranian Jews and other Zagros/South Cauc pops differ from Levantines and Arabs in the NE in a 'smaller dimension', where that difference is uncovered specifically.

By the way, that's why I don't agree that North Caucasus- or West Asia-centered components are the result of drift only. Those that are, such as Kalash-or Palestinian- centered components, are usu. only found in those pops, and in very low levels in a few other pops, obviously capturing a dimension where Kalash/Palestinian differ from everyone else. But West Asian/Caucasus does not behave like that.

@ Davidski
I'm beginning to think that KO1 might have carried a trace of ANE. In the plots with Loschbour and KO1, KO1 scores directly south of Loschbour, while the 0-ANE isocline slopes gently to the 'southwest' from Loschbour.
https://drive.google.com/file/d/0B9o3EYTdM8lQXy1fbWd5RkpxQ1U/view

Felix's ADMIXTURE for KO1 produced a slice of East Euro. I'm aware of course that low data quality might have played a role here.
http://www.fc.id.au/2014/11/analysis-of-hungarian-neolithic-ko1.html

Just as confirmation, could you plot him in the triangle plot you have above in this post?

Davidski said...

KO1 plots on the WHG dot because he comes out 100% WHG in this test.

ryukendo kendow said...

@ David
Oops, missed it.

Ok, where does a synthetic WHG from this run plot in a normal pca with Loschnour and KO1? Could you try this?

Thanks.

Davidski said...

WHG samples cluster with Loschbour. I can't put KO1 on this plot because it's low coverage and doesn't have enough markers.

https://drive.google.com/file/d/0B9o3EYTdM8lQc3k1dGdoR01aNk0/view?usp=sharing

ryukendo kendow said...

@ Davidski

David, I'm sorry for raising this old issue again, but judging from a comparison between the pca you just posted and the one for KO1 vs Loschbour, there seems to be no way we can dismiss the possibility of ANE in KO1 out of hand.

https://drive.google.com/file/d/0B9o3EYTdM8lQXy1fbWd5RkpxQ1U/view

Furthermore, KO1's admixture results resemble those of Motala much more than those of Loschbour.

I'm aware that this doesn't mean a great deal in terms of sources for ANE ancestry in Europe, but if ANE is found conclusively in KO1 in even tiny amounts, then an EHG-like continuum might have extended further west than we thought, especially considering how close the Karelians are to mainland Europe already, and this EHG might force higher levels of EHG in Northeastern Europeans than from Yamnaya alone.

You mentioned you might do f4 stats with KO1, Loschbour/Brana and Mal'ta, in the Hungarian genomes article, but I'm not sure if these were ever posted. Could you post them here?

Thanks!

Davidski said...

Well, the K8 shows 0% ANE for KO1, so it must be close to 0%. PCA based on pairwise IBS aren't always straightforward to interpret, because it's often difficult to spot the direction of gene flow. I have a hunch that KO1 is closer to modern Europeans because groups more closely related to KO1, rather than Loschbour or La Brana-1, contributed more gene flow to modern Europeans.

Anyway, I think these f3 stats are hard to argue with.

https://drive.google.com/file/d/0B9o3EYTdM8lQM1BONmpnUXZEeTg/view?usp=sharing

https://drive.google.com/file/d/0B9o3EYTdM8lQLTJoSzVTcmFNZzQ/view?usp=sharing

The only way this could happen is if KO1 had significant Basal Eurasian admixture, which is unlikely. If he had both ANE and clear Basal Eurasian he'd be pretty close to modern Europeans, and that would've been obvious by now, not only to us.

Davidski said...

By the way, this is what synthetic individuals made from the West Eurasian K15 allele frequencies get in the K8 test.

https://docs.google.com/spreadsheets/d/1zgol5rF0jWRugwUlsOCeObVZ0opPVxFLjQ_LPIgPcds/edit?usp=sharing

Needless to say, they all look more or less like modern West Eurasian samples and fit well into the fateful triangle.

ryukendo kendow said...

@ Davidski

Thanks for the stats! Now have evidence against a great deal of ANE introgression into the pop that KO1 belonged to.

It seems like the ANE-WHG continuum stopped much further east in the mainland, but extended into the Scandinavian peninsula and surrounds. What I would really like to see are samples from the baltic region, which is, after all, very close to Karelia.

The new figures are beautiful confirmations of what many of us here have expected about the Eurogenes K15, with baltic being WHG-rich and ANE-poor and east Euro ANE-rich and WHG-poor rel. to each other. West Asian becomes the most ANE of all components, which, to me at least, adds to the inference that this component is tracking autosomal influence from the steppe in Europe, in addition to latter-day historical movements in the med, which seems represented by east med as well.

ryukendo kendow said...

BTW, none of the components except West Asian and East Euro have an ANE total ratio close to 0.25, which is what we had from the last estimate for Yamnaya, so attempting to 'make' a genome with the ratios provided by Chad using the figures here is a pointless exercise unless one makes it almost 100% West Asian + East Euro.

If the EHG are even higher in ANE:WHG ratio than this, as I suspect, it is even further out of scope. But this is expected, as no modern pops have ANE anywhere near that high, and these components are based on moderns.

Chad Rohlfsen said...

Hmmm..
Could the East Euro be covering the East Eurasian of IR1?

Chad Rohlfsen said...

Ryu,
If the EHG mixed with Armenians, makes a Yamnaya, then the EHG are probably less than 40%ANE. We can't keep Corded in North Central Europe if Yamnaya is over 26% ANE. They may be as low as 22-23% ANE.

Yes, to make try and mix and match component may be pointless. They probably will have some West Asian, and East Euro. I wouldn't doubt a bit of East Med, Baltic, as well.

Chad Rohlfsen said...

We will probably see Amerindian and Siberian to cover the rest of the ANE.

Helgenes50 said...

David,

About the synthetic individuals made from the West Eurasian K15 allele frequencies

for the ANE, all the percentages are close to those of the populations with the same Name, for instance Eastern Euro 24 %

There is an exception, the Atlantic which peaks among the Basques with 12 % ?

Matt said...

Thanks for that David, it does seem like the components are mostly exactly where you'd predict them to be, e.g your West Asian component at 26% ANE, 71% ENF.

At the same time it does seem like the Eurogenes K15 and K8 components have to measure *slightly* different quantities.

To demonstrate, Sardinian in K15 has a population average of like 21% Atlantic, 17% East Med, 5% North Sea and 2.5% Red Sea, and the rest more or less West Med.

That should translate into a Sardinian population average of about 5% ANE, if you translate from the proportions above, because all those components except Red Sea have a fair amount of ANE (I don't want to reproduce all those sums here and clog up the comments).

Whereas Sardinian directly scores no ANE in the K8 at all.
5% ANE placing a population essentially between K8 Sardinian and K8 South Italian.

This is minor though.

Would be cool if you could do a similar exercise with the other K15 components, mainly because it would be interesting to see how Siberian, Native American and South Asian slice across the ANE+other components. I appreciate it's gotta be time consuming for you.

Chad Rohlfsen said...

Some good news.. Reich said that he will get ahold of people, regarding the Varna and Hamangia remains!

Matt said...

Re: the Eurogenes West Eurasian K15 components and their relationship to the ancient K8 components, thinking on it, one another approach other than using ADMIXTURE clusters might be to put all the synthetic samples from K15 and all the synthetic samples from K8 on the same PCA, along with the modern West Eurasian samples, and see where they sit. Or compare it to K15 synthetics on the triangle - http://i.imgur.com/vvzPMqy.jpg

Davidski said...

It'd be funny if Eastern Euro turns out to be just like a Yamnaya genome.

Helgenes50 said...

I didn't notice that I almost have the same percentage as the Atlantic

Atlantic 12.01 42.17 45.81
FR20 11.89 42.36 45.09

The Atlantic component plots in NW France
although this one peaks among the Basques

Davidski said...

But Basques are also more West Med than NW French.

ryukendo kendow said...

@ Chad
That might be the case. I suspect this, however, because circumstantial evidence makes me suspect the link between Yamnaya and modern Eastern Europeans through Corded Ware is weaker than 73%, whatever the link between Yamnaya and CW itself is.

@ Matt
Thanks for the image!.

I think the components ADMIXTURE produces only track 3-ratios very roughly, while tracking commonality of descent/contribution far more strongly. I'll bet that using ADMIXTURE results to produce 3-ratios for populations produces high levels of distortion, especially at the edges of the 'cloud' of populations in the dataset, e.g. in pops like Sardinian, Basque, or any of the unadmixed ancient samples.

Judging from the description published for ADMIXTURE in the paper introducing it, in 'edge' cases ADMIXTURE would be forced to use contribution/common descent to attribute components to data that is somewhat outside the region best modeled by the components produced by the algorithm.

Another thing: ADMIXTURE attempts to infer a number of ancestral pops, so the position of Atlantic in the triangle is highly intriguing. Forgive me if this sounds ridiculous, but it almost seems to suggest that Basques are a mix of an intrusive 'atlantic' population into the West Med farmers that existed there prior. Might Atlantic be tracking BB?? Speculation alert.

@ Davidski
BTW, since all synthetic genomes in the HG-dominant components have ANE and thus none are good proxies for a pure WHG genome, it seems that when ADMIXTURE chooses, say, East Euro, or Baltic or North Sea for some HG genome, it might be telling us something about the provenance of the WHG there.

I would bet that the WHG that increased in east Europe post-IE was not for the most part autochthonous to Europe--that would be represented by Baltic, which was always there in the HG genomes, and which fell from BR1 to BR2 to IR1--but was something from EHG in Eastern Euro, which did not exist in large amounts east to Hungary, as seen in KO1, but did exist in SHG and possibly the ancient Karelians and which peaks today in Maris in the Volga region.

Davidski said...

Here are the K15 West Eurasian components on the K8 PCA plots.

https://drive.google.com/file/d/0B9o3EYTdM8lQdEFDU1JkajhoN28/view?usp=sharing

https://drive.google.com/file/d/0B9o3EYTdM8lQM0lWOGVxdGhBRUk/view?usp=sharing

Who wants to make some bets?

Eastern Euro = Yamnaya

Baltic = Battle-Axe Corded Ware

North Sea = Single Grave Corded Ware

Atlantic = Bell Beaker

West Asian = Maikop

And I guess no need to bet that West Med is almost a dead ringer for EEF.

Davidski said...

By the way, I'll run some of the other K15 components later this week.

Chad Rohlfsen said...

What would setting Bedouins to 8% SSA, like your Red Sea, do to the other components? It seems to have that east Eurasian too...

Davidski said...

The Bedouins without ANE do have around 8% of the so called Sub-Saharan component.

But changing that won't have much of an effect as long as it stays under 10%. However, if I reduced the so called East Eurasian admixture in the Beoduins, this would shift them closer to ENF.

Matt said...

Under that model, based on population averages for K15, that would mean

Sardinian = 50% Cardial EEF, 20% Bell Beaker, 17.5% East Med (Phoenician contacts?), 5% Single Grave Corded Ware, 7.5% misc other

French Basque = 45% Bell Beaker, 25% Cardial EEF, 17% Single Grave Corded Ware, 5% Battle-Axe Corded Ware, 8% misc other

Lithuanian = 36% BACW, 20% SGCW, 20% Yamnaya, 16% BB, misc other.

I was thinking Eastern Euro might map instead of Yamnaya to Migration Period expansion from the Volga region. There's only 5% East Eurasian, so there's some but no strong Turkic / Siberian influence which could peg it at early Migration Period, and the most distinctive element of the Volga is still their high European/North Eurasian HG ancestry. While at 27% ENF, Eastern Euro is a little lower on ENF than the estimate from Reich would suggest (should be around 38% based on 50% Armenian?). Could still be fairly close to Yamnaya though.

Thanks for testing the other Eurogenes K15 components.

Davidski said...

The Eastern Euro has 5% of the East Eurasian probably because it's in large part derived from modern Volga-Ural groups, like the Uralic Erzya and Turkic Chuvashs. This is why ancient genomes are so much better.

Gill said...

David does that mean that those components aren't the only ANE ones or that ANE overall is severely underestimated in K15 compared to K7/K8?

Gill said...

Referring to the spreadsheet of synthetic K15 individuals put through K8.

Davidski said...

The Amerindian, Siberian and South Asian K15 components also carry ANE, but I haven't analyzed them yet with the K8.

Davidski said...

Someone should blog about this. Unfortunately it's somewhat outside of my scope.

Woodland modification in Bronze and Iron Age central Anatolia: An anthracological signature for the Hittite state?

http://www.sciencedirect.com/science/article/pii/S0305440314004828

Davidski said...

OK, I've analyzed them now. Please let me know if there seem to be any errors.

https://docs.google.com/spreadsheets/d/1zgol5rF0jWRugwUlsOCeObVZ0opPVxFLjQ_LPIgPcds/edit?usp=sharing

ryukendo kendow said...

@ Davidski
There seems to be a very glaring thing that's kinda off. MA-1 scores in all the clusters that contain ANE, except one, West Asian, despite West Asian being the cluster that is assigned the highest ANE score in all the components.

Do you think you might have unearthed signs of population structure in the ANE population? Maybe MA-1 forms a clade with the ANE found in old South Asians like Pulliyar, who have high South Asian, plus EHG ANE and Amerindian ANE and Siberian ANE, to the exclusion of ANE in West Asia?

Tesmos said...

Davidski,

Interesting so North Sea could be truly associated with Single grave culture after all? Is my 42-44% North Sea score (mostly) derived from the single grave culture?

Chad Rohlfsen said...

North Sea can't just be for single grave. It's too high in Beaker areas, like Britain. I'll bet it's split between the two.

Seinundzeit said...

David,

Thanks for posting the full results!

RK,

If I recall correctly, MA1 does score West Asian in K13 (9%). "West Asian" clusters in other calculators are quite different from K13's component. For example, HarappaWorld's "Caucasian" component is probably close to 0% ANE, and is strongly ENF-shifted. I guess it all depends on which part of the Caucasus the "West Asian" component in question peaks. I'm 40% "West Asian" for K13 (it's my largest component), but only 18% "Caucasian" for HarappaWorld, which has to be due to differences in ANE-shift for these clusters. Although, I can't recall if MA1 scored "West Asian" in K15? If not, I guess that 70% ENF for "West Asian" makes it too different for MA1 to register it.

Just a pedantic note, but the most ANE-admixed Eurasian component is "South Asian", which is 2.3% ahead of "West Asian" in this regard. I think it's interesting how these components are behaving like extreme versions of actual populations. The West Asian component could just be a Northeastern Caucasian individual (but without WHG), while the South Asian component is just like a scheduled caste/tribal South Indian individual.

Chad Rohlfsen said...

I'm going to go out on a limb here, and say that Bell Beaker will be about ...

40% North Sea
30% Atlantic
10% Baltic
10% East Euro
6% West Med
4% West Asian

Chad Rohlfsen said...

http://haldanessieve.org/2015/01/07/the-effect-of-the-dispersal-kernel-on-isolation-by-distance-in-a-continuous-population/

Davidski said...

I'm not saying Single Grave will score 100% North Sea. What I'm saying is that Single Grave genomes will score similar percentages to those of the North Sea synthetic sample in the K8 test.

Chad Rohlfsen said...

Gotcha. Yes, it will probably be close to there.

ryukendo kendow said...

@ Sein
Thanks for your explanation. The West Asian in K15 however, which is what I am referring to, peaks among the North Caucasus, and is the second most ANE component after South Asian. The ANE-poor 'West Asian/Caucasus' in other runs that peaks in South Caucasians like Georgians or Armenians finds its counterpart here in East Med I think, and finds its way into Sardinians and Neolithic indivs from central Europe presumbaly from the Danubian as opposed to the Cardial Stream.

The ENF thing might be correct though, as neither WHG nor East Eurasian are as far from ANE as ENF. I'm still kinda dissatisfied however, as less extreme 'North Caucasian' versions of West Asian score in MA-1 in other runs in published papers.

Also, maybe its nothing. But substructure in ANE would certainly be intriguing.

Chad Rohlfsen said...

David,
Did you notice any differentiation within ANE, while pulling it from various populations? For instance, was there some distance or separate clusters between Amerindians, Motala, and South Central Asia?

Davidski said...

I never noticed ANE splitting up into North Euro, South Asian and Amerindian etc. versions. It's always more or less the same cluster.

However, sometimes it spills over into WHG in Europe, and then it falls in frequency everywhere else, or into East Eurasian in the Americas and Siberia, and then likewise it falls in frequency everywhere else. That's pretty much it.

Gill said...

One thing I didn't really understand, should Malta score 100% ANE or is the Gedmatch K7 results more accurate where it's like 50% ANE, 33% WHG, 10% ASE, etc? What does the latter represent? Just some kind of affinity to these other components or does it represent a common ancestor of all three?

Gill said...

Also, I don't suppose it'd be possible to make a reverse calculator spreadsheet that derives modern components (K15 or K13) from the ANE K7/K8 results?

Davidski said...

Mal'ta boy won't score 100% ANE at GEDmatch because the file over there doesn't come from the sequence that had most of the contamination taken out.

But even if it did, there'd be other problems like lots of missing markers and mostly low coverage calls.

In other words, the results are largely spurious and there's no way to get them right.

Davidski said...

Here's the K8 for Kostenki14

ANE 0.131916
South_Eurasian 0.076173
Near_Eastern 0.279334
East_Eurasian 0.050992
WHG 0.303079
Oceanian 0.068435
Pygmy 0.018069
Sub-Saharan 0.072002

He basically clusters where he was buried.

http://imageshack.com/a/img540/4461/hXZziU.png

http://imageshack.com/a/img633/9951/0I4Xle.png

Matt said...

Thanks for the further stats. That the ANEness of Siberian is as high as North Sea and Baltic is a little surprising, and its lower in Amerind than I would've thought. South Asian's very split, suggesting it either isn't well characterized by K8 or is pretty admixed.

Any chance of finishing of the remaining 4 components? Just for completeness and to see what the total conversions are like.
For the most part I don't think they'll be any surprises - Oceanian will just be Oceanian, Sub-saharan may fall into either Pygmy or Subsaharan, while South East Asian should split between South Eurasian and East Eurasian. I'd expect to see would be Northeast African splits into ENF and Subsaharan at this level.

ryukendo kendow said...

@ Davidski
I agree with Matt, I would especially like to see how Northeast African turns out. Could you do the rest? Thanks!

BTW, the fact that Kostenki doesn't seem to hold any basal Eurasian in treemix and formal stats, but does have some Crown Eurasian that localises into Middle Easterners today, makes me think that something like K14 was responsible for the UHG and miscellaneous Crown Eurasian in Middle Easterners today.

ryukendo kendow said...

@ Matt
Thanks for the analysis. What is the difference between the pca with unadjusted fsts and that with adjustments? Could we overlay them?

Matt said...

@ Ryu,

Here is the PCA of distances between the components with unadjusted FST -

http://i.imgur.com/gkwhiH7.jpg or zoomed out http://imgur.com/0YHb09s

First two dimensions take up approx 70% of the variance. 3 & 4
(http://imgur.com/hKQBjgK) take the total up to around 90%.

Matt said...

@ Ryu - here are the overlayed plots in seeing:

http://imgur.com/EFtPcTP and http://imgur.com/jJTS3fj

The second of these may be more interesting to look at because I reversed one of the axes so that the West_Med_>West_Asian dimensions run in the same direction.

Chad Rohlfsen said...

David,
Where would these two samples plot?

58% ENF, 35% WHG, 7% ANE

42% ENF, 46% WHG, 12% ANE

Davidski said...

They plot here...

https://drive.google.com/file/d/0B9o3EYTdM8lQVGIxQ2FZLVZGVGs/view?usp=sharing

ryukendo kendow said...

Wonderful!

Gonna read the paper quite carefully. In any case, the tree models are gorgeous. The positions of Ust Ishim and Kostenki that we have reached on this blog so many months before have been confirmed! Congrats to all who put their heads together, and to you too David!

Its also great that the're adding branches to ANE-WHG, making it more like the bush it was in real life. And also the new f4's. I expect such methodological progress to really pay off in the future.

First comments.

This throws YDNA into the air. R1b in yamnaya, but not Euro R1b? And found today in pops with almost no WHG too, many non-IE.
And something extremely glaring. The R1b in Spain is very divergent, occurs in I0410 which is from a neolithic context, and I0410 does not differ from the rest of the neols at all in ADMIXTURE--they're really homogenous, suggesting the presence of this Y-marker in his pop at least >several centuries there. Considering the fact that no basal R1b exists in Europe really, the closest archaeo R1b is so far away, I'm beginning to think that R1b in C+W+S Asia might be quite old? Such that ENF pops can have it with almost no ANE.

But the autosomal picture confirms what we already know.

Their autosomal affinities are estimated wrt modern components as West Asian and E.Euro, in approx equal parts, and their ENF ancestry is not from Europe.

Why did they project their PCA samples? Pure ANE turns up in their PCA closer to E. Euros than French are to E Euros. That pca is useless. Facepalm.

David, you need to redo some of their work lol.

ANE levels turn out to be approx 27-28%, pulling off their f4 method and the tree, which is the low end of my estimated range and they're further east like I said, Hooray!, but we need to see how it turns out in David's analysis.

In any case, I would cuation agst paying too much attention to the YDNA results. The autosomal more or less shows that these ppl, or something very, very like them, influenced Europe to a great degree. Rmb that CW is more Yamnaya than any of the people after, and that yamnaya ancestry appears suddenly, which makes a migration much more likely than a diffusion scenario.