search this blog

Saturday, June 25, 2016

D-stats/nMonte open thread #3


For the latest datasheets with D-stats of the form D(Chimp,Columns)(Mbuti.DG,Rows), featuring samples from Lazaridis et al. 2016, see here, here and here.

Datasheets with D-stats of the form D(Chimp,Rows)(Mbuti.DG,Columns) are available here, here and here. D-stats 1 and 1b include Iran_Chalcolithic in both the rows and columns, while D-stats 3 and 3b have Eastern_HG in both the rows and columns.

The interesting question is, which of these sheets is the best for estimating admixture proportions, primarily in populations from West Eurasia?

186 comments:

Matt said...

Not using these for modelling at the moment, however, some quick observations:

- If you take Iran_Chalcolithic (column) - Kotias (column) then the populations most at the Kotias end are Steppe_EMBA / Stepp_LNBA, WHG, EHG, Caucasians, while populations at the Iran_Chalcolithic end are Levant_Neolithic, EEF, BedouinB, and to a lesser extent Iran_Neolithic. And these and some Mediterranean populations are the only populations who share more with Iran_Chalcolithic than Kotias, really.

So yeah, the Iran_Chalcolithic has too much Levant_Neolithic in it to form such a strong dimension of its own as Kotias, just as we'd expect from the PCA. Shame the Iran_Neolithic isn't in good enough shape to form a column as well as a row. Also CHG probably does have some extra EHG in it to attract it to the Steppe.

- Conversely to this, only BedouinB, Israel_Natufian, Somali, Moroccan, Masai, Tunisian, Loschbour, Esan and Yoruba share more with BedouinB than Iran_Chalcolithic. Seems Bedouin_B is very isolated and has some probably quite low African ancestry?

- Iran_Chalcolithic vs Georgian drift isolates West Europeans, some Steppe and WHG at one end, vs the ancient Middle East at the other end, with only ancient Iranians sharing more drift with Iran_Chalcolithic than with Georgians.

- Finally, comparing Anatolia_Neolithic to Iran_Chalcolithic, all populations are closer to Anatolia_Neolithic, except the Indians, Satsurblia, and the ancient Iranians themselves. Even present day Caucasians come out closer to Anatolia_Neolithic.

Euro HG admixture, of any sort, seems to have a very large effect on increasing drift sharing! I guess this reflects that Euro HG, whether east or west, were a relatively small population size, closely related grouping, and any ancestry from their clade draws a sample quite strongly to them and each other.

Seinundzeit said...

David,

Thanks!

Great stuff, this allows a more intensive exploration of possibilities, compared to PCA.

Kalash:
63.55% ancient Eastern Europe/Steppe
20.40% Iran_Neolithic
16.05% Munda

Pashtun
56.35% ancient Eastern Europe/Steppe
22.05% Iran_Neolithic
21.60% Munda

I'll try some other combinations.

Matt said...

To maybe inform models to test, here are some PCA of the data:

http://i.imgur.com/h0gufw8.png

WHG are probably somewhat stretched away from other samples by use of Bichon (which they share more with than other Villabruna / WHG clade do) and EHG and ANE are probably closer due to lack of columns in this datasheet. Still kind of interesting.

Made me think that if the admixture into Yamnaya were direct from a very basal Iran_Neolithic clade, then the kind of sex biased explanations for their levels of R1 might not be required. If you only had 10% from a very basal farmer population into Steppe Eneolithic then 10% more into Yamnaya, it's much more possible for their y lineages to simply go extinct, without any sex bias. But I guess that wouldn't explain Neolithic mtdna.

http://i.imgur.com/iBJnSYI.png - First PCA from above with an South Asian cline superimposed

It actually sort of shocks me how distant the earliest farmers appear on some of these PCA made from these stats (I used the first sheet, btw). On the last I posted up, the Levant Neolithic appears about as distant from Iberia_EN as Villabruna is (and the Natufians are more distant from Iberia_EN than Villabruna). Iran_Neolithic is far further from the modern Caucasus than Yamnaya is.

On the last note, Sein, if you have time, would you mind trying a model of Lezgin as Steppe_EMBA plus Iran_Neolithic and Levant_Neolithic?.

Seinundzeit said...

Assuming a model in which ASI=ENA, these are interesting fits.

Munda
60.45% ENA (mostly Ami, with some substantial Papuan)
26.15% Iran_Neolithic
13.40% ANE

Kalash
36.15% Iran_Neolithic
33.05% ANE
16.55% WHG
14.25% ENA (only Ami, doesn't receive any Papuan)

Pashtun
38.7% Iran_Neolithic
29.80% ANE
16.05% WHG
15.45% ENA (only Ami, doesn't receive any Papuan)

GujaratiB
38.35% Iran_Neolithic
29.9% ANE
21.70% ENA (only Ami, doesn't receive any Papuan)
10.05% WHG

This seem very reasonable.

If we assume that GujaratiB are comparable to UP Brahmins, and ASI=ENA, then David is right that the Indians on his West Eurasian plot are probably less than 25% ASI.

But that is assuming that ASI is ENA, which might not be the case.

Matt,

Sure thing.

I tried it, but the model seems odd:

72.1% EMBA Steppe
24.6% Levant_Neolithic
3.3% Ulchi
0% Iran_Neolithic

It's not a great fit, in terms of distance.

Just for comparison, the same setup, but with the Kalash. Oddly, this works slightly better for the Kalash than it does for Lezgins :

59% EMBA Steppe
20.3% Iran_Neolithic
15.35% Ulchi
5.35% Levant_Neolithic

This is a much better model for Lezgins, in terms of distance:

37.25% Armenian_Chalcolithic
25.60% Armenia_MLBA
20.70% EMBA Steppe
11.75% Anatolia_Chalcolithic
4.70% Ulchi
0% Armenian_EBA
0% Iran_Chalcolithic

Matt said...

Thanks Sein, re: the first fit for Lezgin, while it's seems unintuitive, it would make a lot of sense given the relative positions of Levant_Neolithic and Steppe_EMBA on PCA 1+2 from these D-stats, to illustrate:
http://i.imgur.com/xawh5j4.png

On that PCA (and therefore the underlying stats) Lezgin could be modelled as an admixture from Levant_Neolithic to Afanasievo and really is much closer along that line to Afanasievo, so no surprise it comes out in the proportions in nMonte that it does, when given those choices. (Although when looking at the higher dimensions of data it's probably not an ideal fit, as you describe).

By these methods "The World's First Farmers" seem extremely distant. Even from present day people of the Near / Middle East.

Seinundzeit said...

Matt,

Good points.

For whatever it's worth, I tried to create models comparable to what we see in the preprint:

Kalash
59.6% BA Steppe (mostly Andronovo, with substantial Afanasievo)
27% Iran_Neolithic
13.4% ENA

or

50.60% Iran_Late_Neolithic
37.65% Samara_Eneolithic
11.75% ENA

or

56.65% Iran_Late_Neolithic
31.15% Eastern_HG
12.2% ENA

Very similar to what we see in the paper.

The ENA levels here seem much more reasonable, the paper's estimates are somewhat higher than what is usually seen.

Also, Andronovo/Sintasha do well for Central/South Asians, with this method, even though the paper found a preference for EMBA steppe populations.

Matt said...

So, just trying a couple of models for Yamnaya_Samara and Iberia_MN.

First using as calc the Euro HGs, the early Neolithic Near East, Satsurblia, BedouinB and Ami as an East Asian extra:

Yamnaya_Samara: Eastern_HG - 54.5, Satsurblia 36.7, Hungary_HG 6.65, BedouinB 2.15 distance% = 2.438 %

(Biggest differences look model is more Kotias column, underfitted to

India_South, Iran_Chalcolithic, Georgian, Anatolia_Neolithic, BedouinB)
Iberia_MN: Hungary_HG 44.2, Levant_Neolithic 35.85, Satsurblia 19.95 - distance% = 4.9998 %

(particularly underfitted to Iberia_EN2, BedouinB, Anatolia_Neolithic column, more slightly overfitted to Bichon).

Taking out Satsurblia and BedouinB as calc populations:

Iberia_MN: Levant_Neolithic 48.5, Hungary_HG 45.3, Eastern_HG 5.4, Ami 0.8 - distance% = 5.4208 %

Yamnaya_Samara: Eastern_HG 74.25, Levant_Neolithic 18.65, Iran_Neolithic 5.15, Hungary_HG 1.95 - distance% = 4.235 %

Also, modeling Satsurblia, Anatolia_Neolithic and Samara_Eneolithic under the same conditions:

Satsurblia: Iran_Neolithic 59.75, Eastern_HG 30.05, Hungary_HG 7.25, Levant_Neolithic 2.95 - distance% = 8.4268 %

Anatolia_Neolithic: Levant_Neolithic 73.9, Hungary_HG 21.55, Eastern_HG 4.55 - distance% = 5.1847 %

Samara_Eneolithic: Eastern_HG 85.45, Levant_Neolithic 7.7, Motala_HG 4.75, Ami 2.1 - distance% = 2.736 %"

So some results sort of as expected there, some quite surprising (Yamnaya Samara having a preference for Levant_Neolithic, not Iran_Neolithic, but Satsurblia has a preference for Iran_Neolithic). Fits generally seem quite bad for using the first farmers only, indicating drifts really need later / more proximate farmers to work, with these columns which include later Neolithic and post-Neolithic West Eurasian pops.

I think the EHG may be slightly overfitted in these relative to if you had a Samara column and using Karelia in calc.

Couple moderns with the last set of ancestor populations:

Basque_Spanish: Levant_Neolithic 36.1, Hungary_HG 35.25, Eastern_HG 27.55, Ami 1.1 - distance% = 5.9668%

English_Cornwall: Eastern_HG 38.7, Levant_Neolithic 34.55, Hungary_HG 25.7, Ami 1.05 - distance% = 5.2278%

Finnish: Eastern_HG 47.9, Levant_Neolithic 22.8, Hungary_HG 22.65, Ami 6.65, distance% = 4.6266 %

Moderns don't seem to have any need for Iran_Neolithic, which would be expected via the PCA models where they can be fitted within a triangle of Levant_Neolithic, EHG and WHG. Thinking about it, Yamnaya's result makes sense from this perspective as well, since it also fits in the same triangle. Pretty distant fits though. The actual real populations varied slightly from those samples (Euro HGs plus First Farmers), and then experienced some drift history of their own, even if the proportions from the rough major groups may be roughly correct.

Matt said...

Modeling the recent Levant, Arabian Peninsula, North Africa with the populations:

AG3-MA1, Ami, Denisovan, Eastern_HG, Esan_Nigeria, Iran_Neolithic, Israel_Natufian, LaBrana1, Hungary_HG, Levant_Neolithic, Loschbour, Masai_Kinyawa, Motala_HG, Neandertal_Altai, Villabruna:

Palestinians: Levant_Neolithic 64.3, Iran_Neolithic 26.8, Eastern_HG 5.4, Ami 3.1, Loschbour 0.4 - distance% = 1.9323 %

BedouinB: Levant_Neolithic 87.3, Iran_Neolithic 7.95, Ami 4.1, Eastern_HG 0.65 - distance% = 4.0007 %

Cypriot: Levant_Neolithic 62, Eastern_HG 28.35, Hungary_HG 6.65, Ami 3 - distance% = 5.2898 %

Druze: Levant_Neolithic 68.75, Eastern_HG 27.75, Ami 3.3 - distance% = 4.4076 %

Tunisian: Levant_Neolithic 62.15, Masai_Kinyawa 11.8, Iran_Neolithic 11.75, Esan_Nigeria 4.2, Loschbour 3.55, Ami 3.1, Eastern_HG 2.7, Israel_Natufian 0.75 - distance% = 1.6545 %

Moroccan: Levant_Neolithic 55.35, Masai_Kinyawa 13.55, Iran_Neolithic 7.7, Esan_Nigeria 7.45, Israel_Natufian 6.9, Loschbour 5.15, Ami 2.6, Eastern_HG 1.3 - distance% = 1.5674 %

Slightly unexpected behavior by Cypriot and Druze with the EHG fractions. Bit suspicious of the Ami fractions. Had a go at including Munda as well, in case it was a South Asian effect, but nothing happened there.

Shaikorth said...

Levant_N also took all the SSA in Palestinians. Does it work for Jordanians and BedouinA too?

ryukendo kendow said...

@ Davidski
David, some evidence to support your points about CT influence.

W Iran Chalcolithic:
Ami2 Anatolia_Neolithic2 BedouinB2 Bichon
Yamnaya_Samara 0.3495000 0.3874000 0.3644000 0.4050000
fitted 0.3474262 0.3821304 0.3596636 0.4055374
dif -0.0020738 -0.0052696 -0.0047364 0.0005374
Bougainville2 Cypriot2 Dai2 Eskimo_Naukan Georgian2
Yamnaya_Samara 0.3368000 0.3846000 0.3476000 0.3703000 0.3928000
fitted 0.3353328 0.3787594 0.3471134 0.3707862 0.3849632
dif -0.0014672 -0.0058406 -0.0004866 0.0004862 -0.0078368
Iberia_EN2 India_South2 Karitiana Kinh_Vietnam Kostenki14
Yamnaya_Samara 0.3899000 0.36100 0.3767000 0.3485000 0.3716000
fitted 0.3839462 0.35795 0.3793858 0.3464136 0.3708734
dif -0.0059538 -0.00305 0.0026858 -0.0020864 -0.0007266
Kotias Mansi2 Mixe Mota Motala_HG2 Munda2
Yamnaya_Samara 0.396400 0.389100 0.3765000 0.1327000 0.4202000 0.3488000
fitted 0.396795 0.385652 0.3779894 0.1308236 0.4229992 0.3473238
dif 0.000395 -0.003448 0.0014894 -0.0018764 0.0027992 -0.0014762
Papuan2 Ulchi2 Ust_Ishim Yamnaya_Samara2 Yoruba2
Yamnaya_Samara 0.332200 0.3538000 0.333400 0.4289000 0.1064000
fitted 0.331034 0.3528402 0.333745 0.4126284 0.1048956
dif -0.001166 -0.0009598 0.000345 -0.0162716 -0.0015044
[1] "distance%=2.2517 / distance=0.022517"


Yamnaya_Samara
"Eastern_HG" 57
"Satsurblia" 24.2
"Iran_Chalcolithic" 15.6
"Loschbour" 3.2
"Iran_Late_Neolithic" 0
"Iran_Neolithic" 0
"AG3-MA1" 0
"LaBrana1" 0
"Ulchi" 0
"Nganasan" 0
___________________
W Anatolia_N:
Ami2 Anatolia_Neolithic2 BedouinB2 Bichon Bougainville2 Cypriot2 Dai2 Eskimo_Naukan
Yamnaya_Samara 0.349500 0.387400 0.3644000 0.4050000 0.3368000 0.3846000 0.3476000 0.3703000
fitted 0.346844 0.385435 0.3602808 0.4043751 0.3346968 0.3796747 0.3464768 0.3707785
dif -0.002656 -0.001965 -0.0041192 -0.0006249 -0.0021032 -0.0049253 -0.0011232 0.0004785
Georgian2 Iberia_EN2 India_South2 Karitiana Kinh_Vietnam Kostenki14 Kotias Mansi2 Mixe
Yamnaya_Samara 0.3928000 0.3899000 0.361000 0.3767000 0.3485000 0.3716000 0.3964000 0.3891000 0.3765000
fitted 0.3852933 0.3870117 0.356962 0.3796727 0.3458671 0.3710158 0.3969764 0.3859297 0.3781327
dif -0.0075067 -0.0028883 -0.004038 0.0029727 -0.0026329 -0.0005842 0.0005764 -0.0031703 0.0016327
Mota Motala_HG2 Munda2 Papuan2 Ulchi2 Ust_Ishim Yamnaya_Samara2 Yoruba2
Yamnaya_Samara 0.1327000 0.420200 0.3488000 0.3322000 0.3538000 0.3334000 0.4289000 0.1064000
fitted 0.1297136 0.423938 0.3468531 0.3306118 0.3524295 0.3338011 0.4131951 0.1040831
dif -0.0029864 0.003738 -0.0019469 -0.0015882 -0.0013705 0.0004011 -0.0157049 -0.0023169
[1] "distance%=2.1257 / distance=0.021257"


Yamnaya_Samara
"Eastern_HG" 59.9
"Satsurblia" 27.6
"Anatolia_Neolithic" 12.3
"Loschbour" 0.2
"Iran_Late_Neolithic" 0
"Iran_Neolithic" 0
"Iran_Chalcolithic" 0
"AG3-MA1" 0
"LaBrana1" 0
"Ulchi" 0
"Nganasan" 0

ryukendo kendow said...

W Anatolia_Chalcolithic:

Ami2 Anatolia_Neolithic2 BedouinB2 Bichon
Yamnaya_Samara 0.34950000 0.38740000 0.36440000 0.4050000
fitted 0.34755355 0.38885065 0.36250355 0.4032658
dif -0.00194645 0.00145065 -0.00189645 -0.0017342
Bougainville2 Cypriot2 Dai2 Eskimo_Naukan Georgian2
Yamnaya_Samara 0.3368000 0.3846000 0.34760000 0.3703000 0.3928000
fitted 0.3349659 0.3815877 0.34703515 0.3707192 0.3867984
dif -0.0018341 -0.0030123 -0.00056485 0.0004192 -0.0060016
Iberia_EN2 India_South2 Karitiana Kinh_Vietnam Kostenki14
Yamnaya_Samara 0.3899000 0.36100000 0.3767000 0.34850000 0.37160000
fitted 0.3895384 0.35807235 0.3794726 0.34636465 0.37152525
dif -0.0003616 -0.00292765 0.0027726 -0.00213535 -0.00007475
Kotias Mansi2 Mixe Mota Motala_HG2
Yamnaya_Samara 0.3964000 0.38910000 0.37650000 0.13270000 0.42020000
fitted 0.3966289 0.38679335 0.37777255 0.13129195 0.42413395
dif 0.0002289 -0.00230665 0.00127255 -0.00140805 0.00393395
Munda2 Papuan2 Ulchi2 Ust_Ishim Yamnaya_Samara2
Yamnaya_Samara 0.34880000 0.33220000 0.3538000 0.3334000 0.4289000
fitted 0.34792215 0.33024515 0.3531155 0.3330186 0.4140418
dif -0.00087785 -0.00195485 -0.0006845 -0.0003814 -0.0148582
Yoruba2
Yamnaya_Samara 0.1064000
fitted 0.1051131
dif -0.0012869
[1] "distance%=1.8289 / distance=0.018289"


Yamnaya_Samara
"Eastern_HG" 55.55
"Anatolia_Chalcolithic" 24.45
"Satsurblia" 19.1
"Loschbour" 0.9
"Iran_Late_Neolithic" 0
"Iran_Neolithic" 0
"Iran_Chalcolithic" 0
"AG3-MA1" 0
"LaBrana1" 0
"Ulchi" 0
"Nganasan" 0
"Anatolia_Neolithic" 0

Davidski said...

Yeah, I saw that. I'll post a few models in the blog entry when I get around to it. Anatolia Chalcolithic is preferred by Yamnaya to all Iranian and other Near Eastern samples.

ryukendo kendow said...

Haha thought so. Once I saw that I had an idea what your 'in a few hours' would be.

Davidski said...

I just discovered a couple of small errors in the datasheets; the same Karitiana and Kinh_Vietnam were in both the rows and columns, hence they each had an empty cell.

I've made the corrections, and also uploaded a third sheet with Karelia_HG in the rows and Eastern_HG in the columns.

Please let me know if there are any other problems with the sheets and I'll correct them.

Matt said...

Karelia_HG's stats seem a little disordered. There are entries in all columns but it has 0.4512 for Mota and only 0.1264 for Mixe?

Davidski said...

OK, hang on.

Davidski said...

Alright, second attempt. This sheet has Eastern_HG in the rows and Eastern_HG2 in the columns.

https://drive.google.com/file/d/0B9o3EYTdM8lQbWJmbXhFRU0xbU0/view?usp=sharing

a said...

In light of the new cluster's/ancient samples. Are there any plans to update Eurogenes K7-for Gedmatch?

Davidski said...

I'm still getting to know these samples. Once I figure out what they're about I'll be able to use them accordingly in a new test. The Eurogenes selection at GEDmatch does need updating, so yeah.

Olympus Mons said...

Davidski,

"Anatolia_Chalcolithic" 24.45" .... I am telling eveybody. Calcolithic/bronze age was after population revolution made by Ubaid expansion. It was a population "nuclear explosion".

It favors AC for the same reason that the highest variance on r1b is precisely where this new R1b sample (I1635) was found and even bigger variance precisely in Anatolia... see this map.

https://www.researchgate.net/figure/269179016_fig6_Figure-5-Geographical-distribution-maps-of-haplogroup-frequencies-and-genetic-variances


Just read the abstract on Hovhannisyan et al.
http://www.ncbi.nlm.nih.gov/pubmed/25452838


Olympus Mons said...

Just to be even clearer...
R1b L23 was born in Western anatolia by 4.900 bc (ish) (where highest variance exist) and the moved... I dont care.
R1b L51 was born in the Delta of Nile by 4.300 bc (ish) and by 3500 bc was in Iberia.

Davidski said...

Just to be even clearer...
R1b L23 was born in Western anatolia by 4.900 bc (ish) (where highest variance exist) and the moved... I dont care.
R1b L51 was born in the Delta of Nile by 4.300 bc (ish) and by 3500 bc was in Iberia.


Nope.

ryukendo kendow said...

@ Davidski

Is it possible to include Ari Blacksmith and Egyptians in the most recent Dstats sheet?

Olympus Mons said...

@Davidski - Nope?

Oh, we will see my friend, we will see. Unfortunately, as with anything else thses days we seem to have to rely on germans to clear the shit out (they are the ones running things in Merimde..).

Olympus Mons said...

Minor correction --- NOT western Anatolia... but really eastern Anatolia... really eastern Anatolia (as per map), because that is were buckloads of diverse R1b cluster by 4.900 bc before moving.

Ariele Iacopo Maggi said...

http://imgur.com/mgk14CR

Davidski said...

@rk

Is it possible to include Ari Blacksmith and Egyptians in the most recent Dstats sheet?

Only Egyptians.

Actually, I forgot to add Balochs and Brahuis to these sheets. I'll fix that tomorrow.

ryukendo kendow said...

@ Davidski

Thanks. If you can add any other South Asians or Afroasiatic populations in the datasheet it'll be great too. The Afroasiatics are especially interesting.

Matt said...

Using new data sheet and only the in theory least admixed calc populations:

AG3-MA1, Ami, Eastern_HG, ElMiron, Hungary_HG, Iran_Neolithic, Israel_Natufian, LaBrana1, Levant_Neolithic, Loschbour, Motala_HG, Villabruna, Yoruba:

Yamnaya_Samara: Eastern_HG 41.85, Motala_HG 21.05, Iran_Neolithic 16.4, Levant_Neolithic 10.7, Hungary_HG, 5.8, Ami 4.2 - distance% = 4.9913 %

For Yamnaya Samara, compared to the same models, adding EHG as a column somewhat reduces EHG in favour of Motala and Levant Neolithic in favour of Iran Neolithic.

Iberia_Chalcolithic: Levant_Neolithic 57.3, Hungary_HG 41.85, Ami 0.85 - distance% = 3.8422 %

Esperstedt_MN: Levant_Neolithic 52.8, Hungary_HG 45.45, Ami 1.75 - distance% = 6.1163 %

Cypriot: Levant_Neolithic 63.5, Eastern_HG 16.05, Hungary_HG 14.3, Ami 6.15 - distance% = 5.4246 %

Basque_Spanish: Hungary_HG 42.75, Levant_Neolithic 37.75, Eastern_HG 15.25, Ami 4.25 - distance% = 6.0891 %

English_Cornwall: Levant_Neolithic 35.9, Hungary_HG 33.35, Eastern_HG 23.25, Ami 4.8, Motala_HG 2.7 - distance% = 5.3971 %

EHG goes down for all.

(The "problem" (or not) of ENA going up with more remote ancestors goes up with using these more ancient populations).

Modifying the above set of calc pops to include Munda and testing South Asia:

GujaratiD: Munda 54.85, Eastern_HG 13.25, Levant_Neolithic 12.85, Iran_Neolithic 12.3, Motala_HG 6.75 - distance% = 3.6981 %

Kalash: Eastern_HG 26.6, Levant_Neolithic 24.05, Munda 22.95, Iran_Neolithic 10.3, Motala_HG 9.45, Ami 5.6, Hungary_HG 1.05 - distance% = 5.1863 %

Using instead Munda plus ANE plus various ME / Steppe cultures:

Kalash: Steppe LNBA-IA 40 (Scythian_IA 28.95, Poltavka 12.05), Near East CHL-MLBA 37(Iran_Chalcolithic 20.85, Armenia_MLBA 16.15), Munda 22 - distance% = 1.5259 %

GujaratiD: Munda 50.55, Near East LN-CHL 31(Iran_Late_Neolithic 17.05, Armenia_Chalcolithic 14.35) Steppe LNBA 18 (Poltavka 16.5, Andronovo 1.5) - distance% = 1.7413 %

GujaratiA: Near East CHL 36 (Armenia_Chalcolithic 22.1, Iran_Late_Neolithic 9.65, Iran_Chalcolithic 4.45), Munda 34.7, Steppe LNBA 29(Poltavka 21.6, Andronovo 7.5) - distance% = 1.2648 %

Assuming ASI % matches Munda then models for ANI and ASI by regression equation for above with latest datasheet, then D-stats for ANI and ASI are:

,Ami2,Anatolia_Neolithic2,BedouinB2,Bichon,Bougainville2,Cypriot2,Dai2,Eastern_HG2,Eskimo_Naukan,Georgian2,Iberia_EN2,India_South2,Iran_Chalcolithic2,Karitiana,Kinh_Vietnam,Kostenki14,Kotias,Mansi2,Mixe,Mota,Motala_HG2,Munda2,Papuan2,Ulchi2,Ust_Ishim,Yamnaya_Samara2,Yoruba2
ANI,0.3511,0.3897,0.3732,0.3907,0.3374,0.3929,0.3499,0.4054,0.3672,0.4011,0.3903,0.3607,0.3901,0.3713,0.3496,0.3634,0.4018,0.383,0.3696,0.1317,0.3991,0.3487,0.3335,0.3548,0.3316,0.4083,0.1119

ASI,0.382,0.3322,0.3217,0.3356,0.3744,0.3338,0.384,0.3487,0.3715,0.3354,0.3319,0.41,0.3369,0.3717,0.3818,0.3404,0.3372,0.36,0.3717,0.1399,0.3391,0.4048,0.371,0.376,0.3459,0.3474,0.1081

ANI models with set of least admixed populations as:

ANI - Eastern_HG 36.6, Levant_Neolithic 34.5, Motala_HG 10.15, Ami 8.95, Iran_Neolithic 6, Hungary_HG 3.8 - distance% = 6.3287 %

PCAd - http://i.imgur.com/smBiYLA.png

huijbregts said...

How do you guys use nMonte to calculate the admixture percentages from Dstat sheets?

The simplest way is just to use all the columns and apply nMonte in the same way as with a calculator sheet.
nMonte presupposes that the columns are orthogonal (independent).
This is guaranteed in Davids datasheets which are PCA-scores of raw DNA composition.
It is also safe with calculator sheets, because orthogonality of the columns is highly valued by the authors.
But the columns of D-stats sheets are not orthogonal.
This is a problem because nMonte presupposes that the columns are orthogonal.
If they are not, the calculation of the distance is incorrect.
And as the distance is used to guide the Monte Carlo process, the resulting estimation of the mixture composition is also incorrect.
This is worse than a negligible estimation error, the result may be way off.

Fortunately, using a PCA, the columns of the Dstat sheet can be transformed to (a smaller number of) orthogonal PCA scores.
So IMO the correct workflow is:
1. Choose a set of relevant rows, containing one target row and a number of reference rows. Use all the Dstat columns.
2. Calculate the PCA. I have worked with k=5 and no no centering or scaling.
3. Collect the scores with k columns.
4. Use the scores as input for nMonte. The resulting distances are very small.
Realize that these are not distances between DNA percentages, but distances between Dstat values which are far more homogenous.

PC1 PC2 PC3 PC4 PC5
Bell_Beaker_Germany -1.845426e+00 9.013090e-03 1.246191e-02 -3.611150e-03 -1.538591e-04
fitted -1.845420e+00 9.007781e-03 1.245808e-02 -3.609276e-03 -1.470620e-04
dif 5.866561e-06 -5.308716e-06 -3.831517e-06 1.873857e-06 6.797083e-06
[1] "distance%=0.0011 / distance=1.1e-05"

Bell_Beaker_Germany
"Poltavka_outlier" 72.85
"Esperstedt_MN" 13.45
"Yamnaya_Samara" 4.55
"Iberia_EN" 4.25
"Anatolia_Chalcolithic" 2.95
"Motala_HG" 1.55
"Loschbour" 0.3
"Iberia_MN" 0.1
"Hungary_EN" 0
"Yamnaya_Kalmykia" 0

I did this exercise to see whether this workflow is succesful, but I was surprised by the result.
I had expected a lot of Yamnaya, but not Poltavka_outlier.
Has anybody done a Dstat on Bell_Beaker/Poltavka_outlier?
Did I forget an important row? Or is an important column missing in the Dstat sheet?

@Davidski
Can I get a row Baalberge_MN?

Davidski said...

@huijbregts

Bell Beaker Germany and Poltavka outlier are closely related samples with similar genetic structures, so using Poltavka outlier as a reference for Bell Beaker Germany eats up much of the ancient components that make up Bell Beaker Germany.

It's like using Irish as a reference for English, and seeing most of the ancient ancestry proportions disappear.

huijbregts said...

@ Davidski
Thanks. I did not know that they were this closely related.
It confirms my idea that this is the correct way to estimate mixture proportions with Dstats/nMonte even though the columns are not orthogonal.


Alberto said...

These sheet are with the outgroups switched, I guess? I see the same effects of underestimating SSA. Who knows why.

The Armenian_Chalcolithic samples don't really seem to have much, if any, EHG after all:

Armenia_Chalcolithic
"Anatolia_Chalcolithic" 65.8
"Iran_Chalcolithic" 24.9
"Loschbour" 5.5
"Ami" 2.6
"Eastern_HG" 0.75
"Satsurblia" 0.45
"Anatolia_Neolithic" 0
"Yoruba" 0
"Iran_Late_Neolithic" 0
"Iran_Neolithic" 0
"Israel_Natufian" 0
"Jordan_EBA" 0
"Levant_Neolithic" 0

What Anatolia_ChL and Armenia Chalcolithic do is to eat up the Anatolia_Neolithic and CHG, probably expected since they're more modern samples. Though they take most of the Yamnaya from southern Europe too, even if Yamnaya is more modern than they are:

Greek
"Armenia_Chalcolithic" 45.1
"Anatolia_Chalcolithic" 31.65
"Anatolia_Neolithic" 9.1
"Loschbour" 8.9
"Eastern_HG" 3.5
"Ami" 1.75
"Satsurblia" 0
"Yoruba" 0
"Iran_Chalcolithic" 0
"Iran_Late_Neolithic" 0
"Iran_Neolithic" 0
"Israel_Natufian" 0
"Jordan_EBA" 0
"Levant_Neolithic" 0
"Yamnaya_Samara" 0

(Same source populations, but not showing those with 0 for cleanness):

Italian_Tuscan
"Anatolia_Chalcolithic" 41.55
"Armenia_Chalcolithic" 36.45
"Loschbour" 10.55
"Anatolia_Neolithic" 8.25
"Eastern_HG" 1.8
"Ami" 1.4

Italian_Bergamo
"Anatolia_Chalcolithic" 71.2
"Loschbour" 16.1
"Armenia_Chalcolithic" 8.9
"Yamnaya_Samara" 2
"Ami" 1.8

Bulgarian
"Anatolia_Chalcolithic" 44.8
"Armenia_Chalcolithic" 31.95
"Loschbour" 12.2
"Yamnaya_Samara" 8.3
"Ami" 2.3
"Eastern_HG" 0.45

Spanish
"Armenia_Chalcolithic" 38.25
"Anatolia_Neolithic" 25.25
"Loschbour" 13.55
"Anatolia_Chalcolithic" 11
"Yamnaya_Samara" 9
"Ami" 1.65
"Eastern_HG" 1.3

French_West
"Anatolia_Chalcolithic" 47.95
"Loschbour" 18.3
"Yamnaya_Samara" 17.55
"Armenia_Chalcolithic" 10.25
"Eastern_HG" 4.8
"Ami" 1.15

English_Kent
"Anatolia_Chalcolithic" 52.35
"Yamnaya_Samara" 24.4
"Loschbour" 18.75
"Armenia_Chalcolithic" 3.2
"Ami" 1.3

Lithuanian
"Anatolia_Chalcolithic" 37.2
"Yamnaya_Samara" 35.95
"Loschbour" 21.75
"Eastern_HG" 3.5
"Ami" 1.6

Ukrainian_West
"Anatolia_Chalcolithic" 40.4
"Yamnaya_Samara" 22.6
"Loschbour" 18.45
"Armenia_Chalcolithic" 13.55
"Eastern_HG" 2.6
"Ami" 2.4

Davidski said...

@Alberto

Your models for Northern Europe aren't realistic, because they suggest that it was populated by pure hunter-gatherers until the Late Neolithic.

The problem is that you don't have any Middle Neolithic/Copper Age European samples in your reference list. If you add them, you'll see essentially the same old results for Northern Europeans.

Lithuanian
Yamnaya_Samara 47.1
Esperstedt_MN 35.95
Motala_HG 8.7
Loschbour 6.2
Ulchi 2.05
Anatolia_Chalcolithic 0

distance%=1.7212 / distance=0.017212

It's likely that there are also similar problems with your models for Southern Europe, although these might not be possible to correct yet due to a lack of sampling from the Balkans.

Kristiina said...

Matt, it would be interesting to know what is your admixture analysis for the Samara hunter-gatherer and the Karelian hunter-gatherer using AG3-MA1, Ami, Eastern_HG, ElMiron, Hungary_HG, Iran_Neolithic, Israel_Natufian, LaBrana1, Levant_Neolithic, Loschbour, Motala_HG, Villabruna and Yoruba?

Alberto said...

Yes, Esperstedt_MN got dropped while testing Bell Beaker and seeing disagreements with previous sheets, with a significant pull towards Yamnaya. In general I keep seeing inconsistencies in this other method of running the double outgroup D-stats. It's not just that Spanish doesn't get SSA admixture, which is a minor problem for Europe, but not for the Near East where we have the new samples to finally model those populations, but it also seems to pull populations towards Euro_EH and ENA (probably that's why the high levels of Ami).

For a sanity check, I compared with the paper's models, and I see the same bias. For example, a simple 2 way admixture of Levant_Neolithic as Natufian + Anatolia_Neolithic shouldn't show any big discrepancy, since Natufians are pretty close to Levant_Neolithic. The paper has them as 67% Natufian + 33% Anatolia_Neolithic. Cross checking with the PCA based sheet I get:


Levant_Neolithic:I0867
"Israel_Natufian:I1072" 69.2
"Anatolia_Neolithic:I0707" 30.8

So quite close. But with this D-stats sheet:

Levant_Neolithic
"Israel_Natufian" 54.6
"Anatolia_Neolithic" 45.4

Which again shows a significant pull towards the more norther population.

Anatolia_Chalcolithic in the paper is modelled as 67% Anatolia_Neolithic + 33% Iran_Chalcolithic. Here I get this:

Anatolia_Chalcolithic
"Anatolia_Neolithic" 48.5
"Iran_Chalcolithic" 37.3
"Eastern_HG" 14.2

But this is a sample from Barcin, 3800 BC. So that strong pull toward EHG looks again a bit strange.

So I'm wondering if there's any particular reason for using this other method for the stats? Did you see any specific advantage vs. the previous one?

Shaikorth said...


"Palestinians: Levant_Neolithic 64.3, Iran_Neolithic 26.8, Eastern_HG 5.4, Ami 3.1, Loschbour 0.4 - distance% = 1.9323 % "

Does this give any SSA to Jordanians, Palestinians and BedouinA or does it all go to Levantine_n (or Natufian)?

Davidski said...

OK, I updated the sheets. See links in the post above. Now they include Baalberge_MN, Balochi, Brahui, Egyptians and a few extra Asian pops. I don't have a huge choice in this dataset, but it's the one I gotta run because it has the right markers for double outgroup D-stats like this.

Alberto, I'm currently running D(Chimp,Columns)(Mbuti.DG,Rows), which is the usual way to run these stats. I can't remember, what else did I run?

ryukendo kendow said...

@ Shaikorth

I'm going to post some fits here later, but a fraction of Natufian ancestry can be replaced by a combination of Esan_Nigeria, Munda, Indian_S, and Maasai_Kinyawa at varying proportions with some runs, which still produces very good fits. Which parallels Ted Kandell's PCA, where natufians stream away from Starcevo_EN towards a space in between Africans and South Asians, best represented by Ust Ishim and Australians at the terminus.

Alberto said...

@Davidski

Yes, that's what I supposed from the results. The old way was D(Chimp,Rows)(Mbuti,Columns) and didn't have these problems. When you first tested this other method I was comparing side by side both sheets that had the same columns and rows, only the method changed. And I was getting worse models, with significantly higher distances. Remember the case of Spanish_Extremadura not getting any Yoruba or even Moroccan as compared to the other sheet getting it with the same pops? It seems the effect is a more generalised pull towards Euro_HG away from Basal and African, or something like that.

@Shaikorth

I get this for Palestinian:

Palestinian
"Iran_Chalcolithic" 49.8
"Levant_Neolithic" 27
"Israel_Natufian" 12.8
"Anatolia_Chalcolithic" 3.65
"Ami" 2.9
"Yoruba" 2.15
"Loschbour" 1.7
"Anatolia_Neolithic" 0
"Eastern_HG" 0
"Satsurblia" 0
"Iran_Late_Neolithic" 0
"Iran_Neolithic" 0

But because of the problem I mention above I think this is underestimating SSA. The biggest negative residual (undefitting) is in the Yoruba column followed by BedouinB and Cypriot. While the biggest positive ones (overfitting) are for Iran_Chalcolithic and Kostenki14.

There's no BedouinA or Jordanian in this sheet. Here's BedouinB (same pops as above, only showing those that get some %):

BedouinB
"Levant_Neolithic" 37.8
"Iran_Chalcolithic" 32.3
"Israel_Natufian" 28.2
"Ami" 1.7

But similar pattern in the residuals: underfitting BedouinB and Yoruba, overfitting Anatolia_Neolithic, Iran_Chalcolithic, Kostenki14, EHG, Motala...

Davidski said...

OK, hang on.

Karl_K said...

Hanging... on...

huijbregts said...

@ Davidski
Thank you for adding Baalberge_MN to the sheet. It is is obviously an important component of the Bell Beaker admixture.

Bell_Beaker_Germany without Baalberge_MN
"Poltavka_outlier" 73.05
"Esperstedt_MN" 15.7
"Anatolia_Chalcolithic" 6.15
"Yamnaya_Samara" 3.3
"Loschbour" 1.7
"Yamnaya_Kalmykia" 0.1
"Hungary_EN" 0
distance=1e-05

Bell_Beaker_Germany with Baalberge_MN
"Poltavka_outlier" 38.75
"Baalberge_MN" 28.1
"Yamnaya_Samara" 17.5
"Yamnaya_Kalmykia" 6.75
"Anatolia_Chalcolithic" 4.3
"Esperstedt_MN" 2.7
"Loschbour" 1.9
"Hungary_EN" 0
distance=6e-06

I feel good about the less extreme relationship of Poltavka_outlier to Bell_Beaker_Germany.
But the volatility of these estimations is disappointing.

P.S. This sheet still contains a doublure of Georgian.

Matt said...

@ Kristiina, I can't fit either of those rows as they aren't available, but using those calc populations:

Eastern_HG: Motala_HG 50.8, AG3-MA1 49.2 - distance% = 7.1186 %

Underfitting / overfitting: http://textuploader.com/53n4n (link to save space)

The combination basically underfits to columns Yamnaya, EHG, all recent West Eurasians, Native Americans and East Asians, and overfits to Ust Ishim, Papuan, Mota, WHG.

Modeling without Motala_HG:

Eastern_HG: AG3-MA1 57.7, Hungary_HG 42.3 - distance% = 7.9113 %

Underfitting / overfitting:

Same issues http://textuploader.com/53nl6

Motala_HG itself modelled with the same groups:

Motala_HG: Hungary_HG 87.35, AG3-MA1 12.65 - distance% = 4.7928 %

If you had the ANE (AG3-MA1) as a column as well as a row, then it would probably go down, but we don't have enough samples for anything like that still, I don't think (there was a third ANE in the Fu paper, but low quality?).

Kristiina said...

Matt, thanks a lot! I am just thinking about this Mesolithic/ Neolithic admixture between WHG and EHG in the EHG area. Am I right that there is not any Basal from Anatolia/Iran in Samara hg. How about CHG? Can we now determine if there is any CHG or rather Iran Neolithic in the Samara hunter-gatherer? Does Hungary_HG contain any CHG or Iran Neolithic? Am I right that in your fittings the West/"SHG" versus East/"ANE" ratio is nearly 1:1?

There is no straight route from Sweden to Volga, so Hungary HG is in a much better position to find its way to Russia.

There is no need to answer all my questions but these a the questions that puzzle me.

Davidski said...

@huijbregts

Thanks, I got rid of the Georgian duplicate rows.

Btw, it's usually best to explore data with unsupervised tests like PCA, TreeMix or Admixture. On the other hand, supervised tests like D-stats/nMonte are best suited to carefully crafted models based on multiple lines of evidence, including unsupervised tests, uniparental markers and even linguistics and archeology.

Onur said...

@Davidski

Btw, it's usually best to explore data with unsupervised tests like PCA, TreeMix or Admixture.

But even in them choice of populations to analyze is a very important matter.

Davidski said...

Use a lot to start with, and these days, there's a lot to choose from.

Samuel Andrews said...

Just had my first look at the D-stat spreadsheets. I never expected the type of results we got from Iran_Neo and Natufian. It explains why it was so hard to get reasonable models for Middle Easterners with the genomes we had before Lazardis 2016. Natufian is close to Anatolia_N and Iran_Neo is close to CHG but exactly what the relationship each grouping has with each is a mystery to me.

Levant_N looking like a mixture of Natufian and Anatolia_N and Caucasus/Agean fitting better as Anatolia_N+something else than Levant_N or Natufian+something else suggests Anatolia_N-like people existed in the Mesolithic/Paeloithic in Anatolia and surroundings.

Samuel Andrews said...

I did a pretty detailed analysis of ancient/modern Iran. Here's a summary of the results. There's a link to it below.

https://docs.google.com/spreadsheets/d/1p8lwXd-ZMJ0yOWunJHc1cSAqzll2RAH7QImpzH0AL28/edit#gid=0

Iran_Late_Neolithic: 53% North(35% CHG, 15% Iran_Neo), 30% South(Anatolia_N or Levant_N), 17% South Indian(Would reduce if you take out out their Iran_Neo ancestry).

Iran_Chalcolithic: 50% South(Best represented by Anatolia_N), 45% North(Best represented by CHG), 5% South Indian

Modern Iran: 70% Iran_Chal, 10% South Indian, 20% LNBA European, 3-5% Siberian(?)

I'm confident South Asian admixture was in Iran since the Late Neolithic. The EHG/Steppe admixture in modern Iran looks real. When I use Srubnaya the results are always 15-20%. There's also extra South Asian in modern Iranians, maybe Indo Iranian languages arrived with a South Asian/Steppe hyprid population.

Samuel Andrews said...

I remember some of you earlier predicting Burosho or Balochiwill turn out mostly Iran_Neolithic with little Steppe admixture. It doesn't look that way. Most of my predictions about Iran_Neo were wrong to. Their affinities were impossible to predict.

I was confused by this idea because I had seen Burusho and Balchi mtDNA and they have some Steppe mtDNA like other South Asians. Using the spreadsheets provided by this thread they're scoring about 30% LNBA European. They have more Iran_Neo than other SC Asians, but they're definitely a similar mix as other SC Asians.

ryukendo kendow said...

I know the current data is somewhat problematic, I'm also getting the highest residues with Yoruba for many groups, but still going to post these ME runs here since intra-Eurasian patterns should be affected less. Was just throwing a list of stuff at modern MEs and seeing what sticks, and expected the ratios and identities of the source populations to vary wildly and somewhat by chance, since there should be many ways to get the same ancestry proportions for the 'big four' from a closely related set of source populations, but I was extremely surprised to find a rather consistent, almost ADMIXTURE-like cline of ancestry from a subset of the aDNA samples in the Modern ME. Maybe the short drift paths are sufficiently discriminatory after all?

These runs were done on a maximalist model for the number of source populations for the algorithm to choose from, here is the list:
ANCIENTS:
Anatolia_Chalcolithic
Anatolia_Neolithic
Armenia_Chalcolithic
Armenia_EBA
Armenia_MLBA
Bell_Beaker_Germany
Hungary_BA
Hungary_CA
Hungary_EN
Iberia_Chalcolithic
Iberia_EN
Iberia_MN
Iran_Chalcolithic
Jordan_EBA
LBK_EN
Scythian_IA
Unetice_EBA
Andronovo
Corded_Ware_Germany
Esperstedt_MN
Levant_Neolithic
Poltavka
Poltavka_outlier
Samara_Eneolithic
Srubnaya
Villabruna
Loschbour
LaBrana1
Hungary_HG
AG3-MA1
Afanasievo
Iran_Late_Neolithic
Iran_Neolithic
Israel_Natufian
Motala_HG
Satsurblia
Yamnaya_Kalmykia
Yamnaya_Samara


MODERNS:
Nganasan
Basque_French
Ulchi
Dai
Munda
India_South
Esan_Nigeria
Masai_Kinyawa

The fits below therefore have a long tail of zeros, which I omit. So the specificity of the fits kinda surprises me:





ryukendo kendow said...

[1] "distance%=1.4611 / distance=0.014611" (All Columns included, Africans, Dai and S Asian rows added)


Chechen
"Armenia_Chalcolithic" 26.2
"Anatolia_Chalcolithic" 25.5
"Andronovo" 18.55
"Scythian_IA" 11.3
"Satsurblia" 10.75
"Armenia_MLBA" 3.1
"Dai" 2.9
"Nganasan" 1.7

Adygei
"Anatolia_Chalcolithic" 28.8
"Armenia_Chalcolithic" 25.8
"Andronovo" 22.95
"Satsurblia" 10.85
"Scythian_IA" 5.9
"Dai" 5.7

[1] "distance%=1.7896 / distance=0.017896"

Abkhasian
"Anatolia_Chalcolithic" 38.9
"Armenia_Chalcolithic" 30.4
"Satsurblia" 16.35
"Andronovo" 10.55
"Dai" 3.8





[1] "distance%=1.5127 / distance=0.015127"


Armenian
"Anatolia_Chalcolithic" 41.2
"Iran_Chalcolithic" 34.35
"Basque_French" 16
"Scythian_IA" 6.05
"India_South" 1.45
"Dai" 0.95


[1] "distance%=1.5528 / distance=0.015528" (Basque French Dropped)
Armenian
"Anatolia_Chalcolithic" 37.35
"Iran_Chalcolithic" 31.1
"Poltavka_outlier" 12.9
"Armenia_Chalcolithic" 12.8
"LBK_EN" 2.95
"Dai" 1.55
"India_South" 1.35

[1] "distance%=1.5648 / distance=0.015648"


Armenian
"Anatolia_Chalcolithic" 37.35
"Iran_Chalcolithic" 31.95
"Armenia_Chalcolithic" 9.75
"Scythian_IA" 7.4
"Poltavka_outlier" 7.1
"LBK_EN" 4.25
"Munda" 1.8
"Iberia_EN" 0.4






[1] "distance%=1.0121 / distance=0.010121"

Lezgin
"Armenia_MLBA" 38.45
"Scythian_IA" 29.2
"Anatolia_Chalcolithic" 14.55
"Armenia_Chalcolithic" 9.35
"Satsurblia" 6.45
"Nganasan" 1.6
"India_South" 0.4

[1] "distance%=1.069 / distance=0.01069" (Dropped Scythian IA)


Lezgin
"Armenia_Chalcolithic" 25.2
"Armenia_MLBA" 20.4
"Andronovo" 17.3
"Anatolia_Chalcolithic" 15.35
"Afanasievo" 9.3
"Satsurblia" 8.1
"Dai" 2.35
"Nganasan" 2


[1] "distance%=1.0953 / distance=0.010953" (Dropped Georgian Column, African, Dai and S Asians added)


Georgian
"Anatolia_Chalcolithic" 33.9
"Armenia_MLBA" 31.45
"Satsurblia" 13.4
"Armenia_Chalcolithic" 10
"Basque_French" 6.7
"India_South" 1.75
"Dai" 1.65
"Iberia_EN" 1.15



[1] "distance%=1.1121 / distance=0.011121" (Dropped Georgian Column, Dropped basque French)


Georgian
"Armenia_MLBA" 32.9
"Anatolia_Chalcolithic" 30.1
"Armenia_Chalcolithic" 14.95
"Satsurblia" 12.85
"Iberia_EN" 4
"Poltavka_outlier" 2.2
"Dai" 1.85
"India_South" 1.15

[1] "distance%=1.1396 / distance=0.011396" (Dropped Georgian Column, Dropped India_South)

Georgian
"Armenia_MLBA" 34.85
"Anatolia_Chalcolithic" 27.7
"Armenia_Chalcolithic" 15.25
"Satsurblia" 13.3
"Iberia_EN" 5
"Ulchi" 2
"Scythian_IA" 1.65
"Poltavka_outlier" 0.25

ryukendo kendow said...

[1] "distance%=1.1256 / distance=0.011256" (All Columns included, Africans, Dai and S Asian rows added)

Assyrian
"Iran_Chalcolithic" 35.9
"Armenia_MLBA" 28.25
"Armenia_Chalcolithic" 18.5
"Iberia_EN" 12.75
"Anatolia_Chalcolithic" 2.95
"Dai" 1.65


[1] "distance%=1.2593 / distance=0.012593" (Africans, Dai and S Asians added, All Columns)


Lebanese_Christian
"Armenia_MLBA" 36.85
"Iran_Chalcolithic" 20.35
"LBK_EN" 18.7
"Levant_Neolithic" 16.25
"Jordan_EBA" 2.85
"Iberia_EN" 2.25
"Munda" 2.25
"Dai" 0.45
"India_South" 0.05

[1] "distance%=1.2891 / distance=0.012891" (All Columns, Africans, S Asians and Dai dropped)


Lebanese_Christian
"Armenia_MLBA" 36.05
"Iran_Chalcolithic" 22.45
"Levant_Neolithic" 15.5
"LBK_EN" 12.1
"Iberia_EN" 6.7
"Jordan_EBA" 5.5
"Ulchi" 1.7


[1] "distance%=0.7127 / distance=0.007127" (Africans, Dai or S Asian dropped, ME Columns dropped)


Druze
"Armenia_MLBA" 32.35
"Iran_Late_Neolithic" 18.7
"Iberia_EN" 18.15
"Levant_Neolithic" 7.95
"Israel_Natufian" 7.95
"Iran_Chalcolithic" 6.25
"Anatolia_Chalcolithic" 6.05
"Ulchi" 2.6


[1] "distance%=1.1377 / distance=0.011377" (Africans, Dai and S Asians added, All Columns)

Druze
"Armenia_MLBA" 38.55
"Iran_Chalcolithic" 22.6
"Levant_Neolithic" 16.4
"LBK_EN" 15.1
"Iberia_EN" 2.45
"Munda" 1.95
"India_South" 1.9
"Jordan_EBA" 0.95
"Dai" 0.1


[1] "distance%=1.1861 / distance=0.011861" (Africans, Dai or S Asian dropped, All columns)

Druze
"Armenia_MLBA" 37.65
"Iran_Chalcolithic" 25.65
"Levant_Neolithic" 15.1
"Iberia_EN" 8.35
"LBK_EN" 6.1
"Jordan_EBA" 4.95
"Ulchi" 1.9
"Armenia_Chalcolithic" 0.3



[1] "distance%=0.4933 / distance=0.004933" (Africans, Dai and S Asians added, ME columns dropped)


Druze
"Iberia_EN" 23.75
"Armenia_MLBA" 22.7
"Anatolia_Chalcolithic" 19.75
"Iran_Chalcolithic" 11.25
"Iran_Late_Neolithic" 9.9
"Armenia_Chalcolithic" 5.2
"Munda" 4.7
"Esan_Nigeria" 1.35
"India_South" 0.7
"Masai_Kinyawa" 0.7


[1] "distance%=1.1968 / distance=0.011968"


Palestinian
"Levant_Neolithic" 38.5
"Armenia_MLBA" 34.2
"Iran_Chalcolithic" 12.9
"India_South" 5.6
"Israel_Natufian" 4.2
"Masai_Kinyawa" 2.85
"Munda" 1.1
"Esan_Nigeria" 0.65

[1] "distance%=1.2545 / distance=0.012545" (Indian South dropped)


Palestinian
"Armenia_MLBA" 33.4
"Levant_Neolithic" 26.3
"Iran_Chalcolithic" 21
"Israel_Natufian" 12.55
"Masai_Kinyawa" 3.65
"Dai" 2.55
"Iberia_EN" 0.45
"Armenia_Chalcolithic" 0.05
"Esan_Nigeria" 0.05

[1] "distance%=1.2754 / distance=0.012754" (Dai dropped)


Palestinian
"Armenia_MLBA" 31.65
"Iran_Chalcolithic" 23.25
"Levant_Neolithic" 23.1
"Israel_Natufian" 14.2
"Masai_Kinyawa" 3.75
"Ulchi" 2.4
"Iberia_EN" 1.65

ryukendo kendow said...

[1] "distance%=1.0912 / distance=0.010912" (BedouinB column dropped, plus Africans, Dai and S Asian rows added)


BedouinB
"Levant_Neolithic" 34.1
"Iran_Chalcolithic" 25.45
"Armenia_MLBA" 17.35
"LBK_EN" 6.95
"Masai_Kinyawa" 5.6
"Iberia_EN" 5.2
"Munda" 3.55
"India_South" 1.35
"Armenia_Chalcolithic" 0.35
"Esan_Nigeria" 0.1

[1] "distance%=1.4916 / distance=0.014916" (BedouinB column dropped, No African, Dai or S Asian rows)

BedouinB
"Israel_Natufian" 33.9
"Levant_Neolithic" 29.85
"Iran_Late_Neolithic" 15.25
"Armenia_MLBA" 9.3
"Iran_Chalcolithic" 7.9
"Ulchi" 2.5
"Satsurblia" 1.1
"Iran_Neolithic" 0.2


[1] "distance%=1.1713 / distance=0.011713" (BedouinB column dropped, plus Africans, Dai rows, S Asian rows dropped)
BedouinB
"Iran_Chalcolithic" 32.1
"Levant_Neolithic" 27.3
"Armenia_MLBA" 12.45
"Iberia_EN" 10.95
"Masai_Kinyawa" 5.9
"Armenia_Chalcolithic" 4.6
"Israel_Natufian" 4.4
"Ulchi" 2.3

Naturally, quite surprised by the strong affinity for Armenia MLBA. I threw the same set of populations, less Armenia MLBA, at Armenia MLBA and obtained the following:
[1] "distance%=1.052 / distance=0.01052"
Armenia_MLBA
"Armenia_Chalcolithic" 46.6
"Armenia_EBA" 27.65
"Satsurblia" 10.9
"Basque_French" 5.4
"Andronovo" 4.2
"Afanasievo" 3.25
"Masai_Kinyawa" 1
"Ulchi" 0.55
"Dai" 0.45



[1] "distance%=1.087 / distance=0.01087" (Dropped Dai)



Armenia_MLBA
"Armenia_Chalcolithic" 36.25
"Armenia_EBA" 26.3
"Satsurblia" 12.55
"Afanasievo" 8.75
"Levant_Neolithic" 8.4
"Basque_French" 4.45
"Ulchi" 1.4
"Andronovo" 1



[1] "distance%=1.0883 / distance=0.010883" (Dropped Basque French)


Armenia_MLBA
"Armenia_Chalcolithic" 36.95
"Armenia_EBA" 26.9
"Satsurblia" 12.65
"Levant_Neolithic" 9.4
"Andronovo" 6.8
"Afanasievo" 6.05
"Ulchi" 1.25

ryukendo kendow said...

A few comments:

Seems like the North Caucasus foothills are a sink, influenced mostly by Anatolia_Chl and Armenia_Chl plus Satsurblia for the most part, and later steppe movements represented by Andronovo, Afanasievo and Scythian_IA.

As one moves to the south Caucasus, we begin to see more Iran_Chl and Armenia_MLBA, the pattern found in Levantines and Mesopotamians.

Lezgins have a very large share of Scythian IA, which I dropped, which then decomposed to become Nganasan, Andronovo, and Afanasievo. Nganasan seems to distinguish Lezgins from the rest. Seems to suggest post Iron Age steppe contribution, which may be the case for the Scythian IA in other N Caucasian populations as well.

Armenians and to an extent Georgians get Basque_French on the first try, 16% in Armenians. Dropping Basque French decomposes the fraction to Poltavka outlier and LBK_EN, which seems to suggest late contributions from Europe.

Assyrians and N_Levantines have surprisingly large fractions of LBK_EN or Iberia_EN, it seems that region has been considerably 'Europeanized' from the Bronze age till today.

From the Levant south to Arabia it seems to trasition from mostly Armenia_MLBA plus others, to mostly a combination of Levant_Neolitic and Iran_Chalcolithic plus varying quantities of Natufian.

There is a pronounced tendency for small fractions of Levant_Neolithic, Natufian or Jordan_EBA, i.e. all the Natufian-influenced genomes, to be replaced by a combination of Masai_Kinyawa, Munda, India_S, Esan Nigeria and Dai, which you can see especially in fits for Bedouin, Armenia_MLBA and Palestinian. A.k.a there is a similarity of Natufians to a combination of East and South Asians and Africans, which is also reflected in David's PCA and Ted Kandells Plot, where Natufians are displaced in both PC1 (towards Africans) and PC3 (Towards East Asians, South Asians, and Australian Aborigines), and the combination ends up pointing towards Ust-Ishim and Oase? No idea why this happens.

Armenia MLBA seems very interesting, as it seems to show contribution from both the Steppe directly, and from Levant_Neolithic, onto a base that resembles present-day North Caucasian populations overall. It seems to be the first ancient that shows Levant_Neolithic so far North, and also the first that shows direct European-like/Steppe-like contribution. Which perhaps makes sense, because this was after the Akkadians--the first Semitics in the region--had just come and gone and right smack in the heyday of the Mittani, IIRC, and this skeleton was unearthed in a Kurgan and thus shows both of the recently arrived superstrates? Either way, Armenia_MLBA seems to have a quite important role in N Levant and Mesopotamia, maybe reflecting a rather cosmopolitan base population in the Middle Bronze Age, on top of which was added the LBK_EN and Iberia_EN from later phoenician and greek contacts???

Overall I like nMonte a bit more than before, since it seems to have some ability to discriminate very short drift paths relating populations less than a few thousand years diverged in time depth, and confirms ADMIXTURE phenomena, e.g. the 'West Med/Sardinian' influence on the Northern Levant, and the European-like influence in Armenians, and the North Caucasian peak in Georgians, which seems to match their preservation of the highest levels of Satsurblia ancestry, etc. And Iran_N populations in fact did not seem to influence the Caucasus and levant all that strongly, which confirms Iran_Ns tendency to like S and SC Asian clusters as opposed to Caucasus ones in Kurd's ADMIXTURE runs. This is quite surprising to me honestly.

Rob said...

Interesting about the earliest steppe influence being found in Armenia_MLBA. This dates 2300 - 1500 BC ? It's too late for Anatolian languages, but I wonder if it explains the appearance of Mitanni.

ryukendo kendow said...

Another thing: when Natufians are replaced with E and S Asians and Africans, Iberia_EN also tends to increase. It seems like Natufians are further away from everyone in Eurasia, but weirdly close to E and S Asians for some reason?

Moving on, the fits for Moroccans make sense as well:

[1] "distance%=1.2051 / distance=0.012051"
Moroccan
"Israel_Natufian" 18.6
"Armenia_MLBA" 17.95
"Levant_Neolithic" 14.55
"Masai_Kinyawa" 14.15
"Iberia_EN" 11.9
"Basque_French" 11.2
"Esan_Nigeria" 9.15
"Ulchi" 1.35
"Iran_Late_Neolithic" 1.1
"Esperstedt_MN" 0.05


[1] "distance%=1.2214 / distance=0.012214" (Dropped Basque French)
Moroccan
"Armenia_MLBA" 24.95
"Israel_Natufian" 21.35
"Masai_Kinyawa" 12.15
"Iberia_EN" 10.8
"Esan_Nigeria" 10.5
"Esperstedt_MN" 9.95
"Levant_Neolithic" 8.75
"Ulchi" 1.55

The high levels of East African ancestry in Moroccans in ADMIXTURE receives confirmation, as does the high level of 'West Med/Sardinian' influence. Also looks like there is input from a very 'Southern' population along the N-Levant S-Levant cline. With BedouinB:

[1] "distance%=0.6255 / distance=0.006255"
Moroccan
"BedouinB" 51.85
"Esperstedt_MN" 13.4
"Esan_Nigeria" 11.8
"Masai_Kinyawa" 10.05
"Iberia_EN" 6.75
"Basque_French" 3.5
"Loschbour" 1.95
"Ulchi" 0.7

Somalis also seem to fit, but have a long tail of small percentages:
[1] "distance%=1.0097 / distance=0.010097"
Somali
"Masai_Kinyawa" 72.7
"Levant_Neolithic" 11.65
"Israel_Natufian" 6.3
"Iran_Late_Neolithic" 4.8
"Armenia_MLBA" 2.1
"Iberia_EN" 1.9
"Ulchi" 0.55

So I added a modern to clear things up:

[1] "distance%=0.9118 / distance=0.009118"
Somali
"Masai_Kinyawa" 73.3
"Druze" 17.25
"Israel_Natufian" 6.75
"Levant_Neolithic" 2.7

A more southern version of Druze. With Bedouin:

[1] "distance%=0.7693 / distance=0.007693"


Somali
"Masai_Kinyawa" 72.9
"BedouinB" 23.9
"Iberia_EN" 2.9
"Esperstedt_MN" 0.3

Interestingly, the 'Southwest Asian' component is resurrected here, where in ADMIXTURE a BedouinB component eats up BedouinB, and forms other local peaks where the component eats up the major part of the Eurasian portion of NW Africans and Somalis. Maybe the AA languages (except Omotics, who behave differently in ADMIXTURE, same we can't include Ari Blacksmith or Ari cultivator in this analysis) spread with a BedouinB-like people? Levant_neolithic plus some Iran_N or Iran_LN, with less of the European and Caucasus influence we see today in the Levant and ME.

If I drop BedouinB:

[1] "distance%=0.8958 / distance=0.008958"

Somali
"Masai_Kinyawa" 73.85
"Iberia_EN" 10.9
"Iran_Late_Neolithic" 7.7
"Levant_Neolithic" 5.65
"Munda" 0.8
"Esan_Nigeria" 0.4
"Armenia_MLBA" 0.3
"Esperstedt_MN" 0.2
"Dai" 0.2

Once again we get this weird replacement of some Natufian-like affinities with S and E Asians and Africans, with increase in Iberia_EN.

ryukendo kendow said...

@Rob

Actually this is not the first time it appears, its just especially pronounced here, sorry for vague statement.

Jay Wallace said...

ryukendo, can we get some ashkenazi results?

Rob said...

@ Ryu

Thanks for clarifying. I read " also the first that shows direct European-like/Steppe-like contribution" as 'the first' ;)

If this wasn't the first, what period was ?

ryukendo kendow said...

I did a time series of the area, from the earliest to the latest, using the same mass-source-throwing method. Some mixes will be obviously ahistorical, because there is admixture from the future to the past, so I will rerun these by dropping the later populations until I get the first mostly historical mix that the algorithm spits out.

First:

[1] "distance%=0.9693 / distance=0.009693"


Iran_Chalcolithic
"Iran_Late_Neolithic" 38.75
"Armenia_EBA" 24.15
"Anatolia_Chalcolithic" 21.45
"Jordan_EBA" 11.3
"Satsurblia" 2.95
"Ulchi" 0.75
"Anatolia_Neolithic" 0.65

Dropping Anatolia_Chl, Jordan EBA, Armenia_EBA:
[1] "distance%=1.1226 / distance=0.011226"

Iran_Chalcolithic
"Iran_Late_Neolithic" 48.4
"Anatolia_Neolithic" 20.1
"Armenia_Chalcolithic" 15.35
"Satsurblia" 8.05
"Armenia_MLBA" 7.05
"Iran_Neolithic" 0.95
"Nganasan" 0.1

Dropping Armenia_Chl:
[1] "distance%=1.2362 / distance=0.012362"


Iran_Chalcolithic
"Iran_Late_Neolithic" 49.4
"Anatolia_Neolithic" 29.55
"Satsurblia" 15.3
"LBK_EN" 3.2
"Nganasan" 1.55
"Afanasievo" 1

Dropping Nganasan
[1] "distance%=1.2442 / distance=0.012442"


Iran_Chalcolithic
"Iran_Late_Neolithic" 49.45
"Anatolia_Neolithic" 30.75
"Satsurblia" 15.15
"LBK_EN" 1.55
"Scythian_IA" 1.2
"Ulchi" 1.2
"Afanasievo" 0.7

Even as stuff gets dropped the fits are still very good, so e.g. dropping Jordan EBA and not seeing Levant Neolithic appear was probably a sign that the fitting was spurious. On the other hand dropping Nganasan causes Scythian_IA and Ulchi to appear, so the ENA implied in the fit was probably accurate.

It seems there was a movement of Anatolia_Neolithic types, plus a very small slice of WHG-EHG and ENA into Iran from the Neolithic to the Chalcolithic.
_______________________

ryukendo kendow said...

[1] "distance%=0.9565 / distance=0.009565"

Armenia_Chalcolithic
"Armenia_MLBA" 34.1
"Anatolia_Chalcolithic" 31.85
"Scythian_IA" 9.4
"Iran_Chalcolithic" 7.95
"LBK_EN" 7.3
"Esperstedt_MN" 7.1
"India_South" 2.3

Very ahistorical, so Dropping Armenia_MLBA, Anatolia_Chalcolithic, Scythian_IA, India_South:
[1] "distance%=1.1912 / distance=0.011912"

Armenia_Chalcolithic
"Iran_Chalcolithic" 32.85
"LBK_EN" 32.75
"Andronovo" 19.15
"Satsurblia" 8.2
"Afanasievo" 4.2
"Iberia_EN" 1.45
"Nganasan" 1.4

Dropping Andronovo:
[1] "distance%=1.2293 / distance=0.012293"


Armenia_Chalcolithic
"Iran_Chalcolithic" 28.9
"LBK_EN" 26.3
"Afanasievo" 21.25
"Iberia_EN" 14
"Satsurblia" 7.7
"Nganasan" 1.85

Afanesievo and Iberia_EN/LBK_EN go up considerably. Lastly dropping LBK_EN:
[1] "distance%=1.3815 / distance=0.013815"


Armenia_Chalcolithic
"Iran_Chalcolithic" 35.1
"Esperstedt_MN" 20.1
"Afanasievo" 14.85
"Anatolia_Neolithic" 13.9
"Bell_Beaker_Germany" 7.75
"Satsurblia" 6.45
"Nganasan" 1.85

Esperstedt_MN appears, a fraction of Anatolia_Neol appears and a portion of the European-like Neol ancestry has combined with Afanasievo to create Bell Beaker Germany. Looks like there was very strong Steppe influence in the Kura Araxes, and a very European-like EN ancestry for some reason...
_______________________________________________

ryukendo kendow said...

[1] "distance%=1.9547 / distance=0.019547"


Anatolia_Chalcolithic
"LBK_EN" 48.05
"Armenia_EBA" 24.85
"Satsurblia" 13.15
"Andronovo" 8.8
"Scythian_IA" 3.65
"Poltavka_outlier" 1.5

Armenia_EBA ahistorical, dropped:
[1] "distance%=1.8448 / distance=0.018448"

Anatolia_Chalcolithic
"Armenia_Chalcolithic" 45.8
"LBK_EN" 37.35
"Satsurblia" 11.9
"Poltavka_outlier" 4.95

Poltavka_Outlier dropped:
[1] "distance%=1.8534 / distance=0.018534"


Anatolia_Chalcolithic
"Armenia_Chalcolithic" 45.2
"LBK_EN" 39.25
"Satsurblia" 12.2
"Samara_Eneolithic" 2
"Srubnaya" 1.35


So much LBK_EN is weird to me, so I dropped it:
[1] "distance%=1.9292 / distance=0.019292"


Anatolia_Chalcolithic
"Armenia_Chalcolithic" 52.1
"Anatolia_Neolithic" 18.7
"Iberia_EN" 11.15
"Satsurblia" 9.95
"Bell_Beaker_Germany" 8.1

Very similar behaviour to Kura-Araxes, the European-like Neol ancestry refuses to go away, popping up again as Iberia_EN, with the other half combining with the small percents of EHG-Steppe ancestry to create bell_Beaker_Germany. Anatolia_Chalcolithic looks like its influenced by Armenia_Chalcolithic (is there signs of Kura-Araxes influence so far west? especially as Satsurblia appears too, which seems to suggest an east to west movement), plus some influences from Europe with a few percents of Steppe ancestry.



_______________________________________________

ryukendo kendow said...

[1] "distance%=0.6653 / distance=0.006653"


Armenia_EBA
"Anatolia_Chalcolithic" 27.1
"Iran_Chalcolithic" 24.15
"Satsurblia" 11.05
"Anatolia_Neolithic" 9.05
"Jordan_EBA" 7.9
"Yamnaya_Kalmykia" 7.35
"Iran_Neolithic" 6.15
"Armenia_MLBA" 5.75
"Hungary_BA" 1.5

Interestingly, except for Hungary BA and Hungary MLBA, none of this is ahistorical. Dropping Armenia MLBA:

[1] "distance%=0.6678 / distance=0.006678"

Armenia_EBA
"Iran_Chalcolithic" 28.6
"Anatolia_Chalcolithic" 25.85
"Satsurblia" 11.8
"Jordan_EBA" 10.3
"Yamnaya_Kalmykia" 9.1
"Anatolia_Neolithic" 8.55
"Iran_Neolithic" 4.2
"Hungary_BA" 1.6

It looks like Anatolia_Chalcolithic ancestry traveled back East and combined with another wave of Iran_Chalcolithic, plus another wave of Yamnaya Kalmykia and Jordan EBA, (so Rob the importance I was ascribing to Armenia MLBA in bringing levantine ancestry and 'SW Asian' so far north should rather be ascribed to Armenia_EBA). Interestingly Yamnaya Kalmykia ancestry also appears.

Dropping Hungary_BA:
[1] "distance%=0.6686 / distance=0.006686"

Armenia_EBA
"Iran_Chalcolithic" 27.95
"Anatolia_Chalcolithic" 24.7
"Satsurblia" 11.9
"Anatolia_Neolithic" 10.7
"Yamnaya_Kalmykia" 10.45
"Jordan_EBA" 9.85
"Iran_Neolithic" 4.45

Dropping Jordan_EBA, to see if Levantine affinities are spurious
[1] "distance%=0.6842 / distance=0.006842"

Armenia_EBA
"Iran_Chalcolithic" 30.5
"Anatolia_Chalcolithic" 23.3
"Anatolia_Neolithic" 13.7
"Satsurblia" 12.1
"Yamnaya_Kalmykia" 10.5
"Iran_Neolithic" 5.85
"Levant_Neolithic" 4.05
No they aren't, as Levant Neolithic appears. So Armenia was quite cosmopolitan by the EBA period.

ryukendo kendow said...

[1] "distance%=1.087 / distance=0.01087"

Armenia_MLBA
"Armenia_Chalcolithic" 36.25
"Armenia_EBA" 26.3
"Satsurblia" 12.55
"Afanasievo" 8.75
"Levant_Neolithic" 8.4
"Basque_French" 4.45
"Ulchi" 1.4
"Andronovo" 1


From the EBA till MLBA, more influence from Anatolia, Europe, the Steppe and the Levant. Dropping Basque French:

[1] "distance%=1.0883 / distance=0.010883"


Armenia_MLBA
"Armenia_Chalcolithic" 36.95
"Armenia_EBA" 26.9
"Satsurblia" 12.65
"Levant_Neolithic" 9.4
"Andronovo" 6.8
"Afanasievo" 6.05
"Ulchi" 1.25


Isolates like the assyrians and Lebanese Christians seem to have maintained this mix pretty well, under the cosmopolitanism from the later civilised ages:

[1] "distance%=1.1256 / distance=0.011256"

Assyrian
"Iran_Chalcolithic" 35.9
"Armenia_MLBA" 28.25
"Armenia_Chalcolithic" 18.5
"Iberia_EN" 12.75
"Anatolia_Chalcolithic" 2.95
"Dai" 1.65

[1] "distance%=1.2891 / distance=0.012891"
Lebanese_Christian
"Armenia_MLBA" 36.05
"Iran_Chalcolithic" 22.45
"Levant_Neolithic" 15.5
"LBK_EN" 12.1
"Iberia_EN" 6.7
"Jordan_EBA" 5.5
"Ulchi" 1.7

Rob said...

👌🏻
Fascinating
Surely this kind of EEF + steppe which keeps showing up suggests something from the west Black Sea (the Anatolian Chalcolithic is c4000 BC) so refugees from Varna-Karanovo, or something C-T like; or north Caucasus- Majkop, are what initially come to mind.

Rob said...

"Looks like its influenced by Armenia_Chalcolithic (is there signs of Kura-Araxes influence so far west? )"

Can you clarify this question, Ryu ?

ryukendo kendow said...

Overall it seems there was
- a few percents of EHG and ENA plus very large influx of Anatolia Neolithic in Iran Chl
- massive influx of Steppe ancestry plus European Neolithic plus Iran Chl with the Kura Araxes
- massive influx of European Neolithic plus a few percents of Steppe/EHG in Anatolia Chalcolithic, though the other half of Anatolia Chl is very stably fit as Kura Araxes/Armenia Chl itself.
- then Armenia EBA sees much more Iranian affinities, plus a bit more Steppe and some Levantine affinities for the first time.
- Armenia MLBA is even more cosmopolitan
- Modern N Levantines and Mesopotamians have further influx from Europe, the Levant, and Iran.

Looks like this is what a few thousand years of civilisation can do to your ancestry.

ryukendo kendow said...

@ Rob

Since Armenia Chl is Kura-Araxes IIRC, what do you make of the large fraction of Armenia_Chl in Anatolia_Chl?

Also, since Armenia_Chl already has so much ancestry from Europe, do you think it spoke Anatolian? Or was it Anatolia_Chl that was responsible, or even Armenia_EBA with its 10% Yamnaya Kalmykia?

It will be nice to inform this picture with linguistics and archaeology.

Rob said...

Ryu

"Since Armenia Chl is Kura-Araxes IIRC, what do you make of the large fraction of Armenia_Chl in Anatolia_Chl?"

It makes sense. The KA expanded into Anatolia from the South Caucasus Piedmont. It dates from 3300 BC, so contemporary to if not slightly anterior to Yamnaya; but both are 500 years after the beginning of Majkop; which I suspect is the source of what we're seeing

The KA phenomenon was culturally and economically diverse, this differs to CWC or Yamnaya, say. I think this would translate into multilinguality, but would one such idiom be some early IE? Why not

Rob said...

"Also, since Armenia_Chl already has so much ancestry from Europe, do you think it spoke Anatolian? Or was it Anatolia_Chl that was responsible, or even Armenia_EBA with its 10% Yamnaya Kalmykia?"

I wonder if this "European farmer" like ancestry comes from Black Sea steppe / Majkop, instead of directly from Europe

For language, we'd need to ponder more and get a few More data points
As I've often said, there might not be a simple , linear picture like the arrows in books ;)

ryukendo kendow said...

The thing that really mystifies me is the tendency for all these post-neolithic groups to favour neolithic ancestry from Europe instead of Anatolia, even groups in the Caucasus such as Kura-Araxes. As can be seen from some of the alternative fits above, this tendency simply refuses to go away, and keeps popping up like even if you drop the offending populations, like a game of whack-a-mole.

Maybe there was an intense interaction between the Yamnaya and the Balkan Neolithics that started creating such cultural dynamism all around the Black sea, as you said. A bit strange how such ancestry reached the caucasus without us noticing though, don't the Barcin and Kumtepe samples close off the historical window?

Rob said...

I think the Barcin samples were too early (? Pre 5000 BC). Kumtepe 6 was also late Neolithic.
Kum 4 is 3200 BC, this perfect, but Dave wasn't able to analyse it

Yep something was happening...
I'll say about this more in due course

Rob said...

At some point will you look at the reverse: European aDNA (BB, Yamnaya, BA Hungary, CWC) in light of the new , near eastern data ?

ryukendo kendow said...

@ Rob
That will have to come much later, maybe in a day or two.

Seems like the mass-source-throwing method is pretty useful. In the meantime, I already did a mass-source-throw at Hungarians previously, and this was the first, second, and third runs:

"Basque_French" 49.6
"Srubnaya" 16.45
"Lezgin" 12.85
"Hungary_BA" 7.4
"Cypriot" 5.75
"Afanasievo" 3.85
"Motala_HG" 1.95
"Nganasan" 1.4
"Loschbour" 0.65
"Hungary_HG" 0.1

Dropping Basque, Lezgin and Cypriot from the list:
[1] "distance%=0.6667 / distance=0.006667"
"Srubnaya" 36.85
"Iberia_MN" 25.1
"Druze" 14.45
"Andronovo" 10.7
"Poltavka_outlier" 4.8
"Armenia_MLBA" 4
"Loschbour" 2.3
"Nganasan" 1.8

Dropping Druze from the list:
"Srubnaya" 33.55
"Iberia_MN" 23
"Armenia_MLBA" 14.65
"Andronovo" 9.6
"Poltavka_outlier" 7.35
"LBK_EN" 5.25
"Loschbour" 2.45
"BedouinB" 2.05
"Nganasan" 1.3
"Ulchi" 0.8

The European half of Hungarians seems pretty accurate to me. The other half, the ME ancestry seems to reach Hungary pretty late, with the Armenia MLBA plus Bedouin % in the 3rd run almost exactly equalling Druze plus Armenia MLBA % in the second run and Cypriot plus Lezgin % in the first run.

MfA said...

Kura-Araxes is Armenia EBA, not copper age.

ryukendo kendow said...

@ Mfa

What is Armenia Chalcolithic?

ryukendo kendow said...

Updating the conclusions to reflect the error:

Overall it seems there was
- a few percents of EHG and ENA plus very large influx of Anatolia Neolithic in Iran Chl
- massive influx of Steppe ancestry plus European Neolithic plus Iran Chl with the Armenia_Chl
- massive influx of European Neolithic plus a few percents of Steppe/EHG in Anatolia Chalcolithic, though the other half of Anatolia Chl is very stably fit as Armenia Chl itself.
- then Armenia EBA sees much more Iranian affinities, plus 10% Yamnaya Kalmykia and ~10% Levantine affinities for the first time.
- Armenia MLBA is even more cosmopolitan
- Modern N Levantines and Mesopotamians have further influx from Europe, the Levant, and Iran.

Looks like this is what a few thousand years of civilisation can do to your ancestry.

Gökhan said...

David could you add Turkish_trabzon Dtats in your sheet or just write down here? I will be appreciated if you do.

MfA said...

I don't know they didn't mentioned which culture in the paper.

On the other hand EBA samples are from the first phases of Kura-Araxes.

Talin necropolis (Aragatsotn Province, Republic of Armenia)
The necropolis is located at the limits of the city of Talin, and is distributed on both sides of the Talin-Gyumri Higway. The Early Iron Age remains are found in the northwestern part of the necropolis (north-western limits of Talin), in a cemetery occupying about 3 kilometers squared2. The Early Bronze Age and Late Bronze Age cemeteries occupy around one kilometer squared2. Systematic archaeological excavations at the site have been conducted since 1984, and over one hundred tombs dating from the last quarter of the 4th millennium BC through Hellenistic period have been excavated. The Early Bronze Age is represented by a ritualistic enclosure and four tombs. These are dated to the first phase of the Kura-Araxes culture, which overspread the region in the second half of the fourth millennium BCE to the early part of the third millennium BCE. The tombs are earth and stone tumuli, 0.4-0.6 meters high, but differ in their construction, with some having been built within pits, and others at ground level. Burial 115 was excavated as part of a group of 12 tombs in 2014, during rescue archaeology prior to road construction during the North-South Corridor Highway project 34,35.
• TA3/R8 (I1658): 3347-3092 calBCE (OxA-31874, 4492±29 bp). Early Bronze Age I, Burial 115,
petrous bone from skull N1.



Kalavan-1 burial ground (Gegharkunik Province, Republic of Armenia)
Kalavan-1 is an open-air site 1,640 meters above sea level on the southwest slopes of the Aregunyats Range north of Lake Sevan, Northeast Armenia. Archaeological and geological investigations were conducted here between 2005 and 2009 as part of a collaborative Armenian and French project. The excavation revealed two main levels of occupation dated to the Terminal Palaeolithic, overlain by an Early Bronze Age Kura-Araxes burial ground. The total excavated area approaches 70 meters squared. Five burial pits were uncovered, of which four, referred to as UF1, UF2, UF8 and UF9,
contained single primary burials, while the fifth (UF5) is a multiple burial that held the remains of at least three individuals. Six consistent radiocarbon dates on human skeletal material from UF5, UF8 and UF9 span 2900-2400 BCE, during the later part of the Kura-Araxes cultural horizon, and this is the range we use for the undated sample. Stone heaps rising to approximately 0.7m in height marked the graves of the adults. These structures were oval-shaped with a major axis of 1 meter, reaching 1.7 meters above the multiple burial. The position of the body in the pits varied: sitting, tightly flexed, and flexed. Post-sepulchral recovery of skulls and long bones occurred. The adult burials were furnished with the same assemblage of black burnished pottery that has the strongest association with
the Kura basin ceramics and UF9 also contained bronze ornaments: a ring and a bracelet found near the skull. The child burial was in flexed position on its right side and was adorned with a neck ornament composed of dog molars and two stone beads, one of which was made of carnelian36,37. The two human remains (petrous bones) used in ancient DNA analyses came from the Early Bronze Age III period burials UF1 and UF9:

Rob said...

The KA falls between late Chalcolithic to "EBA", but really, it's a copper age culture technologically, not bronze.

Whatever the case, our copper age samples are c 4200 BC, thus Alikemek & Sioni horizons.

Hhhhhm. That's very early; too early for Majkop (heck earlier than Majkop); and the steppe were still just foragers.
Changes things a bit (thanks MfA).
...

Alberto said...

I read several papers about the Kura-Araxes origins (sorry, no bookmarks right here), and it was more or less clear that the people who would become the Kura-Araxes culture started to arrive to the area around 4200 BCE, even if the Kura-Araxes culture itself takes over around 3700-3500 BC. The Areni-1 cave samples seem to support this.

For the Archaeological context of those Armenia_ChL samples:

https://www.researchgate.net/publication/259334407_Areni-1_Cave_Armenia_A_Chalcolithic-Early_Bronze_Age_settlement_and_ritual_site_in_the_southern_Caucasus

ryukendo kendow said...

Leaving last thing here for a while: I got some runs from previously which seems to show that the BMAC were quite Caucasus-like by the time the IArs met them. Which seems to have some probability as the Caucasus and surrounds seem to have been shifted north dramatically towards the Steppe by European-like Neolithic and European-like EHG/WHG/Satsurblia flows long before any Iron age, or even IE migrations had occurred, and presumably this would have extended to other places.

Throwing the list at Kalash:
[1] "distance%=1.0253 / distance=0.010253"


Kalash
"India_South" 28.4
"Scythian_IA" 23.15
"Satsurblia" 13.6
"Afanasievo" 8.55
"Armenia_MLBA" 7.9
"Armenia_Chalcolithic" 7.65
"Iran_Chalcolithic" 6.1
"Nganasan" 3.6
"Anatolia_Chalcolithic" 1.05

Steppe = 23 + 9 + 4 + 1 = 36%
Putative BMAC = 13 + 8 + 6 = 26%


[1] "distance%=1.233 / distance=0.01233" (India South Dropped)


Kalash
"Munda" 21.1
"Scythian_IA" 20.05
"Afanasievo" 16.8
"Iran_Chalcolithic" 13.55
"Armenia_Chalcolithic" 13.4
"Satsurblia" 13.4
"Nganasan" 1.7

Steppe = 20 + 16 + 2 = 38%
putative BMAC = 14 + 13 = 27%

[1] "distance%=1.8466 / distance=0.018466" (Munda Dropped)

Kalash
"Afanasievo" 25.85
"Iran_Late_Neolithic" 25.45
"Armenia_Chalcolithic" 18.35
"Dai" 12.3
"Satsurblia" 7.9
"Scythian_IA" 6.95
"Nganasan" 2.1
"Ulchi" 1.1

To me, it seems like the first run is the most accurate representation, but ultimately this depends on the source of ASI and IVC, and to get rid of this I ran GujaratiA as a mix of GujaratiD plus the throwing list.

1] "distance%=0.4061 / distance=0.004061"


GujaratiA
"GujaratiD" 67.6
"Armenia_Chalcolithic" 10.3
"Afanasievo" 6.35
"Scythian_IA" 4.7
"Armenia_MLBA" 4
"Yamnaya_Kalmykia" 2.55
"Iran_Chalcolithic" 2.2
"Basque_French" 0.85
"Hungary_HG" 0.55
"Loschbour" 0.4
"Andronovo" 0.25
"Motala_HG" 0.2
"Nganasan" 0.05

Steppe = 6.35 + 4.7 + 2.55 + 0.85 + 0.55 + 0.4 + 0.25 + 0.2 + 0.05 = 15.9
Caucasus = 10.3 + 4 + 2.2 = 16.5


[1] "distance%=0.4061 / distance=0.004061"


GujaratiA
"GujaratiD" 67.55
"Armenia_Chalcolithic" 11
"Afanasievo" 5.5
"Scythian_IA" 4.45
"Armenia_MLBA" 4.2
"Yamnaya_Kalmykia" 3.15
"Iran_Chalcolithic" 1.95
"Hungary_HG" 0.9
"Andronovo" 0.8
"Loschbour" 0.3
"Nganasan" 0.1
"Motala_HG" 0.1

Steppe = 5.5 + 4.45 + 3.15 + 0.9 + 0.8 + 0.3 + 0.1 + 0.1 = 15.3
Caucasus = 11 + 4.45 + 1.95 = 17.4

[1] "distance%=0.4222 / distance=0.004222"


GujaratiA
"GujaratiD" 67.7
"Armenia_MLBA" 10.6
"Srubnaya" 7.8
"Scythian_IA" 5.75
"Iran_Chalcolithic" 3.95
"Afanasievo" 2.05
"Andronovo" 1.55
"Motala_HG" 0.5
"Hungary_HG" 0.1

ryukendo kendow said...

Here are some others:
[1] "distance%=0.8778 / distance=0.008778"


Tajik_Shugnan
"Scythian_IA" 23.65
"Armenia_Chalcolithic" 17.5
"Andronovo" 14.8
"India_South" 14.3
"Afanasievo" 7.55
"Satsurblia" 7.1
"Armenia_MLBA" 6.65
"Nganasan" 4.85
"Iran_Chalcolithic" 3.6

[1] "distance%=0.7203 / distance=0.007203"


Pathan
"India_South" 36.55
"Armenia_MLBA" 22.25
"Scythian_IA" 15.7
"Armenia_Chalcolithic" 7.85
"Satsurblia" 6.4
"Afanasievo" 5.8
"Nganasan" 2.45
"Anatolia_Chalcolithic" 1.5
"Iran_Chalcolithic" 1.5


HarappaDNA a while ago showed how both 'North European' and 'Caucasus' components were higher in high caste peoples in S Asia, and this may go part of the way towards explaining why. The mysterious local peak in excess Satsurblia ancestry in Kalash and Pathan is weird though, no idea why this is the case, but perhaps this reflects the mysterious local peak in MA-1 affinity among them as well.

Davidski said...

New datasheets with D-stats of the form D(Chimp,Rows)(Mbuti.DG,Columns) are now available at the links above.

So which of these sheets is the best?

Davidski said...

@Gökhan

Yeah, I'll run them tomorrow.

Rob said...

Ryu

Quickly- Did you include Anatolia Neolithic in the analysis of Armenian Chalc & Anatolian Chalc ? (Becuase there is no Anatolian_Neolithic in Anatolian-Chalc ?!)

ryukendo kendow said...

@ Rob

Yeah, I did! I'm very surprised myself!! The entire area becomes very European throughout the chalcolithic on.

MfA said...

David is it possible adding Kurdish samples as well?

Davidski said...

@MfA

Which are the two Kurds in the Turkish set in the Human Origins?

Rob said...
This comment has been removed by the author.
Alberto said...

@Davidski

Thanks for those new sheets. I know that's a lot of work and time, so let's try to make it worth it.

On a very quick test I can confirm that there is a difference and not the strong pull towards Euro_HG that I was seeing before. And Spanish gets a bit of SSA now too.

I'll test and report whatever seems to stand out more. Let's see if we can figure this out.

MfA said...

Turkish Adana23113
Turkish Istanbul20040

David, If single sample is enough you can use the Adana, Istanbul one seems like a bit mixed.

Alberto said...

@Rob

Yes, those Armenian_ChL samples are important to figure out what was going on around 4000 BC and what came after. I'll look at those with the new sheet too to cross check RK's findings.

Rob said...

Thanks all

Samuel Andrews said...

@Ryu,
"massive influx of European Neolithic plus a few percents of Steppe/EHG in Anatolia Chalcolithic'

There was no influx of European Neolithic because Anatolia was it's ancestral homeland.

Aram said...

Alberto

That people who arrive in South Caucasus circa 4500-4000 BC are known as Uruk migrants.
But as I told many times here I don't think they were from Uruk. This high level of EHG proves that I was correct.

Onur said...

@MfA

Turkish Adana23113
Turkish Istanbul20040

David, If single sample is enough you can use the Adana, Istanbul one seems like a bit mixed.


There was also one person with Kurdish-like results in the Behar et al. Turkish sample set of Cappadocia, do you remember which one was that?

MfA said...

@Onur

That was "tur182", Unfortunately Behar samples aren't available in Human Origins set.

Gökhan said...

David i compared DStats 3 and Dstats1b for Turkish sample

It seems ds1b gives better results then ds3 as far as ds1b gives east eurasian ancestry but ds3 not.

Samuel Andrews said...

@Everyone,

In D-stats Natufian is as distant from East Asians as modern Egyptians. More distant from East Asians than Iran_N, even though Iran_N appears to have more Basal Eurasian. To me this means Natufians had African ancestry. Lazardis 2016 didn't find affinity between Natufian and a large collection of modern Africans though. Saying Natufians have descent from an ancient African population with little affinity to most modern Africans sounds crazy but is possible.

This is what I get for Natufian when I model them as Basal Eurasian+UP North Eurasian+Yoruba+Ulchi+Nganasan. I took out all columns with Middle Eastern ancestry and my Basal reference has a 0.3 score with all Eurasians.

Natufian
Basal=30.8%
GoyetQ116-1=42.3%
Villabruna=12.8%
Eastern_HG=7.6%
Yoruba=6.5%
@ A=0.015741

I did the same test with all pre-Metal age Middle Easterners. The only other one who scores in Yoruba was Levant_N with 2.3%.

Chad Rohlfsen said...

The KA samples are 2900-2400, BA. That 4200 date on the CA Armenians is still 300 years after the start of Khvalynsk. It's possible there's no R1b south of the Caucasus before 2900BCE, as well. We'll have to wait and see.

Also, there is no SSA in Natufians. Stats show that. Iran also doesn't have more basal than Natufians. I believe that's on table 7.4 or 9.4. They're fairly equal. I'll check for ENA/Onge in Iran tonight.

Grey said...

"Maybe there was an intense interaction between the Yamnaya and the Balkan Neolithics that started creating such cultural dynamism all around the Black sea, as you said. A bit strange how such ancestry reached the caucasus without us noticing though, don't the Barcin and Kumtepe samples close off the historical window?"

sailing?

Samuel Andrews said...

@Chad,
"Also, there is no SSA in Natufians. Stats show that."

If they have no SSA how do you explain D(Chimp, Natufian)(Mbuti, East Asia)=0.31 and D(Chimp, Modern Levant)(Mbuti, East Asia=0.33?

Gökhan said...

Dstat1b nmonte results fo Turkish sample

Armenia_MLBA 45.25
Iran_Chalcolithic 18.75
Anatolia_Neolithic 18.65
Baalberge_MN 4.15
Armenia_EBA 2.85
Han 2.70
Nganasan 2.55
Jordan_EBA 1.60
Munda 1.55
Satsurblia 1.10
Bougainville 0.45
Ulchi 0.40

Dstat3 nmonte results for Turkish sample

Armenia_Chalcolithic 52.80
Armenia_MLBA 21.15
Anatolia_Chalcolithic 12.30
Dai5.60
Iberia_EN 4.65
Esperstedt_MN 1.55
Poltavka_outlier 1.00
Ami 0.95

DStat1b makes much more sense. In Dstat1b nmonte detected east euroasian ancestry around %6 which is almost Turkish average in several calculators. In my opinion you should discard Dstat3.

huijbregts said...

@ Davidski
So which of these sheets is the best?

In the sheets 1,2 and 3 the Denisovan and Neandertal rows are outliers in the first Principal Component.
As a consequence the higher dimensions get less of the variance and the PCA seems 'flatter'
Paste the next lines in a spreadsheet:

sheet PC1 PC2 PC3 PC4 PC5
1 0.960339 0.026818 0.004684 0.002693 0.002547
2 0.959983 0.026903 0.004821 0.002785 0.002509
3 0.960220 0.026428 0.005067 0.002703 0.002449
1b 0.862578 0.102225 0.012318 0.010410 0.004781
2b 0.861910 0.102503 0.012028 0.010730 0.004966
3b 0.863855 0.100077 0.012681 0.010254 0.005010

This spreadsheet gives the percentage of the variance in the first 5 Principal Components.
It is obvious that in the sheets 1, 2 and 3 the first PC steals variance from the higher dimensions.
This is undesirable.

Shaikorth said...

@Samuel Andrews also this from Lazaridis et al 2016's supplementary table 3:

Fst(Natufian-Mbuti)/Fst(Natufian-Papuan) = 0.9522

Fst(BedouinA-Mbuti)/Fst(BedouinA-Papuan) = 0.9778

Fst(BedouinB-Mbuti)/Fst(BedouinB-Papuan) = 0.9901

This ratio is 1 or more for non-Africans without African admixture, and even for some Near Eastern populations that have low amounts of it (Jordanians etc). 1.03 for both Anatolian and Iranian Neolithic. Natufian affinities seem unresolved. MA-1 is also Onge-shifted compared to EHG and WHG according to the paper's figures which is something that doesn't seem to have come up before.

For the king said...

Can you guys model the modern Iranian populations (Lor, Persian and Mazandarani) ?

Gökhan said...

I vote for Dstat2b. I got best fits from that datasheet.

Olympus Mons said...

@Aram and Alberto,
Yes. That is the problem.
We have DNA for the population that lived in southern Caucasus by 9th millennia BC (Kotias/CHG) and we have DNA for the guys arriving by 4.500 BC (Kura-araxes)... But not the ones in between. - The shulaveri-Shomu. They arrived by 8th millennia BC and got kicked-out by 5.000 BC -- No DNA. but they are the KEY. Let the record show.
Once we got them --- you will see a Match with bell beaker and the birth of M269.

Olympus Mons said...

... and also , Shulaveri gave (at least in part) the CHG and the levant to Yamnaya, diluting their EHG...

Olympus Mons said...

@Chad Rohlfsen,
" It's possible there's no R1b south of the Caucasus before 2900BCE..."

Would bet you there are buckloads of R1b (and M269) by 5000BCE in southern caucasus.
Its in the only population that has not been DNA sampled - The shulaveri-Shomu.
And you know what ... as per latest papers, their cattle and sheep came from Anatolia and not Iran...

Matt said...

@ Davidski, while it's not possible to add the new ancients as columns in the double outgroup sheet, would the following sets be possible at all, to run?

D(Mbuti, Pop, Iran_Neolithic, Levant_Neolithic) - http://textuploader.com/53wn9
D(Mbuti, Pop, Loschbour, Levant_Neolithic) - http://textuploader.com/53wn0
D(Mbuti, Pop, Eastern_HG, Iran_Neolithic) - http://textuploader.com/53wno
D(Mbuti, Pop, Loschbour, Eastern_HG) - http://textuploader.com/53wnj
D(Mbuti, Pop, Loschbour, Israel_Natufian) - http://textuploader.com/53wnl
D(Mbuti, Pop, Levant_Neolithic, Israel_Natufian) - http://textuploader.com/53wnq
D(Mbuti, Pop, Iran_Neolithic, Kotias) - http://textuploader.com/53wnc
D(Mbuti, Pop, Levant_Neolithic, Anatolia_Neolithic) - http://textuploader.com/53wnh
D (Mbuti, Pop, Loschbour, Anatolia_Neolithic) - http://textuploader.com/53wnx

I'm interested particularly in whether recent, Bronze Age and later, populations tend to be closest to WHG, Levant_Neolithic, Iran_Neolithic, EHG and also whether they tend to be closer to Natufian or WHG. It's an interesting question to me whether present day people in Europe are closer to the earliest farmers in the Levant, or to European HG (models under nMonte suggest closer to European HG).

Gökhan said...

@for the king:

Here you go

Iranian_Lor

distance: 0,6453

Iran_Chalcolithic 51,10
Armenia_MLBA 31,35
Baalberge_MN 5,80
Munda 5,00
Jordan_EBA 4,25
Nganasan 2,15
Denisovan 0,15
Yoruba 0,10
Bougainville 0,05
Masai_Kinyawa 0,05


Iranian Mazandarani
distance:0.7819

Armenia_MLBA 46.45
Iran_Chalcolithic 42.60
India_South 4.90
Munda 3.55
Nganasan 0.95
Satsurblia 0.65
Armenia_EBA 0.50
Han 0.25
Karitiana 0.15

Iranian_Persian
distance:0.5318

Armenia_MLBA 42.35
Iran_Chalcolithic 33.15
Jordan_EBA 12.10
Munda 5.75
India_South 2.55
Ulchi 1.30
Baalberge_MN 1.05
Nganasan 0.85
Denisovan 0.30
Han 0.20
Israel_Natufian 0.15
Neandertal_Altai 0.15

Alberto said...

I've been testing D-stats3 vs. Dstats-3b, because those 2 have the EHG in the columns and share the same columns overall. For now only with the basic models for Europe based on the "big 4" European ancestors + Yoruba and Ami for the extra bits.

My impression so far is that 3b gives better results, with much lower distances and better distributed residuals. It tends to favour Anatolia_Neolithic over Loschbour (I'll check about this with EEF and MN farmers from Europe), and does better with the SSA and ENA (IMO). Just 3 examples:

With D-stats3:

Spanish
"Anatolia_Neolithic" 54.95
"Satsurblia" 14.75
"Loschbour" 14.25
"Eastern_HG" 12.45
"Ami" 3.6
"Yoruba" 0
distance=0.018571

With D-stats3b:

Spanish
"Anatolia_Neolithic" 58.6
"Eastern_HG" 14.35
"Satsurblia" 13.95
"Loschbour" 10.15
"Ami" 2.1
"Yoruba" 0.85
distance=0.011487

With D-stats3:

English_Cornwall
"Anatolia_Neolithic" 39.95
"Eastern_HG" 22.15
"Loschbour" 18.8
"Satsurblia" 16.25
"Ami" 2.85
"Yoruba" 0
distance=0.029808

With D-stats3:

Russian_Kargopol
"Eastern_HG" 29.4
"Anatolia_Neolithic" 26.1
"Loschbour" 19.25
"Satsurblia" 15
"Ami" 10.25
"Yoruba" 0
distance=0.034389

With D-stats3b:

Russian_Kargopol
"Anatolia_Neolithic" 35.3
"Eastern_HG" 32
"Satsurblia" 14.2
"Loschbour" 10.5
"Ami" 8
"Yoruba" 0
distance=0.014532

I'll move to some ancients now and West Asia.

Alberto said...

Sorry, English Cornwall here with D-stats3b:

English_Cornwall
"Anatolia_Neolithic" 47.4
"Eastern_HG" 24.35
"Satsurblia" 15
"Loschbour" 12.1
"Ami" 1.15
"Yoruba" 0
distance=0.014556

Alberto said...

@Aram

Yes, I agree. Those people could not have come from the south. The paper I linked above has detailed information about them, but I'm not great at the details about crops, pottery, etc... to have an informed opinion of where could they have come from. I'll look into the models soon.

Samuel Andrews said...

@For the King,

Here's a link with results for Iranians.

https://docs.google.com/spreadsheets/d/1p8lwXd-ZMJ0yOWunJHc1cSAqzll2RAH7QImpzH0AL28/edit#gid=0

Modern Iranians fit well as Iran Chalcolithic+Steppe+South India. Could be pre-IE Iranians+proto-Iranian speakers.

For the king said...

@Gokhan and @Samuel Andrews

Awesome work guys! I wonder if the extra south Indian in Iranians came from Post BMAC Indo Iranians ? or from undiscovered Iranian Neolithic/HG populations ?

German Dziebel said...

@Davidski

I remember that Stuttgart was shown to be closer to Amerindians than it was to East Asians. Are these new aDNA samples still closer to Amerindians than they are to East Asians?

ryukendo kendow said...

@ Gokhan

Nice, so it seems the Armenia MLBA+Iran Chl pattern extends from N Levantines to Assyrians to Iranians, even in other sheets.

@ Shaikorth

Shaik, MA-1 is closer to many ENA populations based on double outgroup stats that we ran ourselves, e.g. in Chads stats. Weirdly not in other topologies though, not sure why that is.

We satisfied ourselves many times in the past by saying that e.g. this or that stat was insignificant, but I'm starting to think that, beyond a certain quite small value, no stat should be insignificant, not when we have whole arrays to compare it to, at least in my experience.

Agree about the natufians being unresolved, it seems most likely there was a ghost population in North Africa, or a ghost population left over from OoA.

Alberto said...

Continuing with sheet 3 vs. 3b, trying to see why 3 favours Loschbour clearly over Anatolia Neolithic compared to 3b, some relatively easy 2 way admixtures with European Neolithic farmers:

With 3:

LBK_EN
"Anatolia_Neolithic" 91.7
"Loschbour" 8.3
distance=0.016261

With 3b:

LBK_EN
"Anatolia_Neolithic" 94.35
"Loschbour" 5.65
distance=0.007387

This one I think it's clear. 3b seems like the better model. But with others things are less clear.

Iberia_EN
"Anatolia_Neolithic" 87.75
"Loschbour" 12.25
distance=0.030124

Iberia_EN
"Anatolia_Neolithic" 90.55
"Loschbour" 9.45
distance=0.021458

_________________

Esperstedt_MN
"Anatolia_Neolithic" 77.6
"Loschbour" 22.4
distance=0.02717

Esperstedt_MN
"Anatolia_Neolithic" 85
"Loschbour" 15
distance=0.010968
_________________

Iberia_MN
"Anatolia_Neolithic" 74.55
"Loschbour" 25.45
distance=0.023669

Iberia_MN
"Anatolia_Neolithic" 76.25
"Loschbour" 23.75
distance=0.018863

It seems that AN + Loschbour work fine for LBK, but not so well for others, so things become more blurry. Looking at the residual from Esperstedt_MN which has the biggest difference (also in distance), D-stats3 is underfitting AN and overfitting Bichon, and D-stats3b is overfitting AN while Bichon is almost spot on.

So not too clear. I'd still lean toward 3b as slightly preferable overall in these cases, but would need more complex models maybe to have a more definitive answer (adding other WHGs maybe, I'll try to check that).

Chad Rohlfsen said...

There's an unsampled group in North Africa that has West Eurasian mtDNA. The Iberomarusian. North Africa has an ancient West Asian population, which will likely be further from ANE than WHG. They will likely have BE too, as most of their mtDNA is like farmers.

ryukendo kendow said...

Kurd has just posted what looks like very strong evidence that ASI ancestry, whatever that is, existed in Iran N.

Iran_N Kotias Onge Chimp 0.012 1.01 34330
Interesting that this is so significant on 30k snps when many stats struggle to reach that D on 100,000s of snps.

Kotias Iran_Chl Onge Chimp -0.017 -1.53 37422

Since Onge and Dai are equidistant to West Eurasians, either there was Onge-->Iran_N, or there was X-->Onge and X-->Iran_N.

ryukendo kendow said...

Another run done some time ago. Whatever the true affinities of Natufians, it looks like there was no input into Proto-WHGs/Paleolithic European HGs into Anatolia_Neolithic that can explain its requirement for a WHG population further 'west' on the WHG-EHG cline than any sample we have now.


Ami2 Anatolia_Neolithic2 BedouinB2 Bichon Bougainville2 Cypriot2 Dai2 Eskimo_Naukan
Anatolia_Neolithic 0.330500 0.4295000 0.3822000 0.3876000 0.3224000 0.3998000 0.3295000 0.3380000
fitted 0.326198 0.3887905 0.3650823 0.4157568 0.3202543 0.3720618 0.3248864 0.3352885
dif -0.004302 -0.0407095 -0.0171177 0.0281568 -0.0021457 -0.0277382 -0.0046136 -0.0027115
Georgian2 Iberia_EN2 India_South2 Karitiana Kinh_Vietnam Kostenki14 Kotias Mansi2
Anatolia_Neolithic 0.3947000 0.4233000 0.3451000 0.3408000 0.3300000 0.3628000 0.3785000 0.3598000
fitted 0.3685553 0.3917611 0.3355922 0.3352351 0.3249574 0.3615754 0.3591586 0.3547743
dif -0.0261447 -0.0315389 -0.0095078 -0.0055649 -0.0050426 -0.0012246 -0.0193414 -0.0050257
Mixe Mota Motala_HG2 Munda2 Papuan2 Ulchi2 Ust_Ishim Yamnaya_Samara2
Anatolia_Neolithic 0.3400000 0.1314000 0.3903000 0.3329000 0.3190000 0.3319000 0.3261000 0.3852000
fitted 0.3360858 0.1283547 0.4056333 0.3273437 0.3163644 0.3277136 0.3220135 0.3791989
dif -0.0039142 -0.0030453 0.0153333 -0.0055563 -0.0026356 -0.0041864 -0.0040865 -0.0060011
Yoruba2
Anatolia_Neolithic 0.1051000
fitted 0.1027874
dif -0.0023126

[1] "distance%=7.8506 / distance=0.078506"


Anatolia_Neolithic
"Israel_Natufian" 52.3
"Hungary_HG" 40
"Iran_Neolithic" 7.1
"Ami" 0.6
"ElMiron" 0
"GoyetQ116-1" 0
"Loschbour" 0
"Vestonice16" 0
"Villabruna" 0
"Yoruba" 0

ryukendo kendow said...


[1] "Ncycles= 1000"
Ami2 BedouinB2 Bichon Bougainville2 Cypriot2 Dai2 Eskimo_Naukan Georgian2
Anatolia_Neolithic 0.3305000 0.38220000 0.3876000 0.32240000 0.3998000 0.3295000 0.33800000 0.39470000
fitted 0.3272568 0.36067125 0.4090611 0.32120085 0.3711161 0.3260855 0.33667635 0.36988515
dif -0.0032432 -0.02152875 0.0214611 -0.00119915 -0.0286839 -0.0034145 -0.00132365 -0.02481485
Iberia_EN2 India_South2 Karitiana Kinh_Vietnam Kostenki14 Kotias Mansi2 Mixe
Anatolia_Neolithic 0.42330000 0.3451000 0.34080000 0.3300000 0.3628000 0.3785000 0.35980000 0.3400000
fitted 0.38536865 0.3388738 0.33756705 0.3259242 0.3599133 0.3653923 0.35543845 0.3379668
dif -0.03793135 -0.0062262 -0.00323295 -0.0040758 -0.0028867 -0.0131077 -0.00436155 -0.0020332
Mota Motala_HG2 Munda2 Papuan2 Ulchi2 Ust_Ishim Yamnaya_Samara2 Yoruba2
Anatolia_Neolithic 0.13140000 0.39030000 0.33290000 0.3190000 0.3319000 0.32610000 0.38520000 0.1051000
fitted 0.12875225 0.40185565 0.32922895 0.3173185 0.3289465 0.32198495 0.38091985 0.1023192
dif -0.00264775 0.01155565 -0.00367105 -0.0016815 -0.0029535 -0.00411505 -0.00428015 -0.0027808

(Anatolia_neolithic column dropped)
[1] "distance%=6.5608 / distance=0.065608"


Anatolia_Neolithic
"Hungary_HG" 36.8
"Israel_Natufian" 35.55
"Iran_Neolithic" 27.65
"ElMiron" 0
"GoyetQ116-1" 0
"Loschbour" 0
"Vestonice16" 0
"Villabruna" 0
"Yoruba" 0
"Ami" 0

The second one looks quite like the model obtained from the paper itself.

Another point though, all of these estimates are suboptimal because no fake model of Anatolia Neolithic is ever going to capture the pretty long drift path caused by post-admixture gene flow between the actual Anatolia_neolithic and the Middle Eastern and West Eurasian populations, so any possible model is always going to be severely underfitted to any post-Neolithic West Eurasian population. A problem which the qpWave-->qpAdm work process had some success in doing away with. So there is quite likely some compensatory misfitting. I actually suspect all post-Neolithic West Eurasian columns should be dropped in our Basal Eurasian experiments and experiments to break down the 'Big Four' or 'Big Six' (Big four plus CHG and Anatolia_Neolithic), since these drift paths are quite long when we're talking about the population history from the Big Four to modern times.

Davidski said...

@rk

Kurd has just posted what looks like very strong evidence that ASI ancestry, whatever that is, existed in Iran N.

Nah, we ran stats using Onge with the Neolithic Iranians against Kotias and Neolithic Anatolians using 400-500K SNPs.

The Anatolians and Kotias were both (insignificantly) closer to Onge, probably because they're less basal.

So it doesn't look like the ancient western Iranians had any ASI. The difference between them and South Central Asians with only around 12% ASI in this respect in D-stats is huge.

ryukendo kendow said...

@ David

Thanks, nice to know. It seems like the 30k snp comparions had such a low number that a small fluctuation in number may have swayed it in the other direction.

ryukendo kendow said...

The following would still be interesting to run though:
Chimp Iran_N Onge Dai
Chimp Iran_N Onge Ami
Chimp Iran_N Onge Korean
Chimp Iran_N Onge Ulchi
Chimp Iran_N Onge Eskimo_Naukan
Chimp Iran_N Onge Karitiana

Kurd said...

@ David

"Nah, we ran stats using Onge with the Neolithic Iranians against Kotias and Neolithic Anatolians using 400-500K SNPs.

The Anatolians and Kotias were both (insignificantly) closer to Onge, probably because they're less basal."

You may want to re-check because I believe the Onge have <100K HO overlapping SNPs.

Also, I use only the highest coverage Iran_N sample. With Iran_Chl I use the highest 2 coverage samples.

The net Iranian shift is consistent for S Indians.

result: Kotias Iran_N Andamanese Chimp -0.0124 -1.046 2264 2320 34330
result: Kotias Iran_N Onge Chimp -0.0123 -1.008 2231 2286 34330
result: Kotias Iran_N Paniyas Chimp -0.0135 -1.132 2255 2317 34330
result: Kotias Iran_N Palliyar Chimp -0.0173 -1.474 2257 2337 34330
result: Anatolia_N Iran_LN Andamanese Chimp -0.0203 -1.816 1360 1416 20275
result: Anatolia_N Iran_LN Onge Chimp -0.0236 -1.959 1340 1405 20275
result: Anatolia_N Iran_LN Paniyas Chimp -0.0222 -1.902 1354 1415 20275
result: Anatolia_N Iran_LN Palliyar Chimp -0.0236 -2.045 1353 1419 20275
result: Kotias Iran_LN Andamanese Chimp -0.0307 -1.960 1306 1389 20275
result: Kotias Iran_LN Onge Chimp -0.0305 -1.873 1295 1377 20275
result: Kotias Iran_LN Paniyas Chimp -0.0282 -1.749 1304 1380 20275
result: Kotias Iran_LN Palliyar Chimp -0.0232 -1.493 1312 1375 20275
result: Anatolia_N Iran_Chl Andamanese Chimp -0.0125 -1.757 2471 2533 37427
result: Anatolia_N Iran_Chl Onge Chimp -0.0110 -1.487 2468 2524 37427
result: Anatolia_N Iran_Chl Paniyas Chimp -0.0132 -1.789 2459 2525 37427
result: Anatolia_N Iran_Chl Palliyar Chimp -0.0144 -1.964 2473 2545 37427
result: Kotias Iran_Chl Andamanese Chimp -0.0217 -2.019 2435 2543 37422
result: Kotias Iran_Chl Onge Chimp -0.0169 -1.534 2430 2513 37422
result: Kotias Iran_Chl Paniyas Chimp -0.0162 -1.452 2454 2535 37422
result: Kotias Iran_Chl Palliyar Chimp -0.0154 -1.419 2459 2536 37422

Davidski said...

There's an Onge set that has almost 600K overlapping SNPs with the Human Origins.

Davidski said...

@rk

The following would still be interesting to run though:
Chimp Iran_N Onge Dai
Chimp Iran_N Onge Ami
Chimp Iran_N Onge Korean
Chimp Iran_N Onge Ulchi
Chimp Iran_N Onge Eskimo_Naukan
Chimp Iran_N Onge Karitiana


We looked at these sorts of stats as well using a lot of markers. Iran_N is closer by something like 4-5 Z scores to Eskimos and Amerindians relative to the Onge. East Eurasians that don't have much ANE show Z scores of around 2.

ryukendo kendow said...

So East Asians are closer to Iran_N than Onge? Thanks! Thats good to know.

Do you mind adding an Onge column and row to the Dstats datasheet for sheet 3b? If you have time of course.

By the way, I have something extremely interesting to show you in a bit.

Davidski said...

Gokhan & MfA

Here are those Turkish stats.

https://drive.google.com/file/d/0B9o3EYTdM8lQd19pRFpNeHVVbU0/view?usp=sharing

Rob said...

Ryu
So something simply WHG appears to have admixed in Anatolian farmers ?
What about Natufians (modelling with basal ghost and earlier European UP)?

ryukendo kendow said...

(Rob, you may want to see this as well, let it tickle your brain.) It seems that Bell Beaker has some low level of clearly Middle Eastern/Natufian/SW Asian and/or African ancestry (which in turn seems to localize to either the Balkans/Anatolia or Iberia or both), but which in any case testifies to long-range contact and quite cosmopolitan origins...

Once again, using the mass-throwing excercise, letting things stick and then expunging 'future-->past' admixture events one after another, the first mix for Bell Beaker is this:

1] "distance%=0.1702 / distance=0.001702"


Bell_Beaker_Germany
"Unetice_EBA" 20.85
"Yamnaya_Samara" 18.2
"Iberia_Chalcolithic" 16.9
"Hungary_BA" 11.1
"Anatolia_Chalcolithic" 10.75
"Armenia_MLBA" 5.95
"Motala_HG" 3.75
"Samara_Eneolithic" 3.15
"Loschbour" 2.7
"Iberia_EN" 2.4
"Baalberge_MN" 1.8
"Yamnaya_Kalmykia" 1.45
"Moroccan" 0.6
"Israel_Natufian" 0.15
"Satsurblia" 0.15
"Villabruna" 0.1

Other than the high level of Unetice EBA its not mostly ahistorical. Dropping Unetice:
[1] "distance%=0.1762 / distance=0.001762"

Bell_Beaker_Germany
"Yamnaya_Samara" 25.85
"Iberia_Chalcolithic" 23.2
"Hungary_BA" 12.95
"Anatolia_Chalcolithic" 10.05
"Armenia_MLBA" 8.9
"Motala_HG" 4.85
"Baalberge_MN" 4.1
"Samara_Eneolithic" 3.4
"Poltavka_outlier" 2.85
"Loschbour" 1.7
"Andronovo" 0.5
"Iberia_EN" 0.5
"Moroccan" 0.5
"Satsurblia" 0.4
"Villabruna" 0.15
"Iberia_MN" 0.05
"Ulchi" 0.05

Interesting stuff. For the longest time people have been debating a Southestern, or Southeastern, Central European, or even a North African origin of Bell beakers, which seems to get some support from the above analysis. From Iberia:

"Iberia_Chalcolithic" 23.2
"Iberia_EN" 0.5
"Iberia_MN" 0.05

From the Balkans, presumably:

"Hungary_BA" 12.95
"Anatolia_Chalcolithic" 10.05
"Armenia_MLBA" 8.9

Another "Exotic":
"Moroccan" 0.5

ryukendo kendow said...

If any of these signals are stable, wow, thats saying something. Since I was most weirded out by the Armenia MLBA (4%) and Moroccan (.5%) percentages, I then dropped these two, and lo and behold:

[1] "distance%=0.1874 / distance=0.001874"


Bell_Beaker_Germany
"Yamnaya_Samara" 20.7
"Iberia_Chalcolithic" 17.5
"Hungary_BA" 15.9 <-----------Increased by 3%
"Anatolia_Chalcolithic" 12.5 <---Increased by 2%
"Baalberge_MN" 6.25
"Andronovo" 4.85
"Motala_HG" 4.35
"Yamnaya_Kalmykia" 3.9
"Corded_Ware_Germany" 2.85
"Samara_Eneolithic" 2.65
"Basque_French" 2.6
"Loschbour" 1.6
"Satsurblia" 1.5
"Poltavka_outlier" 1.15
"Israel_Natufian" 1 <---------!!!
"Levant_Neolithic" 0.55 <------!!!
"Ulchi" 0.1
"Poltavka" 0.05

So the presence of these is probably not an artifact. I'm inclined to associate the 'Moroccan' with Israel_Natufian, and Armenia MLBA with the Levant_Neolithic percentages, which, together with the ~20% Iberia Chalcolithic and ~20% Anatolia Chalcolithic, seems to suggest a Southwestern and Southeastern contribution into the Bell beaker in Germany apart from the Yamnaya contribution(?) This got me interested in the scoring of Southwest and Balkan Europe during the Copper and Bronze ages, possbily they are hiding some diversity. So I dropped Iberia_Chalcolithic and Anatolia_Chalcolithic:

[1] "distance%=0.2274 / distance=0.002274"


Bell_Beaker_Germany

"Hungary_BA" 17.5 <----------Increases 2%
"Hungary_EN" 13.75 <---------Increases to 13% from 0%
"Yamnaya_Kalmykia" 11.6
"Yamnaya_Samara" 10.45
"Andronovo" 9.5
"Baalberge_MN" 5.8
"Corded_Ware_Germany" 5.65 <--Increases 2%
"Iberia_MN" 4.35 <----------Increases 4%
"Satsurblia" 4.3 <-----------Increases 3%
"Levant_Neolithic" 3.6 <---Increases 3%
"Poltavka_outlier" 2.9
"Motala_HG" 2.6
"Hungary_HG" 2.4
"Samara_Eneolithic" 2
"Villabruna" 1.8
"Iberia_EN" 1.7 <-----------Increases 2%
"Masai_Kinyawa" 0.1 <-----------Appears for the first time

Hungarian and Iberian source populations increase in contribution, Levant Neolithic increases further, plus Maasai Kinyawa appears.

ryukendo kendow said...

So I started exploring Iberia_Chalcolithic:

[1] "distance%=0.375 / distance=0.00375"


Iberia_Chalcolithic
"Iberia_MN" 47.25
"Iberia_EN" 15.55
"Baalberge_MN" 11.3
"Esperstedt_MN" 8.05
"Anatolia_Neolithic" 7.05
"Villabruna" 4.6
"Loschbour" 3.05
"Moroccan" 2.5 <-------
"Masai_Kinyawa" 0.5 <-------
"Hungary_HG" 0.05
"Basque_French" 0.05
"Esan_Nigeria" 0.05 <-------

2.5% "Moroccan" ancestry will indeed be diluted to 0.5% "Moroccan" in BB Germany, assuming an 23% Iberia_Chalcolithic contribution into BB Germany, as seen from the second run. The fit is very good.

Dropping Basque French:

[1] "distance%=0.3746 / distance=0.003746"


Iberia_Chalcolithic
"Iberia_MN" 47.45
"Iberia_EN" 15.4
"Baalberge_MN" 11.3
"Esperstedt_MN" 7.95
"Anatolia_Neolithic" 7.2
"Villabruna" 4.7
"Loschbour" 3
"Moroccan" 2.4
"Masai_Kinyawa" 0.6

This actually improves the fit further. Dropping Moroccan:

[1] "distance%=0.3715 / distance=0.003715"


Iberia_Chalcolithic
"Iberia_MN" 48.8
"Iberia_EN" 13.8
"Baalberge_MN" 12.4
"Esperstedt_MN" 9.85
"Anatolia_Neolithic" 6.25
"Villabruna" 5
"Loschbour" 2.15
"Esan_Nigeria" 0.85 <------
"BedouinB" 0.75 <------
"Masai_Kinyawa" 0.1 <------
"LBK_EN" 0.05

The fit improves even more. Dropping BedouinB:
[1] "distance%=0.3727 / distance=0.003727"


Iberia_Chalcolithic
"Iberia_MN" 46.2
"Iberia_EN" 14.75
"Baalberge_MN" 12.4
"Esperstedt_MN" 10.15
"Anatolia_Neolithic" 7
"Villabruna" 4.5
"Loschbour" 2.95
"Esan_Nigeria" 0.95
"Cypriot" 0.7
"LBK_EN" 0.4
Not so good, fit declines. dropping Cypriot and Druze:

[1] "distance%=0.3713 / distance=0.003713"


Iberia_Chalcolithic
"Iberia_MN" 48.65
"Iberia_EN" 13.1
"Baalberge_MN" 12.8
"Esperstedt_MN" 10.1
"Anatolia_Neolithic" 6.15
"Villabruna" 4.45
"Loschbour" 2.55
"LBK_EN" 1.2
"Esan_Nigeria" 0.95
"Masai_Kinyawa" 0.05
The fit improves to the best yet, giving us 1% of African ancestry total.

ryukendo kendow said...

Just trying to see what happens when Africans are dropped:

[1] "distance%=0.379 / distance=0.00379"


Iberia_Chalcolithic
"Iberia_MN" 48.15
"Baalberge_MN" 13.65
"Iberia_EN" 12.6
"Esperstedt_MN" 10.95
"Villabruna" 5.55
"Anatolia_Neolithic" 5.2
"Levant_Neolithic" 1.55
"Loschbour" 1.4
"Esan_Nigeria" 0.85
"Israel_Natufian" 0.1

While the fit is still very very good, it deproves. The fit with a small percent African is probably the best, and makes geographical sense as well.
______________________

Since we already checked out Anatolia_Chalcolithic and Armenia MLBA, I moved on to hungary BA:

[1] "distance%=0.677 / distance=0.00677"

Hungary_BA
"Baalberge_MN" 25.95
"Sintashta" 15.25
"Hungary_EN" 14.85
"Bell_Beaker_Germany" 14.45
"Hungary_HG" 10.2
"Srubnaya" 5.65
"Samara_Eneolithic" 5.1
"Unetice_EBA" 4.35
"Motala_HG" 3.05
"Armenia_EBA" 0.75
"Bougainville" 0.3
"Poltavka_outlier" 0.1

ryukendo kendow said...

Dropping Sintashta, Bell Beaker, Srubnaya, and Unetice, since these are all later than Bell Beaker and will presumably be influenced by the cosmopolitanism of Bell Beaker:

[1] "distance%=0.6864 / distance=0.006864"


Hungary_BA
"Baalberge_MN" 31.45
"Hungary_EN" 16
"Poltavka_outlier" 12.8
"Hungary_HG" 12.05
"Yamnaya_Kalmykia" 12.05
"Armenia_EBA" 6 <-------------------- Increased 5.25%
"Samara_Eneolithic" 5.7
"Motala_HG" 3.85
"Bougainville" 0.1

Dropping Bouganville:

[1] "distance%=0.6868 / distance=0.006868"


Hungary_BA
"Baalberge_MN" 32.05
"Hungary_EN" 15.2
"Poltavka_outlier" 12.8
"Hungary_HG" 12
"Yamnaya_Kalmykia" 10.9
"Armenia_EBA" 6.8 <-------------------- Increased 1%
"Samara_Eneolithic" 6.35
"Motala_HG" 3.85
"Ami" 0.05

Sintashta added back:

[1] "distance%=0.6844 / distance=0.006844"


Hungary_BA
"Baalberge_MN" 30.4
"Hungary_EN" 16.35 <----------Increased 1%
"Hungary_HG" 11.5
"Sintashta" 11.15
"Poltavka_outlier" 8.7
"Yamnaya_Kalmykia" 7.4
"Samara_Eneolithic" 6.1
"Armenia_EBA" 4.45 <-------------Decreased 2%
"Motala_HG" 3.75
"Ami" 0.2

This model has the best fit of all the models, though Sintashta did not reach to Pannonian plains. This makes me think that Sintashta may carry Middle Eastern ancestry apart from Satsurblia.


Both Sintashta and Armenia_EBA removed:

[1] "distance%=0.6933 / distance=0.006933"


Hungary_BA
"Baalberge_MN" 29.25
"Hungary_EN" 20.6 <-----------------+4%
"Yamnaya_Kalmykia" 18.1
"Poltavka_outlier" 12.3 <----------+4%
"Hungary_HG" 12.05
"Motala_HG" 3.5
"Samara_Eneolithic" 3.3
"Satsurblia" 0.65 <-----------------+.5%, appears for the first time
"Moroccan" 0.15 <-----------------+.1%
"Ami" 0.1

So it seems the Middle Eastern ancestry carried by Hungary BA might be real, which anyway supports its Y-DNA haplogroup, J, for which this is the earliest sample in Europe apart from that in the Karelia_HG.

ryukendo kendow said...

It seems Middle Eastern/Natufian/African affinities also exist in the Hungary EN. In the original post by Davidski about the Hungarian Neolithic genomes, some Koros Neolithic genomes were shifted towards Bedouin in the PCA. Running this:


[1] "distance%=0.4988 / distance=0.004988"


Hungary_EN
"Anatolia_Neolithic" 76.1
"Iberia_EN" 14.3
"Motala_HG" 5.5
"Hungary_HG" 3.35
"Hungary_CA" 0.3
"Esan_Nigeria" 0.25 <-----
"LBK_EN" 0.1
"Masai_Kinyawa" 0.1 <-----

Quite surprising. Dropping Maasai:

[1] "distance%=0.4988 / distance=0.004988"


Hungary_EN
"Anatolia_Neolithic" 76.55
"Iberia_EN" 13.9
"Motala_HG" 5.45
"Hungary_HG" 3.35
"Hungary_CA" 0.4
"Esan_Nigeria" 0.3
"Levant_Neolithic" 0.05

Dropping Esan_Nigeria:

[1] "distance%=0.5234 / distance=0.005234"


Hungary_EN
"Anatolia_Neolithic" 73.6
"Iberia_EN" 12.55
"Motala_HG" 4.9
"Levant_Neolithic" 4.85
"Hungary_HG" 4.1

A .3% of Esan Nigeria translates to a ~5% increase in Levant Neolithic. The fit deproves, but is still very good. If I drop Esan_Nigeria but retain Maasai:

[1] "distance%=0.5234 / distance=0.005234"


Hungary_EN
"Anatolia_Neolithic" 73.6
"Iberia_EN" 12.55
"Motala_HG" 4.9
"Levant_Neolithic" 4.85
"Hungary_HG" 4.1

No change. So it seems Hungary_EN either has a fraction of Levantine Neolithic ancestry or tiny amounts of Sub-Saharan ancestry; I have trouble believing the latter though as 'Negroid' populations like Esan_Nigeria do not reach the coasts until well after the historical period.

ryukendo kendow said...

Just to compare them against another Neolithic population, presumably 'purer', LBK_EN:
[1] "distance%=0.3018 / distance=0.003018"


LBK_EN
"Anatolia_Neolithic" 68.65
"Iberia_EN" 13.05
"Hungary_CA" 7
"Iberia_MN" 6.4
"Orcadian" 2.8
"Baalberge_MN" 1
"Motala_HG" 0.7
"Karitiana" 0.4

Dropping Orcadian, Hungary_CA, and Karitiana:

[1] "distance%=0.3272 / distance=0.003272"


LBK_EN
"Anatolia_Neolithic" 71.35
"Iberia_EN" 12.45
"Iberia_MN" 8.4
"Basque_French" 2.65
"Baalberge_MN" 2.55
"Poltavka" 1.45
"Motala_HG" 0.95
"Nganasan" 0.2

Not going to continue dropping, as no sign of African or Middle Eastern ancestry appears, though very low levels of EHG and ENA appear to be present in LBK_EN.

It seems like there are at least three times when ancestry from South of the Mediterranean crossed Northwards by the time of the Bell Beaker, whether they are African, Middle Eastern or Natufian; once in the Hungary_EN period, once during the Bronze Age in the form of BR1/2 and once/twice during the Bell Beaker period.

MfA said...

Thank you Dave.

@Krefter, Ryu

Which Dstats file have you used?

ryukendo kendow said...

About the Middle Eastern input into the Bell Beaker: its likely that the high CHG ratios found in the Bell Beaker compared to the Corded Ware, as reflected in ADMIXTURE and other algorithms, is due to this process of Bell Beaker cosmopolitanism.

Aram said...

ryukendo

"""Maybe there was an intense interaction between the Yamnaya and the Balkan Neolithics that started creating such cultural dynamism all around the Black sea, as you said. A bit strange how such ancestry reached the caucasus without us noticing though"""

Yes that's strange that until now we didn't notice that. Recently I wrote in Anthrogenica about the Balkanic influence on Armenia. I notice that when analysing the Y DNA data meticulously. And the most amazing thing is that influence came straight from the North of Black Sea not via Anatolia.
Something happened when Steppe "touched" North Balkans/Carpathian region.


Aram said...

But I must say that archaeologists knew that.
Aegean connections of Trialeti culture. Cyclopian masonry of fortifications in Armenia, Greece and Crimea, coloured ceramics, Balkanic deities in Hayasa and other stuff. It was not massive but it is sufficient to explain some linguistic and archaeologic issues.

Davidski said...

I'll take a look at the Bell Beakers we have with TreeMix using the new samples from the ancient Near East.

ryukendo kendow said...

The picture for them seems really complex, many of the admixtures are so small that they should be quite easy to miss. Especially if multiple admixture edges have to enter into the same population, which Treemix seems to abhor.

Nonetheless hopefully what is apparent in nMonte is confirmed there too.

Rob said...
This comment has been removed by the author.
Rob said...

Blogger Rob said...
@ Ryu

Thanks again some interesting findings. Some things which caught my eye, in addition

1) For BB;
I guess no one is surprised by the Yamnaya input, but the thing which catches my eye is the near absence of input from MN Germany. Rather this has been replaced by Copper Age Iberia. This is perplexing.

Whilst it might signify that ‘out of Iberia’ component long talked about, it might be an artefact of sample choice, and the tricks of terminology, again. I.e. Copper Age Iberia is more contemporary to BB _Germany than MNE_Germany, with the former dating as late as 2200 BC, whilst the latter is as early as 5000 BC.

The Hungarian input is not surprising, given its geographic & cultural centrality, and the fact that we have an early R1b from Vucedol. Neither is the “Morrocan connection’.

Maybe Frank can comment on this if he’s around

(NB : terms can be trick us. So we should always note absolute dates for historicity. Example: 3500 BC would be “Copper Age” in Hungary, “Eneolithic” in Austria, “terminal Neolithic” in Greece, and “proto-Bronze Age” in the steppe or Bulgaria).

2) Nothing too shocking in Copper Age Iberia (which dates to 3200 – 2200 BC, depending on def.) The latter half is contemporary with BB phase, and the current samples show continuity from the middle Neolithic western European milieu (although these Copper Age Iberian samples aren’t actually from Beaker contexts).

3) EBA Hungary

Did you include Starcevo or LBK here ? If so, its preference for MN Germany is notable, somewhat surprising at first glance, but it does make sense (because the Balkan Neolithic collapsed, and I suspect new Neolithic ancestry came from central Europe – where the earliest proto-Boleraz assemblages are found (Slovakia/ nth Hungary; and also corroborated by increasing appearance of I2a2).

20% steppe influence is consistent with previous estimates.

What happens if you throw Co-1 from Baden into this also ?

4) The modelling of LBK & Hungary EN. I am confused by the use of Iberia EN. Isn’t this ‘ahistoric” ? Iberia EN came after LBK and Hungary EN, chronologically & spatially.


For here, i think it would be more important to see what meix of barcin/ Kumtepe 6 (ie inland NW Anatolia) vs Greek Neolithic ('seaborne route') does. Same for Iberia EN.

ryukendo kendow said...

@ Rob

I did not choose any of the components, the components 'chose themselves', if that makes any sense, since all the ancient samples present in David's dataset are present in the list thrown at the target populations.

Thought this may surprise some people, who expected a simple scenario of local mixture with Central European Neols, and then movement into Western Europe. This seems more like a movement from Iberia, specifically Iberia and not other places due to the slight African ancestry in Iberians distinguishing them from other Neolithics, combined with a movement from the Steppe and the Balkans with some Middle Eastern ancestry, creating a mixed superstrate in the Bell Beaker. Or perhaps the movement was due to elite circulation in a network stretching from Iberia to a balkanised and steppe-ised Central Europe, thus allowing the two populations to be contemporaneous.

About the Anatolia_Chalcolithic thing, I think we should also hold in mind that some of the populations are replaceable, as the constant dropping shows, and are present due to their slight tendency towards this or that column e.g. Cypriot, Bedouin, such that Anatolia MLBA can be replaced with a mix of Baalberge MN, Armenia MLBA, and levant Neolithic, which in turn is replaced by a mix of Hungary_EN, Armenia EBA and Masai Kinyawa, etc. etc. The identities are a bit less fixed than a single run will have us think.

ryukendo kendow said...

Also, BB are seen in most analyses to be relatively simple 2-way mixes, but I think, after this analysis, that this is mostly due to constraining the analysis to Central European neolithics and Steppe, and that throwing a mixed bag at them shows us something much different.

About CO1, its already present in the list, but its not chosen in the model. For the Iberia_EN issue, I did not clear up all the ahistorical populations in the LBK-EN model as the focus was more on seeing if the levant-African affinity is only in Hungary_EN, or in all post Anatolia neolithic populations, so I stopped dropping after a while.

Rob said...
This comment has been removed by the author.
Alberto said...

Not to bore with models comparing D-stats3 vs. d-stats3b, just a summary. The SSA admixture I thin kit's quite clearly better handled by 3b. For example, Palestinians with 3b get 3.6% Esan (with Natufian included in the pops) while with 3 they get 1.6% and undefitted Yoruba column.

Then there is the tendency of a bias toward Euro_HG and ENA that I first noticed. Not always easy to say which is more correct, though in general the distances in 3b are quite lower and the residuals better balanced. For example, a model for Bell Beaker with Yamnaya and MN + WHG, with 3 they get 51% Yamnaya and a distance of 0.014524, while with the same populations with 3b they get 43.5% Yamnaya + 3.5% Satsurblia, which seems more in line with what we've seen before, and the distance goes down to 0.0043 (BTW, also they get 0.75 Yoruba with this last model, while in the other they don't, but Yoruba column stays underfitted).

So overall I didn't find any model that with D-stats3 looks clearly better, but I did find many that with 3b do look clearly better. So for now, unless someone else is seeing something different, I'll stay with 3b for the models I post. I have RK's first models posted above for some comparison, so I hope that will suffice.

Rob said...
This comment has been removed by the author.
Rob said...

Ryu

Yep, what we're seeing with BB makes sense, and certainly has ramped up the former models. I suspect we'll see similar things with other periods, esp when we start getting later Bronze Age & Iron Age samples in the future.

About LBK, etc, I see; it wasn;t your main aim to uncode the individual components of the central European neolithic. But I think this, too, would be interesting- which it prefers out of Greek vs Anatolian, same with EN Iberia, in the future. I think the actual Greek Neolithic paper attempted a similar thing.

Davidski said...

Matt,

https://drive.google.com/file/d/0B9o3EYTdM8lQUVVJUDVzcXJETk0/view?usp=sharing

huijbregts said...

@ Ryu
I agree with Rob that Iberia_Chalcolithic is an odd reference population for Bell_Beaker_Germany.
Especially since you yourself found 20% Baalberge_MN+Esperstedt_MN in Iberia_Chalcolithic; that is a heavy Bell Beaker smell.
By the way which Dstats sheet did you use? I hope it was not D-stats1.

ryukendo kendow said...

For the BB analyses, D stats 3b.

I'm not sure that Esperstedt MN and Baalberge MN in Iberia Chalcolithic represents much of anything; to me their combined 20% on top of a base of 80% Iberia_MN and Iberia_EN suggests long term gene flow in communities in the Neolithic.

On the other hand, if I drop Iberia Chalcolithic, the one that strongly increases is Hungary_EN, which is a more unexpected pattern of behaviour. Espestedt MN and Baalberge MN doesn't show up.

To clarify my view on what I see these stats as suggesting:

Many populations, when dropped, are replaced by populations similar to them, or more basic populations which constitute them. For example, when Armenia MLBA and Moroccan are dropped, they are replaced with Hungary BA, Anatolia Chalcolithic, Levant Neolithic and Natufian. When Anatolia_Chalcolithic are dropped, they are replaced with Hungary EN, Satsurblia, Hungary BA. This means that the Levant Neolithic, Satsurblia, and other such Middle Eastern affinities in the Bell Beaker really do exist, as they are repeatedly reflected in source populations suggested by the algorithm, even after the best sources are removed; the 'exotic' choices are not present due to chance. This is the first layer of interpretation.

The second layer comes from the fact that Anatolia_Chalcolithic and Iberia Chalcolithic were suggested in the first place, telling us that short drift paths are shared in common between theses sources and the target population to the exclusion of other populations like them. This layer of evidence is much weaker, as the dropped populations do not result in large declines in goodness of fit, so the evidence, while suggestive, can be quite weak.

Nevertheless, in contrast to the previous work having Bell Beaker as a simple two way mix of Central Europeans and Steppe peoples, which led some people to suggest a simple movement from Central Europe to Western Europe, this analysis seems to suggest that there was a movement from Iberia, (specifically Iberia and not other places due to the slight African ancestry in Iberians distinguishing them from other Neolithics,) combined with a movement from the Steppe and the Balkans with some Middle Eastern ancestry, creating a mixed superstrate in the Bell Beaker. Or perhaps the movement was due to elite circulation in a network stretching from Iberia to a balkanised and steppe-ised Central Europe, thus allowing the two populations to merge in the elite members. I think the old two-way mixtures were limited by a constrained choice of source populations, and a failure to appreciate how complex the situation could possibly be.

Alberto said...

A first look at the Armenia_Chalcolithic samples. Starting with this model:

Armenia_Chalcolithic
"Iran_Chalcolithic" 44.25
"Anatolia_Neolithic" 31.6
"Eastern_HG" 16.05
"Satsurblia" 6.65
"Israel_Natufian" 1.45
"Hungary_HG" 0
"Loschbour" 0
"Esan_Nigeria" 0
"Esperstedt_MN" 0
"Iran_Neolithic" 0
"Motala_HG" 0
"Ami" 0
"Levant_Neolithic" 0
distance=0.007587

So here the high EHG does appear. The paper has it best modelled as 52.5% Anatolia_N, 29.2% Iran_N and 18.3% EHG. Iran Chalcolithic is older than Armenia_Chalcolithic (going back to 4800 BC), but trying with it:

Armenia_Chalcolithic
"Anatolia_Neolithic" 49.2
"Iran_Neolithic" 21.55
"Eastern_HG" 16.65
"Satsurblia" 12.6
distance=0.009873

This comes relatively close to what the paper shows, and still showing the Iranian input. But to check what is Iran_Chalcolithic:

Iran_Chalcolithic
"Iran_Neolithic" 44.35
"Anatolia_Neolithic" 38.8
"Satsurblia" 14.85
"Ami" 1.2
"Eastern_HG" 0.8
"Hungary_HG" 0
"Loschbour" 0
"Esan_Nigeria" 0
"Esperstedt_MN" 0
"Israel_Natufian" 0
"Motala_HG" 0
"Levant_Neolithic" 0
distance=0.01166

Like a mix of Iran_N and something more northern, from Anatolia/Caucasus area. The mystery is where did this high EHG come from.

And what I'm not seeing is the high "European" ancestry in the above RK's models. The 20% Esperstedt_MN doesn't show at all. Adding Iberia_EN and the ahistorical Bell_Beaker and Afanasievo:

Armenia_Chalcolithic
"Iran_Chalcolithic" 40.65
"Anatolia_Neolithic" 31.55
"Afanasievo" 12.6
"Eastern_HG" 9.8
"Satsurblia" 3.5
"Israel_Natufian" 1.8
"Esan_Nigeria" 0.1
"Hungary_HG" 0
"Loschbour" 0
"Esperstedt_MN" 0
"Iran_Neolithic" 0
"Motala_HG" 0
"Ami" 0
"Levant_Neolithic" 0
"Bell_Beaker_Germany" 0
"Iberia_EN" 0
distance=0.007351

It does take Afanasievo, but still keeping a good part of EHG. No Bell Beaker or Iberia_EN. So Basically these samples look a 3 way mix of something Iranian, something Anatolia/Caucasus and something EHG.

Rob said...

Thanks Alberto

In Ryu's model yesterday, I was very surprised to see the absence of ANF in Anatol-Chalcolithic, but thought it within the realm of possibility, as the "Western Farmer" population did appear to have shifts & declines in population. The migration of Balkan like farmers was perplexing too, as I;d not ever imagined that.

What does Anatolian Chalc look like ?

With the EHG in Armenian Chalcolithic, it prefers Afansievo & "EHG" over group like Khvalnysk or Yamnaya ?

ryukendo kendow said...

Looks like the bias towards Euro_HG in D stats 3 (on which yesterday's mixes were done) may have dragged the neolithics along as well.

Kristiina said...

"No use on carrying coals to Newcastle"
In the end, it is somewhat amusing if it turns out that proto-IE was not spoken on the steppe or Caucasus or Near East but was spoken in a culture such as Vinča culture (c. 5700–4500 BC) which provides the earliest known example of copper metallurgy, or Globular Amphora Culture ca. 3400–2800 BC in the proximity with Vinča culture. Maybe it is in Globular Amphora Culture that we will see R1a1 and R1b together.

Of course, this is only one of the several possible Europe-centered models. In the future it will be rejected, modified or confirmed as other models.

Karl_K said...

@RK

"this analysis seems to suggest that there was a movement from Iberia"

I feel like many people here have been saying this for a very very very long time. This is expected. The Bell Beaker people have never fit well with a 2 population model.

I really like your analysis, but who is surprised?

Alberto said...

@Ryu

Yes, it seemed to me that the very big shift towards "Europe" was at least partially due to a technical problem. Things look more balanced with this other sheet.

@Rob

Yes, if I add Samara_Eneolithic it takes a good part of EHG:

Armenia_Chalcolithic
"Iran_Chalcolithic" 41.2
"Anatolia_Neolithic" 31.65
"Afanasievo" 11.15
"Samara_Eneolithic" 9
"Eastern_HG" 3.05
"Satsurblia" 2.65
"Israel_Natufian" 1.2
"Esan_Nigeria" 0.1
distance=0.007178

Not surprising since Samara_Eneolithic is very EHG (but a mix of 3 different samples, so I'm not usually including it as a source).

For Anatolia_Chalcolithic (pops that get 0% not shown):

Anatolia_Chalcolithic
"Anatolia_Neolithic" 55.25
"Satsurblia" 23.95
"Iran_Chalcolithic" 7.75
"Eastern_HG" 6.35
"Levant_Neolithic" 5.65
"Esan_Nigeria" 1.05
distance=0.013831

By the 1% SSA and the not great distance it seems that the sample is a bit noisy (it is only one and low coverage). But otherwise it looks quite less "Iranian" and "EHG" than Armenia_ChL. If we had Kotias instead of Satsurblia it would probably be mostly Anatolia_Neolithic and CHG, but probably still with some bit of extra EHG. Adding Armenia_ChL to the source pops (historically correct, since they are a few centuries older):

Anatolia_Chalcolithic
"Anatolia_Neolithic" 45.55
"Armenia_Chalcolithic" 34.6
"Satsurblia" 18.4
"Esan_Nigeria" 1.1
"Eastern_HG" 0.25
"Levant_Neolithic" 0.1
distance=0.013177

ryukendo kendow said...

Not sure that I've seen any *genetic* evidence for this, till now.

Rob said...

Expected but it needed to be demonstrated.

Rob said...

Koodos Ryu

Alberto

Thanks. It's curious that Armenian Chalcolithic prefers Afansievo and older type EHG over nearer Yamnaya Kalmykoa, but again chronology could be responsible.

More importantly, if there was steppe type input in Chalcolithic Armenia (4000 BC); we should bank on it being present in Late Neolithic Eastern Europe (eg C-T, post Varna groups); & the Baltic.

About Anatolian Chalcolithic: Hhhmm
That extra CHG again. What does that mean if it prefers archaic CHG over more contemporary choices ?

But basically massive input from / via Armenia .

Alberto said...

@Rob

I had only added Afanasievo as an ahistorical steppe population because that's what showed up in RK's model above. Yamnaya_Kalmykia does work better and takes more EHG when added:

Armenia_Chalcolithic
"Iran_Chalcolithic" 40.1
"Anatolia_Neolithic" 29.75
"Yamnaya_Kalmykia" 20.7
"Eastern_HG" 5.9
"Israel_Natufian" 1.85
"Satsurblia" 1.7
distance=0.006881

But these samples are some 1500 years after the Armenia_ChL ones. And also in the paper's f3 stats for admixing populations, Armenia_ChL didn't show sings of recent admixture, which makes it more mysterious (though it could be a technical problem, but I have no reason for thinking it is).

In any case, yes, probably the EHG-CHG like admixture (or Yamnaya-like) was present along the Black Sea long before Yamnaya. It would be interesting to get samples from Varna, C-T and other cultures from the area to see if it had reached there already. Also awaiting those Globular Amphora ones.

Kristiina said...

Are you so interested in Armenia_Chalcolithic (4300–3400 BCE), because your idea is that the Anatolian IE branch was introduced by people who brought EHG to Armenia 4300 BC? As many of you connect yDNA R with proto-IE, do you think that it was R1b men who brought EHG to Armenia 4300 BC, but it just accidentally happened that the Chalcolithic samples were L1a and the later sample with more WHG and less EHG was R1b and not the yDNA that in reality decreased the steppe affinity? Of course this is not impossible.

In any case, if you look at this map: https://en.wikipedia.org/wiki/Indo-European_languages#/media/File:IE5500BP.png you see that proto-IEs may have been more Balkans-shifted than previously presumed.

Olympus Mons said...

@davidski...
Are you reading ryukendo kendow n BB? - Oh, yes i am here, and I already have the popcorn.

Olympus Mons said...

@ryukendo kendow,
I don't know who you are, But if ever drop by Lisbon, just send me an email. Lunch is on me!

Rob said...

Kristiina
Individual lineages are of secondary importance, but are still relevant. But really, what matters more is full analysis of all regions.
Armenia is highly interesting for 2 reasons; we have a darn good view of it now; from Neolithic to Iron Age. So that's an amazing overview to contribute toward a pan-Eurasian understanding

Secondly, South Caucasus position is a connection between Anatolia-Balkans, the steppe and Central Asia.

MfA said...

There is also U4a in Armenia_ChL, steppe marker.

Armenia_ChL could be a higly drifted tribe, That's probably why doesn't show recent admixture. AFAIK noone checked IBS between ChL samples yet. All males are L1a, some of them even could be 1st degree relatives according to carbon dates and two K1a8, doesn't seem like there was much diversity.

Olympus Mons said...

@Rob,

On southern caucasus, you do not have the most important "people". From 8th millennia (when they arrived) to end 6th millennia (when they were kicked out). completely.
Prior to them was CHG after them was a mix of lots of different people that form the calcolithic and the Bronze age you see in here. but not "them".

Olympus Mons said...

@Daviski,
So, BB did had some SSA? I thought it was a "defect" of sampling. :)

Not only do they had SSA, but they picked up with L3 women in Egypt (merimde and el-omari)

Recap "real" history from Shulaveri2BellBeaker.
1- 7th millennia in southern Caucasus as Shulaveri-Shomu, where M269 was born, apparently coming from Anatolia (because of cattle and goats DNA).
2- By 4.900 “they” were completely kicked out – by then they were a mix of EHG, CHG and Anatolia Neolithic. All their settlements were abandoned and some have a layer of ashes to the one that replace them (sioni going to Kura araxes) with different pottery, different architecture, etc.
3- By 4.800 BC they where in tell tsaf north Israel. So suppose the place they were kicked out to by Ubaid or L1a from Iran, was to west and that is why the 2 places with higher variance of r1b is the eastern Anatolia and …. The place in Armenia where that r1b was just found, near sevan lake.
4- By 4.700 BC they were in Nalchik north Caucasus, and so forth that is why Yamnaya is so close to bell beaker. They, the Shulaveri, diluted the EHG in them and gave them CHG and Levant DNA.
5- By 4.700 BC were settling heavily in Merimde and el-Omari in the Nile delta in Egypt, and having cattle binge parties in Fayum, near the lake. Was L51 born there?
6- 4.000 BC Again as the same as with Ubaid, the pre dynastic pharaonic Egypt with the crazy Badarian on south Egypt moving north, kicked them out.
7- By 3.700 BC were arriving to Iberia, kicked out by the 5.9 kiloyear climatic event that made the Sahara desert, along side with berber (E1b1) guys.
8- By 3.300 were amassing in large cities in Iberia, porto torrão in the lowlands of Iberia as big as Ur city.
9- By 3,000 where building the Zambujal city where the bell beaker actually arised.
10- By 2.700 BC had crossed the pyrenes. … And that is the bell beaker story.

Isn’t it what the DNA is telling? – At least they told every one. When Periplus, the 700BC greek mariner met them there. They told him, who they were.
“We are the people had have been living here for a long time, but were kicked out of our homeland (southern caucasus) by an attack of serpents (Ubaid/uruk). … We are the Oestrimni!

See chapter – Those o fled the serpents.
http://shulaveri2bellbeaker.blogs.sapo.pt/suppl-i-they-who-fled-the-serpents-5061


Davidski said...

So, BB did had some SSA? I thought it was a "defect" of sampling.

Not of sampling, but of post-mortem deamination damage.

Some of the Bell Beaker samples aren't UDG treated, and this is often expressed as very minor Sub-Saharan admixture.

Olympus Mons said...

Ryu,

"...Since I was most weirded out by the Armenia MLBA (4%) and Moroccan (.5%) percentages, I then dropped these two..."

Hey, don't really drop them so quickly... they (Bell beaker Stock of people) were in Armenia up until 4.900BC and went by Morocco by 3.500 BC... that is where Gibraltar straight is. :) - So, do you really have to drop them?!

Kristiina said...

Rob, yes, I agree. There are so many languages spoken around Caucasus that it is not really at all easy to sort out the linguistic history in the area.

Maybe the first metallurgists in the Balkans spoke a Northwest Caucasian type language and the Caucasian substrate in proto-IE comes from there. In my model above, Globular Amphora Culture spoke proto-IE and replaced the earlier Corded Ware language in the North. Corded Ware area overlaps in the east with the Uralic area and that could explain the similarities between Uralic and IE languages.

In any case, I think that the carriers of the expansive languages must have had a technological/ political advantage with respect to groups speaking other languages.

Kristiina said...

Globular Amphora Culture is interesting from the IE point of view:

"A further highly interesting aspect is the connection between some human burials and cattle burials or deposits. In particular, regularly observed deposits consisting of two animals in antithetic crouched position, which are widely interpreted as a harnessed bovine team, seem to be characteristic of the time period for the GAC. These findings underline the extraordinary status enjoyed by domestic animals, which is often used to argue that the agricultural practices of the GAC were mainly based on cattle breeding."

"What were the reasons for the GAC’s integration with other local groups and its widespread expansion? One possibility could be the desire to access local raw materials, such as salt, amber, copper or flint. Or perhaps it was the complementary system of agriculture? The agricultural system in question permitted the opening up of previously unpopulated areas with less fertile soils. With the climatic decline, it offered the local cultural groups the acceptable alternative of subsistence agriculture, which then caused the further expansion of the GAC in those regions."

https://www.topoi.org/project/topoi-1-20/

It seems that also wheel was present in Bohemia/Moravia at the same time:
The number of inhabitants started growing with the spread of new agricultural techniques c. 3500-2000 BCE (the wheel, the lister or sulky plough cattle breeding).
Central Europe in the High Middle Ages: Bohemia, Hungary and Poland, c.900–c p. 45

According to Wikipedia: "The first evidence of wheeled vehicles appears in the second half of the 4th millennium BCE, near-simultaneously in Mesopotamia (Sumerian civilization), the Northern Caucasus (Maykop culture) and Central Europe (Cucuteni-Trypillian culture), so the question of which culture originally invented the wheeled vehicle is still unsolved.

The earliest well-dated depiction of a wheeled vehicle (here a wagon — four wheels, two axles) is on the Bronocice pot, a c. 3500 – 3350 BCE clay pot excavated in a Funnelbeaker culture settlement in southern Poland."

"Cow" and "wagon" are maybe the two most important words reconstructed into the proto-IE, and they are attested early in this area.

Matt said...

@ Davidski: Thanks for these - https://drive.google.com/file/d/0B9o3EYTdM8lQUVVJUDVzcXJETk0/view?usp=sharing.

I see you've already noticed that:

D (Mbuti.DG Ami Iran_Neolithic Levant_Neolithic) = -0.0134 -3.216 19361 19888 394093

D (Mbuti.DG Munda Iran_Neolithic Levant_Neolithic) = -0.0185 -4.66 19200 19922 394093

implies a stronger connection to ENA and particularly Munda as ENA+South Indian for Iran_Neolithic than is present for Levant Neolithic.

The strongest stats for Iran_Neolithic vs Levant_Neolithic are:

D (Mbuti.DG Iran_Late_Neolithic Iran_Neolithic Levant_Neolithic) = -0.075 -11.972 10330 12005 220049

D (Mbuti.DG GujaratiC Iran_Neolithic Levant_Neolithic) = -0.0244 -6.694 19318 20283 394093

then the recent Mediterranean and Early European Farmers are at the other end.

Also:

D (Mbuti.DG Samara_Eneolithic Iran_Neolithic Levant_Neolithic) = -0.0022 -0.417 15359 15427 302806

D (Mbuti.DG Yamnaya_Samara Iran_Neolithic Levant_Neolithic) = -0.0083 -2.124 19878 20212 393394

Right direction but not very significant or relatively weak, poss because dominance of EHG ancestry.

Also interesting:

D (Mbuti.DG Masai_Kinyawa Levant_Neolithic Israel_Natufia) = 0.0018 0.577 9717 9682 238817
D (Mbuti.DG Somali Levant_Neolithic Israel_Natufian) = 0.0032 0.986 10119 10055 238817

So doesn't seem like there is significance to Levant_Neolithic being more related to modern East Africans than Natufians are.

Also on

D(Mbuti.DG Pop Levant_Neolithic Israel_Natufian) shows most extreme highest for EEF, not WHG, which suggests that Levant_Neolithic does not work as WHG+Israel_Natufian, and there is important extra shared drift beyond that mix that favours Anatolians.

Generally, the stats for D(Mbuti.DG Pop Ancient1 Ancient2) for The Four (Levant_Neolithic, Iran_Neolithic, EHG, WHG), plus also Anatolia_Neolithic seem to show that drift sharing for moderns is greatest with Loschbour

(even though this is small compared to the sharing of Loschbour with other members of the WHG clade).

E.g.

D (Mbuti.DG Spanish Loschbour Levant_Neolithic) = -0.0404 -10.932 18815 20401 388452
D (Mbuti.DG Sardinian Loschbour Levant_Neolithic) = -0.0233 -6.238 19197 20112 388452

but then even

D (Mbuti.DG Spanish Loschbour Anatolia_Neolithic) = -0.0111 -3.467 24662 25217 501891

D (Mbuti.DG Sardinian Loschbour Anatolia_Neolithic) = 0.009 2.875 25221 24769 501891

Comparing D(Mbuti.DG Pop Anatolia_Neolithic Levant_Neolithic), the top populations, most related to Anatolia compared to Levant (judging by Z), were:

D (Mbuti.DG LBK_EN Levant_Neolithic Anatolia_Neolithic): 0.0351 14.885 23077 21510 453450
D (Mbuti.DG Lithuanian Levant_Neolithic Anatolia_Neolithic): 0.0333 14.902 22682 21220 454017
D (Mbuti.DG Sardinian Levant_Neolithic Anatolia_Neolithic): 0.0331 15.303 22814 21353 454017

Less significant for others.

Kristiina said...

I correct myself again!

Kuznetsov and Khokhlov write in "ETHNOCULTURAL RELATIONS OF THE STEPPE HABITANTS OF EASTERN EUROPE IN THE EARLY BRONZE AGE" that
"The initial period of the Bronze Age is represented by Yamna cultural and historic community. Comparison of radiocarbon dates of the two main areas of this community, the western (territory of Ukraine) and the eastern (the Volga River and Ural regions), confirms the hypothesis about the eastern origin of Yamna culture. The western area of Yamna cultural and historic community covers the period from 3000 to 2300 BC, while the eastern one covers the period from 3500 to 2900 BC. The eastern origin and the further expansion to the west of the bearers of Yamna culture is also confirmed by the data on funeral customs and inventory."
http://www.nbuv.gov.ua/old_jrn/Soc_Gum/Archeology/2011_1/Atr_1.pdf

Yamna Samara yDNA is mainly R1b-Z2103 and one R1b-L23. R1b-Z2103 is not typical for IE speakers.

On the basis of the current evidence, Globular Amphora Culture starts in Kujawy Region Poland 3400 BC (with only a differemce of 100 years to Yamna samara), and c. 2900 BC it transforms into Corded Ware "in a number of "centers" which subsequently formed their own local networks" (Wikipedia.)

According to Woidich, 2014, "The replacement of the Globular Amphora culture by another supraregional cultural complex—the Corded Ware culture—is already indicated by the trend towards cord decorations in its younger stage of development. The transition thereby did not occur abruptly but rather within in a gradual process. This process is reflected both in culturally mixed inventories36 and in transformation phenomena. In the second quarter of the 4th millennium the Corded Ware complex spreads successively into all regions occupied by the Globular Amphora culture. The Corded Ware culture might have benefited from the established large-scale communication network between the Dnieper region in the east and eastern Holstein in the west during its expansion."
http://journal.topoi.org/index.php/etopoi/article/viewFile/182/212
It may well turn out that R1a1-M417 and R1b-L51 are found in the Globular Amphora culture.

The question is: can we find a culture from which Yamna Samara and Globular Amphora culture could be derived? We have Khvalynsk and Majkop but I do not know if they are/will be genetically or culturally a good fit. In any case, considering the probable Ural region origin of Yamna, it is not a surprise that Volga Uralics are genetically more Yamna than IE speakers.

I am sorry! I put my previous post in a wrong thread.

Davidski said...

New datasheets with only the UDG treated Bell Beakers (hence, they shouldn't show minor Sub-Saharan) and Remedello BA, not UDG treated.

https://drive.google.com/file/d/0B9o3EYTdM8lQWldwcHZSa2hhZEE/view?usp=sharing

Simon_W said...

Very interesting those analyses run by rk.

I first tried to connect the different language families of the Caucasus with the different genetic influences. Northwest Caucasians (Adygei, Abkhasians) seem to be predominantly an equal mix of (or intermediate between) Anatolian and Armenian Chalcolithic. While South Caucasians (Georgians) have much more Anatolian than Armenian Chalcolithic. They also have a lot of Armenia_MLBA, but this they have in common with the Northeast Caucasian Lezgins who have even more of this. So I would tentatively associate Armenia_MLBA with Northeast Caucasian, even though the other Northeast Caucasians, the Chechens have only 3.1% Armenia_MLBA (elite dominance influence?). Armenians resemble South Caucasians in their predominance of Anatolian Chalcolithic over Armenian Chalcolithic, but they differ most of all by the strong Iranian Chalcolithic impact. They don't seem to have significantly more steppe ancestry than the Abkhasians and not more EEF ancestry than Georgians.

What I find striking is the apparent correlation between Armenia_MLBA and R1b! As per Eupedia Lezgins have 21.5% R1b, Chechens have 2%, and Georgians 10% (but with some strong local pockets where they even top the Lezgins in R1b frequency, according to a study I've seen). Judging from rk's analysis Lezgins have 38.45% Armenia_MLBA, Chechens 3.1% and Georgians 32.9%.

This gets even more fascinating when taking the German Bell Beakers into consideration. They seem to have 8.9% Armenia_MLBA that was neither in Yamnaya, nor in Iberia_Chalcolithic, nor in Hungary_BA. That's amazing! And 8.9% is surely too much to be a fluke or a result of deamination. So where did this come from? My guess: From Iberia, where it must have arrived shortly before.

Simon_W said...

I'm not sure what to think of the linguistic hypotheses trying to link Basque with Northeast Caucasian. But prima facie there seems to be some intriguing evidence, check this out:
http://www.people.fas.harvard.edu/~witzel/mt26s.html

Especially the comparison with the mystery languages is interesting:
http://www.people.fas.harvard.edu/~witzel/s4.gif
http://www.people.fas.harvard.edu/~witzel/s3.gif

Of course if there is a relationship it could hardly go back to the time of the Armenian_MLBA, that would seem much too recent.