search this blog

Saturday, January 2, 2016

Spatio-temporal segment sharing analysis featuring eight ancient genomes


No one's done this yet, probably because at this stage it's still a crazy idea. But sometimes crazy ideas actually work. Here's a map:


The map is based on the spreadsheet below, which shows the total amount of relatively large, probably in most part Identity-by-Descent (IBD), genome-wide tracts shared by the ancient individuals in centimorgans (cM). An extended version of the table, including ~1500 present-day Eurasians, can be viewed here.


I used Beagle 3 and fastIBD for the job. The dataset included just over 300K SNPs that showed a call rate of 100% in all of the ancient samples, so as not to potentially bias their results by imputing missing markers.

To do this by the book, I'd need to run many more ancient individuals, at least a few from each archaeological culture of interest, sequenced at comparably high coverage and genotyped in exactly the same way. This might be possible within a year or two.

Having said that, the results from my quick and dirty test run make perfect sense. Here are a few observations:

- The Corded Ware individual from Germany shows a close relationship to the Yamnaya individual from the North Caspian region, but no relationship to the two Neolithic farmers from Central Europe, NE1 and Stuttgart, supporting the idea that the Corded Ware Culture was introduced into Central Europe by migrants from the Pontic-Caspian Steppe.

- The Srubnaya individual from the North Caspian shares a lot of cM with the Corded Ware individual, and also shows a stronger relationship to other ancient Central Europeans than to the Yamnaya individual buried only kilometers away, suggesting that the Srubnaya Culture was introduced to the Pontic-Caspian Steppe from Central Europe or surrounds.

- The closer relationship between the Yamnaya individual and the Late Bronze Age Hungarian, BR2, than between the latter and the Corded Ware individual, gels with archaeological data showing that Yamnaya groups moved into the Carpathian Basin via the Balkans.

- Weak segment sharing between the Yamnaya individual and Kotias, a Mesolithic Caucasus hunter-gatherer (CHG) from western Georgia, suggests that the Yamnaya population did not receive its CHG admixture from the southwestern Caucasus.

- Elevated segment sharing between BR2 and present-day speakers of Baltic and Slavic languages suggests that BR2, or his close relatives, contributed genealogically in a significant way to the Balto-Slavic expansions that affected most of East Central and Eastern Europe during the Iron Age and early Medieval period.

The ancient DNA data used in my experiment came from the following studies:

Gamba, C. et al. Genome flux and stasis in a five millennium transect of European prehistory. Nat. Commun. 5:5257 doi:10.1038/ncomms6257 (2014).

Haak et al., Massive migration from the steppe was a source for Indo-European languages in Europe, Nature, Advance online publication, doi:10.1038/nature14317

Jones, E. R. et al. Upper palaeolithic genomes reveal deep roots of modern eurasians. Nat. Commun. 6:8912 doi: 10.1038/ncomms9912 (2015).

Lazaridis et al., Ancient human genomes suggest three ancestral populations for present-day Europeans, Nature, 513, 409–413 (18 September 2014), doi:10.1038/nature13673

Mathieson et al., Genome-wide patterns of selection in 230 ancient Eurasians, Nature, 528, 499–503 (24 December 2015), doi:10.1038/nature16152

115 comments:

Nirjhar007 said...

For technical interpretations, i better stay off, but i congratulate you for this new idea :) .

Slumbery said...

Davinski
"Weak segment sharing between the Yamnaya individual and Kotias, a Mesolithic Caucasus hunter-gatherer (CHG) from western Georgia, suggests that the Yamnaya population did not receive its CHG admixture from the southwestern Caucasus."

You cannot rule out NW Caucasus as source area with any confidence. You should be vary of the big time difference. The Kotias samples are very old and I'd except them to be strongly diluted by a related "CHG" based population from the South (most likely SE) during the Neolithic. Also, the time difference can be a technical problem fro the test itself. Less and less recognisable "segment" by time.
I do not say that Yamnaya CHG was from the direction of the NW Caucasus, just that this test is not enough to rule that out.

Davidski said...

I didn't rule out the NW Caucasus or any other part of the Caucasus based on this data, I just said the southwestern Caucasus (ie. Georgia) looked unlikely.

Indeed, if you look at this Yamnaya sample's matches with modern Eurasians, you'll see that the North Caucasus looks like a very plausible source of CHG admixture in Yamnaya.

Krefter said...

This is a great tool, especially for aDNA. It'd very interesting to see results for the Hinxtons and LNBA genomes who are probably have lots of IBD with moderns in the same region.

The connection between BR2 and Balto-Slavs is interesting. He is somewhat close to where proto-Slavs lived, but how could he be connected to Balts? I noticed in another study Balts had lots of recent ancestry with Slavs, but how is this possible? It was only Slavs who expanded in large area during the recently(Early Middle Ages).

Slumbery said...

Davinski
I managed to confuse NW with SW. What I wanted to say that based on only this test you cannot rule out South-Western Caucasus either.
Also, does this differentiation make sense in 5000+ year time scale? It is possible that all of the populations around the Caucasus got diluted, replaced or even displaced and partially switched place between Kotias and the time directly before Yamnaya. Apparently with related populations, but still.

Nevertheless, considering Geography and likely pre-migration population density distribution, I'd except that the main mass of Neolithic migration came up on the Eastern side from modern days Azerbaijan and branched from there to the various regions around the Caucasus (whatever intermediate population might be the source of Yamnaya CHG is another story).

Davidski said...

The CHG in Yamnaya looks unadmixed with anything from the Near East, so my bet is that the CHG in Yamnaya comes from the southern steppes just north of the Caucasus.

Rob said...

Dave
That's possible
But how do you suppose a CHG population just sat isolated in the north Caucasus for 12 000 years ?

Aram said...

Avars have a R1b-Z2103* Z2106- that is a direct offshoot of Yamna RISE550 without any intermediary. Tabassarans have Z2106* Z2109- which is also a local specific branch. This are North East Caucasian peoples. So a EHG-CHG contact/movement zone in Eastern Caucasus side is much more plausible than in Western Caucasus.

Karl_K said...

@Davidski

"probably in most part Identity-by-Descent (IBD)"

The results are nice, but there is absolutely no way you can call it IBD. You cannot be sure about the phasing from these ancient genomes.

This is probably IBS.

Davidski said...

Some interesting Srubnaya clusters. Srubnaya is supposed to be proto-Iranian.

Pop1 Pop2 Chr Bp1 Bp2 cM
Srubnaya Brahui 1 40216993 41821779 1.895235
Srubnaya Lebanese_Muslim 1 40216993 41821779 1.895235
Srubnaya Russian_Kargopol 1 40216993 42239717 2.442538
Srubnaya Balochi 1 40216993 41821779 1.895235
Srubnaya Kshatriya 1 40216993 41821779 1.895235

Srubnaya French 1 210655572 212556744 1.87174999999999
Srubnaya Norwegian 1 210655572 212556744 1.87174999999999
Srubnaya Kalash 1 210655572 212556744 1.87174999999999
Srubnaya Kalash 1 210655572 212556744 1.87174999999999

Srubnaya Brahui 2 47438918 49169599 2.35278
Srubnaya Croatian 2 47438918 49153719 2.301603
Srubnaya Russian_Kargopol 2 47438918 49516532 2.551582
Srubnaya Slovakian 2 47438918 49516532 2.551582

Srubnaya Bulgarian 2 47808430 49516532 1.770597
Srubnaya Russian 2 47808430 49143829 1.51456899999999
Srubnaya Russian 2 47808430 50179975 2.519082
Srubnaya Russian_Kargopol 2 47808430 49515633 1.770504
Srubnaya Finnish 2 47808430 49641080 2.10446399999999
Srubnaya Orcadian 2 47808430 49516532 1.770597

Davidski said...

See, that's the thing. Most of these ancient samples come from populations with low, or even very low, effective populations. And today Eurasia is awash with their haplotypes that they spread very rapidly and recently, causing massive founder effects in many places. So how difficult can it be to accurately phase their genomes given enough modern samples?

BR2 shares massive chunks with multiple West and East Slavs, who also share these chunks with each other. This is usually IBD.

ryukendo kendow said...

@ Davidski
Very nice! Good luck for the HG tests. I expect that, as FrankN, Matt and Alberto have built up such a convincing picture, we may well finally see some HG survival in the Baltic and surrounds.

@ All
Meanwhile, here are the top 20 populations for average sharing with each one of the following populations, in ascending order:

YAMNAYA SAMARA
Belarusian Average 1.005753371
Greek_Macedonia Average 1.041922
Adygei Average 1.059369846
Chechen Average 1.130984227
Ket Average 1.155049
Serbian Average 1.157626614
Ukrainian_Kharkov Average 1.1580815
Swedish Average 1.168241996
French Average 1.230036672
Polish Average 1.252698335
Ukrainian_Poltava Average 1.31667932
Kumyk Average 1.333556462
Danish Average 1.3430527
Norwegian Average 1.3830079
Irish Average 1.466585977
German Average 1.531293727
Hungarian Average 1.542163083
Dutch Average 1.719936833
Erzya Average 2.0386389
Ukrainian_Lviv Average 2.199672
Latvian Average 2.438538883

ryukendo kendow said...


CORDED WARE
Latvian Average 2.343604333
Slovakian Average 2.402174367
Estonian Average 2.411931054
Danish Average 2.46217125
Hungarian Average 2.468146228
Ukrainian_Kharkov Average 2.576615
Ukrainian_Lviv Average 2.587891833
Belarusian Average 2.588533857
Ukrainian_Belgorod Average 2.7060215
Dutch Average 2.728601667
Moksha Average 2.7305268
Russian_Kargopol Average 2.830523667
Swedish Average 2.847339696
Serbian_Bosnia Average 2.975926
Finnish Average 3.143216527
French Average 3.166053296
Mari Average 3.17147434
German Average 3.3017427
Irish Average 3.306465154
Ukrainian_Poltava Average 3.4775692
Scottish Average 3.7208718


SRUBNAYA
Ukrainian_Lviv Average 2.9312005
Tajik Average 3.004709757
Dharkar Average 3.010097
Polish Average 3.041164153
Kalash Average 3.04122627
Slovakian Average 3.094962107
Estonian Average 3.240149929
Croatian Average 3.338848475
Pathan Average 3.342796418
Lithuanian Average 3.4186915
Latvian Average 3.53712965
English Average 3.56383166
Danish Average 3.673768075
Kshatriya Average 3.71316304
Orcadian Average 3.964734
Irish Average 4.009592884
Swedish Average 4.041079083
German Average 4.153687709
Ukrainian_Belgorod Average 4.490012533
Ukrainian_Kharkov Average 6.252296
Ukrainian_Poltava Average 6.3027062

The magnitudes differ for the samples, with the oldest one, Yamnaya, showing the least large tract contribution, which is expected. The contribution of Cored Ware to modern Europeans seems to be more direct. Srubnaya is even more direct, and, unlike the others, also displays tract donation to S and C Asians.

Rob said...

Dave

Which HGs are you going to look at specifically? All ?

Davidski said...

I'm hoping to include pairs of hunter-gatherers and MN samples from nearby sites:

La Brana/Iberia MN
Loschbour/Esperstedt MN
KO1/Hungary CA

And then also throw in the best Motala genome and Karelia HG.

Alberto said...

Thanks David, a very welcome addition to the tests you already run regularly, and certainly one that will become more and more important in the coming year or two. These still rather experimental results are already quite interesting.

About the low sharing of Kotias with Yamnaya (and overall with everyone), I don't know what to think. On one hand I'm not too surprised since I've been of the opinion that populations directly descended from those CHG are probably not really relevant. The relevant CHG-like populations should come from somewhere else, in or near Iran. On the other hand, for Yamnaya admixture itself, these CHG seem to be geographically in the right place. I don't think, though, that a population from the North West Caucasus around 5000 BC could not be descended from these South West Caucasus samples. So ruling out the SW Caucasus would probably also rule out the NW Caucasus. In light of these results, I remain open to my original idea of an east of the Caspian origin of that population. Time will tell.

Very interesting once more this connection between Srubnaya and South Asia. The shared R1a-Z93 obviously has to mean something, so this cannot be coincidence.

BR2: If it got its EHG ancestry from Yamnaya (so from real EHG and not from Motala-like populations), then it must have also gotten a good amount of WHG from another source. Looks like a bit of coincidence, so let's see what the analysis of different HGs say about it.

Karl_K said...

@Davidski

"BR2 shares massive chunks with multiple West and East Slavs, who also share these chunks with each other. This is usually IBD."

It is not something that matters in terms of the results. But IBD implies that recombination has not occurred within the haplotype.

If a long ancestral haplotype is recreated by recombination because fragments are extremely common in the population, than it is still IBS, not IBD.

Rob said...

Dave

In light of Karl's comment, is it possible to also also do an IBD and compare the two, or is the phasing unknowable ?

Davidski said...

This algorithm looks specifically for IBD. The fact that it picks up a lot of IBS tracts isn't a problem.

The point is that this is a linked analysis that locates large ancestral haplotypes, so it says more about genealogical ties than the classic unlinked IBS test.

George Okromchedlishvili said...

That's what I have been saying here.

The very obvious source of CHG ancestry is not South but North Caucasians who also received admixture from the EHG-like(?) people.

The real question is whether we should look at North-East or North-West Caucasians?

Given some linguistic arguments I think NW-Caucasus looks the most favorable now.

Jack Rusher said...

If it's not too much bother, I'd like to see a version of that spreadsheet with date, latitude and longitude for the pre-modern samples. I think there's an opportunity for a nice visualization...

Matt said...

Hey all,

I had a quick look over the last hour at the spreadsheet, and at averaging the population values (for modern populations) as Ryu was doing above, for these.

Feeding these population averages through a PCA, for all populations, you get:

http://i.imgur.com/6kTx5xb.png

There looks like some interesting information there, but I think part of the problem may be, these populations having quite high variance (SD) in their sharing, internally, so for populations in the panel with small numbers of N, that can result in some strangeness in position, possibly, and that affects the whole PC. Adjacent populations who are similar have some differences in position.

(Theoretically with enough coverage, populations would vary a lot less in their sharing).

This is also part of why the Cassidy paper made use of medians rather than average, that remove effect from outliers more than averages.

For averages, using just populations of N>10 (N is sample size):
http://i.imgur.com/dvdJ5s4.png

Which seems to make somewhat more sense and form more of a picture where the geographically close populations are also close.

There's a lot of effect from the different sizes of haplotype sharing in a PCA like this, particular the strong sharing between East-Central Europe and the relatively recent (and East-Central European) BR2, which you can see in graphs for various stats against one another: http://i.imgur.com/PMhge7y.png

With a PCA based on correlations rather than covariance, the information about the magnitude of the different sharing isn't used, so accordingly you see something which is a closer match to typical PCA (but not identical still):

http://i.imgur.com/KCrL3Fg.png

(here is the same PCA for pops of over N > 15 - http://i.imgur.com/y1lRsYk.png . There aren't many of them)

Davidski said...

Thanks Matt, that makes sense.

It appears to be a robust methodology, so I'll start my next run focusing on European hunter-gatherers.

Btw, if anyone can put together some maps showing median sharing across Eurasia with these ancients that'd be awesome. Sub-Saharan Africa can be marked as 0, which is what it is, apart from some stray hits.

Davidski said...

Jack,

Here's that info you wanted...

https://docs.google.com/spreadsheets/d/1VlI6MFAuZiOe58a9kaDF8mlwNWby06svw5sghlTKdro/edit?usp=sharing

Using median scores or removing outliers from each present-day population is probably a good idea when making graphics involving modern populations.

Kurti said...

Haven't I been saying that CHG from paleolithic Georgia is simply far too old to be considered as the CHG source in Yamna? Haven't I been also saying that some of the peer reviewed paper was stating that it looks like some kind of Maykop like population was living in the Steppes and getting genetic admixture from CT and H&G groups?

And haven't I been attacked for this statement as "being on drugs".

So once again these 12000 year old paleolithic CHG samples are too old who knows how the Caucasus looked like 6000 BC.

Davidski said...

The population that mixed with EHG to form Samara Eneolithic (Khvalynsk) and Yamnaya was definitely CHG. It just may not have been from the western Caucasus.

There's no evidence in any of the data that the CHG admixture on the steppe came from Maykop, and indeed Maykop probably didn't even exist during most of the Khvalynsk period, so that might be a problem.

And the reason I said you were on something is because you kept saying that CHG was not the other half of Yamnaya, despite strong formal statistical evidence that it is, and also claiming that Samara Eneolithic samples were analyzed in the Jones paper, when they weren't.

Rob said...

Is anyone aware of the specific dating of the Khvalynsk samples from the Mathieson paper ?
Becuase the Khvalynsk horizon was rather broad c. 45/4200 - 3000 BC, during which pre-Maykop and Maykop certainly did exist.

Balaji said...

Davidski, this was a great idea from you with such interesting results. I agree with you that Yamnaya did not receive its CHG-like ancestry from western Georgia. Mountains are very effective in impeding gene flow and thus the CHG-like ancestry in the north Caucasus would be different from Kotias. We saw something similar in the South Caucasus where the amount of CHG ancestry is very different in Georgian and Armenian.

Primate_Gorilla Armenian LBK_EN Kotias -0.0235 -5.95 271793
Chimp Georgian LBK_EN Kotias -0.0003 -0.081 507266

The CHG-like ancestry in Sardinian must also be different. The CHG-like component is South Asia also is different from Kotias and following Dienekes we may call it GHG (G for Gedrosia).

You had created the Smarter Bear doodle plot with the Sintashta-Andronovo horizon circled in.

https://drive.google.com/file/d/0B9o3EYTdM8lQWmhKWk14bFhkcjg/view?usp=sharing

Could you please put Kotias and Satsurblia on the same plot?

The South Asian trend line, because GHG is different from CHG, is parallel to and shifted to the right from the Kotias/Satsurblia line. This plot also illustrates the great chasm between Kalash and the Sintashta-Andronovo horizon. This is not because of the small amount of ASI in Kalash (perhaps 15%). Instead it is because Kalash is probably about 50% GHG and 35% ANE and 15 % ASI whereas Sintashta is 20% BEA, 30% WHG/UHG, 30% CHG and 20% ANE. They are just made of different components. “East is East and West is West, and never the twain shall meet. “

Ryukendo Kendow,

Thanks for summarizing the data for sharing between present-day populations and the ancient ones. Srubnaya does show a connection with South Asia and as Alberto commented the shared R1a-Z93 must also have something to do with it. Devotees of the Steppe hypothesis will claim that Srubnaya contributed ancestry to South Asians. I believe it is the other way around.

ryukendo kendow said...

@ Balaji

Balaji, you had this plot
https://drive.google.com/file/d/0B9o3EYTdM8lQb1R2MDJmS2h0Nk0/view

It is not true that "because GHG is different from CHG, is parallel to and shifted to the right from the Kotias/Satsurblia line...". Since this is a plot of MA-1 versus WHG, its not a PCA where all kinds of drift, such as differences between CHG and a hypothetical 'GHG', all have an equal effect; its one where only differences on the WHG-ANE axis are allowed to manifest properly, so it is clear that the reason why South Asians are shifted right of Kotias and Satsurblia is because they are more MA-1 shifted than Kotias and Satsurblia.

The position of SC Asians are easily accounted for, as the combined action of MA-1, pulling to the Northeast, (but not parallel to the cline! as ANE increases similarity to MA-1 more rapidly than it does to WHG, resulting in excess movement towards the East) and ASI, pulling towards the Southwest (parallel to the cline, ASI reduces similarity to both ANE and WHG equally), produces the displacement towards the right.

Balaji, how do you explain the fact that upper caste peoples are more similar to Europeans?

ryukendo kendow said...

@ David @ Balaji

The idea that SC Asians contain no ancestry related to WHG or EHG is also in all likelihood false, an inference derived from outdated ADMIXTURE runs... This stat does not seem confounded as far as I can tell:

MA-1 Karelia_HG Pathan Chamar

Populations carrying Basal Eurasian are all on the same side of the comparison, so only the crown Eurasian affinities of Pathan and Chamar should matter. David, or anyone willing to run this? It should contradict the previous statement and clarify things once and for all.

Davidski said...

Okay, hang on, I'll update my dataset to make sure South Asians, including Paniya, share around 500K with the Mathieson ancient samples.

What other stats should I run with that once I get it up?

Davidski said...

Balaji,

Now also including Srubnaya...

https://drive.google.com/file/d/0B9o3EYTdM8lQY0FvWXdFR014emc/view?usp=sharing

rk,

MA1 Karelia_HG Pathan Chamar -0.0118 -5.376 305352

ryukendo kendow said...

@ Davidski

David, this is wonderful, as a whole backlog of stats has built up lol.

Loschbour LBK_EN Paniya Dai
Loschbour Kotias Paniya Dai
Karelian_HG LBK_EN Paniya Dai
Karelian_HG Kotias Paniya Dai
MA1 LBK_EN Paniya Dai
MA1 Kotias Paniya Dai
Kostenki14 LBK_EN Paniya Dai
Kostenki14 Kotias Paniya Dai

Loschbour LBK_EN Pulliyar Dai
Loschbour Kotias Pulliyar Dai
Karelian_HG LBK_EN Pulliyar Dai
Karelian_HG Kotias Pulliyar Dai
MA1 LBK_EN Pulliyar Dai
MA1 Kotias Pulliyar Dai
Kostenki14 LBK_EN Pulliyar Dai
Kostenki14 Kotias Pulliyar Dai

Primate_Gorilla Pulliyar Georgian Kotias
Primate_Gorilla Pulliyar Armenian Kotias
Primate_Gorilla Paniya Georgian Kotias
Primate_Gorilla Paniya Armenian Kotias

Primate_Gorilla Mota Austroasiatic LBK_EN
Primate_Gorilla Mota Pulliyar LBK_EN
Primate_Gorilla Mota Paniya LBK_EN

Primate_Gorilla Mota Austroasiatic Kotias
Primate_Gorilla Mota Pulliyar Kotias
Primate_Gorilla Mota Paniya Kotias

Primate_Gorilla Hadza Austroasiatic LBK_EN
Primate_Gorilla Hadza Pulliyar LBK_EN
Primate_Gorilla Hadza Paniya LBK_EN

Primate_Gorilla Hadza Austroasiatic Kotias
Primate_Gorilla Hadza Pulliyar Kotias
Primate_Gorilla Hadza Paniya Kotias

But more importantly, I wonder if you can perform treemix?

(Denisovan, San, Mota, Kotias, Anatolia_Neolithic, Karelian_HG, Loschbour, Dai, Papuan, Karitiana, Pulliyar), 11 migration edges.

(Denisovan, San, Mota, Kotias, Anatolia_Neolithic, Karelian_HG, Loschbour, Dai, Papuan, Karitiana, Pulliyar, Austroasiatic), 12 migration edges.

(San, Mota, Kotias, Anatolia_Neolithic, Karelian_HG, Loschbour, Dai, Karitiana, Pulliyar, Kalash), 12 migration edges.

Thanks very much!

Kurti said...

Dave

"And the reason I said you were on something is because you kept saying that CHG was not the other half of Yamnaya, despite strong formal statistical evidence that it is, and also claiming that Samara Eneolithic samples were analyzed in the Jones paper, when they weren't."

And here is the main problem. You seem to not read properly my posts. I said "CHG" as in Caucasus Hunters and Gatherers are unlikely the source of the "CHG" admixture in Yamna simply out of the logic that 1. by 6000 BC they wouldn't have been Hunters and Gatherers anymore (And if, most likely not the same CHG as we know).
2. The CHG admixture came from a group that was rather predominantly CHG like simply out of the logic that Kotias and Satsurbila are too old to be the source just like Mal'ta is too old and probably mixed to be the source of ANE in West Eurasia.

And again I never implied that Maykop is direct ancestor of Yamna. I said a Maykop LIKE (a group of herders who were ancestral to Maykop as well the CHG ancestry of Yamna) population was living the Steppes. That means a group which probably originated somewhere in the Caucasus split in to one part expanding slightly North into the Steppes becoming Yamna, another staying where it was and becoming Maykop.


Kurti said...

this same herder group must have formed also Kura Araxes and could be descend from Leyla Tepe and came ultimately from the Iranian Plateau imo.

Krefter said...

Kotais lived 4,000 years before Yamnaya so we shouldn't expect recent common ancestry between the two. Comparing Yamnaya to Kotais, is like comparing Sintashta to Irish. It's exactly the same. Yamnaya's CHG ancestors could have been from the South Caucasus, there's no convincing reason they could not have been.

Davidski said...

Here are the top populations for Kotias in terms of median cM sharing.

Georgian 1.508603
North_Ossetian 1.4183625
Adygei 1.131372
Balkar 0.94442100000001
Kumyk 0.66086800000001
Chechen 0.589434
Abkhasian 0.46324950000001

So looking at that, it seems that a lot of individuals in the Western/Northwestern Caucasus still share significant cM with Kotias.

That Yamnaya guy lived almost 5,000 years ago, and had significant CHG ancestry, probably around 45%. So a score of less than 1 cM in an analysis that picks up both IBD and IBS tracts doesn't look too impressive. But it might just be the luck of the draw.

Ryan said...

I'd say that new ancient remains might give us greater precision on where exactly the CHG in Yamnaya comes from, but given the active conflicts to the south and east, if the answer is tied to the spread of animal domestication we may not get an answer for a while.

David - can you use Chromosome painter or something like that to see if the CHG in Neolithic samples and Sardinians shares a common origin with the CHG found in the Steppe?

Krefter said...

Lets forget about Yamnaya. That mystery is solved. We need aDNA from Balkans and Italy ASAP. We also need aDNA from West Asia. I'm tired of differnt teams of scientists sampling the same general cultural/regional samples over and over again.

Davidski said...

rk,

Here are those D-stats. Keep in mind that my South Asian samples now overlap at around 500K SNPs with the Mathieson dataset, not with Human Origins or Kotias. Also, I'm not able to run TreeMix at the moment.

Loschbour LBK_EN Paniya Dai -0.0036 -1.143 113726
Loschbour Kotias Paniya Dai -0.0075 -1.818 96728

Karelia_HG LBK_EN Paniya Dai -0.0116 -3.563 129215
Karelia_HG Kotias Paniya Dai -0.0146 -3.513 109291

MA1 LBK_EN Paniya Dai -0.0069 -1.857 96182
MA1 Kotias Paniya Dai -0.01 -2.19 81586

Kostenki14_UP LBK_EN Paniya Dai -0.0114 -3.144 124327
Kostenki14_UP Kotias Paniya Dai -0.0147 -3.347 105577

Loschbour LBK_EN Pulliyar Dai -0.0098 -2.817 112443
Loschbour Kotias Pulliyar Dai -0.0128 -2.75 95648

Karelia_HG LBK_EN Pulliyar Dai -0.016 -4.624 127819
Karelia_HG Kotias Pulliyar Dai -0.0172 -3.969 108112

MA1 LBK_EN Pulliyar Dai -0.0095 -2.384 95113
MA1 Kotias Pulliyar Dai -0.0115 -2.346 80684

Kostenki14_UP LBK_EN Pulliyar Dai -0.0149 -4.125 122939
Kostenki14_UP Kotias Pulliyar Dai -0.0145 -3.2 104386

Gorilla Pulliyar Georgian Kotias -0.0072 -1.724 100000
Gorilla Pulliyar Armenian Kotias -0.0068 -1.643 100000

Gorilla Paniya Georgian Kotias -0.01 -2.405 101132
Gorilla Paniya Armenian Kotias -0.0099 -2.346 101132

Gorilla Mota Austroasiatic_Munda LBK_EN 0.0092 2.3 112672
Gorilla Mota Pulliyar LBK_EN 0.0085 1.995 112806
Gorilla Mota Paniya LBK_EN 0.0082 1.934 114086

Gorilla Mota Austroasiatic_Munda Kotias 0.0088 1.571 95575
Gorilla Mota Pulliyar Kotias 0.007 1.199 95683
Gorilla Mota Paniya Kotias 0.0043 0.759 96766

Gorilla Hadza Austroasiatic_Munda LBK_EN 0.0111 3.409 118043
Gorilla Hadza Pulliyar LBK_EN 0.0103 2.951 118181
Gorilla Hadza Paniya LBK_EN 0.0085 2.562 119525

Gorilla Hadza Austroasiatic_Munda Kotias 0.0099 2.095 99884
Gorilla Hadza Pulliyar Kotias 0.0097 2.009 99998
Gorilla Hadza Paniya Kotias 0.0062 1.284 101130

Ryan,

I can't run Chromopainter. It takes too long. But check out these median fastIBD scores for Yamnaya. Kotias isn't doing too badly compared to modern Caucasians. And compared to the Srubnaya genome, this Yamnaya sample has virtually nothing in common with South Central Asians.

Ukrainian_Lviv 2.18742900000001
German 1.50379
Erzya 1.4604985
Polish 1.22854999999998
French 0.927386
Kotias 0.70265
Chechen 0.59699999999999
Adygei 0.52449000000001
Abkhasian 0
Armenian 0
Balkar 0
Balochi 0
Brahui 0
Burusho 0
Georgian 0

ryukendo kendow said...

@ Davidski

Thanks very much for the stats!

The ones that strike me as interesting are these:
Gorilla Pulliyar Georgian Kotias -0.0072 -1.724 100000
Gorilla Pulliyar Armenian Kotias -0.0068 -1.643 100000

Gorilla Paniya Georgian Kotias -0.01 -2.405 101132
Gorilla Paniya Armenian Kotias -0.0099 -2.346 101132

Which corroborate these stats from Chad(?) I think:
Primate_Gorilla Kotias Paniya Dai -0.0121 -2.873 101132
Primate_Gorilla Anatolia_Neolithic Paniya Dai -0.0079 -2.663 119765
Primate_Gorilla Armenian Paniya Dai -0.0097 -3.271 119904
Primate_Gorilla Georgian Paniya Dai -0.009 -3.024 119904

To run with this, maybe we can try the following:
Primate_Gorilla Cypriot Paniya Dai
Primate_Gorilla Druze Paniya Dai
Primate_Gorilla Iraqi_Jew Paniya Dai
Primate_Gorilla Georgian_Jew Paniya Dai
Primate_Gorilla Assyrian Paniya Dai
Primate_Gorilla Adygei Paniya Dai
Primate_Gorilla Abkhasian Paniya Dai
Primate_Gorilla Lezgin Paniya Dai
Primate_Gorilla Adygei Paniya Dai
Primate_Gorilla Chechen Paniya Dai

Davidski, the reason why I'm asking for Treemix is because Basal rich populations like Kotias and LBK_EN share more drift with Karitiana than they do with Austroasiatic:

Karitiana Austroasiatic Kotias Chimp 0.02 4.391 112280
Karitiana Austroasiatic LBK_EN Chimp 0.019 5.183 132890

Karitiana Paniya Kotias Chimp 0.0104 2.14 113679
Karitiana Paniya LBK_EN Chimp 0.0109 2.79 134550

Which is of course a very unexpected finding no matter how you look at it. I asked for the Mota and Hadza stats to see if there is anything about the "Afro-Eurasian" thing going on, which may push S Asians away from all other Eurasians. However, all the stats with S Asians are confounded as there are too many ancestry fractions to control for for just four slots in the stat. Treemix may finally answer this question. Hopefully, when you are running Treemix again you can give us a heads-up and we can pursue this question.

capra internetensis said...

@ryukendo

Why unexpected? What in the Munda should outweigh the WHG in LBK's relationship to the ANE in Karitiana?

Davidski said...

rk,

Because of the way my datasets are set up, using Chimp increases the number of markers for this set.

Gorilla Cypriot Paniya Dai -0.009 -2.926 119904
Gorilla Druze Paniya Dai -0.0082 -2.759 119904
Gorilla Iraqi_Jew Paniya Dai -0.0099 -3.171 119904
Gorilla Georgian_Jew Paniya Dai -0.0075 -2.435 119904
Gorilla Assyrian Paniya Dai -0.0097 -3.154 118516
Gorilla Adygei Paniya Dai -0.0028 -0.913 119904
Gorilla Abkhasian Paniya Dai -0.007 -2.268 119904
Gorilla Lezgin Paniya Dai -0.0058 -1.876 119904
Gorilla Adygei Paniya Dai -0.0028 -0.913 119904
Gorilla Chechen Paniya Dai -0.004 -1.314 119904

Chimp Cypriot Paniya Dai -0.0088 -4.11 484613
Chimp Druze Paniya Dai -0.0082 -3.912 484613
Chimp Iraqi_Jew Paniya Dai -0.0101 -4.808 484613
Chimp Assyrian Paniya Dai -0.0094 -4.42 484613
Chimp Adygei Paniya Dai -0.004 -1.898 484613
Chimp Abkhasian Paniya Dai -0.0077 -3.654 484360
Chimp Lezgin Paniya Dai -0.007 -3.263 484613
Chimp Adygei Paniya Dai -0.004 -1.898 484613
Chimp Chechen Paniya Dai -0.0051 -2.435 484360

ryukendo kendow said...

@ Davidski

Arranged from largest to smallest:

Gorilla Iraqi_Jew Paniya Dai -0.0099 -3.171 119904
Gorilla Assyrian Paniya Dai -0.0097 -3.154 118516
Gorilla Cypriot Paniya Dai -0.009 -2.926 119904
Gorilla Druze Paniya Dai -0.0082 -2.759 119904
Gorilla Georgian_Jew Paniya Dai -0.0075 -2.435 119904
Gorilla Abkhasian Paniya Dai -0.007 -2.268 119904
Gorilla Lezgin Paniya Dai -0.0058 -1.876 119904
Gorilla Chechen Paniya Dai -0.004 -1.314 119904
Gorilla Adygei Paniya Dai -0.0028 -0.913 119904

Chimp Iraqi_Jew Paniya Dai -0.0101 -4.808 484613
Chimp Assyrian Paniya Dai -0.0094 -4.42 484613
Chimp Cypriot Paniya Dai -0.0088 -4.11 484613
Chimp Druze Paniya Dai -0.0082 -3.912 484613
Chimp Abkhasian Paniya Dai -0.0077 -3.654 484360
Chimp Lezgin Paniya Dai -0.007 -3.263 484613
Chimp Chechen Paniya Dai -0.0051 -2.435 484360
Chimp Adygei Paniya Dai -0.004 -1.898 484613

Before I say anything more, could we also have
Chimp Georgian Paniya Dai
Chimp Armenian Paniya Dai
Chimp Iranian_Jew Paniya Dai
Chimp Turkish_Balikesir Paniya Dai
Chimp Turkish_Aydana Paniya Dai
Chimp North Ossetian Paniya Dai
Chimp Armenia_BA Armenian Paniya

ryukendo kendow said...

Davidski, just two more:
Chimp Turkish_Jew Paniya Dai
Chimp Kurdish Paniya Dai

If the pattern holds, I will post abt it. Thanks for the stats!

Ryan said...

@rk - I think Davidski already did a Treemix that showed ANE admixture into CHG.

David - what do you think the significance of those Yamnaya fastIDB figures is?

Davidski said...

rk,

From two different datasets...

Chimp Georgian Paniya Dai -0.0102 -3.461 134979
Chimp Armenian Paniya Dai -0.011 -3.682 134979
Chimp Iranian_Jew Paniya Dai -0.0112 -3.742 134979
Chimp Turkish_Balikesir Paniya Dai 0.0057 1.865 134979
Chimp Turkish_Adana Paniya Dai -0.0044 -1.51 134979
Chimp North_Ossetian Paniya Dai -0.001 -0.348 134979
Chimp Armenia_BA Armenian Paniya -0.0674 -18.004 85897
Chimp Turkish_Jew Paniya Dai -0.0086 -2.858 134979

Chimp Georgian Paniya Dai -0.0102 -4.791 484613
Chimp Armenian Paniya Dai -0.0102 -4.761 484360
Chimp Iranian_Jew Paniya Dai -0.0092 -4.218 484603
Chimp Turkish Paniya Dai -0.002 -0.985 484613
Chimp North_Ossetian Paniya Dai -0.0001 -0.036 484360
Chimp Yamnaya_Kalmykia Armenian Paniya -0.053 -22.751 458640
Chimp Sephardic_Jew Paniya Dai -0.0085 -4.107 484613

Ryan,

I'd say those stats show that Yamnaya Samara doesn't have an important relationship with any present-day Caucasians or South Asians.

ryukendo kendow said...

@ Davidski

Thanks very much for the stats!

I am getting quite a clear picture... Is it possible to run the following series? I just want to corroborate everything first.

*Chimp Iraqi_Jew GujaratiD Dai
Chimp Georgian GujaratiD Dai
Chimp Armenian GujaratiD Dai
*Chimp Iranian_Jew GujaratiD Dai
*Chimp Cypriot GujaratiD Dai
*Chimp Druze GujaratiD Dai
Chimp Abkhasian GujaratiD Dai
Chimp Lezgin GujaratiD Dai
Chimp Chechen GujaratiD Dai
Chimp Adygei GujaratiD Dai
Chimp North_Ossetian GujaratiD Dai

(the starred stats are to remind myself, ignore the the stars)

Paniya GujaratiD Adygei North_Ossetian
Paniya GujaratiD Chechen North_Ossetian
Paniya GujaratiD Lezgin North_Ossetian
Paniya GujaratiD Abkhasian North_Ossetian
Paniya GujaratiD Druze North Ossetian
Paniya GujaratiD Cypriot North Ossetian
Paniya GujaratiD Iranian_Jew North Ossetian
Paniya GujaratiD Armenian North Ossetian
Paniya GujaratiD Georgian North Ossetian
Paniya GujaratiD Iraqi_Jew North Ossetian

Paniya GujaratiD BedouinB North Ossetian
Paniya GujaratiD Yemenite_Jew North Ossetian
Paniya GujaratiD Syrian North Ossetian
Paniya GujaratiD Jordanian North Ossetian
Paniya GujaratiD Kurdish North Ossetian

Paniya Sephardic_Jew North_Ossetian Chimp
Paniya Sephardic_Jew Lezgin Chimp

Chimp Paniya Armenia_BA Armenian

ryukendo kendow said...

@ Capra

Capra, the Jones et. al showed that CHG+Onge produced the most negative f3s for a variety of S Asians. Assuming the CHG arrived in the Neolithic in S Asia, then in the comparison

Chimp LBK_EN Munda Karitiana or
Chimp CHG Munda Karitiana

by far the longest drift path is the path shared by (CHG in Kotias, CHG in Munda), which is measured starting from when Basal Eurasian split off from Crown Eurasian, to the split of Kotias and the CHG-like neolithic ancestry in Indians; i.e. the later split is neolithic. This path is far longer than the other path shared by (? crown Eurasian in Kotias, ANE in Karitiana), which is measured from when WHG+ANE split off from Basal Eurasian, to when ? and ANE split off from each other in the Paleolithic. So the negative term should be larger.

Either S Indians are dragged off by something v basal, or Kotias does not actually share such a long drift path with the Basal Eurasian in S Indians; i.e. it is a poor representative.

^^ @ Ryan

Ryan said...

@David - I guess that means that Yamnaya isn't actually PIE, but rather a close relative. I suppose that shouldn't be that surprising given that the location is a bit on the periphery of the Pontic Steppe, and that you earlier mentioned the "true" PIE probably had a bit more WHG than Yamnaya IIRC?

Could you check for IDB sharing between Yamnaya and Uyghurs when you get the chance? I'd be curious to see if this is a bit of a centum/satem thing (with the Slavic groups getting their Yamnaya IBD from neighbouring Uralics instead).

@Rk - "Either S Indians are dragged off by something v basal, or Kotias does not actually share such a long drift path with the Basal Eurasian in S Indians; i.e. it is a poor representative."

I think that's looking likely for IE as well. Kotias is just the best proxy we have so far. I suspect the real thing is probably buried somewhere between Kurdistan and Chechnya right now, but those aren't easy places to go to without getting shot.

David - how much would it affect your models if the "true" EHG in PIE was actually somewhat between SHG and EHG, and the "true" CHG in PIE was actually somewhat between Kotias and EHG?

I guess what I'm getting at is - what if there never were any pure groups, but rather clines, and as a consequence what if the tree model doesn't provide a particularly good fit?

capra internetensis said...

@Ryu

I am on my phone so it is hard to find stuff, in Jones et al I can only find:

D(Yoruba, Mala; Onge, Kharia) 0.0240 13.062
Where the best stat is not for Kotias but the low caste South Indian Mala.

There is no Paniya or Pulliyar in the sample, the nearest thing being Mala. I'm not sure how close they'd be.

The f3 is (Kharia: Lahu, Mala) and is not significant.

It's not clear to me that we should expect Munda and full-on South Indian hill tribes to carry much of that CHG like ancestry that is important in caste populations.

capra internetensis said...

But I guess maybe that is your point?

Romulus said...

Isn't EHG itself just a mixture of ANE and WHG? how much WHG is in EHG vs Mal'ta boy?

Balaji said...

Davidski,

Thanks for the updated Smarter Bear Plot.

https://drive.google.com/file/d/0B9o3EYTdM8lQY0FvWXdFR014emc/view?usp=sharing

Can you comment on why it looks different from the old plot? In particular Kotias and Satsurblia have been considerably displaced.

https://drive.google.com/file/d/0B9o3EYTdM8lQb1R2MDJmS2h0Nk0/view?pli=1

Ryukendo Kendow,

You had written of the following possibility, “Kotias does not actually share such a long drift path with the Basal Eurasian in S Indians; i.e. it is a poor representative.”. To this I say Amen! Kotias was ensconced in the wilderness of the Caucasus surrounded by mountains. How likely was it for his contemporaries or descendants to get out of there, learn agriculture from Anatolia_Neolithic and then settle the Indian Subcontinent? That is why I proposed GHG as being the dominant West Eurasian component in South Asia rather than CHG and that agriculture in South Asia was an indigenous development.

Still, like Capra Internetensis, I do not find the following surprising.

Karitiana Austroasiatic Kotias Chimp 0.02 4.391 112280
Karitiana Austroasiatic LBK_EN Chimp 0.019 5.183 132890

Karitiana Paniya Kotias Chimp 0.0104 2.14 113679
Karitiana Paniya LBK_EN Chimp 0.0109 2.79 134550

BEA, CHG, GHG, WHG and ANE are all West Eurasian components which are closer to each other than to any ENA. A PCA plot of world populations will show all West Eurasians with varying amounts of these components fairly close to each other. Karitiana has about 44% ANE. The GHG in Austroasiatic and Paniya is 20% or less.

Davidski said...

Balaji,

Some of the ancient samples and numbers of markers have changed since I ran the plot last. This is probably causing the shift. I'll have to re-run all of the samples again in the same way with the new dataset to see what's going on. I'll do that later today.

Davidski said...

rk,

Chimp Iraqi_Jew GujaratiD Dai -0.0408 -17.479 594924
Chimp Georgian GujaratiD Dai -0.0414 -17.746 594924
Chimp Armenian GujaratiD Dai -0.042 -18.392 594924
Chimp Iranian_Jew GujaratiD Dai -0.0404 -17.363 594924
Chimp Cypriot GujaratiD Dai -0.0392 -17.011 594924
Chimp Druze GujaratiD Dai -0.039 -17.847 594924
Chimp Abkhasian GujaratiD Dai -0.0385 -16.414 594924
Chimp Lezgin GujaratiD Dai -0.0384 -16.69 594924
Chimp Chechen GujaratiD Dai -0.0363 -15.81 594924
Chimp Adygei GujaratiD Dai -0.0335 -14.898 594924
Chimp North_Ossetian GujaratiD Dai -0.0294 -13.01 594924
Paniya GujaratiD Adygei North_Ossetian -0.0017 -1.727 134979
Paniya GujaratiD Chechen North_Ossetian -0.0027 -2.56 134979
Paniya GujaratiD Lezgin North_Ossetian -0.0029 -2.596 134979
Paniya GujaratiD Abkhasian North_Ossetian -0.0023 -2.085 134979
Paniya GujaratiD Druze North_Ossetian -0.0008 -0.836 134979
Paniya GujaratiD Cypriot North_Ossetian -0.0007 -0.674 134979
Paniya GujaratiD Iranian_Jew North_Ossetian -0.0012 -1.042 134979
Paniya GujaratiD Armenian North_Ossetian -0.0032 -3.032 134979
Paniya GujaratiD Georgian North_Ossetian -0.0042 -4.028 134979
Paniya GujaratiD Iraqi_Jew North_Ossetian -0.0013 -1.035 134979
Paniya GujaratiD BedouinB North_Ossetian 0.0019 1.621 134979
Paniya GujaratiD Yemenite_Jew North_Ossetian 0.0007 0.58 134979
Paniya GujaratiD Syrian North_Ossetian -0.0012 -1.099 134979
Paniya GujaratiD Jordanian North_Ossetian 0.002 1.755 134979
Paniya GujaratiD Kurdish North_Ossetian -0.0019 -1.971 134968
Paniya Turkish_Jew North_Ossetian Chimp -0.0505 -18.292 134979
Paniya Turkish_Jew Lezgin Chimp -0.0529 -18.887 134979
Chimp Paniya Armenia_BA Armenian 0.0018 0.386 85897

Paniya Gujarati Adygei North_Ossetian -0.0008 -1.17 490577
Paniya Gujarati Chechen North_Ossetian -0.0024 -3.815 493288
Paniya Gujarati Lezgin North_Ossetian -0.0031 -4.739 493288
Paniya Gujarati Abkhasian North_Ossetian -0.0022 -3.582 493288
Paniya Gujarati Druze North_Ossetian -0.0006 -0.895 490577
Paniya Gujarati Cypriot North_Ossetian -0.0018 -2.482 493288
Paniya Gujarati Iranian_Jew North_Ossetian -0.0018 -1.827 493281
Paniya Gujarati Armenian North_Ossetian -0.0024 -3.445 493288
Paniya Gujarati Georgian North_Ossetian -0.0034 -4.934 493288
Paniya Gujarati Iraqi_Jew North_Ossetian -0.0016 -2.031 493288
Paniya Gujarati BedouinB North_Ossetian 0.0024 2.863 490577
Paniya Gujarati Yemenite_Jew North_Ossetian 0.0029 3.649 493288
Paniya Gujarati Syrian North_Ossetian 0.0013 1.912 493288
Paniya Gujarati Jordanian North_Ossetian 0.0023 3.489 493288
Paniya Gujarati Kurdish North_Ossetian -0.0016 -2.577 493288
Paniya Sephardic_Jew North_Ossetian Chimp -0.0531 -27.661 487690
Paniya Sephardic_Jew Lezgin Chimp -0.0548 -28.611 487966
Chimp Paniya Yamnaya_Kalmykia Armenian -0.01 -4.29 458640

Davidski said...

Ryan,

I think PIE was Sredny Stog and maybe Khvalynsk. Yamnaya was at best late PIE, but very likely some sort of IE. The fact that this Yamnaya individual shows weak IBD links with South Asians isn't a problem, because Indo-Iranians came from the Middle Bronze to Late Bronze Age steppe, which is what the IBD stats of the Srubnaya individual support. Early Sredny Stog samples will look like the Khvalynsk samples we already have IMO, and will be the benchmark for modeling PIE ancestry.

Here are the stats for the Uyghurs...

Corded_Ware Stats
Mean 1.157087
Standard Error 0.71844003003272
Median 0
Mode 0
Standard Deviation 2.27190685714318
Sample Variance 5.16156076753421
Kurtosis 7.05815416604571
Skewness 2.57759534599704
Range 7.25196199999999
Minimum 0
Maximum 7.25196199999999
Sum 11.57087
Count 10

Hungary_BA Stats
Mean 1.274979
Standard Error 0.38140336599237
Median 1.4054035
Mode 0
Standard Deviation 1.2061033437907
Sample Variance 1.45468527590312
Kurtosis -1.64051563826124
Skewness 0.1079533226031
Range 3.16348000000001
Minimum 0
Maximum 3.16348000000001
Sum 12.74979
Count 10

Srubnaya Stats
Mean 0.6773942
Standard Error 0.32324009683185
Median 0
Mode 0
Standard Deviation 1.02217493708202
Sample Variance 1.04484160199863
Kurtosis 1.68246301254289
Skewness 1.48573564641118
Range 2.97348200000002
Minimum 0
Maximum 2.97348200000002
Sum 6.77394200000003
Count 10

Yamnaya Stats
Mean 0.107098
Standard Error 0.107098
Median 0
Mode 0
Standard Deviation 0.33867361284871
Sample Variance 0.11469981604
Kurtosis 10
Skewness 3.16227766016838
Range 1.07097999999999
Minimum 0
Maximum 1.07097999999999
Sum 1.07097999999999
Count 10

ryukendo kendow said...

@ Davidski
David, I think (?) I see a relationship between the Basal Eurasian ancestry in S Indians and the 'SW Asian' component. Either that, or S Indians are somehow closer to populations carrying cryptic African ancestry.

The evidence for this is not particularly striking, but the patterns are pretty internally consistent, as neither the proportion of CHG ancestry nor the proportion of ENF ancestry demonstrate the same relationship. I.e. given some fraction of CHG, the presence of ENF ancestry does not make Middle Easterners further away from S Indians, and the presence of even more CHG ancestry does not make them closer. Modern SW-asian carrying populations are all closer to S Indians than Kotias itself, and there is a pattern where the more SW-asian carrying populations are even closer to S Indians.

Any other ways to test this topic? Treemix would be best I think...

In the meantime just to make sure one of my inferences is not confounded:
Dai Paniya Georgian Kotias
Dai Paniya Armenian Kotias


@ Balaji

We already had a previous conversation on the topic, where Kol and Chamar strongly favour Kotias over LBK_EN at 90k markers, while
Dai Paniya LBK_EN Kotias 0.0055 1.742 92265
Dai Austroasiatic LBK_EN Kotias 0.0054 1.962 91505
Paniya and Austroasiatic already demonstrate shifts close to significance at 90k markers. Given 500k markers I'm pretty sure the signal will be strong across India.

There is the question of what it means to say that Indians carry 'GHG', because as of now, 'CHG' is any ancestry that split off from Kotias later than Kotias split off from LBK_EN, so 'GHG' is just CHG. Unless you mean that the CHG ancestry that reached India is somehow modified with +ANE or +SW Asian or +African? but then that is just CHG+some other ancestry.

@ Capra
Capra, this applies to Paniya as well:

Karitiana Paniya Kotias Chimp 0.0104 2.14 113679
Loschbour Paniya Kotias Chimp 0.0371 5.928 96728

Karitiana Paniya LBK_EN Chimp 0.0109 2.79 134550
Loschbour Paniya LBK_EN Chimp 0.0781 15.845 113726

Paniya, like Mala, as Dravidian agricultural outcastes (not hill tribes), should have a strong signal of Kotias ancestry, and the drift path to (Kotias, CHG in Paniya) should be much longer than the drift path to (ANE/EHG in Karitiana, ??? in Kotias).

capra internetensis said...

@Ryu

I don't know if the legal difference between Paniya as Scheduled Tribes and Mala as Scheduled Castes reflects any important difference in population history. However, the limited uniparental data I have shows a difference: Paniya (from Tamil Nadu, n=72) had 90% F* and C1b, 3% R1a, while Mala (from Andhra Pradesh, n=17) had 24% F* and H, 47% R1, the rest L and J. So Mala and Paniya are not *necessarily* very comparable.

Anyway, regardless of the tribals, the proposed connection to Southwest Asian component sounds promising.

Davidski said...

rk,

I talked about a pseudo-SSA affinity among South Asians here.

http://eurogenes.blogspot.com.au/2014/03/ancient-north-eurasian-ane-levels.html

Btw...

Dai Paniya Georgian Kotias 0.003 1.074 113679
Dai Paniya Armenian Kotias 0.0021 0.77 113679

Davidski said...

Blaji,

https://drive.google.com/file/d/0B9o3EYTdM8lQYkJ0bnc0eUk4dWs/view?usp=sharing

Here's the datasheet.

https://drive.google.com/file/d/0B9o3EYTdM8lQX25SZ2pFX0ZPTXc/view?usp=sharing

ryukendo kendow said...

@ Davidski

Thanks David.

When you are running treemix, it will be nice to see some runs to observe what's going on.

@ FrankN
Frank, are you aware of any old connections between S Asia and Africa?

ryukendo kendow said...

@ Capra

Capra, thanks for the figures.

Haploid genetics is not my forte, do you observe any patterns in S Indians that might be of interest? For example, what does that extremely high % of R1 signify? Are there any haploid markers which might tie Indians to Africans/SW Asians?

Davidski said...

Y-DNA Haplogroup T comes to mind.

Rob said...

@ Ryu

"are you aware of any old connections between S Asia and Africa"

Apart from the middle Stone Age and putative rapid southern coastal dispersal route ?

Jack Rusher said...

I did some visualizations that were like a 3D Sankey diagram showing shared ancestry as flow between nodes that were located at (x,y) by lat/long and (z) by sample age. Sadly, they were not great, and -- even worse -- I think I'll need to start from alleles to get the graph I want.

Alberto said...

On the topic of formal stats measuring shared drift and not behaving like IBS, one interesting effect that Matt has commented about before is that they can serve to some extent as a "linked" analysis too. For example, about local continuity of Neolithic populations (from the Cassidy et al. paper):

Mbuti Esperstedt_MN : LBK_EN Hungarian_EN -0.0319 -9.997
Mbuti Spanish_MN : Hungarian_EN Spanish_EN 0.0379 14.28

For comparison (without this pseudo(?)-linked effect):

Mbuti Gok2 : LBK_EN Hungarian_EN 0.0021 0.562
Mbuti Gok2 : Hungarian_EN Spanish_EN 0.009 2.199

More examples, here "normal" ones (without the effect):

Mbuti Ballynahatty : Irish_Bronze Corded_Ware_LN -0.0061 -1.638
Mbuti CO1 : Irish_Bronze Corded_Ware_LN -0.0101 -2.087
Mbuti Gok2 : Irish_Bronze Corded_Ware_LN -0.0112 -2.538
Mbuti Otzi : Irish_Bronze Corded_Ware_LN -0.0076 -1.966
Mbuti Spanish_CA : Irish_Bronze Corded_Ware_LN -0.0068 -2.339

And 2 with the effect:

Mbuti Spanish_MN : Irish_Bronze Corded_Ware_LN 0.0106 3.572
Mbuti Esperstedt_MN : Irish_Bronze Corded_Ware_LN 0.0302 7.433

That second one is quite impressive, actually. Investigating over it a bit more, from Haak et al. I checked the f3(Dinka; X,Y) stats:

https://drive.google.com/file/d/0B2ZfdVZaNXDxcWdfRElhYlY2Qms/view?usp=sharing

It's interesting that Corded Ware shares as much drift with Esperstedt_MN as with Yamnaya, when we know that by admixture it would be about 73% Yamnaya, 27% Germany_MN. Probably meaning that while Esperstedt_MN had direct input into those CW samples from Germany, Yamnaya (the samples from Samara, at least) didn't (other MN samples are lower, as expected).

There are more interesting things in those figures, like the very high shared drift between Unetice and Motala (this one confirmed by qpAdm, so not necessarily an amplified effect of direct input of the exact population), or CW sharing with Motala almost as much as with EHGs, high MN and EHG/SHG shared drift but not that high with Yamnaya even in these LNBA cultures (Yamnaya being the only CHG-rich population - though it seems that CHG increased with time while EHG decreased with time, so not necessarily both populations were tied by Yamnaya admixture)...

Rob said...

Alberto

Yes it'll be quite interesting to see what more genomes from east Central Europe will bring to the big picture

I remember that even in Baalberg there were the poorly resolved but potentially interesting samples like R1* and a couple of I2s (in fact none were G2, H or J). And they had some minor evidence of EHG -type mix ?

Davidski said...

Motala and other SHG have very high ratios of WHG. This will skew a lot of the formal results in their favor, especially in some cases where high WHG comes together with steppe admixture to produce a pseudo-SHG effect.

Unetice might well have some SHG admixture, because they may have been in part of Scandinavian origin, but if so, it's basically limited to them.

I don't think SHG had any role in the formation of Corded Ware or Bell Beaker, which also means that their impact on the modern genetics of Northern Europeans is low.

I'm actually doing an IBD run now with Rathlin1, and it looks like LNBA Europeans were one big family all the way from Ireland to Samara via the Corded Ware horizon and the Hungarian Plain. It'll be interesting to see if Rathlin1 shows a strong relationship to any Asian groups, like Srubnaya does to South Asians.

Krefter said...

@About Ancient Greek DNA,

According to a museum there's little difference between Greek Mesolithic and Neolithic mtDNA. This makes it likely EEF are mostly hunter gatherers from the Aegean/Anatolia who became farmers.

Plus there's continuation(mtDNA, nuclear?) between Neolithic and Bronze age Greece. So, this probably means there were humongous movements of people from West Asia into Turkey and Greece during or after the Bronze age.

http://www.anthrogenica.com/showthread.php?5044-An-interesting-article&p=100266&viewfull=1#post100266

Davidski said...

The smelting of iron made it cheaper and easier to produce weapons, which became very useful in the turmoil that followed the Bronze Age collapse. So I reckon it was the Iron Age that this happened.

But I'm skeptical that there was really strong continuity in Greece from Mesolithic to Neolithic and then to Bronze Age. I'm betting on largescale replacements during these periods too, but by populations with broadly similar mtDNA haplogroups.

Rob said...

Krefter

"According to a museum there's little difference between Greek Mesolithic and Neolithic mtDNA. This makes it likely EEF are mostly hunter gatherers from the Aegean/Anatolia who became farmers. "

We need that clarified, because there is not Greek Mesolithic genome-wide data. We have one mtDNA haplogroup. Moreover, the archaeological evidence does not lend itself to local domestication by Aegean - Anatolian foragers. There are stratigraphic hiatuses and differential landscape use.

"Plus there's continuation(mtDNA, nuclear?) between Neolithic and Bronze age Greece. So, this probably means there were humongous movements of people from West Asia into Turkey and Greece during or after the Bronze age. "

I doubt it- there's not a shred of evidence for it. Rather, I bet as I've stated repeatedly, it must have come occurred after c. 4000 BC, but probably before 2500 BC.

Rob said...

Dave

"The smelting of iron made it cheaper and easier to produce weapons, which became very useful in the turmoil that followed the Bronze Age collapse. "

Yes I'm sure this made an impact. But the legend goes the Dorians came from the north
So how should you account for the CHG seen now ?

Jared Knows said...

Africa > South Asia: http://dispatchesfromturtleisland.blogspot.com/2012/05/how-did-african-crops-get-to-india.html, http://www.academia.edu/1139491/African_crops_in_prehistoric_South_Asia_a_critical_review

Davidski said...

Rob,

I think most of the CHG now present in the southern Balkans arrived there after the Bronze Age collapse and during the upheavals of the Iron Age.

I also think that samples from elite Mycenaean graves won't resemble closely any present-day Greek population.

Rob said...

Dave

I know you're usually good with predictions, so that must entail what Krefter said - massive Iron Age migrations from west Asia to Greece, but also Italy, Iberia and Central Europe .

I guess my knowledge of prehistory isn't as good as I thought it was .. ;)

Karl_K said...

@Krefter

"According to a museum there's little difference between Greek Mesolithic and Neolithic mtDNA. This makes it likely EEF are mostly hunter gatherers from the Aegean/Anatolia who became farmers."

Until there are autosome sequences from mesolithic Greece, this is definitely not in any way "likely". That is a huge stretch of the evidence. The only mtDNA from Greece were two K1c haplogroups from the same location.

Karl_K said...

K1c, K2b and K2c subclades NEVER been found among Neolithic farmers.

Davidski said...

Don't be so hard on yourself. I'm pretty sure you know that Spain was settled by Greeks, all sorts of people from within the Roman Empire, and even Arabs. Italy was too.

Central Europe was influenced by gradual gene flow from the south since the Iron Age. I know that Italians settled in Polish cities, so I'm sure they moved to Germany too, probably in larger numbers.

Rob said...

;)

Ariele Iacopo Maggi said...

"I also think that samples from elite Mycenaean graves won't resemble closely any present-day Greek population."

They will lack the slavic parts.

Ariele Iacopo Maggi said...

"Plus there's continuation(mtDNA, nuclear?) between Neolithic and Bronze age Greece. So, this probably means there were humongous movements of people from West Asia into Turkey and Greece during or after the Bronze age. "

Do you mean during IE invasions? (ironic)

Davidski said...

Elite Mycenaeans will resemble Hungary_BA BR2 IMO.

Check out his IBD results above.

Rob said...

Dave

"Elite Mycenaeans will resemble Hungary_BA BR2 IMO."

I agree Dave. And BA Hungary will be similar to former Cucuteni areas
But I also think there will be CHG in BA GReece, as some migration from Anatolia is evident.

Ryan said...

Yet we know Mycenaeans were writing in Greek.

Crete has a lot of CHG doesn't it? I'd think the Minoans could be a potential source of CHG.

Thanks for the data re: Uyghers David. I agree re: Yamnaya being an early (and mostly dead end) branch of IE. I'd propose to you that it would have been a centum language, and that von Bradke's hypothesis that Sintashta/Srubna are the source of satemization is correct and supported by your IDB data, as well as the R1b->R1a shift.

Romulus said...

Minoans had mtDNA H13 same as CHG. I bet there is some CHG in them. I think Mycenean greeks will be fairly close to Italians given that ancient Etruscan sample clusters with modern Italians.

Davidski said...

I think Mycenean greeks will be fairly close to Italians given that ancient Etruscan sample clusters with modern Italians.

This statement makes no sense whatsoever.

Balaji said...

Rynkendo Kendow,

You wrote the following, “I see a relationship between the Basal Eurasian ancestry in S Indians and the 'SW Asian' component. Either that, or S Indians are somehow closer to populations carrying cryptic African ancestry.” You did not state explicitly what the evidence was but I think it is the following D- statistics calculated by Davidski that I have sorted in order..

Paniya Gujarati Yemenite_Jew North_Ossetian 0.0029 3.65 493288
Paniya Gujarati BedouinB North_Ossetian 0.0024 2.86 490577
Paniya Gujarati Jordanian North_Ossetian 0.0023 3.49 493288
Paniya Gujarati Syrian North_Ossetian 0.0013 1.91 493288
Paniya Gujarati Druze North_Ossetian -0.0006 -0.9 490577
Paniya Gujarati Adygei North_Ossetian -0.0008 -1.17 490577
Paniya Gujarati Kurdish North_Ossetian -0.0016 -2.58 493288
Paniya Gujarati Iraqi_Jew North_Ossetian -0.0016 -2.03 493288
Paniya Gujarati Iranian_Jew North_Ossetian -0.0018 -1.83 493281
Paniya Gujarati Cypriot North_Ossetian -0.0018 -2.48 493288
Paniya Gujarati Abkhasian North_Ossetian -0.0022 -3.58 493288
Paniya Gujarati Armenian North_Ossetian -0.0024 -3.45 493288
Paniya Gujarati Chechen North_Ossetian -0.0024 -3.82 493288
Paniya Gujarati Lezgin North_Ossetian -0.0031 -4.74 493288
Paniya Gujarati Georgian North_Ossetian -0.0034 -4.93 493288

In comparison to GujaratiD, Paniya seems to prefer SW Asian populations with African admixture. Conversely GujaratiD appears to favor NW Asian populations. But this is not because of any SW Asian or cryptic African ancestry in Paniya. This trend is only because GujaratiD has more GHG than Paniya. We can see this also by looking at the following sorted D-statistics.

Chimp Armenian Paniya Dai -0.0102 -4.76 484360
Chimp Georgian Paniya Dai -0.0102 -4.79 484613
Chimp Iraqi_Jew Paniya Dai -0.0101 -4.81 484613
Chimp Iranian_Jew Paniya Dai -0.0092 -4.22 484603
Chimp Cypriot Paniya Dai -0.0088 -4.11 484613
Chimp Druze Paniya Dai -0.0082 -3.91 484613
Chimp Abkhasian Paniya Dai -0.0077 -3.65 484360
Chimp Lezgin Paniya Dai -0.0070 -3.26 484613
Chimp Chechen Paniya Dai -0.0051 -2.44 484360
Chimp Adygei Paniya Dai -0.0040 -1.9 484613

Chimp Armenian GujaratiD Dai -0.0420 -18.39 594924
Chimp Georgian GujaratiD Dai -0.0414 -17.75 594924
Chimp Iraqi_Jew GujaratiD Dai -0.0408 -17.48 594924
Chimp Iranian_Jew GujaratiD Dai -0.0404 -17.36 594924
Chimp Cypriot GujaratiD Dai -0.0392 -17.01 594924
Chimp Druze GujaratiD Dai -0.0390 -17.85 594924
Chimp Abkhasian GujaratiD Dai -0.0385 -16.41 594924
Chimp Lezgin GujaratiD Dai -0.0384 -16.69 594924
Chimp Chechen GujaratiD Dai -0.0363 -15.81 594924
Chimp Adygei GujaratiD Dai -0.0335 -14.9 594924

The order is exactly the same for GujaratiD and Paniya. There is no reason to believe that the ANI in Paniya is of a different kind from that in GujaratiD. In both as well as in other South Asians, ANI is predominantly GHG. It is just that Paniya has 17% ANI (according to Moorjani) and GujaratiD perhaps 45% ANI.

G. Dekaen said...

I calculated averages for most of the populations for the Srubna, BR2, Corded and Yamna samples in case anyone is interested in making a map. I can't make maps yet, although hopefully I will in the future, but here are the averages for now:

https://docs.google.com/spreadsheets/d/13C10gN0N1QSBsMTFtbDVYSXwOgY9xq8CgwZAfKvD194/edit#gid=0

Some interesting patterns emerge; I think I'll ask about those tomorrow. For now, I wanted ask whether the quality of the genomes has an effect on the degree of shared cM; for example, I noticed that the best quality genome, BR2 21.25x has much stronger connections to any population compared to Srubna 8.2x. The same pattern applies to the Corded (17.5x) and Yamna (6.2x) samples. Or does BR2's much greater sharing with modern populations indicate that it or populations that were related to it really did contribute that much more to most European populations than Srubna?

I compared the above four genomes to eliminate the effect time has on shared segments, i.e. segments breaking down with age. By eliminating that variable and comparing sets of samples that are of the same relative age, i.e. Srubna and BR2/Corded and Yamna, I hoped to get a more accurate idea of the proportion of genes each contributed to modern day populations. I hope this methodological assumption was sound. Please let me know.

Thank you and enjoy.

Rob said...

Dave

Have you looked at IBD for the Iron Age Scyrhian and the Iron Age Cimmerian from Hungary ?

Richard Rocca said...

@David... I think there is enough pent up demand for an ADMIXTURE test that includes the big four components (WHG/EEF/EHG/CHG), be it via Gedmatch or as a "pay-per-view". Do you have time to make one available in the short term?

Shaikorth said...

@Dekaen
Low coverage inflates homozygosity of ancient genomes, distorting longer segments that could match modern populations, thus less IBD.

Homozygosity figures for many ancient genomes here, you can see which ones have good coverage:

https://verenich.wordpress.com/2015/06/28/%D0%BF%D0%B0%D0%BB%D0%B5%D0%BE%D0%B3%D0%B5%D0%BD%D0%BE%D0%BC%D1%8B-%D1%82%D0%B5%D1%85%D0%BD%D0%B8%D1%87%D0%B5%D1%81%D0%BA%D0%B0%D1%8F-%D0%B8%D0%BD%D1%84%D0%BE%D1%80%D0%BC%D0%B0%D1%86%D0%B8%D1%8F/

Modern populations tend to have homozygosity below 70%, extreme cases like Karitiana get up to 75%.

Alexandros said...

I believe the discussion about Mycenaean and Minoan autosomal DNA is a very complex one. There are many 'ifs' and 'buts' involved.

If Mycenaeans are in fact a hybrid population between Neolithic/Chalcolithic Greeks and a population originating from the Ukranian steppe (this is what archaeologists have been suggesting for years), then maybe they would resemble something like Bronze age Hungarians, maybe with less HG-ancestry. If however there was no great migration into Greece from the steppe and the Mycenaean culture comprised of the indigenous (Neolithic) Greek population ruled by a steppe-derived elite, then Mycenaean Greeks would resemble more the EEF populations analyzed so far. I am personally leaning towards the former, i.e. I do believe substantial migration and settlement did take place.

Regarding Minoans, similar things apply. Were they a hybrid between Neolithic Cretans and a migrant population from Anatolia/Levant (IE or not), or were they simply the direct descendants of Neolithic/Chalcolithic Cretans? What we know for sure is that the Minoans were speaking a non-IE language and were eventually destroyed by the clearly IE (at least culturally) Mycenaeans.

Karl_K said...

@Shaikorth

"Modern populations tend to have homozygosity below 70%, extreme cases like Karitiana get up to 75%."

You need to put a qualifier on this. In terms of the whole genome sequence, over 99.5% of all human genomes are homozygous. Using only fewer, more common SNPs results in less apparent homozygosity.

Shaikorth said...

The qualifier would be "in the SNP set used for the ancients".

Rob said...

Alexandros

Bronze Age Greece is even more complex than you suggest. IMO it's not a 2 way mix of Copper age "natives" and steppe "migrants", but more like a 4 way mix of components- it's just a matter of getting aDNA to calculate their relative contributions

Minoans will be a more simple case - EEF and CHG. .

Krefter said...

@Alexandros and Rob,

Something important to understand is Neolithic Greeks had much less WHG ancestry than other Neolithic Europeans. Therefore, Early Greek speakers could not have been just like Bronze age Hungarians, because they had significantly less WHG ancestry.

Minoan mtDNA doesn't strongly suggest they were very EEF or West Asian. Their mtDNA isn't typical for either at all. Their results are generic West Eurasian with some examples of Euro-specific haplogroups.

Rob said...
This comment has been removed by the author.
ryukendo kendow said...

@ Balaji

Balaji, I arranged the Gujarati stats as well before I posted that. As I mentioned, I thought that the evidence was weak, and that Gujarati might in fact end up with the same set of scores, which was the reason why they were plonked in together with North Ossetian, to show that pattern.

Later, I thought about it, and realised that the North Ossetian stats also do not get rid of the African confound, so we effectively have no evidence at all that S Indians are more African-shifted that N Indians, at least not from this set of stats. The only thing to go by is the ranking of which populations produce the most negative figures, which surprisingly does not seem to correlate very well with how 'CHG' the population is beyond a certain level off CHG ancestry, not exactly at least.

Rob said...

Krefter

"Something important to understand is Neolithic Greeks had much less WHG ancestry than other Neolithic Europeans. Therefore, Early Greek speakers could not have been just like Bronze age Hungarians, because they had significantly less WHG ancestry.

Minoan mtDNA doesn't strongly suggest they were very EEF or West Asian. Their mtDNA isn't typical for either at all. Their results are generic West Eurasian with some examples of Euro-specific haplogroups."


I'm not sure that first part makes sense, and i think you've confused who said what

G. Dekaen said...

@Shaikorth

Apologies Shaikorth, I'm not sure if I understood what you were saying. I am not yet familiar with IBS vs IBD; what are they and what is the difference between them if you don't mind me asking?

What I think I understood from your post is that poorer quality genomes are more likely to be homozygous making them less likely to share larger/more segments with present-day populations. Is that correct?

In that case, calculating averages for the purpose of comparing pairs was pretty futile? Is that also correct?

If so, I'll just note a few interesting things for each sample:

BR2
1. Peaks at an average of 7-8cM in a spot roughly stretching from N. Ukraine through Belarus/Poland, into Lithuania and then stretching (although not including Latvia/Estonia) into NW Russia. There is a secondary local spike in Hungary as well.

2. Although the sample size is small (6), Ukraine - Belgorod shows noticeably less BR2 sharing than the other Ukrainian samples. I'm assuming this Belgorod is Bilhorod-Dnistrovskiy on the Black Sea indicating a different ancestry for S. Ukrainians vs. N. Ukrainians. Also, Belgorod is the only Ukrainian sample that shares more with Srubna than BR2, indicating higher Steppe ancestry?

3. BR2 was located in the Balkans, yet displays much lower affinity for Balkan populations compared to E. Slavs. The northern Balkan groups (Serbs, Slovenians, Croats, Bosnians) share around 4.5-6.5 cM while southern and eastern Balkan groups such as Romanians, Bulgarians, Macedonians share approx. 3-4cM with BR2. N. Balkans sharing is probably elevated by their higher Slavic ancestry post-500AD, making it likely that the entire Balkans was at one point much less related to BR2 than E. Europeans. So, either BR2 represents an expansion of people ultimately from E. Europe into Hungary and different from S. Balkan peoples OR BR2 was a typical Balkanite from 1100BC whose people were later displaced from the south (Dacians, Illyrians, Pannonians etc.) or west (???). I can't recall BR2's admixture proportions to determine which is more likely, only that he was Y-DNA J2a1. It may be interesting to point out that BR2 shared less with Kotias than Srubna (0.87 vs 1.93), could that mean that CHG didn't expand into/via the Balkans until after BR2?

G. Dekaen said...

4. BR2 shows a strong frequency among NW Finno-Ugric populations (Estonians, Moksha, Erzya, but not Finns) with 5-7.5cM shared (and Mari with 4.5cM). I doubt this is due to Slavic admixture since there is limited introgression of Y-DNA I2a1b L621 among these populations. It could be due to shared Baltic substrate or simply indicate that BR2 had a strong connection to E. Europe and was intrusive in the Balkans (related to the previous point).

Srubna

1. Peaks among E. and S. Ukrainians, 4-6cM although small sample sizes. This is not entirely surprising though given Srubna's location. It's most robust peak however is with English, Irish, Germans and Swedes where it averages 3.5-4cM (you can include Danes there as well, also 3.6cM), basically NW Europe. IIRC, the last paper on ancient genomes that I read showed that Srubna had western Neolithic farmer admixture; so, I wonder if that would explain the connection, perhaps a reflux of farmer genes from Bell Beaker/descended cultures like Unetice pushing eastwards contributing to Srubna. If so, this interaction probably happened across Poland/Baltics and into Ukraine since they also all share 3-3.5cM with Srubna in contrast to neighbours Hungary and Belarus (2.4 and 1.2). If so, this gene-swapping was probably mediated by Srubna acquiring western females since Y-DNA R1b and I is lacking within Srubna.

2. Outside Europe, Srubna peaks among Kshatriyas, Tajiks, Pathans, and Kalash, sharing 3-3.5cM. It shares the least with Munda speakers (Kol 0.7) while showing minimal difference between Hindu, Dravidian and Burushaski speakers (1.5cM), with the other two both having populations from 0.7-2.0+cM. In light of this, it's most likely that Srubna represents Iranic speakers rather than Indo-Iranians or simply that Srubna/related populations were progressively diluted going from S.C. Asia into India. Curiously, Kurds and Iranians share very little with Srubna, 0.7-0.8cM.

3. Within the Caucasus, Srubna shares the most with NE Caucasian speakers, Lezgins and Chechens 2-2.5cM followed by N. Ossets (1.8), NW speakers (1-1.5) and then S. Caucasians (Armenians, Assyrians, Georgians) <1cM.

Rob said...

@ Dekaen

Please keep going, but it's worth pointing out that Hungary isn't really "the Balkans".
It's more Central European

G. Dekaen said...

4. Srubna is obviously overwhelmingly descended from Corded Ware sharing 12cM with it which also explains Srubna's abundance of R1a as opposed to Yamnayan R1b. Also, going back to point 1, Srubna shares much more with Stuttgart than Hungary NE1 which is opposite to BR2. Could this be further evidence for my idea that Srubna's strong connection with NW Europeans is due to farmer influx from that area (i.e. Germany/C. Europe) rather than the Balkans?

Corded and Yamna

1. Corded peaks among Finns, French, Germans, Irish, Kargopol Russians, Mari and Swedes 2.8-3.5cM. Not sure what the connection is. They all seem to be mostly on the periphery of historical Corded territory potentially indicating some sort of displacement in the center maybe from the Balkans? The S. Balkans does seem to share the least with Corded (unsurprisingly), 0.7-1.5. The Core Corded territories still have elevated Corded sharing, Poland (2.2), Belarus (2.6), Baltics (1.8-2.4). This roughly coincides neatly with the peaks in BR2 which might indicate that BR2/a similar population with some Balkan admixture pushed into formerly core Corded territory, pushing Corded's to the periphery and diluting it in its core.

2. Corded also seems to have been a strong substrate to NW Finno-Ugric languages with most sharing 2.4-3.2cM (Finns, Estonians, Moksha, Mari) with the exception of Erzyans, 1.6.

3. There is no noticeable difference in Corded sharing among Indic or Iranic populations.

4. Yamna peaks among Germans, Erzyans, Irish, Hungarians, Kumyks, Norwegians, Poles and French, 1.2-2cM. Some lower sampled populations like Ukrainians and Latvians have 2-2.5cM shared. Given that other Balts (Estonians, Lithuanians in the geographic sense) have <0.5, the Latvian result is probably an anomaly. Further on this point, other NE European peoples like Russians, Finns, Moksha and Mari have <1cM, with Moksha/Mari less than 0.2cM; Erzyans are quite the curious anomaly. This leads me to believe something similar happened to Yamna ancestry as did to Corded, where it was pushed to the periphery. Yamna seems to have expanded into C. Europe straddling the Hungarian/Polish border, with some spillage into Belarus (1.0cM) and then continuing into C. Europe. At some later point, some expansion (probably from the Balkans) pushed into C./E. Europe and diluted Yamna in places like Poland (1.2) and Hungary (1.5), pushing it further NW. This is roughly what I proposed a year ago in February when the Yamna genomes were released and showed weird peaks in NW Europe IIRC.

5. Again, no noticeable difference between Indic or Iranic populations regarding Yamna sharing.

G. Dekaen said...

@Rob

I suppose you're technically right, Hungary is just north of what's traditionally considered the "Balkans." I called Hungary "Balkan" out of personal preference, I just think it looks neater on a map when you consider it as such :P.

Rob said...

Dekaen

No I see what you mean

The results make sense, and are what we'd expect given the regular flux seen in Europe, beyond the Copper age right through to early modern era

That is what complicates the analysis. The "peaks" we're seeing now are very likely related to a layered history - depending on which region you're looking at ; with a faster rate of turnover in the immediate periphery of the core of Europe .

FrankN said...

@rk: Not sure whether you will still read this..
I was busy with a few other things, so I just read your request for background on India-Africa relations now. In fact, most of what I was to say has already been said by others (Jared, thx for the great link!),
However, a few things seem worth adding:

1. There was intensive exchange between India and E. Africa by the end of the 2nd mill. BC. Aside from the crops mentionned in Jared's paper (beans, various types of millet), Africa by that time also exported the domesticated donkey. In return, we see rice cultivation commencing on the middle Niger by the mid 2nd. mBC at latest.

2. The link apparently reaches through to the Niger river and into Senegal (suggested center of pearl millet domestication). This provides further background on the often speculated relation between Wolof and Dravidian languages (more recently, the World Language Tree has added Nihali here). A number of obvious linguistic parallels relate to Wanderwords, and technology originating from the Caucasus:
- Bronze: Wol. xanjar, Tel. xancara
- Forge: Bamb. numu, Tel. inumu
- Blacksmith's caste: Wol. kamara, Tel. kamara
- Curved hoe: Wol. konko, Naiki konki
- Sheep: Wol. xar, Brahui xar
- Cow: Wol. nag, Peul. nagge, Tamil naaku

3. Looks like a migration through the Horn of Africa. This migration can be dated as post-Mota, i.e. during the second half of the 3rd m BC, which corresponds well with the time frame under (1), and is also credible as concerns the transfer of metalurgy terms.
https://en.wikipedia.org/wiki/Land_of_Punt

4. We even have a genetic signature of that migration, namely yDNA T. Here some selected frequencies from the respective WP page:
- Dire Dawa (SE Ethiopia) 82.4%
- Djibouti 75%
- Anteoni (Plateau Malagassy, upper caste) 50%
- Peul (Fulbe) 17.8%
- Kanuri (Cameroon) 4.8%
* *
- Kurru (A. Pradesh) 55.6%
- Bauris (W. Bengal) 52.6%
- Bajo Sea Nomads (Sulawesi) 7.4%
- S.Indians avg. 5.9%
- Guajarati 3.4%
* *
- Tajiks (Logar, SW Iran) 50%
- Armenians (Sasun) 20.2%
- Assyrians 10.3%
- Kurds (Sorani) 8.5%
- Oman 8.3%
- Leb. Muslims 4.9%
- Alans 2.9%
- Pashtuns 0.7%
* *
- Sciacca, Sicily 30%
- Chios 25%
- German Stilfser, Tyrol 23.5%
- Abruzzesi 20%
- Albanians 14.5%
- Cadiz 10.7%
- Corsicans 8.1%
- Dutch 5.6%
- Gotland 5%
- Alsatians 5%
- Berlin 3.9%
- Sardinians 2.3%
- Basques 1.8%
- Lviw 1%

Quite a strange beast! The pattern goes against anything one would expect. Still, this probably falls under CHG.

5. And now it gets really interesting: yDNA T has been found in a LBK sample from Karsdorf, Germany (I0795). Dave's K8 has it as 93% AnatNeol, and 7% WHG, which looks somehow wrong.
From the yDNA T WP article:
"In (a) study by Mathieson et al (..) Haplogroup T1a-M70, previously found in LBK sites from Germany, was not present in Barcin nor Mentese Neolithic settlements. This fact together with the absence of the mtDNA lineages carried by both of the T1a individuals from Karsdorf (..) could mean that the Early European Neolithic T1a-M70 had a different migration pattern and, therefore, a different geographical origin."
In fact, this seems to me a further indication, aside from millet occuring in some Elbe-Saale and Polish LBK sites, for a second neolithic migration stream that originated on the Black Sea and went north-east of the Crpathians into Central Europe.
Dave, you might want to pay special attention to I0795 in your next IBD run.

Don't know whether this helps to explain some of the weird d-stats. I myself now at least understand how come the post-Mota African admixture to be associated with LBK.

FrankN said...

A few additional T frequencies that also seem worth mentionning:

- Movima (NE Bolivia) 20%
- Qulla (Tucuman, Arg.) 6.9%
- Kuna (Panama) 6.3%

As reminder:
- Bajo Sea Nomads (Sulawesi) 7.4%

I believe I had linked the coconut DNA study evidencing human-mediated transfer from the Southern Phillipines (or Sulawesi?) to the Columbian-Panamese coast around 300 BCE already somewhere here. Aside from T, another yDNA hg shared by SEA and a few tribes in Columbia and Ecuador has also been demonstrated (would need to look up details).

Even without these migrations, Karitiana are not a particular good outgroup when looking at S. Indians. Both Raghavan 2015, and Skoglund/Reich 2015, in their studies on the Peopling of the Americas, demonstrated gene flow into Karitiana and Surui from a yet unidentified source, genetically somewhere in-between Onge and Papuans (geographically, that in-between could be around Sulawesi/Southern Phillipines). Moreover, Karitiana have been shown to posess Denisova admixture.

Chris Davies said...

@ FrankN - " Looks like a migration through the Horn of Africa. This migration can be dated as post-Mota, i.e. during the second half of the 3rd m BC, which corresponds well with the time frame under (1), and is also credible as concerns the transfer of metalurgy terms.
https://en.wikipedia.org/wiki/Land_of_Punt
We even have a genetic signature of that migration, namely yDNA T."

I can definitely see a genetic connection between West Africa and Persian Gulf/Baluchistan/Pakistan/India with HLA haplotypes.

But the age of haplogroup T in Oman is quite recent, supposedly dated to only 1,600ybp, compared with 13,700ybp in Egypt.

FrankN said...

@Chris Davies: yDNA T 13,700yBP in Egypt? That sounds interesting. Do you have a source on it?

Chris Davies said...

@ FrankN "@Chris Davies: yDNA T 13,700yBP in Egypt? That sounds interesting. Do you have a source on it?"

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1182266/table/TB3/

The table uses the old nomenclature for haplogroup 'T' [K2-M70] because it is from 2004. The median age estimates show 13.7k yrs for Egypt and only 1.6k years for Oman.

The table is from this paper:-

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1182266/

["The Levant versus the Horn of Africa: Evidence for Bidirectional Corridors of Human Migrations"]

Alexandros said...

@Chris Davies

Although an interesting theory, the study you suggest provides insufficient evidence. The title of the table says it all 'Y Chromosome Haplogroup Variance and Expansion Times Based on 10 STR Loci'. Calculating expansion times from just 10 STRs is highly inaccurate and has proved to be wrong several times, given the recent ancient Y-DNA data. There are several examples of wrongly determined STR-based expansion and coalescent times in the pre-ancient DNA literature.

capra internetensis said...

Looking at the fully sequenced T samples on Y Full's tree, there are only 2 Omanis (compared to 10 in the old study cited above), and their TMRCA is 2300 years. Adding in Qataris and Emiratis to make 10 samples the TMRCA increases to 8700 years (all still belong to T1a1a1).

Bahrainis (n=5) have both T1a1 and T1a2, while Saudis (n=39) and Kuwaitis (n=16) have all three branches of T1a, so all have a TMRCA of around 16 thousand years.

For Africa, there are only 2 Sudanese (TMRCA 600 years) and 2 Egyptian samples - the latter fall into both T1a1 and T1a2, like the Bahrainis.

The data is not really adequate but probably the diversity of T in Egypt is high, which is not unexpected. However, this isn't too relevant to the Indian Ocean. We know very little about African T and almost nothing about Indian T; we need more information before we can test the possible connection there.

FrankN said...

@capra:

We know from the Mota study about an Eurasian introgression into Africa shortly after 2,500 BC, which extended as far as West Africa. So far, only two non-African yDNA hgs have been found in the Ethiopian highlands (Oromo), i.e. close to where Mota has been buried: T, and J1(xP58). Of those two, only T has so far been identified in reasonably concentrations in West Africa.
J1(xP58) appears to even have been less well studied than T. In particular, it is still unclear how much of the J1 that has been found in India relates to it. In any case, J1(xP58) seems to be of (East) Caucasian origin, with frequencies peaking in Kubachis (99%), Dargins (69%), Avars (58%) and Lezghins (44%), and a sizeable 18% in Assyrians from modern Iraq. It makes up for 8.3% in Ethiopian Amhara, and 2/3 (3.3 of 4.9%) of the J1 found with Sudanese speakers of Nilo-Saharan languages.
Among Arabs, where P58 dominates, it is relatively rare. According to the WP page, J1(xP58) hasn't been found in the UAE so far. It is 0,8% (vs. 37.2% P58) in Oman, 1.4% (vs. 19.7% P58) in Egypt, and 4.8% (vs. 67.7% P58) in Yemen.

An entry of J1(xP58), and possibly T, from Egypt into the Horn of Africa can be safely excluded. We have Egyptian written sources from the time in question that clearly display inhabitants of the area (Land of Punt) as foreign, of a different culture and possibly also language.
OTOH, there is multiple evidence of IVC contact with the Horn of Africa and the Ethiopian highlands during that period. The extent to which the Arab peninsula was involved is unclear. yDNA, but also cultural indicators as e.g. late knowledge of bananas on the Omani coast, speak for a rather limited role at best.

The most parsimonous way to combine genetic and archeological patterns seems to me a combination of Elamites and (proto-)Malay Sea Nomads in linking the Horn of Africa to the Indian peninsula. Note that Elamite and Malay have quite a number of possibly shared terms:

- Elam. si-in-nu "to come, arrive", Mal. sini "here"?
- Elam. mak-, Mal. makan "eat"
- Elam. mi-ik-ki "flower", Mal. mekar "to blossom"
- Elam. bu-ur, Mal. buah "fruit"
- Elam. pu(n)-, Mal. penuh "full"
- Elam. ba-ha, Mal. baik "good"
- Elam. hu-la-ap-na/ hu-ra-ap-na, Mal. hijau "green"?
- Elam. kur-pi "hands", Mal. ceker "foot, claw"
- Elam. uk-ku "head; because, according to", Mal. ekor "head, suite, aftermath, tail"
- Elam. ga-az, Mal. gasak "to hit"
- Elam. an-ka, Mal. alangkah "how"
- Elam. *hahp "to listen, hear", Mal. harapkan "to abide, await"?
- Elam. ab-ba-ra, Mal. berat "heavy"
- Elam. tur-, Mal. tahu "to know"
- Elam. ti-pi "neck", Mal. tepi "shoulder, side, fringe"
- Elam. si-um-me "his nose", Mal. men-cium(i) "to smell, sniff, scent, detect, kiss"
- Elam. ki "one", Mal. eka "one, single"
- Elam. zik-ki, Mal. ceking "thin"
- Elam. hu-sa "tree, forest", Mal. hutan "forest, wood"
- Elam. mar "two", Mal. mirip "after"?
- Elam. iz-za "walk", Mal. jejak "trail, trace, track, imprint"?
- Elam. zu-ul "water", Mal. cair "liquid" (adj.), curahan "flow, flood"
- Elam. ni-ka, Mal. kami "we"?
- Elam. ap-pa, Mal. apa "what"

A high number of Elamite parallels (15-16) has also been identified with Afro-Asiatic, and here especially Omotic, but note also possible links to Caucasian and IE.
http://starling.rinet.ru/Texts/elam.pdf

Simon_W said...

Interesting that the insular Celts share more with the Corded Ware than with the Yamnaya from Samara. But still more with BR2. So it seems the R1b folks didn't reach the Isles without Corded admixture, as I had suspected before.

The special relationship of BR2 to the Balto-Slavs on the other hand I didn't expect. Amazing, they share more with him than with the Corded Ware. I guess it must be related with post-Corded Ware Bronze Age movements.