Saturday, May 28, 2016

Indian genetic history in three simple graphs

Enjoy, but please, those of you still sore about the passing of the Out of India, Out of Armenia, Out of Your Hat, or indeed, Out of Your Ass Indo-European hypotheses, try not to fill up the comments with the usual inane drivel. Thanks in advance for your cooperation.


AtriĆ°r said...

Lots to look forward to in these next months.

Seinundzeit said...


If possible (when you find the time, and inclination), could you try that Samara_Eneolithic typology, but with the Kalash? Thanks in advance.

Davidski said...

Can't do much with the Kalash in these trees.

Seinundzeit said...



I was wondering, have you tried to construe modern populations using "basal/unadmixed" ancient samples, with TreeMix?

If you think it's worthwhile, I was hoping you could eventually try a "basic"/scaffold tree with these unadmixed populations:

Mbuti, Biaka, Yoruba


Dai, Han

Kostenki14, Villabruna, GoyetQ116-1

MA1, AG3 (either of them, or both)

If these populations yield a topology that makes sense, you could then add a single modern (or ancient) population of interest, and test the possibilities. None of these populations are mixed with each other (in theory), so they should provide a nice framework for something of that nature.

But before adding any other ancient/modern population, a simple tree with these samples would be very interesting/informative.

As always, only if you find the time (and inclination) to do this. Thanks in advance.

Shaikorth said...

Including MA-1 and Dai/Han might already mess with the topology as we saw here:

Maybe they can be replaced with papuan and denisovan (to bring papuan ENA into an optimal place in the tree).

Seinundzeit said...

In theory, it probably won't, since none of these scaffold populations have "Basal Eurasian" admixture, while MA1's neighbor on that tree does.

Also, in that tree MA1 gets 40% admixture from a South Asian population (intermediate between caste Indians and Munda), not from an East Asian population.

But it would be interesting to see both a tree with Dai/Han for ENA, and a separate tree with Papuan and Denisovan, although the latter will introduce admixture events into the typology, when what we really want is a simple tree, no admixing populations, and then use that as a framework to test admixed ancient/modern populations for basic ancient genetic components.

Balaji said...


Thank you so much. I do find these results pleasing and your “wild stab in the dark” did reach its mark. But let me restrain myself and say that these TreeMix graphs do not yet resolve the question of the Indo-European Urheimat one way or the other. We have to wait for a little while longer for more evidence.

Davidski said...


Samara Eneolithic is certainly a mixture of Eastern European Hunter-Gatherers (EHG) and Caucasus Hunter-Gatherers (CHG).

So if you're looking for OIT based on these results, you have to push back the genesis of Indo-European in South Asia to the late Paleolithic at least. And also hope that EHG and/or CHG are from South Asia.

Whatever it takes, I'm guessing, but you won't find much support for your theories here, or anywhere else not populated by hardcore OIT fans.


This topology works OK for some pops, but not for others. Like other topologies, its results depend on the genetic structure and demographic history of the test pop.

For instance, it works great for Karitiana Indians, because they're a two-way mix of highly differentiated components. It also works great for the high caste North Indians, probably because this sample is a mixture of different high caste Indian groups. But it doesn't work for Pathans or Kalash, probably because the samples I have are from the same regions and maybe too homogeneous.

Seinundzeit said...



Fascinating stuff.

I completely agree with you.

For whatever it's worth, as a simple tree, the relationships are just as we expected (and in line with that recent Q Fu et al. paper), so it seems this basic topology is a success with TreeMix (we would see the same sort of results with qpGraph).

And the Karitiana/North Indian results do look quite good.

I wonder, does the high caste North Indian sample get any further migration edges, ones that make sense?

Also, if you find the time, could you try Anatolia_Neolithic, to see how it fares in this setup? It would be interesting to see if a "Basal Eurasian" edge is replicated, or if something unexpected happens. Considering that the basic scaffold works well/looks accurate, the replication of a "Basal Eurasian" migration edge into Neolithic Anatolians/EEF (or the lack of such an edge) would mean a lot for the concept of "Basal Eurasian".

Shaikorth said...

Sein, that MA-1 edge isn't South Asian per say from the root of the Munda-Dai-Han branch which implies that all of them have some extra affinity to MA-1 and have complex ancestry. It'd be interesting to see if the non-denisovan part of Papuans has behaves similarly or is it actually "pure ENA".

Seinundzeit said...


You know, it would be interesting to see if Papuan/Australian populations also possess this ANE affinity, minus the Denisovan ancestry.

If my memory serves me right, I think there was a paper that construed Papuan/Australian populations as a mix of ENA (Onge-like) and an unidentified ghost population which was very rich in Denisovan ancestry.

But perhaps that might be too much complexity for TreeMix.

Shaikorth said...

That's a sensible theory. On this Asian (West, South, East) PCA without Denisovans Papuans cluster exactly like Andamanese on first two dimensions.

This is from Basu et al. which made the questionable choice of labeling modern populations after components, ANI, ASI, AAA and ATB are just modern Indian populations (North, South, Austroasiatic, Tibeto-Burman).

Davidski said...

Yeah, that topology does pick up a 30% Basal Eurasian edge into Anatolia Neolithic. But I reckon inflated archaic admix in Villabruna is making it lower than it should be.

I also tried it with CHG, but it didn't work.

Gill said...

Some of the Haryana Jatts have ASI/South Indian numbers that are just low enough to indicate foreign ancestry. I'm going to refer to the HarappaWorld calculator because most have used it.

The average among Punjabi Jatts has been a near consistent ~30% from across the entire area. It starts to get lower than 27% in Pakistanis who have Baloch/Sindhi/Pathan/Pashtun ancestry, and presumably the Punjabi Jatts approaching 27% are from closer to Western Punjab.

As you go east or north (into Kashmir), the S-Indian/ASI goes up (sometimes Oceanian goes up as you go north and ASI doesn't, but they both are part of the "South Indian" component in all calculators so you see it go up to 30-40% into Uttar Pradesh among Brahmins and other upper castes)

But Haryana, which is inbetween Punjab and UP has Jatts with sometimes as low as 25-26% S-Indian. The HarappaWorld average for 5 Haryana Jatt samples is 26.56% S-Indian.

So if they're a mix, what are they a mix of? If we assume it's modern day North Indian Brahmins, then it would be something with high Baloch/Gedrosian and Northeastern European. Something like 14% S-Indian, 42% Baloch, 14% Caucasian, 23% NE-Euro, 2-3% Amerindian, 3% Mediterranean.

But if it's modern day South Indians or lower castes from the North who have around 50+% S-Indian, ~30% Baloch, and next to no NE-Euro then Haryana Jatts would be a mix of them and a population that is 0% S-Indian, 46% Baloch, 15% Caucasian, 33% NE-Euro, a bunch of Siberian/Amerindian, and 3-5% Mediterranean.

So, basically... either Bronze Age South Central Asia must have been flooded with high-ANE Baloch/Gedrosian as a real component or it was basically high-ANE Bronze Age Central Asian Steppe populations mixed with CHG. But the high ANE might be an artifact of something else, because explaining it in South Asia is difficult. Unless there was a population from like the eastern fringes of the Andronovo horizon with Amerindian levels of ANE/MA1.

Gill said...

^ My point being, whatever Steppe population contributed that extra European WHG-like stuff to Haryana Jatts, it came via a high-ANE population like CHG or something with even more ANE. So if David's theory about a Corded Ware-like population going from North-Central Europe to the Steppe is correct (to explain the European levels of WHG in Bronze Age Central Asia), there was an additional admixture step with a very high ANE population in perhaps South Central Asia (what keeps registering as Gedrosian) before hitting North India.

It depends on whether any of those Steppe samples recovered so far feature really high ANE in addition to WHG (basically, Karelia/EHG-like). In which case the population movement into Central Asia could have been from the Eastern Europe/Volga-Ural region, not North-Central.

Gill said...

In any case, find ancient R1a-L657 and you find the source.

Jaydeepsinh Rathod said...


Have you read this recent paper on origins of Harappan Civilization ?

It is now becoming increasingly clear that the pre-Harappan & Early Harappan levels in the state of Haryana are very old. Haryana also has now the largest Harappan site discovered so far in Rakhigarhi. There is a tentative possibility that the origin of the Harappan civilization itself might be from around the region of Haryana & North Rajasthan.

I am inclined to believe that the Jatts of Haryana are the direct descendents of those Early Harappans. We now have evidence of Pre-Harappan & Harappan phase in succession at Bhiranna in Haryana from 9500 BP to around 3500 BP without any break in cultural continuity or evolution. A large agriculture based population should have existed for a long period in this region. This is further exemplified by the largest Harappan site of Rakhigarhi. It therefore looks very unlikely that a steppe population of nomads could have made any significant dent in the population of these Harappans, the early inhabitants of Haryana.

It is therefore curious that it is in Haryana that you also have the Jatts with the least amount of ASI. Since the intrusion of the steppe population, if it ever happened, could not have significantly altered the demographics of the Harappans in Haryana, does this not suggest that the Harappans themselves might have been high in ANI & low in ASI ? The only way to get around this is if there was a catastrophic population reduction among the Harappans. We have no evidence for this and hence the most logical & simplest explanation seems that the Haryana Jatts are the direct descendents of the Harappans that lived in Haryana since 9500 BP.

If there is any affinity of these Jatts with Eastern Europeans, a movement of the steppe people into South Asia is not the only explanation. A movement from the South into the steppe is also a very real possibility.

Jaydeepsinh Rathod said...


"Samara Eneolithic is certainly a mixture of Eastern European Hunter-Gatherers (EHG) and Caucasus Hunter-Gatherers (CHG).

So if you're looking for OIT based on these results, you have to push back the genesis of Indo-European in South Asia to the late Paleolithic at least. And also hope that EHG and/or CHG are from South Asia."

EHG is certainly not from South Asia. But CHG can be. The CHG in the Caucasus is likely the ancestor of the modern Caucasus component and a close cousin of the Gedrosian component. As I have pointed out earlier, Metspalu et al 2011 have already argued that the Caucasus & the Gedrosian components had already separated by 12500 BP. So the Gedrosian/CHG in South Asia is older than 12500 BP and is not a result of any movement from the Caucasus during the Holocene.

The CHG in Samara Eneolithic could just be a proxy for its close cousin the Gedrosian that could have arrived there from SC Asia where it resides in greatest concentration.

Gill said...

There's not enough European admixture in Jatts (~20-23%) or South Asian admixture in Europeans (next to no ASI/South Indian, but a little Gedrosian) for an out of India theory for WHG/EHG. Those components are definitely from Northeastern Europe and the Volga-Ural area.

The IBS, admixture, etc all points to the WHG/EHG-like admixture in Jatts being related to that found in modern day Europeans. And since R1a-Z93 likely emerged near the Urals, chances are extremely high it's from the Steppe (and recovered ancient DNA from the Steppe, not far from India, shows populations which are extremely similar to Europeans... and aside from the Roma, there are no Indian populations near Europe).

So there could have been an ancient CHG movement from South Central Asia to Europe, via the Steppe, but once the modern European admixture combo came about (the ANE/CHG mixed with WHG), it came back to the Steppe. But South Central Asia (Indus/Balochistan) is not the same as India proper and Haryana Jatts have lower levels of Gedrosian than surrounding populations.

Speaking of which, ancient DNA from Iran!

Seinundzeit said...


Amazing, thanks!

For whatever it's worth, perhaps 30% is more reasonable, as Villabruna is much more distant from East Asians compared to Loschbour, and the 40% estimate was produced for EEF when we only had Loschbour/La-Brana.

Also, I guess this shows that UHG is genetically continuous with the Villabruna cluster (which Matt's experimentation also confirmed).

Out of curiosity, how do Satsurblia (or Kotias) behave in this topology?

Thanks in advance.

Davidski said...

Kotias and Satsurblia don't receive any admixture in this graph even with up to five migration edges.

Seinundzeit said...

That's interesting.

This probably brings us back to your point about the demographic history of the test population in question.

Davidski said...

@Jaydeepsinh Rathod

As I have pointed out earlier, Metspalu et al 2011 have already argued that the Caucasus & the Gedrosian components had already separated by 12500 BP. So the Gedrosian/CHG in South Asia is older than 12500 BP and is not a result of any movement from the Caucasus during the Holocene.

The Matspalu paper is outdated and irrelevant.

There's ample evidence that South Asia was influenced by the Near East during the Neolithic and Bronze Age, including throughout the Harappan period.

The CHG in Samara Eneolithic could just be a proxy for its close cousin the Gedrosian that could have arrived there from SC Asia where it resides in greatest concentration.

Samara Eneolithic and Yamnaya are mixtures of EHG and CHG. There's nothing South Asian about them.

postneo said...

looks like this is still on going.

Micky dodd "why is it so hard etc..."

its not actually hard at all, there are a huge chunk who don't care including many people in my family who like the idea of being "foreign". Its just the in 1.2 billion people a small minority will be nay sayers and perhaps even such a puny lot have disproportionate effect. I for example am one of these. Perhaps its hard for you to tolerate such things.

The question to be asked is why Europeans have such hang ups about being migrants from asia. Why such hand wringing and nervous laughter? Why the hurry to jump to conclusion without sampling.

Bharatiya you seem to be overly wrapped into the politics of this. lets look at data without wading into marxism etc..

The early z93 from europe does make a good case for migration from eastern europe I agree.
The adna case for 1500 BC population from poland to south asia on the other hand is very weak. Its just wishful connection of dots.

As per David own response theres no evidence of paternal flow from east europe to central asia. Not because it did not happen, but simply because nobody has even bothered to sample and establish it empirically yet.

I see al lot of crap about srubnaya being vedic. lots of cultures on the globe will can be vedic by such loose standards.

Lets wait and see.

Davidski said...


There is plenty of evidence of gene flow from Eastern Europe to Central Asia during the Bronze Age, and not just paternal gene flow. All sorts of gene flow, that brought steppe admixture, Z93 and the lactase persistence allele to South Asia.

And nobody ever said Srubnaya was Vedic. Srubnaya is supposed to be proto-Iranian. It's Potapovka and Sintashta that show Vedic-related rituals in their Kurgans.

You have a habit of twisting the facts and the arguments of others that are uncomfortable for you. The reason you do this is because otherwise you wouldn't have an argument of your own.

