search this blog

Thursday, November 26, 2015

The Khvalynsk men


This is where the three Samara Eneolithic or Khvalynsk samples from the recent Mathieson et al. paper plot on my Principal Component Analysis (PCA) of ancient West Eurasia. They're labeled as Steppe_CA (steppe Copper Age). I've also marked them with their Y-chromosome haplogroups.


Individual 10433, belonging to Y-chromosome haplogroup R1a, is almost a pure Eastern European Hunter-Gatherer, which is perhaps surprising, considering he was buried with copper artifacts. On the other hand, sample 10434, the one belonging to haplogroup Q1a, and positioned further east than the other two, appears to have been whacked over the head a few times and simply thrown into a ditch.

The PCA also has most of the other samples featured in Mathieson et al., including Neolithic Anatolians (labeled Anatolia_N), as well as extra samples from Allentoft et al. and Jones et al.

See also...

The Khvalynsk men #2

160 comments:

Nirjhar007 said...

the one belonging to haplogroup Q1a, and positioned further east than the other two, appears to have been whacked on the head a few times and simply thrown in a ditch.
I see.

Dmytro said...

I am willing to bet that when their yDNA is in a majority of Dnipro-Donetsk men will be R1a, and so will much if not most of Serednj Stih (Sredny Stog), as well as much of the Dnipro-Donetsk admixed North Trypilian Chapajevka people (early inhumation phase). Interesting times ahead...

Bernard said...

@Davidsky
"Individual 10433, belonging to Y-chromosome haplogroup R1a, is almost a pure Eastern hunter-gatherer, which is somewhat surprising, considering he was buried with copper artifacts."
The individual buried with copper artifacts is 10122 of R1b Y haplogroup.
See pages 9 and 10 of Supplementary Information:
"10122 / SVP35 (grave 12)
Male (confirmed genetically), age 20-30, positioned on his back with raised knees, with 293 copper artifacts, mostly beads, amounting to 80% of the copper objects in the combined cemeteries of Khvalynsk I and II. Probably a high-status individual, his Y-chromosome
haplotype, R1b1, also characterized the high-status individuals buried under kurgans in later Yamnaya graves in this region, so he could be regarded as a founder of an elite group of patrilineally related families. His MtDNA haplotype H2a1 is unique in the Samara series."

Davidski said...

This is what it says on page 10.

10433 / SVP46 (grave 1)
Male (confirmed genetically), age 30-35, positioned on his back with raised knees, with a copper ring and a copper bead. His R1a1 haplotype shows that this haplotype was present in the region, although it is not represented later in high-status Yamnaya graves. His U5a1i MtDNA haplotype is part of a U5a1 group well documented in the Samara series.


These qualify as copper artifacts.

Bernard said...

May be he got them by trade

Davidski said...

Bernard,

Are you autistic or something?

Bernard said...

LOL

Rob said...

Davo is it possible to see where that ancient African genome from a couple weeks ago would pot?

Davidski said...

I can't put a Sub-Saharan African on a West Eurasian plot. You'd just see a tight ball of West Eurasians and a lone African.

Here's a global plot with Mota and Kotias.

https://drive.google.com/file/d/0B9o3EYTdM8lQMDZnNUk1OXZIcnc/view?usp=sharing

Put their names into the search field to find them.

Rob said...

Thanks

Nirjhar007 said...

I can't read it.

Gökhan said...

David do you have nay plan to creat a new calculator by using all of thos enew anatolian, CHG and greece samples?

Matt said...

@ Davidski, off topic, but, would it still be possible to run these lists of D (Chimp,Test)(Mbuti,Pop) stats at some point?:

Ust Ishim - http://txt.do/5mw14
Dai - http://txt.do/5mw0c
Yoruba - http://txt.do/5mw0p

Rob said...

@ Gokham

I second your question

I suspect adding the new CHG to the mix might alter (possibly significantly) by canabalizing some of the other components- possibly including the EEF fraction
It'll be very interesting to se

Roy King said...

@Davidski
"Here's a global plot with Mota and Kotias.

https://drive.google.com/file/d/0B9o3EYTdM8lQMDZnNUk1OXZIcnc/view?usp=sharing"
Please do a global plot of PC2 vs PC3 and PC2 vs PC4 with Mota and Kotias.
Thanks! These are very helpful.

Open Genomes said...

David, you've done Human Origins World 1&2 PCAs for various ancient samples. Can you add NE1 and KO2 (as a proxy for the Neolithic Anatolians) in the same plot as Mota and Kotias? If you can accommodate K14 and MA-1, and the WHGs, SHGs, ane EHGs, add those too.

As you can see from the World 1&2 PCA, Stuttgart is *not* where the Bedouin B are at all. Stuttgart is "above the peak of the apex of the triangle", higher up than the Sardinians. La Brana-1 is not far from Kotias, near the North_Ossetians and the other WHGs are in the same vicinity. This is the true picture of Eurasian variation. We have a continuum that goes from the "EF" (or rather, Kebaran Hunter-Gatherers at the end of the LGM) to the Ulchi at the other end. Kotias is already "on the way" into Eurasia, which is why the CHGs cluster with Kostenki and MA-1 on the TreeMix graphs apparently even from 41,500 years onward, when the climate turned colder and drier, the populations became isolated, and the drift began.

The basic point here is that you cannot show the real picture of Eurasian variation without including Africans, particularly Mota and the Hadza. Are PC 1&2 83% of the variation? Other dimensions, PC 1&3, 2&3, etc. would be valuable too.

Shaikorth said...

@Open Genomes, the "triangle"-shaped PCA with Africans actually doesn't show much about the real variation of Eurasians as Africans will squeeze some of the Eurasian variation out and the Sardinia vs. East Asia polarity is largely due to drift or sample sizes. This obscures certain things easily verifiable with formal testing, like Papuans actually being the most divergent Eurasians.

If you want Africans and Eurasians on the same plot, I think SpaceMix from Coop et al. provides a decent one, like:
http://oi64.tinypic.com/jt13k4.jpg

Davidski said...

Placing ancient samples in the less significant PCA dimensions is difficult. But here's a PCA datasheet with data for lots of ancient samples and nine dimensions.

https://drive.google.com/file/d/0B9o3EYTdM8lQVGVPLXg0WjVfQjQ/view?usp=sharing

It can be plotted with any plotting software like Gnuplot, Past3 or Excel. The Gnuplot arguments for a 1&2 dimensional PCA are...

plot 'Global9.txt' using 3:4:1 with labels

For a 1&3 PCA they are...

plot 'Global9.txt' using 3:5:1 with labels

For a 1&9 PCA they are....

plot 'Global9.txt' using 3:10:1 with labels

For a 3D PCA they are...

splot 'Global9.txt' using 3:4:5:1 with labels

You can zoom in with the zoom tool, and you can print images like this...

set term png size 2000, 1000

...hit enter...

set output "MDS.png"

...hit enter...

plot 'Global9.txt' using 3:4:1 with labels

Davidski said...

Someone else just used population averages from the Global9 datasheet to run a West Eurasia only PCA. Pretty cool.

https://drive.google.com/file/d/0B9o3EYTdM8lQVE1yZWIzdDdNM2s/view?usp=sharing

ryukendo kendow said...

@ Davidski

Hi David, are you able to segment markers from the X chromosome in the dataset away from the markers for the rest of the autosome?

The recent paper on Algerian genetics performed this, and was able to use ADMIXTURE on the X vs the autosome to show that European, West Asian, and Sub-Saharan gene flow into North Africa was female-biased, while modal North African ancestry is male-biased. This may be used to investigate the sociological aspects of admixture on the East European plains.

Once the segmentation takes place, formal methods can be used too.

On another note, could we try running the following three stats?

Austroasiatic Kanjar LBK_EN Kotias
Austroasiatic Pulliyar LBK_EN Kotias
Austroasiatic Paniya LBK_EN Kotias

Is there any reason that groups like Singapore_Indian were not run previously? These are agricultural Tamils, and can probably be used as a stand-in for most Dravidians.

Some people are getting the idea that Kotias is not a close representative of the other half of Yamnaya or a portion of the ancestry of SC Asians, because of the lack of specific preference for Kotias over LBK_EN. I think this is v unlikely to be true, and is a result of excess Basal-ness of Kotias, and we can confirm this with the following:

Karelian_HG Yamnaya LBK_EN Kotias

Thanks in advance!

@ Kristiina
Thank you very much for the explanation!

It seems like mtDNA haplogroups tend to reflect structure that is a great deal more ancient than what I'm used to seeing in Y-Haplogroups.

@ Sein @ jParada @ Davidski @ Chad

This ASI 'mystery' is really intriguing, and I'm willing to try many methods to crack this.

Before we do anything, Chad, you seemed to have done some qpAdm modelling of South Indians if I remember correctly. Could you share them with us here? We can use them as a baseline from which to start.

Another thing, the 10% of Dai-like ancestry in the Kalash in qpAdm implies that, prior to the influx off the 60% Andronovo-like ancestry into SC Asia, it must have been ~25%, almost a quarter, Dai-like, which is pretty incredible to think about.

ryukendo kendow said...

@ Chad

Oh yes, I saw your message, but could not reply because I don't have >10 comments there. Interested indeed, what do you propose?

Chad Rohlfsen said...

rk,

I can do a few runs again, with some South Asians. I haven't converted my new set with the Paniyas yet, but I can do several groups. It would be preferable to use CHG, but I'm not set up for that yet. The thing that I'm looking at is the potential for ENA and CHG in EHG. I've got several intriguing Dstats which I will post here in a couple minutes. I have to move to my laptop with Plink and Admixture.

VOX said...

Davidski, given that Yamnaya = EHG + CHG and given the archaeological context of this mix, would it be possible to date this admixture event using ROLLOFF, which would also tests this model's accuracy.

Open Genomes said...

Here is a 3D Global9 PC plot of PC1-PC2-PC3:

Global9 3D PC plot with ancient DNA samples

We can see that it's really critical to show *all* samples, including Africans. The presence of Africans, Oceanians, East Asians, and Native Americans changes the picture completely for Eurasia.

Rather than "compressing" Eurasians, the presence of Africans shows us some fascinating things: The WHG-SHG-EHG group trends downward toward MA-1, who in turn leads to the Inuit and Na-Dene, and distantly to the Native Americans. However, Ust'-Ishim leads off on a separate upper South-Asian / Austroasiatic edge towards a vertex consisting of Japanese and Taiwanese Aboriginals. These in turn on another edge leads downward through Paleo-Siberians (Chukchi, etc.) to Native Americans.

A closer examination of the upper left reveals that the Early Farmers (EF) are nowhere near the Bedouin B, who appear to be admixed with Sub-Saharans, but rather, represent their own separate Eurasian vertex, today only populated by WHG-admixed Sardinians. The Kebaran Levantine Hunter-Gatherers would be even more isolated, beyond KO2 (Starcevo_EN) and the Anatolian Neolithic.

The lesson here is that all ancient and modern genomes need to be plotted together, and only then can we "zoom in" on a particular region of interest, knowing which way the drift is headed. The real "projection bias" (or rather "biased projection" ;) is when certain regions are left out, and and arbitrary 2-dimensional projection leaves out key variation that makes samples appear to be "related" when in fact they are not.

Rob said...

Open Genome

I'm not an expert on PCAs but I think your PCA looks very good

Davidski said...

rk,

Austroasiatic Kanjar LBK_EN Kotias 0.0024 0.971 111571
Austroasiatic Pulliyar LBK_EN Kotias -0.0009 -0.285 111572
Austroasiatic Paniya LBK_EN Kotias 0.0007 0.264 111881
Mbuti Yamnaya_Kalmykia LBK_EN Kotias 0.011 3.253 496714
Mbuti Yamnaya_Samara LBK_EN Kotias 0.0081 2.517 504561
Mbuti Yamnaya_Kalmykia Anatolia_Neolithic Kotias 0.0147 4.431 497977
Mbuti Yamnaya_Samara Anatolia_Neolithic Kotias 0.0122 3.918 505714

South Asia is still difficult to crack, because of all the layers of admixture, and the geographic and social clines in these layers that exist there.

But Yamnaya always shows a clear preference for Kotias. This can be seen in ADMIXTURE especially, and I might post some results later today or during the week.

The interesting thing is that Yamnaya Samara often shows inflated affinity to WHG, and in ADMIXTURE also some admixture from WHG. Whatever this is, it might be pulling Yamnaya Samara closer to LBK_EN too.

Btw, I don't think it's possible to run any X chromosome tests with the steppe samples. They usually have much less than 4000 SNPs on the X, which isn't enough.

Vox,

Kotias-related admixture entered the steppe during the Khvalynsk period, at the latest. This can be seen on the PCA above, with the Khvalynsk samples forming a cline from EHG to the Bronze Age steppe.

I don't think Roloff can provide more accurate dates than ancient DNA, especially in this case, because most of the admixture appears to have happened gradually over a long period of time.

So the interesting question is why the admixture happened during the Khvalynsk period. As per the sad tale above of the Q1a man ending up dead in a ditch, it might have happened amidst both hostile and friendly relations with people coming from the south and east.

In other words, I suspect that in some cases men were killed and their women taken, and in others women were married off and moved hundreds of kilometers from their homes to be with their husbands.

Chad Rohlfsen said...

David,

Looking at some various stats with Ust_Ishim and the hunters, I'm curious about something. I think it may be possible that Motala is closer to CHG, but not because Motala is closer to Crown West Eurasian, but that they have some CHG, maybe about the same as Karelia. I've seen that despite being closer to ENA, compared to other hunters, they're also further from Ust_Ishim more significantly than WHG. I'm wondering if this means that they have a few percent of actual ENA and CHG. Maybe, someone else can come up with some more stats to test.

Could you test the following:

Primate_Gorilla Kotias Iberia_Mesolithic Karelia_HG
Primate_Gorilla Kotias Loschbour Karelia_HG
Primate_Gorilla Kotias Hungary_HG Karelia_HG
Primate_Gorilla Kotias Motala_HG Karelia_HG
Primate_Gorilla Satsurblia Iberia_Mesolithic Karelia_HG
Primate_Gorilla Satsurblia Loschbour Karelia_HG
Primate_Gorilla Satsurblia Hungary_HG Karelia_HG
Primate_Gorilla Satsurblia Motala_HG Karelia_HG

Davidski said...

Chad,

Primate_Gorilla Kotias Iberia_Mesolithic Karelia_HG 0.0104 1.467 374334
Primate_Gorilla Kotias Loschbour Karelia_HG 0.0016 0.232 384919
Primate_Gorilla Kotias Hungary_HG Karelia_HG 0.0011 0.142 325855
Primate_Gorilla Kotias Motala_HG Karelia_HG -0.0038 -0.711 417770
Primate_Gorilla Satsurblia Iberia_Mesolithic Karelia_HG 0.0161 2.06 305759
Primate_Gorilla Satsurblia Loschbour Karelia_HG 0.0039 0.49 312556
Primate_Gorilla Satsurblia Hungary_HG Karelia_HG 0.003 0.366 258522
Primate_Gorilla Satsurblia Motala_HG Karelia_HG 0.0027 0.45 339369

Aram said...

Davidski

Your PCA is much better scaled than all what I see in this recent studies. Just few questions to figure out who is who there.
Who are the most 'Southern' Near Easterns (Crosses) ? Bedouin_Bs?
And what sign are the modern Armenians? Caucasian Circles or Near Eastern Crosses?
Thanks in advance.

a said...

@Open Genome
I agree with Rob-
Fantastic and interesting 3d plot.

Chad Rohlfsen said...

Rk,
One way to crack it would be

Primate_Gorilla Paniya Dai Kotias
Primate_ Gorilla Paniya Dai Anatolia_Neolithic

That might tell us how close the West and Wast Eurasian are and which is better.

Chad Rohlfsen said...

East, sorry.

Davidski said...

Aram,

https://drive.google.com/file/d/0B9o3EYTdM8lQc1ZqMDUzZVBaWDA/view?usp=sharing

https://drive.google.com/file/d/0B9o3EYTdM8lQLTZ3OXZ3YmFIa2c/view?usp=sharing

Chad,

Primate_Gorilla Paniya Dai Kotias -0.0499 -9.72 101132
Primate_Gorilla Paniya Dai Anatolia_Neolithic -0.0526 -16.075 119765

Kristiina said...

@ Chad “I'm wondering if this means that [Motala] have a few percent of actual ENA and CHG”

Also haplogroups support that view. On the one hand we see C1f in Mesolithic Karelia (a sister clade of C1c found in Apache and Arsario) and C1e is found in modern Icelanders (sister clade of C1b is found in Apache and Cayapa). People (yDNA Q1a?) who brought these haplorgoups to Scandinavia probably carried Northeast Asian ENA.

On the other hand, we see H which is probably H2a2b (which I previously erroneously named H2b) and yDNA J in Mesolithic Karelia. CHG was probably carried along to Fennoscandia with these haplogroups. H2a2b is still sporadically found in all Fennoscandia. All mtDNA H2a (http://so-many-ancestors.blogspot.fi/2015/03/matrilineal-monday-haplogroup-h2a.html) looks like having spread from north of Caucasus to Fennoscandia.

As for yDNA J, this FamilyTree map is interesting https://www.familytreedna.com/public/Y-DNA_J/default.aspx?section=ymap . Distribution of “J2b M12 confirmed near predicted and suspected, subclade un recognizable” in Scandinavia could be the result of a Caucasian wave. J2b M12 is still today frequent in Vologda and Rybinsk area and Volga Ural.

Alberto said...

Comparing these ones:

Mbuti Yamnaya_Kalmykia LBK_EN Kotias 0.011 3.253 496714
Mbuti Yamnaya_Samara LBK_EN Kotias 0.0081 2.517 504561

Primate_Gorilla Yamnaya Kotias LBK_EN -0.001 -0.245 271071

Is it Gorilla screwing things up or something else?

ryukendo kendow said...

@ Chad

Chad, we have to make rather complex comparisons because of just what Davidski described: the complex layers of admixture. E.g. we have to make comparisons against Japanese and Karitiana just to see if West Eurasian ancestry exists in S Indians. We know S Indians strongly favour Dai over others, but we don't know what else might be hiding underneath that overall Dai-likeness.

Thats why the qpAdm you did many months ago for some S indian populations, I can't remember which exactly, would be really useful here.

On another note, when Indians are allowed to go on a pca plot, the plot is not dominated by a W Eurasian corner, a Onge corner and a Dai corner as you would expect. In fact the Pulliyar and Paniya dominate the 'Indian' corner, and Onge are in between the S Indians and Dai, slightly closer to Dai than Austroasiatics are.

This implies that the S Indian 'extreme' is not 100% Onge-like, which opens up another avenue of attack. Would you be willing to run some D-stats with Onge again? What S Indians do your dataset contain? Do they contain Kharia, Ho, Mala, Paniya, Pulliyar?

Chad Rohlfsen said...

The ASI got better fits as a mix of Onge, Papuan, and Atayal. I posted them over at Anthrogenica, but can't find them.

Chad Rohlfsen said...

Wasnt there a mesolithic Indian that showed more modern like admixture? Using another South or South Central Asian group without a lot of ENA could be better.

Chad Rohlfsen said...

Primate_Gorilla LBK_EN Paniya Dai
Primate_Gorilla Anatolia_Neolithic Paniya Dai
Primate_Gorilla Armenian Paniya Dai
Primate_Gorilla Georgian Paniya Dai
Primate_Gorilla Kotias Paniya Dai
Primate_Gorilla Paniya Armenia Dai
Primate_Gorilla Paniya Georgian Dai

These might at least give an idea to the amount of ENA in ASI and Paniya in general. We might have not been looking with the right pop.

Alberto,
To check if it is the West Eurasian in Mbuti, we could look at...
Primate_Gorilla Mbuti Anatolia_Neolithic Kotias

Chad Rohlfsen said...

I can't seem to find all of those qpAdm stats at Anthrogenica, unfortunately. I'm about to send people on a scavenger hunt.

ryukendo kendow said...

@ Chad

Thank you!

Agree w you, the above stats seem pretty solid. @ David, could you run these?

Also, was there really a mesolithic Indian??? Rly? How come nobody's talking about it?Wouldn't that have hit the gene blogging world like a ******* meteorite??

Chad Rohlfsen said...

These are three different models for the Kharia

15.5% Onge
26.6% Papuan
30.8% Atayal
05.3% Bedouin
21.8% Georgian
chi-square .342 tail-prob .951898

23.2% Onge
18.2% Papuan
30.9% Atayal
01.0% Hadza
26.7% Georgian
chi-squre .706 tail-prob .871684

37.1% Onge
32.2% Atayal
02.2% Hadza
28.5% Georgian
chi-square .811 tail-prob .936969

Chad Rohlfsen said...

Here's another..

36.2% Onge
32.9% Atayal
02.2% Hadza
25.9% Georgian
02.6% Corded Ware
Chi- 0.882 tail- 0.829837

a said...

Suspense;

Any idea when we will see the Mathieson et al 2015, and Jones et al 2015 data converted/incorporated into Eurogenes K6-K10 & K15, for public viewing ? For example the above R1a-R1b-Q Kvhalynsk samples?

Alberto said...

@Chad

I doubt it, but if it's West Eurasian admixture in Mbuti these ones should tell better and identify the possible offender(s) and the degree of it. Yoruba has at least double the West Eurasian admixture than Mbuti, and Mota possibly none. So the 3 should be significantly different. Chimp and Gorilla should be equal, and closer to Mota:

Mbuti Yamnaya_Kalmykia LBK_EN Kotias
Yoruba Yamnaya_Kalmykia LBK_EN Kotias
Mota Yamnaya_Kalmykia LBK_EN Kotias
Chimp Yamnaya_Kalmykia LBK_EN Kotias
Primate_Gorilla Yamnaya_Kalmykia LBK_EN Kotias

It would be important to sort this out first, because either the stats with Mbuti are wrong, or the stats with Gorilla are wrong (or maybe it's something else at play).

Gihanga Rwanda said...

@Alberto

Actually one of the strangest things about that Mota study is it estimated a purported ~7 layer of Western Eurasian "admixture" across the board with the Mbuti at 6% and Yoruba at 7%; the Dinka, Ju'hoansi, and Bantu speakers had identical results.

Alberto said...

@Gihanga Rwanda

Yes, you're right about Yoruba. I thought it was quite higher than Mbuti. Then probably instead of Yoruba something like:

Khomani Yamnaya_Kalmykia LBK_EN Kotias

could work to know if it's West Eurasian admixture in SSA making those differences in the stats. Though Mota vs Mbuti/Yoruba would tell by itself too.

Davidski said...

D-stats...

Primate_Gorilla LBK_EN Paniya Dai -0.0078 -2.578 119527
Primate_Gorilla Anatolia_Neolithic Paniya Dai -0.0079 -2.663 119765
Primate_Gorilla Armenian Paniya Dai -0.0097 -3.271 119904
Primate_Gorilla Georgian Paniya Dai -0.009 -3.024 119904
Primate_Gorilla Kotias Paniya Dai -0.0121 -2.873 101132
Primate_Gorilla Paniya Armenian Dai 0.0405 12.77 119904
Primate_Gorilla Paniya Georgian Dai 0.0402 12.863 119904

Mbuti Yamnaya_Kalmykia LBK_EN Kotias 0.011 3.253 496714
Yoruba Yamnaya_Kalmykia LBK_EN Kotias 0.0123 3.713 496714
Mota Yamnaya_Kalmykia LBK_EN Kotias 0.0106 2.427 451956
Chimp Yamnaya_Kalmykia LBK_EN Kotias 0.0119 2.984 496714
Primate_Gorilla Yamnaya_Kalmykia LBK_EN Kotias 0.0115 2.787 445305
Khomani Yamnaya_Kalmykia LBK_EN Kotias 0.0163 5.048 496714

By the way, rk, Chad was probably talking about this...??

http://eurogenes.blogspot.com.au/2015/08/archeogenetics-of-roopkund-skeletons.html

Because there's definitely no Mesolithic ancient DNA from South Asia yet.

Alberto said...

Thanks Davidski.

So Khomani does introduce some bias, but all the rest of the outgroups are pretty much the same. I wonder then why the difference between these two:

Primate_Gorilla Yamnaya Kotias LBK_EN -0.001 -0.245 271071
Primate_Gorilla Yamnaya_Kalmykia LBK_EN Kotias 0.0115 2.787 445305

Maybe the first one was using only transversion sites? Anyway I think all the others make more sense that that first one.

Re: Indian Mesolithic sample, I also remember not long ago something about it. It was some HG from the Gangetic plain that showed some degree of affinity with modern inhabitants of the region, but it was not DNA, only craniometric data, I seem to remember. Not sure, though.

Chad Rohlfsen said...

Yeah, that might be right David.

Looking at those stats, The Paniya sure as hell don't look 50% West Eurasian. Maybe, not half that. Any ideas, rk?

Looking at those Yamnaya numbers, I see no issue with Gorilla. Africans move the numbers with their differing relationship to West Eurasians.

Chad Rohlfsen said...

Time for Treemix?

Chad Rohlfsen said...

Maybe using Ju_hoan_North, Mbuti, Yoruba, Mota, Denisovan, Papuan, Atayal, Dai, Paniya, Anatolia_Neolithic, Kotias, Andronovo_BA?

Roy King said...

OGF via Ted Kandell did a nice 3D interactive graphic for the world PC1 vs PC2 vs PC3 furnished by Davidski:

http://www.open-genomes.org/analysis/PCA/Eurogenes_Global9_PC_1-2-3_plot_with_aDNA.html

Davidski said...

Very nice indeed.

Btw, qpAdm shows Paniya to be 65/35 CHG/Dai and 0% BA steppe.

https://drive.google.com/file/d/0B9o3EYTdM8lQOUlmV2RCRV9BQ3c/view?usp=sharing

Open Genomes said...

Here is an INTERACTIVE 3-D PCA Plot of Global9 PC1-PC2-PC3 which can be rotated and enlarged, and where samples can be identified when you mouse over them.

Interactive 3-D Eurogenes Global9 PCA Plot with ancient and modern samples

Here is a PCA projection / guide showing the population and migration edges for Eurasia in the foreground:

Eurogenes Global9 PCA Plot showing populations and migration edges

A 3-D PCA plot is much more informative than a 2-D projection, because any projection can appear to falsely superimpose and samples and foreshorten distances. With this interactive 3-D plot it's easy to see the true relationships between populations and ancient samples, and even the directions of admixture.

For example, it's possible to see that Mota clusters with the Hadza and Sandawe rather than with the Aari Cultivators of Ethiopia.

There does seem to be a correlation between Y haplogroups and the plot. Notice that the Early Farmer (EF) (Y-DNA G, T, and H2) is completely basal branch of Eurasians right at "Out of Africa", and that there is a progression of Y-DNA J => H1/H3 => NO => O toward the Austronesians, while another migration edge is roughly I2 => R1a/R1b => C2 => Q1a toward the Americas.

Have a look, see what you can find, and have fun!

a said...

Open Genomes T.K& company. gratitude
You can see a line from R*-H.G.`s-Karelia-Samara to R1s leading into Europe.

Chad Rohlfsen said...

David,

I see one qpAdm that failed. Is there more? Can you drop Dai and input Papuans and Atayal and Australian and Atayal? Thanks!

Chad Rohlfsen said...

It might not hurt to add Anatolia_Neolithic too.

Qagan said...

I have off topic question.

I am confused is the ANE a West or East Eurasian component, mix of both or a unique component?

I ask this because I notice that for example, Ulchi sample score approximately 13% ANE according to estimations by Lazaridis this thread: http://www.anthrogenica.com/showthread.php?4990-Studies-Find-Mysterious-Link-Between-Native-Americans-And-Indigenous-Australasians/page2

but at the same time score approximately 100% East Eurasian in this admixture result of a run at K3: http://www.anthrogenica.com/showthread.php?5711-E-Eurasian-vs-W-Eurasian

Does this mean that Ulchi samples actually have some West Eurasian and how much is it?

Thank you very much

Davidski said...

Ulchi are part EHG, which was classified as ANE in the Laz paper, and yes, this represents West Eurasian admixture in them.

Qagan said...

@Davidski:

You mean Ulchi are 1/4 EHG? Do you have the spreadsheet the averages for each population?

Thank you very much

Davidski said...

Qagan,

As far as I can remember based on some tests I ran, Ulchi are around 15% EHG or ANE. I'd need to double check that. I don't have a spreadsheet.

Chad,

The Papuans and Australians are basically interchangeable in this model. But the standard errors are too high for these results to be considered valid IMO.

https://drive.google.com/file/d/0B9o3EYTdM8lQWl80SUdyLXN5YWM/view?usp=sharing

Qagan said...

Davidski,

Thank you very much. Yes if you can check on it, I will appreciate it. So if I want to find out the actual West Eurasian ancestry I need to look at the ANE percentages in each populations?

Davidski said...

OK, for Ulchis I'm getting 8.3% MA1 and 6.6% Karelia_HG, with ~2% error margins. Scroll down to the bottom here...

https://drive.google.com/file/d/0B9o3EYTdM8lQLThCdjAydGtadVE/view?usp=sharing

https://drive.google.com/file/d/0B9o3EYTdM8lQb0Q0LVJkV2NwY3M/view?usp=sharing

The reason for these different estimates is the lack of correct ancient reference samples. But the upshot is that yes, Ulchis have some West Eurasian admixture of the hunter-gatherer kind.

Qagan said...

Thank you very much do you know why Ulchis are shown as 100% East Eurasian in Eurasia K3 run?

So Ulchis have around 6.6-8.3% West Eurasian admix based on MA1 and Karelia_HG? Sorry for asking such question out of ignorance but I am still pretty much new in population genetics.

Alberto said...

@Open Genomes

Thank you, that's look amazing. Very informative, indeed.

Is it possible with that same data to make a West Eurasian only one? Or you'd need a different dataset for that?

Alberto said...

@David

In qpAdm the standard errors refer to the best coefficients option alone? Because otherwise the second option:

Paniya
Atayal: 25.6%
Papuan: 30.6%
Kotias: 43.7%

chisq: 2.758 tail prob: 0.43

Looks quite decent. So maybe just running that same without Anatolia_Neolithic gives lower errors by picking this second option as the best one.

ryukendo kendow said...

Thank you very much Chad and David! These are great stats, and provide us with a working model to force out ASI.

For any further investigation, we kinda need this to be a collaborative process, since you (Chad) do not have CHG while Davidski does, and you have the Onge but David does not and cannot have them. David, do you mind sharing the CHG data with Chad? Also, is Treemix easy to run, or at least easy to teach? It would be preferable if you (Chad) could run Treemix with Onge, since that's not possible on your David's side.

For the stats that were already run:
Primate_Gorilla LBK_EN Paniya Dai -0.0078 -2.578 119527
Primate_Gorilla Anatolia_Neolithic Paniya Dai -0.0079 -2.663 119765
Primate_Gorilla Armenian Paniya Dai -0.0097 -3.271 119904
Primate_Gorilla Georgian Paniya Dai -0.009 -3.024 119904
Primate_Gorilla Kotias Paniya Dai -0.0121 -2.873 101132

These, together with the qpAdm, show that LBK_EN and Kotias-like ancestry is indeed found in the Paniya, but

Primate_Gorilla Paniya Armenian Dai 0.0405 12.77 119904
Primate_Gorilla Paniya Georgian Dai 0.0402 12.863 119904

also that Paniya is much more like ENA than like Basal-rich populations in overall ancestry; agree with you (Chad) that Basal probably does not form more than 25% of ASI. Then look at this:

Dai Paniya LBK_EN Kotias 0.0055 1.742 92265
Nganasan Paniya LBK_EN Kotias 0.0055 1.615 92265
Dai Pulliyar LBK_EN Kotias 0.0043 1.302 91609
Nganasan Pulliyar LBK_EN Kotias 0.0041 1.15 91609

Which shows, surprisingly, that this Basal is only nonsignificantly Kotias-like.

To make sure that this signal is the same in multiple ASI-rich populations, could we run:

Primate_Gorilla LBK_EN Austroasiatic Dai
Primate_Gorilla Kotias Austroasiatic Dai
Primate_Gorilla LBK_EN Pulliyar Dai
Primate_Gorilla Kotias Pulliyar Dai

To exclude the possibility that Crown West Eurasian is pulling S Indians to Basal-rich populations, and to force the stat to get higher in magnitude:

Loschbour LBK_EN Austroasiatic Dai
Loschbour Kotias Austroasiatic Dai
Karelian_HG LBK_EN Austroasiatic Dai
Karelian_HG Kotias Austroasiatic Dai
MA1 LBK_EN Austroasiatic Dai
MA1 Kotias Austroasiatic Dai
Kostenki14 LBK_EN Austroasiatic Dai
Kostenki14 Kotias Austroasiatic Dai

Loschbour LBK_EN Paniya Dai
Loschbour Kotias Paniya Dai
Karelian_HG LBK_EN Paniya Dai
Karelian_HG Kotias Paniya Dai
MA1 LBK_EN Paniya Dai
MA1 Kotias Paniya Dai
Kostenki14 LBK_EN Paniya Dai
Kostenki14 Kotias Paniya Dai

Loschbour LBK_EN Pulliyar Dai
Loschbour Kotias Pulliyar Dai
Karelian_HG LBK_EN Pulliyar Dai
Karelian_HG Kotias Pulliyar Dai
MA1 LBK_EN Pulliyar Dai
MA1 Kotias Pulliyar Dai
Kostenki14 LBK_EN Pulliyar Dai
Kostenki14 Kotias Pulliyar Dai

The greater the increase in the magnitude of these stats compared to when Chimp is in position 1, the less the result is due to Western Crown Eurasian. The patterns between each one of the HGs would be interesting too.

ryukendo kendow said...

If the stats showw consistently that Basal Eurasian-rich populations share drift with populations like Austroasiatic/Kharia, Paniya, and Pulliyar even when west Eurasian ancestry is corrected for, I will treat some of the other stats where West Eurasian does not pick Dai over Austroasiatic as irrelevant, as such a small amount of Basal Eurasian in Austroasiatic may well be offset by a small amount of West Eurasian ancestry in Austroasiatic to make Loschbour equally far from Austroasiatic and Dai.

So much for the Basal Eurasian in ASI.

Now for the next series of stats. Would you (Chad) be interested in running these, since you have the Onge? We were running Onge and ENA stats before we got totally distracted by CHG, so we never got around to seeing if ASI is really a clade with Onge or not. Your (Chad's) qpAdm and ADMIXTURE runs strongly imply that ASI does *not* in fact clade with Onge in the Papuan(Onge(Dai)) tree. The following D-stats would illuminate this:

Chimp Kharia Onge Dai
Chimp Kharia Onge Japanese
Chimp Kharia Onge Papuan
Kharia Papuan Onge Dai
Hadza Kharia Onge Dai
Ust-Ishim Kharia Onge Dai
Papuan Kharia Onge Dai
LBK_EN Kharia Onge Dai

There is a signal of 25% Georgian in the qpAdm, and in various ADMIXTURE e.g. Dienekes and HarappaDNA, the S Indian component is decomposed into 50% Caucasus ancestry, so including CHG would work, but since you don't have the CHG here could we try

Kharia Onge Dai Georgian
Kharia Onge Dai Lezgin
Kharia Onge Dai Armenian
Kharia Onge Dai Abkhazian
Kharia Onge Dai Balochi
Kharia Onge Dai Brahui

There is a small but consistent signal of Hadza and Bedouin ancestry in the qpAdm, which either means there is something very very divergent in the S Indians, or is just noise...? Chad, did you try any models without Hadza or Bedouin? Does the fit get considerably worse?

I agree that Treemix would be wonderful now we have the whole set of ASI, Onge, CHG etc. But David doesnt have Onge and Chad doesn't have CHG :/ Nevertheless, for a tree containing the following set:

(Chimp Hadza Kotias Paniya Papuan Dai Atayal Karitiana Loschbour Motala_HG Karelian_HG)

If the branch from/into Paniya splits before Papuan and East Asian split, which is a configuration we've seen before, this is a strong indication that ASI does not form a clade with Onge. David, would you be willing to try this?

Balaji said...

Alberto, Davidski,

Thanks for clarifying that from the D statistics, the Yamnaya (and Afanasieve) are the only populations that we know to favor Kotias over LBK_EN. Even Corded Ware which is supposed to be 80% Yamnaya favors LBK_EN over Kotias.

Mbuti Corded_Ware_LN Kotias LBK_EN 0.0178 4.702 302875

All modern European populations must strongly favor LBK_EN. South Asian and East Asian populations do not choose between LBK_EN and Kotias. For the Near East, I found the following statistics that Davidski calculated.

Primate_Gorilla BedouinB LBK_EN Kotias -0.0382 -9.955 271793
Primate_Gorilla Armenian LBK_EN Kotias -0.0235 -5.95 271793

It will be good to find out how more Near Eastern populations choose between the two. I suspect that they will favor LBK_EN, even people of the Caucasus. Davidski could you calculate the following D statistics when you get the time?

Chimp Lithuanian LBK_EN Kotias
Chimp Georgian LBK_EN Kotias
Chimp Lezgin LBK_EN Kotias
Chimp Assyrian LBK_EN Kotias
Chimp Syrian LBK_EN Kotias
Chimp Itanian LBK_EN Kotias

Davidski said...

West_Eurasia9 PCA datasheet...

https://drive.google.com/file/d/0B9o3EYTdM8lQZG5KMnl6UjhrRjA/view?usp=sharing

Balaji,

This expanded Corded Ware sample varies from 75% Yamnaya to as little as 35%. The average is about 60%. The rest is Middle Neolithic European.

ryukendo kendow said...

Hi Balaji,

Many Indians, even relatively ASI rich low caste Indians, favour kotias when their ENA ancestry is subtracted from the picture:

Dai Chamar LBK_EN Kotias 0.0089 3.215 91609

Though its true that non aryan speakers seem to favour kotias weakly, if at all.

Kurti said...

I don't understand why the people are always talking about Kotias while asking for includment of CHG in a new calculator. Isn't it clear from the paper that Kotias is A. the younger B. the EF(25%) admixed and therefore less pure sample of the both CHG samples.

Satsurbila is the one which shows no signs of outside admixture whatsoever, so if any than Satsurbila should be used for future calculators not Kotias.

Davidski said...

Alberto,

https://drive.google.com/file/d/0B9o3EYTdM8lQNmcteVdhLTM1Ykk/view?usp=sharing

Balaji,

Chimp Lithuanian LBK_EN Kotias -0.0273 -7.781 507266
Chimp Georgian LBK_EN Kotias -0.0003 -0.081 507266
Chimp Lezgin LBK_EN Kotias -0.0021 -0.631 507266
Chimp Assyrian LBK_EN Kotias -0.0245 -5.684 112556
Chimp Syrian LBK_EN Kotias -0.0232 -6.978 507266
Chimp Italian_Tuscan LBK_EN Kotias -0.0371 -10.629 507266

Kurti,

Satsurblia looks less mixed because it's a low coverage haploid genome. Kotias looks more mixed because it's a high coverage diploid individual representing a whole population against various heavily drifted modern populations.

By the way, the Scythian from Mathieson shares highest drift with Latvians and Lithuanians. :p

Tobus said...

@ryukendo:

Chimp Kharia Onge Dai 0.022 7.992
Chimp Kharia Onge Japanese 0.0166 6.218
Chimp Kharia Onge Papuan -0.0397 -10.151
Kharia Papuan Onge Dai -0.0301 -10.838
Hadza Kharia Onge Dai 0.0227 10.255
Ust_Ishim Kharia Onge Dai 0.0231 6.536
Papuan Kharia Onge Dai 0.0301 10.838
LBK_EN Kharia Onge Dai 0.0171 7.504

Kharia Onge Dai Georgian -0.0032 -1.547
Kharia Onge Dai Lezgin -0.0046 -2.214
Kharia Onge Dai Armenian -0.0024 -1.166
Kharia Onge Dai Abkhasian -0.0043 -2.05
Kharia Onge Dai Balochi -0.007 -3.678
Kharia Onge Dai Brahui -0.0076 -3.935

ryukendo kendow said...

Hi Tobus,

Looks like you have the Onge too, cheers! Yep, ASI does not form a clade with Onge.

Do you know how to run TreeMix?

@ Davidski
Since Chad does not have the CHG, could you help us run the following?

Primate_Gorilla LBK_EN Austroasiatic Dai
Primate_Gorilla Kotias Austroasiatic Dai
Primate_Gorilla LBK_EN Pulliyar Dai
Primate_Gorilla Kotias Pulliyar Dai

Loschbour LBK_EN Austroasiatic Dai
Loschbour Kotias Austroasiatic Dai
Karelian_HG LBK_EN Austroasiatic Dai
Karelian_HG Kotias Austroasiatic Dai
MA1 LBK_EN Austroasiatic Dai
MA1 Kotias Austroasiatic Dai
Kostenki14 LBK_EN Austroasiatic Dai
Kostenki14 Kotias Austroasiatic Dai

Loschbour LBK_EN Paniya Dai
Loschbour Kotias Paniya Dai
Karelian_HG LBK_EN Paniya Dai
Karelian_HG Kotias Paniya Dai
MA1 LBK_EN Paniya Dai
MA1 Kotias Paniya Dai
Kostenki14 LBK_EN Paniya Dai
Kostenki14 Kotias Paniya Dai

Loschbour LBK_EN Pulliyar Dai
Loschbour Kotias Pulliyar Dai
Karelian_HG LBK_EN Pulliyar Dai
Karelian_HG Kotias Pulliyar Dai
MA1 LBK_EN Pulliyar Dai
MA1 Kotias Pulliyar Dai
Kostenki14 LBK_EN Pulliyar Dai
Kostenki14 Kotias Pulliyar Dai

Thanks in advance.

Alberto said...

@Davidski

Thanks. So that's interesting for the methodology of using qpAdm. Probably as RK said a while back, better to add one by one to test each combination separately and see if they improve the model or not.

Also knowing that Paniya is ASI+CHG and takes no European LNBA could serve as a more realistic base than using Dai to get accurate results about Andronovo/Sintashta admixture. For example to model an Indo-Aryan population as Paniya + Kotias + X, where X can be Sintashta/Androvo, EHG, MA1,...

Kurti said...

Davidski said

"By the way, the Scythian from Mathieson shares highest drift with Latvians and Lithuanians. :p"


According to which study lol. He fits perfectly as a "mixed" individual based on his admixture results, similar to Yamna and Andronovo (if not even slightly more South and eastern shifted) on PCA plots.

Maybe he shares "highest drifts" with them but that doesn't mean he is automatically very close to them either. As we know things such as "highest" are relative ;)

Kurti said...

And about the Kotias and Satsurbila issue, well the former might be high coverage, but the study itself states there is EEF like mixture in Kotias probably slowy reaching Anatolian farmers in the Caucasus. Satsurbila on the other hand is low coverage yes but he doesn't seem to show signs of EF admixture in combination with his age this is a strong indication that he is obviously less mixed.

Shaikorth said...

Kurti, Lithuanians and Latvians share the most drift with many populations that may look more distant based on ADMIXTURE or PCA, both ancient and modern:

Mordovian Lithuania : Chuvash MBUTI -0.0039 -4.050
[Kargopol] Russian Lithuania : Chuvash MBUTI -0.0033 -3.406

It should come as no surprise if they peak the Scythian sharing.

capra internetensis said...

@ryukendo

Kharia are Austro-Asiatic, they share recent O2a1 haplogroups with Dai extensively.

VOX said...

Tobus, can you try:

Chimp Onge Dai Japanese

Correct me if I'm wrong, but I think this would indicate if the Dai have South Eurasian type admixture, if negative enough.

Kurti said...

@Shaikorth

Thats what I tried to say. No suprise with them sharing "highest" compared to other populations. But highest is relative. Turkic groups in Iran have "highest" East Asian admixture but that doesn't make them remotely similar to East Asians.
Just to give a more extreme and drastic example to make my point clear.

Obviously any ancient Satem Indo European and Uralic group from the Steppe region will share significant drifts with Lithuanians/Latvians. But from what I have seen the Iron Age Scythian sample looks like belonging to a population which can be modeled inbetween Lithuanians and a different West and South_Central Asian group. This leads to my statement years ago that the North and East Iranic groups are the once who are the missing gap between North Caucasus, South_Central Asians and East Europans.

Davidski said...

Turkic groups in Iran have "highest" East Asian admixture but that doesn't make them remotely similar to East Asians.

I didn't say Lithuanians share the highest drift with the Scythian from among Europeans. I said Lithuanians share the highest drift with the Scythian.

And no, the Scythian can't be modeled as Lithuanian/South Central Asian, because he lacks South Asian ancestry.

Chad Rohlfsen said...

The use of Dai is a worse fit in
qpAdm. In fact, Dai are closer to West Eurasian Caucasus pops than the Onge. I've said before, and I still believe that the Dai do have West Eurasian admixture, and cause more problems. Here are the Kharia, with the Dai

Onge 46.0%
Dai 27.6%
Hadza 1.0%
Georgian 25.4%

chisq 2.330 tail prob .675224

The Dai clearly have West Eurasian ancestry, and show clear affinity to EHG, MA1, and Nganasan. This is why they are closer to South Asians, it is their West Eurasian ancestry and ENA. The Onge, clearly model with better fits for ASI, which is probably a mix of Onge and Atayal-like stuff.

result: Gorilla Karelia_HG Onge Dai 0.0162 4.088 15538 15043 317554
result: Gorilla Yamnaya Onge Dai 0.0077 2.285 14404 14185 297501
result: Gorilla LBK_EN1 Onge Dai 0.0057 1.820 15807 15629 328505
result: Gorilla Spain_EN Onge Dai 0.0050 1.516 15556 15401 323550
result: Gorilla Armenian Onge Dai 0.0082 2.800 15876 15618 329241
result: Gorilla Georgian Onge Dai 0.0082 2.813 15883 15626 329241
result: Gorilla BedouinB Onge Dai 0.0063 2.160 15655 15458 329241
result: Gorilla Iraqi_Jew Onge Dai 0.0085 2.793 15828 15562 329241
result: Gorilla Kharia Onge Dai 0.0212 7.772 16330 15652 329241
result: Gorilla Onge Atayal Dai 0.0039 1.689 14572 14459 329241
result: Gorilla Onge Han Dai 0.0030 2.085 14592 14505 329241
result: Gorilla Onge Dai Atayal -0.0039 -1.689 14459 14572 329241
result: Gorilla Nganasan Onge Dai 0.0739 24.203 17301 14920 329241
result: Gorilla Dai Onge Nganasan 0.0577 16.732 17301 15414 329241
result: Gorilla Atayal Onge Dai 0.1093 35.865 18007 14459 329241
result: Gorilla Atayal Onge Dai 0.1093 35.865 18007 14459 329241
result: Gorilla Australian Onge Dai -0.0057 -1.582 15567 15747 329240

Dai
Atayal 89.4%
MA1 9.9%
Australian 0.7%

chisq 4.236 tail prob .237128

Dai
Atayal 58.3%
Nganasan 30.5%
Australian 11.2%

chisq 3.484 tail prob .322788

Dai
Atayal 36.3%
Nganasan 44.2%
Onge 19.6%

chisq 1.511 tail prob .679826

I'm still working on better fits.

Shaikorth said...

Chad, can you do these too:

Gorilla Karelia_HG Onge Atayal
Gorilla Georgian Onge Atayal
Gorilla Nganasan Onge Atayal

Tobus said...

@VOX:

Chimp Onge Dai Japanese -0.0016 -0.996

Tobus said...

.. and while I'm at it:

@Shaikorth:

Gorilla Karelia_HG Onge Atayal 0.0165 3.809
Gorilla Georgian Onge Atayal 0.0062 1.89
Gorilla Nganasan Onge Atayal 0.0745 20.851

Shaikorth said...

Thanks, looks like whatever affinities there are between EHG/ANE and Dai extend to Atayal.

Seinundzeit said...

A side note, but I just realized that the qpAdm model of Pashtuns and Kalash as CHG + EHG + AEN/EEF + ENA is strikingly similar to Zack's old K11 Onge run. A comparison, using Pashtuns:

57.5% CHG + 17.7% EHG + 12.8% ENA + 12% AEN/EEF

South Asia=48%
European=19%
SW Asian=17%
Onge=11%
Siberian=2%
American=1%
East Asian=1%

Obviously, we are dealing with radically different methods, and radically different kinds of output. Any direct comparison is somewhat problematic. And anything based on formal stats takes precedence over ADMIXTURE output. The qpAdm output is determinative. But I'm simply struck by the similarity. For example, "South Asian" is a West Eurasian component that peaks in South Asia, West Asia, the Caucasus, and has a bias towards appearing more strongly in Northern Europe rather than Southern Europe. Basically, it acts like CHG. "SW Asian" is a composite of the EEF-like, Bedouin-like, and CHG-like components that often appear in ADMIXTURE. Here, it takes the place of AEN/EEF for Pashtuns, but takes some CHG with it. "South Asian" + some of "SW Asian" is identical to the amount of CHG shown by qpAdm. The "European" score is almost identical to the EHG percentage. And the percentage of "Onge" + "East Asian" is identical to the ENA score in qpAdm. I just find it interesting that the two sets of results are so similar. Probably a good indication that this model closely approximates reality.

A weird detail that I just noticed, Pashtuns have about the same amount of EHG as populations from the British Isles, while the Kalash have about the same amount as Scandinavian and Eastern European populations, looking at the Haak et al. supplements.

David,

If possible, could you try to model Pashtuns as Andronovo + Paniya + Armenian, and Kalash as Andronovo + Paniya + Georgian? In the absence of South Asian aDNA, the Paniya are great for this sort of thing. Thanks in advance.

Chad Rohlfsen said...

Treemix did show a 16% edge from the root of EHG into all non Papuan ena.

FrankN said...

@Chad: "The Dai clearly have West Eurasian ancestry, and show clear affinity to EHG, MA1, and Nganasan. This is why they are closer to South Asians."
There may be another, much more simple explanation for Dai being close to South Asians: IVC is known to have grown rice, at least during its late stages. While the "homeland" of rice domestication hasn't yet been unambiguously determined, Yunnan ranks at the top of the candidate regions. Since migration of crops tends to be associated with migration of people, I deem a migration of Dai-like people into the Indus Valley by around, say, the first half of the 3rd mill. BC, anything but unlikely.

Conversely, Yunnan is the world's single largest tin producer today - a commodity that is indispensable for bronze production, but only found in mineable concentrations in a few places around the globe. Bronze appears rather early in Yunnan, in high technical and artistic sophistication, geographically disconnected from the main entrance route of bronzeworking into East Asia along the northern branch of the silk road.

https://en.wikipedia.org/wiki/Yunnan#History:
"By this time (2nd ct. BC), agricultural technology in Yunnan had improved markedly. The local people used bronze tools, plows and kept a variety of livestock, including cattle, horses, sheep, goats, pigs and dogs. Anthropologists have determined that these people were related to the people now known as the Tai."

https://en.wikipedia.org/wiki/Dian_Kingdom

Moreover, the standard theory of bronzemaking being disseminated southward from Northern China into SEA is more and more getting into conflict with C14 dating of SEA sites. Discussion is still on-going.

https://en.wikipedia.org/wiki/Bronze_Age#Southeast_Asia

VOX said...

"Chimp Onge Dai Japanese -0.0016 -0.996"

Hi, Tobus, thanks for the stats. It looks like Dai might be slightly closer to Onge, although not at significant levels. According to the analysis of Khrunin et al, the Han, Sherpa, Dai and Malaysians harbour about 19% Australian-like admixture. Anybody else has any ideas?

Chad Rohlfsen said...

It's equal because the Dai can be modeled as atayal, siberian, and Papuan. Japanese as Atayal and Siberian. It's the same with admixture and you can see it on a PCA.

Balaji said...

Davidski,

Thanks for the D-stats. It is interesting to compared Georgians to Armenians.

Primate_Gorilla Armenian LBK_EN Kotias -0.0235 -5.95 271793
Chimp Georgian LBK_EN Kotias -0.0003 -0.081 507266

Whereas Armenian has much more LBK_EN than CHG related ancestry, Georgian shows no preference for either LBK_EN or CHG. Clearly Georgian has much more CHG ancestry than Armenian. The Caucasus mountains have been quite effective in impeding gene flow. I had meant to request the statistics for Iranian but had mistyped.

Chimp Iranian LBK_EN Kotias

I expect Iranian to favor LBK_EN over Kotias but I may be wrong and perhaps Iranian too will have no preference.

Open Genomes said...

@a and @Alberto - Thanks! :)

Eurogenes Global9 3-D PCA Plot PC1-PC2-PC3

Eurogenes Global9 PCA plot PC1-PC2-PC3 2-D projection showing populations and some Y haplogroups

Alberto, I think the "secret" of this 3-D PCA Plot is that in fact it does include Africans. The way to examine West Eurasians closely is just to rotate the plot appropriately, and then zoom in real close on that smaller section of Eurasia, and examine the 3-D relationships. There's no way we could have seen the "pull" toward Sub-Saharan Africans in the Palestinians, Bedouin and North Africans which is *not* from the EF Early Farmers unless we have Mota and the other Africans on the plot. The most striking finding is that the EF (Early Farmers) are indeed *the* "Basal Eurasian" branch, and the CHGs (Caucasus Hunter-Gatherers) are in fact *not* any sort of "Basal Eurasian" but something headed out to Ust'-Ishim, the Austroasiatics, and the Austronesians of Taiwan. Call it "ASI/ANI" if you will. ("ASI" includes the Andamanese.) Likewise, the WHG/SHG/EHG group is related to Mal'ta boy and on to the Native Americans, it's a kind of "WHG-ANE" continuum. Of course, this is precisely what we've seen in TreeMix, except there's a bit of confusion between the Early *European* Farmers and the EF "Basal Eurasians".

Given that there are "poles of drift" - or rather, the "points, tail and string of the kite" ;) then perhaps TreeMix would work best with Aytal, Karitiana, Ust'-Ishim, MA-1, Kostenki K14, and the Starcevo Early Farmer KO2 (who seems to be "ultra-Anatolian Farmer"), with Mota and the Ju-Hoan as outgroups. The Papuans / Australians may prove useful too. That way, adding the CHGs (preferably Satsurblia over Kotias) and the various "European" Hunter-Gatherers, will reveal the real combination of admixture found in any "test" individual or population.

Also, is Saqqaq Man in the data? What about Clovis Anzick-1 and Kennewick? Given that Native Americans are one "pole" of admixture, these ancient genomes are going to be very important, particularly to fill in the "ANE" migration path between Mal'ta boy and the Native Americans. It seems that since the R1a1* Karelian EHG comes out at "16% Native American", it's very important to have these ancient Native Americans to distinguish any so-called "ANE" from the Caucasus and the area of Tajikistan from something related to Scandinavian Q-L805 which is in a "Native American" Y clade, Q-M930 that also includes Q-M3 (Kennewick) and below Q-M1107 which includes the Q-Z780 sister clade of Q-M930 to which Clovis Anzick-1 belongs. It would seem that this one single Q-L805 represents the unique instance of actual Beringian admixture in north Eurasia.
BTW, the sample I0434 from Khvalynsk is Q-L474 xL56, in the same clade as Saqqaq Man.

I think with these additional ancient American samples the North Eurasian drift towards the Native Americans will become clearer.

Really, all Eurasians are some combination of drift toward or away from KO2, the Ami/Aytal, and the Karitiana, and back towards Mota, except for the Oceanians whose Denisovan admixture pulls them in another direction.

Let's see what these other ancient American genomes do to the PCA and TreeMix.

Davidski said...

Balaji,

Chimp Iranian LBK_EN Kotias -0.0099 -2.857 507266

capra internetensis said...

Would someone who is running D stats mind doing

Chimp Ust_Ishim Karitiana Clovis ?

As a test for artifacts of age differences.

@OpenGenomes

Q-L805 is not the only Beringian suspect, there are also the Eurasian mitochondrial C1 clades (and it is possible that more upstream clades like L330 are Beringian too). We don't know if I0434 is more related to Saqqaq than any given Q1a; Saqqaq is in Q1a1a-NWT01(xM120) specifically, the Khvalynsk man is just Q1a(xQ1a2).

Amerindians are a "pole" of admixture because they have a lot of specific drift, not necessarily because they have any importance as an ancestral population outside the New World - though I do think there is likely to be significant Beringian ancestry in Eurasia, I strongly doubt it is the main source of ANE.

Arch Hades said...

Khvalynsk didnt have the wheeled vehicles or domesticated horse. It was only later Yamnaya that had that, right? If i'm not mistaken.

If that's true then they might be some very early form of Proto Indo-European but their cultures isn't classic Proto Indo-European.

Krefter said...

@Davidski,

Can you post ADMIXTURE or PCA results for the Sycthian?

ryukendo kendow said...

@ Davidski

David, is it possible to run the stats that I posted previously? If you are too busy to do more than a few each time, we can postpone barraging you with stats till a time when its better for you.

@ Open Genomes

Agree with Capra on this. A PCA with less Native Americans and more Oceanians, in keeping with how divergent they really are, takes on a strikingly different shape.
(Hats off to Matt for demonstrating this out a while ago)

Davidski said...

Arch,

Khvalynsk is generally considered Pre-Proto-PIE, while Yamnaya late PIE. Samara and/or Sredny Stog are generally seen as Proto-PIE.

Capra,

Chimp Ust_Ishim Karitiana Clovis 0.0024 0.38 352350

Krefter,

https://drive.google.com/file/d/0B9o3EYTdM8lQTm1JU2xoYWwwMmc/view?usp=sharing

I'm still working on the Admixure stuff, but I can tell you that this Scythian has around 10% of Siberian ancestry, and I'm not talking about ANE here. Much more than Finns, but not as much as Chuvashs.

rk,

Must've missed it. Please re-post the list.

Balaji said...

Davidski,

Thank you very much. The statistics that you have provided show that CHG could not have moved from the Caucasus to the Indian Subcontinent. CHG had a hard enough time going from Georgia to Armenia. From the Caucasus to Iran to the Subcontinent there are more formidable geographical barriers.

Iranian has more CHG-related ancestry than Armenian. I think this is because Iranian received CHG-related ancestry both from the Caucasus and from India. Still Iranian has more LBK_EN related ancestry than CHG-related ancestry.

Chimp Iranian LBK_EN Kotias -0.0099 -2.857 507266

All this goes to show that agriculture in South Asia which is at least 10,000 years old did not come with migrants from the Near East. They would have had more of LBK_EN ancestry. It was an indigenous development. This also means that ANI has been in the Subcontinent since the Late Pleistocene. ADMIXTURE analysis further suggests that CHG-related ancestry in South Asia is of the Gedrosia kind different from the Caucasus kind.

Shaikorth said...

These stats should allow for a f4 ratio estimate to check how much ancestry full Austronesians share with ANE, same method was used for Siberians by Flegontov et al.

f4(Loschbour, Gorilla; Atayal, Onge)
f4(Loschbour, Gorilla; MA-1, Onge)
f4(Onge, Gorilla; Loschbour, MA-1)) (Z<2 here ensures the Onge is a decent reference)

Tobus said...

@Shaikorth:
Loschbour Gorilla Atayal Onge 0.0006 0.14
Loschbour Gorilla MA1 Onge 0.0569 7.769
Onge Gorilla Loschbour MA1 -0.0086 -1.151

Karl_K said...

@Balaji

"agriculture in South Asia which is at least 10,000 years old did not come with migrants from the Near East. They would have had more of LBK_EN ancestry."

It is an interesting situation for sure. The archaeology shows multiple centers of early farming with local plants. Yet clearly, agriculturalists across the entire fertile crescent very early on started using the exact same domesticated crops. Any useful traits were bred into their own local landraces.

So it seems that in that region 10,000 years ago, at least some seeds and animals were traded and passed around much faster than people were admixing.

And some farming knowledge must have also been passed around. How else could such diverse people all coincidentally domesticate the exact same eight plant species at exactly the same time?

Shaikorth said...

Thanks Tobus, the result here is just 1% which may be be too low to explain the preference of Caucasus/EHG/Siberia for Atayal over Onge.

If you have time, could you do the same stats but MA-1 replaced with Karelia HG and Kostenki14?

ryukendo kendow said...

@ Davidski

I was suggesting to you and Chad methods to break ASI up into into its component parts. Realised that you may not notice comments without the '@ Davidski' at the front, will rmb this.

The following stats are just to make 100% sure we are tracking Basal Eurasian in ASI.

Primate_Gorilla LBK_EN Austroasiatic Dai
Primate_Gorilla Kotias Austroasiatic Dai
Primate_Gorilla LBK_EN Pulliyar Dai
Primate_Gorilla Kotias Pulliyar Dai

Loschbour LBK_EN Austroasiatic Dai
Loschbour Kotias Austroasiatic Dai
Karelian_HG LBK_EN Austroasiatic Dai
Karelian_HG Kotias Austroasiatic Dai
MA1 LBK_EN Austroasiatic Dai
MA1 Kotias Austroasiatic Dai
Kostenki14 LBK_EN Austroasiatic Dai
Kostenki14 Kotias Austroasiatic Dai

Loschbour LBK_EN Paniya Dai
Loschbour Kotias Paniya Dai
Karelian_HG LBK_EN Paniya Dai
Karelian_HG Kotias Paniya Dai
MA1 LBK_EN Paniya Dai
MA1 Kotias Paniya Dai
Kostenki14 LBK_EN Paniya Dai
Kostenki14 Kotias Paniya Dai

Loschbour LBK_EN Pulliyar Dai
Loschbour Kotias Pulliyar Dai
Karelian_HG LBK_EN Pulliyar Dai
Karelian_HG Kotias Pulliyar Dai
MA1 LBK_EN Pulliyar Dai
MA1 Kotias Pulliyar Dai
Kostenki14 LBK_EN Pulliyar Dai
Kostenki14 Kotias Pulliyar Dai

Is it also possible to run a Treemix with the following populations at 10 migration edges? :
(Chimp Denisova Mota Kotias Paniya Papuan Dai Karitiana Loschbour Karelian_HG)

Last of all, Chad has the Onge, but you cannot have them. It will be great to have a dataset with both Onge and CHG, so is it possible for you to pass the data for CHG to Chad or Tobus in any form?

@ Tobus

Once again, thank you very much for these statistics.

Kharia share drift with Dai to the exclusion of Onge
Chimp Kharia Onge Dai 0.022 7.992
Chimp Kharia Onge Japanese 0.0166 6.218
Hadza Kharia Onge Dai 0.0227 10.255
Ust_Ishim Kharia Onge Dai 0.0231 6.536
LBK_EN Kharia Onge Dai 0.0171 7.504
(Chad, I agree with Shaikorth that ~1% ancestry is probably insufficient to explain a Z-score of 6-7.)

Kharia share drift with Onge to the exclusion of Papuan
Chimp Kharia Onge Papuan -0.0397 -10.151
Kharia Papuan Onge Dai -0.0301 -10.838
Papuan Kharia Onge Dai 0.0301 10.838

Kharia resembles West Asian populations when its ENA is subtracted
Kharia Onge Dai Georgian -0.0032 -1.547
Kharia Onge Dai Lezgin -0.0046 -2.214
Kharia Onge Dai Armenian -0.0024 -1.166
Kharia Onge Dai Abkhasian -0.0043 -2.05
Kharia Onge Dai Balochi -0.007 -3.678
Kharia Onge Dai Brahui -0.0076 -3.935

Chad Rohlfsen said...

I think Dai having West Eurasian ancestry is causing them to look closer than the Onge. It's a weak fit to make the Dai without Siberian and Onge Admixture. Using Onge and Atayal makes a much better fit for ASI.

Open Genomes said...

@capra internetensis, the idea that Native Americans are a "pole of admixture" does not mean that in fact Eurasians have (any substantial) "Beringian" ancestry, aside from the somewhat small clades you mentioned. Rather, the correct term should be a "pole of drift", where Central Siberian migration to the Americas was the end result of a process of isolation and drift we already see in Eurasia, i.e. with Mal'ta boy. Regardless, this "pole of drift" is in fact important for our understanding of Eurasian drift and admixture. As we know, "ANE" was modeled on the derived alleles shared between the Karitiana and Mal'ta boy, so this represents what has been called "ANE", even if the concept may not be entirely accurate regarding more southerly Eurasians such as in the Northeast Caucasus and the other "ANE hotspot" around Tajikistan.

The value in using the ancient Native American genomes is of course that they are much closer in time (and space) to the Eurasian source of the drift. It may be that Saqqaq man is purely Paleo-Siberian (probably a Koryak) and therefore a completely different source population and migration than Clovis and Kennewick. Between this apparent "Dorset" population in Eastern Canada and Greenland, the Na-Dene related to the Kets and other Yeniseians, the Amerinds / "First Americans" (and an "East Asian" as well as "ANE" component to their ancestry", and the apparent minor "Papuan" element among a few South American tribes like the Karitiana, we can see there were quite a few populations that contributed to this "pole of drift". This is really why Native Americans are at the extreme of a "triangle" rather than a "line". (Notice too that South American tribes "make a turn" in the general drift in the Americas, due to some additional ancestral element, perhaps this "Papuan" ancestry.)

Regardless, since the Native Americans are at several extremes of drift that was already taking place before the settlement of the Americas, all of these ancestral components accentuate and emphasize this Eurasian drift in a way that would not be possible if they were not on the 3-D PCA.

I can think of other apparent population isolates that are not in this Human Origins Array dataset, namely the Onge and the Tibetans, the Semang of Malaysia and the Aeta of the Philippines. I suspect that the Tibetans in particular will show up at some unusual place within the triangle because of their long isolation due to their physical adaptations to the extreme altitude of the Tibetan Plateau. We can see this in their unusually high percentage of Y haplogroup D, just like the Andamanese and Japanese, other physically isolated East Asian populations.

Perhaps something can be done to "round out" the dataset by including these other isolates along with the ancient Americans?

I suspect that this may create some "pull outward" even for such Siberian-admixed populations like the Karelian Hunter-Gatherers and further clarify the PCA plot.

The main point here is that even a close examination of the PCA of a small region on the plot such as Europe cannot be done properly without including *all* extremes of drift on the same analysis. We would never have seen that the CHGs were very different from the Early Farmers ("Basal Eurasians"), headed in the direction of the Austronesians, or that in fact the European Hunter-Gatherers (all three groups) were headed in the direction of the Native Americans, and the true nature of "LBK" (in fact, EF) admixture in Africa, and the fact that the EFs are the only true "Basal Eurasians" and not at all "Bedouin_B-like", without the Aytal, Karitiana, the Mbuti and the San. on the very same plot as the LBK, Corded Ware and Bell Beaker samples.

Open Genomes said...

@David, can we have a Global9 with Clovis, Kennewick, Saqqaq, Tibetans, Onge, Semang, Aeta (or related people), to "round out" the PCA?
It's seems reasonable the Europeans and the CHGs will be "less compressed" if these were on the plot, because they should accentuate the sources of drift in Eurasia. Thanks.

Shaikorth said...

However Atayal gives similar West Eurasian shifts compared to Onge as Dai do, so if there is that kind of ancestry in Dai it should be in Atayal too. This would leave Onge as the one pure ENA reference since Papuans etc. are complicated by archaic admixture.

Gorilla Karelia_HG Onge Dai 0.0162
Gorilla Nganasan Onge Dai 0.0739

Gorilla Karelia_HG Onge Atayal 0.0165
Gorilla Nganasan Onge Atayal 0.0745

postneo said...

While knowledge may have passed around not the same species of crops and animals were domesticated. Even for barley two different strains were domesticated with different regional centers.

Alberto said...

After reading FrankN's interesting comment and looking at the stats, it's looking more like most of what we call ASI could be a late migration from SE Asia to India. This migration is also supported by a recent study from National Geographic regarding Y haplogroup O-M95.

This would also be a more parsimonious explanation for the late estimates of admixture between ANI and ASI. It looks quite clear that ANI was in the Indus Valley long before 2200 BC (oldest estimate date of admixture), so it could have been ASI which arrived during the late Harappan period there.

I don't think that all the ANI-ASI will be a single event/migration. It's probably going to be quite more complicated, with different waves at different times, both ways. But it's looking like the biggest event might have been this hypothetical late Bronze Age migration from SE Asia to India.

This would increase the chances of the Harappan DNA (if/when it comes) being pure ANI, which would be quite interesting too.

Karl_K said...

"it's looking more like most of what we call ASI could be a late migration from SE Asia to India"

Then who was in India before that? ANE like people, CHG like people? This should have been a territory with a large sustained population for a very long time. How could they have disappeared with so little a trace in such recent history?

jparada said...

@davidski

So, was the scythian a baltic speaker? anyways, these baltic peoples seem to be quite isolated, not only do they share most drift with Mesolithic euros but now with an Iron Age steppe individual.

@kurti

Why should Europeans and west asians fall on an uninterrupted cline? if anything, before the Neolithic they were farther away from each other than they are now.

Chad Rohlfsen said...

The Kharia have more admixture from a Dai/Atayal group. Paniyas should be more Onge like. Onge like people are probably native to South Asia, and I would be surprised if something ANI like dates to the Paleolithic/Mesolithic. That may be why the South Asian cluster is a pain in the ass to break down. It's a mixed and heavily drifted group. I think the Austronesian came later on, more like the Mesolithic to Neolithic timeframe.

Chad Rohlfsen said...

Typo above. I meant to say that I wouldn't be surprised if something ANI like dated back to the Paleolthic/Mesolithic timeframe.

Alberto said...

Yes, before ASI arrived to North India, the people would be something like CHG + ANE. That's still the base of the populations of North India and Pakistan, so they didn't disappear, they just got influx from ASI populations.

Chad is probably right. There was probably an Onge-like component in South India earlier, and during the Bronze Age a SE Asian migration might have taken place, bringing Austro-Asiatic and expanding southern populations to the north.

Probably a complicated history, but in any case the point is mostly about the study about ANI-ASI admixture from a few years back with age estimates between 2200 BC and 900 BC (?). This was taken by many as a proof of Aryan invasions, but it's looking more that what it was detecting was a Dai-like migration to India and the subsequent ASI (a mix of Dai and Onge) expansion to the north.

Hypothetical, of course. But now more parsimonious than the old theory of Aryan Ivasion, I think.

postneo said...

Asi could have been there in peninsular and central India for a long time before moving to harappan areas.

capra internetensis said...

Thanks David

Zero result, no sign of any age effect (at least using genomes with decent coverage).

@Alberto

There was certainly a late Neolithic migration (or multiple waves of migration) from Southern China/Southeast Asia into India (c. 2000 BC?), bringing Austroasiatic languages, polished shouldered axes, and corded ware, as well as Y haplogroup O2a1-M95. Some of the Neolithic Gangetic sites have very early dates, before 6000 BC, but I'm not sure whether these are securely associated with Southeast Asian elements.

But this wave is associated with Austroasiatic tribals, and to a lesser degree with East Indians generally, O2a1 is Holocene age (there are not enough samples but I suspect in East India it is largely or almost entirely the young O2a1a2-F789 clade). South Indians with high ASI have negligible Y DNA O and do not show the Southeast Asian component in HarappaWorld admixture that Munda do (nor any significant East Eurasian outside of what is contained in the South Indian component).

There are earlier connections with Southeast Asia, e.g. Hoabinhian-type lithics, but this all poorly dated. During the LGM there was mostly horrible desert lying to the northwest of India (though along the edge of the Himalayas and Pamirs was probably OK) while India was covered mostly with savanna grassland and open forest, separated from not too different habitats in Southeast Asia by the Naga Hills.

Altogether I see no reason to think ASI is predominantly due to late gene flow from the East. I also think the Harappans were mainly ANI, but the earliest admixture dates come from Dravidian speakers of South India, and may represent the arrival of Neolithic/Chalcolithic farmers/pastoralists from the north. I guess the situation in the subcontinent was quite complicated, with plenty of gene flow in and out, and with major autochthonous components. It will be very hard to disentangle without aDNA.

capra internetensis said...

The North Indian admixture dates from Moorjani et al are very late indeed, Iron Age and historical era. Considerably later than the appearance of Southeast Asian Neolithic elements in the Ganges valley and the eastward migration of the Harappans. There must have been early admixture events but they are being obscured by late ones.

capra internetensis said...

How about

f4(Onge, Gorilla; Karelia_HG, Loschbour)
f4(Onge, Gorilla; Dai, Loschbour)

for East Asian admixture in EHG?

Rob said...

Alberto
Hi. interesting hypothesis. Like Capra, however, I was going to point out that Y hg O, and austro-asiatic are not common enough in India to account for ASI ? But I'm sure you've thought about this :)

Chad Rohlfsen said...

result: Gorilla Onge Loschbour Karelia_HG -0.0035 -0.526 14410 14510 314720
result: Gorilla Onge Dai Loschbour -0.0593 -12.307 15648 17621 326345
result: Gorilla Onge Dai Karelia_HG -0.0627 -12.192 15043 17056 317554

Ryan said...

"And no, the Scythian can't be modeled as Lithuanian/South Central Asian, because he lacks South Asian ancestry."

Don't the Kalash share a strange amount of drift with Lithuanians? Is the Scythian a ghost population that contributed to both Lithuanians and the Kalash?

Davidski said...

Is the Scythian a ghost population that contributed to both Lithuanians and the Kalash?

No, the early Indo-Europeans from the Bronze Age steppe carrying R1a-M417/Z645 is the ghost population that contributed to Scythians, Lithuanians and South Asians.

There might be some minor Scythian ancestry in Lithuanians. But it can't be much considering the very low level of R1a-Z93 in the East Baltic and Siberian admixture at only a couple per cent, if even that. That Scythian is R1a-Z93 and has around 10% of Siberian admixture.

Alberto said...

@Capra

Thanks, I don't know much about the details of Indian prehistory so it's good to hear a good summary and that it's not in disagreement with what I'm more or less seeing.

Indeed, the Y hg O is restricted to Austro-asiatic and not too relevant in itself, but the tests so far don't seem to show 2 clearly different types of ASI. The ENA in Paniya and in Kharia don't look too different, and both look quite Dai-like, and Dai itself being a mix of Atayal-like and Onge or Papuan-like. So yes, probably a complicated history, but with a result that these components are mixed more or less equally in South Indian and in SE Asian populations.

Maybe further test will be able to find the difference (like Admixture does, though admixture could be doing so for other reasons), but for now it looks to me that whichever migrations from SE Asia to India seem to have homogenized the ENA component, regardless of hg O or AA language. Or maybe we just don't have any good proxy for the "real" ASI so it shows up as Atayal+Onge/Papuan because Onge is just too drifted and not too related to continental ASI.

Tobus said...

@Shaikforth: If you have time, could you do the same stats but MA-1 replaced with Karelia HG and Kostenki14?

Loschbour Gorilla Karelia_HG Onge 0.1131 16.563
Onge Gorilla Loschbour Karelia_HG 0.0035 0.525

Loschbour Gorilla Kostenki14 Onge 0.06 8.919
Onge Gorilla Loschbour Kostenki14 0.0175 2.604


@ryukendo: is it possible for you to pass the data for CHG to Chad or Tobus in any form?

David has said the CHG data is freely available from the author, so I should be able to get it easily enough. The issue is that it can take a while to process and merge with my existing data set, depending on the format etc., and I don't have the time to dedicate to that at present... maybe this weekend I'll give it a go.

Shaikorth said...

Karelian gives only 0,5% into Atayal, that's definitely too little to explain the significant shift of Karelia HG towards Atayal over Onge. Maybe the formula of Flegontov paper, though it worked for Siberians and Native Americans, just isn't good enough here.

Kostenki was 1%, but that isn't a very good reference since Onge shares additional drift over Loschbour over it.

Alberto said...

@Shaikorth

I'm not sure if the above stats are correct, but the ones requested by Capra and run by Chad give some 6% Dai into Karelia_HG, so that might be enough to explain it.

Shaikorth said...

That should bring EHG closer to Onge too, assuming it is in the same clade as Dai and Atayal. North Siberians and Native Americans also prefer Atayal over Onge, which should not happen if both Atayal and Onge are fully ENA.

This is probably going to reveal nothing but we might still try

Loschbour Gorilla Motala12 Onge
Onge Gorilla Loschbour Motala12

capra internetensis said...

@Alberto

Unfortunately the sign is wrong, it's -6%, noise result.

Chad Rohlfsen said...

result: Loschbour Gorilla Motala_HG Onge 0.1788 36.462 19709 13729 318936
result: Onge Gorilla Loschbour Motala_HG 0.0047 0.909 13729 13600 318936

I'm not sure where you guys are getting this 1% and 6% numbers at. That is not what qpAdm shows. Best fit for Dai is a mix of Atayal, Nganasan, and Onge. If we have more Karelia into Dai and more Onge into Dai, then both can appear more closely related than they are. Onge are further from West Eurasians than the Dai and Atayal. Just Onge into Dai would make the Dai a bit further from West Eurasians, but the additional West Eurasian into Dai, almost evens it out.

bellbeakerblogger said...

@ Alberto, Capra,

You are probably already well aware of this, but it hasn't been mentioned here. The early Mehrgarh folk were largely Sundadonts, which pretty much necessitates ancestry from SE Asia (or Dai-like)

One thing to keep in mind is that Sundaland has been very heavily Sinicized, so a good proxy might be something more akin to Ainu at its northernmost edge (assuming they and Okinawans are slightly more Jomonese, less Yahyoized in their make-up)]
In that case, this Dai-Mehrgarh component of ASI or whatever, might have had paternal haplogroups more something like C & D..?

Chad Rohlfsen said...

Here's some stats. Dai is closer to Karelia than Onge. Dai is closer to Onge than Atayal and Ami. Ami is significantly closer to Atayal than Dai. They're not all the same.

result: Gorilla Loschbour Dai Atayal -0.0019 -0.683 14224 14277 326345
result: Gorilla Motala_HG Dai Atayal -0.0012 -0.514 14039 14074 321796
result: Gorilla Karelia_HG Dai Atayal 0.0004 0.131 13906 13895 317554
result: Gorilla Karitiana Dai Atayal 0.0029 1.129 14755 14671 329241
result: Gorilla Karitiana Dai Onge -0.0585 -17.356 15151 17034 329241
result: Gorilla Ami Dai Atayal 0.0255 11.476 15341 14579 329241
result: Gorilla Han Dai Atayal -0.0003 -0.140 14901 14910 329241
result: Gorilla Nganasan Dai Atayal 0.0010 0.429 14756 14727 329241
result: Gorilla Nganasan Ami Atayal -0.0013 -0.541 14392 14429 329241
result: Gorilla Nganasan Dai Ami 0.0022 1.182 14761 14695 329241
result: Gorilla Onge Dai Atayal -0.0039 -1.689 14459 14572 329241
result: Gorilla Onge Dai Ami -0.0028 -1.503 14453 14535 329241

Davidski said...

@ bbb

You are probably already well aware of this, but it hasn't been mentioned here. The early Mehrgarh folk were largely Sundadonts, which pretty much necessitates ancestry from SE Asia (or Dai-like).

I have mentioned this in the past in other comment threads. Here is a paper on the topic.

http://pages.uoregon.edu/jrlukacs/Dr.%20John%20R.%20Lukacs%20Website/downloads/MR%203%20dentmorph%20VII%20conf.pdf

FrankN said...

@ Capra: "There was certainly a late Neolithic migration (or multiple waves of migration) from Southern China/Southeast Asia into India (c. 2000 BC?), bringing Austroasiatic languages .."

The Austroasiatic (Munda) is doubtless, but should AFAIK have come from somewhere more south than Yunnan. The issue of the Tai-Kadai homeland appears to be nearly as intensively debated as the IE homeland - as such I refrain from any opinion whether during the Late Neolithic the Dai already lived where there are recorded today, or much further to the South Chinese coast. In any case, they don't appear to be a particular good proxy for a "pure" population.

http://irssh.com/yahoo_site_admin/assets/docs/15_IRSSH-113-V2N1.51195721.pdf

Here is another Sino-Indian link that (for climatic/ ecological reasons) possibly went via Yunnan:
https://en.wikipedia.org/wiki/Silk_in_the_Indian_subcontinent

"Recent archaeological discoveries in Harappa and Chanhu-daro suggest that sericulture, employing wild silk threads from native silkworm species, existed in South Asia during the time of the Indus Valley Civilization dating between 2450 BC and 2000 BC, while evidence for silk production in China back to around 2570 BC and earlier.[4][5] The Indus silks were obtained from more than one species Antheraea and Philosamia (Eri silk). Antheraea assamensis and A. mylitta were widely used. It is widely believed that silk process techniques of degumming and reeling were purely Chinese technology."

Hence, I tend to stick to my "first half of the 3rd millenium" dating. 2000 BC seems to be slightly too young for the move into India, though possibly correct for a "tin explorer and bronze producer" India-to-SW China/ SEA scenario.

Chad Rohlfsen said...

Honestly, It kind of looks like 2-3 clades of ENA (Australoid/Papuan, Onge, Atayal/Ami), with the Atayal/Ami branch having West Eurasian closer to EHG/MA1 than Loschbour.

Chad Rohlfsen said...

Nevermind. Maybe, just two branches. I've got the Papuans as follows..

Onge 95.9%
Denisovan 4.1%

chisq 2.467 tail prob .781509

standard errors were both 0.8%. The fit is worse with Atayal included and standard errors over 30%.

Chad Rohlfsen said...

Oddly though....

Onge 91%
Denisovan 4%
MA1 5%

chisq 1.810 tail prob .770567

slightly better fit, but with std erros between 09-12.3%

FrankN said...

@Chad: " I've got the Papuans as follows..

Onge 95.9%
Denisovan 4.1%"

Interesting, and confirming something I have already been supposing for some time. The possible links may be the "Sea Nomads", today scattered in three groups (Andaman Sea, Southern Sumatra, Borneo/ Sulawesi/ Southern Phillipines), but possibly a far more widespread phenomenon in ancient times (I wonder if there ever has been done DNA analysis on them..)

https://en.wikipedia.org/wiki/Sama-Bajau_peoples

Why an "anciently more widespread phenomenon"? For a start, Sulawesi/E. Borneo, i.e. today's epicentre of the Sama Bajau, corresponds to the genetic and linguistic "homeland" of the Malagassy people on Madagascar.
Nearby Helmahera, the largest island of the Moluccas, has been demonstrated as genetic origin of the Polynesian rat, and is as as such believed to be the origin of the Lapita expansion (from 3.000 BC) into Melanesia, Samoa and Tonga. There is evidence for Obsidian trade from New Britain to NE Borneo at the end of the 4th mill. BC, a distance of 3,500 km!

http://journals.lib.washington.edu/index.php/BIPPA/article/view/9484/8471

In addition, domesticated coconut from the South Phillipines was around 300 BC shipped to Southern Ecuador:
http://link.springer.com/article/10.1007%2Fs10722-008-9362-6

Last but not least, there is the story of the banana: Originally domesticated on New Guinea, with additional hybridisation in the Southern Phillipines and a second one somewhere around the South Chinese Sea, all before 3.000 BC. From 2.000 BC, there is archeological evidence of bananas in Pakistan. Most banana terms on the Indian subcontinent can be traced back to *qaRutay, a root that developed in the Northern Phillipines. Reflexes of this root are both present on Northern Sumatra and the Nicobares, and along a 'land route' through North Vietnam, Yunnan, Burma and Northern Bangladesh, which makes it difficult to define the migration path. By about the same time at latest, Papuan/East Indonesian domesticates reached East Africa (East African 'banana'-terms are pre-Bantu substrate, which provides a terminus ante quem). A separate transfer brought bananas from around the Celebes Sea directly, i.e. without the genotypes in question being found in India, Arabia or East Africa, to West Africa, with the first archeological evidence (Cameroon) dating to around 500 BC.
http://www.pnas.org/content/108/28/11311.full

English 'banana' is believed to have been borrowed from Wolof 'banaana', which may be a reflex of the root 'punti that is widespread around Eastern Indonesia, and also spread eastward into Melanesia. The origin of span. 'platano' is somewhat obscure: It is assumed to have been borrowed from a Carib language, which, however, would imply pre-columbian presence of bananas in the Caribbean.

In short: A maritime network centered around the Celebes Sea, today's home of the Sama Bajau "sea-nomads", appears to have existed at least from 3.000 BC onwards. Around 300 BC, this network stretched from Cameroon to Ecuador - probably sporadically, but intensive enough to transplant bananas and coconuts, and allow for colonisation of Madagascar and Polynesia. The Andamans (plus Sri Lanka, Maledives etc.) would have been apt stopovers.
If anybody gets bored over long winter nights and feels like running some admix statistics along the a/m routes - say Buginese (Sulawesi) vs. Onge, Mbum (Cameroon), Amerindians from Ecuador - I'd be curious about the results.

capra internetensis said...

@Chad

We are doing f4 ratio estimates for the connection between Dai and EHG, so far with no success.

@bellbeakerblogger

I am somewhat skeptical of the value of dental morphology in tracing long-term genetic relationships (as opposed to the appearance of a novel populations, etc). Sundadontry in particular seems to be a relatively generic pattern, possibly close to the ancestral form; e.g. Africans and some mixed populations (like South Siberians) cluster near to Sundadonts.

@FrankN

I expect it was complicated, as usual. The questions of Daic, Tibeto-Burman, and Austroasiatic homelands have already seen some genetic study at relatively low resolution, but the proliferation of full sequences and the likelihood of more ancient DNA from China are very promising.

Nirjhar007 said...

As i Wait for the Indian DNA to washout some of the bullshit here, here something related to the anthropology,
There appears to be two types of the hunters in the Holocene. The first type clustered strongly with upper paleolithic Europeans and was concentrated in the Ganges plane/further west, some even find that these north Indians were taller than other Mesolithic populations of Eastern and Western Europe!. The second type of hunter contrasted with the Ganges type and was concentrated in the South. A good hypothesis is that the Ganges type was perhaps related to ANE and and the southern type was related to South Eurasian (ASI).
The Harappans were largely Caucasoid same to the Modern North Indian Populations around Hariyana etc.
I think we just have to wait for the DNA to solve the riddle.
http://scholarspace.manoa.hawaii.edu/bitstream/handle/10125/29101/AP_V49No1_singh.pdf?sequence=1

Nirjhar007 said...

^ Make it ANI instead of ANE.

ryukendo kendow said...



@ Tobus

Thank you for taking this up. Its because of people like you that the comments section of Eurogenes as a platform far surpasses any other places in the Gene blogging world :)

@ Capra @ Chad

Capra, thanks for the heads up about the Kharia being Austroasiatic. Indeed, their 'austroasiatic ancestry' would have split off from Dai far later than the ASI ancestry split off from (hypothetically) the Onge, meaning that the (Ancestral Austrasiatic, Dai) clade has a far longer drift path than the (ASI, Onge) clade, which would bias the Kharia strongly towards Dai even if ASI is more closely related to the Onge.

Chad, your points make sense too, and both you and Capra's points may explain the stats with Kharia. To validate this, could we run:
Chimp Pulliyar Onge Dai
Chimp Pulliyar Onge Japanese
Chimp Pulliyar Onge Papuan
Kharia Pulliyar Onge Dai
Hadza Pulliyar Onge Dai
Ust-Ishim Pulliyar Onge Dai
Papuan Pulliyar Onge Dai
LBK_EN Pulliyar Onge Dai

Any other ASI-rich population would do, should the Pulliyar not be in useable form.
Tobus, do you have the Paniya in useable form?
Chimp Paniya Onge Dai
Chimp Paniya Onge Japanese
Chimp Paniya Onge Papuan
Kharia Paniya Onge Dai
Hadza Paniya Onge Dai
Ust-Ishim Paniya Onge Dai
Papuan Paniya Onge Dai
LBK_EN Paniya Onge Dai


@ Alberto
I don't think ASI in India can be that young. Many populations with ENA ancestry contain high levels of the EDAR mutation for thick straight hair, which swept to fixation in East Asians and Native Americans 10ky ago, and all populations carrying East Asian post-neolithic ancestry, incl Southeast Asians, Polynesians and Indians such as the Austroasiatics, have high levels of EDAR, while all ENA populations without East Asian post-Neolithic ancestry, such as Papuans and Onge, do not.

http://blogs.discovermagazine.com/gnxp/files/2010/10/edar1.png
http://blogs.discovermagazine.com/gnxp/files/2010/10/edar2-300x206.png

All Dravidians in that paper (Metspalu 2010) have no EDAR at all, even the ones interspersed among the Austroasiatics, so whatever Dai-like ancestry they have must have split from Dai-like peoples more than 10 ky ago. The only ones with EDAR are in fact some Tharu people, who have recent East Asian ancestry and whose language contains a substrate.


postneo said...

yes EDAR like hair is insignificant except in the north east,
The dominant hair phenotype either resembles that of australian aborigines or veddas.

Surprisingly similar hair traits are not seen in intervening papuans, fijians, or SE asians.
body hair is also higher

http://australiaresorts.us/cat10/australia-aborigines.html
http://www.pbase.com/jonathanthomson/image/115501053
https://en.wikipedia.org/wiki/File:1981_event_Australian_aboriginals.jpg





Tobus said...

@ryukendo:
I don't have Pulliya or Paniya (unless they're also known by different names). Do you know which data set they're from? If I'm going to rebuild with CHG I might as well add these pops too.

Karl_K said...

"EDAR mutation for thick straight hair, which swept to fixation in East Asians and Native Americans 10ky ago,"

It was surely long before 10ky ago, as it must have been near fixation for both the founding Native American populations and Chinese Neolithic populations.

Was there any data on Anzick-1 or Kennewick man on the EDAR variant? I assume MA1did not carry the 370A variant, but is that actually the case?

ryukendo kendow said...

@ Davidski

David, are you using the Behar et al dataset Paniya?


@ Tobus

If he is here they are. This set is a gold mine.

http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE21478

ryukendo kendow said...

@Karl K

Yeah, 10ky is a conservative estimate.

Ebizur said...

Ryukendo wrote,

"...all populations carrying East Asian post-neolithic ancestry, incl Southeast Asians, Polynesians and Indians such as the Austroasiatics, have high levels of EDAR, while all ENA populations without East Asian post-Neolithic ancestry, such as Papuans and Onge, do not."

Chaubey et al. (2011) have published the following figures for the frequency of the 1540C allele of the EDAR gene in their Indian samples grouped by language family:

Language group n 1540C

Tibeto-Burman 57 0.61
Austroasiatic (Khasi-Aslian) 20 0.40
Austroasiatic (Munda) 379 0.05
Indo-European 338 0.01
Dravidian 283 0.00

61% in Tibeto-Burmans (but with only 57 samples), 40% in Khasis (but with only 20 samples), 5% in Mundas, 1% in Aryans, 0% in Dravidians.

The frequency of EDAR 1540C does appear to be moderately high in the Khasis (though not nearly fixed as it is in e.g. Native Americans, northern Han Chinese, or Koreans), but it is actually quite low in Kolarian populations of India. Rather than saying that "Indians such as the Austroasiatics...have high levels of EDAR," I think it would be prudent to say that Munda-speaking populations exhibit non-zero frequencies of the EDAR 1540C allele.

Alberto said...

@Capra

I would think the -ve sign is irrelevant there, no? I mean, it's just because Chad changed the order of the Onge and Gorilla, so both results are negative. And -/- = + (the results would be -ve only if one was -ve and the other +ve).

@Shaikorth

But it doesn't seem very clear that Dai and Onge form any kind of tight clade. The 6% Dai in Karelia_HG is probably more from a Han-like source from Siberia, so not related to Onge.

@RK

Yes, I'm not saying that ASI actually didn't exist and South India long before as a specific component. But with the samples we have now and the test run so far, it doesn't seem to show specifically different pattern/signs from SE Asian. Maybe it's just that we don't ave the right samples to see it, but it looks strange that Paniya is no more Onge-like vs. Atayal-like than Dai is. Some kind of homogenization seems to have taken place, even if it didn't bring AA language, Y hg O, EDAR or straight hair to Paniya (or the opposite to Dai). But let's see further tests if they can actually find differences in the components or not.

Sisophon said...

Razib shared data which included Paniya samples a few years ago, but I recall that some of the samples looked like they were mislabeled. Or there are two unrelated populations of Paniya? Anyway, when you get the Paniya samples, please check that you have a single population before analyzing them as a meaningless mixture.

In the samples I have, GSM536916 is not the same population as GSM536806, GSM536807 and GSM536808 but they are all labeled Paniya.

And it is possible that I made some copy and past mistake when I was first learning how to work with the data following Razib's tutorials. I am not an expert in this.

Garvan


Davidski said...

The Paniya I have are all pretty much the same.

Davidski said...

Open Genomes,

Top 9 eigenvectors for Clovis, Kennewick and Saqqaq.

https://drive.google.com/file/d/0B9o3EYTdM8lQMFBkS1F1ZUJIMkE/view?usp=sharing

The latter two, however, were missing quite a few markers in this analysis. So they're a bit iffy.

Btw, I don't have the Asian populations you specified.

capra internetensis said...

@Alberto

result: Gorilla Onge Loschbour Karelia_HG -0.0035 -0.526
result: Gorilla Onge Dai Loschbour -0.0593 -12.307

The position of Loschbour is switched, so the negative result of the numerator is really positive, and overall it is negative.

Since Loschbour is insignificantly closer to Onge than Karelia_HG is, using these references Karelia_HG appears to have no East Eurasian ancestry at all.

Alberto said...

@Capra

Ah, you are right. I thought only Onge and Gorilla were switched in both stats, but Loschbour is also switched in the first one, but not in the second one.

I think that stat would work with Samara_HG, but not with Karelia_HG. They might have quite different affinity to Onge.

Simon_W said...

@ Open Genomes

Interesting work your 3D PCA, thanks for sharing. But I've made some observations that seem kind of odd:
- The CHG are in no way outliers but cluster closely with some modern people from SE Europe, West Asian and the Caucasus.
- The BA Armenians plot very far from each other. One is like a true outlying pole of genetic variation, much more than the CHG, while another one plots close to central European Bell Beakers, Sintashta, Swedish Battle Axe and modern Latvians! I didn't see anything that extremely northern or divergent among the BA Armenians in previous analyses. So in this PCA the IA Armenian can be modeled as a mix of different BA Armenians.
- A modern Makrani plots close to Estonians, Loschbour and Bichon. That's too odd to be true.
- The Andronovo people are extremely diverse. Some close to Corded Ware, another one far off in the Aleut area. I think that may be the admixed one, so this observation is less odd than the others.

Kristiina said...

Open Genomes, were you able to add Clovis, Kennewick and Saqqaq to your PC analysis or to your 3D model? I find your model very illustrative and advanced!