Wednesday, July 6, 2016

qpAdm tour of Iran

Just wanted to see if I could model Early Neolithic versus Chalcolithic Zagros farmer ancestry in present-day Iranians using qpAdm. I reckon I can, more or less. The outcomes below are all fairly solid statistical fits, especially considering the complexity of the models and the close similarity between the Early Neolithic and Chalcolithic Zagros farmers. Update 23/08/2016: I added most of the Iranian Zoroastrians from Broushaki et al. 2016 to the analysis.


Iran_Chalcolithic 0.136 ± 0.121
Iran_Neolithic 0.631 ± 0.152
Yamnaya_Samara 0.164 ± 0.033
Han 0.026 ± 0.017
Yoruba 0.044 ± 0.013

Iran_Chalcolithic 0.723 ± 0.078
Iran_Neolithic 0.106 ± 0.079
Yamnaya_Samara 0.130 ± 0.024
Han 0.041 ± 0.011

Iran_Chalcolithic 0.558 ± 0.066
Iran_Neolithic 0.209 ± 0.065
Yamnaya_Samara 0.178 ± 0.022
Han 0.055 ± 0.010

Iran_Chalcolithic 0.617 ± 0.064
Iran_Neolithic 0.181 ± 0.062
Yamnaya_Samara 0.148 ± 0.022
Han 0.054 ± 0.010

Iran_Chalcolithic 0.592 ± 0.061
Iran_Neolithic 0.172 ± 0.061
Yamnaya_Samara 0.209 ± 0.021
Han 0.027 ± 0.010

However, please note that despite the close similarity between the Early Neolithic and Chalcolithic Zagros farmers, the latter did not in most part descend from the former. In fact, it's very likely that the Chalcolithic farmers were largely, or perhaps even entirely, derived from newcomers to present-day Iran from somewhere to the west of the Zagros Mountains (see here).

It's true that in the basic four-way qpAdm model in Lazaridis et al. the Chalcolithic Zagros farmers are largely modeled as Neolithic Zagros farmers (or Iran_N). However, a more comprehensive analysis in the same paper explains them as a mixture of Caucasus Hunter-Gatherers (CHG), Neolithic farmers from the Levant, and Neolithic Zagros farmers, with admixture ratios of 0.631, 0.202 and 0.167, respectively.

I can basically reproduce the same model with the outgroups listed above, except with Israel_Natufian in place of Levant_Neolithic, which I have to use as one of the reference populations.

Caucasus_HG 0.522 ± 0.111
Iran_Neolithic 0.246 ± 0.108
Levant_Neolithic 0.232 ± 0.026

The qpAmd algorithm is freely available at GitHub here. All of the present-day and ancient samples are freely available at the Reich Lab website here.

Davidski said...

I'm gonna swing back to using qpAdm now. The recently published ancient samples make it a lot easier and more effective to use.

ryukendo kendow said...

Why don't we drop down one level and use Iran_Neolithic, Anatolia_Neolithic, Levant_Neolithic, CHG, and Yamnaya, instead of the later admixed populations? This may give us a bit more insight, especially as there's still wide variance in the proportions estimated above, which probably doesn't reflect reality.

ryukendo kendow said...

To be fair though, pushing the Iran_N percent in Iran Chl into the estimates for present day Iranians will make them about a third Iran_N, which is probably around the actual figure.

Seinundzeit said...


Looks pretty solid.

Can you see what happens when the Onge are included in the modelling (obviously, only once you have the time, and inclination, to do so)?

I think that'll be interesting. Thanks in advance.

ryukendo kendow said...

@ David,

On another note, the stats for Kostenki, El Miron etc. did not display much interesting behaviour. The slight similarity of SW European Neolithics and La Brana to El Miron is recovered, but there is no special similarity of Levantine Neolithics and Iran Neolithics to Kostenki. If the lower percentages for basal Eurasian in the Middle East are validated and there is some UHG in the Middle East, it must have split off before WHG and ANE separated, showing up as excess similarity to none of Kostenki, El Miron, Mal'ta or Loschbour, but showing up as excess generalised West Eurasian affinity as a whole.

Davidski said...


Why don't we drop down one level and use Iran_Neolithic, Anatolia_Neolithic, Levant_Neolithic, CHG, and Yamnaya, instead of the later admixed populations?

Not working. Probably too complex for this program. Dropping Anatolia_Neolithic doesn't help.

If the lower percentages for basal Eurasian in the Middle East are validated and there is some UHG in the Middle East, it must have split off before WHG and ANE separated, showing up as excess similarity to none of Kostenki, El Miron, Mal'ta or Loschbour, but showing up as excess generalised West Eurasian affinity as a whole.

We'll need direct evidence from across space and time in the Near East to make sense of this. Right now I don't understand it at all.


Can you see what happens when the Onge are included in the modelling?

It only seems to work for the Mazandarani and Persian Iranians, and the Onge really cuts into the Iran_Neolithic. What might be happening here is that their Iran_Neolithic ancestry might actually be from local foragers like the Hotu individual, so the algorithm is compensating with the Onge.

Iran_Chalcolithic 0.651 ± 0.091
Iran_Neolithic 0.079 ± 0.108
Yamnaya_Samara 0.180 ± 0.022
Andamanese_Onge 0.062 ± 0.038
Han 0.028 ± 0.018

Iran_Chalcolithic 0.658 ± 0.093
Iran_Neolithic 0.122 ± 0.109
Yamnaya_Samara 0.149 ± 0.022
Andamanese_Onge 0.031 ± 0.039
Han 0.040 ± 0.019

Seinundzeit said...



That's a fascinating pattern. Can't wait to see Upper Paleolithic/Mesolithic aDNA from Central Asia/South Asia, that'll give us a clear idea.

Roy King said...

Very impressive analysis! Do you have any data on modern Iranians between Lorestan and the Bandari? I think that your results might offer insight into the Elamite people and language. My first guess is that the Elamite civilizations reflects Iran_Chalcolithic, and, indirectly, Halafian influences. But it would be helpful to looks at the autosomal data of people from Ilam Province region.

postneo said...

"Obviously, this doesn't square too well with the idea of a Proto-Indo-European homeland in the Zagros Mountains of western Iran, does it?"

It squares as well as anything else:

Do you know what language halafians spoke? do you kow what language iran chacl spoke.
what if halafians brought pre PIE or adopted PIE in Iran or PIE arose sponteously in Iran after all this or was brought to Iran from outside after all these events from somewhere else.

All of these remain open

For the king said...

@Roy The Iranian Shirazi(Persian) sample is right between Iranian Bandaris and Lors. Proto Elamite tablets are found everywhere in Iran, including the far eastern edges. However, the city of Susa(SW Iran) has the most Proto Elamite tablets. The Elamites were probably Iran ChL like, and the Sumerians probably had a lot of Iran ChL like admixture. Also, people from Ilam province are closest to the Iranian Lor samples, followed by the Persian/Shirazi samples.

Samuel Andrews said...

These results are consistent with D-stat based results. "Iranian_Bandari" scoring mostly Iran_Neo as opposed to Iran_Chl and some African though is surprising. Are they a genetic outlier in Iran?

David can you also do qpADM for Assyrians? In D-stats they're pretty similar to Iranians. I think they also have some Steppe admixture.

Davidski said...


These are the only regional samples currently available from Iran. Here are their geographic coordinates:

Bandari 27.183 56.267
Lor 33.465 48.339
Mazandarani 36.529 52.671
Persian 29.591 52.584


The PIE homeland question is not as open as you think and hope. There's been 200 years of study on the topic, and historical linguistics, ancient DNA and archeology are converging very nicely in favor of the steppe Kurgan hypothesis.


There's something missing for Assyrians in this model. Adding Anatolia_Neolithic improves it, but it'd take a while to find a really good model for them.

Anatolia_Neolithic 0.288 ± 0.069
Iran_Chalcolithic 0.528 ± 0.100
Yamnaya_Samara 0.128 ± 0.026
Han 0.039 ± 0.011
Yoruba 0.017 ± 0.006

Btw, Bandari Iranians have Afro-Iranian admixture.

Roy King said...

The Persians are from Shiraz, like "For the King" suggested. Thus the Elamite area (near Shiraz) also is predominantly Iran_Chalcolithic. If Elamite constituted a substratum language in Mesopotamia (see Speiser, 1930) with its reduplicating syllabic typology (see Banana language, eg Inana, Kubebe, etc...), then the Iranian Chalcholithic could very well be a post-Neolithic expansion across the Zagros.

Roy King said...

How do the Brahui model in your admixture analysis? Are they also predominantly Iran_Neolithic rather than Iran_Chalcolithic?

ryukendo kendow said...

@ Roy
Here is a previous model for the Brahui:

[1] "distance%=1.3504 / distance=0.013504"

"Iran_Chalcolithic" 38.4
"Afanasievo" 16.15
"Andamanese_Onge" 13.6
"Armenia_Chalcolithic" 12.95
"Armenia_MLBA" 6.5
"Satsurblia" 6.5
"Dai" 3.8
"Eastern_HG" 1.6
"Karitiana" 0.5

But then, Iran_N seems not to be represented well in the nMonte datasheets, so let's wait and see.

Nevertheless, are you suspecting a spread of an Iran_Chl-like population along with Elamo-dravidian? Especially as Haplogroup J2a, whose distribution in S Asia matches the boundaries of the Indus Valley, emerges only in Iran_Chl.

ryukendo kendow said...

@ David

Just pointing out that your treemixes have, ever since Qiaomei Fu's genomes, grouped Basal-carrying populations together in a clade parallel to ANE and WHG, with only 5-20% basal contribution from a 'Basal Eurasian' edge, with the rest of their ancestry from their own clade, and that EHG/ANE genomes have not been sending edges into Iran_N or CHG, all of which is a suggestive pattern.

Davidski said...


Using the same outgroups as above, this is a pretty decent model actually, apart maybe from the standard error for the Iran_Chalcolithic admixture estimate being higher than the estimate itself.

Iran_Chalcolithic 0.093 ± 0.098
Iran_Neolithic 0.549 ± 0.122
Yamnaya_Samara 0.251 ± 0.024
Andamanese_Onge 0.048 ± 0.044
Han 0.059 ± 0.021

In any case, it's very similar to the result in Laz et al., despite the somewhat different outgroups and the extra ancient Iranian reference population.

FeyliDNAprojectadministrator said...

Hello david, I'd highly appreciate if you could model the feyli samples here with the above group(Iran_chalocolithic, han, natufian, yamnaya_samara) or if it would be possible to switch out the yamnaya for sintashta/andronovo. Perhaps also southern kurds if you have the samples available.

I've given you acess to the feyli samples here: here:

Looking forward to your response.

Davidski said...

Hang on, I'll have a look. But I reckon you need a Feyli project analyst to do this properly, because it would take a few weeks to explore these samples in a useful way.

For the king said...

Interesting that Bandaris have higher Iran N than Brahuis, 2 of the samples have +10% SSA tho. It would be useful to remove those outliers. The Feylis would be almost Identical to the Lor samples we already have.

Roy King said...

"Nevertheless, are you suspecting a spread of an Iran_Chl-like population along with Elamo-dravidian? Especially as Haplogroup J2a, whose distribution in S Asia matches the boundaries of the Indus Valley, emerges only in Iran_Chl."

Yes, that's exactly my hypothesis--namely, that J2a may track Elamite or perhaps Elamo-Dravidian. What seems to be the case is that the Iran_Chalcolithic samples may better represent the spread of Elamite than the Iran_Neolithic samples.

Rob said...

I commented to same effect on last thread with regard to the Chalcolithic turnover in Iran

"Anyhow, whilst SW Iran (The Zagros) might be interesting for Elamite ethnogenesis, ....."

I think the Zagros area has always been favoured for * Elamo-Dravidian

ryukendo kendow said...

Roy, in the previous paper on rapid recent expansion of Y-DNA lineages J was classed with E as a lineage with 'Agricultural' expansion of subclades, while R1a and b were classed with a Bronze Age expansion with a flatter and more star-like burst than J. Now that J1 and J2 appear in agropastoral populations only in the Chalcolithic and Bronze Ages, what do you make of this?

I suspect the situation in East Asia and the ME and SC Asia may be more similar to the situation in Europe than was made out before, with R2 playing an analogous role to G2 in Europe. Many of the lineages with 'agricultural' expansions may turn out to be post-initial Neolithic. It may well be the case that no major lineages present today date back to the initial neolithic at all--those all being rendered minor lineages due to later overprinting. But this is just a suspicion, which needs to be backed by more sampling.

Davidski said...

Here are a few models for the Feyli Kurds. Yeah, very similar to the Iranian_Lor set, but it's nice to see such consistent results.

Iran_Chalcolithic 0.707 ± 0.070
Iran_Neolithic 0.125 ± 0.072
Yamnaya_Samara 0.139 ± 0.023
Han 0.029 ± 0.011

Iran_Chalcolithic 0.615 ± 0.070
Iran_Neolithic 0.174 ± 0.066
Andronovo 0.185 ± 0.028
Han 0.026 ± 0.011

Iran_Chalcolithic 0.626 ± 0.066
Iran_Neolithic 0.138 ± 0.062
Sintashta 0.191 ± 0.030
Han 0.045 ± 0.010

Btw, modeling anyone present-day as part Natufian is probably not a good idea.

Aram said...

Iran_Chl looks Kassitic for me.

"'Lurs are a mixture of aboriginal Iranian tribes, originating from Central Asia and the pre-Iranic tribes of western Iran, such as the Kassites (whose homeland appears to have been in what is now Lorestan) and Gutians. Michael M. Gunter states that they are closely related to the Kurds but that they "apparently began to be distinguished from the Kurds""

Aram said...


Imho this Iran_Chl expanded into Central Asia. Y Dna points to that.
Can You test some Central Asian Turks?

MfA said...

@Dave, can you try Adana23113 as well?

What's wih extra Iran_N over Iran_Chl? One would expect Iran_Chl should cover for modern Iranians and Kurds but apparently not. Could it be related to Iranics from Central Asia brought extra Iran_N along with the steppe admixture from areas like BMAC?

MfA said...

I think there should also extra Levant_BA, Anatolia Chl type ancestries. Maybe extra Iran_N acts like a balancing factor to accomodate.

MfA said...

@Arame, Lors started to distinguish from Kurds around 18th century and onwards. Many Little Lors still identify as Kurds, while Great Lors only as Lors.

Aram said...


Ok. I didn't knew that. Imho Kurds will have more Gutian like ancestry. What You think about that?

Davidski said...

These models don't work for Adana23113 or Turkmen. I'd have to come up with new models. Can't do it now though.

Aram said...

Ok Davidski. You did a great job about Iran_Chl. Thanks.
Btw I expect to see Anatolian_Chl in Minoan Crete. With loads of J2.

For the king said...

Lors had a separate identity since Mongol times. 18th century and onwards is way too recent. Lors were already listed as their own groups in Safavid and Afsharid armies. Unlike Kurds, most speak Perside SW Iranian languages, they also cluster closest to Persian Shirazis. Some sources might refer to lors as " kurd ", but the word " Kurd " was just a generic term for Nomads.

MfA said...

@For the king
Kurd already was used in the ethnic sense by 900-1100 AD.

"Ibn al-Athir (13th c.) already mentioned them as 'Lor Kurds', and Sharafkhan Bidlisi (16th c.) also includes them among the Kurds. From the 19th century Orientalist travel reports, it seems that the separate Lor identity by then had materialized, although many comment on their closeness to Kurds in lifestyle and culture."

Aram said...

Bandari have high level of R2 in Iran. The other highest are the Persians of Isfahan. Bandaris have also lots of L and SS African E and B.

FeyliDNAprojectadministrator said...

Excellent! Thanks David. The Yamnaya/Iranian chalocolithic model seems relatively consistent with the geographic topography and the history of the region. And indeed as people have stated in the comment section. The best canidate for the pre-indo-iranian culture that inhabited what corresponds today with kermanshah, northern/central ilam, and lorestan, is the kassite one. A highlander culture located north of the elamites, that was very influental in mesopotamia. That persisted well into the achaemenid era as "kossaei".

However this is not to say that they were genetically distinct from other zagros based cultures that existed in the same time. The material culture from the very south to the very north, were very similar, they were likely all interrelated.

The Sintashta/Andronovo models are also interesting. They suggest that indo-iranian ancestry has higher range than expected. The Sintashta model in practicular with more "ASI" seems accurate, as the andronovo which is part east-eurasian seems to absorb some of the ENA affinity related alleles.

Either way. Thanks for taking time to model these, David.

Olympus Mons said...

Yes, Elamite makes sense. Still Snake people.
choga sefid, at the heart of the Elamite territory is the place with the most severe cranial deformation to have an Ophdian (snake) form.

However, Halaf gets me confused. Halaf was a transitional to Ubaid . Would not be surprise If Halaf was a very diferent stock then this Iran_Chalcolitic.

So, Hussana-Samara, Ubaid, Uruk, Ur, elamite all the same thing. all have in common being snake people. Not Halaf or anyone else.

Olympus Mons said...

Just a curiosity... Can it be seen how much Iran_chalc there is in Cyprus?

They were the latest as far as I know "Ophidians".

Davidski said...


Anatolia_Neolithic 0.578 ± 0.056
Iran_Chalcolithic 0.200 ± 0.091
Yamnaya_Samara 0.155 ± 0.035
Han 0.043 ± 0.010
Yoruba 0.023 ± 0.007

Olympus Mons said...


Kurd Dgk said...

QpAdm modeling Iraqi Kurd samples using the same outgroups as Lazaridis 2016, I got excellent fits . Iran Chl was not used for a couple of reasons; the fits were not as good as using Iran N, and Iran Chl has shown to be an Iran N - CHG composite:

Iraqi Kurd C1 Iraqi Kurd C3
IRAN NEOL 44% 42%
EHG 19% 21%
HAN 0% 0%
CHISQ 0.9 1.8

ak2014b said...

@Roy King, and @ryukendo kendow
"Especially as Haplogroup J2a emerges only in Iran_Chl."

Isn't it older than the Chalcolithic? The Mesolithic Hotu Cave Iranian sample was already J2a. Here's an Anthrogenica discussion on the 9100-8600 BCE sample being J2a.