search this blog

Thursday, July 31, 2014

Turks probably came from south Siberia


The good people of the Estonian Biocentre have just released a preprint at bioRxiv focusing on the genetic origins of Turkic-speaking nomads. It's a solid effort based on a wide range of samples and several standard analyses, including a massive fastIBD run. The authors' conclusions are very sensible and probably correct:

Most of the Turkic peoples studied, except those in Central Asia, genetically resembled their geographic neighbors, in agreement with the elite dominance model of language expansion. However, western Turkic peoples sampled across West Eurasia shared an excess of long chromosomal tracts that are identical by descent (IBD) with populations from present-day South Siberia and Mongolia (SSM), an area where historians center a series of early Turkic and non-Turkic steppe polities. The observed excess of long chromosomal tracts IBD (> 1cM) between populations from SSM and Turkic peoples across West Eurasia was statistically significant. Finally, we used the ALDER method and inferred admixture dates (~9th–17th centuries) that overlap with the Turkic migrations of the 5th–16th centuries. Thus, our results indicate historical admixture among Turkic peoples, and the recent shared ancestry with modern populations in SSM supports one of the hypothesized homelands for their nomadic Turkic and related Mongolic ancestors.


However, even tough the paper includes a lot of detail, I still find it somewhat underwhelming. The blame lies with Lazaridis et al. 2013/2014, which really raised the bar for this type of work, using several ancient genomes and very sophisticated techniques to try and unravel the deep ancestry of Europeans (see here and here). It's probably unreasonable of me to expect most population genetics papers to be so thorough, but it's still disappointing when they're not.

Also, thanks to Lazaridis et al. as well as a few other recent ancient DNA studies, we now know that the prehistory of Eurasia was more complex than anyone had imagined only a few years ago. Once upon a time is was OK to blame any sort of seemingly eastern genetic signals in Europe on Genghis Khan or Attila the Hun. These days you'd look like a bit of an idiot trying that sort of thing.

So yes, in this case the authors probably got it right, and they probably did pick up signals of Turkic migrations from south Siberia and surrounds. But let's wait and see what a good number of ancient genomes reveal about the origins, direction and time frames of population movements across the Eurasian steppe and Taiga belt.

Citation...

Bayazit Yunusbayev, Mait Metspalu, Ene Metspalu, et al., The Genetic Legacy of the Expansion of Turkic-Speaking Nomads Across Eurasia, bioRxiv posted online July 30, 2014


Wednesday, July 9, 2014

More ancient genomes from Sweden: Pitted Ware forager Ajvide58 and TRB farm girl Gokhem2


Both of these genomes were published earlier this year by Skoglund et al. 2014 (see here).

My analysis shows that Ajvide58 is very similar to Mesolithic Swedish forager StoraFörvar11 (see here), and also in part Ancient North Eurasian (ANE). This can be seen in the Principal Component Analysis (PCA) of West Eurasian populations, in which Ajvide58 is shifted east relative to La Brana-1 (which lacks ANE), much like StoraFörvar11.

However, the Eurogenes K15 results suggest to me that the level of ANE in Ajvide58 is lower than in StoraFörvar11. That's because Ajvide58 shows less of the Eastern European component (17.54% vs. 23.23%), and none of the South Asian component. These two components, along with the Amerindian component, dominate MA-1's K15 results (see here).

On the other hand, Gokhem2 appears not to harbor any ANE ancestry; note it's extreme western shift on the PCA of West Eurasia and complete lack of the Eastern Euro, Amerindian and South Asian components in the Eurogenes K15. This is in line with all scientific literature to date, which indicates that ANE was basically missing from Western and Central Europe during the Mesolithic and Neolithic. Indeed, this sample's best matching population in the Oracle are the Sardinians, one of the few present-day European groups with only a trace amount of ANE.

The absence of ANE in Gokhem2 and all other ancient European genomes from a farming context, like Stuttgart and Oetzi, is a very important point. That's because Neolithic farmers largely replaced indigenous hunter-gatherers across most of Europe, including in Scandinavia. As a result, it's probably safe to assume that this process reduced the amount of ANE in Scandinavia to much less than what was carried there by the indigenous foragers (15-19%). However, present-day Scandinavians carry around 17% of ANE, which must mean that there was another migration wave into Northern Europe after the Neolithic, coming from an area rich in ANE. This was probably the Indo-European expansion from the middle Volga region (see here).

Nevertheless, Gokhem2 does carry forager admixture, which can be seen in its non-trivial levels of Eurogenes K15 components strongly associated with indigenous European forager ancestry: North Sea at 16.74% and Baltic at 3.78%. What this suggests is that the admixture event between the Near Eastern and European ancestors of the TRB farmers didn't take place in Scandinavia, but rather somewhere on the European mainland where ANE wasn't present at the time. Interestingly, the Oracle results are in agreement, because, for instance, they feature La Brana-1 but not Ajvide58.

Eurogenes K15 - Ajvide58

North_Sea 35.8
Atlantic 18.76
Baltic 24.56
Eastern_Euro 17.54
West_Med 0.02
West_Asian 0
East_Med 0
Red_Sea 0
South_Asian 0
Southeast_Asian 0
Siberian 0
Amerindian 2.09
Oceanian 0
Northeast_African 0
Sub-Saharan 1.22

4 Ancestors Oracle results

Principal Component Analyses (PCA) featuring West Eurasian, Eurasian and global reference sets, respectively, show that Ajvide58 is outside the range of modern West Eurasian genetic variation, which is in line with the results of all other ancient European foragers sequenced to date. The cross marks the spot (click on the images to download high resolution PDFs of the plots):




Eurogenes K15 - Gokhem2

North_Sea 16.74
Atlantic 27.71
Baltic 3.78
Eastern_Euro 0
West_Med 45.34
West_Asian 0
East_Med 4.66
Red_Sea 0.78
South_Asian 0
Southeast_Asian 0
Siberian 0
Amerindian 0
Oceanian 0.98
Northeast_African 0.01
Sub-Saharan 0

4 Ancestors Oracle results




The Eurogenes K15 and Alexandr Burnashev's 4 Ancestors Oracle are available for use free of charge at GEDmatch for anyone with genotype data from 23andMe and similar personal genomics companies. Look for the Ad-mix option and then the Eurogenes tab.


Saturday, July 5, 2014

Analysis of Mesolithic Swedish forager StoraFörvar11


StoraFörvar11, or SfF11, is a late Mesolithic genome from a cave on the small island of Stora Karlsö, just off the coast of Gotland. It was published earlier this year by Skoglund et al. along with several other ancient genomes dating to the Neolithic from Gotland and mainland Sweden (see here). Belonging to Northeast European-specific mitochondrial haplogroup U5a1, SfF11 appears to be the archytypal Scandinavian forager, with no detectable Neolithic farmer admixture but considerable Ancient North Eurasian (ANE) ancestry related to Upper Paleolithic hunter-gatherers from Siberia, such as MA-1 and AG2 (see here).

Please note, Sf11 was superimposed onto the first Principal Component Analysis (PCA) plot below, which initially only included La Brana-1, the ancient Mesolithic genome from northern Spain, and present-day West Eurasians. I did this to avoid creating a cluster with the two ancient genomes based not on genuine genetic affinities between them but their relatively poor quality. I obtained the PC coordinates for Sf11 from an almost identical 13K SNP PCA plot which can be seen here.

Note also the clear eastern affinity shown by SfF11 relative to La Brana-1, which in all likelihood is the result of the above mentioned shared ANE ancestry with MA-1, featured on the second PCA. To date, all ancient genomes from Western and Central Europe have basically lacked this admixture, while Scandinavian hunter-gatherers carried it at levels of 15-19%. As hypothesized by Lazaridis et al. 2013, it's likely that Eastern European hunter-gatherers harbored even greater levels of ANE, and it's probably a good bet that they introduced it into Scandinavia during and/or before the Mesolithic.



Below are the Eurogenes K15 ancestry proportions for SfF11, and below that the 4 Ancestors Oracle results. Even though the K15 test was based on just 8K SNPs, the outcome appears robust, and correlates closely with results from more sophisticated formal mixture tests in scientific literature, in which European hunter-gatherers show a strong relationship to present-day East Baltic populations, especially Lithuanians. Moreover, among the best 4-way Oracle fits for SfF11 is 3/4 La Brana-1 and 1/4 MA-1, which is extremely close to the actual genetic structure of Scandinavian foragers: around 80% Western European Hunter-Gatherer (WHG) and around 20% ANE.

The unusually high South and Southeast Asian scores can probably be explained by shared ANE ancestry with South Asians and lack of the so called Basal Eurasian admixture, respectively. Indeed, the latter is a very good bet considering the complete absence of any sort of Mediterranean and Near Eastern signals in these results.

Eurogenes K15

Baltic 29.24
North_Sea 23.97
Eastern_Euro 23.23
Southeast_Asian 5.97
Atlantic 5.62
Amerindian 4.52
South_Asian 4.36
Oceanian 2.17
Northeast_African 0.58
Siberian 0.34
West_Med 0
West_Asian 0
East_Med 0
Red_Sea 0
Sub-Saharan 0

4 Ancestors Oracle

Least-squares method.

Using 1 population approximation:
1 Estonian @ 14.153281
2 Erzya @ 14.620788
3 Kargopol_Russian @ 14.700492
4 Southwest_Russian @ 15.448751
5 Ukrainian @ 15.825631
6 Lithuanian @ 15.842059
7 Ukrainian_Belgorod @ 16.110345
8 East_Finnish @ 16.435534
9 Belorussian @ 16.531115
10 Ukrainian_Lviv @ 16.638975
11 Estonian_Polish @ 16.671571
12 Polish @ 17.379799
13 South_Polish @ 17.805012
14 Russian_Smolensk @ 17.812963
15 Finnish @ 18.279374
16 La_Brana-1 @ 19.903407
17 Southwest_Finnish @ 21.942936
18 Moldavian @ 23.158096
19 Croatian @ 23.266324
20 Hungarian @ 24.020402

Using 2 populations approximation:
1 Erzya+Estonian @ 12.292066
2 Estonian+Kargopol_Russian @ 13.190123
3 Erzya+La_Brana-1 @ 13.192429
4 Erzya+Lithuanian @ 13.414829
5 Erzya+Ukrainian @ 13.440955
6 Erzya+Ukrainian_Lviv @ 13.540859
7 Erzya+Finnish @ 13.602815
8 East_Finnish+Lithuanian @ 13.693698
9 Kargopol_Russian+Lithuanian @ 13.735122
10 Estonian+Southwest_Russian @ 13.994994
11 East_Finnish+Erzya @ 14.077424
12 Estonian+Ukrainian_Belgorod @ 14.113102
13 Kargopol_Russian+Ukrainian @ 14.126683
14 Estonian+Estonian @ 14.153281
15 Belorussian+Erzya @ 14.180946
16 Erzya+Southwest_Russian @ 14.186181
17 Kargopol_Russian+Ukrainian_Lviv @ 14.247527
18 Estonian+Ukrainian @ 14.247854
19 Erzya+Polish @ 14.291491
20 Estonian+Lithuanian @ 14.31161

Using 3 populations approximation:
1 50% Estonian +25% Lithuanian +25% MA-1 @ 11.982448
2 50% Lithuanian +25% Estonian +25% MA-1 @ 12.169832
3 50% Estonian +25% Estonian +25% MA-1 @ 12.225538
4 50% Erzya +25% Estonian +25% La_Brana-1 @ 12.250755
5 50% Erzya +25% Estonian +25% Estonian @ 12.292066
6 50% Lithuanian +25% La_Brana-1 +25% MA-1 @ 12.473574
7 50% Erzya +25% La_Brana-1 +25% Lithuanian @ 12.480595
8 50% Lithuanian +25% Finnish +25% MA-1 @ 12.547096
9 50% Erzya +25% Estonian +25% Ukrainian_Lviv @ 12.657215
10 50% Erzya +25% Estonian +25% Ukrainian @ 12.660239
11 50% Erzya +25% Estonian +25% Lithuanian @ 12.661794
12 50% Estonian +25% Erzya +25% Kargopol_Russian @ 12.679962
13 50% Erzya +25% Erzya +25% La_Brana-1 @ 12.695461
14 50% Erzya +25% La_Brana-1 +25% Ukrainian @ 12.707643
15 50% Estonian +25% Erzya +25% Estonian @ 12.716859
16 50% Erzya +25% Finnish +25% Lithuanian @ 12.72455
17 50% Erzya +25% Estonian +25% Finnish @ 12.737834
18 50% Erzya +25% La_Brana-1 +25% Ukrainian_Lviv @ 12.753404
19 50% Lithuanian +25% Lithuanian +25% MA-1 @ 12.768751
20 50% Estonian +25% Belorussian +25% MA-1 @ 12.780747

Using 4 populations approximation:
1 Estonian+Estonian+Lithuanian+MA-1 @ 11.982448
2 Estonian+Lithuanian+Lithuanian+MA-1 @ 12.169832
3 Estonian+Estonian+Estonian+MA-1 @ 12.225538
4 Erzya+Erzya+Estonian+La_Brana-1 @ 12.250755
5 Erzya+Erzya+Estonian+Estonian @ 12.292066
6 Estonian+La_Brana-1+Lithuanian+MA-1 @ 12.434074
7 La_Brana-1+Lithuanian+Lithuanian+MA-1 @ 12.473574
8 Erzya+Erzya+La_Brana-1+Lithuanian @ 12.480595
9 Finnish+Lithuanian+Lithuanian+MA-1 @ 12.547096
10 Erzya+Erzya+Estonian+Ukrainian_Lviv @ 12.657215
11 Erzya+Erzya+Estonian+Ukrainian @ 12.660239
12 Estonian+Lithuanian+MA-1+Ukrainian @ 12.66118
13 Erzya+Erzya+Estonian+Lithuanian @ 12.661794
14 Erzya+Estonian+Estonian+Kargopol_Russian @ 12.679962
15 Erzya+Erzya+Erzya+La_Brana-1 @ 12.695461
16 Estonian+Lithuanian+MA-1+Ukrainian_Lviv @ 12.697136
17 Erzya+Erzya+La_Brana-1+Ukrainian @ 12.707643
18 Erzya+Estonian+Estonian+Estonian @ 12.716859
19 Erzya+Erzya+Finnish+Lithuanian @ 12.72455
20 Erzya+Erzya+Estonian+Finnish @ 12.737834
21 Estonian+Finnish+Lithuanian+MA-1 @ 12.746305
22 Erzya+Erzya+La_Brana-1+Ukrainian_Lviv @ 12.753404
23 Lithuanian+Lithuanian+Lithuanian+MA-1 @ 12.768751
24 Belorussian+Estonian+Estonian+MA-1 @ 12.780747
25 Estonian+Estonian+MA-1+Ukrainian @ 12.797031
26 Estonian+Estonian+La_Brana-1+MA-1 @ 12.807529
27 Erzya+Estonian+Estonian+Ukrainian @ 12.813496
28 Estonian+Estonian+MA-1+Ukrainian_Lviv @ 12.822931
29 Erzya+Estonian+Kargopol_Russian+La_Brana-1 @ 12.831473
30 Erzya+Estonian+Estonian+Lithuanian @ 12.839613
31 Chuvash+Estonian+Estonian+Lithuanian @ 12.851803
32 Belorussian+Estonian+Lithuanian+MA-1 @ 12.855733
33 Erzya+Estonian+Estonian+Ukrainian_Lviv @ 12.857349
34 East_Finnish+Erzya+Estonian+Lithuanian @ 12.875013
35 Erzya+Estonian+Kargopol_Russian+Lithuanian @ 12.901956
36 Erzya+Estonian+La_Brana-1+Lithuanian @ 12.90565
37 Erzya+Kargopol_Russian+La_Brana-1+Lithuanian @ 12.914481
38 Erzya+Estonian+Estonian+La_Brana-1 @ 12.921321
39 Erzya+Estonian+Estonian+Southwest_Russian @ 12.931952
40 Lithuanian+Lithuanian+MA-1+Ukrainian @ 12.932804

Gaussian method.

Using 1 population approximation:
1 East_Finnish @ 12.111642
2 Finnish @ 12.136433
3 Tatar @ 12.260871
4 Chuvash @ 12.287812
5 Kargopol_Russian @ 13.238854
6 Erzya @ 13.290701
7 Ukrainian @ 14.224517
8 North_Swedish @ 14.501487
9 Mari @ 14.582022
10 La_Brana-1 @ 15.102585
11 Ukrainian_Lviv @ 15.466692
12 Moldavian @ 16.561361
13 Ukrainian_Belgorod @ 16.829215
14 Southwest_Finnish @ 17.044556
15 Southwest_Russian @ 17.644306
16 Estonian_Polish @ 17.912619
17 Swedish @ 18.055712
18 Estonian @ 18.417704
19 Hungarian @ 18.442869
20 Lithuanian @ 18.500045

Using 2 populations approximation:
1 La_Brana-1+Mari @ 9.086839
2 Kargopol_Russian+La_Brana-1 @ 9.216681
3 La_Brana-1+MA-1 @ 9.529079
4 Chuvash+La_Brana-1 @ 9.628936
5 Erzya+La_Brana-1 @ 9.741056
6 La_Brana-1+Tatar @ 10.312023
7 East_Finnish+La_Brana-1 @ 10.369729
8 Chuvash+Estonian @ 10.38245
9 Estonian+La_Brana-1 @ 10.698394
10 Chuvash+Finnish @ 10.701826
11 Estonian+Tatar @ 10.72273
12 Estonian+Shors @ 10.734028
13 Chuvash+Lithuanian @ 10.781409
14 Chuvash+East_Finnish @ 10.832523
15 Chuvash+Kargopol_Russian @ 11.058841
16 Finnish+Tatar @ 11.078731
17 East_Finnish+Tatar @ 11.104768
18 Lithuanian+Shors @ 11.131471
19 Chuvash+Ukrainian @ 11.241182
20 Estonian+Hakas @ 11.257456

Using 3 populations approximation:
1 50% La_Brana-1 +25% Estonian +25% MA-1 @ 6.880967
2 50% La_Brana-1 +25% La_Brana-1 +25% MA-1 @ 7.035486
3 50% La_Brana-1 +25% Lithuanian +25% MA-1 @ 7.1341
4 50% Estonian +25% La_Brana-1 +25% MA-1 @ 7.18973
5 50% La_Brana-1 +25% East_Finnish +25% MA-1 @ 7.57191
6 50% La_Brana-1 +25% Finnish +25% MA-1 @ 7.600389
7 50% Lithuanian +25% La_Brana-1 +25% MA-1 @ 7.628929
8 50% La_Brana-1 +25% Estonian_Polish +25% MA-1 @ 7.697983
9 50% La_Brana-1 +25% Belorussian +25% MA-1 @ 7.70291
10 50% La_Brana-1 +25% Kargopol_Russian +25% MA-1 @ 7.781779
11 50% La_Brana-1 +25% MA-1 +25% Southwest_Finnish @ 7.798672
12 50% La_Brana-1 +25% Erzya +25% MA-1 @ 7.80171
13 50% La_Brana-1 +25% MA-1 +25% Polish @ 7.929863
14 50% La_Brana-1 +25% MA-1 +25% Southwest_Russian @ 7.935151
15 50% La_Brana-1 +25% MA-1 +25% Russian_Smolensk @ 8.031297
16 50% La_Brana-1 +25% MA-1 +25% North_Swedish @ 8.049602
17 50% La_Brana-1 +25% MA-1 +25% Ukrainian_Belgorod @ 8.049701
18 50% La_Brana-1 +25% MA-1 +25% Ukrainian @ 8.06409
19 50% La_Brana-1 +25% MA-1 +25% South_Polish @ 8.188305
20 50% Finnish +25% La_Brana-1 +25% MA-1 @ 8.237496

Using 4 populations approximation:
1 Estonian+La_Brana-1+La_Brana-1+MA-1 @ 6.880967
2 La_Brana-1+La_Brana-1+La_Brana-1+MA-1 @ 7.035486
3 La_Brana-1+La_Brana-1+Lithuanian+MA-1 @ 7.1341
4 Estonian+Estonian+La_Brana-1+MA-1 @ 7.18973
5 Estonian+La_Brana-1+Lithuanian+MA-1 @ 7.414412
6 East_Finnish+La_Brana-1+La_Brana-1+MA-1 @ 7.57191
7 Finnish+La_Brana-1+La_Brana-1+MA-1 @ 7.600389
8 La_Brana-1+Lithuanian+Lithuanian+MA-1 @ 7.628929
9 Estonian+Finnish+La_Brana-1+MA-1 @ 7.689347
10 Estonian_Polish+La_Brana-1+La_Brana-1+MA-1 @ 7.697983
11 Belorussian+La_Brana-1+La_Brana-1+MA-1 @ 7.70291
12 East_Finnish+Estonian+La_Brana-1+MA-1 @ 7.712903
13 Finnish+La_Brana-1+Lithuanian+MA-1 @ 7.779771
14 Kargopol_Russian+La_Brana-1+La_Brana-1+MA-1 @ 7.781779
15 La_Brana-1+La_Brana-1+MA-1+Southwest_Finnish @ 7.798672
16 Erzya+La_Brana-1+La_Brana-1+MA-1 @ 7.80171
17 East_Finnish+La_Brana-1+Lithuanian+MA-1 @ 7.850763
18 Estonian+Estonian_Polish+La_Brana-1+MA-1 @ 7.890161
19 Belorussian+Estonian+La_Brana-1+MA-1 @ 7.906509
20 Estonian+La_Brana-1+MA-1+Southwest_Finnish @ 7.927839
21 La_Brana-1+La_Brana-1+MA-1+Polish @ 7.929863
22 La_Brana-1+La_Brana-1+MA-1+Southwest_Russian @ 7.935151
23 Estonian+Kargopol_Russian+La_Brana-1+MA-1 @ 7.940811
24 Erzya+Estonian+La_Brana-1+MA-1 @ 7.965223
25 La_Brana-1+Lithuanian+MA-1+Southwest_Finnish @ 7.991558
26 La_Brana-1+Lithuanian+MA-1+North_Swedish @ 8.029449
27 La_Brana-1+La_Brana-1+MA-1+Russian_Smolensk @ 8.031297
28 Belorussian+La_Brana-1+Lithuanian+MA-1 @ 8.038993
29 Estonian_Polish+La_Brana-1+Lithuanian+MA-1 @ 8.046271
30 La_Brana-1+La_Brana-1+MA-1+North_Swedish @ 8.049602
31 La_Brana-1+La_Brana-1+MA-1+Ukrainian_Belgorod @ 8.049701
32 La_Brana-1+La_Brana-1+MA-1+Ukrainian @ 8.06409
33 Estonian+La_Brana-1+MA-1+North_Swedish @ 8.075392
34 Estonian+La_Brana-1+MA-1+Polish @ 8.08945
35 Kargopol_Russian+La_Brana-1+Lithuanian+MA-1 @ 8.100132
36 Estonian+La_Brana-1+MA-1+Southwest_Russian @ 8.108852
37 Erzya+La_Brana-1+Lithuanian+MA-1 @ 8.127814
38 Estonian+La_Brana-1+MA-1+Ukrainian @ 8.153751
39 La_Brana-1+Lithuanian+MA-1+Polish @ 8.17359
40 La_Brana-1+La_Brana-1+MA-1+South_Polish @ 8.188305

The Eurogenes K15 and Alexandr Burnashev's 4 Ancestors Oracle are available for use free of charge at GEDmatch for anyone with genotype data from 23andMe and similar personal genomics companies. Look for the Ad-mix option and then the Eurogenes tab.