All about haplotype H.

Nicked straight from here, just a ‘scrapbooked’ item.

What does my H subclade mean?

The following is a simplified chart which displays the positions and their corresponding haplogroups: :

H – Mitochondrial haplogroup H is a predominantly European haplogroup that participated in a population expansion beginning approximately 20,000 years ago. Today, about 40% of all mitochondrial lineages in Europe are classified as haplogroup H. It is rather uniformly distributed throughout Europe suggesting a major role in the peopling of Europe, and descendant lineages of the original haplogroup H appear in the Near East as a result of migration. Future work will better resolve the distribution and historical characteristics of this haplogroup.

H1 – H1 is the most common branch of haplogroup H. It represents 30% of people in haplogroup H, and 46% of the maternal lineages in Iberia. 13-14% of all Europeans belong to this branch, and H1 is about 13,000 years old.

H1a – H1a is a branch of H1. Further research will better resolve the distribution and historical characteristics of this haplogroup.

H1b – H1b is detected at its highest frequency in Eastern Europe and North Central Europe. It is also found in about 5% of haplogroup H lineages in Siberian Mansis.

H2 – H2 is somewhat common in Eastern Europe and the Caucasus, but likely spread from Western Europe because it is not found in significant frequency in the Near East. It is found in its highest frequency in Germany and Scotland.

H2a – Haplogroup H2a is found most frequently in Eastern Europe, and at a low frequency in Western Europe. Unlike its parent branch H2, H2a’s geographical distribution extends to Central Asia.

H2b – H2b is the branch to which the CRS belongs. Further research will better resolve the distribution and historical characteristics of this haplogroup.

H3 – H3 is the second most common branch of H. Like H1, it is found mainly in Western Europe. However, H3 is not found in significant frequencies in the Near East. It is at its highest frequency in Iberia and Sardinia, and is about 10,000 years old.

H4 – H4 is an uncommon branch and is found at low frequencies in both Europe and the Near East. Further research will better resolve the distribution and historical characteristics of this haplogroup.

H5 – H5 is distributed across Iberia, Central, Eastern, and Southeastern Europe, and is also found at low frequencies in the Near East, where it may have originated.

H5a – H5a is found at its highest frequency in Central Europe and is about 7-8 thousand years old. It is found at low frequency in Europe, and since it is not found or is rare in the Caucasus and the Near East it likely has a European origin.

H6 – H6 is an older branch of haplogroup H. Its age is estimated at around 40,000 years. Studies suggest that this haplogroup is Middle Eastern or Central Asian in origin. It is also found at very low frequencies in Europe. Further research will better resolve the distribution and historical characteristics of this haplogroup.

H6a – H6a has similar distribution to its parent branch H6. Further research will better resolve the distribution and historical characteristics of this haplogroup.

H6c – H6c is found at very low frequency, and can be found in European populations. Further research will better resolve the distribution and historical characteristics of this haplogroup.

H7 – H7 is an uncommon branch and is found at low frequencies in both Europe and the Near East. Further research will better resolve the distribution and historical characteristics of this haplogroup.

H8 – Like H6, H8 has roots in the Near East and Central Asia. It is very uncommon in Europe. Further research will better resolve the distribution and historical characteristics of this haplogroup.

H9 – H9 is an uncommon branch of H. Further research will better resolve the distribution and historical characteristics of this haplogroup.

H10 – H10 is an uncommon branch of H. Further research will better resolve the distribution and historical characteristics of this haplogroup.

H11 – H11 is an uncommon branch of H. Further research will better resolve the distribution and historical characteristics of this haplogroup.

H12 – H12 is an uncommon branch of H. Further research will better resolve the distribution and historical characteristics of this haplogroup.

H13 – H13 is an uncommon branch and is found at low frequencies in Europe, the Near East, and the Caucasus. Further research will better resolve the distribution and historical characteristics of this haplogroup.

H14 – H14 is an uncommon branch of H. Further research will better resolve the distribution and historical characteristics of this haplogroup.

H15 – H15 is an uncommon branch of H. Further research will better resolve the distribution and historical characteristics of this haplogroup.


Disuniting Uniformity: A Pied Cladistic Canvas of mtDNA Haplogroup H in Eurasia

It has been often stated that the overall pattern of human maternal lineages in Europe is largely uniform. Yet this uniformity may also result from an insufficient depth and width of the phylogenetic analysis, in particular of the predominant western Eurasian haplogroup (Hg) H that comprises nearly a half of the European mitochondrial DNA (mtDNA) pool. Making use of the coding sequence information from 267 mtDNA Hg H sequences, we have analyzed 830 mtDNA genomes, from 11 European, Near and Middle Eastern, Central Asian, and Altaian populations. In addition to the seven previously specified subhaplogroups, we define fifteen novel subclades of Hg H present in the extant human populations of western Eurasia. The refinement of the phylogenetic resolution has allowed us to resolve a large number of homoplasies in phylogenetic trees of Hg H based on the first hypervariable segment (HVS-I) of mtDNA. As many as 50 out of 125 polymorphic positions in HVS-I were found to be mutated in more than one subcluster of Hg H. The phylogeographic analysis revealed that sub-Hgs H1*, H1b, H1f, H2a, H3, H6a, H6b, and H8 demonstrate distinct phylogeographic patterns. The monophyletic subhaplogroups of Hg H provide means for further progress in the understanding of the (pre)historic movements of women in Eurasia and for the understanding of the present-day genetic diversity of western Eurasians in general.
The mitochondrial DNA (mtDNA) sequences of Europeans are sorted into ten major phylogenetic clades, or haplogroups, alphabetically named H, J, K, N1, T, U4, U5, V, X, and W (Torroni et al. 1994, 1996; Macaulay et al. 1999; Richards et al. 2000). Haplogroup (Hg) H alone constitutes about one half of the European mtDNA pool and, along with other aforementioned lineages, is widespread also in western Asia (Macaulay et al. 1999; Richards et al. 2000; Tambets et al. 2000; Kivisild et al. 2003), Central Asia (Comas et al. 1998; Metspalu et al. 1999), Siberia (Saillard et al. 2000a; Derbeneva et al. 2002a; Derenko et al. 2003), southern Asia (Passarino et al. 1996; Kivisild et al. 1999, 2003; Bamshad et al. 2001), and northern Africa (Corte-Real et al. 1996; Rando et al. 1998; Stevanovitch et al. 2004; fig. 1A). At least 267 Hg H mtDNA genomes have been sequenced (nearly) completely (Reid, Vernham, and Jacobs 1994; Rieder et al. 1998; Levin, Cheng, and Reeder 1999; Ingman et al. 2000; Finnilä, Lehtonen, and Majamaa 2001; Maca-Meyer et al. 2001; Herrnstadt et al. 2002, correction by Herrnstadt, Preston, and Howell 2003; Mishmar et al. 2003). Out of seven Hg H sub-Hgs defined so far, Hgs H1 and H2 (Finnilä, Lehtonen, and Majamaa 2001) along with Hgs H3 and H4 (Herrnstadt et al. 2002) and Hgs H5, H6, and H7 (Quintans et al. 2004) cover 74% of Finnish, 68% of U.S./U.K. and 77% of Galician Hg H sequences, respectively.


 FIG. 1.— Spatial frequency distributions of haplogroup H (A) and its subhaplogroup H1b (B). H1b frequencies are given as percentages with respect to Hg H.

Attempts to classify Hg H lineages by first hypervariable segment (HVS-I) have been hindered by a frequent occurrence of mutations at fast-evolving nucleotide sites—so-called mutational hot-spots (Richards et al. 2000; Allard et al. 2002). Furthermore, HVS-I sequence information leaves a substantial fraction of Hg H genomes phylogenetically unresolved: on average one-third of the lineages share a haplotype identical with the Cambridge reference sequence (CRS; Anderson et al. 1981). Therefore, progress in the analysis of Hg H diversity, and, indeed, of the understanding of the phylogeography of western Eurasian maternal lineages, depends critically on the use of full genome information in mtDNA samples, representative in size and geography.

The phylogeny of 267 published Hg H mtDNA coding region sequences (Reid, Vernham, and Jacobs 1994; Rieder et al. 1998; Levin, Cheng, and Reeder 1999; Ingman et al. 2000; Finnilä, Lehtonen, and Majamaa 2001; Maca-Meyer et al. 2001; Herrnstadt et al. 2002; Mishmar et al. 2003) was analyzed (fig. S1 in the Supplementary Material online). Markers for nine subclades were selected for genotyping in 563 Hg H mtDNA samples of European, Near and Middle Eastern, Central Asian and Altaian origin. To find the relative mutation rates of particular nucleotide positions, the frequencies of phylogenetically independent mutations were calculated from 987 published mtDNA coding region sequences (Marzuki et al. 1991; Reid, Vernham, and Jacobs 1994; Arnason, Xu, and Gullberg 1996; Polyak et al. 1998; Rieder et al. 1998; Levin, Cheng, and Reeder 1999; Ingman et al. 2000; Finnilä, Lehtonen, and Majamaa 2001; Maca-Meyer et al. 2001; Torroni et al. 2001b; Derbeneva et al. 2002b; Herrnstadt et al. 2002; Kivisild et al. 2002 and references therein; Kong et al. 2003; Mishmar et al. 2003).

The samples were selected at random from nine populations: 50 Finno-Ugric speakers from the Volga-Ural region (10 Udmurts, 10 Mokshas, 16 Erzyas, 7 Permyak Komis, 7 Zyrian Komis); 50 Estonians; 165 Eastern Slavs (127 Russians, 38 Ukrainians from various districts of Russia and the Ukraine); 50 Slovaks; 50 French from southern France, Lyon, Low Normandy, and Poitiers; 50 individuals from the Balkans (17 Croats, 17 Albanians, 16 Greeks); 50 Turks; 50 individuals from the Near and the Middle East (10 Jordanians, 8 Lebanese, 7 Saudis, 12 Syrians, 13 Iranians); 48 individuals from Central Asia (17 Altaians, 11 Kirghiz, 3 Kazakhs, 11 Tajiks, 6 Uzbeks). Sixteen Russian and six Ukrainian HVS-I sequences have been published by Malyarchuk and Derenko (2001a), 33 Russian HVS-I and second hypervariable segment (HVS-II) sequences by Malyarchuk et al. (2002), and all of the Volga-Ural region mtDNA HVS-I sequences by Bermisheva et al. (2002). All the samples harbored a C at nucleotide position (np) 7028, which is diagnostic for Hg H and was inferred from the absence of the AluI restriction site at np 7025 (Torroni et al. 1994). All mutations and position numbers in this study are given with respect to Anderson et al. (1981) as revised by Andrews et al. (1999).

Four hundred forty-eight samples were screened for 14 polymorphisms in the mtDNA coding region and three in HVS-II in addition to HVS-I sequence variation. A hierarchical strategy was applied to 104 Russian and 11 Ukrainian mtDNAs (Appendix S2 in the Supplementary Material online). HVS-I variation for all of the samples was scored between nps 16024–16383. Nucleotide changes at positions 73, 951, 3010, 4336, 4452, 4769, 4793, 5004, 8448, 9066, 9380, 13101, 13759, and 16482 were determined by restriction fragment length polymorphisms (RFLPs; Appendixes S1 and S2). Nucleotide states at positions 239, 456, 3915, and 6776 were detected by direct sequencing or allele-specific polymerase chain reaction (PCR; Appendixes S1 and S2). Nucleotide positions 239 and 3915 were sequenced in samples having 16362C and/or lacking a Hin6I restriction site at np 9380 and/or having a DdeI site at np 16478. We note that the transition at np 239 nearly always occurs with the 16362C allele, as it was not found in Hg H variants with 16362T in 2,350 published HVS-II sequences (Hofmann et al. 1997; Parson et al. 1998; Dimo-Simonin et al. 2000; Malyarchuk et al. 2003; Vanecek, Vorel, and Sip 2004; Pereira, Cunha, and Amorim 2004). Credible regions of the obtained haplogroup frequencies were computed with the Sampling program kindly provided by Vincent Macaulay.

The phylogeny of the samples was studied by the construction of a reduced median network (fig. 2A). In the network analysis 479 samples were included (see Appendix S1), including the 31 Finnish sequences taken from Finnilä, Lehtonen, and Majamaa (2001), while 115 Eastern Slav mtDNAs, which were analyzed hierarchically (see Appendix S2), have not been included. The reduced median network (Bandelt et al. 1995; rho set at 2) was constructed with the Network program (Fluxus Technology Ltd., Clare, Suffolk, UK, followed by a median joining algorithm (Bandelt, Forster, and Röhl 1999; epsilon set at 0), as explained at the Fluxus-Engineering Web site. Nucleotide positions were divided into three classes of transition rates—fast (16093, 16129, 16189, 16304, 16311, and 16362), intermediate (16172, 16209, 16278, 16293), and slow (the remainder of the positions between 16024 and 16383)—and assigned class weights 1, 2, and 4, respectively. Transversions and coding region mutations were weighted 8.

 FIG. 2.— Backbone of the phylogenetic tree of mtDNA haplogroup H subclades studied here. Nonsynonymous (#) and synonymous (”) mutations indicated. Mutations are transitions unless an exact base change has been shown.
 To obtain the frequencies of Hg H and its sub-Hg H1b in different populations, we compiled a data set of 26105 HVS-I sequences from various sources listed in table S2 in the Supplementary Material online. The frequency data in individual populations was grouped into broader geographical regions (see table S2) and summary frequencies obtained were mapped (fig. 1). Maps were obtained using Surfer version 7 (Golden Software, Inc., Golden, Colo.) with the Kriging procedure. Estimates at each grid node were obtained by consideration of the entire data set.
Altogether, 830 mitochondrial genomes were included in the coalescence analysis. A subset of the obtained coalescence estimates are presented in table 1 and all of the results in table S1. An average transitional distance from the root haplotype (rho) was calculated. Coalescence time has been calculated taking one transitional step between nucleotide positions 16090–16365 (“HVS”) equal to 20,180 years (Forster et al. 1996) and one base substitution between nucleotide positions 577–16023 (“coding”) equal to 5,138 years (Mishmar et al. 2003). Standard deviation of the rho estimate (sigma) was calculated as in Saillard et al. (2000b), and SD denotes the deviation in years. The 115 Eastern Slav samples analyzed hierarchically and not shown in figure 2A have been included in the coalescence analysis. Note that the coding sequence data is derived mainly from European populations.

Table 1 Coalescence Ages of Some Haplogroup H Subclusters

Clade, Motif, Clocka




Date (SD)

H1 3010 coding 90 2.04 0.25 10,500 (1,300)
H1 3010 hvs 149 1.18 0.35 23,800 (7,100)
H1* 3010, H1 excl. H1a, b, f hvs 96 0.66 0.15 13,200 (3,000)
H2a 4769–951 coding 6 1.17 0.44 6,000 (2,300)
H2a1 951–4769 –16354 hvs 27 0.56 0.25 11,200 (5,000)
H3 6776 coding 31 2.16 0.32 11,100 (1,600)
H3 6776 hvs




16,100 (8,000)

NOTE.—The number of samples belonging to the clade under analysis. Note that the 115 Eastern Slav samples analyzed hierarchically and not shown in figure 2 have been included in the coalescence analysis.

a The defining mutation motif of the root haplotype has been specified. “Coding” and “hvs” refer to age estimates that derive from the coding region or hypervariable region variation, respectively. The calculation of rho, sigma, and date (SD) are described in the Materials and Methods section. See the complete results of the coalescence analysis in table S1.

Figure 3 shows the backbone of the phylogenetic tree of Hg H subclades studied here. We have corrected the names of sub-Hgs H5 and H6 as defined by Quintans et al. (2004) as 4336C and 3915A, respectively, to H5a and H6a, following the hierarchical principle described by Richards et al. (1998). Note that the most parsimonious phylogenetic tree has two branching events based on shared HVS-I nucleotide transitions: one between sub-Hgs H1b and H1f and the second between sub-Hgs H6 and H8. Because the transitions, at nucleotide positions (nps) 16189 and 16362, involve mutational hot-spots, the indicated sub-Hgs, though monophyletic, are not necessarily sister clades, as depicted in figure 3.

View larger version (33K):
[in this window]
[in a new window]
   FIG. 3.— (A) A phylogenetic network of mtDNA variants that belong to nine subhaplogroups of haplogroup H (see Appendix S1). Some related haplotypes belonging to paraphyletic H* are also shown. The legend shows the color code for studied populations. The number of individuals is shown beside the circles above the legend. Mutations are transitions unless a transversional base change has been shown. “d” denotes deletion. Letter coding for the restriction enzymes is as follows: a: AciI, b: SspI, c: Bsh1236I, d: Alw44I, e: MboI, f: Eco32I, g: DdeI, h: Eco47I, i: AluI, j: BsuRI, k: MspI. The gain of a restriction site is marked by a “+” following the site description; the mark “-” indicates the loss of a site. (B) Frequencies (%) of sub-Hgs H2a, H3, H6, and H8 relative to haplogroup H pool in eleven populations. The legend shows the patterns corresponding to different sub-Hgs. A: Finnish data of Finnilä, Lehtonen, and Majamaa (2001); B: Estonian; C: Volga-Ural region Finno-Ugric; D: Eastern Slavic; E: Slovak; F: French; G: U.K. and U.S. data of Herrnstadt et al. (2002); H: Balkan; I: Turk; J: Near and Middle Eastern: K: central Asian and Altaian. See the Materials and Methods section for details of the studied populations.

The number of internal branches in Hg H is significantly higher than in other mtDNA haplogroups widespread in Europe. In the majority of European mtDNA variants—J, T, K, X and U5—the coding region variation is described by only a few extant basal subclades (Finnilä et al. 2000; Finnilä and Majamaa 2001; Herrnstadt et al. 2002; Reidla et al. 2003). In contrast, there are 57 basic branches stemming from the founder node of Hg H in the parsimonious phylogenetic tree relating 267 Hg H coding region sequences (fig. S1).
One hundred twenty-five variable positions were detected in 594 (563 + 31 Finnish sequences of Finnilä, Lehtonen, and Majamaa 2001) Hg H HVS-I sequences. Among them, recurrent transitions were observed in 50 positions (40%) in different subclades (table 2). The sites with the highest number of recurrences match the HVS-I hot-spot sites identified previously (Hasegawa et al. 1993; Malyarchuk and Derenko 2001b; Allard et al. 2002). The most variable positions, 16093 and 16311, had received parallel hits in seven different subclusters; 16189 in six; 16092, 16304, and 16362 each in five; and 16129, 16209, 16249, and 16325 each in four subclusters. Another 12 HVS-I mutations were found in three and 28 substitutions in two different phylogenetic contexts. Because quite a few of these hot-spot mutations are present in HVS-I haplotypes that have been highlighted as having founder status in Europe (Richards et al. 2000), our results document again that additional coding region information is essential and unavoidable in defining monophyletic subclades of Hg H reliably (Torroni et al. 1993; Bandelt et al. 2001; Kivisild et al. 2002). We also found that a reversion of A to the ancestral base G at np 73 of the HVS-II, noticed in Hg H first by Torroni et al. (1996), has occurred independently at least four times in Hg H phylogeny (see also Helgason et al. 2000).

Table 2 Nucleotide Positions in the First Hypervariable Segment (HVS-I) That Have Received More Than One Transitional Hit in Haplogroup H

Haplogroup H Subclusters













93 7 + +   + +   + +   +
311 7 + + +   +   +   + +
189 6 +       + + +   + +
92 5     + +   +     + +
304 5 + +     +   +     +
362 5 +       + +   +   +
129 4 + +           +   +
209 4 +         +     + +
249 4 +   +           + +
325 4 +         + +     +
75 3 +               + +
169 3 +               + +
172 3 +       +         +
192 3     +   +         +
218 3 + +               +
234 3         +       + +
243 3     +     +       +
261 3 +           +     +
291 3 + +               +
294 3 +       +         +
301 3 +   +           +  
354 3   + +             +
51 2 +                 +
86 2 +       +          
111 2           +       +
114 2 +                 +
126 2             +     +
145 2       +           +
148 2 +                 +
162 2 +                 +
167 2   +       +        
176 2     +             +
188 2         +         +
193 2 + +                
240 2       +           +
256 2 +                 +
259 2 +                 +
265 2             +   +  
266 2 +       +          
270 2 + +                
271 2         +         +
274 2   +               +
278 2                 + +
288 2 +             +    
290 2 +                 +
293 2                 + +
298 2     +   +          
299 2 +                 +
300 2 +         +        




a HVS-I nucleotide position (minus 16000).

b Total number of subhaplogroups where the particular transition has occurred. Analysis is based on mtDNA positions 16024–16365 in 594 haplogroup H mitochondrial genomes. Note that 115 Eastern Slav samples that are not shown in figure 2A are included here.

In the coding region, a transition at np 3010 that defines sub-Hg H1 (Finnilä, Lehtonen, and Majamaa 2001) is phylogenetically equally problematic. The derived state at np 3010 has been detected in haplogroups C, D, H, J, L2, L3, and U, making this base pair one of the fastest evolving mtDNA coding region positions (Ingman et al. 2000; Finnilä, Lehtonen, and Majamaa 2001; Maca-Meyer et al. 2001; Torroni et al. 2001b; Herrnstadt et al. 2002). Character conflicts at np 3010 and at more conserved nps 1462 (occurs also in Hgs H*, H2, T), 6272 (H*, L3), 6776 (H3), 8470 (H3), 12172 (H*, L1, U2), and 14869 (H*, K2, L3) were found (fig. S1). Given the data, the number of independent 3010A incidences in Hg H may possibly be as many as four (fig. S1).
Sub-Hg H4 was previously defined by an array of eight mutations (nps 3992, 4024, 5004, 8269, 9123, 10044, 14365, and 14582) through an analysis of haplotypes that occurred in at least two individuals (Herrnstadt et al. 2002). However, re-examination of the sequence data of Herrnstadt et al. (2002) revealed that only six mutations at nps 3992, 4024, 5004, 9123, 14365, and 14582 appear to be necessary to characterize the clade (fig. S1). Consequently, here we name the bough defined by a G-to-A mutation at np 8269, which further embraces the 10044 twig, as H4a.

While applying the RFLP method we discovered three previously unknown mutations: a transition at np 13760 abolishing the AciI site at np 13757 defining sub-Hg H11, a transition at np 5005 eliminating the H4-defining DdeI site at np 5003, and a transition at np 8449 eliminating the H11-defining np 8446 SspI site. Therefore, we confirmed the presence of H4-specific T at np 5004 by sequencing the position in all 12 samples lacking the DdeI 5003 site. The monophyly of sub-Hg H11 is well established by the combination of two RFLPs and by the characteristic HVS-I mutation pattern. These results show that classical indirect DNA polymorphism detection methods, like RFLP, should be backed-up by direct sequencing in order to avoid the ambiguous or even erroneous inference of phylogeny.

The next paragraphs address the main phylogeographic results. The largest subcluster is sub-Hg H1, which comprises about 30% of Hg H and 13% of the total European mtDNA pool. H1 is most frequent in the Iberian Peninsula, covering about 46% of local Hg H lineages (Pereira et al. 2004; Quintans et al. 2004). In the Near East the frequency of H1 does not exceed 6% (P < .025), and its relative frequency with respect to Hg H is lower than that seen in Europe (14%). In the Central Asian populations, where Hg H makes up about 11% of the local mtDNA pool, only 6% of H samples belong to sub-Hg H1 (table 3).


Table 3 Frequencies of Haplogroup H Subhaplolgroups in Studied Populations

Subclusters of Haplogroup Ha














H1* 14 2 10 8 + 28 3 11 4 6 7 3 69 165
H1a 3 4 2 4 + 3 4 2 0 2 0 0 nd 24
H1b 0 0 6 2 + 7 2 0 2 0 0 0 nd 19
H1f 0 8 1 1 + 0 0 0 0 0 0 0 0 10
H2*d 0 2 1 1 + 4 0 2 0 1 0 1 14 26
H2a 0 2 3 9 + 5 1 0 0 0 3 6 3 32
H3 0 2 3 3 + 4 2 6 4 0 0 1 25 50
H4 2 0 0 0 + 3 2 0 0 2 3 0 10 22
H5* 0 1 1 2 + 1 4 4 3 2 1 0 nd 19
H5a 0 3 1 1 + 6 1 1 5 2 0 1 10 31
H6 2 0 2 1 + 9 3 3 4 0 5 10 10 49
H7 4 0 1 0 + 2 1 4 3 1 1 3 5 25
H8 0 0 0 0 1 0 0 1 3 5 0 10
H11 4 0 1 2 + 8 6 0 2 1 1 2 3 30
H sample size 50 31 50 50 + 115 50 50 50 50 50 48 214 808
H frequency (%) 40 40 44 40 42 47 45 26 19 11 49e  
Total sample sizef













NOTE.—nd: no data.

a Definitions of the subclusters are in figure 3.

b Abbreviations for the populations are as follows: VUF: Volga-Ural region Finno-Ugric speakers; Fin: Finnish sequences are taken from Finnilä, Lehtonen, and Majamaa (2001); Est: Estonians; ESlav: Eastern Slavs; Slk: Slovaks; Fre: French; Blk: Balkan peoples; Tur: Turks; NE: Near and Middle Easterners; Asia: Asian peoples; Her: Herrnstadt et al. (2002) coding mtDNA sequences represent the populations of the United Kingdom and United States. See the Materials and Methods section for details of the studied samples.

c Note that the table includes the data of 115 Eastern Slav mtDNAs, which are analyzed hierarchically and are not represented in the network of figure 2A (see Appendix S2).

d Subcluster H2* includes all members of H2 that do not belong to H2a.

e Haplogroup H makes up 49% of the British population (Piercy et al. 1993; Richards et al. 1996; Helgason et al. 2001). The same percentage was used to estimate the frequency of haplogroup H in Western Europe, represented here by the data of Herrnstadt et al. (2002).

f Total sample size estimates are based on the knowledge of haplogroup H frequencies in different populations that has accumulated from numerous published and unpublished sources.

Sub-Hg H1b is found throughout the area of the spread of Hg H, more frequently found in Eastern Europe and north central Europe (about 7% and 5% of Hg H, respectively; fig. 1B). It was also found to make up about 5% of Hg H in Siberian Mansis. A minor sub-Hg H1f constitutes a quarter of the selected subset of Finnish Hg H genomes of Finnilä, Lehtonen, and Majamaa (2001), being almost absent elsewhere in Europe. Confirmation of the high frequency of this rare variant of mtDNA among northern central Finns, characterized by HVS-I motif 16093–16189, can be found in the Finnish data of Meinilä, Finnilä, and Majamaa (2001), reflecting founder effects in the Finnish population history (Nevanlinna 1972; de la Chapelle and Wright 1998; Kittles et al. 1999; Peltonen, Palotie, and Lange 2000). In our previous study (Tambets et al. 2004), we assumed monophyly of the transition at np 16162. This mutation occurs in the motif 73–3010–16162, which defines H1a. The finding of an Eastern Slav individual bearing haplotype 73–16093–16162 and lacking H1 defining transition at np 3010 hints that motif 73–16162 may have arisen, though rarely, more than once in haplogroup H (fig. 2A). Alternatively, and bearing in mind that position 3010 is a mutational hot-spot (see above), one may consider a parallel occurrence or a reverse mutation at this position.
Like H1b, sub-Hg H2a occurs more frequently (P < .05) in Eastern than in Western European Hg H genomes, 6.5% and 1.1%, respectively, when averaged over populations (table 3 and fig. 2B). The spread of H2a extends to Central Asia, mimicking to some extent, albeit at a lower frequency, the phylogeography of Y-chromosomal Hg R1a (Rosser et al. 2000; Wells et al. 2001). In contrast, sub-Hg H3 was found to be more frequent (P < .05) in the Western (11.7%) than in the Eastern European Hg H pool (4.1%) and is virtually absent in Anatolia and the Near East (fig. 2B), resembling in its phylogeography the spread of Y-chromosomal Hg R1b associated 49a,f TaqI haplotype 15 (Semino et al. 1996; Cinnioglu et al. 2004). The high frequency of mtDNA Hg H3—in combination with Y chromosomal Ht 15—extends to the Iberian Peninsula, where H3 constitutes about 17% of Hg H and is the highest detected so far (Pereira et al. 2004; Quintans et al. 2004).

The coalescence ages of H2a1 and H3 fall to the period of postglacial recolonization in Europe (table 1), suggested first for mtDNA Hg V (Torroni et al. 1998, 2001a). We also note that mtDNA bearing “St. Luke motif,” 16235–16293 (Vernesi et al. 2001), belong to sub-Hg H2 (fig. 2A), being particularly frequent in Germany and Scotland (Helgason et al. 2001; Pfeiffer et al. 2001).

The Near Eastern samples cluster together with Central Asian mtDNAs in the sub-Hgs H6b and H8, which are very rare in Europe. The finding is demonstrating a separate flow of maternal lineages south of the Caspian and the Black Sea in addition to well-known long-lasting migrations of pastoral nomads alongside the steppe belt that connects the Danube Basin, over the Pontic-Caspian, with Central Asia, Altay, and Manchuria.

In contrast to that found in Europeans, sub-Hgs H6 and H8 among Central Asian/Altaian populations are characterized by distinctly divergent haplotypes (fig. 2A). This finding may reflect a long-time separation of Asian and European H6 and H8 mtDNA pools and/or an earlier expansion of H6 in the eastern part of its present range. Indeed, the coalescence age of H6 in Central Asians is very deep—40,400 years (SD 16,400 years; table S1). Because the Asian branches of sub-Hg H6 are highly divergent and seem to be among the oldest in Hg H (table S1), they pose an interesting problem, deserving specific study with a much larger sample size at hand.

The commonly used HVS-I clock (Forster et al. 1996) places the initial expansion of Hg H in the Near East to about 23,000 to 28,000 years before the present (Richards et al. 2000). The ancestral clades of Hg H, pre-HV, and HV* have their combined present range predominantly in the Near and Middle East, and in the Caucasus (Metspalu et al. 1999; Richards et al. 2002), implying this could have been the region where the pre-HV/HV clade started to diversify and, possibly, where the earliest Hg H variants might have first appeared.

However, most subclusters of Hg H exhibit coalescence ages, corresponding to the beginning of their expansion in the Late Upper Paleolithic (tables 1 and S1). In this respect our results support an earlier proposition that Hg H was the major mtDNA haplogroup participating in the recolonization of Europe after the Last Glacial Maximum (Torroni et al. 1998; Richards et al. 2000). It is also important to note that the expansion time estimates derived from the coding region and HVS-I of Hg H are often in reasonable agreement with each other (tables 1 and S1). Sub-Hgs H1 and H3 have their highest frequencies in the Iberian Peninsula. These sub-Hgs may have been the companions of mtDNA Hg V in the postglacial repeopling of Europe from a refuge area in Iberia (Torroni et al. 1998). However, in contrast to Hg V, suggested coalescence ages of H1 and H3—13,400 ± 3,000 and 8,600 ± 2,800 years ago, respectively (Pereira et al. 2004)—do not imply deeper phylogeny of H1 and H3 in Iberia compared to the rest of Europe (tables 1 and S1).

These results demonstrate that a seemingly uniform spread of this major human mtDNA clade in western Eurasian populations hides within itself a complex structure of phylogeographically informative subclades. However, it is evident that additional knowledge at the level of complete mtDNA sequences is still needed for a truly comprehensive cataloguing of Hg H diversity, in particular more effectively covering its variation in the Mediterranean, Near and Middle Eastern, and Central Asian/Altaian populations. Nevertheless, even now it is tempting to speculate that much deeper coalescence ages, close to/overlapping with the boundary between the Middle and Upper Paleolithic, for some Hg H branches in Central Asian/Altaian populations, suggest that the time depth of this predominant haplogroup may be much deeper than its apparent general signal for expansion in Europe. It is, therefore, possible that the carriers of pre-Aurignacian industry identified in Zagros as well as in Altay (Otte and Derevianko 2001) were anatomically modern humans already possessing Hg H.

Supplementary Appendixes S1 and S2, figure S1, and tables S1 and S2 are available at the journal’s Web site as well as the Web site of the University of Tartu, Department of Evolutionary Biology (

About these ads

8 responses to “All about haplotype H.

  1. Hi thanks for this good work. I am trying to locate the origins of the mtDNA samples
    taken in SLOVENIA. I had my haplogroup done and I am H and from SLOVENIA, but that was the only study that did NOT identify the towns from which the mtDNA was taken. I am trying to locate my mother’s origins and my grandfather’s on her side. Every other
    sample on the mtDNA map had the location where the samples were taken. I emailed the author of the study Malyarchuk but have no answer now for 2 years. HELP??
    I cant speak Russian or Slovenian so cant write to him in his own language..maybe that is the problem?

    thanks for anything you can supply.

    theresa immordino

  2. I am in the H* subclade and I do not see this subclade on any of the branches, or its distribution.

  3. Great work! I am H with mutation 16304c and 16311c, and from Norway. Does this mean that I am H5 and originate from Turks and East Slavians??

  4. I got my results back and I am Haplogroup H (subclade H). What does that mean when it is not broken down into say H5, etc. I don’t understand this. Help. Thanks

  5. Haplogroup H is a descendent of F. (Not R as shown in your diagram.) All H’s siblings are Semitic; hence H too would be so. In the book, the Antiquities of the Jews, noted Jewish historian says that Heber (an assistor of Abraham) had a son Joktan whose sons settled in India. This points to a pre-historic Semitic Hebrew speaking India (At least up to the time of king David) before the Arian expansion and assimilation.

  6. I couldn’t find an H- with the 16519C…would you know which subgroup this puts me in?? Thanks

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s