Ornella Semino,1 A. Silvana Santachiara-Benerecetti,1 Francesco Falaschi,2 L. Luca Cavalli-Sforza,3 and Peter A. Underhill3
References AbstractThe genetic structure of 126 Ethiopian and 139 Senegalese Y chromosomes was investigated by a hierarchical analysis of 30 diagnostic biallelic markers selected from the worldwide Y-chromosome genealogy. The present study reveals that (1) only the Ethiopians share with the Khoisan the deepest human Y-chromosome clades (the African-specific Groups I and II) but with a repertoire of very different haplotypes; (2) most of the Ethiopians and virtually all the Senegalese belong to Group III, whose precursor is believed to be involved in the first migration out of Africa; and (3) the Ethiopian Y chromosomes that fall into Groups VI, VIII, and IX may be explained by back migrations from Asia. The first observation confirms the ancestral affinity between the Ethiopians and the Khoisan, which has previously been suggested by both archaeological and genetic findings.
References Within extant African populations, both linguistic (Greenberg 1963) and genetic (Hiernaux 1975; Excoffier et al. 1987; Cavalli-Sforza et al. 1994, pp. 169–171) evidence indicates that most sub-Saharan populations are more closely related to each other, whereas Pygmy, Khoisan, and eastern African populations are the most differentiated. Paradoxically, genetic comparisons of Khoisan and Ethiopian populations show both polarity and affinity with respect to one another. This has been shown by the principal-components (PC) analysis of 79 classical protein polymorphisms (Cavalli-Sforza et al. 1993, 1994, p. 191). Although the second PC indicates that the Ethiopian and Khoisan populations are the most divergent, the third PC shows a close relationship. Although intermediary Bantu-speaking populations currently separate these two groups geographically, archeological findings suggest that the Khoisan territory once extended above the equator, to present-day southern Ethiopia and Sudan (Nurse et al. 1985, p. 105).
In a previous study (Passarino et al. 1998), the genetic structure of the Ethiopian population was investigated using mtDNA and some nonrecombinant Y-chromosome (NRY) markers previously studied in the Khoisan (Soodyall and Jenkins 1992; Spurdle and Jenkins 1992). These markers, because of their uniparental inheritance and lack of recombination, are particularly useful for inferring the history of populations through female and male lineages separately. Although the mtDNA did not reveal a particular relationship between Ethiopians and the Khoisan, affinities were suggested by Y-chromosome analyses. The YAP−/49a,f haplotype 26 (A2C0D0F0I1) combination appeared to be typical of these two groups (with a frequency of ~7% in the Ethiopians and 10%–15% in the Khoisan; for the Khoisan frequency, see the discussion by Passarino et al. ). With the exception of some Jewish subjects, particularly Ethiopian Jews (Ritte et al. 1993; Santachiara-Benerecetti et al. 1993), the 49a,f haplotype 26 is absent or extremely rare in all surveyed populations (found only by Torroni et al.  and Persichetti et al. ). However, because of the variability of the complex 49a,f system, a polyphyletic origin for haplotype 26 could not be excluded. A later combined study of seven biallelic markers and four microsatellites showed that the Ethiopians and Khoisan shared the “archaic” haplotype 1A (Hammer et al. 1998), defined by the marker SRY10831 A→G (Whitfield et al. 1995), but that they did not share the microsatellite variants (Scozzari et al. 1999). Most recently, a great number of Y-chromosome biallelic markers have become available (Underhill et al. 1997, 2000, 2001; Hammer et al. 2001). These markers, because of their very low mutability, have most likely arisen only once during human evolution, thus allowing a clear-cut definition of the worldwide Y-chromosome genealogy (Underhill et al. 2001).
To better understand the relationships between Ethiopians and the other African populations, we have now screened 126 Ethiopian (78 Oromo and 48 Amhara) and 139 Senegalese DNAs of our collection, for the diagnostic markers of the major haplogroups of the Y-chromosome genealogy (Underhill et al. 2001). The results obtained using a hierarchical approach are illustrated in figure 1, in which the Ethiopian and Khoisan samples examined by Underhill et al. (2000) are also included.
Phylogenetic tree of the Y-chromosome haplotypes and their percent frequencies in the two Ethiopian groups (Oromo and Amhara) and in the Senegalese of the present study, compared with the frequencies in the Ethiopians and Khoisan previously reported by (more …)
Groups I and II are essentially restricted to Africans and appear to be the most divergent clades within the tree. They show a patchy distribution, with high frequencies among isolated hunter-gatherer groups and in some peoples of Ethiopia and Sudan. Such a distribution was interpreted as the survival of some ancient lineages through more recent population events (Underhill et al. 2001). In particular, Group I, observed in 43.6% of the Khoisan (usually considered to be descendants of an early African population), is present in all of the Ethiopian samples: its frequency is 10.3% in the Oromo sample and 14.6% in the Amhara sample of the present study, and is 13.6% in the ethnically undefined sample reported by Underhill et al. (2000). In contrast, it was not found in the Senegalese. It is worth noting that the Ethiopian YAP−/49a,f haplotype 26 lineage, which is common within the Khoisan (Spurdle and Jenkins 1992; Passarino et al. 1998), corresponds to Group I and possibly reflects the signal seen in the third PC of classical polymorphisms (Cavalli-Sforza et al. 1993, 1994). However, figure 1 shows that the Ethiopian and Khoisan samples within Group I fall into different haplotypes (haplotypes 1, 2, and 5 in Ethiopians vs. haplotypes 4, 6, and 7 in the Khoisan), in agreement with an ancient divergence from the same ancestral population, as has been suggested by microsatellite data (Scozzari et al. 1999).
Group III of Underhill et al. (2001) includes three main clusters (PN2 [Hammer et al. 1997], M75, and M33) that are uniquely characterized by M40 (corresponding to SRY4064 transition of Whitfield et al. ). The majority of African Y chromosomes fall into either the M2 or M35 subclades of the PN2 cluster (Underhill et al. 2001).
Figure 1 shows that virtually all of our Ethiopian YAP+ Y chromosomes fall into either haplotype 16, characterized only by the PN2 mutation (Hammer et al 1997), or the M35-related haplotypes 17, 18, and 19. A new M35 haplotype (haplotype 21) was observed in two Oromo. This is defined by the G→A transition (M281 in fig. 1) at position 280 within the sequence-tagged site containing the M67 and M68 mutations associated with Group VI, which were described by Underhill et al. (2000). Noteworthy is the particularly high frequency of haplotype 18, defined by M78, which also characterizes most of the European YAP+ chromosomes (O.S., unpublished data), and the absence of the haplotype 20, identified by the M81 mutation, which is the most frequent M35 lineage in North Africa (Bosch et al. 2001). In a comparison of the different groups of Ethiopians, the Oromo show an incidence (62.8%) of the M35 cluster higher than that in the Amhara (35.4%, P<.005); the Amhara value is similar to the frequency (31.8%) found in the Ethiopian sample of Underhill et al. (2000). A consistent proportion (17.0%) of Y chromosomes belonging to the M75 cluster (haplotype 22) is a distinctive feature of the latter sample. In contrast, almost all Senegalese (98.6%) are YAP+, and the majority of them (81.3%) fall into the M2 subclade, but only one of them shows the M191 mutation (haplotype 12) (Underhill et al. 2001). This mutation accounts for ~40% of the M2 members, who are mainly Pygmies (Underhill et al. 2000). Group III is less frequent in the Khoisan (28.2%), who share with Ethiopians only the M35 haplotype 19 (10.3%). Conversely, the M2 component, which occurs at a frequency of 17.9% in the Khoisan, is virtually absent in the Ethiopians.
Group VI was observed almost exclusively as the 12f2 subgroup in the Ethiopians. Among them, the Amhara are by far the most important component (33.4%, vs. 3.8% for the Oromo [P<.0001] and 3.4% for the other Ethiopian data [P < .0001]). This difference, not revealed in the study by Passarino et al. (1998), in which the Oromo were underrepresented, might reflect distinct population histories. It is reported (Levine 1974) that the Amhara experienced a strong influence from Middle Eastern populations, in which the 12f2 8-kb allele has a very high frequency and probably originated (Santachiara-Benerecetti et al. 1993; Semino et al. 1996; Quintana-Murci et al. 2001). This is further supported by the opposite distribution of the M35 subclade (35.4% for the Amhara, vs. 62.8% for the Oromo [P<.005] and 31.8% for the other Ethiopian data). Group VI also includes two Senegalese who, however, are currently defined only by the M89 mutation (haplotype 27) and lack any other known mutation characterizing the M89 subgroups.
Groups VIII and IX were also found in the Ethiopians as haplotypes characterized by the mutations M70 (haplotype 28) and M173 (haplotype 29), respectively. M70 was observed in few of our Ethiopians (~5%), and M173 was found in just one subject in the Ethiopian sample of Underhill et al. (2000). The finding of M70 is intriguing, since it has so far been observed to be widely scattered in several continents at a low frequency (Semino et al. 2000; Underhill et al. 2000). The M173 and related lineages are common and widespread in European and in western and central Asian populations (Semino et al. 2000; Underhill et al. 2000; Bosch et al. 2001; Wells et al. 2001); the observation of one M173 in Ethiopia could, therefore, represent a recent admixture event.
In conclusion, the present study underscores the complexity and substructure of the Ethiopian Y-chromosome gene pool. First, the presence of different Y-chromosome haplotypes belonging to African-specific Group I in all groups of Ethiopians and in the Khoisan (at frequencies of ~13% and 44%, respectively) confirms that these populations share an ancestral paternity, as was previously suggested by the 49a,f data (Passarino et al. 1998), and it indicates that Group I was part of the proto-African Y-chromosome gene pool. The virtual absence of this clade in the other African ethnic groups suggests that they could derive from a more recent ancestral population that went through a long period of differentiation before expansion. In addition, Group II, the next closest to the NRY genealogy root and typically an African group, is shared by Ethiopians and the Khoisan but to a lesser degree. In the case of Group II, the split responsible for the differences observed between Ethiopian and Khoisan haplotypes is also old. Second, most of the Ethiopian Y chromosomes, the rest of the Khoisan Y chromosomes, and the majority of the Senegalese Y chromosomes belong to Group III, which is also mainly African but whose precursor is believed to be involved in the first migration out of Africa (Underhill et al. 2001). Third, the remainder of the Ethiopian Y chromosomes (Groups VI, VIII, and IX) may be explained by back migrations from Asia.