According to this recent DNA study of barley, it was domesticated twice, once in Jordan/Israel, and once somewhere East of the Zagros mountains in Iran. The earliest known domestication of barley is about 4,000 years later than that of emmer wheat, and seems to have been a crop domesticated during the Neolithic farmers expansion, not before it.
Genetic evidence for a second domestication of barley (Hordeum vulgare) east of the Fertile Crescent
December 21, 2006.
Cereal agriculture originated with the domestication of barley and early forms of wheat in the Fertile Crescent. There has long been speculation that barley was domesticated more than once. We use differences in haplotype frequency among geographic regions at multiple loci to infer at least two domestications of barley; one within the Fertile Crescent and a second 1,500–3,000 km farther east. The Fertile Crescent domestication contributed the majority of diversity in European and American cultivars, whereas the second domestication contributed most of the diversity in barley from Central Asia to the Far
The domestication of barley is fundamental to understanding the origins and early diffusion of agrarian culture. Barley, as one of the earliest and most important crops in Neolithic agriculture (1), sits at the nexus of what many regard as the most fundamental technological transformation in human history. The oldest archaeological remains of domesticated barley and early forms of wheat are found in human Neolithic sites in the Fertile Crescent such as Abu Hureyra and Jericho (Fig. 1) and are dated to ≈8500 calibrated years BC
The geographic distribution of sampled wild barley accessions and the locations of the human Neolithic sites mentioned in the text that contain early evidence of. The 25 wild barley accessions where all 18 loci were sequenced are indicated by filled circles. An additional 20 accessions (see Materials and Methods) were sequenced at four loci and are indicated by asterisks. Samples with majority assignment to the eastern cluster are shown in red, and samples with majority assignment to western cluster are shown in blue. The Neolithic sites indicated include Jericho (Palestine), Abu Hureyra (Syria), Jarmo (Iraq), Ali Kosh (Iran), Jeitun (Turkmenistan), and Mehrgarh (Pakistan).
For 2,000 years or more, barley along with einkorn and emmer wheat were the primary cereal crops (1). Unlike wheat and other Fertile Crescent founder crops, the natural range of wild barley, the progenitor of cultivated barley, extends east into Central Asia to present day Kyrgyzstan, Afghanistan, and western Pakistan (1). Barley has been continuously cultivated for >8,000 years in southern Central Asia, east of the Fertile Crescent, but it has never been clear whether barley was domesticated locally or imported along with other founder crops from the Fertile Crescent (5).
Domestication of food plants involved not only profound modifications of human societies but also genetic changes in wild plants as cultivated forms were selected. Perhaps the most essential domestication trait in barley, the presence of nonbrittle ears of grain, is controlled by two distinct genetic loci (6–9). In domesticated barley (with nonbrittle ears), grains remain attached to the upright stems where they can be readily harvested. The locus responsible for nonbrittle ears differs among landraces from eastern and western Asia, suggesting independent origins (10) and fueling decades of discussion among archaeologists and biologists as to the number of domestications of barley (5, 11–13). Zohary (13) identifies two types of genetic evidence likely to be informative as to the number and/or locations of domestication.
First, the specific allelic composition of domesticates is the result of subsampling populations of the wild progenitor (13). If the wild progenitor has marked differences in allele frequencies among geographic regions, i.e., when there is geographically based genetic differentiation, allelic composition is especially likely to be informative as to the number and locations of origin of domesticates. Allelic composition in barley could be very informative because in wild barley roughly half of sequenced loci exhibit significant differentiation among the eastern and western portion of the species range (14–18). Differences in allele frequencies among eastern and western landrace barleys have also been reported (14, 19, 20). For example, at three of four esterase loci examined by Kahler and Allard (20), Central Asian and Far East landraces had alleles at ≈20% or greater frequency, which were found at much lower frequencies in European landraces and in wild barley from Israel and Turkey. Although there has been criticism of the methods used (21), neighbor-joining clustering based on distance among amplified fragment length polymorphism genotypes lead Badr et al. (14) to conclude that cultivated barley had a single origin.
Zohary (13) also argues that independent domestications are likely to select for nonallelic mutations that govern the principal domestication-related traits (e.g., nonbrittle ears in cereals and the loss of germination inhibition). Distinct genetic loci governing the same domestication-related trait but found in different portions of the geographic range of a crop constitute strong evidence for multiple independent domestications (13). The genetic determination of three of the primary domestication-related traits in barley has been analyzed: (i) nonbrittle ears (22); (ii) “kernel row type,” which controls whether two or six rows of grains are produced (23–26); and (iii) the presence or absence of hulls around the grain (27, 28). Although the two genetic loci for nonbrittle ears are very closely linked (22, 29), haplotype data from a flanking marker is consistent with the earlier finding of Takahashi (6, 10) that different loci predominant in eastern and western cultivars (26, 30). The kernel row-type trait is also controlled by two separate genetic loci (24). The presence of hull-less grains is controlled by a single locus and appears to have had a single origin in domesticated barley, apparently somewhere east of the Fertile Crescent (27).
Wild barley offers an exceptional opportunity to detect multiple domestications because it has relatively high levels of nucleotide sequence diversity (15, 31, 32) and, more importantly, because loci sampled from different geographic regions often show marked differences in haplotype frequency (18). Ten of the 18 loci sampled by Morrell et al. (17, 18) show significant (P ≤ 0.05) geographic differentiation based on the nearest neighbor test (33) (Table 1). The most extreme geographic differentiation observed in wild barley is 2.2% per-site divergence among the major haplotypes from the Adh3 locus (15, 17); Dhn4 and G3pdh show similar per-site divergence among the major haplotypes in the eastern and western portions of the species range (17, 18).
Results of the nearest neighbor test (10,000 replicates) for geographic structure, Snn, at 18 wild barley loci
Geographic patterns of genetic differentiation are associated with the major topographic feature within the range of wild barley, namely the Zagros Mountains (Fig. 1), that trend northwest to southeast and roughly bisect the range of the species. The Zagros also delineate the eastern edge of the Fertile Crescent; thus, the most dramatic differences in haplotype composition in wild barley occur between the Fertile Crescent and the portion of the range east of the Zagros (17). Differences in haplotype frequencies among regions also suggest that human activity, including transportation of cultivated barley among regions, has not homogenized genetic diversity across the range of the wild progenitor.
We have resequenced seven loci that show significant geographic structure in wild barley (Table 2) (18) in a sample of 32 cultivated barleys, the majority of which are landraces used in traditional agriculture [Fig. 2; see also supporting information (SI) Table 3]. The data set includes 196 SNPs from cultivated barley. The correspondence of haplotypes observed in cultivated and wild barleys can be used to estimate the probability that a landrace is derived from one or more portions of the range of the progenitor. Thus, we ask whether the haplotypic composition of landraces is concordant with that of wild barley from the same region.
Seven loci sequenced in both wild and cultivated barley
|Locus||Aligned length, bp||Parsimony-informative SNPs|
The geographic distribution of sampled barley landraces. The estimated probabilities of eastern and western wild barley origin for each sample are shown in red and blue, respectively. A landrace sample from Peru is not depicted.
To identify geographic discontinuities in genetic diversity in wild barley, we used resequencing data from 18 loci and 25 accessions that included a total of 684 SNPs (Fig. 1 and Table 1). A genetic assignment algorithm (34) was applied to the data with K = 2–4 clusters, without using prior information on geographic origin. The model-based algorithm identifies up to K clusters (where K may be unknown), each of which is characterized by distinct allele frequencies at each locus. With K = 2 clusters, the sample is split between the eastern and western portions of the species range, with the transition between predominantly eastern and western assignment occurring among accessions from the northern Zagros Mountains. However, samples from the northeastern extreme of the range cluster partially with western samples. With K = 3 or 4, northeastern samples are differentiated from both western and other eastern samples. This observation is consistent with haplotype composition; northeastern samples carry many private alleles, or haplotype segments, not sampled elsewhere.
Comparison of the percentage of alleles shared among samples to great circle geographic distance also reflects a change in allelic composition between the eastern and western clusters. In SI Fig. 4A, samples are compared both within and between eastern and western clusters. A group of highly differentiated pairs of samples occurs at ≈1,100 km, between accessions from the Mediterranean coast and the Zagros region, reflecting a change in allelic composition (Fig. 1 and SI Fig. 4A). A very similar pattern is also evident in wild barley based on the seven loci also sequenced in barley landraces (SI Fig. 4B) (discussed below).
For STRUCTURE analysis, the data must be treated as unique alleles. Each of the loci was broken into alleles (haplotype segments) based on direct evidence of recombination (see Materials and Methods). Among the 61 haplotype segments from the 18 loci, there are 329 total alleles, 182 of which are private to either eastern of western wild barley. Of these alleles, 40 occur at a frequency of ≥22% within one of the two partitions of the sample. Eighteen of these alleles are private to the eastern cluster, and 22 are private to the western. If the species-wide frequency of an allele is ≥22%, there is >95% probability that the allele would be sampled in accessions from both the eastern and western cluster.
Levels of nucleotide sequence diversity among the eastern and western cluster are similar, with average θπ for all sites over all 18 loci of 0.00579 for eastern wild barley and 0.00597 for western wild barley.
Taken together, these results suggest that the clusters of wild samples identified in STRUCTURE analysis result from long-standing differences in allelic composition in populations east and west of the Zagros Mountains rather than recent demographic events (e.g., a bottleneck associated with recent founding of eastern populations).
Given that the primary geographic structure in wild barley differentiates western wild barley (from the Fertile Crescent) from eastern wild barley (from the Zagros and further east) we used STRUCTURE to estimate the probability of assignment for each cultivated barley accession to either of these two regions. We found that landrace accessions from east or west of the Zagros have very different probabilities of assignment to the western wild barley cluster (from the Fertile Crescent region). The median probability of assignment for western landraces is 99.4%, with a range of 62.2–99.8% (Fig. 2 and SI Fig. 5). Eastern landraces have a median probability of assignment to the western cluster of only 33.6%, with a range from 0.2% to 78.8%. Three-quarters of eastern landraces have <50% assignment to the western wild barley cluster. Accessions from the Zagros Mountains and Caspian Sea region show the lowest probability of origin in the Fertile Crescent (Fig. 2 and SI Fig. 5). Changes in specification of the data model in STRUCTURE (see Materials and Methods) do not change the primary assignment of individual landrace accessions but do result in less discreet clustering.
As was the case in wild barley, comparison of the percentage of shared alleles with geographic distance within versus between eastern and western landraces indicate differentiation among geographic regions (SI Fig. 6).
At six of the seven loci sequenced in the landrace samples (all but Dhn4), haplotypes that were found exclusively in eastern wild samples also were found to be present in eastern landraces (see SI Fig. 7). The haplotypes include 10 SNPs found only in eastern wild barley and eastern landraces.
These lines of evidence strongly suggest an independent domestication of barley outside the Fertile Crescent. All landrace accessions from east of the Zagros, across Asia to North Korea, have substantial identity to eastern wild barleys (Fig. 2 and SI Fig. 5), with assignment probabilities suggesting a mix of eastern and western origin. These data suggest that eastern landraces have been subject to admixture from imported, western landraces. Archaeological evidence indicates extensive introduction of other Fertile Crescent founder crops into central and southern Asia; for example, einkorn wheat was introduced by ≈6,000 B.C. (4, 5, 11). Thus, it would be surprising if western barley landraces had not been imported along with other domesticates. Introgression with western domesticated barley is likely to have been ongoing; Ordon et al. (19) report high levels of similarity between Japanese barley bred for malting quality and European cultivars.
Which geographic regions have contributed to modern cultivars? To address this question we considered 10 modern cultivars and three barley genetic stocks from North America and Europe. The cultivars have a median 89.7% probability of assignment to the western wild cluster, consistent with the introduction of barley into Europe and ultimately North America from the Fertile Crescent region (Fig. 3) (35). However, two U.S. cultivars and one line of the genetic stocks have <50% probability of assignment to the western cluster, probably because modern barley breeding programs have introduced genetic material from eastern wild barley in the quest to exploit novel germplasm (36). More generally, assignment tests indicate that wild barley from the Fertile Crescent contributed the majority of genetic diversity in present-day European and American cultivars, whereas wild barley from Central Asia contributed half or more of the genetic diversity in barley cultivated from Central Asia to the Far East.
Assignment of cultivated barley samples and barley genetic stocks relative to the eastern and western wild barley clusters. Each column represents a single individual, with the probability of assignment to eastern wild barley shown in red and the probability (more …)
Where could a second barley domestication have occurred? The relatively broad, species-wide sample mesh in the present study suggests an origin of eastern landraces in the western foothills of the Zagros or points farther east. Much of the region immediately east of the Zagros is a high-elevation plateau, where both wild barley populations (4) and known human Neolithic sites are relatively rare (5). However, the locations of early Neolithic agropastoral settlements suggest three general regions in which the secondary domestication could have taken place. In the foothills of the Zagros, at such sites as Ali Kosh and Jarmo (Fig. 1), domesticated barley is dated to ≈7,000–8,000 cal. B.C. Domesticated barley is found at the Indus Valley site of Mehrgarh (in present day Pakistan) from ≈7,000 cal. B.C. Finally, in the piedmont zone between the Kopet Dag mountain range and Kara Kum Desert (east of the Caspian Sea in present day Turkmenistan), cultivated barley was present by ≈6,000 cal. B.C. (see ref. 37). Both naked and hulled six-row barleys were cultivated at Mehrgarh and by the Jeitun culture (in the Caspian Sea region) (5). Mehrgarh lies near the eastern edge of the range of wild barley (1); thus, it is possible that barley found at Mehrgarh was domesticated locally. At Jeitun in southern Turkmenistan (the type site of the Jeitun Culture on the Kopet Dag piedmont), domesticated forms of (probably six-row) barley and einkorn wheat (Triticum monococcum) were being cultivated, and domesticated goats and sheep were being herded, by 6,000 cal. B.C., indicating the presence of a well developed agrarian society (5, 37).
Sample size in the present study limits our ability to compare haplotype composition in various portions of the eastern range of wild barley with that of eastern landraces. However, among the haplotype segments (from six loci) most indicative of eastern origin, i.e., those shared by eastern landraces and eastern wild barley but not sampled in western wild barleys, all but one were found in wild barleys from east of the Caspian Sea, and five segments were exclusive to that region; only one of the 11 segments was sampled only in the western Zagros. Based on a combination of archaeological and genetic data, the most likely location for a second origin of barley is ≈1,500–3,000 km east of sites in the Fertile Crescent where western barley is most likely to have been domesticated.
Evidence for two domestications must be weighed against the alternative, a single domestication in the Fertile Crescent followed by extensive introgression (14). Introgression from Central Asian wild barley into landraces imported from the Fertile Crescent would be necessary because wild barley samples from the Fertile Crescent do not carry all of the haplotype diversity found in eastern landraces. Expanding agrarian cultures could have imported landraces from the Fertile Crescent to Central Asia (5), where introgression with wild barley occurred (14).
There are two principal problems with this scenario (a single domestication followed by introgression). First, in what was perhaps 2,000 or more years before agrarian culture expanded into Central Asia, cultivated barley was subject to human selection for agronomically desirable polygenic traits, such as seed size and loss of seed dormancy (12, 38). After hybridization between wild and landrace barley, repeated back-crossing to the recurrent parent (imported landraces) would be necessary to recover agronomically important traits, diminishing the potential genetic contribution of the donor parent (eastern wild barley). The second issue is that a single domestication cannot explain the two independent origins of the domestication-related traits, nonbrittle ears and kernel row type. As Zohary (13) points out, the fixation of independent mutations at nonallelic, nonbrittle ear loci in cultivated barley is strongly suggestive of at least two domestications. With the addition of multilocus haplotype data demonstrating a strong geographic discontinuity in the probable wild founder populations of barley landraces, both of Zohary’s criteria for identifying multiple domestications are satisfied.
Modern analytical methods, combined with high-throughput DNA sequencing at multiple informative loci permit a clearer view of historical events associated with domestication of a major founder crop and reveal at least two initial domestications at the dawn of agriculture. Because both wild and domesticated barleys maintain high levels of informative nucleotide sequence diversity (31), it is probable that more intensive sampling from an appropriate geographic mesh can provide a finer scale view of the history of domestication.
References and Data Collection. Seeds of wild and cultivated barley landraces and cultivars were obtained from the U.S. Department of Agriculture, Agricultural Research Service, National Small Grains Collection (Aberdeen, ID). Sampled wild barleys were included in previous studies (e.g., refs. 16, 31, and 39). Seeds for U.S. and Canadian cultivated barleys and genetic stocks were provided by the laboratories of Timothy Close (University of California, Riverside) or Patrick Hayes (University of Oregon, Eugene). Sampled accessions were drawn from across the natural geographic and cultivated range of the species, with a special emphasis on landrace barleys from regions that archaeological evidence suggests were early sites for barley cultivation (see SI Table 3). Leaf material from individual plants was harvested, and DNA was extracted by using DNAzol ES (Molecular Research Center, Cincinnati, OH) according to the instructions of the manufacturer.
Seven loci (Table 1) were completely resequenced in a panel of 32 cultivated barleys (SI Table 3). Nineteen of the accessions are listed by the U.S. Department of Agriculture GRIN (The Germplasm Resources Information Network) database as landraces. Sampled landraces are primarily from western and central Asia, with one sample each from Africa, Europe, and South America (see SI Table 3 for the country and geographic locality of origin of each accession). The balance of the sample includes 10 modern cultivars and three North American genetic stocks. The modern cultivars include one each from Scotland and Poland and eight from the U.S. and Canada.
For three loci, Cbf3, Dhn9, and ORF1, additional wild accessions were resequenced, resulting in sample sizes of 52, 45, and 47.
PCR amplification, sequencing, and fragment assembly follow the methods of Morrell et al. (17), i.e., direct sequencing of PCR products using Big Dye V. 3.1 (Applied Biosystems, Foster City, CA) followed by assembly of sequence fragments with PHRED/PHRAP/CONSED (40–42). POLYPHRED V. 5.04 was used for polymorphism detection (43). For the Cbf3 locus, sequence from five wild accessions were computationally phased by using PHASE V. 2.1 (44–46). Error detection using triplets of SNPs (ref. 38 and D. M. Toleno, P.L.M., and M.T.C., unpublished data) was used to confirm the accuracy of haplotypes at all loci.
Data Analysis. Tests for geographic structure at individual loci (Table 1) used the nearest neighbor test (33) as implemented in LIBSEQUENCE (47). Diversity statistics for haplotype segments were calculated with GDA 1.1 (48). The haplotype diagrams in SI Fig. 7 where generated by using SNAP MAP (49). Great circle distance among samples (SI Figs. 4 and 6) was calculated by using the FIELDS package in the R statistical language and programming environment. Unless specified, all other analyses used custom scripts written in R.
To estimate the probability that individual cultivated barley accessions derive from wild barley either within or outside the Fertile Crescent, we used a model-based genetic assignment algorithm implemented in the computer program STRUCTURE V. 2.1 (34, 50). As in previous studies employing resequencing data (51), we have recoded haplotypes as unique alleles, as required by the STRUCTURE data model. Portions of each locus that have a partially independent genealogical history, as indicated by the presence of a detectable recombination event (52), are treated as individual haplotype segments. All parsimony-informative (nonsingleton) mutations were used to define haplotype segments.
A STRUCTURE analysis, based on 61 haplotype segments from the 18 loci previously surveyed in wild barley (31), was used to assign each wild barley individual to a cluster of origin without using prior information as to the geographic origin of individual samples. Assignment of cultivated barleys was based on data from seven loci that showed geographic structure (Table 1) and were highly informative for assignment (53) (SI Table 4) into the two groups defined by all 18 loci.
We explored several options in STRUCTURE, with initial analyses based on a minimal set of assumptions regarding the data. The initial setup used a model with uncorrelated allele frequencies among populations with no admixture; sampled segments were treated as unlinked, and the geographic location of origin was not used to assist clustering. This setup reflects prior data from wild barley populations suggesting that major portions of the range of wild barley have large differences in haplotype frequency (18) with moderate levels of gene flow (17) but no evidence of recent population admixture.
The probability of assignment of domesticated barleys to eastern or western clusters was inferred with K = 2; allele frequencies were estimated with wild barley treated as a learning sample. In this analysis, geographic population of origin was used to cluster wild barleys but not used for domesticates. Cultivated barleys have a known collection location, but their probability of derivation from various portions of the range or their wild progenitor is unknown and inferred based on allele frequencies in the wild barley learning sample. Ten replicate runs of STRUCTURE were carried out, with 100,000 replicates for burn-in and 200,000 replicates during analysis. Reported probabilities of assignment are based on the run with the highest likelihood.
To explore variation in assignment due to model specification, the panel of wild individuals was resampled with replacement to create 20 test samples from both the eastern and western cluster and 20 samples with the allele at each locus drawn randomly and with equal probability from either cluster. Multiple replicate searches were used to explore the impact of model specification on assignment of these test samples where the genetic contribution of each source population is known. Data models tested included a linkage model that accounts for the use of adjacent chromosomal segments (this must be specified along with an admixture model) and with map distances among segments based on the midpoint distance in kilobases. We also explored combinations of models with or without correlated allele frequencies, admixture, and the use of location of origin for wild barley samples to assist clustering. Test samples were clustered with wild barley as a learning sample. The addition of admixture, correlated allele frequencies, and the linkage model resulted in lower probabilities of assignment of test samples to their known population of origin and increased variance of assignment in all test samples. Using prior information on population of origin for wild samples resulted in higher probability of assignment of test samples to their population of origin.