key: cord-0810365-o5tv0ah8 authors: Hassan, Diyar Ahmed; Hama-Ali, Emad Omer title: Evaluation of gene flow and genetic diversity in rice accessions across Kurdistan region-iraq using SSR markers date: 2021-11-07 journal: Mol Biol Rep DOI: 10.1007/s11033-021-06920-x sha: fa8c2a77dbf9ec17c9c50f61ecbada3e7c5d260c doc_id: 810365 cord_uid: o5tv0ah8 BACKGROUND: In recent years, farmers have complained that the only way to obtain seeds is to select plants that show good performance under local climate conditions in the region. This study aimed to investigate the diversity of rice accessions grown in the region to build a breeding program. METHODS AND RESULTS: A total of 62 accessions of rice from farmers and research stations were collected from the Kurdistan region, including short-grain and long-grain types, for molecular genetics and diversity analysis. In this study, 37 polymorphic simple sequence repeat (SSR) markers were selected with several molecular genetics software programs. The results show that these SSR markers are very effective for this investigation, generating a total of 152 observed alleles (Na), 75.166 Effective number of alleles (Ne) and an average of 4.1 and 2.03 alleles per locus, respectively. The average polymorphic information content (PIC) per locus was recorded as 0.404. The research presented here confirms two subpopulations, japonica (C1 and C2) and indica (C3), based on molecular genetics data analysis. Analysis of molecular variance revealed that the 72% variance was due to the variation among populations and 28% within the population. CONCLUSIONS: Altogether, these results indicate that there is very low gene flow. These results show the importance of the study of genetic diversity and relationships for starting breeding and improvement programs for rice in the Kurdistan region. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s11033-021-06920-x. Rice (Oryza sativa L.) or Asian cultivated species (indica and japonica) are among the top agronomic and nutritionally essential crops worldwide. The indica genotypes are tropical rice cultivars that are grown in lowland conditions. In contrast, the japonica genotypes can be either tropical cultivars adapted to rainfed upland conditions or temperate cultivars adapted to lowland conditions. Based on literary evidence [1] , it is believed that rice was introduced into Kurdistan region-Iraq around the 12th century BC. Rice is the leading consumed food in Iraq and Kurdistan. In the south and middle of Iraq, amber (long-grain) rice is cultivated, and in northern Iraq (the Kurdistan Region), long-and short-grain rice is cultivated [2] . Approximately 70% of rice is imported, and only 30% is produced locally. Improving the quality and quantity of rice production is an important point for farmers in the Kurdistan region (Erbil, Sulaymaniyah and Duhok).In Kurdistan, there are no published data about genetic diversity in rice. The only available information is the local name from the farmers based on the paddy and grain trait variation. Rice consumption in Iraq was estimated to be 45.7 kg of milled rice per person in year 2017 according to Helgilibrary [3] . The production season "planting start at June-July and harvesting are on September-October" in the middle and south of Iraq [4] . In the Kurdistan Region, the planting season begins in April or May, and the plants are harvested in October [5] . According to Jeong et al. [6] , the average of rice import have doubled in the world and Iraq is among the top ten countries of milled rice import. The total global rice trade expected growth of 1.37% annually over the next 10 years because of the high demand from import countries [6] . There is a lack of data for rice production in the Kurdistan region. Regarding the assessment of Ewaid et al. [7] , the average rice production from 2007 to 2016 was 271,173 tons in all provinces of Iraq, excluding the Kurdistan region. Researchers have conducted numerous successful studies worldwide defining the genetic diversity of rice using different molecular markers. The Random Amplified Polymorphic DNA (RAPD) was started by [8, 9] and then combined with the simple sequence repeat (SSR) [10] [11] [12] . Amplified fragment length polymorphism (AFLP) was also used for the analysis of the genetic variability of rice [13, 14] in conjunction with SSR [15] . Additionally, the single-nucleotide polymorphism (SNP) marker [16] [17] [18] was used along with the SSR [19, 20] . Among these DNA markers, SSR is considered the best marker used to study the genetic diversity of rice in the last two decades [21] [22] [23] . There is no published data about genetic diversity and breeding program of rice in the region. The only activity is to select plants that showed good performance under local climate conditions by the farmer. This study will be the first step toward building a breeding program and diversity analysis of rice accessions grown in the region. During 2019, samples of approximately 100 rice accessions were collected from the farmer fields and different research institutes of the Iraq and Kurdistan region.The samples were classified based on morphological characteristics, such as seed color, grain size, presence or absence of awns, life cycle, and geographic location (Table S1) . A collection of 62 rice accessions (50 from farmers, 9 from the Directory of Agricultural Research in Sulaymaniyah, and 3 from the Al-Mishkhab Research Center) were selected and planted in 2020 for molecular investigation (Fig. 1 ).Seeds were soaked for 3 days in the lab and on May 20, transferredtofield at Preamagrun-Gaba villagelocation between Latitude 35° 43′ 14.35′′ N Longitude 45° 04′ 32.02′′ and harvested in late October. Cultural practices including irrigation, weed control and fertilizers were conducted during the season.Fresh leaves from each accession were collected from 25-day-old seedlings. Fresh, clean leaves from each accession ground with a pestle and mortar with liquid nitrogen. A total of 150 mg of the ground samples was used for genomic DNA extraction using the Quick-DNA™ Plant/Seed Miniprep Kit, Catalogue No. D6020 (Zymo Research, Irvine, CA, USA). The quantity and quality of DNA were determined by a NanoDrop ND-2000/2000c spectrophotometer (ThermoScientific, USA)and represented by 1% agarose gel in 1X TBE buffer. The gel was viewed using a Labnet gel documentation (LabNet International Inc., Edison, NJ, USA). A total of 37 polymorphic SSR markers were obtained from the Gramene database [24] based on their polymorphism information content (PIC) values in previous research (Table S2) after primer screening and used for all 62 rice accessions. The total PCR volume was optimized to 20 µl and included 2 µl of approximately 15 ng DNA template, 6.0 µl PCR master mix of AccuPower® PCR PreMix (Bioneer, Korea), 1.5 µl of each primer (forward and reverse primers), and 9.0 µl of double-distilled water. The PCR protocol was followed, amplification was carried out using a thermocycler (Applied Biosystems), and the PCR conditions were as follows: 5 min at 95 °C; 35 cycles of 50 s at 95 °C; 50 s at annealing temperature (Table S2) ; and 50 s at 72 °C, followed by 7 min at 72 °C. The amplified products were visualized by ethidium bromide-stained 3% metaphor agarose gels in 1% TBE along with a 100 bp DNA ladder (Add Bio). The gel was viewed using Labnet gel documentation (LabNet International Inc., Edison, NJ, USA), and the gel picture was analyzed using the sample band with a DNA ladder for the band size of each SSR primer. The SSR data were analyzed using POPGENE v1.32 [25] software to determine the allele frequency, Na, Ne and gene diversity per locus in each accession. To detect the subpopulations of the genotypes, STRU CTU RE 2.3.4 software [26] was used. The run parameters were set up as 100,000 burn-in periods and 100,000 Markov chain Monte Carlo (MCMC) replications. The K value set up from 2 to 10 and 10 replicate runs were performed for each value of K. For selecting the best number of K (subpopulation), Structure Harvester was used [27] . A dendrogram of rice genotypes was generated based on the unweighted pair group method with arithmetic mean (UPGMA) method via PowerMarker v3.25 [28] and then visualized using MEGA X [29] . GenAlEx V6.5 [30] was implemented for calculating principal coordinates analysis (PCoA) and analysis of molecular variance (AMOVA). Sixty-two rice accessions were investigated using 37 polymorphic SSR markers. All genotypes were collected from Iraq, 50 from farmers in the Kurdistan region (Erbil, Sulaymaniyah and Duhok), nine from the Directory of Agricultural Research in Sulaymaniyah, and three from the Al-Mishkab Rice Research Station near Najaf. A total of 37 polymorphic SSR markers sequence details are available at (Table S2) were selected for genotyping these collections after primer screening. Allele frequency results (Table 1) show Na 152 with an average of 4.1 alleles per locus in this investigation. The maximum Na of 7 alleles was observed for primers RM20, RM257, and RM294.A minimum of 2 alleles were in RM23, RM171, RM172, The population genetic structure of 62 rice genotypes was obtained from the STRU CTU RE program using SSR genotypic data. The study presents crucial evidence for defining the right population structure of Kurdistan region rice genotypes. The best K (subpopulation) was estimated by Structure Harvester and is presented in Fig. 2 , which indicated 3 (K = 3). Consequently, the K = 3 subpopulation results were chosen from STRU CTU RE and illustrated in Fig. 2 . The genotypes in the blue cluster (C1) represent 45 shortgrain rice, the red color cluster (C2) represents eight shortgrain rice, and the green color cluster (C3) represents nine long-grain rice. All genotypes assigned to these subpopulations were considered pure because they scored more than 0.80, as shown in Fig. 2 . A distance matrix (dissimilarity matrix) result was obtained from PowerMarker [28] , and the tree was visualized using MEGA X [29] . The UPGMA clustering method for generating trees was implemented. As shown in Fig. 3 , the results of the UPGMA circle tree with three main clusters, blue (45) , red (8) , and green (9), are the same as the STRU CTU RE results. The present study shows a structural analysis of rice genotypes from Iraq with three clusters (C1, C2, and C3) using 37 SSR markers. Additionally, the phylogenetic tree result confirms the STRU CTU RE result with the same three clear clusters. However, C1 and C2 were apparent in the same subdivision (Fig. 3) , and C3 separated independently. In addition, all grains in the clusters (C1 and C2) are short-grained rice, but the C2 grains are colored, while the C3 samples are long-grained. Furthermore, from C1 and C2, only one genotype (R 49) is registered in the International Rice Genebank Collection under the name of Bazian (IRGC 9506). The Bazian accession is the preferred rice consumer locally among the short grains. In addition, (R42 and R54), which are locally known under the name of Tahalf or Alliance rice, were introduced to the region by the Food and Agriculture Organization (FAO) in the late 1990s. From C3, the genotypes (R43 and R44) are registered under the name Ambar (IRGC 9505). All three varieties from the Al-Mishkhab research center are clustered under the C3 indica subpopulation (R43-Ambar-furat, R44-Ambar-muaazra, and R45-Yasamin). The genotypic data were rearranged based on the results obtained from STRU CTU RE and PowerMarker. Then, PCoA and AMOVA analysis were performed using GenAlEx. Principal coordinates analysis (PCoA) showed an apparent variance between rice subpopulations (Fig. 3) . They were clearly distributed, as shown in the central coordinates (1 vs. 2) . Subpopulation C1 was allocated in quadrant 1, subpopulation C2 in quadrant 4, and subpopulation C3 in quadrant 2. Additionally, subpopulations C1 and C2 were more closely related than C3, which agrees with the phylogenetic tree output. The percentage of variation explained by the first 3 axes was 77.31% of the cumulative variation ( Table 2) . Table 3 shows the results of AMOVA based on 37 SSR markers (input as allelic distance matrix for F-Statistics analysis). The percentage of molecular variance for the three subpopulation outputs revealed that 71% of the variance was due to the variation among populations and 29% within the individuals ( Table 3 ). The estimated variance among individuals (within single populations) is zero (0). A fixation index (F ST ) value of 0.726 was recorded at a significant level (P value = 0.001), and the gene flow (Nm) value was 0.095. Furthermore, AMOVA suppresses the source among individuals within populations. A small change was obtained in 72 % (among populations) and 28% (within populations). Finally, the AMOVA results based on PhiPT (Table 4 ) revealed a large change in the percentage of molecular variance of 75% (among populations) and 15% (within populations). According to the pairwise population FST results (Table 5 ) below the diagonal, pairwise F ST values of 0.685, 0.745, and 0.751 were recorded between subpopulations C1 and C2, C1 and C3, and C2 and C3, respectively. The Nm results (Table 5 ) above the diagonal showed that the highest Nm occurred (0.115) between C1 and C2 and the lowest (0.083) between C2 and C3. Evaluation of genetic diversity is an important factor for rice germplasm collection and breeding program. In this study, 37 SSR markers selected for genotyping 62 accessions of rice. The allele frequency values in the present study are similar to those found by [31, 32, 34] . They reported 2 to 7 alleles per locus in their genetic studies of rice from south Asia and India. The average gene diversity (0.442) in this study agrees with previous results [23, 32] . Most researchers investigating the genetic diversity of rice using SSR markers have utilized PIC as an indicator in their studies for the capability of markers to detect polymorphisms. Singh et al. [20] demonstrated that the PIC value depends on many factors, such as diverse germplasms, population sizes, genotypic methods and oligonucleotide marker loci in the genome. Therefore, different PIC values were reported as 0.240 [34] , 0.416 [32], 0.420 [19] , 0.483 [23] , 0.560 [40] , 0.570 [41] , 0.630 [31] , 0.704 [42] , and 0.738 [43] . SSR is considered the best marker used to study the genetic diversity and characterizing of rice germplasm in the last two decades [21] [22] [23] .The genetic structure analysis of any population is determined by the number of molecular markers that are used in any investigation. Zhang et al. [23] reported that 72 SSR markers are sufficient for population structure analysis of 150 rice varieties. Based on that, 30 SSR markers are sufficient for investigating the population structure analysis of 62 rice genotypes in the present study. However, we have used 37 SSR markers to be more reliable. The short-grain rice accession clusters grouped under C1 and C2, and the long-grain accessions grouped under C3 based on a circular dendrogram (Fig. 3) . The results suggest that clusters C1 and C2 are classified as japonica and cluster C3 indica subpopulations. These results are in agreement with those obtained by [21] in the eastern Himalayan region of northeast India, in Thailand [40] , in Egypt [41] , and by [32], who indicated two major (japonica and indica) subpopulations. In China, [43] identified three major groups: indica, temperate japonica, and tropical japonica. In the present study, the allelic form of SSR was used, which is the most standard SSR method to obtain the AMOVA results by GenAlEx. A high level of genetic variation among populations of 71% was found among the studied rice accessions (Table 3) . When the allelic form of SSR was used and there were differences among individuals within populations, there was no need to suppress this source of variance. However, when the estimated variance among individuals (within single populations) was set as zero (0) ( Table 3) , the data estimate was slightly negative (not significantly large differences among individuals were found).Then, within-individual analysis could be suppressed (Table 3 ). In addition, some researchers use PhiPT, where each diploid genotype is treated as a unit (in quantitative fashion). Because they want to know how different the genes are, from one individual to the next (track of individuals) [44] . They were low differentiation (F ST = 0.00-0.05), moderate differentiation (F ST = 0.05-0.15) and a high level of differentiation (F ST of >0.30) [45] . In this study, a high level of genetic differentiation (F ST = 0.726, p < 0.001) was indicated between subpopulations C1, C2 and C3 (Table 3 ). The lowest pairwise F ST value of 0.685 was recorded between subpopulations C1 and C2, and the highest of 0.745 was recorded between subpopulations C2 and C3. Similarly, Verma et al. [46] Gouda et al. [47] Suvi et al. [48] reported a very high genetic differentiation F ST of 0.827, 0.490 and 0.407 among subpopulations, respectively. The Nm value (0.095) indicates a very low or limited gene flow between subpopulations (C1, C2 and C3) ( Table 4 ). According to [49] ,a value of Nm less than one will indicate the limitation of gene exchange between populations. The PCoA, AMOVA, F ST and Nm results were the key sources for finding the problem (very low gene flow) in the rice accessions in the region. There is a possibility that most of the individuals within a population are very close, and most farmers are not exchanging seeds. The government is distributing no seeds to farmers. Therefore, each year, the farmers keep a part of their seed for planting the following year. In conclusion, the molecular diversity of the rice accessions in the region was divided into indica and japonica subpopulations based on a step-by-step analysis of STRU CTU RE, PowerMarker, MEGA X, and GenAlEx software. Additionally, SSR proves its effectiveness for identifying a transparent background of rice genotypes in the region and shows that most rice accessions are very close in each subpopulation but under different local names. These findings in the present study are a perfect starting point for a rice breeding program and domestication of new species of rice in the region. The online version contains supplementary material available at https:// doi. org/ 10. 1007/ s11033-021-06920-x. Between archaeology and text: the origins of rice consumption and cultivation in the Middle East and the Mediterranean ed) IOP conference series: earth and environmental science Rice consumption per capita in Iraq Irrigation water reduction using system of rice intensification compared with conventional cultivation methods in Iraq Sensitivity of winter crops to climate variability in the irrigated subtropics of Iraq Review of rice: production, trade, consumption, and future demand in Korea and worldwide Assessment of main cereal crop trade impacts on water and land security in Iraq Genetic variability analysis of partially salt tolerant local and inbred rice (Oryza sativa L.) through molecular markers Genetic diversity assessment of rarely cultivated traditional indica rice (Oryza sativa L) varieties Genetic diversity analysis of rice cultivars (Oryza sativa L.) differing in salinity tolerance based on RAPD and SSR markers Molecular marker based genetic diversity analysis in rice (Oryza sativa L.) using RAPD and SSR markers A comparative study of genetic relationships among the AA-genome Oryza species using RAPD and SSR markers AFLP-based analysis of genetic diversity, population structure, and relationships with agronomic traits in rice germplasm from north region of Iran and world core germplasm set AFLP markers for the study of rice biodiversity Assessment of genetic diversity within and among Basmati and non-Basmati rice varieties using AFLP, ISSR and SSR markers Genetic diversity of released Malaysian rice varieties based on single nucleotide polymorphism markers Genome-wide patterns of nucleotide polymorphism in domesticated rice Genomic diversity and introgression in O. sativa reveal the impact of domestication and breeding on the rice genome Genetic diversity and population structure in a European collection of rice Comparison of SSR and SNP markers in estimation of genetic diversity and population structure of Indian rice varieties Genetic structure and diversity of indigenous rice (Oryza sativa) varieties in the Eastern Himalayan region of Northeast India Genetic structure and diversity in Oryza sativa L Population structure and genetic diversity in a rice core collection (Oryza sativa L.) investigated with SSR markers POPGENE (version 1.31). Microsoft window-bases freeware for population genetic analysis. University of Alberta and the Centre for International Forestry Research Inference of population structure using multilocus genotype data STRU CTU RE HARVESTER: a website and program for visualizing STRU CTU RE output and implementing the Evanno method PowerMarker: an integrated analysis environment for genetic marker analysis MEGA X: molecular evolutionary genetics analysis across computing platforms GENALEX 6: genetic analysis in Excel. Population genetic software for teaching and research Analysis of population structure and genetic diversity in rice germplasm using SSR markers: an initiative towards association mapping of agronomic traits in Oryza sativa Molecular diversity and multilocus organization of the parental lines used in the International Rice Molecular Breeding Program Population structure, diversity and trait association analysis in rice (Oryza sativa L.) germplasm for early seedling vigor (ESV) using trait linked SSR markers Genetic diversity and population structure in aromatic and quality rice (Oryza sativa L.) landraces from North-Eastern India Genetic diversity and population structure using linked SSR markers for heat stress tolerance in rice Assessment of the genetic diversity and population structure in temperate japonica rice germplasm used in breeding in Chile, with SSR markers Population structure and genetic diversity analysis of Indian and exotic rice (Oryza sativa L.) accessions using SSR markers Simple sequence repeat (SSR) markers for assessing genetic diversity among the parental lines of hybrid rice (Oryza sativa L.) Genetic diversity and allelic frequency of selected Thai and exotic rice germplasm using SSR markers Analysis of population structure and genetic diversity of Egyptian and exotic rice Population genetic structure of Oryza sativa in East and Southeast Asia and the discovery of elite alleles for grain traits Genetic diversity and classification of Oryza sativa with emphasis on Chinese rice germplasm The interpretation of population structure by F-statistics with special regard to systems of mating Evolution and the genetics of populations: a treatise in four volumes: vol 4: variability within and among natural populations Variability assessment for root and drought tolerance traits and genetic diversity analysis of rice germplasm using SSR markers Comparisons of sampling methods for assessing intra-and inter-accession genetic diversity in three rice species using genotyping by sequencing Assessment of the genetic diversity and population structure of rice genotypes using SSR markers Rare alleles as indicators of gene flow The authors acknowledge the Kurdistan Institution for Strategic Studies and Scientific Research (KISSR) for letting us work in their labs during the COVID-19 pandemic, 2020 and the Directory of Agricultural Research in Sulaymaniyah for providing nine rice seeds.