key: cord-0760626-1oo7j8c5 authors: Ali Albsheer, Musab M.; Hussien, Ayman; Kwiatkowski, Dominic; Hamid, Muzamil Mahdi Abdel; Ibrahim, Muntaser E. title: The Duffy T-33C is an insightful marker of human history and admixture date: 2020-08-11 journal: Meta Gene DOI: 10.1016/j.mgene.2020.100782 sha: 4f7b612fcf0520e849e7a2e317e211b9033b9f8f doc_id: 760626 cord_uid: 1oo7j8c5 A contrasting genotype and allele frequency pattern between Africans and non-Africans in the Duffy (T-33C) locus is reported. Its near fixation in various populations suggest is no longer under natural selection, and that current distribution is possibly a relic of distant extreme selection combined with genetic drift during the out of Africa. We put this difference into the utility to infer the ancestral state of ambiguous loci in different populations. documented and attributed to the large African population effective size (Watkins et al., 2001; Elhassan et al. , 2014) genetic drift (Wang et al.,2010) and possible selection episodes that might have occurred during or prior to the exodus from the continent. These differences may help shed light on hitherto hidden chapters and dangling questions on evolutionary history and the genetic basis of phenotypic variations (Tishkoff et al., 2002) . Here we report on an interesting situation of a contrasting genotype and allele frequency pattern between Africans and non-Africans in the Duffy antigen gene. The pattern was originally observed in a 812 sample of 812 Sudanese from the MalariaGen genotype panel (Masalit & Hausa) . Allele "C" was found to be most frequent among Africans almost reaching fixation (100%) in the majority of groups. The reverse was observed in non-Africans with the C allele at a frequency of 1% in Europeans (Iberians) and Amerindians (Mayan). In contrast, the T allele frequency was as high as (100%) in Europeans (figure 1.a). The total genotype data of this locus comprises 3533 individuals from global populations, of which 2504 were from the 1000 genomes project (phase 3) (https://www.ncbi.nlm.nih.gov/variation/tools/1000genomes), and 217 genotypes from published data of Sudanese (Shaygia & Manasir) (Kempinska-Podhorodecka et al ., 2012) . Based on such allele and genotype frequencies, and accordingly we set to examine the likelihood of an interesting scenario related to human migrations. The polymorphism within the Duffy antigen gene (rs2814778) seems rather a relic of past extreme selection that conferred an adaptive advantage to humans in a specific environment during their exodus out of Africa, ~40KYB, with frequencies and impact aggravated thereafter by the sheer relatively small number of migrants under the influence of genetic drifts. Generally speaking, diseases are among the most J o u r n a l P r e -p r o o f recognizable forces exerting selection pressure, with the malaria parasite being a notable example. Malaria influence on the human genome has a history estimated to fall in a range of 10-15,000 years ago concurrent with the advent of agriculture (Kwiatkowski et al., 2005) which makes it much later to the first out of Africa episode. It has been widely documented that malaria was one of the factors determining population diversity manifested in atypical variants of hemoglobin, such as hemoglobin S (HbS), C (HBC) or E (HBE), as well as variability in the Duffy blood group system (Kempinska-Podhorodecka et al., 2012) . DARC has also recently been associated with susceptibility to HIV infection and rate of progression to AIDS (He et al., 2008) , and is shown to be prognostic for WBCs count and neutrophil count (benign ethnic neutropenia) in African Americans (Reich et al.,2009) as well as susceptibility to several other conditions capable of inducing equally significant selective pressure (Carvalho et al.,2011) . Such contrasting pattern is unique and the utility of rs2814778 with its spectacular frequency distribution ( Figure. Masalit as manifested in this locus had sustained no major gene flow from non-Africans chromosomes or b) gene flow has taken place but the introgression of the introduced allele has seized and disappeared due to a combined effect of drift and selection. Masalit according to Y chromosome data (Hassan et al ., 2008) is around 71.9% E1b1b paternal haplogroup, mainly the V32 subclade, and approximately 6.3% also belong to the haplogroup J1 considered a par excellence marker of Afro-Asiatic ancestry and back migration. This might insinuate significant patrilineal gene flow from neighboring Afro-Asiatic-speaking populations, although it does not tally with the historical time frame of 500-1500 years for the migration of Arab speaking populations into the Sahel and a hypothesized subsequent admixture scenario (Bereir et al ., 2007) . To further consider other possible scenarios of an older back to Africa at a different time frame like the hypothetical introduction into Africa by Neolithic farmers of the Y haplogroup R1b common among Hausa in this case (40%) followed by selection episode and losing the accompanying T allele due to decreased fitness, we modeled similar scenarios using the program PopG (Felsenstein/Kuhner lab). The analysis shows that even with extreme selection/fitness values, complete fixation in the absence of drift required 300 generations (~7500 years), and ~10000 years with less extreme fitness values seen in major human epidemics and outbreaks ( figure.2a,b) at a frequency of 0.01 of the minor allele (T in this case). While a frequency of 0.001 which is more comparable to the present values will require 1400-1600 generations (~30-40000 years) to reach fixation (figure.2c). A back to Africa scenario seems even less likely given the abundance of very old ancestral mitochondria and Y chromosome haplogroups with no or very low-frequency distribution out of Africa in Hausa and Masalit like A and B haplogroups (Hassan et al ., 2008) 11 . It has been previously shown by us that J o u r n a l P r e -p r o o f the Y chromosome can be a reliable marker for allele introgression into a population genome and in dating of such events (Bereir et al .,2007) . The Duffy mutation (rs2814778) seems an insightful marker for African non-African ancestry and population admixture exemplified in admixture patterns in African Americans (Estalote et al.,2005) and some northern and east Africans. It would be interesting to use biomarkers such as Y-chromosome and/or mtDNA to the quantity and inform on such admixture patterns of hypothesized mixed ancestry groups in continental sets and at the fringes of population interactions. Manasir (from Sudan) seemingly consistent with an admixture between Africans and non-Africans in a population known to have such mixed ancestry. Simulation of allele frequency change over time for a scenario of allele T into Africa 2.a) an initial frequency of 0.001 similar to the current frequencies and fitness of extreme selection at Co-Introgression of Y-Chromosome Haplogroups and the Sickle Cell Gene across Africa's Sahel Duffy Blood Group System and the Malaria Adaptation Process in Humans The Episode of Genetic Drift Defining the Migration of Humans out of Africa Is Derived from a Large East African Population Size The Mutation G298A → Ala100Thr on the Coding Sequence of the Duffy Antigen / Chemokine Receptor Gene in Non-Caucasian Brazilians Y-Chromosome Variation among Sudanese: Restricted Gene Flow, Concordance with Language, Geography, and History Analysis for Genotyping Duffy Blood Group in Inhabitants of Sudan, the Fourth Cataract of the Nile How Malaria Has Affected the Human Genome and What Human Genetics Can Teach Us about Malaria Neutrophil Count in People of African Descent Is Due To a Regulatory Variant in the Duffy Antigen Receptor for Chemokines Gene Genetic Analysis of African Populations: Human Evolution and Complex Disease Correction for Chiaroni et Al., Y Chromosome Diversity, Human Expansion, Drift, and Cultural Evolution Patterns of Ancestral Human Diversity: An Analysis of Alu-Insertion and Restriction-Site Polymorphisms