key: cord-0887264-7vita6bt authors: Paixão, Pedro Thiago Medeiros; Nascimento, Ana Carolina Campana; Nascimento, Moysés; Azevedo, Camila Ferreira; Oliveira, Gabriela França; da Silva, Felipe Lopes; Caixeta, Eveline Teixeira title: Factor analysis applied in genomic selection studies in the breeding of Coffea canephora date: 2022-03-14 journal: Euphytica DOI: 10.1007/s10681-022-02998-x sha: 91608c1195bafd07712fb6aabc9b07547ba2e3a2 doc_id: 887264 cord_uid: 7vita6bt Brazil stands out worldwide in the production of coffee. The observed increases in its productivity and morpho agronomic traits are the results of the improvement of several methodologies applied in obtaining improved cultivars, among which the predictive methods of genetic value stand out. These contribute significantly to the selection of higher genotypes, increasing the genetic gain per unit time. In this context, genomic-wide selection (GWS) is a tool that stands out, since it allows predicting the future phenotype of an individual based only on molecular information. Performing joint selection of traits is the interest of most breeding programs, and factor analysis (FA) has been used to assist in this end. The aim of this study was to evaluate the use of FA in the context of GWS, in genotypes of Coffea canephora. It was found that FA was efficient to elucidate the relationships between the traits and generate new variables. The factors formed can assist in the selection, as in addition to allowing joint interpretations, they present good estimates of predictive capacity, heritability and accuracy. Furthermore, high agreement was observed between the individuals selected based on the factors and those selected considering the individual traits. Additionally, it was observed agreement between the top 10% individuals selected based on the “vigor factor” and each variable individually. However, the selection based on “vigor factor” presented individuals with more suitable size from the phytotechnical point of view. Coffee is one of the most consumed drinks in the world, presenting importance both in terms of its potential health and socioeconomic effects (Porto et al. 2019) . The value of this crop has a wide scope, including producer, industry and consumer and consists of the main revenue of many developing countries (Davis et al. 2020; Huded et al. 2020 ). On the world stage, Brazil stands out as the largest producer of coffee, accounting for about 70% of world exports. Even with the combined effect of the bienniality and the coronavirus crisis, by August/2020 Brazil had already accumulated 3.257 million bags exported in the year (ICO 2020). Abstract Brazil stands out worldwide in the production of coffee. The observed increases in its productivity and morpho agronomic traits are the results of the improvement of several methodologies applied in obtaining improved cultivars, among which the predictive methods of genetic value stand out. These contribute significantly to the selection of higher genotypes, increasing the genetic gain per unit time. In this context, genomic-wide selection (GWS) is a tool that stands out, since it allows predicting the future phenotype of an individual based only on molecular information. Performing joint selection of traits is the interest of most breeding programs, and factor analysis (FA) has been used to assist in this end. The aim of this study was to evaluate the use of FA in the context of GWS, in genotypes of Coffea canephora. It was found that FA was efficient to elucidate the relationships between the traits and generate new variables. The factors formed can assist in the selection, as in Due to the importance of coffee cultivation, breeding programs aim to obtain more productive cultivars, adapted and with better drink quality (Barbosa et al. 2019; Sousa et al. 2017) . Thus, in the genetic improvement of the coffee plant, different traits have been worked, such as those associated to plant architecture, adaptation and stability in different cultivation environments and to the size and quality of fruits and grains, as well as resistance and tolerance to biotic and abiotic stress (Marie et al. 2020; Oliveira et al. 2021; Setotaw et al. 2020) . In this context, genomic selection (GS) (Meuwissen et al. 2001) , has been successfully used in genetic improvement to increase genetic gain per generation through early selection and to improve prediction accuracy . Therefore, GS has an essential role in perennials (Resende et al. 2012; Sousa et al. 2019) . The literature presents some applications of GS in coffee. For Coffea arabica, the most cultivated species of the genus, the accuracy of the GS in breeding considering 18 agronomic traits (Sousa et al. 2019 ) and the effect of biennality on the accuracy of genomic prediction for productivity (Carvalho et al. 2020) were evaluated. For the other cultivated species, C. canephora, the efficiency of GS was analyzed for eight traits of agronomic interest (Alkimim et al. 2020) and the performance of different statistical methodologies were tested within and between environments to predict coffee bean production, rust incidence and green bean yield (Ferrão et al. 2017; . However, in these studies, GS was applied to individual traits, that is, the results obtained, although promising, are valid only for a single trait. Nevertheless, since these traits are generally correlated, an approach that simultaneously incorporates information from multiple traits into the selection process may be of interest. One possible method for this approach is factor analysis (FA). FA aims to synthesize the information contained in a set of n observed variables seeking a new set of p variables (p < n) , called common latent factors (Macciotta et al. 2012) . Thus, the latent variables (common factors) can simultaneously represent a set of traits, without the loss of biological significance, so that the scores of the new variables formed can be treated as pseudo-phenotypes and used in later analyses. In an GWS context, such methodology allows the selection of individuals considering a set of traits simultaneously. Some studies demonstrate potential applications of FA in genetic improvement. In animals, FA was successfully used to obtain estimates of genetic parameters between common latent factors in cattle and dairy buffaloes (Aspilcueta-Borquis et al. 2012; Macciotta et al. 2006; . Paiva et al. (2019) , used formed latent variables, identified as pseudo-phenotypes, in the genetic evaluation of broiler chickens under Bayesian structure. Latent variables also allowed to simultaneously study a set of important characters in pigs, for later use in GS Analyses (Teixeira et al. 2016) . For plant species, these studies are still incipient. In the coffee plant, FA was applied to select genotypes of C. arabica with high simultaneous potential of variables of commercial interest (Barbosa et al. 2019) . However, despite the importance and considering the best of our knowledge, there are no reports in the literature of the use of FA for genomic prediction in the genus Coffea. In this context, we aim to: (1) investigate the relationship between several economically important traits in C. canephora through factor analysis; (2) use the latent variables identified as pseudo-phenotypes in genomic prediction; (3) to compare the individuals selected using FA and the individual trait approach. Populations composed of clones of C. canephora from the varietal groups Conilon and Robusta and hybrids from crosses between these groups were used. These genotypes make up the breeding program of the Empresa de Pesquisa Agropecuária de Minas Gerais (Epamig) in partnership with the Universidade Federal de Viçosa (UFV) and the Empresa Brasileira de Pesquisa Agropecuária-Café (Embrapa Café). The Conilon genotypes were assigned by the Instituto Capixaba de Pesquisa, Assistência Técnica e Extensão Rural (INCAPER) and the Robusta material by the Centro Agronómico Tropical de Investigación y Enseñanza (CATIE) and introduced in the breeding program. The varietal groups Conilon and Robusta analyzed consisted of 51 and 32 genotypes, respectively. In addition to these genotypes, 82 interpopulational hybrids were obtained through artificial crosses between five genotypes of the Conilon group and five of the Robusta group. The genotypes were planted during three consecutive years (2014-2016) following a incomplete block design with up to 35 replicates and single tree plots. The following morpho agronomic traits were evaluated for three years in a row (2014 and 2016): Vegetative vigor (Vig), evaluated by the overall aspect of the plant, using a scale from 1 to 10, 1 being assigned to the fully-depleted plants and 10 to the highly vigorous plants; Plant height (Phe), measured in centimeters (cm) from ground level to the last apical point of the coffee plant with aid of a measuring tape; Canopy projection diameter (Cdi), given in centimeters (cm) by means of a ruler and in the direction perpendicular to the planting line; and the Production per plant (Prod), evaluated by harvesting all the fruits present in a genotype, and measured the total volume in litres of coffee, freshly harvested. DNA samples were extracted from young and fully expanded leaves of the same 165 phenotyped coffee plants. The DNA concentration was standardized and sent to RAPiD Genomics, located in Florida, USA, for identification of molecular markers SNPs (Single nucleotide polymorphisms) ). The set of 18,111 identified SNPs markers were submitted to the quality analysis implemented in GemonicLand software . Quality control was performed by MAF (minor allele frequency-higher than or equal to 5%) and call rate (CR-higher than or equal to 90%). The density of SNPs before and after quality control was from 18,111 to 14,387 SNPs, respectively, representing a reduction of the initial set of markers by 20.56%. The phenotypes were corrected for environmental effects of years and blocks using the software Selegen REML/BLUP (Resende 2016) . The model used is equivalent to: where y is the vector of phenotypic data; u is the vector of the effects of means of years added to the general mean (assumed to be fixed); a is the vector of additive genetic effects of individuals (assumed to be random and distributed as a ∼ N(0, I 2 a ); c is the vector of effects of the specific combining ability between the parents Conilon and Robusta (assumed to be random and distributed as c ∼ N(0, I 2 c ); s is the vector of permanent effects of individuals (assumed to be random and distributed as s ∼ N(0, I 2 s ); b is the vector of the effects of the permanent environment of the blocks (assumed to be random and distributed as b ∼ N(0, I 2 b ), and e is the vector of errors (assumed to be random and distributed as e ∼ N(0, I 2 e ). All effects were considered uncorrelated. The incidence matrices of the effects are represented by upper case letters. The phenotypes were adjusted, as de los Campos et al. (2013) , considering the following expression y * = y − Xû − Qb. In order to study several traits simultaneously, the correlation of the corrected phenotypic data set was estimated, and factor analysis was applied in the resulting matrix. The factor analysis model, considering the p-dimensional random vector Y = [Y 1, Y 2 ,…, Y p ] T with mean (p × 1) and matrix of variances and covariances (p × p) is given by: in which Γ = ij is a matrix (p × m) of coefficients known by rank factor loads m ≤ p, F is a random vector (m × 1) of non-observable latent common factors and is a vector (p × 1) of random errors called specific factors (Ferreira 2018) . The adequacy of the correlation structure was evaluated using the Kaiser-Meyer-Olkin criterion (KMO) method and Bartlett test (Cerny and Kaiser 1977) . The number of factors m was determined according to (Ferreira 2011) , that is, considering a percentage of explanation of at least 70% of the proportion of the total variance of the traits. To maximize the variability of factor loads and facilitate the interpretation of the distribution of variables in the respective factors, Varimax rotation was used. In this way, the allocation of variables in each factor was made by means of factorial loads, which consist of the correlation between each variable and the respective factors. Thus, the variables that were part of the factor are those most correlated to this. Subsequently, the factorial scores used were obtained by the regression method, given by: Page 4 of 9 Vol:. (1234567890) in which ̂ pxm is the matrix of factor loads, ̂ pxm the matrix of specific factors, Y j the vector referring to the j-th sample unit and the vector of means referring to the phenotypic variables evaluated (Ferreira 2018). The factor scores calculated for each genotype were used as pseudo-phenotypes (traits) along with the 14,387 SNPs markers in a GWS analysis, in order to estimate the genomic genetic values (GEBVs) of the 165 individuals. To this end, the mixed linear model G-BLUP was used to estimate the individual additive genetic values, according to the expression: in which y is the vector of pseudo-phenotypes ( N × 1 , in which N is the number of individuals); b is the vector of fixed effects ( p × 1 , in which p is the number of fixed effects), u a is the vector of additive genetic effects of individuals ( N × 1 , random), the structures of variances given by u a ∼ N 0, G a 2 a in which 2 a is the additive variance of the character and G a (N × N ) is the matrix of genomic kinship between individuals for the additive effects given by where p i and q i are the allele frequencies of locus i, W is an incidence matrix for the allelic substitution vectors of markers; ϵ is the vector of random residual effects ( N × 1) with ∼ N 0, I 2 e in which 2 e the residual variance; X is the incidence matrix for fixed effects ( N × p ); Z is the incidence matrix for random effects ( N × N ) (Resende et al. 2014 ). In addition, for comparison purposes, analyses for each measured trait were performed. After estimating the genetic merit for each interpretable factor and for each trait individually, the genomic genetic correlation between the traits was estimated. Predictive models were also compared using predictive capacity, heritability and accuracy. Predictive capacity (PC) was estimated from the correlation between the phenotypic values observed for each individual ( y ) and their respective GEBVs (ŷ) , that is: r y,ŷ = Cov(y,ŷ) Var(y).Var (ŷ) . In order for the PC not to be overestimated, the cross-validation technique was used. This method allows to evaluate the generalization capacity of a y = Xb + Zu a + predictive model in a data set (Sousa et al. 2019) . The cross-validation method used was K-Folds, being considered in this work K equal to 5 folds. Heritability ( h 2 ), which measures the proportion of phenotypic variation in the population attributed to the genetic cause, is given by means of the ratio between genotypic variance 2 g and phenotypic vari- The accuracy depends on the h 2 obtained and the PC, being a measure that is associated with the accuracy in the selection, and is obtained by the estimator (Resende 2015) r q,q = r y,ŷ ∕ √ h 2 in which r y,ŷ = PC and h 2 is the heritability. Based on genetic merit, the best 10% individuals were selected for each approach used. In possession of these individuals, the agreement between the selected individuals was calculated, based on each trait individually and by the factors, using the Cohen's Kappa coefficient (Cohen 1960) . This index can be measured by: k = Pr (a)−Pr (e) 1−Pr (e) , in which Pr(a) − Pr(e) w represents the proportion of observations in which agreement occurred beyond what as expected randomly and 1 − Pr(e) , the proportion of observations that no agreement occurred. This measure varies from zero to 1, and the higher the index, the greater the agreement between the groups formed. Factor analysis was performed in R software (R Core Team, 2019), using the principal function through the psych package (Revelle 2019). Genomic statistical analyses were performed using Genomic-Land software . The estimated values of the correlation coefficient between the evaluated traits ranged from 0.28 to 0.84 (Fig. 1) . According to the T test, at 1% probability, all correlation values were considered significantly different from zero. The value obtained by the KMO statistic was 0.58. In addition, the Bartlett sphericity test was statistically significant ( p − value < 0.01 ), indicating that the correlation matrix does not have a diagonal structure, demonstrating the adequacy of the factor analysis. According to the criterion that the total variation should exceed 70% (Teixeira et al. 2015) , two factors with biological interpretations were formed (Tables 1 and 2 ). The first factor (Factor 1), that explained about 65% of the original variance (Table 1) , grouped traits associated with the morphology of the individual (Vig, Phe and Cdi) ( Table 2 ). This factor has been referred to here as the "vigor factor". All these traits are positively and highly correlated with the factor (Table 2) . Therefore, higher values for these traits will be associated with higher values of the "vigor factor". The second factor (Factor 2), which explained about 22% of the variation (Table 1) , was constituted only by the production trait (Table 2 ), this being named as "production factor". It was found that the variables belonging to the factors presented values for communalities greater than 0.75, indicating that more than 75% of the variability of these traits was explained by the two factors formed ( Table 2) . The heritability estimates for the evaluated traits were from low to high (Resende, 1997) , with values ranging from 0.0034 to 0.6029, respectively (Table 3) . The estimated predictive capacity for the four traits (Vig, Phe, Cdi and Prod) and for the two factors (vigor and production factors) ranged from −0.03 to 0.57 and are presented in Table 3 . For these traits, the highest accuracy values were 0.74 and 0.68, obtained for Cdi and Factor 1 respectively (Table 3) . Specifically, the value of the PC referring to the "vigor factor" (0.51) is higher when compared to the variables Phe (0.37), Vig (0.39), and is close to the PC of the Cdi trait (0.57) ( Table 3 ). In addition, the accuracy value referring to the "vigor factor" (0.68) is higher when compared to the traits Phe (0.55) and Vig (0.57), which were analyzed individually by the G-BLUP method, and lower when compared to the variable Cdi (0.74) ( Table 3) . On the other hand, the estimated predictive capacity for production and Factor 2 (production factor) are similar (Table 3) . Estimates of correlations of genomic genetic values between the four traits (Vig, Phe, Cdi and Prod) and the two formed factors (Factor 1 and Factor 2) ranged from -0.15 to 0.98 (Table 4) . Coefficients of agreement between the top 10% individuals selected based on each variable individually and based on the factors are presented in Table 4 . Estimated Cohen's Kappa values ranged from 0.34 to 0.87. Specifically, considering the first factor (Vigor), only the Apl trait had an estimated Cohen's Kappa value considered low (0.34). All other estimated Cohen's Kappa values were greater than 0.50, indicating a very good (Factor 1 x Vig = 0.74; F1 x Cdi = 0.74) and excellent (Factor 2 x Prod = 0.87) rating (Landis and Koch 1977) . The obtained data were also used to calculate the mean, median, standard deviation (SD) and coefficient of variation (CV) for the best 10% genotypes. The best genotypes for each trait were selected individually and considering the "vigor factor". In general, a reduction in phenotypic values was observed when the selection was performed on the basis of the factor (Table 5 ). In this study, the individual genetic merits of the accesses for four traits (Vig, Phe, Cdi and Prod) and for the two formed factors (vigor and production factors) were predicted by means of the GBLUP model. The joint phenotyping and large-scale genotyping data of C. canephora genotypes were used to evaluate predictive capacity and accuracy through cross-validation with 5 folds (CV). In addition, Cohen's Kappa correlation and concordance coefficients (based on the highest 10% GEBVs) were estimated among the GEBV values. The factor analysis was able to satisfactorily represent a set of highly correlated variables, since the KMO value was greater than 0.5 (Hair et al. 2006) and the Bartlett sphericity test was statistically significant (p-value < 0.01). Since FA is based on a phenotypic correlation matrix, it improves the incorporation of previous knowledge about the underlying genetic Table 4 Estimates of the correlation between the genomic genetic values for the evaluated characteristics and the identified factors (upper triangular) and Cohen's Kappa coefficients (lower triangular) Vig: vegetative vigor; Phe: plant height; Cdi: canopy projection diameter; Prod: production per plant. ** = Significant at 1% probability of error by T test; ns = not significant. E = Excellent, VG = Very Good and R = Reasonable Table 5 Mean, median, standard deviation (SD) and coefficient of variation (CV), for the top 10% genotypes selected for each characteristic individually and based on the "vigor factor" Vig: vegetative vigor; Phe: plant height; Cdi: canopy projection diameter; *Vig: vegetative vigor selected based on the "vigor factor"; *Phe: height of the selected plant based on the "vigor factor "; *Cdi: diameter of the selected canopy projection based on the "vigor factor" architecture, thus allowing the simultaneous selection of characters. With the exception of the traits production and the factor related to it, the values of heritability were moderate to high (Resende 1997) indicating the possibility of success in the selection based on the evaluated traits (Vig, Cdi, Phe) and the "vigor factor". The low heritability observed for production indicates that such a trait is controlled by more genes and, therefore, the selection is more complex (Sousa et al. 2019) . The estimation predictive ability and accuracy indicate that it is possible to use the "vigor factor" to predict the genetic merit of genotypes. PC values similar to or higher than those obtained by individual analysis, as obtained in our work, have also been observed in pigs and have been efficiently used in the improvement of this species (Teixeira et al. 2016 ). On the other hand, the predictive capacity for the trait and the production factor was low. Low values of accuracy and predictive capacity for C. canephora production were reported by Alkimin et al. (2020) , corroborating with the results obtained in the present study. In fact, these low values of accuracy and predictive capacity for production are reasonable, since this is a complex trait (Leroy et al. 2011) . Although the values were low, the accuracy and predictive capacity of the model can be used to give greater confidence in the evaluation and predicted genetic value of individuals (Resende 2015) . The traits vegetative vigor (Vig), plant height (Phe) and canopy diameter (Cdi) showed positive correlation with the first factor, "vigor factor". In this case, it is expected that selecting individuals based on the factor means getting taller plants with greater vigor and canopy diameter. This result suggests that this factor may be included as a pseudo-phenotype in C. canephora breeding programs. Ferreira et al. (2005) , stressed that the selection of genotypes based on a set of characters of importance is necessary, aiming at simultaneous gains in several traits of agronomic interest. On the other hand, considering the second factor, "production factor", since only production presented a high correlation with it, the individual analysis of this trait may be more appropriate. This result is corroborated by those obtained by Ferreira et al. (2005) and Barbosa et al. (2019) , which evaluated the possibility of using selection indices in factorial complexes, in characters of genotypes of C. canephora and C. arabica, respectively. These authors also did not group the production trait into any factor. Positive and significant correlations between GEBVs values (Table 4) were observed between vegetative vigor and canopy projection diameter (0.90), between plant height and canopy diameter (0.88) and between vegetative vigor and plant height (0.72). Since the "vigor factor" represents this set of traits, higher correlation values are observed between them. Given that this estimate captures only the genomic effect, this result also suggests that some genes that influence the expression of these phenotypes may be the same or the occurrence of pleiotropy. In order to detail this result, we can cite the study of genomic association in genotypes of C. canephora of Silva (2018) . Using density of 17,885 markers, the presence of SNP with significant associations in the genes for these three traits was observed. Cohen's Kappa agreement coefficients (based on the highest 10% GEBVs) were estimated among the GEBV values. According to Landis and Koch (1977) , Kappa coefficient values ranging from 0.61 to 0.80 indicate substantial agreement. Therefore, the agreement between the "vigor factor" and the traits Vig (0.74) and Cdi (0.74) can be considered significant. However, the agreement between this factor and Apl (0.34) was considered regular. In the context of C. canephora breeding, this result may be useful. Specifically, the objective is to identify vigorous genotypes with a larger canopy diameter, traits that are positively correlated to plant height. This behavior constitutes a problem/obstacle in the selection of this crop, since from the phytotechnical point of view, low-sized coffee cultivars are better (Ferreira et al. 2005) , as they allow the thickening of plants in the area, providing greater productivity, in addition to reducing the susceptibility of plants to cold winds, facilitating harvesting and cultural treatments (Carvalho 2008) . However, as observed by the Kappa coefficient, it is verified that the genotypes selected by means of the "vigor factor" have a lower agreement with those selected considering the plant height trait (Kappa coefficient = 0.34). In terms of phenotypic values, the selection based on the "vigor factor" allowed to select smaller individuals, with a reduction of 26.46% on average in height. For the other traits (Cdi = 2.45% and Vig = 8.62%), the percentage reduction was lower when compared to that obtained for the Phe trait. 42 Page 8 of 9 Vol:. (1234567890) This result is relevant from the phytotechnical point of view, because as discussed earlier, low-sized coffee cultivars are better for allowing the thickening of plants in the area. The results obtained in the work showed the efficiency of using factor analysis along with genomic selection to predict the individual genetic merits of characters related to the "vigor factor" in the coffee crop. The use of this strategy in the improvement of C. canephora is especially important because it is a perennial and long-cycle species and, therefore, the development of new cultivars can take decades (Resende et al. 2012; Sousa et al. 2019 ). Therefore, the use of genomic selection can speed up the process, making breeding more efficient and competitive. Additionally, the use of the vigor factor can reduce the effect of the correlation between vigor, canopy diameter and plant height, being interesting from the phytotechnical point of view. The factor analysis was efficient for the improvement of Coffea canephora, since it allowed to elucidate the structure of relations of the evaluated traits and form new variables, interpreted as "vigor factor" and "production factor". The formed factors presented a satisfactory percentage of explained variability and can be treated as new phenotypes in genomic selection studies. The traits vegetative vigor, plant height and canopy diameter in C. canephora can be improved jointly due to its high degree of association. The use of factors proved to be a promising approach associated with GWS for the breeding of C. canephora, because in addition to allowing joint interpretations, they were efficient to estimate predictive capacity, heritability and accuracy. In addition, agreement was observed between the top 10% individuals selected based on the "vigor factor" and each variable individually, with the increase in the advantage of the "vigor factor" selecting individuals of more suitable size. Selective efficiency of genome-wide selection in Coffea canephora breeding Genetic parameters of total milk yield and factors describing the shape of lactation curve in dairy buffaloes Genomi-cLand: Software for genome-wide association studies and genomic prediction Recommendation of Coffea arabica genotypes by factor analysis The effect of bienniality on genomic prediction of yield in arabica coffee Cultivares de café A study of a measure of sampling adequacy for factor-analytic correlation matrices A coeficiente of agrément for nominal scales Lost and found: Coffea stenophylla and C. affinis, the forgotten coffee crop species of West Africa Whole genome regression and prediction methods applied to plant and animal breeding A mixed model to multiple harvest-location trials applied to genomic prediction in Coffea canephora Accurate genomic prediction of Coffea canephora in multiple environments using whole-genome statistical models Seleção simultânea de Coffea canephora por meio da combinação de análise de fatores e índices de seleção Análise Fatorial Exploratória Genetic diversity and population structure analysis of coffee (Coffea canephora) germplasm collections in Indian Gene Bank employing SRAP and SCoT markers International Coffee Organization -ICO (2020) Multivariate Data Analysis Upper Saddle River Pearson Prentice Hall The measurement of observer agreement for categorical data Improving the quality of African robustas: QTLs for yield-and quality-related traits in Coffea canephora Use of multivariate analysis to extract latent variables related to level of production and lactation persistency in dairy cattle Use of multivariate factor analysis to define new indicator variables for milk composition and coagulation properties in Brown Swiss cows G × E interactions on yield and quality in Coffea arabica: new F1 hybrids outperform American cultivars Prediction of total genetic value using genome-wide dense marker maps Quantile regression applied to genome-enabled prediction of traits related to flowering time in the common bean Quantile regression in genomic selection for oligogenic traits in autogamous plants: A simulation study Genetic evaluation for latent variables derived from factor analysis in broilers Effect of asparaginase enzyme in the reduction of asparagine in green coffee R: A language and environment for statistical computing. R Foundation for Statistical Computing Software Selegen-REML/BLUP: a useful tool for plant breeding Genomic selection for growth and wood quality in Eucalyptus: capturing the missing heritability and accelerating breeding for complex traits in forest trees Inferência Bayesiana, Regressão Aleatória, Seleção Genômica, QTLGWAS High-throughput targeted genotyping of Coffea arabica and Coffea canephora using next generation sequencing Avanços da genética biométrica florestal Genética quantitativa e de populações. Ltda Procedures for Personality and Psychological Research Genome introgression of Híbrido de Timor and its potential to develop high cup quality C. arabica cultivars Estudo de associação genômica ampla (GWAS) em Coffea canephora. Dissertation, Universidade Federal de Viçosa Sousa TV, Caixeta ET, Alkimim ER et al (2017) Molecular markers useful to discriminate Coffea arabica cultivars with high genetic similarity Early selection enabled by the implementation of genomic selection in Coffea arábica breeding Determinação de fatores em características de suínos Factor analysis applied to genome prediction for high-dimensional phenotypes in pigs The authors are grateful for the financial support of the Fundação de Amparo à Pesquisa do Estado de Minas Gerais -FAPEMIG (APQ-01638-18), the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior -CAPES (Code 001) and the Consórcio Pesquisa Café. The authors have no conflict of interest to declare that are relevant to the content of this article. The authors have no financial or proprietary interests in any material discussed in this article.