key: cord-0977168-o0w8fj27 authors: Sui, Juanjuan; Qu, Changqing; Yang, Jingxia; Zhang, Wenna; Ji, Yuntao title: Transcriptome changes in the phenylpropanoid pathway in senescing leaves of Toona sinensis date: 2019-06-17 journal: Acta Physiol Plant DOI: 10.1007/s11738-019-2915-9 sha: 021fb425e5590a5735e7b0082fc1d7be8dab9ea5 doc_id: 977168 cord_uid: o0w8fj27 Toona sinensis is a deciduous tree native to eastern and southeastern Asia that has important culinary and cultural values. To expand current knowledge of the transcriptome and functional genomics in this species, a de novo transcriptome sequence analysis of young and mature leaf tissues of T. sinensis was performed using the Illumina platform. Over 8.1 Gb of data were generated, assembled into 64,541 unigenes, and annotated with known biological functions. Proteins involved in primary metabolite biosynthesis were identified based on similarities to known proteins, including some related to biosynthesis of carbohydrates, amino acids, lipids, and energy. Analysis of unigenes differentially expressed between young and mature leaves (transcriptomic libraries ‘YL’ and ‘ML’, respectively) showed that the KEGG pathways of phenylpropanoid, naringenin, lignin, cutin, suberin, and wax biosynthesis were significantly enriched in mature leaves. These results not only expand knowledge of transcriptome characteristics for this valuable species, but also provide a useful transcriptomic dataset to accelerate the researches on its metabolic mechanisms and functional genomics. This study can also further the understanding of unique aromatic metabolism and Chinese medicinal properties of T. sinensis. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1007/s11738-019-2915-9) contains supplementary material, which is available to authorized users. Chinese mahogany (Taihe Toona sinensis Roem, syn. Cedrela sinensis, family Meliaceae) is a perennial woody tree that grows 25 m high, and is used as a source of food, timber, and medicine, particularly in the Anhui province of China. Regarded as nutritious food, the edible buds and young leaves are commonly used to make the condiment Toona Paste, which has a floral and onion-like flavor (Park et al. 1996; Edmonds and Staniforth 1998) . The unique flavor results from various natural compounds including triterpenes, phenolics, flavonoids, and lysine amino acid (Mu et al. 2007; Zhou et al. 2011; Kakumu et al. 2014; Zhang et al. 2015) . The mature, fibrous leaves of T. sinensis are used in Chinese traditional medicines to treat conditions ranging from diarrhea and other intestinal complaints to reproductive concerns and cancer. Recently, other biological properties of T. sinensis leaf extracts have been reported, including antiinflammatory, analgesic, inhibition of boil growth inhibition, antioxidant, anti-diabetic, and anti-neoplastic, as well as anti-atherosclerotic, and inhibition of replication of the severe acute respiratory syndrome (SARS) coronavirus and of the pandemic influenza A (H1N1) virus (Hsu et al. 2003; Chia et al. 2010; Huang et al. 2012; Yang et al. 2013 Yang et al. , 2014 You et al. 2013) . The nutritional value and potential health benefits of T. sinensis require further investigation. Currently, only very limited information is available about the compounds contributing to the flavor of young leaves and the medicinal content of mature leaves of T. sinensis. Only a few reports have addressed the effects of flavonoids on the taste of young leaves. Flavonoids, lysine, and polyphenols increase the antioxidant capacity of plant cells and associated tissues, and are responsible for the antioxidant properties of T. sinensis buds and young leaves Vinodhini and Lokeswari 2014) . Recent rapid developments in bioinformatics have allowed the transcriptome approach to emerge as a powerful method for direct sequencing. RNA-Seq, or whole transcriptome shotgun sequencing, can now be used for transcriptome studies due to its high-throughput and highresolution capabilities (Young et al. 2010; Torre et al. 2014 ). RNA-Seq allows analysis of complex transcriptional regulation and variable metabolic pathways of different flavonoids, including across different groups or tissues (Shi et al. 2014) . Previous transcriptome studies in T. sinensis using other species allowed increased understanding of multiple aspects of the biochemistry, development, and metabolism of leaves and shoots, as well as new insights into the biosynthesis of metabolic compounds (Long et al. 2014; Wang et al. 2015) . In this study, we sequenced the transcriptomes of young and mature leaves of T. sinensis. RNA sequencing data was de novo assembled and annotated, and candidate gene expression changes were characterized. For the first time, molecular regulation of the phenylpropanoid and naringenin biosynthesis pathways was characterized in this species. Transcriptome differences between young and mature leaves described in the current study provide crucial resources for gene annotation and discovery, and gene function analysis. Moreover, our sequencing results enhance understanding of biosynthesis of phenylpropanoid and cutin, and provide insights into the potential molecular mechanisms of pharmacological action in T. sinensis, which can promote production and yield of phenylpropanoid for medicinal or culinary purposes of T. sinensis. Mature 5-year trees of the T. sinensis cultivar 'Heiyouchun' were sampled from a T. sinensis industry demonstration zone in Taihe County, Anhui, China. The first to third pinnate fronds with purple color were identified as young leaves (YL), and the sixth to eighth green pinnate fronds were considered as mature leaves (ML) (Fig. 1a ). Young and mature leaves were harvested randomly from three T. sinensis clones, which were propagated by asexual reproduction and thus had the same genetics as the 'Heiyouchun' cultivar. At least 20 YL or ML were mixed in each sample pool for RNA-seq analysis. All samples were immediately immersed in liquid nitrogen and stored at − 80 °C. Total RNA was extracted from leaf samples with TRIzol Reagent (Cat. #15596026, Invitrogen, Carlsbad, CA, USA) and then treated with DNase I (Invitrogen, Cat. #18047019) according to the established methods. To determine RNA quality and concentration, 1 µl of each RNA sample was electrophoresed (2%, agarose, 1x TBE) and quantified using a NanoDrop ND-1000 (Thermo Scientific). In addition, RNA integrity number (RIN) was determined with the Agilent 2100 BioAnalyzer (Agilent Technologies, Santa Clara, CA, USA). At least 20 µg of total RNA was combined with oligo(dT) magnetic beads (concentration ≥ 250 ng/µl, OD 260/280 = 1.8~2.2, OD 260/230 ≥ 2.0, 28S:18S ≥ 1.0) and used to confirm that the RNA integrity number (RIN) value was greater than 8.0 before further library construction. RNA-Seq libraries were prepared using the TruSeq RNA Sample Prep Kit (RS-122-2001, Illumina Inc., San Diego, CA, USA) . Buffer reagent was used to fragment the extracted mRNA, and the resulting fragmented mRNA was reverse transcribed into cDNA, with purified short fragments used for end repair and ligated with adaptors. The cDNA was enriched by PCR amplification and quality was confirmed with BioAnalyzer, after which RT-PCR was used to quantify the cDNA library, and it was sequenced (Illumina HiSeq™ 4000, BGI, Shenzhen, China), generating pairedend reads with 150 bp in length. Raw reads were preprocessed using the filter-fq software (v1.2.0; https ://githu b.com/bowen tan/filte rfq) to discard the adapters (> 5 bp), low-quality fragments (with a quality score Q ≤ 19) or N (unknown nucleotide) content > 5%, and those fragments shorter than 50 bp (including redundant sequences). This clean, high-quality data were used to Fig. 1 Constructed RNA libraries of young and mature leaves and alignment of unigenes annotated by databases. a The first to third pinnate fronds with purple color are identified as young leaves (YL) indicated by yellow stars, and the sixth to eighth green pinnate fronds are mature leaves (ML) indicated by white stars. At least 20 YL and ML were harvested randomly from three T. sinensis 'Heiyouchun' cultivars and mixed in two sample pools to construct RNA libraries. b The final clean data of YL and ML were obtained from raw data by discarding the adapters (> 5 bp), low-quality fragments (with a quality score Q ≤ 19) or N (unknown nucleotide) content > 5%, and fragments shorter than 50 bp (including redundant sequences) with Trinity software. c Comparison of the matching sequences with recently used NCBI sequence homologies showed 20,515 (43.06%) unigenes out of 64,541 identified unigenes were successfully annotated using BLAST searches of the public Nr, Nt, BLASTp, BLASTx databases calculate Q20 and Q30 values, levels of GC content and sequence duplication, and for all downstream analyses. The resulting paired-end reads were clustered using TGICL software (Pertea et al. 2003) to analyze the length and distribution of the transcriptional and unigene clusters. Paired-end sequences were separated into two files, "left" reads into the "left.fq" file and "right" reads into the "right.fq" file. Reads that uniquely mapped to the left contigs were considered to be derived from T. sinensis. Any reads matching to genus qualified them as right reads. Unmatched reads at this stage of the process were considered a set of singleton reads and also placed into the right.fq file. Potential transcripts and unigenes were assembled from the pooled clean reads of left.fq and right.fq files using Trinity software (v20140717) (Manfred 2011) . Trinotate was used to perform the functional annotation of unigenes and ORFs (Bryant et al. 2017) . We processed all unigene sequences for identification and functional annotation including homology search with known databases including NCBI's Nt (nonredundant nucleotide sequences), GO (gene ontology), COG (Cluster of Orthologous Groups), and KEGG (Kyoto Encyclopedia of Genes and Genomes). The highest similarity of aligned proteins was used for functional annotation of unigene sequences. First, BLASTx and BlastN (both with parameters of match length ≥ 90 bp, e value < 1e−5, and the allowance of ≤ 1 mismatch and 1 gap, and identity ≥ 90%) were used to align unigenes to protein databases and Nt, respectively. Subsequently, EST-Scan software was used to determine sequence direction. Then Blast2GO was employed to determine GO annotation against the GO database for unigenes annotated by NCBI Nr (nonredundant protein sequences) (Conesa et al. 2005; Götz et al. 2008) . InterProScan (v5) was used to give further protein annotation. Prediction of protein-coding regions was performed with OrfPredictor software (Min et al. 2005) . Additionally, GO functional classification of all unigenes was performed using the Web Gene Ontology Annotation Plot (WEGO) software (Ye et al. 2006) , which visualizes and characterizes gene functions and distributions across different pathways. Gene expression levels were estimated by mapping clean reads to the Trinity transcript assembly using RNA-Seq by Expectation-Maximization (RSEM) (Li and Dewey 2011) for each sample. The abundance of each gene was normalized and calculated using the unigene expression via the where C and N represent the counts of mapped reads uniquely aligned to a unigene and the sum of reads sequenced that were uniquely aligned to total unigenes, respectively, and L represents the sum of a unigene in base pairs. The DEGSeq package in R was used to conduct differential expression analysis for the young and mature leaves by modeling count data with negative binomial distributions (Anders and Huber 2010) . P values were adjusted to reduce false positives due to multiple testing (Storey and Tibshirani 2003) , with a q value < 0.05 and |log 2 (ratio)| ≥ 1 set as the thresholds for significantly differential expression between the two samples. The identified differentially expressed genes (DEGs) were analyzed according to KEGG enrichment pathways and GO functional categories. GO enrichment analyses were conducted in GOseq with the Wallenius' noncentral hypergeometric distribution used to search for and map all significantly enriched GO terms among the DEGS (Young et al. 2010) . KEGG online tools were used for pathway enrichment analysis of the DEGs (http://www. kegg.jp/) (Mao et al. 2005) . To eliminate genetic differences of individual cultivars, at least 20 young leaves (YL) or mature leaves (ML) from three individual T. sinensis 'Heiyouchun' cultivars were mixed in each sample pool for RNA extraction. To construct high-quality YL and ML RNA libraries, total RNA quality was determined by agarose gel electrophoresis, Nanodrop, and RNA integrity number (RIN) value is shown in Supplemental Fig. 1 . The quality of YL RNA was 38 µg (concentration = 411 ng/µl, OD 260/280 = 2.1, OD 260/230 = 2.0, 28S/18S = 1.8, RIN = 10), and quality of ML RNA was 24 µg (concentration = 532 ng/µl, OD 260/280 = 2.1, OD 260/230 = 1.3, 28S/18S = 1.7, RIN = 8.9). The quality was satisfactory for use in constructing libraries. To generate a complete T. sinensis leaf transcriptome, two cDNA libraries from YL and ML were constructed and sequenced using the Illumina HiSeq™ 4000 platform, generating 3.82 and 3.10 Gb of raw RNA-seq data, respectively. After deletion of adaptor-polluted, redundant, and FPKM = 10 6 C NL 10 3 , other low-quality sequences, 3.32 and 2.80 Gb clean reads of YL and ML, respectively, were retained and assembled. For these clean reads, the Q30 scores (sequencing error rate, 0.1%) were 97.32% and 95.44%, and GC contents were 40.06% and 40.26%, generated from the transcriptome libraries of 'YL' and 'ML', respectively (Fig. 1b, Table 1 ). After filtration, the Trinity tool was used to assemble independent high-quality clean sequences from each library, which were further merged, generating 102,881 transcripts and 64,541 unigenes. These transcripts were 107,527,675 bp and 53,892,623 bp with unigene GC contents of 40.16% and 39.94% of YL and ML, respectively. Mean sizes for total transcripts with N50 s and N90 s were 1758 and 417 bp, respectively, while mean sizes for unigenes with N50 s and N90 s were 1563 and 313 bp. The mean lengths of total transcripts and unigenes were 1045 bp and 835 bp of YL and ML, respectively (Table 2 ). An overview of the sequence size distribution of transcripts and unigenes is shown in Supplemental Table 1 . The quality and quantity of raw sequence data were sufficient to perform further analysis. 63.16% (40,767) of the unigenes were between 200 and 600 bp in length, 12.11% (7814) were between 600 and 1000 bp, 14.19% (9160) were between 1 and 2 kb, 8.90% (5743) were between 2 and 4 kb, and unigenes of lengths more than 4 kb accounted for only 1.64% (1057) ( Table 3) . To obtain functional annotations, we subjected all generated unigenes to BLASTx alignment using a serial blast with a cut-off e value 1e−5, in the NCBI databases and sequence homologies. In total, 75,779,461 raw reads (27.80% of the total reads) were annotated. Of these unigenes, 1746 were annotated with the Nr database and 726 with Nt (Fig. 1c) ; 33,791 with UniProt (including Swiss-Prot, TrEMBL, and PIR-PSD); 20,515 with GO (Supplemental Table 2 ), 8696 with COG; 5482 with KEGG; and 23,970 with PFAM. The 20,515 unigenes annotated with GO were assigned to categories including molecular functions, cellular processes, and biological processes (Fig. 2) . The two most abundant unigene sequences belonged to cellular processes (4644, 22.63%) and metabolic processes (4375, 21.33%) within biological processes. Unigenes involved in cellular processes were distributed in cell and cell parts (6044 unigenes, 33.83%), organelles (2052, 11.48%), and plasma membrane (3377, 18.90%). Unigenes involved in molecular functions played roles in binding (4744, 42.59%) and catalytic activity (4107, 36.87%), whereas 20.53% represented activity proteins, including transporters, structural molecules, molecular transducers, enzyme regulators, receptors, antioxidants, electron carriers, and transcription factors. COG analysis aligned 8696 unigenes for functional classification (Fig. 3) . For 14.26% (1240 unigenes), a general function was predicted while translation, ribosomal structure, and biogenesis accounted for 9.50% (826), posttranslational modification was related to 8.41% (732), 6.41% (556) were engaged in carbohydrate transport and metabolism, amino acid transport and metabolism accounted for involved 5.76% (501), replication functions were predicted for 4.37% (380), and 2.94% (256) were involved in transcription. Assembled unigenes were assigned to metabolic pathways in the KEGG database based on sequence similarity (Fig. 4) . Of the 5482 unique mapped sequences, 14.61% (801) were assigned to amino acid metabolism pathways and 8.31% (456) to ribosome metabolism and translation pathways; 2.77% (152) were involved in the immune system; 2.04% (112) were classified under biosynthesis of secondary metabolites; 1.17% (64) under metabolism of terpenoids and polyketides; 0.97% (53) were assigned to phenylpropanoid biosynthesis; and 0.53% (14) to flavonoid biosynthesis. To identify genes with different expression levels between YL and ML, the unigene expression levels were calculated with the RPKM method, which accounts for effects of both sequencing depth and gene length on the read count (Fig. 5a) . A total of 15,172 unigenes had differential expression (with q value < 0.05 and |log 2 (ratio)| ≥ 1) between the two samples and thus were identified as differentially expressed genes (DEGs). Among these DEGs, 9648 were up-regulated and 5524 were down-regulated in ML compared with YL (Fig. 5b) . DEGs mapped within each GO term category were counted. The hypergeometric test revealed that a total of 67 functional groups, including molecular functions, cellular components, and biological processes, showed remarkable enrichment in DEGs compared with the transcriptomic background (Fig. 6) . Several enriched pathways, including amino acid biosynthesis, signal transduction, and metabolic pathways, were identified using KEGG enrichment analysis of DEGs. A total of 308 pathways by DEGs are shown in Table 4 and Supplement Table 4 , with 22 metabolic pathways significantly over-represented. Significantly highly enriched pathways of YL samples were primarily related to plant biological and human pathogen resistance metabolism pathways, including general ribosome (ko03010), cell cycle (ko04110), RNA transport (ko03013), ribosome biogenesis in eukaryotes Fig. 3 Functional classification with the COG database for assigned unigenes. A total of 8696 unigenes were aligned to data in the COG database for functional classification (ko03008), DNA replication (ko03030); HTLV-I infection (ko05166), Fanconi anemia pathway (cellular response to DNA interstrand crosslink) (ko03460), and systemic lupus erythematosus (ko05332). The most enriched pathways in samples of ML were related to secondary metabolism pathway, including ribosome (ko03010); phenylpropanoid biosynthesis (ko00940); pathogenic E. coli infection (ko05130); axon guidance (ko04360), hypertrophic cardiomyopathy (HCM) (ko05410); cutin, suberin, and wax biosynthesis (ko00037), dilated cardiomyopathy (ko05414); photosynthesis antenna protein (ko00196); malaria (similar to plant galactolipid metabolism pathway) (ko05144); flavonoid biosynthesis (ko00941); nitrogen metabolism (ko00910); carotenoid biosynthesis (ko00906); limonene and pinene degradation and stilbenoid, diarylheptanoid and gingerol biogenesis (ko00906). The phenylpropanoid biosynthesis exclusive ribosome in those metabolic pathways related specific medical traits of T.sinensis was a most significantly Many unigenes related to phenylpropanoid biosynthesis were identified in ML transcriptome (Fig. 7) . Transcriptome analysis revealed that 53 enzyme genes related to phenylpropanoid biosynthesis (Table 5) Fig. 1) . These results indicated that caffeoyl-CoA, flavonoids, and lignin were each metabolized in an enzyme-dependent manner and accumulated in ML extracts. In addition, almost all major enzyme genes involved in cutin, suberin, and wax biosynthesis were annotated in this pathway (Supplemental Fig. 2 ; Supplemental Table 3 ). Despite this increased information, the complexity of the molecular mechanism for the biosynthesis of cutin, suberin, and wax in mature leaves of T. sinensis remains uncertain and requires further study. Transcriptome sequencing can be used to efficiently and effectively analyze the cellular transcriptome. Many computational software packages and pipelines have already been widely used during RNA-seq data analysis, including edgeR (Robinson et al. 2010) , DESeq (Anders and Huber 2010) , DEGSeq (Wang et al. 2010) , and limma (Smyth 2004) . edgeR is normally used to determine differential expression with empirical Bayes estimation and exact tests based on a negative binomial model. edgeR can be used for small numbers of replicates with over-dispersed data to assess differential gene expression. TMM normalization and Benjamini-Hochberg procedures are used as default to control sequencing depths and FDR, respectively (Robinson et al. 2010) . Similar to edgeR, DESeq also uses a negative binomial model, a scaling factor normalization procedure and the Benjamini-Hochberg procedure to control sequencing depths and FDR of different samples, but exhibits more general dispersion estimation and balanced selection of DEGs. DESeq is technically possible to use with experiments without any biological replicates but this is not recommended (Anders and Huber 2010) . Limma was originally used for microarray data analysis but was later extended to RNA-seq data. TMM normalization of the edgeR package and 'voom'-conversed log 2 scale Fig. 5 The comparison of expression levels between mature leaves and young leaves. The unigenes' expression levels based on RPKM (a) and fold change (b). q value < 0.05 and |log 2 (ratio)| ≥ 1 are used to determine weight prior to linear modeling. The Benjamini-Hochberg procedure is used as default to estimate FDR (Smyth 2004) . DEGseq exports gene expression values in a table format, which are then directly processed by edgeR. It analyzes gene expression based on a random sampling model or raw counts in Poisson distribution model. DEGseq can Fig. 6 GO annotation categories with differentially expressed unigenes. All DEGs were mapped to each GO database term and counted within the corresponding GO term categories. DEGs when a cutoff ratio of |log 2 (ratio)| ≥ 1, and q value < 0.05 Table 4 Significant KEGG enrichment analysis of young leaf and mature leaf DEGs of T. sinensis Yes also be applied to identify differential expression of exons or pieces of transcripts with or without a small number of replicates. In our study, to get higher sequencing depth and detect subtle gene expression changes, we directly pooled 20 individual biological replicates together into YL and ML sample groups. Due to lack of replicates, DEGseq was more suitable than the other programs to conduct differential gene expression analysis. When we use DEGseq package, it will first homogenize the sample when analyzing single replicate (this homogenization process will avoid the biasness to some extent) according to the internal arithmetic method, and then we analyze the difference based on the data after homogenization instead of directly analyzing the difference between the original data input. To sort off reliable DEGs, the software accounting calculates the corresponding p value and corrected q value. In addition, DESeq detected DEGs based on the level of gene expression according to the negative binomial distribution of statistical methods. The obtained p value will be corrected to control false-positive results according to Benjamini and Hochberg methods. The corrected q value < 0.05 and |log2 (ratio)| ≥ 1 set as the thresholds is defined as DEGs. In this study, using transcriptome sequencing analysis, we obtained 64,541 unigenes with an N50 value of 1563 bp and a mean length of 835 bp and used these for assembly evaluation by comparison with NCBI and sequence homologies. In total, 20,515 (43.06%) of these unigenes were successfully annotated using BLAST searches of the public Nr, PFAM, Swiss-Prot, GO, COG, and KEGG databases. The resulting RNA-Seq data provided a high-quality annotated assembly for T. sinensis generated by comprehensive analysis. Distribution patterns annotated similarly across several databases indicated that YL and ML of T. sinensis undergo multiple unique developmental processes ( Fig. 1 ; Supplemental Table 2 ). The large number of annotated enzymes suggests the presence of genes associated with different pathways of primary and secondary metabolite biosynthesis across life stages (Zhang et al. 2016; Zhao et al. 2017 ). Our findings demonstrate that the phenylpropanoid and lignin biosynthesis pathways were among the most enriched. Nine differentially expressed unigenes, including 4CL, CoumCoA3H, HCT, CYP98A, CCR, REF1, CAD, and POD, were up-regulated in YL and ML. ML were significantly enriched in phenylpropanoids, consistent with increased content of flavonoid, lignin, cutin, and wax. In plants, control of phenylpropanoid biosynthesis is complex and plays a significant role in pathogen resistance, anthocyanin biogenesis, and pharmacology (Jimene and Riguera 1994) . In this transcriptome study, we identified most of the Fig. 7 Schematic diagram of the phenylpropanoid biosynthesis pathway. Differentially expressed genes involved in the phenylpropanoid biosynthesis pathway in response to leaf senescence in T. sinensis. The red-colored names of enzymes indicate the response pattern (up-regulated) of the unigenes that encoded the corresponding enzyme in mature leaf. Numbers of putative unigenes encoding enzymes are given for T. sinensis in parentheses catabolic genes associated with phenylpropanoid synthesis, demonstrating an understanding of the precise pathway in plants (Shi et al. 2013) . Genetic, molecular, and biochemical evidence suggests that synthesis and catabolism of phenylpropanoid amino acids are regulated by previously undescribed coordinated mechanisms (Burkhard et al. 2001; Grabherr et al. 2011) . Information from the current study will advance understanding of the regulation of phenylpropanoid metabolism in T. sinensis, which will provide valuable information for the future production of high-phenylpropanoid crops with medical applications. Differential expression analysis for sequence count data A tissue-mapped axolotl de novo transcriptome enables identification of limb regeneration factors Structural insight into Parkinson's disease treatment from drug inhibited DOPA decarboxylase Anti-Neoplastic effects of gallic acid, a major component of Toona sinensis leaf extract, on oral squamous carcinoma cells Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research Plate 348. Toona sinensis. Curtis's Bot Mag High-throughput functional annotation and data mining with the Blast2GO suite Full-length transcriptome assembly from RNA-Seq data without a reference genome Effects of Toona sinensis leaf extract on lipolysis in differentiated 3T3-L1 adipocytes In vitro and in vivo activity of gallic acid and Toona sinensis leaf extracts against HL-60 human premyelocytic leukemia Phenylethanoid glycosides in plants: structure and biological activity Phytochemical analysis and antileukemic activity of polyphenolic constituents of Toona sinensis RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome De novo assembly of the desert tree Haloxylon ammodendron (C. A. Mey.) based on RNA-Seq data provides insight into drought response, gene discovery and marker identification Full-length transcriptome assembly from RNA-Seq data without a reference genome Automated genome annotation and pathway identification using the KEGG Orthology (KO) as a controlled vocabulary OrfPredictor: predicting protein-coding regions in EST-derived sequences Mapping and quantifying mammalian transcriptomes by RNA-Seq Rapid determination of volatile compounds in Toona sinensis (A. Juss.) Roem. by MAE-HS-SPME followed by GC-MS Phenolic compounds from the rachis of Cedrela sinensis TIGR Gene Indices clustering tools (TGICL): a software system for fast clustering of large EST datasets edgeR: a Bioconductor package for differential expression analysis of digital gene expression data Transcriptomic analysis of a tertiary relict plant, extreme xerophyte Reaumuria soongorica to identify genes related to drought adaptation Genomewide transcriptome analysis of genes involved in flavonoid biosynthesis between red and white strains of Magnolia sprengeri pamp Linear models and empirical Bayes methods for assessing differential expression in microarray experiments Statistical significance for genomewide studies RNA-seq analysis of Quercus pubescens leaves: de novo transcriptome assembly, annotation and functional markers development Antioxidant activity of the isolated compounds, methanolic and hexane extracts of Toona ciliata leaves Phenolic antioxidants from Chinese toon (fresh young leaves and shoots of Toona sinensis) DEGseq: an R package for identifying differentially expressed genes from RNA-seq data De novo transcriptome sequencing in Pueraria lobata to identify putative genes involved in isoflavones biosynthesis Antiproliferative activity and apoptosis-inducing mechanism of constituents from Toona sinensis on human cancer cells Toona sinensis Inhibits LPSinduced inflammation and migration in vascular smooth muscle cells via suppression of reactive oxygen species and NF-B signaling pathway WEGO: a web tool for plotting GO annotations The effectiveness and mechanism of Toona sinensis extract inhibit attachment of pandemic influenza A (H1N1) virus Gene ontology analysis for RNA-seq: accounting for selection bias De novo transcriptome characterization of Lilium 'Sorbonne' and key enzymes related to the flavonoid biosynthesis De novo assembly and comparative transcriptome analysis provide insight into lysine biosynthesis in Toona sinensis Roem Identification of putative flavonoid-biosynthetic genes through transcriptome analysis of Taihe Toona sinensis bud Nutrition quality analysis of different Toona sinensis cultivars in Dazhu county The authors have declared that no competing interests exist.Data availability All relevant supporting data can be found within the additional files accompanying this article. RNA-Seq raw data reads have been deposited in NCBI SRA under the accession number PRJNA516485.Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.