key: cord-0300104-j92b66jc authors: Playfoot, Christopher J.; Duc, Julien; Sheppard, Shaoline; Dind, Sagane; Coudray, Alexandre; Planet, Evarist; Trono, Didier title: Transposable elements and their KZFP controllers are drivers of transcriptional innovation in the developing human brain date: 2020-12-14 journal: bioRxiv DOI: 10.1101/2020.12.14.422620 sha: 327e1e8ffea3f75a9accc5ce5239df6773e93fa7 doc_id: 300104 cord_uid: j92b66jc Transposable elements (TEs) constitute 50% of the human genome and many have been co-opted throughout human evolution due to gain of advantageous regulatory functions controlling gene expression networks. Several lines of evidence suggest these networks can be fine-tuned by the largest family of TE controllers, the KRAB-containing zinc finger proteins (KZFPs). One tissue permissive for TE transcriptional activation (termed ‘transposcription’) is the adult human brain, however comprehensive studies on the extent of this process and its potential contribution to human brain development are lacking. In order to elucidate the spatiotemporal transposcriptome of the developing human brain, we have analysed two independent RNA-seq datasets encompassing 16 distinct brain regions from eight weeks post-conception into adulthood. We reveal an anti-correlated, KZFP:TE transcriptional profile defining the late prenatal to early postnatal transition, and the spatiotemporal and cell type specific activation of TE-derived alternative promoters driving the expression of neurogenesis-associated genes. We also demonstrate experimentally that a co-opted antisense L2 element drives temporal protein re-localisation away from the endoplasmic reticulum, suggestive of novel TE dependent protein function in primate evolution. This work highlights the widespread dynamic nature of the spatiotemporal KZFP:TE transcriptome and its potential importance throughout neurotypical human brain development. KZFPs constitute the largest family of transcription factors encoded by mammalian genomes. These 26 proteins harbor an N-terminal Krüppel-associated box (KRAB) domain and a C-terminal zinc finger 27 array, which, for many, mediates sequence-specific DNA recognition. The KRAB domain of a majority 28 of KZFPs recruits the transcriptional co-repressor KAP1 (KRAB-associated protein 1, also known as 29 Tripartite motif protein 28, TRIM28), which acts as a scaffold for heterochromatin inducers such as the 30 histone methyl-transferase SETDB1, the histone deacetylating NuRD complex, heterochromatin 31 protein 1 (HP1) and DNA methyltransferases (Ecco et al. 2017 ). Many KZFPs bind to and repress TEs, a 32 finding that led to the 'arms race' hypothesis, which states that waves of genomic invasion by TEs 33 throughout evolution drove the selection of KZFP genes after they first emerged in the last common 34 ancestor of tetrapods, lung fish and coelacanth some 420 million years ago (Jacobs et Intriguingly, KZFPs are collectively more highly expressed in the human brain than in other adult 52 tissues, suggesting a prominent impact for these epigenetic regulators and their TEeRS targets in the 53 function of this organ (Nowick et al. 2009; Imbeault et al. 2017; Farmiloe et al. 2020; Turelli et al. 2020) . 54 In line with this hypothesis, we recently described how ZNF417 and ZNF587, two primate specific KZFPs 55 repressing HERVK (human endogenous retrovirus K) and SVA (SINE-VNTR-Alu) integrants in human 56 embryonic stem cells (hESC), are expressed in specific regions of the human developing and adult brain 57 (Turelli et al. 2020) . Through the control of TEeRS, these KZFPs influence the differentiation and 58 neurotransmission profile of neurons and prevent the induction of neurotoxic retroviral proteins and 59 an interferon-like response (Turelli et al. 2020) . Furthermore, expression of LINE1, another class of TEs, 60 has been noted in human neural progenitor cells (hNPCs) and in the adult human brain, occasionally 61 leading to de novo retrotransposition events (Muotri et Table 1 & 2). We first examined KZFP gene expression in 85 these two brain regions, which are representative of the forebrain and the hindbrain, respectively 86 (Supplemental Fig. S1B ). The large majority of KZFPs expressed in the DFC exhibited higher levels at 87 early prenatal stages to drop shortly before birth and remain low onwards (Fig. 1A) . When comparing 88 early prenatal (2A-3B; 8-18 post-conception weeks) and adult (11; age 20-60+ years) stages, about half 89 (169/333) of KZFPs were more expressed in the former and only 1.5% (5/333) in the latter, the rest 90 being stable (Fig. 1B) . This temporal pattern was less striking in the cerebellum (Supplemental Fig. S2A expressed KZFPs arose at particular times in evolution, we determined their ages. We found KZFPs 120 either significantly downregulated or upregulated from early prenatal to adult stages to be significantly 121 younger than those displaying no differences between these developmental periods (Fig. 1C , Wilcox 122 test p<=0.01). This delineates two subsets amongst KZFPs participating in brain development, one 123 evolutionarily recent and more transcriptionally dynamic, the other more conserved and 124 transcriptionally static. Table 3 ). This was true in all brain regions, although its transcripts persisted longer in 138 the cerebellum compared to other areas ( Fig. 1G We next examined KAP1, which encodes a protein that serves as corepressor for many KZFP (Ecco et 143 al. 2017 ). Its expression levels were globally higher than those of any KZFP, albeit also with a drop from 144 prenatal to postnatal stages except in the cerebellum ( Fig. 1G ; Supplemental Table 1 & 2). We also 145 probed DNMT1, which encodes the maintenance DNA methyltransferase important for TE repression 146 in neural progenitor cells and other somatic tissues beyond the early embryonic period (Jönsson et al. 147 2019). Although displaying overall patterns comparable to those seen for KZFPs and KAP1, DNMT1 148 expression progressively increased in the cerebellum to reach its highest level in the adult ( Fig. 1G ; 149 Supplemental Table 1 & 2). In sum, KZFPs and their main epigenetic cofactors exhibit a largely 150 homogenous, dynamic spatiotemporal reduction in expression during human brain development. 151 Having determined that the expression of most KZFPs drops at late stages of prenatal brain 153 development, we examined the behaviour of their TE targets. Young TEs are highly repetitive, which 154 complicates the mapping of TE-derived RNA-seq reads to unique genomic loci, thus biasing against the 155 scoring of their expression. We therefore first analysed RNA-seq reads mapping to multiple TE loci 156 within the same subfamily, regardless of positional information. In the DFC, discrete subfamilies, 157 predominantly from the LTR class and to a lesser extent the SINE class, exhibited temporally distinct 158 dynamics, concordant between datasets (Pearson correlation coefficient ≥ 0.7) ( Fig. 2A ; Supplemental 159 Table 4 ). The same was true for the cerebellum, but with moderately different subfamilies passing our 160 threshold for concordance between datasets (Supplemental Fig. S3A ; Supplemental Table 4 ). In the 161 DFC, for example, the LTR7C and SVA-D subfamilies exhibited higher postnatal expression, whereas 162 LTR70 and HERVK13-int behaved inversely, albeit without marked differences between brain regions 163 Table 6 ). For example, ZNF611 is a previously characterised major 197 regulator of SVA-D in early embryogenesis , and the two exhibited strongly anti-198 correlated expression throughout human brain development (Fig. 2D inset) . 199 We next expanded our study by examining the expression of individual TE integrants, assigning RNA-200 seq reads to their genomic source loci and comparing early prenatal (stages 2A to 3B) and adult (stage 201 11) samples for the 16 available brain regions (Supplemental Fig. S1A & B). We found between 5,000 202 and 7,000 significant differentially expressed TE loci in each region, with 4,000 loci common to both 203 DFC and CB datasets (Supplemental Fig. S3C ; Supplemental Table 7 & 8) . Integrants belonging to 204 fourteen TE subfamilies from the LTR, LINE and SINE classes were significantly more expressed in adult 205 samples, with HERVH-int, MSTA-int and L2 elements significantly enriched in most brain regions (Fig. 206 2E). The cerebellum again exhibited distinct patterns, with significant enrichment of LTR12C and MIR 207 elements instead (Fig. 2E) . Conversely, integrants from 11 TE subfamilies were more expressed in the 208 early prenatal period, largely in specific brain regions (Supplemental Fig. S3D ). Together, these results 209 highlight the spatiotemporal dynamic nature of the transposcriptome in the developing human brain. Table 9 ). Amongst pre-or postnatal 228 TcGTs, developmental trajectories differed substantially, with some detected exclusively at either 229 stage. For example, an L2a-driven isoform of CTP synthase 2 (CTPS2), whose product catalyses CTP Table 9 ). Some were present in all available tissues, but the vast majority were brain 244 restricted (Fig. 3C) . We concentrated deeper analyses on the 68 high confidence brain developmental TcGTs. Thirty-seven 287 different TE subfamilies accounted for their promoters but MIRs and L2s, belonging respectively to the 288 SINE and LINE families, contributed almost half, perhaps due in part to their high prevalence in the 289 genome (MiR3 and L2a: 87,870 and 166,340 integrants, respectively) (Fig. 3E) , and LTRs about a fifth. 290 A large range of evolutionary ages were represented, from the ~20 myo (million year old) LTR12C to 291 the ~177 myo MIRs and L2s. 292 Of these 68 high-confidence TcGTs, 38.2% (26/68) were postnatal-specific, 51.5% (35/68) were 293 continually detected and 10.3% (7/68) were prenatal-restricted (Fig. 4A) . Furthermore, the 5' end of 294 these TcGTs coincided with ATAC-seq peaks from neurons in 26.5% (18/68), from non-neuronal cells 295 in 22% (15/68), and from both in 51.5% (35/68) of cases (Fig. 4A) . Some TcGTs were present in all brain 296 regions, whereas others exhibited regional specificity (Supplemental Fig. S5A ). Table 9 ). 303 To estimate the relative contribution of the TE and non-TE promoters to the expression of the 68 genes 304 involved in high confidence TcGTs, we compared their transcription levels in samples where the TcGT 305 was or was not detected (Fig. 4B) . In some cases, the TcGT was associated with higher levels of gene 306 expression in a temporal manner such as the postnatally detected L2a:KCNAB2 (top) and most 307 strikingly MamGypLTR1b:ALDH1A1 (top mid), compared to their non-TE-driven counterparts (Fig. 4B) . 308 The continually detected, non-neuronal LTR33B:ZNF317 (bottom mid) was associated with high 309 expression throughout brain development, suggestive of a constitutive TE derived promoter. 310 Conversely, some TcGTs were associated with higher prenatal expression, such as with L2a:CTPS2 311 Fig. S5B) . 324 To verify that the TE and genic exon belonged to the same mRNA transcript, we next aimed to 326 experimentally confirm TcGT candidates in the SH-SY-5Y neuroblastoma cell line. Using qRT-PCR 327 primers within the TE TSS and subsequent genic exon, we detected appreciable expression of TcGTs in 328 this cell system (Supplemental Fig. S6A ). However, this did not formally demonstrate that transcription 329 was driven by the TE. To address this point, we targeted a CRISPR-based activation system (CRISPRa) 330 to the TSS region of TcGTs in 293T cells (Chavez et al. 2015) (Fig. 5A) . We picked candidates based on 331 the ease of gRNA design and the potential mechanistic or biological relevance of their protein product. 332 We selected three anti-sense L2-driven, cell type-specific TcGTs predicted to encode for proteins 333 involved in brain development: KCNAB2, DYSF and DDRGK1, the first in its canonical protein isoform 334 and the other two as N-truncated isoforms. Activation of each of these three TcGTs could be induced 335 with the CRISPRa system, confirming that they were indeed driven by their respective TE promoters 336 (Fig. 5A) . 337 TcGT-encoded protein isoforms can display differential subcellular localisation 338 Having noted that 22% of high-confidence TcGTs were predicted to encode N-truncated proteins (Fig. 339 4A), we hypothesised that this could, in some cases, result in derivatives deprived of important 340 subcellular localization domains, such as the endoplasmic reticulum (ER)-targeting N-terminal signal 341 peptide. We focused on L2:DDRGK1 as it was enriched postnatally, neuron-specific, not annotated in 342 ENSEMBL and experimentally validated by our 293T-based CRISPRa experiment ( Fig. 4A; Fig. 5A product is anchored to the ER membrane by an N-terminal 27 amino acid signal peptide (Fig. 5B) and 346 plays a role in ER homeostasis and ER-phagy (Liang et al. 2020; Liu et al. 2017 ). In the predicted 347 translated product of the L2:DDRGK1 TcGT, the signal peptide is replaced by a 10 amino acid L2-348 encoded sequence, conserved in new-world primates, but harboring non-synonymous substitutions in 349 old-world primates ( Fig. 5B; Supplemental Fig. S7A ). Of note, this L2 integrant is absent in mice 350 (Supplemental Fig. S7A) . Furthermore, the L2:DDRGK1 TcGT is detected in the Rhesus Macaque 351 developing brain with the same prenatal to postnatal expression dynamics as in humans (Supplemental 352 Fig. S7B & C) . We therefore transfected HEK293T cells with plasmids expressing HA-tagged versions of 353 either the canonical "wild-type" (WT) DDRGK1 transcript or its TcGT counterpart and examined the 354 subcellular localization of the resulting proteins by indirect immunofluorescence (Fig. 5C ) and by 355 cellular fractionation followed by western blotting (Fig. 5D ). Confocal microscopy revealed that 356 WT:DDRGK1-HA largely co-localized with BIP, an ER membrane marker, while L2:DDRGK1-HA displayed 357 a diffuse cytosolic pattern (Fig. 5C ). Cellular fractionation further confirmed that the WT DDRGK1 358 isoform was sequestered in the membrane fraction, whereas the L2:DDRGK1 counterpart was 359 enriched in cytosol (Fig. 5D) . 360 As N-truncated isoforms made up the largest category of in-silico predicted TcGT products besides full-361 length proteins, we next asked how widespread this type of TE-induced protein re-localisation might 362 be. For this, we intersected a database of signal peptide-containing proteins with our initial list of 480 363 TcGT-encoded protein products ( Fig. 5E; Supplemental Table 9 ). Of 94 TcGT products predicted to be 364 N-truncated, 12 contained a putative signal peptide in the canonical isoform. This prediction was 365 supported in 11 cases in silico by signalP 5.0 (Almagro Armenteros et al., 2019), which predicted that 366 in all of these instances the TcGT isoforms lacked this putative signal peptide (Supplemental Fig. S8 ). 367 Therefore, subcellular re-targeting may be a frequent consequence of TE-driven protein innovation. Reads were mapped to the human (hg19), or macaque (rheMac8) genome using hisat2 (Kim et al. 483 2015) with parameters hisat2 -k 5 --seed 42. Counts on genes and TEs were generated using 484 featureCounts (Liao et al. 2014 ). To avoid read assignation ambiguity between genes and TEs, a gtf file 485 containing both was provided to featureCounts. For repetitive sequences, an in-house curated version 486 of the Repbase database was used (fragmented LTR and internal segments belonging to a single 487 integrant were merged), generated as previously described (Turelli et differentially expressed when the fold change between groups was greater than two and the p-value 504 was smaller than 0.05. A moderated t-test (as implemented in the limma package of R) was used to 505 test significance. P-values were corrected for multiple testing using the Benjamini-Hochberg's method 506 (Benjamini and Hochberg 1995) . Temporal expression correlation analyses of individual genes, TE 507 integrants or subfamilies were performed between Brainspan and Cardoso datasets using the 508 'Pearson' method. For inter-regional correlations within the Brainspan dataset, only expressed genes 509 or TEs common to all regions were considered. Bam files and sashimi plots were visualised using the 510 Integrative Genomics Viewer (Katz et al. 2015; Robinson et al. 2011) . 511 First, a per sample transcriptome was computed from the RNA-seq bam file using Stringtie (Kovaka et 513 al. 2019) with parameters -j 1 -c 1. Each transcriptome was then crossed using BEDTools (Quinlan and 514 Hall 2010), to ensembl hg19 (or rheMac8) coding exons and curated RepeatMasker to extract TcGTs 515 with one or more reads spliced between a TE and genic exon for each sample. Second, a custom python 516 program was used to annotate and aggregate the sample level TcGTs into counts per stages (defined 517 in Supplemental Fig. S1B ). In brief, for each dataset, a GTF containing all annotated TcGTs was created 518 and TcGTs having their first exon overlapping an annotated gene, or TSS not overlapping a TE were 519 discarded. From this filtered file, TcGTs associated with the same gene and having a TSS within 100bp 520 of each other were aggregated. Finally, for each aggregate, its occurrence per group was computed 521 and a consensus transcript was generated for each TSS aggregate. For each exon of TcGT aggregate, 522 its percentage of occurrence across the different samples was computed and integrated in the 523 consensus if present in more than 30% of the samples the TcGT was detected in. All samples available 524 in both datasets were used regardless of mapped read count. 525 From the resulting master file, additional criteria were applied to determine prenatal, postnatal or 526 continually expressed TcGTs. 1. Only TcGTs that were present in at least 20% of prenatal, postnatal or 527 20% of both pre and postnatal samples (continual) were kept for each dataset. 2. To ensure TcGTs 528 were robustly detectable in the different datasets, TcGT files were merged based on the same TSS TE 529 and associated gene name. 3. TcGTs were required to exhibit the same temporal transcriptional 530 behaviour in both datasets. I.E a 2 fold change in TcGT detection pre vs postnatal and vice versa or a 531 lower fold change in both datasets (continual). This resulted in the 480 robustly detectable temporal 532 TcGTs in Fig. 3A and Supplemental Table 9 . These TcGTs were further filtered for strong promoter 533 regions using a Bedtools intersect of the 200bp up and downstream of the TcGT TSS with FANTOM5 534 CAGE-seq (Forrest et al. 2014 ) and BOCA neuronal and non-neuronal consensus ATAC-seq peak bed 535 files (Fullard et al. 2018) . TcGT TSS loci were also intersected with ENSEMBL (GRCh37.p13) 536 transcriptional start sites to determine non-annotated transcripts. 537 Protein product prediction 538 DNA sequences were retrieved for each TcGTs consensus and protein products were derived from the 539 longest ORF in the three reading frames using biopython (Cock et al. 2009 ). The resulting translation 540 products were aligned against the protein sequence of the most similar cognate gene isoforms (exons 541 intersect between TcGTs and each gene isoform) and classified into several categories. Proteins with 542 no alignment for any isoform were classified as out-of-frame, therefore not clear or not aligned. In-543 frame peptides were further classified according to their N-terminal modifications: Normal, TcGT ORF 544 peptides align perfectly with cognate ORF peptides; N-add, TcGT ORF peptides encode novel in-frame 545 N-terminal amino acids followed by the full length cognate protein sequence; N-truncated, TcGT ORF 546 peptides lack parts of the cognate N-terminal protein sequence and might contain novel in-frame N-547 terminal amino acids. TcGTs that we could not clearly classify were grouped in the 'other' category, 548 such as TcGTs including C-terminal modifications. If the classification was ambiguous for different 549 protein isoforms, the normal category was always privileged. Gateway LR Clonase II Enzyme mix (ThermoFisher) as per manufacturer's instructions. pTRE-3HA 620 produces proteins with three C-terminal HA tags in a doxycyclin-dependent manner. 621 Approximately 400,000 HEK293T cells in different wells of a 6 well plate were transfected with either 623 pTRE-WT:DDRGK1-HA or pTRE-L2:DDRGK1-HA whose expression was induced for 48 hours by adding 624 1µg/ml doxycycline to the media. After 48 hours wells were washed with 1ml ice cold PBS and cells 625 were scraped and transferred to Eppendorf tubes on the second wash. After centrifugation at 300rcf 626 for five minutes at 4 o C, PBS was aspirated, cells re-suspended in 400µl ice-cold cytoplasmic isolation 627 buffer (10mM KOAc, 2mM MgOAC, 20mM HEPES pH7.5, 0.5mM DTT, 0.015% digitonin) and 628 centrifuged at 900rcf for five minutes at 4 o C. Supernatant was collected as the cytoplasmic fraction 629 and the remaining pellet was re-suspended in 400µl of membrane isolation buffer (10mM HEPES, 630 10mM KCl, 0.1mM EDTA pH8, 1mM DTT, 0.5% Triton X-100, 100mM NaF), then centrifuged for 10 631 minutes at 900rcf at 4 o C to pellet nuclei with the supernatant collected as the membrane fraction. 632 Pelleted nuclei were resuspended in 400µl of lysis buffer (1% NP-40, 500mM Tris-HCL pH8, 0.05% SDS, 633 20mM EDTA, 10mM NaF, 20mM benzamidine) for 10 minutes on ice, centrifuged for 10 minutes at 634 Three washes with PBS were performed, followed by incubation with anti-mouse and anti-rabbit Alexa 655 488 or 568 (ThermoFisher 1:800) for one hour. DAPI (1:10000) was added in the last 10 minutes of 656 incubation, samples washed three times with PBS and coverslips mounted on slides with ProLong Gold 657 Antifade Mountant (ThermoFisher). Images were acquired on a SP8 upright confocal microscope 658 (Leica) and processed in ImageJ. 659 No additional high throughput data was generated in this study. 661 We thank all members of the Trono Lab for helpful and insightful discussions, along with Samuel 663 Corless and Nezha Benabdallah for critical reading of the manuscript. 664 ALDH1A1 is a 679 Marker of Astrocytic Differentiation during Brain Development and Correlates with Better 680 Survival in Glioblastoma Patients LTR retroelement expansion of the human cancer transcriptome and immunopeptidome 684 revealed by de novo transcript assembly A gene related to Caenorhabditis elegans spermatogenesis factor fer-1 is mutated in 688 limb-girdle muscular dystrophy type 2B Controlling the False Discovery Rate: A Practical and Powerful 690 Approach to Multiple Controlling the False Discovery Rate: a Practical and Powerful Approach to 691 Multiple Testing Evolution of the mammalian transcription factor binding repertoire via transposable 694 elements TRIM28 Controls a Gene Regulatory Network Based on Endogenous 697 Retroviruses in Human Neural Progenitor Cells Hot L1s 700 account for the bulk of retrotransposition in the human population Gene expression across mammalian organ development A meta-analysis of genome-wide association studies identifies 17 707 new Parkinson's disease risk loci Highly efficient Cas9-mediated transcriptional programming ZFP30 promotes adipogenesis through the KAP1-mediated 713 activation of a retrotransposon-derived Pparg2 enhancer Regulatory activities of transposable elements: from conflicts 716 to benefits Regulatory evolution of innate immunity through co-option of 718 endogenous retroviruses Endogenous retroviruses function as species-720 specific enhancer elements in the placenta Biopython: freely available Python tools for computational molecular biology and 723 bioinformatics Endogenous retroviral LTRs as promoters for human genes: A 725 critical assessment CRISPOR: intuitive guide selection for CRISPR/Cas9 genome editing 727 experiments and screens Gage 729 FH. 2009. L1 retrotransposition in human neural progenitor cells Relationship between the LHPP 732 Gene Polymorphism and Resting-State Brain Activity in Major Depressive Disorder The GENCODE v7 catalog of human long noncoding RNAs: Analysis of their gene 736 structure, evolution, and expression Transposable elements resistant to epigenetic resetting in the human germline are epigenetic 739 hotspots for development and disease Transposable Elements and Their KRAB-ZFP Controllers Regulate Gene Expression in Adult 742 Tissues KRAB zinc finger proteins L1-associated genomic regions are deleted in somatic cells of the healthy 747 human brain Widespread correlation of KRAB 749 zinc finger protein binding with brain-developmental gene expression patterns A promoter-level mammalian expression atlas An atlas of chromatin accessibility in the adult human brain The impact of transposable elements on mammalian 758 development Bioconductor: open software development for computational biology and 761 bioinformatics Novel Bioinformatics 763 Approach Identifies Transcriptional Profiles of Lineage-Specific Transposable Elements at Distinct Loci in the Human Dorsolateral Prefrontal Cortex The interactome of KRAB zinc finger proteins reveals the evolutionary 768 history of their functional diversification The Dfam 770 database of repetitive DNA families A comprehensive catalog of human KRAB-associated zinc finger genes: Insights into the 773 evolutionary history of a large family of transcriptional repressors KRAB zinc-finger proteins contribute to the evolution of 776 gene regulatory networks Endogenous retroviruses drive KRAB zinc-finger protein family expression for tumor suppression Haussler 781 D. 2014. An evolutionary arms race between KRAB zinc-finger genes ZNF91/93 and SVA/L1 782 retrotransposons Transposable elements drive widespread expression of oncogenes in human cancers Transposable Elements: A Common Feature of 787 Neurodevelopmental and Neurodegenerative Disorders Activation of neuronal genes via LINE-1 elements upon global 791 DNA demethylation in human neural progenitors Spatio-temporal transcriptome of the human brain Quantitative visualization of alternative exon expression from RNA-seq data Brain Transcriptome Databases: A User's Guide Transposable elements reveal a stem cell specific class of long noncoding 802 BLAT---The BLAST-Like Alignment Tool HISAT: a fast spliced aligner with low memory requirements Transcriptome assembly from 808 long-read RNA-seq alignments with StringTie2 Biological functions and signaling of a transmembrane semaphorin The Human Transcription Factors Measuring and interpreting transposable element expression voom: precision weights unlock linear model analysis tools 817 for RNA-seq read counts Integrative functional genomic analysis of human brain development and 820 neuropsychiatric risks Human endogenous retrovirus-K contributes to motor neuron disease A Genome-825 wide ER-phagy Screen Highlights Key Roles of Mitochondrial Metabolism and ER-Resident 826 featureCounts: an efficient general purpose program for assigning 828 sequence reads to genomic features Identification of 831 bona fide B2 SINE retrotransposon transcription through single-nucleus RNA-seq of the mouse 832 hippocampus Dysferlin, a novel skeletal muscle gene, is mutated in Miyoshi myopathy and limb girdle 835 muscular dystrophy A critical role of 837 DDRGK1 in endoplasmic reticulum homoeostasis via regulation of IRE1α stability Gateways to the FANTOM5 promoter level mammalian expression 841 atlas Channel β Subunit Kvβ2 ( Kcnab2 ) The human transcriptome across tissues and individuals Tissue-specific usage of transposable element-851 derived promoters in mouse development 854 Transcriptional landscape of the prenatal human brain Somatic mosaicism in 857 neuronal precursor cells mediated by L1 retrotransposition L1 860 retrotransposition in neurons is modulated by MeCP2 C2H2 zinc finger proteins greatly expand the human regulatory 864 lexicon Large-scale meta-analysis of genome-wide association data identifies six new risk loci 867 for Parkinson's disease Evidence for HTR1A and LHPP as interacting genetic risk factors in 870 major depression Differences in human and chimpanzee gene expression 872 patterns define an evolving network of transcription factors in brain Transposable Elements and KZFPs Facilitate Human Embryonic Genome 876 Activation and Control Transcription in Naive Human ESCs BEDTools: a flexible suite of utilities for comparing genomic features 881 Integrative genomics viewer Widespread contribution 883 of transposable elements to the innovation of gene regulatory networks ZNF445 is a primary regulator of genomic imprinting Diseases of the nERVous system: retrotransposon activity 889 in neurodegenerative disease Molecular Criteria for Defining the Naive Human Pluripotent State Ongoing 894 evolution of KRAB zinc finger protein-coding genes in modern humans Transposable Elements, Polydactyl Proteins, and the Genesis of Human-Specific 897 Transcription Networks Deplancke 900 B, et al. 2020. Primate-restricted KRAB zinc finger proteins and target retrotransposons control 901 gene expression in human neurons Primer3-new 903 capabilities and interfaces Ubiquitous L1 Mosaicism in 906 Identification of a 908 cDNA encoding an isoform of human CTP synthetase 1 can regulate the ZNF300 promoter in APL-derived promyelocytes HL-60 A single-cell RNA-914 seq survey of the developmental landscape of the human prefrontal cortex Emerging Roles of Long Non-Coding RNAs as Drivers of Brain Evolution