key: cord-007757-4mri8kyq
authors: van de Sluis, Bart; Voncken, Jan Willem
title: Transgene Design
date: 2010-10-04
journal: Transgenic Mouse Methods and Protocols
DOI: 10.1007/978-1-60761-974-1_6
sha: 
doc_id: 7757
cord_uid: 4mri8kyq

Transgenics are powerful mouse models to understand the biological functions of genes. This chapter gives a short overview of the requirements and considerations in designing a transgene. In addition, potential important choices that have to be made in advance for the successful designing and generating a transgenic mouse model are discussed. Methods for DNA purification for microinjection are also provided in this chapter.

The application of transgenesis has increased exponentially since its introduction in the early 1980s and is still one of the most powerful methods to study gene function. As approaches to solve scientific problems became more complex, transgene design evolved alongside. At this moment, the versatility in strategies and applications of transgenic animal models are both staggering and exciting at the same time. An attempt to give a fully comprehensive overview of all variations in transgene design described in the scientific press would be an illusion and exceed the aim of this chapter. Nevertheless, the strategy to generate a transgenic animal model (i.e., design of a transgene) warrants special attention to ensure the highest chances for success. Therefore, this chapter provides a concise overview of the elementary requirements of a transgenic construct and some considerations in transgene design. In addition, a number of aspects are discussed that may influence choices early on in the process of designing and generating a transgenic mouse model.

The first choice in transgene design concerns the donor species (origin) of the transgene DNA and the biological properties of the transgene (see Subheading 1.1). Secondly, transgenes, including elements that control their expression, may either be fully derived from a native genomic locus or be assembled from genomic DNA or copy DNA (cDNA) and (heterologous) regulatory elements (see Subheading 1.2) . In addition, a range of regulatory systems offers a certain degree of control over transgene expression (see Subheadings 1.3 and 1.4). Size constraints, inherent to particular cloning systems, may limit the use of native regulatory elements: if a transgene becomes too large for regular plasmid or cosmid-based vectors, or when genetic complementation is desired (e.g., with DNA fragments spanning large genomic deletions), one can switch to systems that allow cloning of very large DNA segments (see Subheading 1.3; Chapter 9). A number of frequently encountered drawbacks are worth paying extra attention to; these are summarized in the notes (see .

Since the purity of the microinjected DNA is the very first determinant of success, detailed protocols are provided for DNA purification methods of conventionally sized (i.e., at maximum 20 kb) transgenes (see Subheading 2) . For the purification of large DNA segments, the reader is referred to Chapter 9.

In considering transgenic technology, the choice of origin of the transgene, i.e., the species the transgene originates from, is an important one. The origin of a transgene may range from prokaryotes (e.g., reporter genes such as ß-galactosidase) to worms or flies and higher eukaryotes like man. Human DNA is applied most widely to generate transgenes in experimental biomedical research. This choice offers several advantages. First, from a biomedical viewpoint, many known genetic disorders in humans have been mapped and extensively characterized at the molecular level: mutations or deletions have often been identified and mutant alleles are readily available (1) (2) (3) (4) (5) (6) (7) . This makes it possible to generate transgenic models with "diseased" alleles and study structure-function relationships in the context of common (mouse) alleles. Second, despite structural divergence, most genes have been well conserved between mouse and man. This may become an important issue when, for instance, the biological activity of the (trans)gene product is dependent on protein-protein interactions or homodimerization. Such interactions may no longer occur between proteins originating from species that have diverged too much during evolution. It is possible that the (trans) gene product alters expression of the mouse homolog, either at a transcriptional, translational, or posttranslational (stability) level. Needless to say, these aspects are important, since they may all affect the outcome of experiments. Third, at a practical level, screening for founder mice generated by pronuclear microinjection,

may be difficult when the transgene is also derived from the mouse; the same holds for expressional analysis; such analyses would require quantitation at the DNA and mRNA level, respectively. Structural differences between human and mouse genes make it possible to screen for transgeneity relatively easy. Primary sequence differences, often concentrated in noncoding regions (introns), provide a convenient way to discern between the transgene and the endogenous mouse gene by simple restriction endonucleases analysis. Sequence differences are not necessarily confined to noncoding regions but may also occur in coding regions (exons): the mRNA transcribed from the human transgene and that from the endogenous murine counterpart may differ in size and/or nucleotide composition. The latter may become useful in case size similarity hampers straightforward interpretation. If gene products differ at the amino acid level, Western analysis also presents a means to discriminate between endogenous and transgene-related expression, provided antisera are available that specifically detect the (trans)gene product. Alternatively, discrimination of the transgene-encoded protein can be easily accomplished by adding an epitope to the proteins' amino acid sequence, such as a synthetic Flag-, multiple Histidine (His), protein-derived tags (i.e., Hemagglutinin (HA) , c-myc) or virus-based tag (i.e., V5, PY -polyoma). In addition, follow-up studies such as immunoprecipitation, immunohistochemistry, flow cytometry, chromatin immunoprecipitation (ChIP assay) and protein purification, may significantly benefit from such tags, in case the availability of immunological tools against the transgene product is limiting. Importantly, the effect of tag addition should be evaluated in vitro prior to embarking on in vivo experiments, as the addition of small peptide tags may affect protein function in vivo.

In summary, there is a wide choice in the origin of the DNA used to construct a transgene. As holds for the choice of expressional control (see Subheading 1.3), the choice of transgene origin is mostly determined by the aim of the experimental model itself. For biomedical studies, the use of human transgenes may be preferred, if not obligatory. If the study of gene function is the aim and overexpression is the experimental approach, a human gene may simply offer a practical solution for screening purposes. In addition, an alternative approach to generate transgenic mice is discussed in Chapter 10.

Whereas cDNA-based expression vectors on average work fine in vitro and designing a transgenic construct using cDNA may seem straightforward, cDNA-based transgenes often function in vivo, but expression levels are frequently low, and such transgenes are often silenced. It appears important to preserve the intron-exon boundaries at least to some extent in a transgene. The native intron-exon structure of a gene need not be preserved

in its entirety though. If the size of a genomic DNA transgene is too large for conventional cloning techniques, combining cDNA sequences with a few genomic intron-exon boundaries may circumvent this problem (see Fig. 1 ). Inclusion of only one generic intron in a transgene has been shown to augment transgene expression significantly (8, 9) .

It appears that the origin of the intron need not necessarily be same as that of the (trans)gene of interest, but may in fact be heterologous or even a hybrid of sequences from different origins. Often, (part of) the first (noncoding) exon attached to a promoter is used in combination with coding sequences within a transgene; care should be taken that transgene translation starts at the intended ATG, and not in upstream heterologous exon sequences (see Subheading 1.3). Moreover, the effect of including introns in a transgene seems independent of its position within the transcriptional unit, although 3¢-positioned splice acceptor and donor sequences have been known to result in aberrant splicing products. These observations suggest that recognition and processing by spliceosomes is instrumental in the observed upregulation of transgene expression. In addition, some endogenous introns appear to harbor regulatory elements with structural and functional similarities to enhancers, Locus control regions (LCRs), or Matrix attachment regions (MARs) (see Subheading 2.2) which direct transgene expression in a position-independent or cell type-specific fashion (9-17). Since regulatory regions are usually not cloned along with cDNA, these have to be provided "separately" and are most often not endogenous (1). Endogenous regulatory elements may be included when the DNA originated from a genomic clone (2) . The experimenter has a certain degree of freedom to tailor transgene design to specific requirements ((3); see also The choice of regulatory elements that drive transgene expression is broad (Fig. 2) , and is primarily determined by the aim of the model. However, in all instances, a number of indispensable elements that control gene expression need to be included in a transgene.

The promoter, the region of DNA at which gene expression is initiated by binding of the RNA polymerase transcriptional machinery, is the most basic and essential element controlling gene expression. The promoter region should comprise a Kozak/ ATG sequence at which transgene translation commences (see Subheading 1.2; (18)). If the expression pattern of a transgene needs to parallel that of the endogenous mouse gene, one needs to include native regulatory elements. Regulatory elements can be included that augment transgene expression, such as enhancers, which typically act in an orientation-independent manner. MARs, scaffold attachment regions (SARs), and chromosomal insulators are believed to insulate (trans)gene expression from the influences of surrounding chromatin (15) . LCRs confer position-independent and copy number-dependent expressional characteristics to a transgene. In addition, LCRs provide transgene expression at physiological levels, often with a cell lineagespecific enhancer activity. The application of LCRs in transgenesis is discussed in detail elsewhere (reviewed in (15) ). The advantage of including such elements in transgenes is obvious: whereas transgenes with "minimal" promoters may become inactive by the insertion into transcriptionally silent chromatin, transgenes carrying, for instance, LCRs will not. However, not all endogenous loci contain such elements and most often their position relative to coding regions within the locus is not known. If faithful reproduction of the endogenous expression profile is required (see also Subheading 1.4), without actual knowledge of the position of regulatory element within a transgene, there is obvious advantage in using large DNA segments as transgenes (see Subheading 1.3).

In the early days of transgenesis, it was often difficult to obtain faithful transgene expression patterns, i.e., which parallel expression of their endogenous counterparts, for a number of reasons (e.g., lack of knowledge in regard to nature and location of regulatory sequences of a locus; size restrictions of cloning systems). The use of a full-length relatively small mammalian gene (i.e., 15-20 kb), including 5¢ and 3¢ and internal regulatory regions, may yield faithful transgene expression patterns. In such a fortunate situation, not only coding sequences, but also cell lineage-specific and other regulatory elements are located within or close to the intron-exon structure of a locus. However, the exact location of elements that exert transcriptional control over a (trans)gene of interest need not always be known and often there may be many kb removed from the actual transcriptional start site. For a number of applications, like genetic complementation of large deletions and gene therapy, it is imperative to include such regulatory features in a transgene (19) (20) (21) . Fortunately, when transgenes become too large (i.e., up to 100-150 kb) for "conventional" plasmid-based cloning, a number of modern cloning techniques have overcome this hurdle: one needs to resort to cloning systems employing P1 artificial chromosomes (PAC), yeast artificial chromosomes (YAC), or bacterial artificial chromosomes (BAC) (see Chapter 9) . In addition, other cloning methodologies have been described to generate plasmids for biotechnology purposes, such as recombineering (see Chapter 11) . In principle, any gene can be cloned into these systems. Exceedingly large YACs (>500 kb) are transferred into embryonic stem cells first and via this route used to generate transgenic mice (see Chapter 9 ). An obvious and important advantage of using large stretches of genomic DNA is that with these systems the chances of obtaining cell lineage-specific, integration site-independent, and copy numberdependent expression characteristics are greatly improved (22, 23) .

If overexpression or ectopic expression is required, general type heterologous promoters (e.g., such as widely applied viral promoters) and/or enhancers are widely used. The use of heterologous and autologous (i.e., endogenous to the gene in interest) regulatory elements is often combined (see for example Figs. 1 and 3). The very first transgenic mouse models generated made use of the general-type metallothioneine promoter (pMT) (24, (26) (27) (28) . This promoter was used to control expression of human, rat, or viral transgenes and, although showing a relatively high level of basal expression, proved to be further inducible with glucocorticoids, heavy metals, or bacterial endotoxin (LPS) (29, 30).

The use of heterologous promoters and other regulatory elements has become wide spread, and as indicated before, is determined primarily by the aim of the animal model (see Fig. 2 ). To ensure global and ubiquitous expression of a particular transgene, general-type promoters, such as those derived from histones, ß-actin, or housekeeping genes (e.g., phosphoglycerate kinase (PGK)), but also viral promoters are often applied. Needless to say, if a heterologous promoter is chosen to drive transgene expression, one should adhere to those promoters that have been proven to function in vivo, or thoroughly test the novel system first. In the latter case, the experimenter should realize that in vitro expression characteristics of novel promoters may be very different than those in vivo. It is therefore strongly recommended to test novel promoters in vivo before use in a transgenic animal model.

If tissue-restricted expression patterns are required, specific promoters that confer this selectivity are chosen. In order to determine the minimal requirements for tissue-restricted expression of a particular promoter, it needs to be dissected at the molecular level. Classical promoter studies can be applied in vitro and in vivo to map the elements present within and surrounding a gene of interest. The promoter is tested in vivo by fusing it to reporter genes, such as ß-galactosidase, CAT, luciferase, or GFP. This approach is standardly used to examine spatio-temporal expression patterns and tissue specificity of native promoter sequences in vivo. To bypass embryonic lethality of a transgene or to study the effects of tissue restricted transgene (over)expression, tissue-specific, inducible, or combinations of these regulatory systems (binary transgenic systems) may be employed. Several binary transgenic systems have been developed, often employing prokaryotic expression control systems (31) (32) (33) (34) (35) (36) (37) (38) . A number of exciting applications of transgenic technology are described in Chapters 9, 12, and 19.

1. Qiagen Gel Extraction kit. 1. Thermoblock, adjustable temperature control.

2. Microcentrifuge capable of 12,000 × g. 13. Ice cold 100% EtOH, 70% EtOH.

The transgene is released from vector sequences by restriction endonuclease (REN) digestion. Removal of prokaryotic sequences should be carried out as completely as possible (see Note 2) . Depending on the materials present on location, several methods can be pursued to extract the linearized transgene from agarose gels. Commercially available agarose gel extraction kit, such Qiagen gel extraction kit, may be used. Although optional, some laboratories apply Elutip-D columns for subsequent purification of DNA for microinjection, with very satisfactory and consistent results (see Subheading 3.1.4; Note 8). 1. Aside from promoter choice, the overall structure of a transgene has considerable influence on its activity. In eukaryotes, foreign DNA sequences (e.g., viral DNA, transgenes) are often methylated and inactivated as part of a host defense system against transcription of potentially harmful genes (39) (40) (41) . Although the exact mechanisms by which (trans) genes are transcriptionally silenced are not clear on all aspects, the presence of prokaryotic sequences (i.e., plasmid) should be minimized in a transgenic construct (the strategy (i.e., restriction sites) by which the transgene is released from plasmid sequences is a crucial part of the transgene design). In addition, it appears that the presence of bona fide intronexon structures in a transgene circumvents some of these problems (see Subheading 1.2).

2. Expression of the transgene may be low in several independent animal lines or not detectable at all. Although this may indicate a flaw in the design of the construct, it is possible, that the (trans)gene product causes embryonic death: transgene (over)expression interferes with normal embryogenesis.

In the latter case, the use of different regulatory sequences should be considered (see Subheadings 1.3 and 1.4). To rule out poor construct design, it is strongly recommended to evaluate the biological activity (i.e., basic expression) of a transgene and the size of the transgenic mRNA in cultured cells (e.g., by transient transfection or electroporation and subsequent Northern analysis).

3. High copy number insertions are often associated with a significant decrease in transgene expression as a result of silencing (42) (43) (44) , most likely because these are perceived as repeats in the mammalian genome. The presence of regulatory elements that confer position independent and copy number dependent transgene expression, may circumvent this problem. Occasionally, insertion as inverted repeats may also affect transgene activity (29) .

4. An inserted transgene may affect the expression of (nearby) endogenous genes, which influences the phenotype in an unforeseen manner. Transgene insertion may cause haploinsufficiency, or when integrated in an imprinted locus or in a gene on the X-chromosome, it may even cause a null-mutant phenotype, entirely unrelated to the intended model. To ascribe a certain phenotype to transgene activity, it is therefore imperative to include more than one independent transgenic line in the studies, as is standardly done for ES cell-mediated genetic manipulation in mice (see Chapters 10 and 12).

5. Although without doubt more involving than conventional transgenesis, it is possible to study the behavior of recessive mutations in mice, by overexpression at supra-physiological levels (18) . Alternatively, a (conditional) knockin for the mutation may be generated, or the transgenic lines may be backcrossed to a (conditional) knockout (see Chapters 10, 12, and 15) for the endogenous gene (19) . 6 . In our experience, the slightest impurities will have a serious impact on the efficiency with which transgenic founders are generated. Often times, residual Ethidium Bromide is a source of trouble. Traces Ethidium Bromide in a DNA preparation is easily detected on an agarose gel from which Ethidium Bromide has been omitted. If DNA is detectable in this fashion, the preparation should be reextracted with Phenolchloroform-isoamylalcohol a number of times.

7. Manipulation of DNA, that is to be used to generate transgenic animals via pronuclear injection, should be done with utmost care: shield DNA as much as possible from UV-light (i.e., day light, UV-light box) when in the presence of Ethidium Bromide, since DNA can be damaged and mutated as a result. Working surfaces and equipment (UV tray; scalpels) are clean when isolating fragments from agarose gels.

8. Since DNA for microinjection needs to be extremely pure, it is not precipitated with coprecipitants. Moreover, some coprecipitants, like Dextran T-500 (Pharmacia), for instance, are toxic to zygotes.

9. Elutip-D purification is a convenient method to purify DNA for pronuclear injection (see Chapter 2). Yields are not very high (loading capacity column), but the DNA obtained is ultrapure. As an alternative to Elutip-D purification, dialysis against TE-buffer for microinjection is often used. However, care should be taken that the materials used are absolutely free of soap and other contaminants, since these substances have a strongly negative effect on the survival of injected fertilized eggs.

Transgenic models of Huntington's disease

Mitochondrial dysfunction in neurodegenerative diseases

Duchenne muscular dystrophy and the neuromuscular junction: the utrophin link

Mouse models of human genetic disease: which mouse is more like a man?

Molecular genetics and transgenic model of Gertsmann-Straussler-Scheinker disease

Recent insights into the molecular pathogenesis of Huntington disease

Brain dystrophin, neurogenetics and mental retardation

A generic intron increases gene expression in transgenic mice

Functional analysis of the human adenosine deaminase gene thymic regulatory region and its ability to generate position-independent transgene expression

Elements regulating somatic hypermutation of an immunoglobulin kappa gene: critical role for the intron enhancer/matrix attachment region

Sequences containing the second-intron enhancer are essential for transcription of the human apolipoprotein B gene in the livers of transgenic mice

Multiple neuron-specific enhancers in the gene coding for the human neurofilament light chain

High-level expression of the rat whey acidic protein gene is mediated by elements in the promoter and 3¢ untranslated region

Position-independent expression of whey acidic protein transgenes

Locus control regions: coming of age at a decade plus

Cell-specific expression of alpha 1(I) collagen-hGH minigenes in transgenic mice

Independent regulatory elements in the nestin gene direct transgene expression to neural stem cells or muscle precursors

At least six nucleotides preceding the AUG initiator codon enhance translation in mammalian cells

Complementation of null CF mice with a human CFTR YAC transgene

A human YAC transgene rescues craniofacial and neural tube development in PDGFRalpha knockout mice and uncovers a role for PDGFRalpha in prenatal lung growth

A YAC mouse model for Huntington's disease with fulllength mutant huntingtin, cytoplasmic toxicity, and selective striatal neurodegeneration

A yeast artificial chromosome covering the tyrosinase gene confers copy number-dependent expression in transgenic mice

Copy number-dependent expression of a YAC-cloned human CFTR gene in a human epithelial cell line

Metallothioneinhuman GH fusion genes stimulate growth of mice

Acute leukaemia in bcr/ abl transgenic mice

Dramatic growth of mice that develop from eggs microinjected with metallothionein-growth hormone fusion genes

Transgenic mice containing growth hormone fusion genes

Somatic expression of herpes thymidine kinase in mice following injection of a fusion gene into eggs

Transmission distortion and mosaicism in an unusual transgenic mouse pedigree

Regulation of metallothionein gene expression

Timing is everything in life: conditional transgene expression in the cardiovascular system

Flipping the oncogene switch: illumination of tumor maintenance and regression

Spatial and temporal regulation of a lacZ reporter transgene in a binary transgenic mouse system

Temporal control of gene expression in transgenic mice by a tetracycline-responsive promoter

Inducible gene expression and gene modification in transgenic mice

Temporal control of the Cre recombinase in transgenic mice by a tetracycline responsive promoter

A modified tetracycline-regulated system provides autoregulatory, inducible gene expression in cultured cells and transgenic mice

Transgenic mouse models for lung cancer

A genetic program for deletion of foreign DNA from the mammalian genome

Patterns of DNA methylation -evolutionary vestiges of foreign DNA inactivation as a host defense mechanism. A proposal

Mammalian cDNA and prokaryotic reporter sequences silence adjacent transgenes in transgenic mice

The vagaries of variegating transgenes

Conspiracy of silence among repeated transgenes

Repeat-induced gene silencing in mammals