|
|
||||||||
OPEN ACCESS ARTICLE
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
TECHNOLOGY DEVELOPMENT |
aInvitrogen Corporation, Carlsbad, California, USA;
bCellartis AB, Göteborg, Sweden
Key Words. Human embryonic stem cells • phiC31 integrase • Site-specific integration
Correspondence: Bhaskar Thyagarajan, B.V.Sc., Ph.D., Invitrogen Corporation, 1600 Faraday Avenue, Carlsbad, California 92008, USA. Telephone: 760-268-7460; Fax: 760-602-6691; e-mail: bhaskar.thyagarajan{at}invitrogen.com
Received April 17, 2007;
accepted for publication October 9, 2007.
First published online in STEM CELLS EXPRESS October 25, 2007.
| ABSTRACT |
|---|
|
|
|---|
promoter. Stable clones were selected by antibiotic resistance and further characterized. The frequency of integration suggested candidate hot spots in the genome, which were mapped using a plasmid rescue strategy. The pseudo-attP profile in hESC differed from those reported earlier in differentiated cells. Clones derived using this method retained the ability to differentiate into all three germ layers, and fidelity of expression of GFP was verified in differentiation assays. GFP expression driven by the Oct4 promoter recapitulated endogenous Oct4 expression, whereas persistent stable expression of GFP expression driven by the EF1
promoter was seen. Our results demonstrate the utility of phiC31 integrase to target pseudo-attP sites in hESC and show that integrase-mediated site-specific integration can efficiently create stably expressing engineered human embryonic stem cell clones. Disclosure of potential conflicts of interest is found at the end of this article.
| INTRODUCTION |
|---|
|
|
|---|
An example of this is the generation of reporter cell lines, where specific promoter activity can easily be monitored via expression of fluorescent protein. The exact culture conditions and medium additives required to maintain human embryonic stem cells (hESC) in their pluripotent state are currently being identified and optimized. In addition, the pathways involved in directing specific differentiation of hESC are still under investigation. The development of a hESC reporter line expressing a fluorescent marker under control of a developmental stage-specific promoter would aid in the study of stem cell biology by providing a rapid readout that indicates the state of cells in culture. Earlier studies have used randomly integrating vectors, either retroviruses or plasmid DNA, to generate hESC-derived lines expressing green fluorescent protein (GFP) [3–5]. These studies describe the development of hESC lines expressing GFP driven by either a constitutive (EF1
), an inducible (EF1
tied to tetO element), or a lineage-specific (Oct4) promoter. A recent study describes the construction of human embryonic stem cells by transfection of plasmid DNA [6]. Although stable lines were successfully generated, the silencing of randomly integrated transgenes in this study underscores the importance of choosing the appropriate vector.
Use of lentiviral vectors for engineering hESC is popular because of the high efficiency of gene delivery afforded by viral infection. Although this system has proven useful, limited payload capacity and the potential for gene disruption from random integration of DNA could limit its utility in stem cells. Two recent articles describe the generation of hESC lines with integrated transgenes [7, 8]. Vallier et al. describe the use of a recombinase to create GFP-expressing hESC lines [7]. This approach first requires the creation of a recombinase-expressing line, followed by random integration of the expression construct. Although this is an elegant method to induce expression of the gene of interest, the efficiency of producing transgenic lines using this method can be low. Zeng et al. describe a baculovirus vector coupled with elements of adenoassociated virus to obtain transduction, integration, and long-term expression of the transgene [8]. This protocol was efficient at generating transgenic hESC, and the integrated transgene showed long-term expression. The large capacity of baculoviruses also overcomes the disadvantage of retroviruses, which have a limited payload capacity. However, the long-term effects of expression of the adeno-associated virus rep protein in hESC are still unknown, and it is possible that they could lead to undesirable effects [9].
Here, we focus on using a site-specific integration approach to direct expression constructs to transcriptionally active chromosomal regions using phiC31 integrase. This study outlines a method for using this site-specific integration system to generate engineered hESC lines. We describe here the construction of a pluripotency-specific reporter line using the human Oct4 promoter to drive GFP expression, as well as a constitutively expressing GFP line.
The integrase from the Streptomyces phage phiC31 has been shown to target donor plasmids containing a native attB site into pseudo-attP sites in the human genome [10, 11]. phiC31 integrase has been shown to function in vitro, in cell culture systems, as well as in vivo. This integrase has been successfully used in cells derived from a number of species, including human, mouse, rat, rabbit, Chinese hamsters, Drosophila, and plants [12–25]. Unlike the better-known recombinases Cre and Flp, the phiC31 integrase catalyzes recombination between two nonidentical sites. This feature, along with the apparent lack of a corresponding excisionase enzyme, makes the recombination reaction unidirectional, ensuring that constructs integrated into the genome do not act as substrates for the reverse reaction. The result is an improvement in integration efficiency compared with random integration. Another extremely useful feature of phiC31 integrase is the ability of the enzyme to target pseudo-attP sites present in the genome. These pseudo-attP sites bear some resemblance to the native attP site and have been shown to be present in transcriptionally active areas of the genome [10]. Many of the sites described have been shown to be in intronic regions of genes. Since these pseudo-attP sites typically tend to be in open chromatin [10], our hypothesis was that there would less interference with expression of the transgene, and any changes in expression of the transgene would be solely due to regulation of the promoter used. These features make this integrase a potentially useful tool for construction of transgenic lines from unmodified cells, since the targeted sites are already present in the genome.
In this study, we have used phiC31 integrase to create variant hESC-derived lines [26, 27] containing the GFP gene driven by either the human Oct4 promoter or the human EF1
promoter. We also describe a simplified vector construction design using a targeting vector that is a substrate for Multisite Gateway (Invitrogen Corporation, Carlsbad, CA, http://www.invitrogen.com). This greatly reduces the effort involved in cloning, and allows the creation of multiple constructs in the same background and with little effort. The combination of Multisite Gateway technology and site-specific recombinases provides a powerful tool for the construction of transgenic lines in human embryonic stem cells, which in turn can be used as versatile platforms for the study of stem cell biology.
| MATERIALS AND METHODS |
|---|
|
|
|---|
|
Cell Culture and Transfection
BG01v cells (49, XXY, +12, +17,) were obtained from BresaGen, Inc., (Athens, GA). SA002 cells (47, +13, XY) were obtained from Cellartis AB (Goteborg, Sweden, http://www.cellartis.se). All reagents were obtained from Invitrogen unless indicated otherwise. The cells were maintained either on a mouse embryonic fibroblast (MEF) feeder layer in Dulbecco's modified Eagle's medium (DMEM)/Ham's F-12 medium (F12) medium supplemented with 20% knock-out serum replacement (KSR), 4 ng/ml basic fibroblast growth factor, 1 ml of nonessential amino acids, and 100 µM β-mercaptoethanol or on Matrigel (BD Biosciences, Franklin Lakes, NJ, http://www.bdbiosciences.com) in the same medium conditioned on MEF feeder layer. Fresh medium was provided to the cells every day, and the cells were passaged every 4–5 days.
One day prior to transfection with Lipofectamine 2000 (Invitrogen), cells were treated with Accutase (Sigma-Aldrich, St. Louis, http://www.sigmaaldrich.com) and plated on Matrigel in conditioned medium. Lipofectamine 2000-mediated transfection was carried out according to the manufacturer's protocol. We typically used 4 µg of the expression vector and 4 µg of the phiC31 integrase expression vector to transfect 2 million cells. Control transfections omitted the phiC31 integrase plasmid or the GFP expression vector. After transfection, cells were allowed to recover for 1 day, and selection was started with medium containing Hygromycin at a concentration of 10 µg/ml. After 14–21 days of selection, individual colonies were manually picked and expanded for further analysis.
Electroporation was carried out with the BTX ECM630 electroporator (Harvard Bioscience, Holliston, MA, http://www.btxonline.com). Six to 8 million cells were harvested using Accutase and resuspended in 800 µl of OptiPro SFM (Invitrogen). These cells were placed in an electroporation cuvette with a gap of 0.4 cm. Cells were electroporated with a pulse of 500 V at 250 µF. Electroporated cells were plated on MEF feeders and allowed to recover for 48–72 hours before selection was started with Hygromycin (10 µg/ml; Invitrogen). As with lipid-mediated transfection, individual drug-resistant clones were manually picked and expanded for further analysis.
Plasmid Rescue and Sequence Analysis
Genomic DNA isolated from individual clones was restricted with the restriction enzymes NheI, SpeI, and XbaI. The enzymes were heat-inactivated, and the DNA was self-ligated at low DNA and T4 DNA ligase concentrations. After overnight incubation at 16°C, the DNA was extracted with phenol:chloroform, ethanol-precipitated, and resuspended in water. Electrocompetent DH10B Escherichia coli were then electroporated with the ligated DNA using the Gene Pulser II (Bio-Rad, Hercules, CA, http://www.bio-rad.com) using the recommended conditions. The resulting transformation was plated on Luria-Bertani (LB)-agar plates containing ampicillin. Plasmid DNA isolated from the resulting colonies was sequenced using the primer ChoSeqR (5'-TCCCGTGCTCACCGTGACCAC-3'). Sequence data were analyzed using Sequencher software. The genomic integration site was determined by matching the sequence read to the database at BLAT (http://genome.ucsc.edu) [30].
Analysis of 23 pseudo-site sequences rescued in this study was carried out by the Web-based MEME motif finder (http://meme.sdsc.edu/meme/meme.html) [31]. This program was used to find motifs ranging from 6 to 50 base pairs in 100 base pairs of sequence surrounding the point of crossover. The wild-type phiC31 attP site was also included in the analysis. A common motif was discovered in all the pseudo-sites, and a consensus sequence was generated based on these analyses using WebLogo Version 2.8.2 (http://weblogo.berkeley.edu).
Differentiation and Silencing Assays
Cells were induced to form embryoid bodies in differentiation medium as described, with some modifications [32]. Differentiation medium was composed of DMEM/F12 supplemented with 10% fetal bovine serum, 1% nonessential amino acids, and 100 µM β-mercaptoethanol. Four days after the start of differentiation, embryoid bodies were plated on culture plate to be further differentiated as monolayers. After 21 days, the differentiation potential was measured by immunocytochemistry for markers specific for the three different lineages. Primary antibodies were obtained from various sources and used at the following dilutions: pluripotent marker of Oct4, 1:500 (Abcam, Cambridge, MA, http://www.abcam.com); endoderm marker of
-Fetoprotein, 1:500 (Santa Cruz Biotechnology Inc., Santa Cruz, CA, http://www.scbt.com); mesoderm marker of Smooth Muscle Actin, 1:200 (Sigma-Aldrich); mesoderm marker of Brachyury, 1:1,000 (R&D Systems Inc., Minneapolis, http://www.rndsystems.com); ectoderm marker of βIII-Tubulin (TUJ1), 1:1,000 (Invitrogen); and ectoderm marker of Nestin, 1:500 (BD Biosciences). Secondary markers were obtained from Molecular Probes (Eugene, OR, http://probes.invitrogen.com) and used at the following dilutions: Alexa 594-conjugated anti-mouse IgG, 1:1,000; and Alexa 594-conjugated anti-rabbit IgG, 1:1,000. Data on GFP expression levels were collected using a FACScan instrument (BD Biosciences), and data were analyzed using FlowJo (Tree Star, Ashland, OR, http://www.treestar.com).
| RESULTS |
|---|
|
|
|---|
phage recombination site sequences. Recombination of the amplified products with the recipient pDONR vectors generated the Entry vectors, which could then be used for multiple constructions. Appropriate Entry vectors were recombined with the Destination vector in one step to generate expression vectors containing the gene of interest driven by promoter of choice. In this study, we used this strategy to generate two vectors that consist of the GFP gene driven by either the constitutive EF1
promoter or the hESC-specific human Oct4 promoter (Fig. 1B). We then used phiC31 integrase to insert the plasmids into the hESC genome. This enzyme directs integration of expression vectors into pseudo-attP sites in the human genome in an efficient manner. To this end, we engineered our Destination vector such that it would contain a recombination site for phiC31 integrase. To allow for selection of integration events, we also incorporated the Hygromycin phosphotransferase gene driven by the HSV-TK promoter. To obtain cells with integration events, the cells of interest were transfected with the expression vectors along with a plasmid encoding the expression of phiC31 integrase. The integrase protein catalyzed the integration of the expression vector into genomic pseudo-attP locations. Stable integration events were selected by expression of the drug-resistance marker present on the plasmids.
The expression constructs were transfected in the absence and presence of the phiC31 integrase plasmid into BG01v cells. Typically, the frequency of integration after 2 weeks of drug selection in the presence of integrase was
2 x 10–5. Data from three controlled experiments show that the average increase in colony number was 1.4-fold over random integration. In the absence of integrase, 80 colonies were obtained from three experiments, and in the presence of integrase, 114 colonies were obtained. These data suggest that phiC31 integrase can mediate integration into pseudosites in hESC.
Pseudosite Profile in hESC
To show that clones obtained were the result of phiC31-mediated site-specific integration, the site of integration was determined by a plasmid rescue strategy. The attB-genome junctions were sequenced, and the data were analyzed by comparison with the BLAT database (http://genome.ucsc.edu/cgi-bin/hgBlat). Table 1 shows the sites of integration of various clones derived from BG01v or SA002. Of 90 clones screened, plasmid rescue data were obtained for 56 clones. Of these, 51 clones were a result of phiC31-mediated integration and 5 were a result of random integration. The chromosomal loci for the random integration events were not determined. The 51 integrase-mediated clones showed integration into 23 different pseudo-attP sites. As has previously been observed, there were small deletions (5–25 bases) observed at the site of integration [11, 15].
|
It has previously been reported that pseudo-attP sites show some similarity to the native phiC31 attP site and that they share a common motif that contains a strong inverted repeat [10]. The pseudosites observed in hESC were subjected to similar analysis, and we found that these sites shared a common motif with the phiC31 attP site (Fig. 2A). This motif is present close to the crossover region in most of the sites, suggesting involvement in the recombination reaction. A consensus sequence for this motif was derived using the MEME motif finder [31]. The consensus sequence of this motif is shown in Figure 2B. The consensus shows a strong inverted repeat centered on the core, providing further evidence for the hypothesis that the integrase binds to each half-site [33].
|
|
promoter [34]. As shown in Figure 3A, the EF1
promoter directs strong expression of GFP in these cells. Fluorescence-activated cell sorting analysis of three independent Oct4-GFP clones and one EF1
-GFP clone revealed that the EF1
promoter directs higher levels of expression compared with the hOct4 promoter (Fig. 3BI). This expression is maintained upon long-term culture, as shown in Figure 3BII and 3BIII. Irrespective of the promoter, there was no significant reduction in GFP expression even after 10 passages, which is approximately 4–5 weeks in culture.
Characteristics of GFP Lines
Three independent BG01v-phOct4-GFP clones (YA06, YA15, and YA18) and one SA002-phOct4-GFP clone (YB1403) were studied for their ability to differentiate into the three germ layers by inducing formation of embryoid bodies (EBs). Immunostaining of the embryoid bodies is shown in Figure 4. Expression of endodermal (
-Fetoprotein), ectodermal (βIII-Tubulin and Nestin), and mesodermal (muscle-specific actin and Brachyury) markers was detected in EBs derived from all four lines.
|
promoter was still present upon differentiation.
|
| DISCUSSION |
|---|
|
|
|---|
The utility of the site-specific integration system described here is enhanced by the ability to use recombinase-mediated cloning (Multisite Gateway technology) to generate expression constructs for delivery into hESC. As part of this study, we show assembly of an expression vector containing the GFP gene driven by either the human Oct4 promoter or the EF1
promoter. Although these were relatively simple two-fragment constructions, our laboratory and others commonly use this system to assemble as many as five fragments in one reaction, and assembly of up to eight fragments in two steps has been reported [35]. The ability to assemble multiple regulatory elements and markers using this technology allows the creation of complex expression constructs that would otherwise be next to impossible to generate.
Although site-specific integration using phage integrases ensures stable and controllable levels of gene expression, a main challenge with using plasmid DNA lies in gaining efficient transfection of hESC in culture. In this study, we have obtained satisfactory results using two nonviral gene delivery methods. First, a lipid-based transfection reagent, Lipofectamine 2000, was used to transiently transfect BG01v cells, with efficiencies averaging 10%–20%. A second variant line SA002, however, was refractory to transfection by Lipofectamine 2000. For this line, we used electroporation (BTX ECM630) and obtained reasonable transfection levels, typically between 20% and 30%. Electroporation was also effective in BG01v cells, indicating that this method could potentially be adapted for efficient transfection of other human ESC lines. These methods do not seem to affect the growth characteristics of these cells, as we have been able to keep them in culture for many passages without differentiation. The frequency of stable transfection was independent of the method of transfection used.
One of the attributes of the phiC31 family of integrases is that they tend to target a relatively small number of loci (pseudo-attP sites) in the human genome. Of particular benefit to cell engineering is that these loci tend to exist in transcriptionally active regions. For ESC engineering, it is important to categorize these loci and determine which, if any, remain active upon differentiation. Once a database of active and nonsilenced pseudo-attP sites has been developed, it could be used to help streamline the selection process for subsequent engineered cell lines. Using BG01v and SA002 as relatively easily managed models for hESC lines [26, 27], we sought to map and categorize the loci that we targeted to determine whether pseudosites used in hESC are the same as those previously identified in terminally differentiated cells. We used a plasmid rescue-based strategy that has been used successfully in previous studies to determine the site of integration of the plasmids [10, 11]. We hypothesized that since significant chromatin remodeling occurs from the time cells differentiate past the ESC stage to a terminal adult tissue, we would identify at least some previously unmapped pseudosites in hESC. We were able to obtain integration site data for only 60% of the clones analyzed. It is possible that the clones for which plasmid rescue data could not be obtained are a result of random integration or that the integration site was not amenable to plasmid rescue. At this stage, it is not possible to distinguish between these two events.
Interestingly, our data indicate that the pseudo-attP site profile in human ESC is clearly distinct from that seen in differentiated cells. Of the 23 pseudosites that we identified in this study, only five have been reported previously. All the others, including the two most prominent hot spots for recombination in hESC (6p11.2 and 13q32.3), have not been identified previously. We are currently determining expression levels at these pseudosites in human ESC and the effect of differentiation on expression level. Only 2 of the 23 pseudo-attP sites were present in exons, and of these, only the site on chromosome 1 is a possible hot spot. The gene disrupted by integration into the chromosome 1 pseudosite, CDCP2 (NM_201546 [GenBank] ), has not been the subject of extensive study and is described as being weakly similar to Procollagen C-Proteinase enhancer protein. All the other hotpots for integration are either in intergenic regions or in introns of genes, suggesting that phiC31-mediated integration into stem cell lines will not result in deleterious consequences. Previous reports have suggested that the number of phiC31 pseudo-attP sites could be as high as 1,000. This suggests that the system may not be as site-specific as one would prefer. Our data, however, suggest that a large fraction of the integration events (30%–60%) occur in fewer than 10 hot spots in hESC. These data are very similar to those obtained in differentiated and transformed cells [11, 15]. Given the observed frequency of targeting at hot spots of recombination, one could screen approximately 20 colonies and have a high probability of obtaining one that is in a desirable locus. This offers a significant advantage over established methods of generating transgenic hESC lines.
Analysis of the pseudo-attP sequences revealed the presence of a common sequence motif at the crossover site (Fig. 3). The consensus sequence derived for this motif showed the presence of inverted repeats around the core, similar to earlier reports [10]. The consensus sequence that we observed shows significant identity to the previously reported consensus [10]. Of the 26-base pair sequence, 17 bases are identical to the previously reported consensus at the first level, and 25 of the 26 bases are identical when the second and third level consensus sequences are included. These data strongly suggest that phiC31 integrase is targeting pseudo-attP sites with a very similar sequence pattern in hESC. The difference in observed pseudosites could be attributed to the differences in open regions of the chromosome between hESC and terminally differentiated cells.
As expected, GFP expression driven by the Oct4 promoter was not as strong as the expression driven by the EF1
promoter. The Oct4 promoter fragment that was used in this study has been shown to have the elements necessary for expression in ESC [28, 29]. Our studies confirmed that this is indeed the case, as can be seen in multiple lines (Fig. 2). GFP expression coincided with Oct4 expression and was not present in cells that did not express Oct4, indicating that the promoter fragment used was ESC-specific. This was further confirmed by differentiation of the Oct4-GFP cells into embryoid bodies. GFP expression driven by the Oct4 promoter was completely absent after differentiation, and EBs derived from these cells could not be distinguished from a control line. In contrast, despite a slight reduction in expression, we were still able to detect GFP expression driven by the EF1
promoter after differentiation. Integration into pseudosites mediated by phiC31 integrase did not affect the pluripotency of these cells. Upon differentiation, the cells stopped expressing GFP and began expressing markers of differentiation. As seen in Figure 4, these cells maintained the ability to differentiate into all three lineages, indicating that they had not lost their pluripotency. The EF1
-GFP clone was also able to differentiate into all three lineages (data not shown).
The ability to target well-expressed loci in unmodified cells allows for easy creation of transgenic ESC lines expressing markers of interest. Multisite Gateway technology facilitates the construction of plasmids with multiple expression constructs. Combining Multisite Gateway technology with site-specific integration provides a powerful tool to study stem cell biology and differentiation and an efficient solution to some of the factors impeding generation of stably transfected hESC clones. The use of phiC31 integrase allows for efficient generation of stable clones with sustained expression of the gene of interest. Therefore, this system satisfies several of the criteria that we set for the ideal cell engineering tool: reliable and stable insertion into the host genome, long-term expression of the introduced genes, and no detectable adverse effects. The method described here targets pseudo-attP sites, some of which are hot spots for integration. These sites can be mapped and characterized as to their position with respect to neighboring genes. As expected from previous studies, most pseudo-attP sites identified in this study appear in regions not known to code for any open reading frames. This does not rule out the possibility that insertion disrupts expression of noncoding RNA or some yet undefined regulatory region, however. The sites identified in this study are mostly distinct from pseudo-attP sites previously described in terminally differentiated and transformed cells. The ability to map the insertion sites and categorize them as hot spots allows them to be analyzed with respect to their transcriptional activity after differentiation to specific lineages. This offers the opportunity to create a database of hot spots and the embryonic lineages they may be active in. New cell clones could then be generated, mapped, and compared with the database for a rapid triage of those usable for a particular experiment.
| DISCLOSURE OF POTENTIAL CONFLICTS OF INTEREST |
|---|
|
|
|---|
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
N. Sharma, B. Moldt, T. Dalsgaard, T. G. Jensen, and J. G. Mikkelsen Regulated gene insertion by steroid-induced {Phi}C31 integrase Nucleic Acids Res., June 1, 2008; 36(11): e67 - e67. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| STEM CELLS | THE ONCOLOGIST | CME | ALPHAMED PRESS JOURNALS |
