Mono-allelic retrotransposon insertion addresses epigenetic transcriptional repression in human genome

Background Retrotransposons have been extensively studied in plants and animals and have been shown to have an impact on human genome dynamics and evolution. Their ability to move within genomes gives retrotransposons to affect genome instability. Methods we examined the polymorphic inserted AluYa5, evolutionary young Alu, in the progesterone receptor gene to determine the effects of Alu insertion on molecular environment. We used mono-allelic inserted cell lines which carry both Alu-present and Alu-absent alleles. To determine the epigenetic change and gene expression, we performed restriction enzyme digestion, Pyrosequencing, and Chromatin Immunoprecipitation. Results We observed that the polymorphic insertion of evolutionally young Alu causes increasing levels of DNA methylation in the surrounding genomic area and generates inactive histone tail modifications. Consequently the Alu insertion deleteriously inactivates the neighboring gene expression. Conclusion The mono-allelic Alu insertion cell line clearly showed that polymorphic inserted repetitive elements cause the inactivation of neighboring gene expression, bringing aberrant epigenetic changes.


Background
Retrotransposons have been extensively studied in plants and animals and have been shown to have an impact on human genome dynamics and evolution. About 42% of the human genome contains retrotransposons while DNA transposons account for around 2-3% [1][2][3]. According to the 2001 analysis, which has been confirmed overall by the 2004 update (International Human Genome Sequencing Consortium 2004), short interspersed elements (SINEs), such as Alu or SINE-R/VNTR/Alu (SVA), account for 13%, Long interspersed elements [LINE-1(L1)] for 20%, and long-terminal repeat (LTR) retrotransposons, such as endogenous retrovirus (ERV), for 8%, respectively, of the sequenced human genome. The retrotransposons increase their copy number by retrotransposition via RNA.
Attempted or successful retrotranspositions carry a high risk of eliciting chromosome breaks, deletions, translocations, and recombinations [4]. It is estimated that there is one Alu retrotransposon insertion every 21 births [5] during gametogenesis, transferring the retrotransposon's genetic information to the next generation [6,7]. These retrotransposition events are likely to change the activity of genes at the insertion site, including increased or decreased transcriptional activity. In some cases, this alteration of gene expression causes the development of several diseases or cancers [8]. DNA methylation on the retrotransposon is thought to be the mechanism that controls the retrotransposition rate. Recent vast numbers of publications uniformly address that complex disease, cancer, aging, and environmental challenges are associated with aberrant retrotransposon DNA methylation.
In fact, not all retrotransposons have the capability to retrotranspose to other genomic locations. Currently, most L1s are inactive and cannot retrotranspose to new genomic locations [9], while a small number of human specific L1 (L1HS) elements remain retrotransposition competent. Retrotransposons seeded in the human genome several million years ago and have many subfamilies defined by distinct patterns of diagnostic base substitutions. Subfamilies may be classified as young, intermediate or old, reflecting the time since the start of retroposition by their members. The expansion of Alu subfamilies (Yc1, Ya5, Ya2, Yb9, Yb8, Y, Sg1, Sx, and J; young to old, respectively) is superimposed on primate evolution. The evolutionally young L1, Alu, and SVA are currently able to transpose in the human genome, hence the ongoing retrotranspositional insertions of the youngest subfamilies are not yet fixed in the human genome and represent polymorphic loci [10]. Some polymorphic insertions are known to be responsible for more than 30 human genetic diseases [11][12][13]. A genetic polymorphism names as PROGINS has been identified in the progesterone receptor (PGR) gene with insertion of Alu subfamily [14]. The correlations of Alu insertion polymorphism on PGR gene are associated with endometriosis [15,16], ovarian cancer with diethylstilbestrol exposure [17], breast cancer [18], and obesity [19]. Insertional polymorphic retrotransposons are often observed in a mono-allelic fashion, meaning retrotransposons are inserted into only one of the alleles in individuals. For instance, in chromosome 11, the PGR gene has a newly inserted AluYa5 subfamily between exon 5 and 6. In this study, we examine DNA methylation and histone modification of the locus which occurred mono-allelic young Alu, AluYa5 insertions and address the direct effect of retrotransposon in controlling gene expression.

Nucleic acid isolation and bisulfite treatment
Genomic DNA was isolated by standard proteinase K digestion and phenol-chloroform extraction [20]. Total RNA was collected and extracted from cultured cells with the RNeasy Protect minikit (QIAGEN Inc., Valencia, CA) according to the manufacturer's recommended protocol. Reverse transcription was performed by using the first strand cDNA synthesis kit (NEB, Beverly, MA, USA). Bisulfite modification of genomic DNA has been described previously [21]. PCR primer sequences for Alu polymorphism with genomic DNA were forward: TTGAGTAAAGCCTCTAAAAT and reverse: TTCTTG CTAAATGTCTGTT, and with bisulfite DNA were forward: GAAATTTGAAGGAAATAAATATTAGTGT and reverse: CATTTAATTATCCAAAAATATTTTCT-TAC TAA.
Quantitation of allele-specific gene expression by Pyrosequencing PCR products from genomic DNA or cDNA were used for Pyrosequencing analysis as previously described [21].
Briefly, the PCR product of each gene was used for individual sequencing reactions. Streptavidin-Sepharose beads (Amersham Biosciences) and a Vacuum Prep Tool (Biotage AB) were used to purify the single-stranded biotinylated PCR products according to the manufacturer's recommendation. The appropriate sequencing primer was annealed to the purified PCR product and used for a Pyrosequencing reaction using the PSQ 96HS system (Biotage). Raw data were analyzed with the allele quantitation algorithm using the PSQ 96 HS software. PCR primer sequences for Alu polymorphism by Pyrosequencing were forward: TTTTCGAAACTTACATATTGA, reverse biotin labeled: TTTAGTATTAGATCAGGTGC, and sequencing primer: GATCCTACAAACA. For allele-specific expression, forward primers: TAGTCAAGTGGTCTAAAT-CATTGC, reverse biotin labeled: TTTAGTATTAGA TCAGGTGC, and sequencing primer: GATCCTA-CAAACA. To validate DNA methylation detection by Pyrosequencing, we designed control oligo for 100% DNA methylation (PSQ-C oligo: 5'-TATTAGATCGACGG-GAACAAACGTTGAATTC -3') and 0% DNA methylation (PSQ-T oligo: 5'-TATTAGATCAACGGGAACA AACGTTGAATTC -3'). The sequencing primer for control oligo is 5'-CAACGTTTGTTCCCGT -3'. We mixed PSQ-C oligo (or PSQ-T oligo) with sequencing oligo in PyroMark Annealing Buffer (QIAGEN Inc., Valencia, CA) and performed Pyrosequencing with sequencing entry C/TGATC.

Screening of AluYa5 insertional polymorphisms in cell lines
To find insertional polymorphic retrotransposons, we screened Raji, Jurkat, HT15, H1299, MCF, and K562 cell lines using the primer sets listed in the Methods section. The primers flanked the newly inserted retrotransposon AluYa5 in chr11:100,911,358-100,912,065 locus (Assembly: hg19), thus presence of Alu insertion could be distinguished by length of PCR amplicon. The PCR amplicon with fully inserted Alu generates a 476 bp product, while the amplicon without Alu insertion produces a 150 bp product. Among the cell lines we tested, HT15 and H1299 showed two different sizes of bands after PCR amplification, indicating Alu has inserted in only one allele of the genome locus (Figure 1). MCF and K562 showed insertion of Alu into both alleles (476 bp products). Raji and Jurkat cell lines, however, did not carry an Alu insertion in either allele (150 bp products).

Alu insertion dependent DNA cytosine methylation
In order to examine the retrotransposon-derived DNA methylation spreading theory [23], we determined DNA methylation status on the Alu-present and Alu-absent alleles, using the mono-allelic inserted cell lines HT15 and H1299. The PCR amplicon with bisulfite treated DNA was digested with the restriction enzyme HpyCH4III, which cut the 5'..ACNGT..3' region located on the PCR amplicon sequence in only the methylated allele (Figure 2A). Both mono-allelic Alu inserted cell lines, HT15 and H1299, showed partial digestion of only the Alu-present allele, indicating DNA methylation exists in only the Alu inserted allele ( Figure 2B). The Alu inserted allele in the H1299 cell line showed slightly more methylation than the Alu inserted allele in the HT15 cell line ( Figure 2C). We did not observe digestion of the Alu-absent allele.

Alu insertion derived inactive histone modification
To determine whether Alu insertion causes histone tail modifications, we performed ChIP-PCR with two histone modification antibodies against H3K9ac or H3K9me3. Acetylation at Lys-9 on histone H3 (H3K9ac) is an active chromatin marker and often associated with positive gene expression; conversely, methylation at Lys-9 on histone H3 (H3K9me3) is an inactive chromatin marker and correlated with repressed gene expression [24]. After chromatin immunoprecipitation with the two antibodies for active or repressive histone markers, followed by PCR amplification, we observed differential histone modification between Alu-present and Alu-absent alleles. The active marker H3K9ac is present in only the Alu-absent allele; however, the inactive histone marker H3K9me3 exists in both allele of the genome locus ( Figure 3). This difference in histone modification has only happened in young Alu subfamilies, not all Alu subfamilies. ChIP coupled with PCR amplification of AluJ, AluYb8, and L1HS showed different distributions of histone modifications. AluJ, the oldest Alu subfamily, co-located with both the active marker H3K9ac and the inactive marker H3K9me3. However, the young Alu subfamily AluYb8 had at least eight times more inactive histone marker H3K9me3. In addition, human-specific L1HS did not show a different distribution of active or inactive histone markers (Figure 4).

Gene expression repressed by Alu insertion in the genome
To examine differential gene expression in Alu-present and Alu-absent allele, we developed an allele-specific gene expression detection method using Pyrosequencing. To distinguish between the two alleles, we genotyped the single nucleotide polymorphism (SNP) at chr11:100921952-100922452 (2009 (GRCh37/hg19) assembly), reference SNP ID number is rs1042839, since this SNP is correlated with occurrence of Alu insertion [25,26]. To confirm this co-existence, we genotyped this SNP in the six cell lines we worked with and compared with their Alu insertion statuses (Table 1). Hetero Alu inserted cell lines HT15 and H1299 showed heterozygote C/T, Alu-absent cell lines Hep3B2 and HL-60 had a C/C genotype, and Alu-present cell lines MCF and K562 had a T/T genotype. We confirmed that the T allele co-exists with Alu insertion, while the C allele co-exists with the absence of Alu insertion in hetero Alu-inserted cell lines. Next, we used this SNP to identify Alu-inserted alleles for allele-specific gene expression detection in a Pyrosequencing reaction. After reverse transcription-PCR with mRNA from the hetero Alu-inserted cell line H1299, we amplified the locus flanking the SNP to detect each allelic gene expression level ( Figure 5). Surprisingly, we observed unequal gene expression levels between Alu-present and Alu-absent alleles, 10.5% and 89.5% respectively, having an equal distribution of both alleles in the genome (46.7% of Alupresent allele vs 53.3% of Alu-absent alleles with genotyping data). Thus the presence of Alu in the gene body repressed gene expression at the allele containing the Alu element.

Discussion
We examined the polymorphic inserted young Alu, AluYa5, in the PGR gene to determine the effects of Alu insertion on the near gene environment. We used mono-allelic inserted cell lines which carry both Alupresent and Alu-absent alleles. We observed that the polymorphic insertion of evolutionally young Alu causes increasing levels of DNA methylation in the surrounding genomic area and generates inactive histone tail modifications. Consequently, the Alu insertion deleteriously inactivates the neighboring gene expression ( Figure 6).
It is a novel approach to address the cis-effects of retrotransposons or retrotransposition in neighboring genomic structures using a mono-allelic inserted young Alu subfamily. These effects were observed in a single cell line system, and virtually all conditions at the particular locus are the same; the only difference being the presence or absence of a retrotransposon insertion. Thus this system bypasses many concerns about experimental artifacts being solely responsible for deducing the function of retrotransposons in the genome.
Generally, our results agree with previous reports that retrotransposons may repress gene expression through an epigenetic mechanism. Our study strongly supports the observations that young active retrotransposons insert in areas that lack cytosine methylation. Retrotransposons spread DNA methylation into neighboring regions, generating repressive histone modifications. It causes a significant inactivation of gene expression. Hollister et al. reported the correlation of transposable elements and gene silencing; however the caveat was that the data do not show whether repetitive elements tend to preferentially insert near lowly expressed genes or whether the insertion of repetitive elements causes the low gene expression [27]. However, our mono-allelic inserted cell line system clearly showed that repetitive elements cause the inactivation of neighboring gene expression.
It has been estimated that approximately one out of every 21 births, 212 births, and 916 births has a new insertion of Alu, L1, and SVA retrotransposition, respectively [10]. Thus there is a great deal of retrotransposition in the current human genome. It has been know that evolutionally young repetitive elements have the capability to retrotranspose to other genomic locations. In our study, the

Conclusions
The mono-allelic Alu insertion cell line clearly showed that polymorphic inserted repetitive elements cause the inactivation of neighboring gene expression, bringing aberrant epigenetic changes.