Molecular functions of DNA methylation
DNA甲基化与组蛋白修饰及非组蛋白共同决定了染色质结构和开放程度。因此,DNA甲基化有助于调控基因表达、转座子沉默、染色质互作(图4)以及性状遗传(Box 1)。
Gene regulation
植物中基因相关的DNA甲基化发生在启动子区或者转录基因区。尽管某些情况下DNA甲基化会促进基因的表达,比如说拟南芥中ROS1基因以及番茄中数百个抑制果实成熟的基因,但多数情况下启动子DNA甲基化会抑制基因的转录。启动子DNA甲基化通过阻止转录激活子的结合或促进转录抑制子的结合直接抑制基因的表达,或者通过促进H3K9me2等抑制性组蛋白修饰和抑制组蛋白乙酰化等促进性组蛋白修饰来间接抑制基因的表达(图4a)。启动子甲基化如何激活基因的转录还不是很清楚。想来,DNA甲基化可能会增强某些转录激活子的结合或是抑制某些转录抑制子的结合。启动子区的DNA甲基化通常是来自附近转座子或其他重复序列上甲基化扩散的结果。基因邻近的转座子和重复序列同样受到主动DNA去甲基化的作用以保护基因免受转录沉默。在启动子DNA甲基化激活基因表达的情况中,主动去甲基化会导致基因的转录沉默。
DNA甲基化控制
基因表达,转座子沉默和染色体相互作用。基因启动子中的| DNA甲基化(m)通常会抑制转录,但在某些情况下,它可以增加转录3,8,12,106,107。基因体中b| DNA甲基化主要存在于CG context12,13,120,121,其功能尚不清楚。一些异染色质内含子中| DNA甲基化吸引了抗沉默1 (ASI1) - ASI1免疫沉淀蛋白1 (AIPP1) -增强霜霉病2 (EDM2)复合物,以促进mRNA 132 - 136过程中选择性远端聚腺苷化位点(红色星号)的选择。ASI1与染色质结合并结合RNA;EDM2在异染色质中识别二甲基组蛋白H3赖氨酸9 (H3K9me2)。d | DNA甲基化对于沉默转座子和其他DNA重复序列也很重要,这些重复序列主要位于内包熵外异染色质12120中。| DNA甲基化参与了内包熵区域之间的染色体相互作用和一些相互作用的异染色质岛,这些异染色质岛是染色质抑制区域,位于其他纯色染色体臂上,以丰富的转座子和小rnas145146为特征。在所有的面板中,转座子和其他重复序列是黄色的,基因是蓝色的。黑色和黄色的染色体区域分别代表着着丝粒和包内熵区域。POL II, RNA聚合酶II 。
In A. thaliana, only approximately 5% of the genes are methylated in promoter regions. As a result, DNA methylation does not regulate the transcription of many genes, and most mutants with decreased or increased DNA methylation do not have severely impaired growth or development11. By contrast, crop plants with larger genomes can have a higher transposon content and more transposons that are close to genes; consequently, thereare more genes with promoter methylation3. Therefore, DNA methylation has more important roles in gene regulation in several crop plants than in A. thaliana, and DNA methylation mutants in these crop plants are generally either lethal or have severe growth and developmental defects3,116-119.
在拟南芥中,仅仅约有5%的基因的启动子区存在甲基化。因此,DNA甲基化对于大多数的基因来说并不调控它们的转录,所以大多数增加或降低DNA甲基化水平的突变并不会导致严重的生长和发育缺陷。相反,作物拥有更大的基因组,因此其含有更高的转座子含量,进而会有更多的临近基因的转座子,结果导致了作物中基因启动子区的甲基化水平会增高。因此,DNA甲基化在某些作物中作用于基因表达调控要比在拟南芥中更加重要,并且在这些作物中DNA甲基化突变要么致死,要么会导致严重的生长和发育缺陷。
The gene bodies of over one-third of A. thaliana genes are methylated12. In contrast to transposons and repeats, which are usually heavily methylated in all three cytosine contexts, DNA methylation in gene bodies has very little non-CG methylation12,13,120,121 (Fig. 4b). Gene body methylation (gbM) preferentially occurs at exons and is absent from the transcription start and stop sites121. As a conserved feature in most angiosperms, genes with gbM tend to be longer than unmethylated genes and are generally constitutively expressed12,121,122. In two angiosperms, Eutrema salsugineumand Conringia planisiliqua, genome-wide loss of gbM was attributed to the loss of CMT3123,124. Levels of gbM decreased in A. thaliana with reduced histone H3.3 levels, and this correlated with increased density of the linker histone H1, suggesting that gbM is facilitated by histone H3.3, which inhibits histone H1-dependent chromatin folding and consequently increases chromatin accessibility to DNA methylases125.
三分之一的拟南芥基因的基因区存在甲基化。与转座子和重复序列中存在三种不同类型的胞嘧啶甲基化,基因区的DNA甲基化存在很少的non-CG类型甲基化(图4b)。基因区甲基化会优先发生在外显子上,但在起始和终止位点上不存在甲基化。大多数被子植物比较保守的一个特征就是,能够发生DNA甲基化的基因要比没有DNA甲基化的基因长度更长,并且通常是组成型表达的。在山嵛菜和线果芥中,由于缺失了CMT3而导致了全基因组范围上不存在基因区的DNA甲基化。在拟南芥中,基因区DNA甲基化水平会随着组蛋白H3.3水平的降低而降低,并且与组蛋白H1的密度增加相关,说明组蛋白H3.3通过抑制组蛋白H1依赖性的染色质折叠,进而增加染色质对DNA甲基化酶的开放程度来促进基因区的DNA甲基化。
Gene body CG methylation is almost completely absent in the A. thaliana met1-3 mutant, in which steady-state mRNA levels of gbM genes do not appear to be globally increased relative to unmethylated genes12. Additionally, natural variation in gbM does not correlate with global gene expression levels in A. thaliana populations126. On the other hand, a comparison between the grass Brachypodium distachyon and rice (Oryza sativa Japonica group) showed that gbM is strongly conserved among orthologues of the two species and affects a biased subset of long, slowly evolving genes121. Thus, the biological importance of gbM in angiosperms seems to be species dependent. Considering that enrichment of the histone variant H2A.Z in gene bodies correlates with gene responsiveness to environmental and developmental stimuli and that the genomic distributions of H2A.Z and DNA methylation in A. thaliana are anti-correlative, gbM was proposed to reduce gene expression variability by excluding H2A.Z from nucleosomes127. In addition, gbM in plants may prevent aberrant transcription from internal cryptic promoters128. Indeed, in mouse cells, intragenic DNA methylation protects the gene body from spurious Pol II entry and cryptic transcription initiation129. It was also suggested that gbM increases pre-mRNA splicing efficiency in plants127, which is consistent with the observation that a small portion of alternative exon-intron junctions are affected by the global loss of CG methylation in the O. sativa met1-2 mutant130.
拟南芥met1-3突变体中基因区基本没有CG甲基化,在该突变体中,基因区存在甲基化的基因的mRNA稳定程度相对于基因区不存在甲基化基因并没有显著的提升。另外,拟南芥群体中基因区甲基化的自然变异并不于基因的表达水平相关。另一方面,短柄草与水稻的比较分析显示在这两个物种同源基因之间基因区甲基化十分保守,并且会偏向于影响一些长的、演化比较慢的基因。因此,被子植物中基因区甲基化的生物学重要性似乎各个物种各有不同。考虑到组蛋白变体H2A.Z在基因区的富集与响应于环境和发育刺激的基因相关,并且拟南芥中H2A.Z和DNA甲基化的分布呈负相关模式,所以可以认为基因区甲基化通过将H2A.Z从核小体上排除开来进而降低基因的表达变异。另外,植物中的基因区甲基化可能起到防止基因在启动子区潜在的起始密码子处开始转录的作用。确实,在小鼠细胞中,基因内的DNA甲基化会保护基因区免受错误的Pol II 结合及隐藏性转录起始。同时,基因区甲基化增加植物初级mRNA的剪切效率,这与在水稻met1-2突变体中所观察到的缺失CG甲基化会影响一小部分可变外显子-内含子连接点的现象一致。
Some gene introns harbour transposons or other repeats, which are heavily methylated in all cytosine sequences and regulate mRNA processing, for example, alternative polyadenylation. Loss of DNA methylation in a long interspersed nuclear element retrotransposon in the intron of the homeotic gene DEFICIENS causes alternative splicing and premature termination and consequently the generation of the unproductive mantled somaclonal variant of oil palm131. An intron of the A. thaliana INCREASE IN BONSAI METHYLATION 1 (IBM1; also known as JMJ25) gene, which encodes a histone H3K9 demethylase, contains a heterochromatic repeat element that is recognized by a newly discovered protein complex that promotes distal polyadenylation of IBM1 transcripts132-136 (Fig. 4c). This protein complex consists of ANTI-SILENCING 1 (ASI1),ENHANCED DOWNY MILDEW 2 (EDM2) and ASI1-IMMUNOPRECIPITATED PROTEIN 1 (AIPP1). ASI1 is an RNA-binding protein that contains a BAH domain that may mediate its chromatin association with the heterochromatin region within the IBM1 intron132,133. EDM2 contains a composite plant homeodomain (PHD) that recognizes both the transcription- repressing H3K9me2 and transcription-activating H3K4me3 modifications, which together characterize introns that contain heterochromatin repeats134. AIPP1 interacts with both ASI1 and EDM2, thereby promoting the formation of the complex, which also promotes distal polyadenylation of many other genes that similarly harbour intronic heterochromatin135, although the mechanism by which the complex promotes alternative polyadenylation is unknown. Mutation of ASI1, EDM2 or AIPP1 indirectly causes gene silencing owing to the loss of full-length, functional transcripts of IBM1. ASI1 also associates with AIPP2, which has a PHD domain, AIPP3, which has a BAH domain, and the POL II carboxy-terminal domain phosphatase CARBOXY-TERMINAL PHOSPHATASELIKE 2135. Intriguingly, mutations in the three proteins had opposing effects on gene regulation compared with mutations in the ASI1-AIPP1-EDM2 complex135.
一些基因内含子会包含转座子或者重复序列,所以可能会存在不同类型的胞嘧啶甲基化,并调控mRNA的加工,比如说可变聚腺苷酸化。在油棕的同源异型基因DEFICIENS的内含子中的LINE逆转座子上DNA甲基化的缺失会导致可变剪切和提前终止,并最终导致产生非生殖性体细胞无性系变体。拟南芥编码组蛋白H3K9去甲基化酶的IBM1的内含子含有一个异染色质重复序列,能够被一个新鉴定的蛋白复合物所识别,促进IBM1转录本的远端多聚腺苷酸化(图4c)。该蛋白复合物由ASI1、EDM2和AIPP1构成。ASI1是一个RNA结合蛋白,包含一个BAH结构域,可能介导其染色质与IBM1内含子区内异染色质之间的关联。EDM2含有一个植物同源结构域PHD,能够同时识别转录抑制H3K9me2和转录激活H3K4me3两种修饰,两者同时构成了含异染色质重复序列的内含子特征。AIPP1同时与ASI1和EDM2互作,因而促进该复合物的形成,同时促进许多其它有着类似内含子的异染色质的基因的远端多聚腺苷酸化,尽管目前对于该复合物是如何促进可变多聚腺苷酸化的具体分子机制还不是很清楚。ASI1、EDM2或者AIPP1的突变因为全长、功能性IBM1转录本的缺失间接导致基因沉默。ASI1同时还与含有一个PHD结构域的AIPP2、含有一个BAH结构域的AIPP3以及POL II羧基末端结构域磷酸酶CARBOXY-TERMINAL PHOSPHATASELIKE 2相关。有趣的是,这三个蛋白的突变与ASI1-AIPP1-EDM2复合物的突变相比存在相反的基因调控影响模式。