If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
* This work was supported in part by National Institutes of Health Grant R01 GM079641 (to O.G.). O.G. is a co-founder of EpiCypher, Inc. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. 2 O. Gozani, unpublished observations. 3 A. Kuo and O. Gozani, unpublished observations. 1 The abbreviations used are:PTMpost-translational modificationPCNAproliferating cell nuclear antigenORIorigins of replicationBAHbromo-adjacent homologyORCorigin of replication complexMEFmouse embryonic fibroblastMGSMeier-Gorlin syndrome.
Covalent post-translational modifications (PTMs) of proteins can regulate the structural and functional state of a protein in the absence of primary changes in the underlying sequence. Common PTMs include phosphorylation, acetylation, and methylation. Histone proteins are critical regulators of the genome and are subject to a highly abundant and diverse array of PTMs. To highlight the functional complexity added to the proteome by lysine methylation signaling, here we will focus on lysine methylation of histone proteins, an important modification in the regulation of chromatin and epigenetic processes. We review the signaling pathways and functions associated with a single residue, H4K20, as a model chromatin and clinically important mark that regulates biological processes ranging from the DNA damage response and DNA replication to gene expression and silencing.
The human genome consists of greater than 20,000 protein-coding genes (
of proteins by chemical moieties such as phosphorylation, acetylation, and methylation. These covalent modifications occur largely on the side chains of distinct amino acid residues and regulate protein function by diverse mechanisms that together greatly expands the complexity of the proteome (
Histones are some of the most abundant proteins in eukaryotic cells. Two copies of histone H2A, H2B, H3, and H4 form an octameric structure that is wrapped by ∼147 bp of double-stranded DNA (dsDNA) to form the nucleosome, the core structural unit of chromatin and the first step in packaging of the genome (
). Here we will focus on lysine methylation of histone proteins, an important modification that was first identified on histones in the 1960s and is now appreciated to fundamentally regulate chromatin dynamics (
Proteins are reversibly methylated on the nitrogen side chain of lysine residues (Fig. 1). This reaction, although subtly changing the primary structure of the modified peptide, greatly increases the information encoded within the molecule. Lysine residues can accept up to three methyl groups, forming mono-, di-, and tri-methylated derivatives (referred to here as Kme1, Kme2, and Kme3, respectively; Fig. 1), with unique activities frequently being coupled to the specific extent of methylation on the lysine residue (
). Here, to highlight the functional complexity that can be added to the proteome via lysine methylation, we focus on the signaling pathways and functions associated with methylation of a single residue, H4K20, as a model chromatin and clinically important mark that regulates diverse biological processes ranging from the DNA damage response and DNA replication to gene expression and silencing. For more detailed and comprehensive reviews of H4K20 methylation biology, we refer the reader to two excellent reviews, Refs.
Lysine 20 is the major site of methylation on histone H4. Depending on the cell type, up to ∼80% of H4 molecules can be di-methylated, whereas H4K20me1 and H4K20me3 are generally less abundant, for example being present on 10 and 5% of nucleosomes in asynchronous HeLa cells, respectively, with similar ratios observed in mouse embryonic fibroblasts (MEFs) and other cell types (
). Three distinct SET domains containing lysine (K) methyltransferase (KMT) enzymes, SETD8 (SET8, PR-SET7), SUV4-20H1, and SUV4-20H2, are responsible for the generation of the three different methyl states of H4K20 (
). SETD8 was identified as the first KMT for H4K20, and genetic ablation of Setd8 in flies completely abolished all three methylation states, initially suggesting that it might catalyze mono-, di-, and tri-methylation (
). However, biochemical analysis showed that SETD8 only catalyzed the addition of one methyl moiety, converting unmodified H4K20 to H4K20me1. Structural studies provided the molecular basis for this specificity, demonstrating that the active site of SETD8 is not able to accommodate a lysine carrying more than a single methyl group (
); these results suggested that the accumulation of H4K20me1 was a result of failure in the Suv4-20h1/Suv4h2 double knock-out cells to convert H4K20me1 to the higher methylated states. Similar results were observed upon depletion of the two homologous proteins, suv420h1/h2 in the zebrafish Danio rerio (
Thus, in higher eukaryotes, SETD8 appears to be the sole mono-methyltransferase for H4K20, and successive methylation of H4K20me1 by SUV4-20H1 and SUV4-20H2 generates the preponderance of global H4K20me2 and H4K20me3.
SUV4-20H1 and SUV4-20H2 share a high degree of sequence similarity in their catalytic SET domain (
), in vitro methylation assays using recombinant SUV4-20H1 and SUV4-20H2 indicate that both proteins catalyze H4K20me2 with similar kinetics and that these enzymes are more active on H4K20me1 as substrate than on unmethylated H4K20 (
). Nonetheless, deletion of the individual enzymes in MEFs showed clear differences with respect to H4K20 methylation as Suv4-20h1 deletion resulted in an ∼60% decrease in H4K20me2 levels without affecting H4K20me3. In contrast, genetic ablation of Suv4-20h2 largely eradicated H4K20me3 but did not have an impact on global levels of H4K20me2 (
). It is worth noting that both Suv4-20h1 and Suv4-20h2 are capable of mono-methylation but with a 3-fold lower activity compared with their activity on mono-methylated peptides and a 250-fold lower activity relative to SETD8. The molecular basis for this specificity was revealed in structural studies on the SET domain of mouse SUV4-20H2, which showed that a single methyl group on the substrate lysine helps to lock the lysine in place in the active site (
). With respect to tri-methylation, a significant structural rearrangement of the active site has to take place in order to accommodate an H4K20me3 peptide, which is energetically unfavorable, potentially explaining why this activity is difficult to detect in vitro when utilizing the SET domains of SUV4-20H1 and SUV4-20H2 (
). The discrepancy between the in vitro and in vivo data argues that catalytic specificity of the SUV4-20H1 and SUV4-20H2 does not solely come from the SET domain and is also influenced by domains outside the SET domain and/or differential interacting proteins. Thus, SUV4-20H1 and SUV4-20H2 are both di-methyltransferases in vitro, but we have much to learn in order to understand how H4K20me3 is regulated by SUV4-20H1 and SUV4-20H2 in vivo.
In multicellular organisms, H4K20 methylation appears to be important for development and organismal viability. Specifically, knock-out Setd8 mouse embryos die before they reach the 8-cell stage, and flies lacking Setd8 have developmental defects and only make it into the late larval stage (
). However, it is worth noting that SETD8 has non-histone substrates that may contribute to the knock-out phenotype (see below). Indeed, flies expressing H4 that contains a K20A substitution display significant developmental delay (24–48 h) and have significant death at the larval stage (46%), but overall they have a less severe phenotype than the Setd8 knock-out flies, in which 100% death at the larval stage is observed (
). This striking difference may in part be attributed to the more restricted expression pattern of SUV4-20H2 in the mouse embryo relative to SUV4-20H1, which is ubiquitously expressed throughout the embryo (
). Taken together, the data in higher eukaryotes support the notion that establishing a proper H4K20me pattern is important for development, yet the non-essential role for SUV4-20H2 in this process requires additional studies as does separating the function of the individual KMTs versus the specific methylation event on H4.
In mice, the deletion of Setd8 gives a much more severe phenotype as compared with the double deletion of Suv4-20h1 and Suv4-20h2 (
). This suggests that loss of all methylation states on H4K20 is more detrimental than the loss of only H4K20me2 and H4K20me3. At the same time, SETD8 has also been shown to methylate non-histone substrates such as the tumor suppressor p53 (
). To date, a non-histone target for SUV4-20H1 and SUV4-20H2 has yet to been identified. SETD8 mono-methylates p53 on lysine 382 to suppress p53-mediated transcriptional activation of highly responsive target genes (
). SETD8 methylation of NUMB also impacts p53 functions, suggesting SETD8 integrates multiple pathways to regulate the key tumor suppressor p53. Together, these studies indicate that SETD8 deletion phenotypes are likely due to a combination in the absence of H4K20 methylation as well as SETD8 functions in non-histone methyl-lysine signaling pathways.
Lysine methylation, although a relatively stable modification, is dynamic and is reversed by lysine demethylase enzymes (
). Overexpression of PHF8 in U2OS cells reduced H4K20me1 and depletion of PHF8 led to an increase in H4K20me1 signal at the G2/M- and G1-phase of the cell cycle. Why PHF8 is active on dimethyl-lysines on H3K9 and H3K27 but not H4K20 is not understood. Regardless, these studies suggest that H4K20me1 levels can be dynamically regulated by demethylation, and it may only be a matter of time until demethylases of H4K20me2/3 are discovered.
It is becoming increasingly clear that cross-talk between histone modifications plays an important role in chromatin signaling. For instance, double knock-out of the H3K9 methyltransferases Suv39h1 and Suv39h2 in MEFs depletes the H3K9me3 signal and also abolishes H4K20me3 signals at pericentric heterochromatin without affecting H4K20me1/2 (
). However, H3K9me3 signals persist at DAPI-dense foci in cells in the absence of SUV4-20H1 and SUV4-20H2. Interestingly the C-terminal domain of SUV4-20H2 can bind to isoforms of HP1, key components of heterochromatin, and through their chromodomains can direct the binders of H3K9me3 (
). This suggests a model in which H3K9me3 recruitment of HP1 brings in SUV4-20H2 to establish an H4K20me3 landscape at heterochromatin. The molecular function of H4K20me (and what binds to this mark) in heterochromatin formation is a question that should be further elucidated over the next several years. A different variant of cross-talk exists between methylation at H4K20 and acetylation at H4K16. These marks are mutually restrictive on fly chromosomes, and the H4K16ac-containing peptide serves as a less efficient substrate for methylation by SETD8 than unmodified peptide (
). One intriguing speculation is that each mark modulates the ability of reader domains for the other mark. For example, might acetylation at H4K16 dynamically regulate the binding of a reader of H4K20me? To date, such a case has not been identified, but certainly cross-talk between PTMs present on histones and chromatin-regulatory factors would provide for a complex new layer in the regulation of the genome.
H4K20 Methylation and the Cell Cycle
H4K20 methylation levels, for example H4K20me1, are quite dynamic across the cell cycle (
). In late G1, the levels of H4K20me1 begin to decline, and H4K20me1 levels are lowest during S-phase. After the completion of DNA replication, H4K20me1 levels start to increase and peak during mitosis (see Fig. 2) (
). These observations suggested that the changes in H4K20me1 are due to parallel cell cycle changes in SETD8 protein levels. In this regard, SETD8 was found to be degraded in late G1-phase by the proteasome through CRL4Cdt2 ubiquitin-mediated destruction (
). The second PCNA-interacting protein domain of SETD8 mediates the interaction with PCNA in cells and is required for degradation of SETD8. Taken together, SETD8 and its product H4K20me1 are dynamically regulated throughout the cell cycle, and the drop in H4K20me1 levels in S-phase is likely a direct consequence of the active degradation of SETD8. Other possible mechanisms by which H4K20me1 levels might be regulated include cell cycle-regulated demethylation by PHF8 (
) or other histone lysine demethylase enzymes (KDMs) and by conversion of H4K20me1 to H4K20me2; in this regard, we have observed cell cycle-regulated changes in H4K20me2 levels, with the peak in H4K20me2 occurring in late G1/early S-phase, the same time that H4K20me1 levels decrease,
H4K20 methylation, in diverse organisms and in an evolutionarily conserved manner, is implicated in the DNA damage response. In the fission yeast Schizosaccharomyces pombe, strains carrying a deletion of the H4K20 KMT set9 (that H4K20 methylation) are hypersensitive to DNA-damaging agents (
). A similar phenotype was observed in strains expressing an H4K20R mutant allele, indicating a direct role for set9-mediated H4K20 methylation in the DNA damage response (rather than a different set9 target, for example). In mammalian cells, depletion of SETD8 results in the formation of γH2AX foci (a hallmark of DNA damage) and increased sensitivity to a variety of genotoxic stresses (
). In addition, Suv4-20h1/Suv4-20h2 double knock-out MEFs are sensitive to DNA-damaging agents and display an increased number of chromosomal breaks relative to control MEFs in response to DNA damage (
). Overall, the data argue that H4K20 methylation plays an important and evolutionary conserved role in the DNA-damage response.
The different methylation states of H4K20me can be bound by specific reader domains, which can distinguish between the site and the degree of methylation (for an overview of several reported H4K20-binding proteins, see Table I). Reader domains typically contain a hydrophobic pocket made up of two to four aromatic residues that interact with the methylated lysine as well as make additional contacts with the sequence surrounding the methylated residue (
). 53BP1 and its yeast homolog Crb2 are key proteins in DNA damage-signaling pathways (for example, 53BP1 associates with sites of DNA damage at an early stage in the repair pathway) that contain a conserved tandem tudor domain (
). This domain on both proteins binds to synthetic histone H4 peptides that are mono- and di-methylated at Lys-20 with a slight preference for di-methyl versus mono-methyl peptides. Consistent with these observations, the reduction of SETD8 and the expression of a catalytically inactive mutant enzyme both result in reduced formation of 53BP1 foci at DNA damage sites (
). Thus, although the specific contribution of H4K20me1 versus H4K20me2 in stabilizing 53PB1 at DNA damage foci remains to be fully elucidated, it is clear that both modification states, via 53BP1, play an important role in the cellular response to DNA damage. In fission yeast, Crb2 localization to DNA damage sites is greatly reduced in the Δset9 strain that lacks H4K20 methylation (
), lending further support for an evolutionary conserved mechanism linking H4K20me and the DNA damage response together through a direct interaction between tandem tudor domain-containing proteins and H4K20me1/2.
The faithful and precise duplication of the genome in S-phase is essential for cell division and organismal development. The replication machinery fires only once per cell cycle from distinct genomic sites termed origins of replication (ORI). During the G1 phase of the cell cycle, a pre-replication complex, which consists of the origin of replication complex (ORC), the minichromosome maintenance complex (MCM), and two additional factors named Cdc6 and Cdt1 form at ORIs throughout the genome (
). In yeast, ORIs can be identified due to a defined DNA sequence element. However, in higher eukaryotes a similar element or other type of DNA sequence motif has yet to be identified, and rather it is proposed that chromatin modifications might aid in specifying metazoan ORIs, potentially via recruitment of components of the replication machinery (
H4K20me plays an important role in cell cycle progression. For instance, double knock-out Suv4-20h1/2−/− MEFs proliferate slower than wild-type MEFs, which is likely due to a block in the G1- to S-phase transition and delayed S-phase entry (
). Cells depleted of SETD8 also show reduced proliferation rates and accumulate in S-phase, and the cells that do make it into the G2-phase fail to go through mitosis, consistent with them carrying a defect that arose from an aberrant S-phase (
). Cells in which induced depletion of SETD8 occurs in S-phase proceed normally through mitosis. However, these cells are delayed during the subsequent S-phase, suggesting a role for SETD8 in-between DNA replication cycles that is important for S-phase progression (
). These cells show an S-phase block as well as a G2/M arrest, arguing that the timing of methylation is important for proper progression through S-phase. Interestingly, the phenotype resulting from SETD8 stabilization appears to be dependent on SUV4-20H1 and SUV4-20H2 as the expression of degradation-resistant SETD8 in Suv4-20h1/Suv4-20h2 double knock-out MEFs reduces the amount of cells arrested in G2/M, indicating that the higher methylation states of H4K20 also play a key role in the cell cycle (
). Together, these observations illustrate that correct deposition of H4K20me on the genome is important for faithful progression through S-phase.
The genome needs to be replicated once and only once per cell cycle, and defects in replication can lead to loss of key genomic information or re-replication, which can cause genomic instability. Normally, once an origin fires, mechanisms are in place to ensure that a second firing event, which would lead to re-replication, does not occur (
). Moreover, preventing SETD8 degradation in S-phase results in the accumulation of SETD8 at foci that co-localize with BrdU, PCNA, and DNA-polymerase ε, markers of active replicating forks, further supporting a direct role for SETD8 at origins (
). In this context, reporter assays show that targeting SETD8 (and subsequently SUV4-20H1 and SUV4-20H2) to an origin can result in local H4K20me and recruitment of ORC and minichromosome maintenance complexes (
), suggesting that H4K20me2 and H4K20me3 may be involved in stabilizing the ORC complex at ORIs via direct binding to the key ORC component proteins ORC1 and ORCA.
Mutations within several factors that comprise the pre-DNA replication machinery have been identified in individuals affected by Meier-Gorlin syndrome (MGS), an autosomal recessive primordial dwarfism disorder (
). Specifically, mutations were identified in genes coding for the pre-DNA replication machinery proteins ORC1, ORC4, ORC6, CDT1, and CDC6 in individuals that have features of MGS, suggesting an important role for DNA replication in the development of primordial dwarfism. The MGS-associated mis-sense mutations in the largest of the ORC proteins, ORC1, fall within the BAH (
). Two of these BAH-localized mutations decrease H4K20me2-binding affinity by about 4-fold, suggesting that disruption of the ORC1-H4K20me2 interaction plays an etiologic role in the pathogenesis of MGS (
). Indeed, morpholino-mediated knockdown of drOrc1 in zebrafish resulted in a dwarfism phenotype that was rescued by complementation with human wild-type ORC1 but not by ORC1 H4K20me2-binding mutants. Moreover, double knockdown of drSuv4-20h1 and drSuv4-20h2 also resulted in the fish displaying a dwarfism phenotype (
); thus, abrogating H4K20me2 binding by ORC1 and depletion of H4K20me2/3 both are required for normal growth in zebrafish, consistent with a role for the ORC1-H4K20me2 interaction in MGS. Further supporting a connection between ORC1 and H4K20 methylation in organismal growth is the observation that Suv4-20h1/2−/− mice were born small, and in flies the replacement of all H4 genes by H4K20A delayed development by 24–48 h (
). Together, these results argue that methylation of H4K20, likely me2, the most abundant histone mark, is an important modification that regulates development in higher eukaryotes via its role as a landing dock for ORC1 and potentially other reader domains.
One model integrating our knowledge for how H4K20 methylation may impact the regulation of DNA replication is shown in Fig. 3. By a mechanism yet to be elucidated, SET8 is targeted to ORIs to place H4K20me1 during G2/M. Subsequent recruitment of SUV4-20H1/2 during the G1-phase leads to generation of H4K20me2/3 at origins. These two marks, H4K20me2 via the ORC1 BAH domain and H4K20me3 via ORCA, are involved in stabilizing the ORC complex at ORIs or one or both could alternatively plays a signaling function at origins, perhaps during the licensing phase. Biologically, abrogating these interactions may lead to inefficient or unstable pre-replication complex formation, which could in turn leads to delayed cell cycle progression and insufficient cellular proliferation during critical developmental phases that necessitate rapid cell division. Thus, mutations in these pathways could lead to organismal dwarfism of a proportional nature like what is observed with MGS. In addition to periods of development, rapid cell proliferation is also a feature of cancer cells, raising the possibility that rendering origin recognition and/or use inefficient via targeting of the ORC1-H4K20me2 interaction may have therapeutic value in the treatment of fast growing tumors. Testing these ideas as well as understanding the precise role of the ORC1-H4K20me2 interaction in regulating replication are important questions moving forward. In addition, as H4K20me2 is a highly abundant species, genome-wide studies during the cell cycle are needed to understand whether this mark is enriched at origins relative to other genomic regions and the relationship of this mark to other chromatin modifications (like H4K16Ac) that play a role in origin selection.
H4K20me Regulates Transcription and Chromatin Compaction
H4K20me has been linked to transcriptional regulation, and H4K20me1 is enriched at the bodies of actively transcribed genes (
). However, RNAi-mediated reduction of SETD8 can increase the expression levels of specific target genes, and an increase in H4K20me1 levels by PHF8 knockdown reduces the mRNA levels of PHF8-bound genes (
). Together, these data implicate H4K20me1 in transcriptional regulation, with the consequence of repression or activation likely dictated by specific reader proteins that couple the mark to downstream outcomes.
With respect to repression, SETD8-mediated H4K20me1 may regulate local compaction of the chromatin fiber. The depletion of SETD8 reduces chromatin condensation in interphase cells, suggesting that the normal function of H4K20me1 is to promote a condensed state, which would be consistent with gene silencing (
Role of Mass Spectrometry in Characterizing Methylation at H4K20
An essential aspect that has greatly facilitated the characterization of H4K20 methyl biology has come from mass spectrometry approaches. As described above, both bottom-up and top-down approaches in combination with quantitative strategies have provided insight into the kinetics of H4K20 methylation through the cell cycle, during development, and in different disease states. In addition, mass spectrometry approaches have been key in developing our understanding of dynamic cross-talk between H4K20 methylation and other histone modifications.
) utilized mass spectrometry to characterize an elegant in vitro chromatin assembly system. In this study, mass spectrometry was used to determine the order of H4 PTMs during deposition of histones on DNA templates in the context of Drosophila extracts. Mono-methylation of H4K20 was found to follow acetylation events and to be required for subsequent de-acetylation, implicating H4K20me1 in chromatin assembly via marking the fully assembled nucleosome after histone deposition (
). Specifically, H4K20 tri-methylation is enriched in embryonic stem cells and in induced pluripotent stem cells relative to more differentiated cells. In contrast, H4K20 mono- and di-methylation are higher in differentiated cells compared with embryonic stem cells and induced pluripotent stem cell samples. Using a top-down approach, Kelleher and co-workers (
) provided a comprehensive and quantitative analysis of the modified forms of histone H4. This study determined the relative abundance of 42 uniquely modified histone H4 species. One observation arising from this analysis provided further evidence that dimethylation of H4K20 is the most abundant modification on H4.
As nucleosomes contain two copies of each core histone, a longstanding question in the field is whether modifications are found symmetrically or asymmetrically on histones in the context of nucleosomes. Moreover, once armed with this information, do these principles hold for all histone PTMs or are there specific PTMs that are symmetric, whereas others more commonly associated asymmetrically with nucleosomes? Furthermore, if this is the case, are there functional consequences to these differences? The only approach that can begin to address these types of questions is one taking advantage of mass spectrometry. In an elegant study, Reinberg and co-workers (
) developed a strategy involving quantitative liquid chromatography and tandem mass spectrometry (LC-MS/MS) of modification state-specific antibody immunoprecipitations from micrococcal nuclease-digested extracts. In this way, if the mark of interest (the one recognized by the antibody being used for the immunoprecipitation) is present on 100% of the peptides spanning the modification, then the modification is present symmetrically on nucleosomes. However, if only 50% of peptides contain the modification of interest, then the modification is asymmetrically associated with nucleosomes. Enrichment between 50 and 100% indicates a mixed population (symmetric and asymmetric nucleosomes). Using a thoroughly vetted H4K20me1 antibody (which is an essential requirement for the analysis), roughly 50% of H4K20me1-containing nucleosomes are symmetric and 50% asymmetric (
). A future question of interest will be to test whether the different populations of H4K20me1 nucleosomes are linked to alternative pathways and functional outcomes as well as the mechanisms that result in one fate versus the other.
H4K20me is essential for development, and alterations in H4K20me are associated with a variety of diseases ranging from cancer to developmental disorders like MGS. Here, we have only highlighted some of the many functions associated with the different methyl states of H4K20. It is important to note that the diversity of functions associated with a methylation event is largely dictated by the reader domain binding event, which senses and transduces the modification within a spatial and temporal context. For example, HK420me2 can be linked to DNA damage versus DNA replication based on whether it is bound by 53BP1 (via its tandem tudor domain) or ORC1 (via its BAH domain). Understanding how different readers are differentially targeted represents a new challenge in the field that when addressed will help us further elucidate the mechanisms by which lysine methylation regulates biology. Given that the methylation of a single residue, H4K20, can function in roles ranging from DNA damage, DNA replication, mitosis, and transcription, there is great potential for the broad signaling network defined by lysine methylation to greatly expand the functionality of the proteome, and as such, we anticipate exciting new biology resulting from the study of protein lysine methylation as well as the identification of new targets for therapeutics to treat human disease.
International Human Genome Sequencing Consortium
Finishing the euchromatic sequence of the human genome.