Tag Archives: Hi-C

On Genome Topology 2: The Fractal Globule

As a follow-up to my last post on the use of Hi-C to discover highly self-interacting genomic ‘topological domains’, I wanted to discuss a very interesting aspect of the original paper describing Hi-C. As well as finding a division of the genome into two chromatin compartments, Lieberman-Aiden et al. used their Hi-C data to compare and contrast two models of the topology of chromatin folding within the nucleus.

In this first description of Hi-C, Leberman-Aiden divided their genome-wide contact matrix into 1Mb regions (ie.10 times less definition than the Dixon et al study). They found that, at this level of resolution, the genome can be partitioned into two varieties of spatial compartment, termed A and B. Greater interaction occurs within each compartment than across compartments. Compartment A displays a more open form of chromatin, with a high gene density and high levels of gene expression. Compartment B shows a more densely packed, closed chromatin state. Although the authors do not equate these compartments to euchromatin and heterochromatin, they sound distinctly similar to this old cytogenetic division.

In the later section of the paper, Lieberman-Aiden et al. discuss how their Hi-C data can be used to test models of the three dimensional folding of chromatin. The ‘Equilibrium globule’ model has been used to describe polymers in a poor solvent at equilibrium. In it chromatin is pictured as being in a densely knotted configuration. The ‘Fractal Globule’ model describes polymers self-organising into long-lived, non-equilibrium conformations:

“This highly compact state is formed by an unentangled polymer when it crumples into a series of small globules in a “beads-on-a-string” configuration. These beads serve as monomers in subsequent rounds of spontaneous crumpling until only a single globule-of-globules-of-globules remains. The resulting structure resembles a Peano curve, a continuous fractal trajectory that densely fills 3D space without crossing itself”

(C) Top: An unfolded polymer chain, 4000 monomers (4.8 Mb) long. Coloration corresponds to distance from one endpoint, ranging from blue to cyan, green, yellow, orange, and red. Middle: An equilibrium globule. The structure is highly entangled; loci that are nearby along the contour (similar color) need not be nearby in 3D. Bottom: A fractal globule. Nearby loci along the contour tend to be nearby in 3D, leading to monochromatic blocks both on the surface and in cross-section. The structure lacks knots. (D) Genome architecture at three scales. Top: Two compartments, corresponding to open and closed chromatin, spatially partition the genome. Chromosomes (blue, cyan, green) occupy distinct territories. Middle: Individual chromosomes weave back-and-forth between the open and closed chromatin compartments. Bottom: At the scale of single megabases, the chromosome consists of a series of fractal globules.

When the intrachromasomal contact probability is plotted against genomic distance a power law scaling is observed between ~500kb and ~7Mb. This scaling figure (s1.08) is much closer to that predicted for the fractal globule model (s-1) than that for the equilibrium globule (s-3/2). Likewise, data on the 3D distance between pairs of loci from 3D-FISH is in agreement with a fractal globule topology.

It therefore seems that, at the scale of several megabases, chromatin is organised in these knot-free conformations of globules within globules, allowing unfolding and refolding, whilst also enabling maximally dense packing. I must admit that I don’t have too much insight into the meaning of this; but frankly fractals are cool, and I love the idea of crumpling into globules of globules!

Lieberman-Aiden, E., van Berkum, N., Williams, L., Imakaev, M., Ragoczy, T., Telling, A., Amit, I., Lajoie, B., Sabo, P., Dorschner, M., Sandstrom, R., Bernstein, B., Bender, M., Groudine, M., Gnirke, A., Stamatoyannopoulos, J., Mirny, L., Lander, E., & Dekker, J. (2009). Comprehensive Mapping of Long-Range Interactions Reveals Folding Principles of the Human Genome Science, 326 (5950), 289-293 DOI: 10.1126/science.1181369

On Genome Topology

The study of higher order genomic structure using novel chromosome conformation capture techniques is an important growth area of biological research. These methods are being used to study long-range interactions between or within chromosomes, and promise to elucidate the spatial organisation of the genome, and it’s functional significance. One such technique, Hi-C, which allows the identification of chromatin interactions across the entire genome, is used in a recent paper to discover that mammalian chromosomes are divided into highly self-interacting ‘topological domains’.

Hi-C works by purifying chromosomal interactions and then sequencing the products. Briefly, this is achieved by chromosomes first being cross-linked by treatment with formaldehyde; the DNA is then chopped up and the ends of the fragments are chemically marked; the fragments are then ligated together under conditions that favour ligation of cross-linked fragments. Thus the ligation products were originally in close proximity to each other. After shearing, the marked fragments are purified, and the resulting library of interacting fragments is ‘massively parallel sequenced’. Upon alignment with a reference genome sequence, one can construct a genome-wide contact matrix.

Dixon et al. applied Hi-C to mouse ES cells, human ES cells, human fibroblasts, as well as using data from mouse cortex. They found that when they analysed their data at a resolution of less than 100kb, highly self-interacting regions emerged. For example, in mouse ES cells, 2,200 of these ‘topological domains’, with a median size of 880kb, occupied ~91% of the genome. The topological domains were separated by short segments in which chromatin interactions ended abruptly, termed ‘topological boundary regions’. Interestingly, in general, the boundary regions remained the same between embryonic stem cells and differentiated cells, in both mouse and human. Hence, the overall domain architecture is generally unchanged between cell types.  Surprisingly, there was also quite a high degree of conservation of boundary zones between human and mouse.

These boundary zones seem to correspond to insulator or barrier elements that are known to divide different chromatin domains, and prevent heterochromatin from spreading. For instance the HoxA locus is divided into two compartments by a known insulator element, which was found to be a topological boundary region in both human and mouse. Dixon et al. also found that the distribution of the heterochromatin associated histone  modification H3K9me3 was segregated at boundary regions in differentiated cells. As the topological domains generally remain constant between stem and differentiated cell types, the boundaries seem to pre-mark the end points for heterochromatic spreading during cellular differentiation. Likewise, this shows that the topological domains are not a consequence of heterochromation formation.

In agreement with the linkage of boundary zones to insulator elements, Dixon et al found that they were enriched for binding-sites for the insulator protein CTCF. However, only 15% of global CTCF binding sites were in boundary zones, suggesting a more complex composition and function for the boundary zones. Looking at the distributions of other cellular factors, the researchers showed that boundary zones are associated with high levels of transcription; being enriched for transcription start sites, housekeeping genes, and promoter associated histone marks. Interestingly they also observed an enrichment for SINE retrotransposons. This is in agreement with a recent paper (that I wrote about) linking SINEs to the genomic spread of CTCF binding sites during evolution.

The discovery that the genome is partitioned into these topological domains is part of a growing literature dissecting genomic macro-structure. Dixon et al. compared topological domains with various other recently defined higher order levels of genomic organisation; ‘A+B’ compartments (Lieberman-Aiden et al.), lamina-associated domains, replication time zones, and large organised chromatin K9 modification domains. They concluded that topological domains are related to, but independent from each of these previously characterised architectures. This list gives one some idea of the complexity, and our shallow understanding of, higher order genomic structure. However, this tranche of new chromosome capture techniques, combined with methods for high throughput analysis of chromatin composition, are yielding a wealth of data. In the next few years we should have a far more nuanced and complete appreciation of the interplay between chromosomal architecture, chromatin state and genetic regulation. A mouth-watering prospect.

Dixon, J., Selvaraj, S., Yue, F., Kim, A., Li, Y., Shen, Y., Hu, M., Liu, J., & Ren, B. (2012). Topological domains in mammalian genomes identified by analysis of chromatin interactions Nature, 485 (7398), 376-380 DOI: 10.1038/nature11082

Lieberman-Aiden, E., van Berkum, N., Williams, L., Imakaev, M., Ragoczy, T., Telling, A., Amit, I., Lajoie, B., Sabo, P., Dorschner, M., Sandstrom, R., Bernstein, B., Bender, M., Groudine, M., Gnirke, A., Stamatoyannopoulos, J., Mirny, L., Lander, E., & Dekker, J. (2009). Comprehensive Mapping of Long-Range Interactions Reveals Folding Principles of the Human Genome Science, 326 (5950), 289-293 DOI: 10.1126/science.1181369