• Non ci sono risultati.

The genetic effects of cohesin dysfunction

N/A
N/A
Protected

Academic year: 2021

Condividi "The genetic effects of cohesin dysfunction"

Copied!
76
0
0

Testo completo

(1)

UNIVERSITA’ DEGLI STUDI DI PISA Facoltà di Scienze Matematiche, Fisiche e Naturali

Research Doctorate School in Biological and molecular Sciences

“The genetic effects of cohesin dysfunction”

Supervisor PhD student Dott. Antonio Musio Francesco Cucco

(2)

2

Index

INTRODUCTION ... 4

Cohesin structure ... 5

Cohesin and cell cycle ... 8

Roles beyond cohesion ... 12

Regulation of gene expression ... 12

DNA repair and genome stability maintenance ... 15

Cohesin and human disease ... 17

Cornelia de Lange Syndrome ... 17

Cohesin mutations and cancer ... 20

Colorectal cancer ... 22

AIM OF PhD RESEARCH ... 26

MATERIALS AND METHODS ... 27

DNA extraction ... 27

Mutation analysis for SMC1A ... 27

Whole-exome sequencing ... 27

In silico analysis of SMC1A mutations ... 28

RNA extraction and retrotranscription ... 28

Quantitative real-time PCR analysis ... 28

Protein extraction ... 29

Co-immunoprecipitation ... 29

Western blotting ... 29

Cell culture ... 30

SMC1A cDNA mutagenesis and transfection ... 30

siRNA treatment ... 30

Cytogenetic analysis ... 31

Immunocytochemistry... 31

(3)

3

RESULTS ... 32

SMC1A mutations in coloractal adenomas ... 32

CdLS: looking for “new” or “old” causative gene(s)? ... 41

CdLS: genotype-phenotype correlation ... 44

DISCUSSION ... 50

CONCLUSION ... 54

(4)

4

INTRODUCTION

The survival of all organisms requires the production of genetically identical daughter cells through a process known as cell division. To ensure genetic identity, the genome must be replicated and segregated prior to the actual division process. Specifically, during M phase, the array of chromosomes must be segregated so that each daughter cell inherits a complete complement: mistakes at this point in the cell cycle can result in daughters with deviations from the normal karyotype, which in turn can result in a loss of fitness or, in the case of multicellular organisms, various diseases. The solution to this challenge is sister chromatid cohesion mediated by cohesin, a multimeric protein ring structure that encircles the replicated sister chromatids (Nasmyth and Haering, 2009). Following DNA replication in S phase, original and new chromatids are held together by cohesin, thus maintaining their identity as sisters. Cohesion is maintained throughout the rest of S phase, G2 and into early mitosis where the chromosomes align at the centre of the cell on the microtubule spindle apparatus. At anaphase onset, the cohesin ring is opened, chromatid cohesion is lost and the sisters separate, allowing spindle forces to pull them to opposite sides of the cell. Due to the importance of cohesion in chromosome segregation, many efforts have been made by scientists to elucidate the molecular mechanisms underlying this fine biological process. Proteins that are essential for cohesion were first identified using many organism models through the analysis of specific mutants in which sister chromatid separation occurred precociously before anaphase (Davis et al., 1971; Miyazaki and Orr-Weaver, 1992; Guacci et al., 1997; Birkenbihl and Subramani, 1992; Holt and May, 1996). Four of these cohesion proteins (Smc1, Smc3, Scc1/Mcd1/Rad21, and Scc3/Irr1) were later shown to be subunits of the protein complex called “cohesin” to indicate its essential function in sister chromatid cohesion (Losada et al., 1998; Toth et al., 1999). The proper function of cohesin in correct chromosome segregation requires the activity of auxiliary factors that all together compose the “cohesin network”. Since the cohesin

(5)

5

network plays a role in one of the most fundamental cellular process, it is not surprising that, as mentioned, the cohesin complex and its auxiliary factor are very highly conserved throughout the evolution (Tab. 1).

Table 1: Nomenclature of principal cohesin network factors in different organisms (Modified from Onn et al., 2008).

Cohesin structure

Cohesin is a ring-like shaped molecule that is capable of embrace DNA molecules in a topological way. Cohesin is heterotetrameric complex composed by two SMC proteins (Structural Maintenance of Chromosomes) and two non-SMC proteins. In human the first are called SMC1A and SMC3, while the latter are named RAD21 and STAG1/2. The SMC proteins are organized into five domains: the N-terminal globular domain, the hinge domain separating two coiled-coil regions and the C-terminal globular portion. The hinge domain acts as an hub driving the winding of the SMC proteins. Indeed, the two coiled-coil regions fold back on themselves around the hinge domain, giving rise to an antiparallel structure of about 45 nm

(6)

6

which approaches the globular terminals, forming an ATP Nucleotide Binding Domain (NBD), with ATPase activity (Fig. 1).

Figure 1: Structure and folding of SMC proteins.

The precise function of ATP binding and/or hydrolysis is still elusive. It has been hypothesized that ATP binding forces SMC1A and SMC3 NBDs together, whereas hydrolysis permits their dissociation. However, it is currently unclear whether NBDs association/dissociation per se facilitates cohesin loading onto chromatin. The NBDs present three highly conserved motifs: the "Walker A", located in correspondence with the N-terminal, the "Walker B" and "signature motif", both located within the C-terminal. All SMC proteins act in the form of a homo- or hetero-dimer (Haering et al., 2002) and, SMC1 and SMC3 dimerize by tight interactions between residues at the level of the hinge domain. Crystallography studies have shown that this connection at the hinge domain gives origin to a structure similar to the clamps of DNA polymerase, but with a much smaller central hole (Nasmyth and Haering, 2009). The “V” structure generated is subsequently closed by RAD21. NBDs of SMC3 and SMC1A are bound tightly by RAD21’s N- and C-terminal domains, respectively, creating a huge tripartite ring. In particular RAD21 interacts with the NBD of SMC1A via its Winged Helix Domain (WHD), whereas the molecular details of the interaction

(7)

7

with Smc3 have not been fully characterized yet (Nasmyth and Haering, 2009). The fourth subunit of the complex is associated with RAD21 and two mutually exclusive isoforms have been found in somatic cells of vertebrates: Stromalin Antigen 1 (SA1) and Stromalin Antigen 2 (SA2). Only SA2 seems to be primarily responsible for the cohesion of sister chromatids (Remeseiro et al., 2012), while SA1 seems to be involved in other cohesin functions that will be discussed later.

Figure 2: Structure of mammalian cohesin complex.

It is believed, as mentioned, that the binding of the cohesin complex with DNA molecules is topological (Haering et al., 2008). Different models have been developed to explain the mechanism by which cohesin combines firmly sister chromatids (Fig. 3):

(8)

8

A. "One ring model" (or "embrace" model): the two sister chromatids, represented by a fiber of about 10 nm each one, would be captured within a single cohesin ring with an inner diameter of about 40 nm.

B. "Two ring model": according to this model cohesins would tie a single chromatid and cohesion between sister chromatids would be stabilized as a result of the interaction between SA2 subunits belonging to two different rings.

C. "Bracelet model": according to this model sister chromatids would be maintained in close proximity by oligomers of cohesin generated as a result of the interaction of RAD21 subunits.

Figure 3: Proposed models of sister chromatid cohesion mediated by cohesin.

(Modified from Mehta et al., 2012).

Cohesin and cell cycle

Cohesin binding to chromatin is dynamic and occurs during the G1/S phase in the budding yeast or in the telophase of the previous cell cycle in vertebrates (Haering et al., 2004). In yeast, cohesin binds along chromosome arms every 10 to 20 kb,

(9)

9

whereas its density is higher at centromeric regions (Kogut et al., 2009). The loading of cohesin along chromosomes requires NIPBL activity (an homolog of fungal Scc2/4 and Drosophila Nipped-B) as showed by studies in Saccharomyces

cerevisiae (Ciosk et al., 2000), in Schizosaccharomyces pombe (Bernard et al.,

2006), Xenopus laevis (Gillespie et al., 2004), and in mammalian cells (Seitan et al., 2006). The Scc2/Scc4 complex promotes loading of cohesin onto DNA by stimulating its ATPase activity associated with SMC NBDs which might in turn allow transient opening of the hinge domain permitting the passage of DNA into the cohesin ring. According to this model the hinge domain represents the “entry gate” of cohesin for DNA molecules (Nasmyth et al., 2011). Cohesin associates with chromosomes before DNA replication, but cohesion can only be established once DNA has been replicated during S phase. The establishment of sister chromatid cohesion depends on an acetyltransferase, called Eco1/Ctf7 (Skibbens et al., 1999; Unal et al., 2007). Recently it has been shown that a key function of Eco1 is to acetylate cohesin on two lysine residues that are both located near the ATPase domain of Smc3 (Unal et al., 2008). Smc3 acetylation renders cohesin resistant to Wapl and its binding partner Pds5B, counteracting their tendency to remove cohesin from chromosomes. Acetylation could also stabilize protein– protein interactions of cohesin core subunits. The finding that Smc3 acetylation is at least partially dependent on DNA replication could explain why cohesin subunits produced in G2 or M phase cannot generate sister chromatid cohesion despite forming functional cohesin rings that associate stably with chromosomes (Ben-Shahar et al., 2008). Removal of cohesin in both mitosis and meiosis is established in two sequential steps: the former (known as the prophase pathway) takes place during prophase and prometaphase, when most cohesin dissociates from chromosome arms but not from centromeres. The latter takes place shortly before the onset of anaphase, when all remaining cohesin (mainly at centromeres but also on arms) dissociates due to cleavage of its kleisin subunit by Separase (Hauf et al., 2001; Uhlmann et al., 1999). In the first step, RAD21 and STAG2 subunits are phosphorylated by Polo-Like Kinase 1 (PLK1), these phosphorylation reactions decrease binding of cohesin to chromatin (Sumara et al., 2002) causing its dissociation only along the chromosome arms (Hauf et al.,

(10)

10

2005), whereas the bulk of centromeric cohesin remains unchanged. Centromeric cohesin is protected from the prophase pathway due to the centromere-specific SGO1 protein. Furthermore, WAPL interacts with SA2 (Gandhi et al., 2006), whose phosphorylation is also required for cohesin dissociation in mitosis. It is therefore possible that SA2 phosphorylation changes cohesin in a way that facilitates cohesin dissociation by WAPL during the prophase pathway. The ring model predicts that cohesin removal in prophase pathway must involve ring opening, but the key questions are whether entry and exit gates are the same and whether they both involve cohesin’s hinge. In this regard, it has been recently shown that cohesin's proposed DNA exit gate is formed by interactions between Scc1 and the coiled-coil region of Smc3 (Peters et al., 2014; Nasmyth et al., 2014). At anaphase, APC/C becomes active leading to ubiquitination and subsequent degradation of numerous substrates, including the Separase inhibitor Securin. Separase, also known as Separin, is a cysteine protease that, once chromosomes are bioriented on the mitotic or meiotic spindle, completely remove cohesin from chromosomes by the cleaveage of Rad21/Scc1 subunit, triggering sister chromatid disjunction. Although Separase only cleaves ∼10% of cohesin complexes in mitotic mammalian cells (Waizenegger et al., 2000), these reactions are essential for sister chromatid separation, because both expression of noncleavable mutants of Scc1 and inhibition of Separase expression cause defects in chromosome segregation (Hauf et al., 2001).

(11)

11

Figure 4: Cohesin dynamics and its regulators during cell cycle in vertebrates. (Modified from Peters et al., 2008).

(12)

12

Roles beyond cohesion

Surprisingly, cohesin is expressed in a wide range of mammalian tissues (Sumara et al., 2000), including post-mitotic neurons that normally do not replicate their DNA and can thus not establish cohesion (Wendt et al., 2008). A stronger clue that cohesin may have non-mitotic functions derives from patients with Cornelia de Lange Syndrome (CdLS), that is caused by mutations in genes coding for cohesin network proteins. More than 50% of CdLS probands carries a mutation in only one allele of the NIPBL gene (Krantz et al., 2004; Tonkin et al., 2004) that caused only modest (30%) reduction in the level of NIPBL protein and that does not appear to be accompanied by major defects in sister chromatid cohesion. These observations suggest that cohesin is involved in processes that are independent of its canonical role in sister chromatid cohesion.

Regulation of gene expression

Experimental evidences suggest a new important role for cohesin in the regulation of gene expression. Involvement of cohesin in this process has been already shown in Saccharomyces cerevisiae at the locus HMR, a silenced region of the yeast genome. This region is flanked by sequences known as "silencers" which are bound by specific proteins "Sir" (Silent information regulator). The association between Sirs and nucleosomes prevents the interaction of several factors involved in the transcription process with their target sequences. Among the factors identified for the definition of silencing boundaries of HMR locus, Smc1 is one of the most important, since its mutations result in a dramatic extension of silenced adjacent regions (Donze et al., 1999). A screening conducted in Drosophila

melanogaster aimed at the identification of factors necessary for activation of the

homeotic genes Cut and Ultrabithorax by distal enhancers, identified Nipped-B (human homologue of NIPBL) even before it was made clear its role in sister chromatids cohesion (Rollins et al., 1999). This finding suggested that the cohesin complex could mediate the interactions between distal enhancers and promoters (Rollins et al., 2004; Dorsett et al., 2005). According to these observations, it has

(13)

13

been developed a model in which cohesin acts as an insulator, blocking communications between enhancers and promoters. It has been suggested that Nipped-B mediated removal of cohesin from chromatin, allows the activation of the genes by distal enhancers (Dorsett, 2007), (Fig. 5). Further evidences came from a study on Zebrafish aimed to the identification of factors regulating the expression of the Runx locus. Runx genes are involved in both processes of differentiation and development. It has been shown that halving of Rad21 expression results in a drastic reduction of the Runx-1 mRNA levels (Horsfield et al., 2007).

Figure 5: Cohesin may act as an insulator, preventing the interactions between enhancers and promoters in Drosophila Melanogaster. Removal of cohesin reactivates these interactions leading to gene expression. (Modified from Dorsett., 2007)

The hypothesis that cohesin complex can act as an insulator was recently supported by the discovery of an high level of co-localization of cohesin with CTCF. The CTCF (CCCTC-binding Factor) is considered the main insulator in vertebrates, but is highly conserved from Drosophila to mammalians (Felsenfeld et al., 2004; Moon et al., 2005). It is plausible that the CTCF is involved in the formation of loops that facilitate some interactions between enhancers and

(14)

14

promoters, impeding others. Several groups have shown the existence of a loop mediated by CTCF at the locus H19/IGF by the 3C (Chromosome Conformation Capture) technique (Kurukuti et al., 2006; Yoon et al., 2007). The same mechanism has been found in the proximity of the beta globin locus in a mouse model (Splinter et al., 2006). In addition, two groups have independently shown a high level of co-localization between CTCF and cohesin complex in mammalian cells. In the former study, Wendt et al., (2008) through a genome wide approach conducted on HeLa cells, found that about 90% of the regions occupied by the cohesin complex are also bound by CTCF. In the latter study, Parelho and collaborators (2008), showed similar data in pre-B and thymocytes mice cells using Chip-on-chip technique. In particular they found that 80% of CTCF binding sites are also occupied by cohesin and about 65% of the regions recognized by cohesin binds also CTCF (Parelho et al., 2008). The discovery of a high degree of co-localization between CTCF and cohesin allows to speculate on the possibility that the cohesin complex prevents the long-range interactions between enhancer and promoters, involving the CTCF (Fig. 6).

Figure 6: Cohesin prevents the long-range interactions between enhancers and promoters together with the CTCF (Modified from Losada, 2014).

(15)

15

DNA repair and genome stability maintenance

Studies on yeast showed that mutations in cohesin complex lead to sensitivity to different mutagenic agents, including gamma radiation, camptothecin and hydroxyurea. Furthermore, mutant cells for Scc1 are hypersensitive to DNA damage (Kim et al., 2002; Schar et al., 2004). ATM and ATR are serine/threonine kinases implicated in cellular response to DNA damage, regulating both G1/S and G2/M checkpoints. It has been shown that ATM and ATR complexes phosphorylate histone H2AX in the presence of DNA damage, allowing the loading of cohesin at DNA damage sites (Unal et al., 2004), whereas cohesin is recruited at breaks by Eco1/Ctf7 activation in G2/M phase (Strom et al., 2007; Unal et al., 2007). These studies provided evidences that DNA damage repair depends on the ability of cohesin to mediate sister chromatid cohesion. This is likely due to the finding that Double Strand Breaks (DSBs) are preferentially repaired by Homologous Recombination (HR) between sister chromatids, suggesting that the close proximity of sister chromatids improves the efficiency of DNA repair (Sjogren and Nasmyth, 2001). In mitotic cells, two different populations of cohesin contribute to DNA repair process: cohesin engaged in holding sisters chromatids together at the time of the break and cohesin subsequently recruited to chromatin surrounding the break itself (Strom et al., 2004). Cohesin performs these functions also in human cells, where, SMC1A is a target of the ATM kinase and it is phosphorylated upon treatment with ionizing radiation (IR) (Kim et al., 2002; Yazdi et al., 2002). Phosphorylation of SMC1A recruits ATM at the DNA DSB sites through the action of BRCA1 and NBS1 (Kitagawa et al., 2004). The phosphorylation of SMC1A, however, may be independent of ATM. Human cultured cells treated with high concentrations of aphidicolin show a condition of replicative stress leading to stalling of replication forks and DSBs. This damage triggers the phosphorylation of SMC1A by ATR, on a residue different from that one phosphorylated by ATM (Serine 966, instead of Serine 957) (Musio et al., 2005). Furthermore, replication forks stalling induces an increased synthesis of SMC1A (Musio et al., 2005), which can in turn promote the stabilization of the cleavage sites and the recruitment of the machinery

(16)

16

responsible for DNA repair. In summary, cohesin participates in two distinct DNA repair pathways in relation of the type of DNA damage. Cohesin’s properties are essential for DNA repair and the maintenance of genomic stability (Mannini et al., 2010). Moreover, phosphorylation of cohesin components can lead to activation of cell cycle checkpoints, allowing cycle arrest and DNA repair (Fig. 7). The heterodimer SMC1A-SMC3 is also a member of the recombination complex RC-1, together with DNA ligase III, DNA polymerase ε and several endonucleases (Jessberger et al., 1996). This complex catalyzes different reactions of DNA repair by recombination using sister chromatids as template. This mechanism, using sister chromatids instead of homologous chromosomes, has the advantage of reducing problems associated with loss of heterozygosity, translocations and internal deletions.

(17)

17

Figure 7: Cohesin-mediated DNA damage response. RPs = repair proteins. (Modiefied from Jessberger, 2009).

Cohesin and human disease

Given the critical cellular processes now known to be associated with cohesin, it is not surprising that human disorders have recently been found to be caused by perturbation of cohesin structural components and regulators. Germinal cohesin mutations lead to disorders known as cohesinopathies, like CdLS and Roberts Syndrome (RBS) and somatic cohesin mutations have been recently associated to different types of cancer such as, Colorectal Cancer (CRC), Acute Myeloid Leukemia (AML), Urothelial Bladder Cancer (UBC) and Glioblastoma (GBM) (Losada, 2014).

(18)

18

Cornelia de Lange Syndrome

CdLS is the most frequent among cohesinopathies and it owes its name from the Dutch pediatrician who in 1933 described for the first time two cases and proposed diagnostic criteria. CdLS is a developmental disorder with wide phenotypic variability that alters the functionality of the organism at different levels. It is believed that its incidence is 1:10,000 live births, but this could be underestimated since the subjects with mild phenotypes are sometimes difficult to identify (Dorsett et al., 2009). Patients with CdLS have typical dysmorphic features: synophrys, long eyelashes, depressed nasal root with an uptilted tip of the nose and anteverted nares, long philtrum, thin upper lip, small widely spaced teeth, small brachycephalic head, and low-set, posteriorly angulated ears (Liu and Krantz, 2008). Frequently it is also present an abnormal development of the jaw (micrognathia) and cleft palate. Another typical trait is represented by limbs malformations that in most cases regards a reduction of the first metacarpal, brachydactyly and polydactyly; while, in other cases, more severe, can be observed the complete absence of the forearm. The short stature of patients with CdLS is detectable usually around the second trimester of pregnancy, and is typically maintained in the postnatal period: the height, weight and head circumference remain below the normal childhood growth curve until adulthood (Kline et al., 1993). At a systemic level, the gastrointestinal tract is generally compromised. The abnormalities most frequently encountered are the vesicoureteral reflux, pelvic dilatation and renal dysplasia (Selicorni et al., 2005). One of the main complications is the gastro-esophageal reflux, which can be found in 65% of CdLS probands independently of the severity of the disease (Luzzani et al., 2003). The heart function is also affected: about 25% of probands presents congenital abnormalities that primarily stroke the septum between the atria and the ventricles (Mehta et al., 1997). In addition to developmental delay patients show cognitive impairment and the IQ test assesses with a range between mild and moderate (30-85). The language is particularly affected while the memory capacities are generally less impaired. Regarding behavior, patients frequently have obsessive-compulsive attitudes and sometimes suffer from

(19)

19

attention deficit, hyperactivity, depression and autism (Hyman et al., 2002; Moss et al., 2005). From the genetic point of view, CdLS is a genetically heterogeneous dominantly inherited disorder. The majority of cases are sporadic, however there are some cases that have been attributed to germline mosaicism (De Scipio et al., 2005; Niu et al., 2006; Slavin et al., 2012). About 60% of CdLS probands has a heterozygous mutation in the NIPBL gene (Gillis et al., 2004; Krantz et al., 2004; Tonkin et al., 2004). Mutations in NIPBL are responsible for both severe and mild forms. Most of described NIPBL mutations leads to a truncating protein so it is believed that haploinsufficiency is the molecular mechanism in NIPBL-mutated CdLS probands, due to the reduction of wild-type protein. About 5% of probands carries a mutation in SMC1A (~5%) and SMC3 (~1%) (Musio et al., 2006; Deardorff et al., 2007; Ansari et al., 2014). SMC1A map to the short arm of the X chromosome, in a region which escapes the inactivation process (Brown et al., 1995). Mutations found so far within the SMC1A and SMC3 coding regions are essentially missense or small deletions in frame (only one exception is represented by a nonsense mutation found in SMC3). Mutations in SMC proteins has been associated with milder forms of CdLS than mutations in NIPBL. Since the great majority of mutations in SMC1A and SMC3 have been identified as missense or deletions in frame, the best hypothesis is that greater changes in SMC proteins would not be compatible with life and therefore are negatively selected (Revenkova et al., 2009). It is believed that in SMC1A and SMC3 mutated cases, the disease is due to a dominant negative effect of the mutated proteins SMC. In fact, since the cohesin acts as a complex, it is sufficient a single mutated subunit to alter the functionality of the ring (Deardorff et al., 2007). Finally 5% of probands harbors mutations in HDAC8 and RAD21 (Deardorff et al., 2012a; Deardorff et al., 2012b) that are associated with moderate traits, between phenotypes of patients with SMC and NIPBL mutations. Currently, 30% of clinically diagnosed CdLS probands does not carry an identifiable mutation in one of the cohesin network genes. This would indicate that new disease-causing genes remain to be identified.

(20)

20

Figure 8: Frequencies of mutated genes in CdLS.

Since cohesin holds sister chromatids together, it was thought that the etiopathology of CdLS could reside in defects in this process. In a first study metaphases of subjects with mutations in NIPBL were analyzed. The frequency of premature separation of sister chromatids (PSCS) amounted to 41%, versus 9% of the controls (Kaur et al., 2005). These data, therefore seem to endorse the previous hypothesis. However, a similar study conducted on CdLS patients bearing a mutation in SMC1A or SMC3, found no difference in the degree of PSCS, respect to controls (Revenkova et al., 2009). Finally, in 2009 a further study conducted on 150 metaphase spreads from lymphoblastoid cell lines (LCLs) belonging to 29 CdLS patients and 24 healthy subjects, confirmed the general inadequacy of the PSCS rates as specific marker for CdLS (Castronovo et al., 2009). Next, through the use of microarrays, it was shown that CdLS patients have an altered gene expression profile (Liu et al., 2009). From the same study emerged that about 22% of LCLs genes has intragenic cohesin binding sites with an enrichment at Transcription Start Site (TSS) and Transcription Termination Site (TTS). Specifically, the highest peak was detected in the proximity of the

(21)

21

TSSs while the second peak map at the level of the TTSs. The discovery that most of the differentially expressed genes in CdLS showed such cohesin peaks, suggested a correlation between intragenic cohesin binding sites and gene expression regulation. Transcriptionally active genes show a peak of cohesin binding also near the 5'UTR, supporting the notion of an involvement of cohesin in the modulation of gene expression. Liu and collaborators (2009) have also shown that the number of binding sites of cohesin is reduced by 29.7% in CdLS probands cells compared to healthy subjects, and that the maximum reduction is achieved in the proximity of the TSSs (43.4%). In particular, it was found that about 50% of the differentially expressed genes maintains their binding sites of cohesin near the TSS, while the remaining 50% loses them. Although these evidences point a global transcription impairment at the base of CdLS pathogenesis, the expression level of dysregulated genes is in general only slightly perturbed in these patients (only 13.8% of them show a fold change greater than 1.5). Therefore it is thought that the cumulative effects of slight changes in expression levels of several genes could be the molecular mechanism underlying CdLS.

Cohesin mutations in cancer

Sequencing of 4,742 human cancer samples across 21 cancer types has identified

STAG2 as one of 12 genes that are mutated at relatively high frequencies in at

least four tumor types (Lawrence et al., 2014). Mutations in genes encoding cohesin subunits like SMC1A and NIPBL were initially identified in CRC (Barber et al., 2008), and mutations in STAG2 were later found in Glioblastoma (GBM), Ewing’s sarcoma and melanoma (Solomon et al., 2011). Mutations in STAG2 are also the most common in Urothelial Bladder Cancer (UBC), with rates of 10–15% in aggressive tumors and up to 30% in low-grade tumors (Guo et al., 2013). Although most identified mutations are heterozygous, the SMC1A and STAG2 genes are located on the X chromosome, which makes the corresponding mutations functionally homozygous, at least in males. It is unlikely that cells can

(22)

22

proliferate in the absence of cohesin. SMC1A mutations are prevalently missense, preserving the production of the protein, but probably assuming a dominant negative effect. The relation proposed between tumorigenesis and cohesin dysfunction is basically represented by the increase of genome instability due to faulty of DNA replication and/or repair and chromosome missegregation (Duijf and Benezra, 2013). On one hand aneuploidy and genome instability affect cell survival but, on the other hand, they can also accelerate tumor evolution and adaptability (Holland and Cleveland, 2012). An association between aneuploidy and cohesin mutations in cancer has been reported in some studies (Solomon et al., 2013), but not in others (Balbas-Martinez et al., 2013). The role of cohesin in genome organization could also promote carcinogenesis in alternative ways. The most obvious effect would be gene expression changes of crucial oncogenes or tumour suppressors. Furthermore, altered organization of replication factories may slow replication and increase replicative stress (Guillou et al., 2010). A local alteration of chromosomal domain organization could alter the replication timing of the domain and thereby affect its mutation rate, its epigenetic modifications or the frequency of structural rearrangements (Stamatoyannopoulos et al., 2009). Though these evidences suggested an important role of cohesin mutations in cancer development, they remain poorly characterized in terms of folding and stability of the complex itself and its association to chromatin or its interaction with other proteins. Distinct aspects of cohesin regulation and function could be tissue-specific and thereby explain why mutations in cohesin are more prominent in certain types of tumors. So far, synergy with mutations in additional pathways is also unclear. The generation of cell and animal models deficient for cohesin or carrying the mutations identified in cancer, will help us to understand the functional importance of such mutations and hopefully contribute to improve the diagnosis and treatment of patients.

(23)

23

Figure 9: Recurrence of mutations in cohesin network factors in different types of cancer. (CRC = colorectal cancer; AML = acute myeloid leukemia; UBC = urothelial bladder cancer; GBM = glioblastoma, DS-AMKL= down syndrome-acute megakaryoblastic leukemia).

Colorectal cancer

Colorectal cancer (CRC) is a leading cause of death, with over 1 million cases worldwide and 27 deaths per 100,000 population in high-income countries (data updated to 2011, from www.who.int). The onset mechanisms of this cancer are not yet completely understood, but it is believed that contribute genetic, epigenetic and environmental factors (Weitz et al., 2005). From the biomedical point of view CRC is a malignant tumor that affects the large intestine and more frequently the rectum, sigmoid and anus. The disease is caused by the uncontrolled proliferation of cells from the lining mucosa of large intestine with formation of adenomas (precancerous lesions), which have a high probability to transform into malignant lesions (Michor et al., 2005). The epithelium of the

(24)

24

colon has a peculiar morphological and functional organization with an high rate of self-renewal. In particular intestinal stem cells are organized in groups of cells present at the base of the intestinal mucosa, inside protective niches called “crypts”. Intestinal stem cells are responsible for the production of immature progenitors, which generate the mature cell types: enterocytes, goblet cells and endocrine cells. These stem cells lose their proliferative capacity and acquire their differentiated phenotype before migrating to the intestinal lumen. When they reach the apex of the crypt they undergo apoptosis and can be phagocytized by the stromal cells, or be released into the intestinal lumen. The process of differentiation, migration and apoptosis occurs in 3-6 days (Lipkin et al., 1963). The normal architecture of the crypt is therefore maintained through a balance between cell proliferation at the base of the crypt and apoptosis on the surface. The dysregulation of this process can promote the development of CRC because of the accumulation of proliferating cells in the mucosa and the failure in death of cells genetically aberrant (Risio et al., 1996; Bedi et al., 1995). One of the first effects of this imbalance is the formation of polyps along the walls of the intestine. CRCs develop through an ordered series of events beginning with the transformation of normal colonic epithelium to an adenomatous intermediate and then ultimately adenocarcinoma, in the so-called “adenoma–carcinoma sequence” (Fig. 10). This progression occurs over the course of many years and has been well characterized from a molecular point of view. Every histological stage is associated with a specific mutation in tumor suppressor genes or proto-oncogenes. These mutations arise within normal tissue in a characteristic sequence leading to early adenoma/dysplastic crypt, late adenoma and carcinoma. Generally the process begins with mutations in the tumor suppressor gene APC (Adenomatous Polyposis Coli) (Powell et al., 1992). Normally APC binds to β-catenin and other kinases forming a complex. Through this complex, the β-catenin is phosphorylated, ubiquitinated and degraded, silencing the Wnt/β-catenin pathway. If APC is mutated the complex of degradation is impaired resulting in an accumulation of β-catenin in the nucleus that activates the transcription of genes like c-myc and cyclin D1. These genes promote uncontrolled cell proliferation and thus the development of preneoplastic lesions (Mann et al., 1999). Subsequently,

(25)

25

mutations in oncogenes such as K-RAS, involved in the transduction of proliferative signals, leads to the formation of late adenomas. Finally, the transformation in adenocarcinoma is often due to mutations in the tumor suppressor gene TP53. TP53 is also known as “the guardian of the genome” for its multi-functional roles in genome stability maintenance. From this point, the fast accumulation of mutations leads rapidly to the metastatic stage (Rajagopalan et al., 2003).

Figure 10: Adenoma-carcinoma progression and its stage-associated mutational events.

About 85% of sporadic CRCs is characterized by this chain of mutational events. One of the hallmark of this type of colorectal tumors is the presence of a particular form of genomic instability defined CIN (Chromosomal INstability), which is caused by the gain or loss of whole chromosomes or parts of them (leading to aneuploidy) (Lengauer et al., 1997). The remaining 15% of sporadic CRCs is represented by the form of cancer that is characterized by microsatellite instability (MSI). This form of CRCs is divided into two categories: MSI-L, low-frequency microsatellite instability (1-28%) and MSI-H, high-low-frequency microsatellite instability (≥29%). The MSI is the result of mutations in the DNA

(26)

26

mismatch repair system (MMR) composed of the MMR genes encoding for proteins MLH1, MSH2, MSH3, MSH6, PMS2 and PMS1, highly conserved from prokaryotes to eukaryotes (Ionov et al., 1993; Thibodeau et al., 1993). Other than sporadic forms, two hereditary CRCs has been described. These tumors, characterized by early onset and increased aggressiveness, are known as Familial Adenomatous Polyposis (FAP) and Hereditary Nonpolyposis Colorectal Cancer (HNPCC) or Lynch syndrome (Pavlovic-Calic et al., 2007). These familiar forms are caused by germline heterozygous loss of APC and MMR genes respectively. In 2008 Barber and collaborators showed that cohesin genes are mutated in CRCs: 132 samples of sporadic CRC were examined and 10 somatic mutations were identified in four genes of cohesin network. In particular, they found 4 mutations in SMC1A, 4 mutations in NIPBL, 1 mutation in SMC3 and another in STAG3. Mutations in SMC1A were all missense mutations, with a frequency of about 4%. Among SMC1A mutations, two affect the coiled-coil region, one the hinge domain and the other one the C-terminal domain. They also reported that the down-regulation of these genes in human cells leads to chromosomal instability and defects in sister chromatid cohesion. These evidences suggested that somatic mutations in the cohesin complex could be one of the main trigger of CIN, typical of most sporadic CRCs. (Barber at al., 2008).

(27)

27

AIM OF PhD RESEARCH

Cohesin has a leading role in a wide range of chromosome-related processes. Its importance in cell biology is highlighted by the observation that cohesin alterations are causative for congenital human disorders and, as recently reported, could have a critical role in the development of many type of cancers. This PhD research was aimed in investigating the genetic effects of cohesin dysfunction with particular emphasis to CdLS and CRC. CdLS is genetic heterogeneous and 30% of patients lack of mutations in known genes. During the first part of this research project we planned to identify the additional causative CdLS gene(s) by WES analysis and, in parallel, to collect all of the identified CdLS mutations in order to define properly genotype-phenotype correlations.

Most of sporadic CRC are characterized by CIN that arise since the early step of carcinogenesis. Mutations in cohesin complex has been found in colorectal carcinomas though at low frequencies. Because of the importance of cohesin in genome stability maintenance and preventing aneuploidy, our hypothesis was that cohesin mutations may occur in the early step of colorectal tumorigenesis and could be the crucial event in the development of CIN that characterize this type of cancer. In particular we focused on the study of SMC1A protein, not only because is one of the most conserved subunit, but also because it is directly involved in DNA damage response. To test this hypothesis we performed the mutational screening of SMC1A in colorectal adenomas, a precocious step of colorectal carcinogenesis. In addition, we investigated the effects on genome stability of cohesin mutations identified in colorectal adenomas.

(28)

28

MATERIALS AND METHODS

DNA extraction

 Genomic DNA was extracted from embedded paraffin samples by the NucleoSpin Tissue kit (Macherey-Nagel) according to the manufacturer’s protocol. DNA concentration was estimated by BioPhotometer D30 (Eppendorf).

 Genomic DNA was extracted from cell pellets and human blood by the GenElute™ Mammalian Genomic DNA Miniprep Kit Protocol (Sigma) according to the manufacturer’s protocol. DNA concentration was estimated by BioPhotometer D30 (Eppendorf).

 DNA was extracted from agarose gel by the GenElute™ Gel Extraction Kit (Sigma) according to the manufacturer’s protocol. DNA concentration was estimated by BioPhotometer D30 (Eppendorf).

Mutation analysis for SMC1A

Primer pairs were designed to amplify exons, exon–intron boundaries and short flanking intronic sequences. Amplified DNA has been run on agarose gel (1,5%), extracted and sequenced by classical Sanger method. Sequences have been analyzed by Chromas Lite v.2.33 software.

Whole-exome sequencing

Libraries has been generated using TruSeq Sample Preparation Kit v2 (Illumina) and sequencing of enriched fragments have been performed on an Illumina HiSeq200 platform using the pairend-end sequencing protocol with a read length of 100bp each. Construction of the libraries, sequencing and primary data analysis

(29)

29

have been performed by external service. The sequences have been further filtered against public databases, such as dbSNP and 1000 Genomes projects in order to remove common variants and will be analyzed to exclude nongenic, intronic and synonymous variants. The sequence variations have been also validated by Sanger sequencing and confirmed as de novo mutations after comparison with parental DNA.

In silico analysis of SMC1A mutations

The effects of the amino acid changes have been analyzed by dedicated software such as the PolyPhen, SIFT and Mutation Taster. The conservation throughout evolution of the amino acid sequences have been analyzed by ClustalW software.

RNA extraction and retrotranscription

Total RNA was extracted from cell pellets by RNeasy Mini Kit (Qiagen) according to the manufacturer’s protocol and cDNA synthesis was performed by iScript™ cDNA Synthesis Kit (Bio-Rad) according to the manufacturer’s protocol.

Quantitative real-time PCR analysis

Quantitative real-time PCR (qPCR) was performed using QuantiTecT SYBR Green PCR mix (Qiagen) on the Rotor Gene 3000 (Corbett). Each sample was run in duplicate and repeated at least three times. The primers for qPCR amplifications of the SMC1A were 5′CCAAGCGGCGTATTGATGAA3′ (forward) and 5′ GCATCCATGTTCTTGCCCAA3′ (reverse). HPRT was used as an internal control. The primers for HPRT were 5′AGCCAG ACTTTGTTGGATTTG3′ (forward) and 5′TACTAAGCA GATGGCCACAGA3′

(30)

30

(reverse). The results are expressed as fold enrichment relative to control untreated cells.

Protein extraction

 Whole proteins were extracted from early colorectal adenomas (FFPE tissues) by Qproteome FFPE Tissue kit (Qiagen) according to the manufacturer’s protocol. Proteins concentration was estimated by the Breadford’s assay.

 Whole protein were extracted from cell pellets by immersion in lysis buffer, homogenization through Tissuelyser (Qiagen®), incubation and centrifugation. Supernatant containing whole protein extracts were collected. Proteins concentration was estimated by the Breadford’s assay.

Co-immunoprecipitation

Proteins extracts were dissolved in 1 ml of incubation buffer. The solution was precleared with 20 ml Dynabeads protein G (Invitrogen) for 1 h. The supernatants were then incubated with 3 mg of anti-SMC1A or RAD21antibodies coupled to the 40 ml Dynabeads protein G. The loaded suspensions were precipitated, washed four times with incubation buffer and then resuspended in SDS-loading buffer.

Western blotting

Commercially available antibodies used in this study are as follows: anti-SMC1A (Bethyl Laboratories), anti-SMC3 (Bethyl Laboratories), anti-RAD21 (Bethyl Laboratories) and anti-Actin (Santa Cruz Biotechnology). Samples were boiled in sample buffer and separated by SDS–PAGE. The proteins were transferred to

(31)

31

nitrocellulose membrane (Amersham) and incubated with the primary antibody. After removal of the unbound primary antibody, membranes were incubated with secondary antibody–peroxidase conjugate (Sigma) and processed for detection by chemiluminescence (Amersham) and imaged on Biomax film (Kodak). Actin antibody was used as internal control.

Cell culture

Human primary fibroblasts were grown in Dulbecco’s modified Eagle’s medium (DMEM, Gibco BRL) supplemented with 10% fetal calf serum and antibiotics in a humidified 5% CO2 atmosphere.

SMC1A cDNA mutagenesis and transfection

The site-directed mutagenesis of the cDNA clone containing SMC1A (OriGene) was performed with Quik Change Site-Directed Mutagenesis Kit (Stratagene) according to the manufacturer’s instructions. By this approach, we introduced the c.2027A>G, c.2479C>T and c.3421C>T SMC1A mutations. All mutations were confirmed by sequencing. Transfections were performed by Lipofectamine LTX according to the manufacturer’s protocol.

siRNA treatment

Smart pool siRNA against SMC1A was purchased from Dharmacon. Cells (at 40– 60% confluence) were transfected with 20 nM si-SMC1A RNA by using Oligofectamine Reagent (Invitrogen). Cells were analyzed for aneuploidy and genome stability 48h post-transfection.

(32)

32

Cytogenetic analysis

Nocodazole was added to the cultures for 90 min, followed by a 20-min incubation in 0.075 M KCl at 37°C and multiple changes of Carnoy’s fixative. Cells were dropped onto cleaned and wet slides. One hundred metaphases were analyzed. Micronuclei, chromosome aneuploidy and aberrations were visualized by staining slides in Giemsa or Propidium Iodide and detected by direct microscope visualization.

Immunocytochemistry

Cells were fixed in 2% paraformaldehyde for 10 min, permeabilized for 5 min on ice in 0.2% Triton X-100 and blocked in PBS with 1% BSA for 30 min at room temperature. Thereafter, cells were incubated with anti-Tubulin antibody (Abcam) for 1 h, washed in PBS, 1% BSA and incubated with Alexa Fluor 488-conjugated goat anti-rabbit secondary antibody (Molecular Probes) for 1h. Nuclei were stained with DAPI.

Statistical analysis

Results were analyzed by Student’s t-test. P-values of 0.05 were considered statistically significant.

(33)

33

RESULTS

SMC1A mutations in colorectal adenomas

We collected 48 Formalin Fixed Paraffin Embedded (FFPE) tissues of colorectal adenomas from patients of different genders. Samples were retrospectively selected from the files of both the Unit of Surgical Pathology of the Azienda Ospedaliero-Universitaria Pisana and Humanitas Clinical and Research Center (Fig. 1).

Figure 1: Standard Hematoxylin and Eosin (H&E) staining was performed on the microtomic section of early colorectal adenomas for histopathological examination. Histological diagnoses were formulated according to the 2010 World Health Organization (WHO) Classification (4th Edition). Two samples of tubular adenomas with low-grade dysplasia are shown.

DNA from each sample was isolated in order to amplify the coding sequences of the SMC1A gene and PCR was than performed for each exon. Each amplicon has been sequenced by classic Sanger sequencing method. All identified variants were re-amplified and re-sequenced by using both forward and reverse primers to exclude sequence artifacts. Through this approach, we identified 11 somatic mutations in SMC1A gene with a frequency of 22.9% (11 of 48) (Fig. 2).

(34)

34

Figure 2: Mutational screening in colorectal early adenomas allowed us to identify eleven SMC1A mutations. (A) nucleotide change c.40 T>C identified in patient 1. (B) nucleotide change c.101delA identified in patient 2. (C) nucleotide change c.620A>G identified in patient 3. (D) nucleotide change c.734 T>C identified in patient 4. (E) nucleotide change c.1360 A>C identified in patient 5. (F) nucleotide change c.1957 T>C identified in patient 6. (G) nucleotide change c.2210 T>C identified in patient 7. (H) nucleotide change c.2479 C>T identified in patient 8. (I) nucleotide change c.2662 A>G identified in patient 9. (J) nucleotide change c.3106 G>A identified in patient 10. (K) nucleotide change c.3421 C>T identified in patient 11.

(35)

35

SMC1A mutations were equally distributed between genders (five females and six

males). Nine of them were missense mutations (c.40 T>C, c.620 A>G, c.734 A>G, c.1360 A>C, c.1957 T>C, c.2210 T>C, c.2662 A>G, c.3106 G>A and c.3421C>T) leading to amino acid changes, whereas two mutations (c.101delA and c.2479 C>T) caused a premature stop codon (Tab. 1).

Table 1: Mutational screening in colorectal early adenomas allowed us to identify eleven SMC1A mutations equally distributed between genders. Nine are missense mutations, whereas two mutations caused a premature stop codon.

Next, we sequenced the SMC1A gene in 20 colorectal cancers in order to investigate whether SMC1A mutations occurred also in cancer tissues. We identified only one mutation, the c.2027A>G missense mutation which causes the D207G amino acid change. Therefore the frequency of SMC1A mutation is 5% in colorectal cancer (1 of 20) and this evidence confirms previous data (Barber et al., 2008). These data showed that mutation rate of SMC1A is higher in precancerous lesions than in colorectal cancers. The effect of mutations on SMC1A protein has been predicted using the Sorting Intolerant From Tolerant program (SIFT, http

(36)

36

://sift-dna.org). Among the nine identified missense mutations, five of them resulted to be non-tolerated and damaging for protein activity (Tab. 2).

Table 2: SMC1A misssense mutations effects predicted by SIFT software.

Furthermore, the analysis of protein sequences from human and animal models (Pan troglodytes, Bos taurus, Sus scrofa, Canis lupus familiaris, Mus musculus,

Macaca mulatta, Gallus gallus and Xenopus laevis), aligned by the ClustalW

method (http://www.ebi.ac.uk/Tools/msa/clustalw2), showed that mutations affect evolutionarily conserved amino acids (Fig. 3).

(37)

37

Figure 3: Two examples of sequences alignments from human and animal models (Patients 11 and 6). These alignments showed that mutated SMC1A residues affects evolutionarily conserved amino acids.

The reported mutations map along the entire length of the protein. However, most of them is localized in the coiled-coil domains (Fig. 4).

Figure 4: Distribution of mutations along SMC1A protein.

To investigate whether mutated SMC1A protein interacts with the other cohesin subunits, we performed co-IP with RAD21 and SMC1A proteins using early colorectal adenoma extracts from samples positive for SMC1A mutations. We found that both SMC1A and SMC3 coimmunoprecipitated with IP-RAD21 and both SMC3 and RAD21 were detected in IP-SMC1A. No SMC1A, SMC3 or

(38)

38

RAD21 signals were detected in control western blotting using IgG-coated beads (Fig. 5).

Figure 5: Co-IP studies using RAD21 and SMC1A antibodies in early colorectal adenomas extracts.

To determine whether SMC1A mutations caused CIN and aneuploidy, we decided to look at the effects of the identified mutations. In particular we tested the mutation identified in colorectal cancer and two SMC1A mutations detected in early colorectal adenomas, c.2027 A>G, c.3421 C>T and c.2479 C>T, respectively. The first two mutations caused an amino acid change, whereas the last one gave a premature stop codon. At first, we performed the site-directed mutagenesis to generate vectors carrying these mutations. All mutations were confirmed by direct sequencing (data not shown). In order to investigate the effects of SMC1A mutations, primary human fibroblasts were transfected with the mutated vectors. As a control, we also transfected cells with a vector containing the wild-type sequence of SMC1A. Transfection efficiency was comparable among SMC1A wild-type and mutants carrying vectors and ranged 80–90%. Quantitative PCR showed a strong SMC1A expression 24h after transfections (Fig. 6A). Overexpression of wild-type and mutated SMC1A proteins does not

(39)

39

affect their incorporation into cohesin complex since co-IP experiments, using anti-SMC1A antibody, showed that SMC1A interacted with SMC3 in all the samples (Fig. 6B).

Figure 6: (A) SMC1A mRNA expression levels after transfections with SMC1A vectors (wild type, 2027, 2479 and 3421). (B) Mutated SMC1A proteins coimmunoprecipitated with SMC3 in all the samples (1= wt SMC1A; 2= 2027 SMC1A; 3= 2749 SMC1A; 4= 3421 SMC1A, 5= Positive control, - = IgG negative control).

Cells were treated with nocodazole to induce mitotic arrest and Giemsa-stained metaphase spreads were analyzed in a blinded fashion (Fig. 7A). Transfection with mutated SMC1A vectors induced chromosome aneuploidy, although to a different extent. In fact, the frequency of aneuploid cells ranges from 8% with c. 2027 A>G to 29% with c.2479 C>T. Both untreated cells and cells transfected with wild-type SMC1A vector showed 3% of aneuploidy, suggesting that overexpression of wild-type SMC1A is not enough to induce CIN per se (Fig. 7B).

A

(40)

40

Figure 7: SMC1A mutations identified in colorectal early adenomas lead to CIN. (A) Examples of metaphase spreads showing gain or chromosome loss following transfection with a vector containing the c.2479C>T mutation. (B) Transfections led to high frequencies of aneuploid cells when compared with control cells and cells transduced with SMC1A wild type vector.

To further characterize the effects of SMC1A mutations, we investigated the outcome of c.101delA SMC1A mutation that leads to a premature stop codon. Since this mutation causes the translation of a short protein (50 amino acids), it is likely that the mutation is not tolerated. We postulated that c.101delA mutation could lead to a haploinsufficiency reducing the levels of wild-type protein. We therefore used the siRNA approach to decrease SMC1A wild- type protein in human fibroblasts. The downregulation of SMC1A was observed 48h after siRNA treatment (Fig. 8).

A

(41)

41

Figure 8: SMC1A siRNA treated cells showed lower levels of SMC1A in comparison to mock cells. Actin was used as a loading control.

Cytogenetic analysis, following siRNA treatment, showed that the SMC1A silencing led to 32% of chromosome aneuploidy in siRNA treated cells versus 3% in mock cells (Fig. 9).

Figure 9: SMC1A silencing leads to high frequency of aneuploid cells in comparison to mock cells.

Furthermore, the analysis of anaphases in both cells treated with siRNA and cells treated with SMC1A mutated vectors revealed that the frequency of abnormal anaphases was significantly higher in treated cells in comparison to control cells (Fig. 10).

(42)

42

Figure 10: Quantification of abnormal anaphases in both vectors transfected and

siRNA treated cells. ∗ P<0.05.

Next, we investigated additional markers of genomic instability to further corroborate the theory according to which SMC1A mutations lead to CIN. In addition to chromosome gain and loss, micronuclei formation can arise as a consequence of missegregation. Imaging of both cells transfected with SMC1A mutated vectors and cells treated with SMC1A siRNA, revealed the presence of abnormal figures, such as lobated nuclei, micronuclei (Fig. 11A) and rare anucleated cells (Fig. 11B).

(43)

43

Figure 11: (A) imaging of lobated nucleus with a micronucleus and micronuclei in cells transfected with 3421 vector. Nuclei were stained with Propidium Iodide. (B) anucleated cell following SMC1A silencing. Cells were incubated with anti-tubulin antibody, while nuclei were stained with DAPI.

CdLS: looking for “new” or “old” causative gene(s)?

WES was performed on six unrelated CdLS probands negative for NIPBL,

SMC1A, SMC3 RAD21 and HDAC8 mutations. All these patients present a mild

phenotype. Genomic DNA was extracted from blood sample. WES and the following bio-informatical analysis were performed by an external service that provided also the lists of all the annotated variations for each patient. WES provided us with thousands of variants per chromosome for each patient. Although all of the probands were screened as negative for mutations in the known gene, at first we looked at the CdLS causative genes, NIPBL, SMC1A,

SMC3, RAD21 and HDAC8. Unexpectedly, we found two de novo NIPBL

mutations in two of our probands. The first was a missense mutation (R1789E; Fig. 12A). Characterized by an heterozygous deletion of two nucleotides in exon 28 and the following insertion of two other nucleotides (c.5365_5366delCGinsGA). The second NIPBL mutation was a frameshift insertion of one base (c.7976_7977insT), leading to a premature stop codon (P2659fs; Fig.12B). Direct Sanger sequencing was performed to confirm the

(44)

44

mutations identified by WES. In addition, the probands parents’ were screened in order to confirm that these variations were de novo mutations.

Figure 12: (A) NIPBL missense mutation R1789E, caused by heterozygous deletion of two nucleotides and the following insertion of two other nucleotides (c.5365_5366delCGinsGA). (B) NIPBL insertion of one base (c.7976_7977insT), leading to an altered reading frame and a premature stop codon.

Then, we proceeded in the analysis of the four probands which resulted negative for the known CdLS causative genes. Due to CdLS heterogeneity, we analyzed each patient variants list individually. In the first step of data filtering we used both dbSNP and HapMap databases to remove common variants (polymorphisms). Subsequently, we annotated the novel alterations representing potentially pathogenic mutations, focusing nonsynonymous (NS) variants, splice acceptor, donor site mutations (SS) and coding indels (I). Mendelian diseases are in most cases characterized by mutations in the coding sequences so, in the preliminary analysis, we excluded from the annotation the alterations that occurred in intronic, 3’ -and 5’ untraslated regions. By this approach, we found about 200 novel variants for each patient. Genes harbouring these variations were

A

(45)

45

collected and were classified by their function. Interestingly, we found a frameshift insertion of one base (c.4401_4402insA) in EP300 gene, leading to a premature stop codon (Y1467fs, Fig. 13). The mutation has been found in a girl showing a CdLS like phenotype. EP300 mutations have been found to be causative for Rubistain-Taybi syndrome (RTS) (Roelfsema et al., 2005). Direct Sanger sequencing confirmed the mutation in the patient whereas it was absent in her parents.

Figure 13: EP300 insertion of one base (c.4401_4402insA), leading to an altered reading frame and a premature stop codon.

We did not identified any disease-associated mutation in the remaining three patients. Therefore all genes harboring identified variants in these three patients, were classified by their function. We selected 75 genes as the most relevant in terms of biological function. First we confirmed the mutations in each patient by Sanger sequencing, then we performed the same screening on their parents to select only de novo mutations. Globally, we found 7 de novo mutations. Proband A carries 2 frameshift deletions, 1 nonsense and 1 missense mutation, Proband B carries 1 missense and 1 nonsense mutation, while proband C carries only 1 missense mutation (Fig. 14). These mutations are currently under investigation to find out if they could be causative for CdLS.

(46)

46

Figure 14: 75 genes have been selected from WES analysis on three CdLS proband, negative for known disease-causative genes. 7 de novo mutations have been found, that are currently under investigation.

CdLS: genotype-phenotype correlation

In order to properly define a CdLS genotype–phenotype correlation we collected 326 CdLS-causing mutations in NIPBL, SMC1A, SMC3, RAD21 and HDAC8 including missense, nonsense, small deletions and insertions, splice site mutations, and genomic rearrangements. Information on NIPBL, SMC1A, SMC3,

RAD21, and HDAC8 mutations has been derived from public databases (dbSNP;

www.ncbi.nlm.nih.gov and LOVD; www.LOVD.nl) and published reports (Fig.15).

Proband A

Proband B

(47)

47

Figure 15: Phenotypic characteristics of CdLS caused by mutations in different cohesin regulatory and structural components. (A–D) 28-year-old girl with truncating mutation in NIPBL. (E–H) 7-year-old boy with missense mutations in NIPBL. (I–L) 3-year-old girl with in frame insertion/deletion mutation of HDAC8. (M and N) 15-year-old girl with missense mutation in SMC1A. (O and P) 3-year-old boy with deletion of RAD21. (Q–U) 57-year-3-year-old man with in frame deletion of SMC3 (shown as a teenager in “Q”).

(48)

48

Out of the 278 NIPBL heterozygous mutations identified in CdLS probands, 216 fall in coding sequences, 45 in noncoding regions, and 17 involve gross genomic alterations. Most of the mutations are nonsense, splice site, or frame shifts that result in a predicted truncated protein that presumably leads to the partial reduction in NIPBL production resulting in haploinsufficiency (Tab.3). 24

SMC1A mutations, all falling within the coding region of the gene, consist of 19

missense and 5 small-in frame deletions. Until some time ago, the only mutation identified in SMC3 was a small in-frame deletion of 3 bp in a male proband. However, 15 new SMC3 mutations have been recently reported in specific cohorts of CdLS and CdLS-like phenotypes (Ansari et al., 2015; Gil-Rodríguez et al., 2015). All of these novel SMC3 mutations are missense mutation or in-frame indels (with the exception of only one nonsense mutation) and didn’t significantly change SMC3 mutation frequencies among the total of CdLS probands (~ 1%). 3 mutations have been identified in RAD21: 2 missense and 1 microdeletion which included RAD21 gene. Finally, mutations in the HDAC8 gene (5 missense and 1 nonsense) have been identified in six CdLS probands.

Table 3: Type and Number of NIPBL Mutations Identified in CdLS Probands.

Mutational data analysis demonstrates an NIPBL genotype–phenotype correlation. Most of NIPBL mutations leads to the partial reduction in NIPBL production. This notion is supported by the identification of gross deletions leading to the loss of

(49)

49

the entire NIPBL region (Hulinsky et al., 2005; Russo et al., 2012). Truncated and presumably non-functional NIPBL protein are associated with a more severe phenotype characterized by typical facial features, severe-to-profound developmental and cognitive delay with lack of meaningful communication, severe growth retardation, and structural abnormalities of the limbs and other organs. Missense mutations are, in general, associated with a milder phenotype characterized by absent limb abnormalities and with less severe developmental and growth involvement (Fig. 16). The clinical phenotype of probands with gross genomic rearrangements correlates to the size of the rearrangements and the number of exons involved, suggesting that the loss of specific and larger coding sequences in NIPBL, could have a specific and additive effect on phenotypic severity. However, there are a few exceptions of note. Missense mutations and in-frame deletions involving NIPBL HEAT domain, in particular H2–H4 repeats, have been identified in probands with phenotypes ranging from moderate to severe, with limb reduction, severe cognitive impairment and growth retardation. This suggests that mutations affecting the HEAT domains are critical for protein function, likely affecting NIPBL’s interaction with other proteins and/or chromatin (Jahnke et al., 2008).

Figure 16: Genotype–phenotype correlations in CdLS due to the type of NIPBL mutations.

(50)

50

The clinical picture of CdLS probands carrying SMC mutations is more homogenous and is characterized by a mild to moderate phenotype more similar to NIPBL-mutated probands with missense changes. The great majority of SMC mutations are missense or in-frame indels and fall in the coiled-coil domains.

SMC mutated probands show mild to moderate cognitive impairment, facial

features with differences from classical forms of CdLS, with a tendency toward normal birth weight and head circumference and a lack of gross structural anomalies of the limbs. No mutations have been identified in the functional hinge domain, with the exception of codon 496 of SMC1A that is located in the hinge/coiled-coil transition region (Mannini et al., 2012). This evidence suggested that mutations therein are not tolerated at least in male cells that have only a single copy of SMC1A and are negatively selected, because no cohesin complex could be formed. Mutations in RAD21 cause a human cohesinopathy overlapping with SMC mutated CdLS phenotype. Common features include short stature, synophrys, micrognathia, brachydactyly, but exist some facial divergences and there is remarkably mild cognitive involvement. Similarly to NIPBL haploinsufficiency, HDAC8 mutations result in a modest reduction of transcription. Mutations in HDAC8 result in individuals who display clinical features that overlap to some extent with classical forms of CdLS, however, like their SMC mutant counterparts they do not display gross limb anomalies. In summary, analysis of the inter- and intragenic mutational spectrum seen in CdLS probands reveals an evolving picture of genotype–phenotype correlations (Fig. 17).

(51)

51

Figure 17: Diagram representing the correlation genotype–phenotype of the five CdLS causative genes.

Riferimenti

Documenti correlati

[r]

¾ analysis of unweighted gene co-expression networks have revealed a relationship between centrality and essentiality across all genes (Provero [arXiv:cond-mat/0207345], 2002).

It is submitted that the probable intention of the drafters of the Restatement (Second) was to achieve a balance (in section 187 (2) (a)) between the older reasonable

Now we locate all the zeros

The recent finding that human colorectal cancers cells carry mutations in cohesin genes further support the notion that cohesin is involved in genome

In fact, all the empirical parts of the thesis have been done using aggregate data of the Euro Area, implying that the monetary policy suggestions derived may be helpful for

Sono i due nuovi coinquilini della vecchia casa di Thomas e Emma, che sentono “the piano playing / Just as a ghost might play”, infatti, il poeta

Growth retardation, hirsutism, low anterior hairline, long eye- lashes, synophrys, long philtrum, micrognathia, absence deformity of distal upper limbs, flexion con- tracture at