• Non ci sono risultati.

CD4+ T celll differentiation explored through Markov process

N/A
N/A
Protected

Academic year: 2023

Condividi "CD4+ T celll differentiation explored through Markov process"

Copied!
72
0
0

Testo completo

(1)

Università degli Studi di Napoli Federico II

DOTTORATO DI RICERCA IN FISICA

Ciclo XXIX

Coordinatore: prof. Salvatore Capozziello

CD4+ T celll differentiation explored through Markov process

Settore Scientifico Disciplinare FIS/02

Dottorando Tutore

Andrea Piccolo Prof. Mario Nicodemi

(2)

Se qualcuno ti dir`a che l’universo `e una macchina, pensa com’`e bello far finta almeno per un istante che non lo sia.

(Luther Blisset, Lettera al figlio sull’utilit`a della scuola)

(3)
(4)

Contents

1 Introduction 5

2 Cell transformations: basic concepts 9

3 Th2 di↵erentiation: our first results 17

3.1 Th2 di↵erentiation resolved by cell generation. . . 17 3.2 RNA-seq analysis indicates three major states . . . 22

4 ABC Model 33

5 Results 39

5.1 Single cell fate: mathematical modelling of three cell states quantifies the link between acceleration of proliferation and di↵erentiation . . . 39 5.2 Asymmetric division and robustness of the model . . . 48 5.3 Validating the A, B and C cell states and parameters by expression profiling 50 5.4 Single-cell RNA-seq links CD4+ T-cell division rates to di↵erentiation

state in an in vivo Th1 infection model . . . 56

6 Summary and Conclusion 61

(5)

Contents

(6)

Chapter 1 Introduction

Cell transformations and changes in their states impact a number of biological processes, having significant implications, for instance, in medical research. Examples of this range from stem-cell reprogramming, di↵erentiation and carcinogenesis. These are complex processes characterized by alterations in the expression of thousands of genes and proteins in a coordinated way. Although much progress has been made in understanding these phenomena, their accurate and more complete comprehension is still in its scientific infancy.

In this thesis I discuss a project in collaboration with Professor Sarah Teichmann’s group (MRC Laboratory of Molecular Biology and EMBL-EBI, Cambridge). We have been working on a cell transformation process of na¨ıve cells into T h2 cells (also in this case the data has been purposely derived in Teichmann’s lab). Here we investigate population data on cells at di↵erent stages of di↵erentiation (FACS data) which contain a wealth of information on the microscopic nature of the events leading na¨ıve cells becoming T h2 cells.

Many di↵erentiation processes occur hand-in-hand with a change in cell cycle status:

(7)

this can be cell cycle arrest, as in the monocyte to macrophage transition [35], cell cycle entry, as for the pre-adipocyte to adipocyte di↵erentiation [2], and entry and subsequent cell division, as in T helper (Th) cell di↵erentiation [3].

Th cell di↵erentiation is the process where na¨ıve CD4+ T cells transition to e↵ector lymphocytes and is central to mammalian adaptive immunity. After antigen stimulation of the T-cell receptor in the preence of specific cytokines, na¨ıve Th cells start dividing rapidly to reach a di↵erentiated state, with the best understood being T h1, T h2, T h17 and pT regs [4]. So far, several master regulators have been identified (e.g. Gata3 for T h2, T bet for T h1, Rorgt for T h17 and F oxp3 for pT regs) [4] and there is considerable insight into their regulatory networks [5].

While much is known in CD8+ (killer) T cells [6], the expansion of CD4+ (helper) T cells during an infection is less well understood at the cellular and molecular levels.

How does the coupling between di↵erentiation and the cell cycle occur in CD4+ T cells?

Are the two processes indipendent and orthogonal, as suggested by Du↵y and Hodgkin [7], or linked through molecules and hence intertwined [8]? Does di↵erentiation occur in a gradual manner as suggested by many studies, including a recent single-cell analysis of lungh epithelial development [9], or in a cooperative switch-like manner?

Here, we use a new approach to tackle these questions, which is to extract biologically intermediate states of di↵erentiation from a single chronological time point. By sorting out separate cell populations from a single cell culture of asynchronized, dividing cells, we aimed to reduce the biological variability in cytokine exposure, confluence, etc. With this approach we minimize the biological noise in our data and focus entirely on the processes of cell division and di↵erentiation.

We used in-depth transcriptome profiling coupled with bioinformatics data analysis to identify three major cell states during Th2 di↵erentiation. By counting cells in each cell generation using flow cytometry, we modelled the rates of death, division

(8)

and di↵erentiation using a discrete time Markov branching process. This model gives information about which kind of transitions are most likely to happen at single cell level; for example, it can say if an activated cell (still not di↵erentiated) is most likely to give rise to di↵erentiated cells through a duplication or in a direct way. This revealed a higher cell division rate for di↵erentiated cells compared with proliferating, activated cells. We validate those finding by DNA staining and by single-cell live imaging of T h2 cells. These in vitro data supported the idea of a fine-tuned relationship between cell cycle speed and di↵erentiation status in CD4+T cells.

Finally, we related our findings from an ex vivo cell culture model of T h2 di↵erentia- tion to single-cell transcriptomes of T h1 cells from a mouse model of malaria infection.

The in vivo cytokine secreting T h1 cells also cycle more quickly than in vivo activated cells, showing the universal relevance of our results to primary activation of T cells. This implies that an acceleration of e↵ector CD4+ T cell expansion upon di↵erentiation is part of the immune system’s machanism of pathogen clearance during primary activation.

This thesis has been written on the basis of the article [1] written together with Valentina Proserpio at EMBL European Bioinformatics Institute (EBI) of Cambridge.

(9)
(10)

Chapter 2

Cell transformations: basic concepts

When we talk about cell transformation we refer to those processes in which a cell transforms from a type to another. In this thesis we focus on di↵erentiation, which is a kind of transformation process in which a stem cell di↵erentiate into a specific cell. However, there are many kind of cell transformations such as reprogramming and carcinogenesis. A key role in such transformations is played by the cell potency that is involved, especially, in di↵erentiation and reprogramming. Here, we discuss briefly cell potency, di↵erentiation, reprogramming and carcinogenesis in order to give general information to better understand the biology behind this thesis which is based on more mathematical and statistical issue rather than biological.

Cell Potency Cell potency is a general term which describes a stem cell’s ability to di↵erentiate into di↵erent cell types. The more cell types a stem cell can di↵erentiate into, the greater its potency. Potency is also described as the gene activation potential within a cell which like a continuum begins with totipotency to designate a cell with the most di↵erentiation potential, pluripotency, multipotency, oligopotency and finally unipotency. Potency is taken from the Latin term ”potens” which means ”having power”.

(11)

Totipotent(cell

Pluripotent(Stem(

Cell

Multipotent(neural(

stem(cell

Other(multipotent((

stem(cell

Neural(progenitor Glial(progenitor

Neural(precursor

Neurons

oligodendrocytes(

precursor astrocytes(precursor

astrocytes oligodendrocytes

Figure 2.1 –Stem cell hierarchy and multipotency of neural stem cells: This schematic illustration depicts the generation of di↵erentiated cell types from multipotent neural stem cells. Di↵erent specialized multipotent stem cells are produced from pluripotent embryonic stem cells. Multipotent neural stem cells produce the three lineages: neurons, oligodendrocytes, and astrocytes. In this schematic illustration, the curved arrows represent self-renewing ability.

In Figure 2.1 there is an example showing the generation of di↵erentiated cell types from multipotent neural stem cells.

Totipotency is the ability of a single cell to divide and produce all of the di↵erenti- ated cells in an organism, and example totipotent cells are spores and zygotes. In the spectrum of cell potency, totipotency represents the cell with the greatest di↵erentiation potential. Toti comes from the Latin totus which means ”entirely.” It is possible for a fully di↵erentiated cell to return to a state of totipotency. This conversion to totipotency is complex, not fully understood and the subject of recent research. Research in 2011 has shown that cells may di↵erentiate not into a fully totipotent cell, but instead into a

(12)

”complex cellular variation” of totipotency [36].

In cell biology, pluripotency (from the Latin plurimus, meaning very many, and potens, meaning having power) refers to a stem cell that has the potential to di↵erentiate into any of the three germ layers: endoderm (interior stomach lining, gastrointestinal tract, the lungs), mesoderm (muscle, bone, blood, urogenital), or ectoderm (epidermal tissues and nervous system).

Induced pluripotent stem cells, commonly abbreviated as iPS cells or iPSCs are a type of pluripotent stem cell artificially derived from a non-pluripotent cell, typically an adult somatic cell, by inducing a ”forced” expression of certain genes and transcription factors. These transcription factors play a key role in determining the state of these cells and also highlights the fact that these somatic cells do preserve the same genetic information as early embryonic cells. Due to their great similarity to ESCs, iPSCs have been of great interest to the medical and research community. iPSCs could potentially have the same therapeutic implications and applications as ESCs but without the controversial use of embryos in the process, a topic of great bioethical debate. In fact, the induced pluripotency of somatic cells into undi↵erentiated iPS cells was originally hailed as the end of the controversial use of embryonic stem cells. However, iPSCs were found to be potentially tumorigenic, and, despite advances, were never approved for clinical stage research [37].

Multipotency describes progenitor cells which have the gene activation potential to di↵erentiate into multiple, but limited cell types. For example, a multipotent blood stem cell is a hematopoietic cell — and this cell type can itself di↵erentiate into several types of blood cell types like lymphocytes, monocytes, neutrophils, etc., but cannot di↵erentiate into brain cells, bone cells or other non-blood cell types.

In biology, oligopotency is the ability of progenitor cells to di↵erentiate into a few cell types. It is a degree of potency. Examples of oligopotent stem cells are the lymphoid

(13)

or myeloid stem cells. A lymphoid cell specifically, can give rise to various blood cells such as B and T cells, however, not to a di↵erent blood cell type like a red blood cell.

Examples of progenitor cells are vascular stem cells that have the capacity to become both endothelial or smooth muscle cells.

In cell biology, a unipotent cell is the concept that one stem cell has the capacity to di↵erentiate into only one cell type. It is currently unclear if true unipotent stem cells exist. Hepatoblasts, which di↵erentiate in hepatocytes (which constitutes most of the liver) and cholangiocytes (epithelial cells of the bile duct), are bipotent. A close synonym for unipotent cell is precursor cell.

Di↵erentiation In developmental biology, cellular di↵erentiation is the process by which a less specialized cell becomes a more specialized cell type. Di↵erentiation occurs numerous times during the development of a multicellular organism as the organism changes from a simple zygote to a complex system of tissues and cell types.

Di↵erentiation is a common process in adults as well: adult stem cells divide and create fully di↵erentiated daughter cells during tissue repair and during normal cell turnover.

Di↵erentiation dramatically changes a cell’s size, shape, membrane potential, metabolic activity, and responsiveness to signals. These changes are largely due to highly controlled modifications in gene expression. With a few exceptions, cellular di↵erentiation almost never involves a change in the DNA sequence itself. Thus, di↵erent cells can have very di↵erent physical characteristics despite having the same genome.

Reprogramming Cellular reprogramming describes the process where a fully di↵er- entiated, specialized cell type is induced to transform into a di↵erent cell type that they would not otherwise become under normal physiological conditions. Cellular repro- gramming has been achieved using a variety of methods, including somatic cell nuclear

(14)

transfer, cell-cell fusion and, most recently, through the introduction of four transcription factors. Most scientists have focused on reprogramming somatic cells into pluripotent stem cells, but recently some researchers have begun to focus on reprogramming somatic cells into multipotent stem cells, which have a more restricted developmental potential and are closer to the cell population the researcher ultimately wants to engineer.

Carcinogenesis Carcinogenesis or oncogenesis or tumorigenesis is literally the creation of cancer. It is a process by which normal cells are transformed into cancer cells. It is characterized by a progression of changes at the cellular, genetic and epigenetic level that ultimately reprogram a cell to undergo uncontrolled cell division, thus forming a malignant mass.

Na¨ıve T cell

A na¨ıve T cell, also called T lymphocyte, type of leukocyte (white blood cell) that is an essential part of the immune system. T cells are one of two primary types of lymphocytes - B cells being the second type - that determine the specificity of immune response to antigens (foreign substances) in the body. T cells originate in the bone marrow and mature in the thymus. In the thymus, T cells multiply and di↵erentiate into helper, regulatory, or cytotoxic T cells or become memory T cells. They are then sent to peripheral tissues or circulate in the blood or lymphatic system.

T helper cells (Th cells) assist other white blood cells in immunologic processes, including maturation of B cells into plasma cells and memory B cells, and activation of cytotoxic T cells and macrophages. These cells are also known as CD4+ T cells (cluster of di↵erentiation 4) because they express the CD4 glycoprotein on their surfaces. Helper T cells become activated when they are presented with peptide antigens by MHC class

(15)

II molecules, which are expressed on the surface of antigen-presenting cells (APCs).

Once activated, they divide rapidly and secrete small proteins called cytokines that regulate or assist in the active immune response. In particular, IL-13 is cytokine secreted by many cell types, but especially T helper type 2 (Th2) cells. IL-13 has e↵ects on immune cells that are similar to those of the closely related cytokine IL-4. These cells can di↵erentiate into one of several subtypes, including Th1, Th2, Th3, Th17, Th9, or TFh, which secrete di↵erent cytokines to facilitate di↵erent types of immune responses.

Cytotoxic T cells (TC cells, CTLs, T-killer cells, killer T cells) destroy virus- infected cells and tumor cells, and are also implicated in transplant rejection. These cells are also known as CD8+ T cells since they express the CD8 glycoprotein at their surfaces. These cells recognize their targets by binding to antigen associated with MHC class I molecules, which are present on the surface of all nucleated cells. Through IL-10, adenosine, and other molecules secreted by regulatory T cells, the CD8+ cells can be inactivated to an anergic state, which prevents autoimmune diseases.

Regulatory T cells (suppressor T cells) are crucial for the maintenance of immuno- logical tolerance. Their major role is to shut down T cell-mediated immunity toward the end of an immune reaction and to suppress autoreactive T cells that escaped the process of negative selection in the thymus. Suppressor T cells along with Helper T cells can collectively be called Regulatory T cells due to their regulatory functions.

Memory T cells are a subset of infection- as well as potentially cancer-fighting T cellsthat have previously encountered and responded to their cognate antigen; thus, the term antigen-experienced T cell is often applied. Such T cells can recognize foreign invaders, such as bacteria or viruses, as well as cancer cells. Memory T cells have become

”experienced” by having encountered antigen during a prior infection, encounter with cancer, or previous vaccination. At a second encounter with the invader, memory T cells can reproduce to mount a faster and stronger immune response than the first time

(16)

the immune system responded to the invader.

RNA sequencing technology

RNA-Seq (RNA sequencing) technology allows you to discover and profile the tran- scriptome in any organism. RNA-seq (Wang 2009) is rapidly replacing gene expression microarrays in many labs. RNA-seq lets you quantify, discover and profile RNAs. For this technique, mRNA (and other RNAs) are first converted to cDNA. The cDNA is then used as the input for a next-generation sequencing library preparation. Specif- ically, RNA-Seq facilitates the ability to look at alternative gene spliced transcripts, post-transcriptional modifications, gene fusion, mutations/SNPs and changes in gene expression over time, or di↵erences in gene expression in di↵erent groups or treatments.

In addition to mRNA transcripts, RNA-Seq can look at di↵erent populations of RNA to include total RNA, small RNA, such as miRNA, tRNA, and ribosomal profiling. Single cell sequencing examines the sequence information from individual cells with optimized next generation sequencing (NGS) technologies, providing a higher resolution of cellular di↵erences and a better understanding of the function of an individual cell in the context of its microenvironment.

Figure 2.2 – Principal steps for RNA sequencing technology.

Briefly, long RNAs are first converted into a library of cDNA fragments through either RNA fragmentation or DNA fragmentation (Figure 2.2). Sequencing adaptors are subsequently added to each cDNA fragment and a short sequence is obtained from

(17)

each cDNA using high-throughput sequencing technology. The resulting sequence reads are aligned with the reference genome or transcriptome and used to generate a base- resolution expression profile for each gene. Three principal units of measure are used for gene expression in RNA-seq experiments: RPKM (Reads Per Kilobase Million), FPKM (Fragments Per Kilobase Million) and TMP (Transcripts Per Kilobase Million).

FACS

In biotechnology, flow cytometry is a laser- or impedance-based, biophysical technology employed in cell counting, cell sorting, biomarker detection and protein engineering, by suspending cells in a stream of fluid and passing them through an electronic detection apparatus. Fluorescence-activated cell sorting (FACS) is a specialized type of flow cytometry. It provides a method for sorting a heterogeneous mixture of biological cells into two or more containers, one cell at a time, based upon the specific light scattering and fluorescent characteristics of each cell. It is a useful scientific instrument as it provides fast, objective and quantitative recording of fluorescent signals from individual cells as well as physical separation of cells of particular interest. In our work staining precursor na¨ıve cells with CellTrace Violet dye allowed us to discriminate, count and profile cells that have undergone di↵erent numbers of cell divisions and primary cells derived from Il13-eGFP homozygous reporter mice allowed us to identify di↵erentiated Th2 cells.

(18)

Chapter 3

Th2 di↵erentiation: our first results

3.1 Th2 di↵erentiation resolved by cell generation.

In Vivo After antigen stimulation of the T-cell receptor [10], na¨ıve CD4+ T cells start dividing quickly and some cells initiate expression of cytokines, which is the hallmark of di↵erentiation e↵ector cells. To probe this process in vivo, we isolated and sequenced CD3+/CD4+/CD62L single cells from spleen and both mediastinal and mesenteric lymph nodes of Nippostrongylus brasilensis (Nb)-infected mice 5 days post-infection (Figure 3.1). In order to remove cells with a poor quality library a quality analysis control was performed and we retained data from 78 cells. In order to separate the fast cycling from the slow cycling ones, we clustered them according to the expression of cell cycle genes (Figure 3.2). We ranked the cells according to the expression of aggregate G2/M genes as measure of “cell cycle scores”, thus reflecting the speed of the cell cycle [1]. We observed that the cells expressing higher amounts of G2/M genes were also significantly enriched in interleukin4 (IL4) (p-value = 0.008, Fisher’s exact test). In order to verify that those G2/M high cells were proliferating faster and were enriched in IL4

(19)

3.1. Th2 di↵erentiation resolved by cell generation.

expression, we looked at the expression level of proliferating marker genes [11, 12]. The cells enriched in those genes also expressed significantly higher amounts of IL4 (p-value

= 0.001, Fisher’s exact test), confirming that cytokine-producing cells are cycling faster.

Based on this observation, we proceeded to study the link between cell cycle speed and di↵erentiation in Th2 cells in more details in an in-vitro cell culture system.

Figure 3.1 – Overview of the experiment. CD3+/CD4+/CD62L- T helper cells were isolated from lungs, mesenteric and mediastinal lymph nodes of Nb-infected mice on day 5 post-infection. After cell capturing and cDNA generation with the C1 system, samples were sequenced with an Illumina Hi-seq Sequencer.

Figure 3.2 – Seventy-eight single cells were clustered according to the expression of G2/M genes (logTPM) as a measure of cell cycle speed. TPM transcripts per millions. Three cells expressing IL4 clustered within the group of cells expressing high levels of G2/M genes (in the red box) (p value = 0.008, Fisher’s exact test).

(20)

3.1. Th2 di↵erentiation resolved by cell generation.

In Vitro - Flow Citometry As a marker of di↵erentiation we employed IL13 instead of IL4, as its expression is less susceptible to changes in IL4 concetration in the medium.

Staining precursor na¨ıve cells with CellTrace Violet dye allowed us to discriminate cells that have undergone di↵erent numbers of cell divisions (Figure 3.3). Primary cells derived from IL13-eGFP homozygous reporter mice allowed us to identify di↵erentiated Th2 cells [11]. Using this system, we observe, consistent with previosly published data for other cytokines [8, 12], that the proportion of di↵erentiated cells (with flourescent IL13+ reporter expression) increases linearly in each consecutive generation (Figure 3.4). In prevous reports, cytokine producing cells have been detected only from the third generation onwards [8], while we detected these cells in earlier generation already.

This is probably due to our use of a green flourescent protein (GFP) reporter for the endogenous cytokine, instead of the traditional staining with flourescent antibody.

To dissect whether sudden or gradual changes in cell state occur during Th2 di↵eren- tiation, we performed a transcriptome-wide characterization of cells that had undergone di↵erent number of mitotic divisions after 3.5 days of activation. From this single time point, we sorted and carried out mRNA-sequencing (mRNA-seq) of three non overlapping population of cells that were not expressing GFP and had, respectively, not divided (generation 0 negative G0N), divided twice (G2N) or divided four times (G4N). We also profiled a fourth population of cells that had divided four times and was positive for GFP (G4P; Figure 3.5).

Hierarchical clustering of these datasets indicates that there are three major states. A G0N cluster is clearly separate from the other population, indicating a major di↵erence between cells taht have not yet undergone mitosis and other cells that have entered cell cycle. The G4P cluster is more distant from the other two dividing populations, indicating that te expression of one single marker of di↵erentiation (IL13) occurs concomitantly with global changes to expression profile of growing lymphocytes. In contrast, the

(21)

3.1. Th2 di↵erentiation resolved by cell generation.

Figure 3.3 – Schematic representation of the division/di↵erentiation process from a na¨ıve cell to fully di↵erentiated Th2 cells. The CellTrace Violet content is roughly equally distributed between daughter cells after each mitotic division. Cells expressing the Il13-eGFP Th2 di↵erentiation marker are shown in green. TCR T-cell receptor.

(22)

3.1. Th2 di↵erentiation resolved by cell generation.

Figure 3.4 – Flow cytometry plot of CellTrace Violet versus Il13-eGFP di↵erentiated Th2 cells at day 3.5. Consecutive generations (from G0 to G5) are visualized as pink gates.

The upper gates are IL13-positive cells (P), and the lower gates contain IL13-negative cells (N). Ratio of GFP-P cells to the total number of cells per generation (average and standard deviation of three biological replicates).

Figure 3.5 – Cells in the gates highlighted were sorted by FACS and profiled by mRNA- sequencing. Hierarchical clustering of the distance matrices between RNA expression profiles

(23)

3.2. RNA-seq analysis indicates three major states

G2N and G4N cluster together, sharing similar expression profiles. These results are supported and validated by Quantitative PCR (qPCR) and flow cytometry of individual genes and proteins [1].

3.2 RNA-seq analysis indicates three major states

Deep transcriptomic analysis reveals three discrete cell states during Th2 di↵erentiation. While some group of genes increase or decrease apparently continuosly across the four RNA-seq data sets from G0N to G2N, G4N and G4P, there are also groups of genes that have non-monotonic patterns of expression (Figure 3.6). Therefore, it is unclear whther the di↵erentiation is occurring through a single gradual progression or via discrete intermediate states.

Figure 3.6 – Heatmap of all ˜14,000 protein-coding genes (rows) per generation (columns).

At the top and bottom, genes with a monotonic increase/decrease are shown. In the middle, genes are ranked according to distance from G4P and G0N.

We analyzed di↵erentially expressed genes (DEGs) between subgroups and found roughly 1500 DEGs between G0N and G2N and between G4N and G4P but only 170 between G2N and G4N (Figure 3.7). Gene Ontology (GO) enrichment analysis (Figure

(24)

3.2. RNA-seq analysis indicates three major states

Figure 3.7 – Number and percentage of di↵erentially expressed genes between samples.

Figure 3.8 – Gene Ontology enrichment analysis was performed on di↵erentially expressed genes between G0N and G2N (gray bars) and G4N and G4P (green bars). The threshold p-value of 0.05 is shown as a dotted line.

3.8) showed that G0N-G2N DEGs are enriched in “ATP biosynthetic process”, “Mitosys”

and “Cell Cycle”. The majority of these genes (70 to 85%) are up-regulated in G2N versus G0N, providing further confirmation that G0N cells are not actively proliferating.

At the same time, the high expression of the activation marker Cd69 [13] and the levels of L-selectin (Cd62l, Sell) and Cd44 [14] in G0N cells and the increase in size of some of these cells indicate that they have been partially activated so are no longer na¨ıve cells.

Our GO analysis of DEGs between G4N and G4P indicated that the terms “Regula- tion of cytokine secretion”, “Cytokine activity” and “Regulation of T cell proliferation”

(25)

3.2. RNA-seq analysis indicates three major states

Figure 3.9 – expression changes of genes belonging to three important categories: signaling ligands (SLs), surface molecules (SM), transcription factor (TF) and cell cycle genes.

represent the main categories of genes that are specifically di↵erentially expressed to- gether with IL13 transcription. This means that the expression of IL13 coincides with the expression of other genes important for Th2 function (IL3, IL4, IL5, IL6, [1]).

We also analysed the expression changes of genes belonging to four important categories: signaling ligands (SLs), surface molecules (SM), transcription factor (TF) and cell cycle genes (Figure 3.9).

Among the upregulated TFs we found some genes for which a role in Th2 di↵er- entiation has already been demonstrated (Gata3, Batf3, Epas1, [15]) and some where their function still remains to be to elucidated. In the SM group we observed induction of IL2ra and IL7r, which are known to be involved in lymphocyte di↵erentiation [16].

Moreover, the strong downregulation of Ifngr1 observed in conjunction with cell activa- tion is consistent with previous reports [17]. Overall, the vast majority (⇠ 75%) of the di↵erentially expressed SLs are upregulated from one generation to the next.

Within the upregualted genes, we calculated the average Z-score across conditions for

(26)

3.2. RNA-seq analysis indicates three major states

Figure 3.10 – Average Z-scores for upregulated genes belonging to di↵erent functional categories calculated from heatmaps in Figure 3.9

each of the four populations (Figure 3.10). TFs and SMs are promptly upregulated soon after entering the cell cycle (G2N) and no further increase is detected after further cell division. SMs show a second prominent increase in their expression in cells when IL13 is also expressed. No important e↵ect of entering the cell cycle is visible on cytokine expression levels and only the production of IL13 correlates with the expression of all SLs. Importantly, all the Th2-specific cytokines follow the same pattern of expression, continuing to be lowly expressed after entering the cell cycle and only undergoing a sharp boost from G4N to G4P.

Finally, the expression level of cell cycle genes is strongly upregulated from G0N to G2N, as expected by definition for these two subpopulations. More interesting, we observed a second sharp increase in the expression of cell cycle genes in G4P cells, emphasizing the concomitant upregulation in cell cycle and di↵erentiation genes, as we already observed from ex vivo Th2 single-cell RNA-seq. Finally, the expression of

(27)

3.2. RNA-seq analysis indicates three major states

Figure 3.11 – Heatmap of Th2 signature gene expression across generations.

Th2 singnature increases from G0N to the consecutive negative generations and further increases from G4N to G4P (Figure 3.11). This group of genes includes most of the genes with a role in Th2 specification (Ecm1, IL24, Batf, IL10, Nfil3, Gata3, IL4ra) [18, 19, 20].

In summary, these data suggest that the G0N to G2N/G4N transition represents the exit from cell cycle arrest and entry into the proliferative cell state. The di↵erence between G4N and G4P must result from a second major switch, which represent di↵erentiation to the Th2 e↵ector state with expression of the characteristic cytokine, including IL13. Together with expression of these cytokines, there is a parallel further increase in the expression of cell cycle genes.

The combination of the above results leads us to characterize Th2 cell di↵erentiation as consisting of three major transcriptionally distinct states, which we name state A

(28)

3.2. RNA-seq analysis indicates three major states

Figure 3.12 – A three-state di↵erentiation model in which G0N cells are named A cells (“Activated” cells), G2N and G4N cells are named B cells (“Proliferating” cells) and G4P

cells are named C cells (“Citokine-producing” cells)

(activated cells that correspond to G0N), B (proliferating cells that correspond to both G2N and G4P) and C (cytokine expressing cells that correspond to G4P) (Figure 3.12).

Validation of the three state model at single-cell resolution Our description of three cell states during Th2 di↵erentiation comes from population mRNA-seq data.

Therefore, we aimed to verify our hypothesis at single-cell resolution by performing high-throughput single-cell qPCR analysis with dozens of genes in parallel in 46 cells from each population. We obtained a good overall correlation between the RNA-seq data and the average of the single-cell qPCR results (Figure 3.13 and 3.14).

Based on these data, we aimed to assign each cell to one of the three specific states we identified. We employed principal component analysis (PCA) and, to quantify the separation, a linear support vector classifier (SVC) was trained using “one-hot” labels (e.g. is it G4N or not) for each of the conditions for the first two component values (Figure 3.15). What we observed is that the SVC vector is able to distinguish G0N and G4P from other cells with good accuracies (score are 0.83 and 0.72, respectively).

(29)

3.2. RNA-seq analysis indicates three major states

Figure 3.13 – Additional analysis confirmed the existence of three discrete states at the single cell level. Violin plots represent the distribution of selected genes in single cells.

The insets show bulk RNA-seq results.

Figure 3.14 – Additional analysis confirmed the existence of three discrete states at the single cell level. Correlation between bulk data (x-axis) and single-cell qPCR (100-ct mean;

y-axis). Each gene is represented by a line linking its values in G0N and G4P

(30)

3.2. RNA-seq analysis indicates three major states

Conversely, it fails to distinguish G2N and G4N states from the other cells and also the mixture of G2N and G4N from the rest of the cells. This analysis support the existence of three states represented by G0N, G4P and a mixture of G2N and G4N.

In order to probe the transcriptional regulation of these states on a single-cell basis, we focused on 11 highly expressed transcription factors (Epas1, Myb, Mycn, Jhdm1d, Pou6f1, Pparh, Tcf7, Txk, Zc3h12c, Zfp36, Hlx) and discretized them into “on” or

“o↵” states in each cell. This yelded 124 unique binary states and 117 of these can be connected by single-gene changes to yeld a state graph as in Moingnard et al. [21]

(Figure 3.16)

From this analysis we could observe that cells of the same state share similar TF organization as they cluster close to each other, underlining how TFs could act as master regulators of cell fate. We could also verify that the di↵erentiation from A (in grey, Figure 3.16) to C (in green) requires the transition through at least on B cell (in red), further confirming the intermediate nature of this cell state.

Taken together, these data confirm the concept of a three-state model of di↵erentiation during Th2 primary activation. We also quantified the homogeneity of cells belonging to each of the three states as the average of Spearman cell-to-cell correlation (p value <

0.05) within each state (Figure 3.17).

We observed an increase in the correlation across cells belonging to state C with respect to states A and B. This suggest that A and B cells are flexible and heterogenous after primary activation, both before and after enter the cell cycle. In contrast, the cell that are more di↵erentiated are more similar to each other, representing a more homogeneous population. These results, in agreement with data from Arsenio et al. on CD8+ T cells [22], support the concept of a commitment toward a more specific state in concert with the expression of the specific cytokines.

(31)

3.2. RNA-seq analysis indicates three major states

Figure 3.15 – Linear principal component analysis (PCA) with a linear support vector classifier (SVC the yellow line) trained with “one-hot” labels (e.g. is it G4N or not) for each of the conditions and the first two principal component values were used to separate each of the generation and G2N/G4N cells from the other cells (in blue).

(32)

3.2. RNA-seq analysis indicates three major states

Figure 3.16 – State graph of 117 connected binary cell states for 11 transcription factors, constructed using the SCNS toolkit. Each edge represents the change in expression of a single gene. Grey circles are G0N cells, red circles are G2N/G4N cells and green circles represent G4P cells

(33)

3.2. RNA-seq analysis indicates three major states

Figure 3.17 – Average of Spearman correlation values between any two di↵erent cells with p value <0.05 is reported for the indicated population.

(34)

Chapter 4 ABC Model

Figure 4.1 – The model predicts T-cell behaviour at the cellular level. Overview of the di↵erentiation process that converts a na¨ıve cell into a fully di↵erentiated Th2 cell. Each na¨ıve cell goes through three di↵erent states: state A (Undivided), state B (Proliferating) and state C (Cytokine expressing).

We model the cell dynamics as a stochastic Markov process which includes three states, named state A, B and C (Figure 4.1 and Figure 4.2). For sake of mathematical simplicity, we considered a discrete time branching process, whose time step unit is named t. We suppose that, from a time step to the next, an A cell can die (rate Ad), stay the same (rate Ai), divide giving rise to two type B cells (rate As), or produce an asymmetric

(35)

division in an B and a C cell (rate Aa). A B cell can die (rate Bd), stay the same (rate Bi), duplicate (rate Bs), give rise to a type C cell (rate Bt), or divide asymmetrically in an B and a C cell (rate Ba). Similarly, C cells can die, stay the same, divide (rates Cd, Ci, Cs). A number of other state changes could be considered. For simplicity we only include the above ones which are motivated on biological grounds. In our FACS data at day3.5 the last populated subgroup is G6N , which points out that six division steps must have occurred since day0. From that we derive t = 14h, a value compatible with the known duration of the cell cycle.

The ABC model can be analytically fully solved. The recursive relations of the average number of A, B and C cells, A(t), B(t) and C(t) can be derived from the Master Equation of the Markov process:

0 BB BB B@

A(t + 1) B(t + 1) C(t + 1)

1 CC CC CA

=

0 BB BB B@

↵ 0 0

⌘ 0

1 CC CC CA

0 BB BB B@

A(t) B(t) C(t)

1 CC CC CA

(4.1)

where

↵ = Ai

= 2Bs+ Bi+ Ba

= 2Cs+ Ci

⌘ = 2As+ Aa

= Bt+ Ba

= Aa

(4.2)

The eigenvalues a, b, g of the above matrix give the longer time scale growth rates of the A, B and C populations respectively. The number composition of the di↵erent subpopulations, G0N , G1N , . . . , G6N can be analogously derived at any time, t, as a

(36)

Figure 4.2 – The model with examples of state-specific cell transitions and their corre- sponding probabilities. Transition probabilities are labelled as follows: d death, i stay identical, s symmetric division, a asymmetric division t transdi↵erentiation.

function of the microscopic parameters of the model.

EAT model equation Populations at time-step N can be achieved by the following equations:

0 BB BB BB BB BB BB BB BB

@

AN0 BN1 C1N BN2 C2N BN3 C3N BN4 C4N BN5 C5N BN6 C6N

1 CC CC CC CC CC CC CC CC A

=

0 BB BB BB BB BB BB BB BB

@

Ai 0 0 0 0 0 0 0 0 0 0 0 0

2As+ Aa Ai 0 0 0 0 0 0 0 0 0 0 0

Aa Bt Ti 0 0 0 0 0 0 0 0 0 0

0 2As+ Ba 0 Ai 0 0 0 0 0 0 0 0 0

0 Ba 2Ts Bt Ti 0 0 0 0 0 0 0 0

0 0 0 2As+ Ba 0 Ai 0 0 0 0 0 0 0

0 0 0 Ba 2Ts Bt Ti 0 0 0 0 0 0

0 0 0 0 0 2As+ Ba 0 Ai 0 0 0 0 0

0 0 0 0 0 Ba 2Ts Bt Ti 0 0 0 0

0 0 0 0 0 0 0 2As+ Ba 0 Ai 0 0 0

0 0 0 0 0 0 0 Ba 2Ts Bt Ti 0 0

0 0 0 0 0 0 0 0 0 2As+ Ba 0 0 0

0 0 0 0 0 0 0 0 0 Ba 2Ts 0 0

1 CC CC CC CC CC CC CC CC A

N0 BB BB BB BB BB BB BB BB

@

A00 0 0 0 0 0 0 0 0 0 0 0 0

1 CC CC CC CC CC CC CC CC A

(4.3)

Dynamic solution The Network which links the three states A, B, C can be sum- marized in Figure 4.1 and Figure 4.2

The recursive equations for the average number of A, B and C cells at time step t, A(t), B(t) and C(t) are:

A(t + 1) = (A1)A(t) = ↵A(t);

B(t + 1) = (2As+ Aa) A(t) + (Bi+ 2Bs+ Ba) B(t) = ⌘A(t) + B(t);

(37)

C(t + 1) = (Aa)A(t) + (Bt+ Ba) B(t) + (Ci+ 2Cs) C(t) = A(t) + B(t) + C(t);

That is

0 BB BB B@

A(t) B(t) C(t)

1 CC CC CA

=

0 BB BB B@

↵ 0 0

⌘ 0

1 CC CC CA

0 BB BB B@

A(t 1) B(t 1) C(t 1)

1 CC CC CA

= . . . =

0 BB BB B@

↵ 0 0

⌘ 0

1 CC CC CA

N 0 BB BB B@

A(0) B(0) C(0)

1 CC CC CA

(4.4)

Its eigenvalues are ↵, , with eigenvectors:

|↵ >=

0 BB BB B@

1

+⌘

(↵ )(↵ )

1 CC CC CA

(4.5)

| >=

0 BB BB B@

0 1

1 CC CC CA

(4.6)

| >=

0 BB BB B@

0 0 1

1 CC CC CA

(4.7)

We can thus decompose the vector (A, B, C) in this way

0 BB BB B@

A(0) B(0) C(0)

1 CC CC CA

= a|↵ > +b| > +c| > (4.8)

where

(38)

a = A(0) (4.9)

b = B(0) ⌘

↵ A(0) (4.10)

c = C(0) + B(0) + ⌘

(↵ ) ( )A(0) (4.11)

thus

0 BB BB B@

A(N ) B(N ) C(N )

1 CC CC CA

= ↵Na|↵ > + Nb| > + Nc| > (4.12)

The solution is:

A(N ) = A(0)↵N

B(N ) = NB(0) +(N N)

A(0)

C(N ) = NC(0) + N +1

(↵ )( )+(↵)(N ) N B(0)+

+

N +1 (↵ )(↵ )

N

(↵ )(↵ )+(↵ ⌘ ↵)(↵N )+(↵ )(N ) (↵ N +1)( )+(↵ )(N ) (↵ )(N ) A(0)

(39)
(40)

Chapter 5 Results

5.1 Single cell fate: mathematical modelling of three cell states quantifies the link between accelera- tion of proliferation and di↵erentiation

To further dissect our three state hypothesis and verify and quantify the existance of a di↵erence in the proliferation rate of di↵erent cells, we investigated the cellular (as opposed to molecular) events underlying cell di↵erentiation across such states. We exploited flow cytometry data at day 3.5 of di↵erentiation to discriminate between di↵erent models of cell di↵erentiation. We considered a simple schematic, mathematical model of the behaviour of individual cells and their transformation dynamics across the three states A, B and C (Figure 5.1).

(41)

5.1. Single cell fate: mathematical modelling of three cell states quantifies the link between acceleration of proliferation and di↵erentiation

Figure 5.1 – Overview of the di↵erentiation process that converts a na¨ıve cell into a fully di↵erentiated Th2 cell. Each na¨ıve cell goes through three di↵erent states: state A (Undivided), state B (Proliferating) and state C (Cytokine expressing).

In our model, an activated na¨ıve cell become an A cell that can then divide and give rise to B cells, wich in turn can transdi↵erentiate into e↵ector C cells. We assume that each cell can stochastically divide, die or di↵erentiate into anther state at given cell state-specific rates described by a Markov process (Figure 5.2).

Figure 5.2 – The model with examples of state-specific cell transitions and their corre- sponding probabilities. Transition probabilities are labelled as follows: d death, i stay identical, s symmetric division, t transdi↵erentiation. In the table, best fits of the model transition probabilities (expressed as probability per 14 h) from flow cytometry data at day 3.5 are reported. Data are representative of three independent mice

The death rate of an A cell in state A is named Ad; and Ai is the rate at which the A cell remains identical. Since upon activation an A cell can start dividing, we consider the transition where an A cell divides symmetrically into two type of B cells (rate As) .

(42)

5.1. Single cell fate: mathematical modelling of three cell states quantifies the link between acceleration of proliferation and di↵erentiation

a)

b)

Figure 5.3 – Population dynamics of the three states over a 3day (panel a) and 2day (panel b) period as predicted by the model.

Analogously, a B cell can die (rate Bd), stay the same (rate Bi), duplicate (rate Bs) or transdi↵erentiate into a type C cell (rate Bt). T cells, similarly, can die, stay the same and divide (rates Cd, Ci and Cs).

The corresponding transition probability are shown in Figure 5.2 (lower panel). The model, with the same parameters, also accuratelly models two additional flourescence- activated cell sorting (FACS) datasets collected indipendently at day 2 and day 3 of di↵erentiation (Figure 5.3).

We also calculated AIC (Akaike Information Criteria) and BIC (Bayesian Information Criteria) parameters for two, three and four state models to verify whther our three

(43)

5.1. Single cell fate: mathematical modelling of three cell states quantifies the link between acceleration of proliferation and di↵erentiation

Figure 5.4 – AIC and BIC for two-, three- and four-state models (the asterisks indicate the minimum values).

state model has the higher performance. AIC and BIC analysis supports the idea that in vitro Th2 primary activation is best described by three states (Figure 5.4).

Importantly, the day 3.5 fit appears to be robust because indipendent fits of day 2 and day 3 data return fitting parameters very close to those of the day 3.5 fit (Figure 5.5).

Starting from the static picture of the system at day 3.5, our model predicts the detailed dynamics of the population composition (Figure 5.6).

By solving the master equation of the model, the average number of A, B and C cells, as well as the composition of the subpopulation, can be derived at any time (Figure 5.7) By fitting FACS data at day 3.5, the values of the single cell transition probabilities can be determined (Figure 5.2). The model fits our FACS data well in terms of the composition of the di↵erent cell subpopulation at day 3.5 (Figure 5.7).

With the parameters returned by the day 3.5 FACS data fit, the model predicts a twofold faster proliferation of C cells with respect to B cells, as expected from gene

(44)

5.1. Single cell fate: mathematical modelling of three cell states quantifies the link between acceleration of proliferation and di↵erentiation

Figure 5.5 – Comparison of the parameters extracted from fits at days 2, 3 and 3.5 and comparison of the parameters extracted from flow cytometry data at days 2, 3 and 3.5.

Figure 5.6 – Population dynamics of the three states over a 4-day period as predicted by the model.

(45)

5.1. Single cell fate: mathematical modelling of three cell states quantifies the link between acceleration of proliferation and di↵erentiation

Figure 5.7 – Cell subpopulations in the flow data at day 3.5 and the model prediction with parameters extracted at day 3.5.

expression profiles of cell cycle genes mentioned above. Interestingly, the di↵erentiation rates of A and B cells (given by As and Bt, respectively) are approximately one order of magnitude smaller than the growth rates of the populations of the three states ↵, and

(Figure 5.8).

It is worth noting that the predicted death rates of B and C cells are very small (Bd ⇡ 0, Cd⇡ 0) during the first 3.5 days of di↵erentiation. We were able to validate the di↵erence of the death rate of A cells with respect to the other two states in indipendent experiments. Please note that apoptotic cells, measured as the sub-G1 DNA peak by Hoechst staining by flow cytometry, were only present in G0N cells, and completely absent in G1N, G2N and G3N and GFP-Positive cells (Figure 5.9).

To verify the higher division rate of di↵erentiated cells, we also compared the cell cycle distribution of Hoechst-stained IL13-positive (C) and -negative cells (B) (Figure 5.10). Using the G2-M/G1 ratio as an indicator of the proportion of cycling cells, we

(46)

5.1. Single cell fate: mathematical modelling of three cell states quantifies the link between acceleration of proliferation and di↵erentiation

Figure 5.8 – Comparison of the parameters extracted from fits at days 2, 3 and 3.5 and comparison of the parameters extracted from flow cytometry data at days 2, 3 and 3.5.

Figure 5.9 – Apoptotic cells, measured as the sub-G1 DNA peak by Hoechst staining by flow cytometry, were only present in G0N cells, and completely absent in G1N, G2N and G3N and GFP-Positive cells

(47)

5.1. Single cell fate: mathematical modelling of three cell states quantifies the link between acceleration of proliferation and di↵erentiation

Figure 5.10 – The ratio between proportions of cells in G2 versus G1 is used as a measure of cell cycle speed when comparing both positive and negative cells within each individual generation (experiments are representative of four independent mice. Error bars indicate Standard Deviation, p value <0.01).

observed that cells expressing IL13 are cycling faster than IL13-negative cells in the same generation confirming our model prediction.

Also, from the transcriptional point of view, cell cycle genes are highly up-regualated in G4P compared with G4N cells, further suggesting an increase in cell cycle speed co-occurring with cytokine expression (Figure 5.11).

To give more quantitative estimation of the cell cycle lenght in G4P versus G4N cells, we employed an automated imaging system to image single lymphocyties over 20-40 h period. A MATLAB program, developed in house, was employed to extract data from frames and the division time for G4P and G4N were computed to be 12.5± 4.2 and 18.7± 3.5 h, respectively (Figure 5.12).

These further experiments not only confirm the acceleration of cell division that occurs concomitantly with Th2 di↵erentiation but also precisely quantify the di↵erence in cell cycle length of Th cells during primary activation.

(48)

5.1. Single cell fate: mathematical modelling of three cell states quantifies the link between acceleration of proliferation and di↵erentiation

Figure 5.11 – cell cycle genes are highly up-regualated in G4P compared with G4N cells

Figure 5.12 – Live imaging of G4P versus G4N cells. Representative pictures from the live imaging time course experiment of G4N (top) and G4P (bottom) cells. Distribution of time of first division for G4N (gray) and G4P (green) cells (p-value <0.001)

(49)

5.2. Asymmetric division and robustness of the model

Figure 5.13 – Asymmetric division considered in the As and OA models.

5.2 Asymmetric division and robustness of the model

We also tested an extended version of the model including asymmetric divisions (named the “As” model) in which we added the possibility of A and B cells dividing asymetrically and giving rise to a B cell and a C cell (rate Aa and Ba, Figure 5.13).

The As model does fit day 3.5 FACS data but it returns very low asymmetric transition rates in most of the di↵erent fits. This suggests that asymmetric transitions are extremely rare and, in fact, can be considered negligible with respect to symmetric ones. These results are supported by the fact that when we considered a model (OA model) in which C cells can derive only by asymmetric division of A or B cells (As = Bt= 0),we obtain asymmetric transition parameters close to 0 (Ba⇠ 0) and that C cells remain a very small fraction of the population even at long times, as < (Figure 5.14).

We evaluated the AIC and BIC and found that both the AIC and BIC minima correspond best to the original model (Figure 5.15), i.e. the without asymmetric division.

Taken together, these results suggest that asymmetric transition do not substantially contribute to Th2 di↵erentiation.

(50)

5.2. Asymmetric division and robustness of the model

Figure 5.14 – The As (i) and OA (j) model- predicted dynamics of the population fractions of the three states over a 4-day period.

Figure 5.15 – AIC and BIC evaluated for day 3.5 across di↵erent models.

(51)

5.3. Validating the A, B and C cell states and parameters by expression profiling

Figure 5.16 – To each state we assigned a specific expression profile as visualized here.

5.3 Validating the A, B and C cell states and pa- rameters by expression profiling

Finally, we validated our model predictions with a dual experimental and computational approach. By combining the population dynamics predictions from the model with an RNA-seq data time-course, we aimed to link the cellular identity and the molecular characteristics of these cells.

First, we assigned a defined expression profile to each of the states: the G0N expression profile to A cells, the G2N profile to B cells and the G4P profile with C cells (Figure 5.16). Next, our model allow us to estimate the proportion of cells in each of the three states at di↵erent time points during Th2 di↵erentiation (Figure 5.17).

Based on the expression profile of each of the states, we are able to predict ensamble transcriptomic profiles at di↵erent time points.

To verify the accuracy of our predictions, we performed a time-course mRNA-seq experiment (6, 12, 24, 48 and 84 h post-activation) in the same culture conditions used before (Figure 5.17). We analyzed the expression profiles of all the genes at all time points and plotted the predicted versus the measured log(RPKM) values (Figure 5.18).

(52)

5.3. Validating the A, B and C cell states and parameters by expression profiling

Figure 5.17 – Experimental setup of the time-course expression profiling. At the time points visualized total cells have been collected and total mRNA sequenced.

The correlation coefficient between the two was high (r ⇠ 0.83) and discrepancies were mainly at low expression levels, as expected.

Then we calculated the correlation between the single generation datasets, the model prediction and the new time-course data (Figure 5.19). As expected, the correlation of G0N decreases along consecutive time points while the correlation of G2N, G4N and G4P rises over time. Reassuringly, our prediction consistently have the highest correlation coefficient with the observed data over the whole time coirse (0.83 on average). Only the last point (84 h) correlates better with G4P expression data.

Globally, we observed that ⇠ 43% of total genes have a correlation r > 0.5; if we consider only DEGs, about⇠ 60% of them have correlation r > 0.5 (Figure 5.20).

To further verify our prediction, we classified genes as negative and positive signature, i.e. genes that are overexpressed in one state only (positive) and genes that are downregulated in one state only (Figure 5.21).

(53)

5.3. Validating the A, B and C cell states and parameters by expression profiling

Figure 5.18 – Model validation in vitro and in vivo. a Cell culture expression time course: correlations of gene expression levels (logRPKM) between the model prediction and measured data at 6, 18, 24, 48 and 84 h. The colour scale represents the density of transcripts as a percentage of all expressed genes.

(54)

5.3. Validating the A, B and C cell states and parameters by expression profiling

Figure 5.19 – Pearson correlation between model-predicted expression values (thick light blue line) and generation profiles (thin lines, grey for G0N, red for G2N, orange for G4N and green for G4P).

Figure 5.20 – Pearson correlation distribution between measured and predicted values for all genes and DEGs only.

(55)

5.3. Validating the A, B and C cell states and parameters by expression profiling

Figure 5.21 – Identification of positive and negative signature genes as specifically ON or OFF genes in that particular state and their relative abundances.

To minimize the noise, we considered only top 30% of the significant positive signature genes, as those genes should be most representative of each particular cell state. We compared our prediction (Figure 5.22) with measured data. Inspection of the data indicated that the trend was consistent for each of the three statess, with A-positive signature decresing over time, while B and C genes levels increase.

Figure 5.22 – Time-course data and the predicted expression levels of the 30 % top positive genes. The median is shown as a black solid bar.

(56)

5.3. Validating the A, B and C cell states and parameters by expression profiling

Figure 5.23 – For each of the states, the linear regression of the median values (Z-score normalized) is visualized as a red line for the time-course data and as a blue line for the predicted ones. Errors were calculated using data from di↵erent RNA-seq replicates.

Linear regression on the median expression (Figure 5.23) for the predicted and the experimentally determined data behave similarly, suggesting that our prediction fit well for all three states across all time points. Overall, the excellent agreement between the predicted and the measured RPKM values (Figure 5.18, 5.19 and 5.23) shows that our model is accurate in terms of transcriptomic changes in Th2 di↵erentiation.

These results confirm not only the three cell states during Th2 di↵erentiation but also the robustness of the cellular parameters inferred by the model.

(57)

5.4. Single-cell RNA-seq links CD4+ T-cell division rates to di↵erentiation state in an in vivo Th1 infection model

Figure 5.24 – Overview of splenic Th cells isolated from PcAS PbTII-infected mice at 2, 3 and 4 days post-infection.

5.4 Single-cell RNA-seq links CD4+ T-cell division rates to di↵erentiation state in an in vivo Th1 infection model

To verify the link between cell cycle speed and di↵erentiation rate in vivo and to ask if the model can be extended from Th2 to Th1 di↵erentiation, we studied the CD4+

T-cell response against Plasmodium chabaudi AS (PcAS). Antigen-specific PbTII CD4+

T cells (CD45.1) were transferred into wild-type CD45.2 recipients and recovered from spleens at day 2, 3 and 4 post-infection (Figure 5.24)

As a measure of di↵erentiation status inferred from the single-cell RNA-seq data, we developed a di↵erentiation score based on the expression of “Th1 di↵erentiation signature genes” [23]. We used aggregated G2/M gene expression levels across 26 genes as a “cell cycle score” reflecting division rate (analogous to [24]). Both of them are reported as

Riferimenti

Documenti correlati

Be- cause of this, we performed an epidemiological survey on the prevalence of HCV and risk factors for infection in the general adult population of the metropolitan area of Naples

Additionally, results highlighted appreciable relationships between xylem traits and climate variability more than tree-ring width, supporting also the evidence that the plant

While there is no evidence of a productivity dierential between Polizia and Cara- binieri for the very rst robbery of a sequence, subsequent robberies that fall in the Polizia

In the following definition, the notion of growth rate of a finitely generated, countable discrete group is extended to σ-finite von Neumann algebras having the Haagerup Prop- erty

In fact, even the fact that Unreal fails to update the frames at 60 Hz stably can be a problem: some raw data from the RGBD camera are not used and for instance the position of

As a testbed for our layout synthesis system, we developed a furniture arrangement application, whose goal is learning to arrange tables in a room according to the user preferences..

Per sostenere il cambiamento di ruolo e la trasforma- zione delle proprie funzioni la biblioteca accademica di ricerca ha dovuto cercare alleati dentro e fuori l’ac- cademia,

È nella manifestazione del tempo depositata dalla memoria che si acquisisce la consapevolezza della provvisorietà della vita e dell’architettura, ma anche la sfida rivolta a