• Non ci sono risultati.

Analisi dati di Microarray

N/A
N/A
Protected

Academic year: 2021

Condividi "Analisi dati di Microarray"

Copied!
51
0
0

Testo completo

(1)

MicroArray

MicroArray

based on

based on

in

in

situ

situ

oligo

oligo

synthesis

synthesis

Photolitography Photolitography

(2)

4/7/2003 2

PM MM

cell Probe pair

Gene sequence ACCAGATCTGTAGTCCATGCGATGC ACCAGATCTGTAATCCATGCGATGC PM MM

Probe set (

(3)

MAS 5.0

Scaling (Affymetrix).cel

Data MAS 5.0 % calls A <90% Filtered data MAS 5.0 Is Avg Int > Bgd Yes Calculating Bgd threshold Yes Yes Data dCHIP dCHIP normalization and PM modelling % calls A <90% Filtered data dCHIP Is SE

in the upper (3%) tail of the distribution

? No Is

the gene present in MAS and dCHIP

set? Yes No

Diff. exp. is passing

SAM test? Yes

SAM data are congruent with CyberT p values? Final data set Yes a b c α β γ s1 s2 s3 MAS 5.0 scaling:

Arrays are scaled to a same value of median array intensity

MAS 5.0 scaling:

Arrays are scaled to a same value of median array intensity

(4)

Invariant Set Normalization Method.

Invariant Set Normalization Method.

dCHIP normalization:

In this normalization procedure an array with median overall intensity is chosen as the baseline array against which other arrays are normalized at probe intensity

level.

Subsequently a subset of PM probes, with small within-subset rank difference in the two arrays, serves as the basis for fitting a normalization curve.

dCHIP normalization:

In this normalization procedure an array with median overall intensity is chosen as the baseline array against which other arrays are normalized at probe intensity

level.

Subsequently a subset of PM probes, with small within-subset rank difference in the two arrays, serves as the basis for fitting a normalization curve.

(5)

MAS 5.0 software:

MAS 5.0 software:

„

„ log ratio versus average log log ratio versus average log

intensity for two replicates

intensity for two replicates

„

„ without without data data scaling scaling

(

(**, r, r22=0.8995) =0.8995)

„

„ with with data data scaling scaling

(

({, r{, r22=0.8995).=0.8995).

dCHIP

dCHIP software:software:

„

„ Log ratio versus average log Log ratio versus average log

intensity for two replicates

intensity for two replicates

„

„ without data normalizationwithout data normalization

(

(**; r; r22=0.8995)=0.8995)

„

„ with data normalizationwith data normalization

(

(6)

dCHIP

dCHIP

In this normalization procedure an array with median overall intensity is chosen as the baseline array against which other arrays are normalized at probe intensity level. Subsequently a subset of PM probes, with small within-subset rank difference in the two arrays, serves as the basis for fitting a normalization curve.

In this normalization procedure an array with median overall intensity is chosen as the baseline array against which other arrays are normalized at probe intensity level. Subsequently a subset of PM probes, with small within-subset rank difference in the two arrays, serves as the basis for fitting a normalization curve.

In this normalization procedure an array with median overall intensity is chosen as the baseline array against which other arrays are normalized at probe intensity level. Subsequently a subset of PM probes, with small within-subset rank difference in the two arrays, serves as the basis for fitting a normalization curve.

dCHIP allows the calculation of a model-based expression index in the array as well as a probe-sensitivity expression index.

Fitting experimental probe set values with the calculated model it is possible to define a standard error value (SE) which gives an indication of the hybridization

quality to each probe set.

dCHIP allows the calculation of a model-based expression index in the array as well as a probe-sensitivity expression index.

Fitting experimental probe set values with the calculated model it is possible to define a standard error value (SE) which gives an indication of the hybridization

(7)

MAS 5.0

Scaling (Affymetrix).cel

Data MAS 5.0 % calls A <90% Filtered data MAS 5.0 Is Avg Int > Bgd Yes Calculating Bgd threshold Yes Yes Data dCHIP dCHIP normalization and PM modelling % calls A <90% Filtered data dCHIP Is SE

in the upper (3%) tail of the distribution

? No Is

the gene present in MAS and dCHIP

set? Yes No

Diff. exp. is passing

SAM test? Yes

SAM data are congruent with CyberT p values? Final data set Yes a b c α β γ s1 s2 s3

(8)

MAS 5.0 software:

MAS 5.0 software:

„

„ Distribution of intensity values in the Distribution of intensity values in the

not expressed genes set (genes call A not expressed genes set (genes call A in >90% of the arrays under in >90% of the arrays under analysis). Dashed line indicates the analysis). Dashed line indicates the threshold value delimiting the 10% threshold value delimiting the 10% upper tail of the distribution.

upper tail of the distribution.

dCHIP

dCHIP software:software:

„

„ The distribution of the SE associated The distribution of the SE associated

to all probe sets was evaluated, and, to all probe sets was evaluated, and, assuming that the vast majority of the assuming that the vast majority of the probe sets have a good hybridization probe sets have a good hybridization quality profile, the SE value delimiting quality profile, the SE value delimiting the 3% upper tail of the distribution the 3% upper tail of the distribution (dashed line) is used as threshold to (dashed line) is used as threshold to filter out those probe sets which could filter out those probe sets which could give misleading results due to their give misleading results due to their low hybridization quality.

(9)

MAS 5.0

Scaling (Affymetrix).cel

Data MAS 5.0 % calls A <90% Filtered data MAS 5.0 Is Avg Int > Bgd Yes Calculating Bgd threshold Yes Yes Data dCHIP dCHIP normalization and PM modelling % calls A <90% Filtered data dCHIP Is SE

in the upper (3%) tail of the distribution

? No Is

the gene present in MAS and dCHIP

set? Yes No

Diff. exp. is passing

SAM test? Yes

SAM data are congruent with CyberT p values? Final data set Yes a b c α β γ s1 s2 s3

(10)

„

„ Intensity values for the same data set Intensity values for the same data set

calculated with MAS 5.0 and with calculated with MAS 5.0 and with dCHIP

dCHIP ..

„

„ Distribution of fold change generated Distribution of fold change generated

using MAS scaled data (black line) or using MAS scaled data (black line) or dCHIP

(11)

MAS 5.0

Scaling (Affymetrix).cel

Data MAS 5.0 % calls A <90% Filtered data MAS 5.0 Is Avg Int > Bgd Yes Calculating Bgd threshold Yes Yes Data dCHIP dCHIP normalization and PM modelling % calls A <90% Filtered data dCHIP Is SE

in the upper (3%) tail of the distribution

? No Is

the gene present in MAS and dCHIP

set? Yes No

Diff. exp. is passing

SAM test? Yes

SAM data are congruent with CyberT p values? Final data set Yes a b c α β γ s1 s2 s3

(12)

4/7/2003 12

(13)

4/7/2003 13

„

„ StepStep--down down permutationpermutation test:test:

„

„ Date Date gg ipotesi da testare e un valore di ipotesi da testare e un valore di

significatività significatività α α ((eses αα=0.05).=0.05). „ „ Si generano Si generano permutazioni permutazioni „

„ Le Le XX permutazioni permettono di generare permutazioni permettono di generare

delle probabilità (

delle probabilità (PPcalculatedcalculated) basate sul t) basate sul t--test test che verranno ordinate in ordine crescente.

che verranno ordinate in ordine crescente.

„

„ La probabilità (La probabilità (PPobservedobserved) del t) del t--test calcolato per test calcolato per

il dato non permutato fa si che la ipotesi H0

il dato non permutato fa si che la ipotesi H0

sia scartata se il valore di

sia scartata se il valore di PPobservedobserved è presente è presente nel 100

nel 100αα% dei % dei PPcalculatedcalculated ((es es 100*0.05=5% dei 100*0.05=5% dei P

Pcalculatedcalculated con il valore più basso)con il valore più basso)

Correzione di errori di tipo I

Correzione di errori di tipo I

g tests m ctrl n exp ! )! ) (( )! ( m m n m n m n m m + − + =       +

(14)

4/7/2003 14

„

„ StepStep--down down permutationpermutation test:test:

„

„ Se, dopo ordinamento crescente dei Se, dopo ordinamento crescente dei PPobservedobserved, il primo test ha un , il primo test ha un

P

Pobservedobserved presente nel 100presente nel 100αα% dei % dei PPcalculatedcalculated questo passa il test e questo passa il test e permette di scartare l’ipotesi H0

permette di scartare l’ipotesi H0

„

„ Per il secondo test si scartano i dati riferiti al primo test, cPer il secondo test si scartano i dati riferiti al primo test, comprese le omprese le

permutazioni ad esso riferite e se

permutazioni ad esso riferite e se PPobserved observed è presente nel 100è presente nel 100αα% dei % dei P

Pcalculatedcalculated questo passa il test e permette di scartare l’ipotesi H0.questo passa il test e permette di scartare l’ipotesi H0.

„

„ La procedura è ripetuta fino a quando non si trova un La procedura è ripetuta fino a quando non si trova un PPobservedobserved che che

non rientra nel 100

non rientra nel 100αα% dei % dei PPcalculatedcalculated, a questo punto il test viene , a questo punto il test viene interrotto.

interrotto.

Correzione di errori di tipo I

(15)

4/7/2003 15

SAM

(16)

MAS 5.0

Scaling (Affymetrix).cel

Data MAS 5.0 % calls A <90% Filtered data MAS 5.0 Is Avg Int > Bgd Yes Calculating Bgd threshold Yes Yes Data dCHIP dCHIP normalization and PM modelling % calls A <90% Filtered data dCHIP Is SE

in the upper (3%) tail of the distribution

? No Is

the gene present in MAS and dCHIP

set? Yes No

Diff. exp. is passing

SAM test? Yes

SAM data are congruent with CyberT p values? Final data set Yes a b c α β γ s1 s2 s3

(17)

4/7/2003 17

Teorema di

Teorema di

Bayes

Bayes

„

„ Dato un modello Dato un modello M = M(w)M = M(w) per un set di dati per un set di dati DD, ,

conoscendo le probabilità del modello

conoscendo le probabilità del modello P(M)P(M) è è possibile definire quanto il modello sia corretto

possibile definire quanto il modello sia corretto

P(M|D)

P(M|D) sulla base dei dati osservati sulla base dei dati osservati DD. .

logP(M|D) = logP(D|M) + logP(M)-logP(D)

a posteriori a priori modello dati

(18)

4/7/2003 18

Bayesian

Bayesian

regolarized

regolarized

t

t

-

-

test

test

(

(BaldiBaldi & Long 2001)& Long 2001)

C C T T C T

n

n

m

m

t

2 2

σ

σ

+

=

(19)

4/7/2003 19

Bayesian

Bayesian

regolarized

regolarized

t

t

-

-

test

test

0 500 1000 1500 2000 2500 0 2000 4000 6000 8000 10000 12000 14000 mean st.dev Ctrls BaysCtrls 0 0.05 0.1 0.15 0.2 0.25 0.3 6.5 7 7.5 8 8.5 9 9.5 10 10.5 mean st.dev ln Ctrls ln BaysCtrls

(20)

4/7/2003 20

Bayesian regolarized

Bayesian regolarized

t

t

-

-

test

test

„

„

Ottimizzazione dei parametri del t

Ottimizzazione dei parametri del t

-

-

test

test

bayesiano

bayesiano

:

:

„

„ La variabilità media dei geniLa variabilità media dei geni (w (w →→ σσ00)) e il e il

coefficiente di confidenza

coefficiente di confidenza bayesianobayesiano ((νν0 0 →→ ν

ν00σσ22 0

0) ) devono essere definiti dall’utente.devono essere definiti dall’utente.

„

„ La compensazione degli errori di tipo I viene La compensazione degli errori di tipo I viene

fatta usando il coefficiente di correzione di

fatta usando il coefficiente di correzione di

Bonferroni

(21)

4/7/2003 21

Correzione di errori di tipo I

Correzione di errori di tipo I

„

„ Ipotesi H0: la media del trattato e la media del Ipotesi H0: la media del trattato e la media del

controllo appartengono alla stessa distribuzione?

controllo appartengono alla stessa distribuzione?

„

„ Errore di tipo IErrore di tipo I: considerare non valida l’ipotesi H0.: considerare non valida l’ipotesi H0.

„

„ Sidak significance pointSidak significance point::

„

„ Se i p calcolati sono minori di K (g,Se i p calcolati sono minori di K (g,αα) le ) le

corrispondenti ipotesi nulle sono rigettate.

corrispondenti ipotesi nulle sono rigettate.

g

g

K

(

,

α

)

=

1

1

α

α= grado di accettazione dell’errore (es 0.05) g=numero di test indipendenti

(22)

4/7/2003 22

„

„ WestfallWestfall and and YoungYoung stepstep--down procedure:down procedure:

„

„ I PI P--valuesvalues sono scritti in ordine crescente psono scritti in ordine crescente p(1)(1), p, p(2)(2)…, p…, p(g)(g) e pe p(j)(j)≤≤pp(k)(k). .

Quindi le ipotesi H0 sono: H

Quindi le ipotesi H0 sono: H00(1)(1), H, H 0

0(2)(2),…,H,…,H00(g).(g). „

„ StepStep 1: Se p1: Se p(1)(1)<K(g,<K(g,αα) l’ipotesi nulla è rigettata.) l’ipotesi nulla è rigettata.

„

„ StepStep 2: Se p2: Se p(2)(2)<K(g<K(g--1,1,αα) l’ipotesi nulla è rigettata.) l’ipotesi nulla è rigettata.

„

„ StepStep j: Se pj: Se p(j)(j)>K(g>K(g--j+1,j+1,αα) tutte le ipotesi nulle sono accettate da ) tutte le ipotesi nulle sono accettate da

quel punto in poi e non vengono fatti più test.

quel punto in poi e non vengono fatti più test.

Correzione di errori di tipo I

(23)

4/7/2003 23

P gene diff. Espresso α’

<10-6 1 – (1 – 0.05)1/5= 0.102 < 10-6 1 – (1 – 0.05)1/4= 0.0127 2* 10-5 1 – (1 – 0.05)1/3= 0.0170 0.047 1 – (1 – 0.05)1/2= 0.0253g

g

K

(

,

α

)

=

1

1

α

Correzione di errori di tipo I

(24)
(25)

4/7/2003 25

Mining differential expression data

Mining differential expression data

„

„ In order grasp information onIn order grasp information on transcriptionallytranscriptionally

controlled molecular mechanisms it is important controlled molecular mechanisms it is important

to: to:

„

„ group genes by transcription profilegroup genes by transcription profile

„

„ link functions to differentially expressed geneslink functions to differentially expressed genes

„

„ link transcriptional features to colink transcriptional features to co--regulated regulated

genes

(26)

DOWN UP MIX (137) (91) (156) 1.5 0.5 -0.5 -1.5 2.0 1.0 -1.0 2.0 1.0 -1.0 K-way clustering K-way clustering Quality-adaptive clustering Quality-adaptive clustering

(27)

4/7/2003 27

Linking functions to differentially

Linking functions to differentially

expressed genes: GO

expressed genes: GO

„

„ Gene OntologyGene Ontology (GO) is a dynamic controlled (GO) is a dynamic controlled

vocabulary that can be applied to all

vocabulary that can be applied to all

organisms even as knowledge of gene and

organisms even as knowledge of gene and

protein roles in cells is accumulating and

protein roles in cells is accumulating and

changing.

changing.

„

„ GO might link differentially expressed genes GO might link differentially expressed genes

to specific functional classes.

(28)

4/7/2003 28

GO

GO

„

„ Molecular FunctionMolecular Function::

the tasks performed by individual gene, products;

the tasks performed by individual gene, products;

examples are transcription factor and DNA

examples are transcription factor and DNA

helicase

helicase..

„

„ Biological ProcessBiological Process::

broad biological goals, such as mitosis or

broad biological goals, such as mitosis or purinepurine metabolism, that are accomplished by ordered

metabolism, that are accomplished by ordered

assemblies of molecular functions

assemblies of molecular functions

„

„ Cellular ComponentCellular Component::

subcellular

subcellular structures, locations, and structures, locations, and

macromolecular complexes; examples include

macromolecular complexes; examples include

nucleus,

(29)

4/7/2003 29

GO &

GO &

p63

p63--regulated genesregulated genes

„

„ Gene ontology Gene ontology

annotations can be

annotations can be

retrieved from Locus Link

retrieved from Locus Link

database.

database.

„

„ Genes can be clustered Genes can be clustered

using GO features

(30)

4/7/2003 30

GO

GO

p63

p63--regulated genes GOregulated genes GO

signal transduction lipid metabolism oncogenesis transcription from polII promoter & regulation from polII promoter

(31)

Down Mix Up Down Mix Up 0 2 4 6 8 N. of items signal transduct ion lipid meoncogenestabolismis

transcripcell prolitionferation

P53 RE promoters

Gene Ontology classes are spread within p63 differential expression clusters

Gene Ontology classes are spread within p63 differential expression clusters

(32)

4/7/2003 32

Analysis of scientific literature

Analysis of scientific literature

Information ExtractionInformation Extraction

•Texts can be retrieved (e.g. MedLine queries), using specific keywords (a priori knowledge) .

Text MiningText Mining

• Words patterns are located in the text.

(33)

Genes related abs LL id, Gene names NCBI MedLine LL id → symb vocabulary LL id → symb vocabulary LocusLink extraction LocusLink extraction Abs extraction by symbols Abs extraction by symbols Gene names AND/OR keywords Gene names AND/OR keywords Clustering abs by their information content Clustering abs by their information content MedLine subset

(34)
(35)

„

„ Agrawal AprioriAgrawal Apriori rules induction algorithm isrules induction algorithm is „

„ a powerful method to finda powerful method to find regolaritiesregolarities in a set of in a set of

documents

documents. The tool tries to find set of items that are . The tool tries to find set of items that are frequently found together, so that from the presence of

frequently found together, so that from the presence of

certains

certains items in a set of items in a set of documentsdocuments one can infer that one can infer that other items are present.

other items are present.

item

(36)

4/7/2003 36

Promoters analysis

Promoters analysis

„

„ Identification of common transcriptional Identification of common transcriptional

elements, within co

elements, within co--regulated genes, might regulated genes, might help to unmask hidden regulative

help to unmask hidden regulative

mechanisms.

mechanisms.

„

„ Identification of Identification of transcriptional regulative transcriptional regulative

elements (

elements (TTRE) clustersRE) clusters linked to colinked to co-

-regulated genes might help to identify, in the

regulated genes might help to identify, in the

genome, uncharacterized genes regulated by

genome, uncharacterized genes regulated by

the same mechanism.

(37)

4/7/2003 37

Promoters analysis

Promoters analysis

:

:

limitations

limitations

„

„

Computational tools are quite inefficient

Computational tools are quite inefficient

in the identification of

in the identification of

eukaryotic

eukaryotic

promoters.

promoters.

Our approach:

Our approach:

„

„ Relay on NCBI genomic annotationsRelay on NCBI genomic annotations..

„

„ Repeating the Repeating the analysis as annotations analysis as annotations are are

upgraded

(38)

Hs genome data

Hs genome

data Dr genomedata

Dr genome

data Mm genomedata

Mm genome data Automatic download (Curl/Zlib) Automatic download (Curl/Zlib) Non-coding regions Extraction (LL Id linked) Non-coding regions Extraction (LL Id linked) Upstream

Upstream 5’UTR5’UTR IntronsIntrons 3’UTR3’UTR 3’UTR

Downstream

3’UTR Downstream

Transcription start site

(39)

4/7/2003 39

Promoters analysis

Promoters analysis

„

„ Transcriptional signatures might be linked to Transcriptional signatures might be linked to

co

co--regulated genesregulated genes..

„

„ Transcriptional signaturesTranscriptional signatures might be might be

characterized by

characterized by the the presence presence of Tof TRE RE clusters

clusters..

„

„ The The identification identification of TRE of TRE clusters clusters (e.g. CISTER)(e.g. CISTER)

relays on availability of a set of potential

relays on availability of a set of potential TTREREss mapped on promoters.

(40)

„ TREs identification, by statistical approaches, is strongly

dependent upon the tool used to search them.

„

„ TREs identificationTREs identification, , by statistical approachesby statistical approaches, , isis stronglystrongly

dependent

dependent uponupon the the tooltool usedused toto search search themthem..

„

„ Number of motifs for each Number of motifs for each

promoter sequences

promoter sequences

calculated on the basis of

calculated on the basis of

Match program results (154

Match program results (154

human promoters sequences)

human promoters sequences)

default

default cutoffcutoff; ; TransfacTransfac 5.2 5.2 matrices

matrices

„

„ Motifs distribution by using Motifs distribution by using

Patser

Patser program (154 human program (154 human promoters sequences)

promoters sequences) cutoffcutoff p<

p< --101055; ; TransfacTransfac 5.2 5.2

matrices;

matrices;

(41)

Regulative regions binary representation Regulative regions binary representation M1 M2 ... Mn Gene i 1 1 1 1 0 1 1 0 1 0 1 0 Gene j 1 0 0 1 1 1 0 1 0 0 0 1 A A C C G G T T 1 10 0 0 1 10 0 0 1010 3 3 11 00 0 0 1717 3 3 20 0 0 1 1 20 0 0 1 1 1 1 33 20 20 1212 3 3 666 20 0 6 20 0 1 3 0 0 1 3 0 0 2323 66 22000 0 1 1 1010 33 15 15 1010 8 1 8 1 220 0 0 0 0 0 00 1 2 3 4 5 6 1 2 3 4 5 6 7 8 9 7 8 9 10 1110 11 Upstream

Upstream 5’UTR5’UTR IntronsIntrons

Patser

(Hertz & Stormo 1999)

Transfac matrices

Transfac matrices

(42)

TRANSFAC Alignment matrices TRANSFAC Alignment matrices Upstream sequences shuffling Upstream sequences shuffling Upstream sequences (2Kb) Upstream sequences (2Kb) 1000 Shuffling ? Yes No cutoff=100α% cutoff=100α% PATSER Gene1 001..101 Gene2 111..000 GeneN 100..111 Gene1 001..101 Gene2 111..000 GeneN 100..111 Yes No Raw Score > cutoff ? PATSER Raw transcriptional Location scores Raw transcriptional Location scores 1 1 0 0

(43)

4/7/2003 43

Promoters analysis

Promoters analysis

„

„ A commercial association rule like "A commercial association rule like "If a customer If a customer

buys

buys winewine and and breadbread, he often buys , he often buys cheesecheese, , too

too." can be rewritten in a gene oriented way:." can be rewritten in a gene oriented way:

„

„ ““If an upstream region contains at least one If an upstream region contains at least one AP2AP2

and one

and one SP1SP1 , it often contains , it often contains Oct1Oct1, too, too..

„

„ TTranscriptional signatures can be mined by ranscriptional signatures can be mined by

consolidated statistical techniques used in other

consolidated statistical techniques used in other

fields (e.g. association rule induction).

(44)

Common PWMs: •AP-2 alfa •AP-2 gamma •MAZ •MAZR Common PWMs: •AP-2 alfa •AP-2 gamma •MAZ •MAZR 304 p63-responsive genes, described by 317 PWMs. 304 p63-responsive genes, described by 317 PWMs. Apriori rules induction Confidence=100% Support≥10% Apriori rules induction Confidence=100% Support≥10% 20049 rules were generated 20049 rules 20049 rules were generated were generated Ranking rules by frequence in UP DW MX expression classes Ranking rules by frequence in UP DW MX expression classes Is PWM present in >70% genes? Is PWM present in >70% genes? No Yes Sp1, Major T Ag Sp1, Major T Ag Sp1, Major T Ag 12 95 19 26 DW MX UP UP & MX PWMs: •Muscle initiator •TFII-I DW PWM: •Sp3 MX PWMs •E12 UP & MX PWMs: •Muscle initiator •TFII-I DW PWM: •Sp3 MX PWMs •E12

(45)

Common Rules

AP-2alpha AP-2gamma GCbox

AP-2alpha AP-2gamma GCbox MAZ MAZR

UP-MX Specific Rules

GCbox Muscle initiatorsequence-20 Muscle initiator sequences-19

GCbox MAZ MAZR Muscle initiator Muscle initiator sequences-19

DW Specific Rules

AP-2alpha AP-2gamma GCbox Sp3

(46)

4/7/2003 46

Are the transcriptional rules

Are the transcriptional rules

associated in clusters?

associated in clusters?

Apriori rules

Apriori

Apriori rules rules Transfac

matrices describing each rule Transcriptional rules clusters CISTER CISTER CISTER

Cloning with TATA box upstream to luciferase

reporter gene

Cloning with TATA box upstream to luciferase reporter gene Cotransfection with p63 in Saos-2 Cotransfection with p63 in Saos-2 p63-responsive genes expression classes

(47)

4/7/2003 47

Citogenetica ed

Citogenetica ed

arrays

arrays

„

„ Gli Gli array array possono anche avere usi diversi possono anche avere usi diversi

dalla semplice analisi di espressione.

dalla semplice analisi di espressione.

„

„ Un interessante approccio è l’uso di Un interessante approccio è l’uso di arrays arrays

per avere

per avere mappature mappature fini del fini del genoma genoma

(<100Kb) che permettano di identificare

(<100Kb) che permettano di identificare

micro delezioni

micro delezioni duplicazioni non visualizzabili duplicazioni non visualizzabili con tecniche citogenetiche convenzionali

(48)

CGH

„ CGH è un metodo di citogenetica molecolare

per l’identificazione di modificazioni genetiche.

„ Le alterazioni sono classificate come guadagno

di DNA (gain) o perdita di DNA (loss).

„ Uguali quantitativi di DNA tumorale marcato

con biotina e DNA normale di riferimento marcato con digoxigenina sono ibridati con cromosomi metafasici normali.

„ Il DNA tumorale è visualizzato con floresceina

(FITC) ed il DNA normale con rodamina (TRITC) e rivelati con un microscopio a fluorescenza.

Loss

(49)

ER- ER+

„ A sinistra dati generati su 81 pazienti ER+. „ A destra dati generati su 21 pazienti ER-.

„ Il profilo che tende verso sinistra indica sotto-rappresentazione del materiale

genetico nelle cellule tumorali.

„ Il profilo che tende verso destra indica amplificazione del materiale genetico

(50)

CGH arrays

BAC array

Cy3 Cy5

„ Nel CGH array il DNa normale e quello

tumorale sono marcati come nel CGH canonico ma invece di ibridare questi DNA a cromosomi in metafase si usa un filtro su cui sono stati legati i BAC usati per collegare la sequenza del genoma umano con i dati citogenetici.

„ L’uso di un formato ad array ha il

vantaggio di essere piu’ sensibile ed avere una migliore risoluzione rispetto al consueto uso della CGH in cromosomi metafasici, dato che i cloni sono

direttamente collegati ad informazioni di sequenza

(51)

Integrazione tra citogenetica e sequenze

genomiche

„ L’uso di questo approccio in cellule tumorali ML-2 che ha permesso di

identificare anormalità di numero di copie per specifiche sequenze.

„ C’e’ stata una perdita di copie di DNA in 1p, 6q, 11q e 20p „ Si sono osservate delle duplicazioni in 12, 13 e 20q.

„ Con questo approccio è stato possibile non solo identificare specifiche

anormalità cromosomiche ma l’uso degli array di CGH ha permesso di mappare con precisione su uno specifico BAC, associato ad una sequenza genomica, il punto di rottura.

Riferimenti

Documenti correlati

¨  L’espressione a[i] restituisce il valore della variabile con indice i, quindi il tipo di questa espressione è il tipo base dell'array. Accesso agli elementi di

• tutte le variabili di un array hanno lo stesso tipo di dato (il tipo dell tipo dell ’array ’ array), e si chiamano anche..

Abbiamo visto che il tipo di dato degli elementi di un array può essere qualsiasi tipo valido.. Il tipo di dato degli elementi di un array può dunque anche essere un

In order to understand the reason for these unexpected experimental results we simulated several mean photon number dynamics, different for amplitude of the phonon oscillations,

– Soluzione: si passa alla funzione un ulteriore parametro di tipo array, che viene “riempito” con il risultato.. E i parametri di

● Se gli estremi degli scaglioni e/o le percentuali cambiano, devo cambiare solo l'array e non il programma che fa la ricerca!.. Algoritmi su

 Evitare l'uso di array paralleli trasformandoli in array Evitare l'uso di array paralleli trasformandoli in array di oggetti:.

Per dichiarare un vettore in Pascal è necessario indicare le diverse posizioni possibili (normalmente si usano i numeri) a e il tipo di dato che conterrà ciascun valore