MicroArray
MicroArray
based on
based on
in
in
situ
situ
oligo
oligo
synthesis
synthesis
Photolitography Photolitography
4/7/2003 2
PM MM
cell Probe pair
Gene sequence ACCAGATCTGTAGTCCATGCGATGC ACCAGATCTGTAATCCATGCGATGC PM MM
Probe set (
MAS 5.0
Scaling (Affymetrix).cel
Data MAS 5.0 % calls A <90% Filtered data MAS 5.0 Is Avg Int > Bgd Yes Calculating Bgd threshold Yes Yes Data dCHIP dCHIP normalization and PM modelling % calls A <90% Filtered data dCHIP Is SE
in the upper (3%) tail of the distribution
? No Is
the gene present in MAS and dCHIP
set? Yes No
Diff. exp. is passing
SAM test? Yes
SAM data are congruent with CyberT p values? Final data set Yes a b c α β γ s1 s2 s3 MAS 5.0 scaling:
Arrays are scaled to a same value of median array intensity
MAS 5.0 scaling:
Arrays are scaled to a same value of median array intensity
Invariant Set Normalization Method.
Invariant Set Normalization Method.
dCHIP normalization:In this normalization procedure an array with median overall intensity is chosen as the baseline array against which other arrays are normalized at probe intensity
level.
Subsequently a subset of PM probes, with small within-subset rank difference in the two arrays, serves as the basis for fitting a normalization curve.
dCHIP normalization:
In this normalization procedure an array with median overall intensity is chosen as the baseline array against which other arrays are normalized at probe intensity
level.
Subsequently a subset of PM probes, with small within-subset rank difference in the two arrays, serves as the basis for fitting a normalization curve.
MAS 5.0 software:
MAS 5.0 software:
log ratio versus average log log ratio versus average log
intensity for two replicates
intensity for two replicates
without without data data scaling scaling
(
(**, r, r22=0.8995) =0.8995)
with with data data scaling scaling
(
({, r{, r22=0.8995).=0.8995).
dCHIP
dCHIP software:software:
Log ratio versus average log Log ratio versus average log
intensity for two replicates
intensity for two replicates
without data normalizationwithout data normalization
(
(**; r; r22=0.8995)=0.8995)
with data normalizationwith data normalization
(
dCHIP
dCHIP
In this normalization procedure an array with median overall intensity is chosen as the baseline array against which other arrays are normalized at probe intensity level. Subsequently a subset of PM probes, with small within-subset rank difference in the two arrays, serves as the basis for fitting a normalization curve.
In this normalization procedure an array with median overall intensity is chosen as the baseline array against which other arrays are normalized at probe intensity level. Subsequently a subset of PM probes, with small within-subset rank difference in the two arrays, serves as the basis for fitting a normalization curve.
In this normalization procedure an array with median overall intensity is chosen as the baseline array against which other arrays are normalized at probe intensity level. Subsequently a subset of PM probes, with small within-subset rank difference in the two arrays, serves as the basis for fitting a normalization curve.
dCHIP allows the calculation of a model-based expression index in the array as well as a probe-sensitivity expression index.
Fitting experimental probe set values with the calculated model it is possible to define a standard error value (SE) which gives an indication of the hybridization
quality to each probe set.
dCHIP allows the calculation of a model-based expression index in the array as well as a probe-sensitivity expression index.
Fitting experimental probe set values with the calculated model it is possible to define a standard error value (SE) which gives an indication of the hybridization
MAS 5.0
Scaling (Affymetrix).cel
Data MAS 5.0 % calls A <90% Filtered data MAS 5.0 Is Avg Int > Bgd Yes Calculating Bgd threshold Yes Yes Data dCHIP dCHIP normalization and PM modelling % calls A <90% Filtered data dCHIP Is SE
in the upper (3%) tail of the distribution
? No Is
the gene present in MAS and dCHIP
set? Yes No
Diff. exp. is passing
SAM test? Yes
SAM data are congruent with CyberT p values? Final data set Yes a b c α β γ s1 s2 s3
MAS 5.0 software:
MAS 5.0 software:
Distribution of intensity values in the Distribution of intensity values in the
not expressed genes set (genes call A not expressed genes set (genes call A in >90% of the arrays under in >90% of the arrays under analysis). Dashed line indicates the analysis). Dashed line indicates the threshold value delimiting the 10% threshold value delimiting the 10% upper tail of the distribution.
upper tail of the distribution.
dCHIP
dCHIP software:software:
The distribution of the SE associated The distribution of the SE associated
to all probe sets was evaluated, and, to all probe sets was evaluated, and, assuming that the vast majority of the assuming that the vast majority of the probe sets have a good hybridization probe sets have a good hybridization quality profile, the SE value delimiting quality profile, the SE value delimiting the 3% upper tail of the distribution the 3% upper tail of the distribution (dashed line) is used as threshold to (dashed line) is used as threshold to filter out those probe sets which could filter out those probe sets which could give misleading results due to their give misleading results due to their low hybridization quality.
MAS 5.0
Scaling (Affymetrix).cel
Data MAS 5.0 % calls A <90% Filtered data MAS 5.0 Is Avg Int > Bgd Yes Calculating Bgd threshold Yes Yes Data dCHIP dCHIP normalization and PM modelling % calls A <90% Filtered data dCHIP Is SE
in the upper (3%) tail of the distribution
? No Is
the gene present in MAS and dCHIP
set? Yes No
Diff. exp. is passing
SAM test? Yes
SAM data are congruent with CyberT p values? Final data set Yes a b c α β γ s1 s2 s3
Intensity values for the same data set Intensity values for the same data set
calculated with MAS 5.0 and with calculated with MAS 5.0 and with dCHIP
dCHIP ..
Distribution of fold change generated Distribution of fold change generated
using MAS scaled data (black line) or using MAS scaled data (black line) or dCHIP
MAS 5.0
Scaling (Affymetrix).cel
Data MAS 5.0 % calls A <90% Filtered data MAS 5.0 Is Avg Int > Bgd Yes Calculating Bgd threshold Yes Yes Data dCHIP dCHIP normalization and PM modelling % calls A <90% Filtered data dCHIP Is SE
in the upper (3%) tail of the distribution
? No Is
the gene present in MAS and dCHIP
set? Yes No
Diff. exp. is passing
SAM test? Yes
SAM data are congruent with CyberT p values? Final data set Yes a b c α β γ s1 s2 s3
4/7/2003 12
4/7/2003 13
StepStep--down down permutationpermutation test:test:
Date Date gg ipotesi da testare e un valore di ipotesi da testare e un valore di
significatività significatività α α ((eses αα=0.05).=0.05). Si generano Si generano permutazioni permutazioni
Le Le XX permutazioni permettono di generare permutazioni permettono di generare
delle probabilità (
delle probabilità (PPcalculatedcalculated) basate sul t) basate sul t--test test che verranno ordinate in ordine crescente.
che verranno ordinate in ordine crescente.
La probabilità (La probabilità (PPobservedobserved) del t) del t--test calcolato per test calcolato per
il dato non permutato fa si che la ipotesi H0
il dato non permutato fa si che la ipotesi H0
sia scartata se il valore di
sia scartata se il valore di PPobservedobserved è presente è presente nel 100
nel 100αα% dei % dei PPcalculatedcalculated ((es es 100*0.05=5% dei 100*0.05=5% dei P
Pcalculatedcalculated con il valore più basso)con il valore più basso)
Correzione di errori di tipo I
Correzione di errori di tipo I
g tests m ctrl n exp ! )! ) (( )! ( m m n m n m n m m + − + = +
4/7/2003 14
StepStep--down down permutationpermutation test:test:
Se, dopo ordinamento crescente dei Se, dopo ordinamento crescente dei PPobservedobserved, il primo test ha un , il primo test ha un
P
Pobservedobserved presente nel 100presente nel 100αα% dei % dei PPcalculatedcalculated questo passa il test e questo passa il test e permette di scartare l’ipotesi H0
permette di scartare l’ipotesi H0
Per il secondo test si scartano i dati riferiti al primo test, cPer il secondo test si scartano i dati riferiti al primo test, comprese le omprese le
permutazioni ad esso riferite e se
permutazioni ad esso riferite e se PPobserved observed è presente nel 100è presente nel 100αα% dei % dei P
Pcalculatedcalculated questo passa il test e permette di scartare l’ipotesi H0.questo passa il test e permette di scartare l’ipotesi H0.
La procedura è ripetuta fino a quando non si trova un La procedura è ripetuta fino a quando non si trova un PPobservedobserved che che
non rientra nel 100
non rientra nel 100αα% dei % dei PPcalculatedcalculated, a questo punto il test viene , a questo punto il test viene interrotto.
interrotto.
Correzione di errori di tipo I
4/7/2003 15
SAM
MAS 5.0
Scaling (Affymetrix).cel
Data MAS 5.0 % calls A <90% Filtered data MAS 5.0 Is Avg Int > Bgd Yes Calculating Bgd threshold Yes Yes Data dCHIP dCHIP normalization and PM modelling % calls A <90% Filtered data dCHIP Is SE
in the upper (3%) tail of the distribution
? No Is
the gene present in MAS and dCHIP
set? Yes No
Diff. exp. is passing
SAM test? Yes
SAM data are congruent with CyberT p values? Final data set Yes a b c α β γ s1 s2 s3
4/7/2003 17
Teorema di
Teorema di
Bayes
Bayes
Dato un modello Dato un modello M = M(w)M = M(w) per un set di dati per un set di dati DD, ,
conoscendo le probabilità del modello
conoscendo le probabilità del modello P(M)P(M) è è possibile definire quanto il modello sia corretto
possibile definire quanto il modello sia corretto
P(M|D)
P(M|D) sulla base dei dati osservati sulla base dei dati osservati DD. .
logP(M|D) = logP(D|M) + logP(M)-logP(D)
a posteriori a priori modello dati
4/7/2003 18
Bayesian
Bayesian
regolarized
regolarized
t
t
-
-
test
test
(
(BaldiBaldi & Long 2001)& Long 2001)
C C T T C T
n
n
m
m
t
2 2σ
σ
+
−
=
4/7/2003 19
Bayesian
Bayesian
regolarized
regolarized
t
t
-
-
test
test
0 500 1000 1500 2000 2500 0 2000 4000 6000 8000 10000 12000 14000 mean st.dev Ctrls BaysCtrls 0 0.05 0.1 0.15 0.2 0.25 0.3 6.5 7 7.5 8 8.5 9 9.5 10 10.5 mean st.dev ln Ctrls ln BaysCtrls
4/7/2003 20
Bayesian regolarized
Bayesian regolarized
t
t
-
-
test
test
Ottimizzazione dei parametri del t
Ottimizzazione dei parametri del t
-
-
test
test
bayesiano
bayesiano
:
:
La variabilità media dei geniLa variabilità media dei geni (w (w →→ σσ00)) e il e il
coefficiente di confidenza
coefficiente di confidenza bayesianobayesiano ((νν0 0 →→ ν
ν00σσ22 0
0) ) devono essere definiti dall’utente.devono essere definiti dall’utente.
La compensazione degli errori di tipo I viene La compensazione degli errori di tipo I viene
fatta usando il coefficiente di correzione di
fatta usando il coefficiente di correzione di
Bonferroni
4/7/2003 21
Correzione di errori di tipo I
Correzione di errori di tipo I
Ipotesi H0: la media del trattato e la media del Ipotesi H0: la media del trattato e la media del
controllo appartengono alla stessa distribuzione?
controllo appartengono alla stessa distribuzione?
Errore di tipo IErrore di tipo I: considerare non valida l’ipotesi H0.: considerare non valida l’ipotesi H0.
Sidak significance pointSidak significance point::
Se i p calcolati sono minori di K (g,Se i p calcolati sono minori di K (g,αα) le ) le
corrispondenti ipotesi nulle sono rigettate.
corrispondenti ipotesi nulle sono rigettate.
g
g
K
(
,
α
)
=
1
−
1
−
α
α= grado di accettazione dell’errore (es 0.05) g=numero di test indipendenti
4/7/2003 22
WestfallWestfall and and YoungYoung stepstep--down procedure:down procedure:
I PI P--valuesvalues sono scritti in ordine crescente psono scritti in ordine crescente p(1)(1), p, p(2)(2)…, p…, p(g)(g) e pe p(j)(j)≤≤pp(k)(k). .
Quindi le ipotesi H0 sono: H
Quindi le ipotesi H0 sono: H00(1)(1), H, H 0
0(2)(2),…,H,…,H00(g).(g).
StepStep 1: Se p1: Se p(1)(1)<K(g,<K(g,αα) l’ipotesi nulla è rigettata.) l’ipotesi nulla è rigettata.
StepStep 2: Se p2: Se p(2)(2)<K(g<K(g--1,1,αα) l’ipotesi nulla è rigettata.) l’ipotesi nulla è rigettata.
StepStep j: Se pj: Se p(j)(j)>K(g>K(g--j+1,j+1,αα) tutte le ipotesi nulle sono accettate da ) tutte le ipotesi nulle sono accettate da
quel punto in poi e non vengono fatti più test.
quel punto in poi e non vengono fatti più test.
Correzione di errori di tipo I
4/7/2003 23
P gene diff. Espresso α’
<10-6 1 – (1 – 0.05)1/5= 0.102 < 10-6 1 – (1 – 0.05)1/4= 0.0127 2* 10-5 1 – (1 – 0.05)1/3= 0.0170 0.047 1 – (1 – 0.05)1/2= 0.0253 … g
g
K
(
,
α
)
=
1
−
1
−
α
Correzione di errori di tipo I
4/7/2003 25
Mining differential expression data
Mining differential expression data
In order grasp information onIn order grasp information on transcriptionallytranscriptionally
controlled molecular mechanisms it is important controlled molecular mechanisms it is important
to: to:
group genes by transcription profilegroup genes by transcription profile
link functions to differentially expressed geneslink functions to differentially expressed genes
link transcriptional features to colink transcriptional features to co--regulated regulated
genes
DOWN UP MIX (137) (91) (156) 1.5 0.5 -0.5 -1.5 2.0 1.0 -1.0 2.0 1.0 -1.0 K-way clustering K-way clustering Quality-adaptive clustering Quality-adaptive clustering
4/7/2003 27
Linking functions to differentially
Linking functions to differentially
expressed genes: GO
expressed genes: GO
Gene OntologyGene Ontology (GO) is a dynamic controlled (GO) is a dynamic controlled
vocabulary that can be applied to all
vocabulary that can be applied to all
organisms even as knowledge of gene and
organisms even as knowledge of gene and
protein roles in cells is accumulating and
protein roles in cells is accumulating and
changing.
changing.
GO might link differentially expressed genes GO might link differentially expressed genes
to specific functional classes.
4/7/2003 28
GO
GO
Molecular FunctionMolecular Function::
the tasks performed by individual gene, products;
the tasks performed by individual gene, products;
examples are transcription factor and DNA
examples are transcription factor and DNA
helicase
helicase..
Biological ProcessBiological Process::
broad biological goals, such as mitosis or
broad biological goals, such as mitosis or purinepurine metabolism, that are accomplished by ordered
metabolism, that are accomplished by ordered
assemblies of molecular functions
assemblies of molecular functions
Cellular ComponentCellular Component::
subcellular
subcellular structures, locations, and structures, locations, and
macromolecular complexes; examples include
macromolecular complexes; examples include
nucleus,
4/7/2003 29
GO &
GO &
p63
p63--regulated genesregulated genes
Gene ontology Gene ontology
annotations can be
annotations can be
retrieved from Locus Link
retrieved from Locus Link
database.
database.
Genes can be clustered Genes can be clustered
using GO features
4/7/2003 30
GO
GO
p63
p63--regulated genes GOregulated genes GO
signal transduction lipid metabolism oncogenesis transcription from polII promoter & regulation from polII promoter
Down Mix Up Down Mix Up 0 2 4 6 8 N. of items signal transduct ion lipid meoncogenestabolismis
transcripcell prolitionferation
P53 RE promoters
Gene Ontology classes are spread within p63 differential expression clusters
Gene Ontology classes are spread within p63 differential expression clusters
4/7/2003 32
Analysis of scientific literature
Analysis of scientific literature
• Information ExtractionInformation Extraction
•Texts can be retrieved (e.g. MedLine queries), using specific keywords (a priori knowledge) .
•
•Text MiningText Mining
• Words patterns are located in the text.
Genes related abs LL id, Gene names NCBI MedLine LL id → symb vocabulary LL id → symb vocabulary LocusLink extraction LocusLink extraction Abs extraction by symbols Abs extraction by symbols Gene names AND/OR keywords Gene names AND/OR keywords Clustering abs by their information content Clustering abs by their information content MedLine subset
Agrawal AprioriAgrawal Apriori rules induction algorithm isrules induction algorithm is
a powerful method to finda powerful method to find regolaritiesregolarities in a set of in a set of
documents
documents. The tool tries to find set of items that are . The tool tries to find set of items that are frequently found together, so that from the presence of
frequently found together, so that from the presence of
certains
certains items in a set of items in a set of documentsdocuments one can infer that one can infer that other items are present.
other items are present.
item
4/7/2003 36
Promoters analysis
Promoters analysis
Identification of common transcriptional Identification of common transcriptional
elements, within co
elements, within co--regulated genes, might regulated genes, might help to unmask hidden regulative
help to unmask hidden regulative
mechanisms.
mechanisms.
Identification of Identification of transcriptional regulative transcriptional regulative
elements (
elements (TTRE) clustersRE) clusters linked to colinked to co-
-regulated genes might help to identify, in the
regulated genes might help to identify, in the
genome, uncharacterized genes regulated by
genome, uncharacterized genes regulated by
the same mechanism.
4/7/2003 37
Promoters analysis
Promoters analysis
:
:
limitations
limitations
Computational tools are quite inefficient
Computational tools are quite inefficient
in the identification of
in the identification of
eukaryotic
eukaryotic
promoters.
promoters.
Our approach:
Our approach:
Relay on NCBI genomic annotationsRelay on NCBI genomic annotations..
Repeating the Repeating the analysis as annotations analysis as annotations are are
upgraded
Hs genome data
Hs genome
data Dr genomedata
Dr genome
data Mm genomedata
Mm genome data Automatic download (Curl/Zlib) Automatic download (Curl/Zlib) Non-coding regions Extraction (LL Id linked) Non-coding regions Extraction (LL Id linked) Upstream
Upstream 5’UTR5’UTR IntronsIntrons 3’UTR3’UTR 3’UTR
Downstream
3’UTR Downstream
Transcription start site
4/7/2003 39
Promoters analysis
Promoters analysis
Transcriptional signatures might be linked to Transcriptional signatures might be linked to
co
co--regulated genesregulated genes..
Transcriptional signaturesTranscriptional signatures might be might be
characterized by
characterized by the the presence presence of Tof TRE RE clusters
clusters..
The The identification identification of TRE of TRE clusters clusters (e.g. CISTER)(e.g. CISTER)
relays on availability of a set of potential
relays on availability of a set of potential TTREREss mapped on promoters.
TREs identification, by statistical approaches, is strongly
dependent upon the tool used to search them.
TREs identificationTREs identification, , by statistical approachesby statistical approaches, , isis stronglystrongly
dependent
dependent uponupon the the tooltool usedused toto search search themthem..
Number of motifs for each Number of motifs for each
promoter sequences
promoter sequences
calculated on the basis of
calculated on the basis of
Match program results (154
Match program results (154
human promoters sequences)
human promoters sequences)
default
default cutoffcutoff; ; TransfacTransfac 5.2 5.2 matrices
matrices
Motifs distribution by using Motifs distribution by using
Patser
Patser program (154 human program (154 human promoters sequences)
promoters sequences) cutoffcutoff p<
p< --101055; ; TransfacTransfac 5.2 5.2
matrices;
matrices;
Regulative regions binary representation Regulative regions binary representation M1 M2 ... Mn Gene i 1 1 1 1 0 1 1 0 1 0 1 0 Gene j 1 0 0 1 1 1 0 1 0 0 0 1 A A C C G G T T 1 10 0 0 1 10 0 0 1010 3 3 11 00 0 0 1717 3 3 20 0 0 1 1 20 0 0 1 1 1 1 33 20 20 1212 3 3 666 20 0 6 20 0 1 3 0 0 1 3 0 0 2323 66 22000 0 1 1 1010 33 15 15 1010 8 1 8 1 220 0 0 0 0 0 00 1 2 3 4 5 6 1 2 3 4 5 6 7 8 9 7 8 9 10 1110 11 Upstream
Upstream 5’UTR5’UTR IntronsIntrons
Patser
(Hertz & Stormo 1999)
Transfac matrices
Transfac matrices
TRANSFAC Alignment matrices TRANSFAC Alignment matrices Upstream sequences shuffling Upstream sequences shuffling Upstream sequences (2Kb) Upstream sequences (2Kb) 1000 Shuffling ? Yes No cutoff=100α% cutoff=100α% PATSER Gene1 001..101 Gene2 111..000 … GeneN 100..111 Gene1 001..101 Gene2 111..000 … GeneN 100..111 Yes No Raw Score > cutoff ? PATSER Raw transcriptional Location scores Raw transcriptional Location scores 1 1 0 0
4/7/2003 43
Promoters analysis
Promoters analysis
A commercial association rule like "A commercial association rule like "If a customer If a customer
buys
buys winewine and and breadbread, he often buys , he often buys cheesecheese, , too
too." can be rewritten in a gene oriented way:." can be rewritten in a gene oriented way:
““If an upstream region contains at least one If an upstream region contains at least one AP2AP2
and one
and one SP1SP1 , it often contains , it often contains Oct1Oct1, too, too..
TTranscriptional signatures can be mined by ranscriptional signatures can be mined by
consolidated statistical techniques used in other
consolidated statistical techniques used in other
fields (e.g. association rule induction).
Common PWMs: •AP-2 alfa •AP-2 gamma •MAZ •MAZR Common PWMs: •AP-2 alfa •AP-2 gamma •MAZ •MAZR 304 p63-responsive genes, described by 317 PWMs. 304 p63-responsive genes, described by 317 PWMs. Apriori rules induction Confidence=100% Support≥10% Apriori rules induction Confidence=100% Support≥10% 20049 rules were generated 20049 rules 20049 rules were generated were generated Ranking rules by frequence in UP DW MX expression classes Ranking rules by frequence in UP DW MX expression classes Is PWM present in >70% genes? Is PWM present in >70% genes? No Yes Sp1, Major T Ag Sp1, Major T Ag Sp1, Major T Ag 12 95 19 26 DW MX UP UP & MX PWMs: •Muscle initiator •TFII-I DW PWM: •Sp3 MX PWMs •E12 UP & MX PWMs: •Muscle initiator •TFII-I DW PWM: •Sp3 MX PWMs •E12
Common Rules
AP-2alpha AP-2gamma GCbox
AP-2alpha AP-2gamma GCbox MAZ MAZR
UP-MX Specific Rules
GCbox Muscle initiatorsequence-20 Muscle initiator sequences-19
GCbox MAZ MAZR Muscle initiator Muscle initiator sequences-19
DW Specific Rules
AP-2alpha AP-2gamma GCbox Sp3
4/7/2003 46
Are the transcriptional rules
Are the transcriptional rules
associated in clusters?
associated in clusters?
Apriori rules
Apriori
Apriori rules rules Transfac
matrices describing each rule Transcriptional rules clusters CISTER CISTER CISTER
Cloning with TATA box upstream to luciferase
reporter gene
Cloning with TATA box upstream to luciferase reporter gene Cotransfection with p63 in Saos-2 Cotransfection with p63 in Saos-2 p63-responsive genes expression classes
4/7/2003 47
Citogenetica ed
Citogenetica ed
arrays
arrays
Gli Gli array array possono anche avere usi diversi possono anche avere usi diversi
dalla semplice analisi di espressione.
dalla semplice analisi di espressione.
Un interessante approccio è l’uso di Un interessante approccio è l’uso di arrays arrays
per avere
per avere mappature mappature fini del fini del genoma genoma
(<100Kb) che permettano di identificare
(<100Kb) che permettano di identificare
micro delezioni
micro delezioni duplicazioni non visualizzabili duplicazioni non visualizzabili con tecniche citogenetiche convenzionali
CGH
CGH è un metodo di citogenetica molecolare
per l’identificazione di modificazioni genetiche.
Le alterazioni sono classificate come guadagno
di DNA (gain) o perdita di DNA (loss).
Uguali quantitativi di DNA tumorale marcato
con biotina e DNA normale di riferimento marcato con digoxigenina sono ibridati con cromosomi metafasici normali.
Il DNA tumorale è visualizzato con floresceina
(FITC) ed il DNA normale con rodamina (TRITC) e rivelati con un microscopio a fluorescenza.
Loss
ER- ER+
A sinistra dati generati su 81 pazienti ER+. A destra dati generati su 21 pazienti ER-.
Il profilo che tende verso sinistra indica sotto-rappresentazione del materiale
genetico nelle cellule tumorali.
Il profilo che tende verso destra indica amplificazione del materiale genetico
CGH arrays
BAC array
Cy3 Cy5
Nel CGH array il DNa normale e quello
tumorale sono marcati come nel CGH canonico ma invece di ibridare questi DNA a cromosomi in metafase si usa un filtro su cui sono stati legati i BAC usati per collegare la sequenza del genoma umano con i dati citogenetici.
L’uso di un formato ad array ha il
vantaggio di essere piu’ sensibile ed avere una migliore risoluzione rispetto al consueto uso della CGH in cromosomi metafasici, dato che i cloni sono
direttamente collegati ad informazioni di sequenza
Integrazione tra citogenetica e sequenze
genomiche
L’uso di questo approccio in cellule tumorali ML-2 che ha permesso di
identificare anormalità di numero di copie per specifiche sequenze.
C’e’ stata una perdita di copie di DNA in 1p, 6q, 11q e 20p Si sono osservate delle duplicazioni in 12, 13 e 20q.
Con questo approccio è stato possibile non solo identificare specifiche
anormalità cromosomiche ma l’uso degli array di CGH ha permesso di mappare con precisione su uno specifico BAC, associato ad una sequenza genomica, il punto di rottura.