As Different as Chalk and Cheese:
Meta-statistics for Biomarker Detection
Ron Wehrens Pietro Franceschi
Biostatistics and Data Management
http://cri.fmach.eu/BDM
StatSeq Workshop, Verona April 18–19, 2012
Biomarker identification (2-class situation)
G1
G2
G1
Biomarker identification (2-class situation)
G1
G2
G1
G2
Common approaches
Context Tools Challenges
Univariate statistical tests multiple testing
correlations
Multivariate PCLDA, PLSDA, VIP, ... model validation
variable selection optimization criterion
Common approaches
Context Tools Challenges
Univariate statistical tests multiple testing
correlations
Multivariate PCLDA, PLSDA, VIP, ... model validation
variable selection optimization criterion
local optima
Spike-in apple data
10 control apples
10 treated apples - spiked 9 chemical compounds 3 sets of concentrations ESI+ and ESI−
In each: 22 “true” biomarkers
Total: six comparisons between treated and control
One apple sample – the result
Higher criticism
“God loves 0.05 just as much as 0.06” – selecting optimal thresholds
Second-level of significance testing (Tukey, 1976) “Real differences are rare and weak”
Here: extension to multivariate methods Compare to “standard” thresholds:
α = 0.05,VIP= 1
Dohono and Jin, PNAS (2008) Wehrens and Franceschi – submitted.
Higher Criticism: theory
HC= max i √ N(i/N − pi) p i/N(1 − i/N) ● ● ● ● ● ● ● ●● ●● ●● ●● ●●● ●●●● ●●● ●●●●●●●●● ●●●●●● ●●● ●●●● ●● ●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ●●●● ●● ●●●●●●●●● ●●●●●●●●●●●●●●●●●●● ●● ●●●● ●●●●●●●●●●●●●●●●● ●● ●● 0.00 0.02 0.04 0.06 0.08 0.10 1.0 2.0 3.0 4.0 HC thresholding, alpha = 0.1 Ordered scores HCiResults for apple data: selection efficiency
Efficiency (in %) 0.1 0.2 0.3 0.4 0.5 Negative ionization PCLDA (2 LVs) PCLDA (4 LVs) PCLDA (10 LVs) PLSDA (2 LVs) PLSDA (4 LVs) PLSDA (10 LVs) VIP (2 LVs) VIP (4 LVs) VIP (10 LVs) t test 0.1 0.2 0.3 0.4 0.5 Positive ionization Group 1 Group 2 Group 3Stability-based biomarker selection
Central idea: good biomarkers are stable
Perturb data by subsampling (samples as well as variables)
Fit many models on many perturbed data sets Which variables come up consistently at the top? Parameters: defining subsampling, consistency
Wehrens et al., Anal. Chim. Acta (2011)
SBBS results
Simulations based on real apple data
0.0 0.1 0.2 0.3 0.4 0.5 0.5 0.6 0.7 0.8 0.9 1.0 Scenario E FPR TPR ● ● ● ● ● ● ● PCLDA (4 PCs): CBBS PCLDA (4 PCs): SBBS t−test: CBBS t−test: SBBS 0.0 0.1 0.2 0.3 0.4 0.5 0.5 0.6 0.7 0.8 0.9 1.0 Scenario E FPR TPR ● ● ● ● ● ● ● PLSDA (4 LVs): CBBS PLSDA (4 LVs): SBBS t−test: CBBS t−test: SBBS
Biomarker selection – conclusions
Both SBBS and HC form viable alternatives Much better than “standard” thresholds HC: very few parameters
... but time-consuming SBBS: faster
... but more settings
... and you need > 10 replicates
Both implemented in R package BioMark, available from http://cran.r-project.org