• Non ci sono risultati.

ENIGMA COINSTAC: Increasing neuroimaging data diversity with managed privacy

N/A
N/A
Protected

Academic year: 2021

Condividi "ENIGMA COINSTAC: Increasing neuroimaging data diversity with managed privacy"

Copied!
1
0
0

Testo completo

(1)

INTRODUCTION

• One of the central problems across the social, behavioral, and brain sciences is the use of small local samples.

• One solution to this problem is to collect samples from multiple international locations (sites) and combine these into a single analysis:

• This improves power as well as international representation in the analyses

• A roadblock to this data-sharing procedure is concern for participant privacy

Two solutions have been created for these problems in the neurosciences.

ENIGMA

The Enhancing NeuroImaging Genetics through Meta-Analysis (ENIGMA) project uses a variety of approaches including meta-analyses across international participating sites:

• This meta-analytic approach implements separate but identical analyses on distributed datasets, and then aggregates the results in a preplanned (prospective) meta-analysis

• This achieves some of the power of large samples without requiring data sharing, enhancing research without sacrificing participant privacy

• This approach has proven successful in many studies of genetics and neuroimaging

• ENIGMA has moved beyond these simple analysis to more complicated and computationally intensive ones (the approach here supports these analyses, too)

COINSTAC

The Collaborative Informatics and Neuroimaging Suite Toolkit for Anonymous Computation (COINSTAC) project has been developed to:

• Automatically analyze data across multiple sites • Maintain data privacy and confidentiality

• Transport the derived results to a central location for meta-analysis

Here we consider an example workflow for the COINSTAC system, implementing an ENIGMA-type prospective meta-analysis.

The process begins when the lead site defines the local and global analyses.

The local analysis is sent out to be run at each site to generate intermediate results that are combined in the global analysis that is run at the lead site.

Participating sites (collaborating sites with similar data) can

see how their data is to be used (they can see all of the

analyses proposed).

If they agree, they may join the consortium to participate in the meta-analysis, and map their data to the analysis.

Once the participating sites have joined the consortium and mapped their data, the lead site

launches all of the local

analyses.

The participating sites

automatically run the

local analysis and return the intermediate data to the lead site.

The lead site combines the intermediate data from the local sites into the global analysis, generating the meta-analytic results.

ENIGMA COINSTAC: Increasing neuroimaging data diversity with managed privacy

If agreed upon, these results are returned to all the participating sites.

Although not used in this example, COINSTAC implements procedures where this stage of the analysis can be iterative, with the sites repeatedly communicating with the lead site to im-plement calculations requiring shared information.

This allows computation of much more complex analyses including machine learning techniques.

You can think of “mapping the data” as pointing the local analysis to the correct columns of a simple spreadsheet, but much more complicated data structures can be accommodated.

These analyses are defined at the start so that participating sites know what is expected.

The lead site invites other sites to the consortium.

1

Matthew D. Turner,

1

Harshadvan Gazula,

2

Peter Kochunov,

3

Paul Thompson,

3

Neda Jahanshad,

3

Chris Ching,

3

Fabrizio Pizzagalli,

4

Greg Strauss,

5

Anthony Ahmed, ENIGMA SZ Working Group,

1

Vince D. Calhoun,

6

Theo van Erp,

1

Jessica A. Turner

Our Example

Meta-analysis is usually thought of as a post hoc procedure, however this is neither required nor always desirable. The procedure used here is a prospective analysis; a meta-analysis designed based on data availability and developed before any analysis is done. It is one of the simplest types of analysis that the COINSTAC architecture supports.

Lead Site

Participating

Sites

1Georgia State University; 2University of Maryland; 3University of Southern California; 4University of Georgia; 5Cornell University; 6University of California, Irvine

NEUROIMAGING EXAMPLE

• We are currently using this framework for a distributed meta-analysis involving 10 sites from the following countries:

• US

• Germany • Italy, and • Australia

• We have neuroimaging data from over 3,000 individuals

• We are examining the relationships between subcortical and cortical brain volumes and negative symptom severity in schizophrenia, using this local/global analysis in a meta-analysis framework

• Future analyses will include decentralized machine learning approaches for classification and prediction of symptom severity based on brain structural and functional patterns

Local

Global

For more information, contact: mturner46@gsu.edu

Riferimenti

Documenti correlati

Characteristics of the included studies Study Participants Comparison group Outcome assessment Allocation concealment, blinding, loss to follow-up, intention to treat analysis

Because alveolar fluid clearance (AFC) is impaired in most patients with acute respiratory distress syndrome (ARDS) and is associated with higher mortality, we hypothesized

The study highlights the discordance be- tween the definition in institutional discourse of various forms of plagiarism in academic writing, and the description of these practices

It is hereby confirmed that the book “Putting Tradition into Practice: Heritage, Place and Design.. Proceedings of 5th INTBAU International Annual Event” edited

The results of the densitometric analysis (ImageJ software) of OsFK1 or OsFK2 western blot bands of embryo, coleoptile and root from seedlings sowed under anoxia or under

The functions contained in the R package NeON aimed to shed light on effective population size of human chromosomes from LD patterns of genome-wide SNPs data,

In section 3, the Weyl limit point-limit circle criterion [9] is used to prove that the squared Dirac operator for the free problem is essentially self-adjoint on the set C 0 ∞ ( 0,

In particular, the genome sequence analysis revealed the presence of the gene dfr and genes coding for penicillin-binding proteins (PBP) in all the type strains analyzed, which