• Non ci sono risultati.

3.1 Patients’ selection

Patients with epilepsy of various aetiologies were recruited at the Department of Paediatric Neurology and Muscular Disease Unit, IRCCS Istituto Giannina Gaslini, between August 2019 and May 2020. Clinical and instrumental data, including brain MRI and EEG findings as well as treatments data and genetic results, were reviewed through our local database.

Stool consistency was assessed through the BST that illustrates stool shapes together with precise descriptions regarding their consistency as an ordinal scale of stool types ranging from the hardest (type 1) to the softest (type 7): types 1 and 2 are considered abnormally hard stools while types 5, 6 and 7 are considered abnormally liquid stools. Types 3 and 4 are generally considered normal stool forms. To simplify BST scores collection, only one score was reported when patient indicated two different stool types: 3-4 scores= 3; 1-2 scores= 1; 5-7 scores= 7.

The occurrence of GI symptoms was assessed referring to the validated Rome IV Diagnostic Questionnaire (the Rome Foundation, Inc.;

https://theromefoundation.org), which investigates the rate and intensity of the following GI symptoms: abdominal pain, constipation, diarrhea, reflux, bloating, dyspepsia, nausea and vomiting. To evaluate urinary metabolites, early morning urine samples of hospitalized epileptic patients were collected and stored at -80 °C soon after.

We did exclude patients with progressive neurological disorders (e.g., autoimmune encephalitis, and CNS tumors). The control group included age-matched neurotypical children, recruited from the Blood Transfusion Centre of our Institute.

Written informed consent was signed by patients or their parents/legal guardians and approved by our local Independent Ethics Committee (IEC).

3.2 Statistical analysis

Epileptic patients were stratified into different subgroups: the Isolated Epilepsy (IE) group, including patients suffering from epilepsy and without intellectual disability or associated neuropsychiatric comorbidities; the Epilepsy plus (Epi+) group defined as epilepsy patients also showing intellectual disability, and/or psychiatric symptoms;

the drug-resistant (drug-R) and the drug-sensitive (drug-S) groups that, according to the International League Against Epilepsy (ILAE), included patients taking ≥ 2 and <

2 anti-seizure medications (ASMs).

Groups were compared using the Fisher’s Exact test. The thresholds of p-value were set at 0.05 (statistical significance) and 0.001 (highly statistical significance).

3.3 Chemicals

Ammonium formate, acetonitrile, methanol and formic acid (LC-MS grade) were purchased from Sigma Aldrich Srl (Milan, Italy). Water was purified by reverse osmosis and filtrated through a Milli-Q purification system (Millipore, Milford, MA, USA).

Reversed-phase column ACQUITY C18 BEH 1.7µm 2.1 X 100 mm (Waters S.p.A., Sesto San Giovanni, Milan, Italy) and HILIC column ACQUITY BEH Amide 1.7µm 2.1 X 150 mm (Waters S.p.A., Sesto San Giovanni, Milan, Italy) were used.

3.4 Mass spectrometry analysis of urine samples

Fifty µL urine, extracted adding 150 µL cold (-20°C) methanol, vortex-mixed and centrifuged at 14,000 rpm for 10 min, were used for both hydrophilic interaction liquid chromatography (HILIC) and reversed-phase (RP) chromatography. Samples were analyzed using Vanquish Horizon UHPLC coupled to a Q-Exactive Orbitrap mass spectrometer.

The extracted metabolites were diluted 1:2 with H2O and 5 µl samples were directly injected into RP and HILIC columns. The linear gradient for reversed-phase columns started with 1% B and in 15 minutes increased up to 100% with a flow rate of 250 µl/min, then the columns were normalized for 5 minutes with 1% phase B. The linear gradient for HILIC columns started with 90% B and decrease to 30% B in 15 minutes with a flow rate of 200 µl/min, the columns were then normalized with 90 % phase B for 9 min.

Mass spectrometry (MS) data were acquired in full scan mode in both positive and negative ionization, using 70000 resolution, 1e6 AgC and 100 ms maximum injection time. In the identification phase separately for each polarity, the experiments were done in data‐dependent acquisition mode alternating MS and MS/MS experiments. A

maximum of 5 MS/MS experiments were triggered per MS scan. The intensity threshold was set at 1.6e5 using an isolation window of 1.4 Da. The m/z values of signals already selected for MS/MS were put on an exclusion list for 20 s. 70000 and 17500 resolution, 1e6 and 1e5 AgC, 100 ms and 50 maximum injection time were used for MS1 and MS2 scan respectively. If no further inclusion list entries are identified in a scan event will be selected other masses. Normalized stepped collision energy of 30, 40, 50 was used.

Raw data files were processed by Compound Discoverer™ 3.1 software, including peak detection, peak alignment and peak integration. Raw files were aligned with adaptive curve settings. Unknown compounds were detected with a 5ppm mass tolerance, 3 signals to noise ratio, 30% of relative intensity tolerance for isotope search, and 500,000 minimum peak intensity, and then grouped with 5 ppm mass and 0.2 min retention time tolerances. A procedural blank sample was used for background subtraction and noise removal during the pre-processing step. Peaks with less than a 3-fold increase, compared to blank samples were removed from the list.

Metabolites identified in the processed raw data of mass spectral peaks were searched against both ChemSpider™ chemical structure database and mzCloud spectral library. A customized AMRT database integrated into CD was also used for metabolites identification.

3.5 Untargeted metabolomics Analysis

All annotated and normalized metabolite data obtained from the three metabolite detection systems were as follows: 11,190 for C18-positive, 11,110 for C18-negative, and 4,773 for HILIC-positive. For filtration of the features from each of the detection platforms, relaxed filtering in which features satisfying both “mzVault Best Match”

value of 50, and “mzCloud Best Match” value of 70 were kept and the remaining was discarded. Accordingly, 277, 327, and 174 features passed the filtration process for C18-positive, C18-negative, and HILIC-positive, respectively. All the filtered data coming from three platforms were merged. Since more than half of the features in each data type did not satisfy normal distribution test, the Mann-Whitney U test was carried out between IE and Epi+, or between drugR and drugS epilepsy patients’

subgroups. Among multiple abundance entries of the same metabolite, the one with the minimum p-value was retained, and if there is more than one metabolite with the same minimum p-value, average values of the abundances were calculated. Lastly, contaminants were also filtered out. After filtration and merging steps, a total of 369 metabolite data were obtained for 43 patient samples. For drug-resistant/sensitive group analysis, two patient samples that did not have a relevant record were discarded.

3.6 Univariate Analysis

For normality and homoscedasticity, Shapiro-Wilk and Levene’s test was performed in R, respectively. Since a higher proportion of data was not following normal

distribution, a non-parametric univariate test, the Mann-Whitney U test was performed to find metabolites statistically significantly changing between two groups. For significance, metabolites with p-values equal to or less than 0.05 were used in downstream analysis.

3.7 Principal Component Analysis (PCA)

Principal Component Analysis (PCA) is a dimensional reduction method and used as an unsupervised classification method (161) that can also be used to test classification between two groups of interest, using R packages “factoextra” and “FactoMineR”

(162). Additionally, a multivariate version of ANOVA, PERMANOVA, in which the significance between centroids of clusters is tested, was performed with Euclidian distance using adonis function in R package “vegan” (163).

3.8 Partial least squares discriminant analysis (PLS-DA)

Partial least squares discriminant analysis (PLS-DA), considered as supervised version of the PCA method, is another one of the machine learning algorithms that can be used for classification and feature selection, especially in metabolomics studies (164). PLS-DA is also sensitive to imbalanced data, as most machine learning methods (165). One of the oversampling methods, Synthetic Minority Oversampling Technique (SMOTE) was performed on KNIME Analytics Platform to balance IE/Epi+ or drug-resistant/drug-sensitive data before PLS-DA (166). For PLS-DA, R package “mixOmics” was used by setting the number of components to 5 (167). The

performance of the model was evaluated through 10-fold cross-validation. In addition to the area under curve (AUC) score, two model evaluation metrics, R2Y and Q2 scores were recorded. R2Y describes the variance of class response (Y) captured by the model, and Q2 reflects the quality of the prediction ability of the built model (168). Even though there is no exact threshold for Q2, models with Q2 scores higher than 0.5 are considered to have good prediction ability (169). Additionally, another validation technique, a permutation test was performed with 999 permutations, using R package “RVAideMemoire”. When the model passed the performance tests, Variable Importance in Projection (VIP) scores were calculated from the final validated model to quantify the contribution of metabolites in the separation of the classes (170). Metabolites with VIP scores higher than 1.0 were selected. Lastly, the predictive ability of the model was tested on test dataset, and error rate was recorded.

3.9 Pathway Analysis

For pathway analysis, metabolites with both p-values ≤ 0.05 from Mann-Whitney U test and VIP score > 1 from PLS-DA were selected. Respective KEGG IDs of the metabolites were manually retrieved from KEGG Compound Database, based on match with name and formula in our dataset (171). A list of identified KEGG IDs was submitted to the online pathway enrichment analysis tool, Metabolites Biological Role (MBROLE) 2.0 (172). For analysis, “KEGG Pathways” selected for annotation, and organism-dependent enrichment analysis was carried out by selecting “Homo

sapiens” in options. This tool calculates p-values of the enriched pathways based on hypergeometric test, and for this study enriched pathways with FDR corrected p values ≤ 0.05 were recorded.

Documenti correlati