In this section, the principal results of the cleaning will be explained following the steps described in section 3.7.
In the first part, the feature selection results will be shown, presenting the step-by-step outcomes only for the gross movement pattern; on the other hand, the final result of the redundant feature removal will be shown for every movement pattern.
In the second part, the results of the noisy instances will be presented only for the gross movement pattern, but the others have similar results.
4.4.1 Selected features
Since the first step of the feature selection algorithm proposed in this work is the estimation of the feature importance, figure 4.6 represents the bar diagram of the importance for the gross movement cluster of movement patterns. According to the feature importance, most of the proposed features in section 3.5.4 carry out essential information for the bradykinesia prediction.
The second result of the procedure is the correlation matrix presented in figure 4.7.
It may be observed that a significant number of features are highly correlated, hence the removal of those features is necessary. The final result of the redundant feature removal is in figure 4.8, where the black spots in the correlation matrix means that the feature is discarded.
Upper gross movements ReliefF Feature Importance
time_since_med_intake seg_v_perc_mov
entropy_velentropy_pos seg_v_entropy
entropy_X
entropy_PSD_Xzero_cross_Y seg_v_max_ratio
entropy_Zentropy_Y
entropy_PSD_Yentropy_PSD_Zseg_v_std_ratio seg_v_mean_ratio
entropy_jerk
seg_v_entropy_no_trem entropy_mgn
seg_v_entropy_ratio energy_peakzero_cross_X
peak_cc_Z
dominant_freq_ampl zero_cross_Z
mad_velmad_posmad_mgn mad_Zmad_Xiqr_vel
peak_cc_X mad_Y peak_cc_Ymad_jerkrange_pos
std_posrms_posstd_velrms_vel range_vel
power_sum std_PSDmean_velrms_mgn
autocorrRange_X std_mgn
sma
mean_mgn
seg_v_mean_vel_when_mov iqr_posrms_jerk
seg_v_max_no_trem midHinge_mgn
rms_X range_jerkmean_pos
seg_v_std_no_trem mean_jerkiqm_mgnstd_jerk
seg_v_mean_no_trem rms_Z
seg_v_meanrange_mgnseg_v_std
energy_ratio_no_peak range_Ziqr_mgnrange_Yrms_Y
mean_range_X iqr_X range_Xmead_X
autocorrRange_Y std_range_Xmead_mgn
mean_range_Y seg_v_max
ratio_mean_max_peak autocorrRange_Zmean_range_Z
iqr_jerk std_range_Z
mead_Y iqr_Y
std_range_Y mead_Z
iqr_Z
energy_ratiokurtosis_PSD skewness_PSD
seg_v_perc_intense_mov timelag_cc_Ytimelag_cc_Zdominant_freqtimelag_cc_X
Features
-0.01 -0.005 0 0.005 0.01 0.015 0.02 0.025 0.03 0.035
Predictor Importance Estimates
Figure 4.6: Feature importance estimates computed using ReliefF on the gross movement cluster. The outcome of the algorithm shows as the most important features to predict the bradykinesia severity is the medication intake, the segment velocity features, and the entropy. On the other hand, the less important for the bradykinesia estimates are the cross-correlation time lag and the dominant frequency.
mean_mgnstd_mgn rms_mgn rms_X rms_Y rms_Z
range_mgn
range_X range_Y range_Z
dominant_freq_ampl dominant_freqpower_sumenergy_peak energy_ratio
energy_ratio_no_peakstd_PSDratio_mean_max_peak entropy_X entropy_Y entropy_Z
entropy_mgn
entropy_PSD_X entropy_PSD_Y entropy_PSD_Z skewness_PSD
kurtosis_PSD
peak_cc_X peak_cc_Y peak_cc_Z timelag_cc_X timelag_cc_Y timelag_cc_Z
rms_velrange_velentropy_velmean_vel std_velrms_posrange_posentropy_pos mean_posstd_pos rms_jerk
range_jerk entropy_jerkmean_jerkstd_jerksmaiqr_mgn
iqr_X iqr_Y iqr_Z
autocorrRange_X autocorrRange_Y autocorrRange_Z
mad_mgn
mad_X mad_Y mad_Z
mead_mgn
mead_X mead_Y mead_Z
mean_range_X mean_range_Y mean_range_Z std_range_X std_range_Y std_range_Z zero_cross_X zero_cross_Y zero_cross_Z
iqm_mgnmidHinge_mgniqr_vel mad_veliqr_posmad_posiqr_jerk mad_jerkseg_v_maxseg_v_meanseg_v_std seg_v_entropyseg_v_max_no_tremseg_v_mean_no_tremseg_v_std_no_tremseg_v_entropy_no_trem seg_v_perc_movseg_v_perc_intense_movseg_v_mean_vel_when_movseg_v_max_ratio seg_v_mean_ratioseg_v_std_ratioseg_v_entropy_ratiotime_since_med_intake
mean_mgnrms_mgnstd_mgnrms_Xrms_Yrms_Z range_mgnrange_Xrange_Yrange_Z dominant_freq_ampldominant_freqenergy_peakenergy_ratiopower_sum energy_ratio_no_peakstd_PSD ratio_mean_max_peakseg_v_mean_no_tremseg_v_max_no_tremseg_v_std_no_tremautocorrRange_XautocorrRange_YautocorrRange_Zentropy_PSD_Xentropy_PSD_Yentropy_PSD_Zskewness_PSDmean_range_Xmean_range_Ymean_range_ZmidHinge_mgnseg_v_entropykurtosis_PSDzero_cross_Xzero_cross_Yzero_cross_Ztimelag_cc_Xtimelag_cc_Ytimelag_cc_Zentropy_mgnseg_v_meanstd_range_Xstd_range_Ystd_range_Zentropy_posentropy_jerkentropy_velseg_v_maxpeak_cc_Xpeak_cc_Ypeak_cc_Zmead_mgnrange_posmean_posrange_jerkmean_jerkentropy_Xentropy_Yseg_v_stdentropy_Zrange_velmad_mgnmean_veliqm_mgnmad_posmad_jerkrms_posrms_jerkmead_Xmead_Ymad_veliqr_mgnmead_Zrms_velstd_posstd_jerkiqr_posiqr_jerkstd_velmad_Xmad_Ymad_Ziqr_veliqr_Xiqr_Yiqr_Zsma
seg_v_entropy_no_tremseg_v_perc_mov seg_v_perc_intense_mov seg_v_mean_vel_when_movtime_since_med_intakeseg_v_entropy_ratioseg_v_mean_ratioseg_v_max_ratioseg_v_std_ratio
Upper gross movements Feature correlation matrix
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
Figure 4.7: Feature correlation matrix of the gross movement cluster. Most features are highly correlated among each other and this justify the removal of the redundant ones. Meanwhile, a small amount of features are very low correlated like the energy ratio, the features derived by the channel cross-correlation, the zero crossing rate and the medication intake.
Besides, a general overview of the results in the different movement patterns is in figure 4.9, where the features are ranked according to their importance.
The features selected in all the clusters are invariant respect to the movements, instead, the less selected features depend on the patterns of movement.
Figure 4.8: Feature correlation matrix of the gross movement pattern after the removal of redundant predictors. The black squares are the removed features;
in general, there are still some correlation among the features but the retained features carry out the most information for the bradykinesia estimates.
Figure 4.9: Retained features of the different movement patterns. The move-ment clusters have some features in common, such as the dominant frequency, the auto-correlation range, and the zero crossing, but different other features can detect the bradykinesia only in particular movement patters.
4.4.2 Data cleaning
The results are shown in a qualitative way using t-SNE projections and the numerical results of this section will be described in section 4.7.1.
Figure 4.10 illustrates the projection of the gross movement cluster before the re-moval of noisy and outlier data points. It may be noticeable that the class separation is not significant.
Then, the result of the k-means clustering algorithm is shown in figure 4.11, where
Figure 4.10: t-SNE projection of the gross movement cluster before the cleaning.
The overlap among the classes is serious and the variability is still high.
the clusters should correspond to the classes of bradykinesia severity.
Figure 4.11: k-means clustering outcome of the gross movement pattern.
Finally, in figure 4.12 the projection after the cleaning is illustrated, showing an enhancement of the class separation compared to figure 4.10.