• Non ci sono risultati.

So far, most of the research work done in the context of privacy-preserving data mining and data analytics focuses on an organization-centric model for the personal data management

N/A
N/A
Protected

Academic year: 2021

Condividi "So far, most of the research work done in the context of privacy-preserving data mining and data analytics focuses on an organization-centric model for the personal data management"

Copied!
3
0
0

Testo completo

(1)

The work of Francesca Pratesi during her PhD was aimed at addressing two problems:

a) the assessment and the enforcement of privacy in big data analytical frameworks;

b) the privacy preservation in personal data exchange within a user-centric data ecosystem.

Over the last years, knowledge discovery technologies have become increasingly central, because of the availability of a large quantity of data, that are provided by users while using different kinds of services, the so-called Big Data. Big data offer many new opportunities to understand our society because they describe in detail the activities of the population; therefore, sophisticated techniques for analysis have been developed, to have the opportunity to gather, save and analyze more and more complex data. The worrying side of the story is that big data contain personal sensitive information with great detail, thus the opportunities of discovering knowledge increase with the risks of privacy violation. Several techniques have been proposed to develop technological frameworks for countering privacy violations, without losing the benefits of big data analytics technology. The “Privacy-by-Design” paradigm aims to design many practical and impactful services in such a way that the quality of results can coexist with high protection of personal data.

The Privacy-by-Design paradigm aims to protect privacy by inscribing it into the design specifications of information technologies, from the very start.

So far, most of the research work done in the context of privacy-preserving data mining and data analytics focuses on an organization-centric model for the personal data management. This model has some drawbacks that limit the exploitation of the human data. First of all, personal data are often fragmented and this does not permits a holistic view of individuals. Second, users are not involved in the life-cycle of their data and having a very limited possibility to control their data.

Furthermore, as personal data are mainly under the control of organizations, the focus of authorities has been more on the protection of the personal data, to reduce the risks of uncontrolled uses, than on the promotion of their full utilization when paired with a higher control from their “owners”.

To counter these problems we are witnessing a change of perspective towards a user-centric model for personal data management. This model has the aim to give an active role to users, introducing transparency and full control on the management of personal data. This sheds a new light on the privacy issues: most of the existing solutions consider an architecture where central sites make the data private before releasing them; in the new model the privacy-transformation has to be performed before data leave the user. There is also the necessity to provide to organizations (Data Provider) and data analysts tools for the privacy risk assessment and for enforcing a desired level of privacy protection.

The contribution of the research work and consequentely of the PhD thesis of Francesca Pratesi is focused on the investigation of the assessment and the enforcement of privacy in Big Data. Both these goals are analyzed in two different contexts: the standard organization-centric model and the quite novel user-centric data ecosystem.

Concerning the first point, the thesis proposes an analytical framework able to set the data free in the organization-centric model for personal data management. The idea is to provide to organizations (data-owners or data provider) and data analysts or service developers tools for the privacy risk assessment and for enforcing the desired level of privacy protection. The framework was instantiated using mobility data [5,6,7], GSM data [9], purchasing data [10] and a combination of them [11].

Regarding the second goal, instead, the thesis explores the enforcement of privacy in GSM data, when GSM profiles are used for quantifying typologies of city-users. In particular, the thesis proposes a privacy-preserving technique for guaranteeing privacy while the quantification results are preserved.

The thesis also investigates on how the above problems may be addressed in the user-

(2)

centric ecosystem and, in particular, how it is possible to adapt or extend the Privacy-by-Design methodology taking into consideration some important aspects that characterize this particular setting. In particular the candidate tried to adapt the proposed framework for the risk assessment to the user-centric model, where each user has access only to his/her own data. Here, the thesis proposes a method to estimate the individual risk [12,13], searching for correlation between privacy risk and some individual features. The method is based on classification and regression processes, permitting to build a predictor that might be used by individuals not represented in the training set.

Finally, a privacy-preserving technique for individual mobility data based on differential privacy was developed [1,2,3,4]. The method guarantees individual privacy protection and acceptable quality of analytical result at collective level.

Moreover, Francesca Pratesi actively participated in several FP7 and H2020 European Projects, like LIFT (Grant Agreement FP7-ICT-2009-C n. 255951, http://lift-eu.org) [2], Petra (Grant Agreement FP7-ICT-60904, http://petraproject.eu/) [5,6] and SoBigData (Grant Agreement #654024, http://www.sobigdata.eu) [10,11,12,13] and in some industrial projects, involving partner like UniCoop Tirreno (http://kdd.isti.cnr.it/project/livlab-livorno), Toyota (http://kdd.isti.cnr.it/project/risk-analysis-publishing-vehicular-data-toyota) and Siemens (http://kdd.isti.cnr.it/project/privacy-risk-assessment-siemens).

Francesca Pratesi has been a teaching assistant for the course Fondamenti Teorici di Programmazione (Bachelor Degree Course in Digital Humanities, University of Pisa) from 2014/2015 to 2016/2017.

Francesca Pratesi was also a Program Commetee member of three international workshops (Dyno@ASONAM 2016, DyNo@ECML-PKDD 2017, PAP@ECML-PKDD 2017) and of one international conference (Data analysis & Social Mining for the Interconnected Society @ Goodtechs 2017) and a Local Commetee member of two conferences (the XIII AI*IA Symposium and the 39th ACM SIGIR Conference).

Francesca Pratesi also contributed to the writing of a book chapter [8].

Finally, Francesca Pratesi assolved her student’s duties, taking PhD exams for 6 credits, attending 3 cycles of seminars and spending 4 months abroad in order to carry on her research with international partners (in particular, with the laboratory “People in Motion Lab” at University of New Brunswick, Canada, and with the laboratory “Human Dynamics” at Media Lab, Massachusetts Institute of Technology, USA).

References

[1] A Monreale, S Rinzivillo, F Pratesi, F Giannotti, D Pedreschi, Privacy-by-design in big data analytics and social mining, EPJ Data Science 3 (1), 1-26

[2] A Monreale, WH Wang, F Pratesi, S Rinzivillo, D Pedreschi, G Andrienko, N Andrienko, Privacy-preserving distributed movement data aggregation, AGILE 2013 [3] F Pratesi, A Monreale, WH Wang, S Rinzivillo, D Pedreschi, G Andrienko, N Andrienko, Privacy-Aware Distributed Mobility Data Analytics, SEBD 2013 [4] F Pratesi, A Monreale, WH Wang, S Rinzivillo, D Pedreschi, G Andrienko, N Andrienko, Differential privacy in distributed mobilytics. submitted to EPJ Data Science.

(3)

[5] M Berlingerio, V Bicer, A Botea, S Braghin, N Lopes, R Guidotti, F Pratesi, Mobility Mining for Journey Planning in Rome, ECML PKDD 2015

[6] A Botea, S Braghin, N Lopes, R Guidotti, F Pratesi, Managing travels with PETRA: The Rome use case, IEEE ICDEW 2015

[7] F. Pratesi, A. Monreale, R. Trasarti, F. Giannotti, D. Pedreschi, T. Yanagihara, PRISQUIT: a system for assessing privacy risk versus quality in data sharing.

submitted to Transaction on Data Privacy (TDP)

[8] G Amato, L Candela, D Castelli, A Esuli, F Falchi, C Gennaro, F Giannotti, A Monreale, M Nanni, P Pagano, L Pappalardo, D Pedreschi, F Pratesi, F Rabitti, S Rinzivillo, G Rossetti, S Ruggieri, F Sebastiani, M Tesconi, How Data Mining and Machine Learning Evolved from Relational Data Base to Data Science, A Comprehensive Guide Through the Italian Database Research Over the Last 25 Years, Springer International Publishing, 2018, pag 287-306.

[9] P Cintia, L Gabrielli, F Giannotti, A Monreale, F Pratesi, Privacy-Aware Sociometer: a Mitigation Strategy for Quantification of City Users, ready to be submitted to Data & Knowledge Engineering (DKE)

[10] R Pellungrini, F Pratesi, L Pappalardo, Assessing Privacy Risk in Retail Data, 1st International Workshop on Personal Analytics and Privacy (PAP) @ECML PKDD 2017

[11] F Pratesi, A Monreale, F Giannotti, D Pedreschi, Privacy Preserving Multidimensional Profiling, Goodtechs 2017 (to appear)

[12] R Pellungrini, L Pappalardo, F Pratesi, A Monreale, Fast estimation of privacy risk in human mobility data, 3rd International Workshop on TEchnical and LEgal aspects of data pRIvacy and SEcurity 2017

[13] R Pellungrini, L Pappalardo, F Pratesi, A Monreale, A data mining approach to assess privacy risk in human mobility data. ACM Transactions on Intelligent Systems and Technology (TIST 9:3, special issue on urban intelligence, to appear)

Riferimenti

Documenti correlati

Una volta all’interno e privato dell’envelope attraverso un processo favorito dalla fosforilazione della proteina Nef (con la stessa funzione del canale ionico M2

elettorali. Uno studio comparato, cit., p. TAR Lazio, sez.. discostandosi da quelli della giurisprudenza ordinaria ed amministrativa, si poneva a favore della

SN 2017gmr does not show signs of narrow, high-ionization emission lines in the early optical spectra, yet the optical lightcurve evolution suggests that an extra energy source

Commissione non preveda una revisione, né una modifica della direttiva 96/71/CE, e non sia quindi in grado di affrontare tutte le questioni di fondo sollevate dalla

Our results provide insights on the factors driving the level of multicultural hybridism in immigrant-owned firms and, in particular, on the size of founding entrepreneurial teams,

- nel trattare gli insegnamenti di lingua e in lingua come un insieme al servizio dello sviluppo del repertorio plurilingue e interculturale dell‟apprendente, privilegiando la

[r]

More specifically, acoustic correlates of speech rhythm in Thai are studied by [6] with the same method: both nPVI and rPVI values found for this language overlap