• Non ci sono risultati.

Understanding spreading and evolution in complex networks

N/A
N/A
Protected

Academic year: 2021

Condividi "Understanding spreading and evolution in complex networks"

Copied!
3
0
0

Testo completo

(1)

Understanding spreading and evolution in complex networks Letizia Milli

Nowadays, the increasing availability of Big Data, describe our desires, opinions, sentiments, purchases, relationships and social connections, provide access to a huge source of information on an unprecedented scale. In the last decades Social Network Analysis, SNA, has received increasing attention from several, different fields of research. Such popularity was certainly due to the flexibility offered by graph theory: a powerful tool that allows reducing many phenomena to a common analytical

framework whose basic bricks are nodes and their relations. Especially, the analysis of diffusive phenomena that unfold on top of complex networks is a task able to attract growing interests from multiple fields of research. Understanding the mechanism behind the global spread of an epidemic or information is fundamental for applications in a diversity of areas such as epidemiology or viral marketing. The contribution of the Ph.D. thesis is focused on the investigation of spreading phenomena on complex networks and related problems; in the thesis were developed useful tools for understanding, monitoring and signaling diffusion phenomena. The thesis of Letizia Milli is aimed at addressing some problems related to the diffusion on complex networks, such as:

• developing a new library to model, simulate and study diffusion phenomena that unfold over complex networks

• finding a statistical method to validate in advance a forecast obtained by a model • finding a partition of users that maximizes the spreading; this is of great

significance for applications across various domains. For instance, the knowledge of influential spreaders is crucial for the design of effective strategies to control the outbreak of epidemics while targeting the innovators in information diffusion is helpful for conducting successful campaigns in commercial product promotions

In the thesis were developed two libraries framework: DyNetX, a package designed to model evolving graph topologies and a simulation framework, called NDlib, aimed to model, simulate and study diffusion phenomena that unfold over complex

networks. The former extends the NetworkX project by providing the user facilities to describe and analyze networks whose nodes and edges changes as time goes by and represents the first library specifically intended to provide support to easily

describe complex time dependent networks. The latter, NDlib, represents a multilevel solution: it is designed to offer a programmatic interface to developers, an

experimental server to those centers that need to provide simulations as a service and, finally a visual interface for those, students as well as non-technicians, who want to run simulations and experiments but don't have the time to learn a new library or programming language. The purpose of this simulation framework is to empirically

(2)

compare the effects of diffusion processes according to different diffusion models over several network topologies within different contexts. Covered models include classic and network epidemic models, threshold models and opinion dynamics models; the repertoire of models is extensible. NDlib is about being released on SoBigData.eu. It is currently hosted on GitHub, on pypi and has its online documentation on ReadTheDocs.

Then, with NDlib leveraging DyNetX, in the thesis was compared the behaviors of classical spreading models (SI and SIR) when applied on top of a same social

network whose topology dynamics are described with different temporal-granularity. It was shown that analyze diffusive phenomena not considering topology dynamic indeed cause a relevant overestimate of the actual velocity and amplitude of the spreading. Moreover, taking advantage of the topology evolution enables the analyst to identify special context-dependent events (activity peaks) as well as activity

patterns (weekend/weekdays).

Continuing with the study of spreading phenomena and in particular with the diffusion of innovation, in the thesis with NDlib it was compared alternative

modeling choices that simulate the diffusion of information, news, ideas, trends with both passive and/or active processes. It was introduced two novel approaches whose aim is to provide active and mixed schemas applicable in the context of

innovations/ideas diffusion simulation.

In the thesis, it was also tackled the resolution of two issues related to the diffusion. The first one is the quantification task. Indeed in the literature, there is no method to validate the results obtained with a good model of diffusion. For example, the

pandemic forecasts have now been confirmed using actual data from surveillance collected during the pandemic. The use of quantification methods could be of great help in this context. We could use labeled initial data to learn a reliable quantifier of an epidemic and then, apply the quantifier to big data to monitor the epidemic, e.g., to learn the number of infected and then validate our predictive model. So it was

proposed techniques for quantification on networks that exploit the homophily effect observed in many social networks.

The second topic related to the diffusion is the research of a special niche among early adopters. Innovations are continuously launched over markets, such as new products over the retail market or new artists over the music scene. Some innovations become a success, others don't. Forecasting which innovations will succeed at the beginning of their lifecycle is hard. In the thesis was presented a data-driven study that accounts for the existence of a special niche among early adopters, individuals that consistently tend to adopt successful innovations before they reach success: they were called Hit-Savvy. Hit-Savvy can be discovered in very different markets and retain over time their ability to anticipate the success of innovations. This problem is strictly related to the task of influence maximization, to the research on word-of-mouth and viral marketing. It is the problem of finding a small subset of nodes that could maximize the spread of influence. The experiments present in the thesis,

(3)

contribution, it devised a predictive analytical process, exploiting Hit-Savvy as signals, which achieves high accuracy in the early-stage prediction of successful innovations, far beyond the reach of state-of-the-art time series forecasting models. This analysis was carried in the mean-field context; the method was tested in two datasets without a network.

References

1. Letizia Milli, Anna Monreale, Giulio Rossetti, Fosca Giannotti, Dino Pedreschi, Fabrizio Sebastiani. Quantification trees. IEEE International Conference on Data Mining (ICDM), 2013, Texas. DOI:

https://doi.org/10.1109/ICDM.2013.122

2. Barbara Furletti, Lorenzo Gabrielli, Fosca Giannotti, Letizia Milli, Mirco Nanni, Dino Pedreschi, Roberta Vivio, Giuseppe Garofalo. Use of mobile phone data to estimate mobility flows. Measuring urban population and inter-city mobility using big data in an integrated approach. 47th Meeting of the Italian Statistical Society (SIS), Cagliari 2014.

3. Letizia Milli, Anna Monreale, Giulio Rossetti, Dino Pedreschi, Fosca Giannotti, Fabrizio Sebastiani. Quantification in social networks. IEEE International Conference on Data Science and Advanced Analytics (DSAA), 2015, Parigi. DOI: https://doi.org/10.1109/DSAA.2015.7344845

4. Giulio Rossetti, Letizia Milli, Salvatore Rinzivillo, Alina Sirbu, Dino Pedrschi, Fosca Giannotti. NDLIB: Studying Network Diffusion Dynamics. International Conference on Data Science and Advanced Analytics (DSAA) 2017, Tokyo.

5. Letizia Milli, Giulio Rossetti, Fosca Giannotti, Dino Pedreschi. Information Diffusion in Complex Networks: the Active/Passive Conundrum. International Conference on Complex Networks and their Applications (pp. 305-313). Springer, 2017, Lyon. DOI: https://doi.org/10.1007/978-3-319-72150-7_25 6. Letizia Milli, Giulio Rossetti, Fosca Giannotti, Dino Pedreschi. Diffusive

Phenomena in Dynamic Networks: a data-driven study. International Conference on Complex Networks (CompleNet), 2018, Boston.

7. Giulio Rossetti, Letizia Milli, Dino Pedreschi, Fosca Giannotti. Forecasting Success via Early Adoptions Analysis: A Data-Driven Study. Pubblicato a PLOS ONE, 2017, Volume 12(12), e0189096. DOI:

https://doi.org/10.1371/journal.pone.0189096

8. Giulio Rossetti, Letizia Milli, Salvatore Rinzivillo, Alina Sirbu, Dino Pedrschi, Fosca Giannotti. NDlib: a Python Library to Model and Analyze Diffusion Processes Over Complex Networks. International Journal of Data Science and Analytics. DOI: https://doi.org/10.1007/s41060-017-0086-6

9. Giulio Rossetti, Letizia Milli, Salvatore Rinzivillo. NDlib: A Python Library to model and analyze diffusion processes over complex networks. Demo@WWW, 2018 DOI: https://doi.org/10.1145/3184558.3186974

Riferimenti

Documenti correlati

The accumulation of additional genetic abnormalities are responsible for histologic transformation of indolent lymphomas to diffuse aggressive histotypes, mainly in

The aim of this section is to show the advantages of the data fusion model when applied to the reconstruction of a real smooth free form surface. In this case study, the performance

Text key- words used include, for example: bipolar disorder, depressive disorder, mood disorders, depressi*, practice guidelines, evidence-based medicine, guideline*, (medical

While older Italian-born women – having had the chance to discuss the matter at length – regarded this issue as a matter of free choice, younger Italian-born

Meta-modeling principles are concretely exploited in the implementation of an adaptable patient-centric Electronic Health Record EHR system to face a number of challenging

The Jackal in tutta probabilità ha come followers un pubblico di giovani che utilizzano i SNS in modo “attivo” e che produce molti contenuti, Clio Make Up

Data from the familiarity and frequency use of pasta, fruits, vegetables, legumes, fish, red and cured meat, vegetable and animal fats, dairy products, alcohol, soft drinks and

Il massaggio ritmico tramite foam roller (foam rolling massage, FRM) è da considerare all’interno delle cosiddette tecniche di rilascio miofasciale, le quali hanno ultimamente