• Non ci sono risultati.

Personal information value from the user perspective: an empirical study on the students of Ca’ Foscari

N/A
N/A
Protected

Academic year: 2021

Condividi "Personal information value from the user perspective: an empirical study on the students of Ca’ Foscari"

Copied!
180
0
0

Testo completo

(1)

Master's Degree programme – Second Cycle

(D.M. 270/2004)

in Economia e Gestione delle Aziende – International Management Final Thesis

Personal information value from the user

perspective: an empirical study on the

students of Ca’ Foscari

Supervisor

Ch. Prof. Massimo Warglien

Graduand

Marco Byloos

Matriculation Number 842130

Academic Year

(2)

I would like to thank my supervisor, for his wise advices during the creation of this document and for holding the course on “Making decisions” that conflagrated my imagination and inspired me to do my master degree thesis on this topic. Then, I would like to thank the over one thousand students of the Ca’ Foscari University that took part in this survey, if this research has any merit, one thousandth per participant is also yours and I want to share it with you!

Thank you to:

Marco B., Stela B., Luca P., Marco M., Myshi G., Ana T., Enrico C., Nicola B., Pierbarra, Omar R., Roberto P., Sam Haoliang H., Alessio B. Thuong N., Nicoló C., Alessia I., Lucrezia C., Benedetta B., Giulia Z., Michele S., Christian S., Riccardo P., Riccardo B., Alessia d. C., Virginia L., Alessandro M., Naldy M., Silvia, Elisa C., Stefania M., Silvia, Hicham M., Antonietta S., Giada A., Beatrice F., Sara F., Marianna, Anna B., Federico D., Siderisgravehood, Riccardo d. A., Elisa G., Eda F., Elisa M., Aleksandra G., Roberta G., Riccardo A., Alvise M., Veronica B., Eleonora d. R., Angela d. T., Thomas C., Quan P., Margherita R., Martina Z., Filippo L.P., Amos B., Silvia S., Riccardo C., Costanza S., Luna M.., Shahpara H., Chiara B., Laura M., Pier Alberto M., Giovanni O., William T., Michael B., Anna P., Marangoni G., Bianca V., Tania M., Martina C., Ilaria B., Francesca d. F., Emanuele B., Olga P., Michela, Veronica, Valentina M., Chiara M., Chiara S., Gaia, Tani, Arianna B., Christian G., Linda P., Mattia B., Alberto F.., Tommaso F., Giorgia F., Nicoló P., Matteo, Daniela M., Regina P., Sabrina d. P., Rossella Z., Clarissa P., Chiara, Giacomo C., Dawda J., Gianluca B., Sara Z., Elena d. M., Lorenzo C., Teresa Z., Federico B., Cecilia d. G., Paola R.B., Emma C., Maria, Diletta C., Silvia G., Sebastiano A., Francesca, Luo F., Mattia T., Beatrice G., Luce, Alessandra V., Beatrice Z., Elisa P., Carol, Afnan, Vera L., Kuraishinju, Daniele, Giuliana, Chiara R., Elisabetta Sofia P., Emilia R., Susan R., Martina B., Valentica C., Rita, Chiara M., Michelangelo S., Michelangelo M., Riccardo R., Francesca, Silvia C., Johnny Haze, Lucia T., Count of Oak, Spica, Francesco R., Federico P., Marta B., Laura B., Arianna G., Lavinia S., Alessandro M., Lucia d. N., Max C., Miri L., Alessia B., Arianna, Alice, Mattia G., Michele B.C., Annalaura G., Nicola F., Andrea R., Dario M.G., Martina S., Giacomo D., Diana B., Laura T., Laura F., Maria Teresa, Giorgia P., Serena V., BruceKetta, Firiel, Nicolo' G., LSP., Beatrice C., Emma R., Fundor333, EpicCiqui, Giulia B., Giorgia, Lorenzo C., Mohamed A., Ilaria T., Raffaella M., Matteo M., Francesco M., Elena C., Gabriele C., Federica R., Enrico, Nicolo' S., Eva T., Stefano R., Anna P., Marco S., Giada P., Chiara,

(3)

Fabbrizio F., Sandy, Marta N., Giada R., Sabrina, Fatima, Giulia B., Francesca R., Enrico A., Elisa R., Filippo P., Ginevra, Stefania L., Chiara C., Luca N., Silvia, Paolo M., Matteo M., Giulia S., Ariana E., Eleonora B., Mariana G., Alessio P., Marta T., Mirko, Margherita N., Alessandro B., Margherita P., Veronica, lucyliu, Giada, Mattia, Merysolee, Kalpana M., Arianna C., Giulia, Riccardo d. R., Claudia, Amanda V., Giacomo, Sara, Stefano F., Yili, endi, Cristina P., Elena C., Gianmarco D., Adriana P., Priscilla, Fabio, NoraB, Giovanni P., Ludovico, Giacomo, Marta G and all the other anonymous.

(4)

Abstract

The web offers the users a multitude of resources and services without charge and most of the income necessary to run those platforms derives from the same users’ time, activities and the use of their personal information. While the success of the companies behind those free online services relies on the correct evaluation of their user value, the typical user seems to underestimate his/her bargaining power. This research, based on the data gathered from over a thousand of students of the Ca’ Foscari University, aims at gauging the estimation of the personal information online with two independent variables and at finding the relations between the values given and their online experience background.

(5)

Contents

Contents ... IV List of the tables ... VII List of the figures ... VII List of the (ordinary) linear regressions ... XI List of abbreviations... XV

Introduction ... 1

1 Data as the currency of the 21st century ... 3

1.1 The dawn of free services online with the Web 2.0 ... 3

1.2 Once an amusing experiment, now the main hub of personal information ... 5

1.3 The value of personal information ... 8

2 Call to quantify the value of the personal information ... 11

2.1 The previous attempts to frame the phenomenon ... 11

2.2 The Ca’ Foscari University students as the target population ... 13

2.3 The roadmap of the test ... 13

2.3.1 The successful gathering of data and the 5 minute effortless survey mission ... 13

2.3.2 The interviewing method ... 14

2.3.3 The processing of the obtained data set ... 15

2.3.4 The search for meaningful correlations between the variables ... 15

3 The survey ... 17

3.1 Gauging the effects of time, how a long experience raises the awareness ... 17

3.1.1 Checking whether longer experiences are worth more than the new adventures ... 17

3.1.2 Measuring the dependency of the online activities with the intensities of use ... 18

3.2 Observing the effects of a wide social network offer ... 20

3.3 Monitoring the effects of different privacy management behaviors ... 21

3.3.1 Partitioning the identity revealers with the level of public participation ... 21

(6)

3.3.3 Assessing the tracking acceptance, knowledge and annoyance ... 23

3.4 Collecting the opinions about the free services revenue sources ... 24

3.5 Collecting the value perceived of the personal information ... 25

3.5.1 The maximum fee offer to retain the main social network functionality ... 25

3.5.2 A pay or leave option to check the estimations consistency ... 25

3.6 The layout of the survey and the opportunity for additional data ... 26

4 The survey results ... 27

4.1 The survey duration ... 28

4.2 A general overview ... 29

4.3 Focus on the acquaintance of the web ... 36

4.3.1 The Year of access, the indirect measure for the total permanence on the Web ... 36

4.3.2 The Year of 1st social, the birthday in the socializing side of the Web ... 37

4.3.3 The ratio between the years spent online with or without the social networks ... 37

4.3.4 The Hours spent surfing and socializing, the intensities of the online activities ... 39

4.3.5 The balance of the time-consuming activities online... 40

4.4 Focus on: The distribution of the social network profiles ... 41

4.4.1 The Socials used and the extraction of the Numbers of socials used ... 41

4.4.2 The social networks interoperability and ecosystems ... 43

4.5 Focus on: The basic online privacy habits ... 44

4.5.1 The Public participation and the choice to disclose the identity ... 44

4.5.2 The Terms of service cognition, assessing an implicit consensus ... 44

4.5.3 The Tracking acceptance between awareness and annoyance ... 45

4.5.4 Choice patterns between the TOS cognition and the Tracking acceptance ... 46

4.6 Focus on: The participants opinion about the free services income sources ... 48

4.6.1 Income opinions, the first place source is clear ... 48

4.6.2 A synthetic ranking for the remaining positions ... 49

(7)

4.7.2 The outcome of the Pay or leave option ... 52

4.7.3 The coherence of the two estimation variables values ... 53

4.8 The success of the optional fields... 57

4.9 Exit polls and visual analysis of the consistence of the results ... 58

5 The analysis of the survey results ... 69

5.1 Effects of the two online permanences on the estimation variables ... 70

5.2 Effects of intensity of use over the estimation variables ... 80

5.3 The effects of a wide social network offer over the estimation variables ... 88

5.3.1 Focus on the Socials used ... 88

5.3.2 Focus on the Number of socials ... 111

5.4 The effects of the Public participation over the estimation variables ... 119

5.5 The effects of cognition of the rules over the estimation variables ... 123

5.6 The effects of tracking acceptance over the estimation variables ... 127

5.7 The effects of the income opinions over the estimation variables ... 131

5.8 The effects of open sourcing over the estimation variables ... 135

5.9 Generalizing the estimation variables with the framing variables ... 147

Conclusions ... 154

Appendices ... 156

a) The survey ... 156

I) Introduction block ... 156

II) First block: Framing and first evaluation... 156

III) Second block: Exposition to real estimates and coherency check ... 157

IV) Third block: Extra data retrieval, further acknowledgement and thanksgiving ... 158

b) The landing page layout ... 160

References ... 162

(8)

List of the tables

Table 1 - Major Social networks in 2017 (in Italy), source: Statista ... 7

Table 2 - Births and assimilations of the social networks... 9

Table 3- Capitalization per user and earning per user of the social networks ... 10

Table 4 - Distribution matrix of the combinations of TOS cognition and Tracking acceptance ... 46

Table 5 - Ranking of the income sources with a residual method ... 49

Table 7 - Ranking of the income with linear proportional weights ... 50

List of the figures

Figure 1 - Distribution of the survey duration among the participants ... 28

Figure 2 - Average, mode, median, max and min values for the Year of access and the Year of 1st social .... 29

Figure 3 - Average, mode and median of the Minutes spent surfing and socializing ... 30

Figure 4 - Distribution of the social networks used by the surveyed ... 31

Figure 5 - Distribution of the Number of socials concurrently used by the surveyed ... 31

Figure 6 - Distribution of the Public participation habits of the participants ... 32

Figure 7 – Distribution of the reading habits and the level of TOSs cognition among the surveyed 32 Figure 8 - Distribution of the Tracking acceptance towards the cookies and similar devices ... 32

Figure 9 – Distribution of the opinions about the main income source of the free services ... 33

Figure 10 - Average, mode, median, max and min surveyed threshold to keep the main social functionality ... 34

Figure 11 – Distribution of the Pay or leave option outcomes ... 34

Figure 12 - Distribution of the Extra data left by the surveyed ... 35

Figure 13 - Distribution on the Year of access to the web among the surveyed ... 36

Figure 14 - Distribution on the Year of the 1st social network experience ... 37

Figure 15 - Distribution of the total years spent online ... 37

Figure 16 - Distribution of the years spent online on the social networks... 38

Figure 17 - Distribution of the ratio between the years spent with socials over the total years spent online . 38 Figure 18 - Distribution of the Hours spent surfing online ... 39

Figure 19 - Distribution of the Hours spent socializing on the social networks ... 39

Figure 20 - Distribution of the balance of the activities done online ... 40

Figure 21 - Distribution of the Social networks used by the surveyed divided in the four categories ... 42

Figure 22 - Distribution of the Number of social networks used by each student. ... 42

Figure 23 - Number of social accounts per category ... 43

(9)

Figure 26 – Distribution of the level of TOSs cognition and the reading habits among the surveyed ... 45

Figure 27 - Reactions towards the tracking nature of the modern Web 2.0 ... 46

Figure 28 - Distribution of the Tracking acceptance answers for each alternative of the TOS cognition .. 47

Figure 29 - Distribution of the TOS cognition answers for each answer of the Tracking acceptance ... 47

Figure 30 - Distribution of the main source of income opinions ... 48

Figure 31 - Distribution of the Income opinion rankings for the four alternatives ... 49

Figure 32 - Full spectrum of the maximum fee estimated by the surveyed. ... 51

Figure 33 - Distribution of the ranges of the maximum fee offered to retain the main social account ... 52

Figure 34 – Distribution of the outcome of the Pay or leave option ... 53

Figure 35 - Correlation between the Pay or leave option outcome and the Maximum fee estimated54 Figure 36 - Correlation between the Pay or leave outcome and the Maximum fee ranges ... 55

Figure 37 - Correlation between the Maximum fee ranges and the Pay or leave outcome ... 55

Figure 38 – Distribution of the combinations of data of the Extra data variable ... 57

Figure 39 - Exit polls of the Year of access variable ... 58

Figure 40 - Visualization of the convergence for the Year of access variable ... 59

Figure 41 - Exit polls of the Year of the 1st social variable ... 59

Figure 42 - Visualization of the convergence for the Year of 1st social variable ... 60

Figure 43 - Exit polls of the Hours spent surfing variable ... 60

Figure 44 - Visualization of the convergence for the Hours spent surfing variable ... 60

Figure 45 - Exit polls of the Hours spent socializing variable ... 61

Figure 46 - Visualization of the convergence for the Hours spent socializing variable ... 61

Figure 47 - Exit polls of the Socials used variable ... 61

Figure 48 - Visualization of the convergence for the Socials used variable ... 62

Figure 49 - Exit polls of the Number of socials variable ... 62

Figure 50 - Visualization of the convergence for the Number of socials variable ... 63

Figure 51 - Exit polls of the Public participation variable ... 63

Figure 52 - Visualization of the convergence for the Public participation variable ... 63

Figure 53 - Exit polls of the TOS cognition variable ... 64

Figure 54 - Visualization of the convergence for the TOS cognition variable ... 64

Figure 55 - Exit polls of the Tracking acceptance variable ... 64

Figure 56 - Visualization of the convergence for the Tracking acceptance variable ... 65

Figure 57 - Exit polls of the Income opinion variable ... 65

Figure 58 - Visualization of the convergence for the Income opinion (Targeted advertising ranking) variable ... 65

(10)

Figure 60 - Visualization of the convergence for the Maximum fee estimation variable ... 66

Figure 61 - Exit polls of the Pay or leave option variable ... 66

Figure 62 - Visualization of the convergence for the Pay or leave option outcome variable ... 67

Figure 63 - Exit polls of the Extra data variable ... 67

Figure 64 - Visualization of the convergence for the Extra data variable ... 67

Figure 65 - Correlations between the Year of access online and the Maximum fee estimated ... 70

Figure 66 - Correlations between the Year of 1st social and the Maximum fee estimated ... 70

Figure 67 - Correlations between the Year of access online and the Maximum fee ranges ... 71

Figure 68 - Correlations between the Year of 1st social and the Maximum fee ranges ... 72

Figure 69 - Correlations between the Year of access online and the outcome of the Pay or leave option .... 73

Figure 70 - Correlations between the Year of 1st social and the outcome of the Pay or leave option ... 74

Figure 71 - Distribution of the Year of access influence over the Maximum fee estimations ... 75

Figure 72 - Distribution of the coefficient of the Year of access influence over the Pay or leave outcomes ... 75

Figure 73 - Distribution of the Year of access influence over the User average evaluations ... 76

Figure 74 - Distribution of the Year of 1st social influence over the Maximum fee estimations ... 77

Figure 75 - Distribution of the Year of 1st social influence over the Pay or leave outcomes ... 77

Figure 76 - Distribution of the Year of 1st social influence over the User average evaluations ... 78

Figure 77 - The influence of the Year of access variable on the two estimation variables ... 79

Figure 78 - The influence of the Year of 1st social variable on the two estimation variables ... 79

Figure 79 - Correlation between the Hours spent surfing and the Max fee estimated ... 80

Figure 80 - Correlation between the Hours spent socializing and the Max fee estimated ... 80

Figure 81 - Correlation between the Hours spent surfing and the Maximum fee ranges ... 81

Figure 82 - Correlations between the Hours spent socializing and the Maximum fee ranges ... 81

Figure 83 - Correlation between the Hours spent surfing and the Pay or leave outcome ... 82

Figure 84 - Correlation between the Hours spent socializing and the Pay or leave outcome ... 82

Figure 85 - Distribution of the Hours spent surfing influence over the Max fee estimations ... 83

Figure 86 - Distribution of the Hours spent surfing influence over the Pay or leave outcomes ... 83

Figure 87 - Distribution of the Hours spent surfing influence over the User average evaluations .... 84

Figure 88 - Distribution of the Hours spent socializing influence over the Max fee estimations ... 85

Figure 89 - Distribution of the Hours spent socializing influence over the Pay or leave outcomes .. 85

Figure 90 - Distribution of the Hours spent socializing influence over the User average evaluations ... 86

Figure 91 - The influence of the Hours spent surfing variable on the two estimation variables ... 87

Figure 92 - The influence of the Hours spent socializing variable on the two estimation variables . 87 Figure 93 - Correlation between the Socials used and the Maximum fee estimations ... 88

(11)

Figure 94 - Correlation between the Socials used and the Maximum fee ranges ... 89

Figure 95 - Correlation between the Socials used and the Pay or leave outcome ... 90

Figure 96 - Distribution of the Socials used influence over the Max fee estimations ... 97

Figure 97 - Distribution of the Socials used influence over the Pay or leave outcomes ... 103

Figure 98 - Distribution of the Socials used influence over the User average evaluations ... 110

Figure 99 - The influence of the Socials used variable on the two estimation variables ... 110

Figure 100 - Correlation between the Number of socials used at the same time and the Max fee . 111 Figure 101 - Correlation between the Number of “pure” socials used and the Max fee ... 111

Figure 102 - Correlation between the Number of socials and the Maximum fee ranges ... 112

Figure 103 - Correlation between the Number of “pure” socials and the Maximum fee ranges ... 112

Figure 104 - Correlation between the Number of socials and the Pay or leave outcome ... 113

Figure 105 - Correlation between the Number of “pure” socials and the Pay or leave outcome .... 113

Figure 106 - Analysis of the users with only one social and their Pay or leave outcome ... 114

Figure 107 - Distribution of the Number of socials influence over the Max fee estimations ... 115

Figure 108 - Distribution of the Number of “pure” socials influence over the Max fee estimations .... 115

Figure 109 - Distribution of the Number of socials influence over the Pay or leave outcomes ... 116

Figure 110 - Distribution of the Number of “pure” socials influence over the Pay or leave outcomes .. 116

Figure 111 - Distribution of the Number of socials influence over the User average evaluations .. 117

Figure 112 - Distribution of the Number of "pure" socials influence over the User average evaluations .. 117

Figure 113 - The influence of the Number of socials variable on the two estimation variables ... 118

Figure 114 - The influence of the Number of “pure” socials variable on the two estimation variables ... 118

Figure 115 - Correlation between the Public participation and the Maximum fee estimations ... 119

Figure 116 - Correlation between the Public participation and the Maximum fee ranges ... 120

Figure 117 - Correlation between the Public participation and the Pay or leave outcomes ... 120

Figure 118 - Distribution of the Public participation influence over the Max fee estimations ... 121

Figure 119 - Distribution of the Public participation influence over the Pay or leave outcomes .... 121

Figure 120 - Distribution of the Public participation influence over the User average evaluations 122 Figure 121 - The influence of the Public participation variable on the two estimation variables .. 122

Figure 122 - Correlation between the TOS cognition and the Maximum fee estimations ... 123

Figure 123 - Correlation between the TOS cognition and the Maximum fee ranges ... 124

Figure 124 - Correlation between the TOS cognition and the Pay or leave outcomes ... 124

Figure 125 - Distribution of the TOS cognition influence over the Max fee estimations ... 125

Figure 126 - Distribution of the TOS cognition influence over the Pay or leave outcomes ... 125

(12)

Figure 128 - The influence of the TOS cognition variable on the two estimation variables ... 126

Figure 129 - Correlation between the Tracking acceptance and the Maximum fee estimations ... 127

Figure 130 - Correlation between the Tracking acceptance and the Maximum fee ranges ... 128

Figure 131 - Correlation between the Tracking acceptance and the Pay or leave outcome ... 128

Figure 132 - Distribution of the Tracking acceptance influence over the Max fee estimations ... 129

Figure 133 - Distribution of the Tracking acceptance influence over the Pay or leave outcomes .. 129

Figure 134 - Distribution of the Tracking acceptance influence over the User average evaluations ... 130

Figure 135 - The influence of the Tracking acceptance variable on the two estimation variables .. 130

Figure 136 - Correlation between the Income opinion and the Maximum fee estimationz... 131

Figure 137 - Correlation between the Income opinion and the Maximum fee ranges ... 132

Figure 138 - Correlation between the Income opinion and the Pay or leave outcome ... 132

Figure 139 - Distribution of the Income opinion influence over the Max fee estimations ... 133

Figure 140 - Distribution of the Income opinion influence over the Pay or leave outcomes ... 133

Figure 141 - Distribution of the Income opinion influence over the Pay or leave outcomes ... 134

Figure 142 - The influence of the Tracking acceptance variable on the two estimation variables .. 134

Figure 143 - Correlation between the extra data effects and the Maximum fee estimations ... 135

Figure 144 - Correlation between the different kind of references (name) left and the Maximum fee estimations . 135 Figure 145 - Correlation between the different kind of reference (email) left and the Maximum fee estimations ... 136

Figure 146 - Correlation between the Extra data and the Max fee estimation ranges ... 137

Figure 147 - Correlation between the Extra data and the Pay or leave outcome ... 137

Figure 148 - Distribution of the Extra data influence over the Max fee estimations ... 140

Figure 149 - Distribution of the Extra data influence over the Pay or leave outcomes ... 142

Figure 150 - Distribution of the Extra data influence over the User average evaluations ... 144

Figure 151 - The influence of the Tracking acceptance variable on the two estimation variables .. 145

List of the (ordinary) linear regressions

Regression 1 - Correlation between the two independent estimation variables ... 56

Regression 2 - Correlation between the two independent estimation variables ... 56

Regression 3 - Regression of all the Year of access entries over the Maximum fee estimation ... 75

Regression 4 - Regression of all the Year of access entries over the Pay or leave option out ... 75

Regression 5 - Regression of all the Year of access entries over the User average evaluation. ... 76

Regression 6 - Regression of all the Year of 1st social entries over the Max fee estimations. ... 76

Regression 7 - Regression of all the Year of 1st social entries over the Pay or leave option out. .... 77

(13)

Regression 9 - Regression of all the Hours spent surfing entries over the Maximum fee estimations ... 83

Regression 10 - Regression of all the Hours spent surfing entries over the Pay or leave out. ... 83

Regression 11 - Regression of all the Hours spent surfing entries over the User average evaluation. ... 84

Regression 12 - Regression of all the Hours spent surfing entries over the Max fee estimation. ... 84

Regression 13 - Regression of all the Hours spent surfing entries over the Pay or leave out. ... 85

Regression 14 - Regression of all the Hours spent socializing entries over the User average eval. ... 86

Regression 15 - Regression of “I don’t use any social” answer over the Maximum fee estimations. .... 91

Regression 16 - Regression of “Facebook” answer over the Maximum fee estimations. ... 91

Regression 17 - Regression of “Fb Messenger” answer over the Maximum fee estimations. ... 91

Regression 18 - Regression of “WhatsApp” answer over the Maximum fee estimations. ... 92

Regression 19 - Regression of “Instagram” answer over the Maximum fee estimations. ... 92

Regression 20 - Regression of “Twitter” answer over the Maximum fee estimations. ... 92

Regression 21 - Regression of “Pinterest” answer over the Maximum fee estimations. ... 93

Regression 22 - Regression of “Tumblr” answer over the Maximum fee estimations. ... 93

Regression 23 - Regression of “reddit” answer over the Maximum fee estimations. ... 93

Regression 24 - Regression of “YouTube” answer over the Maximum fee estimations. ... 94

Regression 25 - Regression of “VKontakte” answer over the Maximum fee estimations. ... 94

Regression 26 - Regression of “StumbleUpon” answer over the Maximum fee estimations... 94

Regression 27 - Regression of “Google+” answer over the Maximum fee estimations. ... 95

Regression 28 - Regression of “LinkedIn” answer over the Maximum fee estimations. ... 95

Regression 29 - Regression of “Telegram” answer over the Maximum fee estimations. ... 95

Regression 30 - Regression of “Snapchat” answer over the Maximum fee estimations. ... 96

Regression 31 - Regression of “Skype” answer over the Maximum fee estimations. ... 96

Regression 32 - Regression of “Others” answer over the Maximum fee estimations. ... 96

Regression 33 - Regression of the “I don’t use any social” answer over the Pay or leave outcome. ... 97

Regression 34 - Regression of the “Facebook” answer over the Pay or leave outcome. ... 97

Regression 35 - Regression of the “FB Messenger” answer over the Pay or leave outcome. ... 98

Regression 36 - Regression of the “WhatsApp” answer over the Pay or leave outcome. ... 98

Regression 37 - Regression of the “Instagram” answer over the Pay or leave outcome. ... 98

Regression 38 - Regression of the “Twitter” answer over the Pay or leave outcome. ... 99

Regression 39 - Regression of the “Pinterest” answer over the Pay or leave outcome. ... 99

Regression 40 - Regression of the “Tumblr” answer over the Pay or leave outcome. ... 99

Regression 41 - Regression of the “reddit” answer over the Pay or leave outcome. ... 100

(14)

Regression 43 - Regression of the “VKontakte” answer over the Pay or leave outcome. ... 100

Regression 44 - Regression of the “StumbleUpon” answer over the Pay or leave outcome. ... 101

Regression 45 - Regression of the “Google+” answer over the Pay or leave outcome. ... 101

Regression 46 - Regression of the “LinkedIn” answer over the Pay or leave outcome. ... 101

Regression 47 - Regression of the “Telegram” answer over the Pay or leave outcome. ... 102

Regression 48 - Regression of the “Snapchat” answer over the Pay or leave outcome. ... 102

Regression 49 - Regression of the “Skype” answer over the Pay or leave outcome. ... 102

Regression 50 - Regression of the “Others” answer over the Pay or leave outcome. ... 103

Regression 51 - Regression of the “I don’t use any social” answer over the User average eval. .... 104

Regression 52 - Regression of the “Facebook” answer over the User average eval. ... 104

Regression 53 - Regression of the “FB Messenger” answer over the User average eval. ... 104

Regression 54 - Regression of the “WhatsApp” answer over the User average eval. ... 105

Regression 55 - Regression of the “Instagram” answer over the User average eval. ... 105

Regression 56 - Regression of the “Twitter” answer over the User average eval. ... 105

Regression 57 - Regression of the “Pinterest” answer over the User average eval. ... 106

Regression 58 - Regression of the “Tumblr” answer over the User average eval. ... 106

Regression 59 - Regression of the “reddit” answer over the User average eval. ... 106

Regression 60 - Regression of the “YouTube” answer over the User average eval. ... 107

Regression 61 - Regression of the “VKontakte” answer over the User average eval. ... 107

Regression 62 - Regression of the “StumbleUpon” answer over the User average eval. ... 107

Regression 63 - Regression of the “Google+” answer over the User average eval. ... 108

Regression 64 - Regression of the “LinkedIn” answer over the User average eval. ... 108

Regression 65 - Regression of the “Telegram” answer over the User average eval. ... 108

Regression 66 - Regression of the “Snapchat” answer over the User average eval. ... 109

Regression 67 - Regression of the “Skype” answer over the User average eval. ... 109

Regression 68 - Regression of the “Others” answer over the User average eval. ... 109

Regression 69 - Regression of all the Number of socials entries over the Max fee estimation. ... 114

Regression 70 - Regression of all the Number of “pure” socials entries over the Max fee estimation. . 114

Regression 71 - Regression of all the Number of socials entries over the Pay or leave out. ... 115

Regression 72 – Regression of all the Number of “pure” socials entries over the Pay or leave out. .. 115

Regression 73 - Regression of all the Number of socials entries over the User average evaluation. ... 116

Regression 74 - Regression of all the Number of “pure” socials entries over the User average eval. .. 117

Regression 75 - Regression of the Public participation over the Maximum fee estimations. ... 121

(15)

Regression 77 - Regression of the Public participation over the User average evaluations. ... 122

Regression 78 - Regression of the TOS cognition over the Maximum fee estimations. ... 125

Regression 79 - Regression of the TOS cognition over the Pay or leave outcome. ... 125

Regression 80- Regression of the TOS cognition over the User average evaluations. ... 126

Regression 81 - Regression of the Tracking acceptance over the Maximum fee estimations. ... 129

Regression 82 - Regression of the Tracking acceptance over the Pay or leave outcome. ... 129

Regression 83 - Regression of the Tracking acceptance over the User average evaluations. ... 130

Regression 84 - Regression of the Income opinion over the Maximum fee estimations... 133

Regression 85 - Regression of the Income opinion over the Pay or leave outcome. ... 133

Regression 86 - Regression of the Income opinion over the User average evaluation. ... 134

Regression 87 - Regression of the “Left the name” event over the Max fee estimations. ... 138

Regression 88 - Regression of the “Left name” event categories over the Max fee estimations. .. 138

Regression 89 - Regression of the “Left the email” event over the Max fee estimations. ... 138

Regression 90 - Regression of the “Left email” event categories over the Max fee estimations. .. 139

Regression 91 - Regression of the “Left a comment” event over the Max fee estimations. ... 139

Regression 92 - Regression of the “Left blank” event over the Max fee estimations. ... 139

Regression 93 - Regression of the “Left the name” event over the Pay or leave outcome. ... 140

Regression 94 - Regression of the “Left the name” event categories over the Pay or leave out. ... 140

Regression 95 - Regression of the “Left the email” event over the Pay or leave outcome. ... 141

Regression 96 - Regression of the “Left the name” event categories over the Pay or leave out. ... 141

Regression 97 - Regression of the “Left a comment” event over the Pay or leave outcome. ... 141

Regression 98 - Regression of the “Left blank” event over the Pay or leave outcome. ... 142

Regression 99 - Regression of the “Left the name” event over the User average evaluation. ... 142

Regression 100 - Regression of the “Left the name” event categories over the User average eval. .... 143

Regression 101 - Regression of the “Left the email” event over the User average evaluation. ... 143

Regression 102 - Regression of the “Left the email” event categories over the User average eval.143 Regression 103 - Regression of the “Left a comment” event over the User average evaluation. ... 144

Regression 104 - Regression of the “Left blank” event over the User average evaluation. ... 144

Regression 105 - Regression of the significative framing variables over the Maximum fee estimations ... 147

Regression 106 - Regression of the (purified) significative framing variables over the Max fee estimations .. 148

Regression 107 - Regression of the significative framing variables over the Max fee estimations 149 Regression 108 - Regression of the (purified) significative framing variables over the Max fee estimations .. 150

Regression 109 - Regression of the significative framing variables over the Max fee estimations 151 Regression 110 - Regression of the (purified) significative framing variables over the Max fee estimations .. 152

(16)

List of abbreviations

FB Messenger for Facebook Messenger.

TOS(s) for Terms Of Service, (s) when are for more than one service. Avg. for Average.

Coeff.(s) and Corr. for Coefficient(s) and Correlations in the titles of the figures. Est., out. and eval. for estimation, outcome and evaluation in the figures description. DV for Dependent Variable in the description of the regressions.

(17)

Introduction

Almost the total majority of free online services are offered by private societies and thus they need a tangible economic return to refresh the offer to the consumer. The economic return coming out from a free service is not a traditional one since the user/customer is not asked to directly handle credit for the transaction, hence the revenue must surge from collateral implicit sources. Two of the most relevant sources may be represented either by an embedded system of advertising or by the profitable management and even the trade of stacks of personal information coming from the users. Three additional less relevant sources might be found in private donations, as in the case of the messaging platform Telegram, in crowdfunding solutions, as in the case of Wikipedia or in freemium business models where optional features are sold to premium users to cover the other costs as it is widespread in the videogames industry of recent times. There may be other relevant solutions effective in some peculiar situations although the overwhelming tidal invasion of free services since the starting of the Web 2.0 hints that there must be a common source.

While the advertising funding mechanism and dynamics are a well-known reality often disclosed to the public with tools like Google AdWords, the treatment of the data collected remain more mysterious. Data is the crucial resource of this century as it permits extraordinary efficiency in the sale department and it permits the creation of new formerly unimaginable markets. Apparently companies are well aware of its value and strive the have access to the greatest quantity of it.

This study aims at investigating on the user side how much his personal data is worth or believed to be worth in order to obtain something in exchange of services or other ways of compensation. It starts with a brief introduction to the world of the Web 2.0 where the data is the fulcrum of everything and the major players as the free services providers and especially the social networks thrive with the user generated content. Then there is the analysis of the user data value from the companies’ perspective and of the previous studies about the user perception of their value which call for further inspection. After that, the new aspects to take in account for a new survey about this topic are explained with detailed motivations and in the last two chapter are presented the raw survey results as well all the correlations between the user framing variables and their estimation to seek which one is impacting most their value perception.

(18)
(19)

1 Data as the currency of the 21st century

The digital innovation tide of the last 10 years brought the overwhelming explosion of the online services of every kind and their evolution is far from decelerating as we are now just becoming familiar with the first Artificial Intelligences that sooner or later will perform every iterative and computing-intense task. The static World Wide Web of the ‘90s, after an initial incubation period of experiments and slow connections, with the advent of the broad band and later the boom of the mobile world, reached the critical mass necessary for the never-ending blooming of communities, services, platforms and digital companies of the new Web. This became the matrix that, to date, connects the life of 3 885 567 619 users1. The impossible and unthinkable of yesterday, now is routine and the

resources needed in terms of time and dedication have been drastically reduced. The world has never been so connected and people from every country and social class can interact filling the cauldrons of ideas that push forward the progress. One of the main drivers of this revolution of the human kind seems to be the apparent gratuitousness of the online resources that allow the massive gathering of participants that would be impossible or at least improbable to achieve with a subscription model. The cost of running just the internet global network sits between the 100-200 billion of dollars interval2 hence, the overall expenditures to keep functioning the gargantuan machine of the Web

should shadow the GDP of several nations. How this immense machine humongous costs can cope with the necessity to offer services which need to be free to trigger the massive participation that fuel their success? The answer may reside far from the direct money transactions and more close the employment of the users’ time, activities and information. This chapter will focus on the main protagonists of the new web revolution, the social networks, and their fuel, the data.

1.1 The dawn of free services online with the Web 2.0

The term “Web 2.0” appeared in 2004 to define the clear differences between the two mindsets adopted for the development of web applications. As described in (O'Reilly, 2005) in the first Web 2.0 manifest, the new web was to be intended no more as a network protocol but as platform where satellite utilities gravitated around the core knots of the network, like the giant ecosystem created by Google, where countless accessory services are added continuously over the main search engine platform. A complete and integrated user experience becomes the ultimate target of the IT companies.

1 (Internet World Stats, 2017) referring to the last available report of June 30th 2017 2 Estimation made by a Quora Engineer (Price, 2012)

(20)

Another striking point of the revolution is the exploitation of the collective intelligence where part, if not most, of the contents is generated by the people and the continuous interlinking of resources made obsolete the use of the directories. A clear example of the collective intelligence power is Wikipedia, the most visited encyclopedia of the world where all of its contents are uploaded by the users. Another significative mention goes to Twitter and the intense use of tags and in particular the hashtags, through which similar bits of information become easily and efficiently indexed and notified thanks to a simple action from the users done during the uploading of the contents.

Software has shifted its distribution from a product to a service, exchanging the full ownership with the licensed use and becoming a perpetual beta version frequently updated with new features often inspired by the requests of the userbase. The offline applications blended with the web applications, transforming the desktop experience into a webtop one, where the computation is done on the server and the content data is saved in the cloud rather than in a local disk. The browsers extended their functions nearly becoming substitutes of the operative systems, like the famous Google Chrome browser also available as Chrome OS in the Chromebooks, and the physical devices are now more terminal clients rather than computing centers and the recent mobile frenzy has only exacerbated the phenomenon. The convergent trend of the software also influenced the hardware and how this was to be managed. The most open programming languages and standards prevailed over the closed ones, favoring interoperability and later the rise of the Internet of Things.

While in the previous Web model the data was the content carefully uploaded by the webmasters, in the new one everything that is digitalized can, should and will become data and the Web 2.0 companies have their fulcrum in the data management. The data generated by the users’ online activity, along with their uploaded content, is stored in the cloud and despite being somehow available to the owner, he/she can easily lose control over his/her tracking data destiny. E-commerce businesses, for example, keep track of their products value through their customers’ transactions and feedbacks. The multimedia content sharing platforms keep track of the piece of content popularity and of the visitors’ comments. The social networks keep track of the user activities and relations. Finally, the search engines keep track of all the public information generated in all the other websites. Moreover, the advertising business model permits an indirect remuneration based on the visualizations and on the “clicks” of a banner embedded into the host website. Crossing user profiles preferences with the advertising contents generates the “targeted advertisings” which are more likely to interest the visitors and can greatly improve the companies’ revenues. Since their preferences are solid fuel boosters for an effective advertising campaign, there are the conditions for the creation of

(21)

new markets for the user profiling. Data has become so important that in many situations has become a currency with its own economy. Indeed, aside from the obvious e-commerce case, the majority of the most visited websites online offer services without a direct cash return.

The success of the big data companies demonstrates that the data based economy unlocks new possibilities for the companies as well new benefits for the users especially when there are simple needs to be fulfilled that would be too expensive to offer “ad hoc” to the single person but could be mass-scaled easily. When the participants number is relevant, data can be efficiently used instead of money. Indeed, data is the currency of 21st century.

1.2 Once an amusing experiment, now the main hub of personal information

Among the revolutionary web appliances and services that bloomed in the last two decades, the social networks are the most suited Web appliances to collect and profile its users since the socials are the crossroads of the online surfing experience and can obtain access to the users most sensible information as a byproduct of their activity on the affiliated websites and services.

Everything started in 1997, based after the six degrees of separation concept. The website SixDegrees.com allowed users to list friends, family members and acquaintances on the site and to exchange messages or post bulletin boards with the people inside their first, second and third degree. The site reached up to 3.5 million members3 and laid the foundations that in less than a decade evolved into the actual social network model.

Years later other networks started to appear, less focused on the socializing aspect but still based on the crowdsourcing opportunities of the web. In the 1999 the platform Blogger permitted the creation of personal websites and their easy management with the use of the posts. In 2001 Wikipedia started its quest in becoming the world most rich and updated source of general references run only by the activity of the users, which have to collaborate and interrelate in order to check the facts and keep the website quality standards. In the same year rises StumbleUpon, another users network where the “stumblers” tagged, reviewed and ranked the online resources they visited.

The new millennium released the power of the networks to the productivity sectors. The 2002 gave birth to the RSS feed format that allowed a fast and effective automatic exchange of updates between

(22)

the web resources. LinkedIn, the employment oriented social network, was published just the year after, along with Skype, the peer-to-peer client that made the Voice-Over-IP standard widespread and initiated the migration from the voice-call, SMS and MMS mobile services into more web based all-in-one platforms.

The social networking was then introduced to the masses. While between the 2003 and the 2004 Second Life and games like World of Warcraft created massively populated virtual worlds, Myspace and later Facebook, based on the real and not on a virtual world, forged the modern stereotype of the social network where users created personal profiles, that were de facto personal homepages. The personal profile pages and updates could be made visible to anyone or to only selected contacts and fostered the creation of virtual public or private relationships and exchanges of information between groups with the same interests.

The entertainment sector has been socially refreshed from the year 2005 in which appeared Google Video and YouTube that paved the way into the massively distribution of streaming content. Meanwhile Reddit reinvented the forums and began agglomerating user stories, thoughts and questions in which every user could participate. Twitter in the 2006 allowed the world to exchange rapidly bits of information as news, updates, moods and opinions. The following year the concepts of blog and social networks are furtherly bridged together with Tumblr, where the blogroll becomes the user profile page.

In the same years in the eastern European world VKontakte re-vamped and re-proposed the Facebook format with more attention to the privacy of the users but failed to become globally widespread. The decisive breakthrough is in the 2007 when Apple releases the first canonical smartphone, the iPhone, that has set the bases for an active and continuous presence of the users online. The mobile world led by Apple, Google with Android and, in a minor extent, Microsoft with their recently abandoned Windows phones, greatly extended the userbase of the social networks and made the traffic generated explode. From this point the social networks are no more a gimmick for the people used to deal with technology but an indispensable commodity for the average person

Thanks to the constant online present allowed by the use of the smartphones, a new wave of more advanced web services invaded the market. Among them, the most relevant to mention are Pinterest and Instagram in 2010, the former being a social network where users tag and make collections of external website contents and the latter being a photographers’ oriented picture sharing social

(23)

and Telegram in 2013, an encrypted private messaging platform born as retaliation from the bottom after the shocking disclosures of the NSA spying activities by E.Snowden.

The formation of new social networks is still in act and many other platforms are arising, however the ones eligible to be taken in account for this study are summarized in Table 1.

Social

Network Typology Costs In bold the debut platform Platforms Debut

WhatsApp Advanced Instant Messaging Free* Web, iOS, Android, Windows Phone, BlackBerry OS, Symbian 2009

Facebook Social network based on profile, relationships and posts Free Web, iOS, Android, Windows Phone 2004

YouTube Multimedia content sharing - Videos Free** Web, iOS, Android, Windows Phone, Blackberry OS 2005

Instagram Multimedia content sharing - Pictures and clips Free Web, iOS, Android, Windows Phone 2010

Fb Messenger Advanced Instant Messaging Free Web, iOS, Android, Windows Phone, Blackberry OS, Tizen 2011

Skype Advanced Instant Messaging Free** Web, iOS, Android, Windows Phone, Blackberry OS, PalmOS, Symbian,

Windows, macOS, UNIX/Linux, PSP, Xbox 2003

Telegram Advanced Instant Messaging Free Web, iOs, Android 2013

LinkedIn Employment oriented and profile based social network Free Web, iOS, Android, Windows Phone 2003

Twitter Public information sharing platform Free Web, iOS, Android, Windows Phone 2006

Snapchat Advanced Instant Messaging Free iOS, Android 2011

Google+ Social network based on profile, relationships and posts Free Web, iOS, Android, Windows Phone 2011

Pinterest Multimedia content sharing - Collections of content Free Web, iOS, Android, Windows Phone 2009

Tumblr Social network based on posts and reposts Free Web, iOS, Android, Windows Phone, Blackberry OS 2007

reddit Social news aggregation, web content rating, and discussion

social network Free

Web, iOS, Android, Windows Phone,

Blackberry OS 2005

VKontakte Social network based on profile, relationships and posts Free Web, iOS, Android, Windows Phone, Blackberry OS 2006

StumbleUpon Public information sharing platform Free Browsers toolbar, iOS, Android, Windows Phone 2001

(24)

1.3 The value of personal information

Every day, about 230 billion e-mails are sent, over 660 million tweets are submitted on Twitter, more than 6 billion videos are watched on YouTube, near 70 million photos are uploaded on Instagram, 110 billion blog posts appear on Tumblr, at least 230 million calls are done through Skype and 5 billion researches are done on Google4.

There is a market for personal information that is thriving and users provide their personal information abundantly and typically for free. People generously supply opinions, identity, age, addresses, preferences, phone numbers, and more. People also deliberately ‘over-disclose’ without much thought about the potential monetary value or benefits of their digital traces (Sören Preibusch, 2012). This should hint that people undervalue and share carelessly their personal information and indeed, about half of the 1059 Facebook users interviewed in (Sarah Spiekermann, 2012) when asked how much they would pay to save their entire Facebook profile from deletion replied that they would not pay at all.

Although, on the basis of the data gathered by (Fujitsu, 2010) on a global scale, 88% of the people worry about who has access to their data, 86% state that they recently became more security conscious about their data, 83% are concerned when they hear that their data may be stored overseas, and 80% think that governments regulate the market and impose high penalties on companies that do not use data responsibly. This can be explained with a distinction between the sense of privacy, or the disclosure of only the secret information, and the sense of ownership of all the personal information. The influence of the context and the framing of the questions may also affect the people reaction and their answers.

Companies, on the other hand, seem to know and to evaluate the monetary value of the information they deal with and often the success of a platform is proportional to the number of the active, content generating, users and their traffic. That is why Google+ may have reached 3 billion of registered accounts globally but is one of the least successful social network to date, publicly considered a dead wasteland, with estimations of only few millions of monthly active users, confirmed by the company refusal to disclose the real numbers.

(25)

There is also to mention a tendency versus the conglomeration of complementary companies into ecosystems, as seen in Table 2, like in the case of Facebook that bought the more specialized WhatsApp and Instagram that completed the offers along its social profile based social network. The companies can collect and use the information for profit like Google, that built up a stock market valuation of about 650 billion US dollars (as of August 2017) with the customized advertising or like Facebook that manage billions of profiles that, confronted with the market capitalization, should be valued around 140$ each5. The other socials are shown in Table 3. A new social network must break

through the market with a new revolutionary idea or with an innovative approach to become popular and survive in the long term, but once their model is proven worthful the incumbents either try to clone the functionalities or try directly to acquire the new emerging society. This is the case of Google that failed to launch its proprietary Google Video platform and acquired YouTube a year later or Microsoft that bought the more popular and flexible Skype in spite of its Windows Live Messenger.

*peak value in 2013, then Google stopped disclosing the active user number

Table 2 - Births and assimilations of the social networks

5 Source: statista.com Social

Network Debut Acquisition Founder Owner Bought for active users Monthly

WhatsApp 2009 2014 WhatsApp Inc. Facebook Inc. 19 bln $ 1 200 mln

Facebook 2004 - Facebook Inc. Facebook Inc. - 2 047 mln

YouTube 2005 2006 YouTube Inc. Alphabet Inc. n.a. 1 500 mln

Instagram 2010 2012 Burbn, Inc. Facebook Inc. 1 bln $ 700 mln

Fb Messenger 2011 - Facebook Inc. Facebook Inc. - 1 200 mln

Skype 2003 2011 Skype Ltd. Microsoft Corp. 8,75 mln $ 300 mln

Telegram 2013 - Telegram LLC Telegram LLC - 100 mln

LinkedIn 2003 2016 LinkedIn Corp. Microsoft Corp. 26,4 mln $ 106 mln

Twitter 2006 - Twitter Inc. Twitter Inc. - 328 mln

Snapchat 2011 - Snap Inc. Snap Inc. - 255 mln

Google+ 2011 - Alphabet Inc. Alphabet Inc. - *540 mln

Pinterest 2009 - Pinterest Inc. Pinterest Inc. - 175 mln

Tumblr 2007 2013 Tumblr Inc. Oath Inc. 1,1 bln $ 357 mln

reddit 2005 2006 reddit Condé Nast n.a. 169 mln

VKontakte 2006 2009 VKontakte Ltd Mail.Ru group n.a. 81 mln

(26)

Social Network

Market capitalization6

[Acquisition price]

Earnings7 Active users8

(when acquired) Capitalization per user (when acquired) Earnings per user

WhatsApp [19 bln $ (2014)] part of Facebook 1 200 mln (450 mln) (42$ / user) n.a.

Facebook 486 bln $9 (10.6 bln $) 2 047 mln 237$9 / user (5.2$ / user)

YouTube [1.65 bln $ (2006)](90 bln $ now)10 / part of Alphabet 1 500 mln (50mln) 60$(33$/ user) 10 / user n.a.

Instagram [1 bln $ (2012)] part of Facebook(10.6 bln $) (30 mln) 700 mln (33$ / user) n.a.

Fb Messenger - part of Facebook 1 200 mln part of Facebook -

Skype [8.75 bln $ (2011)] part of Microsoft (170 mln) 300 mln (51$ / user) n.a.

Telegram not public not public 100 mln n.a. n.a.

LinkedIn [26.4 bln $ (2016)] part of Microsoft 106 mln (61$ / user) n.a.

Twitter 11.6 bln $ -0.46 bln $ 328 mln 35,4$ / user -1,4$ / user

Snapchat 16.2 bln $ -0.51 bln $ 255 mln 63,6$ / user -2$ / user

Google+ 634 bln $9 part of Alphabet

(19.5 bln $) 540 mln 1174$9 / user (36$ / user)

Pinterest 12.3 bln $11 not public 175 mln 70.3$11 / user n.a.

Tumblr [1.1 bln $ (2013)] Oath/Yahoo part of (260 mln) 357 mln (4.2$ / user) n.a.

reddit [20mln $ (2006)] part of CondéNast 169 mln n.a. n.a.

VKontakte [112 mln $12 (2009)] part of Mail.Ru 81 mln

(60 mln) (1.9$12/ user) n.a.

StumbleUpon [75 mln $ (2007)] not public (2.5 mln) 30 mln (30$ / user) n.a.

Table 3- Capitalization per user and earning per user of the social networks

6 Source MSN money on 11th August, companies listed on NASDAQ and NYSE 7 Source MSN money on 11th August, companies listed on NASDAQ and NYSE

8 Source: Statista. Some entries are outdated up to one year due to lack of public data (e.g. Telegram had 100mln users in

February 2016, growing 350 thousands per day)

9 Data referring to the entire group

(27)

2 Call to quantify the value of the personal information

While the companies do find value in the users’ personal information collected since they are running their services for free, it is more difficult to assess the distance from the user perception of their own personal information online, since the pieces of information dropped online are given without monetary exchanges. This study wants to take the user personal information perceived value analysis a step further from what has been done until now.

2.1 The previous attempts to frame the phenomenon

The quest to measure, or at least to frame within one order of magnitude, the value perceived by the users about the personal information dropped online has been challenged by several studies and with heterogeneous approaches. The great obstacle in retrieving the estimations of the value is the objective absence of a defined cost, as the services are exempted from monetary fees. At the direct question of how much someone would value his/her personal information the following response could vary from a careless negative answer to amounts that people would never see in their entire life. This happens due to the framing effect that could alter temporarily their perception. The quantification of the real value they percept should be made through implicit means.

The amount of personal information dropped online should be intrinsically connected with a more prolonged and intense use of the web, especially of the social networks, and an efficient way to assess the perceived value of their data is to gauge how much they are willing to pay or even to sacrifice in order to remain connected. The most basic approach could be represented by the individuation of the willingness to pay of the users for their internet connection, as attempted in (Gregory Rosston, 2010), however the desire for faster or more reliable connections may be caused not only by the importance of their personal data but by the use of bandwidth-intensive or low-latency applications as the streaming of high quality videos or multiplayer videogames. Another approach structured over an indirect measurement is the study conducted in (Wallsten, 2013) that tries to analyze how many minutes of general-purpose activities the people are going to sacrifice to stay online. There are contextualization issues with this approach since it requires a reliable and complete set of monetary conversion rates for every substituted activity. In addition, people may have traded off the old activities with their online surrogate, invalidating the assumption that they are sacrificing some activities to remain online longer. This last problem was avoided in (Erik Brynjolfsson, 2012) in which the value of the free online activities was to be quantified by the surplus that these activities brought to the consumer. This time the major issue is the mass of different activities done online by

(28)

every user that prevents an easy and flawless translation into estimations. Moreover, not all the activities done online that grant surplus are involved with the use of sensible personal data, e.g. the use of a search engine, an online videogames match, the downloading of multimedia content...etc. Another approach to probe the value perception of the personal information is to interview the people with a survey, however the data gathered in this way can be influenced by multiple factors. The context, especially in presence of a possible surge of privacy concern, can alter the perceptions of the interviewed and may artificially raise their personal information value perception as seen in (Luc Wathieu, 2005). Secondly, the nature of a particular piece of personal information might highly affect the evaluating capabilities of the surveyed especially when that data, compared to the local commonly accepted parameters, is felt as not normal. Thirdly, the study on the evaluation of privacy in (Bernardo A. Huberman, 2005) hints to the existence of a steep gradient where “normal” or positively abnormal pieces of information are deliberately shared mostly for free and undesirable information are kept secret or shared only if the benefits far exceed the shame or the danger of losing its control. Finally, the simple framing of the question, or a preliminary priming with privacy concerns, could affect the contextual evaluation of the personal information as seen in (Alessandro Acquisti, 2012) and (Leslie K. John, 2011).

Even taking the adequate measures to contrast the influences of the external undesired factors as seen above, the interview of the user should not directly ask for the value of their personal information, it should rather aim at a very closely related variable to prevent an internal negotiation in the surveyed mind and to avoid triggering a conservative sense of ownership. In (Sarah Spiekermann, Psychology of Ownership and Asset Defense: Why People Value Their Personal Information Beyond Privacy, 2012) the 1 059 participants have to imagine a scenario in which their Facebook account is going to be closed due to a change of M. Zuckerberg mind and they have to make a realistic monetary offer to retain all the personal information contained in their profile. The amount offered to retain all the personal information from the imminent black-out of the service can be easily translated in their maximum willingness to pay for their personal information currently uploaded online, therefore their offer is not only a proxy but strongly correlated indirect measurement linked with the value perceived. Even though this method highly suffers of the users’ subjectivity in evaluating their private assets, it can be mitigated with the use of a large sample and it seems to be the most accurate way to frame the average perceived value.

(29)

2.2 The Ca’ Foscari University students as the target population

The ideal population for study should be well defined, with similar characteristics of age, occupation, education, etc. to prevent the influence effects of the variables not taken in consideration. This population has been identified in the Venetian university students of Ca’ Foscari, where the average age of the interviewed should be more than 18 and less than 25 years old13, the levels of education

should be of comparable entity and in general they should have sufficiently homogenous characteristics and backgrounds since their main occupation is the one of the university student. The results obtained from their survey should then be taken as representative of the North-eastern Italian young population and not as absolute value valid at global level to be easily generalized without enough corrections and precautions. For further researches, the method used in this study could be replicated elsewhere in other universities or in other different contexts to confirm the general tendencies in terms of personal information evaluation and eventually frame the differences with locally or culturally diverse populations.

2.3 The roadmap of the test

The test for the personal information user evaluations consisted in the gathering of the data from the chosen population, their aggregation and re-elaboration for the extraction of implicit variables and finally their mutual correlation to discover the ones that impacted the most.

2.3.1 The successful gathering of data and the 5 minute effortless survey mission

The gathering of data has been done in four major step all aimed at delivering an effortless and inspiring experience to the surveyed and at extracting the most information with only ten questions. The first consisted in the creation and refining of an elaborate survey layout to allow the highest grade of data extraction with the least number of questions with the goal of keeping the entire experience inside the 5 minutes time span. The short survey completion time permitted a high number of valid responses out of the total entries and created the minimum annoyance to the surveyed14. The tripartite

structure of the survey with the initial profiling of the surveyed, a consequent breakthrough disclosure of the companies’ value per user data with annexed a pay or leave option to check the consistency of

13 The average age of graduation of the 2015 was of 25,1 years old, average shared by the bachelors and the master degree

students who in total should spend at least 3 years before graduating for the bachelor degree and 3+2 for the master degree (University Ca' Foscari of Venice, 2016)

(30)

their evaluation and a final optional thanks giving and engaging section prevented a high ratio of unfinished surveys. Finally, the careful picking and framing of the questions allowed the gathering of indirect and deductible data, as the number of social networks concurrently used, shortening the length of the survey from the initial sketch to the final version officially employed.

The second milestone consisted in building an online landing page to bring the whole project to a more open-sourced level, where the respondents could gain additional information about this research, check the anonymous aggregate data of all the participants updated every one hundred of entries collected and have the contacts to send additional suggestions even after the survey completion. This dedicated web site, with its layout available in the appendices of this document, presented also a share button, a graphic indication on the number of surveys achievements and a constantly updated list of the participants that explicitly left their name to be thanked in this research to furtherly improve their engagement. Almost 25% of the participants decided to leave their name or pseudonym and an equivalent amount left their email address to receive a copy of the research. The third milestone has been a closed beta test of the survey and of the landing page with some selected students to check for its reliability and for errors or other generic misunderstandings before opening the test to the all the other students of Ca’ Foscari. Although the beta test successfully purified the survey from most of the issues some minor adjustments were made in the successive stages15.

The fourth final step consisted in the distribution of the survey and the gathering of 1 389 total entries of which 1 016 eligible to be used in the statistics of this research. The students where contacted during the months of June and July 2017 and the survey remained public until the end of September 2017 giving at least two months to fulfill it.

2.3.2 The interviewing method

The participants received a link to the online survey platform where they could answer electronically the questions from desktop or mobile devices. All the explicit questions were mandatory to finish the survey and no one interrupted the survey without finishing them, the scarcest progress registered was

15 Several dozens of surveyed inserted hours (e.g. “3 hours”) instead of minutes (e.g. “180 minutes”) in the second

(31)

of 64%, i.e. the full survey without the filling of the non-mandatory data field and before the final thanksgiving section.

2.3.3 The processing of the obtained data set

After the gathering, all the data was purified from the outliers, from the void, invalid and misspelled entries, e.g. the wrong use of points and commas for the decimal values, estimations inserted as sentences and not as numerical values and the input of entries like “2 hours” in the field that required to insert the minutes.

The data has then been processed for the qualitative analyses. Where the spectrum of the answers was composed of unbounded numerical values, all the entries where re-categorized and assigned to containers of equal distance, as in the evaluation of the maximum fee to retain the functionality of the main social network, where all the entries were assigned to the 6 ranges of “0-10”, “10-30”, “30-50”, “50-70”, “70-90” and “90+” euros.

Furtherly, the data has been aggregated to be visually displayed in chapter of the survey results.

2.3.4 The search for meaningful correlations between the variables

Finally, the single variables collected by the survey where analyzed in tandem in search for correlations. In particular, every framing variable, like the permanence online or the privacy management habits, was confronted with the estimation variables and the average of the two to discover which and how intensively each question topic is affecting the people perception. The significative framing variables were then collected to check which ones described better the estimation variables with a generalized formula for every estimation variable.

(32)

Riferimenti

Documenti correlati

Novikov, Nonlinear equations of Korteweg-de Vries type, finite-band linear operators and Abelian varieties, Uspekhi Mat.. Druma, On analytic solution of the two-dimensional

Useful results for statistical applications The first convergence result provided in this section can be used for the construction of asymptotic confidence intervals for the

 The hdfs command can be executed in a Linux shell to read/write/modify/delete the content of the distributed file system..  The parameters/arguments of hdfs command are used

In particular, generational accounting tries to determine the present value of the primary surplus that the future generation must pay to government in order to satisfy the

Another indirect method of detecting road damage as a result of mining activities is the detection of subsidence troughs that occurred under road infrastructure..

Computational methodologies in combination with experimental biological assay represent fundamental key tools in the drug discovery process. The study of

Thus, this research—which was based on the study of the Romanian immigrant population living in Alcalá de Henares—focused on contributing knowledge that would help understand the

It includes “poisoning mailing list”, “analysis mailing list”, supply of informa- tions of poisoning on Web, requests for analysis of a causative toxin, trials of toxin analysis and