CLUSTERING COEFFICIENT 6-19 - N ETWORK ANALYSIS

N ETWORK ANALYSIS

6.3. CLUSTERING COEFFICIENT 6-19

To compute network transitivity, we need to count the number of triangles, which is equal to k. With n_Λ(v_i) =1 and n_Λ(x) =n_Λ(y) = (^k+1₂ ) = ¹₂k(k+1), we find that

τ(G) = ^k

2·0.5·k(k+1) +k = ¹ k+2 and thus

k→∞limτ(G) =₀

We can extend the notion of transitivity to weighted graphs following an approach suggested by Opsahl and Panzarasa [2009]. In this case, we need to assign a weight to triples and triangles, after which we compute the transitivity of a graph by considering the ratio of the cumulative weights on the triangles and that of the triples. Let us start with defining precisely what the weight of a triple or triangle is.

Definition 6.16:Let G be a simple, undirected weighted graph and consider vertex v ∈ V(G). If H is a triple or a triangle at v where edges e₁and e₂ are incident with v, then the triple weight w_Λ(H)and triangle weight w_∆(H), respectively is equal to the average of the weights of e1and e2, i.e.,

w_Λ(H)^def= ¹₂ ^w(e1) +w(e2) and w_∆(H)^def= ¹₂ ^w(e1) +w(e2) In principle, the triple of triangle weight can also be defined as, for exam-ple, max{w(_e₁)_{, w}(_e₂)}, but we shall not consider such details here. Using these definitions, we can then define the transitivity of a weighted graph as follows.

Definition 6.17:Let G be a simple, undirected weighted graph with H_∆its set of triangles, and H_Λits set of triples. The network transitivity τ(G)is defined as

τ(G)^def=

H∈H∑_∆ w_∆(H)

H∈H∑_Λ w_Λ(H)

Note that this definition is identical to that of transitivity in an unweighted graph when setting weights equal to 1.

Finally, Opsahl and Panzarasa [2009] extend their definition of transitiv-ity to directed graphs, be they weighted or not. In this case, they simply use the same definition of weights for triples and triangles, respectively, but restrict the enumeration of these subgraphs to so-called nonvacuous triples and transitive triangles:

Definition 6.18:Consider a (strict) directed graph D. Let H be a triple at v, with its neighbors u and w in H. H is a nonvacuous triple if eitherh−→u, vi,h−→v, wi ∈ A(H)orh−→w, vi,h−→v, ui ∈A(H). If H was a triangle at v, then H is transitive if A(H) = {h−→_{u, v}i,h−→v, wi,h−−→u, wi}or A(H) = {h−→_{w, v}i,h−→v, ui,h−−→w, ui}.

In other words, H as a triple is nonvacuous if there exists either a(u, w) -path via v or a(w, u)-path via v, and H as a triangle is transitive if w can be reached from u both through an arch−−→u, wiand a path in H via v, or u can be reached from w through an arch−−→w, uiand a directed path through v. Figure 6.6 shows all possible (non)vacuous triples and (non)transitive triangles.

Non-vacuous Non-vacuous

Vacuous Vacuous

Transitive Transitive

Non-transitive Non-transitive

Figure 6.6: (Non)vacuous triples and (non)transitive triangles at (the marked) ver-tex v.

We will often use the clustering coefficient or network transitivity to compare different random graphs. Both metrics are used in practice, yet computing network transitivity for large graphs can be somewhat ineffi-cient provided special measures are taken. We will not go into details here, but will return to various examples when discussing concrete examples of random graphs throughout the remaining chapters.

6.4 Centrality

Another important metric for network analysis is deciding on whether there are any vertices “more important” than others. The importance of a vertex

6.4. CENTRALITY 6-21

is, of course, dependent on what a graph is actually modeling. For exam-ple, when dealing with networks representing relationships between peo-ple, a vertex with a high degree may characterize an influential person. In a communication network, however, the importance of a vertex may be de-termined by the number of shortest paths of which it is member, for in that case it may be an indication of its workload regarding processing and for-warding messages.

In network analysis, this concept of importance is referred to as central-ity[Kotschutzki et al., 2005]. Perhaps one of the simplest notions of central-ity is identifying the center of a graph. It is formed by those vertices whose eccentricity is equal to the radius of a graph:

Definition 6.19:Consider a (strongly) connected graph G. The center C(G) of a graph G is the set of vertices with minimal eccentricity, i.e., C(G)^def= {^v ∈ V(G)|e(v) =rad(G)}.

Intuitively, a vertex is at the center of a graph when it is at minimal distance from all other vertices. Using the eccentricity of a vertex u, we can then define its centrality as:

Definition 6.20:Consider a (strongly) connected graph G. The (eccentricity based) vertex centrality cE(u)of a vertex u∈V(G)is defined as 1/e(u).

All vertices in the center of a graph have maximal centrality, whereas indeed all vertices at the “edges” of a graph have very low centrality. Returning to graph G_simplefrom Figure 6.4, we find that the center consists only of vertex 7. With some computational effort, it can be shown that graph GA_complex from Figure 6.1 has no less than 43 vertices in its center, whereas GBcomplex

has only two vertices in the center.

Eccentricity can be used for determining whether certain functions in a network have been optimally placed. For example, when deciding on placing certain buildings in a city, we may want to take into account that those buildings should be conveniently reached, such as fire stations. In effect, the decision is to place certain functionality within a specific range of all nodes.

Eccentricity measures the maximum distance to any other node in a net-work. In some cases, it is more important to know how close a node is to all other nodes. This means that we need to take into account all the distances from one node to the others. In that case, we simply take the total distance of one node to every node into account:

Definition 6.21:Consider a (strongly) connected graph G. The closeness c_C(u)of a vertex u∈V(G)is defined as cC(u)^def= ^1/ ∑v∈V(G)d(u, v).

Returning to our example, it is clear that a fire station should be close to any arbitrarily chosen node. In that case, we want to optimize on the trav-eling distance when a fire breaks out. However, matters become different in the case of services that need to be accessed simultaneously from different nodes, such as with hospitals, a town hall, shopping centers, and so forth.

This is where closeness comes into play. In those cases, we want to place a service conveniently close to as many nodes as possible, which is clearly a different criterion than minimizing the maximum distance that needs to be traveled.

For G_simplewe find the following values for the closeness of its vertices.

Although vertex 7 forms the center of G_simple, it is not the vertex closest to all others, which is vertex 1.

Vertex: 1 2 3 4 5 6 7

∑ d(u,·) 21 22 27 32 24 37 29

cC(u): 0.048 0.045 0.037 0.031 0.042 0.027 0.034 Note that comparing closeness between vertices of different graphs may not be very useful. For example, when considering unweighted graphs, we see that the closeness of a vertex decreases as the graph consists of more vertices. For this reason, comparing the closeness of vertices is useful only relative to a given graph.

Vertex centrality and closeness are both related to the reachability of a vertex, and as such may indeed indicate the importance of a vertex. How-ever, we have also seen another type of important vertices, namely cut ver-tices, whose removal actually partitions a graph. One can argue that such vertices form the center of a graph. Based on this observation, notably re-searchers in the social sciences have introduced what is referred to as be-tweenness. The basic idea is simple: if a vertex lies on many shortest paths connecting two other vertices, it is an important vertex. The reasoning is that the removal of such a vertex will directly influence the cost of the con-nectivity between other vertices, as other (i.e., longer) shortest paths will have to be followed. Formally, we have:

Definition 6.22:Let G be a simple, (strongly) connected graph. Let S(x, y)be the set of shortest paths between two vertices x, y ∈ V(G), and S(x, u, y) ⊆ S(x, y) the ones that pass through vertex u∈V(G). The betweenness centrality c_B(u) of vertex u is defined as

cB(u) =

∑

x6=y

|S(x, u, y)|

|S(x, y)|

Note that because G is (strongly) connected, |S(x, y)| > 0 for all pairs of distinct vertices x and y.

6.4. CENTRALITY 6-23

In the following chapters we will apply these and other metrics to spe-cific types of graphs. As we’ll see, more metrics can be defined to differ-entiate and characterize graphs, but many of these metrics are more easily explained and motivated given the specific context in which graphs and networks are used to model real-world situations.

C

HAPTER

7

Nel documento Graph Theory and Complex Networks (pagine 157-163)