• Non ci sono risultati.

Subscription-Driven Self-Organization in Content-Based Publish/Subscribe 2. 1. 4. 3.

N/A
N/A
Protected

Academic year: 2021

Condividi "Subscription-Driven Self-Organization in Content-Based Publish/Subscribe 2. 1. 4. 3."

Copied!
1
0
0

Testo completo

(1)

M I D L A B M i d d l e w a r e L a b o r a t o r y

M I D L A B

M i d d l e w a r e L a b o r a t o r y

http://www.dis.uniroma1.it/~midlab

Università degli studi di Roma La Sapienza

R. Baldoni R. Beraldi L. Querzoni A. Virgillito

Subscription-Driven Self-Organization in Content-Based Publish/Subscribe

e-mail: {baldoni, beraldi, querzoni, virgi}@dis.uniroma1.it

Background

In content-based publish/subscribe systems subscriptions and events are defined over a predetermined event schema, composed by a set of n attributes. The event schema can be graphically represented as an n-dimensional space where events are embodied by points and subscriptions by subspaces; an event matches a subscription if the corresponding point falls inside the subspace defined by the subscription.

A distributed event notification service is composed by a set of processes, namely event brokers, interconnected through transport-level links which form an acyclic topology. Clients can issue subscriptions and publish events. Events are diffused through the system from publisher to interested subscribers using routing tables whose content is based on previously issued subscriptions.

e s

2:publish(e)

P1 S1

1:subscribe(s)

3:notify(e) B2

B1 B3

The problem

If a broker is not directly interested in the event it acts as a simple router: we call such broker a “pure forwarder”. A pure forwarder introduces one useless TCP hop on the event diffusion path and requires two useless event matching operations.

B7

notify(e) notify(e) publish(e)

B3 B2

B10

B9 B1

B8

B4

B5 P1

S2 S1

Pure

forwarders

Idea

notify(e) notify(e) publish(e)

B3 B2

B10

B9 B1

B8

B4

B7

B5 P1

S2 S1

Pure

forwarders

The basic idea of our approach is to rearrange the topology of the brokers' network in order to directly connect brokers hosting similar subscriptions without affecting network-level metrics. This network rearrangement should lead to a reduction of the average number of “pure forwarders” experienced during event diffusion.

The algorithm

Main goal of the algorithm is to put close brokers hosting similar subscriptions inside the application-level network. To measure the similarity between two subscriptions we use an heuristic , namely associativity, based on the geometrical intersection of the respective zones (hyper-rectangles shapes) occupied in the event space. Each broker B tries to rearrange the network in order to obtain an increment of its associativity AS(B) while not decreasing the global associativity of the system AS. Self-organization takes place only if it leads to an increase of AS: increasing the associativity of the whole pub/sub system, intuitively means increasing the probability that brokers sharing common interests get close to each other. The algorithm can be split in four phases: triggering, tear-up link discovery, teardown link selection, and reconfiguration which includes the content-based routing tables update.

Tear-Up Link Discovery: a discovery request message DREQ is sent along the link. The message is forwarded by a broker if there is the possibility that a higher associativity can be obtained with a broker behind one or more of its links. This phase returns to B (via DREP messages) the address of the broker that can assure the highest associativity.

B3 B2

B10

B9 B1

B8

B4

B7

B5 DREQ

DREQ DREQ DREQ

DREQ

DREQ DREP

DREP

DREP DREP

DREP

DREP

2.

Triggering: the algorithm is triggered by a broker B when it detects the possibility of increasing its associativity. Specifically this can happen when B suspects that behind one of its links there could be a broker B' which can increase its associativity AS(B). This can be done by the broker simply looking at its content- based routing table.

subscribe(s2) B3

B2

B10

B9 B1

B8

B4

B7

B5

S2

s1 s2 s1

1.

Reconfiguration: during this last phase the two brokers B and B' are connected through the creation of a new link, while the link chosen in the previous phase is teared down. After this topology change, data structures at brokers along the path have to be updated. In order to avoid network partitioning, due to concurrent self-organizations, the path connecting B and B' remains locked during all this phase.

B3 B2

B10

B9 B1

B8

B4

B7

B5 ROUTEUPDATE

ROUTEUPDATE

ROUTEUPDATE ROUTEUPDATE

ROUTEUPDATE

4.

Tear-Down Link Selection: the aim of this phase is to select the link that has to be teared down during the reconfiguration phase. This link is selected among those constituting the path between B and B'. Specifically the link connecting brokers with the lowest associativity is chosen. If the associativity of the new link that has to be created between B and B' is higher than the associativity of the chosen link the algorithm passes to the last phase.

AS(B10,B3) = 0.7

AS(B3,B9) = 0,8

AS(B8,B10) = 0,4

AS(B5,B7) = 0,3 AS(B4,B8) = 0,5

AS(B7,B4) = 0,1

B3 B2

B10

B9 B1

B8

B4

B7

B5

AS(B9,B5) = 0,6

3.

Results

A prototype of the system has been implemented and exploited for a simulation study of the self-organization algorithm, which confirms that, when limiting the scope of neighbor discovery to a reasonable value (in terms of network latency), the self organization algorithm achieves significant improvement on the TCP-hop metric with respect to a pure content-based routing algorithm, without affecting notifications' average network latency. Moreover, the overhead introduced by the self-organization algorithm can be amortized, with respect to the usage of the content-based routing algorithm alone, only if there are at least, on the average, more than 20 notifications for each subscription (this is actually a daylife working condition for a pub/sub system).

15 20 25 30 35 40

10 20 30 40 50 60 70 80 90 100

pub/sub ratio

TCP hops

Plain CBR SOCBR with d=100ms SOCBR with d=∞

0,0 0,5 1,0 1,5 2,0 2,5 3,0 3,5 4,0 4,5 5,0

10 20 30 40 50 60 70 80 90 100

pub/sub ratio

Average latency x pub (sec.)

Plain CBR SOCBR with d=100ms SOCBR d=∞

For further details on the algorithm and the simulation study please refer to:

R. Baldoni, R. Beraldi, L. Querzoni, and A. Virgillito

“Subscription-driven self-organization in content-based publish/subscribe”

Technical report, downloadable from http://www.dis.uniroma1.it/midlab/docs/bbqv04techrep.pdf Dipartimento di Informatica e Sistemistica, Mar 2004.

Riferimenti

Documenti correlati

 The hdfs command can be executed in a Linux shell to read/write/modify/delete the content of the distributed file system..  The parameters/arguments of hdfs command are used

 The input pairs are grouped in partitions based on the integer value returned by a function applied on the key of each input pair.  This operation can be useful to improve

 The stationary distribution at a node is related to the amount/proportion of time a random walker spends visiting that node..  It is when the distribution does not change

Results are organized as follows: first we evaluate the application-level routing performance (number of TCP hops per event notification) and the associativity metric of the system

To improve per- formance in these settings, we propose the use of a buffer- ing mechanism: each node accumulates notifications for a given amount of time and sends

7 Another topic-based system with a similar approach is Bayeux [9], built over the overlay network infrastructure named Tapestry [56].. Rebecas topology is given by an acyclic graph

This component is used by broker B i to manage brokers that want to leave the network and maintain the network healthiness through algorithms that handle broker crash failures..

The overlay infrastructure represents the organization of the various en- tities that compose the system, (e.g., overlay network of dedicated servers, peer-to- peer structured