Brief Announcement: Distributed
Self-Organizing Event Space Partitioning for Content-Based Publish/Subscribe Systems ?
Roberto Beraldi, Adriano Cerocchi, Fabio Papale, and Leonardo Querzoni University of Rome “La Sapienza”, Rome, Italy
[beraldi,cerocchi,querzoni]@dis.uniroma1.it
Context and motivations. Publish/subscribe systems have commonly been divided in two large families on the basis of their event-selection model [2]: topic- based and content-based systems. The former trade reduced subscription expres- siveness with simpler implementations and higher performance. Conversely, the latter allow to accurately map published data in a complex event schema on top of which expressive subscriptions can be defined, but incur the cost of more complex implementations that delivers reduced performance on large distributed settings. System developers are thus faced with a choice about which kind of sys- tem is best suited to the target application. A common solution to this dilemma lies in the event space partitioning [4] technique: the event schema is partitioned in a number of subspaces that are then statically mapped to topics. The parti- tioning must be globally known and subscribers are expected to subscribe those topics where subspaces that have a non-empty intersection with their content- based subscriptions have been mapped. Undesired events (false positives) can be filtered out at the receiver side. The event space partitioning granularity strongly affects the performance of such systems: if it is excessively coarse-grained too much resources are wasted to deliver false positives, while if it is too fine-grained the number of topics that will be generated, and that must be managed by the topic based system, could easily become huge. Current solutions [3] provide sub- optimal approximations that are calculated offline and then statically applied to the system.
Contribution. We propose a self-organizing algorithm that builds and dynam- ically adapts at run-time an event space partitioning that eventually provides subscribers with a desired level of performance (i.e. percentage of false posi- tives below a specified threshold) while striving to limit the number of topics that must be managed. The novelty of our solution lies in its ability to work on the basis of run-time performance indices measured by subscribers. Each sub- scriber monitors the ratio between the false positives (an event notified to the subscriber that does not match any of its content-based subscriptions) and the total number of events received for each topic it is subscribed to. If this ratio raises above a predefined global threshold T
F Pthe subscriber starts a procedure for partitioning the subspace mapped to that topic. At the end of this proce- dure it updates its topic subscriptions and starts again monitoring performance
?
This work was partially supported by the BLEND and SOFIA European projects
and by the DOTS-LCCI Italian project.
S1 S2
X Y
/
/X1 /X2 /X3
/X1/Y1 /X1/Y2 /X1/Y3
/X1/Y1/X1 /X1/Y1/X2 /X1/Y1/X3
/X3/Y1 /X3/Y2 /X3/Y3
/X2 /X3/Y3/X3/Y2/X3/Y1 /X1/Y3/X1/Y2/X1/Y1/X1 /X1/Y1/X2 /X1/Y1/X3
e