Event-Based Data Dissemination on Inter-Administrative Domains: Is It Viable?

(1)

Event-Based Data Dissemination on

Inter-Administrative Domains: Is It Viable?

Roberto B

ALDONI

, Leonardo Q

UERZONI

, Sirio S

CIPIONI

Dipartimento di Informatica e Sistemistica “A. Ruberti”

Sapienza Universit`a di Roma

{ baldoni,querzoni,scipioni } @dis.uniroma1.it

Abstract

Middleware for timely and reliable data dissemination is a fundamental building block of the Event Driven Architecture (EDA) which is the ideal platform for developing a large class of sense-and-react applications such as air traffic control, defense systems etc. Many of these middlewares are compliant to the Data Distribution Service (DDS) specification and they have been traditionally designed to be deployed on strictly controlled and managed environments where they show predictable behaviors and performance. However, the enterprise setting is far from being managed as it can be characterized by geographic inter-domain scale and heterogeneous resources.

In this paper we present a study aimed at assessing the strengths and weaknesses of a commercial DDS implementation deployed on an unmanaged setting. Our experiments campaign outlines that, if the application manages a small number of homogeneous resources, this middleware can perform timely and reliably as long as there is no event fragmentation at the network transport level. In a more general setting with fragmentation and heterogeneous resources, reliability and timeliness rapidly degenerate pointing out a clear need of research in the field of self-configuring scalable event dissemination with QoS guarantee on unmanaged settings.

1. Introduction

It is a common belief that sense-and-react applications (including air traffic control, defense systems, fraud detection etc) cannot be built without using Event-Driven Architecture (EDA) technologies [3]. This is due to the necessity of sub-second responses to changing conditions in the environment and this can only be achieved if the sensing portion of the application polls the environment rather than having data pushed to it [4]. Timeliness and reliability in data distribution are essential to guarantee the correctness and safety of an application in such domain. If the system fails in delivering data on time, instability may arise, possibly resulting in threats to either infrastructures or human lives.

(2)

Historically, most of the standards for data distribution middleware, e.g. CORBA Event [9] and Notification [10] services, Java Message Service (JMS) [7] etc., as well as most proprietary solutions, lacked the support nec- essary for real-time, mission and safety critical systems. The main drawbacks of these solutions are related to a limited (or not existing) support for Quality of Service (QoS) and to the lack of architectural properties needed to promote dependability and survivability. Recently, in order to fill this gap, the Object Management Group (OMG) proposed the Data Distribution Service (DDS) [6] specification. This standard gathers the experience of proprietary real-time pub/sub middleware solutions independently engineered and evolved within the industrial process control and in the defense systems application domains. The resulting standard is based on a completely decentralized architecture and provides an extremely rich set of configurable properties for QoS. Current DDS implementations are designed for managed and controlled environments where they deliver predictable performance characterized by high throughput, low latency and loss rates close to zero even in stress conditions as remarked by several studies [8, 12]. These studies compare, in LAN scenarios, various DDS implementations and other publish/subscribe softwares such as OpenSplice, OpenDDS, TAO DDS, and RTI DDS.

However, the enterprise setting is radically different from a managed one: it is often characterized by an inter-administrative (i.e. unmanaged) geographic scale, shared network channels, heterogeneous resources with the consequent unpredictability of, for example, end-to-end latency and load.

In this paper we present a summary of results of an extensive campaign of experiments whose aim was to assess the strengths and weaknesses of current DDS implementations on such unmanaged environment. During the campaign we tested three DDS implementations deployed on PlanetLab on national and continental scenarios. All of them shown similar behaviors and shortcomings, this is why, in the next sections, we report for the sake of brevity only the results obtained from the DDS middleware by Real-Time Innovations Inc. [1] as it shown the better performance coupled with the more stable behavior. These middlewares can perform timely and reliably on an unmanaged setting only if there is no event fragmentation at the DDS level and if resources are in a small number and homogeneous. In a more general unmanaged setting, reliability and timeliness of event dissemination rapidly degenerate. Lessons learned from the campaign point out shortcomings in these middleware systems that turn out in research challenges. These challenges include: the need of designing self- tuning mechanisms within the middleware to cope with a constantly changing setting (e.g., end-to-end latency, load on a single node), the design of scalable algorithms for efficient event routing also in the presence of event fragmentation, adaptive mechanisms to compensate possible diversity of resources in terms of computational

(3)

power. Answering to these questions is mandatory for enlarging as more as possible the zone in which the performance of the event dissemination perform timely and reliably on an unmanaged setting.

The paper is organized as follows: Section 2 introduces the testbed we realized, and Section 3 shows and discusses the results; finally, Section 4 concludes the paper resuming the lessons we learned from this practical experience.

2 Testbed

The testbed we developed¹for this study is based on PlanetLab [11], an open overlay network composed by more than 800 nodes, located in 400 sites. PlanetLab allowed us to deploy a testbed where congestion, latency and other network parameters are similar to a real environment.

We defined two different scenarios scenarios for our tests: national and european. They are composed (respectively) by 1 publisher and 2 subscribers, 1 publisher and 20 subscribers. Another important difference between these scenarios is the geographic location of nodes: in the national one all nodes are located in Italy (in particular we used nodes in Rome, Milan and Naples), while in the european scenario nodes have been selected from several european countries (e.g. four nodes in France, two in Germany, etc.). In both scenarios we tested a simple application where a single publisher distributes data to various subscribers embodied by the remaining nodes.

All the results reported in the next sections were obtained using an application written in C++ that exploits RTI DDS libraries. These libraries enable publish-subscribe communications and several services, as flow control or reliable communication. In our test, we configured several QoS policies and transport protocol parameters in order to optimize performance of RTI DDS in the considered scenarios. Here we provide only a basic list of the most important QoS policies and parameters settings used for our tests. The first and we think the most important QoS policy is Reliability, that specifies what kind of communication paradigm we want to use in our test: reliable or best-effort. Other relevant QoS policies are related to reliable communication; for exam- ple History.kind controls the behavior of the Service when the value of an instance changes before it is finally communicated to some of its existing DataReader (The receiving component on a subscriber node) entities, we set this policy to KEEP ALL to maintain all values. A last fundamental parameter is the heartbeat period that controls the time interval between two subsequent send of heartbeat messages. These messages announce to the

1The code needed to run the tests, together with a document including all configuration details, is available at the following url:http:

//www.dis.uniroma1.it/˜midlab/dds_testbed.zip

(4)

DataReader that it should have received all events up to the current one and can also be used by the DataWriter (the sender component on a publisher node) to request the DataReader to send back an acknowledgement. In our tests this parameters is set to 1 second. Other configuration parameters are related to the transport layer.

RTI DDS provides a plugin mechanism that can be used to select different kind of transport primitives: shared memory, unicast, multicast, etc. Due to the impossibility to change the network infrastructure (Internet) of our

testbed we were forced to use UNICAST IPv4 based on UDP/IP. A fundamental transport layer parameter is the fragment size of UDP packets. The maximum allowed fragment size defines the number of UDP messages in which an event is fragmented. We will show in the next section that this parameters have a critical impact on loss rate and latency in reliable communications.

During our tests we measured both the data distribution latency (time needed for an event to travel from the publisher to the subscriber, i.e. half Round Trip Time) and maximum throughput (i.e. number of events received per second) allowed by the tested. During tests conducted with the best-effort policy we also kept track of loss rate (i.e. number of events lost). Every point plotted in the graphs reports the average value measured in 20

independent runs of our test application executed with a time interval of 12 hours from the preceding one. For latency tests we collected 1000 samples, produced at a rate of 1 per second, for each run. For throughput tests we executed, in each run, 10 independent event production bursts, each with a duration of 30 seconds, separated by a time interval of 30 seconds. During each burst the application try to send as much events events as possible and measures the number of events sent by the publisher and received by the subscribers.

3 Experiments

3.1 Fragment Size vs. Loss Rate

First, we want to focus on loss rate and show the impact it has on the latency of reliable communications. We analyzed the behaviour of RTI DDS using different fragment sizes. This parameter defines the maximum size of UDP packet generated and sent by middleware, therefore, it also defines the number of packets each event will be fragmented in.

Figure 1 reports RTI DDS behaviour with best-effort communication in the national scenario varying both fragment size and event size. The first interesting outcome of this test is that the number of events lost during each run is quite small (under 4%) as long as we consider small event sizes. When events become bigger (right half of the graph), reducing fragment size have a drastic effect on loss rate. In this case, the middleware must

(5)

0 10 20 30 40 50 60 70 80 90 100

16 32 64 128

256 512

1024 2048

4096 8192

16384 32768

65536 131072

Loss Rate (%)

Event Size (B) Frag Size = 1KB

Frag Size = 4KB Frag Size = 8KB Frag Size = 16KB Frag Size = 64KB

Figure 1. National scenario: impact of fragment size on event loss rate

retransmit each event various time in order to guarantee a reliable communication, therefore introducing huge latency.

0 20 40 60 80 100 120 140

16 32 64 128

256 512

10 24

20 48

40 96

81 92

16384 32768

Latency (ms)

Event Size (B) Frag Size = 8KB

Frag Size = 64KB

Figure 2. National Scenario: impact of fragment size on latency

Figure 2 reports the latency for the same experiments using reliable communication. This figure shows that loss rate have an impact on latency in reliable communications: latency grows rapidly up to an unacceptable level when loss rate is high. Fragmentation is a vulnerable point this kind of middleware systems because fragmenting an event increases the probability of a loss.

3.2 Best-Effort/Reliable communication vs. Latency

Table 1 reports the results obtained in a LAN-based setting. The numbers clearly show how both best-effort and reliable communications experience similar latency with small (one order of magnitude) standard deviation

(6)

Latency RTT µsec StdDev µsec

Best-Effort 128B 148 12

Best-Effort 16KB 610 13

Reliable 128B 152 15

Reliable 16KB 640 18

Throughput msg/sec MBytes/sec Best-Effort 128B 67904.51 8.289 Best-Effort 16KB 7415.12 115.861

Reliable 128B 64918.30 7.192 Reliable 16KB 7315.12 114.29875

Table 1. Summary of LAN results

when the middleware is deployed on a controlled, high-performance setting. The conditions at the basis of these results are hardly met in an unmanaged WAN setting where latency and standard deviation have often the same order of magnitude.

Figure 3. National scenario: latency and its standard deviation vs. event size

Figure 3 reports both latency and its standard deviation when the middleware is deployed in the national scenario. Interestingly, latencies experienced for both best-effort and reliable communication remains low and stable for a wide range of event sizes (up to 16KBytes) with a standard deviation of the same order of magnitude.

This result is quite important as it shows how RTI DDS can, in principle, be fruitfully employed in unmanaged, shared scenarios as it delivers consistent performance (see the detail in Figure 3).

This encouraging result is however limited to small events. Starting from event of size 32KBytes or larger, experienced latency and loss rate start to grow rapidly. This phenomenon is evident in the right part of Figure 3 where, for events of 64KBytes and 128 KBytes, latency for reliable communication becomes rapidly unac-

(7)

ceptable. In fact, we experienced that, in our scenarios, the probability of a packet loss is larger for packets of 64KBytes than for packet of 8KBytes. Consequently sending events larger than 64KB can be problematic and should be avoided when possible.

0 10 20 30 40 50 60 70 80 90 100

16 32 64 12

8 25

6 51

2 1024

2048 4196

8192 16384

3276 8

6553 6

131072

Loss Rate (%)

Event Size (B) Sub 1

Sub 2

Figure 4. National scenario: loss rate of two different subscribers

3.3 Heterogenous nodes

The results showed in previous section are even more evident if we consider the different performance experienced by the two subscriber nodes in the national scenario. These two nodes are characterized by widely different capabilities in terms of bandwidth and link reliability. Figure 4 reports the loss rate experienced at each node. Subscriber 1 shows low loss rates, even with large messages, while subscriber 2 has consistently large loss rates that get worse when we reach large event sizes. This subscriber is the responsible for introducing the large latency reported by the graphs in the previous section.

Figure 5 shows more clearly how a slow node can slow down throughput of the publisher for the topic it is subscribed to. The curves showing the behaviour at subscriber side remarks how the two subscribers have clearly different capabilities: subscriber 2 is, in fact, not able to sustain large throughput due to its limited resources.

With best-effort communication this is not a real problem, as the publisher generates events at its maximum rate, and only subscriber 2 will be affected by its poor performance. With reliable communication, instead, the graph clearly shows how throughput at publisher site (and, therefore, the whole throughput for the topic) is limited by the performance of subscriber 2. This problem is caused by the fact that the DDS Specification defines a transport layer based on UDP protocol. Consequently reliability must be implemented at an upper layer in the middleware. In this case implementing reliable communication over UDP channels requires the definition of

(8)

0 50 100 150 200 250 300 350 400

16 32 64 128

256 512

1024 2048

4196 8192

16384 32768

65536 131072

Throughput (Mbit/s)

Event Size (B) Sub 1

Sub 2 Pub BE Pub RE

Figure 5. National scenario: throughput for reliable and best-effort communications

reliability in terms of events: events have to be stored in a queue at publisher side until they are acknowledged by every subscriber. A queue stores produced events up to a maximum: when the queue is full the publisher stops producing further events. Figure 5 shows a slow subscriber is able to limits throughput of the publisher belonging to same topic. A parameter max blocking time can be configured in RTI DDS to discard events in send queues after a predefined time interval. This parameter have to be manually configured and if it is too

small it can drastically reduce the level of reliability. Moreover frequent retransmissions decrease drastically throughput, based on the number of event transmitted. This aspect can be a crucial point if you want to use a DDS product in an unmanaged setting, similar to the one we are considering here, where nodes can have widely different capabilities.

3.4 Scalability in terms of numbers of users

In figure 6 we analyze the european scenario where we have 20 subscribers deployed on several sites in the european area. In this scenario we want to show the behaviour of RTI DDS in terms of throughput when a publisher have to distribute the same event to a large set, 20 in our tests, of subscribers. We have to remember that in our scenario we can not use IP Multicast because we can’t configure routers belonging to different administrative domains (a common condition in an unmanaged environment). Consequently, we have to use Unicast IPv4 as a transport layer for RTI DDS. Using UDP-based unicast to transfer events, 20 subscribers require 20 “data send” therefore the overall throughput suffers a strong degradation of performance. This is evident from Figure 6 where the global throughput for best-effort communication is clearly negatively affected by a larger number of recipients.

(9)

0 50 100 150 200 250 300 350 400

16 32 64

128 256

512 1024

2048 4196

8192 16384

32768 65536

131072

Throughput (Mbit/s)

Event Size (B) 2 Sub

20 Sub

Figure 6. European Scenario: Impact of Number of Subscriber on Throughput 3.5 Impact of Static Configuration

During our tests the lack of self-organization and on-the-fly adaptiveness characteristics clearly emerged. We had to manually configure and test several parameters that can strongly impact performance. Moreover, due to the lack of control over the environment and its inherent dynamics, these parameters must also be adapted and changed from test to test. For example the performances of reliable communication in RTI DDS are strictly related to heartbeat frequency. Slow heartbeat frequency did not allow our test application to maintain a strict reliable communication paradigm or introduced large latencies. Moreover, in RTI DDS, there is an automatica discovery mechanism that is unable to work in our inter-administrative scenario; therefore a publisher does not have to know directly each subscriber but it can automatically add a subscriber to its list if it is contacted by the subscriber itself.

4 Conclusions and Lessons Learned

Future distributed enterprise computing systems will mix with very high probability SOA an EDA. There are indeed two main reasons to foresee this merge. Firstly, SOA can be effectively used for normal operations within an enterprise computing system. But what about behaviors that deviate from expectations? This is for example a typical event in monitoring, auditing or stock management. For these operations EDA offer much better capabilities than SOA to quickly react to these asynchronous events [4]. Secondly, we are close to the advent of “mashups” enterprises [2] that fuse stream of data from available sources on the Internet for creating

(10)

added value streams [5]². Reliable and timely event dissemination on unmanaged system is an essential element for enabling this picture.

The DDS is the only standard at the moment that specifies QoS policies such as reliability and timeliness for event dissemination. Even though implementations have been designed for a managed environment we positively remarked that in some limited settings, they can be used as-is. In particular during our experiments campaign RTI DDS has shown the best attitude in this porting from managed to unmanaged settings. If there is no event fragmentation at the DDS level ( i.e., event size is smaller than the maximum payload of a UDP packet) and if resources are in a small number and homogeneous (i.e., every computer has the same computational power), RTI DDS can be effectively used on the top of an unmanaged environment. Unfortunately in a more general settings performance rapidly degenerate and the data dissemination becomes unpredictable.

To enlarge the setting in which data dissemination can be done in timely and reliable ways on the top of an unmanaged environment, our study shows that there is a need of introducing innovative solutions on at least three important architectural aspects each of them representing a challenging research area:

Adaptivity and Self-configuration capability: during the setup phase of out tests we were forced to configure

and to readapt the software differently on each node, adapting the software configuration to a continuously changing setting.

The same problem was also evidenced by the lack of an automatic mechanism for node discovery. These issues are a consequence of the complete lack of self-configuration capabilities in the software and of its inability to automatically adapt to changed environmental conditions.

Efficient event routing primitives: the main event routing and transport mechanism employed by RTI DDS

(and many of its siblings) is IP multicast. In our unmanaged inter-administrative scenario, the middleware can not use IP multicast so it resorts to a point-to-point UDP-based communication primitive. This limitations have a strong impact on scalability: it introduces high latency in best-effort communications and strongly reduces the overall throughput in reliable communications. From this point of view we advocate the exploitation of application based multicast services, able to avoid the pitfalls imposed by the unreliable shared channels.

Management of heterogeneous resources: the unmanaged and heterogeneous environment that characterizes

our setting clearly created some difficulties to the tested software. Some of these problems are related to the strong diversity among the capabilities of various nodes. For example, the software was not able to efficiently support data distribution among a very slow and a very fast node, just adapting the whole process to the slowest

2Amazon and eBay already offer Web Services that are used by third parties to create value-added services.

(11)

participant. This behavior is certainly not acceptable in our setting of interest. Therefore, we think that future middleware products in this field should implement mechanism able to exploit node diversity, without simply adapting to the worst case.

5 Acknowledgements

Authors would to thank Stefania della Bina e Fabio Falcinelli for their help in implementing test applications, using several publish/subscribe softwares such as OpenSplice and RTI DDS, for LAN and PlanetLab environment.

References

[1] Real-time innovations inc.,http://www.rti.com/products/data_distribution/index.html.

[2] D. Butler. Mashups mix data into global service. Nature, 439(7072):6–7, 2006.

[3] K. M. Chandy. Sense and respond systems. In International Computer Measurement Group Conference, pages 59–66, 2005.

[4] K. M. Chandy. Event-driven applications: Costs, benefits and design approaches. In Gartner Application Integration and Web Services Summit, 2006.

[5] K. M. Chandy, L. Tian, and D. M. Zimmerman. Enterprise computing systems as information factories. In Proceedings of 10th IEEE International Enterprise Distributed Object Computing Conference (EDOC), 2006.

[6] O. M. Group. Data distribution service for real-time systems specification, 2002.

[7] S. M. Inc. Java message service api rev 1.1, 2002.

[8] B. McCormick and L. Madden. Open architecture publish-subscribe benchmarking. In OMG Real-Time Embedded System WorkShop 2005, 2005.

[9] Object Management Group. CORBA event service specification, version 1.1. OMG Document formal/2000-03-01, 2001.

[10] Object Management Group. CORBA notification service specification, version 1.0.1. OMG Document formal/2002- 08-04, 2002.

[11] PlanetLab Consortium.http://www.planet-lab.org/.

[12] Vanderbilt University. RT-DEEP project,http://www.dre.vanderbilt.edu/DDS/.

(12)

Appendix A RTI DDS Configuration Parameters and QoS policies

The following Table reports the complete list of configuration parameters and QoS policies used for our tests with RTI DDS.

RTI DDS Participant QoS Value

transport builtin mask UDPv4

receiver pool buffer size 64KB

RTI DDS Transport Properties QoS Value

message size max 64KB

send socket buffer size 64KB

receive socket buffer size 64KB RTI DDS Flow controller Properties Value

max tokens Unlimited

tokens leaked per period Ulimited

period 10ms

bytes per token 64KB

RTI DDS Topic QoS Value

durability TRANSIENT LOCAL

reliability BEST EFFORT

or RELIABLE

history KEEP ALL

RTI DDS Value

Data Reader / Data Writer QoS

publish mode.kind DDS ASYNCHRONOUS

PUBLISH MODE QOS

Max blocking time 10 s

size send queue 400 packets

size output queue 400 packets

size receive queue 400 packets

initial instance 1

heartbeat period 30 ms

heartbeat per max samples 400

max byte per nack response 131072 Byte

max heartbeat retries 25 ms

max nack response delay 25 ms

min nack response delay 0 ms

min/max heartbeat response delay 0 ms fast heartbeat period (throughput) 30 ms

fast heartbeat period (latency) 100 ms

Appendix B PlanetLab nodes employed in the tests

PlanetLab nodes in the national scenario:

Publisher planetlab1.elet.polimi.it;

Subscriber 1 planetlab-1.dis.uniroma1.it;

(13)

Subscriber 2 planetlab02.dis.unina.it;

PlanetLab nodes in the european scenario:

Publisher planet1.zib.de;

Subscriber 1 merkur.planetlab.haw-hamburg.de;

Subscriber 2 planetlab2.diku.dk;

Subscriber 3 planetlab1.cesnet.cz;

Subscriber 4 planetlab2.info.ucl.ac.be;

Subscriber 5 planck227.test.ibbt.be;

Subscriber 6 planetlab02.mpi-sws.mpg.de;

Subscriber 7 planetlab02.ethz.ch;

Subscriber 8 planetlab02.dis.unina.it;

Subscriber 9 planetlab1.csg.uzh.ch;

Subscriber 10 planetlab0.cs.stir.ac.uk;

Subscriber 11 planetlab02.cnds.unibe.ch;

Subscriber 12 planetlab2.informatik.uni-kl.de;

Subscriber 13 planetlab2.itwm.fhg.de;

Subscriber 14 plab2-c703.uibk.ac.at;

Subscriber 15 planetlab1.ifi.uio.no;

Subscriber 16 planet2.manchester.ac.uk;

Subscriber 17 planetlab1.exp-math.uni-essen.de;

Subscriber 18 planetlab1.hiit.fi;

Subscriber 19 planet2.colbud.hu;

Subscriber 20 planetlab1.eecs.iu-bremen.de;