dkcorbabench is a small suite of benchmarks for comparing the dynamic footprint of various CORBA implementations.
The 'footprint' benchmark measures how long it takes to send and receive an echo to N servers in parallel, where both the client and all servers are running as separate processes on the local system.
'footprint' takes two arguments:
max_n (the maximum number of servers to try to run), and
max_t (the maximum allowed latency in milliseconds).
It then loops, creating a new server and measuring latency
to all servers, until either max_n or max_t is exceeded.
After each measurement, it prints the latency
and displays the first two lines of /proc/meminfo (if running on Linux).
To just measure performance, choose a low value for max_t.
To measure a combination of memory consumption and performance,
choose a high value for max_t, or run the benchmark on a machine
with much too little memory relative to its cpu speed.
Two versions of 'footprint' are included:
CPU | RAM | Swap | CPU speed | Compiler | Kernel | unix pipes result | OmniOrb3.0.4 result | Tao1.2 result |
---|---|---|---|---|---|---|---|---|
pentium 2 | 128 MB | ide hard drive | 450 MHz | gcc2.96-98 | Red Hat 7.2 | 800 servers | 117 servers | n/a |
dual pentium 3 | 416 MB | ide hard drive | 650 MHz | gcc2.96-98 | Red Hat 7.2 SMP | 2650 servers | 194 servers | 272 servers |
dual pentium 3 | 48 MB | ide hard drive | 650 MHz | gcc2.96-98 | Red Hat 7.2 SMP | 350 servers | 55 servers | 64 servers |
dual pentium 3 | 16 MB | ide hard drive | 650 MHz | gcc2.96-98 | Red Hat 7.2 SMP | 61 servers | 9 servers | 5 servers |
dual pentium 3 | 14 MB | ide hard drive | 650 MHz | gcc2.96-98 | Red Hat 7.2 SMP | 54 servers | 7 servers | 0 servers |
dual pentium 3 | 12 MB | ide hard drive | 650 MHz | gcc2.96-98 | Red Hat 7.2 SMP | 41 servers | 6 servers | 0 servers |
dual pentium 3 | 11 MB | ide hard drive | 650 MHz | gcc2.96-98 | Red Hat 7.2 SMP | 33 servers | 5 servers | 0 servers |
dual pentium 3 | 10 MB | ide hard drive | 650 MHz | gcc2.96-98 | Red Hat 7.2 SMP | 26 servers | 1 server | 0 servers |
dual pentium 3 | 9 MB | ide hard drive | 650 MHz | gcc2.96-98 | Red Hat 7.2 SMP | 16 servers | 1 server | 0 servers |
dual pentium 3 | 8 MB | ide hard drive | 650 MHz | gcc2.96-98 | Red Hat 7.2 SMP | 10 servers | 0 servers | 0 servers |
The above results were without linking in -lomniDynamic3 to the server. With that library linked in, 25% fewer servers could be spawned on one system with 64 MB RAM. Thus it looks like the simple act of linking in the dynamic any support adds significant overhead per process.
On the 'footprint1' benchmark, TAO1.2 seems to beat OmniOrb3 above 48 MB of RAM, but OmniOrb3 wins below 16 MB.
The source code is GPL'd, and is in the file dkcorbabench-0.4.tar.gz. The change log is in the file ChangeLog.
dkcorbabench uses autoconf to adapt to the platform you're running on. The autoconf macros used to adapt to the platform's ORB are adapted from those at corbaconf.kiev.ua.
Only omniorb3 and TAO 1.2 have been tested so far, although I intend to support more ORBs later.
I compiled and installed omniorb3 from sources using the scripts in omni_scripts.tar.gz.
To build dkcorbabench for the x86 workstation, I used the commands
./configure --with-omni=/opt/omniorb3/native/ makeTo cross-compile dkcorbabench for the ppc405, I used the commands
CC=${MY_GCC3_CROSS_TOOL}gcc CXX=${MY_GCC3_CROSS_TOOL}g++ CFLAGS=$MY_TARGET_CFLAGS CXXFLAGS=$MY_TARGET_CFLAGS ./configure --with-omni=/opt/omniorb3/405_linux_2.0_glibc2.1 --host=powerpc-linux makewhere MY_GCC3_CROSS_TOOL is the location and prefix of the cross-development tools, and
I compiled TAO 1.2 from sources for the x86 using the script mintao.sh. I had to turn off the minimum tao switch, but hope to turn it back on by compiling the server with minimum tao, but the client with maximum tao, and, um, running them on separate machines or something.
To build dkcorbabench with Tao for the x86 workstation, I set the usual ACE_ROOT and TAO_ROOT environment variables, then used the commands
./configure make
For footprint1, the measurement is performed by the following code:
/** Ping n children, return how many milliseconds it takes. */ static int ping_n_servers(int n, Echo_var *servers) { int start = time_in_ms(); CORBA::Request_var *req = new CORBA::Request_var[n]; const char *arg = "Hello!"; /* Send N pings */ for (i=0; i<n; i++) { req[i] = servers[i]->_request("echoString"); req[i]->add_in_arg() <<= CORBA::string_dup(arg); req[i]->set_return_type(CORBA::_tc_string); req[i]->send_deferred(); } /* Wait for N replies */ for (i=0; i<n; i++) { req[i]->get_response(); const char* ret; req[i]->return_value() >>= ret; } return time_in_ms() - start; } footprint1(int max_n, int max_t) { Echo_var servers[max_n]; for (i=0; i<max_n; i++) { int latency; spawn new echo server. servers[i] = reference to new echo server. // Invoke method now, so TCP connect time isn't included in measuement servers[i]->echoString("Hello!"); latency = ping_n_servers(i+1, servers); if (latency > max_t) abort(); printf("Latency to %d children is %d ms\n", i+1, latency); } }
Hi Dan, I see that you are evaluating ORBs in low memory situations. I can't help but chime in here with an unorthodox possibility. Just for curiosity, I have implemented your footprint client and server example in Tcl using the Combat ORB. Client and Server are attached; in addition, you will need - Tcl 8.3.x from http://tcl.sourceforge.net/ - [incr Tcl] 3.x from http://incrtcl.sourceforge.net/ - Combat/Tcl from http://www.fpx.de/Combat/#Download-Tcl Your footprint benchmark runs unchanged. Performance is expectably worse than that of C++ clients and servers. On my 266 MHz machine, the 100ms latency threshold is enough for 2 children only. Combat is probably not the ORB you want to use for high throughput. However, it has a small footprint, so I'm confident that it will keep its "performance" even in low memory conditions. So maybe you might want to try running it on your test system. Another plus is that as you add more features, scripted clients and servers grow only a little. On the Combat home page, you can find an Account example; its statically-linked Win32 server is 460k, the (gra- phical) client 860k. This is not supposed to be a sales pitch, but I hope I made you curious. Have fun, Frank