dkcorbabench

Contents


Description

dkcorbabench is a small suite of benchmarks for comparing the dynamic footprint of various CORBA implementations.

The 'footprint' benchmark measures how long it takes to send and receive an echo to N servers in parallel, where both the client and all servers are running as separate processes on the local system.

'footprint' takes two arguments:
max_n (the maximum number of servers to try to run), and
max_t (the maximum allowed latency in milliseconds).
It then loops, creating a new server and measuring latency to all servers, until either max_n or max_t is exceeded. After each measurement, it prints the latency and displays the first two lines of /proc/meminfo (if running on Linux).

To just measure performance, choose a low value for max_t.
To measure a combination of memory consumption and performance, choose a high value for max_t, or run the benchmark on a machine with much too little memory relative to its cpu speed.

Two versions of 'footprint' are included:


Results

With max_n set to 1000, and max_t set to 100ms, results are as follows:
CPU RAM Swap CPU speed Compiler Kernel unix pipes result OmniOrb3.0.4 result Tao1.2 result
pentium 2 128 MB ide hard drive 450 MHz gcc2.96-98 Red Hat 7.2 800 servers 117 servers n/a
dual pentium 3 416 MB ide hard drive 650 MHz gcc2.96-98 Red Hat 7.2 SMP 2650 servers 194 servers 272 servers
dual pentium 3 48 MB ide hard drive 650 MHz gcc2.96-98 Red Hat 7.2 SMP 350 servers 55 servers 64 servers
dual pentium 3 16 MB ide hard drive 650 MHz gcc2.96-98 Red Hat 7.2 SMP 61 servers 9 servers 5 servers
dual pentium 3 14 MB ide hard drive 650 MHz gcc2.96-98 Red Hat 7.2 SMP 54 servers 7 servers 0 servers
dual pentium 3 12 MB ide hard drive 650 MHz gcc2.96-98 Red Hat 7.2 SMP 41 servers 6 servers 0 servers
dual pentium 3 11 MB ide hard drive 650 MHz gcc2.96-98 Red Hat 7.2 SMP 33 servers 5 servers 0 servers
dual pentium 3 10 MB ide hard drive 650 MHz gcc2.96-98 Red Hat 7.2 SMP 26 servers 1 server 0 servers
dual pentium 3 9 MB ide hard drive 650 MHz gcc2.96-98 Red Hat 7.2 SMP 16 servers 1 server 0 servers
dual pentium 3 8 MB ide hard drive 650 MHz gcc2.96-98 Red Hat 7.2 SMP 10 servers 0 servers 0 servers

Notes

The above results were without linking in -lomniDynamic3 to the server. With that library linked in, 25% fewer servers could be spawned on one system with 64 MB RAM. Thus it looks like the simple act of linking in the dynamic any support adds significant overhead per process.

On the 'footprint1' benchmark, TAO1.2 seems to beat OmniOrb3 above 48 MB of RAM, but OmniOrb3 wins below 16 MB.


Downloading/Compiling

The source code is GPL'd, and is in the file dkcorbabench-0.4.tar.gz. The change log is in the file ChangeLog.

dkcorbabench uses autoconf to adapt to the platform you're running on. The autoconf macros used to adapt to the platform's ORB are adapted from those at corbaconf.kiev.ua.

Only omniorb3 and TAO 1.2 have been tested so far, although I intend to support more ORBs later.

Omniorb3

I compiled and installed omniorb3 from sources using the scripts in omni_scripts.tar.gz.

To build dkcorbabench for the x86 workstation, I used the commands

./configure --with-omni=/opt/omniorb3/native/
make
To cross-compile dkcorbabench for the ppc405, I used the commands
CC=${MY_GCC3_CROSS_TOOL}gcc CXX=${MY_GCC3_CROSS_TOOL}g++ CFLAGS=$MY_TARGET_CFLAGS CXXFLAGS=$MY_TARGET_CFLAGS ./configure --with-omni=/opt/omniorb3/405_linux_2.0_glibc2.1 --host=powerpc-linux
make
where MY_GCC3_CROSS_TOOL is the location and prefix of the cross-development tools, and
MY_TARGET_CFLAGS are the compiler flags needed to build for the ppc405.

TAO 1.2

I compiled TAO 1.2 from sources for the x86 using the script mintao.sh. I had to turn off the minimum tao switch, but hope to turn it back on by compiling the server with minimum tao, but the client with maximum tao, and, um, running them on separate machines or something.

To build dkcorbabench with Tao for the x86 workstation, I set the usual ACE_ROOT and TAO_ROOT environment variables, then used the commands

./configure
make

Pseudocode

For footprint1, the measurement is performed by the following code:

/**
 Ping n children, return how many milliseconds it takes.
*/
static int ping_n_servers(int n, Echo_var *servers)
{
    int start = time_in_ms();

    CORBA::Request_var *req = new CORBA::Request_var[n];
    const char *arg = "Hello!";

    /* Send N pings */
    for (i=0; i<n; i++) {
        req[i] = servers[i]->_request("echoString");
        req[i]->add_in_arg() <<= CORBA::string_dup(arg);
        req[i]->set_return_type(CORBA::_tc_string);
        req[i]->send_deferred();
    }

    /* Wait for N replies */
    for (i=0; i<n; i++) {
        req[i]->get_response();
        const char* ret;
        req[i]->return_value() >>= ret;
    }

    return time_in_ms() - start;
}

footprint1(int max_n, int max_t)
{
    Echo_var servers[max_n];
    for (i=0; i<max_n; i++) {
        int latency;
        spawn new echo server.
        servers[i] = reference to new echo server.
        // Invoke method now, so TCP connect time isn't included in measuement
        servers[i]->echoString("Hello!");
        latency = ping_n_servers(i+1, servers);
        if (latency > max_t) abort();
        printf("Latency to %d children is %d ms\n", i+1, latency);
    }
}

Links

Here are a few pages that discuss footprint reduction. I haven't followed any of the OmniOrb suggestions yet, and haven't successfully tried minimum corba or soreduce on my benchmark.

Tcl implementation

Frank wrote:
Hi Dan,

 I see that you are evaluating ORBs in low memory situations. I can't
help but chime in here with an unorthodox possibility.

 Just for curiosity, I have implemented your footprint client and
server example in Tcl using the Combat ORB. Client and Server are
attached; in addition, you will need

  - Tcl 8.3.x from http://tcl.sourceforge.net/
  - [incr Tcl] 3.x from http://incrtcl.sourceforge.net/
  - Combat/Tcl from http://www.fpx.de/Combat/#Download-Tcl

 Your footprint benchmark runs unchanged.

 Performance is expectably worse than that of C++ clients and servers.
On my 266 MHz machine, the 100ms latency threshold is enough for 2
children only. Combat is probably not the ORB you want to use for high
throughput.

 However, it has a small footprint, so I'm confident that it will keep
its "performance" even in low memory conditions. So maybe you might want
to try running it on your test system.

 Another plus is that as you add more features, scripted clients and
servers grow only a little. On the Combat home page, you can find an
Account example; its statically-linked Win32 server is 460k, the (gra-
phical) client 860k.

 This is not supposed to be a sales pitch, but I hope I made you
curious.

 Have fun,
        Frank

Last change: 27 Feb 2002
Copyright 2002 Dan Kegel
[Return to kegel.com]