Autobahn performance with many pubsub topics

#1

Hi all,

I’m working on an application that uses a custom solution for pubsub, and currently evaluating switching to autobahn. So far I’ve been quite happy with the autobahn API and its ease of use. One concern I have is how autobahn’s performance characteristics change as the number of pubsub topics grows.

For example, each pubsub client in our application subscribes to at least 100 different topics and a few clients regularly publish on a handful of them (that number could easily grow to 500 and beyond). I’m worried that the amount of computation required for autobahn to publish a particular message will drastically increase as the number of topics and clients both increase, leading to performance degradation.

Have there been any tests that measure autobahn’s performance with large numbers of topics and clients? Alternatively, could anyone comment on the internal implementation for routing pubsub messages? For example, assume there are n subscribers, t topics, and that each subscriber subscribes to each topic. Suppose a message gets published on a particular topic. Ideally the time complexity of sending this message to all clients would be O(n), but I could imagine a situation where it takes O(n * t) or worse, depending on how efficient the routing mechanism is.

Thanks in advance,

Nick

0 Likes

#2

Hi all,

I'm working on an application that uses a custom solution for pubsub,
and currently evaluating switching to autobahn. So far I've been quite
happy with the autobahn API and its ease of use. One concern I have is

Thanks! Thats good to hear. Shameless self-promo: getting an API right, that is both easy to use while still flexible is much more complicated that writing the implementation.

how autobahn's performance characteristics change as the number of
pubsub topics grows.

For example, each pubsub client in our application subscribes to at
least 100 different topics and a few clients regularly publish on a
handful of them (that number could easily grow to 500 and beyond). I'm
worried that the amount of computation required for autobahn to publish
a particular message will drastically increase as the number of topics
and clients both increase, leading to performance degradation.

Have there been any tests that measure autobahn's performance with large
numbers of topics and clients? Alternatively, could anyone comment on

We have tested a single AutobahnPython instance running on a 2 core 4GB VM with 180k concurrently active connections. A little OS tuning was needed (we used FreeBSD/kqueue, but Linux/epoll should be fine also) to allow for such large numbers of open TCPs, but otherwise, it's pretty straightforward.

the internal implementation for routing pubsub messages? For example,
assume there are *n *subscribers, *t* topics, and that each subscriber
subscribes to each topic. Suppose a message gets published on a
particular topic. Ideally the time complexity of sending this message to
all clients would be O(n), but I could imagine a situation where it
takes O(n * t) or worse, depending on how efficient the routing
mechanism is.

It's good that you care about such things .. since usually that's the kind of stuff that might come back biting.

However, rest assured, Autobahn has very efficient event dispatching.

The dispatching in the standard case will basically incur a O(1) lookup in a dict (hash access) for the subscribers of the given topic, and then an iteration over those: O(n), where n is the subscriber count for _that_ topic.

Moreso: Autobahn will serialize and WS frame the event to be dispatched only _once_, and then just push the buffered octets onto each TCP connection that leads to an receiver.

You can look for yourself - follow the code from:

https://github.com/tavendo/AutobahnPython/blob/master/autobahn/autobahn/wamp.py#L990

Hope this helps,

/Tobias

···

Am 17.10.2013 04:43, schrieb Nick Fishman:

Thanks in advance,

Nick

--
You received this message because you are subscribed to the Google
Groups "Autobahn" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to autobahnws+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

0 Likes

#3

Hi Tobias, thanks for the quick response.

Hi all,

I’m working on an application that uses a custom solution for pubsub,

and currently evaluating switching to autobahn. So far I’ve been quite

happy with the autobahn API and its ease of use. One concern I have is

Thanks! Thats good to hear. Shameless self-promo: getting an API right,
that is both easy to use while still flexible is much more complicated
that writing the implementation.

Definitely agreed!

how autobahn’s performance characteristics change as the number of

pubsub topics grows.

For example, each pubsub client in our application subscribes to at

least 100 different topics and a few clients regularly publish on a

handful of them (that number could easily grow to 500 and beyond). I’m

worried that the amount of computation required for autobahn to publish

a particular message will drastically increase as the number of topics

and clients both increase, leading to performance degradation.

Have there been any tests that measure autobahn’s performance with large

numbers of topics and clients? Alternatively, could anyone comment on

We have tested a single AutobahnPython instance running on a 2 core 4GB
VM with 180k concurrently active connections. A little OS tuning was
needed (we used FreeBSD/kqueue, but Linux/epoll should be fine also) to
allow for such large numbers of open TCPs, but otherwise, it’s pretty
straightforward.

That’s good to know. In this particular test (or others), did you collect any metrics on data throughput? I’m particularly interested in how much data we can push through autobahn, since there’s no prepackaged way to horizontally scale or cluster to multiple instances (based on your response here). This means the performance of a single instance becomes even more critical.

It appears that wsperf should do exactly this, but I’ve had trouble figuring out how to get it working. I followed the AutobahnTestSuite docs, but after building websocketpp I don’t see the wsperf binary anywhere (and in fact it’s listed in that project’s .gitignore). The websocketpp docs mention that wsperf should be one of the examples, but I also haven’t found it there. Perhaps the websocketpp library isn’t well-maintained, but I would certainly appreciate any pointers so I don’t have to write a test suite myself.

the internal implementation for routing pubsub messages? For example,

assume there are *n *subscribers, t topics, and that each subscriber

subscribes to each topic. Suppose a message gets published on a

particular topic. Ideally the time complexity of sending this message to

all clients would be O(n), but I could imagine a situation where it

takes O(n * t) or worse, depending on how efficient the routing

mechanism is.

It’s good that you care about such things … since usually that’s the
kind of stuff that might come back biting.

However, rest assured, Autobahn has very efficient event dispatching.

The dispatching in the standard case will basically incur a O(1) lookup
in a dict (hash access) for the subscribers of the given topic, and then
an iteration over those: O(n), where n is the subscriber count for
that topic.

Moreso: Autobahn will serialize and WS frame the event to be dispatched
only once, and then just push the buffered octets onto each TCP
connection that leads to an receiver.

You can look for yourself - follow the code from:

https://github.com/tavendo/AutobahnPython/blob/master/autobahn/autobahn/wamp.py#L990

Thank you, this is exactly what I was looking for.

···

On Thursday, October 17, 2013 3:11:38 AM UTC-7, Tobias Oberstein wrote:

Am 17.10.2013 04:43, schrieb Nick Fishman:

Hope this helps,

/Tobias

Thanks in advance,

Nick

You received this message because you are subscribed to the Google

Groups “Autobahn” group.

To unsubscribe from this group and stop receiving emails from it, send

an email to autobahnws+...@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

0 Likes

#4

    We have tested a single AutobahnPython instance running on a 2 core 4GB
    VM with 180k concurrently active connections. A little OS tuning was
    needed (we used FreeBSD/kqueue, but Linux/epoll should be fine also) to
    allow for such large numbers of open TCPs, but otherwise, it's pretty
    straightforward.

That's good to know. In this particular test (or others), did you
collect any metrics on data throughput? I'm particularly interested in
how much data we can push through autobahn, since there's no prepackaged
way to horizontally scale or cluster to multiple instances (based on
your response here
<https://groups.google.com/forum/#!topic/autobahnws/w1JeODYoedE>). This
means the performance of a single instance becomes even more critical.

We not yet have WAMP testsuite/perf tests .. it's upcoming .. recent AutobahnTestsuite has added some basic machinery, but not yet completed.

Rgd. plain WS performance, you might look at the numbers in the 9.x sections of

http://autobahn.ws/testsuite/reports/servers/index.html

Testing details:
https://github.com/tavendo/AutobahnTestSuite/tree/master/examples/publicreports

It appears that wsperf
<https://github.com/zaphoyd/websocketpp/wiki/wsperf> should do exactly
this, but I've had trouble figuring out how to get it working. I
followed the AutobahnTestSuite docs
<https://github.com/tavendo/AutobahnTestSuite#mode-wsperfcontrol>, but
after building websocketpp I don't see the wsperf binary anywhere (and
in fact it's listed in that project's .gitignore). The websocketpp docs
<http://www.zaphoyd.com/software/wsperf> mention that wsperf should be
one of the examples, but I also haven't found it there. Perhaps the
websocketpp library isn't well-maintained, but I would certainly
appreciate any pointers so I don't have to write a test suite myself.

This is work in progress and in any case, does not cover WAMP but only raw WS.

Sorry, I really understand what you are after (hard numbers, in multiple dimensions): it's just not there yet.

However, we are dedicated to add those .. it's just a lot of work.

Also: there will be a scale-out story ..

/Tobias

···

     > the internal implementation for routing pubsub messages? For
    example,
     > assume there are *n *subscribers, *t* topics, and that each
    subscriber
     > subscribes to each topic. Suppose a message gets published on a
     > particular topic. Ideally the time complexity of sending this
    message to
     > all clients would be O(n), but I could imagine a situation where it
     > takes O(n * t) or worse, depending on how efficient the routing
     > mechanism is.

    It's good that you care about such things .. since usually that's the
    kind of stuff that might come back biting.

    However, rest assured, Autobahn has very efficient event dispatching.

    The dispatching in the standard case will basically incur a O(1) lookup
    in a dict (hash access) for the subscribers of the given topic, and
    then
    an iteration over those: O(n), where n is the subscriber count for
    _that_ topic.

    Moreso: Autobahn will serialize and WS frame the event to be dispatched
    only _once_, and then just push the buffered octets onto each TCP
    connection that leads to an receiver.

    You can look for yourself - follow the code from:

    https://github.com/tavendo/AutobahnPython/blob/master/autobahn/autobahn/wamp.py#L990
    <https://github.com/tavendo/AutobahnPython/blob/master/autobahn/autobahn/wamp.py#L990>

Thank you, this is exactly what I was looking for.

    Hope this helps,

    /Tobias

     >
     > Thanks in advance,
     >
     > Nick
     >
     > --
     > You received this message because you are subscribed to the Google
     > Groups "Autobahn" group.
     > To unsubscribe from this group and stop receiving emails from it,
    send
     > an email to autobah...@googlegroups.com <javascript:>.
     > For more options, visit https://groups.google.com/groups/opt_out
    <https://groups.google.com/groups/opt_out>.

--
You received this message because you are subscribed to the Google
Groups "Autobahn" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to autobahnws+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

0 Likes

#5
We have tested a single AutobahnPython instance running on a 2 core 4GB
VM with 180k concurrently active connections. A little OS tuning was
needed (we used FreeBSD/kqueue, but Linux/epoll should be fine also) to
allow for such large numbers of open TCPs, but otherwise, it's pretty
straightforward.

That’s good to know. In this particular test (or others), did you

collect any metrics on data throughput? I’m particularly interested in

how much data we can push through autobahn, since there’s no prepackaged

way to horizontally scale or cluster to multiple instances (based on

your response here

<https://groups.google.com/forum/#!topic/autobahnws/w1JeODYoedE>). This

means the performance of a single instance becomes even more critical.

We not yet have WAMP testsuite/perf tests … it’s upcoming … recent
AutobahnTestsuite has added some basic machinery, but not yet completed.

Rgd. plain WS performance, you might look at the numbers in the 9.x
sections of

http://autobahn.ws/testsuite/reports/servers/index.html

Testing details:

https://github.com/tavendo/AutobahnTestSuite/tree/master/examples/publicreports

Thanks, that’s very helpful (especially the part about running Autobahn via PyPy, I was looking for that info earlier).

It appears that wsperf

<https://github.com/zaphoyd/websocketpp/wiki/wsperf> should do exactly

this, but I’ve had trouble figuring out how to get it working. I

followed the AutobahnTestSuite docs

<https://github.com/tavendo/AutobahnTestSuite#mode-wsperfcontrol>, but

after building websocketpp I don’t see the wsperf binary anywhere (and

in fact it’s listed in that project’s .gitignore). The websocketpp docs

<http://www.zaphoyd.com/software/wsperf> mention that wsperf should be

one of the examples, but I also haven’t found it there. Perhaps the

websocketpp library isn’t well-maintained, but I would certainly

appreciate any pointers so I don’t have to write a test suite myself.

This is work in progress and in any case, does not cover WAMP but only
raw WS.

Sorry, I really understand what you are after (hard numbers, in multiple
dimensions): it’s just not there yet.

However, we are dedicated to add those … it’s just a lot of work.

I understand, it’s non-trivial to do this right and collect the right performance metrics. This is important even if it’s not a top priority. I must say though, the Autobahn testsuite itself is pretty awesome and has already pushed the development of websocket implementations forward.

Also: there will be a scale-out story …

That will be awesome, I can’t wait :slight_smile:

···

On Monday, October 21, 2013 3:50:32 AM UTC-7, Tobias Oberstein wrote:

/Tobias

 > the internal implementation for routing pubsub messages? For
example,
 > assume there are *n *subscribers, *t* topics, and that each
subscriber
 > subscribes to each topic. Suppose a message gets published on a
 > particular topic. Ideally the time complexity of sending this
message to
 > all clients would be O(n), but I could imagine a situation where it
 > takes O(n * t) or worse, depending on how efficient the routing
 > mechanism is.
It's good that you care about such things .. since usually that's the
kind of stuff that might come back biting.
However, rest assured, Autobahn has very efficient event dispatching.
The dispatching in the standard case will basically incur a O(1) lookup
in a dict (hash access) for the subscribers of the given topic, and
then
an iteration over those: O(n), where n is the subscriber count for
_that_ topic.
Moreso: Autobahn will serialize and WS frame the event to be dispatched
only _once_, and then just push the buffered octets onto each TCP
connection that leads to an receiver.
You can look for yourself - follow the code from:
[https://github.com/tavendo/AutobahnPython/blob/master/autobahn/autobahn/wamp.py#L990](https://github.com/tavendo/AutobahnPython/blob/master/autobahn/autobahn/wamp.py#L990)
<[https://github.com/tavendo/AutobahnPython/blob/master/autobahn/autobahn/wamp.py#L990](https://github.com/tavendo/AutobahnPython/blob/master/autobahn/autobahn/wamp.py#L990)>

Thank you, this is exactly what I was looking for.

Hope this helps,
/Tobias
 >
 > Thanks in advance,
 >
 > Nick
 >
 > --
 > You received this message because you are subscribed to the Google
 > Groups "Autobahn" group.
 > To unsubscribe from this group and stop receiving emails from it,
send
 > an email to autobahnws+...@googlegroups.com <javascript:>.
 > For more options, visit [https://groups.google.com/groups/opt_out](https://groups.google.com/groups/opt_out)
<[https://groups.google.com/groups/opt_out](https://groups.google.com/groups/opt_out)>.

You received this message because you are subscribed to the Google

Groups “Autobahn” group.

To unsubscribe from this group and stop receiving emails from it, send

an email to autobahnws+...@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

0 Likes

#6

Thanks, that's very helpful (especially the part about running Autobahn
via PyPy, I was looking for that info earlier).

Comparing (1) Autobahn/wsaccel/ujson with (2) Autobahn/PyPy .. there are a couple of things to note:

- with mass data, (1) is faster right now .. the AOT compiled native code just runs faster than JITed code

- I have ideas about how to further push wsaccel .. but see below

- for small sized messaging, (2) is faster .. this could be the case since PyPy accelerates the branchy code

- the PyPy GC at least was (old measurements) not as consistent as CPy .. which leads to better 99.9% latency for (1) than (2)

- the optimum would be to have Autobahn/PyPy + vectorized (SSE3+) native code acceleration for WS (and probably JSON). this can be done, but must be done differently (wsaccel rewrite to real native module (no Cython) and interfaces via cffi)

/Tobias

0 Likes