Horizontally Scaling Cloned 'Unique' Workers

#1

Hi all,

I am developing an architecture in which I have a ‘master’ worker which is always running and maintains a list of ‘children’ workers of which it controls. These children workers have unique ID’s and I need the ability to make calls into specific children workers that share the same source base. My initial idea was to register the methods under the same URI and use the CallOptions (https://github.com/crossbario/autobahn-python/blob/master/autobahn/wamp/types.py#L698) but this does not provide the same ‘eligible’ option as PublishOptions (https://github.com/crossbario/autobahn-python/blob/master/autobahn/wamp/types.py#L484). I cannot seem to use pub/sub for this as I will not receive a response on the same topic.

One option appears to be to use unique URI’s for the register/call pairs, but this on the face of it seems like a poor choice and complicates the registration process (Especially when using decorators).

What would be a good pattern here to achieve the ability to ‘call’ into specific crossbario containers?

Thanks!

-Vetsin

0 Likes

#2

Hi, I’m doing something that sounds sort of similar, so I don’t know if this helps …

I have a bunch of containers that connect to crossbar using round-robin scheduling, but I essentially need “sticky” sessions, so once a client has connected to a specific instance, it needs to keep communicating with that specific instance. The method I’ve settled on is to have each container register on two topics, one (for example); “com.example.generic” and the other is “com.example.(n)” where where (n) is a unique identifier issued to each container.

So on the first request, the topic is set to “com.example.generic” and the call is load-balanced over say three containers and serviced by container # 2.

Container #2 will include an “002” in the call return, then the caller will change the topic it uses from “com.example.generic” to “com.example.002” and use that new topic for the remainder of the session.

It’s literally a few extra lines of code …

  • if there’s a better method, I’m all ears, now I’m using load balancing I seem to have many requirements that sit in the sticky session category …
0 Likes

#3

Hi,

not sure I get what you want .. because "horiz. scaling workers" and "unique workers" seem to contradict.

With horiz. scaling workers, I'd expect the workers to be identical and not relevant which worker serves what call.

Crossbar.io has built in support for scaling like this:

https://github.com/crossbario/crossbar-examples/tree/master/scaling-microservices

Cheers,
/Tobias

···

Am 16.02.2017 um 01:40 schrieb Matthew Gill:

Hi all,

I am developing an architecture in which I have a 'master' worker which is
always running and maintains a list of 'children' workers of which it
controls. These children workers have unique ID's and I need the ability to
make calls into specific children workers that share the same source base.
My initial idea was to register the methods under the same URI and use the
CallOptions
(https://github.com/crossbario/autobahn-python/blob/master/autobahn/wamp/types.py#L698)
but this does not provide the same 'eligible' option as PublishOptions
(https://github.com/crossbario/autobahn-python/blob/master/autobahn/wamp/types.py#L484).
I cannot seem to use pub/sub for this as I will not receive a response on
the same topic.

One option appears to be to use unique URI's for the register/call pairs,
but this on the face of it seems like a poor choice and complicates the
registration process (Especially when using decorators).

What would be a good pattern here to achieve the ability to 'call' into
specific crossbario containers?

Thanks!
-Vetsin

0 Likes

#4

Hi Tobias,

I think maybe there are many different use-cases and whereas horizontal scaling is needed for load distribution (and resilience), that doesn’t necessarily mean that workers have to be stateless. Once you add state to a process, it ‘does’ matter which worker you talk to. Take the instance of ‘sticky’ sessions I referred to earlier, this is a long-standing issue/requirement for load-balancing traditional high-capacity web applications and something addressed by software like “ha-proxy”. This uses similar scheduling to distribute web requests over multiple back-end connections, but provides a specific facility whereby once established, a front-end session continues to talk to the same back-end server used to service the first request in the session.

http://blog.haproxy.com/2012/03/29/load-balancing-affinity-persistence-sticky-sessions-what-you-need-to-know/

For example; I’m currently working on a web analytics package and want to distribute logging over a number of back-end workers, so I’m running a number of workers and registering with crossbar using a round robin scheduler. A session might consist of a dozen page logs, and when complete, that session is then packaged up and submitted to an aggregator for post-processing. I found the process of trying to extract logging information from multiple workers with a view to aggregating into one session log too expensive and complex, so instead I make sessions persistent and logging for a single session is only ever recorded by one worker, so the majority of logging goes to a specific server, yet horizontal scaling still allows me to spread the load over multiple workers … if that makes sense?

It’s very easy to implement persistence (from my perspective), but on reflection it’s an interesting thought that maybe crossbar could at some point implement persistent paths over the scheduler … ?

Gareth.

0 Likes

#5

Hi Gareth,

I can follow the use case. Thanks for laying out in detail!

So, I'd be open to make sticky sessions work in Crossbar.io shared registrations / load-balancing. I think we could do that nicely (fitting into the overall design). And it would be useful for scenarios like the one you describe (scale out of stateful callees).

I do think that would be a much better solution than to add sth like "eligible" for RPC. Because this puts the burden onto the caller, and introduces more coupling (the eligible/exclude stuff when using session IDs is borderline .. it introduces coupling).

If you care about this, could you pls file an issue on the CB repo? I can't promise when we would be able to implement it, but I'd be +1 on it.

Cheers,
/Tobias

···

Am 16.02.2017 um 12:24 schrieb Gareth Bult:

Hi Tobias,

I think maybe there are many different use-cases and whereas horizontal
scaling is needed for load distribution (and resilience), that doesn't
necessarily mean that workers have to be stateless. Once you add state to
a process, it 'does' matter which worker you talk to. Take the instance of
'sticky' sessions I referred to earlier, this is a long-standing
issue/requirement for load-balancing traditional high-capacity web
applications and something addressed by software like "ha-proxy". This uses
similar scheduling to distribute web requests over multiple back-end
connections, but provides a specific facility whereby once established, a
front-end session continues to talk to the same back-end server used to
service the first request in the session.

http://blog.haproxy.com/2012/03/29/load-balancing-affinity-persistence-sticky-sessions-what-you-need-to-know/

For example; I'm currently working on a web analytics package and want to
distribute logging over a number of back-end workers, so I'm running a
number of workers and registering with crossbar using a round robin
scheduler. A session might consist of a dozen page logs, and when complete,
that session is then packaged up and submitted to an aggregator for
post-processing. I found the process of trying to extract logging
information from multiple workers with a view to aggregating into one
session log too expensive and complex, so instead I make sessions
persistent and logging for a single session is only ever recorded by one
worker, so the majority of logging goes to a specific server, yet
horizontal scaling still allows me to spread the load over multiple workers
... if that makes sense?

It's very easy to implement persistence (from my perspective), but on
reflection it's an interesting thought that maybe crossbar could at some
point implement persistent paths over the scheduler ... ?

Gareth.

0 Likes

#6

Okl; https://github.com/crossbario/crossbar/issues/965

:slight_smile:

0 Likes

#7

Ok, while on the subject ( :slight_smile: ) I find I generally have to mirror Crossbar metadata in order to add additional application specific metadata. Is there any way to extend the metadata such that the pre-existing metadata could be used rather than duplicating the effort?

Usecase;

I use Google oAuth2 and on ticket authentication I have the opportunity to use the ticket id to read the user’s Google profile on the server - something that it would appear to be more difficult to do at a later stage. At this point I cache the google profile using the session ID as an index, then I hook into “on_leave” to remove the cached information. If I could tag the google profile to the session metadata, it would make managing this much easier … at the moment I have “self.profiles” in the authenticator class, and have to profile a “get_profile” topic to recover the information, rather than using a standard crossbar metadata call.

0 Likes

#8

Just as a matter of interest; if you take a look at https://linux.co.uk (which is a Wordpress website), you will notice that it’s maintaining a persistent connection to a Crossbar server across page transitions.
(and it’s not using iFrames or web workers)

… implemented as a generic Wordpress plugin … whereas it’s by no means bullet proof, it should work on many Wordpress sites … :slight_smile:

… makes integrating autobahn / crossbar with ‘legacy’ web stuff potentially more do-able … in this case I’m recording live analytics … :slight_smile:

0 Likes

#9

Hi Gareth,

Just as a matter of interest; if you take a look at https://linux.co.uk
(which is a Wordpress website), you will notice that it's maintaining a
persistent connection to a Crossbar server across page transitions.

Neat!

+1 for LetsEncrypt too. I see you are using Nginx for static Web, but have LetsEncrypt (with different keys) on both! ++1

(and it's not using iFrames or web workers)

Neither using iframes nor web workers? How are you doing it? Curious =)

.. implemented as a generic Wordpress plugin .. whereas it's by no means
bullet proof, it should work on many Wordpress sites .. :slight_smile:
.. makes integrating autobahn / crossbar with 'legacy' web stuff
potentially more do-able ... in this case I'm recording live analytics .. :slight_smile:

Yeah, having the ability to keep a WAMP connection open over page transitions for classic Web (non single-page apps) is a powerful thing.

Like measuring the time users are actually active on pages by checking for mouse / scroll activity etc in real-time

Cheers,
/Tobias

···

Am 16.02.2017 um 16:41 schrieb Gareth Bult:

0 Likes

#10

Thanks all. I believe a sticky-session solution would fully support my desired use-case. To further elaborate on my use case my ‘unique workers’ are unique because they process unique data sets. Each worker shares the same source base, but all further operations for that given data set should be executed against their given worker. Since I do not care which worker picks up a data set the sticky session will work just fine, though I would request that this new feature have a method to ‘discover’ that session again if it were somehow dropped (via metaapi?).

Thanks!

-Vetsin

···

On Thursday, February 16, 2017 at 9:28:24 AM UTC-8, Tobias Oberstein wrote:

Hi Gareth,

Am 16.02.2017 um 16:41 schrieb Gareth Bult:

Just as a matter of interest; if you take a look at https://linux.co.uk

(which is a Wordpress website), you will notice that it’s maintaining a

persistent connection to a Crossbar server across page transitions.

Neat!

+1 for LetsEncrypt too. I see you are using Nginx for static Web, but
have LetsEncrypt (with different keys) on both! ++1

(and it’s not using iFrames or web workers)

Neither using iframes nor web workers? How are you doing it? Curious =)

… implemented as a generic Wordpress plugin … whereas it’s by no means

bullet proof, it should work on many Wordpress sites … :slight_smile:

… makes integrating autobahn / crossbar with ‘legacy’ web stuff

potentially more do-able … in this case I’m recording live analytics … :slight_smile:

Yeah, having the ability to keep a WAMP connection open over page
transitions for classic Web (non single-page apps) is a powerful thing.

Like measuring the time users are actually active on pages by checking
for mouse / scroll activity etc in real-time

Cheers,

/Tobias

0 Likes

#11

Thanks all. I believe a sticky-session solution would fully support my
desired use-case. To further elaborate on my use case my 'unique workers'
are unique because they process unique data sets. Each worker shares the
same source base, but all further operations for that given data set should
be executed against their given worker. Since I do not care which worker
picks up a data set the sticky session will work just fine, though I would
request that this new feature have a method to 'discover' that session
again if it were somehow dropped (via metaapi?).

Another approach would be: have the worker selected from the hashed value of the client session's auth ID - depending on that is stable, eg because the client is authenticating or at least HTTP cookie tracked ..

I also though we would have something about "partitioned calls and publications" which were supposed to have the callee/subscriber worker selected based on a stable caller/publisher client mapping.

Essentially, it would allow the workers with the callee/subsciber instances to preload the respective data shard ("partition"). Maybe the sketch referred to the concept as "sharded" vs shared registrations at one point. Forgot. The whole concept isn't sufficiently defined yet to be even implemented ..

···

Am 16.02.2017 um 19:26 schrieb Vetsin:

Thanks!
-Vetsin

On Thursday, February 16, 2017 at 9:28:24 AM UTC-8, Tobias Oberstein wrote:

Hi Gareth,

Am 16.02.2017 um 16:41 schrieb Gareth Bult:

Just as a matter of interest; if you take a look at https://linux.co.uk
(which is a Wordpress website), you will notice that it's maintaining a
persistent connection to a Crossbar server across page transitions.

Neat!

+1 for LetsEncrypt too. I see you are using Nginx for static Web, but
have LetsEncrypt (with different keys) on both! ++1

(and it's not using iFrames or web workers)

Neither using iframes nor web workers? How are you doing it? Curious =)

.. implemented as a generic Wordpress plugin .. whereas it's by no means
bullet proof, it should work on many Wordpress sites .. :slight_smile:
.. makes integrating autobahn / crossbar with 'legacy' web stuff
potentially more do-able ... in this case I'm recording live analytics

.. :slight_smile:

Yeah, having the ability to keep a WAMP connection open over page
transitions for classic Web (non single-page apps) is a powerful thing.

Like measuring the time users are actually active on pages by checking
for mouse / scroll activity etc in real-time

Cheers,
/Tobias

0 Likes

#12

Neither using iframes nor web workers? How are you doing it? Curious =)

In principle, I’m intercepting all link clicks, then using xdr to load the target into a dummy element, then overwriting the head and body of the current page with the head and body from the dummy element.

Net effect, new page, persistent global context … load authbahn in global and you’re away! :slight_smile:

Yeah, having the ability to keep a WAMP connection open over page
transitions for classic Web (non single-page apps) is a powerful thing.

Like measuring the time users are actually active on pages by checking
for mouse / scroll activity etc in real-time

Indeed, the target is to implement a real-time version of Google analytics, the trick has been to engineer sufficient scaleability to make it viable to release it as a public wordpress plugin.

Starting from the premise that there are ~ 70M Wordpress sites out there, it’s proving to be an interesting engineering challenge … effectively infinite capacity.

I think I’m there, just need to finish the front-end.

This is what I have at the moment, you can watch people connect/disconnect, switch pages etc, all in real-time … need some graphs and pie charts next :slight_smile:

But as you say, the ability to track mouse movement etc leaves much scope for analysing how people use specific pages, where there eye is drawn and exactly which links get clicked on.

Certainly plenty of room for feature creep in future …

0 Likes

#13

Hi Gareth!

That’s a really nice dashboard. We used to have an application which did some tracking across our Web sites (though not persistent across page loads), which had an activity graph - mouse moves, scroll actions etc. - maybe something for your future feature creep :slight_smile:

Regards,

Alex

0 Likes

#14

Working on the activity graph atm … will post a screenshot once I have it working … :slight_smile:
At the moment I’m using Masonry and jQuery animation so it’s quite fun to watch as people log in and out … but I guess I’m going to need a cut-off at some point as a busy website is going to overrun the screen … :frowning:

0 Likes

#15

Still messing around with it, but essentially this is now live / working … responsive SVG takes the pain out of resizing … :slight_smile:

d3 is a bit interesting to get to grips with, but I think I’ve wrapped my head around it …

0 Likes

#16

Just as a matter of interest, I’ve now installed the Wordpress plugin on 4x sites and they all still seem to be working, despite the horrendous way in which I’m manipulating the DOM … :wink:

0 Likes