Client reconnect, ping-pong query

#1

Hi, this is a funny one …

I’m using a python client against crossbar with a reconnectingclient factory, seems to work well. I can stop/start crossbar and the client happily reconnects when it can, no problems. However … I have an issue elsewhere (!) which is causing the python client to lock out the GIL (I’m assuming) by running a callInThread() with a process that then calls out a long-running system call.

After 100 seconds, Crossbar reports a “pong” failure and disconnects the client.

When the client comes back to life (another 30 seconds) it reports the disconnection, and that it will retry in 1 second.

I then reports that the factory is closed, and no further retries take place.

It would appear that a ping-pong failure at the crossbar-end is somehow breaking the client retry.

Anyone any ideas on what might be going wrong?

… if it makes any difference, it’s an SSL connection …

Client factory code looks like this;

class ComponentFactory(WampWebSocketClientFactory, ReconnectingClientFactory):
“”“WAMP Reconnect Wrapper.”""
maxDelay = 300
initialDelay = 1
factor = 1.1
debug = True

def clientConnectionFailed(self, conn, reason):
    """When the connection fails."""
    if not reactor.running:
        self.stopTrying()
    ReconnectingClientFactory.clientConnectionFailed(self, conn, reason)

def clientConnectionLost(self, conn, reason):
    """When the connection is lost."""
    if not reactor.running:
        self.stopTrying()
    ReconnectingClientFactory.clientConnectionLost(self, conn, reason)

``

And the connect code;

    component_config = ComponentConfig(realm=relm, extra=self)
    factory = ApplicationSessionFactory(config=component_config)
    factory.session = self.component
    server_url = "%s://%s:%s/ws" % (proto, host, port)
    transport = ComponentFactory(factory, server_url)
    transport.setProtocolOptions(acceptMaskedServerFrames=True)
    class CtxFactory(ssl.ClientContextFactory):

        def getContext(self):
            self.method = SSL.SSLv23_METHOD
            ctx = ssl.ClientContextFactory.getContext(self)
            ctx.use_certificate_file('certificate.pem')
            ctx.use_privatekey_file('private_key.pem')
            return ctx

    ctx = CtxFactory()
    connectWS(transport, ctx)
    return Application(name)

``

 __  __  __  __  __  __      __     __
/  `|__)/  \/__`/__`|__) /\ |__)  |/  \
\__,|  \\__/.__/.__/|__)/~~\|  \. |\__/

Crossbar.io : 16.10.1
Autobahn : 0.16.1 (with JSON, MessagePack, CBOR, UBJSON)
Twisted : 16.5.0-EPollReactor
LMDB : 0.92/lmdb-0.9.18
Python : 3.5.2/CPython
OS : Linux-4.4.0-45-generic-x86_64-with-Ubuntu-16.10-yakkety
Machine : x86_64
Release key : RWQ2MDk26PKBMNUZG2Jok1tMBB1SKyci+N7dtcep8jrikTl4NvI1Rnux

``

0 Likes

#2

Hi Gareth,

could you try using ApplicationRunner.run(xx, auto_reconnect=True); and see if you can reproduce this?

Cheers,

Tobias

···

Am 17.11.2016 2:18 nachm. schrieb “Gareth Bult” garet...@gmail.com:

Hi, this is a funny one …

I’m using a python client against crossbar with a reconnectingclient factory, seems to work well. I can stop/start crossbar and the client happily reconnects when it can, no problems. However … I have an issue elsewhere (!) which is causing the python client to lock out the GIL (I’m assuming) by running a callInThread() with a process that then calls out a long-running system call.

After 100 seconds, Crossbar reports a “pong” failure and disconnects the client.

When the client comes back to life (another 30 seconds) it reports the disconnection, and that it will retry in 1 second.

I then reports that the factory is closed, and no further retries take place.

It would appear that a ping-pong failure at the crossbar-end is somehow breaking the client retry.

Anyone any ideas on what might be going wrong?

… if it makes any difference, it’s an SSL connection …

Client factory code looks like this;

class ComponentFactory(WampWebSocketClientFactory, ReconnectingClientFactory):
“”“WAMP Reconnect Wrapper.”""
maxDelay = 300
initialDelay = 1
factor = 1.1
debug = True

def clientConnectionFailed(self, conn, reason):
    """When the connection fails."""
    if not reactor.running:
        self.stopTrying()
    ReconnectingClientFactory.clientConnectionFailed(self, conn, reason)

def clientConnectionLost(self, conn, reason):
    """When the connection is lost."""
    if not reactor.running:
        self.stopTrying()
    ReconnectingClientFactory.clientConnectionLost(self, conn, reason)

``

And the connect code;

    component_config = ComponentConfig(realm=relm, extra=self)
    factory = ApplicationSessionFactory(config=component_config)
    factory.session = self.component
    server_url = "%s://%s:%s/ws" % (proto, host, port)
    transport = ComponentFactory(factory, server_url)
    transport.setProtocolOptions(acceptMaskedServerFrames=True)
    class CtxFactory(ssl.ClientContextFactory):

        def getContext(self):
            self.method = SSL.SSLv23_METHOD
            ctx = ssl.ClientContextFactory.getContext(self)
            ctx.use_certificate_file('certificate.pem')
            ctx.use_privatekey_file('private_key.pem')
            return ctx

    ctx = CtxFactory()
    connectWS(transport, ctx)
    return Application(name)

``

 __  __  __  __  __  __      __     __
/  `|__)/  \/__`/__`|__) /\ |__)  |/  \
\__,|  \\__/.__/.__/|__)/~~\|  \. |\__/

Crossbar.io : 16.10.1
Autobahn : 0.16.1 (with JSON, MessagePack, CBOR, UBJSON)
Twisted : 16.5.0-EPollReactor
LMDB : 0.92/lmdb-0.9.18
Python : 3.5.2/CPython
OS : Linux-4.4.0-45-generic-x86_64-with-Ubuntu-16.10-yakkety
Machine : x86_64
Release key : RWQ2MDk26PKBMNUZG2Jok1tMBB1SKyci+N7dtcep8jrikTl4NvI1Rnux

``

You received this message because you are subscribed to the Google Groups “Crossbar” group.

To unsubscribe from this group and stop receiving emails from it, send an email to crossbario+unsubscribe@googlegroups.com.

To post to this group, send email to cross...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/crossbario/cab763cc-c422-4d99-91b3-27e39db0d0f9%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

0 Likes

#3

Mmm, not easily, is there an equivalent option for “twistd” ?
(my code is all auto-generated and based around twistd)

I’ve fixed my “other” problem … just as a matter of interest the “lxc” module for Python doesn’t really play terribly nice.

Even if you run things like “lxc.copy” inside a sub-process through “callInThread”, it still locks up the duration of the time it takes

to copy an entire filesystem, which is not good.

This however seems to work properly without hanging anything up; (run in a thread)

# this is evil and hangs the application
# master = lxc.Container('iflex_template')
# container = master.copy(name)

# this works as expected
waitforme = Deferred()
self.protocol = cloneProtocol(waitforme)
params = ["bash", "-c", "/usr/bin/lxc-copy -n iflex_template -N {}".format(name)]
reactor.spawnProcess(cloneProtocol(waitforme), "/bin/bash", params, env=os.environ)
yield waitforme
container = lxc.Container(name)

``

class cloneProtocol(protocol.Protocol):
def init(self, waitforme):
self.waitforme = waitforme

def connectionMade(self):
    """initial connection"""
    pass

def processEnded(self, reason):
    """process an end of session"""
    print("-- Ended CLONE process -- {}".format(reason))
    self.waitforme.callback(self)

def childDataReceived(self, childFD, data):
    """come here when we have some data from the session"""
    data = data.decode('utf-8')
    print("Data from lxc-copy :: ", data)

def childConnectionLost(self, childFD):
    """connection to child lost"""
    pass

``

0 Likes