Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [paho-dev] MQTT C Client - Concurrency Question/Problem


On 09/26/2014 07:44 AM, Sergio Torassa wrote:
Hi Ian,

FWIW, from a user perspective I would avoid not backward compatible options like the second one. It could break existing application that does not follow the steps it requires (first disconnect and then destroy). I would much prefer the third one: it solves the issue of the leaked sockets and has no side effects (if a client is destroyed there is no reasons in keeping it connected, as it can't be used anymore).

m2c

sergio
On 09/26/2014 10:20 AM, Franz Schnyder wrote:
Hi Ian

Thanks for the help and the fast replies. From my point of view my
favorite would be option 3. I think also option 1 would already be an
improvement and help to faster discover the problem. I think option 2
is like throwing an exception out of a destructor which I think should
be avoided.

Regards
Franz
I agree that the third option could be nicest. A possible complication is that the disconnect operation could take some time as it involves a network operation, and in the asynchronous client there is a callback to notify the application of disconnect success. This could be a cut down disconnect, just clearing up the data structures and closing the socket?

The other question that comes to mind is, would the application like to know that it called destroy() when the client was still connected, in case this was unintentional?

Ian


On Fri, Sep 26, 2014 at 12:11 AM, Ian Craggs
<icraggs@xxxxxxxxxxxxxxxxxxxxxxx> wrote:
Hi Franz,

it seems like it could be a good idea for the API to protect or warn against
this in some way, because this is not a good side effect.   Some options:

1) A trace error entry if the client being destroyed is not disconnected.
2) Change destroy so that it returns an error code, and refuses to destroy
the client if it is not disconnected.
3) Change destroy so that it disconnects the client first, if it isn't
already.

We can keep the bug open to make sure that the high CPU use has gone away,
and for me to add a fix to protect against this, whatever that might be.

Ian


On 09/25/2014 07:58 PM, Franz Schnyder wrote:

Hi Ian

Looks like I was too fast with raising the bug. I think the problem is a
result of my "wrong use" of the library API.

I tried to find out why the Socket_getReadySocket does not return 0 but
always a socket even though there was no network traffic at that time. I
found that my process had beside the 3 used sockets some "leaked" sockets:

    sudo lsof -a -p <pid>
    ...
    MqttSnGW  5632 root      12u  IPv4      14541      0t0     TCP
PiTwo:44388->104.40.130.232:1883 (CLOSE_WAIT)
    MqttSnGW  5632 root      14u  IPv4      19362      0t0     TCP
PiTwo:44666->104.40.130.232:1883 (ESTABLISHED)
    MqttSnGW  5632 root      15u  IPv4      14985      0t0     TCP
PiTwo:44403->104.40.130.232:1883 (CLOSE_WAIT)
    MqttSnGW  5632 root      16u  IPv4      19365      0t0     TCP
PiTwo:44667->104.40.130.232:1883 (CLOSE_WAIT)
    MqttSnGW  5632 root      17u  IPv4      19371      0t0     TCP
PiTwo:44669->104.40.130.232:1883 (ESTABLISHED)
    MqttSnGW  5632 root      18u  IPv4      19427      0t0     TCP
PiTwo:44674->104.40.130.232:1883 (CLOSE_WAIT)
    MqttSnGW  5632 root      19u  IPv4      19446      0t0     TCP
PiTwo:44676->104.40.130.232:1883 (CLOSE_WAIT)
    MqttSnGW  5632 root      20u  IPv4      19488      0t0     TCP
PiTwo:44680->104.40.130.232:1883 (CLOSE_WAIT)
    MqttSnGW  5632 root      21u  IPv4      19505      0t0     TCP
PiTwo:44682->104.40.130.232:1883 (CLOSE_WAIT)
    MqttSnGW  5632 root      22u  IPv4      19543      0t0     TCP
PiTwo:44686->104.40.130.232:1883 (ESTABLISHED)
    ...

I then found out that in some situation my code destroyed
(MQTTClient_destroy) a connected client without a prior call to
MQTTClient_disconnect which results in the 'leaked' sockets. I changed my
code so it ensures it always disconnects the client prior to destroying them
and the 'leaked' sockets are gone. The gateway now runs for more that one
day and the CPU usage is still normal and the Socket_getReadySocket return 0
when there is no network traffic. So I'm quite confident that the problem is
gone.

I will add this information to my bug report and leave it to you to decide
if the library should handle a destroy without prior disconnect or if this
is the responsibility of the library user.

Regards
Franz






_______________________________________________
paho-dev mailing list
paho-dev@xxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from
this list, visit
https://dev.eclipse.org/mailman/listinfo/paho-dev


--
Ian Craggs
icraggs@xxxxxxxxxx                 IBM United Kingdom
Paho Project Lead; Committer on Mosquitto


_______________________________________________
paho-dev mailing list
paho-dev@xxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from
this list, visit
https://dev.eclipse.org/mailman/listinfo/paho-dev
_______________________________________________
paho-dev mailing list
paho-dev@xxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/paho-dev

--
Ian Craggs
icraggs@xxxxxxxxxx                 IBM United Kingdom
Paho Project Lead; Committer on Mosquitto



Back to the top