Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [mosquitto-dev] client drops

I ran some informal tests last night and the results seem a bit different, although not vastly so. But at least my results move me out of panic mode. Here's what I did:
  • spun up an ubuntu 14.04 512mb 1 CPU server on Digital Ocean. That's their cheapest $5/month server. This server is for the mosquitto broker.
  • set the hard and soft open file limits to something ridiculously high
  • Installed Mosquitto. Configuration:
    • TLS/SSL connections only and w/ insecure certificate (domain name mismatch. I'm just testing.)
    • authentication via prefix ID
    • log type: all
  • spun up 5  ubuntu 14.04 512mb 1 CPU servers on Digital Ocean. These are for the load testing.
  • Each of the load testing servers spun up 1700 clients, each client somewhat emulating our hardware i.e. each client:
    • had keepalive set to 60 seconds
    • subscribed to a topic with 5 other clients
    • subscribed to its own unique topic
    • published a short message (just a number) to a topic shared by 5 peer clients every 3 to 6 seconds
    • everything was done at QoS 0
  • So that's a total of 5*1700 = 8500 clients, each publishing every 3 to 6 seconds to 5 other clients, all on one Mosquitto broker.
The results were that the number of clients connected held solidly at 8501 (8500 + the subscriber telling me the # of clients connected) overnight. Mosquitto was hovering between 80 and 100% cpu usage throughout.

When I added another 1700 clients using a 6th server things started breaking down.

So that's not amazing results but better than a meltdown at 3800 clients.

I never got the SSL errors that originally led me to start this thread but I forgot to check for those errors when I spun up the 6th server, so I will run the test again and check for that.

John



On Mon, Jun 5, 2017 at 3:35 PM, Karl Palsson <karlp@xxxxxxxxxxxx> wrote:

If you have 3800 clients connected without doing serious kernel
trickery, you're way in the deep end of poll() breakdown range.
something like htop would probably show that you're cpu time is
in "system", not in "user" land. I've not done any in depth
testing for a few years now. Here's where I broke it all down
back in 2013
https://lists.launchpad.net/mosquitto-users/msg00335.html

There's more out there now, but I believe that at 3800, you're
just about at the end.

Cheers,
Karl P

John Harrison <john@xxxxxxxxxxx> wrote:
> Hi Karl. Thanks for the questions.
>
> The persistence write interval, as I understand it, is the same
> as the autosave interval which defaults to 30 minutes. The
> surges of 99.9% CPU happen for a few seconds every 10 seconds
> or so.
>
> Here's some data that I believe answers your other questions:
>
> $SYS/broker/clients/connected: 3827
> $SYS/broker/clients/expired: 0
> $SYS/broker/clients/disconnected: -1
> $SYS/broker/clients/total: 3825
> $SYS/broker/load/connections/+: 73.78, 68.74, 67.17
> $SYS/broker/load/bytes/received/+: 20925.18, 22602.58, 22880.39
> $SYS/broker/load/bytes/sent/+: 141812.25, 151984.54, 160541.45
> $SYS/broker/load/messages/sent/+': 2883.43, 3147.90, 3278.97
> $SYS/broker/load/messages/received/+: 1621.39, 1658.80, 1668.28
> $SYS/broker/load/publish/received/+: 120.27, 136.55, 142.08
> $SYS/broker/load/publish/sent/+: 1498.97, 1680.24, 1824.82
> $SYS/broker/messages/received: 351311395
> $SYS/broker/messages/sent: 591194360
> $SYS/broker/messages/stored: 2591
> $SYS/broker/publish/messages/received: 23615914
> $SYS/broker/publish/messages/sent: 280424529
>
> John
>
> On Thu, Jun 1, 2017 at 5:32 AM, Karl Palsson
> <karlp@xxxxxxxxxxxx> wrote:
>
> >
> > John Harrison <john@xxxxxxxxxxx> wrote:
> > > Hello,
> > >
> > > We have found as the number of clients has increased on our
> > > Mosquitto production server, more and more clients are getting
> > > dropped. We don't use any plugins, authentication is by client
> > > prefix, and we are using SSL/TLS in our connections to the
> > > Mosquitto broker.
> > >
> > > To try to understand the problem better I parsed through a
> > > day's worth of the verbose log of the broker (size 1 Gig),
> > > isolating the lines with errors on them, removing the
> > > timestamps, and finally finding duplicates and sorting. The
> > > first 3 lines (the most common lines appearing in the logs) are
> > > below. The first column shows the number of times that line
> > > appeared in the 24 hour log:
> > >
> > > >    3498 OpenSSL Error: error:1408F119:SSL
> > > > routines:SSL3_GET_RECORD:decryption failed or bad record mac
> > > >    1616 Socket error on client <unknown>, disconnecting.
> > > >    1197 OpenSSL Error: error:140F3042:SSL
> > > > routines:SSL_UNDEFINED_CONST_FUNCTION:called a function you should
> > not call
> > >
> > >
> > > Mosquitto seems to average 20% CPU but the hills and valleys
> > > are huge: it seems to spend 4/5 of its time at almost 0% CPU
> > > then peaks at 100% CPU for the other 1/5 time. Looking at
> > > tcpdump it seems the messages are coming in scattered, not all
> > > at once, so I'm not really understanding these hills and
> > > valleys and don't know if it is related . There's nothing in
> > > the syslog for clues on what is going on.
> >
> > Are those spikes corresponding to your persistence write
> > interval? Do you have lots of offline queued clients, or large
> > amounts of retained messages? How many simultaneous clients?
> >
> > Cheers,
> > Karl P
> >
> > >
> > > I found this:
> > > https://github.com/eclipse/mosquitto/issues/383
> > >
> > > which makes me wonder if the problem might be at least
> > > partially solved by using haproxy for my SSL termination? (I'm
> > > thinking haproxy would work as well as Nginx and is something
> > > we were looking at doing for a future Mosquitto cluster.)
> > >
> > > Any clues, help, thoughts would be appreciated.
> > >
> > > John

_______________________________________________
mosquitto-dev mailing list
mosquitto-dev@xxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/mosquitto-dev



--
Like us on Facebook https://www.facebook.com/filiminlight 
Follow us on Twitter @filiminlight

Back to the top