Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [mosquitto-dev] mosquitto sometimes disconnects client after publish

I was able to obtain a packet capture for the first client which I've attached. Here is the breakdown of what I'm seeing:

Client: 172.16.0.114
Mosquitto Broker: 10.16.208.139 (port 8883)

The numbers below are the packet numbers in the capture file.
- 1: Client connects to broker
- 17?: Client starts publishing first batch of messages
- 386: First batch of messages complete
- 387: Client starts publishing second batch of messages
- 388: Missing packet?
- 389: Mosquitto ACKs packet 387
- 390: Mosquitto starts spamming TCP DUP ACK packets for 387
- 549: Mosquitto stops spamming TCP DUP ACK packets for 387
- 550-578: Lots of retransmissions
- 784: Second batch of messages complete
- 785: Third batch of messages starts
- 788: Mosquitto sends FIN ACK, closing the connection (WHY?)
- 790-880: Mosquitto responds to remaining messages with RST

The main question is why would mosquitto suddenly close the connection after starting to receiving the 3rd batch of messages. The logs in mosquitto at this point where the 3rd batch starts is here:

18:52:24
1491418344: OpenSSL Error: error:140E0197:SSL routines:SSL_shutdown:shutdown while in init
18:52:24
1491418344: Socket error on client admin, disconnecting.

Does anyone know how an OpenSSL error would happen "while in init", if the connection has been open for a while? Note that the mosquitto broker currently supports connections by username/password over TLS using a cert (which is how this client is connecting), but also supports TLS-PSK from other clients.


On Wed, Apr 5, 2017 at 11:59 AM Jeff Armstrong <jeff@xxxxxxxxxxxxxxxxx> wrote:
I did not get a packet capture as I don't currently have ssh access to that instance, but am pretty certain it came from the broker. First, I connected directly to the instance, rather than through the load balancer, and the problem persisted. Second, my other client application also saw the issue. Here is a breakdown of what happened:

  • mosquitto and my client started up normally

  • client successfully connects, subscribes to a few topics, and then sends a batch of publishes to various topics (a batch of publishes is about 20 different publishes, each to a different topic, with each message being only 11 bytes)

  • mosquitto sees first batch of publishes from my client

  • mosquitto sees the second batch of publishes from the client 5 minutes later (the interval between batches is 5 minutes)

  • mosquitto does NOT see MOST of third batch of publishes from mqttd at the 10 minute mark. Below are example logs from mosquitto that show only 2 messages get published (out of about 20):

05:38:51
1491370731: Received PUBLISH from zenreach (d0, q0, r0, m0, '2.1/LOC/SMMug3jKfsoLKlXy/LS/PAQ', ... (11 bytes))
05:38:51
1491370731: Received PUBLISH from zenreach (d0, q0, r0, m0, '2.1/LOC/SMMug3jKfsoLKlXy/LS/PAQ', ... (11 bytes))

  • during third batch of publishes, my client starts printing timeout errors on each publish (I'm using WaitTimeout() with a timeout of 5 seconds as shown in my original post)

  • about 15 minutes after the errors started occurring, mosquitto disconnects the client user because of timeout. This makes sense since the keepalive is set to 10 minutes and since mosquitto isn't receiving any publishes (or pings even), it should disconnect the client. Below is the log from mosquitto:

05:53:51
1491371631: Client zenreach has exceeded timeout, disconnecting.
05:53:51
1491371631: Socket error on client zenreach, disconnecting.

The question here is, why do all publishes just suddenly stop working on the 3rd batch (including pings)?


On Tue, Apr 4, 2017 at 9:35 PM cheng <tangch318@xxxxxxxxx> wrote:
Did you capture packet on mosquitto broker side and verify if RESET packet from mosquiito broker?

2017-04-05 1:19 GMT+08:00 Jeff Armstrong <jeff@xxxxxxxxxxxxxxxxx>:
I'm running a mosquitto broker (1.4.10) on AWS behind a ELB. I wrote a client that publishes batches of messages to a single topic. For example, it may send 20 messages of about 5k each, and repeat that every minute.

The first batch of messages always goes through, but subsequent batches are either completely missing or partially missing. I noticed that after a failed batch of messages is published, the client gets disconnected. I can see the TCP reset in the packet capture from the client side. Even though there are no publishing errors, and the disconnect seems to occur AFTER publishing the 2nd batch, the 2nd batch of messages doesn't show up on the broker.

I'm unable to reproduce this running mosquitto locally, which makes it seem like it's a problem with mosquitto specific to AWS or the load balancer. My first thought was that the load balancer was timing out the connections, but the timeout on ELB is 60s, and I set the client keepalive to 5s and the problem persists.

Any ideas?

Thanks,
Jeff
--
Jeff Armstrong
Software Engineer
Greenfield Labs

_______________________________________________
mosquitto-dev mailing list
mosquitto-dev@xxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/mosquitto-dev

_______________________________________________
mosquitto-dev mailing list
mosquitto-dev@xxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/mosquitto-dev
--
Jeff Armstrong
Software Engineer
Greenfield Labs
--
Jeff Armstrong
Software Engineer
Greenfield Labs

Attachment: mqtt_filtered.pcap
Description: Binary data


Back to the top