Hi Karl,
And thanks for your reply.
I am indeed expecting retries from the server (the PUBRELs in the log trace). The fact that they happen is fine. Part of the aim of the test is to check that retries do happen and that is the reason why the retry interval is set so low for Mosquitto.
What is not expected is the fact that I am seeing a retry of an older message (2) after a publish of a newer message (3). This should not be happening not matter how long the retry interval is, in my understanding of the MQTT ordering guarantees at QOS 2 and max in flight 1.
This could be happening because the two messages were both in flight from the get go, which is also not expected.
As you can see from the trace the messages 2 and 3 are published by Mosquitto to client_sub one after the other without waiting for the QOS 2 PUBCOMP ack for msg 2 before publishing msg 3. This is happening for message 1 though. Message 2 is published after PUBCOMP is received for message 1. Note the timestamp have second precision so there is some uncertainty of the ordering in the logs. Either way my tests is failing so re-ordering is real. ( is there a way to configure a higher precision in Mosquitto?).
1502390751: New client connected from 192.168.99.1 as client_sub (c0, k60).
1502390751: Sending CONNACK to client_sub (1, 0)
1502390751: Received SUBSCRIBE from client_sub
1502390751: topic_1 (QoS 2)
1502390751: Sending PUBLISH to client_sub (d0, q2, r0, m1, 'topic_1', ... (666 bytes))
...
1502390752: Received PUBREC from client_sub (Mid: 1)
1502390752: Sending PUBREL to client_sub (Mid: 1)
...
1502390753: Received PUBCOMP from client_sub (Mid: 1)
1502390753: Sending PUBLISH to client_sub (d0, q2, r0, m2, 'topic_1', ... (665 bytes))
1502390753: Sending PUBLISH to client_sub (d0, q2, r0, m3, 'topic_1', ... (668 bytes))
1502390753: Received PUBREC from client_sub (Mid: 2)
1502390753: Sending PUBREL to client_sub (Mid: 2)
1502390753: Received PUBREC from client_sub (Mid: 3)
1502390753: Sending PUBREL to client_sub (Mid: 3)
The above trace occurs once client_sub came back to life.
They are a response to client_pub having published them during client_sub downtime at 2 sec intervals:
1502390742: Received PUBLISH from client_pub (d0, q2, r0, m1, 'iotgw2-1502390760800', ... (666 bytes))
1502390742: Sending PUBREC to client_pub (Mid: 1)
1502390742: Received PUBREL from client_pub (Mid: 1)
1502390742: Sending PUBCOMP to client_pub (Mid: 1)
1502390744: Received PUBLISH from client_pub (d0, q2, r0, m2, 'iotgw2-1502390760800', ... (665 bytes))
1502390744: Sending PUBREC to client_pub (Mid: 2)
1502390744: Received PUBREL from client_pub (Mid: 2)
1502390744: Sending PUBCOMP to client_pub (Mid: 2)
1502390746: Received PUBLISH from client_pub (d0, q2, r0, m3, 'iotgw2-1502390760800', ... (668 bytes))
1502390746: Sending PUBREC to client_pub (Mid: 3)
1502390746: Received PUBREL from client_pub (Mid: 3)
1502390746: Sending PUBCOMP to client_pub (Mid: 3)
So is the QOS 2 PUBCOMP ack not the trigger for Mosquitto (or any MQTT broker) with max inflight window set to 1 to send the next message PUBLISHED on a particular topic for a particular subscriber of that topic? If not what is? Is it the PUBREC? How can I obtain this expected behaviour with Mosquitto and Paho?
What about when retries do happen because processing finishes (i.e. PUBCOMP is sent) later than the retry interval set o Mosquitto (or any MQTT broker)? How can we make sure ordering is still guaranteed?
Thanks,
Martin