Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [mosquitto-dev] Storing all Messages in Database - Avoiding Single Subscriber `#` Anti-pattern

Hi Karl,

On Tue, Sep 6, 2016 at 8:03 PM, Karl Palsson <karlp@xxxxxxxxxxxx> wrote:
>
> Drasko DRASKOVIC <drasko.draskovic@xxxxxxxxx> wrote:
>> It is strange though that in the official examples I find
>> exactly the Single Subscriber anti-pattern:
>> https://github.com/eclipse/mosquitto/blob/master/examples/mysql_log/mysql_log.c,
>> line:
>
> Says who? Who says it's an antipattern? You want to save all the
> messages in some "other" way, but you don't want to get all the
> messages via even a local networking connecton (which are pretty
> heavily optimized in most OSs)

You probably missed my first e-mail on this subject, where I posted
the link where I explained problem in more details:
https://groups.google.com/forum/#!topic/rabbitmq-users/KVMNkAsW-ac. I
did not want to repeat all, but basically you do want to look at this
video: https://www.youtube.com/watch?v=VoTclkxSago (the fun part
starts at 11th minute), and read this article:
http://www.hivemq.com/blog/mqtt-sql-database (look at chapter "Isn’t
the wildcard subscriber some kind of bottleneck?")

>
> If your bridging configuration results in you creating hotspots,
> that's a problem with how you're setting up your bridging and
> clustering, not that you need some "special" way of writing out
> files that's not called subscribing.

No matter how you configure the bridge, your backend database client
will connect to only one host. And it will connect on `#` topic which
will provoke that all other nodes in the bridge have to send all
messages to this node in order for the messages to be pushed to the
backend client.

Let's say that you have 3 nodes in the bridge, and your db client
subscribes to the Node 2 (on `#` topic). Even if client connected to
the Node 1 pushes something on the topic XY wanting to send for
example message to the some subscriber on Node 3, this message will
also have to go to the Node 2. And so on, and so on. So when you make
bridge of 100 nodes, you will still have just one node (Node 2 in this
case) that will under heavy load - i.e. you will not be able to spread
the load over all 100 nodes).

The only way to spread the load is to let each of 100 nodes send their
internal messages directly to database backend client (rather then
route them through Node 2). But this is not possible if your client is
MQTT subscriber (as it will be connected to only one node).

So, to resolve this I do not want my DB backend client to be MQTT
subscriber (especially not on `#` topic, which makes firehose
single-point-of-failure node) but rather be some kind of either TCP
client that taked message data from within MQTT nodes internally, but
most probably be Kafka queue subscriber while MQTT brokers will
publish their messages to Kafka (each of them).

>
> What _is_ an antipattern is publishing all your data into a flat
> topic called "data" or something, and then being _required_ to
> subscribe to a very broad topic even if you're only interested in
> a subset of it.
>
> If you really _do_ want to listen to all the data, subscribing to
> "#" is _precisely_ the way to do so.

I wish it was that simple... But this approach is not scalable.

BR,
Drasko


Back to the top