Hi Jiji,
I’m not
sure if I can attach a file in this mailing list, so I copy/paste inline. We
did a lot of changes, mainly in Linux, to:
-
Increase
the number of potential concurrent connections.
-
Decrease
the use of CPU to take the maximum
advantage of a single core.
Content of /etc/security/limits.conf
* soft
nofile 999999
* hard nofile 999999
root soft nofile 999999
root hard
nofile 999999
Content of /etc/rc.local
defrt=`ip route | grep "^default" | head -1`
ip route change $defrt initcwnd 10
ifconfig eth0 txqueuelen 2000
ifconfig eth1 txqueuelen 2000
Content of /etc/pam.d/sshd
@include common-auth
account required pam_nologin.so
@include common-account
session [success=ok ignore=ignore module_unknown=ignore default=bad] pam_selinux.so close
session required pam_loginuid.so
@include common-session
session optional pam_motd.so motd=/run/motd.dynamic noupdate
session optional pam_motd.so
session optional pam_mail.so standard noenv
session required pam_limits.so
session required pam_env.so
session required pam_env.so user_readenv=1 envfile=/etc/default/locale
session [success=ok ignore=ignore module_unknown=ignore default=bad] pam_selinux.so open
@include common-password
Content of /etc/ssh/sshd_config
Port 22
Protocol 2
HostKey /etc/ssh/ssh_host_rsa_key
HostKey /etc/ssh/ssh_host_dsa_key
HostKey /etc/ssh/ssh_host_ecdsa_key
UsePrivilegeSeparation yes
KeyRegenerationInterval 3600
ServerKeyBits 768
SyslogFacility AUTH
LogLevel INFO
LoginGraceTime 120
PermitRootLogin yes
StrictModes yes
RSAAuthentication yes
PubkeyAuthentication yes
IgnoreRhosts yes
RhostsRSAAuthentication no
HostbasedAuthentication no
PermitEmptyPasswords no
ChallengeResponseAuthentication no
X11Forwarding no
X11DisplayOffset 10
PrintMotd no
PrintLastLog yes
TCPKeepAlive yes
AcceptEnv LANG LC_*
Subsystem sftp /usr/lib/openssh/sftp-server
UsePAM yes
Content of /etc/sysctl.conf
fs.file-max = 99999999
vm.swappiness = 10
vm.min_free_kbytes = 65536
net.ipv4.ip_local_port_range = 1024 65535
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.core.rmem_default = 16777216
net.core.wmem_default = 16777216
net.core.optmem_max = 40960
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
net.core.netdev_max_backlog = 90000
net.ipv4.tcp_max_syn_backlog = 30000
net.ipv4.tcp_max_tw_buckets = 2000000
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_fin_timeout = 10
net.ipv4.tcp_slow_start_after_idle = 0
net.ipv4.udp_rmem_min = 8192
net.ipv4.udp_wmem_min = 8192
net.ipv4.conf.all.log_martians = 1
net.ipv4.tcp_syncookies = 1
net.ipv4.conf.all.rp_filter = 1
net.ipv4.tcp_max_syn_backlog = 1024
net.ipv4.tcp_sack = 0
Content of /etc/init/mosquito.conf
description "Mosquitto MQTTv3.1 broker"
author "Roger Light <roger@xxxxxxxxxx"
limit nofile 999999 999999
start on net-device-up
respawn
exec /usr/sbin/mosquitto -c /etc/mosquitto/mosquitto.conf
Relevant parameters configuration in /etc/mosquito/mosquito.conf
allow_duplicate_messages true
autosave_interval 1800
autosave_on_changes false
connection_messages true
log_dest syslog
log_timestamp true
log_type all
max_inflight_messages 1
max_queued_messages 1000
message_size_limit 10240
persistence true
persistence_file mosquitto.db
retry_interval 5
store_clean_interval 10
sys_interval 15
listener 8883
tls_version tlsv1
We was
testing Mosquitto + Java clients + Nodejs clients + TLS + mysql-auth-plug + MySQL,
using a physical server to host the mosquito and several (8-10, don’t remember)
clients. And everything in our LAN (we
tested AWS and others, but not for 100K connections).
We needed
more than one VM because of ephemeral ports (one VM = Max 64K connections to the broker) and
because virtual machines we configured were not too large.
We wanted
to try whether we could support one million users that are not connected
concurrently. And we do not know the number of users that would be connected
simultaneously if we have a million users. So… we started to try to take advantage
of Mosquitto as much as possible. Just for fun!
With the
configuration I’ve copied/pasted in this email we achieved more than 100K
connections (as a peak, sometimes more, sometimes less). Let’s say… Mosquitto
is able to establish the connection. However from 50K connection (more or less)
we started to have lots of timeouts because of keepalive messages did not reach
the clients on time.
In our
experience, the problem is not the memory usage. You can buy several GB of RAM
at a very low price now. The memory of our broker remained under 10GB (of 48 GB). Also, 23 of our 24 Xeon cores were underused
while one was 100% used after a few dozen thousands connections. I could not tell
you the specific numbers, sorry. So, we think that the main problem is that
Mosquitto is single threaded and not able to take advantage of multi-core/multi-processor
systems. Our cores run at 2.5GHz and, definitely are not enough to support more
than 25-30K connection without having disconnections. Mosquito does not scale horizontally too well.
Jiji, you
can view old mails at http://dev.eclipse.org/mhonarc/lists/mosquitto-dev/.
Hope this
information is helpful for you.
Regards.