MQTT issue since V 3.13.0 update

Hello,

I’ve started using TTNv3 last month. I’ve set-up a simple python script with paho.mqtt on my server to retrieve the data from my temperature sensors (10 of them) using MQTT protocol. It runs in the background. It works perfectly fine until the update to V 3.13.0 on May 21. I’ve stopped receiving uplink messages despite the fact that I can see my gateway and my sensors are online and working well.

Nothing has changed on my side since the beginning of the month and I can’t figure what is different regarding MQTT for the TTN after reading the log for the update 3.13.0.

So far, I’ve tried :

  • Rebooting my server (and my script obviously)

  • Test the connection. I can see in my script that the connection is established when I run it. I also have the confirmation message in the TTN console, so everything looks fine on this side. I can also see in the TTN console “receive uplink”, “forward uplink” and “forward data message to application server”. However, there are no uplink messages appearing in my log on my server.

  • Reset the API Key

I’ve been trying for hours to reconnect/disconnect to see if the problem would resolve and I finally got it working without any change to my script, so I guess it wasn’t the problem. However, I was testing it in the foreground, so I’ve stopped it and restarted it in the background and I have the same problem now, no uplink messages on my server despite everything else showing it should work.

I’ve also tested it on my desktop to see if my server was the problem and I had the same behaviour (both are Debian based Linux distro).

I’ve waited a bit and did some more testing, and I’ve finally managed to get a connection and data for a couple of hours yesterday, but it silently fails after a few hours (both on my desktop and server).

I’m in Canada using the nam1.cloud. Am I the only one experiencing this issue? I’m unsure what to look at for debugging purposes, since it’s an intermittent issue.

Thanks!

There is a discussion about this problem on the slack ops channel.

Personally I am finding that once the MQTT disconnects (and that happens randomly!) it is not possible to reconnect at the first re-attempt.
In the end I had to script a loop with a one second pause on each iteration, then eventually after about 10 - 20 retries the connection is successful. I agree: only a problem since the recent upgrade.

Update: having just looked at my log, I can see the last time I dropped connection it needed 55 retries to connect again to the port without reporting “Error: The connection was lost”

This is reported as being fixed. I did some MQTT testing this morning without being aware that there was an issue and didn’t experience any issues.

I saw a disruption in my MQTT application at 2021-05-23 22:55:30,179 CEST (connection lost).

I am using the paho Java client, the MQTT settings used in my application are:

  • mostly default, this includes ‘clean session’ set to true
  • automatic reconnect is explicitly enabled

Note that automatic reconnect in combination with the ‘clean session’ means that you get re-connected on network troubles, but any subscriptions are lost! So if you use this combination, you won’t see any data coming in anymore even though the client was reconnected.

In my code, the subscription is done in the connectComplete notification callback, see:

With this code, the connection was automatically re-established and the subscription restored about 3 minutes later at 2021-05-23 22:58:23,016 CEST

1 Like

Thanks everyone for your replies :slight_smile:

Indeed, it now looks like it’s working again. I’ll follow @hphillip and @bertrik advice to foolproof my script in the future.

Have a nice day!

1 Like