TTIG does not support downlinks when not frequently receiving uplinks?

Hey @kersing/ @arjanvanb,
It is indeed the server that drops the connection. The TTIG currently runs 2.0.0 of the Basic Station packet forwarder which does not implement WS Pong. With the upgrade to 2.0.4 which we are planning, we’ll be able use this to not terminate active gateway connections.

Quick Note: We increased the keepalive time from 60s to 600s. So your gateways will now only reconnect every 10m if there’s no traffic through it.

5 Likes

I guess you’re keeping an eye on Basic Station Integration: Race condition in re-connection handling causes permanent failure of uplink forwarding · Issue #1730 · TheThingsNetwork/lorawan-stack · GitHub but just to be sure: this introduced new problems with internet providers that reset the connection every 24 hours (and maybe also assigns a new IP address lease). From Slack:

@JackGruber Today at 7:47 AM

Every morning the same problem The TTIG Gateway does not transmit data anymore

@strenker Today at 10:34 AM

Thats exactly the same what i have seen as well. My ISP resetted the internet connection (which always includes new IP addresses) in the night from 2020-01-29 to -30 and my TTIG became trouble. Before that date, ISP resets were never a trouble.

Such an ISP-reset of the internet line takes normally appr. 90 seconds to recover.

1 Like

Hey yeah indeed.
We (TTI) should improve how we communicate these changes.

4 Likes

We’ve deployed a potential fix here: TTIG Problems, - no location data, wrong date/time, wrong channel and stability issues
We’ve currently deployed the update only to the US West cluster. EU soon to follow.

If you still have issues, please post your observations in this thread (not in the one I posted).

Thanks,
Krishna

3 Likes

Krishna, apparently this does not work for all TTIGs, or some regression has occurred since February? A few instances that seem related:

…for which on Slack some more details are given:

@Martin_Kautenburger 2020-05-07 11:31 PM

Hi, i own 3 TTIG, the devices disconnects a few times a day. A few month ago the ttig works very good. But now it is not usable. I see on ttnmapper there are many ttig with the same problem.

One more:

So thanks for posting these here @arjanvanb.

I haven’t made any updates to the bridge in a while so there aren’t any regressions.

The received status messages is a strong indication of how often gateways reconnect and based on the graph below, it’s non-periodic and non-synchronised (as I expect).

Received Status message

[Excerpt from our metrics for the last 24 hours]

The disconnection could be due to;

  1. Client-side disconnects (need to check with the gateway guys)
  2. Flaky internet connectivity (but I don’t think this is the case for all)

I’ll do some tests locally and revert.

In the meantime, a disconnection will not affect your ability to forward traffic when the gateways reconnect.
In our previous update, we re-wrote the connection logic with special attention to unclean client-side disconnects.

That doesn’t seem to match the following?

I still see people referring to not getting downlink due to inactivity:

It’s unclear to me if that should have been fixed.

I tried testing this locally but I’m unable to reproduce it. Even with the shitty WiFi in my apartment, the TTIG does not disconnect periodically or when it does, comes back online and can route packets immediately.

@arjanvanb: What doesn’t match? The reconnection logic and server side WS pings work together and are not opposite features.

My debugging is not being helped by the fact that a lot of people are reporting (here and on Slack) that the gateways are disconnected with it’s only the NOC/Console not displaying the connection.

Can someone who’s facing multiple disconnections (i.e., the gateway LED rapidly blinking often before stabilising) please send me your EUI so I can trace it on the server?