TTIG Problems, - no location data, wrong date/time, wrong channel and stability issues

KrishnaIyerEaswaran2 · February 19, 2020, 1:59pm

Hello Everyone,

As you may have noticed over the last few days, we have been updating some of our backend components in preparation for V3. As a part of this, we’ve deployed a fix to handle unclean disconnection of TCP connections. This should improve the stability of the gateways during ISP rests, or powercycles.

We’re testing Server side WebSocket ping which will prevent the server from disconnect gateways every x seconds and we hope to deploy that soon.

Please comment on this thread with some debug information on issues if any.

Thanks,
Krishna

Franz_Refle · February 20, 2020, 4:24pm

Hello Krishna,

many thanks for your information about the unclean disconnection,
my critical TTIG, which used to fail reconnecting after a powerfail or WLAN break is online since yesterday 13:00 o clock. After an outage today (I think WLAN) at 12:15 it reconnected at 13:00 without my intervention. I think your fix could have been successful. I’ll watch this behaviour the next days and will report it here.
2020/02/21 7:30 … by chance I saw at the moment my TTIG blinking red/green … means it had lost WLAN Connection and was establishing it again. After shortly blinking fast green the LED got solid green … means established connection to CUPS and configured correctly. Looking at TTN Console there was an outage of about 10 min. Perfect … I think and hope, the issue I described above is solved. Many thanks…the timeout of 600 instead of the initially 60 seconds isn’t a big problem for me and my apps.
My problem seems, as mentioned above by TD-er, WLAN disconnects, which I’ll try to minimize now. I’ll establish an AP of its own for the TTIG, operating BG only.
My second TTIG, connected to a corporate WLAN, which is switched off every night, runs stable the last two days … also a big improvement. This device was implemented for testing purposes only and will be switched off within the next 2 weeks.

KrishnaIyerEaswaran2 · February 21, 2020, 9:29am

Hello Everyone,

Our latest update seems to have gone well and we got some positive feedback on Forum/Slack. So we’re going to deploy another update where the server supports WebSocket Pings. This will keep TTIG connections alive even when there’s no upstream data. The ping interval is set to 30s.
This will be deployed within the next hour in the US-West cluster and based on it’s performance, soon in the EU cluster.

Regards,
Krishna

KrishnaIyerEaswaran2 · February 21, 2020, 9:53am

Hey thanks for the confirmation. With the newer release, even this 10 min wait time will not be necessary. Let’s see how the update goes.

Franz_Refle · February 26, 2020, 6:47am

Hello Krishna,

Within the last 6 days there was only one outage on my “critical TTIG” (maybe it lost WLAN Connection) lasting some hours. Your latest update seems a big improvement. Is the Update in the US West cluster ok ? When will it be implemented in the EU Cluster ?

regards
Franz

KrishnaIyerEaswaran2 · February 26, 2020, 10:36am

Hey Franz,

The updates are now rolled out to the EU cluster as well.

Regards,
Krishna

Franz_Refle · February 27, 2020, 8:07am

Hello Krishna,

my TTIGs work like a charm … I’m really impressed
many thanks for this solution 8-;))

cu
Franz

JeroenKl · February 28, 2020, 8:33am

Yes @KrishnaIyerEaswaran2, it works fine now.
But still missing the TTNmapper integration (read: not shown on the map with use of the app)
Yes, location is set correctly and visible for the public

BR,

Jeroen

Verkehrsrot · February 28, 2020, 6:29pm

After i noticed the new firmware rollout post here, today i reinstalled my TTIG.
Now it delivers time of day in metadata, great!
But: it’s about 1 second late, compared to my MatchX1701 gateway which uses GPS time.

Or the MatchX1701 is wrong?

Need to analyze this further. Will keep you posted here.

"gateways": [
    {
      "gtw_id": "eui-40d63cfffeMATCHX",
      "timestamp": 64530420,
      "time": "2020-02-28T18:20:49.389245Z",
      "channel": 5,
      "rssi": -69,
      "snr": 8,
      "latitude": 52.53737,
      "longitude": 13.41779,
      "altitude": 60
    },
    {
      "gtw_id": "eui-58a0cbfffeXXTTIG",
      "timestamp": 568394828,
      "time": "2020-02-28T18:20:48.352178096Z",
      "channel": 0,
      "rssi": -78,
      "snr": 8.25
    }
  ]
}

LoRaTracker · February 28, 2020, 11:39pm

‘GPS Time’ is normally understood to be the time at the beginning of 1980. Since then there has been 18 leap seconds added, so there is now an 18 second difference between the time the GPS network uses as a base and UTC time.

I suspect what you mean is ‘time obtained from a GPS’. The time taken from the GPS (and then used by the gateway) can be different from UTC time.

Its often assumed GPSs put out UTC time, this is not always the case.

Verkehrsrot · February 29, 2020, 9:54am

I know all that, but it does not explain why we see a difference of around 1 second here.

The MatchX gateway uses time obtained from gps for feeding the PPS input line of Semtech’s concentrator chip. The packet concentrator code (Semtech) draws absolute time from NMEA sentences of same gps signal. For me it looks like this code has a bug: if NMEA record arrives near top of second, it is intepreted with the next pps pulse, instead with previous. This makes the absolute time advance by 1 second.

If this is the root cause here, the “new” TTIG time will probably be correct.

arjanvanb · February 29, 2020, 10:16am

Just for the sake of completeness:

I don’t think it was new firmware, but these were changes in the backend instead (like maybe the bridge sitting between the TTIG and V2, or between V3 and V2). Given that, even though the TTIG indeed knows time, maybe the time in the meta data is added at some later stage. I guess that’s unlikely though, if only as for large latency then even for perfect clocks that would yield a time that is in advance rather than late.

If one wants to know, maybe one can compare UART logging to the values in the meta data.

LoRaTracker · February 29, 2020, 10:30am

That depends, but I dont know the software.

If at powerup\startup the Gateway waits for the GPS (which has also just powered up) to get a fix or the time and then updates its internals with that time and then uses the PPS to increment that time; then the time the gateway uses could indeed be out by a second or more.

If however the extraction of the time from the GPS NMEA data is a continuous process (rather than just at startup) then after at most 12.5 minutes of good reception the GPS ought to be putting out the time that is equivalent of UTC.

Verkehrsrot · February 29, 2020, 10:45am

It’s always that tricky +/- 1 second problem, which maybe a result of not advanced code for PPS & time handling.

kersing · February 29, 2020, 10:56am

For all packet forwarders based on the semtech code this is applies.

LoRaTracker · February 29, 2020, 11:12am

Indeed so, but all too often coders are not aware that GPSs do not always put out UTC.

Verkehrsrot · February 29, 2020, 11:40am

But at least the coders at Semtech/Cycleo were aware of that:

And each time an NAV-TIMEGPS UBX message has been received:

get the concentrator timestamp (using lgw_get_trigcnt, mutex needed to protect access to the concentrator)

get the GPS time contained in the UBX message (using lgw_gps_get)

call the lgw_gps_sync function (use mutex to protect the time reference that should be a global shared variable).

Then, in other threads, you can simply used that continuously adjusted time reference to convert internal timestamps to GPS time (using lgw_cnt2gps) or the other way around (using lgw_gps2cnt). Inernal concentrator timestamp can also be converted to/from UTC time using lgw_cnt2utc/lgw_utc2cnt functions.

Verkehrsrot · February 29, 2020, 11:52am

The TTIG seems to pull system time on some unknown way over the 443 port from the TTN network server:

1970-01-01 00:00:14.572 [TCE:INFO] Connecting to INFOS: wss://lns.eu.thethings.network:443
1970-01-01 00:00:15.325 [TCE:INFO] Infos: 58a0:cbff:feXX:XXX muxs-::0 wss://lns.eu.thethings.network:443/traffic/eui-58A0CBFFFEXXXXXX
1970-01-01 00:00:15.461 [TCE:VERB] Connecting to MUXS…
1970-01-01 00:00:16.202 [TCE:VERB] Connected to MUXS.
2020-02-29 11:48:03.166 [RAL:WARN] Ignoring unsupported/unknown field: antenna_gain
2020-02-29 11:48:03.173 [RAL:INFO] Lora gateway library version: Version: 4.1.1;

Verkehrsrot · February 29, 2020, 12:09pm

Did some simple real time analysis, see enclosed photo.
The time of the TTIG seems pretty precise, in terms of some milliseconds. Screenshot_20200229-130536

arjanvanb · February 29, 2020, 12:17pm

So, does the TTIG UART log for an uplink show (almost) the same time as the meta data?

(Again, I’d also assume that the TTIG includes that time in its Basic Station uplink message, though I now see that such is actually not too clear from the specification? It might be easy to validate by comparing the UART log to the meta data, if you want to know.)