TTIG Problems, - no location data, wrong date/time, wrong channel and stability issues

Curious: why is that gateway wall time so important to some of you?

Missing data is one thing, but _wrong_data should be fixed. I don’t like a product which sends wrong data to my backend. I can’t fix it by myself, I have to wait till TTN fixes the product they sold. Up to now I found no resources explaining what will be fixed when. That’s some kind of after sales support I expect when I buy a product.

Last time I checked TTN didn’t sell any gateways. RS Components and several dealers do sell gateways. Your supplier should be the first port of call when something does not work as described.
However TTNs name is associated with the product, so some improvements or comments would be useful. @wienkegiezeman?

OK, that’s not so easy for a beginner. The TTIG is not provided by The Things Network (TTN). Got it. It is advertised by “The Things Industries” and offered on “The Things Industries” homepage as product, (distributed via RS-Components). Now I’m wondering, if this is not the right form for the TTIG, where can I file an issue for The Things Industries …

After running my TTIG now for some time without greater problems, I think @bei’s wild guess in his post on Sept 3rd is right. Since I’ve corrected the sending interval in my Arduino Node my TTIG forwards all my sensors flawlessly. A break in my WiFi Connection or a Power Fail isn’t a problem anymore.

this seems to be the problem: https://github.com/TheThingsNetwork/lorawan-stack/issues/1730 … and I think, it will be fixed asap

Bad news: Each of my two TTIGs doesn’t connect since 2 days.
They blink fast green and sometimes green/red.
Disconnecting power for more then 10 minutes doesn’t help.
Debug Output says:

[AIO:INFO] cups has no cert configured - running server auth and client auth with token
[AIO:ERRO] [-1] HTTP connect failed: UNKNOWN ERROR CODE (0052)
[AIO:DEBU] [-1] HTTP connection shutdown…
[CUP:ERRO] CUPS connect failed - URI: https://mh.sm.tc:7007

and here the corresponding Debug lines of an earlier successful boot:

[AIO:INFO] cups has no cert configured - running server auth and client auth with token
[CUP:VERB] Retrieving update-info from CUPS https://rjs.sm.tc:9191
[AIO:DEBU] [2] HTTP connection shutdown…
[CUP:INFO] Interaction with CUPS done (no updates) - next regular check in 1d
[TCE:INFO] Starting TC engine

Both URIs (https://mh.sm.tc:7007 and https://rjs.sm.tc:9191) are connecting with my Firefox Browser
and show the following message:
{“error”:“Invalid or missing input Expecting value: line 1 column 1 (char 0)”}

Both TTIGs are on

[SYS:DEBU] Station Version 2.0.0(minihub/debug)
[SYS:DEBU] Version Commit e17c5af
[SYS:DEBU] Station Build 2018-12-06 09:30:37
[SYS:DEBU] Firmware Version 2.0.0
[SYS:DEBU] FW Flavor ID semtech0
[SYS:DEBU] Model minihub

This TTIG was reset by pressing the Reset Button in Config Mode.
The other TTIG was left in its state.
The behaviour of both seems to be the same.

Who can help ?

Hi @ll, the main problem of not connecting was a misconfigured DNS Server on my site. But the general Problem, if TTIG is disconnected by WLAN or Power interruption stays the same as before, has even got worse since the timeout was rised from 60 to 600 seconds. I have to disconnect the TTIG for 10 minutes, then the connection is ok for sometimes half a day, sometimes even half an hour. The thing is absolutely unreliable. I’ve transported my second TTIG to another location, where it connects via a Sophos firewall to fibre cable (an absolutely reliable company network), but problems are the same … my next try will be a connection via mobile phone tethering, this worked some months ago. It’s an epic fail, since some months i try to get this damned thing working.

As it is running on an ESP processor, I know a thing or two about those.
You may want to check the following for a very stable WiFi:

  • Use fixed channel for the wifi used by this gateway.
  • Test with B/G only set in the access point (or use a B/G-only one) to see if the gateway will accept it.
  • Try to use an AP for only ESP base units.

My indoor gateway is also as unstable as a drunk on roller skates for the last few days.
For example we had to power cycle everything here in house and thus the gateway connected before the cable modem had a connection. This was apparently enough to not being able to make a connection to the TTN backend. It is now powered off to see later this evening if it can connect again.

It would be nice to know what is wrong here and if there is something we as users can do to make it reliable again.

Sorry for my bad English … it’s a hard work for an old man to write his thoughts down here 8-;))
The problem seems to be the following: An interrupted connection (by WLAN ot power outage for example) doesn’t teminate correct on server side and persists … then the TTIG opens a new one and confuses the server (see bei’s wild guess in post #61 about 9m ago). After a timeout of 600 sec (was previous 60 sec) the server side kills this dead connection and it’s ID. So the new connection can’t work correctly because of this missing ID. This is my interpretation of bei’s Analysis, don’t know, if it is correct. I’m hoping since 9 mths that someone on serverside may correct this issue. In the meantime for reliable work i will buy some other gateways, in our ongoing user group at the Bürgernetz Dillingen we will buy a LG308 or perhaps a Kerlink … no more TTIG, I’m very frustrated. My first plan was to build a RasPi + 880 GW, but i decided to use a TTIG, I thought: Built by TTI it must be reliable and we could concentrate on Node Development … hahaha … the damned thing should be able to handle broken connections, but it isn’t. And closed source software doesn’t make things easier …

Well, with the WiFi tips I gave, you could minimize the connection interruptions.
My indoor gateway did appear to be working fine for a while, but like I said the last few days were really horrible regarding stability.
Or maybe it is right now that I experience almost every hiccup as I’m testing a lot with it right now.

The broken connection problem persists for months on my side, the only connection that worked for more then one day was tethering by a mobile phone, very funny …
@TD-er: Many thanks for your good tips, maybe they help to make it a little bit more reliable, but the problem itself persists. Using a dedicated AP for the TTIG indeed seemed to solve the problem for some hours, but not forever … if the TTIG did’t receive and forward any packet in the timeout period the connection also terminated properly and the problem disappeared.

By forcing the access point to B/G only, you also may increase stability of the WiFi connection as it does increase the sensitivity of the WiFi radio. (also allows to operate stable in a very noisy environment)

Full ack, using a dedicated AP on B/G only could be an improvement, thks … I’ll give it a chance.
The better solution, I think is the LG308, connected via wire. We will try it Thursday evening …
we will see, which new bugs come with the new gateway 8-;))

Hello Everyone,

As you may have noticed over the last few days, we have been updating some of our backend components in preparation for V3. As a part of this, we’ve deployed a fix to handle unclean disconnection of TCP connections. This should improve the stability of the gateways during ISP rests, or powercycles.

We’re testing Server side WebSocket ping which will prevent the server from disconnect gateways every x seconds and we hope to deploy that soon.

Please comment on this thread with some debug information on issues if any.

Thanks,
Krishna

3 Likes

Hello Krishna,

many thanks for your information about the unclean disconnection,
my critical TTIG, which used to fail reconnecting after a powerfail or WLAN break is online since yesterday 13:00 o clock. After an outage today (I think WLAN) at 12:15 it reconnected at 13:00 without my intervention. I think your fix could have been successful. I’ll watch this behaviour the next days and will report it here.
2020/02/21 7:30 … by chance I saw at the moment my TTIG blinking red/green … means it had lost WLAN Connection and was establishing it again. After shortly blinking fast green the LED got solid green … means established connection to CUPS and configured correctly. Looking at TTN Console there was an outage of about 10 min. Perfect … I think and hope, the issue I described above is solved. Many thanks…the timeout of 600 instead of the initially 60 seconds isn’t a big problem for me and my apps.
My problem seems, as mentioned above by TD-er, WLAN disconnects, which I’ll try to minimize now. I’ll establish an AP of its own for the TTIG, operating BG only.
My second TTIG, connected to a corporate WLAN, which is switched off every night, runs stable the last two days … also a big improvement. This device was implemented for testing purposes only and will be switched off within the next 2 weeks.

1 Like

Hello Everyone,

Our latest update seems to have gone well and we got some positive feedback on Forum/Slack. So we’re going to deploy another update where the server supports WebSocket Pings. This will keep TTIG connections alive even when there’s no upstream data. The ping interval is set to 30s.
This will be deployed within the next hour in the US-West cluster and based on it’s performance, soon in the EU cluster.

Regards,
Krishna

5 Likes

Hey thanks for the confirmation. With the newer release, even this 10 min wait time will not be necessary. Let’s see how the update goes.

2 Likes

Hello Krishna,

Within the last 6 days there was only one outage on my “critical TTIG” (maybe it lost WLAN Connection) lasting some hours. Your latest update seems a big improvement. Is the Update in the US West cluster ok ? When will it be implemented in the EU Cluster ?

regards
Franz

1 Like

Hey Franz,

The updates are now rolled out to the EU cluster as well.

Regards,
Krishna

3 Likes

Hello Krishna,

my TTIGs work like a charm … I’m really impressed
many thanks for this solution 8-;))

cu
Franz

1 Like