Console: All of my gateways except one are "gone" - multiple different error conditions

I have a GW from MatchX.io eui-40d63cfffe1f4351 set to TTN that worked until Saturday. I had to change my internet provider, so the subnet changed but I can see that the device got a new IP via DHCP because it still communicates with eux.matchx.io

On TTN console, I see the GW “last seen: 4 days ago”. Can’t test if packets are forwarded right now.

I have a TTIG gateway (the power plug indoor version) eui-58a0cbfffe800cf6 that still works (sends packets) but Conole shows “last seen: yesterday”.

Same with other TTIG GW that should receive packets every some minutes: eui-58a0cbfffe8013ca “last seen: 3 hours”

I have a RAK7258 on the same network as 2. where I updated the firmware: eui-60c5a8fffe760fb6. Everything seems fine, I can see packets in the RAK but they are NOT forwarded to TTN. “Last seen: yesterday”.

I have a LoriX One at a customer running that worked as far as I cen tell bit “Last seen: yesterday”.

Are we having some sort of meltdown right now? I see nothing in the Forum and there seems to be no Forum category “Console”.

This is a known and regular issue with the V2 console - use forum search and you will see this comes up regularly. In last 3 or 4 days I have seen wild swings in number of connected GW’s on the various TTN maps as the issue comes and goes, and you are likely a victim of this. As it is V2 related and we are moving to V3 through this year it will not be fixed, likely they will start to show again shortly.

That is the key point - you will know if GWs still working if you have nodes serviced by the GWs and the data is still apprearing in your application(s). If not you may have other problems, though it is unlikely that previosuly working GW’s would all suddenly stop at same time - esp if varying types.

If it gives any comfort a late night check before turnning in last night told me >1/3rd of my (>30) GW’s were showing offline in console (data still coming in), most with a constistent 15hrs last seen at that time, with a few others over 3-4 days, checking this morning I see several already starting to show live again…

Dear Jeff,

thanks for answering - and so fast.

OK I see there is a problem in the NUC, right? Would there be a “status page” where we could see these outages? It’s not really problematic as long as wer know what’s up I guess…

Andreas

P.S.: I’ll answer the other post about the RAK specific issues and I drop the other ones for now.

That is the ONLY test that has any meaning. So right now it’s not clear that there’s an actual network problem at all.

OK I see there is a problem in the NUC, right?

No, the idea of having status independent of packets is entirely deprecated.

Either your actual node packets get through, or they do not. That is the only thing that matters.

Ok thanks, but can we get a status information on some website if the NOC / console does not reflect the actual situation for the gateway? I don’t have my own nodes near all my about 8 gateways to do this kind of testing and I still would like to keep the gateways up.

Also, I do not have access to all my gateways from outside because it had been difficult enough to convince people to accept a gateway in the first place… so without the Console telling me, I’m pretty blind…

Andreas

No, you cannot.

V3 may provide such things, but it’s been decided long ago not to try to do so for V2, and with the plan being to turn it off sometime this year there’s no point in putting any further work into it.

I don’t have my own nodes near all my about 8 gateways to do this kind of testing

That’s exactly what people who want this information do. If other people have nodes near your gateway and you do see it handling their traffic that would be a positive indication, but the only way to known for sure is to have your own canary node either a reasonable distance away, or with a resistor instead of an antenna and/or the power dialed down.

Also, I do not have access to all my gateways from outside

I personally would not put a gateway in the field without it having a remote access solution, and independent health telemetry, those are manufacturer/user features, not TTN ones.

1 Like

As above, but worth repeating, these are NOT going to be resolved in the v2 stack.

OK thanks!

As far as remote access is concerned: I do have remote for devices and networks I own. However, if a customer pays and allows me to install a TTN gateway for the world to profit from it, it will forever be the decision of the owner of the premises to grant me that or not. It is also true that one should not leave rarely needed services open. A VPN is not an option with those customers.

I install gateways for third parties, I may manage them in the TTN console but if customers don’t allow access to their device, there is nothing I can do.

I see it as part of reality that needs to be accepted.

Now if TTN could deliver a “canary system” even under these circumstances that would be helpful. It would make life in these circumstances easier. I would know if I’ll have to drive around or to beg the customer to restart the GW.

So, by not providing such an option, we have less options, thats all. I will be more reluctant to call the customer because maybe, it’s running and all is fine?

Andreas

Thats more up to us - a €10 node transmitting temperature, light levels and number of Gummy Bears + battery level about 20m away from the gateway transmitting every 15 minutes should last a couple or three years.

Then you rig up a data flow that checks that the uplink has arrived and that it contains your gateway id in the meta data. If not, raise an alert.

1 Like

Indeed, if your customer’s won’t allow someone(either you or them) to have a remote access channel, the boxes in the field are unmaintainable short of a very expensive physical site visit by qualified staff. It only takes once incident for customer to understand why this is important.

It is also true that one should not leave rarely needed services open.

You don’t leave things open; most connectivity solutions for a gateway wouldn’t support inbound connections anyway. What you do do is either have a reverse tunnel or a VPN.

If the customer says neither is an option, and firewalling the gateway off away from all of their other things still won’t make it one, then that customer has said loudly and clearly that they do not care about reliability.

Regardless, remote access and custom telemetry while both present in any sane deployment, aren’t the same thing. You could have either without the other… but you really should think about field deploying without both.

You could have telemetry this week if you do it.

If you want to make feature requests of TTN, you first have to switch to the version of TTN that will see future development.

We want to roll out as many as possible, on ideal places. That is why we can’t always have “ideal customers”. Very often, they don’t see much use in the Gateways at first, as long as they have no application. If we want to leave the market to Sigfox or other Lora providers, there is no hurry and no need for such customers, you are right.

But do we want to?

I visit Gateways on a regular basis but that might be yearly.

You can trust me that I am able to maintain them. I work with the experience of 20 years in the field and I earned money with programming since 1984. You can assume that I know what I am doing just as you.

Andreas

Then nothing is stopping you from implementing your own monitoring telemetry today.

And do keep in mind that your gateways pointed at TTN V2 are going to need reconfiguration sometime this year; you can follow industry best practices by giving them remote management; or you can insist on doing that the slow and expensive way with site visits.

It’s up to you to take action to make sure that your deployments meet the needs of yourself and your customers. If you choose not to follow standard best practices, then you own the consequences of that.