TTN packet broker outage

The Packet Broker went out again yesterday. That’s 3 times in 10 days.

It seems to be on a schedule but when I check the overall systems status record it says All systems Operational. Does anyone know why this keeps happening?

Check it here. Go to the bottom of the page to see the outage.

https://status.thethings.network/

Yes,i opened yesterday also a subject ( [ End devices must be reset after problems with v2/v3 broker ]) which is related to your question.
I was also wondering that nobody is complaining about this and the TTN status page give a status that everything is working smoothly, while the broker is down for such a long time:
ttn-outage-4

Who is this mysterious “nobody” that should do the “complaining”?

In this case, “nobody” was called Nick and he raised at ticket at 01:08 - you are very welcome.

1 Like

thank you Nick

Any response t the ticket Nick?

Yeah, it got fixed about 3am

There is no SLA on TTN so that’s the most we can expect.

1 Like

Hi Nick,

Does Packet Broker operate independently of TTN?

Do you know if there is there a subscription model that includes an SLA?

Sort of but not really - TTI do own & run it after all. It’s not just for TTN.

Yes, a paid for TTI instance comes with an SLA - bearing in mind you’d still have gateways & devices on v2 that would be vulnerable to any foibles of TTN v2 but at least the PB would be covered.

You would still need to move your estate from v2 by end of Nov.

Thanks Nick, I don’t have any sensors on V2. Is it possible that the gateways I am hitting need to be upgraded?

Dunno, if you look at your uplink data it makes it clear if it’s a v3 gateway or arrived via PB, so only you can tell. If they are on PB via v2, then yes, they need upgrading. Good luck alerting & supporting the gateway owners with that.

Thank you very much for your time on this issue Nick.
I agree that there is no SLA, but why is the status page saying that “everything is operational” while a few inches lower on the page you can see that the packetbroker is down already many hours in forwarding v3 packets ?
It is run by TTN, so somebody must know that PB is is down in forwarding.
Maybe it is scheduled downtime, as far as I remember. was the duration in all 3 cases i noticed about 6 hours and there were still ~10 packets /sec forwarded. (while normally this is around 60 packets/sec

anton

That’s not a question I can answer on behalf of TTI, but I’d suggest it’s all about how many other things are going on at any one moment in time - if you are fixing things for a paying customer and don’t have the time to post a message for something that is relatively self-evident, what would you do?

Now if we all paid a small subscription, say a third of forum members paid €10 a year, that could fund someone to look after the shop for us!

It wasn’t, planned maintenance / upgrades follow a well structured course including announcements. And you may have noticed, but TTI do these on a Monday morning.

Nope - we are TTN! (The community, the volunteers, the users and the stack provider and core TTi team suspport staff) Its ultimately owned and run by TTI but I suspect as they are promoting across other LoRaWAN service providers and private network users (as demonstrated with the Chirpstack integrations/interop support) they are trying to treat as a 3rd party service wrt TTN. Not official just my observation/suspicion.

And indeed PB then has its own status page - where this incident is called out! bookmark https://status.packetbroker.net/

Its ultimately owned and run by TTI but I suspect as they are promoting across other LoRaWAN service providers and private network users (as demonstrated with the Chirpstack integrations/interop support) they are trying to treat as a 3rd party service wrt TTN. Not official just my observation/suspicion.

This is indeed correct. Packet Broker is a separate entity built and operated independently from The Things Stack (community/commercial) clusters, though there is an overlap of the team members who work on them.

The previous incident wasn’t logged since it occurred at 1 AM CEST and our engineer(s) focused solely on fixing the problem.

We have discussed this item internally and will do a better job of notifying future Packet Broker incidents as well at; https://status.packetbroker.net/ . We also have fixes in place to prevent/detect failures earlier and we hope that there won’t be huge outages in the future.

3 Likes

Hi All,
Was the broker working yesterday? I tried migrating a couple of devices and testing a new V3 device deployment but using an existing V2 gateway not yet migrated.
I cold see from my Balena console that the join requests were being carried by my V2 gateway, I could see that join accepts were being sent to the end device but couldn’t see any connection on the V3 console.

For my own curiosity I migrated a spare gateway to V3 and immediately the V3 end device appeared. There was no traffic on the V2 gateway but curiously some V2 data appeared to be carried by my V3 gateway if the metadata was correct?

I am curious to know if there is any logic to any of this?

Thanks
Garry

Around 5/6pm UKT this evening I happened to notice that TTNMapper - which I think shares some feeds and ultimately data with PacketBroker - was showing a significant number of GW’s in the area as offline, with last contact in that approx time scale. This evening @Johan_Scheepers contacted me to ask if seeing PB unresponsive - mapper.packetbroker.net. Checked a few GW ID for details and no response…

Have just checked status. https://status.thethings.network/

and can see that there has been a steady decline in traffic since then so clearly a problem

Anyone missing data or not seeing levels of traffic they expect then guess this most likely the reason. Hopefully someone from TTI Core will kick the servers shortly and all will be restored?! :pray:

image

Messages posted to Slack #Support. UPDATE: and now also on #Ops :wink:

1 Like

However TTNMapper feeds seem still not to have recovered with Community GW’s (Red) still showing off line and Peering networks (through PacketBroker) (Purple) still showing offline and last seen in drop off window yesterday

Paging @jpmeijers !!! :wink:

1 Like

Don’t quarry too hard @Johan_Scheepers :pick: :gem:

The TTN Mapper database server ran out of disk space. For now I increased the storage size, but that does mean I personally need to fork out more money to pay for this every month.

Please consider supporting TTN Mapper on a monthly basis via Patreon to help pay for the hosting costs: https://www.patreon.com/ttnmapper

1 Like