[IMST IC880A + RPi] Gateway "hangs" every few weeks

I’ve a software fix for the unintended reset issue ready to be tested. Anyone up for some guinea pig role playing? :slight_smile: Let me know…!

Here.

@kgbvax: cool! It’s pretty easy. You have the ttn-zh setup, right? If so, just SSH into your gateway, and run:

$ cd ~/ic880a-gateway
$ sudo ./install.sh spi-watchdog

The fix is pretty trivial, is a watchdog service monitoring the sx1301 status, and if an unintended reset is detected, it will exit on error causing the main ttn-gateway.service to restart (thanks to systemd).

1 Like

On it… :smile:
By the way - it would help if gateway-remote-config URL would not be hardcoded…

you mean that the repo would be configurable? or just the ability to enter any arbitrary remote url for source of config?

Either way would be great for us (as we maintain everything in GH).
URL would be the generic solution.

Alright! I’ll try to build something more generic. The central github was a very pragmatic and convenient solution. It is open for anyone that wants to put their config files there, of course, but I guess some people would rather keep it more private.

1 Like

The watchdog is running on my gateway now. I’ll keep an eye on the logs the next days/weeks.

I am running the watchdog version now too.

However I must say that the frequency of problems with the gateway has been very low lately. This means that it will take a long time to know if this watchdog makes gateway service better.

I’ll see if I can find out restarts from one of the system log files in /var/log.

For a week now, I have my gateway in a metal enclosure, shorter cables, and better power to the concentrator. I didn’t experience any hangs since then. But I am glad there is something in place in case it happens again!

1 Like

Also here watchdog is watching now!:smile:

1 Like

I think the watchdog fix is not working. Could anyone confirm or Has anyone seen anything else? I am going to change the main spi branch to just to an automatic restart every day, and have it ‘mostly’ resolved for most people until we make sure the watchdog service works and we merge it into that.

Anyone strongly opposes to that quick fix? :slight_smile:

Was about to post that the fix was not working for me. Had a ‘hang’ and no fix :frowning:
Instead of a daily reboot, I will make a Lora Node equiped with wifi that sends a packet every 5 minutes. I’ll check that a ping-packet was send (through wifi) and arrives at the packet forwarder. If not: restart or reboot the gateway and send a notification by email or to my mobile phone. Maybe make it better by requesting the node to send a second packet if no packet arrived. If that one does not arrive -> reboot. And some double checks that the rest of the network is working ok, to prevent a unnecessary reboot.

Hardware fixes are also an option, but I want to have a software fix first.
@joris: could you send me the logfiles of both services just to check if anything there can help?

@joris @ernestopace @Dagmar_forum @kgbvax: I pushed a new attempt at fixing the hanging problem. If you wanna give it a spin and see if it does work this time around, same process as last time: just SSH into your gateway, and run:

$ cd ~/ic880a-gateway
$ sudo ./install.sh spi-watchdog

Hope it works better this time!
Cheers

1 Like

Thanks @gonzalocasas! I will give it a try, although my gateway did not experience a hang the past 12 days… hopefully it stays that way :slight_smile:

some pictures of my ic880a / Pi hardware, I placed the connectors close to each other, antenna as far as possible away from the “noisy” Pi. Very short interconnect, 100E series resistors in every signal wire. Scope picture is the SPI clock, running at 8MHz. Please note that any wire will act as antenna, if you have the strong signal e.g. from GSM telephone it could transfer energy to those “long” wires, disturbing the signals…

measured on the ic880a connector between closest GND and pin 14 (SPI-CLK)

1 Like

As a side note, someone here at Zurich noticed that removing WiFi from the Pi and connecting over ethernet makes a huge difference and the gateway runs for weeks without issues.

1 Like

Yes, that’s the (SPIbased!) gateway at ZH-Affoltern. :grinning: B827EBFFFED390B0


we only restartet it once for fun :wink:

4 Likes

I have turned wifi and bluetooth off now:

# /etc/modprobe.d/raspi-blacklist.conf
# wifi
blacklist brcmfmac
blacklist brcmutil
# bluetooth
blacklist btbcm
blacklist hci_uart

Thanks for the tip!

1 Like