Gateway connects/disconnects all the time after update to V3

Yesterday I migrated my gateway (RAK7258) to V3 (in Basic Station Mode).
Everything worked ok and the gateway was online after some trial and error with settings for authentication mode, trust and token. First new Packets from devices came in shortly afterwards…

…but today it’s another story. After a power cycle of the gateway no more packets are sent and the gateway keeps connecting/disconnecting all the time with the following debug output:

Wed Dec  8 17:54:49 2021 user.info basicstation[19057]: [TCE:INFO] INFOS reconnect backoff 10s (retry 1)
Wed Dec  8 17:54:59 2021 user.info basicstation[19057]: [any:INFO] /var/etc/station/tc.trust: 
cert. version     : 3
serial number     : 82:10:CF:B0:D2:40:E3:59:44:63:E0:BB:63:82:8B:00
issuer name       : C=US, O=Internet Security Research Group, CN=ISRG Root X1
subject name      : C=US, O=Internet Security Research Group, CN=ISRG Root X1
issued  on        : 2015-06-04 11:04:38
expires on        : 2035-06-04 11:04:38
signed using      : RSA with SHA-256
RSA key size      : 4096 bits
basic constraints : CA=Wed Dec  8 17:54:59 2021 user.info basicstation[19057]: [AIO:INFO] tc has no cert configured - running server auth and client auth with token
Wed Dec  8 17:54:59 2021 user.info basicstation[19057]: [TCE:INFO] Connecting to INFOS: wss://eu1.cloud.thethings.network:8887
Wed Dec  8 17:54:59 2021 user.info basicstation[19057]: [TCE:INFO] Infos: 60c5:a8ff:fe76:64aa muxs-::0 wss://eu1.cloud.thethings.network:8887/traffic/eui-60C5A8FFFE7664AA
Wed Dec  8 17:54:59 2021 user.info basicstation[19057]: [any:INFO] /var/etc/station/tc.trust: 
cert. version     : 3
serial number     : 82:10:CF:B0:D2:40:E3:59:44:63:E0:BB:63:82:8B:00
issuer name       : C=US, O=Internet Security Research Group, CN=ISRG Root X1
subject name      : C=US, O=Internet Security Research Group, CN=ISRG Root X1
issued  on        : 2015-06-04 11:04:38
expires on        : 2035-06-04 11:04:38
signed using      : RSA with SHA-256
RSA key size      : 4096 bits
basic constraints : CA=Wed Dec  8 17:54:59 2021 user.info basicstation[19057]: [AIO:INFO] tc has no cert configured - running server auth and client auth with token
Wed Dec  8 17:54:59 2021 user.warn basicstation[19057]: [RAL:WARN] Ignoring unsupported/unknown field: antenna_gain
Wed Dec  8 17:54:59 2021 user.info basicstation[19057]: [SYS:INFO] Process /etc/station/radio_init.sh (pid=20133) completed
Wed Dec  8 17:54:59 2021 user.info basicstation[19057]: [RAL:INFO] Lora gateway library version: Version: 5.0.1;
Wed Dec  8 17:54:59 2021 user.info basicstation[19057]: [RAL:INFO] SX1301 rxrfchain 0: enable=1 freq=867.5MHz rssi_offset=-158.000000 type=2 tx_enable=1 tx_notch_freq=0
Wed Dec  8 17:54:59 2021 user.info basicstation[19057]: [RAL:INFO] SX1301 rxrfchain 1: enable=1 freq=868.5MHz rssi_offset=-158.000000 type=2 tx_enable=0 tx_notch_freq=0
Wed Dec  8 17:54:59 2021 user.info basicstation[19057]: [RAL:INFO] SX1301 ifchain  0: enable=1 rf_chain=1 freq=-400000 bandwidth=0 datarate=0 sync_word=0/0
Wed Dec  8 17:54:59 2021 user.info basicstation[19057]: [RAL:INFO] SX1301 ifchain  1: enable=1 rf_chain=1 freq=-200000 bandwidth=0 datarate=0 sync_word=0/0
Wed Dec  8 17:54:59 2021 user.info basicstation[19057]: [RAL:INFO] SX1301 ifchain  2: enable=1 rf_chain=1 freq=0 bandwidth=0 datarate=0 sync_word=0/0
Wed Dec  8 17:54:59 2021 user.info basicstation[19057]: [RAL:INFO] SX1301 ifchain  3: enable=1 rf_chain=0 freq=-400000 bandwidth=0 datarate=0 sync_word=0/0
Wed Dec  8 17:54:59 2021 user.info basicstation[19057]: [RAL:INFO] SX1301 ifchain  4: enable=1 rf_chain=0 freq=-200000 bandwidth=0 datarate=0 sync_word=0/0
Wed Dec  8 17:54:59 2021 user.info basicstation[19057]: [RAL:INFO] SX1301 ifchain  5: enable=1 rf_chain=0 freq=0 bandwidth=0 datarate=0 sync_word=0/0
Wed Dec  8 17:54:59 2021 user.info basicstation[19057]: [RAL:INFO] SX1301 ifchain  6: enable=1 rf_chain=0 freq=200000 bandwidth=0 datarate=0 sync_word=0/0
Wed Dec  8 17:54:59 2021 user.info basicstation[19057]: [RAL:INFO] SX1301 ifchain  7: enable=1 rf_chain=0 freq=400000 bandwidth=0 datarate=0 sync_word=0/0
Wed Dec  8 17:54:59 2021 user.info basicstation[19057]: [RAL:INFO] SX1301 ifchain  8: enable=1 rf_chain=1 freq=-200000 bandwidth=2 datarate=2 sync_word=0/0
Wed Dec  8 17:54:59 2021 user.info basicstation[19057]: [RAL:INFO] SX1301 ifchain  9: enable=1 rf_chain=1 freq=300000 bandwidth=3 datarate=50000 sync_word=0/0
Wed Dec  8 17:54:59 2021 user.info basicstation[19057]: [RAL:INFO] SX130x LBT not enabled
Wed Dec  8 17:54:59 2021 user.err basicstation[19057]: [RAL:ERRO] Concentrator start failed: lgw_start
Wed Dec  8 17:54:59 2021 user.err basicstation[19057]: [RAL:ERRO] ral_config failed with status 0x08
Wed Dec  8 17:54:59 2021 user.err basicstation[19057]: [any:ERRO] Closing connection to muxs - error in s2e_onMsg
Wed Dec  8 17:54:59 2021 user.info basicstation[19057]: [TCE:INFO] INFOS reconnect backoff 10s (retry 1)

and ttn live data console shows this:

.
.
19:14:34 Connect gateway
19:14:23 Disconnect gateway
19:14:23 Receive gateway status  (Versions firmware""package""platform"linux - Firmware - Protocol 2"station"2.0.4-9-g3d5c686(linux/std)"...
19:14:23 Connect gateway
.
.

any ideas what’s wrong here?

Lots of people seem to see this issue with RAK’s basicstation build, and as far as I know they haven’t, published the source of their version of it, so it’s not really something others can debug. It’s possible that the degree to which basicstation tries to configure the gateway based on information held in the servers is part of the problem.

You can probably get things to work with the classic packet forwarder.

In theory it’s possible to build ones own system image for the RAK gateways, I’ve done it, but kind of incidental to building images for architecturally similar custom hardware in a custom setting that wasn’t trying to connect to TTN so didn’t include basicstation.

It should also be possible to build basicstation using a published cross toolchain and install it in the overlay as a modification to RAK’s system image without actually recompiling the underlying Linux from sources (or an an extreme, to generate a modified squashfs).

I had the same issue with an imst IC880A board

Did you implement a SPI reset after a powerup (assuming that the RAK uses SPI)?

https://www.beyondlogic.org/lorawan-upgrading-to-basic-station-and-the-things-network-v3-stack/

That’s a likely suspect, the thing is that the stock RAK firmware setup is sold as a turnkey solution without source or documentation of how it works internally.

Wading through the various scripts that control things may indeed be worthwhile, but is sort of second guessing what they may have gotten wrong.

Thanks for your input!
Martin from RAK Wireless staff confirmed the issue and they are working on it:

Hello Michael,
This is a known issue and the team is working on to find the problem and a solution for it. Can you share the following information, so I can send it to the team?..