LIG16 troubleshooting

I tried just as you this morning. Connection works well. I also get time sync rejected errors after some time. But I think it is also able to correctly do time sync. At least I still have incoming messages…

2021-09-11 08:38:46.613 [SYN:INFO] MCU/SX130X drift stats: min: +0.0ppm  q50: -2.4ppm  q80: -9.8ppm  max: -13.8ppm - threshold q90: -11.9ppm
2021-09-11 08:38:46.613 [SYN:INFO] Mean MCU drift vs SX130X#0: -1.0ppm
2021-09-11 08:39:21.272 [SYN:INFO] Time sync qualities: min=306 q90=340 max=341 (previous q90=343)
2021-09-11 08:39:23.373 [SYN:INFO] MCU/SX130X drift stats: min: +0.0ppm  q50: +6.2ppm  q80: +13.3ppm  max: +17.6ppm - threshold q90: +17.1ppm
2021-09-11 08:39:23.373 [SYN:INFO] Mean MCU drift vs SX130X#0: 4.0ppm
2021-09-11 08:39:35.976 [SYN:VERB] Time sync rejected: quality=371 threshold=340
2021-09-11 08:39:40.177 [SYN:VERB] Time sync rejected: quality=341 threshold=340
2021-09-11 08:40:03.282 [SYN:VERB] Time sync rejected: quality=341 threshold=340
2021-09-11 08:40:09.584 [SYN:INFO] MCU/SX130X drift stats: min: +1.0ppm  q50: +3.8ppm  q80: +16.7ppm  max: -24.4ppm - threshold q90: -24.3ppm
2021-09-11 08:40:09.584 [SYN:INFO] Mean MCU drift vs SX130X#0: -1.5ppm
2021-09-11 08:40:22.187 [SYN:INFO] Time sync qualities: min=294 q90=341 max=371 (previous q90=340)
2021-09-11 08:40:34.790 [SYN:VERB] Time sync rejected: quality=354 threshold=341
2021-09-11 08:40:53.695 [SYN:INFO] MCU/SX130X drift stats: min: +1.0ppm  q50: -3.3ppm  q80: +11.4ppm  max: -16.2ppm - threshold q90: +14.8ppm
2021-09-11 08:40:53.696 [SYN:INFO] Mean MCU drift vs SX130X#0: 0.7ppm
2021-09-11 08:41:02.098 [SYN:VERB] Time sync rejected: quality=347 threshold=341
2021-09-11 08:41:03.902 [S2E:VERB] RX 867.9MHz DR5 SF7/BW125 snr=-3.8 rssi=-118 xtime=0x4700000D2F54B3 - updf mhdr=40 DevAddr=26088ABD FCtrl=80 FCnt=2541 FOpts=[] 036D0C54..1E47 mic=-1989087438 (59 bytes)
2021-09-11 08:41:10.500 [SYN:VERB] Time sync rejected: quality=342 threshold=341
2021-09-11 08:41:12.600 [SYN:VERB] Time sync rejected: quality=350 threshold=341
2021-09-11 08:41:24.517 [S2E:VERB] RX 867.7MHz DR3 SF9/BW125 snr=-15.5 rssi=-125 xtime=0x4700000E69ED7E - updf mhdr=40 DevAddr=2601280E FCtrl=C2 FCnt=17965 FOpts=[0307] 0D3865CC..5189 mic=-495581456 (27 bytes)
2021-09-11 08:41:25.203 [SYN:INFO] Time sync qualities: min=295 q90=347 max=354 (previous q90=341)
2021-09-11 08:41:33.606 [SYN:VERB] Time sync rejected: quality=355 threshold=347
2021-09-11 08:41:42.008 [SYN:INFO] MCU/SX130X drift stats: min: -3.3ppm  q50: -8.6ppm  q80: -9.5ppm  max: -20.9ppm - threshold q90: -16.8ppm
2021-09-11 08:41:42.008 [SYN:INFO] Mean MCU drift vs SX130X#0: -7.8ppm
2021-09-11 08:41:52.511 [SYN:VERB] Time sync rejected: quality=503 threshold=347
2021-09-11 08:42:13.516 [SYN:VERB] Time sync rejected: quality=353 threshold=347
2021-09-11 08:42:26.119 [SYN:INFO] Time sync qualities: min=311 q90=353 max=503 (previous q90=347)
2021-09-11 08:42:28.220 [SYN:INFO] MCU/SX130X drift stats: min: -2.4ppm  q50: +7.1ppm  q80: -10.5ppm  max: -16.7ppm - threshold q90: -15.2ppm
2021-09-11 08:42:28.220 [SYN:INFO] Mean MCU drift vs SX130X#0: -4.1ppm
2021-09-11 08:42:55.526 [SYN:VERB] Time sync rejected: quality=380 threshold=353
2021-09-11 08:43:11.378 [S2E:VERB] RX 867.3MHz DR0 SF12/BW125 snr=8.8 rssi=-73 xtime=0x47000014C85984 - updf mhdr=40 DevAddr=260B1790 FCtrl=80 FCnt=39 FOpts=[] 02B67C6A..9DE9 mic=-1490846239 (31 bytes)
2021-09-11 08:43:11.821 [S2E:DEBU] ::1 diid=50352 [ant#0] - next TX start ahead by 4s537ms
2021-09-11 08:43:12.330 [SYN:INFO] MCU/SX130X drift stats: min: -1.0ppm  q50: -2.9ppm  q80: +9.5ppm  max: -13.8ppm - threshold q90: -13.8ppm
2021-09-11 08:43:12.330 [SYN:INFO] Mean MCU drift vs SX130X#0: -2.0ppm
2021-09-11 08:43:16.338 [S2E:VERB] ::1 diid=50352 [ant#0] - starting TX in 19ms915us
2021-09-11 08:43:16.364 [S2E:INFO] TX ::1 diid=50352 [ant#0] - dntxed: 867.3MHz 16.0dBm ant#0(0) DR0 SF12/BW125 frame=6090170B268509000340FF00..B80695A4
2021-09-11 08:43:17.513 [S2E:DEBU] Tx done diid=50352
2021-09-11 08:43:29.135 [SYN:INFO] Time sync qualities: min=273 q90=341 max=380 (previous q90=353)

Also I got some nice logs about the temperature sensor after a while:

2021-09-11 08:46:33.984 [SYN:INFO] Time sync qualities: min=273 q90=341 max=376 (previous q90=341)
2021-09-11 08:46:42.387 [SYN:VERB] Time sync rejected: quality=629 threshold=341
2021-09-11 08:46:46.035 [S2E:VERB] RX 867.7MHz DR5 SF7/BW125 snr=14.0 rssi=-78 xtime=0x4700002193DE7D - updf mhdr=40 DevAddr=260B1790 FCtrl=80 FCnt=40 FOpts=[] 0293D7C6..4093 mic=449092140 (31 bytes)
2021-09-11 08:46:46.475 [S2E:DEBU] ::1 diid=50614 [ant#0] - next TX start ahead by 4s547ms
2021-09-11 08:46:50.789 [SYN:VERB] Time sync rejected: quality=342 threshold=341
2021-09-11 08:46:51.003 [S2E:VERB] ::1 diid=50614 [ant#0] - starting TX in 19ms838us
2021-09-11 08:46:51.017 [___:INFO] lgw_start:850:  --- IN
Opening SPI communication interface
Note: chip version is 0x10 (v1.0)
INFO: using legacy timestamp
INFO: LoRa Service modem: configuring preamble size to 8 symbols
Loading AGC fw for sx1250
Loading ARB fw
ARB: dual demodulation disabled for all SF
INFO: no temeprature sensor found on port 0x39
INFO: found temperature sensor on port 0x3B
lgw_start:1179:  --- OUT
lgw_receive:1310: INFO: RSSI temperature offset applied: 1.574 dB (current temperature 35.7 C)
lgw_receive:1313: INFO: nb pkt found:1 left:0
lgw_receive:1310: INFO: RSSI temperature offset applied: 1.564 dB (current temperature 35.5 C)
lgw_receive:1313: INFO: nb pkt found:1 left:0
lgw_receive:1310: INFO: RSSI temperature offset applied: 1.567 dB (current temperature 35.6 C)
lgw_receive:1313: INFO: nb pkt found:1 left:0
lgw_send:1340:  --- IN
lgw_send:1474:  --- OUT
lgw_receive:1310: INFO: RSSI temperature offset applied: 1.574 dB (current temperature 35.7 C)
lgw_receive:1313: INFO: nb pkt found:1 left:0
lgw_send:1340:  --- IN
2021-09-11 08:46:51.028 [S2E:INFO] TX ::1 diid=50614 [ant#0] - dntxed: 867.7MHz 16.0dBm ant#0(0) DR5 SF7/BW125 frame=6090170B26850A000352FF00..4698E443
2021-09-11 08:46:51.069 [S2E:DEBU] Tx done diid=50614
2021-09-11 08:46:52.889 [SYN:INFO] MCU/SX130X drift stats: min: +0.0ppm  q50: +2.6ppm  q80: -11.4ppm  max: +12.9ppm - threshold q90: -11.9ppm
2021-09-11 08:46:52.890 [SYN:INFO] Mean MCU drift vs SX130X#0: 0.2ppm

I told Dragino about this behaviour in their Google-Groups. Maybe someone from Dragino will read it and answer.

That is just a GUI thing of the webinterface. When I refresh the page I get to see new logs. Edit: gateway traffic graph stays empty indeed. But in theThingsStack I see normal traffic coming in.

I have tested Basics Station for about an hour now, and I seem to get all uplink messages and am also able to do downlink messages. I will keep the LIG16 configured as it is for now and see how it behaves on the long term.

1 Like

I am testing a LIG16. I flashed the newest firmware
OpenWRT 18.06, Version: Dragino-v2 lgw-5.4.1636008262, Build Thu Nov 4 14:44:22 CST 2021

I can join TTS-CE for both modes: UDP forwarder 1700 and basic station 443.

Now, i want to join a private TTS server but i can’t do it the… easy way.
This server has some Lorix one GW’s running fine and they join by basic station port 8887.

So, any use cases for LIG16 and private TTS servers ?

So, no one tried connecting to a LNS basic station port 8887 on a dragino LIG16 ?

There is a new firmware available from Dragino for LIG16. This one allows the user to provide his own server certificate so it’s usable for any TTS scenario.

https://www.dragino.com/downloads/index.php?dir=LoRa_Gateway/LIG16/Firmware/Release/

Latest is 2021-Dec-24

ChangeLog
===========
lgw–build-v5.4.1640315898-20211224-1120
*Add Iot_keep_alive Script interval setting
*Update the display of the helium page on GUI
*Reduce the gateway data traffic
*Fixed the loss of the gateway downlink queue
*Update the display of the Basic-Station page for TTN on GUI

1 Like

New fw seems stable. But also with this new firmware the gateway does not show traffic statistics when you use it in Basic Station mode. If I remember correctly someone already contacted Dragino about this months ago (@wolfp was that you?)

For now I reverted again to semtech udp.

I think you’re going to find that’s pretty common with Basic Station on most platforms. Whatever log scanning was used to capture it is expecting log messages from the UDP forwarder, while Basic Station would produce that information in a different format if at all.

New firmware, tested ok for LIG16.
GUI for gateway traffic does not show anything, don’t bother for that.

lgw–build-v5.4.1644990565-20220216-1352
*Add LPS8-N support
*Add AT&T disconnection detection
*Add Secondary LoRaWAN Server for Semtech UDP
*LPS8/LG308/DLOS8’s software of pkt_fwd iterates to fwd
*LPS8/LG308/DLOS8 supports the GUI of Gateway traffic
*Fixd fallback address lost after save WiFi setting

For me the new firmware version is not ok. There is still nothing shown in “Gateway Traffic” when you use BasicStation. I need to know what’s going on with my gateway without using the console.
Therefore I installed the new firmware using Semtech UDP.

Due to the differences between the UDP based packet forwarder and BasicStation that is unlikely to change anytime soon. So no need to report this every update.

1 Like

I see a lot of reboots on my LPS8, on average every 15 hours, and Basic Station freezes after a few days/weeks.
I updated it 2 weeks ago with the new firmware (16 Feb 2022 lgw-5.4.1644990565) and configured it for Basic Station, connected over WiFi. It ran fine for 2 weeks (lots of reboots) but froze yesterday. Completely unresponsive, no ping.
I had noticed the same issue on the previous firmware (24 Dec 2021), Basic Station freezing, so I switched back to UDP and that ran without freezing; lots of reboots but always coming back online. I wanted to give Basic Station a chance with the new firmware but it seems to reboot more often, and now it froze.
The only modification i made to the firmware is installing nano and a cron job that sends the uptime over MQTT SSL every 5 minutes. Earlier i had tried to avoid the freezes with watchcat (did not work), and directing the syslog to permanent storage to investigate post freeze but i did not find any errors.
Does anyone have similar reboot/freeze issues with Dragino gateways over WiFi?
gw
My WiFi signal is not strong, but acceptable?
wifi

imho the LIG16 and the LPS8 have different hardware but seem to use the (nearly) same firmware. I own both gateways, the LPS8 is still on the “old” firmware running Semtech UDP without any problems since a few months.

The LIG16 is running the newest firmware with Semtech UDP without any reboots or freezes since a few days.

Your WLAN signal seems to be strong enough, it’s nearly the same signal/noise as I have.

Maybe this is more a LPS8 - related problem when using BasicStation.

Did it freeze before you put this on?

One LIG16 is up and running basic station for 2 weeks now, it hasn’t bother me since.

Thanks for the feedback. I’ve switched my LPS8 back to the UDP forwarder and upgraded to a beefy 2.5A power supply, but i’m still seeing several reboots over the past days. I’m just wondering if the same thing is happening to other people without realising it - the uptime is reported in the dashboard bottom right.
gw-uptime
Sorry i didn’t want to hijack this LIG16 thread, just wondering if anyone had a similar problem or if it’s related to firmware. The conclusion seems to be that the LIG16 (both UDP and Basic Station) does not have this problem.

My recommendation would be to get a logread -f running, say from /etc/rc.local and then run the console serial of the gateway into some other logging system, eg, a Raspberry Pi or old PC that can sit there and just log everything for days to weeks.

You may also want to determine if the SoC’s hardware watchdog, or the Linux soft watchdog that can wrap it, are in use.

Basically what you’ll then want to do is search through the external log for Linux boot messages, and see what was happening just before.

That assumes of course that these are Linux reboots, if it’s just the packet forwarder restarting, you’d look for those instances instead.

thanks @cslorabox i ran logread -f in a ssh session and the LPS8 rebooted after 32 hours with these last lines before the pipe broke; i don’t see anything unusual. the iot_keep_alive is a Dragino service for failover Ethernet/WiFi/3G

Mon Mar  7 13:20:02 2022 user.notice iot_keep_alive: Internet Access OK: via wlan0-2
Mon Mar  7 13:20:02 2022 user.notice iot_keep_alive: use WAN or WiFi for internet access now
Mon Mar  7 13:20:03 2022 daemon.info fwd[2562]:
Mon Mar  7 13:20:03 2022 daemon.info fwd[2562]: ################[PKT_SERV] no report of this service ###############
Mon Mar  7 13:20:03 2022 daemon.info fwd[2562]: PKTUP~ [server] JSON: {"stat":{"time":"2022-03-07 05:20:03 UTC","rxnb":1,"rxok":1,"rxfw":1,"ackr":0.0,"dwnb":111,"txnb":0,"pfrm":"SX1308","mail":"tomtobback@gmail.com","desc":"Dragino LoRaWAN Gateway"}}
Mon Mar  7 13:20:03 2022 daemon.info fwd[2562]: PKTUP~ [server2] JSON: {"stat":{"time":"2022-03-07 05:20:03 UTC","rxnb":1,"rxok":1,"rxfw":1,"ackr":0.0,"dwnb":0,"txnb":0,"pfrm":"SX1308","mail":"tomtobback@gmail.com","desc":"Dragino LoRaWAN Gateway"}}
Mon Mar  7 13:20:05 2022 daemon.info fwd[2562]: INFO~ [server-up] received packages from mote: 260D1D33 (fcnt=17)
Mon Mar  7 13:20:05 2022 daemon.info fwd[2562]: PKTUP~ [server] JSON: {"rxpk":[{"jver":1,"tmst":2803997651,"chan":4,"rfch":0,"freq":924.000000,"mid":0,"stat":1,"modu":"LORA","datr":"SF7BW125","codr":"4/5","rssis":-41,"lsnr":9.0,"foff":0,"rssi":-41,"size":21,"data":"QDMdDSaAEQABanzd5cTGYm5Ja30x"}]}
Mon Mar  7 13:20:05 2022 daemon.info fwd[2562]: INFO~ [server2-up] received packages from mote: 260D1D33 (fcnt=17)
Mon Mar  7 13:20:05 2022 daemon.info fwd[2562]: PKTUP~ [server2] JSON: {"rxpk":[{"jver":1,"tmst":2803997651,"chan":4,"rfch":0,"freq":924.000000,"mid":0,"stat":1,"modu":"LORA","datr":"SF7BW125","codr":"4/5","rssis":-41,"lsnr":9.0,"foff":0,"rssi":-41,"size":21,"data":"QDMdDSaAEQABanzd5cTGYm5Ja30x"}]}
Mon Mar  7 13:20:05 2022 kern.info kernel: wlan0-2: AP 24:f5:a2:05:e3:ab changed bandwidth, new config is 2437 MHz, width 1 (2437/0 MHz)
Mon Mar  7 13:20:05 2022 daemon.info fwd[2562]: UNCONF_UP:{"ADDR":"260D1D33", "Size":21, "Rssi":-41, "snr":9, "FCtrl":["ADR":1,"ACK":0, "FPending":0, "FOptsLen":0], "FCnt":17, "FPort":1, "MIC":"317D6B49"}
Mon Mar  7 13:20:07 2022 daemon.info fwd[2562]: INFO~ [server-down] PULL_ACK received in 143 ms
Mon Mar  7 13:20:09 2022 kern.info kernel: wlan0-2: AP 24:f5:a2:05:e3:ab changed bandwidth, new config is 2437 MHz, width 2 (2447/0 MHz)
Mon Mar  7 13:20:12 2022 daemon.info fwd[2562]: INFO~ [server-down] PULL_ACK received in 143 ms
Mon Mar  7 13:20:17 2022 daemon.info fwd[2562]: INFO~ [server-down] PULL_ACK received in 145 ms
Mon Mar  7 13:20:18 2022 user.notice iot_keep_alive: Internet Access OK: via wlan0-2
Mon Mar  7 13:20:18 2022 user.notice iot_keep_alive: use WAN or WiFi for internet access now

and another reboot after these final lines:

Tue Mar  8 05:31:13 2022 user.notice iot_keep_alive: Internet Access OK: via wlan0-2
Tue Mar  8 05:31:13 2022 user.notice iot_keep_alive: use WAN or WiFi for internet access now
Tue Mar  8 05:31:17 2022 daemon.info fwd[2329]: INFO~ [server-down] PULL_ACK received in 148 ms
Tue Mar  8 05:31:23 2022 daemon.info fwd[2329]: INFO~ [server-down] PULL_ACK received in 148 ms
Tue Mar  8 05:31:28 2022 daemon.info fwd[2329]: INFO~ [server-down] PULL_ACK received in 148 ms
Tue Mar  8 05:31:28 2022 user.notice iot_keep_alive: Internet Access OK: via wlan0-2
Tue Mar  8 05:31:28 2022 user.notice iot_keep_alive: use WAN or WiFi for internet access now
Tue Mar  8 05:31:33 2022 daemon.info fwd[2329]: INFO~ [server-down] PULL_ACK received in 152 ms
Tue Mar  8 05:31:38 2022 daemon.info fwd[2329]: INFO~ [server-down] PULL_ACK received in 148 ms
Tue Mar  8 05:31:42 2022 daemon.info fwd[2329]:
Tue Mar  8 05:31:42 2022 daemon.info fwd[2329]: ################[PKT_SERV] no report of this service ###############
Tue Mar  8 05:31:42 2022 daemon.info fwd[2329]: PKTUP~ [server2] JSON: {"stat":{"time":"2022-03-07 21:31:42 UTC","rxnb":0,"rxok":0,"rxfw":0,"ackr":0.0,"dwnb":0,"txnb":0,"pfrm":"SX1308","mail":"tomtobback@gmail.com","desc":"Dragino LoRaWAN Gateway"}}
Tue Mar  8 05:31:42 2022 daemon.info fwd[2329]: PKTUP~ [server] JSON: {"stat":{"time":"2022-03-07 21:31:42 UTC","rxnb":0,"rxok":0,"rxfw":0,"ackr":0.0,"dwnb":46,"txnb":0,"pfrm":"SX1308","mail":"tomtobback@gmail.com","desc":"Dragino LoRaWAN Gateway"}}

I don’t see any clues, but as long as it doesn’t freeze i don’t mind the reboots.

I would check the watchdog configuration.

Also consider capturing the logs via a python script or something that can timestamp each line, and see if there’s an unusual time gap between the final log messages and the u-boot startup messages.

Thanks for all the feedback, i think i’ve figured out the cause: overheating. The LPS8 seems very poorly designed, overheating at around 25 degC ambient air temperature. Maybe not a big issue in Europe but for sub-tropical Hong Kong that it the case more than half of the year, especially under the roof. The Dragino HQ is less than 50k away from here in Shenzhen, i wonder how they test their devices. I’ve added a temperature sensor inside my gateway box, and a small 5V brushless fan that switches on when the inside temp goes above 30 degC (at 40 and 35 degC i was still seeing frequent reboots), and a few ventilation holes in the top panel. Detail here. Almost 4 days without a reboot.
20220315_130610

2 Likes