Downlink error in v3 UI - message not arriving at gateway nor device

FWIW, I’ve now seen this whilst testing a new device - data was passed on with no problems.

1 Like

Good morning all,

@descartes: My problem is not just a UI issue, the data is not arriving at the client, nor at the gateway on the client location.

@philmcm Thanks for linking to the UI github issue.

@MarkusKa Do you see the messages arrive at your device, or are you having the same issue that they don’t arrive at the device nor the local gateway?

Morning,

Do you mean that the downlink messages do not arrive at your device?
I only get a confirmation as a downlink that the uplink was successful and that works. I do not send any data as downlink.

The Uplink Work and so does my MQTT Integration.

@MarkusKa Yes correct. I am sending commands to the device to do something (e.g. restart) and I do get the schedule downlink confirmation message in the TTN console but on the device itself, nothing arrives. Also on the local gateway, the downlink message does not arrive.

Hi @descartes,

can I ask if you are seeing uplink data or downlink data on the device? Because it seems everyone on this thread is talking about uplink, but my issue is related to downlink messages.

So from TTN to the device

Hi @sqippa-online and everyone, I am having similar issue with downlink after device migrated from V2 to V3. When gateway is connected via GSM network, some times downlink reached the device, most of the time it doesn’t.

When gateway is connected via LAN, sometimes when set to a confirmed downlink, even after the device received it, the downlink msg keeps being resent. I checked with CLI that there is no Downlink queue, at first I thought maybe it is being queued on the Gateway? I restarted the Gateway and the same downlink still repeating. I checked the device logs and it did received a downlink even when CLI shows no downlink in queue and the Gateway has been restarted.

To further eliminate the possibility of queue/cache via the Internet provider since the Gateway is connected to HomeBroadband, I connect the Gateway to a Mini-Router with Vodafone GSM cellular network and the repeating downlink stopped for a few Uplink cycle and then … it re-appeared. As I mentioned earlier with GSM connection, downlink will only reach the device from time to time.

When I switch back to HomeBroadband LAN connection to Gateway, the repeating downlink returns. When check with CLI, there is nothing in downlink queue.

The weirdest behaviour is … after a different downlink was sent successfully to the device, the previous repeating downlink continue after that. I even used CLI to perform a CLEAR QUEUE but the downlink still continue being sent to the device.

The device has not been touched for all the above tests so the final test is to restart the device and see whether is it something to do with the device. Restarted the device and the downlink is still repeating.

Do apologies for the above long behaviour observations, I am trying to understand TTN V3 downlink since TTN V2 downlink works nicely. It does looks like it is some kind of synchronisation issue and maybe a bug?

Any suggestion/help would be much appreciated. Thank you very much.

Downlinks aren’t good. Confirmed downlinks are very very bad.

If your device doesn’t confirm the downlink the network server will retry forever, using up computer resources thereby contributing to global warming which means you will be taken off Greta’s Christmas Card list.

The console will not show a downlink awaiting confirmation.

What is your device?
Is it on OTAA or ABP?

No such thing - but if the GSM network is running slow, the downlink may not reach the gateway in time. Is the gateway on v2 or v3.

I have a test outstanding on clearing blocked confirmations which I may be able to look at later today.

Hi Nick (@descartes) ,

Thank you for getting back to me quickly, really appreciate it.

noooo … don’t want to be taken off the Christmas list :sweat_smile:

Could you point me to the right direction on implementing a Downlink response please? Are you referring to the u1_t confirmed parameter value set to 1 when sending the next uplink request? We updated the device code to send the next uplink with u1_t confirmed value = 1, but still getting the previous downlink.

So far we have been using LMIC library with V2 and a confirmed downlink will normally be replaced by a non-confirmed downlink and once that is sent it stop repeating.

The device is a heltec esp32 lora unit. We are using ABP, downlink is just for initial installation where signal strength and confirmation on uplink msgs are received and confirmed. Then transmissions will be set to 30 mins interval via downlink and most probably never use downlink again. However it is vital during initial installation and setup.

Gateway is Laird RG186 on V3. I tried setting the RX1 delay to different values (1 to 5) but it doesn’t seems to have any effect and downlink still unable to reach device when using GSM network. Any suggestion on how to diagnose and a good way to overcome this problem? Thanks.

Thank you and looking forward to getting downlink working reliably :blush:

Out of curiosity I wondered … “What if I send a different confirmed downlink … would this new confirmed downlink repeat or the previous confirmed downlink will return after …” ?

With LAN (Broadband connection to Gateway) I scheduled a different downlink via TTN V3 console, after the device received this new downlink once, the previous downlink came back again … something very strange is happening … mmm

Maybe I should try with a new device next?

That is saying “send uplink as confirmed”. The acknowledgement of a downlink is done automatically from within LMIC. And as such there are notes regarding ensuring you don’t sleep before LMIC has finished. EV_TXDONE is only an event telling us that it has transmitted an uplink, it doesn’t tell us if the MAC still has other work to do which could take some time if there are series of MAC commands to process - as it may have a confirmed downlink to acknowledge that it transmits that receives a further downlink that needs processing and some sort of response.

LMIC isn’t hugely helpful about indicating that it has work to do - you have to call os_queryTimeCriticalJobs() to find out if there are any jobs pending which is a bit obtuse. See EV_TXDONE and transmission complete · mcci-catena/arduino-lmic · Discussion #640 · GitHub

It’s all cool, downlinks have their time & place, just not as a routine thing IMO.

It won’t do / that’s just advanced guessing.

You will have to change DELAY_DNW1 in lorabase.h for the downlink to work with ABP on 5s which is preferable but if you leave it as is in the code base you’ll need to set that to 1s in the device settings. I’m not sure if I should be proud or depressed that my brain retains this level of detail.

MCCI LMIC 3.2.0 onwards is highly compliant - as in it processes MAC commands which is an expectation of v3. You should use LoRaWAN version 1.0.3. I’m using 3.3.0 for real but have test devices running nicely on 4.0.0. I don’t deploy client devices using LMIC, but if further testing is satisfactory and given the world-wide shortage of modules I may well do.

Anyway, here’s the exciting command I’ve just tested.

I sent my device a confirmed reboot downlink which means it never confirms command and just keeps getting the downlink again.

Using:

ttn-lw-cli dev set your-application-id the-offending-device-id --unset mac-state.pending-application-downlink

it clears the downlinks and all is well.

Not strange at all, you asked for a downlink to be confirmed, it hasn’t yet, so TTS NS is resending.

Hopefully the CLI command above will reset the state of affairs.

If you need to know that a device has received a downlink it is preferable to put some logic in to the firmware & your backend systems.

To emphasise how non-PC confirmed downlinks are, TTI have only just put a message on the NS to tell us if a device has confirmed a downlink. Up until now we had no indication at all that one had been confirmed.

Hi Nick,

Thank you. I executed the command you provide and the repeating downlink stopped … phew … :pray:

I will check my current LMIC version and upgrade it to 3.2.0 and test again.

Thank you very much.

Please use 3.3.0

I was observing that it has only been since 3.2.0 that LMIC has been largely MAC command compliant.

Is your device sleeping between uplinks? If so, does it go to ESP low power mode which forgets about the state of the LMIC stack with the result of the device not acknowledging the downlink?

That’s certainly a consideration - but as LMIC should ack a confirmed downlink in the same Tx request package, it’s likely that it just wasn’t compliant.

IIRC any ack should be in the next uplink. I don’t think the specification states an immediate uplink is required but I might be wrong. So if LMIC state is lost due to sleep before the next uplink it will not send the required ack.

Anyway, as we once again saw, confirmed down links are a source of issues so better to avoid them. As you stated, use application level logic in stead to ‘confirm’ receipt of new settings. That way you can also indicate the settings are valid or signal the device couldn’t apply them due to invalid values.

Very probably - so many moving parts - just admiring the Beelan source code which suggested it had added MAC command processing whereas in fact it’s still only a PR and only partially implemented - whilst I’m on a roll I can have a peek at LMIC to see what it thinks it’s doing & when.

@kersing, transpires that LMIC does trigger an uplink to confirm a downlink in v4. Took a while to trace so I’ll not do that with older versions.

I have created an ATmega4808 + RFM95 device specifically to trace MAC commands from TTS as they do appear to be numerous and I can put plenty of debug in to LMIC to see what’s going on, but printf only works (easily) on AVR, not SAMD.

Don’t know how the OP got in to a loop, but my client found the loop-da-loop by sending a confirmed reboot downlink and with the MKR WAN which you know I love, you can’t be sure you’ve done a Tx so if you get a confirmed downlink requesting it you’d then have to send a confirmed uplink to clear the confirmed downlink with the uplink being confirmed so you know that the downlink was confirmed (excepting MKR WAN doesn’t tell you an uplink was confirmed in current firmware) - we’ll end up with two Tx’s from the gateway in short order. What a mess!

1 Like

Hi @kersing , for this particular type of device, there is no sleep time. I will be careful the device that does go to DeepSleep. Thank you.

@descartes Thanks Nick. I upgraded my LMIC lib to 3.3.0. I am impressed you can remembered the DELAY_DNW1 :grinning:
enum { DELAY_DNW1 = 1 };

Test Downlink with LAN-Home-broadband connection and it is responding as expected.

DELAY_DNWY1=1, RX1 Delay=1, Test Downlink with V3-Gateway+GSM-Cellular-Network connection and it is responding as expected (Missed downlink sometimes).

DELAY_DNWY1=5, RX1 Delay=5, Test Downlink with V3-Gateway+GSM-Cellular-Network connection and it missed most of the downlinks.

Nick, just to double confirm:

  1. Should the u1_t confirmed parameter value set to 1 for the following UplinkTransmission when a ConfirmedDownlink was received?

  2. Should I stick with LMIC lib version 3.3.0. and not 4.0.0. ?

Thank you very much.