LMiC + lora_pkt_fwd: unable to receive downlinks

Hello everyone!
I’m using STM32F411 mcu with SX1272 radio as a LoRaWAN node with this LMiC library. I had to port it for F411 - change some registers, pins, ISR names etc. but it works - I’m able to see packets sent by my node in the TTN console.

But after few days of debugging I’m still not able to receive any downlink messages. Here is the piece of code responsible for printing out downlinks (in onEvent function):

case EV_TXCOMPLETE:
    // Check if we have a downlink on either Rx1 or Rx2 windows
    if( ( LMIC.txrxFlags & ( TXRX_DNW1 | TXRX_DNW2 ) ) != 0 )
    {
        debug_str( "Received downlink\r\n");
        if( LMIC.dataLen != 0 )
        { // data received in rx slot after tx
            debug_buf( LMIC.frame + LMIC.dataBeg, LMIC.dataLen );
        }
    }
    if( ( LMIC.txrxFlags & TXRX_ACK ) != 0 )
        debug_str( "Received ACK\r\n" );
    break;

The above code is based an downlink examples I was able to find so I think there’s nothing wrong with it.

Here is a piece of the node’s logs (the stm32 LMiC library I used didn’t have printing out debug informations implemented so I had to do this myself based on the arduino LMiC library, but to do this in simpler way, all numeric values are hexadecimal):

02D9BA84: engineUpdate, opmode = 0x00
02E13A85: engineUpdate, opmode = 0x08
02E13AD7: Uplink data pending
02E13B20: Airtime available at 02D9C352 (channel duty limit)
02E13BA3: Ready for uplink
02E13C1B: TXMODE, freq = 33BE27A0, len = 19, SF = 07, BW = 0000007D, CR = 4/05, IH 00000000
02E1B59B: RXMODE_SINGLE, freq = 33BE27A0, SF = 07, BW = 0000007D, CR = 4/05, IH = 00000000
02E1C960: RADIO RX TIMEOUT 
02E1C9AF: RXMODE_SINGLE, freq = 33D3E608, SF = 0C, BW = 0000007D, CR = 4/05, IH = 00000000
02E1ECFB: RADIO RX TIMEOUT 
TXCOMPLETE
02E43B98: engineUpdate, opmode = 0x00
02EBBB99: engineUpdate, opmode = 0x08
02EBBBEA: Uplink data pending
02EBBC34: Airtime available at 02E450AE (channel duty limit)
02EBBCB7: Ready for uplink
02EBBD2F: TXMODE, freq = 33BE27A0, len = 19, SF = 07, BW = 0000007D, CR = 4/05, IH 00000000
02EC36AF: RXMODE_SINGLE, freq = 33BE27A0, SF = 07, BW = 0000007D, CR = 4/05, IH = 00000000
02EC4A75: RADIO RX TIMEOUT 
02EC4AC5: RXMODE_SINGLE, freq = 33D3E608, SF = 0C, BW = 0000007D, CR = 4/05, IH = 00000000
02EC6E11: RADIO RX TIMEOUT 
TXCOMPLETE
02EE40E3: engineUpdate, opmode = 0x00
02F5C0E4: engineUpdate, opmode = 0x08
02F5C135: Uplink data pending
02F5C17F: Airtime available at 02EED1C2 (channel duty limit)
02F5C202: Ready for uplink
02F5C27A: TXMODE, freq = 33BE27A0, len = 19, SF = 07, BW = 0000007D, CR = 4/05, IH 00000000
02F63BFA: RXMODE_SINGLE, freq = 33BE27A0, SF = 07, BW = 0000007D, CR = 4/05, IH = 00000000
02F64FC0: RADIO RX TIMEOUT 
02F65010: RXMODE_SINGLE, freq = 33D3E608, SF = 0C, BW = 0000007D, CR = 4/05, IH = 00000000
02F6735F: RADIO RX TIMEOUT 
TXCOMPLETE

We can see that the node is sending uplink messages with 868.1Mhz (I’ve disabled all the other channels for debugging purposes, enabling them doesn’t solve the problem anyway), SF7BW125, CR 4/5 and then listening on RX1 window with the same settings and on RX2 window with 869.525Mhz, SF12BW125, CR4/5. The RX1 windows timing is correct (my OSTICKS_PER_SEC is 32768) - it opens about 1 second after TX. There’s something wrong with the second one - it should open after about 2 seconds but it seems to open directly after RX1. I don’t think that’s the cause since I’m aiming for RX1 window anyway.

Let’s take a look at the gateway’s logs then (lora_pkt_fwd by Semtech):

INFO: Received pkt from mote: 26011D57 (fcnt=101)
JSON up: {"rxpk":[{"tmst":4165619523,"chan":0,"rfch":1,"freq":868.100000,"stat":1,"modu":"LORA","datr":"SF7BW125",
"codr":"4/5","lsnr":10.2,"rssi":-62,"size":25,"data":"QFcdASYAZQABbA12gx8NtS3LBjDWdvABLQ=="}]}
INFO: [down] PULL_RESP received  - tokem[169:243] :)
JSON down: {"txpk":{"imme":false,"tmst":4166619523,"freq":868.1,"rfch":0,"powe":14,"modu":"LORA","datr":"SF7BW125",
"codr":"4/5","ipol":true,"size":14,"ncrc":true,"data":"YFcdASYABAABmg5Gbr0="}} 
INFO: tx_start_delay=1495 (1495.500000) - (1497, bw_delay=1.500000, notch_delay=0.000000)
INFO: [down] PULL_ACK received in 101 ms
##### 2018-08-10 17:08:06 GMT #####
### [UPSTREAM] ###
# RF packets received by concentrator: 2
# CRC_OK: 100.00%, CRC_FAIL: 0.00%, NO_CRC: 0.00%
# RF packets forwarded: 2 (50 bytes)
# PUSH_DATA datagrams sent: 3 (533 bytes)
# PUSH_DATA acknowledged: 0.00%
### [DOWNSTREAM] ###
# PULL_DATA sent: 3 (100.00% acknowledged)
# PULL_RESP(onse) datagrams received: 2 (380 bytes)
# RF packets sent to concentrator: 2 (33 bytes)
# TX errors: 0
# TX rejected (collision packet): 0.00% (req:5, rej:0)
# TX rejected (collision beacon): 0.00% (req:5, rej:0)
# TX rejected (too late): 0.00% (req:5, rej:0)
# TX rejected (too early): 0.00% (req:5, rej:0)
# BEACON queued: 0
# BEACON sent so far: 0
# BEACON rejected: 0
### [JIT] ###
# SX1301 time (PPS): 4108975322
src/jitqueue.c:471:jit_print_queue(): INFO: [jit] queue is empty
### [GPS] ###
# GPS sync is disabled
##### END #####  

As we can see, the gateway receives the packet sent by the node and attempts to send the downlink with the same channel settings one second after the uplink packet reception. I don’t see any errors here neither - the downlink should fit into the node’s first RX window.

Things I’ve looked into so far:

LMIC_setClockError - the LMiC library I use didn’t have this function so I’ve implemented it on myself based on the arduino LMiC library. I was able to confirm that:

  • without setting clock error RX1 starts ~1sec after TX and lasts for ~6ms
  • with setting clock error at 1% RX1 starts ~1sec after TX and lasts for ~20ms
  • with setting clock error at 10% RX1 starts ~1sec after TX and lasts for ~160ms

but even with 10% the downlinks are still not being received.

IQ inversion - I’ve confirmed that the node is listening correctly with inverted IQ.

Does anyone here have any suggestions about where else can I look for possible solutions?

1 Like

If this is EU86, then the RX2 settings are incorrect for TTN; they should use SF 9 (0x09), not SF 12 (0x0C). But indeed, the gateway log shows the downlink is intended for RX1 (the downlink tmst is 1,000,000 microseconds after the uplink tmst), so the above shouldn’t be your current problem.

Maybe your LMIC_setClockError should also ensure the RX window is started a bit earlier?

1 Like

Thanks for your answer!

I was able to implement printing out timestamps and settings info in decimal system. I’ve also changed RX2 SF to 9 as you suggested and enabled other channels.

Here are the typical debug informations for different values of clockerror:

  • no clock error (0%)
23239: Ready for uplink
23242: TXMODE, freq = 867500000, len = 19, SF = 7, BW = 125, CR = 4/5, IH = 0
24317: RXMODE_SINGLE, freq = 867500000, SF = 7, BW = 125, CR = 4/5, IH = 0
24322: RADIO RX TIMEOUT 
24324: RXMODE_SINGLE, freq = 869525000, SF = 9, BW = 125, CR = 4/5, IH = 0
24341: RADIO RX TIMEOUT 
  • 1% clock error
63380: Ready for uplink
63384: TXMODE, freq = 867100000, len = 19, SF = 7, BW = 125, CR = 4/5, IH = 0
64448: RXMODE_SINGLE, freq = 867100000, SF = 7, BW = 125, CR = 4/5, IH = 0
64467: RADIO RX TIMEOUT 
64470: RXMODE_SINGLE, freq = 869525000, SF = 9, BW = 125, CR = 4/5, IH = 0
64499: RADIO RX TIMEOUT 
  • 10% clock error
124393: Ready for uplink
124396: TXMODE, freq = 867700000, len = 19, SF = 7, BW = 125, CR = 4/5, IH = 0
125368: RXMODE_SINGLE, freq = 867700000, SF = 7, BW = 125, CR = 4/5, IH = 0
125527: RADIO RX TIMEOUT 
125529: RXMODE_SINGLE, freq = 869525000, SF = 9, BW = 125, CR = 4/5, IH = 0
125700: RADIO RX TIMEOUT 
  • 30 % clock error
86412: Ready for uplink
86415: TXMODE, freq = 867300000, len = 19, SF = 7, BW = 125, CR = 4/5, IH = 0
87356: RXMODE_SINGLE, freq = 867300000, SF = 7, BW = 125, CR = 4/5, IH = 0
87561: RADIO RX TIMEOUT 
87564: RXMODE_SINGLE, freq = 869525000, SF = 9, BW = 125, CR = 4/5, IH = 0
88049: RADIO RX TIMEOUT 
  • 50% clock error
65649: Ready for uplink
65653: TXMODE, freq = 867100000, len = 19, SF = 7, BW = 125, CR = 4/5, IH = 0
66594: RXMODE_SINGLE, freq = 867100000, SF = 7, BW = 125, CR = 4/5, IH = 0
66799: RADIO RX TIMEOUT 
66801: RXMODE_SINGLE, freq = 869525000, SF = 9, BW = 125, CR = 4/5, IH = 0
67601: RADIO RX TIMEOUT 

We can see that for 0% and 1%, RX1 window starts 1 second and some milliseconds after the TXMODE message. For 10%, 30% and 50% it starts less then 1 second after TXMODE and lasts for about 200ms. I think these values should provide enough timing “flexibility” for the node to be able to correctly catch the downlink but there’s still no success :frowning:

1 Like

What’s actually printing the RADIO RX TIMEOUT message? Is this related to EV_SCAN_TIMEOUT?

I cannot find the message in the library you linked to. Could it indicate it failed to switch the radio to reception mode (rather than not receiving anything)? (The wiring might need to be validated if that’s the case.)

1 Like

Yeah this version of LMiC didn’t have any debugging information so I’ve added most of them on my own.
The RADIO RX TIMEOUT is my own message (it isn’t present in “standard” arduino_lmic library). I’ve added it to know the exact os_time of RX window end.
It gets called in radio_irq_handler inside radio.c:

        } else if( flags & IRQ_LORA_RXTOUT_MASK ) {
            // indicate timeout
            LMIC.dataLen = 0;
#if LMIC_DEBUG_LEVEL > 0
            debug_time(os_getTime());
            debug_str(": RADIO RX TIMEOUT \r\n");
#endif
        }

EDIT: I was also able to confirm that the gateway in fact sends downlink messages correctly. I did this by writing simple LoRa receiver firmware (pure physical LoRa without LoRaWAN) that prints out received packets. It managed to receive packet with inverted IQ from gateway but still - the LMiC node didn’t receive anything.

I’m sure you considered all, but just in case, I guess there are a few cases for failure:

  1. Gateway transmitting using the wrong settings. (Log seems okay.)
  2. Gateway not actually transmitting at all. (Any other node to test with? Like an OTAA node.)
  3. Gateway too close to the node.
  4. Node not toggling between TX and RX mode. (Wiring?)
  5. Node not receiving. (Antenna connected to RX path? RX simply broken?)
  6. Node using the wrong settings. (Log seems okay.)
  7. Timing. (Seems you tested all.)

In case you missed it: https://github.com/matthijskooijman/arduino-lmic offers a great README that may help.

1 Like
  1. I’ve edited my previous post just before you answered so you might have not noticed it:

    This receiver’s reception settings are set to what the gateway log says - freq 868.1, SF7BW125, codr 4/5 and it receives the downlink packet successfully so we can confirm that the gateway’s log doesn’t lie in this matter.

  2. As above. I’ve also wrote firmware with OTAA for this node (using the very same LMiC) but as could be predicted - it didn’t work because the node was unable to receive downlinks with session settings.

  3. The gateway is in fact only few meters away from the node so I’ll try to move it some distance away in the next few days.

  4. Wiring should be okay because I’m using Multitech’s mDot which is a STM32F411 hardwired with SX1272. I have all pins inside HAL set accordingly to the mDot’s datasheet:

    // output lines
    #define NSS_PORT           1 // NSS: PB12, sx1272
    #define NSS_PIN            12
    
    #define TX_PORT            2 // TX:  PC3
    #define TX_PIN             3
    #define RX_PORT            2 // RX:  PC2
    #define RX_PIN             2
    #define RST_PORT           2 // RST: PC0
    #define RST_PIN            0
    
    // input lines
    #define DIO0_PORT          1 // DIO0: PB5   (line 1 irq handler)
    #define DIO0_PIN           5
    #define DIO1_PORT          1 // DIO1: PB6  (line 10-15 irq handler)
    #define DIO1_PIN           6
    #define DIO2_PORT          1 // DIO2: PB7  (line 10-15 irq handler)
    #define DIO2_PIN           7
    
  5. I’ll try with another mDot. Maybe this one is broken in some way.

  6. Yeah I’ll spend some time in debugging to make sure that the correct parameters are written in SX1272’s registers

  7. I could also try with the second downlink window. According to the LMiC readme you have linked (thanks!):

    But is there any way to “force” the TTN to schedule downlinks to the RX2 window? I work on this issue for over a week now and I’ve seen only a few downlinks that I’m sending through TTN console scheduled to RX2.

I think, but have not verified, that TTN might use RX2 when the uplink uses SF12. For that, TTN’s SF9 in RX2 needs less airtime than RX1, as RX1 uses the same SF as the uplink hence would need SF12. (And SF12 need twice the airtime of SF11, which needs twice the time of SF10, etc.) If true, then it might also prefer RX2 for SF11 and maybe SF10. See also:

(Full discussion in the outdated Why is RX_SLOT 1 not used?)

1 Like

Finally some success!

I changed the uplink datarate settings to SF12:
LMIC_setDrTxpow( DR_SF12, 14 );
You were right - this setting caused the TTN to schedule downlinks to RX2. Then I did tests with different setClockError values and voila! - at 30% I am consistently able to receive downlinks on RX2.

Thanks to that we are able to confirm that the wiring, hardware settings and the radio are all good.
Unfortunately, the RX1 still doesn’t work. The hardware and software being OK is suggesting that the cause is connected only to timing issues (maybe due to the gateway being too close as you suggested)
I think I should try to dig deep into LMiC functions connected to reception timing to make them a little bit more tolerant. (since setting clock error even at 100% doesn’t help).

EDIT (since I don’t want to double post)
Okay, RX1 windows are working now without problems at 1% clock error. Turns out I overlooked the OSTICKS_PER_SEC setting inside the oslmic.h. It was set at slightly too big value. This wrong setting was almost unnoticeable to a human during debug (1 second lasted around 1.1s) but were big enough to cause RX windows timing issues.

@arjanvanb Thank you very much for your insight. I don’t think I would notice this error without your suggestions :smiley:

1 Like

Nice find!

…but would not be a huge problem for SF12, where everything takes much, much longer anyhow, so despite starting a tad too late the LoRa chip still was able to detect the preamble it was looking for when waiting for a downlink. (It doesn’t need the full preamble; this fact is also used for some single-channel gateways that support multiple spreading factors; see Switch bandwidth and SF of node while gateway doesn’t?)

1 Like

I have exactly the same problem. Can you help with that?

Thanks for your time.

Regards.

A post was split to a new topic: Raspberry Pi node with LMIC - Unable to receive downlinks