First uplink fine, second fails with "Drop uplink message - Device not found"

Hi there! I need some help with my current setup.
Information:
Using The Things Stack v3.8.6
My node is using the LMIC library: arduino-lmic
I have a private iC880a-based gateway with a RPi.

From the console, the gateway is receiving an uplink message:
(The problem persists with a confirmed or unconfirmed uplink.)

`{
  "@type": "type.googleapis.com/ttn.lorawan.v3.UplinkMessage",
  "raw_payload": "QEyAgQAAAAAKaat2QnElaPq6C7R63kB7p1YmpXlRSQ==",
  "payload": {
    "m_hdr": {
      "m_type": "UNCONFIRMED_UP"
    },
    "mic": "pXlRSQ==",
    "mac_payload": {
      "f_hdr": {
        "dev_addr": "0081804C",
        "f_ctrl": {}
      },
      "f_port": 10,
      "frm_payload": "aat2QnElaPq6C7R63kB7p1Ym"
    }
  },
  "settings": {
    "data_rate": {
      "lora": {
        "bandwidth": 125000,
        "spreading_factor": 11
      }
    },
    "coding_rate": "4/5",
    "frequency": "868100000",
    "timestamp": 1222956060,
    "time": "2020-07-31T08:20:58.120264Z"
  },
  "rx_metadata": [
    {
      "gateway_ids": {
        "gateway_id": "ic880a",
        "eui": "B827EBFFFE09CD2D"
      },
      "time": "2020-07-31T08:20:58.120264Z",
      "timestamp": 1222956060,
      "rssi": -66,
      "channel_rssi": -66,
      "snr": 13.8,
      "uplink_token": "ChQKEgoGaWM4ODBhEgi4J+v//gnNLRCcqJPHBA=="
    }
  ],
  "received_at": "2020-07-31T08:20:58.141421195Z",
  "correlation_ids": [
    "gs:conn:01EEDJAZSV3E333PM7ZX209EGZ",
    "gs:uplink:01EEHZBFMXCJVKJJNFJ10KR737"
  ]
}
`

Immediately after, the message is dropped and is never forwarded to the application:

{
  "@type": "type.googleapis.com/ttn.lorawan.v3.ErrorDetails",
  "namespace": "pkg/gatewayserver",
  "name": "host_handle",
  "message_format": "host `{host}` failed to handle message",
  "attributes": {
    "host": "cluster"
  },
  "cause": {
    "namespace": "pkg/networkserver",
    "name": "device_not_found",
    "message_format": "device not found",
    "correlation_id": "63ea98f2348d4f97b103fb25b42a7ce1",
    "code": 5
  },
  "code": 5
}

tts

Regarding the node, I joined the network with OTAA.
The first message after being joined does not get dropped, if I force the node to join again and get new keys and address. The node has ESP32 and goes to deep sleep after every message sent. The OTAA info is saved in non-volatile memory. I thought that might be the problem, but I checked and the keys and addr are being saved and read correctly.
Changing the spreading factor does not change this behaviour.
ADR is disabled.

I’m not sure how to interpret the error message correctly and how to proceed for a fix, any help is appreciated.

You’ll need to save much more. Your raw payload shows the uplink is using frame counter 0, for its 2nd uplink. That smells? Of course, if that’s causing this then the error message is not quite helpful…

For a debugging strategy, have you tried:

  • ABP?
  • Not going to sleep (and not saving / reloading)?

These should help you focus on fixing the correct thing.

Thank you for the input. I did follow the link you shared and now it is working as intended with unconfirmed uplinks :slight_smile:
And then I am using this logic as well for restoring from RTC:

However, with confirmed uplinks is another story. It seems like a problem from the lmic library itself.
When a packet is sent and the downlink ack is not received, more packets are sent over and over again, and these subsequent packets are getting drop. I am using the function LMIC_setTxData2 to send the packet.
First confirmed message:

{
  "@type": "type.googleapis.com/ttn.lorawan.v3.UplinkMessage",
  "raw_payload": "gEHg6AEAAwAKjpcC5tjHw31PqNSRwsQWEnTGIVGo3w==",
  "payload": {
    "m_hdr": {
      "m_type": "CONFIRMED_UP"
    },
    "mic": "IVGo3w==",
    "mac_payload": {
      "f_hdr": {
        "dev_addr": "01E8E041",
        "f_ctrl": {},
        "f_cnt": 3
      },
      "f_port": 10,
      "frm_payload": "jpcC5tjHw31PqNSRwsQWEnTG"
    }
  },
  "settings": {
    "data_rate": {
      "lora": {
        "bandwidth": 125000,
        "spreading_factor": 7
      }
    },
    "coding_rate": "4/5",
    "frequency": "867100000",
    "timestamp": 4070255403,
    "time": "2020-07-31T13:54:45.274122Z"
  },
  "rx_metadata": [
    {
      "gateway_ids": {
        "gateway_id": "ic880a",
        "eui": "B827EBFFFE09CD2D"
      },
      "time": "2020-07-31T13:54:45.274122Z",
      "timestamp": 4070255403,
      "rssi": -58,
      "channel_rssi": -58,
      "snr": 9.2,
      "uplink_token": "ChQKEgoGaWM4ODBhEgi4J+v//gnNLRCr1uyUDw==",
      "channel_index": 3
    }
  ],
  "received_at": "2020-07-31T13:54:45.294234522Z",
  "correlation_ids": [
    "gs:conn:01EEDJAZSV3E333PM7ZX209EGZ",
    "gs:uplink:01EEJJENDEKDGYJ36JVCHN706V"
  ]
}

The ack was sent by the gateway but the node did not receive it.
The second message is then sent after 8 seconds:

{
  "@type": "type.googleapis.com/ttn.lorawan.v3.UplinkMessage",
  "raw_payload": "gEHg6AEAAwAKjpcC5tjHw31PqNSRwsQWEnTGIVGo3w==",
  "payload": {
"m_hdr": {
  "m_type": "CONFIRMED_UP"
},
"mic": "IVGo3w==",
"mac_payload": {
  "f_hdr": {
    "dev_addr": "01E8E041",
    "f_ctrl": {},
    "f_cnt": 3
  },
  "f_port": 10,
  "frm_payload": "jpcC5tjHw31PqNSRwsQWEnTG"
}
  },
  "settings": {
"data_rate": {
  "lora": {
    "bandwidth": 125000,
    "spreading_factor": 7
  }
},
"coding_rate": "4/5",
"frequency": "867300000",
"timestamp": 4078402107,
"time": "2020-07-31T13:54:53.422871Z"
  },
  "rx_metadata": [
{
  "gateway_ids": {
    "gateway_id": "ic880a",
    "eui": "B827EBFFFE09CD2D"
  },
  "time": "2020-07-31T13:54:53.422871Z",
  "timestamp": 4078402107,
  "rssi": -54,
  "channel_rssi": -54,
  "snr": 9.5,
  "uplink_token": "ChQKEgoGaWM4ODBhEgi4J+v//gnNLRC79N2YDw==",
  "channel_index": 4
}
  ],
  "received_at": "2020-07-31T13:54:53.446731572Z",
  "correlation_ids": [
"gs:conn:01EEDJAZSV3E333PM7ZX209EGZ",
"gs:uplink:01EEJJEXC70575KTASA1QQ3Q85"
  ]
}

And then the drop occurs:

{
  "@type": "type.googleapis.com/ttn.lorawan.v3.ErrorDetails",
  "namespace": "pkg/gatewayserver",
  "name": "host_handle",
  "message_format": "host `{host}` failed to handle message",
  "attributes": {
    "host": "cluster"
  },
  "cause": {
    "namespace": "pkg/networkserver",
    "name": "device_not_found",
    "message_format": "device not found",
    "correlation_id": "29ffa4cc9bdc446aba7323cb8c231693",
    "code": 5
  },
  "code": 5
}

The raw payload is the same, same frame counter… So it gives the same error.
After 8 tries, fires up an event that TX is complete, without receiving any ack. After that event I have my deep sleep function.

Since the library is re-sending these packets, after the function LMIC_setTxData2, shouldn’t it modify these parameters for each subsequent message?
I guess in these case I need to find a way to stop these subsequent messages or to modify the parameters myself. Any experience with this issue?

PS: According to the guidelines, should I make my posts a bit shorter? :sweat_smile:

1 Like

If I understand correctly, then that’s a fault in the LoRaWAN protocol:

I like details! :+1:

1 Like