Connection with the console – The Things Network V3 – not stable

I noticed that sometimes a message gets lost.(LPS8 gateway, two sensors linked, the things network V3)
The console stream connection gets lost in about 20 % of the sending.

example
08:03:09 Forward uplink data message Payload BAT2.5 COUNT0 STATUS{…}TEMP-32….
08:03:08 Stream reconnected The stream connection has been re-established….
08:03:02 Forward uplink data message Payload BAT3.7 COUNT43 STATUS0 TEMP27….
08:03:02 Stream connection closed The connection was closed by the stream provider….

Looks like in this case a ‘re- send’ took place. This is not always the case.

Is this a known problem ? Can it be solved somehow

rgds,

Piet De Bruyn

Not in the sense that an uplink re-occured, that’s all fine.

It’s down to the stability of your internet connection to the TTS servers. This same issue occurred with v2 as well.

Is there something I should/could do about it?
As far as I understand, this phenomenon should not be the cause of eventually loosing uploads
As far as I understand, every piece in the chain from my gateway to the TTS infrastructure or a combination of it can be the reason.

Mine goes for hours without a glitch, then hiccups and reconnects. Sometimes it glitches several times in the space of a few minutes. I have 250Mb down and 30Mb up.

I have also written my own console, it’s a lighter load so tends not to glitch as much but it does eventually (I haven’t put restart code in).

The console and the back end processing are separate - just because you don’t see it on the console means you haven’t got it in your integration / backend / data storage.

Obviously this could be a huge issue - do you have a console log with an uplink and the data store without that uplink?

Or a big bag of water (person) or metal skip passing close to your device at the point it sends the uplink, or multiple transmissions in the area, so it never gets to your gateway. Your internet glitches so the UDP packet never gets as far as TTS. But once it’s in TTS I have confidence that it will be processed, TTI use clusters to cope with a server falling over, but then your integration may have a problem…

As a BACKUP we do have Data Storage - it only holds 36 - 48 hours but it’s entirely internal to TTS so if your integration goes wrong, you still have a way to get to the missing data.

2 Likes

Hi,

Still struggling with the ‘missing records’…


From time to time I do miss a message from my sensor in my application. I have setup some logging 'see extract below) . In the log of my gateway I see the following which I try to understand :

My sensor is sending a message exactly every 5 minutes : 16:32:21 - 16:37:21 - 16:42:21 - 16:47:21

How to explain that “FCnt” is succeeding : 6062 - 6063 - 6064 - 6065.
But the “DevAddr” of 6064 is different (“4C0BBD9F” <> ?? “260BBD9F”). And there is no RXTX record after FCnt 6064.

Is this due to TTN, my gateway (Dragino LPS8) or my sensor (Dragino LDDS75)? Any tips where I should take some action ?

---------------------------------------------------------------------------------------------------------------------- extract of my log

Thu Jul 1 16:32:21 2021 daemon.info lora_pkt_fwd[31129]: PKT_FWD~ DATA_UNCONF_UP-> {“DevAddr”: “260BBD9F”, “FCtrl”: [“ADR”: 1, “ADRACKReq”: 0, “ACK”: 0, “RFU” : “RFU”, “FOptsLen”: 0], “FCnt”: 6062, “FPort”: 2, “MIC”: “3E19F1C0”}
Thu Jul 1 16:32:21 2021 daemon.info lora_pkt_fwd[31129]: RXTX~ {“rxpk”:[{“tmst”:2203323947,“time”:“2021-07-01T15:32:21.855716Z”,“chan”:2,“rfch”:1,“freq”:868.500000,“stat”:1,“modu”:“LORA”,“datr”:“SF7BW125”,“codr”:“4/5”,“lsnr”:3.8,“rssi”:-90,“size”:18,“data”:“QJ+9CyaArhcC6zd6asjA8Rk+”}]}

Thu Jul 1 16:37:21 2021 daemon.info lora_pkt_fwd[31129]: PKT_FWD~ DATA_UNCONF_UP-> {“DevAddr”: “260BBD9F”, “FCtrl”: [“ADR”: 1, “ADRACKReq”: 0, “ACK”: 0, “RFU” : “RFU”, “FOptsLen”: 0], “FCnt”: 6063, “FPort”: 2, “MIC”: “F522C341”}
Thu Jul 1 16:37:21 2021 daemon.info lora_pkt_fwd[31129]: RXTX~ {“rxpk”:[{“tmst”:2503329467,“time”:“2021-07-01T15:37:21.869432Z”,“chan”:3,“rfch”:0,“freq”:867.100000,“stat”:1,“modu”:“LORA”,“datr”:“SF7BW125”,“codr”:“4/5”,“lsnr”:6.8,“rssi”:-89,“size”:18,“data”:“QJ+9CyaArxcCNMkWzvVBwyL1”}]}
Thu Jul 1 16:37:21 2021 daemon.info lora_pkt_fwd[31129]: INFO~ [up] PUSH_ACK received in 31 ms
Thu Jul 1 16:37:24 2021 daemon.info lora_pkt_fwd[31129]:

Thu Jul 1 16:42:21 2021 daemon.info lora_pkt_fwd[31129]: PKT_FWD~ DATA_UNCONF_UP-> {“DevAddr”: “4C0BBD9F”, “FCtrl”: [“ADR”: 1, “ADRACKReq”: 0, “ACK”: 0, “RFU” : “RFU”, “FOptsLen”: 0], “FCnt”: 6064, “FPort”: 2, “MIC”: “BEE51382”}
Thu Jul 1 16:42:25 2021 daemon.info lora_pkt_fwd[31129]:
Thu Jul 1 16:42:25 2021 daemon.info lora_pkt_fwd[31129]: REPORT~ ################## Report at: 2021-07-01 15:42:25 UTC ##################

Thu Jul 1 16:47:21 2021 daemon.info lora_pkt_fwd[31129]: PKT_FWD~ DATA_UNCONF_UP-> {“DevAddr”: “260BBD9F”, “FCtrl”: [“ADR”: 1, “ADRACKReq”: 0, “ACK”: 0, “RFU” : “RFU”, “FOptsLen”: 0], “FCnt”: 6065, “FPort”: 2, “MIC”: “0437526B”}
Thu Jul 1 16:47:21 2021 daemon.info lora_pkt_fwd[31129]: RXTX~ {“rxpk”:[{“tmst”:3103343451,“time”:“2021-07-01T15:47:21.880536Z”,“chan”:2,“rfch”:1,“freq”:868.500000,“stat”:1,“modu”:“LORA”,“datr”:“SF7BW125”,“codr”:“4/5”,“lsnr”:4.2,“rssi”:-90,“size”:18,“data”:“QJ+9CyaAsRcCJgyo30prUjcE”}]}
Thu Jul 1 16:47:21 2021 daemon.info lora_pkt_fwd[31129]: INFO~ [up] PUSH_ACK received in 31 ms
Thu Jul 1 16:47:25 2021 daemon.info lora_pkt_fwd[31129]:


Thanks in advance,

Rgds

Piet De Bruyn

For me this looks like the message at 16:42:21 has been disturbed e.g. by an other signal on the frequency. This is detected by the CRC-check therefore the message is not forwarded to the server.
It seems that the disturbance multiplied the first two bytes of the dev-addr by 2.

As far as I understand Fnct is assigned by TTS upon receiving messages from the gateway. Is there any explanation why in the log of the gateway I see the number nicely increasing, but in the MQTT retrieving the uploaded messages from TTS I sometimes do miss a number?

imho the Fcnt for uplinks is build by the node. If an uplink gets lost anywhere, it’s number is missing. But as long as the numbers are increasing the Fcnt-value is accepted by the server.
Where are your Fcnt-values missing? In the log of your gateway, in the TTS CE Gateway-log (Live Data) or in MQTT?

I do see the correct sequence in the gateway log but miss the in MQTT (I have I program that reports me missing cases - a few every 24 hours) and I do miss at least some in the Live Data (somewhat more difficult to track - hiccups are not unusual as I understood from earlier posts)

That is right.

Are the CRC values for the entries missing in the application data correct on the gateway? TTN will not process uplink packets with CRC errors.

LoRaWAN is not a reliable protocol, there will always be packets missing due to radio interference and other causes. There might even be packet loss between the gateway and the backend when using UDP connectivity.
If you need all data you will need to implement some logic on application level, there have been extensive discussions on the subject in the past on the forum. (Or switch to another protocol as LoRaWAN in inherently lossy)

I have added one example, when you need more I will gather additional examples.DevAddr_260BC4C3_missing773.txt (7.3 KB)

Ideally we need a gateway log just like you’ve created (good detail) AND the TTS gateway console log for the exact same point of time.

The gateway does appear to be pushing the uplink up to the network server but what we don’t know is how or if the NS handled it.

I know this will be a bit tricky to co-ordinate, but if you leave both your Putty and web open and keep an eye out for a dropped uplink then capture the data - screen shot for the web page is fine.

PS, lines 8 to 34 are a report on status (as implied by line 8) and have nothing to do with the uplink because it hasn’t happened yet. Line 35 is a status message from the gateway to the server to tell it that it’s alive and line 36 is the acknowledgement back.

Line 37 is the uplink with more info on 38. Line 39 says the network server acknowledged it.

Sadly the timestamps aren’t that much use as many embedded system deliver their serial output in bursts.

From the comments in your text file (why not post everything readable in the message with the log entries in code blocks) I gather you think I have access to the TTS(CE) infrastructure.
That is wrong. I am a volunteer spending a lot of time trying to help other TTN users. We, moderators, do not have special privileges on the TTN infrastructure, just the additional right required for moderation on the forum.
We try to help people by suggesting possible causes of issues but do not have the time to do all the analysis and debugging for all users.

So if I ask specific questions, if you want me to help you need to answer those by providing the answers, not require me to go try find the information from your logs. If you post the logs as well that might help, but answers are the main priority.

Maybe an additional hint to figure out where the problem is:
If the connection between a gateway and the TTS-Server is lost (for more than 30 seconds?), the Up/Down-counter on the right side of “Last seen” on the TTS-console is reset.
This is the way how I found out that my WiFi-connection to my gateway was unreliable.

Sorry if I created the perception that I expected analyses and debugging from your side; TTS is a kind of black box for me. As your collegue suggested I went looking for the console log at exact the same point of time. (And I feel bad previously only have looked at the application console)

With what I noticed I rephrase my problem :
I have missing packets on the Application data interface. I now see them in the web gateway interface and some do not reach the application. I tried also MQTT and webhook to ThingSPeak. In both cases the packets were also missing. I searched the forum and did find the topic ‘Missing Packets on Application Data’ which seems to describe exactly what I experience now. In the discussion was suggested to use the storage integration, but also there the packet was missing.

Below an example : two packets were retrieved from Data Storage with f_cnt = 1769 and 1771. The one in between with f_cnt = 1770 appeared in the gateway console. I have added the packet.
I cannot see anything weird. Any suggestion where to look further for the cause

{
  "name": "gs.up.receive",
  "time": "2021-07-12T19:23:46.565597526Z",
  "identifiers": [
    {
      "gateway_ids": {
        "gateway_id": "dragino-lps8-gateway-pdbr"
      }
    },
    {
      "gateway_ids": {
        "gateway_id": "dragino-lps8-gateway-pdbr",
        "eui": "A840411EAA044150"
      }
    }
  ],
  "data": {
    "@type": "type.googleapis.com/ttn.lorawan.v3.UplinkMessage",
    "raw_payload": "QMPECyaA6gYCS24pZOrWLbpL",
    "payload": {
      "m_hdr": {
        "m_type": "UNCONFIRMED_UP"
      },
      "mic": "1i26Sw==",
      "mac_payload": {
        "f_hdr": {
          "dev_addr": "260BC4C3",
          "f_ctrl": {
            "adr": true
          },
          "f_cnt": 1770
        },
        "f_port": 2,
        "frm_payload": "S24pZOo="
      }
    },
    "settings": {
      "data_rate": {
        "lora": {
          "bandwidth": 125000,
          "spreading_factor": 7
        }
      },
      "coding_rate": "4/5",
      "frequency": "867300000",
      "timestamp": 1209086659,
      "time": "2021-07-12T19:23:46.541561Z"
    },
    "rx_metadata": [
      {
        "gateway_ids": {
          "gateway_id": "dragino-lps8-gateway-pdbr",
          "eui": "A840411EAA044150"
        },
        "time": "2021-07-12T19:23:46.541561Z",
        "timestamp": 1209086659,
        "rssi": -35,
        "channel_rssi": -35,
        "snr": 10.2,
        "location": {
          "latitude": 51.21161367789357,
          "longitude": 4.452853202819825,
          "source": "SOURCE_REGISTRY"
        },
        "uplink_token": "CicKJQoZZHJhZ2luby1scHM4LWdhdGV3YXktcGRichIIqEBBHqoEQVAQw+XEwAQaDAjCrbKHBhCszdONAiC4k7mZmJoD",
        "channel_index": 4
      }
    ],
    "received_at": "2021-07-12T19:23:46.565503660Z",
    "correlation_ids": [
      "gs:conn:01FADPRV0H5CK5CVN3M4GRN6FC",
      "gs:uplink:01FAE2ST051540D8553TQC3MNE"
    ]
  },
  "correlation_ids": [
    "gs:conn:01FADPRV0H5CK5CVN3M4GRN6FC",
    "gs:uplink:01FAE2ST051540D8553TQC3MNE"
  ],
  "origin": "ip-10-100-5-46.eu-west-1.compute.internal",
  "context": {
    "tenant-id": "CgN0dG4="
  },
  "visibility": {
    "rights": [
      "RIGHT_GATEWAY_TRAFFIC_READ",
      "RIGHT_GATEWAY_TRAFFIC_READ"
    ]
  },
  "unique_id": "01FAE2ST05W6VQZVJN60D91CV9"
}

They may well be, as most rack servers are painted black. But that doesn’t stop you reading all the documentation just like the rest of us have.

Can you put in a webhook that points to: http://datacache.co.uk/pietrodelamancha/UplinkToTab.php which is based on this: GitHub - descartes/TheThingsStack-Integration-Starters: Starter / Template code for various The Things Stack (v3) integrations. Please just turn on uplinks. It will provide another capture source that I find rather reliable.

PS, this would help too: How do I format my forum post? [HowTo]

Can you paste the packet payload and keys into https://lorawan-packet-decoder-0ta6puiniaut.runkit.sh/ to have it validate the MIC? If that check fails we know it is due to the received data being invalid.

I have used the TTS.MQTT.Tab.py - is that what you wanted me to perform?
Or should it have been the webhook-to-tab?

As far as I understand, the WebHook integration using PHP cannot directly run on an internal network - I will need more time to figure out
as I cannot manipulate the router I use, nor do I have any knowledge about php.
Following were my result using MQTT : missing f_cnt 2542

received_at f_port f_cnt frm_payload rssi snr data_rate_index consumed_airtime
2021-07-15T11:38:51.795159456Z 2 2541 DOUA+gA= -36 9 5 0.051456s
2021-07-15T11:48:52.091033149Z 2 2543 DNgA+gA= -37 9 5 0.051456s

received_at application_id device_id f_port f_cnt frm_payload rssi snr data_rate_index consumed_airtime
2021-07-15T11:38:51.795159456Z garage afstandsmeter 2 2541 DOUA+gA= -36 9 5 0.051456s
2021-07-15T11:48:52.091033149Z garage afstandsmeter 2 2543 DNgA+gA= -37 9 5 0.051456s

I also used the lorawan-package decoder for the message I saw in the gateway console

Click to see the full logs

Assuming base64-encoded packet
QMPECyaA7gkCpLSqusMsTdyv
Message Type = Data
PHYPayload = 40C3C40B2680EE0902A4B4AABAC32C4DDCAF

( PHYPayload = MHDR[1] | MACPayload[…] | MIC[4] )
MHDR = 40
MACPayload = C3C40B2680EE0902A4B4AABAC3
MIC = 2C4DDCAF (from packet)
= 2C4DDCAF (expected, assuming 32 bits frame counter with MSB 0000)

( MACPayload = FHDR | FPort | FRMPayload )
FHDR = C3C40B2680EE09
FPort = 02
FRMPayload = A4B4AABAC3 (from packet, encrypted)
= 0CDF00FA00 (decrypted)

  ( FHDR = DevAddr[4] | FCtrl[1] | FCnt[2] | FOpts[0..15] )
 DevAddr = 260BC4C3 (Big Endian)
   FCtrl = 80
    FCnt = 09EE (Big Endian)
   FOpts = 

Message Type = Unconfirmed Data Up
Direction = up
FCnt = 2542 (from packet, 16 bits)
= 2542 (32 bits, assuming MSB 0x0000)
FCtrl.ACK = false
FCtrl.ADR = true

until clicked

So the MIC seems ok.

Well you can run that on your computer if you want, but no, it’s not a web hook, different integration (Webhook vs MQTT)

The idea was that the webhook would capture your uplinks on a system independent from you so the problem could have a third party eye on the situation.

(trying to post on correct place now)
After figuring out how to run your php script / webhook, hereby the results :slight_smile:

I identified “FCnt”: 4963 as missing packet

  1. Logread gateway LPS8 (f_cnt = 4963 available)
Click to see the full logs

Fri Jul 23 22:29:02 2021 daemon.info lora_pkt_fwd[3048]: INFO~ [down] PULL_ACK received in 31 ms
Fri Jul 23 22:29:06 2021 daemon.info lora_pkt_fwd[3048]: PKT_FWD~ DATA_UNCONF_UP-> {“DevAddr”: “260BC4C3”, “FCtrl”: [“ADR”: 1, “ADRACKReq”: 0, “ACK”: 0, “RFU” : “RFU”, “FOptsLen”: 0], “FCnt”: 4963, “FPort”: 2, “MIC”: “47049B7C”}
Fri Jul 23 22:29:06 2021 daemon.info lora_pkt_fwd[3048]: RXTX~ {“rxpk”:[{“tmst”:2106010483,“time”:“2021-07-23T21:29:06.932847Z”,“chan”:3,“rfch”:0,“freq”:867.100000,“stat”:1,“modu”:“LORA”,“datr”:“SF7BW125”,“codr”:“4/5”,“lsnr”:9.8,“rssi”:-37,“size”:18,“data”:“QMPECyaAYxMCqin6CFt8mwRH”}]}
Fri Jul 23 22:29:06 2021 daemon.info lora_pkt_fwd[3048]: INFO~ [up] PUSH_ACK received in 30 ms
Fri Jul 23 22:29:09 2021 user.notice iot_keep_alive: Internet Access OK: via eth1
Fri Jul 23 22:29:09 2021 user.notice iot_keep_alive: use WAN or WiFi for internet access now

2)TTS gateway - (f_cnt = 4963 available)

Click to see the full message

{
“name”: “gs.up.receive”,
“time”: “2021-07-23T21:29:06.952248082Z”,
“identifiers”: [
{
“gateway_ids”: {
“gateway_id”: “dragino-lps8-gateway-pdbr”
}
},
{
“gateway_ids”: {
“gateway_id”: “dragino-lps8-gateway-pdbr”,
“eui”: “A840411EAA044150”
}
}
],
“data”: {
@type”: “type.googleapis.com/ttn.lorawan.v3.UplinkMessage”,
“raw_payload”: “QMPECyaAYxMCqin6CFt8mwRH”,
“payload”: {
“m_hdr”: {
“m_type”: “UNCONFIRMED_UP”
},
“mic”: “fJsERw==”,
“mac_payload”: {
“f_hdr”: {
“dev_addr”: “260BC4C3”,
“f_ctrl”: {
“adr”: true
},
“f_cnt”: 4963
},
“f_port”: 2,
“frm_payload”: “qin6CFs=”
}
},
“settings”: {
“data_rate”: {
“lora”: {
“bandwidth”: 125000,
“spreading_factor”: 7
}
},
“coding_rate”: “4/5”,
“frequency”: “867100000”,
“timestamp”: 2106010483,
“time”: “2021-07-23T21:29:06.932847Z”
},
“rx_metadata”: [
{
“gateway_ids”: {
“gateway_id”: “dragino-lps8-gateway-pdbr”,
“eui”: “A840411EAA044150”
},
“time”: “2021-07-23T21:29:06.932847Z”,
“timestamp”: 2106010483,
“rssi”: -37,
“channel_rssi”: -37,
“snr”: 9.8,
“location”: {
“latitude”: 51.21161367789357,
“longitude”: 4.452853202819825,
“source”: “SOURCE_REGISTRY”
},
“uplink_token”: “CicKJQoZZHJhZ2luby1scHM4LWdhdGV3YXktcGRichIIqEBBHqoEQVAQ89ac7AcaDAii6eyHBhDNnYLGAyC48obApasG”,
“channel_index”: 3
}
],
“received_at”: “2021-07-23T21:29:06.952143565Z”,
“correlation_ids”: [
“gs:conn:01FB9X1XCYAW9CZ3YWZMC2VYFS”,
“gs:uplink:01FBAMB748G3RXGX177JF93Z2B”
]
},
“correlation_ids”: [
“gs:conn:01FB9X1XCYAW9CZ3YWZMC2VYFS”,
“gs:uplink:01FBAMB748G3RXGX177JF93Z2B”
],
“origin”: “ip-10-100-5-46.eu-west-1.compute.internal”,
“context”: {
“tenant-id”: “CgN0dG4=”
},
“visibility”: {
“rights”: [
“RIGHT_GATEWAY_TRAFFIC_READ”,
“RIGHT_GATEWAY_TRAFFIC_READ”
]
},
“unique_id”: “01FBAMB748ZY61HZYMSM4GAGH0”
}

  1. extraction of the tab files from the php script/webhook
Click to see the full message

20210723 (f_cnt = 4963 missing)
received_at application_id device_id f_cnt f_port frm_payload data_rate_index consumed_airtime rssi snr
2021-07-23T21:24:07.165550572Z garage afstandsmeter 4962 2 DOUHVgA= 5 0.051456s -35 6.8
2021-07-23T21:34:07.181296805Z garage afstandsmeter 4964 2 DOkHVgA= 5 0.051456s -35 11.2

garage (f_cnt = 4963 missing)
received_at device_id f_cnt f_port frm_payload data_rate_index consumed_airtime rssi snr
2021-07-23T21:24:07.165550572Z afstandsmeter 4962 2 DOUHVgA= 5 0.051456s -35 6.8
2021-07-23T21:34:07.181296805Z afstandsmeter 4964 2 DOkHVgA= 5 0.051456s -35 11.2

garage_afstandsmeter (f_cnt = 4963 missing)
received_at f_cnt f_port frm_payload data_rate_index consumed_airtime rssi snr
2021-07-23T21:24:07.165550572Z 4962 2 DOUHVgA= 5 0.051456s -35 6.8
2021-07-23T21:34:07.181296805Z 4964 2 DOkHVgA= 5 0.051456s -35 11.2

Do you have any other suggestions to check? I can try to live with the fact that I loose packets, but it is strange that my packet does reach the TTS gateway and does not appear on the TTS device.