Connection with the console – The Things Network V3 – not stable

That is right.

Are the CRC values for the entries missing in the application data correct on the gateway? TTN will not process uplink packets with CRC errors.

LoRaWAN is not a reliable protocol, there will always be packets missing due to radio interference and other causes. There might even be packet loss between the gateway and the backend when using UDP connectivity.
If you need all data you will need to implement some logic on application level, there have been extensive discussions on the subject in the past on the forum. (Or switch to another protocol as LoRaWAN in inherently lossy)

I have added one example, when you need more I will gather additional examples.DevAddr_260BC4C3_missing773.txt (7.3 KB)

Ideally we need a gateway log just like you’ve created (good detail) AND the TTS gateway console log for the exact same point of time.

The gateway does appear to be pushing the uplink up to the network server but what we don’t know is how or if the NS handled it.

I know this will be a bit tricky to co-ordinate, but if you leave both your Putty and web open and keep an eye out for a dropped uplink then capture the data - screen shot for the web page is fine.

PS, lines 8 to 34 are a report on status (as implied by line 8) and have nothing to do with the uplink because it hasn’t happened yet. Line 35 is a status message from the gateway to the server to tell it that it’s alive and line 36 is the acknowledgement back.

Line 37 is the uplink with more info on 38. Line 39 says the network server acknowledged it.

Sadly the timestamps aren’t that much use as many embedded system deliver their serial output in bursts.

From the comments in your text file (why not post everything readable in the message with the log entries in code blocks) I gather you think I have access to the TTS(CE) infrastructure.
That is wrong. I am a volunteer spending a lot of time trying to help other TTN users. We, moderators, do not have special privileges on the TTN infrastructure, just the additional right required for moderation on the forum.
We try to help people by suggesting possible causes of issues but do not have the time to do all the analysis and debugging for all users.

So if I ask specific questions, if you want me to help you need to answer those by providing the answers, not require me to go try find the information from your logs. If you post the logs as well that might help, but answers are the main priority.

Maybe an additional hint to figure out where the problem is:
If the connection between a gateway and the TTS-Server is lost (for more than 30 seconds?), the Up/Down-counter on the right side of “Last seen” on the TTS-console is reset.
This is the way how I found out that my WiFi-connection to my gateway was unreliable.

Sorry if I created the perception that I expected analyses and debugging from your side; TTS is a kind of black box for me. As your collegue suggested I went looking for the console log at exact the same point of time. (And I feel bad previously only have looked at the application console)

With what I noticed I rephrase my problem :
I have missing packets on the Application data interface. I now see them in the web gateway interface and some do not reach the application. I tried also MQTT and webhook to ThingSPeak. In both cases the packets were also missing. I searched the forum and did find the topic ‘Missing Packets on Application Data’ which seems to describe exactly what I experience now. In the discussion was suggested to use the storage integration, but also there the packet was missing.

Below an example : two packets were retrieved from Data Storage with f_cnt = 1769 and 1771. The one in between with f_cnt = 1770 appeared in the gateway console. I have added the packet.
I cannot see anything weird. Any suggestion where to look further for the cause

{
  "name": "gs.up.receive",
  "time": "2021-07-12T19:23:46.565597526Z",
  "identifiers": [
    {
      "gateway_ids": {
        "gateway_id": "dragino-lps8-gateway-pdbr"
      }
    },
    {
      "gateway_ids": {
        "gateway_id": "dragino-lps8-gateway-pdbr",
        "eui": "A840411EAA044150"
      }
    }
  ],
  "data": {
    "@type": "type.googleapis.com/ttn.lorawan.v3.UplinkMessage",
    "raw_payload": "QMPECyaA6gYCS24pZOrWLbpL",
    "payload": {
      "m_hdr": {
        "m_type": "UNCONFIRMED_UP"
      },
      "mic": "1i26Sw==",
      "mac_payload": {
        "f_hdr": {
          "dev_addr": "260BC4C3",
          "f_ctrl": {
            "adr": true
          },
          "f_cnt": 1770
        },
        "f_port": 2,
        "frm_payload": "S24pZOo="
      }
    },
    "settings": {
      "data_rate": {
        "lora": {
          "bandwidth": 125000,
          "spreading_factor": 7
        }
      },
      "coding_rate": "4/5",
      "frequency": "867300000",
      "timestamp": 1209086659,
      "time": "2021-07-12T19:23:46.541561Z"
    },
    "rx_metadata": [
      {
        "gateway_ids": {
          "gateway_id": "dragino-lps8-gateway-pdbr",
          "eui": "A840411EAA044150"
        },
        "time": "2021-07-12T19:23:46.541561Z",
        "timestamp": 1209086659,
        "rssi": -35,
        "channel_rssi": -35,
        "snr": 10.2,
        "location": {
          "latitude": 51.21161367789357,
          "longitude": 4.452853202819825,
          "source": "SOURCE_REGISTRY"
        },
        "uplink_token": "CicKJQoZZHJhZ2luby1scHM4LWdhdGV3YXktcGRichIIqEBBHqoEQVAQw+XEwAQaDAjCrbKHBhCszdONAiC4k7mZmJoD",
        "channel_index": 4
      }
    ],
    "received_at": "2021-07-12T19:23:46.565503660Z",
    "correlation_ids": [
      "gs:conn:01FADPRV0H5CK5CVN3M4GRN6FC",
      "gs:uplink:01FAE2ST051540D8553TQC3MNE"
    ]
  },
  "correlation_ids": [
    "gs:conn:01FADPRV0H5CK5CVN3M4GRN6FC",
    "gs:uplink:01FAE2ST051540D8553TQC3MNE"
  ],
  "origin": "ip-10-100-5-46.eu-west-1.compute.internal",
  "context": {
    "tenant-id": "CgN0dG4="
  },
  "visibility": {
    "rights": [
      "RIGHT_GATEWAY_TRAFFIC_READ",
      "RIGHT_GATEWAY_TRAFFIC_READ"
    ]
  },
  "unique_id": "01FAE2ST05W6VQZVJN60D91CV9"
}

They may well be, as most rack servers are painted black. But that doesn’t stop you reading all the documentation just like the rest of us have.

Can you put in a webhook that points to: http://datacache.co.uk/pietrodelamancha/UplinkToTab.php which is based on this: GitHub - descartes/TheThingsStack-Integration-Starters: Starter / Template code for various The Things Stack (v3) integrations. Please just turn on uplinks. It will provide another capture source that I find rather reliable.

PS, this would help too: How do I format my forum post? [HowTo]

Can you paste the packet payload and keys into https://lorawan-packet-decoder-0ta6puiniaut.runkit.sh/ to have it validate the MIC? If that check fails we know it is due to the received data being invalid.

I have used the TTS.MQTT.Tab.py - is that what you wanted me to perform?
Or should it have been the webhook-to-tab?

As far as I understand, the WebHook integration using PHP cannot directly run on an internal network - I will need more time to figure out
as I cannot manipulate the router I use, nor do I have any knowledge about php.
Following were my result using MQTT : missing f_cnt 2542

received_at f_port f_cnt frm_payload rssi snr data_rate_index consumed_airtime
2021-07-15T11:38:51.795159456Z 2 2541 DOUA+gA= -36 9 5 0.051456s
2021-07-15T11:48:52.091033149Z 2 2543 DNgA+gA= -37 9 5 0.051456s

received_at application_id device_id f_port f_cnt frm_payload rssi snr data_rate_index consumed_airtime
2021-07-15T11:38:51.795159456Z garage afstandsmeter 2 2541 DOUA+gA= -36 9 5 0.051456s
2021-07-15T11:48:52.091033149Z garage afstandsmeter 2 2543 DNgA+gA= -37 9 5 0.051456s

I also used the lorawan-package decoder for the message I saw in the gateway console

Click to see the full logs

Assuming base64-encoded packet
QMPECyaA7gkCpLSqusMsTdyv
Message Type = Data
PHYPayload = 40C3C40B2680EE0902A4B4AABAC32C4DDCAF

( PHYPayload = MHDR[1] | MACPayload[…] | MIC[4] )
MHDR = 40
MACPayload = C3C40B2680EE0902A4B4AABAC3
MIC = 2C4DDCAF (from packet)
= 2C4DDCAF (expected, assuming 32 bits frame counter with MSB 0000)

( MACPayload = FHDR | FPort | FRMPayload )
FHDR = C3C40B2680EE09
FPort = 02
FRMPayload = A4B4AABAC3 (from packet, encrypted)
= 0CDF00FA00 (decrypted)

  ( FHDR = DevAddr[4] | FCtrl[1] | FCnt[2] | FOpts[0..15] )
 DevAddr = 260BC4C3 (Big Endian)
   FCtrl = 80
    FCnt = 09EE (Big Endian)
   FOpts = 

Message Type = Unconfirmed Data Up
Direction = up
FCnt = 2542 (from packet, 16 bits)
= 2542 (32 bits, assuming MSB 0x0000)
FCtrl.ACK = false
FCtrl.ADR = true

until clicked

So the MIC seems ok.

Well you can run that on your computer if you want, but no, it’s not a web hook, different integration (Webhook vs MQTT)

The idea was that the webhook would capture your uplinks on a system independent from you so the problem could have a third party eye on the situation.

(trying to post on correct place now)
After figuring out how to run your php script / webhook, hereby the results :slight_smile:

I identified “FCnt”: 4963 as missing packet

  1. Logread gateway LPS8 (f_cnt = 4963 available)
Click to see the full logs

Fri Jul 23 22:29:02 2021 daemon.info lora_pkt_fwd[3048]: INFO~ [down] PULL_ACK received in 31 ms
Fri Jul 23 22:29:06 2021 daemon.info lora_pkt_fwd[3048]: PKT_FWD~ DATA_UNCONF_UP-> {“DevAddr”: “260BC4C3”, “FCtrl”: [“ADR”: 1, “ADRACKReq”: 0, “ACK”: 0, “RFU” : “RFU”, “FOptsLen”: 0], “FCnt”: 4963, “FPort”: 2, “MIC”: “47049B7C”}
Fri Jul 23 22:29:06 2021 daemon.info lora_pkt_fwd[3048]: RXTX~ {“rxpk”:[{“tmst”:2106010483,“time”:“2021-07-23T21:29:06.932847Z”,“chan”:3,“rfch”:0,“freq”:867.100000,“stat”:1,“modu”:“LORA”,“datr”:“SF7BW125”,“codr”:“4/5”,“lsnr”:9.8,“rssi”:-37,“size”:18,“data”:“QMPECyaAYxMCqin6CFt8mwRH”}]}
Fri Jul 23 22:29:06 2021 daemon.info lora_pkt_fwd[3048]: INFO~ [up] PUSH_ACK received in 30 ms
Fri Jul 23 22:29:09 2021 user.notice iot_keep_alive: Internet Access OK: via eth1
Fri Jul 23 22:29:09 2021 user.notice iot_keep_alive: use WAN or WiFi for internet access now

2)TTS gateway - (f_cnt = 4963 available)

Click to see the full message

{
“name”: “gs.up.receive”,
“time”: “2021-07-23T21:29:06.952248082Z”,
“identifiers”: [
{
“gateway_ids”: {
“gateway_id”: “dragino-lps8-gateway-pdbr”
}
},
{
“gateway_ids”: {
“gateway_id”: “dragino-lps8-gateway-pdbr”,
“eui”: “A840411EAA044150”
}
}
],
“data”: {
@type”: “type.googleapis.com/ttn.lorawan.v3.UplinkMessage”,
“raw_payload”: “QMPECyaAYxMCqin6CFt8mwRH”,
“payload”: {
“m_hdr”: {
“m_type”: “UNCONFIRMED_UP”
},
“mic”: “fJsERw==”,
“mac_payload”: {
“f_hdr”: {
“dev_addr”: “260BC4C3”,
“f_ctrl”: {
“adr”: true
},
“f_cnt”: 4963
},
“f_port”: 2,
“frm_payload”: “qin6CFs=”
}
},
“settings”: {
“data_rate”: {
“lora”: {
“bandwidth”: 125000,
“spreading_factor”: 7
}
},
“coding_rate”: “4/5”,
“frequency”: “867100000”,
“timestamp”: 2106010483,
“time”: “2021-07-23T21:29:06.932847Z”
},
“rx_metadata”: [
{
“gateway_ids”: {
“gateway_id”: “dragino-lps8-gateway-pdbr”,
“eui”: “A840411EAA044150”
},
“time”: “2021-07-23T21:29:06.932847Z”,
“timestamp”: 2106010483,
“rssi”: -37,
“channel_rssi”: -37,
“snr”: 9.8,
“location”: {
“latitude”: 51.21161367789357,
“longitude”: 4.452853202819825,
“source”: “SOURCE_REGISTRY”
},
“uplink_token”: “CicKJQoZZHJhZ2luby1scHM4LWdhdGV3YXktcGRichIIqEBBHqoEQVAQ89ac7AcaDAii6eyHBhDNnYLGAyC48obApasG”,
“channel_index”: 3
}
],
“received_at”: “2021-07-23T21:29:06.952143565Z”,
“correlation_ids”: [
“gs:conn:01FB9X1XCYAW9CZ3YWZMC2VYFS”,
“gs:uplink:01FBAMB748G3RXGX177JF93Z2B”
]
},
“correlation_ids”: [
“gs:conn:01FB9X1XCYAW9CZ3YWZMC2VYFS”,
“gs:uplink:01FBAMB748G3RXGX177JF93Z2B”
],
“origin”: “ip-10-100-5-46.eu-west-1.compute.internal”,
“context”: {
“tenant-id”: “CgN0dG4=”
},
“visibility”: {
“rights”: [
“RIGHT_GATEWAY_TRAFFIC_READ”,
“RIGHT_GATEWAY_TRAFFIC_READ”
]
},
“unique_id”: “01FBAMB748ZY61HZYMSM4GAGH0”
}

  1. extraction of the tab files from the php script/webhook
Click to see the full message

20210723 (f_cnt = 4963 missing)
received_at application_id device_id f_cnt f_port frm_payload data_rate_index consumed_airtime rssi snr
2021-07-23T21:24:07.165550572Z garage afstandsmeter 4962 2 DOUHVgA= 5 0.051456s -35 6.8
2021-07-23T21:34:07.181296805Z garage afstandsmeter 4964 2 DOkHVgA= 5 0.051456s -35 11.2

garage (f_cnt = 4963 missing)
received_at device_id f_cnt f_port frm_payload data_rate_index consumed_airtime rssi snr
2021-07-23T21:24:07.165550572Z afstandsmeter 4962 2 DOUHVgA= 5 0.051456s -35 6.8
2021-07-23T21:34:07.181296805Z afstandsmeter 4964 2 DOkHVgA= 5 0.051456s -35 11.2

garage_afstandsmeter (f_cnt = 4963 missing)
received_at f_cnt f_port frm_payload data_rate_index consumed_airtime rssi snr
2021-07-23T21:24:07.165550572Z 4962 2 DOUHVgA= 5 0.051456s -35 6.8
2021-07-23T21:34:07.181296805Z 4964 2 DOkHVgA= 5 0.051456s -35 11.2

Do you have any other suggestions to check? I can try to live with the fact that I loose packets, but it is strange that my packet does reach the TTS gateway and does not appear on the TTS device.

Seems you figured out how to do that 13th July as that’s when my data backup server started saving your uplinks but it appears that stopped at yesterday at f_cnt 4862.

The whole point of me setting it up & providing you access for free was so we could audit this independently.

I’m not going to get in to this as I’ve not got the information I wanted to carry on with this. If you turn it back on and then find an instance, feel free to post the details again.

+1 for using the Hide details but -1 for not using the </> for code/logs.

Thank you again for your support.

-1 again for me as I stopped your bakup saving my uplinks (I assume I did that by setting up my own capture)
I restored the webhook pointing again to http://datacache.co.uk/pietrodelamancha/UplinkToTab.php.

I will follow up further for missing packages, but when you only get the same information as I saw coming in in the tab files, I do not see what we might learn extra.

You can have several, so you can have one too.

Proof.

Last 24 hours : missing 17 packets (out of 288 as my sending frequency is every 5 minutes = 6%);
In the log of my LPS8 gateway : all of them were found as uploaded, but for 4 of them I did not find a PUSH_ACK; these 4 packets were also missing in the TTS gateway console - so I assumed lost ‘on the internet’ (4 out of 288 = 1,4%)
Besides that a few out of these 17 packets appeared twice in the TTS gateway console (resends ?)
None of these 17 packets can be found in the TAB files created by my webhook integration (your php script).
Hereby a few of the (17) missing FCnt : 5359, 5380, 5514, 5634
I restored your webhook link last Satuday. If I did well, you probably can check if these FCnt are indeed missing.
Desperate, I am looking to adapt my application to take into account the loss of packets, but 6% is too high in my opinion.
Thanks again for your support and patience

I assume your LPS8 is set to use Semtech legacy packet forwarder? UDP based… an inherently lossy and somewhat insecure Internet Protocol so depending on back haul connectivity and routing to TTN back end some loss to be expected… but 6% does seem a bit high…

Depending on your actual payload size and content and given you don’t seem to loose sequential data you might consider sending a moving window with not just current data but possibly last 2 or even 3 readings… which you then compare and extract at application level. Careful use of bytes (search working with bytes) may even allow some level of data compression. The nature of LoRaWAN is on air time doesn’t extend in direct relationship to payload length but rather follows a staircase effect… whose step tread length varies with SF. So it may be you can adapt without actually (significantly) impacting your on air time whilst then adding resilience to your data stream…

I can confirm that they are missing from my data cache.

The documentation from TTI says 10% is to be expected: https://www.thethingsindustries.com/docs/devices/concepts/best-practices/

Picking out a quick to get to dataset that’s running within the confines of my Bat Cave with a v2 & a v3 gateway on TTN/TTS, I’ve got 5% loss but I don’t have the stats from the gateways to correlate with nor does it have a complete download of Data Storage to cross reference.

I do have a gateway packet forwarder modification that would make it easier to gather stats. And I have a TTS OS instance that I could run a gateway & device against to see what the results are. And I could start gathering Data Storage but only on TTS.

Transmitting windowed data sets as @Jeff-UK would be a simple solution. Storing uplinks on device and expiring them after a few hours (giving time for the server to request resends) is another.

But fundamentally, given your packet loss is less than TTI’s expectation and similar to mine I’m not sure there is much to be concerned about.

What is the data you are sending and why do you aspire to 100%?

Well, I am monitoring an open/close situation with a distance sensor. I have put my interval on 5 minutes, as more frequent measuring would have too much impact on my ‘fair use policy’. Is 5 minutes really important ? Maybe not, I probably can adapt my application. Having more than 30 years of experience in administrative computing, physical computing was/is rather new to me and maybe my expectations were too high. But reading about Iot and its applications, I am still wondering how you manage it all with an accepted loss of 10 %. When one uses a water sensor and loses just that packet that would tell me that there is water in the basement, one need a different approach than an application relying on one alarm message. Thanks again for your support. As far as I understand, my configuration should be fine and a different approach is needed.

Because we don’t expect the use of the ISM unlicensed band to be for our own exclusive use so we expect a level of packet loss due to anything from a dodgy microwave being used for 20 minutes at lunch time or a big bag of water with meat wrapper having a fag whilst leaning up against the sensor housing five times a day. It’s transmitted in the blind, anything else could be in the airwaves at the time and there’s no handshaking or acknowledgement.

If the door is within 2 or 300m, then the NRF24L01 is pretty good for a ‘reliable’ transmission. If you want something shockingly bad, try ASK 433MHz! If you have power, maybe WiFi on a narrow beam antenna. Or a line of sight laser. Or maroon flares.

Before I type the following, if you really really need to know if a door is open, closed or in the process thereof, LoRaWAN is not a good fit but can work.

For the more important uplinks, my devices use a smaller payload, more frequently whilst awaiting a downlink that is initiated by my application server to acknowledge that “the system” is on the case.

For the rarest of circumstances where only LoRaWAN is an appropriate radio system, ie lots of very small battery driven sensors over a distributed area that are mostly ticking over phoning home every few hours until something very bad happens that at least two or three sensors will be triggered on, then they use a confirmed uplink and keep going until they get that confirmation from the gateway, then revert to the a pattern of uplinks with one in X being confirmed, until it gets the back-end acknowledgement.

However, the backend database usually predicts the forthcoming moment of doom based on readings thus far. It can also spot trends and send a downlink to increase the uplink frequency so that data arrives more frequently, thus ironing out some of the dropped packets.

The only time I’ve seen a device go in to full Blue Light Defcon 1 mode was a water trough when on a hot day a pile of horses decided to drink it dry in the morning. They then went on to the next trough, but the device doesn’t have any knowledge of its peers, so whilst no horses went thirsty, it just knew it had gone from enough to empty. As I said, usually the backend system can see the trend and give a prediction of how long before refill time. At some point I may link in weather forecasts to see if that helps plus some mandatory Machine Learning once I have time to play.

We (the boss & I) came to all of this via security systems - security, fire, access control, CCTV etc - so I have a pretty good idea of ways of monitoring a door remotely and it’s all about levels of certitude. I can bypass most security systems or sensors but for 99% of situations, a reed sensor + 120dB siren will do the job, but even then, a can of pepper spray will always persuade you to turn off the alarm for me. So you need to fit the monitoring system to what you are protecting.

I guess the TL;DR version is, send status “all good, I haven’t been stolen, battery level is X” reports every 60 minutes and then if the sensor is triggered, two byte packets on a different port in quick order - for EU on SF7 that’s one every 5 seconds on the 1% duty cycle (bit over the top) or one every two minutes on FUP but that’s spread over an hour so you could do once every 15 seconds. But reality is you just need to send with confirmed and then a normal uplink to get the backend acknowledgement about 10 seconds later and that’s it, alerted.

Now your next problem is, is your phone in service for that all important SMS? For a small fee I can arrange a satellite pager and modify it with a big toe fitted taser module to ensure you wake up.