The packages are lost between the gateway and the application in v3

kadam1265 · September 9, 2021, 10:07am

Hi everyone,

I have been using it for over 1 year the TTN v2, and it was work without problems. But now, i try to migrate everything to the v3 stack, and i have a big problem with it. I have a gateway and 3 devices in one application. The devices are useing ABP. The gateway receive the all uplinks from devices correctly, but in the application not all uplinks shows. I have two device with low frame counter number(under 1000) because i was restart them not long ago, and i have one device with high frame counter( around 20 000), because it has been operating outside for months. The third device with around 20k frame counter is not showing in the v3 application, but on the v2 stack is working with an other gateway, and the v3 gateway is receive the all messages.

The third device in v3 application (never seen):
kép
The third device in v2 application (it is working):
kép
The third device in v3 gateway traffic(the gateway is receive the packet):
kép

And i have a similar problem with the two other device. The probleme is, the gateway is receive the all messages, but in the application not all messages showing, randomly skips the message transfer between the gateway and the application in the v3.

Somebody has a similar problem, or know the solution? I have to find a solution because i need it to my school project.

descartes · September 9, 2021, 12:12pm

Do you have any statistics to put this in context - like percentage of missing messages.

Which integrations are you using? Do you back it up with Data Storage as well??

Oh, well, in that case we’ll get right on to it

cslorabox · September 10, 2021, 3:26am

LoRaWan requires either end to reject packets from the other when the frame count skips forward more than 16384 than the last seen value.

If the V3 network has never seen your node, presumably it’s expecting an uplink frame count of 0, and a value of 20000 is illicitly more forward from that than 16384.

You may need to change the expected uplink frame count for the device in the network, eg, set it to the value sent recently, so that the next packet will be only a short skip forward.

You also, incidentally, need to know the frame count of the last downlink sent to the device, so that the V3 network can send downlinks with legally higher frame counts. If it starts downlinks from 0, then node will ignore them, and that’s entirely unacceptable.

Remember that proper support of downlink and MAC command parsing is absolutely required for any device that’s going to be used on V3. If your field deployed device won’t support that, you must not try to use it with v3.

kadam1265 · September 11, 2021, 5:44am

The missing packages rate around 57%, I use a custom webhook, to save data to my database.
I don’t use Data Storage integration, because i don’t know, how to use, but i activated the integration.

This is the database, the node is sending messages every 10 minutes. But there is a lot of missing messages.

I have an interesting observation.
The gateway is receives all the messages, but there is a pattern in the messages that appear in the application. The application is only receive the last 3 channel (868.1MHz, 868.3MHz, 868.5MHz).

I don’t understand why is this happening. I added the all 8 channels frequency:

kadam1265 · September 11, 2021, 5:52am

I was change the frame counter with the CLI to around 20k and the application started receiving the messages. But with 57% error rate.

My devices sadly only capable to send messages, because i wrote the whole program without libraries and very important the low power consumption for the devices, because the device is have to working more than 1 year with two AA battery. But in the future i plan to develop the code on the devices to use OTAA instead ABP.

For the CLI install this page was very helpful: TTN/TTS V3 command line cheat sheet

Jeff-UK · September 11, 2021, 7:55am

Do you have corresponding GW logs? If not can you grab some GW logs and arrange to extract the matching period of data from the database for review?

A simple test with the Data Storage integration is to look at the instructions and where it gives you the GET Https:xxxxxxxxxxxxxxx… just ignore the GET and paste everything from https… into your browser address and amend the {type} to suite your needs. It will then pull the current storage accumulation down - which is typically 24-36hrs worth (so dont do that too often!), copy the whole text shownin browser (works in FireFox ESR) into a new .txt file, then copy/import that into Excel - as preformatted text it will then present as very long lines (corresponding to each “Result” - the length determined by number of receiving GW’s and hence length of associated metadata). Scroll along the lines and near thge front you will see an ‘f_cnt’ value - allows you to then eyeball and quickly identify any missing messages. You can compare those with your own DB and see if there is a mismatch with your own Integration, and also then compare with GW log to see if there is a real ‘loss’ between GW and either Integration.

kadam1265 · September 11, 2021, 8:33am

I don’t have GW logs. But no message is missing between the application and my database. In the TTN V3 console, I open the gateway’s live data, and i see all the packages (In all the 8 channel). But when the gateway receive a packet in the 5 additional channels (867100000, 867300000, 867500000, 867700000, 867900000) the received messages not showing in the application and of course in the database. When the gateway is receive a packet in the 3 base channel (868100000, 868300000, 868500000) the received message is showing in the application and in the database.

For the GW log, is there any option? I use an STM32 gateway board with a nucleo. I know the gateway sending the received messages on the usb, but this is an outdoor gateway, so i can’t use the usb.

kadam1265 · September 11, 2021, 9:09am

I use this command in CLI: ttn-lw-cli end-devices get kadam1265-weathernet-v3 atmega4809-0002 --mac-state.current-parameters.channels

And I see there are only three channels:

Is there any solution, to add the other 5 channels?

Jeff-UK · September 11, 2021, 9:44am

This is good input but a little confusing - so GW and app comms works on the 3 base channels, you say using the 5 extended channels does not result in message getting to application…but then your devices do not appear to be set to use these channels so not sure how your 1st assessment is valid. Yes you need to extend Node set up to include full 8 channel suite. also the application and device config/regsitration needs to be correctly set to reflect full 8 channel node capability.

On the GW itself or in the V3 console?

Is all the live tarffic related to your device in the V3 console reflected in the application?

Also am very concerned by

Ok what GW are you using, which concentrator board, which RF/baseband solution (SX130?), which packet forwarder? What is source of any libraries/firmware.

Also given

and

That suggests your device is mis-configured, and if prompted by the NS via a MAC command to add the extra channels (which appear to have been set up in the registration - were these in when 1st registered or have you added them through the console after the fact? IIRC there may be issue from that not ‘taking’ properly), your node may not be capable or actioning the request depending on firmware etc. so

Full details on Node hardware and firmware/libraries please…inc sources where acquired and any changes you have made.

It helps us understand what is happening and likely causes if we have a full set of information wrt end to end configuration vs playing 20 questions to get to potential source of error.

kadam1265 · September 11, 2021, 10:11am

On the GW itself or in the V3 console?

In the v3 console.

Ok what GW are you using, which concentrator board, which RF/baseband solution (SX130?), which packet forwarder? What is source of any libraries/firmware.

I use the ST LRWAN_GS_HF1(SX1301/SX1257) gateway extension board, with the original firmware.

That suggests your device is mis-configured

The gateway and the nodes was working properly with v2, and with an other gateway(Tracknet TabsHub) the v2 is still working perfectly, with the same nodes.
The gateway is receive all teh 8 channels. For example the 867.500MHz:

Full details on Node hardware and firmware/libraries please…inc sources where acquired and any changes you have made.

The nodes: The MCU is an atmega4809, and the LoRa module is a RFM92W(SX1272). I don’t use public libraries, because i made them. And this was working perfectly on the v2 with all the 8 channels.

This is the part of the node’s code, wich choose the tx channel:

This was working with the TTN V2!

cslorabox · September 11, 2021, 1:48pm

Incomplete implementations are unfortunately not something that can be permitted, as they cause the network to run around in circles trying to get proper behavior from your nodes.

It’s absolutely required that you not only receive dowlinks, but also properly process all of the MAC commands, some of which have mandatory responses you must make. If you look for example at the history of what MCCI has gone through getting their branch of LMiC to pass certification tests, it’s no small job.

Note however that a LoRaWan stack ultimately only involves the wakeful behavior of your device. With care to interaction with it, using one does not prevent you from implementation maximum power saving strategies to the extent possible while achieving spec compliance. You cannot skip the receive windows, but a LoRaWan stack will have logic for running the receiver for as little time as possible around the starting time of any rx reply, which you can further tune.

kadam1265 · September 11, 2021, 2:53pm

I understand, but i can’t use LMiC in MPLABX, because i don’t find any instructions, etc. My nodes are sending packages every 10 minutes, and they randomly choose 1 channel from the 8 predefined frequency. They do nothing more.

My question is, why the application does not see the messages which comes the other 5 channel even though the gateway sees it. Of course i try to develop the node-s code in the future, but now i have to resolv this problem.

cslorabox · September 11, 2021, 2:55pm

No, you must not even ATTEMPT to operate nodes that do not correctly implement LoRaWan, including receiving and processing MAC commands.

There’s no point in wasting any more time debugging until you are using firmware actually designed to work correctly.

Non compliant nodes are detrimental to the network as a whole, and must not be used.

The issue may not even actually be entirely unrelated: in the spec, allowed channels are part of what those MAC commands you’re not prepared to handle control.

kadam1265 · September 11, 2021, 3:04pm

Ok, please tell what is the solution. I hope your solution is not to buy a device in a shop… In ABP the nodes can’t working? Because i have LoRa certificated modules too and they also use all the 8 channels…

cslorabox · September 11, 2021, 3:04pm

Incidentally, that gateway is lacking a TX power amp, which means that in an environment where there are any other gateways present, connecting it to TTN is a denial of service attack, because it may be selected as geometrically closes to someone else’s node, but lack the actual transmit power to reach it with downlinks - creating the exact same difficult as when a node doesn’t implement receive, but where the node actually is correct, and its the gateway that’s not compliant.

That board is for private network bench tests only.

cslorabox · September 11, 2021, 3:08pm

Use an open source LoRaWan stack known to correctly implement the spec. The main candidates are:

LoRaMAC-node (for a variety of ARM targets, ST has their own fork)
LMiC, usually in the Arduino-ified MCCI version or at least something as up to date, not the incomplete older versions which don’t correctly handle MAC commands.

A correct LoRaWan stack is probably most of an engineer-year to understand the requirement and develop if you try to do it from scratch, maybe more

There is no mode of operation where TTN permits nodes that do not correctly implement downlink and handling of MAC commands in accordance with the LoRaWan spec.

If they are certified, they correctly implement the channel control MAC commands, so if used correctly they do what the network server expects that specific node to, and not what it does not expect of that node at that point in time.

descartes · September 11, 2021, 3:15pm

AS @cslorabox has outlined, using non-compliant firmware on a device and using a gateway that has firmware pre-v3 & may well have it’s own issues or may actually be the element that is causing issues, is detrimental to the running of the running of the community network.

There is no apparent reason why you can’t use MCCI LMIC v3.1 or better on the 4809, I do and it works fine. You could try this as an experiment to see if it’s your MAC or the gateway that is the problem.

If you have another known good device based, you could try that with your gateway.

But if you cannot quickly & efficiently perform such tests, as you have a non-compliant MAC but could have and you have a gateway that’s not known to be up to spec, please disconnect your gateway from the TTN community network.

You are totally free to try experiments with your own copy of TTS if you wish, that would be in scope for the forum. But not anything causing issues to the community at large.

Once you have decided on a course of action and perhaps run some additional tests, feel free to open a new topic.