Network and application session keys the same among all my ABP devices

verdiagriculture · May 26, 2020, 4:30pm

Hey everyone,

I am using the ABP method to connect around 20 nodes to a gateway. The devices usually uplink for a while, but as time pass, less and less of them do. After 10 days, none of the devices uplinked.

I was trying to find out why, and I read that the NWKSKEY and the APPSKEY should be different on all the devices. I have them the same on all my devices because it made it faster to program.

Has anyone else made this mistake? If so, is the devices not uplinking a symptom of this issue?

Thanks!

descartes · May 26, 2020, 5:00pm

The combination of NwkSKey and AppsKey are meant to be unique per session per device - someone with a better understanding can unpick the details but I’d suspect the upload counter slowly gets so badly out of sync with each other that TTN just rejects the messages.

So you could temporarily try turning off the Frame Counter Checks but medium term some reprogramming would be in order.

arjanvanb · May 26, 2020, 5:19pm

And what about the DevAddr?

(Typically the DevAddr is not unique for all existing devices, but that needs a unique NwkSKey indeed.)

And just to be sure: you’re not using the Semtech demo keys, right? (Both keys being 2B7E151628AED2A6ABF7158809CF4F3C for that.)

verdiagriculture · May 26, 2020, 5:31pm

I auto-generated both keys once, and used them for all the devices. I autogenerated a new DevAddr for every device.

When I go back into the field next week, I will reprogram all of them so that they have an unique NwkSkey and AppSkey, but I want to nail down the bug before then.

arjanvanb · May 26, 2020, 5:46pm

As you can read in the explanation about the DevAddr I linked-to above, TTN will use the combination of DevAddr and NwkSKey. So I don’t think the non-unique NwkSKey is the cause of your problem.

However: any chance the current values for the uplink counter (for each device) are lower than the last known values, caused by devices resetting somehow? (If they use solar power then maybe their batteries might go too low at night?) If that’s the cause, then your gateway traffic should still see the uplinks just fine (and also show the current counter), but they would not be routed to your application. (Unless you disable the frame counter security.)

Beware that changing any setting for an ABP device will reset their counters. (When using TTN Console.)

verdiagriculture · May 26, 2020, 5:46pm

Since I am using ABP, and I power cycle the endnode. Does this means that it is a new session, and that I should have programmed new keys?

verdiagriculture · May 26, 2020, 5:51pm

Okay, thanks.
It had hoped that this was the cause of my issues, but I didn’t really believe that it was either.

I’ll keep combing through my code.

arjanvanb · May 26, 2020, 5:52pm

For ABP you don’t need to program new keys every time you reset them. But you should reset the counters in TTN Console (or using the command line ttnctl) at the same time you reset the ABP device. (Or disable the security, but like linked-to in an earlier answer: that might need handling in the device as well, for the downlink counters.)

I think key is: do you see the uplinks in the gateway traffic?

TD-er · May 27, 2020, 7:06am

As far as I know, the DevAddr is only used with OTAA, to generate a session key.
When using ABP, you essentially replace that step by generating that session key for that single node.
N.B. you can even do it “manually” by first performing an OTAA and then somehow copy those session keys into an ABP-configuration. But that is even more cumbersome than generating an ABP session key in the first place. (and it will probably be void if the original node does a new OTAA with the same DevAddr)

If all nodes use the same session key, but not at the same time, then you would still run into the problem where the frame counters will be already used.

So if you disable the frame counter check, in the console, you will be able to see those “duplicate” messages again.
I don’t know what the TTN network (or maybe a gateway) does when it detects a session key is used on several nodes. Nor if it can be detected anyway. I guess it can be detected when two nodes send using the same session key at the same time, received by different gateways. (or on different channels at the same time)

Anyway, I would suggest to generate unique ABP keys per node.

arjanvanb · May 27, 2020, 7:35am

Nope. It’s also present in every uplink and downlink message, for both ABP and OTAA devices. And it’s used when calculating the MIC, along with other details such as the frame counter.

The session keys are not part of the message, but the MIC is. However, validation of the MIC starts with a list of all devices of a given DevAddr, and that’s not the same for all devices in this case. See also How does a network know a received packet is for them? - #2 by htdvisser

For security: yes, absolutely. But I doubt this problem is caused by using the same secrets. And if it is: the messages should be visible in the gateway’s log then, if one has access to that. (For many gateways the Traffic page in TTN Console is not working right now.)

Aside: in an earlier, meanwhile deleted, topic @verdiagriculture wrote that the frame counter checks were disabled. Also the number of messages per day is low, so no problems to be expected with, e.g., 16 bits counters.

arjanvanb · May 27, 2020, 8:09am

Ah, maybe you meant the DevEUI, not the DevAddr? Indeed, an ABP device does not know that, and an OTAA device only uses that during its join (along with the AppEUI and the secret AppKey). For uplinks, the DevEUI as known in TTN Console will still be added by the network and passed along in every MQTT API message and integrations (labeled hardware_serial), but I don’t know if they’re used otherwise.

TD-er · May 27, 2020, 9:10am

Yep, the HW address of the node. That’s the one I meant.

verdiagriculture · May 28, 2020, 9:34pm

Thanks for all the replies.
I don’t have immediate access to gateway traffic. But I am getting a new gateway soon, and with 8 or so nodes that I have, hope to recreate the bug and solve it.

verdiagriculture · May 29, 2020, 2:03am

I finally caught a gateway traffic on the console. It looks horrendously long. I see deduplicate commands, and it got 3 devices from network server and performs 2 MIC checks. Maybe this is the issue that is halting my devices from uplinking properly. A log like this isn’t normal is it?

arjanvanb · May 29, 2020, 8:13am

Wow, that’s a messy post. I see duplicate details in the screenshots, and what are we even looking at? Most looks like a downlink trace?

Also, is this about traffic that, though shown in the gateway Traffic, did not arrive in your actual application? If so: how is that application getting its data from TTN? If that is not using MQTT: what does a command line MQTT client receive? How does a trace of dropped uplinks compare to uplinks that are handled correctly?

(Just to be sure: if you’re only looking at the Data page in TTN Console, then that often stops showing data without any clear reason. So: look at what your actual application is receiving. Or enable the Data Storage integration for debugging.)

What are the other details of the uplink that triggered that downlink? The one shown in the screenshot shows FCntUp 6. But the downlink says reason: initial, so seems to be a US915 initial ADR command, which should only be sent once in the lifetime of an ABP device, emphasis mine:

There are a several moments when an ADR request is scheduled or sent:

The initial ADR Request (for US915 and AU915). This is sent immediately after join and is mainly used to set the channel mask of the device. This one is a bit tricky, because we don’t have enough measurements for setting an accurate data rate. To avoid silencing the device, we use an extra “buffer” of a few dB here. This request is only needed with pre-LoRaWAN 1.1 on our v2 stack. With LoRaWAN 1.1 devices on our v3 stack, we can set the channel mask in the JoinAccept message. ABP devices pre-LoRaWAN 1.1 will only get this message once, if they reset after that, they won’t get the message again; this issue is also solved by LoRaWAN 1.1.

However, now that gateway traffic is (partly) routed through V3 components, maybe that’s no longer only sent once.

Do you see downlinks for every uplink?

Yes, all as expected. (The log entry duplicates: 1 actually indicates there is only a single occurrence, not two.)

So, the same DevAddr is used by 3 ABP devices in your region. After 2 MIC checks it found your device, hence does not need to check the 3rd device as well. All just as expected, as explained in one of the links in my earlier answers. (This is not a problem for you, as you wrote that all your devices have a different DevAddr along with your single NwkSKey. So, the other 2 devices are someone else’s ABP devices, using a different NwkSKey.)

I still think that the key question is: do you see the missing uplinks in the gateway’s Traffic? So, do you still see a device’s traffic in the gateway after a device stopped working after 10 days?

descartes · May 29, 2020, 8:19am

Whilst figuring out what goes on with duplicated keys may be interesting, is this not like figuring out the exact dynamics of dropping lighted matches in to a petrol filler on a car. If you don’t try that and drive normal (aka configure the nodes correctly), does it work?

verdiagriculture · May 29, 2020, 5:18pm

Yes, I do apologize for the messy post , I will put more work in making my posts. I just got really excited cause I haven’t seen a downlink (yes it’s a downlink trace) in a long time. The uplink associated with this downlink was received by the TTN, so one device broke through a 5 day period of silence.

It is a lot of speculation as to what the issue is right now, and I don’t like making these kinds of wishy-washy posts. In one to two weeks time, I will have more concrete data. I will cease to post until I have that data to solve the problem with.

Your comments about what the trace log meant was very helpful! I’ll keep the ADR initial problem in mind as I debug.

verdiagriculture · May 29, 2020, 5:26pm

Hey Descartes,
First, I love your username.
I meant recreate the uplink bug. My main goal is to make my devices uplink consistently, and since I don’t understand things very well, the only way I can make sure the problem is fixed is if I:

Recreate uplink bug -> implement fix -> uplink bug gone -> remove fix -> uplink bug reappears

Only then can I be confident that the fix addresses the core issue.

descartes · May 29, 2020, 5:59pm

As per other thread, I should be able to solve this for you with my tested Arduino + RFM95W code - I’ve four nodes based on this, office, at home, one outside with four DS18B20’s and the original low power test - apart from one cat related incident, they just run, three are on 50,000+ uplinks, the original is on 122,341.

verdiagriculture · May 30, 2020, 5:54am

That’s pretty impressive. After scourging the internet, I found a new champion which may be the cause of my issues: LMIC Timing. Seems like the Arduino’s millis counter is causing timing issues as previously mentioned by CongducPham and matthijskooijman.
Does your Arduino implementation use a similar syncing scheme?