LoRaWAN simulation issues

kersing · October 27, 2020, 5:51pm

As the sources are available there probably is. In an other way, I hope not as everyone always has special use cases where the legal/fair use/whatever limits do not need observing…

cslorabox · October 27, 2020, 5:54pm

Then you really need to include the airtime limits in the simulation!

is there any way to avoid the LoRaWAN limitations for an specific case just like this one?

Simulate a setup in a region that doesn’t have downlink airtime limits - but then your result only applies to that region, and not one which does.

But really, if you’re hitting downlink airtime limits, you’re already shooting yourself in the foot, because it means your gateway is spending too much time transmitting and thus unable to hear uplinks from nodes…

Did you accidentally “power on” a bunch of nodes that need to register all at the same time? That’s not very realistic, simulate instead a person putting batteries in them one at a time, or pulling out a battery saver tape or whatever.

cultsdotelecomatgmai · October 27, 2020, 5:58pm

Hi @dpergarc, please do not do any form of stress testing on a network that is being used by other people for production work.

descartes · October 27, 2020, 6:11pm

None of which particularly needs to replicate the protocol. You can use real life to figure out metrics like transmission time (which can actually be calculated to a reasonable level), typical gateway processing, backhaul responses and network processing times. Then write some algorithms, run simulations and unleash some R to evaluate the results.

It would help if you could tell us why you need to figure this out.

kersing · October 27, 2020, 6:20pm

That is an issue we regularly experienced at workshops. Last few workshops I ran I made sure to have multiple gateways on site.

cslorabox · October 27, 2020, 6:25pm

Seems like it could also be handled with better timing randomization at startup and between join attempts.

Also possibly code optimized for such a setting by using only fast SF’s for the first five minutes or so before starting to include slower ones.

dpergarc · October 28, 2020, 5:10pm

Hi guys!

Thank you very much again for answering so quick!!!

Do not worry, I am not trying to execute this in a real scenario, I am using the TTN stack with docker in my computer

That might be (it makes sense actually), but this is not the case I am afraid, since I am doing the following sequence:

Join-accept procedure → OK.
Wait 15 minutes.
Join-accept procedure → OK.
Wait 15 minutes.
Join-accept procedure → NO OK.

That is weird, I am not sending nothing else to the gateway by neither the emulated node nor any other node.

I will look for it, just only to make a little test to discard the air time limitations - Would you mind telling me one? It might be quicker I guess

Thanks beforehand, looking forward to hearing from you soon.

Kind regards.
Daniel.

cslorabox · October 28, 2020, 9:10pm

How are you implementing time and timestamps in your simulation?

dpergarc · October 29, 2020, 8:45am

Hi there!

I am attaching a real example of Semtech UDP packets used recently this morning (that is to say, both the rxpk and the stat, which are related to the first communication steps):

/*******************************************************************************************/

Attempt 1 - OK

Stat
{“time”:“2020-10-29 08:29:44 GMT”,“lati”:0,“long”:0,“alti”:0,“rxnb”:0,“rxok”:0,“rxfw”:0,“ackr”:100,“dwnb”:0,“txnb”:0}
Rxpk
{“time”:“2020-10-29T08:29:47.678004Z”,“tmst”:1603956587,“powe”:14,“chan”:0,“rfch”:0,“freq”:868.1,“stat”:1,“modu”:“LORA”,“datr”:“SF7BW125”,“codr”:“4/5”,“rssi”:-35,“lsnr”:5,“size”:46,“data”:“AO/Nq5B4VjQSAJmqu8zd7v/Me0zhInc=”}

Attempt 2 - OK

Stat
{“time”:“2020-10-29 08:32:06 GMT”,“lati”:0,“long”:0,“alti”:0,“rxnb”:0,“rxok”:0,“rxfw”:0,“ackr”:100,“dwnb”:0,“txnb”:0}
Rxpk
{“time”:“2020-10-29T08:32:08.878004Z”,“tmst”:1603956728,“powe”:14,“chan”:0,“rfch”:0,“freq”:868.1,“stat”:1,“modu”:“LORA”,“datr”:“SF7BW125”,“codr”:“4/5”,“rssi”:-35,“lsnr”:5,“size”:46,“data”:“AO/Nq5B4VjQSAJmqu8zd7v9wb/g86W0=”}

Attempt 3 - FAIL

Stat
{“time”:“2020-10-29 08:32:38 GMT”,“lati”:0,“long”:0,“alti”:0,“rxnb”:0,“rxok”:0,“rxfw”:0,“ackr”:100,“dwnb”:0,“txnb”:0}
Rxpk
{“time”:“2020-10-29T08:32:41.766004Z”,“tmst”:1603956761,“powe”:14,“chan”:0,“rfch”:0,“freq”:868.1,“stat”:1,“modu”:“LORA”,“datr”:“SF7BW125”,“codr”:“4/5”,“rssi”:-35,“lsnr”:5,“size”:46,“data”:“AO/Nq5B4VjQSAJmqu8zd7v/d1L8pvIA=”}

Note

Most information has been made up, but I tried to simulate them with random values achieving the same wrong result.

/*******************************************************************************************/

I hope this helps, if you need further information please, do not hesitate and ask for it, I will be glad to share it!

Thanks beforehand, looking forward to hearing from you.

Kind regards.
Daniel.

cslorabox · October 29, 2020, 4:59pm

As I suspected you have not correctly simulated the hardware timestamp.

Your packets purport to be a couple of minutes apart, but their timestamps are only 141 microseconds apart, which is simply impossible - those packets would overlap!

As a guess, you incremented the hardware timestamp counter only by the air time and not the elapsed time between packets. And you mistakenly applied a value in milliseconds as microseconds.

Or at least something like that.

Your fake packets need to have hardware timestamps which show a progression of time roughly matching that actual progression of time in between when they are submitted.

Assuming you are running your simulation in real time, what you need to do is convert the time since program start to microseconds, mask it at 32 bits so its rolls and use that.

Modeling the uplink airtime is relatively unimportant, unless you’re also trying to do your own accounting of when frequencies are occupied (though in that case you have to model real world radio behavior, too, like close nodes blanking distance ones on other channels, and some intention of to what degree you support the theoretical orthogonality of distinct spreading factors on the same channel at the same time.)

dpergarc · October 30, 2020, 1:13pm

Hi there!

Thank you very much @cslorabox, that seems to be the key!!! I am hard coding that with random values just like as it shown below:

7911574
8449704
8892065
9220230
9947056
404184
2161830

So, according to what you say, the real operation here should be:

Calendar.getInstance().timeInMillis * 1000 // microseconds value → For instance right now: 1604063010987524.
Delete the first six values → in order to have a 32 bits value.

Is that correct? Could it be a random value emulating steps of two minutes?

Thanks beforehand, looking forward to hearing from you.

Kind regards.
Daniel.

cslorabox · October 30, 2020, 3:39pm

No, you need to mask to a 32 bit value. That is not an operation which has any direct equivalence in decimal, but it is rather elementary computer programming.

I’m sorry to say that your approach to all of this is a bit haphazard - if you want to end up with a result that has any real meaning, you’re going to need to first take time to better understand how such a network works.

dpergarc · November 3, 2020, 4:49pm

Hi there!

Copy, understood!

I am aware that a real situation means to follow the LoRaWAN rules, but I insist, it is a simulation sending data through UDP to a TTN stack launched on my computer, so do not worry, I am not breaking the rules (and of course, I am not pretending to do so).

Once this issue has been solved, I will keep working on uplink procedure, that it is the next step, thank you very much for all your support! All of your suggestions have been so helpful!

Kind regards.
Daniel

cslorabox · November 4, 2020, 1:54pm

You really, really missed the point of what you were responding to.

It doesn’t sound like your simulation is modeling the actual behavior of a network.

As a result, your results will be meaningless at best, but it may be far worse if you actually believe or represent them to others as having any sort of meaning.

dpergarc · November 4, 2020, 12:53pm

Hi there!

I will try to summarize as much as possible the problem I have. I am developing an application to simulate some LoRaWAN devices, and I need to comunicate them to the TTN platform, using the TTN stack with dockers. Even though both the “stat” and the “join-accept” procedures seems to be all right (the second one thanks to you guys), when I try the “uplink”, a mistake is given from the gateway (I am using the Semtech UDP Packet Forwarder communication). Moreover, the documentation followed in the one regarded to the LoRaWAN v1.0.3.

Let’s see the mistake on the gateway side:
gateway

It might be not very relevant, but it is also included a picture of the device side:
device

In case it may help, I am including the JSON returned by that error:
error.json (1013 Bytes)

More clues:
In order to corroborate that the “join-accept” is working well (or that is what it seems), it can be checked by the print done in my program:
devAddr -> B4634D01
nwkSKey -> 1910482B5841434E6A4FEF4D8999741A
appSKey -> 5E968317B4A01D19C80CDBA349F0732A

By this, has anyone had this kind of mistake? If true, what could it be and how it can be solved? If not, what is the best way to find out what is going on?

Thanks beforehand, looking forward to hearing from you soon.

Kind regards.
Daniel.

cslorabox · November 4, 2020, 4:48pm

You have an endian mixup error in your simulated node: notice how when the network is issuing the join accept the bytes of the device address are in the opposite order from what is received in the attempted uplink.

Given that device addresses are not globally unique, you could also potentially get a device not found issue if the MIC of the message does not match due to network session key issues, as it’s getting a matching MIC which validates that the traffic is from the correct instance of the non-unique device address.

But it’s definitely going to be not found when you reverse the byte order of the device address itself as you have in this attempt, so fix that first.

dpergarc · November 5, 2020, 4:55pm

Hi there!

First of all, I am truly sorry for creating a new topic for this new issue, I really thought that they could be treated as separated discussions.

Regarding to the current issue:

Fixed! My bad! Now the only variables that are lsb are appEUI and devEUI… but I am afraid the same error is given , take a look to the following pictures:

Gateway:

Gateway1355×549 40.7 KB
Device:

device1347×452 31.3 KB

I will try this sending to the TTN console V2, just in case I could have more clues.

Thanks beforehand, looking forward to hearing from you soon.

Kind regards.
Daniel.

cslorabox · November 5, 2020, 4:59pm

See if your traffic validates in the online lorawan packet decoder.

You’ve set yourself a huge project here where re-implementing LoRaWAN from scratch is being treated as a trivially implicit part, rather than the large project it actually is.

You’re going to need to improve your own debugging game quite a bit to get this to work.

Maybe should should spend some time with known good implementations first…

Beware that even once you start generating valid uplinks, you’re still going to have to deal with MAC configuration downlinks correctly, or you’ll get stuck in a loop on those distorting any picture of network performance.

dpergarc · November 10, 2020, 1:21pm

Hi there!

After seeking deeply what was going on, I finally achieved sending data to the device itself registered on the ttn stack, just as it is seen in the picture below:
data

Data in hexadecimal -> 543A32372E30383B483A32362E3436
Data in ascii -> T:27.08;H:26.46

The change was in the fctrl parameter, setting that the adaptive data rate will be used. It is weird, since another value (e.g. 0x00) give me the classical issue Host cluster failed to handle message.

Sorry if this is something trivial for all of you guys, but… Can anyone give me an idea of what this is happening? Am I missing something that might be relevant?

Thanks beforehand, looking forward to hearing from you.

Kind regards.
Daniel.

kersing · November 10, 2020, 2:11pm

Even in simulation I think sending ASCII text is not the way to go. It sets a wrong example and won’t match real world where binary encoded data will be transmitted. (Or should be transmitted)

You are not showing the message so it is hard to comment. Fctrl can be zero as far as I know but other header fields may have impact on what should be there.

For most forum users this will be something they’ll never encounter as few work on the internals of a LoRaWAN stack. However if you are working on it as you are you should read the LoRaWAN specification front to back and back to front and make sure you understand it. Have you done so?