Open source LoRaWAN traffic simulator

Some of you may have seen this on Twitter a few days ago but I’m sharing here too just in case! I couldn’t find an easy-to-use LoRaWAN traffic simulator so I ended up building one. Of course, TTN traffic can be simulated by using the Gateway server MQTT endpoint but I wanted something a bit more generic, and closer to “real” traffic.

For now, only ABP nodes (with FCnt starting at 0) and uplink traffic can be simulated, but OTAA support shouldn’t be too difficult to implement (happy to take pull requests!).
Also, the payload is fixed right now (also something that would be reasonably straightforward to improve), and the generated radio metadata (RSSI, SNR, etc.) is completely random – this could be improved for folks more interested in testing network optimization than pure server stress testing like I was.

Please check it out and let me know if you have comments/questions!

2 Likes

A quick look at the readme suggests you are not simulating a node which the name of your repository suggests, you are simulating a gateway that is forwarding packets from a node.

For anyone interested, run this against you own LoRaWAN backend only. Do not run against the public TTN infrastructure as that will interfere with operations for the community.

Hi @kersing!

I guess “node” is in the name of the repo has to do as much with Node.js as it has with “node” as-in “end node”, but that’s a good point and maybe I should clarify the name :slight_smile:. But I’m not sure how one could simulate a node without an actual gateway to send its packets to the NS, though?

That being said, you are very right that I should have advised folks against using this tool against the TTN servers, as it is indeed rather intended for stress testing of one’s own infrastructure.

Indeed, one would need a way to get the simulated packets to the sever. Really the bulk of any simulator is going to be the node simulation, which has to implement the LoRaWAN protocol, the crypto, etc. Much as a real one, a simulated gateway does very little - pretty much just put the rolling microsecond timestamp on the uplinks, and ideally enforce a refusal to “send” any downlinks it receives after their requested transmission time. Unless a simulation is going to include a model for radio propagation and implementation, it’s about 90% node model and barely 10% gateway model.

What was the problem this software solves? If you have a network & application server that’s under too much load, surely you also have a business model that allows for a modest upgrade to relieve the pressure?

No idea if the asker is actually “going there” or just making an offhand comment.

But being able to afford more infrastructure and knowing when you’d need it are distinct things; ideally you know you need it before you start dropping traffic on the floor or missing downlink windows.

There’s a whole art and variety of strategies to modern infrastructure computing, in fact, some methods are supposed to be able to auto-scale by bringing in more resources dynamically as needed (and ironically are astoundingly inefficient given all the overhead they devote to parcelling and deparcelling things to allow for
potential work sharing). If you had a scheme that was supposed to do so, how would you know that it actually does?

…you’d hit it with a lot of simulated traffic…

Old school way - have some KPI’s, review them.

The marketing blurb of every third startup at present - usually taking a sledge hammer to crack a nut but if you can scare the management in to getting you some new toys to play with whilst someone keeps an eye on the KPI’s, go for it.

Sorry, but that’s really not how infrastructure development works.

Dynamic load scaling is, and that needs testing.

Sometimes it’s informative to see how other parts of the development profession actually operate

It’s manually allocated resources which are “old school”. That’s not to say they aren’t ever a defensible choice, but infrastructure people will laugh you right out of the meeting, as it’s not the way they operate.

That’s what I’m saying - but it works and costs less than some overly complicated dynamic scaling system that turns out not to work very well when called upon but was never really needed because of the over enthusiastic projections of business growth.

Mostly I get to be in the exit interviews for the infrastructure people. Your picture of my working life & experience has a long way to go but do keep guessing.

That’s exactly the mistake you make when telling the poster that the organizational requirements they face in their business are fictitious.

I have no doubt that the requirements you face can be satisfied with manually allocated resources assigned well in advance of need. But what you are not considering is that the requirements (actually technical or merely organizational but still valid) faced by others are not the same as those you encounter.

If the asker’s boss believes they need load testing, neither you nor I are going to talk them out of it by posting here. And they might even be correct in that assessment, by knowing things about the nature or scale of the requirement which we cannot.

Actually sitting down and writing a network server, then having the infrastructure team refactor it into a modern architecture has been an informative experience. I don’t agree with all off their choices, but it’s been quite informative to understand the thinking which dominates that field today.

The lack of OTAA, fixed payload and random data elements don’t make for “real” traffic so there’s still a lot to be built before it can be a tool that would provide enough information to be useful. Whereas going with an AWS instance and then migrating to a larger one (and back down again if appropriate) is much simpler & less time consuming.

Very subtle but even single-handed ageing developers can do big stuff from the comfort of their own home. For context, 32TB of local storage for test data for one client alone. But yeah, some client work would fit in a SQLite database on an Arduino Uno, so on the average, you may be about right.

It happens that the time critical path in a LoRaWAN sever is being able to generate RX1 downlinks in less than a second, in the face of a flood of other traffic you have to decode before you can figure out which are the occasional ones that might get a downlink response; even worse if you allow the live loop out to and back from an application client.

It doesn’t matter if anyone is actually pretending to transmit or listen to those downlinks, just that you can fire the transmit orders off towards what are theoretically gateways in under a second, while those things are throwing a flood of uplink traffic at you. You could even take one single real node and gateway, but have the base load come from the simple simulator throwing random traffic that little more than minimally validates (or for more fun, fcnt skips across the 16 bit boundary, as that can require a search…)

On the surface, meeting RX1 timing is easy if you have just a few gateways interacting or can manually divide your task up into such realms with no worry of nodes migrating between gateway collections that might be assigned to different sever instances. It’s much harder for something like TTN.

However, even in a small network, you also have the challenge of making sure you don’t loose critical protocol state if the hosting instance fails. Sure, there are mitigation strategies; skip the downlink fcount foward a bit from the last checkpoint and hope… but modern infrastructure design is all about having solutions for such problems.

Small solutions are possible; often defensible. But a lot of what the infrastructure people have learned by building websites that scale is applicable, too. And even where I think infrastructure people have made the wrong choice, it’s educational to understand their solution, as I can see encountering problems in other domains where I really would need to do things that way. Working on the code of all of nodes, gateways, and server infrastructure has been a privilege, and each has been heavily aided by the knowledge of what is actually happening deep inside the others.

(FWIW one of the first customizations I made was dial up the RX1 delay to more than a second, not because it couldn’t be met but because there’s not really any need to keep it that small)

3 Likes

@kartben can you please walk me through these variables. what are they and how to get them in order to get started? also where exactly in the code to set them ?
NETWORK_SERVER_URI
NETWORK_SESSION_KEY
APPLICATION_SESSION_KEY