How reliable are downlinks?

ame · June 13, 2020, 9:07am

I’m planning something similar, but instead of turning the output on there will be some local smarts in the node that accepts commands to turn on and off, but starts a timer when an “on” command is received. Then, if the network dies the timer will time out and the output will turn off automatically. But, if everything is good and I want the device to be on longer I’ll re-send the “on” command before the timer expires which will re-set it.

arjanvanb · June 13, 2020, 10:13am

Ah, the following was actually confirmed to work:

So, I’m quite sure confirmed downlinks can be replaced with a different confirmed downlink just fine.

dajt · June 13, 2020, 11:35am

Replacing unakc’d confirmed downlinks is an interesting case I had not considered! Glad it works

I agree taking 4-6 attempts to get a downlink is terrible, but given I only managed to get about 3 in a month when this project started my expectations are pretty low at this point and I’m very pleased anything happens at all.

I have written the code to work with either a wifi or lorawan feather because testing with wifi is a lot easier, and in case this just doesn’t work with lorawan I can suggest we use wifi. But I think the project sponsor wants to prove lorawan can either do this or can’t for cases where farms have tanks further away from the house than wifi can reach. GSM is another option but there are already solutions using that. The wifi/lorawan code do not co-exist - you get one or the other so there is no time wasted on a stack not being used. Class C would require either us to write the class C code or move to a different device that does support it, both are way outside the spec for this project.

At the moment there is no “on” automation for the pump so even if the gateway or the feather dies they’re no worse off. The tank level readings are taken from a separate sensor - nothing to do with the feather. There is already at least one automated “off” mechanism, we’re going to allow for timeouts to be sent with the “on” messages, and we also check a couple of input signals that will make the feather switch the pump off without a downlink message. We’re pretty good for switching it off I think. The annoyance will be wondering why it hasn’t switched on when it should have, and sending too many downlinks if the performance isn’t any better with the gateway on-site.

Right now, it’s 10x better than it was 3 days ago and I can demo it without being too embarrassed. We have next semester to hopefully get on-site and see how it goes in place.

cslorabox · June 13, 2020, 2:49pm

It still doesn’t make sense why you are putting the radio in the turn-on/turn-off path at all instead of having that locally automatic and using LoRaWAN only to report status and possibly change control-loop settings.

Even if your water level sensor is remote from the pump, you probably want to connect them via a shorter range point-to-point link (LoRa or otherwise) and not through the LoRaWAN gateway.

That’s wholly apart from the how your LoRaWAN node-gateway interactions don’t yet seem to be working as they should.

ame · June 14, 2020, 12:29am

I am not the OP. Sorry for the confusion. The system I am preparing will be locally controlled, with LoRaWAN used for reporting the state, but I am hoping to use downlink commands to control a relay to modify the state, but even if the downlink fails, or is delayed, or the relay fails, the system will still operate safely.

Besides, what’s the point of a downlink if it can’t be used?

My installation is slightly different to the OP: I have a tank on a hill. There is no LoRaWAN converage there, but there is cellphone coverage. I have a pump in a valley, 1km away. There is no LoRaWAN or cellphone coverage there.

Phase one is to install a water level sensor in the tank connected to a LoRaWAN analogue node. Next to the tank I will install a LoRaWAN gateway with a cellular modem. The sensor, node, gateway, and modem will be powered by a small solar array and battery bank. The node will be only a few metres from the gateway, so it’s a bit pointless, but it allows me to start getting data in a consistent way.

Phase two is to install a sensor on the pump in the valley and connect it to a LoRaWAN digital node. It will report the status of the pump (running/not running) by connecting to the gateway on the hill that was installed in phase one. 1km is still not that far for LoRaWAN. A digital output on the node will do something with the pump (but I don’t recall what it is just now…), but it’s not critical as the pump operates automatically based on a pressure switch. In other words, my use of the downlink to control an output is not part of a control loop.

arjanvanb · June 14, 2020, 8:45am

Downlinks are very useful: for OTAA, for ADR, for remote configuration, maybe even for a remote reboot to allow for joining a different network. (Unfortunately, downlinks for confirmed uplinks apparently have a design or implementation flaw.) Downlinks should work properly, and if not then one should fix that. But controlling things is just not a good use case for Class A LoRaWAN.

Just for the sake of completeness, though already mentioned in many other places: even if downlinks work fine, there’s also the limit of 10 downlinks per day. I’d assume that retries for confirmed downlinks count against the Fair Access Policy as well, as the network cannot be blamed for that.

ame · June 14, 2020, 9:34am

Yes, I have borne that in mind. The node is a class C device, and we plan to turn it on (or off) at most once a day, but probably not very often at all.

arjanvanb · June 14, 2020, 10:20am

Class C devices are just Class A on the TTN Community network:

cslorabox · June 14, 2020, 2:25pm

Beware that LoRaWAN gateways are quite power hungry due to the multichannel DSP baseband receiver chip (some sellers will claim the 8-channel cards are 49-channel due to the number of distinct combinations that could be demodulated, ironically their power consumption isn’t far off from 49x that of a node radio). You may need a bit larger solar setup than you’d expect to keep this up across cloudy days.

Personally I’d look into a custom point-to-point LoRa link using lower power node-class radios. Your box on the hill can wakeaup the mobile data modem periodically and report in, along with water level. And then it can command the pump via LoRa, probably something like “this repeatable message means run for 15 minutes”.

The one possibly tricky part is having the two node-class radios “find each other” if you conclude you need to use multiple channels but you presumably have a fair amount of power at the pump and can keep that radio receiving during searches, and once you establish communication you can keep a schedule of windows which grow wider if a transmission is missed.

bluejedi · June 14, 2020, 9:01pm

FYI: I have updated the topic title.

Downlinks themselves are not unreliable. It is what you want to use them for and whether LoRaWAN as technology is suitable for what you want to apply it for. “How unreliable are downlinks?” implies that downlinks would standard be unreliable, which is not a correct statement.

ame · June 14, 2020, 9:51pm

This is all great information. I am using TTN for test/development, and it might be appropriate for deployment. If I need something “better” I can pay for TTI, or use our incumbent telecom operator’s offerings.

The MikroTik gateway I am using consumes 7W maximum. The Teltonika modem uses <5W. The Ursalink node uses <2.5W and the sensor about 1W. The solar power system will be sized accordingly.

ame · June 15, 2020, 10:26pm

Ok. I have clarified what we want the digital output for. It is to prevent the pump from running when we are not expecting to use water.

If the tank is full the pump will automatically shut-off (because of a pressure switch). But, it will attempt to restart every 15 minutes. To reduce wear and tear we want to have an override switch on the pump. This will be turned on or off at most once a day, probably with a few days between each switching event, i.e. when we turn it off it’s because we don’t want water for a while, and when we turn it on, we’ll leave it on for a while. Basically this output is permitting the system to run (with its own automatic controls) or not, and is not part of the control loop.

Is this an acceptable use case?

descartes · June 16, 2020, 9:59pm

But it is if it overrides the automatic filling of the tank.

Only you can decide if this is an acceptable use case - if you send an OFF command and then someone uses the water but the override is still OFF and for whatever reason downlinks aren’t getting through, is this OK?

ame · June 17, 2020, 3:40am

Well, it depends on how reliable downlinks are.

To answer your specific example, yes. It is acceptable.

If we are not getting any telemetry, or we can’t send downlinks, then we need to investigate. This is why we need to know how reliable downlinks (and uplinks) are.

dajt · June 17, 2020, 6:34am

From my brief experience, it’s variable and probably mostly to do with the gateways. But the sponsor in my case won’t be any worse off than they are now and the farm manager will have to go and switch the pump on manually if too many downlinks go missing.

The GSM backed gateway I have access to most of the time is pretty hopeless as described above.

I’ve parked out near the customer site a few times over the last couple of months (social distancing!) to see if their on-site gateway was any better.

The first two times I was just trying to see if I could get a downlink at all, and did seem to be getting them but it was only with a hello world program. After that, assuming downlinks were ok, I spent most of my time developing with a wifi feather because it was so much easier, and I wouldn’t be abusing the fair use policy.

The 3rd time I was trying to record a demo. I parked in a different place, closer to their building, and the performance was terrible, like the GSM gateway. This visit is what sparked this thread as I then doubted my impression from the first two visits.

The last time I was across the road again to see if that made a difference and the performance was fine. Joined in 6 or 8 seconds, every downlink came through at the first opportunity, and I did quite a few of them. I was able to record the demo.

The gateways seem to be the deciding factor here. I don’t know what the variables are but could they be things like:

Do you have line of sight?
Are you too close?
Is you signal better to a dud gateway than a good one, so only the dud is sending your downlinks? I think I read only one gateway is asked to send a downlink. I heard the sponsor may have more than one gateway on site so perhaps I was talking to one of the not so good ones?

bluejedi · June 17, 2020, 6:49am

It all boils down to whether the worst case scenario (e.g. downlinks and/or uplinks dont’t come through for any period of time, for whatever reason) can cause any dangerous / unacceptable situation.

“Then we need to investigate” is a very fuzzy statement and may not prevent a dangerous / unacceptable situation.

Whether that is acceptable for your case is up to yourself to determine/decide.
There is however no (simple) formula to determine the ‘reliabilty of downlinks’ (or in other words determine the probability that an unacceptable situation can occur due to downlinks not timely arriving for whatever reasons).

ame · June 17, 2020, 9:01am

Yup. It is entirely acceptable. If the command to prevent the pump running is not successful then the pump will still be controlled by its own pressure switch, so the system is not compromised. If the command to allow the pump to run is not successful then water will still be available for a while, but the telemetry for the water level will show it is dropping (and not being refilled). If there is no telemetry we will generate a warning that there is no telemetry, and we need to investigate.

dajt · June 17, 2020, 10:36am

Wow, we may as well be writing the same system

ame · June 17, 2020, 10:39am

I have a pump, a tank, and a switch. There are very few combinations we can make.

cslorabox · June 17, 2020, 1:53pm

That’s not necessarily a sound conclusion at all.

By far the most common cause of downlink failure in new projects is timing errors in the node, causing it to not be receiving at the correct time.

However, unreliable or slow Internet connection between the gateway and the servers can also cause a problem, unpredictably over time. There’s only one second available to complete the roundtrip between the node, gateway, server, gateway and node. And unfortunately because the original packet forwarder code could only have one transmit request outstanding at a a time (no queue) even in a case where the rx window is later, the server has to hold onto the transmit request until just before it needs to be sent anyway, because it can’t risk collisions between transmit requests sent down to the gateway out of order. The packet forwarder code has long since gained an internal queue to order things, but the server can’t count on a gateway running a version with that capability.

The biggest thing you need to do for testing is gain actual access to the gateway - what do its logs say, and did the transmit light blink? Ideally, get one scope probe on a gpio on the node driven for the duration of the receive window, and another scope probe on the gateway transmit LED, and see both if the gateway transmitted at all, and if the timing lined up.