How reliable are downlinks?

I wasn’t being very clear. At my house I have the GSM GW for testing while I’m writing the firmware. I am usually a bit too close to this GW while testing at home, diagonally opposite corners of a double garage. But I sometimes put the GW outside, and sometimes test from inside the house making the distance much further, and neither of those things has an effect. Downlinks have never been reliable via this GW, and even joining can take forever, although moving to MCCI LMIC seemed to improve that a bit for some some reason. I cannot think why. It now takes 6 - 10 mins to join rather than 40 - 60 mins, and I can usually get a downlink on 1 in 4 or 6 attempts rather than what felt more like 1 in 10 or 20.

The tests when I talk about going on-site use a different GW. The site is about a 10 minute drive from my house so there is no chance I’m using the GSM GW then. I agree the LoS must have been affected when I was in the car park, I can’t think what else it could be. I have no idea where this GW is, other than somewhere on the site. The first test I want to do next semester when hopefully the COVID lockdown has eased a bit is test the node at the pump controller location to see how it performs in place.

The node is a Feather with a piece of wire soldered into a via as the antenna. If this is put into a weather shielded metal control box with a bunch of contactors etc is that likely to act as some sort of RF shield or be noisy and stop the node from working?

Your advice works for people with both the skills and the equipment. 90% of TTN users do not own a gateway and are ‘reduced’ to other ways of debugging. (Given that there are > 111.000 developers and just < 11000 gateways according to the TTN home page)

Right, but after running with 3G connected gateways for over 12 months I can confidently state 3G back haul is very well possible even with downlinks. Of course there are issues at times but 99% of the time everything works as expected for my gateways using that telco.
Other sites and other telcos may perform differently. If you search the forum you should find messages concerning vodafone in Italy where the gateway performance degraded over time due to traffic prioritization in their network. However those cases are not the norm in my experience and from what I’ve heard from other gateway owners using 3G.

An antenna within a metal box? May-be google faraday cage and reconsider?

Plenty. I’ve been using one of those gateways at workshops where we use OTAA nodes that are started within coverage area of that gateway. Wouldn’t work without downlinks.

LOL, exactly what I was thinking.

The AdaFruit product page says “Simple wire antenna or spot for uFL connector” which I assume means we could solder on whatever a uFL connector is and attach an antenna that can be mounted outside the metal box.

Or maybe we sit it on top of the box with a little umbrella!

I wouldn’t mind taking the G3 GW back to the sponsor and have them test it. I would also like to test the node in place ASAP, and not just from across the road. But it’s now between semesters and AU is worried about a 2nd wave of COVID given what’s happening in Victoria so I’m just going to have to sit on my hands for a few more weeks before I can think about going to see them in person and not just sit in my car across the road like I have been.

But that improvement may tell you that it’s not (all) to blame on the GSM gateway?

A gateway will just transmit at the time the network tells it to. If the downlink arrives at the gateway too late to be transmitted (due to network latency), then it will not transmit it at all; it will not transmit it with some delay, but discard it. So, if changes in the device make downlinks work better, then at least part of the problem is not in the gateway.

I’d investigate if the uplinks between the different versions of LMIC are different. Like if there are any differences in the join procedure of the two LMICs. Maybe they start OTAA with a different SF, maybe also making TTN make a different choice for RX1 or RX2? Or, if you are in US915 or AU915, then maybe MCCI supports the initial ADR Request with some network configuration? (I don’t know if those settings affect downlinks. For EU868, LoRaWAN 1.0.2 includes some details in the Join Accept, most importantly also configuring RX2.)

If you cannot find differences in the uplink SF or downlinks, then I feel the improvement you saw indeed tells you that failing downlinks are not (all) to blame on the gateway. So even though the ethernet gateway seems to give you better results, you’d still have some investigation to do.

Aside: for Basic Station, does anyone know if TTN leaves the choice for RX1 or RX2 to the gateway?

Station will choose either RX1 or RX2 transmission parameters, whereby RX1 is preferred. If the frame arrives too late or the transmission spot is blocked, it will try RX2. To force a decision, the LNS can omit either the RX1 or the RX2 parameters.

Given one example message I found, it seems TTN forces a single choice indeed:

As we’re seeing in this thread, trying to debug a node without physical and administrative access to a gateway is an exercise in banging one’s head against a wall.

Apart from community groups that can share resources, buying a concentrator card is pretty much the “price of playing” if one wants to do much beyond build a copy of known good software/hardware.

It would be nice if that were not the case, but realistically, it is. The asker of this thread has now spent far more time being frustrated and driving to where another gateway is, than a concentrator would cost at even the most minimal wage.

I have to wonder if those numbers really capture the situation, it may include those who registered from a tentative interest. Actual active accounts versus gateways would be more interesting, but it still doesn’t show who is doing actual development.

However, for the node RX timing problem in specific, one actually can blip a gpio on both transmit and receive, and then use either a digital scope or a cheap USB logic analyzer to measure the time in between. It’s more complete and illustrative when the node is compared to the gateway, but one actually could compare the nodes RX to its own TX and the spec. The catch is that you need to find something that indicates that the packet has actually finished transmitting, or else externally calculate the packet duration and include that in the expected delay measured.

2 Likes

Sure, if you have them and can use them. But there are many many modules with marketing materials designed to seduce that only require a USB cable and the ability to download software.

I talked to Wienke Giezeman at the Reading conference last autumn about this - his feeling is that there are sufficient materials to inform - I disagree and I think it leaves some people who could champion LoRaWAN somewhat disillusioned & misinformed.

1 Like

A USB-based logic analyzer suitable for measuring node timing costs around $10 and works with sigrok/pulseview open source software.

Many toys on Amazon / BangGood / AliExpress / Farnell / Digikey / etc etc

It’s the learning curve that’s the issue.

I think the only solution is to direct people towards known good solutions for them to cut their teeth on before diving in at the deep end.

1 Like

True when things work as expected.
But when they don’t, having to dive in is evident.

In such cases a ‘What to check if’ type of checklist would be useful, especially for the less experienced.
(But I haven’t come accross one yet.)

Such tools usually require some background in electronics, which many TTN users do not have.

Like getting people started on a tricycle and then making them use a motorbike!

Which is where I’m heading - hence my recent prolific appearance on the forum.

This was actually more a reference to abstractions. You can make things much simpler by adding abstraction layers, but when something doesn’t work the abstractions most often won’t help and one needs to dive into the underlying complexities anyway.

“Known good solutions” are somewhat like (complexity hiding) abstraction layers.
If they don’t work as expected, finding the cause(s) often requires knowledge of the whole chain.

I was thinking of devices that don’t require any rocket-surgery to just get it started.

So, at best, AT command based that they can type through the commands to understand setting the Ids etc, getting a join, an uplink, a downlink and so forth. And see what happens at all points when you do things. And then link a simple MCU to run the commands automatically.

Certainly not a low-cost bodge of a me-too product that throws you straight to the community.

And definitely not a build your own or learning to compile for a feature-rich MCU. And I’d class a small Arduino (ie Uno/Nano/Pro Mini) with radio chip on SPI as a low-cost bodge for a beginner.

Yes, clear.

When using SPI LoRa, I wouldn’t advise anyone to start with 8-bit ‘Arduino’ (AVR) hardware these days (and especially not the 5V types) though. With 8-bit AVR + LMIC users will run into resource limitations very quickly (almost from the start).

I.m.o. there currently still doesn’t exist a best of bread generic LoRa development board suitable for beginners as well as advanced users. For less advanced (non-professional) users this usually should include Arduino framework support.

With generic I mean that it has sufficient available IO ports, is breadboard friendly, supports native 3.3V, supports LiPo / Li-ion or LiFePO4, optionally supports solar charging, and (very important) is low-power friendly (unlike many current popular LoRa boards that are not), is 32-bit, preferably ARM based.

The new Heltec CubeCell dev boards match many of above requirements and are also based on the newer SX1262 LoRa chip, but do not have much track record (yet). I also wonder if its architecture and implementation are open enough or that many details have been abstracted away (inaccessible) to make things simple for less advanced users. They also don’t support debugging via SWD or JTAG.

1 Like

To be fair, that does sound a lot like the M0 Feather I was asked to use. The AdaFruit docs are pretty annoying, using a lot of exclamation marks and “see how easy this is!” language, and are probably a bit out of date in regards to which version of LMIC to use, and ignoring the rest of the world. But the board itself seems ok.

LMIC itself should be easy to use - you only need about 4 functions. But there are so many examples around with unnecessary code in them, and old messages saying what might fix your problems, etc that until you’ve spent a month with it and found all the cruft and figured out what you really need you do feel a bit adrift. I’ve not found any easy to understand explanation of things like the set link check mode (what does TTN does not supporting that actually mean? do I break something if I leave it enabled?) and setting the downlink window to rx2 for TTN - do I have to do those things or does the network tell the node about it when the node joins? I see those functions in examples but don’t know if I need to do it or not. And if so, when? After LMIC reset or after join?

The internet never forgetting is a big hassle here.

I feel for the MCCI maintainer. He’s (they’re?) doing a good job dragging the code into better shape but maintaining backwards compatibility is a stone around their neck. Eg, their downlink callback would be good to use in examples and application code, but I’ll bet no-one has yet. Making the various byte arrays default to the byte-ordering given in the TTN console by default would be a QoL improvement, and break the code against every other fork. Adding functions to get/set all the state necessary to sleep and wake up using a byte array would be nice and probably not even break compatibility for once.

But you’re still at the mercy of the gateway. And if you need your own gateway and a logic analyzer to have a fighting chance at diagnosing problems then the protocol is not ready for mainstream adoption. I do have a logic analyzer, I just didn’t expect I’d need it for something as trivial as this project.

That’s not to say it isn’t perfectly good for a company who control the entire stack from node to server and deploy their own gateways. Which my sponsor is (mostly) doing. They’re using TTN but do control their gateways. I’m just sure I have a dodgy one :slight_smile:

I suspect that’s an incorrect analysis of the situation.

There’s very little for a gateway owner to “control” that matters in a protocol sense. Mostly what ownership gets you is the ability to monitor what is happening for debug understanding.

And no, it’s not particularly likely that you have a flaky gateway. Possible, but much less likely that node-side issues.

This is the classic example of the abstraction problem - everything is abstracted, which can be great if it works, but if it doesn’t work it’s pretty much impossible to understand why or fix it.

And most of the products that did that had serious flaws in their initial versions - power issues, issues with regional modes the developers didn’t really understand, etc. Some of them have fixed some of these problems, others haven’t.

In the better cases the AT command set firmware is open, debuggable, and repairable.

The project sponsor brought me a different GW to test with today and it’s going much better. This is a Laird RG-191 Sentrium.

The new GW is connected to my home office WiFi AP, which is wired to my home router/vDSL modem so the path is:

GW --> WiFi --> AP --> ethernet cable --> router/modem --> copper wires to pillar --> vDSL mux…

The occasional downlink is still being missed, but the missed ones come through in far less attempts than was usual with the other GW. When I have more time I’ll connect the GW to the AP with an ethernet cable rather than WiFi and see if that improves matters. I didn’t do that today because I’ll have to take my laptop/feather further away to keep some distance between the GW and the feather and I didn’t have time to work around that this morning.

This is all still within a double garage, so I’ll also try it outside with a lot more distance between the GW and feather. That didn’t make any difference with the old GW, I’d tried it a couple of times.

The sponsor also had an idea for why I had 3 good tests and 1 bad on-site. When I told him where I’d parked for the bad test he thought it made sense because I would have had a much worse LoS to the GW than for the 3 good ones.

Hey guys, I have a question regarding the use of downlinks. Is there any way for my RAK11721 module to read a downlink from TTN and send it back to me in my microcontroller? To be more specific, I have a code in micropython and I’m using AT commands to connect with TTN, I want to know if anyone knows an AT command that can read a downlink when it shows up in TTN to send it back to my microcontroller because I’ve already read the datasheet of RAK11721 and couldn’t find it.