OTAA accept not received when in reach of multiple gateways

Hi Everybody,

Perhaps someone knows if I interpret the following situation correctly or how it might be resolved.

When I’m in the vicinity of my brand-new TTIG, my test device joins successfully and the messages get to the application without any problems, for both SF7 and SF9. The device is a TTGO T-Beam with a tracker SW for TTN Mapper.

When it has already joined and I go further away, messages are received by my TTIG and by another GW about 1.5 km away for SF9 (and sometimes also SF7). RSSI and esp. SNR are usually higher on the distant GW. ADR is switched off.

The problem is, that the device cannot join the TTN when in reach of both GWs and not close enough to my own TTIG. When an automatic re-join is attempted, the device looses connection.

The console shows, that the join request is received by both GWs and the SNR/RSSI is better for the distant GW. I can also see that my own GW does not send a join accept, so I assume that it has been sent by the distant GW and my device was not able to receive it. Either the distant GW can receive better than transmit, or, my device can transmit better than receive.

Console with two failed join attempts

Both attempts look successful on the console.

Application:
app-traffic-joins-anon

My gateway (join requests visible, but not join accepts):
gw-traffic-joins-anon

Details from the console for the first join attempt

Application (the first GW in the list is the distant one, the second eui-58… is mine):

	{
	  "time": "2019-11-24T15:15:20.596248692Z",
	  "frequency": 868.1,
	  "modulation": "LORA",
	  "data_rate": "SF9BW125",
	  "coding_rate": "4/5",
	  "gateways": [
		{
		  "gtw_id": "eui-b827ebff...",
		  "timestamp": 178502364,
		  "time": "2019-11-24T15:15:20.5656Z",
		  "channel": 0,
		  "rssi": -103,
		  "snr": 10,
		  "rf_chain": 1
		},
		{
		  "gtw_id": "eui-58a0cbff...",
		  "timestamp": 39487572,
		  "time": "1970-01-01T00:00:01.57460852Z",
		  "channel": 0,
		  "rssi": -106,
		  "snr": 3
		}
	  ]
	}

My Gateway eui-58a0…:

	{
	  "gw_id": "eui-58a0cbff...",
	  "payload": "...",
	  "dev_eui": "003BA610...",
	  "lora": {
		"spreading_factor": 9,
		"bandwidth": 125,
		"air_time": 205824000
	  },
	  "coding_rate": "4/5",
	  "timestamp": "2019-11-24T15:15:20.646Z",
	  "rssi": -106,
	  "snr": 3,
	  "app_eui": "70B3D57E...",
	  "frequency": 868100000
	}

What can be done to fix the situation? Going back from OTAA to ABP does not look like a good solution…

I hope the category is correct for this topic, but it’s about two gateways, so :wink:

Regards,
Holger

2 Likes

There are many different things which could be wrong here.

First, you have an ESP32 based device. That, especially in combination with Arduino code (if that is indeed how you are using it) has historically had assorted challenges in meeting the precise timing requirements of LoRaWAN downlinks. As you’ve not given any details of the software in use, this is hard to evaluate.

At first one would think that if the timing works at one range, it should work at another - however that is not entirely the case if the spreading factor may also change, because there are aspect of the actual implementation of timing which require converting between time and number of symbols, in a spreading factor specific way.

ADR is switched off.
Going back from OTAA to ABP does not look like a good solution…

Using OTAA without ADR is probably inadvisable. And even if it might work, if you are not going to use ADR, then manually managing keys to be able to use ABP could perhaps entirely remove the need to support downlinks at all, saving a lot of complexity and battery usage.

When an automatic re-join is attempted, the device looses connection.

Beware that attempting to rejoin means abandoning the current session. It’s 100% guaranteed to break connectivity until the join succeeds.

I can also see that my own GW does not send a join accept, so I assume that it has been sent by the distant GW and my device was not able to receive it. Either the distant GW can receive better than transmit, or, my device can transmit better than receive.

In theory, a public LoRaWAN network is vulnerable to (presumably accidental) denial-of-service on the downlink path by a gateway which reports strong signals and then fails to actually transmit when commanded to - for example as a result of misconfiguration, of an attempt to share the gateway with a private network server in addition to TTN - or even of an Internet connection with substantial latency in the downlink direction causing the packets to reach the gateway too late to actually transmit.

None of those is terribly likely but they are all possible.

Unfortunately the fact that your own gateway is a TTIG substantially limits debugging steps you might otherwise be able to take, such as temporarily running on a private server while testing at a distance from your gateway. Perhaps you can take your gateway and node to another location to try?

1 Like

Or your device is exceeding the allowed/expected transmission power, which also makes “connections” with a gateway asymmetric? Assuming your TTGO T-Beam uses some LMIC code, see LMIC_setDrTxpow.

While changing code: to mitigate any downlink timing problems, see LMIC_setClockError. (Though I’d guess that usually gateways do okay with their side of the timing. So if the TTIG Join Accept is received, then the node’s timing might be just fine, and the other gateway’s Join Accept should be received too, when using the same SF? I’m not sure though, and @cslorabox already explained that timing might be an issue if the SF changes. So, you might want to use a fixed SF as well? TTN Mapper recommends SF7.)

Without even making changes to the code: what if you move closer to the other gateway? Getting one proper Join Accept from the other gateway will tell you if it is actually transmitting any. You’re mapping anyway. :wink: Or maybe you can even become a collaborator of the other gateway, to see its traffic; see also How to contact a gateway owner?

Oh, and I guess that powering down the TTIG does not make any changes? Then it’s not really related to multiple gateways, but just the combination of your node’s position and the other gateway.

Some asides: for moving (mapping) nodes, ADR is not recommended; see Why is ADR only relevant for nodes at fixed positions? Also, a DevAddr is not a secret (and not even unique, and will change for every new OTAA Join anyway). Same goes for the AppEUI and DevEUI.

1 Like

I have the same issue observed before. Downlink issue (observed at OTAA)
It is something that happens in the core infrastructure of TTN

1 Like

Offtopic, but… - how did you get your TTIG send time in metadata? My one doesn’t :frowning:

No idea.

Wow, the amount and depth of all your answers is awesome! :smiley:

Let me try to summarize the topics:

  • The tracker is my first toy project to learn about LoRaWAN/TTN and to test my TTIG. The SW is based on https://github.com/kizniche/ttgo-tbeam-ttn-tracker, but forked to https://github.com/grillbaer/ttgo-tbeam-ttn-tracker to make it compilable with PlatformIO and to make experimenting easier. Now it’s “real” C++, but the libs are still the Arduino ones, running on a ESP32.

  • I tried SF9 after the first tests with the default (and recommended) SF7 to get a feeling for the range.

  • LoRa lib: MCCI LoRaWAN LMIC library@2.3.2 = https://github.com/mcci-catena/arduino-lmic
    BTW, does anybody know a better one for HPD13A or RFM95W? There seem to be many of them out there… difficult to choose from.

  • Timing indeed was an issue in my first OTAA tests with SF7 near my own GW. LMIC_setClockError(MAX_CLOCK_ERROR * 5 / 100) seems to solve it. Of course, I’m not sure if it is really solved in all situations…

  • The SW uses LMIC_setDrTxpow(sf, 14) to set the TX power to 14 dBm = 25 mW. That’s what I would have expected.

  • Ok, I understood that OTAA should better be used with ADR on, which is not well suited for moving devices. There is no problem to go back to ABP for me, but I read so many times to use ABP only for development and to prefer OTAA instead. However, this should be no problem for this test and anyway, the LMIC lib seems to reset my ADR=off after joining with OTAA, so that I have to set ADR=off and SF=x again. Maybe this hacky re-setting before every send() is part of the problem… Going to ABP would also remove the rejoins. It looks like they are triggered by the LMIC lib automatically.

  • As suggested, I’m going to take the tracker into the vicinity of the other GW and see, what happens… but probably not before the week-end. Great tip!
    I could also try to log RSSI and SNR on the device’s side to see, if the values are near those of on the GW side.

  • I was wondering if posting EUIs and DevAddrs in the forum is considered a problem, because I saw it blacked out in several posts. Thanks for the info. Now my first post is another bad example which will lead more people to anonymize it :sunglasses:

  • Is there an easy way to retain the Console App/GW traffic logs without a permanently open browser? Sending app data to integration “Data Storage” drops the meta-data. The CSV from TTN Mapper keeps at least some meta-data.

So, the next steps for me are a) to test OTAA near the other GW, b) to compare RSSI/SNR between node and GW and c) to drop OTAA for this node and use ABP.

Thanks again for all your support!

1 Like

For the application you could use mqtt, that includes all meta data. There is no such alternative for the gateway. However, if you search the forum you might find a message where someone saves the data from the browser.

1 Like

If I understood the docs correctly from a quick first look, there is already a MQTT server from TTN up and running and I only have to subscribe to the relevant topics as client. That’s great!

Yes, for node data you just subscribe. No need to run your own mqtt server.

Not my cup of tea, but it might depend on the antenna gain as well? Are you using any special antenna? (Also, I recall that some LMIC- based libraries actually stated setting the power did not even have any effect, so you might not even be able to change it…)

And maybe test with the TTIG disabled, to ensure it’s not related to, e.g, the following?

(You always seem to get a value for gtw_id, so I’d assume the above does not apply to you. But it might help to be sure. Maybe you can even reach the other gateway from home when using a much higher SF.)

This is the current LMIC reference library for the Arduino framework and SX1276 based modules like HPD13A and RFM95W. But version 2.3.2 is not the latest version.

1 Like

No. I expect a small helix inside the <40 mm plastic cover:
image
Shouldn’t antenna gain have a quite symmetric effect on TX and RX? But I don’t know to much about HF…

That would be a simple way to test it. I’m going to do some more tests at the week-end, perhaps they give some more insights.

Gain, yes. But something like digital or power supply noise from the local system is more likely to be an issue for receiving than transmitting.

That said LoRa has a LOT more immunity to such things than other sub-GHz schemes (OOK remote controls are very susceptible for example)

1 Like

Oh, thank you :smiley: I only looked at the github releases, not the tags. Then I’ll switch to MCCI LoRaWAN LMIC library@3.0.99.

Is this lib a good choice for ESP32? It needs not necessarily be based on the Arduino libs for me.

Sounds possible. I could try to have a look at the LoRa power supply with the oscilloscope. This revision of the T-Beam board has some mighty multi-purpose AXP192 power management chip on it, the big chip between display and antenna with nearly as much pins as the µC, which is covered by the display. Who knows, what its output looks like…

Another fun fact that might affect/help your testing of downlinks transmitted by remote gateways, after the device has joined: (at least) in the EU868 frequency plan, a downlink in RX2 uses SF9, regardless what SF was used in the uplink. And that uses an increased power:

This does not apply to the OTAA Join Accept (as at that point the device does not yet know about the TTN-specific settings, which it actually gets in the data of that very Join Accept). But after joining, or when using ABP, you might turn off the close-by gateway, schedule a downlink in TTN Console, and transmit a SF11 or SF12 uplink to increase the chance that TTN will tell the remote gateway to use RX2 with SF9 and a higher transmission power for a downlink.

I’ve no idea of any of the above is useful for debugging the topic at hand. :slight_smile: However, testing for RX2 is good anyway!

1 Like

So, this is the noise on the 3.3 V supply voltage of the LoRa module:
image
Lots of bursts of spikes with at least 30 mV peak-peak.
On spike zoomed in:
image
And this is with a 100 nF capacitor soldered directly to the power supply pins:
image
Much better :slightly_smiling_face:

I have to stop the modifications before doing some more testing, now. Otherwise I won’t know what was the cause.

1 Like

Bypass capacitors are of course important.

I have to say my first guess is not that local digital noise is your problem - I mentioned that with regard to the antenna reciprocity question, but don’t mean to encourage getting too sidetracked.

My actual suspicion is that some software issue, either in the node or on the infrastructure side, is meaning that you aren’t ending up with a LoRa radio transmitting and receiving with compatible settings at the same exact time.

Yes it is, but if you prefer to use Espressif ESP-IDF instead of the Arduino framework, the same library has been ported to ESP-IDF.

See ttn-esp32 here:
Overview of LoRaWAN libraries - LMiC, LoRaMAC-node and their variations [HowTo]

1 Like