No Join Response from TTN


(Vanthome) #1

Hi, I have a node device in development which can obviously send join requests:

However, I do not get any response from the TTN. I also tried to trace it with tcpdump to
make sure that not the packet forwarder is “eating” the response but there is simply no response.
The firmware uses LMIC version 1.6 and I have set up the frequency plan as defined in the Wiki.
I wonder why the TTN does send a join response and not even shows an error in the console?

The device is connected in the EU via a ic880A based gateway that works otherwise.

This is a similar problem from a guy using another device:
https://www.thethingsnetwork.org/forum/t/mdot-join-req-visible-in-ttn-console-no-response-how-debug/8105/7

BG, Thomas


Join Request problem
Arduino pro mini RFM95 node problems
(Drust) #2

Hi Vanthome,

Have you checked the keys on your node?


#3

Hi Vanthome,

if appeui, deveui and appkey are correct, you should see an “activation message” in your application console -> data, shortly after the join request in gateway traffic.

If you do not see anything in application console, you most probably configured appkeys in wrong format.

from the “activation message” you should then see which gateways received the join request. If you only see your gateway there, the corresponding “Join Accept” in gateway traffic should follow.
In case multiple gateways are in reach, you might miss the “Join Accept” in gateway log, as it might be sent via a different gateway.


#4

you should see this yellow icon if your’e node is in range of a TTN gateway, the node is working and the keys are 100%


(Vanthome) #5

YES! that was indeed the issue, thx!


Things Uno not joining anymore
(Vanthome) #6

I want to follow up on this because I have discovered that there still is a problem most likely. And back then, the solution was probably not to correct the keys as I thought.
The problem has meanwhile also been described here.
The problem is that my device (due to some testing I’m doing it’s on purpose) is continuously sending join requests. The problem I can describe better now:
When I start the device and it makes n join requests, I get n responses. When I restart the device, I only get responses from the n + 1 the join request. I have not dug deep yet to find out what is actually the difference in the requests. All I know is that it’s not the frame counter because it’s not incremented for join requests.

The only cure that I can reproducibly apply is to delete and re-create the device or use a newly created device.

It would be really helpful if someone from the TTN would comment on this.


#7

8 months later … huh ? :roll_eyes:

* and what type is your gateway ?


(Jac Kersing) #8

Is your node using unique nonces for the OTAA requests? Any OTAA request with a duplicate nonce will be ignored as per spec.
If there is a pseudo random generator in the device chances are the random nonces are in fact not random and always use the same sequence.


(Arjan) #9

Indeed. So, look for error messages in TTN Console: OTAA shows "Activation DevNonce not valid: already used". (Note that things are different in LoRaWAN 1.1; see the same topic.)


(Vanthome) #10

Thanks for your answers first of all. But no, I never saw the message “Activation DevNonce not valid: already used”. As you can see in the screenshot, I can see only join requests and that’s it. That alone I consider sub-optimal and should be enhanced so that you can see the reason. And there must be a reason because otherwise the backend would send a join response. The only reason I can imaging is that the backend selects another GW through which it sends the join response but then, why would it, if it receives the join requests through that GW. Also as far as I can see, I have no other GW in my area.


(Arjan) #11

Well, the error message would be in the Join Request, as TTN is not going to create a Join Accept if it reports that error. Did you click the Join Request to expand it, to see its details?

And did you try the workaround to temporarily change the AppKey, and if so: did that not solve your “N+1” problem?


(Jac Kersing) #12

Open the join requests in your application and you should see which gateways received it.


(Vanthome) #13

@kersing Yea, the join requests goes through the GW I’m looking at, I’m usually looking at the traffic view of the GW (also checked the device itself).
@arjanvanb I cannot see any error when I open any of the many join requests I have inspected.
No, I did not try that, but if it’s a workaround, is there a known issue?


(Jac Kersing) #14

Did you open them at gateway level? Then there won’t be any error related to the node. Only on application level that kind of information is available.
While debugging it helps to have both gateway and application/node traffic/data tabs open.

Yes, it is an issue in the LoRaWAN spec < 1.1. It is mentioned in the previous messages which you dismissed as not being the cause of your issues.

Can you show a screen shot of the application/node data join entries expanded for failing join requests and the first one that works?


(Vanthome) #15

Only on application level that kind of information is available.

Checked it on application and device level and there is no error.
On application level:

image

On application level, one message expanded:

image

Yes, it is an issue in the LoRaWAN spec < 1.1

But if it was this issue, there should be an error, right?

I will also post screenshots of the case where I get the join response but for this, I need to delete and recreate it first.


(Jac Kersing) #16

The changing dev addr values indicate the back-end accepts the join request and assigns a new address. Is there one or are there two gateways listed in the metadata?


(Vanthome) #17

Right, good catch, didn’t notice that. I checked although I was sure and no, there is only one GW and no error (my own one). And this is exactly the behaviour where we have no explanation for.


(Vanthome) #18

Ok, I can confirm that if I change the App Key of the device via ttnctl, it seems to reset something (probably the nonces as you propose) and I get join responses again. But, then I should get a proper error message in the console which I don’t, so this smells like a bug to me.


(Wietse 0803) #21

Hi everyone,

I’ve the same problem with my node (ATxmega32E5 with a RFM95W module).
There are only ‘activation’ messages in my console visible, despite I get a proper joined message in the callback.

join_succeed_1

I already tried the following things:

  • Compared the differences between the lmic.c and oslmic.c from this library (Matthijs Kooijman) with the lmic.c of the original library from IBM and applied the changes/improvements (?) from Matthijs.

  • Added
    LMIC_setClockError(MAX_CLOCK_ERROR * 1 / 100); after LMIC_reset();
    Therefore the changes in lmic.c from Matthijs Kooijman where necessary.

  • Changed
    setDrJoin(DRCHG_SET, DR_SF7);
    to
    setDrJoin(DRCHG_SET, DR_SF9);
    and I also tried
    setDrJoin(DRCHG_SET, DR_SF12);
    on lines 686 and 879 of lmic.c

Screenshot console:
console

I hope someone has another suggestion, because I’m out of ideas right now.
Thanks in advance :blush:.


(Wietse 0803) #22

Finally found a solution for the described problem in my previous post. It was definitely a timing issue.

I decided to use an external clock (16 MHz, crystal) instead of the built in oscillator of my microcontroller.
After changing the clock input I had to change the timer prescaler settings of course.
LMIC requires ticks to be 15.5μs - 100 μs long, in my current setup a tick is 32 μs.

Previous config (snippet hal.cpp):
static void hal_time_init () {
TCC4.CTRLB = TC_WGMODE_NORMAL_gc;
//32MHz/1024 = 31250kHz -> 32 μs
TCC4.CTRLA = TC_CLKSEL_DIV1024_gc;
EVSYS_CH0MUX=EVSYS_CHMUX_TCC4_OVF_gc;
TCC5.CTRLB = TC_WGMODE_NORMAL_gc;
TCC5.CTRLA=TC_CLKSEL_EVCH0_gc;
}

New config (snippet hal.cpp):
static void hal_time_init () {
TCC4.CTRLB = TC_WGMODE_NORMAL_gc;
//16MHz/256= 16000kHz-> 62,5 μs
TCC4.CTRLA = TC_CLKSEL_DIV256_gc;
EVSYS_CH0MUX=EVSYS_CHMUX_TCC4_OVF_gc;
TCC5.CTRLB = TC_WGMODE_NORMAL_gc;
TCC5.CTRLA=TC_CLKSEL_EVCH0_gc;
}

Now it’s running fine. After this successful observation I changed the prescaler settings of my built in oscillator to run on 16 MHz instead of 32 MHz and it’s still running fine :slight_smile: .

ttn_tx_complete