Frame Option bytes sent by server are too long and are rejected. Work-around please

A device using AS923 region rules and DwellTime = 1 can receive only 11 bytes at DR2. I observe the TTN server sending 14 bytes of Frame Options. This packet is rejected by my device as violating the packet length rule, and consequently the device fails to operate.

Details follow: When my device has operated without a gateway, it has tracked the data rate/spreading factor down to DR2 and stored this in flash memory. AS923 region, DwellTime=1, DR2 has an 11 byte application payload limit. When the device boots again with a gateway connected to TTN V3 it joins properly (using DR2) . However its next uplink message receives a downlink message response from the server with 26 bytes: 12 bytes of header/MIC and 14 bytes of Frame Options (Fopts) MAC commands.

The device firmware detects the 14 bytes as being illegal (greater than 11 bytes) and so rejects the downlink message. The device cannot function.

The 14-byte Fopts field is as follows (raw data and parsed):
Raw data: 06 07 07 a0 af 8c 50 03 51 ff 00 01 09 35
06 = DevStatusReq (no payload)
07 = NewChannelReq (5 byte payload follows)
03 = LinkADRReq (4 byte payload follows)
09 = TxParamSetupReq (1 byte payload follows - including DwellTime = 1)

I suspect this is a bug - am I right?:

(a) The server should be aware of the 11 byte limit and not attempt to send 14 bytes.
(b) In any event, there is no need for the server to insist on DwellTime = 1 as there is no need for this restriction according to New Zealand regulations. (I posted on this here, but received no replies: Dwell Time and TxParamSetupReq for AS923 in New Zealand)

Is this a known problem? is there a work-around I can apply at the server level or device application level, without changing the LoRaWAN firmware?

Related post: How to set dwelltime to 0 at AS923 in lopy4?

1 Like

The folk who have access to the servers, who develop the code, and configure the settings are the TTI core team rather than the Community forum contributors, moderators or volunteers, though some TTI’ers do visit not all threads get read or have the needed staff cover so best option is post to GIT if susected bug or post to the Things Slack boards - #support channel?

Nice detailed description of the problem, thanks for that. Could you also let us know what frequency you configured for your end device (in the Console, go to your end device, then General Settings and expand the Network layer section)?

By the way: the LoRaWAN specification does not require end device to reject such downlinks:

The end-device SHALL only enforce the maximum Downlink MAC Payload Size defined for
DownlinkDwellTime = 0 (no dwell time enforced) regardless of the actual setting. This
prevents the end-device from discarding valid downlink messages which comply with the
regulatory requirements which may be unknown to the device (for example, when the device
is joining the network).

Thanks for the report !

In addition to the frequency plan used, could you also tell us the MAC version and Regional Parameters version used during testing ?

Both older and newer RP versions are ambigous with respect to the settings that the end device will boot up with. To make things worse, there is no right answer to begin with, since the DownlinkDwellTime also changes the offset used for the RX1 data rate, so even being conservative with respect to the payload size will cause the RX1 window data rate to possibly be wrong (if the device is expecting DownlinkDwellTime=0).

The server insists on doing a TxParamSetup due to this matter - if the server does not ‘clarify’ on the DownlinkDwellTime, it is possible that RX1 transmissions won’t be possible.

If the boot time settings expected by the stack do not match the ones used by the end device (remember that the standard is ambigous here and for AS923 the boot settings are not provided), they may be provided using the --mac-settings.downlink-dwell-time and --mac-settings.uplink-dwell-time CLI options.

Thanks all for your prompt replies. Additional info follows.

The console says settings for the end device are: “Asia 920-923MHz”, LoRaWAN spec 1.0.3", “RP001 regional Parameters 1.0.3 Rev A”

I am using the STM32WLE5 chip and an IDE provided by ST. A file st_readme.txt says:
“Implements LoRa Mac from Semtech/StackForce develop branch (26-May-2020 commits, version 4.4.4)”

I think the problematic code is in LoRaMac.c, ProcessRadioRxDone():

        case FRAME_TYPE_DATA_UNCONFIRMED_DOWN:
            // Check if the received payload size is valid
            getPhy.UplinkDwellTime = MacCtx.NvmCtx->MacParams.DownlinkDwellTime;
            getPhy.Datarate = MacCtx.McpsIndication.RxDatarate;
            getPhy.Attribute = PHY_MAX_PAYLOAD;

            // Get the maximum payload length
            if( MacCtx.NvmCtx->RepeaterSupport == true )
            {
                getPhy.Attribute = PHY_MAX_PAYLOAD_REPEATER;
            }

            phyParam = RegionGetPhyParam( MacCtx.NvmCtx->Region, &getPhy );

            if( ( MAX( 0, ( int16_t )( ( int16_t ) size - ( int16_t ) LORAMAC_FRAME_PAYLOAD_OVERHEAD_SIZE ) ) > ( int16_t )phyParam.Value ) ||
                ( size < LORAMAC_FRAME_PAYLOAD_MIN_SIZE ) )
            {
                MacCtx.McpsIndication.Status = LORAMAC_EVENT_INFO_STATUS_ERROR;

                PrepareRxDoneAbort( );
                return;
            }

(I have checked the latest LoRamac-node on Github and it seems unchanged.).

I think that at this time MacCtx.NvmCtx->MacParams.DownlinkDwellTime has been set to the default AS923_DEFAULT_DOWNLINK_DWELL_TIME which is 1. The RegionGetPhyParam() call returns a value based on both UplinkDwellTime and RxDatarate, which is 11 in this case.

If I understand Hylker, the code should be changed to:
getPhy.UplinkDwellTime = 0;

If so, then do you agree this is a bug in LoRaMac-node code base? If so is surprising that others have not found this before.

I may be able to fix this in my devices, but it would be nicer if there was a server-side fix, say split the MAC commands into two sets so there is never > 11 bytes.

Also - the server’s TxParamSetupReq MAC parameters are setting UplinkDwellTime and DownlinkDwellTime to 1. Are there any circumstances in which the server will set these to 0? The spec says “Used by the network server to set the maximum allowed dwell time and Max EIRP of end-device, based on local regulations” so it would seem that it is incumbent on the server code to determine my region and send 0.

Further confusion: the console has many options for the “Frequency Plan” setting, but these are not described using the spec’s terminology (e.g. AS923), nor can I see documentation that describes the differences. Is there a better AS923 setting for me than “Asia 920-923MHz”?

It seems like this issue in LoRaMAC-node has been subtly reported before, and then Semtech maintainers went around and did an unhelpful closing of issues based solely on time and not status…

and a second time here, again carelessly closed:

The code has since been refactored a bit so the details of how the (unnecessary? mistaken?) enforcement is being performed have changed.

Interestingly, the problem was previously recognized and fixed in the radio chip receive setup code, but not in subsequent rx done processing:

This is a quite complex subject and several discussion have been held concerning this.

The regional parameters specifications RP1 versions did not specify the default dwell time values for AU915 and AS923 regions.

Starting at RP2 versions the default dwell time for these regions is specified in order to avoid as much as possible potential issues.

On LoRaMac-node project when we have added support for AS923 regions we decided that the end-device should enforce the most restrictive limitations in order to ensure that it would always comply with countries national regulations. It was maybe not the best decision however we had to make one as it was not specified.

An end-device is only responsible to ensure that the uplink dwell time restrictions are respected. For downlinks it is the Network Server responsibility to ensure the respect of the downlink dwell time.
This is the reason why starting at RP2 specifications an uplink-dwell-time=1 and downlink-dwell-time=0 has been specified.

It has to be noted that potential issues may still happen in case a network server changes the Rx1DrOffset sent under the JoinAccept message. In case of AU915 and AS923 regions a network server shouldn’t do it as the first Rx window could become unusable.

In order to solve this issue I would recommend to update the LoRaMac stack to the latest version 4.6.0 which implements LoRaWAN 1.0.4 + RP2-1.0.1 specifications.
Version v4.4.4 is now quite old (May 26, 2020) and a lot of fixes have been done since. Please refer to the CHANGELOG.md file for further details.

Under v4.4.4 version the best way to solve the issue is to modify the AS923_DEFAULT_DOWNLINK_DWELL_TIME definition from 1 to 0.

There were some recent discussions on this subject between my self and TTN and the outcome can be seen at following TheThingsNetwork/lorawan-stack issue: Maximum downlink payload size exceeded when dwell time is activated ¡ Issue #4971 ¡ TheThingsNetwork/lorawan-stack ¡ GitHub

The following issues/PR are also related to the same subject:

2 Likes

That is correct thinking.

But having the end device enforce downlink payload length limits based on dwell time is mistaken and counterproductive - it does nothing to ensure compliance, because the end device does not control what the network transmits.

The reason the downlink dwell times needs to be accurately known is to determine the minimum downlink data rate in order to correctly set the radio to receive what the network might transmit.

But the mistaken code enforcing downlink packet size limits needs to be removed.

Thanks all.

As I understand it, mluis suggests two fixes, both involving changing the device code. This is reasonable for new devices but difficult to implement for devices in the field.

mluis references this Maximum downlink payload size exceeded when dwell time is activated · Issue #4971 · TheThingsNetwork/lorawan-stack · GitHub in which the summary is “Maximum downlink payload size is exceeded when the end device has dwell time enabled, but the Network Server’s frequency plan does not.” This kind of implies the problem does not exist if the Network Server’s frequency plan does have dwell time enabled. However, I think the problem exists for both settings “Asia 920-923 MHz” and “Asia 920-923 MHz (used by TTN Australia)” - in both cases the server sends 14 bytes of Fopts, which the device rejects. (The difference is that the TxParamSetupReq MAC clears both DwellTime bits for the Australian frequency plan).

It should be possible for the NS to foresee this problem and so split the Fopts in two - it would be benign to defer the NewChannelReq MAC for a later downlink, for example.

Or am I missing something?

And is somewhat perverse - the “damage” is done, rejecting the MAC commands is likely to result in the NS rescheduling the very same downlink, thereby exacerbating the situation.

Hopefully the TTI team can implement something to split the sequence over a number of downlinks to accommodate existing devices perhaps by the crude method of not allowing NewChannelReq to be sent with LinkADRReq.

I will now go and dig through some recent firmware development to see if I have devices that may trip up over this corner case!

Hi Nick

The NS does indeed send the same MAC commands in every downlink, and the device keeps on rejecting them.

I would have thought that it should be pretty easy to replicate, as most devices using the LoRaMac-node code base should behave the same way. I think what you need to do is to boot a device in the absence of a gateway. That way the device will retry using successively lower data rates, until it reaches DR2. I think this is saved in NVM. The next time it boots with a gateway present it will join the network OK (using DR2) but then you hit the problem reported here. Ironically, the LinkADRReq MAC gives the device permission to use a higher DR rate and so longer packets, but…

In a directly cabled off-air compliance test, yes.

But to experience it in the real world one would have to be in a region where a dwell limit applies in some settings such as various AS923 countries rather than one where it never applies like EU868 or applies all of the time like US915 (which compensates with 500 KHz downlink BW anyway)

Just to confirm: I made this change and the problem goes away:

I am still of the opinion that a server-side fix (NS does not try to send > 11 bytes of MAC commands) would be useful also.

Please raise this as an issue on GitHub - it won’t be picked up from the forum.

Issue submitted here: Network Server should be aware of maximum downlink payload sizes and split MAC messages if appropriate ¡ Issue #5370 ¡ TheThingsNetwork/lorawan-stack ¡ GitHub

1 Like