Invalid MHDR

Hello,
Over the last few days, my gateway (Knot LR8) disconnects frequently after receiving an
invalid MHDR m_type:PROPRIETARY message.

Any hints welcome !

What message do you see that tells you it disconnects?

Have you checked the uptime of the gateway? Is it rebooting?

On the console, it appears with a disconnect gateway msg type.
Message may also be Invalid MHDR m_type:PROPRIETARY major:1
Gateway reconnects around one minute after

From the Knot log :

  • [LNS] host: eu1.cloud.thethings.network error: socket error [2001]*

Are you forwarding packets with CRC errors? (Is/was default in MikroTik devices)
If so, can you disable that?

What RouterOS version are you running?

Many thanks for your support !
The gateway forwards only “valid” messages (CRC OK on the log)
Router OS is version 7.14.1, stable branch

Did you check if the router is restarting at the time the connection is lost? (Uptime should be a good indicator)

I’ve got the same problem, Mikrotik LtAP LR8 LTE Kit.

About every 30-40 seconds, sometimes it only comes after couple of minutes I get this Disconnect gateway | invalid MHDR m_type:PROPRIETARY received message in TTN console, 5-6 seconds I get Connect gateway message.
In Mikrotik’s logs comes up: [LNS] host: eu1.cloud.thethings.network error: socket error [2001] just like for Pierre, but no logs of anything restarting in Mikrotik, only connection to TTN.

Also, when I tried using CUPS (hoping that maybe only LNS is broken somehow) I’m getting:
[CUPS] connecting to https://eu1.cloud.thethings.network:8887/update-info
and directly after that:
[CUPS] server response code: 404

So not great either :slight_smile: However I don’t know the protocols so I am not sure whether TTN or Mikrotik is doing weird things :slight_smile:

Did the gateway with the same software run without this issue a week ago? If so there might have been a software update at TTN that triggered this. I am not seeing this with older MikroTik software versions (still at routeros 6 on my gateways) so this might be a combined software versions issue.
If easily possible you could try downgrading to the routeros 6 branch. (I’ve never downgraded routeros major versions so I have no clue if that’s feasible)

Brief update:

  • My Knot, recently acquired, worked perfectly from the Lora point of view, in temporary installation in the garage:

    • 7.14 packages
    • power supply via the mains adapter
    • whip antenna connected directly to the knot
  • Disconnections/msg invalid MHDR appears after :

    • installation of the Knot in the attic
    • packages 7.14.1
    • power supply via PoE from a secondary switch
    • omni antenna with a coax.

That’s a lot of changes, and a lot of avenues to explore

  • I downgraded to 7.14, same symptoms. So back to 7.14.1. Factory FW of my Knot is 7.6, and I think you couldn’t dowgrade below factory FW.
  • I tested the AC adapter power supply instead of PoE, same symptoms.
  • Change of antenna: it seems that the frequency of invalid MHDR msg is slowing down, but it’s difficult to be objective, I haven’t tested for very long.

=> Suspecting possible interferences, I put the Knot back in the garage: works fine, no disconnection for several hours. Returned to the attic, resumed disconnections/msg invalid MHDR, about every 2 mins. I then moved the Knot away from an electrical box in the attic, and tried to place the antenna as far away as possible: works OK.

=> To confirm/deny this electromagnetic interference concern, I’m going to try to determine the sensitive part (knot and/or antenna), and identify the source of interference using an SDR dongle and spectrum analysis software.

2 Likes

Years ago I found that one of my GW’s deployed in a loft/attic space but close to an old/failing flourecent strip light suffered poor sensitivity/reception, moving it 1.5-2m further away significantly reduced issues (didnt suspect the light, only realised after we moved and found the root cause), moving again at 90deg about 0.5m further away from the electric power stub cable that fed the light improved a bit more such that statistically is doesnt see any reduction c/w 3 others in same install/test site. It may be original position was too close to the power cable and just moving away from that was what helped - didnt go back to evaluate. Note not a Mikrotik and was running classic SMTC UDP based PF (before BasicStation) with option to suppress/not send CRC errored messages of course! Never seen an Invalid MHDR message on any of my GW’s, also checked in on one of my Mikrotik GW’s last night/this morning - old install from maybe 2.5-3 years back so like Jac running older firmware - and see no issues…

Maybe because of the better position of your antenna your gateway receives the signal of a node transmitting a message with a valid CRC with an invalid MHDR. This message is send to TTS by your gateway but TTS “doesn’t like” it.
My gateway sees many messages with valid CRC like this :
{“rxpk”:[{“jver”:1,“tmst”:1487375576,“time”:“2024-03-18T07:15:16.926067Z”,“chan”:7,“rfch”:0,“freq”:867.900000,“mid”: 0,“stat”:1,“modu”:“LORA”,“datr”:“SF12BW125”,“codr”:“4/5”,“rssis”:-132,“lsnr”:-11.0,“foff”:-1472,“rssi”:-121,“size”:52,“data”:“4ETuDLi0w4zxg2GDv3gk3TWxMGj2mx/7H+tCIulSaSYjY82afNSmnaC6hOLuR9zDDNV+bA==”}]}

It cannot identify Message Type, DevAddr and Size, but the CRC seems to be valid.

From my side, I didn’t have a gateway before, so I can’t compare :slight_smile: My gateway is on RouterOS 7.14 as well and I’d rather not downgrade it, since I’ve figured out pretty complex solution for other parts of LtAP and since I am not that proficient with Mikrotik yet, I’d rather not rediscover it again :grimacing:

As for the changes in antenna location - maybe you just moved it away enough, behind some obstruction that makes you not receive that weird device. However! I believe it’s not only (nor mostly) about the weird node. I believe it’s mostly about either Mikrotik or TTN, because when I changed protocol to UTP, nothing wrong happens. Surely disconnection won’t happen because with UTP there’s no connection, but no invalid MHDR message to be found anywhere. All messages I receive are passed successfully to TTN with no errors nor warnings, so I believe it must be something with implementation of LNS on either end.

Not sure if it’s related, but we have started getting Invalid MHDR : m_type: CONFIRMED_DOWN received messages on one of our gateways in the past few days. This is on a private instance of TTS. Screenshot below:
Screenshot 2024-04-23 at 10.07.27