The mac_tx_ok response from the RN modules when using unconfirmed transmits simply states the module was able to transmit the data. It does not in any way check the data was received by any receiver.
To check if data is being received you need to use confirmed transmissions. However, each confirmation packet counts towards the 10 downlinks you are allowed to use each day. So you can only use 10 confirmed uplinks (and no additional downlinks) a day.
You could divide the maximum number of uplinks for your device by ten and change every (uplinks/10) uplink to be a confirmed uplink. That could make you rejoin too soon as missing a single acknowledgement is to be expected in busy regions, so the node could rejoin after missing 3/4/an entire days worth of acknowledgements. Keep in mind each node has a limited rejoin capacity. The limit is 64K, but as there is a ‘random’ number involved you will experience random number collisions after just a few hundred rejoins. A join attempt with a previously used random value will be ignored by the back-end and required a new join attempt to be made.
What intrigues me is the issue of the nodes not resuming operation after the gateway got back on-line. How often is a node sending data?