Risk management for TTN

fortean · October 24, 2016, 6:50pm

Folks,

Most of us are very busy “getting things to work”. Development of hardware and software, installing more gateways, testing nodes, tinkering with software. Very important - but IMHO more should be done. I believe we should introduce a more formal system to manage risk. And perhaps some certification system. I have some thoughts on this (of course), but before I go into details, I’m curious about your preliminary thoughts on this. If there has been any work done in this field already, please point me into the right direction - I was unable to find it.

(I apologize for posting this in the “Uncategorized” category, but strangely enough we don’t seem to have a category similar to “Risk Management”)

TwaveTech · October 24, 2016, 7:32pm

Before products can be sold they should indeed be CE certifified which includes EMC and safety
For nodes: safety regulations can be found in IEC 61010
(did these certifications for over 2 years)

Hope this helps

fortean · October 24, 2016, 7:57pm

Indeed, but - that’s not exactly what I was referring to. Though CE certification may be part of it, I believe we should have a formal risk analysis system in place, e.g. cf ISO31000:2009. This involves a number of steps to be taken / processes to be in place, as shown here (source: ISO31000:2009, © BSI 2010, image shown here under the implied condition of fair use):

TwaveTech · October 24, 2016, 8:10pm

Thanks for sharing, this concept is indeed in a broader concept, but not obligatory ofcourse

fortean · October 24, 2016, 8:43pm

Not yet, it may well become obligatory in the future. But that’s not the main reason I advocate to set up a more formal RA system within TTN. My main reason is that a good RA system helps us build a better, safer network.

Apart from the ISO32K standards, there are, for example, the ISO27K series. ISO27001 explains how to set up an information security management system (comparable with a quality controls system, but focused on information security). The corresponding ISO27002 standard lists a number of control objectives and corresponding controls that can be used as a starting point to see what we can do to reduce our risks to acceptable levels. This, of course, requires that we need to have an entity that decides what “acceptable levels” are.

Just to give you an an idea of what such controls look like, I partially quote one from the ISO27002 standard (again under the implied condition of fair use):

Objective

14.3.1 Protection of test data

Control

Implementation guidance

The following guidelines should be applied to protect operational data …

… etcetera.

There are 114 controls listed (but to be fair: many more are present in the standard, since the implementation guidance often recommends numerous actual controls).

arjanvanb · October 24, 2016, 8:47pm

If TTN needs to do something like this, then I wish them something more fun than a boring ISO-thingy dating back to 2009…

fortean · October 24, 2016, 10:02pm

Indeed, not exciting - but necessary to do IMO. It is essential to have some kind of management system in place that allows us to anticipate to threats. If we don’t have such a system, sooner or later a threat will find itself a vulnerability, there will be no or insufficient controls in place and so harm will be done. Excitement I can do without.

For completeness sake: ISO27001 and ISO27002 have been revised in 2013. But even the older versions (the standard and its precursors have been around for 30+ years) already contained much of what is in the newest versions. It are proven best practices, mostly. Formalized common sense.

We can do with a little formalized commons sense. For example: if we install a gateway somewhere, have we consciously and formally decided on the level op protection that is needed to protect that gateway - or was it just what somebody thought was the right thing to do, often probably overlooking even the simplest things? And how about experience: if something went awry with our infrastructure, is it analysed, are controls being considered until the risk is acceptable - and who(m) decide it is - or do we simply NOT learn from our mistakes?

I say we need to protect our network against threats. We need to find its vulnerabilities. And we need to learn from incidents. That requires a more formal approach. The ISO standards offer a very good set of best practices, I’m all for using them. Dull or not!

But by all means, if you have a more sexy, exciting but still valid alternative to offer, I’m willing to learn

tkerby · October 25, 2016, 3:56pm

As a systems engineer, I’d be very interested to see if any risk analysis had been performed on LoRaWAN in general.

It would be very easy to spoof traffic at the gateway level (although more traceable) and fairly easy to run a receiver to sniff packets and resend them with bad data. I’ve seen this with mqtt alone on a network when run in an open configuration - people send false temperature data to turn heaters on for instance.

In a city, there is higher risk. For instance spoofing flood data or locally we have IoT on street bin sensors. People could be out with sandbags or the refuse collectors chasin empty bins (or worse still the fire brigade called out as they also detect bin fires).

In general, I don’t think we can trust the network so the intelligence needs to be put in the devices. Rolling key generation is one way for a device to authenticate it sent the next message for instance. An occasional downlink to the device can desynchronise the keys in case of packet loss.

Would be fascinating to see how some of these risks are considered

fortean · October 25, 2016, 8:43pm

Thank you for your support of the idea to do at least some risk analysis. Perhaps we should give it a try.

For starters, I’d like to find out how the network is governed. Who decides (whom decide) what should (not) be done? Who decides if one is allowed to say “this gateway is part of the TTN network”? How are reviews done, if there anything like quality control, for example?

Because if the “governance” of TTN, in whatever form it exists, decides that it is a Good Idea™ to do risk analysis, we have achieved the first step of the RA process: acquiring management support. This holds true regardless the management be a person or a group. The first thing that should be done is to acquire the sponsorship of TTN Governance.

Forgive me my ignorance - but how is governance of TTN done?

drouetd · October 25, 2016, 8:45pm

Hi tkerby,

Perhaps, I’ve not correctly understood what you meant by “spoof traffic at the gateway level”. My understanding was that the NwkSKey prevented spoofing as any tampering with the datagram between the Node and the Broker would result in the message integrity check (MIC) failing and the Broker dropping the datagram. So gaining physical access to a gateway to tamper with datagrams wouldn’t help because only the end Node and the Broker/Network Server have access to the NwkSKey. You could do a replay attack however, but the Frame Counter is there to protect against that.

Running a packet sniffer doesn’t get you much as the payload is encrypted between the Node and the Handler using the AppSKey.

With a NwkSKey providing protection against a man-in-the-middle attack, a frame counter protecting against replay attacks and end-to-end encryption of the payload, the security model seems rather robust. That being said, it’s been quite a few years since I’ve worked in network security and, more importantly, I’m only just starting to get up to speed with LoRaWAN and the TTN’s architecture so perhaps I’ve missed something.

Obviously, my comments focus on network security and don’t address the larger question of risk management.

fortean · October 25, 2016, 9:14pm

But your posting DOES list a number of controls that are in place, and they are there to prevent or reduce weaknesses. Network security (especially in our case…) is an essential part of security. Who(m) has/have decided that these controls need to be in place, who has done the risk analysis, and should we address them if we feel that a broader view might be necessary?

Whom, in other words, have - or who has - the final say when it comes to accepting risk in our network?

fortean · October 26, 2016, 7:38am

@daniel already provides a number of threats (e.g. somebody spoofing traffic, gaining physical access to a gateway, replay attacks) and controls in place to prevent them from causing harm (MIC, encryption, frame counter).

This is of great interest to me: apart from finding out about the ‘management side of things’ of TTN I’m also very interested in compiling a list of threats and possible controls (and control objectives) for the TTN infrastructure (and other similar infrastrucures), which, again together with controls to be found in ISO27001 annex A (or ISO27002) may provide a starting point for a more formal RA system.

Many moons ago DTI started similarly when they compiled a list of controls and control objectives, later to become BS7799, then BS17799, then ISO27002 and ISO27001, so perphaps we’re on the brink of creating a new standard to be used for risk analysis and risk treatment within volunteer driven IoT infrastructures (aka volioti)

An example of ‘objectives’ and ‘controls’ (examples taken from ISO27001, parts in italics are not parrt of the standard but used by me to clarify things):

Control #1: Information security requirements for mitigating the risks associated with supplier’s access to the organization’s assets should be agreed with the supplier and documented.

Contro #2: All relevant information security requirements should be established and agreed with each supplier that may access, process, store, communicate, or provide IT infrastructure components for, the organization’s information.

Control #3: … etc.

(Though just an example, it is relevant: does the ‘formal owner’ of TTN require this from suppliers? Are volunteers that operate the gateways to be seen as ‘suppliers’? Anyway…)

So, by all means, post your threats, controlobjectives and controls here - and if you know who(m) formally govern(s) the TTN network, let me know!

fortean · October 27, 2016, 7:14pm

I believe I now know whom our BOG (Body of Governance) are, they are listed on the TTN frontpage https://www.thethingsnetwork.org/.

It are:

Wienke Giezeman Initiator
Johan Stokking Tech Lead
Martijn van der Veen Web Developer
Hylke Visser Backend Developer
Laurens Slats Community Manager
Rishabh Chauhan Community Manager
Ludo Teirlinck Hardware Developer
Thomas Telkamp Network Architect
Wessel Versluis Designer
Dorian Amouroux Web Developer
Romeo Van Snick Front-end Developer
Alexander Overtoom Business Lead
Antoine Rondelet Backend Developer
Roman Volosatovs Backend Developer
Fokke Zandbergen Developer Advocate
Daniel Gómez Jurado Web developer
Thibault Labarre Web developer
Nicolas Dejean Developer

I wonder if there is a way to adress all of them at once, or should I simply mail Wienke to discuss my concerns? I would very much like to contribute to TTN and feel that the best thing I can do for TTN is to introduce RA to it.

arjanvanb · October 28, 2016, 5:15pm

I’ve moved some (older) posts from another topic, Elderly care LoRaWAN products here; see below. You may see a strange order of the posting dates (which may be older than the posts above), and some references may be weird.

It started with a reply to the following:

fortean · October 26, 2016, 7:40pm

You may well be right doubting if LoRa/TTN is the best fit for ‘critical’ applications. The big issue here is if we can really offer equivalent service levels as, say, big Telco’s that offer similar services. Oh, and should we? It’s a hobby, right?

I don’t think so. Building and deploying “stuff” is, of course, why most of us are here. It gives us a great experience, additional skills and knowledge, it’s a real Feel Good Thing

But if real people will use our “hobby horse” to rely on, much more is needed. Yesterday I started a thread about the idea to establish a proper risk analysis framework. I firmly believe that’s the way to go: find out whom are our “body of Governance” - so, who is held repsonsible if something goes awry - and if there is no such thing, we should create one. That body of Governance should embrace not just technology, but enthousiastically promote and facilitate frequent risk analysis, preferably within a RA framework, e.g. as defined in ISO27001 and ISO31000.

It’s dull, somebody said. I don’t think so - but even if it were, it needs to be done. I can really do without the excitement of reading in the papers that a user of TTN lost his / her life because our network was hacked, not up or otherwise compromised.

Ugh

TijnOnlijn · October 27, 2016, 5:50am

Or, the other way around… if your life depends on it, don’t use a service without any SLA.

fortean · October 27, 2016, 11:14am

Well, the topic of this tread was about IoT and elderly care, and this implies that people WILL use our network for such rather important things.

When old man Jones dies because he fell down and was not found in time we tend to say “that’s life”. But if the same happens and he wore a IoT / LoRaWAN alert button that did not work, somebody will investigate (I hope). And when it proves the root cause was that TTN was hacked or not robust enough, or the local gateway was in maintenance, we really have an issue, I say. Even if we can legally squirm away from our liability, I would feel bad. Especially if a very simple control could have prevented this.

Such controls are typically not considered until life was lost or other major damage was done. I say: let’s prevent this, let’s set up a risk management system / ISMS.

BTW: this post itself is an example of how simple reasoning about “what might happen” can result in finding controls. For example, if a maintainer of a gateway plans replacement of that gateway, or maintenance, there could be a procedure that ensures that in those situations a second, perhaps temporary gateway is set up, so that ole’ Jones can live

People tend to use services because they are available, and this especially is true for free services. We, as maintainers of a network, have a moral / ethical obligation to ensure that our network is sufficiently safe and robust.

TijnOnlijn · October 27, 2016, 11:45am

could you give an example of such a ‘very simple control’?

fortean · October 27, 2016, 12:17pm

I just did, back in my previous post, when I wrote " For example, if a maintainer of a gateway plans replacement of that gateway, or maintenance, there could be a procedure that ensures that in those situations a second, perhaps temporary gateway is set up,"

I think that some of the readers here are not aware of how risk management works, or even what it is. Allow me to broadly paint a picture.

Risk management is fairly straight forward. First thing we need is a body of Governance (BOG). What’s that? Well, just a team that is responsible for the governance of our network. In practice: what they say goes and anybody that wants to be part of the TTN network MUST adhere to their decisions.

The BOG then stimulates and motivates the installation of what is called “an information security management system”. There are good standards available on what needs to be done, e.g. ISO27001. I know quite a bit about these standards, as you may have guessed by now so if you have any questions about them, don’t hesitate, shoot.

The ISMS is an implementation of a well known approach called “the Deming cycle”, which also is known as the PDCA (Plan-do-check-act) cycle. Actually, it’s nothing else than common sense: you plan to do something, then do it, check if results were as you hoped, if not you act upon it to correct the situation - and you use what you have learned to start a new iteration of the PDCA cycle (which involves a new context, as the times, they are a-changing…), ad infinitum.

Generally speaking,

an ISMS starts with getting consent and sponsoring of the BOG, then
doing an inventory of what assets there are (which is more than just technology, people matter for example),
think about what risk levels are acceptable (and the BOG needs to agree here as they are responsible), then
construct (and/or steal..) a list of threats,
see if these threats work on vulnerabilities in your assets, then
see what can be done about that (controls) to reduce the risk below acceptable levels, then
implement these controls (mostly done in project form), then
see if it all worked out as planned, if not: learn and correct.

And then it starts all over again.

It’s not something that can be done as a stand alone activity - if TTN would set up some kind of RA system (e.g. an ISMS) that requires that we ALL participate in it. We all need to be aware of the rules and adhere to them, we all need to consider risk and work with the comittee or group that is in charge of the ISMS.

TijnOnlijn · October 27, 2016, 12:35pm

I think there may have been a misunderstanding about the word ‘simple’. Just put another gateway in place is practically not that simple in most situations. Apart from that I’m pretty much in doubt if you or anyone else could persuade a significant part of the TTN members / gateway owners to accept such a policy. Although that may more belong to the risk management topic.

There are enough less critical applications for elderly people to consider, and personally I think that LoRa/TTN applications should be aimed at those. For the simple reason that anything that looks like an SLA isn’t feasible at all in the current setup by any means. So perhaps it would be nice to keep this discussion in the other topic.

Still it is important to be clear about what is and what isn’t a sensible use of the network.
Think about for instance low voltage speaker cable. If someone would use those for 220v and the house burns down, would you imply that the cable manufacturer is responsible by any means? Or the fool who used that cable for that application?

So let’s use this topic to focus on sending music through the speaker wires.