LMIC-1.51 fitting in Arduino Atmega 328


#1

Hi,

Looking for a simple and cheap way to make a node that would run the complete LoRaWAN stack, I have been porting the stack to the ESP8266 (see other topic).

At the same time, I wanted to run that code on the Arduio Pro-Mini which has a Atmega 328 cpu. However, the code would not fit in the MCU, mainly because the large static AES arrays did not fit the memory.

And, after playing several tricks on the code I had to come to the conclusion that even if the code would fit in the Arduino I would not have enough room left to run any application that would do meaningful work.

So I used another AES library and used that instead of the standard LMIC 1.5 AES code. As I did have the AES code of the Nexus (Gerben den Hartog/Ideetron) at hand I ported those AES functions to the LMIC-1.5 code.

The result is a new library (I called it LMIC-1.51 for lack of a better name) with a considerably smaller codebase. Just the vanilla hello world example code uses 25.716 bytes (83%) program space and leaves 230 bytes on the stack/local variables.

I know, you still cannot run ALL sketches just like that, but the Dallas temperature sensor DS18B20 and the I2C bus HTU21d temperature/humidity sensor do fit, the latter one runs fine at the moment on my mini LoRa sensor.

Suggestions for reducing the codebase even further are welcome.

The library is here: Things4U github site. I will publish some more documentation shortly, but the main pin connections can be found in the example sketch.

Maarten

Thanks to Niels for giving me moral support :slight_smile: and advice and helping to test the new library.


LoRaWAN shield for Arduino UNO?
(niels) #2

already have this free and WORKING:

Sketch uses 25,428 bytes (82%) of program storage space. Maximum is 30,720 bytes.
Global variables use 1,570 bytes (76%) of dynamic memory, leaving 478 bytes for local variables. Maximum is 2,048 bytes.

by just optimizing your hello world part :slight_smile:

good work!


(Michael) #3

Good work on that.
Came to the same conclusion that the IBM-AES-implementation is too big to fit a usefule sketch in the 328

Is there a reason for not using progmem in libraries/lmic-v1.51/src/lmic/AES-128_V10.cpp? you commented out that one…

//static const unsigned char S_Table [16][16] PROGMEM = {
unsigned char S_Table [16][16] = {


(Jac Kersing) #4

Maarten,

What is the license for the resulting code? Some files list eclipse license, others no license and for some you have the copyright and no license is specified. As the source is on github I assume you allow others to use it, however it would be nice to know the license conditions.

Best regards,

Jac


(niels) #5

because JUST using progmem at declaration of the array does not work … more functionality needs to be implemented. That github snippet will never work without also implementing pgm_read_word() / pgm_read_byte() at the places the array is accessed.

Using PROGMEM is a two-step procedure. After getting the data into Flash memory, it requires special methods (functions), also defined in the pgmspace.h library, to read the data from program memory back into SRAM, so we can do something useful with it.

https://www.arduino.cc/en/Reference/PROGMEM


#6

Good point Jac. Indeed I do allow others to use it, but the original copyright holders should too :slight_smile: . I’m not a lawyer (do not want to be either) but found out that it is less simple than I thought. If anybody has experience with this I would like his/her input.

The original copyrights are Eclipse for IBM code and GPL v3 for the encryption files distributed by Ideetron. I therefore have for the moment put a reference to GPL3 in those 4 files. Both GPL3 and Eclipse could be used for distribution of code and re-use.

So multi-licensing may work but for such a small project not nice, and Eclipse and GPL v3 are not always going together well. For my code I will use Eclipse for the changed parts in various IBM files most importantly for the changed aes.cpp file which is the easiest.

I have contacted Ideetron to hear their view on this.

Maarten

PS. Maybe I should have selected an AES library with Eclipse licensing, or maybe moving the encryption part to a different directory .


#7

I am testing that still at the moment. It seems to work, but you HAVE to use another statement further on in the code where you access that array:

S_Byte = pgm_read_byte_near( & S_Table[S_Row][S_Collum]);  // XXX When using PROGMEM

This seems to work (it does not without that pggm_read_byte_near statement), but I like to test if there are no negatives when using this apart from taking program space.
On the positive: it would give another 256 bytes extra memory.

Maarten


#8

I made a new commit o GitHub this morning. For avr (Atmega) architecture the excryption module is using PROGMEM to get another 256 butes of memory. of course for other architectures (ESP8266 and Teensy) the normal definitions are used.

At my last compile of the mini sensor with I2C temperature/humidity sensor I was left with 439 bytes of memory.

Hope it will enable our favourite sketches to run on the Atmega 328 along with the LMIC stack…Let me know if it does.

Maarten


(Matthijs Kooijman) #9

AFAICS all the libraries referenced are based on my initial port of the LMIC library. I’ve been (finally) cleaning up my port last week (integrating all code from @tftelkamp’s version). As part of that cleanup, I changed it to use PROGMEM as well, which makes the code fit in a 328p. Using the “ttn” example from my repo (which just sends messages with some ttn-specific settings): Sketch uses 22,762 bytes (70%) of program storage space. Global variables use 1,001 bytes (48%) of dynamic memory.

Looking back at the commits linked and talked about in this thread, I think they only add the PROGMEM keyword, without also updating the references to these arrays. This would mean the code compiles (and gets smaller), but doesn’t actually work. I suspect that these commits haven’t actually been tested on AVR hardware, then? My version does update the references to these arrays properly and has been tested on AVR hardware already (on an atmega256rfr2, I’m planning to test on a 328p / Arduino Uno today).

The code is here: https://github.com/matthijskooijman/arduino-lmic/tree/experimental (still in the experimental branch, I wanted to do a bit more testing before merging to the master branch). In the past, this library has been used on ESP8266 as well with some modifications. I think my experimental branch would work on the ESP8266 now as-is, though I don’t have one set up to test (@popcorn, you tested this in the past, perhaps you could do so now?).

Using PROGMEM means that dat is not copied from flash to RAM on startup, but instead is read from flash directly, saving duplicating the data in RAM. The downside is that reading from flash is slightly slower, but that’s minimal (2 cycles to read from RAM, 3 cycles to read from flash on a 328p. A bit more overhead could result from the reduced flexibility of reading from flash, though).


(Matthijs Kooijman) #10

I was cheating a bit there: Those figures are with class B support, OTAA / joining and some MAC commands disabled. While keeping them enabled, I get: Sketch uses 28,516 bytes (88%) of program storage space. Global variables use 1,049 bytes (51%) of dynamic memory. In both cases, I modified the ttn example sketch to remove some verbose debug printing in onEvent(), since that also takes up significant flash space.


(niels) #11

Hi @matthijs how do I disable Class B support in LMIC? (same question for OTAA?)

thx!

OTAA looks quite simple (DISABLE_JOIN), but class-B is unclear to me

`[quote] // Uncomment this to disable all code related to joining
//#define DISABLE_JOIN
// Uncomment this to disable all code related to ping
//#define DISABLE_PING
// Uncomment this to disable all code related to beacon tracking.
// Requires ping to be disabled too
//#define DISABLE_BEACONS

// Uncomment these to disable the corresponding MAC commands.
// Class A
//#define DISABLE_MCMD_DCAP_REQ // duty cycle cap
//#define DISABLE_MCMD_DN2P_SET // 2nd DN window param
//#define DISABLE_MCMD_SNCH_REQ // set new channel
// Class B
//#define DISABLE_MCMD_PING_SET // set ping freq, automatically disabled by DISABLE_PING
//#define DISABLE_MCMD_BCNI_ANS // next beacon start, automatical disabled by DISABLE_BEACON
[/quote]`


(Matthijs Kooijman) #12

Class-B is basically just beacon tracking and pinging, so that’s what I meant when I said “disable Class B”.


(Matthijs Kooijman) #13

I did a bunch of experiments with different AES implementations today. There seems to be a lot of opportunity to save on flash space by switching to a different AES implementation. I tried these:

Ideetron’s AES
This is the code I nicked from the things4u git repository linked above. The files say it is code from Ideetron. The things4u repository used the high-level functions, which take care of CTR and CMAC, but also take care of building the initialization vectors (Block A and Block B in the LoRaWAN spec). The latter was superfluous, since lmic.c already built these blocks. Also, since building these blocks needs access to the various internal details (devaddress, keys, frame counter), the code in the things4u repo was a bit convoluted (bypassing existing LMIC datastructures and functions).

I ended up only using Ideetron’s low-level AES cipher, and wrote the higher level CTR and CMAC code from scratch, to better fit with the LMIC structure (and it ended up a bit more effcient too).

This code uses a problematic GPL license (see below).

Tiny-AES128-C
This is a AES128 implementation taken from https://github.com/kokke/tiny-AES128-C. This is just an AES cipher, so it uses the same CTR and CMAC code as above. This code was modified to use PROGMEM to reduce the memory usage. The structure of this code looks very similar to the Ideetron code, so I suspect they share a common ancestor (perhaps some pseudocode in the AES specification?).

This library is stated to be in the public domain, but I’m not sure if the casual mention in the README suffices as a proper public domain dedication.

Avr-crypto-lib / AESLib
AESLib is an Arduino library that includes a few selected parts of the more complete AVR-Crypto-lib and can be found here: https://github.com/DavyLandman/AESLib It uses heavily optimized handcoded assembly code. The AESLib version only includes assembly versions, so it only works on AVR (the original AVR-crypto-lib also has more portable C versions). Again, this is just an AES cipher, so this uses my CTR and CMAC code again.

This code uses a problematic GPL license (see below).

aes-min

This is a AES library intended to be small, licensed under an MIT license. It can be found here:

It allows using a normal sbox lookup table, or a “small” version that calculates things instead of using a lookup table. This is slower, but should save a 256-byte lookup table. In practice, the flash size gain is only 64 bytes on an Atmega328p, presumably because the calculations use a lot of bitshifting, which AVR isn’t really good at (so it needs a lot of instructions to implement the calculations).

Furthermore, this library allows doing “on the fly” key schedule calculation, or precalculating the key schedule. Both have been tested. With precalculation, the key schedule calculation was redone on every AES request, even when the key didn’t change, to make the implementation simpler. The time used for this precalculation was not included in the table, since it is expected that, in a final implementation, it would only happen rarely.

AVR-AES
This is a library specifically meant for the AVR architecture, using (presumably) hand-coded assembly. It comes in a five different flavours, all of which were tested. There are some additional toggles, which have just been left at the defaults.

The FAST and FURIOUS versions use precalculated keyschedules, and the same remarks and caveats about this as with aes-min apply.

Again, this library is GPL-licensed, making it incompatible.
LMIC
Of course I also looked at the original AES implementation in LMIC. This implementation has the AES cipher and CTR and CMAC implementations heavily integrated. I modified the code to use PROGMEM for reduced memory usage, and applied some fixes to make it work on 8/16-bit hardware too.

None
I also tried without any AES support, to have a baseline for comparison.

Licensing
Some of these libraries use the GPL license, which is unfortunately not compatible with the EPL license used by the LMIC library. I think that this means you can use them for personal use, but cannot distribute the result (so my github repository might actually be violating the license right now, I’m not entirely sure). This is a pity, since especially AESLib / AVR-Crypto-Lib is fast and small… Given the licenses, the only plausible alternative of the above is really Tiny-AES128-C, which is small, but also fairly slow.

One alternative I haven’t tried yet is https://github.com/spaniakos/AES (which looks like it’s RAM-hungry).

Test results
Here’s some benchmarks with the above libraries:

                        | License | Flash | RAM  | AES (16 bytes) | CTR (16 bytes) | MIC (25 bytes) |
 No AES                 |         | 20944 | 1103 |                |                |                |
 LMIC                   | EPL     | 30040 | 1103 |    744μs       |    692μs       |  2 356μs       |
 Ideetron               | GPL     | 22678 | 1119 |  1 544μs       |  1 704μs       |  6 652μs       |
 Tiny-AES128-C          | PD      | 22912 | 1283 |  1 368μs       |  1 508μs       |  5 940μs       |
 AESLib                 | GPL     | 22748 | 1109 |    308μs       |    356μs       |  1 360μs       |
 aes-min                | MIT     | 22462 | 1103 |  1 292μs       |  1 444μs       |  5 664μs       |
 aes-min small          | MIT     | 22398 | 1113 | 23 304μs       | 23 772μs       | 27 948μs       |
 aes-min precalc        | MIT     | 22336 | 1279 |  1 060μs       |  1 196μs       |  4 736μs       |
 aes-min small + precalc| MIT     | 22272 | 1289 | 18 684μs       | 18 712μs       |  9 260μs       |
 avr-aes SMALL          | GPL     | 22328 | 1104 |    664μs       |    744μs       |  2 916μs       |
 avr-aes FANTASTIC      | GPL     | 23214 | 1103 |    304μs       |    344μs       |  1 336μs       |
 avr-aes FURIOUS        | GPL     | 23232 | 1279 |    188μs       |    232μs       |    860μs       |
 avr-aes FAST           | GPL     | 23032 | 1279 |    184μs       |    216μs       |    832μs       |
 avr-aes MINI           | GPL     | 21950 | 1124 |    912μs       |  1 020μs       |  4 016μs       |

The first two columns indicate flash and RAM sizes as indicated by the Arduino IDE. The AES column indicates the time used for a single block plain AES encryption. CTR is the time for encryption regular payload using CTR and MIC is a MIC calculation using CMAC. All times are in microseconds.

Code
The code used for these tests can be found here: https://github.com/matthijskooijman/arduino-lmic/tree/aes-experiment (pull carefully, branch is prone to rebases / force pushes).


Arduino LMIC library updated
Arduino LMIC library updated
Arduino LMIC library updated
Arduino LMIC library updated
(niels) #14

@matthijs these benchmarks were tested on an 8mhz AVR?


(Matthijs Kooijman) #15

16Mhz Arduino Uno.


(Matthijs Kooijman) #16

I’ve updated my post above to include aes-min, which seems like a good alternative implementation with a compatible license, as well as AVR-AES, which is very small and fast on the AVR architecture, but unfortunately also has an incompatible license. I have contacted the AVR-AES author to see if he is willing to work out some alternative licensing arrangement to allow it to be used with LMIC.


(Matthijs Kooijman) #17

Quick update: The AVR-AES authors are willing to expand the license so we can use it with LMIC, so we’ll probably end up using AVR-AES. We’re still working out the details, I’ll keep you posted.


(Tarak Chaari) #18

Very interesting. Big thanks for bringing Lorawan to the Atmega 328 :wink:
Is this class A or C please?


(Arjan) #20

From the LoRaWAN specifications:

Note: The network server uses an AES decrypt operation in ECB mode to encrypt the join-accept message so that the end-device can use an AES encrypt operation to decrypt the message. This way an end-device only has to implement AES encrypt but not AES decrypt.

That sounds nice, but I think it only applies to Join-Accept messages, allowing for cheap Class A nodes, if they don’t expect any response in the downlink receive window?


(niels) #21

I m pretty sure the answer is no. I checked IBM LMIC 1.5 code and also the downlink seems to use the “encrypt” method on the node to “decrypt” a downlink packet… so this allows for full class A nodes.