|
|
@@ -0,0 +1,297 @@ |
|
|
|
--- |
|
|
|
title: Nearly Complete Guide to RNG on a microcontroller |
|
|
|
description: > |
|
|
|
How to initialize and run an RNG on an STM32L151CC microcontroller. |
|
|
|
created: !!timestamp '2021-05-18' |
|
|
|
listable: false |
|
|
|
time: 12:00 PM |
|
|
|
tags: |
|
|
|
- security |
|
|
|
- rng |
|
|
|
- microcontroller |
|
|
|
--- |
|
|
|
|
|
|
|
Security depends upon cryptography and which in turn depends upon a |
|
|
|
Random Number Generator (RNG). An RNG is used for key generation (both |
|
|
|
symmetric and asymmetric) and key negotiation (session establishment). |
|
|
|
The later is an absolute requirement to ensure that communications can |
|
|
|
be secured. The former (key generation) can be used at first boot for |
|
|
|
personalization, but isn't necessary as it could be done when personalizing |
|
|
|
the device at programming or first deployment. |
|
|
|
|
|
|
|
There are two types of RNGs, the first is a True Random Number Generator |
|
|
|
(TRNG). This is one that takes some non-deterministic process, often |
|
|
|
physical, and measures it. Often, these are slow and are not uniform, |
|
|
|
requiring a post processing step before the are useful. |
|
|
|
|
|
|
|
The second type is a Pseudo Random Number Generator (PRNG)<label |
|
|
|
for="sn-drbg" class="margin-toggle sidenote-number"></label><input |
|
|
|
type="checkbox" id="sn-drbg" class="margin-toggle"/><span |
|
|
|
class="sidenote">[NIST](https://www.nist.gov/) also refers to a |
|
|
|
PRNG as a Deterministic Random Bit Generator (DRBG).</span>. PRNGs |
|
|
|
take a seed, and can generate large, effectively unlimited when seeded |
|
|
|
properly, amounts of random looking data from them. The issue is than |
|
|
|
if someone is able to obtain the seed, they will be able to predict |
|
|
|
the subsequent values, allowing breaking security. |
|
|
|
|
|
|
|
The standard practice is to gather data from a TRNG, and use it to seed |
|
|
|
a PRNG. It used to be common that the PRNG would be reseeded, but I |
|
|
|
agree w/ djb (D. J. Bernstein) that once seeded, no additional seeding |
|
|
|
is needed<label for="sn-entropy" class="margin-toggle sidenote-number"></label> |
|
|
|
<input type="checkbox" id="sn-entropy" class="margin-toggle"/> |
|
|
|
<span class="sidenote">See his blog post |
|
|
|
[Entropy Attacks!](https://blog.cr.yp.to/20140205-entropy.html)</span> |
|
|
|
as modern PRNGs are secure enough and can generate enough randomness |
|
|
|
that their state will not leak. |
|
|
|
|
|
|
|
There are lots of libraries and papers that talk about how to solve the |
|
|
|
problem for RNGs on a microcontroller that may not have an integrated |
|
|
|
[T]RNG block, but I have not been able to find a complete guide for |
|
|
|
integrating their work into a project where even a relative beginner |
|
|
|
could get it functional. |
|
|
|
|
|
|
|
This article was written as I developed the |
|
|
|
[lora-irrigation](https://www.funkthat.com/gitea/jmg/lora-irrigation) |
|
|
|
project. This project will be used as an example, and the code reference |
|
|
|
is mostly licensed under the 2-clause BSD license, and so is freely |
|
|
|
usable for your own projects. |
|
|
|
|
|
|
|
|
|
|
|
Sources of Randomness |
|
|
|
--------------------- |
|
|
|
|
|
|
|
As mentioned, most microcontrollers do not have a dedicated hardware |
|
|
|
block like modern AMD64 (aka x86-64) processors do w/ the RDRAND |
|
|
|
instruction. Though they do not, there are other sources that are |
|
|
|
available. |
|
|
|
|
|
|
|
The first, and easiest one is the Analog Digital Converter (ADC). Even |
|
|
|
if the ADC pin is tied to ground, the process of digital conversion is |
|
|
|
not 100% deterministic as there are errors in the converter or noise |
|
|
|
introduced on the pin.<label for="sn-adcnoise" |
|
|
|
class="margin-toggle sidenote-number"></label><input type="checkbox" |
|
|
|
id="sn-adcnoise" class="margin-toggle"/><span class="sidenote">The article |
|
|
|
[ADC Input Noise: The Good, The Bad, and The Ugly. Is No Noise Good |
|
|
|
Noise?](https://www.analog.com/en/analog-dialogue/articles/adc-input-noise.html) |
|
|
|
talks about this.</span> |
|
|
|
|
|
|
|
The data sheet for the microcontroller will help determine the expected |
|
|
|
randomness from the part. In the case of the |
|
|
|
[STM32L151CC](https://www.st.com/content/st_com/en/products/microcontrollers-microprocessors/stm32-32-bit-arm-cortex-mcus/stm32-ultra-low-power-mcus/stm32l1-series/stm32l151-152/stm32l151cc.html) |
|
|
|
that I'm using, Table 57 of the data sheet lists the Effective number |
|
|
|
of bits (ENOB) as typically 10 bits, which is a couple bits short of |
|
|
|
the 12 bit resolution of the ADC. This means that the 2 least |
|
|
|
significant bits are likely to have some noise in them. I did a run, |
|
|
|
and collected 114200 samples from the ADC. The [Shannon |
|
|
|
entropy](https://en.wikipedia.org/wiki/R%C3%A9nyi_entropy#Shannon_entropy) |
|
|
|
calculated using the empirical probabilities was 2.48.<label |
|
|
|
for="sn-shannonenropy" class="margin-toggle sidenote-number"></label> |
|
|
|
<input type="checkbox" id="sn-shannonenropy" class="margin-toggle"/> |
|
|
|
<span class="sidenote">Now this is not strictly Shannon entropy, as the |
|
|
|
values were calculated from the experiment, and Shannon entropy should |
|
|
|
be calculated from the a priori probabilities.</span> Discarding the |
|
|
|
0's (which makes up over half the results) improves the entropy |
|
|
|
calculation to 3.29. The |
|
|
|
[min-entropy](https://en.wikipedia.org/wiki/R%C3%A9nyi_entropy#Min-entropy)<label for="sn-min-entropy-fwdref" class="margin-toggle sidenote-number"></label>, |
|
|
|
<input type="checkbox" id="sn-min-entropy-fwdref" class="margin-toggle"/> |
|
|
|
<span class="sidenote">Forward reference: |
|
|
|
<a href="#min-entropy-awk">min-entropy awk script</a></span> |
|
|
|
a better indicator of entropy, calculation is 1.2 bits, and if all the |
|
|
|
0's are dropped, it improves to 2.943. This does help, but in the end, |
|
|
|
subtracting the data sheet's ENOB from the ADC resolution does result |
|
|
|
in an approximate estimate of entropy. |
|
|
|
|
|
|
|
It is possibly that a correlation analysis between samples could |
|
|
|
further reduce the entropy gathers via the ADC, but with sufficient |
|
|
|
collection, this should be able to be avoided. |
|
|
|
|
|
|
|
The second is using uninitialized SRAM. It turns out that this has |
|
|
|
been studied in [Software Only, Extremely Compact, Keccak-based Secure |
|
|
|
PRNG on ARM Cortex-M](https://dl.acm.org/doi/10.1145/2593069.2593218) |
|
|
|
and [Secure PRNG Seeding on Commercial Off-the-Shelf |
|
|
|
Microcontrollers](https://www.intrinsic-id.com/wp-content/uploads/2017/05/prng_seeding.pdf). |
|
|
|
Depending upon how the SRAM is designed in the chip, it can create a |
|
|
|
situation where each bit of SRAM will be indeterminate at boot up. |
|
|
|
Both of these papers studied a similar microcontroller, an |
|
|
|
STM32F100R8 to the one I am using, a STM32L151CC. |
|
|
|
|
|
|
|
I ran my own experiments where I powered on an STM3L151CC and dumped |
|
|
|
the SRAM 8 times and analyzed the results. I limited my analysis to |
|
|
|
26863 bytes the 32 KiBytes of ram (remaining was data/bss or stack, so |
|
|
|
would not change, or was zeros). I then calculated the min-entropy for |
|
|
|
each bit across power cycles and the resulting sum was 11188, or |
|
|
|
approximately .416 bits per byte. This is 5.2% and in line with what |
|
|
|
the later paper observed for a similar device. |
|
|
|
|
|
|
|
Part of using a source of randomness is making sure that it is usable. |
|
|
|
In the case of the ADC, each reading can be evaluated against previous |
|
|
|
reads to ensure that the data being obtained is possibly random. In |
|
|
|
the case of SRAM, this is more tricky, as the state of SRAM is static, |
|
|
|
and short of a reset, will not change. This means that to use SRAM, |
|
|
|
proper analysis of the device, or family of devices, need to be evaluated |
|
|
|
for suitability. There are cases where a device's SRAM does not provide |
|
|
|
adequate entropy, as discussed in the papers, and so this method should |
|
|
|
not be used in those cases, or not solely relied upon. |
|
|
|
|
|
|
|
The following is an `awk` script for calculating the min-entropy of the |
|
|
|
provided data. Each sample must the first item on a line, and each sample |
|
|
|
must be a hexadecimal value w/o any leading `0x` or other leading |
|
|
|
identifier: |
|
|
|
<pre id="min-entropy-awk" class="language-awk fullwidth"><code># Copyright 2021 John-Mark Gurney |
|
|
|
# This script is licensed under the 2-clause BSD license |
|
|
|
|
|
|
|
function max(a, b) |
|
|
|
{ |
|
|
|
if (a > b) |
|
|
|
return a; |
|
|
|
else |
|
|
|
return b; |
|
|
|
} |
|
|
|
|
|
|
|
{ |
|
|
|
v = ("0x" $1) + 0; a[NR] = v; |
|
|
|
maxv = max(maxv, v); |
|
|
|
} |
|
|
|
|
|
|
|
END { |
|
|
|
tcnt = length(a); |
|
|
|
me = 0; |
|
|
|
for (bit = 0; 2^bit <= maxv; bit += 1) { |
|
|
|
cnt0 = 0; |
|
|
|
cnt1 = 0; |
|
|
|
for (i in a) { |
|
|
|
tbit = int((a[i] / 2 ^ bit) % 2); |
|
|
|
if (tbit) |
|
|
|
cnt1 += 1; |
|
|
|
else |
|
|
|
cnt0 += 1; |
|
|
|
} |
|
|
|
v = -log(max(cnt0, cnt1) / tcnt) / log(2); |
|
|
|
print "bit " bit ":\t" v; |
|
|
|
me += v; |
|
|
|
} |
|
|
|
printf "total:\t%0.3f\n", me; |
|
|
|
} |
|
|
|
</code></pre> |
|
|
|
|
|
|
|
It is also possible that there are other parts of the board/design |
|
|
|
that could be a source of randomness. The project that started this |
|
|
|
journey is using [LoRa](https://en.wikipedia.org/wiki/LoRa) for |
|
|
|
communication. It turns out that the sample code for the radio chip |
|
|
|
([LoRaMac‑node](https://github.com/Lora-net/LoRaMac-node)) implements |
|
|
|
a [random interface](https://github.com/Lora-net/LoRaMac-node/blob/7f12997754ad8e38a84daa85f62e7e6c0e5dbe59/src/radio/radio.h#L154-L163). |
|
|
|
The function just waits one milisecond, reads the RSSI value, takes |
|
|
|
the low bit and repeats this 32 times to return a 32-bit word. There |
|
|
|
are issues with this as I cannot find any description of the expected |
|
|
|
randomness in the data sheet, nor in the code. It also does not do |
|
|
|
any conditioning, so just because it returns 32-bits, does not guarantee |
|
|
|
32-bits of usable entropy. I have briefly looked at the output, and |
|
|
|
there does appear to be higher lengths of runs than expected. Another |
|
|
|
issue is that it's collection takes a while, as the fastest is 1 bit |
|
|
|
per ms. So, assuming the need to collect 8 bits for 1 bit of entropy |
|
|
|
(pure speculation), that means at minimum 2 seconds to collect the |
|
|
|
2048 bits necessary for 256 bits of entropy. |
|
|
|
|
|
|
|
|
|
|
|
Uniquifying |
|
|
|
----------- |
|
|
|
|
|
|
|
One of the other ways to help ensure that a microcontroller is to |
|
|
|
integrate per device values into the PRNG. This does not guarantee |
|
|
|
uniqueness between boots, but it does make it harder to attack if an |
|
|
|
attacker is able to control the other sources of randomness. |
|
|
|
|
|
|
|
In the case of the STM32L151 chip I am using, there is a unique |
|
|
|
device id register. The device register is programmed at the |
|
|
|
factory. Because it is unknown if this unique id is recorded by the |
|
|
|
manufacturer, and possibly traced through the supply chain, and no |
|
|
|
guarantees are made to both the uniqueness or privacy, it has limited |
|
|
|
use to provide any serious additional randomization. |
|
|
|
|
|
|
|
Another method, is to write entropy at provisioning time. This can be |
|
|
|
done in either flash memory or EEPROM, which may have a more granular |
|
|
|
write access. |
|
|
|
|
|
|
|
|
|
|
|
Using SRAM |
|
|
|
---------- |
|
|
|
|
|
|
|
The tricky part of using SRAM is figuring out how to access the |
|
|
|
uninitialized memory. Despite having full access to the environment, |
|
|
|
modifying the startup code, which is often written in assembly, to do |
|
|
|
the harvesting makes an implementation less portable. Using standard |
|
|
|
C, or another high level language, makes this easier, *but* we need to |
|
|
|
know where the end of the data and bss segments are. This is where |
|
|
|
looking at the linker script will come in. |
|
|
|
|
|
|
|
A linker script is used to allocate and map the program's data to the |
|
|
|
correct locations. This includes allocating memory so that all the |
|
|
|
code and data fits in flash, but also allocating ram for variables, and |
|
|
|
stack. Often there will be a symbol provided that marks where the data |
|
|
|
and bss sections in ram end, and the heap should begin. For example, |
|
|
|
in [`STM32L151CCUX_FLASH.ld` at lines 185 & |
|
|
|
186](https://www.funkthat.com/gitea/jmg/lora-irrigation/src/commit/91a6fb590b68af1bcd34f776d4a58c89ac581c7d/stm32/l151ccux/STM32L151CCUX_FLASH.ld#L185-L186) |
|
|
|
it defines the symbols `end` and `_end`, the later of which is often |
|
|
|
used by `sbrk` (or `_sbrk` in my project's case in |
|
|
|
libnosys<label for="sn-sbrk-sample" class="margin-toggle sidenote-number"></label><input type="checkbox" id="sn-sbrk-sample" class="margin-toggle"/> |
|
|
|
<span class="sidenote">A sample `_sbrk` is in [utils_syscalls.c](https://www.funkthat.com/gitea/jmg/lora-irrigation/src/commit/91a6fb590b68af1bcd34f776d4a58c89ac581c7d/loramac/src/boards/mcu/saml21/hal/utils/src/utils_syscalls.c#L67-L83), |
|
|
|
though this particular implementation is not used by my project.</span>) |
|
|
|
to allocate memory for the heap. Using sbrk is the easiest method to |
|
|
|
access uninitalized SRAM, but modifying or adding a symbol can be used |
|
|
|
if your microcontroller's framework does not support sbrk. |
|
|
|
|
|
|
|
|
|
|
|
Putting it together |
|
|
|
------------------- |
|
|
|
|
|
|
|
It is accepted that integrating as many difference sournces of entropy |
|
|
|
(TRNGs) is best. This ensures that as long as any single soruce is |
|
|
|
good, or each one is not great, but combined they provide enough |
|
|
|
entropy (preferably at least 128 bits), that the seeded PRNG will be |
|
|
|
secure and unpredictable. |
|
|
|
|
|
|
|
As some sources are only available at first boot, e.g. SRAM, it is |
|
|
|
best to save a fork of the PRNG to stable storage. In my |
|
|
|
implementation, I decided to use EEPROM for this. I added an |
|
|
|
additional EEPROM section in the linker script, and then added a symbol |
|
|
|
[rng_save](https://www.funkthat.com/gitea/jmg/lora-irrigation/src/branch/main/strobe_rng_init.c#L39) |
|
|
|
that is put in this section. This should be 256-bits (32-bytes) as |
|
|
|
the savings of smaller does not make sense, and any proper PRNG when |
|
|
|
seeded with 256-bits will provide enough randomness. Writing to EEPROM |
|
|
|
does require a little more work to have the code save to this region, |
|
|
|
rather than RAM, but the STM32 HAL layer has functions that make this |
|
|
|
easy. |
|
|
|
|
|
|
|
It would be great if where the PRNG seed could be in read-once, |
|
|
|
write-once memory to ensure that it can be read, mixed in with any |
|
|
|
additional entropy, and the written out, but I do not know of any |
|
|
|
microcontroller that supports this feature. |
|
|
|
|
|
|
|
Part of this is is to ensure that the the state between the saved |
|
|
|
seed, and the PRNG state used for this boot is disjoint, and that if |
|
|
|
either seed is compromised, neither can be backtracked to obtain the |
|
|
|
other. In the case of [strobe](https://strobe.sourceforge.io/papers/strobe-latest.pdf), |
|
|
|
the function [strobe_randomize](https://www.funkthat.com/gitea/jmg/lora-irrigation/src/branch/main/strobe/strobe.c#L319-L331) |
|
|
|
does a RATCHET operation at the end, which ensure the state cannot be rolled |
|
|
|
back to figure out what was generated, and as the generated bytes does |
|
|
|
not contain the entire state of the PRNG, it cannot be used to |
|
|
|
reconstruct the future seed. |
|
|
|
|
|
|
|
Another advantage of using EEPROM is the ability to provide an initial |
|
|
|
set of entropy bytes at firmware flashing time. I did attempt to add |
|
|
|
this, but OpenOCD, which I use for programming the Node151 device, |
|
|
|
does not support programming EEPROM, so in my case, this was not |
|
|
|
possible<label for="sn-eeprom-flash" class="margin-toggle sidenote-number"></label><input type="checkbox" id="sn-eeprom-flash" class="margin-toggle"/><span class="sidenote">Despite not using it, the infrastructure to generate perso entropy is still present in the [Makefile](https://www.funkthat.com/gitea/jmg/lora-irrigation/src/branch/main/Makefile#L152-L157).</span>. |
|
|
|
I could have added an additional source data file to the flash, but |
|
|
|
figured that the other sources of entropy were adequate enough for my |
|
|
|
project. |
|
|
|
|
|
|
|
|
|
|
|
{# |
|
|
|
Conclusion |
|
|
|
---------- |
|
|
|
|
|
|
|
Modern microcontrollers do have a number of sources of entropy that can |
|
|
|
be used. With a little bit of work, a PRNG seed can be saved between |
|
|
|
resets, allowing for more secure operation, and even preloading of |
|
|
|
entropy. #} |