Badly implementing encryptionPart V–nonce reuse

time to read 3 min | 554 words

In the previous post, I moved to using HMAC for the key stream generation. Together with a random nonce, we ensure that each time that we encrypt a value, we’ll get a different encrypted value. However, I want to stop for a while and talk about what will happen if we reuse a nonce.

The short answer, a single nonce reuse results in catastrophic results for our security scheme. Let’s take a look at how that works, shall we?

We are going to use:

  • Key: jMnNRO9K7DEGmrhPS6awT3w4AAjCMgaNNqPSiwTL//s=
  • Nonce: 3ilsaRYYOls4SA6XHd70jA==

Here is the encryption results:

attack at dawn! 414CE53F71D47A36FF099792858F58
defend at dusk! 445DF73B7CDB7A36FF099786818A58

Do you notice anything? At this point, we can see that we have some similarities between the texts, which is interesting. If we XOR those two texts, what will we get?

Well, if you think about it, we have:

“attack at dawn!” XOR keystream = 414CE53F71D47A36FF099792858F58
”defend at dusk!” XOR keystream = 445DF73B7CDB7A36FF099786818A58

The XOR operation is cumulative, so if we XOR those two values, we have:

445DF73B7CDB7A36FF099786818A58 XOR 445DF73B7CDB7A36FF099786818A58 =
    ( “attack at dawn!” XOR keystream ) XOR (”defend at dusk!” XOR keystream) =
    “attack at dawn!” XOR ”defend at dusk!” =
   051112040D0F000000000014040500

Sorry for the math here, but the basic idea should be clear. Note that in this case, we don’t know either of those messages, but what we have been able to do is to get the XOR of the plain texts. At this point, an actual cryptographer will go look at frequency tables for symbols in English and start making guesses about those values. I’m certain that there are better techniques for that, but given the computing power that I have, I decided to just break it using the following code:

When running this code on the XOR of the encrypted texts (which is the equivalent of the XOR of the plain texts), we get the following outputs:

info: xor: 051112040D0F000000000014040500
info: word: defends, match: attacks
info: word: attacker's, match: defender's
info: word: attacker, match: defender
info: word: defenders, match: attackers
info: word: adapt, match: dusty
info: word: attack, match: defend
info: word: defending, match: attacking
info: word: attacks, match: defends
info: word: dusty, match: adapt
info: word: defender's, match: attacker's
info: word: defender, match: attacker
info: word: defend, match: attack
info: word: attackers, match: defenders
info: word: attacking, match: defending
info: word: attacked, match: defended
info: word: defended, match: attacked

As you can see, we have a bunch of options for the plain text. Removing redundancies, we have the following pairs:

attack defend
adapt dusty

That was easy, I have to say. Now I can move on to the next word, eventually cracking the whole message. For the values that are the same, I’ll get zero bytes, of course. At that point, I can simply try guessing based on context, but the process shouldn’t take long at all.

The question is how to solve this? Currently, the commonly accepted practice is to shout at you very loudly in code reviews and to put big notices in the documentation: Thou shall not reuse a nonce.

There is something called SIV mode, which aims to help in this, but I want to keep that for a future post.

More posts in "Badly implementing encryption" series:

  1. (24 Feb 2022) Part X-Additional data
  2. (23 Feb 2022) Part IX–SIV
  3. (22 Feb 2022) Part VIII–timings attacks and side channels
  4. (21 Feb 2022) Part VII–implementing authenticated encryption
  5. (18 Feb 2022) Part VI–malleable encryption
  6. (17 Feb 2022) Part V–nonce reuse
  7. (16 Feb 2022) Part IV–keyed hash function
  8. (15 Feb 2022) Part III–breaking your encryption apart
  9. (14 Feb 2022) Part II–breaking the code
  10. (11 Feb 2022) Part I