# Keys Derivation with Chacha20

The NaCl family doesn’t have protocols for key exchange. There’s the
`crypto_box()`

API, but it doesn’t provide forward secrecy or
key compromise impersonation resistance. There’s Noise, but the specs
are fairly unapproachable, complex, and give too many options. So I
figured I would design my own protocol framework. Something simple,
tailored for my crypto library, Monocypher.

Monocypher uses at least 3 primitives to establish and use a secure channel:

**X25519**for key exchanges.**Chacha20**for secrecy.**Poly1305**for authentication.

This is enough to imitate NaCl’s `crypto_box()`

: hash the
X25519 shared secret with HChacha20 (NaCl uses HSalsa20, which is
essentially the same), and voila we have our session key.

Alas, this does not have satisfying security properties. No forward secrecy, no key compromise impersonation
resistance… For these, we need to imitate Noise and Signal, and perform
*several* key exchanges, using temporary key pairs. Then we need
to derive actual session keys from the resulting shared secrets. Signal
and Noise perform this derivation with HKDF.

Signal’s key derivation looks like this:

```
SK = KDF(DH1 || DH2 || DH3)
DH1 DH2 DH3 DH4
│ └─┐┌─┘ │
└─────┐││┌─────┘
┌─┴┴┴┴─┐
│ HKDF │
└──┬───┘
│
SK
```

Noise uses a more iterative process (I’m over simplifying):

```
CK1, SK1 = HKDF(DH1)
CK2, SK2 = HKDF(DH2, CK1)
CK3, SK3 = HKDF(DH3, CK2)
CK4, SK4 = HKDF(DH4, CK3)
DH1 DH2 DH3 DH4
│ │ │ │
┌──┴───┐ ┌──┴───┐ ┌──┴───┐ ┌──┴───┐
│ HKDF ├───┤ HKDF ├───┤ HKDF ├───┤ HKDF ├──CK4
└──┬───┘ └──┬───┘ └──┬───┘ └──┬───┘
│ │ │ │
SK1 SK2 SK3 SK4
```

HKDF is is excellent, but it is based on HMAC, which needs a hash function to run. I don’t want a hash, I want to cash in on DJB’s conjecture that HSalsa20 is enough to hash X25519 shared secrets. I want to avoid using a fourth primitive if I can help it.

Problem is, HChacha20 can only hash up to 48 bytes. And it’s not even
a real hash, it’s derived from a *stream cipher*. There’s no compression function, so we have to construct one
somehow.

## Stumbling in the dark

My first attempt tried to do the compression with XOR. It failed miserably:

```
CK1 = HChacha20(DH1)
CK2 = HChacha20(DH2) XOR CK1
CK3 = HChacha20(DH3) XOR CK2
CK4 = HChacha20(DH4) XOR CK3
DH1 DH2 DH3 DH4
│┌───┐ │┌───┐ │┌───┐ │┌───┐
└┤ H │ └┤ H │ └┤ H │ └┤ H │
└─┬─┘ └─┬─┘ └─┬─┘ └─┬─┘
│ │ │ │
├─────╴+╶────╴+╶────╴+
│ │ │ │
CK1 CK2 CK3 CK4
```

*(The “H” blocks mean HChacha20, and “+” means XOR)*

If two key exchanges happen to be the same (either because of user
error or attacker interference), they *cancel each other out*. We
need to separate the different outputs somehow. Lucky for us, HChacha20
has 16 more bytes to spare (the nonce and counter). We could use it to
our advantage and increment it:

```
CK1 = HChacha20(DH1, 1)
CK2 = HChacha20(DH2, 2) XOR CK1
CK3 = HChacha20(DH3, 3) XOR CK2
CK4 = HChacha20(DH4, 4) XOR CK3
DH1 DH2 DH3 DH4
│ 1 │ 2 │ 3 │ 4
│┌─┴─┐ │┌─┴─┐ │┌─┴─┐ │┌─┴─┐
└┤ H │ └┤ H │ └┤ H │ └┤ H │
└─┬─┘ └─┬─┘ └─┬─┘ └─┬─┘
│ │ │ │
├──────╴+╶─────╴+╶─────╴+
│ │ │ │
CK1 CK2 CK3 CK4
```

There, no more problem. Unless…

See, DJB’s conjecture did not mention such a hack. Granted, Chacha20
is a stream cipher, and when using it with a uniform random key, the
streams produced with different nonces are independent. Except X25519
shared secret are *not* random. They’re on the curve, which means
*at most* 2^252 bits of entropy. There are more effective attacks
than brute force, and the security goal of X25519 is only 128 bits
anyway.

Note that if I could reasonably assume this independence, then X25519 shared secret could reasonably be used directly as Chacha20 keys, and even NaCl does not go that far.

So, nope.

I then tried to take inspiration from the sponge construction, and have HChacha20 absorb keys 16 bytes at a time:

```
CK1 = HChacha20(HChacha20(zero, DH1[0:15]), DH1[16:31])
CK2 = HChacha20(HChacha20(CK1 , DH2[0:15]), DH2[16:31])
CK3 = HChacha20(HChacha20(CK2 , DH3[0:15]), DH3[16:31])
CK4 = HChacha20(HChacha20(CK3 , DH4[0:15]), DH4[16:31])
DH1 DH2 DH3 DH4
│ │ │ │
┌─────┼─────┐ ┌─────┼─────┐ ┌─────┼─────┐ ┌─────┼─────┐
│ LSB │ MSB │ │ LSB │ MSB │ │ LSB │ MSB │ │ LSB │ MSB │
└──┬──┴──┬──┘ └──┬──┴──┬──┘ └──┬──┴──┬──┘ └──┬──┴──┬──┘
│ │ │ │ │ │ │ │
┌─┴─┐ ┌─┴─┐ ┌─┴─┐ ┌─┴─┐ ┌─┴─┐ ┌─┴─┐ ┌─┴─┐ ┌─┴─┐
┌─┤ H ├─┤ H ├─┬─┤ H ├─┤ H ├─┬─┤ H ├─┤ H ├─┬─┤ H ├─┤ H ├─┐
│ └───┘ └───┘ │ └───┘ └───┘ │ └───┘ └───┘ │ └───┘ └───┘ │
0 CK1 CK2 CK3 CK4
```

This is actually very similar to DJB’s XSalsa20 cascade, where chaining key would be the key, and the Diffie-Hellman key exchanges would be the extended nonce and counter. We even have a security reduction, so we should be good, right?

Wrong: the zeroth chaining key is not random. It’s *zero*.
DJB’s security reduction does not apply at all. We could still
*conjecture* that **CK1** is random as long as
**DH1** is secure, but I’ve never saw such a conjecture
anywhere, and I’m not pulling conjectures out of thin air.

So, nope. Again.

## Calming down and stepping back

Clearly, throwing stuff up to see what sticks doesn’t work. I needed to look at the requirements more rigorously, and make sure I understand the security properties of Chacha20 (and HChacha20).

### KDF requirements

My protocols have the outer structure of Noise, with the same kind of
state machine. Each time there’s a new Diffie-Hellman key exchange, we
inject it into the state machine, and get new symmetric keys to work
with. My protocol specifically needs an authentication key
**AK**, and an encryption key **EK**, both of
which will be compromised after the first use (because I use XOR and a
polynomial hash). Here’s what I need:

```
CK1, EK1, AK1 = KDF(DH1 )
CK2, EK2, AK2 = KDF(DH2, CK0)
CK3, EK3, AK3 = KDF(DH3, CK1)
CK4, EK4, AK4 = KDF(DH4, CK2)
DH1 DH2 DH3 DH4
│ │ │ │
┌──┴──┐ ┌──┴──┐ ┌──┴──┐ ┌──┴──┐
│ KDF ├───┤ KDF ├───┤ KDF ├───┤ KDF ├──CK4
└┬───┬┘ └┬───┬┘ └┬───┬┘ └┬───┬┘
│ │ │ │ │ │ │ │
AK1 EK1 AK2 EK2 AK3 EK3 AK4 EK4
```

This KDF needs to have the following security property: If
**DH(n)** is secure (meaning, the involved private keys are
suitably random, and unknown to the attacker), then
**AK(i)** and **EK(i)**, for all
**i** such that **i≥n**, are random, unknown
to the attacker, and independent from each other and any other keys.

Especially important is that revealing **AK(i)** or
**EK(i)**, for all **i**, doesn’t give any
further information away.

### Properties of X25519

This is your regular Diffie-Hellman key exchange over an elliptic curve. Each party has a key pair, whose private half is a random secret, and the public half is revealed to the world (and any potential attacker). An exchange between two key pairs computes a shared secret from the private half of one key and the public half of the other key.

Diffie-Hellman operations are conjectured to be intractably hard to reverse. One cannot feasibly learn the private key of someone simply by knowing their public key, or even shared secrets.

An attacker who performs a key exchange with you does have some control, though. They know your public key, and they can chose their private key. They could perform several key exchanges and bias the resulting shared secret, or even force the shared secret to be zero (by choosing a low order public key).

### Properties of Chacha20

Chacha20 is a stream cipher. Its permutation has 3 inputs: a 32-byte key, an 8-byte nonce, and an 8-byte counter. The output is a 64-byte block. The security goal of Chacha20 is that given an independent random key, a counter, and a nonce the output block is random and independent. Most notably independent from blocks generated with the same key, but with a different counter and nonce. (Counter and nonce put together are like a salt.)

HChacha20 is almost the same as Chacha20: it has the same input key, the same counter and nonce (except those are fused into a single auxiliary “input” for simplicity). The main difference is that it outputs only 32-bytes. Those bytes can be computed from the corresponding Chacha20 output block. The main reason for HChacha20’s existence is that it slightly cheaper to compute than a full Chacha20 block. Its security properties are the same as Chacha20.

There’s an additional conjecture about HChacha20: if we give it as an
input an X25519 shared secret whose private keys are secure, then the
output is random. This assumption comes from the Curve25519 paper, where
DJB makes uses a Salsa20 based hash, and from NaCl’s use of HSalsa20 in
underneath its `crypto_box()`

API.

I conjecture that if it works with Salsa20, it works with Chacha20.
They are similar enough that this conjecture should raise no eyebrow.
This is why I use HChacha20 on Monocypher’s
`crypto_key_exchange()`

API.

Note that we can *not* use this conjecture to deduce that
hashing the same shared secret with a different counter and nonce would
yield *independent* random outputs. If we want independent
outputs, we must use different shared secrets, computed with independent
private keys.

### Putting it back together

The starting point is to get random numbers from secure shared secrets. Getting those intermediate “key”s is the easy part:

```
IK1 = HChacha20(DH1, 0)
IK2 = HChacha20(DH2, 0)
IK3 = HChacha20(DH3, 0)
IK4 = HChacha20(DH4, 0)
DH1 DH2 DH3 DH4
│ 0 │ 0 │ 0 │ 0
│┌─┴─┐ │┌─┴─┐ │┌─┴─┐ │┌─┴─┐
└┤ H │ └┤ H │ └┤ H │ └┤ H │
└─┬─┘ └─┬─┘ └─┬─┘ └─┬─┘
│ │ │ │
IK1 IK2 IK3 IK4
```

If a shared secret is secure, the corresponding intermediate key will be as good as random. If it is not, that key will be known to the attacker. We must also keep in mind that an active attacker can have some control over the value of an intermediate key in some cases.

Now from those intermediate keys, we can try and generate chaining keys. The following works pretty well:

```
CK1 = IK1
CK2 = IK2 XOR HChacha20(CK1, 1)
CK3 = IK3 XOR HChacha20(CK2, 1)
CK4 = IK4 XOR HChacha20(CK3, 1)
IK1 IK2 IK3 IK4
│ 1 │ 1 │ 1 │
│ ┌─┴─┐ │ ┌─┴─┐ │ ┌─┴─┐ │
├─┤ H ├╴+╶┤ H ├╴+╶┤ H ├╴+
│ └───┘ │ └───┘ │ └───┘ │
│ │ │ │
CK1 CK2 CK3 CK4
```

If IK1 is random, so is its hash. Moreover, the hash will be
independent from everything else, and IK2 in particular. So we can XOR
them together without fearing a bad interaction (the XOR of two
independent random sources is random). Thus, CK2 is random if IK1
*or* IK2 is random. CK3 is random if IK1 *or* IK2
*or* IK3 is random. And so on.

Note that we changed the auxiliary input. If we used zero everywhere, this could lead to the following:

```
CK2 = IK2 XOR HChacha20(CK1 , 1)
CK2 = HChacha20(DH2, 0) XOR HChacha20(CK1 , 1)
CK2 = HChacha20(DH2, 0) XOR HChacha20(IK1 , 1)
CK2 = HChacha20(DH2, 0) XOR HChacha20(HChacha20(DH1, 0), 1)
```

Now what if the attacker has control over DH2, *and* know the
value of DH1? Then they may attempt to force it to the following
value:

`DH2 = HChacha20(DH1, 0)`

Which would mean:

```
CK2 = HChacha20(DH2, 0) XOR
HChacha20(HChacha20(DH1, 0), 0)
CK2 = HChacha20(HChacha20(DH1, 0), 0) XOR
HChacha20(HChacha20(DH1, 0), 0)
CK2 = BIG FAT ZERO
```

Oops. If we’re using a different auxiliary input however, things are much better:

```
CK2 = HChacha20(DH2, 0) XOR
HChacha20(HChacha20(DH1, 0), 1)
```

Simple control over DH2 won’t make the two terms of this XOR equals. The attacker must find a value such that:

```
HChacha20(Attacker_value, 0) = HChacha20(known_value, 1)
HChacha20(Attacker_value, 0) = arbitrary_value
```

This basically requires a preimage attack on HChacha20, and such an attack would mean the ability to break Chacha20 itself. We’re thus safe. (Note that we weren’t that unsafe to begin with: the control over an X25519 shared secret is far from perfect. It’s just simpler and safer to work around the assumption that it is.)

Now there’s still a problem: protocol separation. I am not designing
a single protocol, I’m designing a protocol *framework*, with at
least 3 different protocols. This is not a huge problem, but sometimes,
the same shared secret could be involved in two different protocols.
This can happen with exchanges between two long term keys, or because
some clever fool tried to save computation during a protocol
negotiation, and ended up reusing an ephemeral key.

So instead of using 1 as auxiliary input, we could use a protocol specific constant, like its name. We have 16 bytes to do it, this should be enough:

```
p = protocol_name
CK1 = IK1
CK2 = IK2 XOR HChacha20(CK1, p)
CK3 = IK3 XOR HChacha20(CK2, p)
CK4 = IK4 XOR HChacha20(CK3, p)
IK1 IK2 IK3 IK4
│ p │ p │ p │
│ ┌─┴─┐ │ ┌─┴─┐ │ ┌─┴─┐ │
├─┤ H ├╴+╶┤ H ├╴+╶┤ H ├╴+
│ └───┘ │ └───┘ │ └───┘ │
│ │ │ │
CK1 CK2 CK3 CK4
```

That’s not quite enough, though: CK1 is not affected by the protocol name. We need to tweak things a little:

```
p = protocol_name
CK1 = HChacha20(IK1 , p)
CK2 = HChacha20(IK2 XOR CK1, p)
CK3 = HChacha20(IK3 XOR CK2, p)
CK4 = HChacha20(IK4 XOR CK3, p)
IK1 IK2 IK3 IK4
│ p │ p │ p │ p
│ ┌─┴─┐ │ ┌─┴─┐ │ ┌─┴─┐ │ ┌─┴─┐
└─┤ H ├─┬╴+╶┤ H ├─┬╴+╶┤ H ├─┬╴+╶┤ H ├─┐
└───┘ │ └───┘ │ └───┘ │ └───┘ │
│ │ │ │
CK1 CK2 CK3 CK4
```

Now we’re all set.

### The complete key derivation scheme

Chaining keys are nice, but they’re just a stepping stone towards the keys I’m actually interested in: the authentication keys and the encryption keys. For those, we only need a single Chacha20 invocation (which unlike HChacha20 outputs 64 bytes, so we’re saving a bit of computation there). Care must be taken to use a different counter or nonce than we used for the chaining key’s HChacha20 invocations, though, or the keys won’t be truly independent. So:

```
AK1, EK1 = Chacha20(CK1, 1)
AK2, EK2 = Chacha20(CK2, 1)
AK3, EK3 = Chacha20(CK3, 1)
AK4, EK4 = Chacha20(CK4, 1)
CK1 CK2 CK3 CK4
│ 1 │ 1 │ 1 │ 1
│ ┌─┴─┐ │ ┌─┴─┐ │ ┌─┴─┐ │ ┌─┴─┐
└─┤ C │ └─┤ C │ └─┤ C │ └─┤ C │
└─┬─┘ └─┬─┘ └─┬─┘ └─┬─┘
│ │ │ │
┌─┴─┐ ┌─┴─┐ ┌─┴─┐ ┌─┴─┐
AK1 │ AK2 │ AK3 │ AK4 │
EK1 EK2 EK3 EK4
```

And now the whole shebang:

```
p = protocol_name
CK1 = HChacha20(HChacha20(DH1, 0) , p)
CK2 = HChacha20(HChacha20(DH2, 0) XOR CK1, p)
CK3 = HChacha20(HChacha20(DH3, 0) XOR CK2, p)
CK4 = HChacha20(HChacha20(DH4, 0) XOR CK3, p)
AK1, EK1 = Chacha20(CK1, 1)
AK2, EK2 = Chacha20(CK2, 1)
AK3, EK3 = Chacha20(CK3, 1)
AK4, EK4 = Chacha20(CK4, 1)
DH1 DH2 DH3 DH4
│ 0 │ 0 │ 0 │ 0
│┌─┴─┐ │┌─┴─┐ │┌─┴─┐ │┌─┴─┐
└┤ H │ └┤ H │ └┤ H │ └┤ H │
└─┬─┘ └─┬─┘ └─┬─┘ └─┬─┘
│ p │ p │ p │ p
│ ┌─┴─┐ │ ┌─┴─┐ │ ┌─┴─┐ │ ┌─┴─┐
└─┤ H ├─┬╴+╶┤ H ├─┬╴+╶┤ H ├─┬╴+╶┤ H ├─┐
└───┘ │ └───┘ │ └───┘ │ └───┘ │
│ 1 │ 1 │ 1 │ 1
│ ┌─┴─┐ │ ┌─┴─┐ │ ┌─┴─┐ │ ┌─┴─┐
└─┤ C │ └─┤ C │ └─┤ C │ └─┤ C │
└─┬─┘ └─┬─┘ └─┬─┘ └─┬─┘
│ │ │ │
┌─┴─┐ ┌─┴─┐ ┌─┴─┐ ┌─┴─┐
AK1 │ AK2 │ AK3 │ AK4 │
EK1 EK2 EK3 EK4
```