## Combining cryptographic hashes, part 3

In previous posts I discussed how Joux’s multicollision attack significantly decreases the efficiency of concatenating hash functions and how it can be used to break the birthday limit of md5. Now the obvious remaining question is: if concatenation is not effective, are there better ways to combine two hash functions ? As often in cryptography, it is very easy to come up with a solution which is neat, simple, and wrong: for instance, would you have said that $F(M) || G(F(M) \oplus M)$ is insecure ? Luckily the smart people at the Emmy Noether Research Group give us two examples of good combiners.

Hash-tree based concatenation
In a paper from 2007, Fischlin and Lehmann introduce a security-amplifying combiner based on the concatenation combiner. Their idea is quite simple : to build a $2^n$ multicollision on an n-bits iterative hash function, Joux’s attack requires n consecutive calls to the compression function. In other words, the colliding messages must have a size equal to or larger than $n * B$ bits, where B is the message block size of the hash function.

Conversely, limiting the size of the input messages would also decrease the number of multicollisions found by Joux’s attack, and therefore the probability of finding a pair of messages which collide in both hash functions. For instance, for messages of size $n/4$ blocks, Joux’s attack can only build a $2^{\frac{n}{4}}$ multicollision, and the probability of finding a pair of colliding messages drops to $2^{-\frac{n}{2}}$.

Input messages obviously have an arbitrary size, so to ensure that the concatenation combiner $H = F || G$ only process a small number of message blocks, a hash-tree structure can be used. For instance, if we decide to break the message in groups of 4 blocks the hash-tree would look like this:

The size of the groups is a tradeoff between security and speed, but you should be safe as long as you keep the groups smaller than $n/4$ blocks.

The $C_4p$ combiner
In 2009 the same Fischlin and Lehmann introduced a new family of secure combiners. The basic combiner of this family is called $C_4p$, and it has the following construction:

where:
$H_0$ and $H_1$ are the two hash functions to combine
$\oplus$ is the XOR operator
$H_\oplus = H_0 \oplus H_1$
$H_k^i(M) = H_k(i || M)$, in other words the binary representation of $i$
is preprended to the message $M$ before hashing with $H_k$ (k=0 or 1)
$H_\oplus^i(M) = H_0^i(M) \oplus H_1^i(M) = H_0(i||M) \oplus H_1(i||M)$

With a combination of XOR operations, concatenations and permutations, the $C_4p$ combiner outputs $2n$bits hash values (same length as the classical concatenation combiner) and is robust against collisions. In their article the authors describe several modifications of $C_4p$ which provide additional properties, such as pre-image robustness and indifferentiability from random oracles.

Last words
In march Fischlin and Lehmann will present at the CT-RSA 2010 conference a paper titled “Hash Function Combiners in TLS and SSL”, it should be fun 🙂