Tag Archives: hashing

Blake Tree Hashing

One feature Skein has, but Blake is lacking is tree hashing. The kind of tree I’ll describe here has continuous leaves, so it is suitable for incremental updates/verification. For parallel hashing other constructions, such as interleaving, might be superior.

I’m using the Blake compression function without any changes, but I change the counter scheme. The counter scheme combines ideas from my earlier Alternative Padding suggestion and the UTF-8 encoding. I’ll only describe the Blake256 variant here, but generalization to Blake512 is straightforward.


The tree hash takes three parameters:

  • The LeafSize. A power of two, at least 4.
  • The FanOut. A power of two, at least 2
  • The maximum tree high MaxHeight, at least 2.

As with my earlier proposal, bit 63 marks the final block of a layer, and bit 62 indicates the presence of bit padding. The remaining 62 bits serve as a combined layer indication and position.

The number of leading 1 bits indicates the layer. No leading 1 bit corresponds to leaves etc. The remaining bits count the length of the message in bytes up to and including the current block.

For the leaf layer divide the message into LeafSize * HashSize pieces. For the higher layers divide the message into FanOut * HashSize pieces.
Hash each piece, producing one hash that’ll become part of the next layer. Don’t reset the counter when going to the next piece, only reset the chaining value to the IV.

If you reach MaxHeight, hash the whole lower layer in one go, instead of splitting it into pieces.

Repeat until the length of a layer is equal to the HashSize. That layer is the root hash.


TreeHash(Message, LeafSize, FanOut, MaxHeight)
    requires (LeafSize >= 4) and IsPowerOfTwo(LeafSize)
    requires (FanOut >= 2) and IsPowerOfTwo(FanOut)
    requires MaxHeight >= 2

    const HashSizeInBytes = 32
    const BlockSizeInBytes = HashSizeInBytes * 2
    const CounterSizeInBits = 64

    CurrentLayer = Message
    if Message.LengthInBits==0
        return BlakeCompress(IV, BlockSizeInBytes zero-bytes, 2**(CounterSizeInBits-1))
    LayerIndex = 0
        NextLayer = ""
        if LayerIndex == 0
            PieceSizeInBytes = LeafSize * HashSizeInBytes
            PieceSizeInBytes = FanOut * HashSizeInBytes
        if (PieceSizeInBytes<CurrentLayer.LenghInBytes) or (LayerIndex == MaxHeight - 1)
            PieceSizeInBytes = CurrentLayer.LengthInBytes
        for(pieceIndex=0; pieceIndex < Ceiling(CurrentLayer.LengthInBits/(PieceSizeInBytes*8)); pieceIndex++)
            State = IV
            Piece = CurrentLayer[pieceIndex*PieceSize, PieceSize]
            if CurrentLayer.LengthInBits % 8 != 0 //Add bit padding
            while Piece.LengthInBits % (BlockSizeInBytes * 8) != 0 //Pad to block size
            for(blockIndex=0; blockIndex<Piece.LengthInBytes/BlockSizeInBytes; blockIndex++)
                Counter = Min(
                    pieceIndex*PieceSize + (blockIndex + 1) * BlockSizeInBytes,
                if Counter == Ceiling(CurrentLayer.LengthInBits/8)//Last block of layer
                    Counter.SetBit(CounterSizeInBits - 1)
                    if CurrentLayer.LengthInBits % 8 != 0 //Bit padding was used
                        Counter.SetBit(CounterSizeInBits - 2)
                for(i=0; i< LayerIndex; i++)//Set the `LayerIndex` most significant bits in Counter
                    Counter.SetBit(CounterSizeInBits - (3 + i));
                State = BlakeCompress(State, Piece[blockIndex * HashSizeInBytes, HashSizeInBytes], Counter)
        CurrentLayer = NextLayer
    until CurrentLayer.LengthInBytes == HashSizeInBytes)
    return CurrentLayer

(The division sign / denotes exact division, not c style truncating division, = is assignment, and == equality test)


HashSizeInBytes = 32
FanOut = 2
LeafSize = 4
MaxHeight = 255
Message.LengthInBits = 2500

Each block corresponds to a compression function call, the number in it is the counter as a hexadecimal number.

| B000 ... 0040 |
        |                                                                               |
+---------------+                                                               +---------------+
| 2000 ... 0040 |                                                               | A000 ... 0060 |
+---------------+                                                               +---------------+
        |                                                                               |
        +-------------------+-------------------+-------------------+                   +
        |                   |                   |                   |                   |
+---------------+   +---------------+   +---------------+   +---------------+   +---------------+
| 0000 ... 0040 |   | 0000 ... 0080 |   | 0000 ... 00C0 |   | 0000 ... 0100 |   | C000 ... 0139 |
+---------------+   +---------------+   +---------------+   +---------------+   +---------------+

Properties/Open Questions

  • The Blake256 based variant can hash 2^64-8 bits or 2^61-1 bytes
  • I didn’t describe a configuration block yet. Without such a block, trees with different parameters might interact in undesirable ways.
  • Should we reduce the minimum LeafSize to 2 * HashSize? That’d reduce the maximum input size or adds additional complications.
  • Can it be extended to include interleaved hashing in an elegant way?

Alternative Blake Padding

BLAKE is one of the SHA-3 finalists, and IMO the one most likely to win. So I’ve been looking into it, to see how well it fits my usecases.

I like the design of its compression function, but I don’t like how it handles padding. Its padding adds at least 9 bytes to each message, which can cause an additional compression function call. Most annoyingly for me, message sizes that are powers of two always cause this additional call. This is particularly annoying in tree-hashing, a binary tree with Blake256 in my case, where inner nodes have a size of 64 bytes, and leaves have a slightly bigger power-of-two size, such as 1024. Now without padding the overhead of tree-hashing is one compression function call per leaf. With the original padding it’s three calls, tripling the overhead from around 6% to around 19%.

Definition of the alternative Padding

I’ll suggest an alternative padding, that’s more efficient for certain message lengths, and IMO simpler. I’m not a good cryptographer, so it’s very well possible that this padding is totally insecure. If so, please tell me ;)

Blake256 uses a 64 bit counter variable, that counts the bits of the message up to the current block. Now my suggestion is turning this into a 62 bit counter that counts the bytes of the message and using the other two bits as flags. Bit 63 marks the final block and bit 62 on the final block indicates that bit padding was used.

For each block except the final block, set the counter to the number of bytes up to this point, i.e. to (i+1)*64. For the final block there are three different cases:

  1. If the message length is zero, set the block data to all zero, and the counter to 2^63. This is similar to case 2), except that the padding consists of 512 bits, instead of 0 bits.
  2. If the message length is an integral number of bytes, set the counter to 2^63 | MessageLengthInBytes and pad the message to the next multiple of 512 bits. No padding required if it’s already a multiple of 512 bits.
  3. If the message length isn’t an integral number of bytes, append a single 1 bit, set the counter to 2^63 | 2^62 | MessageLengthInBytes and pad the message to the next multiple of 512 bits. The partial byte counts as a full byte in MessageLengthInBytes.

Then hash the final block.

Consequences of the alternative Padding

  1. The counter values are still unique
  2. The final block is clearly tagged, preventing length extension
  3. The only case where more than Ceiling(MessageLength/BlockSize) compression function calls are needed is the empty message.
  4. The padding is simpler for the common case, where the message consists of bytes. Simply pad with zeros, and set a bit in the counter. It isn’t much more complicated in the bitstream case.
  5. The maximal message length is slightly larger, 2^62-1 bytes = 2^65-8 bits instead of 2^64-1 bits. That difference is irrelevant in practice, but it obviously still fulfills the NIST length requirement.

It’s trivial to transfer this padding to the 128 bit counter Blake512 uses. Just use a 126 bit byte-counter, and use bits 126 and 127 as the flags.