Merkle Trees In Go

Merkle trees are a pretty exotic sounding data structure but once you get the hang of them they’re actually deceptively simple and pretty useful.

I thought I’d share my home-grown implementation of a Merkle tree which is by no means super-efficient, but hopefully it’s educational!

A what tree?

First, let me provide some context. The motivation behind Merkle trees is pretty interesting.

Imagine you have a lot of files on your computer - say they’re black and white Mickey Mouse cartoons your grandfather recorded back in the day. You’ve digitized them and now you want to upload them to some cloud storage services, like Google Drive or Dropbox.

However, you’re a careful person. You want to make sure the storage service doesn’t screw you over and so that you can potentially claim damages in court for losing such valuable Mickey Mouse videos (note: I am not a lawyer, I don’t know if this would work :-) ). So you calculate a cryptographic hash (say you’re a SHA-256 fan) of each video file and you keep it handy so that the next time you download a file from the storage service, you hash it and compare it to the checksum you calculated prior.

Since good cryptographic hashes like SHA-256 are computationally collision-free, this is an effective way to check if the file changed under you.

But now you have to keep a lot of hashes around. You need to keep one for each file.

Is there a way where you don’t have to keep one hash per file?

We’re about to find out :-).

Short Summaries of Knowledge

Suppose you wanted to do something simpler than what is described above. Suppose you want to convince to somebody that you have a lot of data. But you don’t want to have to send them all that data.

Let’s start with the simple case. Suppose you have two files, A and B. How can you convince them that you know the contents of both?

The simple approach is you can hash both files separately, and send over both hashes to whoever wants to know. Then if they demand to verify the hashes, you can send over the files and have them verify it that way. This protocol is inefficient, but it is merely to showcase the usefulness of hashes as constantly sized (say, 256 bits in the case of SHA-256) summaries of a potentially large amount of information.

But can we prove the same idea - that you know the contents of two files A and B - with less information than two separate hashes? As you can probably guess, the answer is yes!

What if you do something like this:

a_hash = hash(A)
b_hash = hash(B)
# suppose || is the concatenation operator
ab_hash = hash(a_hash || b_hash)

Now you send over ab_hash - the hash of the bytewise-concatenated hashes of both A and B. Is ab_hash a proof that you have the contents of A and B?

The answer is yes. The reason for this solely rests on the properties of cryptographic hashes. Because collisions are very, very unlikely, the chance that you sent over a hash that exactly matches the hash of the concatenated hashes of A and B are vanishingly small.

Of course, there is a chance that cryptographic hashes will have collisions. However, this chance is so incredibly small, that it can be discarded in practice.

How would one verify this “proof”? In the case of the two files, I could just reveal these two files, and the validator can simply check that ab_hash == hash(a_hash || b_hash). Again, not efficient, but might be okay for just two files.

How can we generalize this to much more data?

It’s Hashes All The Way Up

We’ve seen short summaries in the case where there are only two files (or really, any two bytestrings) that we care about. What if we have 32 files now? Is there a way to still have a constant sized summary of all that data?

As you might expect, the answer is yes!

But how?

This is where the trees come in.

The trick is, pair-wise hash each of these 32 files, so that you get 16 hashes. Then pair-wise hash those 16 hashes, so that you get 8 hashes. Then 4 hashes. Then two hashes. Then one hash. That final hash is the root of a hash tree, and is a constant-sized summary of all 32 files that you have. Neat, right?

Here’s how this could look like in Go:

// Bytes32 is a convenience type to represent a 32 byte slice.
type Bytes32 [sha256.Size]byte

// level represents a level in the complete binary tree that
// is the merkle tree.
type level []Bytes32

// Tree represents a merkle tree.
type Tree struct {
	levels []level // starting from bottom, going to the top
}

// New constructs a new merkle tree given some data.
func New(data [][]byte) (*Tree, error) {
	if len(data)&(len(data)-1) != 0 {
		return nil, fmt.Errorf("data length must be exact power of two, got: %d", len(data))
	}

	// build the bottom-most level of the tree by hashing the passed in data
	var bottom level
	for i := 0; i < len(data); i += 2 {
		left := sha256.Sum256(data[i])
		right := sha256.Sum256(data[i+1])
		bottom = append(bottom, left)
		bottom = append(bottom, right)
	}

	var (
		allLevels = []level{bottom}
		exponent  = math.Log2(float64(len(data)))
	)
	// build the tree in a bottom up fashion, starting
	// from the deepest level.
	// level i + 1 is constructing by pairwise hashing the nodes
	// on level i.
	for l := 1; l <= int(exponent); l++ {
		var (
			prevLevel = allLevels[l-1]
			currLevel level
		)
		for n := 0; n < len(prevLevel); n += 2 {
			currLevel = append(currLevel, sha256.Sum256(common.Concat(prevLevel[n][:], prevLevel[n+1][:])))
		}
		allLevels = append(allLevels, currLevel)
	}
	return &Tree{
		levels: allLevels,
	}, nil
}

Not the cleanest piece of code, but it works! It only handles clean powers of two, but you can probably easily generalize it to work with any amount of data.

But this is just the beginning - now what? We have this big tree, but what can we do with it?

Below is a diagram of a Merkle tree, courtesy of Investopedia.

Merkle tree example

Merkle Proofs

Now we get to the case where you want to download your Mickey Mouse videos. It just so happens that the storage service you’re using has a feature called “Merkling” where they claim that they can prove to you that your files have not been tampered with and that you only need constant space on your end to verify it!

You can’t believe your eyes and ears when you hear the salesman telling you this. He must be screwing with me, you think. But no! This time, the salesman is in fact saying something plausible!

Suppose that right after you uploaded your files to the storage service, they went ahead and created the Merkle tree for those files (i.e, using some variant of the algorithm above). In their UI, they also give you the root hash of this tree - this is the root node.

Then, when you download episode 18 from Mickey Mouse’s adventures in Orlando, you’ll get something interesting alongside it: a Merkle proof.

A Merkle proof is also quite simple - it’s a selection of nodes from the Merkle tree, which, alongside the hash of the data you are requesting, will let you verify in O(D) time (where D is the depth of the tree) that the data you fetched is in fact correct.

A Merkle proof consists of all of the sister nodes going up the tree from this leaf node. This is best understood from a diagram, here, courtesy of the tutorial on ethereum.org:

Merkle proof example

In the image above, we’re trying to generate the Merkle proof of the node marked “C”. What we do first is we grab the sibling of “C”, which is “D”. Then, we grab the sibling of the parent of “C”, which is H(A -> B). Then we grab the sibling of the parent of H(A -> B), which is H(E -> H). At this point, we’re done.

Here’s what it could look like in Go:

type Proof struct {
	Hashes    []Bytes32
	LeafIndex int
}

func (t *Tree) ProofFor(i int) (p Proof, err error) {
	if i > len(t.levels[0])-1 || i < 0 {
		return Proof{}, errors.New("leaf node index out of bounds")
	}

	var (
		siblingIndex = getSiblingIndex(i, 0)
		sibling      = t.levels[0][siblingIndex]
	)
	p.Hashes = append(p.Hashes, sibling)

	// now that we have the element and it's sibling, we walk up the levels
	// of the tree to get the nodes that are needed to complete the proof.
	var (
		currIndex = i
		currLevel = 1
	)
	for currLevel < len(t.levels)-1 {
		parentIndex := currIndex / 2
		parentSiblingIndex := getSiblingIndex(parentIndex, currLevel)
		parentSibling := t.levels[currLevel][parentSiblingIndex]
		p.Hashes = append(p.Hashes, parentSibling)
		currIndex = parentSiblingIndex
		currLevel++
	}
	p.LeafIndex = i
	return p, nil
}

func getSiblingIndex(i, level int) int {
	if i%2 == 0 {
		return i + 1
	}
	return i - 1
}

The reason the proof needs to contain the siblings of the parents is because that is the information that goes into the root that you would not have the ability to compute on your end (since it comes from other files that you may not have available).

Verifying The Proof

Alright, so we have the Merkle root, and we have a so-called “proof” - how do we verify it?

The verification procedure needs three things: the hash of the leaf node we care about. This is easy - you just hash the Mickey Mouse video you downloaded. Then, you need the Merkle root. Again, easy - you kept it around when the storage service returned it to you. And of course, you need the proof.

To verify it, we need to hash our way up the tree and get the Merkle root as it is from the proof. Then, we compare it to the Merkle root we have.

Here’s what it would look like in Go:

func Verify(proof Proof, leaf, root Bytes32) bool {
	var (
		hash      = leaf
		leafIndex = proof.LeafIndex
	)
	for _, p := range proof.Hashes {
		if leafIndex%2 == 0 {
			// sibling is a left node, so concat that hash first then the proof node
			hash = sha256.Sum256(common.Concat(hash[:], p[:]))
		} else {
			// sibling is a right node, so concat proof node first then the sibling
			hash = sha256.Sum256(common.Concat(p[:], hash[:]))
		}
		leafIndex /= 2
	}

	return bytes.Equal(hash[:], root[:])
}

Updating Leaves

Another advantage of Merkle trees are the fast updates to the Merkle root. Suppose you had to modify one of the leaf nodes and recalculate the Merkle root. What would you do?

It’s very similar to building the proof - start from the leaf you want to update, and recalculate the parent hashes all the way until you get to the root. The complexity of the operation is again O(D) where D is the depth of the tree.

Here’s what it would look like in Go:

func (t *Tree) Update(index int, data []byte) error {
	if index < 0 || index >= len(t.levels[0]) {
		return errors.New("index out of range")
	}

	// update the hash of the leaf and iteratively update parents
	dataHash := sha256.Sum256(data)
	t.levels[0][index] = dataHash

	for lev := 0; lev < len(t.levels)-1; lev++ {
		var (
			newParent   Bytes32
			current     = t.levels[lev][index]
			sibling     = t.levels[lev][getSiblingIndex(index, lev)]
			parentIndex = index / 2
		)
		if index%2 == 0 {
			newParent = sha256.Sum256(common.Concat(current[:], sibling[:]))
		} else {
			newParent = sha256.Sum256(common.Concat(sibling[:], current[:]))
		}
		t.levels[lev+1][parentIndex] = newParent
	}

	return nil
}

How Are Merkle Trees Used?

Merkle trees are really useful in blockchain settings.

Take for example Bitcoin. Each block could have a bunch of transactions. Alice wants to check if her transaction made it into a particular block.

Merkle roots for a Bitcoin block are constructed with the transaction data as the leaves of the tree. They are then pairwise-hashed (similar to what we did above - though slightly adapted to support a non-power-of-two number) in a tree fashion until the root is determined.

All a Bitcoin node would have to do is to send Alice a small proof - size O(D) where D is the depth of the complete binary Merkle tree - and Alice can then verify that the Merkle root she calculates is what is stored for that Bitcoin block.

This is great because the bandwidth required is very small. Alice doesn’t need to have ALL the transactions in the block to verify that her’s is in there.

Conclusion

And that’s it! Hopefully it was easy enough to follow.

Written on February 11, 2023