What is a Merkle tree?
A Merkle tree, also known as “binary hash tree” is a hash-based data structures popularly used in computer science applications.
It’s a tree-like structure where each leaf node is a hash of a block data, and each non-leaf node is a hash of its children. Typically, in Merkle trees, each node has 2 children extending in branches.
Merkle trees enable efficient data verification in distributed systems. They are efficient as they use hashes instead of entire files. Hashes are basically a method of holding encoded files that are much bigger in size in a small, encrypted code called hash code. Currently, Merkle trees are widely used in peer-to-peer networks such as blockchain technology like Bitcoin, Git, Tor, etc.
The importance of Merkle tree in blockchain
Merkle trees are used to efficiently verify data. For example, if Bitcoin didn’t have Merkle trees, each node would have to keep a complete copy of all the data and each transaction that ever occurred on Bitcoin.
Can you imagine how tedious that would have been?
Any verification request would have required an extremely large pack of data to be sent over to the network in order to verify it. Since the data is not hash coded, each computer would have to use a lot of computing power to process the data and verify it.
Merkle trees solve this issue. They hold large data records into small hash codes, offering a small amount of data across networks to prove its validity as that’s all that is needed.
Post the FTX Crash (a centralized crypto exchange), having a Merkle tree verification of proof-of-reserves has become a need of the hour.
Merkle tree chart explanation
In various distributed and peer-to-peer systems, data verification is very important. As the same data exists everywhere – if a piece of data is modified in one location, it must be changed everywhere. This is why it’s important to ensure the same data is everywhere.
In a Merkle tree, the top-most node is called the “root” node like that of a tree. Each node has 2 children branches called “leaf” as shown in the image above. These nodes carry all the data of the blockchain in a secure hash function.
Since the intention is to limit the amount of big data being sent over the network. Instead of sending an entire file we just sent the hash of the file to check if it matches.
The protocol of a Merkle tree
- Computer A sends the hash of the file to computer B.
- Computer B checks and verifies that hash against the root of the Merkle tree.
- If it finds no difference, the job is done! Otherwise go to step 4
- If there is a difference in a hash, the computer will request the hash of the root of the two children leaves.
- Computer A will find the necessary hash and send it back to computer B
- The computer repeats step 4 & 5 till it has found the data block(s) that are inconsistent. It’s possible to discover more than 1 incorrect data block.
Merkle trees are useful in a peer-to-peer network system to verify information, as some information comes from untrusted sources (which is a concern in peer-to-peer systems).
Merkle tree use cases
As talked about earlier, a Merkle tree is specifically useful in distributed systems where the same data should exist in multiple locations.
Git is a version control system popularly used by programmers. All the saved files are saved on the computer of every user. Hence, it’s imperative to check that these changes are consistent and applied across everyone’s computers.
Bitcoin is a blockchain-based anonymous currency. All transactions that happen on the blockchain in bitcoin are stored in blocks. This blockchain exists on every bitcoin user’s computer.
The leaves of the bitcoin Merke tree are typically hashes of single blocks. Every time someone wants to alter the blockchain for example: add a transaction in the chain, this change needs to be reflected everywhere.
This is particularly difficult in Bitcoin’s blockchain. Let’s say an intruder wants to modify the chain for their personal benefit, it is not possible in case of bitcoin due to the 51% attack resistance of a blockchain as proposed by Satoshi Nakamoto in the Bitcoin Whitepaper.
Merkle trees are significant for blockchain technology and a concept used by many developers in the web3 ecosystem.
I hope you benefited from this explanation, in case of any doubts feel free to drop your questions in the comment section below.
Bharvi out 🤙🏼