By last blog, I have presented how to locally simulate a simple Proof-of-Work(PoW) Blockchain in Java with multithreading. In this blog, I will make proper modifications to implement a more complex Blockchain inspired by the idea of sharding.
Basic Idea of Sharding in Blockchain
Sharding is originally a concept popular in database area. It is the idea of dividing a huge chunk of data into several disjoint subsets, and each of the subset is called a shard. The reason recent researchers have been interested in applying sharding into Blockchain is that this concept seems to be helpful to solve the scalability problem of Blockchain technology. The mining congestion happens when the transactions producing rate is higher than the transactions processing rate of the Blockchain system, which causes longer and longer waiting time for the transactions confirmation as the market keeps growing. With the ideas of sharding and parallel mining, we can divide the overloaded transactions into smaller disjoint shards and allow multiple concurrently maintained sub-chains to record them. Ideally, in this way, we should be able to improve the transactions processing rate dramatically.
Sharding Data Structure in my Simulation
As there is a variety of different ways of implementing sharding in Blockchain, I will apply a relatively simple approach here. Let’s treat the Block data structure from previous code as Sub-Block, and redefine the Block as a collection of Sub-Blocks. The entire Blockchain data structure should look like the figure below:
Now every mining iteration, based on the transactions load, the Blockchain system would decide to divide next Block into a proper number of shards(sub-blocks), and evenly distribute compute powers to each of the shards. Miners of different shards will concurrently compute the satisfying nounce for the shard they belong to. For N shards, there will be N winners. Each of them will be able to generate a new sub-block and all the sub-blocks will be gathered together as a whole new Block to be attached to the Blockchain.
Implementation Modifications
First, the blockchain variable in Beecoin class should now be implemented as a ArrayList of Block Array:
private static ArrayList<Block[]> blockchain = new ArrayList<>();
Every mining iteration, we will first decide the number of shards we want, which is the length of the Block array, according to two things:
1. the transactions load
2. the number of miners
The reason we want to also consider the miners quantity is that we don’t want to have too many shards that there is not enough miners in charge of each shard. When there is only little miners for each shard, the safety of the shard is harmed. In my simulation, I simply assume that each shard will have at least two miners in charge of it.
We can compute the number of shards needed for each Block in this way:
desiredShards = (# of unconfirmed transactiosn) / (Capacity of each sub-block)
maxShards = (# of miners) / (minimum Miners per Shard)
IF desiredShards >= maxShards:
desiredShards = maxShards
Now, when we create a Miner thread for mining, we also want to tell it which shard it belongs to. Additionally, we need to modify the Miner class to let it support concurrent multi-blocks mining. The only things I need to change, is the class variables of Miner, which are used to support communication between Miner threads. They were designed to make sure all threads terminating when a solution is found by any thread. Now, they should make sure a thread only terminate when a solution is found by itself or a thread of the same shard. Thus, for N shards, there will be N final_nounce to be found. In other words, each of those class variables now should become an collection of what it was. Now the Miner class data members should look like this:
public class Miner extends Thread {
public static boolean solutionClaimed[]; // modified
public static int claimerID[]; // modified
public static int candidate[]; // modified
public static ArrayList consensusList = new ArrayList(); // modified
public static int minerNum = 0;
public static String final_nounce[]; // modified
public static int shardsNum; // newly added
public static int minersPerShard; // newly added
private int difficulty;
private long prevInfo;
private int shardID; // newly added
public int index;
public String solution;
}
Results and Discussion
After all code modifications needed for the upgrade from a single-chain to a sharded-chain being done, we can finally test and compare the performance.
First, I set up the PoW difficulty level and the number of miners to be 4 for both single chain and sharded chain, which made the only difference to be the Blockchain data structure. I let both the single chain and the sharded chain systems to run for 15, 20, and 25 mining iterations, and I observed the transactions confirming rate of them. I got Figure-1 as followed:
In Figure-1, we can compare the transactions confirming rate between the classic single chain and the chain with sharded blocks. It not only shows the general better performance of the sharded chain but also shows that as more and more mining iterations pass, the single chain easily stucks at a confriming rate bottleneck, while the sharded chain clearly has a bigger growing space of transaction processing rate.
If we focus on the number of backlog transactions after 15, 20, and 25 mining iterations for both single chain and sharded chain system, we can see a pattern as followed:
In Figure-2, the advantage of sharded chain is more obvious. When the single chain has more and more serious mining congestion problem, the sharded chain shows a perfect transactions processing capability that eliminates the mining congestion problem when mining iterations being less than 25.
These two figures indicate a huge improvement of transactions processing throughput when update the single chain to a sharded chain.
What’s more, the sharded chain can have even better performance as it showed in Figures above. When the number of miners in the network grows, the transaction processing capability of the system grows with it.
I let the sharded chain system run for 15, 20, 25, 35, and 50 iterations with 4 miners and 6 miners. The difficulty level remained 4. It was possible to have at most 2 shards for the 4-miners system and 3 shards for the 6-miners system. Notice that as more and more iterations passed, the transactions load in my simulation tended to grow exponentially.
In Figure-3 and Figure-4, we can see clearly that the transactions confirming rate grows quicker with 6 miners. Also the number of backlog transactions grows much slower with 6 miners. This shows how dramatically the increase of miner numbers can improve the transactions processing capability of the system.
Conclusion
In this simulation, some assumptions have been made in order to focus on my main concern and avoid unnecessary complexity of the model. As the simulation successfully shows how sharding and parallel mining can dramatically improve the transactions processing capability, it gives us hope that we can apply sharding to the real world Blockchain to promote the transactions processing throughput to make it more realistic to apply Blockchain technology to more areas.
In the real world Blockchain systems with distributed network envolved, to be able to upgrade a single chain to a sharded chain, extra works are required on managing the communications among different shards. In the future, I will explore how to maintain a safe and efficient cross-shards communications so that not only the transactions processing throughput are enhanced but also the double-spending problem can be eliminated and the safety of the network is kept.