This article was published as a part of the Data Science Blogathon.
Web3 is the latest buzzword in the world of technology. Web3 revolves around the concept of a decentralized web, primarily built using blockchain. Blockchain has been around for a while now and came into the limelight because of Bitcoin. Many get confused with the terms bitcoin and blockchain and consider them to be the same, but bitcoin is the best implementation of blockchain technology. A blockchain is a type of decentralized database that is immutable, persistent, and tamper-proof that contains data as a block encrypted using hashing algorithms. The advantage of blockchain technology is that any type of data can be stored in it, but it is mostly used to store transaction details and acts as a digital ledger. The idea of a decentralized architecture in both Web3 and blockchain makes sure that the data is not owned by a single person or an entity, which has been the downside of Web2.
Each block in a blockchain is unique and contains a hash value, which can be used for differentiating each block. Fingerprinting is the concept used for linking blocks on a blockchain. As new blocks keep getting added to the end of the blockchain, the hash of the penultimate block is used for building the hash for a new block, which makes the blocks in a blockchain tamper-proof.
In this article, we will be building a simple blockchain in python that will store some text information from users. The blockchain technology used in the industry is far more complex than the blockchain that we will be building in this article, but this is enough to understand blockchain technology from an implementation perspective.
Each block on our blockchain will have the following properties:
The first block of a blockchain is called the genesis block. From the above explanation, it can be derived that the preceding hash cannot be extracted from the blockchain as the blockchain is empty. In such a case, the preceding hash is generated using some secret specified by the creator of that blockchain. This ensures that all the blocks in a blockchain have a similar structural schema. Each blockchain has a difficulty level associated with it. It specifies the number of digits that need to be 0 in the hash. To satisfy this condition, we have the nonce property, which is a whole number that helps in generating the hash with the specified number of preceding zeros. Since the hashing algorithm used in most blockchain technology is SHA256, it is almost impossible to find the nonce by pre-calculating the hash value. Trial-and-error is the only way to calculate the nonce, which makes it computationally expensive and time-consuming. We need to run a for loop to calculate the nonce value. The process of guessing the nonce that generates the hash as per the requirements is called blockchain mining and is computationally expensive and time-consuming but is necessary to add a block to the blockchain. We will set the difficulty level to 4 for our blockchain and so the first 4 letters of each hash should have ‘0000’. For mining bitcoin, the difficulty level is set to 30, and mining each block in bitcoin roughly takes 10 minutes.
Now that we have a basic understanding of the blockchain that we will be building in this article, let’s get our hands dirty and start building.
There is nothing fancy here; all the modules used in building the blockchain are native python modules and can be directly imported without having to install them using pip. We will be using the hashlib module for performing SHA256 hashing while the time module will be useful to fetch the block generation time.
import hashlib from time import time from pprint import pprint
We will define a class called blockchain with two properties, namely blocks and secret. The blocks property will store all the blocks on the blockchain while the secret variable will be used for building the previous hash for the genesis block. We will define three functions, namely create_block, validate_blockchain, and show_blockchain. The create_block function will be used to create a new block and append it to the block’s property in the blockchain. The properties of each block explained earlier will be implemented here. The nonce that satisfies the blockchain requirement of having four zeros preceding each hash will be calculated. The validate_blockchain function will be used to validate the integrity of the blockchain. This means that it will check the fingerprinting on each block on the blockchain and tell us if the blockchain is stable or not. Each block should contain the correct hash of the previous block. If there are any discrepancies, it is safe to assume that someone has meddled with the blocks on the blockchain. This property makes blockchains immutable and tamper-proof. Finally, the show_blockchain function will be used to display all the blocks on the blockchain.
class blockchain():
def __init__(self):
self.blocks = []
self.__secret = ''
self.__difficulty = 4
# guessing the nonce
i = 0
secret_string = '/*SECRET*/'
while True:
_hash = hashlib.sha256(str(secret_string+str(i)).encode('utf-8')).hexdigest()
if(_hash[:self.__difficulty] == '0'*self.__difficulty):
self.__secret = _hash
break
i+=1
def create_block(self, sender:str, information:str):
block = {
'index': len(self.blocks),
'sender': sender,
'timestamp': time(),
'info': information
}
if(block['index'] == 0): block['previous_hash'] = self.__secret # for genesis block
else: block['previous_hash'] = self.blocks[-1]['hash']
# guessing the nonce
i = 0
while True:
block['nonce'] = i
_hash = hashlib.sha256(str(block).encode('utf-8')).hexdigest()
if(_hash[:self.__difficulty] == '0'*self.__difficulty):
block['hash'] = _hash
break
i+=1
self.blocks.append(block)
def validate_blockchain(self):
valid = True
n = len(self.blocks)-1
i = 0
while(i<n):
if(self.blocks[i]['hash'] != self.blocks[i+1]['previous_hash']):
valid = False
break
i+=1
if valid: print('The blockchain is valid...')
else: print('The blockchain is not valid...')
def show_blockchain(self):
for block in self.blocks:
pprint(block)
print()
Now that we have built the blockchain class, let’s use it to create our blockchain and add some blocks to it. I will add 3 blocks to the blockchain and will validate the blockchain and finally print the blocks and look at the output.
Python Code:
import hashlib
from time import time
from pprint import pprint
class blockchain():
def __init__(self):
self.blocks = []
self.__secret = ''
self.__difficulty = 4
# guessing the nonce
i = 0
secret_string = '/*SECRET*/'
while True:
_hash = hashlib.sha256(str(secret_string+str(i)).encode('utf-8')).hexdigest()
if(_hash[:self.__difficulty] == '0'*self.__difficulty):
self.__secret = _hash
break
i+=1
def create_block(self, sender:str, information:str):
block = {
'index': len(self.blocks),
'sender': sender,
'timestamp': time(),
'info': information
}
if(block['index'] == 0): block['previous_hash'] = self.__secret # for genesis block
else: block['previous_hash'] = self.blocks[-1]['hash']
# guessing the nonce
i = 0
while True:
block['nonce'] = i
_hash = hashlib.sha256(str(block).encode('utf-8')).hexdigest()
if(_hash[:self.__difficulty] == '0'*self.__difficulty):
block['hash'] = _hash
break
i+=1
self.blocks.append(block)
def validate_blockchain(self):
valid = True
n = len(self.blocks)-1
i = 0
while(i<n):
if(self.blocks[i]['hash'] != self.blocks[i+1]['previous_hash']):
valid = False
break
i+=1
if valid: print('The blockchain is valid...')
else: print('The blockchain is not valid...')
def show_blockchain(self):
for block in self.blocks:
pprint(block)
print()
b = blockchain()
b.create_block('Ram', 'Python is the best programming language!!')
b.create_block('Vishnu', 'I love cybersecurity')
b.create_block('Sanjay', 'AI is the future')
b.show_blockchain()
b.validate_blockchain()
We can see the blocks present on the blockchain and that the validate_blockchain function returns true. Now let’s meddle with our blockchain and add a new block somewhere in-between the blocks of the blockchain and run the validate_blockchain function to see what it returns.
block = {
'index': 2,
'sender': 'Arjun',
'timestamp': time(),
'info': 'I am trying to tamper with the blockchain...'
}
block['previous_hash'] = b.blocks[1]['hash']
i = 0
while True:
block['nonce'] = i
_hash = hashlib.sha256(str(block).encode('utf-8')).hexdigest()
if(_hash[:4] == '0'*4):
block['hash'] = _hash
break
i+=1
b.blocks.insert(2, block)
b.show_blockchain()
b.validate_blockchain()
This is the output we get.
{'hash': '0000bfffcda53dc1c98a1fbaeab9b8da4e410bbcc24690fbe648027e3dadbee4', 'index': 0, 'info': 'Python is the best programming language!!', 'nonce': 91976, 'previous_hash': '000023ae8bc9821a09c780aaec9ac20714cbc4a829506ff765f4c82a302ef439', 'sender': 'Ram', 'timestamp': 1654930841.4248617} {'hash': '00006929e45271c2ac38fb99780388709fa0ef9822c7f84568c22fa90683c15f', 'index': 1, 'info': 'I love cybersecurity', 'nonce': 171415, 'previous_hash': '0000bfffcda53dc1c98a1fbaeab9b8da4e410bbcc24690fbe648027e3dadbee4', 'sender': 'Vishnu', 'timestamp': 1654930842.8172457} {'hash': '000078a974ba08d2351ec103a5ddb2d66499a639f90f9ae98462b9644d140ca9', 'index': 2, 'info': 'I am trying to tamper with the blockchain...', 'nonce': 24231, 'previous_hash': '00006929e45271c2ac38fb99780388709fa0ef9822c7f84568c22fa90683c15f', 'sender': 'Arjun', 'timestamp': 1654930848.2898204} {'hash': '0000fe124dad744f17dd9095d61887881b2cbef6809ffd97f9fca1d0db055f2a', 'index': 2, 'info': 'AI is the future', 'nonce': 173881, 'previous_hash': '00006929e45271c2ac38fb99780388709fa0ef9822c7f84568c22fa90683c15f', 'sender': 'Sanjay', 'timestamp': 1654930845.594902} The blockchain is not valid...
We can see that the validate_blockchain function returns false because there is some mismatch in the fingerprinting and hence the integrity of our blockchain has been compromised.
In this article, we discussed the following:
To continue this project further, the blockchain can be hosted and deployed as a REST API server on the cloud that can be used by the users to store information on the blockchain. Obviously, our blockchain is not distributed for the sake of simplicity. If you are really interested in using blockchain technology for your database, feel free to look at BigchainDB, which is a decentralized blockchain database. It provides support for both python and nodejs. Alternatively, GunDB is a popular graph-based decentralized database engine being used in web3 applications in recent times.
That’s it for this article (building blockchain using Python). Hope you enjoyed reading this article and learned something new. Thanks for reading and happy learning!
The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.