Analytics Vidhya has long been at the forefront of imparting data science knowledge to its community. With the intent to make learning data science more engaging to the community, we began with our new initiative- “DataHour”.
DataHour is a series of webinars by top industry experts where they teach and democratize data science knowledge. On 31st May 2022, we were joined by Valerii Babushkin for a DataHour session on “Introduction to Blockchain.”
Valerii Babushkin works at Blockchain.com as a Head of Data Science. Before that he worked on Facebook at the WhatApp User Data Privacy as Staff Engineer, at Alibaba Russia as the VP of Machine Learning, and at X5 Retail Group as a Senior Director of Data Science. Also, Valerii is a Kaggle Competition Grand Master, ranked globally in the top 30.
Are you excited to dive deeper into the world of AI and Blockchain? We got you covered. Let’s get started with the major highlights of this session: Introduction to Blockchain.
Introduction
Every one of you must be heard of Blockchain. It’s a revolutionary technology having the ability to reduce risk and fraud in a scalable manner. Through this session we’ll be able to learn:
why blockchain is a computer and
how this computer is layered.
what different environments of Blockchain do we have at the moment.
Blockchain as a Technology
Blockchain –> A virtual computer that runs on top of a network of physical computers that provide strong, auditable, game-theoretic guarantees that the code it runs will continue to operate as designed. This type of technology is designed as displayed in the image Architecture of Blockchain.
The Architecture of Blockchain Computers
We can use nodes in thousands or even in millions. Then comes consensus mechanism, and memory which is blocks, and processors or processors which are or are virtual machine machines. This is a computer with a special feature of provable trust guarantees. So this computer is composed of nodes which are physical computers joined together, with a consensus mechanism that is always in the center. But it’s not a new phenomenon because all these kinds of things like many nodes, memory blocks or memory, processors, and consensus mechanisms.
Different Between State Machine Replication(1980) and New Aspect: Open Consensus (2020)
Things like many nodes, memory blocks or memory, processors, and consensus mechanisms all started in the 1980s. Google, Amazon, and Bank of America – all have lots of servers. So what they need is to ensure the state is consistent across all servers and have a known number of servers that to are authorized.
What’s an open consensus? A new aspect so anyone in open consensus can write new data to the blockchain with no authorization and in fact with an unknown number of servers. If you hear you had authorization and you know how many servers you will have here then please make sure you don’t and it’s impossible. But Nakamoto in 2008 found a way to bypass the lower bound using proof of work.
What does it mean and why it’s impossible? Because – let’s imagine we have a lottery and in this lottery, the probability that someone would win a prize is uniformly distributed. If we have three people each person has a one-third of the chance to win. But now imagine that out of these three people one person would tell you that I’m not one person I’m a thousand people. Now we have a thousand and two people and this person chance to win. Is one thousand divided by 1002 which is pretty high, it almost won or this person could say a million now it’s not even closer to one. In strange kind cases maybe not if you think about that Nakamoto sounds very. The guy who wrote the paperwork blockchain was Nakamoto Satoshi. And solved it by proof of work.
Open Consensus: How?
How can we solve open consensus? Let’s give the price not according to the number of people but according to the actual work these people or entities or machines have done. So you could try to fake and say look I’m a million people but in proof of work you have to do the work of million people itself. And that’s what both bitcoin and Ethereum are doing right now. They are mining. This means they are solving some puzzle to create the next block and that’s why each party has a chance to receive their tinning in the lottery or blog according to their capabilities. To calculate or to solve the puzzle – the problem here is obvious it is slow and it wastes energy. So what else is proof of sake fast block the creation of energy waste more complex? Moreover, it’s not that easy to solve.
Example: Five years ago Ethereum announced in a newspaper that in the next month or two they would switch to proof of stake. Again, two months ago, they announced that Ethereum would switch in months or two to proof of stake. What does this show? This shows that it’s not an easy task. But probably by the end of the year, Ethereum would have this proof of stake which where you prove that you are more than you are as many entities you claim to be by having stakes. Another thing is proof of space instead of making some kind of puzzle-solving you can just prove that you have enough space meaning that you have enough machines and entities. There are many more ideas so you can see the names of technologies that are trying to incorporate that.
The Result: Trust Guarantees
So what is the result of that? The result is trust guarantees.
The game theory of nodes plus consensus mechanism provides trust guarantees to anyone using it – users, developers, creators, businesses other computers or services- that no previous computer architecture could provide.
Because remember were shown to be impossible, this trust guarantee means that the rules of the system can change without due process as defined by the system governance protocols. Don’t be evil becomes can’t be evil.
This trust guarantees also enable the credible creation of computing primitives such as digital money, digital goods, smart contracts decentralized organizations, etc.
What is a Blockchain?
It goes through the following layers:
You have a consensus layer which is a card called the heartland or any blockchain. You have a compute layer which you might call a blockchain computer. Then an application list – a lady team of Motoko whatever and you have a user interface which some people can call a web 3.0.
Layer 1 – Consensus Layer
A public data structure ledger that provides:
A number of qualities persistence. Once added data can never be removed.
Consensus – all honest participants have the same data
Liveness – honest participants can add new transactions and
Openness – anyone can be a participant no authentication
So this is a consensus layer and remember how we can do that prove a work with a state group of space and some more ideas.
Layer 1.5 Compute Layer/Blockchain Computer
Here, the app logic is encoded in a program that runs on blockchain:
Rules are enforced by a public program public source code. This means that there is transparency – no single trusted third party
The app program is executed by parties who create new blocks so there is public verifiability – anyone can verify state transitions
Execution Environment (Bitcoin Blockchain and Ethereum Blockchain)
First public blockchain – It’s a limited computing environment
limited instruction set (no loops)
sufficient for some tasks – atomic swaps, payment channels
Ethereum – Ethereum is a general programming environment.
EVM is a general-purpose computing environment
App code updates the internal state in response to transactions
Calling app costs fees (gas) – which prevents denial of service attacks and miners storing on-chain state costs fees because coding up caused fees
Layer 3 – Applications that Run on Blockchain Computer
Any time you’re doing an application on the blockchain you try to write to the blockchain as rare as possible because writing to a blockchain is a very costly operation. You don’t want to do that if you can’t avoid that but sometimes you just have to. So that’s the thing it’s just an additional way of providing consensus and distribution so use it only when you need it. Then you have applications – you can see many of them.
For example, Ethereum’s Defi – here are many things. Here credit and lending insurance, stable coins, prediction markets, marketplaces, derivatives, KYC, infrastructure exchange, investment in custodial payments – so many things.
Blockchain as a Company
In early 2012, Reeves and Brian Armstrong, the co-founder of cryptocurrency exchange Coinbase, applied to Y combinator’s summer class. They proposed a payment platform for bitcoin where users could keep a digital wallet, and exchange other currency for bitcoins for a percentage fee to make payments in bitcoin.
Due to different opinions, they part ways before attending y combinator. Reeves wanted to create a platform where users control access to their bitcoin information, while Armstrong felt that the platform should retain custody of the user’s wallet. After parting ways with Armstrong, Reeves continued to work on blockchain.info.
The company began its life as the first bitcoin blockchain explorer in 2011. So you think about the company working with crypto which is 11 years old. It is something. It started as a wallet (non-custodial).
The Difference between a Custodial and Non-custodial Wallet
Here’s the difference – so imagine you have some money in the bank. The bank actually can freeze or suspend your money anytime. So any time your bank decided to do that you have no access to your money also. You could try to send your money to someone and the bank can refuse to operate with this transaction which means that the bank has custody over your money. You could say that the money you have control of – are the money you have in your pocket.
Let’s say you have some banknotes but the government could say now we have a new design. And do not accept this money like happened in the demonetization times. So efficiently government said we don’t accept it anymore even though that was your earned money. If the government can say not any more so this is an example of a custodial wallet.
Non-custodial means that nobody nothing/no entity can prevent or withhold money from the wallet.
Non-custodial wallet – it’s a wallet on an actual blockchain only you have an access to that meaning that if you forgot the password
or key to your wallet no one can help you because no one except for you has access to this wallet.
That’s how the non-custodial wallet started. Then we opened an exchange, we have an explorer which we started as a crypto trading venture.
Applied Machine Learning in Blockchain
In terms of the business model, there is no like a huge revolution. It’s very uncommon or was uncommon, especially in 2011. Type of assets, commodity, or what can we call it. But what we do like being a wallet or exchanging or trading or assets management – it’s nothing new.
We do payments and payment fraud because as soon as you have an exchanger instead of living in the heaven of cryptocurrency we have your customer blockchain and you have your customer money in your bank. You have intermediaries taking a cut like a payment system, you have your bank and you have you. So as soon as there is some exchange of fiat money you have to take the possibility of being on this transaction being fraudulent you have to calculate the probability and you have to take the risk into account.
The same is KYC fraud. So the same, if you work with fiat money if you use exchanger you try to pay for something in fiat that means we need to do KYC (stands for know your customers) meaning that you have to know who your customer is.
What is a fiat currency – fiat currency is a basic currency we get used to like dollars, rupees, pounds, euros, rubles, whatever.
Marketing – you have customers probably you spend some money on marketing probably you’d like to make it efficient. This means you have a kind of uplift modeling or rate prediction or whatever next best action recommended system.
Exchanges are the same. So you have a volume of exchange how long or how many trades you would have tomorrow, in a week, etc.
If you do Trading – you would like to predict the price of the stuff which people in hedge funds are doing.
38:00
Pricing – so pricing is let’s say you can go to blockchain.com. You can deposit USDT that’s a stable coin packed into USD and receive your interest rate of nine percent. Why do that to attract people the same thing that banks are doing with deposits what if I could attract the same people with eight percent? So I’d pay less for them so I would love to do that.
Surprising lending – If you’d like to lend some money you need to understand whom you want to lend to; what and why and just understand what would be the end after that.
Blockchain is another layer that provides you some new quality decentralization and non-custodial accounts and it’s quite quick and trust guarantees. So some of these will be adopted for sure because it’s too good. On the other hand, if it’s decentralized you know that government doesn’t want to give power over money. So they don’t at least they don’t want to do that but there will be some regularization there will be some taxation.
Real-world use case of Blockchain
We can use blockchain in healthcare. You can use blockchain as a way to collect the data and transmit the data.
For example Insurance – is very connected to healthcare so sometimes people do fake examinations or certifications to receive some benefits. So with blockchain, you could avoid that.
We use cookies essential for this site to function well. Please click to help us improve its usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy & Cookies Policy.
Show details
Powered By
Cookies
This site uses cookies to ensure that you get the best experience possible. To learn more about how we use cookies, please refer to our Privacy Policy & Cookies Policy.
brahmaid
It is needed for personalizing the website.
csrftoken
This cookie is used to prevent Cross-site request forgery (often abbreviated as CSRF) attacks of the website
Identityid
Preserves the login/logout state of users across the whole site.
sessionid
Preserves users' states across page requests.
g_state
Google One-Tap login adds this g_state cookie to set the user status on how they interact with the One-Tap modal.
MUID
Used by Microsoft Clarity, to store and track visits across websites.
_clck
Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.
_clsk
Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.
SRM_I
Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.
SM
Use to measure the use of the website for internal analytics
CLID
The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.
SRM_B
Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.
_gid
This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.
_ga_#
Used by Google Analytics, to store and count pageviews.
_gat_#
Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.
collect
Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.
AEC
cookies ensure that requests within a browsing session are made by the user, and not by other sites.
G_ENABLED_IDPS
use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.
test_cookie
This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.
_we_us
this is used to send push notification using webengage.
WebKlipperAuth
used by webenage to track auth of webenagage.
ln_or
Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.
JSESSIONID
Use to maintain an anonymous user session by the server.
li_rm
Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.
AnalyticsSyncHistory
Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.
lms_analytics
Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.
liap
Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.
visit
allow for the Linkedin follow feature.
li_at
often used to identify you, including your name, interests, and previous activity.
s_plt
Tracks the time that the previous page took to load
lang
Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings
s_tp
Tracks percent of page viewed
AMCV_14215E3D5995C57C0A495C55%40AdobeOrg
Indicates the start of a session for Adobe Experience Cloud
s_pltp
Provides page name value (URL) for use by Adobe Analytics
s_tslv
Used to retain and fetch time since last visit in Adobe Analytics
li_theme
Remembers a user's display preference/theme setting
li_theme_set
Remembers which users have updated their display / theme preferences
We do not use cookies of this type.
_gcl_au
Used by Google Adsense, to store and track conversions.
SID
Save certain preferences, for example the number of search results per page or activation of the SafeSearch Filter. Adjusts the ads that appear in Google Search.
SAPISID
Save certain preferences, for example the number of search results per page or activation of the SafeSearch Filter. Adjusts the ads that appear in Google Search.
__Secure-#
Save certain preferences, for example the number of search results per page or activation of the SafeSearch Filter. Adjusts the ads that appear in Google Search.
APISID
Save certain preferences, for example the number of search results per page or activation of the SafeSearch Filter. Adjusts the ads that appear in Google Search.
SSID
Save certain preferences, for example the number of search results per page or activation of the SafeSearch Filter. Adjusts the ads that appear in Google Search.
HSID
Save certain preferences, for example the number of search results per page or activation of the SafeSearch Filter. Adjusts the ads that appear in Google Search.
DV
These cookies are used for the purpose of targeted advertising.
NID
These cookies are used for the purpose of targeted advertising.
1P_JAR
These cookies are used to gather website statistics, and track conversion rates.
OTZ
Aggregate analysis of website visitors
_fbp
This cookie is set by Facebook to deliver advertisements when they are on Facebook or a digital platform powered by Facebook advertising after visiting this website.
fr
Contains a unique browser and user ID, used for targeted advertising.
bscookie
Used by LinkedIn to track the use of embedded services.
lidc
Used by LinkedIn for tracking the use of embedded services.
bcookie
Used by LinkedIn to track the use of embedded services.
aam_uuid
Use these cookies to assign a unique ID when users visit a website.
UserMatchHistory
These cookies are set by LinkedIn for advertising purposes, including: tracking visitors so that more relevant ads can be presented, allowing users to use the 'Apply with LinkedIn' or the 'Sign-in with LinkedIn' functions, collecting information about how visitors use the site, etc.
li_sugr
Used to make a probabilistic match of a user's identity outside the Designated Countries
MR
Used to collect information for analytics purposes.
ANONCHK
Used to store session ID for a users session to ensure that clicks from adverts on the Bing search engine are verified for reporting purposes and for personalisation
We do not use cookies of this type.
Cookie declaration last updated on 24/03/2023 by Analytics Vidhya.
Cookies are small text files that can be used by websites to make a user's experience more efficient. The law states that we can store cookies on your device if they are strictly necessary for the operation of this site. For all other types of cookies, we need your permission. This site uses different types of cookies. Some cookies are placed by third-party services that appear on our pages. Learn more about who we are, how you can contact us, and how we process personal data in our Privacy Policy.