How to Predict the Election Result With a Coin?

jalil Last Updated : 13 Oct, 2022

10 min read

This article was published as a part of the Data Science Blogathon.

_{Photo by Clay Banks on Unsplash}

Introduction

Predicting events and happenings has been an interesting skill for humans since the dawn of civilization and remains so today. This ability has been so attractive to many that they have selected the career of “prophecy” for themselves, attracting many followers and revenue in the process. Some of those people’s names, such as “Nostradamus,” are still heard, and some people are still looking for the realization or non-realization of his predictions after centuries.

Photo by Jen Theodore on Unsplash

Forecasting what will happen in the future is considered a critical business skill. In most cases, a company’s level of insight and understanding about the future is directly linked to its level of validity, success, and the rate at which it moves through the stages of growth. So, a company makes plans based on what will happen in the future; due to this, most businesses today pay more attention to data science and spend a lot of money to make their businesses more data-driven. Many companies also employ many data scientists and set up departments for data science.

It is evident that today’s forecasts, developed with data science, are significantly different from those made centuries ago. Forecasting and getting insights with the help of data science is a scientific process based on the analysis and a close look at a data set.

Let’s leave the business world and move to the more yellow world of elections. Guess who will win an election is one of the most interesting things people do. Every year, before an election, several statistical and research agencies start collecting and analyzing data to determine who will win before voting. As a result, Politicians usually pay statistics organizations large sums of cash to produce real-time reports on how the public perceives them. This way, the candidate can use intelligence to lead his campaign.

Main Problem with Collecting Data

Most enterprises and institutions that collect data have a big problem when they try to predict elections: they have a lot of wrong information. This problem will lead to wrong conclusions in the end. This problem was one of the most obvious during the 2016 presidential election in the US. In that year, Donald Trump and Hillary Clinton were in the race. Most polls and reputable news sites at the time said Hillary would win the election easily.

Source: NYTimes

Most people believed this prediction because, on the one hand, there was Trump, who had no political experience and had campaigned for election by making very harsh and aggressive statements. On the other hand, Clinton has been involved in politics for more than 30 years and has served as Secretary of State. With all of these explanations, it didn’t seem so unusual that Clinton beat Trump. Many reliable news sources, like the New York Times, said that Clinton’s chances of beating Trump were 85 to 15.

The Whole World is Impressed!

The whole world was amazed as the votes were being counted. Even the most optimistic statistical groups didn’t think Trump had a chance, but he won the electoral votes of key states one by one and quickly got the 270 electoral votes he needed to become president.

At the end of the vote, Trump won with 306 votes to Clinton’s 232. This was one of the most surprising results in US electoral history. This failure shocked a lot of data scientists. Many papers have been written up to this point, and many different parts of this situation have been examined.

Photo by Library of Congress on Unsplash

One of the main reasons why many pre-election polls give different results is that people don’t say who their favorite candidate is for various reasons. So, they either didn’t vote or said they would vote for someone else. However, on election day, when they went to the polls, they voted for their ideal candidate, who wouldn’t say his name. If most people do this, it’s clear that the poll results will change, and a statistical disaster like what happened in 2016 will happen again.

A Summary of What the Problem is!

In this article, we are not going to find out why the predictions for the 2016 elections were completely mistaken. In the future, we might publish a more in-depth article about the statistical reasons and factors that led to this. In this article, we’ll learn about a game based on math and statistics.

As was already said, one of the most important factors in predicting mistakes is the amount of wrong information that gets into the survey. Now, if we can change how we collect data to make it much more accurate, our prediction of how the election will turn out will probably be right. We’ll look at examples of this type of data tracking in the following sections.

Imaginary Forest

In a faraway forest where 200 animals live, there is an election to choose the forest leader. “Ms. Tree” and “Mr. Pig” are the two candidates in this election. “Ms. Tree” has been teaching forest residents for a long time. It thinks everyone in the forest and neighboring forests should live peacefully and work together.

On the other hand, “Mr. Pig” is a violent person who loves fighting. He thinks that a lot of the resources should be put into the military and, if possible, used to attack neighboring forests and steal their resources. A secret group that works in the forest has given us the job of finding out who will win the election before it happens. Our lives are at risk if we are wrong about what will happen!

Photo by Sergei A on Unsplash

As a result, we decided to survey every forest member to predict the election’s true winner. When we ask people who their favorite candidate is, Based on the atmosphere in the forest, the following things will happen:

If a person wants to vote for “Ms. Tree,” he selects her as his preferred candidate.
If someone wants to vote for “Mr. Pig”, they probably will give us the name of the “Ms. Tree” (We don’t know how often this happens, but we do know that it does happen sometimes.)

As a result, the most important thing for us is to get accurate data from forest people. So, we should use a strategy that lets us find out people’s real favorite candidate without making them say it out loud. In other words, we need to find a middle ground between our language and theirs so that they can answer our questions in that middle-ground language and we can understand them. While we get accurate information from a person, his or her personal opinion is kept private and does not become public. But what should be done to solve this problem?

A Coin and an Endless Amount of Possibilities

All you have to do to solve this problem is use a coin as a guide. Each coin sample space can be either “heads” or “tails.”

_{Photo by Steve Smith on Unsplash}

As a consequence, it’s enough to go to each of the forest’s residents and, by giving him a coin, ask him to play the following game in his home (without our presence) and tell us of the outcome; the rules of the game are as follows:

If the coin lands on “heads,” mention the name of the candidate you want to vote for.
If the coin lands on “tails,” the coin is thrown, call out the name of the “Mr. Pig.”

As was already said, the biggest problem with tracking data about the election was that many people who were willing to vote for the “Mr. Pig” didn’t come out and say so for various reasons.

This strategy solves the problem since it is unknown if a person voted for “Mr. Pig” because of desire or because the coin “tails” appeared. People may safely tell us the outcome of the game since we don’t know the outcome of the tap or the line. The following was the outcome of the data collection:

Creating a language between the person asking the question and the person answering it so that the person answering can confidently choose the right answer.
Minimizing the area for judgment and providing a secure atmosphere in which the responder may choose the preferred alternative.

By taking the above steps, the quality of data tracking will greatly improve, making it more likely that conclusions and assessments based on this data are proper.

Please feel free to share your thoughts or reading tips in the comments.

Follow me on LinkedIn, and Twitter.

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.

jalil

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

MUID

Used by Microsoft Clarity, to store and track visits across websites.

Expiry: 1 Year

Type: HTTP

_clck

Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.

Expiry: 1 Year

Type: HTTP

_clsk

Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.

Expiry: 1 Day

Type: HTTP

SRM_I

Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Years

Type: HTTP

SM

Use to measure the use of the website for internal analytics

Expiry: 1 Years

Type: HTTP

CLID

The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.

Expiry: 1 Year

Type: HTTP

SRM_B

Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Months

Type: HTTP

_gid

This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.

Expiry: 399 Days

Type: HTTP

_ga_#

Used by Google Analytics, to store and count pageviews.

Expiry: 399 Days

Type: HTTP

_gat_#

Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.

Expiry: 1 Day

Type: HTTP

collect

Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.

Expiry: Session

Type: PIXEL

AEC

cookies ensure that requests within a browsing session are made by the user, and not by other sites.

Expiry: 6 Months

Type: HTTP

G_ENABLED_IDPS

use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.

Expiry: 2 Years

Type: HTTP

test_cookie

This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.

Expiry: 1 Year

Type: HTTP

_we_us

this is used to send push notification using webengage.

Expiry: 1 Year

Type: HTTP

WebKlipperAuth

used by webenage to track auth of webenagage.

Expiry: Session

Type: HTTP

ln_or

Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.

Expiry: 1 Day

Type: HTTP

JSESSIONID

Use to maintain an anonymous user session by the server.

Expiry: 1 Year

Type: HTTP

li_rm

Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.

Expiry: 1 Year

Type: HTTP

AnalyticsSyncHistory

Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

lms_analytics

Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

liap

Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.

Expiry: 6 Months

Type: HTTP

visit

allow for the Linkedin follow feature.

Expiry: 1 Year

Type: HTTP

li_at

often used to identify you, including your name, interests, and previous activity.

Expiry: 2 Months

Type: HTTP

s_plt

Tracks the time that the previous page took to load

Expiry: Session

Type: HTTP

lang

Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings

Expiry: Session

Type: HTTP

s_tp

Tracks percent of page viewed

Expiry: Session

Type: HTTP

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

Indicates the start of a session for Adobe Experience Cloud

Expiry: Session

Type: HTTP

s_pltp

Provides page name value (URL) for use by Adobe Analytics

Expiry: Session

Type: HTTP

s_tslv

Used to retain and fetch time since last visit in Adobe Analytics

Expiry: 6 Months

Type: HTTP

li_theme

Remembers a user's display preference/theme setting

Expiry: 6 Months

Type: HTTP

li_theme_set

Remembers which users have updated their display / theme preferences

Expiry: 6 Months

Type: HTTP

How to Predict the Election Result With a Coin?

Introduction

Main Problem with Collecting Data

The Whole World is Impressed!

A Summary of What the Problem is!

Imaginary Forest

A Coin and an Endless Amount of Possibilities

Product Launches and Surveys

Conclusion

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#

_gat_#

collect

AEC

G_ENABLED_IDPS

test_cookie

Webengage (2)

_we_us

WebKlipperAuth

LinkedIn (16)

ln_or

JSESSIONID

li_rm

AnalyticsSyncHistory

lms_analytics

liap

visit

li_at

s_plt

lang

s_tp

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

s_pltp

s_tslv

li_theme

li_theme_set

Google (11)

_gcl_au

SID

SAPISID

__Secure-#

APISID

SSID

HSID

DV

NID

1P_JAR

OTZ

Facebook (2)

_fbp

fr

LinkedIn (6)

bscookie

lidc