You’ve undoubtedly heard about gold diggers as well. In the majority of these cases, individuals discover enormous wealth with the aid of gold diggers and become overnight millionaires.
Your friend has a gold detector. You, too, have chosen to join the group of Gold Seekers in order to help your buddy. So you and a friend go to a mine with around 1000 stones in it, and you guess that 1% of these stones are gold.
When gold is identified, your friend’s gold detector will beep, and the method is as follows:
As you and your fellow explore the mine, the machine beeps in front of one of the rocks. If this stone is gold, its market value is about $1,000. Your buddy recommends that you pay him $ 250 and pick up the stone. The deal seems appealing because you earn three times as much money if it’s gold. On the other hand, the gold detector’s accuracy is great, as is the likelihood of gold being gold. These are the thoughts that will finally encourage you to pay $ 250 to your buddy and pick up the stone for yourself.
It is not a bad idea to take a step back from the world of gold seekers and return to the beautiful world of mathematics to examine the problem more closely:
Given the foregoing, it is likely that if we turn this device in the mine, it will sound 109 times, even though only 10 beeps are truly gold. This means that there is only a 9% chance that the stone we paid $250 for is gold. That means we didn’t do a good deal and probably wasted $250 on a piece of worthless stone. If we want to mathematically summarize all of these conversations, we will have:
After investigating this issue mathematically, we discovered that the “measurement accuracy” parameter alone is insufficient to achieve a reliable result and that other factors must be considered. The “false positive paradox” is a concept used in statistics and data science to describe this argument.
This paradox typically occurs when the probability of an event occurring is less than the error accuracy of the instrument used to measure the event. For example, in the case of “gold diggers,” we used a device with 90% accuracy (10% error) to investigate an event with a 1% probability, so the results were not very reliable.
Before delving into the concerns surrounding the “false positive paradox,” it’s a good idea to brush up on a few statistical cases. Assume that a corona test has been performed to help you understand the notion. This test yielded four modes:
It should be mentioned that the corona test and medical tests, in general, are used as examples here, and these four requirements may be applied to any event in which there is a risk of inaccuracy.
In the instance of gold seekers, the percentage of false-positive error of the device, i.e. if the device is not gold but the device beeps, was 10%, and the percentage of false-negative error of the device, i.e. if the device is gold but the device does not beep, was 0%. In the next sections, we will look at some different aspects of the “false positive paradox” debate.
A mysterious virus has infected a city of 10,000 people, impacting roughly 40% of the population. As a product manager, you focus on creating the viral detection kit as quickly as feasible in order to distinguish infected persons from healthy people.
Your ID kit has a 5% false-positive error rate and a 0% false-negative error rate. This kit is currently being utilized to detect sick persons in the city, and your predicted outcomes are as follows:
As previously stated, the false-negative percentage of this kit is 0%, implying that if someone has a condition, it must be recognized. It has recently been found that around 300 people’s test results were deemed incorrect. Finally, it is possible to state that the test result was positive for 4300 persons, with 4000 of them really having the condition. As a result, the measurement accuracy of this kit is around 93 percent, which is a respectable figure that can be relied on.
In one of the most significant metropolitan retail complexes with a population of one million people, an anti-terrorist camera and siren have been placed. This siren has a 1% chance of being false positive and a 1% chance of being false negative. To put it another way, we can say:
False Negative: If a CCTV camera identifies a terrorist, the alarm will go off 99 per cent of the time.
False Positive: When regular persons pass in front of the camera, the alarm is 99 per cent less likely to sound, but it is 1 per cent more likely to ring.
The concern now is, if the alarm goes off one day, what is the likelihood that there is a terrorist within the complex?
We estimate that there are 500 terrorists in a metropolis of 1 million people. This assumption is plausible and supported by demographic and statistical data. Now we return to the original question: what is the likelihood of a terrorist being within the complex if the alarm goes off? The following calculations are used to arrive at this percentage.
Given the reconnaissance camera’s 99 per cent accuracy, there are 500 terrorists in the city, and if they all pass in front of the camera, the siren will sound 495 times:
An alert device has entrusted you with product management. Police will use the gadget to detect drivers who have drunk alcohol or used drugs. The following are the specs for the product produced by your team:
You spend some time thinking about product releases and ask the police to provide you with a study on the incidence of alcohol and drug use among drivers because you are adept in data science and have been a data scientist before taking on product management.
Following an examination of the data, you will discover that, on average, 5 out of every 1,000 drivers had ingested alcohol and drugs. This is a bit of a problem since if the police randomly test drivers with your existing product, it might lead to a disaster! We undertake the following computations to have a better understanding of this problem.
Five individuals out of every thousand have taken alcohol and drugs, and because the device’s false-negative error rate is zero per cent, the test of these five people will be positive.
As previously stated, the device’s false positive error rate is around 5%. This means that around 50 of the 995 drivers who have not drunk will test positive:
Given that the chance of consumption among these people is 60%, in a group of 100 people, approximately 60 people have consumed, hence the test of these 60 people will be positive:
According to the findings, the measurement precision of a device alone cannot ensure the reliability of the output, and the sample space under consideration is maybe more essential than the instrument’s accuracy. To avoid the effect of a “false positive paradox,” conditions must be constructed in which the chance of occurrence exceeds the device’s inaccuracy. In the instance of the “awareness test,” this resulted in a significant boost in output accuracy.
Read more articles on our website. Here is the link.
The media shown in this article is not owned by Analytics Vidhya and are used at the Author’s discretion.