I was driving on the Mumbai-Pune Expressway some time back with my family. My 8 yr old had a bunch of curious questions, as usual. While providing a long drawn answer I caught myself using the phrase “false positive”. After a while, I was the only person left awake in the car (btw, no prizes for guessing how they all fell asleep). I started thinking to myself how do I explain the concept of a “false positive” in such a way that my daughter could understand it. Hmm.
Then I had a Eureka moment. The boy who cried Wolf! It turns out it is a classical Machine Learning story. Allow me to retell this classic from a ML lens.
The “lying” shepherd boy crying “Wolf” multiple times created a bunch of False Positives. The villagers kept running to help the boy during these false positives and then finally changed their behaviour for the same trigger. Reinforced Learning?
Why did the boy engage in deception? Maybe initially he wanted to test the response time or behavior of the villagers? Or maybe he was bored with the sheep and desired to get “rewarded” with the villagers attention?
Why did the villagers behave the way they did? Why did they fall for the boy’s trick multiple times? Their initial behavior was optimized to minimize loss of sheep. Even though they had encountered a few false positives they knew that it would only take 1 true positive to cause a catastrophic loss. They didn’t care about Precision. Their sole concern was Recall.
Recall = (Count of True Positives) / (Count of True Positives + False Negatives)
Basically it boils down to “When a wolf IS REAAALLLY present, what % of those instances do we respond to?” They wanted a 100% Recall.
But eventually, the villagers optimized for Precision.
Precision = (Count of True Positives) / (Count of True + False Positives)
Simply put, this is “Of all the times we rushed to help, what percentage of it was a real emergency?”
The villagers worked like a Binary Classifier. When they heard the boy shout, they had to decide to classify it as a Trick or Threat. They kept classifying the initial instances as a Threat. That ended up being a sequence of False Positives. They had a low Precision. They tried to optimize for Precision and ended up “misclassifying” a True Positive event as a Trick. This is why balance between the 2 primary metrics is necessary. Hence, the F1 score.
In this case Reinforced Learning with incorrect input led to a failure of the overall system. You could even say there was OverFitting to the training data.
What could we have done to fix this, since hindsight is 20/20?
- The villagers could have applied a “penalty” to the boy for introducing False Positives to discourage that behaviour.
- Another trustworthy shepherd could have been added or used to replace the existing one. (Similar to getting a better training dataset?)
- CCTV with object recognition tuned for identifying Wolves? Crazy but possible now!
Any other links that this story has with ML?