No matter what your training or what you do for a living, you’re bound at some point to have a few fundamental questions about how things came to be the way they are, or how they should be.
Just how did you miss that flight connection when you were about to leave on holiday? Would you have made it if the circumstances had been different? How did you manage to get the last table at the restaurant? What strategies could you employ next time to ensure your luck holds?
When you set about answering questions like these, it’s all about dealing with cause and effect – that is, how various factors lead to different results.
When you move from personal and anecdotal experience and look for more substantial rules to follow, you can use math and statistics to help you. But take care; for years it’s been axiomatic that you must distinguish between true factors that affect the result, and those that appear to, but in fact do not. In other words, to separate between causation and correlation. Causation indicates that a relationship is one of cause and effect, while correlation implies a connection of sorts between two variables, but nothing more.
Join authors Pearl and Mackenzie as they debunk some of the most basic “truths” in mathematics. You might even learn how to game your life a little better along the way.
In this summary, you’ll learn
why people once thought the smallpox vaccine was worse than catching the disease itself:
why computers aren’t as advanced as we like to think; and
which Biblical figure ran one of the earliest controlled experiments.
The notion of causation has been disparaged by some statisticians.
If you’ve spent any time near an institute of higher learning or, frankly, if you’ve ever heard a brainiac dismissing government reports on the news, you’ll likely have heard the phrase “correlation does not imply causation” repeated ad nauseam. It has virtually been accepted as fact for the last few decades.
In part, this is down to the fact that causation has been downplayed as an idea by the scientific community. At the start of the twentieth century, English mathematician Karl Pearson epitomized this view.
Pearson’s biometrics lab was the world’s leading authority in statistics, and he liked to claim that science was nothing more than pure data. The idea was that because causation could not be proven, it could not be represented as data. Therefore, he saw causation as scientifically invalid.
Pearson liked to prove his point by singling out correlations that he considered spurious. A favorite was the observation that if a nation consumes more chocolate per capita, it produces more Nobel Prize winners. To him, it was a meaningless correlation, so looking for causation was unnecessary.
But this attempt at ridicule actually hides a causative factor; it is likelier that wealthier nations consume more chocolate, just as it’s likelier that they’ll produce scientific advances noticeable to the Nobel committee! On top of that, it later turned out that causation could be represented mathematically. This is what geneticist Sewall Wright showed while researching at Harvard University in 1912.
Wright was studying the markings on guinea pigs’ coats to determine the extent to which they were hereditary. He found the answer to this causal question by using data. It began with a mathematical diagram. Wright drew arrows connecting causes and outcomes, linking the colors of the animals’ coats to contributing factors in their immediate environment and development.
Wright also developed a path diagram to represent these relationships, in which a “greater-than” sign (>) signifies “has an effect on.” For instance: developmental factors > gestation period > coat pattern. Wright then turned this diagram into an algebraic equation, using the collected data. It demonstrated that 42 percent of a given coat pattern was caused by heredity, while 58 percent was the result of developmental factors.
Given the scientific climate, Wright came in for some stick: he was so vehemently attacked that his methods for establishing causation from correlation were buried for decades. But times have changed; it is now finally time to revive his work. Research fields from medicine to climate science are now beginning to welcome causation as a principle. Surely the Causal Revolution has begun.
Data alone can mislead when causality is neglected.
It’s a generally acknowledged truth that if you really want to understand the root cause of something, you’re going to have to collect data about it. However, a note of caution must be sounded: unless data is properly analyzed, it can be wildly misinterpreted.
Just such a thing happened with the smallpox vaccine. When the vaccine was introduced in the eighteenth century, the data seemed to show that the vaccine was actually causing more deaths than smallpox itself.
Let’s use some hypothetical numbers to demonstrate the case. Imagine that out of 1 million children, 99 percent receive the smallpox vaccine. There is a 1 percent chance that the vaccine will cause a reaction, and a 1 percent chance of that reaction being fatal. In other words, 99 fatalities.
In contrast, 1 percent of the million children go unvaccinated. These 10,000 children have a 2 percent chance of developing smallpox. And of these 200, 20 percent will die. That’s 40 children.
When you compare 99 vaccine-related fatalities to 40 deaths caused by disease, you can see why people might think that the vaccination is more deadly. But here’s the rub. If we want to truly understand data, we have to look at more than just the bare bones. So in the case of the smallpox vaccine data, we really need to be asking the question, “How many would have died if no one had been inoculated?”
Run the sums and you’ll see that 4,000 children would have died. The data as it stands obscures that fact and the undoubted benefits of the vaccine. All this goes to show that data can be used to find connections between almost anything. You may be surprised to learn that data shows a relationship between a child’s shoe size and his reading ability.
It may seem nonsensical that the two are related, but they are through a common cause: age. Older children will have bigger feet on average than younger children, and will be better readers.
It’s just such a neglect of common causes that led Pearson to be so dismissive of the relationship between chocolate consumption and Nobel prize winners. To get around this problem, the authors have developed a process to look beyond the initial observation of data. They call it the Ladder of Causation, and we’ll start climbing it now.
The first rung of the Ladder of Causation is concerned with association and probability.
By nature, we are inclined to look at the world around us and start making connections. It’s that sort of thinking that stands on the first rung of the Ladder of Causation. Interestingly, though we’re programmed to do it almost from birth, the machines we’ve created to help us in our daily lives still can’t get close. Stuck on this first rung are most animals, as well as Artificial Intelligence programs.
An owl, for instance, tracks its prey by monitoring its movements. It tries to predict where the prey will be in the next moment. Why the prey is moving is of no interest to the owl.
Self-driving cars may seem all very futurist, but their AI can’t get past the ladder’s first step. Since they’re programmed only to react to observation, a car can’t work out, for instance, the various potential reactions that a pedestrian drunkenly crossing the road will have to a car horn. All possible and potential scenarios would have to be programmed into the car for it to be able to react appropriately to each one. Data collection can also be thought of as existing on that first rung because it involves projections based on passive observation.
Imagine that a marketing director is asked to find out the likelihood of a toothpaste-buying customer also buying dental floss. She would probably collect data on the numbers of toothpaste-buying customers and floss-buying ones. Symbolically, statistics represents this query as P(floss|toothpaste), or “What is the probability of floss, given that you see toothpaste?”
These sorts of questions form the basic foundation of statistics. But they don’t tell us anything about cause and effect. How can the marketing manager calculate whether toothpaste or floss is the cause? When examining sales of dental hygiene products, it may not be that important in the end. But, on most other occasions, it’s clear that observing basic probability alone is not nearly informative enough.
The second rung of the ladder is intervention, which we use both day to day and in research.
Progressing up the Ladder of Causation requires not just watching the world but changing it. It’s only humans that do this on a regular basis. The second rung of the ladder is typified by the question “What if we do . . . ?”
It’s the “do” part that’s important. Unlike the passive first rung, the second rung is characterized by actively influencing outcomes. Imagine you have a headache and take a painkiller. That’s an active intervention intended to relieve the pain you’re experiencing.
Let’s return to our dental hygiene marketing manager. She might ask, “Will floss sales be affected if we change the price of toothpaste?” In contrast, you may be surprised to learn that computers cannot currently be programmed to accurately ask these questions. And that’s why they can’t get beyond the first rung of the ladder. One of the best ways to test the effect of something is to conduct a controlled experiment.
A controlled experiment involves taking groups as similar to each other as possible and applying a test to one but not the other. As a result, the variable and its effect can be measured objectively and in isolation.
These kinds of controlled experiments are hardly new – they’re actually reported in the Bible. In the story of Daniel, the Babylonian King Nebuchadnezzar sought out some of the captured nobles of Jerusalem for his court, as was customary. This involved educating them in the elite Babylonian diet of rich meats and wine. However, in accordance with Jewish dietary laws, some of the Jewish boys would not eat the meat.
Daniel was one of them. He suggested that he and three friends be given a vegetarian diet, while another group of boys have the king’s diet. We’d call this second group a control group these days. After ten days they would compare the results. Needless to say, Daniel’s group flourished, and Nebuchadnezzar gave them high court positions. A more modern example would be Facebook. The company loves experimenting with arrangements of items on web pages and comparing different groups who see different configurations against each other.
The third and final rung of the ladder involves getting to grips with counterfactuals.
The third rung on the ladder is unique to humans: it’s the ability to imagine how different interventions can lead to different outcomes. One common way of putting this imagination into practice is to use counterfactual models. In other words, to picture what would happen if another action were taken.
Climate scientists, to name one group, do this all the time by asking questions like, “Would we see intense heat waves if carbon dioxide in the atmosphere were at preindustrial levels?”
Counterfactuals can also be applied to past events as well. They are common in legal proceedings where they take the form of “but-for causation” questions. When someone has been shot and killed, a trial is aimed at answering the question “But for the defendant pulling the trigger, would the victim have died?”
These sorts of counterfactual questions are alien to machines.
If a house burns down after somebody strikes a match, most people would be happy to claim that the house would still be standing were it not for the lit match. However, logically speaking, it’s also true that it would still be standing had oxygen not been present. While oxygen is normal and expected, lighting of the match is not, so we ignore the causal relationship between the oxygen and the fire.
A computer doesn’t think that way. For it, both lit match and oxygen would be considered equal factors. In mathematical language, both are “necessary causes.” Therefore, the computer may be just as likely to conclude that the oxygen was to blame for the fire.
A computer might also calculate whether the match was a “sufficient cause” of the fire. This means that even though other factors may have been necessary for the fire to start, the computer works out whether the match was sufficiently responsible to be considered the cause. If it had been programmed to recognize that oxygen was necessary for fire, it may conclude that it was the cause.
Understanding the three rungs of the Ladder of Causality is crucial in helping us understand causal questions. But this begs the question: in scientific studies, what complicating factors should be identified when on the different rungs of the ladder? Let’s look at that next.
Controlling for confounders is important in establishing causality.
We’ve already seen the case for controlled trials. But even in these scenarios, we have to be careful; the results can still be misleading if influencing factors known as confounders are not identified.
Here it would be best to backtrack for a moment to establish a definition. Confounders influence both the participants and the outcome of the experiment. They are generally associated with the second rung of the Ladder of Causality, since adjusting an experiment to take them into account requires intervention. For example, when a test group is much younger than a control group on average, age becomes a confounder. To control for it, only people of similar ages should be compared across the groups.
But confounders are tricky things, since it’s often so difficult to eliminate them. In fact, that’s exactly why there was such a lively debate around the link between smoking and lung cancer in the 1950s and 60s. It was impossible for skeptics to discount that a third variable – such as genetics – could be responsible. However, one way to control for confounders is to introduce randomization.
For instance, the biases of researchers are a confounder. These can be controlled by randomly assigning participants to control and treatment groups. That way, neither participants nor researchers know who’s in what group, which is precisely why placebos are given to the control group in medical trials.
But randomization is not always practical or ethical. For instance, researchers could not ethically tell a random group of people to smoke for 30 years in order to test the link to cancer. It could kill them! Equally, there’s little point in forgoing a randomized controlled trial in favor of collecting data from people who are, say, taking prescription drugs of their own volition.
The data would just yield results that were completely misleading. People’s decision to take the drug or not may be based on all kinds of reasons, such as affordability. In that case, only people within a certain income bracket would provide data in the trial. One way to control for this would be for researchers to intervene by conducting a controlled experiment. The authors call such actions “do-factors.”
The identification of a mediator can be vital in establishing correct causality.
Knowing that causation exists is only half the battle. What really matters is establishing why one thing causes another. If you can work out why a disease is caused by a certain thing, that will make prevention and finding a cure that much easier. A mediator is a variable that tells us why one factor leads to a particular result.
This is best illustrated with an example. Houses are equipped with alarms to warn us if a fire breaks out. But, actually, they are really there to detect smoke. Smoke is the mechanism – the mediator – by which we know a fire has started. A causal diagram for these relationships would express them as fire > smoke > alarm.
Mediators are on the third rung of the Ladder of Causality because they go hand in hand with counterfactuals. We could ask, for instance, “Would the fire have triggered the alarm if not for the smoke?” Mediators are useful things, then, but we can run into trouble if we start misidentifying them.
The classic example is scurvy, the disease that ravaged sailors for centuries. We know now that it can be prevented by taking vitamin C. However, when it was noticed in 1747 that citrus fruits counteracted scurvy, people assumed that it was the fruits’ acidity that was doing the work. After all, vitamins weren’t discovered until 1912.
The causal path is citrus fruit > vitamin C levels in the body > scurvy. Even though sailors had incorrectly identified the mediator, scurvy was still all but stamped out in ranks of the British Navy through judicious doling out of citrus fruits. But the very same mistake resulted in the disaster that befell the British Arctic Expedition of 1875.
On that voyage, the sailors’ lime juice was certainly acidic, but was lacking in vitamin C. Before too long, the onset of scurvy was apparent. However, some of the sailors were also eating fresh reindeer meat, which contains vitamin C. Consequently, when sailors who were eating tinned meat came down with scurvy, doctors concluded that bad meat was the cause.
It was a deduction that proved deadly; the mediator had been misidentified. As a result, Robert Falcon Scott’s South Pole expedition disembarked without citrus fruit. Just one scurvy-ridden crew member made it back alive, while two of the five who perished most likely died of scurvy. To put all of that into counterfactual terms, if doctors had known about vitamins when scurvy was widespread, the fate of Scott’s crew may well have been different.
Factors and their relationships can be expressed with mathematical formulae, which could be turned into algorithms.
We’ve reflected a fair amount on causation so far. But do these musings help us work out if correlation implies causation? Equally, what potential would an answer have for AI? The first thing we can do is draw causal diagrams. After that, it’s possible to create a mathematical formula that demonstrates the likelihood of a relationship existing between correlation and causation.
A causal diagram presents all known factors in one place. The factors that directly affect one another are then linked together with arrows. It’s then possible to see clearly which are mediators and which are confounders. Health care specialists might well try this when testing the effectiveness of a drug that claims to lower blood pressure. They might draw a diagram with arrows linking the drug and blood pressure, lifespan and blood pressure, and the drug and lifespan.
Since age affects both blood pressure as well as lifespan – quite independently of the drug – it is linked to both factors with an arrow that points in two directions, which identifies age as a confounder. Or, symbolically: age ←→ blood pressure.
Thanks to the diagram, the probability of a lifespan of any given length – assuming the individual has taken the drug – can then be expressed in a formula. The clever bit is the methodology. Because it logically progresses step by step, this means that robots can be the ultimate beneficiaries.
This cause-and-effect process could be programmed into a computer. We’d use it just like we use path diagrams: assumptions and data would be fed in and we’d then pose a question.
If the computer determines that the question can be answered by using the causal model, it would then engineer a mathematical formula. This formula could then be used to calculate, not only an answer, but the statistical uncertainty in that answer. This uncertainty is a reflection of the limited data set as well as possible measurement errors. This would mean that, for the first time, computers would be able to ask “why?”
We don’t have to think hard to see the benefits that could emerge if we could ask computers causal questions: What types of planets could sustain life? Is there a cancer-causing gene? Surely, enormous advances in science and medicine are there for the taking.
The key message in this book:
Attempting to understand exactly how causation works has damaged all areas of research, and has stymied scientific advances. It is perfectly possible, in contradiction to received wisdom, to establish a logical process to determine when correlation implies causation. Furthermore, this method could also be programmed into computers so they can answer causal questions and thus ensure rigorous scientific advances for decades to come.