Correlation Is Not Causation !!
Correlation Is Not Causation !!
LESSON OBJECTIVES...
SUMMARY
WHAT DO WE THINK?
This activity is intended to get students thinking about why this topic is important. Students to discuss in pairs to produce written answers,
experience, but also to think about how to gather relevant data, Students may think that a strong performance is based on skill
perhaps by asking a large number of people to provide alone, so may be surprised by this. Encourage them to think
information about personal income and self-reported happiness about other factors behind a strong performance (e.g. good luck)
Some superstitious sports fans (and players) have Why might it be helpful to work out if different variables
bizarre rituals that they always perform before a game. are related to each other, or if one variable is even driving
Why? How could we check to see if they really work? the change in the other variable?
People who perform such rituals usually believe that there is Encourage students to think more broadly about the
a relationship between their behavior and the team's importance of establishing which variables are driving
performance. Could be checked by stopping/changing ritual outcomes that they may care about e.g. health, education etc.
The more ice-creams sold on any given day, the hotter If you could investigate the relationship between any
the temperature. Is this proof that ice cream sellers two variables, which would you pick? (e.g. meditation vs
control the weather? stress, family size vs self confidence, screen time vs
academic performance…?)
Encourage students to see that relationships can work in Encourage students to think of an outcome that they genuinely
different directions. Ask them to consider if it's more likely that care about, so that they might reflect on the importance of
the weather is determining the ice cream sales establishing which variables may influence it
PART 1: CORRELATION - Looking for Links
Are these two variables related?
Check that the students can plot coordinates. Read through steps as a class, demonstrating how the scatter graph is constructed
4 84
11 70
What does the scatter plot tell us about the relationship
between the number of absences and test scores?
13 68
Students should observe that as the number of absences
8 75
increases, the test scores decrease
15 62
3 87
17 55 Can you give an explanation for why the student who was
5 73 absent 5 times falls quite far below your line?
14 62 This student was rarely absent but still performed poorly. His
9 76
weak performance is likely due to general difficulties with
VERY STRONG
The dots are closely clustered
NEGATIVE
along the line
The variables move in
opposite directions. As one
increases, the other
decreases
MODERATE NO CORRELATION
The dots fall a little further from There is no relationship
the line between how the variables
change
PART 1: CORRELATION - Looking for Links
Are these two variables related?
Students to discuss in pairs, or work individually, to produce written answers, then review answers as a class
DESCRIBING CORRELATIONS
Using the language on the previous page, describe the
correlations shown in the following scatter plots.
Remember to describe both the STRENGTH and TYPE of
correlation
CORRELATION COEFFICIENT
Statisticians describe these kinds of relationships using a
Students should note that, as the number of training hours
value, r, which can vary from -1 to 1.
increases, the average 100m time decreases. Students should Values between 0 and 1 indicate a POSITIVE correlation.
Values between 0 and -1 indicate a NEGATIVE correlation.
describe the correlation as STRONG and NEGATIVE
Values further from 0 indicate a STRONGER correlation
Values closer to 0 indicate a WEAKER correlation
PART 2: CAUSATION - Looking for CAUSES
Does one variable drive the change in the other?
Review this first section with the students as a class Students discuss in pairs to produce written answers, then
as a class
Does the rotation of windmills blades Many possible answers. Encourage students to consider
this, but also why it's incorrect. The windmill blades don't drive Gather initial ideas from students here, but this topic is fully
the wind, the wind drives the windmill blades. explored on the next two pages.
PART 2: CAUSATION - Looking for CAUSES
Does one variable drive the change in the other?
Review this page with the students as a class
Example
REVERSE CAUSATION Winter coat usage (A) correlates with
Rather than A causing B, would
A B cold weather (B), but cold weather
B causing A make more sense? actually causes winter coat usage
Example
CONFOUNDERS
A Basketball performance (A) and shoe size
Might some other variable, C, C (B) are correlated, but 'height' (C) drives
actually be causing both A and B? B
both basketball performance and shoe size
Example
COINCIDENCE From 2000-2009, the amount of cheese
If there’s no reasonable connection,
A eaten per person (A) correlated with the
could it just be a coincidence? B number of people who died by becoming
tangled in their own bedsheets (B) ....!
Example
MULTIPLE CAUSES A
Having good friends (A) correlates with
Might A be only one of many C B well-being (B), but many other factors
causes of B? D (C, D...etc.) also contribute to well-being
PART 2: CAUSATION - Looking for CAUSES
Does one variable drive the change in the other?
Students to discuss in pairs, or work individually, to produce written answers, then review answers as a class
More students who use a tutor In the 1990s, the stork population of
have poor academic grades ... Germany increased and the at-home
so tutors damage academic birth rate also increased ...
performance so storks really do deliver babies
Most likely explanation is REVERSE CAUSATION. Students who Most likely explanation is COINCIDENCE. There is no
have poor grades are more likely to seek out a tutor reasonable mechanism by which the stork population could
Most likely explanation is CONFOUNDERS. Illness increases Most likely explanation is CONFOUNDERS. Parents who have
chance of being in a hospital bed, and also increases chance many books in the home typically place great value on learning,
HOMER: NOT A BEAR IN SIGHT. THE BEAR PATROL MUST BE WORKING LIKE A CHARM.
LISA: THAT’S SPECIOUS REASONING, DAD.
HOMER: THANK YOU, DEAR.
LISA: BY YOUR LOGIC I COULD CLAIM THAT THIS ROCK KEEPS TIGERS AWAY.
HOMER: OH, HOW DOES IT WORK?
LISA: IT DOESN’T WORK.
HOMER: UH-HUH.
LISA: IT’S JUST A STUPID ROCK.
HOMER: UH-HUH.
LISA: BUT I DON’T SEE ANY TIGERS AROUND, DO YOU?
[HOMER THINKS OF THIS, THEN PULLS OUT SOME MONEY]
HOMER: LISA, I WANT TO BUY YOUR ROCK.
[LISA REFUSES AT FIRST, THEN TAKES THE EXCHANGE]
PART 3: REGRESSION TO THE MEAN - THE HIDDEN DRIVER
Was that change going to happen anyway?
Review this first section with the students as a class Discuss the following questions as a class
Use COUNTERS to represent SPEED CAMERAS Were the speed cameras really responsible for the
change in accident numbers?
How to play: No. Most likely result in Year 2 is a mid-range number, which will
(4) If you score 10,11,12, you have a DANGER-STREET. What does this activity tell us about the effectiveness of
Your teacher will now give you a speed camera, and speed cameras?
ask you repeat Step 2 to find the number of accidents It tells us accident numbers would probably fall just by chance,
on your street in the following year.
so we actually don’t know if speed cameras are effective or not
PART 3: REGRESSION TO THE MEAN - THE HIDDEN DRIVER
Was that change going to happen anyway?
Review the first two sections with the students as a class, reading the story aloud. For the questions, work individually, or in pairs, to produce
written answers, then discuss as a class
PERFOMANCE, SKILL & LUCK
THE MATHEMATICS OF HIGH PERFORMANCE
Most outcomes are a result of two main factors - skill
The speed trap test was based purely on
and luck
random chance, but do we see the same effect
with elite performance, where skill is involved?
PERFORMANCE = SKILL + LUCK
In his book 'Thinking Fast & Slow' psychologist
Daniel Kahneman tells a story about performance.
He was explaining to instructors who teach pilots
that praise works better than punishment. SKILL consistent but... .... LUCK is not
However, one of the most experienced instructors
tells him that he's wrong. The instructor explains: GREAT PERFORMANCE: POOR PERFORMANCE:
GOOD SKILL + GOOD LUCK POOR SKILL + BAD LUCK
"On many occasions I have praised flight luck likely to change, luck likely to change,
cadets for clean execution of some aerobatic so performance will so performance will
maneuver. The next time they try the same probably dip to probably improve to
maneuver, they usually do worse. On the other average average
hand, I have often screamed into a cadet's
How had the aircraft instructor misunderstood the
earphone for bad execution, and in general he
impact of his teaching methods?
does better on his next try. So please don't tell
us that reward works and punishment does Students should understand that he believed his words were
not, because the opposite is the case." responsible for the following changes in performance
In terms of performance, what do you think athletes Can the principal justifiably claim that the new uniform
invited to appear on the magazine cover have in common? policy worked?
Students should suggest that all sports stars invited to be on the The policy was introduced when test scores were unusually low,
cover will have been performing at the very highest level in their so they would most likely have increased in the following year
respective sport in the period before receiving their invitations anyway, again due to regression to the mean
Can you think of another explanation for 'The Sports What could the principal have done differently to really
Illustrated Curse'? test the effectiveness of his uniform policy?
Since good performance entails good skill and good luck, and luck He could have introduced the policy after an average set of
is likely to change, their performance is likely to dip anyway. The results. Alternatively, he could have made only one of two classes
curse is most probably an example of regression to the mean adopt the policy, thus 'controlling' for regression to the mean
PART 4: DECISION SCOPE - APPLYING WHAT WE KNOW
How can this help us make better decisions?
Read each of the three scenarios aloud. Students to discuss in pairs, or work individually, to produce written answers. Review answers as a class
Students should note that CONFOUNDERS may well explain this the hidden driver in this scenario. The protagonist sought a new
correlation. Responsible and attentive parents are probably more remedy when his health was much worse than average. In most
likely to organize regular family meals, and are also probably cases, he would soon return to average health anyway, but the
more likely to take an active interest in their children's academic timing of the intervention makes it appear that the apricot seeds
MULTIPLE CAUSES A
Extremely high numbers usually come Might A be only one of many C B
down and extremely low numbers go up. causes of B? D
Answer the following questions to check you have List 3 alternative explanations to consider, before
understood all the important ideas from this lesson: making a claim of causation:
1 - Reverse Causation
2 - Confounders
3 - Coincidence
Why is it helpful to know if two variables are related?
If we want to influence a certain outcome B, it is useful to know
When a claim is made that a certain intervention has
which other variables it is related to (A, C, D etc.). These
made improvements, what trap should you check for?
correlations can be further investigated to establish causation,
Regression to the mean. It is also important to know if the
extreme value
each other
Good luck and bad luck will soon be followed by average luck,
change in the other variable. claim is an example of reverse causation. In reality, an increase