0% found this document useful (0 votes)

24 views107 pages

MSC Statistics Booster

The document outlines a comprehensive overview of the MSc Statistics Booster course, covering topics such as the research process, variables, measurement, and data analysis using R Studio. It emphasizes the importance of reliability and validity in psychological research, as well as different types of research designs including correlational and experimental methods. Additionally, it discusses descriptive statistics and measures of central tendency and dispersion to effectively summarize data.

Uploaded by

srinithyabhooteshwarananda

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views107 pages

MSC Statistics Booster

Uploaded by

srinithyabhooteshwarananda

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 107

MSc Statistics Booster

BECKY MCNEILL AND CHLOE BATES

What we will look at
today
This morning:
The research process
Variables, measurement, and research design
The basics of understanding data
An introduction to R Studio software
This afternoon:
Hypotheses
Significance
T-tests & R Studio practice
Why do we need
psychological
science?
Grand goals of psychology

1. Describe: What a mental process or behaviour

entails, when and how it occurs.
2. Explain: Knowing the causes of a mental state or
behaviour.
3. Predict: Predicting changes in mind/behaviour
when certain factors are changed.
4. Control: Manipulating factors to cause mental
processes or behaviour to occur.
The hypothetico-deductive method
1. Based on existing knowledge, generate a hypothesis, e.g. that
national identity plays an important role in watching a coronation.

2. Deduce predictions from the hypothesis

3. Test whether observed data are

in line with prediction

4. Use the outcome of test as

evidence to support or reject
hypothesis
Sampling in
psychological science
Psychology aims to generalise about differences or
associations in a population, based on a sample of
observations.
◦ Always some chance we are wrong – thus results from a
single study are not conclusive!
◦ Since we cannot logically confirm a hypothesis based on a
sample alone, we try to reject the null hypothesis
◦ If there was no relationship between national identity
and watching the coronation, it is unlikely that we would
get supporting data due to chance
Any questions so far?
Reliability
Reliability: The extent to which an event/behaviour
produces the same score by our measurements
each time.

Depends on degree of measurement (or

manipulation) error.
◦ Proportion of score accounted for by construct…

Person’s score on the Hypothetical construct

measured variable = being measured + Error
Test-retest reliability
Reliable –
we can trust this IQ test

IQ in the morning: IQ in the evening:

88 87

IQ in the morning: IQ in the evening:

88 125

Unreliable –
we cannot trust this IQ test
Internal consistency of
questionnaires
Example: Extraversion – do people who score high on Item 1 also score
high on Items 2 and 3?
Validity
Validity: the extent a measure or observation
indicates what it is intended to.

Three broad types of validity:

◦ Measurement validity
◦ Internal validity
◦ External validity
What exactly are we
measuring?
Intangible hypothetical
Intelligence
construct

Mathematical
Component variables skills
Verbal skills

Number of words Identify words

Operationalisation that can be kept in that have similar
mind during delay meaning
Measurement validity
Some types* of measurement validity:

Content validity: Degree to which our measure

includes all aspects of a hypothetical construct
Construct validity: Degree to which our measure
actually and only reflects the hypothetical construct

◦ Ecological validity: How much the procedures of the

experiment are similar to real-life events.

◦ Temporal validity: How much the results apply across

times.
Internal vs. external
validity

Internal Validity External Validity

First impression: First impression:

• judging faces • judging faces +
• well-lit and silent lab clothing, gestures,
• model always stays 2 smell
minutes. • in natural
environment
• natural light
Increasing one often decreases (changeable)
the other  researchers make • no control over noise
tough choices! • various time
Summary: Reliability
and Validity
Reliability: Error-free measurements, regardless of what they
actually measure.

Validity: Whether measurements reflect variables, constructs and

relationships they are meant to reflect.

18
Any questions so far?
15-minute break
Variables and
Measurement
In this next section we will discuss:
Different types of variables
◦ IVs & DVs

How variables are measured NO SUCH

◦ Nominal, ordinal, interval, & ratio THING AS
TOO MANY
GLOBAL
VARIABLES

Basic research designs

◦ Correlational designs
◦ Experimental designs
◦ Between and within-participant designs
Two different types of
variables
Independent Variables (IV)
◦ The variables that you assume to “come first”
◦ Dependent Variables (DV)
◦ The variables that you assume to “come second”

Independen Dependent
t Variable
Variable
Alcohol Cognitive
Consumptio Function
n
Usually, this difference implies causality. Exception:
Predicting from incomplete data.
Two different types of
variables
Even if some variables cannot be manipulated
You can still have a sense of direction, or relationship,
between variables, and use IV and DV terms
This is true if A can cause B but B cannot cause A

Gender Salary

Smokin Lung
g Cancer
Levels of Measurement
Nominal Variables
◦ Measured using scales for the 1 = Male
2 = Female
sole purpose of categorization. 3 = Non-
binary

◦ The numerical values are

nothing more than a category.
◦ Similar to a label
1=
Omnivore
2=
◦ No implication of hierarchy Vegetarian
3 = Vegan
Levels of Measurement
Ordinal Variables
◦ A scale that implies order or hierarchy

◦ For example, winners in a contest or a race

◦ 1 = First place
◦ 2 = Second place
◦ 3 = Third place

◦ More informative than nominal

variables

◦ They do not, however, tell us

about differences between the
categories – the winner could be 100m
ahead of the others
Levels of Measurement
Ordinal Variables – An indication of direction, but no clear
grounds of comparison.

To what extent does this make you feel disgusted?

1 2 3 4 5 6 7
Not at all Somewhat Entirely

Does going from 4 to 6 mean the same increase in disgust

as going from 1 to 3?
Regularly used in questionnaire-based research
Levels of Measurement
Interval Variables
◦ Offers clear information about order and offers
equal intervals between scale points.
What time of day do you
feel most alert?
12:00 1:00 2:00 3:00 4:00 5:00 6:00 7:00 8:00 9:00 10:00
11:00 12:00 …

Equal intervals
Levels of Measurement
Ratio Scales
◦ Has all the properties of an interval variable, but also,
zero is a meaningful value.

A good example of this is height, weight, reaction time, or

the amount of sweets eaten in a behavioural control task.
Basic Research Design
So far we’ve looked at what sorts of measures we can use to
help us answer scientific questions.

Next, we will move on to examine how data are collected.

Put simply, there are two main ways to collect data:

◦ Observe what happens naturally
◦ Manipulate some aspect of the environment to see how it affects the
variable we are interested in.
Correlational Research
Observing and measuring the world as we find it. No
manipulation is involved.
◦ Processed food consumption and diabetes
◦ Job satisfaction and productivity

Researchers gather information about different

variables and then observe how they are related to
each other.
◦ Lack of manipulation. Researchers are just measuring aspects
of the world as they find it.
Minimal impact can = high ecological validity.
Correlational Research
Example: Self-esteem and dating anxiety
◦ Researchers can administer a survey to assess self-esteem and a
survey to assess dating anxiety.
◦ It is found that there is a correlation between low self-esteem and
high dating anxiety.

It is important to remember that correlation does not equal

causation.
◦ Low self-esteem may cause high dating anxiety, or…
◦ High dating anxiety may cause low self-esteem, or…
◦ Both may be caused by a third variable, or…
◦ Low self-esteem may be caused by X, and high dating anxiety may be
caused by Y, or…
◦ And so on…
Correlation ≠ Causation

http://www.tylervigen.com/spurious-correlations
Correlation ≠ Causation

http://www.tylervigen.com/spurious-correlations
Experimental Research
In an experimental study, the researchers
systematically manipulate the independent
variable.
◦ The key goal of manipulation is to determine
cause and effect.
Experiments also must control for the effects
of confounding variables.
◦ This can be done by randomly assigning
participants to experimental groups.
Experimental Research
For instance, researchers may be interested in the
effects of time pressure on face matching.

Participants can be randomly assigned to three

experimental conditions:
◦ Condition 1: High time pressure.
◦ Condition 2: Low time pressure.
◦ Condition 3: No time pressure (i.e. control condition)
Quasi-experimental
Research
Often times we want to look at variables that we
cannot directly manipulate.
◦ I.e. Gender, smoker, developmental disorders

Trying to make the groups as equal as possible on

other factors, but still…
No random assignment = not experimental
◦ Thus, quasi-experimental
Two Methods of Data
Collection
Between-participants design
Manipulates the IV using different entities.
◦ Different groups take part in different experimental conditions
◦ Comparing cognitive functioning of participants that drank 1, 3, or no
alcoholic drinks.

Within-participants design
Manipulate the IV using the same entries.
◦ The same group takes part in different experimental conditions.
◦ As you can imagine, this method may not be ideal for the example of
alcohol on cognitive functioning.
Any questions so far?
The Basics of Data
Descriptive statistics
◦ Mean
◦ Median
◦ Mode

The normal curve

Descriptive Statistics
Let’s say that we took demographic information
from 100 participants.
◦ Age, Gender, Race, Marital Status, and Highest Level of
Education.
Five measures from 100 pps = 500 pieces of data.
We certainly can’t show a spreadsheet containing
such a large amount of information.
How do we most effectively deliver this
information?
Descriptive Statistics
Loads of measurements can
result in an overwhelming
amount of data.
14
It can be more cleanly
12
presented as a figure.

Number of responses
10
It would also be nice to have a 8
6
numerical summary of the
4
data. 2
0

1
3
5
7
9
11
13
No. books read
Descriptive Statistics
Estimates of central tendency
◦ Used to provide a summary of where scores are located on a scale
◦ Mean, Median, Mode
14
12

Number of responses
Measures of dispersion/variability 10
◦ A quantification of how much the scores
8
vary
6
◦ Range, Variance, Standard Deviation,
4
Inter-Quartile Range
2
0

1
3
5
7
9
11
13
No. books read
The Mean
The central tendency most
commonly used and what
people may colloquially refer
to as the average. 14

Number of responses
12
10
To calculate the mean, we 8
6
add up all the scores
4
and then divide by the total 2
number of scores. 0

1
3
5
7
9
11
13
No. books read
The Mean
Not always at the middle of a
set of numbers
◦ Sensitive to outliers (extreme outlier
values) s
14
12
10

Age (years)
8
Data’s mean will gravitate 6
towards outliers, do not simply 4
accept it as representative of 2
0
where scores are clustered. 1 5 9 13 17 21
No. books read
The Median
Another way of quantifying the centre of a distribution is to
look for the middle score(s) when the scores are ranked in
order of magnitude.
35, 55, 55, 60, 80

With an even number of scores, then you take the mean of

the two centralmost scores.
35, 55, 55, 57, 60, 80
(55 + 57)/2 = 56
The Median
The median bisects your data and it is less
affected by outliers than the mean is.
15
10
Age (years)

5
0
1 4 7 10 13 16 19
No. books read
The Mode
Simply the score that occurs most frequently in a
distribution.

Lets look at the following data:

35, 80, 55, 55, 60

Simply determine which occurs most frequently

◦Easy to spot in a graph because it will always be the
tallest bar.
There can be multiple modes (bimodal, trimodal..)
Visualizations of Central Tendency
• A normal distribution and
skewed distributions are
shown here.

• Across these distributions,

the measures of central
tendency fall at different
points on the curve.

• This is why you should not

refer to the average, but to a
specific measure of central
tendency.
The Dispersion of a
Distribution
In addition to central tendency, it is also useful to
quantify the spread, or the dispersion, of data.
This allows us to capture the variation across
scores. Low Dispersion
High Dispersion
15 15

10 10

5 5

0 0
5 6 7 8 9 10
Pro Tip: The
range can be
very useful in

Range identifying
coding errors.

Range relates to the difference between the lowest and the

highest value in the sample.

A sample of 8 people are asked ‘How old were you when

you passed your driving test?’
◦ 17, 18, 19, 21, 22, 25, 35 and 41years

As the lowest value is 17 years and the highest value is 41

years:
◦ Then the range is 24 years
◦ Does not tell us anything about scores are distributed.
Range
17, 18, 19, 21, 22, 25, 35, 41

You may have noticed from our example that the range is
very much influenced by the extreme scores.
◦ Most numbers are tightly between 17-25 but outliers 35 and 41
make it seem much more variable

As such, some researchers prefer to use the inter-quartile

range.
◦ The highest and lowest scores of the middle 50% of all cases
◦ IQR in these 8 scores is 25-19 or 6 (look at the middle 4)
Variance
This focuses on the degree to which the scores on a
variable are different from one another.
It is an indication of the variability of your data
Imagine a range of ages:
18, 21, 23, 18, 19, 19, 19, 33, 18, 19, 19, 20

246/12

Mean Age = 20.5

Ages - Mean Deviation from the mean

Summing these deviations
gives us a total of zero,
Not helpful.
Summing these deviations The variance
gives us a total of zero, (the sum variances
Not helpful. of squares)
Variance

Different unit of

193/12 measurement!

Variance= 16.08
Average squared deviation from mean

Ages: 18, 21, 23, 18, 19, 19, 19, 33, 18, 19, 19, 20
Variance -> Standard
Deviation (SD)
To get a measure of deviation Standard Deviation:
that is in the original units of
measurement we need to take The degree to which scores in a
dataset deviate around the
the square root of the variance. mean.
Variance = 16.08
This gives us the standard
deviation. SD =

SD =
It is important to understand
the concept of the SD because it SD =

is a key component of many SD = 4.01

statistical analyses and often
reported.
Calculating Standard
Deviation (SD)
18, 21, 23, 18, 19, 19, 19, 33, 18, 19, 19, 20

The ages
from before
1. Calculate mean = 20.50
2. Calculate each score’s deviation from the mean.
◦ The sum will be zero!

3. We resolve this by squaring each deviation.

4. We then calculate the means of these scores to
get our variance.
◦ But it won’t be in the units of measurement!

5. Finally get the SD in units of measurement by taking

the square root of the variance (“unsquaring” the
scores).
Standard Deviation – Visual
Example
Mean
14
12
10 SD
8
6
4
2
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Standard Deviation gives us the average amount of deviation from the mean
Introduction to R Studio
We are going to do a brief overview of R which is another data analysis
software.

R Studio is an interface that lets you use R with four main windows:
script (organizer for commands that define analyses), console (output
window with commands and results), environment (list of objects
available in the programming environment, such as dataframes), and
files (list of files, with tabs for other useful outputs such as plots). Open
up R Studio now.
Also open up Microsoft Excel as you will be using it to enter data from
the exercise.
R Practice!
Descriptive stats
Lunch
What we will look at
today
This morning:
The research process
Variables, measurement, and research design
The basics of understanding data
An introduction to R Studio
This afternoon:
Hypotheses
Significance
T-tests & R Studio practice
Correlations
Hypotheses
Null hypothesis: No difference or association in the
population: any difference is due to chance
Alternative hypothesis: A difference of a certain size
or larger exists in the population
Alternative hypotheses are a testable statement
But using inferential statistics, we address the null
hypothesis
Hypotheses
We cannot:
◦ Prove or disprove either the null (H 0) or the experimental
(H1) hypothesis
We can:
◦ Reject the null hypothesis (a significant difference was
found in the sample, which is unlikely if the difference is
null in the population)
◦ Fail to reject the null hypothesis (no significant difference
was found in the sample, so it we cannot dismiss that the
difference is null in the population) – we never accept the
null
Hypothesis examples
NULL ALTERNATIVE

There is no link between Pill People taking Pill X

X and depressive symptoms experience a reduction in
their symptoms
There is no effect of sleep
deprivation on cognitive Cognitive performance
performance. deteriorates in sleep-
deprived participants.
There is no relationship
between grades in year 1 Students who perform
and degree classification. better in year 1 receive a
higher degree classification
Hypothesis examples
DIRECTIONAL NON-DIRECTIONAL

7 year olds will recall more There will be a difference in

than 5 year olds the word recall ability
between 7 and 5 year olds.
As studying increases, so will There will be a relationship
exam grades. between studying and exam
grades
One- vs. Two-tailed
Hypotheses
ONE-TAILED HYPOTHESIS TWO-TAILED HYPOTHESIS

Specifies direction of findings Predict a relationship between

variables, without specifying a
direction
E.g. As hours of study increase,
exam grades also increase
E.g. Amount of time studied will
have an effect on exam grades

one tail two tails

Type I and Type II Errors
Type I: rejecting the null hypothesis when we should fail to
reject it (False positive)
Type II: failing to reject the null hypothesis when we should
reject it (False negative)
How do we test
hypotheses?
Scenario: Two groups of people sign up to be
involved in an experimental treatment for
improving intelligence. One group receives the
treatment pill, the other receives a placebo.

H1 = There will be a difference between the

treatment group and the control group
H0 = There will not be a difference between the two
groups
Statistical Significance
A significant difference/association = rejection of the null
hypothesis

What is the probability that an observed difference

between two means arose due to chance, if the null
hypothesis is correct?
Logic of null
hypothesis testing
“What are the odds of this happening by
chance in a world where the phenomenon
is just a chance observation (nothing
special going on: null hypothesis)?”

Example: “If there is really no height

difference between basketball players and
their personal trainers in the population of
athletics, then what are the odds I would
pick 10 pairs of players + trainers and in
each one the player is taller than their
trainer?”
P-values 1

The “odds” that chance P-value: % of the distribution that is

more extreme than the cut-off point
could have produced such a
score in a null world are less More extreme effects = lower p-
likely, the larger the effect values
(mean difference / variation)
Significance test: “cuts off” a
point on the x-axis of a
distribution that is more
extreme the larger the
observed difference in your
sample
Sample size also affects
significance
P-values 2

Most p-value statistics are produced Here “” = number of observations -1

using a distribution whose tails are
“thinner” the more observations Skinnier tail= less % of distribution to
they’re based on (t, F, chi-square, the right of any given point.
etc.)
So, p-values are lower the more N=2
observations in the sample
N=6

Yellow curve (2 observations): effect N=

cuts off only 15% of the distribution, infinite
p = .15
Effect
Black curve (many observations):
effect cuts off 5% of the distribution,
p = .05
P-value facts:
1. P-values are not the chance the observations are
valid or true. They are simply how unlikely the
observations are if we assume they are only a
random fluke of the sample.
2. When you observe a low p-value the null
hypothesis could still be more plausible if the
alternative hypothesis is highly implausible (1 in a
million chance of ESP existing vs. 1 in 1000 chance,
p < .001, that randomness could have produced the
effects)
P-value facts:
3. P-values are less useful if there are many
uncontrolled ways to interpret the data
(interpreting a dart in the “13” wedge as evidence
of skill, rather than calling it beforehand), or 5 out
of 50 experiments gave us a nice p-value
4. The conventional cut-off for rejecting the null
hypothesis is p < .05. “What are the odds of that?”
Below 1 in 20 = we suspect something is going on.
Probability benchmarks
Would you go bungee jumping if…

◦ The probability of an accident was 1/10 (i.e. p = 0.10)?

◦ The probability of an accident was 1/20 (i.e. p = 0.05)?

◦ The probability of an accident was 1/100 (i.e. p = 0.01)?

◦ The probability of an accident was 1/1000 (i.e. p =

0.001)?
Problems with p-values
All or nothing mentality
◦ Experiment 1: p = 0.0499
Interpret with caution
◦ Experiment 2: p = 0.0511

Can this tell us anything meaningful about

our comparison?
Problems with p-values
A p-value conveys nothing about the magnitude of
a difference between two means – only that the
difference between them is unlikely to be due to
chance if the null hypothesis is true.
p = 0.001

p = 0.001
Getting those p-values

Comparing two groups: t-tests – with exercise

Comparing trends and relationships: Correlations
and tests of significance – no exercise
Comparing three or more groups: Analysis of
Variance (ANOVA) – later!
T-test Overview
It is common for researchers to compare two groups
The t-test compares the means of two different sets of
scores. The DV has to be measured on a continuous scale.
Statistical significance is assessed by using the variability in
the available data
We assess how likely the differences between the two
means are, if there is no true difference between the two
samples (if the H0 is true).
Degrees of Freedom
These will first show up when you conduct
your t-test analysis (we have two in F tests,
one in t-tests and chi-square, etc.)

A mathematical term that is used in many

statistical tests (including t-tests).

A difficult concept to truly understand

Degrees of Freedom
A better working definition:
◦ The number of individual scores in a sample that
can vary without changing the sample mean.
◦ Because we have a sample that imperfectly
represents the population, we can only say how
certain we are based on the number of scores
that can vary, instead of using the number of
observations
Degrees of
Freedom
Two examples
I need, two numbers that will add up to 10…
◦7
◦3

For our 2nd number, the df = 1

◦ There was only 1 score that we could give without
changing our final answer
Degrees of Freedom
I am responsible for hosting a dinner party that will have 10
guests.
Knowing where 9 people will sit, will automatically
determine the location of the 10th person.
I would, however, have flexibility when placing the first 9
guests. (df = 10 -1 = 9)

Mathematically expressed as n – 1
◦ This may vary slightly depending on your test,
but the logic remains the same.
Two types of t-tests
Paired t-test
Used when participants perform in both conditions.
◦ May also be called a related or within-participants t-test.

Independent t-test
Used when participants perform in only one of two
conditions.
◦ May be called an independent, between-participants, or unrelated t-
test.
Paired t-test: Examples
Test individuals at two different time points (e.g.,
1st day of the month & last day of the month) in
order to see if any improvement in performance
has occurred.
Test individuals’ memory at different times of the
day to see if performance is affected by the time of
day.
Test individuals following good rest (e.g., 8 hours)
or poor rest (e.g., 4 hours) to observe impact of
sleep on performance.
Counterbalancing
If we test the same population twice, there is a risk that
they will display improved performance as a consequence
of learning or test familiarity
Whenever possible, it is important that we counterbalance
conditions
For example, if we have an experimental condition and a
control condition:
Participant Experimental Control
Jenny 1st 2nd
Michael 1st 2nd
Emma 2nd 1st
Daniel 2nd 1st
…. …. …
Data Requirements
◦ Two sets of data from continuous variables.
◦ From the same people.
◦ Data should be correlated.
◦ The data overall should form a normal distribution.
Variables
Paired-sample t-tests require one independent variable
and one dependent variable
◦ Independent Variable – This could be time (Time 1 and Time 2) or
group (experimental and control with matched cases).
◦ What is DIFFERENT about these two variables? (time, what test
you took)
◦ Dependent Variable – This is what is similar in the different
conditions of the independent variable (e.g. test scores, on a similar
scale).
◦ What is SIMILAR about these two variables? (same kind of
response, same kind of scale)
◦ If the two measurements are not constructed in a similar way (1-5
vs. 1-9) the paired t-test will not be valid.
Paired t-test Summary
The paired t-test explores whether the means of
two sets of scores are significantly different.

◦ Generally this is used in a repeated design

◦ You need a categorical independent variable and a

continuous dependent variable
Two types of t-tests
Paired t-test
Used when participants perform in both conditions.
◦ May also be called a related or within-participants t-test.

Independent t-test
Used when participants perform in only one of two
conditions.
◦ May be called an independent, between-participants, or unrelated t-
test.
Independent t-test Data
Requirements
The following are required:
◦ Two sets of values from continuous variables.
◦ These are taken from two different samples (e.g.
vegetarians vs. vegans).
◦ The differences between the sets of scores should ideally
form a normal distribution (but some tests are robust to
this violation).
◦ Sample sizes do not have to be equal.
Variables
This test requires one independent variable and one
dependent variable. The default procedure in R is a Welch’s
t-test which is accurate even if variables are not normally
distributed. (Student’s t-test assumes normality)
Independent variable – This will be the group to which the
participants belong.
◦ For example nationality (UK citizen or not) or condition
(experimental or control)
Dependent variable – This is what you think may change
across the different conditions of the independent variable
(e.g. test scores)
Examples of
Independent t-test
Test different groups’ memory at different times of
the day to see if performance will be affected by
the time of day.
Test different groups following good rest (e.g., 8
hours) or poor rest (e.g., 4 hours) to observe
impact of sleep on performance.
Test an experimental group and a control group to
look at the effect of a drug versus a placebo.
Independent t-test
formula
t=

𝑥̄ 1−𝑥̄ 2

√[ ](
2 (∑ 𝑥 1) 2 2 (∑ 𝑥 2)2
t= ∑𝑥 − +∑𝑥 −
1 𝑁1
𝑁 1+ 𝑁 2− 2
2 𝑁2 1
+
1
𝑁1 𝑁 2 )

The formula for the independent test looks daunting, but

it’s actually straightforward.
Independent t-test
formula
The formula is more complicated because we are using
data from two different groups

We need a formula that treats each group separately. It

needs to:

◦ Account for standard error in both samples

◦ Account for differences in sample size
◦ Calculate the differences between sample means
Conclusion
The independent t-test explores whether the means of two
groups of scores are significantly different from each other.
This is used when data on the same variable are collected
from two different samples.
You need one categorical independent and one continuous
dependent variable.
The underlying computation is similar to the paired t-test,
but more complex to account for the fact that samples
come from different populations.
Conclusion – Critical
Components of the t-
test
T–value
◦ The larger it is, the greater the difference between means.
◦ Because t-values reflect the difference between Mean 1 and
Mean 2, t will be negative if Mean 2 is larger than Mean 1. This is
no cause for concern!
df
◦ The number of individual scores in a sample that can vary
without changing the sample mean.
p-value
◦ The probability that the difference between means (as reflected
by your t value) is due to random variation between/within
groups, if the null hypothesis is true.
R Practice!
t-tests
Correlations
•Correlations have two characteristics
◦ Magnitude:
◦ Between 0 and 1.
◦ A value of ‘0’ means no relationship.
◦ A value of ‘1’ means a perfect relationship.

◦ Direction:
◦ Negative Correlation vs. Positive Correlation.
◦ For example, -.67 or .56.
Positive Correlation
If there is a positive correlation between your variables then there
will be a general trend from the bottom left to the top right.
Correlation coefficient value will be positive
Negative Correlation
If there is a negative correlation between your variables then there
will be a general trend from the top left to the bottom right.
Negative value of the correlation coefficient
Correlations
Which is the bigger correlation?

-.67 or .56?

The size of a correlation ignores whether it is positive or negative so…

-.67!
Correlations
A correlation coefficient (r):
◦ Quantifies how well the line of best fit explains the data
◦ gives direction of relationship.

A correlation has two features

◦ Magnitude ( 0 -1)
◦ Direction ( + , - )

A correlation of zero shows no relationship.

What we’ve looked at
today
This morning:
The research process
Variables, measurement, and research design
The basics of understanding data
An introduction to R
This afternoon:
Hypotheses
Significance
R practice
Correlations
T-tests

L2 ResearchDesign - BRSM Lecture2
No ratings yet
L2 ResearchDesign - BRSM Lecture2
60 pages
The Measurement of Behaviour: Psych 3F40 Psychological Research Mike Maniaci 9 / 2 5 / 2 0 1 3
No ratings yet
The Measurement of Behaviour: Psych 3F40 Psychological Research Mike Maniaci 9 / 2 5 / 2 0 1 3
33 pages
Measurement and Data Collection
No ratings yet
Measurement and Data Collection
82 pages
Quant Methods
No ratings yet
Quant Methods
36 pages
Midterm Exam Review Session: What You've Learned So Far
No ratings yet
Midterm Exam Review Session: What You've Learned So Far
12 pages
Chapter Four Part One
No ratings yet
Chapter Four Part One
28 pages
4 Defining and Measuring Variables
No ratings yet
4 Defining and Measuring Variables
27 pages
Chapter 5 - Measurement Techniques
No ratings yet
Chapter 5 - Measurement Techniques
46 pages
Research Methods: It Is Actually Way More Exciting Than It Sounds!!!!
No ratings yet
Research Methods: It Is Actually Way More Exciting Than It Sounds!!!!
36 pages
Measurement Concepts
No ratings yet
Measurement Concepts
23 pages
Research Methods in Psychology
No ratings yet
Research Methods in Psychology
41 pages
Review Sheet For Experimental Exam1
No ratings yet
Review Sheet For Experimental Exam1
4 pages
Psychology Revision: Research Methods
No ratings yet
Psychology Revision: Research Methods
17 pages
Unit 3 Lecture PS
No ratings yet
Unit 3 Lecture PS
226 pages
Personality Research Methods
No ratings yet
Personality Research Methods
28 pages
Types of Research Methods
85% (13)
Types of Research Methods
3 pages
Validity and Reliability
100% (1)
Validity and Reliability
21 pages
Essentials of A Good Psychological Test
No ratings yet
Essentials of A Good Psychological Test
6 pages
NCE Assessment and Testing PDF
100% (3)
NCE Assessment and Testing PDF
7 pages
W6 - Measurement in Quantitative Research
No ratings yet
W6 - Measurement in Quantitative Research
5 pages
Psychological Assessment Module 2
No ratings yet
Psychological Assessment Module 2
30 pages
Measurement and Sampling Methods
No ratings yet
Measurement and Sampling Methods
16 pages
Portion 4
No ratings yet
Portion 4
27 pages
Fundamental Research Issues: Ali Waqas Sadia Zafar
No ratings yet
Fundamental Research Issues: Ali Waqas Sadia Zafar
17 pages
Revision On Methods of Research Part I
No ratings yet
Revision On Methods of Research Part I
33 pages
AS-level - Research Methods 1 - Aim, Hypothesis, Variables, Experiments, Correlations
No ratings yet
AS-level - Research Methods 1 - Aim, Hypothesis, Variables, Experiments, Correlations
93 pages
Expe Chap 7 G3
No ratings yet
Expe Chap 7 G3
11 pages
Research Methodology
No ratings yet
Research Methodology
11 pages
Defining and Measuring Variables
No ratings yet
Defining and Measuring Variables
24 pages
Unit 4
No ratings yet
Unit 4
25 pages
Scientific Method Qualitative Quantitative Explanatory Hypotheses Phenomena Experimental
No ratings yet
Scientific Method Qualitative Quantitative Explanatory Hypotheses Phenomena Experimental
6 pages
Introduction To Psychology - Fact Sheet Lesson 1
No ratings yet
Introduction To Psychology - Fact Sheet Lesson 1
7 pages
Andrew's Midterm Notes
No ratings yet
Andrew's Midterm Notes
26 pages
Levels of Measurement & Validity
No ratings yet
Levels of Measurement & Validity
17 pages
Psychology Research Methods Guide
No ratings yet
Psychology Research Methods Guide
30 pages
D1100 CH 02 Methods, Slides Notes
No ratings yet
D1100 CH 02 Methods, Slides Notes
52 pages
Chapter 3 BSRM
No ratings yet
Chapter 3 BSRM
4 pages
Lecture 3 (Scales of Measurement)
No ratings yet
Lecture 3 (Scales of Measurement)
48 pages
Reliability and Validity
No ratings yet
Reliability and Validity
29 pages
Measurement and Measurement Scales PDF
No ratings yet
Measurement and Measurement Scales PDF
19 pages
Chapter 3 Notes
No ratings yet
Chapter 3 Notes
10 pages
Chapter 4 - Identifying Variable
No ratings yet
Chapter 4 - Identifying Variable
21 pages
MSWR Exam Prep
No ratings yet
MSWR Exam Prep
9 pages
Reliability and Validity Worksheet
No ratings yet
Reliability and Validity Worksheet
6 pages
Biostatistics: Variables & Measurement
No ratings yet
Biostatistics: Variables & Measurement
2 pages
Quantitative Research Design Guide
No ratings yet
Quantitative Research Design Guide
35 pages
As Psychology (Lecture 2)
No ratings yet
As Psychology (Lecture 2)
35 pages
AS Psychology (Lecture 2)
No ratings yet
AS Psychology (Lecture 2)
21 pages
Marketing PDF
No ratings yet
Marketing PDF
8 pages
Modify: Intro To Psychology Psy 100 - Study Material & Notes, April 2020
No ratings yet
Modify: Intro To Psychology Psy 100 - Study Material & Notes, April 2020
5 pages
Field Methods - Notes
No ratings yet
Field Methods - Notes
12 pages
Scales and Measurement
No ratings yet
Scales and Measurement
70 pages
Personality Psychology As Science: Research Methods: Sixth Edition
No ratings yet
Personality Psychology As Science: Research Methods: Sixth Edition
44 pages
Week 2.1 Research Methods
No ratings yet
Week 2.1 Research Methods
10 pages
Psychology Research Methods Guide
100% (1)
Psychology Research Methods Guide
63 pages
One Way ANOVA and Chi Square
No ratings yet
One Way ANOVA and Chi Square
31 pages
Introduction To R (Used in PSYC8010)
No ratings yet
Introduction To R (Used in PSYC8010)
24 pages
Intro To Qual Ghosh - Tagged
No ratings yet
Intro To Qual Ghosh - Tagged
14 pages
Week 1 Class Intro Ghosh - Tagged
No ratings yet
Week 1 Class Intro Ghosh - Tagged
15 pages
Relationship Between Physical Activity and Cognitive Functioning Among Older Indian Adults
No ratings yet
Relationship Between Physical Activity and Cognitive Functioning Among Older Indian Adults
13 pages
PSYC8290 Brain Plasticity 2023-24
No ratings yet
PSYC8290 Brain Plasticity 2023-24
26 pages
Response Inhibition On The Stop Signal Task Improves During Cardiac Contraction (Rae, 2018)
No ratings yet
Response Inhibition On The Stop Signal Task Improves During Cardiac Contraction (Rae, 2018)
9 pages
PSYC8650 - Fear in The Brain and Body
No ratings yet
PSYC8650 - Fear in The Brain and Body
27 pages
PSYC8640 Extended Essay Questions 2023 - Tagged
No ratings yet
PSYC8640 Extended Essay Questions 2023 - Tagged
2 pages
Editorial-Board Brs
No ratings yet
Editorial-Board Brs
1 page
Filipino 2 Thesis Writing Help
100% (3)
Filipino 2 Thesis Writing Help
6 pages
Art Critique Mona Lisa - Leonardo Da Vinci: This Study Resource Was Shared Via
No ratings yet
Art Critique Mona Lisa - Leonardo Da Vinci: This Study Resource Was Shared Via
5 pages
Restoring Pride - The Lost Virtue of Our Age
100% (1)
Restoring Pride - The Lost Virtue of Our Age
122 pages
ACT With Love - Extra Bits - April 2023
No ratings yet
ACT With Love - Extra Bits - April 2023
4 pages
Assignment B.ed
No ratings yet
Assignment B.ed
6 pages
SINDA Developmental Scale 6.5 12.5 Months
No ratings yet
SINDA Developmental Scale 6.5 12.5 Months
1 page
Classroom Impact on Grade 9-10 Performance
No ratings yet
Classroom Impact on Grade 9-10 Performance
7 pages
Understanding Eastern Philosophies
No ratings yet
Understanding Eastern Philosophies
43 pages
Personality and Learning Style: Evidence For Big Five Traits
No ratings yet
Personality and Learning Style: Evidence For Big Five Traits
6 pages
Pramā Concepts in Indian Philosophy
No ratings yet
Pramā Concepts in Indian Philosophy
7 pages
Research Addiction
No ratings yet
Research Addiction
6 pages
Professional Resume
No ratings yet
Professional Resume
1 page
Bdae Boston Diagnostic Aphasia Examination Docs
100% (5)
Bdae Boston Diagnostic Aphasia Examination Docs
9 pages
Towards Fostering Growth Mindset Classrooms
No ratings yet
Towards Fostering Growth Mindset Classrooms
28 pages
Thesis Methodology Help for Students
100% (3)
Thesis Methodology Help for Students
7 pages
Personnel vs HR Management Explained
No ratings yet
Personnel vs HR Management Explained
2 pages
Parable of Sadhu
No ratings yet
Parable of Sadhu
2 pages
Management: Science, Theory and Practice
No ratings yet
Management: Science, Theory and Practice
15 pages
Inverted U Theory
No ratings yet
Inverted U Theory
3 pages
Logie STRAT
No ratings yet
Logie STRAT
6 pages
Inspiration Leader
No ratings yet
Inspiration Leader
6 pages
Mindfulness, Psychological Capital, and English Language Learning Strategies of Senior High School Students
No ratings yet
Mindfulness, Psychological Capital, and English Language Learning Strategies of Senior High School Students
13 pages
DLL Vuca Raisin in The Sun Week 4 October 21 25
100% (2)
DLL Vuca Raisin in The Sun Week 4 October 21 25
12 pages
Updated Timetable: All 9 Subjects in One Week (Sunday To Friday, 5
No ratings yet
Updated Timetable: All 9 Subjects in One Week (Sunday To Friday, 5
2 pages
(Etextbook PDF) For Diversity in Organizations 3rd Edition by Myrtle P. Bell PDF Download
50% (2)
(Etextbook PDF) For Diversity in Organizations 3rd Edition by Myrtle P. Bell PDF Download
93 pages
MZG EngSP6
No ratings yet
MZG EngSP6
12 pages
How To Write Chapter Ii
No ratings yet
How To Write Chapter Ii
2 pages
Construction and Re-Construction of Identities - A Study of Learners' Personal & L2 Identity
No ratings yet
Construction and Re-Construction of Identities - A Study of Learners' Personal & L2 Identity
20 pages
Service Learning
No ratings yet
Service Learning
4 pages
Current Issues in Work and Organizational Psychology - 1st Edition New Edition PDF
100% (14)
Current Issues in Work and Organizational Psychology - 1st Edition New Edition PDF
16 pages

MSC Statistics Booster

Uploaded by

MSC Statistics Booster

Uploaded by

MSc Statistics Booster

BECKY MCNEILL AND CHLOE BATES

1. Describe: What a mental process or behaviour

2. Deduce predictions from the hypothesis

3. Test whether observed data are

4. Use the outcome of test as

Depends on degree of measurement (or

Person’s score on the Hypothetical construct

IQ in the morning: IQ in the evening:

IQ in the morning: IQ in the evening:

Three broad types of validity:

Number of words Identify words

Content validity: Degree to which our measure

See also: “face” validity, “criterion” validity, etc.

◦ Ecological validity: How much the procedures of the

◦ Temporal validity: How much the results apply across

Internal Validity External Validity

First impression: First impression:

Validity: Whether measurements reflect variables, constructs and

How variables are measured NO SUCH

Basic research designs

◦ The numerical values are

◦ For example, winners in a contest or a race

◦ More informative than nominal

◦ They do not, however, tell us

To what extent does this make you feel disgusted?

Does going from 4 to 6 mean the same increase in disgust

A good example of this is height, weight, reaction time, or

Next, we will move on to examine how data are collected.

Put simply, there are two main ways to collect data:

Researchers gather information about different

It is important to remember that correlation does not equal

Participants can be randomly assigned to three

Trying to make the groups as equal as possible on

The normal curve

With an even number of scores, then you take the mean of

Lets look at the following data:

Simply determine which occurs most frequently

• Across these distributions,

• This is why you should not

Range relates to the difference between the lowest and the

A sample of 8 people are asked ‘How old were you when

As the lowest value is 17 years and the highest value is 41

As such, some researchers prefer to use the inter-quartile

Mean Age = 20.5

Ages - Mean Deviation from the mean

is a key component of many SD = 4.01

3. We resolve this by squaring each deviation.

5. Finally get the SD in units of measurement by taking

There is no link between Pill People taking Pill X

7 year olds will recall more There will be a difference in

Specifies direction of findings Predict a relationship between

one tail two tails

H1 = There will be a difference between the

What is the probability that an observed difference

Example: “If there is really no height

The “odds” that chance P-value: % of the distribution that is

Most p-value statistics are produced Here “” = number of observations -1

Yellow curve (2 observations): effect N=

◦ The probability of an accident was 1/10 (i.e. p = 0.10)?

◦ The probability of an accident was 1/20 (i.e. p = 0.05)?

◦ The probability of an accident was 1/100 (i.e. p = 0.01)?

◦ The probability of an accident was 1/1000 (i.e. p =

Can this tell us anything meaningful about

Comparing two groups: t-tests – with exercise

A mathematical term that is used in many

A difficult concept to truly understand

For our 2nd number, the df = 1

◦ Generally this is used in a repeated design

◦ You need a categorical independent variable and a

The formula for the independent test looks daunting, but

We need a formula that treats each group separately. It

◦ Account for standard error in both samples

The size of a correlation ignores whether it is positive or negative so…

A correlation has two features

A correlation of zero shows no relationship.