0% found this document useful (0 votes)

16 views23 pages

Stat 7

The document outlines the analysis of associations between categorical variables in a sociology course, focusing on hypothesis testing and the use of contingency tables. It explains the concept of association, statistical independence, and the Chi-squared test for assessing relationships between categorical data. Limitations of the Chi-squared test are also discussed, emphasizing the need for further analysis to understand the strength and direction of associations.

Uploaded by

beliz tuzel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views23 pages

Stat 7

Uploaded by

beliz tuzel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

WEEK 8.

ANALYZING THE ASSOCIATION

BETWEEN CATEGORICAL VARIABLES

STATISTICAL METHODS IN SOCIOLOGY II

SOC 242
Spring 2024-2025

Tuesday, 14:40-16:30 G204

Thursday, 12:40-14:30 G204

FACULTY OF ARTS AND SCIENCES

DEPARTMENT OF SOCIOLOGY
SESSION PLAN
o Independence and Dependence (Association)

o Testing Categorical Variables For Independence

Statistical Methods in Sociology II, Week 10 2

WHAT IS “ASSOCIATION”?

“two variables have an association if a particular

value for one variable is more likely to occur with
certain values of the other variable—for example, if
being very happy is more likely to happen if a
person has an above average income”.

Statistical Methods in Sociology II, Week 10 3

THE ASSOCIATION BETWEEN
CATEGORICAL VARIABLES

▪ Suppose both response and explanatory variables are

categorical, with any number of categories for
each.

▪ There is an association between the variables if the

population conditional distribution for the
response variable differs among the categories of the
explanatory variable.

Statistical Methods in Sociology II, Week 10 4

LOGIC OF HYPOTHESIS TESTING
AND ‘ASSOCIATION’

▪ We discussed how to test a hypothesis about differences in

means -suitable for continuous variables, and
differences in proportions suitable for categorical
variables.

▪ One way of re-phrasing our hypothesis test is to say ‘is

there an association between categorical variable X and
continuous variable Y?’
▪ E.g. association between gender and height

Statistical Methods in Sociology II, Week 10 5

LOGIC OF HYPOTHESIS TESTING AND
‘ASSOCIATION’
▪ Today – how to test hypotheses about relationship
between two categorical variables (nominal/ordinal)

▪ Logic of hypothesis test exactly the same.

▪ Main difference is instead of using sampling distribution

of mean to create z-scores (or t-scores), we use sampling
distribution of another statistic called Chi-squared,
appropriate for categorical data.

Statistical Methods in Sociology II, Week 10 6

CONTINGENCY TABLES
▪ ‘Contingent on’ – means ‘depends on’.

▪ We display categorical data for analysis in contingency

tables.

▪ Definition: A contingency table displays the number of

observations for each combination of outcomes over the
categories of each variable.

▪ We’ll only look at two variables at a time.

Let’s look at an example…

Statistical Methods in Sociology II, Week 10 7

EXAMPLE
To see how subjective health status depends on gender, convert
to % within columns (within independent variable).
Gender
Male Female Total
Subjective health Row
Very good 170 205 375 marginals
Good 730 643 1,373
Fair 264 306 570
Poor 34 42 76
Very poor 5 7 12
Total 1,203 1,203 2,406
Data source: WVS Turkey, 2018

Response: subjective health status Column

marginals
Explanatory: gender

Statistical Methods in Sociology II, Week 10 8

TO SEE HOW SUBJECTIVE HEALTH DEPENDS ON GENDER,
CONVERT TO % WITHIN COLUMNS!

▪ e.g male who report very good health =(170/1,203)x100=22.5%

▪ The two columns form the conditional distributions of subjective
health status on gender
14.1 % of men very good
17.0 % of women very good
Gender
Male Female Total
Subjective health
status
Very good 14.1% 17.0% 15.6%
Good 60.7% 53.4% 57.1%
Fair 21.9% 25.4% 23.7%
Poor 2.8% 3.5% 3.2%
Very poor 0.4% 0.6% 0.5%
Total 100% 100% 100%
Data source: WVS Turkey, 2018

Statistical Methods in Sociology II, Week 10 9

GUIDELINES FOR CONTINGENCY TABLES
▪ Show sample conditional distributions: percentages for the
response variable within the categories of the explanatory
variable.
(Find by dividing the cell counts by the explanatory category total and
multiplying by 100. Percents on response categories will add to 100.)
▪ Clearly define variables and categories.
▪ If display percentages but not the cell counts, include
explanatory total sample sizes, so reader can (if desired)
recover all the cell count data.
▪ rows for response variables, columns for explanatory
variables.

Statistical Methods in Sociology II, Week 10 10

STATISTICAL INDEPENDENCE
▪ Association between these variables depends whether the
conditional distribution of subjective health status differs
between men and women.

▪ We use concept of ‘statistical independence’

▪ Two categorical variables are statistically independent if the
population conditional distributions on one of them are
identical at each category of the other;
▪ Remember, the distribution of a random variable will differ across
samples, even where pop. is invariant;
▪ Task is to decide whether two variables are independent or not in the
population, not the sample. In other words, could our result have
occurred due to chance alone?

Statistical Methods in Sociology II, Week 10 11

PERFECT DEPENDENCE
Gender
Male Female
Subjective health
Good 100 0
Poor 0 100
Total 100 100

▪ Gender perfectly predicts the subjective health status.

▪ Conditional distributions are different.
▪ There is perfect dependence.

Statistical Methods in Sociology II, Week 10 12

PERFECT INDEPENDENCE
Gender
Male Female
Subjective health
Good 50 50
Poor 50 50
Total 100 100

▪ Gender is no help at all in predicting subjective

health status.
▪ The conditional distributions are the same.
▪ There is perfect independence.
▪ In reality, there is never perfect independence...
Statistical Methods in Sociology II, Week 10 13
TESTING FOR STATISTICAL INDEPENDENCE
We want to know if population conditional distributions are
identical

We don’t expect sample conditional distributions to be identical -

why?

▪ Answer: Sampling variation

▪ Question: is it plausible that the observed difference in sample
conditional distributions would be this great if the population
conditional distributions are identical?
▪ We use a statistical test – similar logic as for comparing means:
▪ H0: the variables are statistically independent

▪ H1: the variables are statistically dependent

Statistical Methods in Sociology II, Week 10 14

EXPECTED CELL FREQUENCIES

▪ The way we test for statistically significant

association is the same logic as for our t-test. By
comparing what we get with what we would get if the
null hypothesis is true. This means comparing
observed with expected cell frequencies.

▪ Crucially, we will get a p-value for our significance

test, which is called a Chi-squared test.

Statistical Methods in Sociology II, Week 10 15

EXPECTED CELL FREQUENCIES
Gender
Male Female Total
Subjective health
status
Very good 188 188 375
Good 687 687 1,373
Fair 285 285 570
Poor 38 38 76
Very poor 6 6 12
Total 1,203 1,203 2,406
Data source: WVS Turkey, 2018

fe = column total x raw total / total sample size

e.g.
proportion of ‘good’ = 1,373/2,406= 0.57
number of men = 1,203
1,203 x 0.57 = 687
expected frequency of male respondents
assuming H0 is true = 687

Statistical Methods in Sociology II, Week 10 16

CHI-SQUARED (Χ2 ) (KARL PEARSON, 1900)

Gender
Male Female Total
Subjective health
status
Very good 188 188 375
Good
Fair
687
285
687
285
1,373
570 2 = 
( f o − f e )2
Poor 38 38 76 fe
Very poor 6 6 12
Total 1,203 1,203 2,406
Data source: WVS Turkey, 2018

Test statistic = chi-squared (χ2 )

Definition: Sum of squared deviations of observed from
expected cell frequencies, divided by sum of expected
frequencies.
Quantifies how much difference there is between what we would
expect if H0 is true and what we actually see.
H0 is true: fo and fe Statistical
are close Methods in Sociology II, Week 8

H0 is false: some fo and fe are far – large value of Χ2

Statistical Methods in Sociology II, Week 10 17

CHI-SQUARED (Χ2 )

2 = 
( fo − fe )2
O E O-E (O-E)2 (O-E)2/E
fe
170 188 -18 306.25 1.6333
730 687 44 1892.25 2.7564
264 285 -21 441 1.5474
34 38 -4 16 0.4211
5 6 -1 1 0.1667
205 188 18 306.25 1.6333
643 687 -44 1892.25 2.7564
306 285 21 441 1.5474
42 38 4 16 0.4211
7 6 1 1 0.1667
Total: 13.0496
χ2 = 13.0496

Statistical Methods in Sociology II, Week 10 18

DISTRIBUTION OF Χ2 AND DEGREES OF FREEDOM
If we took repeated samples, sampling distribution of χ2 is not
normal but follows its own χ2 distribution.

What happens to χ2 when there are more cells in a table?

▪ It gets bigger – so we need to take the number of cells into
account to get the critical value for test statistic

χ2 distribution is based on the degrees of freedom in the contingency

table
▪ Table with r rows and c columns, df=(r-1)(c-1)
▪ df – refers to number of cells in a contingency table that can
vary, given the marginals
▪ Statistics softwares (R, Excell, IBM SPSS, STATA etc.) work this out
for you but you should know how to work it out for yourself too.

Statistical Methods in Sociology II, Week 10 19

PROPERTIES OF CHI-SQUARE DISTRIBUTION

• No negative values
• Mean = df
• The standard deviation
increases as the df
increase, so the chi-
square curve spreads
out more as the df
increase
• As the df becomes very
large, the shape
becomes more like the
normal distribution

Statistical Methods in Sociology II, Week 10 20

PROPERTIES OF CHI-SQUARE DISTRIBUTION

The probability that

we would get a value
of the Χ2 statistic this
big or bigger if gender
and subjective health
status are
independent in the
population is .005
(that is less than
0.05).
There is strong
evidence to reject
H0.

Statistical Methods in Sociology II, Week 10 21

LIMITATIONS OF CHI-SQUARE TEST

▪ Doesn’t tell us anything about the strength or direction of the

association.

▪ Doesn’t tell us which cells deviate from expected distributions

▪ Next week: Residual analysis and Odds Ratios

Statistical Methods in Sociology II, Week 10 22

Questions? Ideas?

Thank you for your attention!

Stat 8
No ratings yet
Stat 8
24 pages
Course Slide
No ratings yet
Course Slide
34 pages
Chi-Square Analysis in Research
No ratings yet
Chi-Square Analysis in Research
17 pages
10measures of Association
No ratings yet
10measures of Association
249 pages
Stat 3
No ratings yet
Stat 3
43 pages
Chi-Square Test of Independence: SW318 Social Work Statistics Slide 1
No ratings yet
Chi-Square Test of Independence: SW318 Social Work Statistics Slide 1
40 pages
Statistical Significance & Association
No ratings yet
Statistical Significance & Association
21 pages
Hypothesis Testing - The Scientists' Moral Imperative
No ratings yet
Hypothesis Testing - The Scientists' Moral Imperative
34 pages
Tests of Significance and Measures of Association
No ratings yet
Tests of Significance and Measures of Association
21 pages
Mini Project Statistics)
100% (1)
Mini Project Statistics)
22 pages
Categorical Data Analysis: 48Th Icro-Sun PG Teaching Programme 26 & 27 OCTOBER, 2024
No ratings yet
Categorical Data Analysis: 48Th Icro-Sun PG Teaching Programme 26 & 27 OCTOBER, 2024
57 pages
Chi Square Hypothesis Tests
No ratings yet
Chi Square Hypothesis Tests
5 pages
Chi Square Test For Independece
100% (1)
Chi Square Test For Independece
40 pages
Tests of Significance2
No ratings yet
Tests of Significance2
12 pages
Seminar 09
No ratings yet
Seminar 09
14 pages
Stat 9
No ratings yet
Stat 9
38 pages
Chi Square
No ratings yet
Chi Square
25 pages
21 - Contingency Tables
No ratings yet
21 - Contingency Tables
33 pages
Chapter 15 PowerPoint
No ratings yet
Chapter 15 PowerPoint
23 pages
Ssps Chapter5
No ratings yet
Ssps Chapter5
42 pages
Chi-Square Test Fall Semester 2024
No ratings yet
Chi-Square Test Fall Semester 2024
21 pages
Chi Square Test
No ratings yet
Chi Square Test
9 pages
Chi - Square
No ratings yet
Chi - Square
21 pages
Statistical Data Analysis-Descriptive and Correlational
No ratings yet
Statistical Data Analysis-Descriptive and Correlational
11 pages
Com Exam
No ratings yet
Com Exam
38 pages
BS IMI U8 Oct23
No ratings yet
BS IMI U8 Oct23
100 pages
Chap 015
No ratings yet
Chap 015
21 pages
Chisquare Final
No ratings yet
Chisquare Final
16 pages
Statistical Estimation
No ratings yet
Statistical Estimation
37 pages
Statistical Method of Categorical Variable
No ratings yet
Statistical Method of Categorical Variable
68 pages
Chi Square
No ratings yet
Chi Square
7 pages
Class X: Bivariate Association & The Chi Square Test
No ratings yet
Class X: Bivariate Association & The Chi Square Test
27 pages
Week 6 Lecture 1 - 2023-2024
No ratings yet
Week 6 Lecture 1 - 2023-2024
47 pages
Unit 9 8614
No ratings yet
Unit 9 8614
25 pages
Statistical Theory Lecture 5-2025
No ratings yet
Statistical Theory Lecture 5-2025
13 pages
Chi Square
No ratings yet
Chi Square
5 pages
? 2 - Test
No ratings yet
? 2 - Test
66 pages
Non Parametric Test
No ratings yet
Non Parametric Test
102 pages
Test of Goodness of Fit and Independence: Chi-Square-test-as A Test of Independence
No ratings yet
Test of Goodness of Fit and Independence: Chi-Square-test-as A Test of Independence
9 pages
Lecture No. 23
No ratings yet
Lecture No. 23
30 pages
9 0
No ratings yet
9 0
9 pages
Sample Questions: EXAM 2
No ratings yet
Sample Questions: EXAM 2
6 pages
Course Slide
No ratings yet
Course Slide
50 pages
Chi Squared Test
No ratings yet
Chi Squared Test
20 pages
Chi-Square by MPH
No ratings yet
Chi-Square by MPH
55 pages
Chi-Square Test: DR Ramakanth
No ratings yet
Chi-Square Test: DR Ramakanth
38 pages
Chapter12 - X2 - Student
No ratings yet
Chapter12 - X2 - Student
31 pages
Module 10
No ratings yet
Module 10
31 pages
Lec 7
No ratings yet
Lec 7
16 pages
QM Lecture 10 - Chi Square Tests
No ratings yet
QM Lecture 10 - Chi Square Tests
48 pages
Chi Square Test
100% (1)
Chi Square Test
23 pages
Lecture7 BivariateAnalysis
No ratings yet
Lecture7 BivariateAnalysis
50 pages
Statistical Analysis Techniques
No ratings yet
Statistical Analysis Techniques
79 pages
Chi Square Notes
No ratings yet
Chi Square Notes
5 pages
Inferences On Two-Way Contingency Tables
No ratings yet
Inferences On Two-Way Contingency Tables
45 pages
Inferential Statistics Lesson 5
No ratings yet
Inferential Statistics Lesson 5
21 pages
W8 Hypothesis Testing
No ratings yet
W8 Hypothesis Testing
18 pages
Measures of Association
No ratings yet
Measures of Association
56 pages
Şen2018 Article DefiningTheEffectsOfUrbanExpan
No ratings yet
Şen2018 Article DefiningTheEffectsOfUrbanExpan
13 pages
Econ 210 PS2
No ratings yet
Econ 210 PS2
8 pages
W11 Mavroudi Nagel 2016 118-150
No ratings yet
W11 Mavroudi Nagel 2016 118-150
33 pages
SOC 222 Sınav Hazırlık
No ratings yet
SOC 222 Sınav Hazırlık
148 pages
Econ 210 PS3
No ratings yet
Econ 210 PS3
6 pages
Assignment IV
No ratings yet
Assignment IV
1 page
HL Global Politics Presentation Guide
No ratings yet
HL Global Politics Presentation Guide
17 pages
Lesson 2. Historical Criticism Activity 1. Using The Excerpt of Gottchalk's, "Understanding History" and Howell and
No ratings yet
Lesson 2. Historical Criticism Activity 1. Using The Excerpt of Gottchalk's, "Understanding History" and Howell and
3 pages
4th Grade History Lesson Plan
No ratings yet
4th Grade History Lesson Plan
12 pages
OSSii Berlin v10
No ratings yet
OSSii Berlin v10
18 pages
Michael Aschbacher: Finite Group Theory Pioneer
No ratings yet
Michael Aschbacher: Finite Group Theory Pioneer
3 pages
DAE 2nd Year Annual Result 2025
No ratings yet
DAE 2nd Year Annual Result 2025
1 page
JKC Grand Test 1 - 2013
No ratings yet
JKC Grand Test 1 - 2013
6 pages
Architecture Practice Scope & Board Functions
No ratings yet
Architecture Practice Scope & Board Functions
8 pages
Future Trends in Organizational Development
No ratings yet
Future Trends in Organizational Development
26 pages
Proceedings of International Congress of Mathematicians
100% (1)
Proceedings of International Congress of Mathematicians
829 pages
Pca 3 Bgu Ingles
No ratings yet
Pca 3 Bgu Ingles
16 pages
SDB Medcem MTA Engl
No ratings yet
SDB Medcem MTA Engl
8 pages
Aryabhata Ganit Challenge Phase-1 (24-25)
No ratings yet
Aryabhata Ganit Challenge Phase-1 (24-25)
1 page
ISO 8434 3 - 1995 Standard
No ratings yet
ISO 8434 3 - 1995 Standard
34 pages
College Management System
0% (2)
College Management System
43 pages
Droste 2002
No ratings yet
Droste 2002
3 pages
Laboratory Manual in Surveying 1 Fieldwork No. 2: Taping LEARNING OUTCOMES: After Completion The Fieldwork Activity The Students Are Expected To
No ratings yet
Laboratory Manual in Surveying 1 Fieldwork No. 2: Taping LEARNING OUTCOMES: After Completion The Fieldwork Activity The Students Are Expected To
5 pages
Speech Contest Guidelines
No ratings yet
Speech Contest Guidelines
3 pages
Final Final Research Paper
No ratings yet
Final Final Research Paper
19 pages
Anti-Bullying Debate Complete Guide
No ratings yet
Anti-Bullying Debate Complete Guide
3 pages
University of Saint Louis Tuguegarao Tuguegarao City
No ratings yet
University of Saint Louis Tuguegarao Tuguegarao City
18 pages
Proposal Senam Osteoporosis
No ratings yet
Proposal Senam Osteoporosis
5 pages
ENG201 Midterm Solved Mcqs With References by Moaaz
No ratings yet
ENG201 Midterm Solved Mcqs With References by Moaaz
19 pages
Understanding-Consumer-Resistance-to-Sustainability Interventions
No ratings yet
Understanding-Consumer-Resistance-to-Sustainability Interventions
18 pages
Psychology G4670 Theories in Social and Personality
No ratings yet
Psychology G4670 Theories in Social and Personality
6 pages
North South University: ENG 103 Midterm Exam: Summer2020 Time Allowed: 90 Minutes
No ratings yet
North South University: ENG 103 Midterm Exam: Summer2020 Time Allowed: 90 Minutes
7 pages
Resume: (VLSI) Design. Power Consumed During Test Mode Operation Higher Than During Normal
No ratings yet
Resume: (VLSI) Design. Power Consumed During Test Mode Operation Higher Than During Normal
3 pages
CSWIP 3.1 - Leading Multiple Choice Questions With Full Explanations
No ratings yet
CSWIP 3.1 - Leading Multiple Choice Questions With Full Explanations
8 pages
CH 6 7 8 Englis 9th Test
No ratings yet
CH 6 7 8 Englis 9th Test
1 page
Bihar State Universities (Amendment and Validation) Act, 2012
100% (1)
Bihar State Universities (Amendment and Validation) Act, 2012
22 pages

Stat 7

Uploaded by

Stat 7

Uploaded by

WEEK 8.

ANALYZING THE ASSOCIATION

STATISTICAL METHODS IN SOCIOLOGY II

Tuesday, 14:40-16:30 G204

FACULTY OF ARTS AND SCIENCES

o Testing Categorical Variables For Independence

Statistical Methods in Sociology II, Week 10 2

“two variables have an association if a particular

Statistical Methods in Sociology II, Week 10 3

▪ Suppose both response and explanatory variables are

▪ There is an association between the variables if the

Statistical Methods in Sociology II, Week 10 4

▪ We discussed how to test a hypothesis about differences in

▪ One way of re-phrasing our hypothesis test is to say ‘is

Statistical Methods in Sociology II, Week 10 5

▪ Logic of hypothesis test exactly the same.

▪ Main difference is instead of using sampling distribution

Statistical Methods in Sociology II, Week 10 6

▪ We display categorical data for analysis in contingency

▪ Definition: A contingency table displays the number of

▪ We’ll only look at two variables at a time.

Statistical Methods in Sociology II, Week 10 7

Response: subjective health status Column

Statistical Methods in Sociology II, Week 10 8

▪ e.g male who report very good health =(170/1,203)x100=22.5%

Statistical Methods in Sociology II, Week 10 9

Statistical Methods in Sociology II, Week 10 10

▪ We use concept of ‘statistical independence’

Statistical Methods in Sociology II, Week 10 11

▪ Gender perfectly predicts the subjective health status.

Statistical Methods in Sociology II, Week 10 12

▪ Gender is no help at all in predicting subjective

We don’t expect sample conditional distributions to be identical -

▪ Answer: Sampling variation

▪ H1: the variables are statistically dependent

Statistical Methods in Sociology II, Week 10 14

▪ The way we test for statistically significant

▪ Crucially, we will get a p-value for our significance

Statistical Methods in Sociology II, Week 10 15

fe = column total x raw total / total sample size

Statistical Methods in Sociology II, Week 10 16

Test statistic = chi-squared (χ2 )

H0 is false: some fo and fe are far – large value of Χ2

Statistical Methods in Sociology II, Week 10 17

Statistical Methods in Sociology II, Week 10 18

What happens to χ2 when there are more cells in a table?

χ2 distribution is based on the degrees of freedom in the contingency

Statistical Methods in Sociology II, Week 10 19

Statistical Methods in Sociology II, Week 10 20

The probability that

Statistical Methods in Sociology II, Week 10 21

▪ Doesn’t tell us anything about the strength or direction of the

▪ Doesn’t tell us which cells deviate from expected distributions

▪ Next week: Residual analysis and Odds Ratios

Statistical Methods in Sociology II, Week 10 22

Thank you for your attention!

You might also like