Association of Attributes

Uploaded by

mukesh chauhan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

53 views5 pages

Association of Attributes

Uploaded by

mukesh chauhan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

You are on page 1/ 5

Association of Attributes

Introduction:

In social sciences, we come across certain phenomena which are incapable of quantitative
measurement. Blindness, deafness, religion; juvenile delinquency, marital status etc are some
phenomena which are not measurable. Such characteristics are called attributes. In these
cases, one can make only counting of individuals who possess or do not possess these
attributes. In other words what can do is to state so many individuals are blind or so many non-
blind. While dealing with one attribute the classification of data is done on the basis of
presence or absence of the attribute. It is also absolutely essential that a clear-cut definition of
the attribute under study is made because only such a definition paves way for the counting of
the individuals possessing or not possessing the attribute. Two attributes are said to be
associated only if they appear together in a great number of cades than is to be expected if
they are independent. On the other hand, if the number of observed cases is less than the
expected, under assumption of independence, attributes are associated. In order to ascertain
whether the attributes are associated or not the following methods can be used.

1. Comparison of observed and expected frequencies.

2. Proportion method
3. Yule's coefficient of Association
4. Coefficient of colligation
5. Coefficient of contingency

Example of Attributes and Contingency Table

Suppose that we have two variables, sex (male or female) and handedness (right- or left-
handed). Further suppose that 100 individuals are randomly sampled from a very large
population as part of a study of sex differences in handedness. A contingency table can be
created to display the numbers of individuals who are male and right-handed, male and left-
handed, female and right-handed, and female and left-handed. Such a contingency table is
shown below.

Right-handed Left-handed TOTALS

Males 43 9 52

Females 44 4 48
TOTALS 87 13 100

40
The numbers of the males, females, and right- and left-handed individuals are called marginal
totals. The grand total, i.e., the total number of individuals represented in the contingency
table, is the number in the bottom right corner.

The table allows us to see at a glance that the proportion of men who are right-handed is about
the same as the proportion of women who are right-handed although the proportions are not
identical. The significance of the difference between the two proportions can be assessed with
a variety of statistical tests including Pearson's chi-square test, the G-test, Fisher's exact test,
and Barnard's test, provided the entries in the table represent individuals randomly sampled
from the population about which we want to draw a conclusion. If the proportions of individuals
in the different columns vary significantly between rows (or vice versa), we say that there is a
contingency between the two variables. In other words, the two variables are not independent.
If there is no contingency, we say that the two variables are independent.

The example above is the simplest kind of contingency table, a table in which each variable
has only two levels; this is called a 2 x 2 contingency table. In principle, any number of rows
and columns may be used. There may also be more than two variables, but higher order
contingency tables are difficult to represent on paper. The relation between ordinal variables,or
between ordinal and categorical variables, may also be represented in contingency tables,
although such a practice is rare.

Measurement of Association

Measures of association

The degree of association between the two variables can be assessed by a number of
coefficients: the simplest is the phi coefficient defined by

where χ2is derived from Pearson's chi-square test, and N is the grand total of observations. φ
varies from 0 (corresponding to no association between the variables) to 1 or -1 (complete
association or complete inverse association). This coefficient can only be calculated for
frequency data represented in 2 x 2 tables. φ can reach a minimum value -1.00 and a
maximum value of 1.00 only when every marginal proportion is equal to .50 (and two diagonal
cells are empty). Otherwise, the phi coefficient cannot reach those minimal and maximal
values.[1]

Alternatives include the tetrachoric correlation coefficient (also only applicable to 2 ×2 tables),
the contingency coefficient C, and Cramér'sV.

C suffers from the disadvantage that it does not reach a maximum of 1 or the minimum of -1;
the highest it can reach in a 2 x 2 table is .707; the maximum it can reach in a 4 x 4 table is
0.870. It can reach values closer to 1 in contingency tables with more categories. It should,
therefore, not be used to compare associations among tables with different numbers of
categories.[2]Moreover, it does not apply to asymmetrical tables (those where the numbers of
row and columns are not equal).

The formulae for the Cand V coefficients are:

and

K being the number of rows or the number of columns, which ever is less.
C can be adjusted so it reaches a maximum of 1 when there is complete association in a table

Of any number of rows and columns by dividing C by (recall that C only applies to
tables in which the number of rows is equal to the number of columns and therefore equal to
k).

The tetrachoric correlation coefficient assumes that the variable underlying each dichotomous
measure is normally distributed. The tetrachoric correlation coefficient provides "a convenient
measure of [the Pearson product-moment] correlation when graduated measurements have
been reduced to two categories." The tetrachoric correlation should not be confused with the
Pearson product-moment correlation coefficient computed by assigning, say, values0 and 1 to
represent the two levels of each variable (which is mathematically equivalent to the phi
coefficient). An extension of the tetrachoric correlation to tables involving variables with more
than two levels is the polychoric correlation coefficient.

The Lambda coefficient is a measure the strength of association of the cross tabulations
when the variables are measured at the nominal level. Values range from 0 (no association) to
1 (the theoretical maximum possible association). Asymmetric lambda measures the
percentage improvement in predicting the dependent variable. Symmetric lambda measures
the percentage improvement when prediction is done in both directions.

The uncertainty coefficient is an other measure for variables at the nominal level.

All of the following measures are used for variables at the ordinal level. The values range from
-1 (100% negative association, or perfect inversion) to +1 (100% positive association, or
perfect agreement). A value of zero indicates the absence of association.

 Gammatest: No adjustment for either table size or ties.

 Kendalltau: Adjustment forties.
o Taub: For square tables.
o Tauc: For rectangular tables.
Calculating the test-statistic

The value of the test-statisticis

where

Χ2=Pearson's cumulative test statistic, which asymptotically approaches a χ2distribution.

Oi=an observed frequency;

Ei=an expected(theoretical) frequency, asserted by the null hypothesis;

n =the number of cells in the table.

The chi-square statistic can then be used to calculate a p-value by comparing the value of the
statistic to a chi-squared distribution. The number of degrees of freedom is equal to the
number of cells n, minus the reduction in degrees of freedom, p.

The result about the number of degrees of freedom is valid when the original data was
multinomial and hence the estimated parameters are efficient for minimizing the chi-square
statistic. More generally however, when maximum likelihood estimation does not coincide with
minimum chi-square estimation, the distribution will lie somewhere between a chi-square
distribution with n − 1 − p and n − 1 degrees of freedom.

Unit IV Lesson 9 Association of Attributes
No ratings yet
Unit IV Lesson 9 Association of Attributes
5 pages
Introduction To Nonparametric Statistics Craig L. Scanlan, Edd, RRT
No ratings yet
Introduction To Nonparametric Statistics Craig L. Scanlan, Edd, RRT
11 pages
Application of Coefficient of Contingency Among Classification
No ratings yet
Application of Coefficient of Contingency Among Classification
12 pages
Advice4Contingency Tables
No ratings yet
Advice4Contingency Tables
11 pages
Chi-Square Test Fall Semester 2024
No ratings yet
Chi-Square Test Fall Semester 2024
21 pages
Lecture 3 - Measuresof Assocn
No ratings yet
Lecture 3 - Measuresof Assocn
55 pages
Chi-Square Analysis Guide
No ratings yet
Chi-Square Analysis Guide
8 pages
T5 e
No ratings yet
T5 e
21 pages
Measures of Association
No ratings yet
Measures of Association
4 pages
Theory of Attributes: By: Dr. Akash Asthana Assistant Professor, University of Lucknow, Lucknow
No ratings yet
Theory of Attributes: By: Dr. Akash Asthana Assistant Professor, University of Lucknow, Lucknow
23 pages
6 Contingency Tables
No ratings yet
6 Contingency Tables
72 pages
FT Mba Section 2 Associations 2016-17 Sva
No ratings yet
FT Mba Section 2 Associations 2016-17 Sva
40 pages
Bivariate Analysis for Economists
No ratings yet
Bivariate Analysis for Economists
19 pages
Statistical Analysis of Cross-Tabs: Descriptive Statistics Includes Collecting, Organizing, Summarizing and Presenting
No ratings yet
Statistical Analysis of Cross-Tabs: Descriptive Statistics Includes Collecting, Organizing, Summarizing and Presenting
38 pages
Contingency Tables
No ratings yet
Contingency Tables
24 pages
Analysis of Categorical Data
No ratings yet
Analysis of Categorical Data
75 pages
Chapter 1
No ratings yet
Chapter 1
21 pages
Module 1A Basic Statistical Concepts
No ratings yet
Module 1A Basic Statistical Concepts
37 pages
Chi-Square Test
No ratings yet
Chi-Square Test
25 pages
Association of Attributes Chi-Square: Muhammad Usman ROLL 553-07-09
No ratings yet
Association of Attributes Chi-Square: Muhammad Usman ROLL 553-07-09
31 pages
Cramersv
No ratings yet
Cramersv
4 pages
Module 10
No ratings yet
Module 10
31 pages
Module 5 Bivariate Analysis
No ratings yet
Module 5 Bivariate Analysis
81 pages
Lecture 4: Contingency Table: This Example Is From Wikipedia
No ratings yet
Lecture 4: Contingency Table: This Example Is From Wikipedia
5 pages
10measures of Association
No ratings yet
10measures of Association
249 pages
About A Broader Population
No ratings yet
About A Broader Population
9 pages
9-Statistical Tests
No ratings yet
9-Statistical Tests
25 pages
Categorical Data Analysis: 48Th Icro-Sun PG Teaching Programme 26 & 27 OCTOBER, 2024
No ratings yet
Categorical Data Analysis: 48Th Icro-Sun PG Teaching Programme 26 & 27 OCTOBER, 2024
57 pages
Categorical Data Analysis Guide
No ratings yet
Categorical Data Analysis Guide
194 pages
21 - Contingency Tables
No ratings yet
21 - Contingency Tables
33 pages
Statistics Final
No ratings yet
Statistics Final
8 pages
Session7 Chi Square Test
No ratings yet
Session7 Chi Square Test
12 pages
CG8 Data-Analysis
No ratings yet
CG8 Data-Analysis
63 pages
Basic Concept Hypothesis Testing
No ratings yet
Basic Concept Hypothesis Testing
7 pages
Statistical Association Tests
No ratings yet
Statistical Association Tests
8 pages
Chapter 4
No ratings yet
Chapter 4
19 pages
Cross Tabulation for Managers
No ratings yet
Cross Tabulation for Managers
30 pages
Unit-IV of Data Science
No ratings yet
Unit-IV of Data Science
38 pages
Analyzing Quantitative Data
No ratings yet
Analyzing Quantitative Data
53 pages
The Phi-Coefficient, The Tetrachoric Correlation Coefficient, and The Pearson-Yule Debate
No ratings yet
The Phi-Coefficient, The Tetrachoric Correlation Coefficient, and The Pearson-Yule Debate
19 pages
What Is Statistics
No ratings yet
What Is Statistics
5 pages
5 Marks Statistics
No ratings yet
5 Marks Statistics
43 pages
MR Chi-Square
No ratings yet
MR Chi-Square
40 pages
Chapter Four
No ratings yet
Chapter Four
6 pages
058 1
No ratings yet
058 1
25 pages
Lecture Note BUS173 02
No ratings yet
Lecture Note BUS173 02
16 pages
Statistics Basics for Students
No ratings yet
Statistics Basics for Students
9 pages
Ekström (2010) - On The Relation Between The Polychoric Correlation Coefficient and Spearman's Rank Correlation Coefficient
No ratings yet
Ekström (2010) - On The Relation Between The Polychoric Correlation Coefficient and Spearman's Rank Correlation Coefficient
15 pages
Q
No ratings yet
Q
4 pages
Descatamiento - Summary - Non-Parametric Tests 2023 October
No ratings yet
Descatamiento - Summary - Non-Parametric Tests 2023 October
10 pages
Statistics 2nd Year Practice Sheet CH#15
No ratings yet
Statistics 2nd Year Practice Sheet CH#15
7 pages
Test For Association of Attributes: Contingency Tables
No ratings yet
Test For Association of Attributes: Contingency Tables
7 pages
1.09 Parametric and Non-Parametric Tests of Independence - Answers
No ratings yet
1.09 Parametric and Non-Parametric Tests of Independence - Answers
5 pages
Measures of Association For Tables (8.4) : - Difference of Proportions - The Odds Ratio
No ratings yet
Measures of Association For Tables (8.4) : - Difference of Proportions - The Odds Ratio
24 pages
Better To Be in Agreement Than in Bad Company: A Critical Analysis of Many Kappa-Like Tests
No ratings yet
Better To Be in Agreement Than in Bad Company: A Critical Analysis of Many Kappa-Like Tests
22 pages
Statistical Models and Shoe Leather: David A. Freedman
No ratings yet
Statistical Models and Shoe Leather: David A. Freedman
24 pages
Soley-Bori 2013 Dealingwithmissingdata Keyassumptionsandmethodsforappliedanalysis
No ratings yet
Soley-Bori 2013 Dealingwithmissingdata Keyassumptionsandmethodsforappliedanalysis
21 pages
Pine Script - Sample Regression Indicator
No ratings yet
Pine Script - Sample Regression Indicator
2 pages
Airbnb Email Campaign Regression Study
No ratings yet
Airbnb Email Campaign Regression Study
3 pages
Outliers
No ratings yet
Outliers
16 pages
Excel Spreadsheet For Response Surface Analysis
No ratings yet
Excel Spreadsheet For Response Surface Analysis
3 pages
Chi-Squared MCQs for Stat Students
No ratings yet
Chi-Squared MCQs for Stat Students
4 pages
Application of Probability in Engineering
No ratings yet
Application of Probability in Engineering
9 pages
Statistics for College Students
No ratings yet
Statistics for College Students
10 pages
Probabilistic Models of The Brain Perception and Neural Function Rajesh P. N. Rao Available Any Format
100% (4)
Probabilistic Models of The Brain Perception and Neural Function Rajesh P. N. Rao Available Any Format
113 pages
Padel Math89 FinalExam
No ratings yet
Padel Math89 FinalExam
9 pages
Tugas Analisis Regresi - 220020076 - Novita Ratna Dewi
No ratings yet
Tugas Analisis Regresi - 220020076 - Novita Ratna Dewi
17 pages
Handout 3 Non Stationarity
No ratings yet
Handout 3 Non Stationarity
27 pages
Pivot Table
No ratings yet
Pivot Table
52 pages
Forecasting Models
No ratings yet
Forecasting Models
13 pages
MCQ On Regression
100% (2)
MCQ On Regression
3 pages
ADVANCED STATISTICS Activities
No ratings yet
ADVANCED STATISTICS Activities
12 pages
Correlation Analysis Overview
No ratings yet
Correlation Analysis Overview
47 pages
Nolan S.A. - Heinzen, T. E. Statistics For Behavioral Sciences 2nd Edition
100% (1)
Nolan S.A. - Heinzen, T. E. Statistics For Behavioral Sciences 2nd Edition
710 pages
Corporate Governance's Impact on Agency Cost
No ratings yet
Corporate Governance's Impact on Agency Cost
11 pages
Take Home Examination QBM101 (Set R) PDF
No ratings yet
Take Home Examination QBM101 (Set R) PDF
5 pages
Mca4020 SLM Unit 08
0% (1)
Mca4020 SLM Unit 08
36 pages
R10 Sampling and Estimation
No ratings yet
R10 Sampling and Estimation
17 pages
Sample Exam Questions Review
No ratings yet
Sample Exam Questions Review
19 pages
SPSS Uber Output
No ratings yet
SPSS Uber Output
12 pages
Sampling 1%
No ratings yet
Sampling 1%
10 pages
Research Methods: Chapter Four: Sampling Design
No ratings yet
Research Methods: Chapter Four: Sampling Design
36 pages
Linear Regression and Corelation (1236)
No ratings yet
Linear Regression and Corelation (1236)
50 pages
Quants
100% (1)
Quants
18 pages