0% found this document useful (0 votes)
1K views12 pages

The Median Test

The Median Test is used to compare the medians of two or more independent populations. It has fewer assumptions than alternatives like the Wilcoxon-Mann-Whitney test but also less power. The document discusses the assumptions, test statistic, and procedure of the Median Test. It then compares it to alternative tests like the Wilcoxon and Kruskal-Wallis tests, finding the alternatives generally more powerful and efficient. While the Median Test has disadvantages, the document concludes it still has uses like when population spreads differ.

Uploaded by

Lukman Nul Hakim
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1K views12 pages

The Median Test

The Median Test is used to compare the medians of two or more independent populations. It has fewer assumptions than alternatives like the Wilcoxon-Mann-Whitney test but also less power. The document discusses the assumptions, test statistic, and procedure of the Median Test. It then compares it to alternative tests like the Wilcoxon and Kruskal-Wallis tests, finding the alternatives generally more powerful and efficient. While the Median Test has disadvantages, the document concludes it still has uses like when population spreads differ.

Uploaded by

Lukman Nul Hakim
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 12

The Median Test

Intro

Should the Median Test be Retired from General Use?”

My goal is to combine information I gathered from Freidlin and Gastwirth as well as other publications
to find out truly how well the Median Test stands up to other comparable location tests and under what
circumstances. I also plan on using a statistical software package to compare these different location
tests.

Basic Idea

• The Median Test is used to test for location differences between two or more independent
populations.

• The Median Test does not take into account the distance from the median. It, like the Sign Test,
only takes into account which side of the median the observations lie on.

Assumptions

• Two/K independent samples. We need independence both within and among samples.

• Two/K populations have the same shape, but not necessarily the same distribution.

• Random Samples from each population.

Hypothesis

Two Samples

• H0: q1=q2, the medians are the same

• H1: q1≠q2, the medians are not the same

K Samples

• H0: q1=q2=q3=…=qk, the medians are the same

• H1: At least one of the medians is different

Procedure

• Find common median (M) by combining all the samples, (m+n=N).

• Under H0, half the observations from each sample should be above M.

• Create Contingency Table, where the rows are the populations and the columns are whether the
observation is above or below M.
• Use the idea of permutation (Binomial with p=.5) for finding the probability, E(a under H0)=m/2
and E(b under H0)=n/2.

Ties:

– Observations that equal the median.

– Correcting ties:

• Omit the observation and proceed as normal.

• If lots of ties, then split up the observations in a way that makes rejecting H0
less likely.

– Contingency Tables

– Two Samples:

Above median (M=6) Below median (M=6)

Type 1 3=a 7=c 10=m

Type 4 7=b 4=d 11=n

10=a+b 11=c+d 21=N

- Four Samples

Above median (M=5.5) Below median (M=5.5)

Type 1 5=a1 7=b1 12=n1

Type 2 3=a2 9=b2 12=n2

Type 3 8=a3 4=b3 12=n3

Type 4 8=a4 4=b4 12=n4

24=A 24=B 48=N

Test Statistic

Two Samples:
To find the actual probability of getting results like or more extreme than ours is found by the
following hypergeometric distribution:

m n
P*=
( a )( b )

(a+N b)
Four Samples:

The probability distribution for the k sample Median Test is found by just slightly modifying the
previous test statistic. We simply now account for all the populations.

n 1 n 2 ⋯ nk
( )( ) ( )
a 1 a2 ak
N A!B! Π ( ni! )
P*=
()A =>
( )(
N ! Π ( ai!)Π ( bi! ) )
Test Comparison

• What tests could be substituted for the Median Test?

• The following tables compare the power and ARE of some Nonparametric location
POWER

n=m=5 n=m=13 n=m=25 n

Median Test .543 .582 .597 .6

Kolmogorov-Smirnov .677 .700 .683 .6

Normal Scores .811 .801 .803 .8

Wilcoxon .812 .795 .790 .7

For Two Samples

When sample size is larger, Median Test has higher power

Power:

– (1-b), where b=P (accept H0 when H1 true)


– P (reject H0 when H1 true)

• How likely it is that our test is doing what it should.

– ARE

Procedure ARE

Median Test 2/p=.637

Sign Test 2/p=.637

Wilcoxon Signed-Rank Test 3/p=.955

Wilcoxon-Mann-Whitney U Test 3/p=.955

Spearman Correlation Coefficient .91

ARE = Asymptotic Relative Efficiency:

– The ratio of sample sizes of parametric to nonparametric procedures.

– In a parametric procedure we would need only the ARE of a nonparametric procedure

– So, it takes more resources to create an equally efficient test using the Median Test
versus a rank test.

Two Sample Alternatives

• Wilcoxon-Mann-Whitney Test:

– Uses ordering and then ranking the data in terms of closest to farthest from the median
to determine if there is a difference between the population medians.

• Normal Scores:

– Is based off of the ranks like in WMW. The rank is replaced by the r/(m+n)th quantile of
the standard normal.

• Kolmogorov-Smirnov Test:

– A test for common distributions using a form of ordering other than the ranking system.
This ordering is not based on the median data point.

Assumptions
• Wilcoxon-Mann-Whitney Test assumes that the two populations only differ in the median by an
additive constant, all other assumptions are the same as the Median Test.

E(X)=m

E(Y)=m+

• Normal Scores has the same assumptions as the WMW Test.

• Kolmogorov-Smirnov Test assumes only that there are two independent random samples.

Hypothesis

WMW and Normal Scores:

• H0: q1=q2, the medians are the same

• H1: q1≠q2, the medians are not the same

Kolmogorov-Smirnov Test:

• H0: F(x)=G(x), the distributions are the same

• H1: F(x)≠G(x), the distributions are not the same

K Sample Alternatives

• Kruskal-Wallis Test:

– Uses a ranking system that sums all the ranks and then at the end the finds how far the
average rank is from the median rank. It is an extension of the WMW Test.

• Normal Scores:

– Uses normalized transformed ranks from the Kruskal-Wallis Test.

– Pitman Permutation (ANOVA with General Scores):

– Uses the population means and tests their distance from the overall mean.

Assumptions

• Kruskal-Wallis assumes that we have identical populations except for possible difference in
location, on top of the assumptions made in the Median Test.

• Normal Scores has the same assumptions as the Kruskal-Wallis Test.

• ANOVA with General Scores has assumptions of equal distribution, equal variances, and then
the other same basic assumptions.
Hypothesis

Kruskal-Wallis and Normal Scores:

• H0: q1=q2=q3=…=qk, the medians are the same

• H1: At least one of the medians is different

ANOVA with General Scores:

• H0: m1=m2=m3=…=mk, the means are the same

• H1: At least one of the means is different

Conclusion

• The Median Test as shown it is much less effective compared to the similar location tests (for
this data set) since we got the best results from the alternative tests.

• I therefore agree with Friedlin and Gastwirth in saying the Median Test should not be used when
there are better alternative tests.

• Still, the Median Test is useful for some types of data.

Disadvantages

• H1 states that the population medians are different; however, which ones, their direction, and
the total number that are different are not stated.

• The Median Test is said to not be as powerful or efficient as the alternatives for small sample
sizes.

• We do not use all the data that is available. We only use which side of the median the
observations are and not their actual distance from the median.

• In cases of ties, we must figure out a way to handle them, without influencing the results.

Advantages

• Use the Median Test when spread is different because the other test assumptions are broken.
This can be in medical areas where patients and times are extremely different and can not
assume that they come from the same population.

• The test statistic is easy to compute.


• It is easy to understand and to use.

• Two-staged Two-Sample Median Test is best when using survival data (time until failure), with a
generalize

THE MEDIAN TEST

Used to determine whether two random samples come from populations with the same
medians.

Assessment Center Rating by Two Teams:


Officer Candidates Randomly Assigned to Assessment Teams

Team A Team B
Subject Rating Subject Rating
Goldin 72 Olsen 97
Jesani 67 Smither 76
Pritchard 87 Trantham 83
Birdwell 46 Gordon 69
Chavez 58 Graham 56
O’Neal 63 Andel 68
Johnson 84 Hutton 92
Tate 53 Paul 88
Bird 62 McGuire 74
Zuni 77 Costo 73
Compton 82 Raines 65
Lewis 89 Battan 54
Litzmann 43
The median test ( cont. )

Step 1 Determine the median rating for the two assessment groups combined

Midpoint = (N + 1) / 2 = (25 + 1) / 2 = 13

Ranking of the combined ratings


of the two groups

97 69
92 68
89 67
88 65
87 63
84 62
83 58
82 56
77 54
76 53
74 46
73 43
72
median of combined groups

Step 2 Determine the number of officers in either group whose ratings were
equal to or above the median, and the number not above the median.

The median test (cont.)

Team A Team B

Above Mdn 6 7

Below Mdn 6 6

Step 3 Run a two-way chi-square to determine whether there is an association


between assessment team and the ratings.

(expected frequencies are in parentheses)


Position Team A Team B Total

Above Mdn 6 (6.24) 7 (6.76) 13

Below Mdn 6 (5.76) 6 (6.24) 12

Total 12 13 25

2 = (6 – 6.24)2 / 6.24 + (7 – 6.76)2 / 6.76 +


(6 – 5.76)2 / 5.76 + (6 – 6.24)2 / 6.24

2 = 0.0367

The median test (cont.)

Interpretation

The critical value of 2 for 1 df at  = 0.05 is 3.841.

Since 0.0367  3.841, the null hypothesis is accepted.

It is concluded that there is no significant difference in the median ratings given by the
two assessment teams.
Focus: Median tests
• What median tests are
• Why they are used
• When they are used
• How they are used
What are median tests?
• They are tests similar to the mean tests covered in a college introduction to statistics.
• They include confidence intervals, and significance tests.
When to use a median test:
(as opposed to a mean test)
• When data or population does not fulfill conditions for mean tests.
• The ONLY condition is a simple random sample!
Why do we use median tes

Because they are more robust!

Medians are more robust than means

SRS of salaries of Company A:

$18,
1 8 $35,000
000

$20,
2 9 $36,000
000

$23, 1
3 $50,000
000 0

$23, 1
4 $50,000
000 1

$23, 1
5 $60,000
000 2

$28, 1 $130,00
6
000 3 0

7 $30, 1 $1,000,
000 4 000

• The mean of these salaries is $109,000

• The median of these salaries is clearly between #7 and #8, or $32,500

Just from looking at the list of salaries, the median seems to describe the middle of the
distribution much more accurately, since salary #14 pulls the mean so far up

More robustness

The rest of the procedure of the median test is more robust than the t-distribution.

This combination of a robust statistic and robust procedure allows for statistical inference on very
skewed

Confidence Intervals for Medians


The two main types:

• Exact: needs tables and or computer software

• Approximate: simpler tables, appropriate for larger samples

We will concentrate on the approximations

Introduction to the Confidence Intervals

It is necessary to understand “rank”

The rank of a value in a distribution is simply its numbered place in the list of ordered values

Example: in the distribution of letters

{a, b, c, d, e, f}

“b” has a rank of 2 from the left, and a rank of 5 from the right.

Steps for Approximate Confidence Intervals

1. Order the distribution from smallest to largest values

2. Find the median of the distribution.

3. Find the rank* of each limit depending on the sample size from a table like the one shown on
the next slide.

4. Take the rank number and count in that many data points from each side of the ordered data.
When data is skewed, a median test can be much more useful than a mean test in estimating the true
parameter

You might also like