HONEY GUPTA 1
LECTURE OUTLINE
Hypothesis testing
Steps involved in Hypothesis testing
HONEY GUPTA 2
LEARNING OUTCOME
Students will be able to understand Hypothesis testing
Students will be able to understand steps involved in
Hypothesis testing
HONEY GUPTA 3
HONEY GUPTA 4
NORMAL DISTRIBUTION
• Bell shaped
• Symmetrical
• Area under the curve is 1
• Maximum respondents are near to the central value
• Mean=Median=Mode
• Data is distributed equally on both sides of the central value
HTTPS://WWW.YOUTUBE.COM/WATC
H?V=XSSRRVMOQLQ
HONEY GUPTA 5
HYPOTHESIS TESTING
Hypothesis testing refers to testing an assumption
on the sample data which is drawn from a
population with the objective to infer or
estimate about the population.
The methodology employed depends on the nature
of the data used (metric or non-metric) and the
objective of the study.
In other words, hypothesis testing is a
systematic way of infering from a sample
or population with the intent of making a
determination about the expected behaviour of
the entire population.
HONEY GUPTA 6
Drawing a random sample from the population
is based on the assumption that the sample
will resemble the population. On this basis,
the known sample statistic is used for
estimating the unknown population
parameter. When a researcher sets a
hypothesis or assumption, he / she assumes
that the sample statistic will be close to the
hypothesized population parameter.
HONEY GUPTA 7
In real life, one cannot expect the sample statistic
to be always a good estimator of the population
parameter. Differences are likely to occur due
to sampling or non-sampling errors. A
large difference between the sample statistic
and the hypothesized population parameter
raises questions on the accuracy of the
sampling technique. In statistical analysis, the
researcher would address this issue through
the concept of probability. The researcher would
specify the probability level at which the conclusion
that the observed difference (between the sample
statistic and the population parameter) is due to
chance only.
HONEY GUPTA 8
PROCESS OF HYPOTHESIS TESTING
Formulate H0 and H1
Select Appropriate Test
Choose Level of Significance
Calculate Test Statistic TSCAL
Reject/Do not Reject H0
Draw Marketing Research Conclusion
HONEY GUPTA 9
1. FORMULATE NULL AND ALTERNATE
HYPOTHESIS
The null hypothesis, represented as H 0 (H sub-zero) Theoretically, a null
hypothesis is set as no difference or status quo and considered true,
until and unless it is proved wrong by the collected sample data. The
null hypothesis is always expressed in the form of an equation, which
makes a claim regarding the specific value of the population.
The alternate hypothesis, represented as (H 1 ) (H one) or (H a )is
what we hope to support. It is the logical opposite of the null
hypothesis In other words, when the null hypothesis is found to be true,
the alternate hypothesis must be false or when the null hypothesis is
found to be false, the alternate hypothesis must be true.
HONEY GUPTA 10
Consequently,
In other words, the null hypothesis (Ho) is the hypothesis that is tested against its
complement, the alternate hypothesis (H1, Ha). For example to see whether
population mean is different from 50, a null hypothesis can be set as "population
mean is not different from 50." Symbolically,
And, the alternate hypothesis can be set as "population mean is different from
50." Symbolically,
H1
This results in two more alternate hypotheses, H1: < 50, which indicates that the
population mean is less than 50 and H1: µ > 50; which indicates that the population
mean is greater than 50.
HONEY GUPTA 11
There is no significance difference in the level of
awareness level among male and female.
There is no significance difference in the marks obtained
before and after the course.
HONEY GUPTA 12
In another example, suppose a researcher believes that there
is a relationship between income and standard of living. He collects
sample data of 10,000 households randomly selected. The null and the
alternate hypothesis formulated will be as follows :
Null hypothesis, Ho: There is no relation between income and
standard of living of households.
Alternate hypothesis, Ha : There is a relation between income and
standard of living of households.
It should be noted that the null hypothesis and the alternate
hypothesis must be stated in such a way that both cannot be true. If null
is true, the alternate cannot be true and vice versa.
HONEY GUPTA 13
2. SELECT APPROPRIATE STATISTICAL TEST
The next step is the selection of an appropriate test statistic and its
distribution. Test statistic measures how close the sample is to the
null hypothesis. All test statistics follow a statistical distribution.
Selection of the appropriate statistical test depends on the type of data
collected in the sample. If data is metric data (interval or ratio scale)
then parametric tests (z, t, F) are used and if the data is non-metric
data (nominal or ordinal) then non-parametric tests (chi-square test
etc.) are used.
HONEY GUPTA 14
HONEY GUPTA 15
3. CHOOSE LEVEL OF SIGNIFICANCE
Significance level is the probability of rejecting the null hypothesis when it is
true (Type I error).
The researcher needs to decide the level of significance which
is generally 5 percent or 1 percent. This level of significance is the
complement of the confidence level.
Simply level of significance = 1 — confidence level. In other words, if level of
significance = 5% then confidence level is 95%.
In a two tail test 5% is split into 2.5% in each tail.
HONEY GUPTA 16
HONEY GUPTA 17
4. CALCULATE TEST STATISTIC AND DRAW
STATISTICAL CONCLUSIONS.
The test statistics and the level of significance are already been
decided in step 2 and 3. In this step, the researcher has to compute
the test statistic (various test statistics) and draw a
statistical conclusion from the same. That means, the researcher
will either reject the null hypothesis or would fail to reject
(accept) the null hypothesis.
First, decide whether the test statistic falls in the acceptance region
(Non-rejection region) or critical region (rejection region)
HONEY GUPTA 18
HONEY GUPTA 19
The region of acceptance refers to the range of values that leads
the researcher to accept the null hypothesis. Later, draw a
statistical conclusion. A statistical conclusion is a decision to
accept or reject a null hypothesis. This depends on whether the
computed test statistic falls in the acceptance region or the
rejection region. If the statistic falls within a specified range
of values (c o n f i d e n c e i n t e r v a l ) , t h e r e s e a r c h e r w i l l n o t
r e j e c t ( a c c e p t ) the null hypothesis otherwise null hypothesis
is rejected.
HONEY GUPTA 20
HONEY GUPTA 21
HONEY GUPTA 22
PARAMETRIC AND NON-PARAMETRIC TESTS
Parametric tests
The parametric test make certain assumptions about a data set; namely –
that the data are drawn from a population with a specific or normal
distribution. It is further assumed in parametric test that the variables
in the population are measured based on an interval scale.
When parametric tests are used
• When the data has a normal distribution
• When the measurement scale is interval or ratio
• When there is knowledge about population
Types of Parametric test–
• Two-sample t-test
• Independent t test
• Paired t-test
• Z test
• Analysis of variance (ANOVA)
• Pearson coefficient of correlation
HONEY GUPTA 23
Non-parametric test
Non-parametric test are also known is distribution-free test is
considered less powerful as it uses less information in its calculation
and makes fewer assumption about the data set. Nonparametric tests are
also called distribution-free tests because they don’t assume that your data follow a
specific distribution.
When non-parametric tests are used
• When the study is better represented by the median
• When the data is not necessarily normal distributed
• When there is ordinal data, ranked data
• When the measurement scale is nominal or ordinal
• When no previous knowledge of population.
Types of Non-parametric test
• Kruskal-Wallis test
• Chi-square test
• Spearman’s rank correlation
• Wilcoxon signed rank test
• Wilcoxon rank sum test
HONEY GUPTA 24
The key difference between parametric and nonparametric test is that the
parametric test relies on statistical distributions in data whereas
nonparametric do not depend on any distribution. Non-parametric does not
make any assumptions and measures the central tendency with the median
value. Some examples of Non-parametric tests includes Mann-Whitney,
Kruskal-Wallis, etc.
Parametric is a statistical test which assumes parameters and the distributions
about the population is known. It uses a mean value to measure the central
tendency. These tests are common, and therefore the process of performing
research is simple.
HONEY GUPTA 25
1. TO TEST SIGNIFICANCE OF MEAN: T-TEST
Conditions:
1. The sample size is small <30
2. Population SD is unknown and estimated from sample SD.
3. The population from which sample are taken is normally distributed.
4. Sample should be random.
5. Mean is assumed to be known i.e. it is hypothesised
To test whether sample whose size is n is drawn from a population with mean µ. Whether
the hypothesised mean is true.
To test whether there is a difference between the true population mean and
hypothesized mean.
H0: µ = µ0 ("there is no difference between true population and
hypothesized population mean")
H1: µ ≠ µ0 (" there is a difference between true population and hypothesized
population mean ")
HONEY GUPTA 26
HONEY GUPTA 27
HONEY GUPTA 28
P-VALUE
A p-value, or probability value, used in hypothesis testing to help you support or reject the
null hypothesis. The p-value is the probability that the null hypothesis is true. The
smaller the p-value, the stronger the evidence that you should reject the null
hypothesis.
The most common threshold is p < 0.05, if the value of the p-value is 0.025, then there is a
only 2.5% probability that null is true. The lower the p-value, the more confident we are
that the alternate hypothesis is true.
When the p-value falls below the chosen alpha value, then we say the result of the test is
statistically significant.
HONEY GUPTA 29
HONEY GUPTA 30
A random sample of size 16 has 53 as mean. The sum of the squares of the deviations taken
from the mean is 135. Can this sample be regarded as taken
from the population having 56 as mean? Obtain 95% confidence limits
of the mean population. (For v =15, ).
HONEY GUPTA 31
HONEY GUPTA 32
HONEY GUPTA 33
HONEY GUPTA 34
ESTIMATION OF MEAN DIFFERENCE BETWEEN
TWO UNPAIRED SAMPLES : UNPAIRED T-TEST
Small sample (from two normal population): unpaired t test
1. Done for samples coming from 2 normal populations
2. Samples are independent.
3. Values of one sample does not influence values of other sample.
There is no difference in two population means.
Samples are taken from population with mean
HONEY GUPTA 35
HONEY GUPTA 36
HONEY GUPTA 37
is the hypothesized difference between the population means (0 if testing for equal means)
HONEY GUPTA 38
HONEY GUPTA 39
HONEY GUPTA 40
2. TO TEST SIGNIFICANCE OF MEAN : Z-TEST
1. Based on normal distribution
2. Used to test significance of mean
3. Population SD is assumed to be known
4. If sample size is large n>30
To test whether sample whose size is n is drawn from a population with mean µ. Whether
the hypothesised mean is true.
To test whether there is a difference between the true population mean and hypothesized
mean.
H0: µ = µ0 ("there is no difference between true population and hypothesized
population mean")
H1: µ ≠ µ0 (" there is a difference between true population and hypothesized
population mean ")
HONEY GUPTA 41
HONEY GUPTA 42
HONEY GUPTA 43
HONEY GUPTA 44
A random sample of 80 bank employees is taken to test claim that mean salary of bank
executives in a particular state is 48400pm. Further, from related study undertaken
recently, it is known that SD of distribution of salaries bank executives in a particular state
is 5870 is believed to be true. The sample has yielded an average monthly salary of
47456, is the claim that µ=48400 tenable? 1% level of significance.
HONEY GUPTA 45
HONEY GUPTA 46
ESTIMATION OF MEAN DIFFERENCE BETWEEN
TWO UNPAIRED SAMPLES : Z-TEST FOR
INDEPENDENT SAMPLES
Large sample (from two normal population): z test
1. Done for samples coming from 2 normal populations
2. Samples are independent.
3. Values of one sample does not influence values of other sample.
There is no difference in two population means.
Samples are taken from population with mean
HONEY GUPTA 47
HONEY GUPTA 48
HONEY GUPTA 49
HONEY GUPTA 50
HONEY GUPTA 51
HONEY GUPTA 52
HONEY GUPTA 53
PRACTICE QUESTIONS
Business Research Methods by H.K Dangi, Shruti Dewan
HONEY GUPTA 54
Business Research Methods by H.K Dangi, Shruti Dewan
HONEY GUPTA 55
Business Research Methods by H.K Dangi, Shruti Dewan
HONEY GUPTA 56
Business Research Methods by H.K Dangi, Shruti Dewan
HONEY GUPTA 57
Business Research Methods by H.K Dangi, Shruti Dewan
HONEY GUPTA 58
TEXT BOOKS REFERRED
Research Methodology: Deepak Chawla
Business Research Methods : Cooper and Schindler
Research Methodology (a step-by-step guide for beginners) : Ranjit Kumar
Research Methodology: C.R Kothari
Business Research Methods: HK. Dangi, Shruti Dewan
HONEY GUPTA 59
THANKYOU!
HONEY GUPTA 60