0% found this document useful (0 votes)
3 views20 pages

Unit 4

Unit IV covers inferential statistics, including concepts such as populations, samples, random sampling, hypothesis testing, and the z-test. It emphasizes the importance of understanding sampling distributions, standard error, and the central limit theorem in making statistical inferences. The document also discusses the application of these concepts through a case study on the Iris data set and an example involving SAT scores.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views20 pages

Unit 4

Unit IV covers inferential statistics, including concepts such as populations, samples, random sampling, hypothesis testing, and the z-test. It emphasizes the importance of understanding sampling distributions, standard error, and the central limit theorem in making statistical inferences. The document also discusses the application of these concepts through a case study on the Iris data set and an example involving SAT scores.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

UNIT IV INFERENTIAL STATISTICS

Populations – samples – random sampling – standard error of the mean - Hypothesis testing – z-test – z-
test procedure –decision rule – calculations – decisions – interpretations - statistical significance testing
– Estimation – point estimate – confidence interval – level of confidence – effect of sample size. Case
study - apply inferential statistics on Iris data set

POPULATIONS
Any complete set of observations (or potential observations) may be characterized as a population.
Accurate descriptions of populations specify the nature of the observations to be taken.
Real population
A real population is one in which all potential observations are accessible at the time of sampling.
Hypothetical Populations
Hypothetical Populations Insofar as research workers concern themselves with populations, they often
invoke the notion of a hypothetical population.
A hypothetical population is one in which all potential observations are not accessible at the time of
sampling.
According to the rules of inferential statistics, generalizations should be made only to realpopulations
that, in fact, have been sampled.
Generalizations to hypothetical populations should be viewed, therefore, as provisional conclusions
based on the wisdom of the researcher rather than on any logical or statistical necessity.
In effect, it’s an open question—often answered only by additional experimentation— whether or not a
given experimental finding merits the generality assigned to it by the researcher.

SAMPLES
Any subset of observations from a population may be characterized as a sample.
In typical applications of inferential statistics, the sample size is small relative to the population size.
For example, less than 1 percent of all U.S. worksites are included in the Bureau of Labor Statistics’
monthly survey to estimate the rate of unemployment.
Optimal Sample Size
There is no simple rule of thumb for determining the best or optimal sample size for any particular
situation.
Often sample sizes are in the hundreds or even the thousands for surveys, but they are less than 100 for
most experiments.
Optimal sample size depends on the answers to a number of questions, including
“What is the estimated variability among observations?” and
“What is an acceptable amount of error in our conclusion?”
Once these types of questions have been answered, specific procedures can be followed to determine the
optimal sample size for any situation

Population Sample
The population includes all members of a A sample is a subset of the population.
specified group.
Collecting data from an entire population Samples offer a more feasible approach to
can be time-consuming, expensive, and studying populations, allowing researchers to
sometimes impractical or impossible. draw conclusions based on smaller, manageable
datasets
1
Includes all residents in the city. Consists of 1000 households, a subset of the
entire population.

RANDOMSAMPLING
Random sampling occurs if, at each stage of sampling, the selection process guarantees that all potential
observations in the population have an equal chance of being included in the sample.
It’s important to note that randomness describes the selection process—that is, the conditions under
which the sample is taken—and not the particular pattern of observations in the sample. Having
established that sampling is random, you still can’t predict anything about the unique pattern of
observations in that sample.
The observations in the sample should be representative of those in the population, but there is no
guarantee that they actually will be.
Casual or Haphazard, Not Random
Casual or Haphazard, Not Random A casual or haphazard sample doesn’t qualify as a random sample.
When to Use Random Sampling
1. When the population is relatively homogenous: Simple random sampling works well when the
population shares similar characteristics, as each individual has an equal chance of being selected.
2. When the population size is known: If the total population size is known, simple random sampling
ensures that every individual has a known and non-zero chance of being included in the sample.
3. When there is no need for specialized knowledge: Simple random sampling is straightforward to
implement and does not require extensive prior information about the population.
4. When statistical inference is the primary goal: Simple random sampling is often used when
researchers need to make generalizations about the population based on the sample data, as it
provides an unbiased estimate of population parameters.
5. When resources are limited: Simple random sampling can be more cost-effective and less time-
consuming compared to other sampling methods, making it suitable when resources are limited.

TABLES OF RANDOM NUMBERS


Tables of random numbers can be used to obtain a random sample.
These tables are generated by a computer designed to equalize the occurrence of any one of the 10
digits: 0, 1, 2, . . . , 8, 9.
For convenience, many random number tables are spaced in columns of five-digit numbers. Table H in
Appendix C shows a specimen page of random numbers from a book devoted entirely to random digits.
How Many Digits?
The size of the population determines whether you deal with numbers having one, two, three, or more
digits
Using Tables
Enter the random number table at some arbitrarily determined place. Ordinarily this should be
determined haphazardly.

2
Hypothetical Populations
Hypothetical Populations As has been noted, the researcher, unlike the pollster, usually deals with
hypothetical populations. Unfortunately, it is impossible to take random samples from hypothetical
populations. All potential observations cannot have an equal chance of being included in the sample if,
in fact, some observations are not accessible at the time of sampling.
Random Assignment
A procedure designed to ensure that each subject has an equal chance of being assigned to any group in
an experiment.
random sampling—that all subjects in the population have an equal opportunity of being sampled.
random assignment—that all subjects have an equal opportunity of being assigned to each of the various
groups
Probability
Probability refers to the proportion or fraction of times that a particular event is likely to occur.
Probabilities of Complex Events
Often you can find the probabilities of more complex events by using two rules—the addition and
multiplication rules—for combining the probabilities of various simple events.
addition rule
Mutually Exclusive Events - Mutually Exclusive Events Events that cannot occur together
The addition rule tells us to add together the separate probabilities of several mutually exclusive
events in order to find the probability that any one of these events will occur.
Addition Rule Add together the separate probabilities of several mutually exclusive events to find the
probability that any one of these events will occur.

M U LT I P L I C AT I O N R U L E
Independent Events The occurrence of one event has no effect on the probability that the other event
will occur
Multiplication Rule Multiply together the separate probabilities of several independent events to find the
probability that these events will occur together
Dependent Events
When the occurrence of one event affects the probability of the other event, these events are dependent.
Conditional Probabilities
Before multiplying to obtain the probability that two dependent events occur together, the probability of
the second event must be adjusted to reflect its dependency on the prior occurrence of the first event.
This new probability is the conditional probability of the second event, given the first event.

3
PROBABILITY AND STATISTICS
Probability assumes a key role in inferential statistics including, for instance, the important area known
as hypothesis testing.
Because of the inevitable variability that accompanies any observed result, such as a mean difference
between two groups, its value must be viewed within the context of the many possible results that could
have occurred just by chance.
With the aid of some theoretical curve, such as the normal curve, and a provisional assumption, known
as the null hypothesis, that chance can reasonably account for the result, probabilities are assigned to the
one observed mean difference.
If this probability is very small, the result is viewed as a rare outcome, and we conclude that something
real—that is, something that can’t reasonably be attributed to chance—has occurred.
On the other hand, if this probability isn’t very small, the result is viewed as a common outcome, and
we conclude that something transitory—that is, something that can reasonably be attributed to chance—
has occurred

SAMPLING DISTRIBUTION
Single most important concept in inferential statistics— the concept of a sampling distribution. A
sampling distribution serves as a frame of reference for every outcome, among all possible outcomes,
that could occur just by chance.
SAMPLING DISTRIBUTION of the Mean
Sampling Distribution of the Mean Probability distribution of means for all possible random samples of
a given size from some population

4
MEAN OF ALL SAMPLE MEANS ( μX )
The mean of the sampling distribution of the mean always equals the mean of the population.

STANDARD ERROR OF THE MEAN ( σX )


A rough measure of the average amount by which sample means deviate from the mean of the sampling
distribution or from the population mean.
The standard error of the mean equals the standard deviation of the population divided by the square
root of the sample size

Special Type of Standard Deviation


The standard error of the mean serves as a special type of standard deviation that measures variability in
the sampling distribution. It supplies us with a standard, much like a yardstick, that describes the amount
by which sample means deviate from the mean of the sampling distribution or from the population
mean.
Effect of Sample Size

5
According to Formula 9.2, any increase in sample size translates into a smaller standard error and,
therefore, into a new sampling distribution with less variability.
With a larger sample size, sample means cluster more closely about the mean of the sampling
distribution and about the mean of the population and, therefore, allow more precise generalizations
from samples to populations.
SHAPE OF THE SAMPLING DISTRIBUTION
the central limit theorem states that, regardless of the shape of the population, the shape of the sampling
distribution of the mean approximates a normal curve if the sample size is sufficiently large.

Why the Central Limit Theorem Works


In a normal curve, intermediate values are the most prevalent, and extreme values, either larger or
smaller, occupy the tapered flanks. Why, when the sample size is large, does the sampling distribution
approximate a normal curve, even though the parent population might be non-normal.
Many Sample Means with Intermediate Values
When the sample size is large, it is most likely that any single sample will contain the full spectrum of
small, intermediate, and large scores from the parent population, whatever its shape. The calculation of a
mean for this type of sample tends to neutralize or dilute the effects of any extreme scores, and the
sample mean emerges with some intermediate value.
Few Sample Means with Extreme Values
To account for the rarer sample mean values in the tails of the sampling distribution, focus on those
relatively infrequent samples that, just by chance, contain lessthan the full spectrum of scores from the
parent population. Sometimes, because of the relatively large number of extreme scores in a particular
direction, the calculation of a mean only slightly dilutes their effect, and the sample mean emerges with
some more extreme value.
OTHER SAMPLING DISTRIBUTIONS
For the Mean
There are many different sampling distributions of means. A new sampling distribution is created by a
switch to another population. Furthermore, for any single population, there are as many different
sampling distributions as there are possible sample sizes. Although each of these sampling distributions
has the same mean, the value of the standard error always differs and depends upon the size of the
sample.
For Other Measures
There are sampling distributions for measures other than a single mean. For instance, there are sampling
distributions for medians, proportions, standard deviations, variances, and correlations, as well as for
differences between pairs of means, pairs of proportions, and so forth.

6
Introduction to Hypothesis Testing: The z Test
Using the sampling distribution as our frame of reference, the one observed outcome is characterized as
either a common outcome or a rare outcome.
A common outcome is readily attributable to chance, and therefore, the hypothesis that nothing special
is happening—the null hypothesis—is retained.
On the other hand, a rare outcome isn’t readily attributable to chance, and therefore, the null hypothesis
is rejected
SAT Score case study
Assume that the SAT math scores for all college-bound students during a recent year were distributed
around a mean of 500 with a standard deviation of 110.
An investigator at a university wishes to test the claim that, on the average, the SAT math scores for
local freshmen equals the national average of 500. His task would be straightforward if, in fact, the math
scores for all local freshmen were readily available.
Then, after calculating the mean score for all local freshmen, a direct comparison would indicate
whether, on the average, local freshmen score below, at, or above the national average.
7
Assume that it is not possible to obtain scores for the entire freshman class. Instead, SAT math scores
are obtained for a random sample of 100 students from the local population of freshmen, and the mean
score for this sample equals 533. If each sample were an exact replica of the population, generalizations
from the sample to the population would be most straightforward.
Having observed a mean score of 533 for a sample of 100 freshmen, we could have concluded, without
even a pause, that the mean math score for the entire freshman class also equals 533 and, therefore,
exceeds the national average.

TESTING A HYPOTHESIS
a test of the hypothesis that the mean SAT math score for all local freshmen equals the national average
of 500. Now, given a mean math score of 533 for a random sample of 100 freshmen,
let’s test the hypothesis that, with respect to the national average, nothing special is happening in the
local population.
Insofar as an investigator usually suspects just the opposite—namely, that something special is
happening in the local population—he or she hopes to reject the hypothesis that nothing special is
happening, henceforth referred to as the null hypothesis.
Hypothesized Sampling Distribution
If the null hypothesis is true, then the distribution of sample means—that is, the sampling distribution of
the mean for all possible random samples, each of size 100, from the local population of freshmen—will
be centered about the national average of 500. (Remember, the mean of the sampling distribution always
equals the population mean.)
In Figure 10.1, this sampling distribution is referred to as the hypothesized sampling distribution, since
its mean equals 500, the hypothesized mean reading score for the local population of freshmen

1. Anticipating the key role of the hypothesized sampling distribution in our hypothesis test, let’s
focus on two more properties of this distribution: 1. In Figure 10.1, vertical lines appear, at
intervals of size 11, on either side of the hypothesized population mean of 500. These intervals
reflect the size of the standard error of the mean, X . To verify this fact, originally demonstrated
in Chapter 9, substitute 110 for the population standard deviation, σ, and 100 for the sample size,
n, in Formula 9.2 to obtain
2. Notice that the shape of the hypothesized sampling distribution in Figure 10.1 approximates a
normal curve, since the sample size of 100 is large enough to satisfy the requirements of the
central limit theorem. Eventually, with the aid of normal curve tables, we will be able to
construct boundaries for common and rare outcomes under the null hypothesis.
Common Outcomes
An observed sample mean qualifies as a common outcome if the difference between its value and that of
the hypothesized population mean is small enough to be viewed as a probable outcome under the null
hypothesis.
Rare Outcomes
An observed sample mean qualifies as a rare outcome if the difference between its value and the
hypothesized population mean is too large to be reasonably viewed as a probable outcome under the null
hypothesis

8
z TEST FOR A POPULATION MEAN
Sampling Distribution of z
The distribution of z values that would be obtained if a value of z were calculated for each sample mean
for all possible random samples of a given size from some population.
The conversion from X to z yields a distribution that approximates the standard normal curve in Table A
of Appendix C, since, as indicated in Figure 10.3, the original hypothesized population mean (500)
emerges as a z score of 0 and the original standard error of the mean (11) emerges as a z score of 1. The
shift from X to z eliminates the original units of measurement and standardizes the hypothesis test
across all situations without, however, affecting the test results.

Converting a Sample Mean to z

where z indicates the deviation of the observed sample mean in standard error units, above or below the
hypothesized population mean.

Assumptions of z Test
When a hypothesis test evaluates how far the observed sample mean deviates, in standard error units,
from the hypothesized population mean, as in the present example, it is referred to as a z test or, more
accurately, as a z test for a population mean
9
This z test is accurate only when
(1) the population is normally distributed or the sample size is large enough to satisfy the requirements
of the central limit theorem and
(2) the population standard deviation is known

STATEMENT OF THE RESEARCH PROBLEM


The formulation of a research problem often represents the most crucial and exciting phase of an
investigation. Indeed, the mark of a skillful investigator is to focus on an important research problem
that can be answered.
Do children from broken families score lower on tests of personal adjustment?
Do aggressive TV cartoons incite more disruptive behavior in preschool children?
Does profit sharing increase the productivity of employees?
NULL HYPOTHESIS (H0 )
Once the problem has been described, it must be translated into a statistical hypothesis regarding some
population characteristic.
Abbreviated as H0 , the null hypothesis becomes the focal point for the entire test procedure (even
though we usually hope to reject it).
In the test with SAT scores, the null hypothesis asserts that, with respect to the national average of 500,
nothing special is happening to the mean score for the local population of freshmen.
An equivalent statement, in symbols, reads: H0 : 500 where H0 represents the null hypothesis and μ is
the population mean for the local freshman class.
Generally speaking, the null hypothesis (H0) is a statistical hypothesis that usually asserts that nothing
special is happening with respect to some characteristic of the underlying population. Because the
hypothesis testing procedure requires that the hypothesized sampling distribution of the mean be
centered about a single number (500), the null hypothesis equals a single number (H0 : μ = 500).
Furthermore, the null hypothesis always makes a precise statement about a characteristic of the
population, never about a sample.
Remember, the purpose of a hypothesis test is to determine whether a particular outcome, such as an
observed sample mean, could have reasonably originated from a population with the hypothesized
characteristic
Finding the Single Number for H0
The single number actually used in H0 varies from problem to problem.

ALTERNATIVE HYPOTHESIS (H1)


In the present example, the alternative hypothesis asserts that, with respect to the national average of
500, something special is happening to the mean math score for the local population of freshmen
(because the mean for the local population doesn’t equal the national average of 500).
An equivalent statement, in symbols, reads: H1 : 500 where H1 represents the alternative hypothesis, μ is
the population mean for the local freshman class, and signifies, “is not equal to.” The alternative
hypothesis (H1 ) asserts the opposite of the null hypothesis.
A decision to retain the null hypothesis implies a lack of support for the alternative hypothesis, and a
decision to reject the null hypothesis implies support for the alternative hypothesis.
Research Hypothesis
Usually identified with the alternative hypothesis, this is the informal hypothesis or hunch that inspires
the entire investigation

DECISIONRULE
Specifies precisely when H0 should be rejected (because the observed z qualifies as a rare outcome)
10
A very common one, already introduced in Figure 10.3, specifies that H0 should be rejected if the
observed z equals or is more positive than 1.96 or if the observed z equals or is more negative than –
1.96. Conversely, H0 should be retained if the observed z falls between ± 1.96.

Critical z Score
A z score that separates common from rare outcomes and hence dictates whether H0 should be retained
or rejected.
Level of Significance (α)
Figure 10.4 also indicates the proportion (.025 .025 .05) of the total area that is identified with rare
outcomes.
Often referred to as the level of significance of the statistical test, this proportion is symbolized by the
Greek letter α (alpha). In the present example, the level of significance, α, equals .05.

The level of significance (α) indicates the degree of rarity required of an observed outcome in order to
reject the null hypothesis (H0 ).
For instance, the .05 level of significance indicates that H0 should be rejected if the observed z could
have occurred just by chance with a probability of only .05 (one chance out of twenty) or less.

CALCULATIONS
We can use information from the sample to calculate a value for z. As has been noted previously, use
Formula 10.1 to convert the observed sample mean of 533 into a z of 3.

DECISION
Either retain or reject H0 , depending on the location of the observed z value relative to the critical z
values specified in the decision rule.
According to the present rule, H0 should be rejected at the .05 level of significance because the observed
z of 3 exceeds the critical z of 1.96 and, therefore, qualifies as a rare outcome, that is, an unlikely
outcome from a population centered about the null hypothesis.
Retain or Reject H0?
If you are ever confused about whether to retain or reject H0 , recall the logic behind the hypothesis test.
You want to reject H0 only if the observed value of z qualifies as a rare outcome because it deviates too
far into the tails of the sampling distribution.
Therefore, you want to reject H0 only if the observed value of z equals or is more positive than the
upper critical z (1.96) or if it equals or is more negative than the lower critical z (–1.96).

INTERPRETATION
Finally, interpret the decision in terms of the original research problem. In the present example, it can be
concluded that, since the null hypothesis was rejected, the mean SAT math score for the local freshman
class probably differs from the national average of 500.

11
Although not a strict consequence of the present test, a more specific conclusion is possible. Since the
sample mean of 533 (or its equivalent z of 3) falls in the upper rejection region of the hypothesized
sampling distribution, it can be concluded that the population mean SAT math score for all local
freshmen probably exceeds the national average of 500.
By the same token, if the observed sample mean or its equivalent z had fallen in the lower rejection
region of the hypothesized sampling distribution, it could have been concluded that the population mean
for all local freshmen probably is below the national average

12
Estimation (Confidence Intervals)
A hypothesis test merely indicates whether an effect is present.
A confidence interval is more informative since it indicates, with a known degree of confidence, the
range of possible effects.
A confidence interval can appear either in isolation or in the outcome of a test that has rejected the null
hypothesis.
As a research area matures, the use of confidence intervals becomes more prevalent.
an investigator was concerned about detecting any difference between the mean SAT math score for all
local freshmen and the national average.
This concern led to a z test and the conclusion that the mean for the local population exceeds the
national average.
Given a concern about the national average, this conclusion is most informative; it might even create
some joy among local university officials.
However, the same SAT investigation could have been prompted by a wish merely to estimate the value
of the local population mean rather than to test a hypothesis based on the national average. This new
concern translates into an estimation problem, and with the aid of point estimates and confidence
13
intervals, information in a sample can be used to estimate the unknown population mean for all local
freshmen.
Point Estimate
A single value that represents some unknown population characteristic, such as the population mean
A point estimate for μ uses a single value to represent the unknown population mean. This is the most
straightforward type of estimate.
If a random sample of 100 local freshmen reveals a sample mean SAT score of 533, then 533 will be the
point estimate of the unknown population mean for all local freshmen.
The best single point estimate for the unknown population mean is simply the observed value of the
sample mean.

CONFIDENCE INTERVAL (CI) FOR μ


A confidence interval for μ uses a range of values that, with a known degree of certainty, includes the
unknown population mean.
For instance, the SAT investigator might use a confidence interval to claim, with 95 percent confidence,
that the interval between 511.44 and 554.56 includes the population mean math score for all local
freshmen.
To be 95 percent confident signifies that if many of these intervals were constructed for a long series of
samples, approximately 95 percent would include the population mean for all local freshmen.
In the long run, 95 percent of these confidence intervals are true because they include the unknown
population mean. The remaining 5 percent are false because they fail to include the unknown population
mean.

Why Confidence Intervals Work


the three important properties are as follows:
■ The mean of the sampling distribution equals the unknown population mean for all local freshmen,
whatever its value, because the mean of this sampling distribution always equals the population mean.
■ The standard error of the sampling distribution equals the value (11) obtained from dividing the
population standard deviation (110) by the square root of the sample size ( 100 ).
■ The shape of the sampling distribution approximates a normal distribution because the sample size of
100 satisfies the requirements of the central limit theorem.
A Series of Confidence Intervals
In practice, only one sample mean is actually taken from this sampling distribution and used to construct
a single 95 percent confidence interval.
However, imagine taking not just one but a series of randomly selected sample means from this
sampling distribution. Because of sampling variability, these sample means tend to differ among
themselves.
For each sample mean, construct a 95 percent confidence interval by adding 1.96 standard errors to the
sample mean and subtracting 1.96 standard errors from the sample mean; that is, use the expression

to obtain a 95 percent confidence interval for each sample mean


True Confidence Intervals
Why, according to statistical theory, do 95 percent of these confidence intervals include the unknown
population mean? As indicated in Figure 12.2, because the sampling distribution is normal, 95 percent

14
of all sample means are within 1.96 standard errors of the unknown population mean, that is, 95 percent
of all sample means deviate less than 1.96 standard errors from the unknown population mean.
Therefore, and this is the key point, when sample means are expanded into confidence intervals—by
adding and subtracting 1.96 standard errors—95 percent of all possible confidence intervals are true
because they include the unknown population mean.
False Confidence Intervals
Five percent of all confidence intervals fail to include the unknown population mean. As indicated in
Figure 12.2, 5 percent of all sample means (2.5 percent in each tail) deviate more than 1.96 standard
errors from the unknown population mean.
Therefore, when sample means are expanded into confidence intervals—by adding and subtracting 1.96
standard errors—5 percent of all possible confidence intervals are false because they fail to include the
unknown population mean.
To illustrate this point, only 1 of the 16 sample means shown in Figure 12.2 is not within 1.96 standard
errors of the unknown population mean.
The resulting confidence interval, shown as shaded, has a range that does not span the broken line for
the population mean, thereby being designated as a false interval because it fails to include the value of
the unknown population mean.

15
Problem
Reading achievement scores are obtained for a group of fourth graders. A score of 4.0 indicates a level
of achievement appropriate for fourth grade, a score below 4.0 indicates underachievement, and a score
above 4.0 indicates overachievement. Assume that the population standard deviation equals 0.4. A
random sample of 64 fourth graders reveals a mean achievement score of 3.82.
(a) Construct a 95 percent confidence interval for the unknown population mean. (Remember to convert
the standard deviation to a standard error.)
(b) Interpret this confidence interval; that is, do you find any consistent evidence either of
overachievement or of underachievement?

INTERPRETATION OF A CONFIDENCE INTERVAL


A 95 percent confidence claim reflects a long-term performance rating for an extended series of
confidence intervals. If a series of confidence intervals is constructed to estimate the same population
mean, as in Figure 12.2, approximately 95 percent of these intervals should include the population
mean.
In practice, only one confidence interval, not a series of intervals, is constructed, and that one interval is
either true or false, because it either includes the population mean or fails to include the population
mean.
Of course, we never really know whether a particular confidence interval is true or false unless the entire
population is surveyed. However, when the level of confidence equals 95 percent or more, we can be
reasonably confident that the one observed confidence interval includes the true population mean.
when the level of confidence equals 95 percent or more, we can be reasonably confident that the one
observed confidence interval includes the true population mean.
Problem
Before taking the GRE, a random sample of college seniors received special training on how to take the
test. After analyzing their scores on the GRE, the investigator reported a dramatic gain, relative to the
national average of 500, as indicated by a 95 percent confidence interval of 507 to 527. Are the
following interpretations true or false?
(a) About 95 percent of all subjects scored between 507 and 527.
(b) The interval from 507 to 527 refers to possible values of the population mean for all students who
undergo special training.
(c) The true population mean definitely is between 507 and 527.
(d) This particular interval describes the population mean about 95 percent of the time.
(e) In practice, we never really know whether the interval from 507 to 527 is true or false.

16
(f) We can be reasonably confident that the population mean is between 507 and 527.

LEVEL OF CONFIDENCE
The level of confidence indicates the percent of time that a series of confidence intervals includes the
unknown population mean

Effect on Width of Interval


Notice that the 99 percent confidence interval of 504.62 to 561.38 is wider and, therefore, less precise
than the corresponding 95 percent confidence interval of 511.44 to 554.56.
The shift from a 95 percent to a 99 percent level of confidence requires an increase in the value of zconf
from 1.96 to 2.58. This increase, in turn, causes a wider, less precise confidence interval.
Choosing a Level of Confidence
Although many different levels of confidence have been used, 95 percent and 99 percent are the most
prevalent.
Generally, a larger level of confidence, such as 99 percent, should be reserved for situations in which a
false interval might have particularly serious consequences, such as the failure of a national opinion
pollster to predict the winner of a presidential election.

EFFECT OF SAMPLE SIZE


The larger the sample size, the smaller the standard error and, hence, the more precise (narrower) the
confidence interval will be. Indeed, as the sample size grows larger, the standard error will approach
zero and the confidence interval will shrink to a point estimate. Given this perspective, the sample size
for a confidence interval, unlike that for a hypothesis test, never can be too large. Selection of Sample
Size
As with hypothesis tests, sample size can be selected according to specifications established before the
investigation. To generate a confidence interval that possesses the desired precision (width), yet
complies with the desired level of confidence, refer to formulas for sample size in other statistics books.
* Valid use of these formulas requires that before the investigation, the population standard deviation be
either known or estimated.

H Y P O T H E S I S T E S T S O R CONFIDENCE INTERVALS?
Ordinarily, data are used either to test a hypothesis or to construct a confidence interval, but not both.
Hypothesis tests usually have been preferred to confidence intervals in the behavioral sciences, and that
emphasis is reflected in this book.
As a matter of fact, however, confidence intervals tend to be more informative than hypothesis tests.
Hypothesis tests merely indicate whether or not an effect is present, whereas confidence intervals
indicate the possible size of the effect.
When to Use Confidence Intervals
If the primary concern is whether or not an effect is present—as is often the case in relatively new
research areas—use a hypothesis test.
For example, given that a social psychologist is uncertain whether the consumption of alcohol by
witnesses increases the number of inaccuracies in their recall of a simulated robbery, it would be
appropriate to use a hypothesis test.
Otherwise, given that previous research clearly demonstrates alcohol-induced inaccuracies in witnesses’
testimonies, a new investigator might use a confidence interval to estimate the possible mean number of
these inaccuracies.
17
Indeed, you should consider using a confidence interval whenever a hypothesis test results in the
rejection of the null hypothesis.

Applying Inferential Statistics on the Iris Dataset


The Iris dataset is a well-known dataset in machine learning and statistics, containing measurements of
iris flowers from three different species: Iris-setosa, Iris-versicolor, and Iris-virginica. Each sample has
four features: sepal length, sepal width, petal length, and petal width.
Inferential statistics allows us to make inferences and generalizations about a population based on a
sample of data. In this context, we will perform inferential statistical analysis on the Iris dataset to draw
conclusions about the relationships between the different features and species.
Steps to Apply Inferential Statistics
1. Load the Iris Dataset: The Iris dataset can be easily loaded using libraries like pandas and seaborn
in Python.
2. Descriptive Statistics: Begin with a summary of the dataset to understand the basic properties.
3. Hypothesis Testing: Perform hypothesis tests to compare the means of different groups and to
assess the relationships between variables.
4. Confidence Intervals: Calculate confidence intervals for the means of the different features for
each species.
5. ANOVA (Analysis of Variance): Conduct ANOVA to determine if there are statistically
significant differences between the means of the features across the species.
Implementation
# Import necessary libraries
import pandas as pd
import seaborn as sns
import numpy as np
from scipy import stats
import statsmodels.api as sm
from statsmodels.formula.api import ols
18
# Load the Iris dataset
iris = sns.load_dataset('iris')

# Display first few rows of the dataset


print(iris.head())

# Descriptive statistics
print(iris.describe())

# Hypothesis Testing: Comparing means using T-test


setosa = iris[iris['species'] == 'setosa']
versicolor = iris[iris['species'] == 'versicolor']
virginica = iris[iris['species'] == 'virginica']

# T-test between Setosa and Versicolor for sepal length


t_stat, p_value = stats.ttest_ind(setosa['sepal_length'], versicolor['sepal_length'])
print(f'T-test Setosa vs Versicolor (Sepal Length): t-statistic = {t_stat}, p-value = {p_value}')

# Confidence Intervals for Sepal Length of each species


conf_int_setosa = stats.norm.interval(0.95,loc=np.mean(setosa['sepal_length']),
scale=stats.sem(setosa['sepal_length']))
conf_int_versicolor = stats.norm.interval(0.95,loc=np.mean(versicolor['sepal_length']),
scale=stats.sem(versicolor['sepal_length']))
conf_int_virginica = stats.norm.interval(0.95,loc=np.mean(virginica['sepal_length']),
scale=stats.sem(virginica['sepal_length']))

print(f'95% Confidence Interval for Setosa Sepal Length: {conf_int_setosa}')


print(f'95% Confidence Interval for Versicolor Sepal Length: {conf_int_versicolor}')
print(f'95% Confidence Interval for Virginica Sepal Length: {conf_int_virginica}')

# ANOVA (Analysis of Variance)


model = ols('sepal_length ~ species', data=iris).fit()
anova_table = sm.stats.anova_lm(model, typ=2)
print(anova_table)
Interpretation of Results
1. Descriptive Statistics: The descriptive statistics provide a summary of the central tendency,
dispersion, and shape of the dataset's distribution. For example, mean, standard deviation, and
quartiles of sepal length, sepal width, petal length, and petal width for each species.
2. Hypothesis Testing: The T-test compares the means of sepal length between Iris-setosa and Iris-
versicolor. The null hypothesis is that there is no difference in means. A low p-value (typically <
0.05) indicates that we reject the null hypothesis, suggesting a significant difference in means.
3. Confidence Intervals: Confidence intervals provide a range of values within which we expect the
true population mean to lie, with a certain level of confidence (e.g., 95%). Overlapping intervals
suggest no significant difference between groups.
4. ANOVA: ANOVA tests whether there are statistically significant differences between the means of
sepal length across the three species. The null hypothesis states that all group means are equal. A
significant F-statistic (with p-value < 0.05) indicates that at least one group mean is different.
By applying inferential statistics to the Iris dataset, we can draw conclusions about the relationships
between species and their characteristics. This process includes performing hypothesis tests, calculating
confidence intervals, and conducting ANOVA to determine significant differences between groups.
These statistical methods provide a deeper understanding of the dataset and help in making informed
decisions based on the data.

19
20

You might also like