0% found this document useful (0 votes)
854 views10 pages

Statistics & Sampling Insights

This document discusses combinations and sampling. It provides examples of situations where sampling would be required instead of collecting data from the entire population due to feasibility issues. It also contains examples calculating sample means, standard deviations, and confidence intervals. Central limit theorem concepts are demonstrated including how sample size impacts standard error.

Uploaded by

Anonymous 1997
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
854 views10 pages

Statistics & Sampling Insights

This document discusses combinations and sampling. It provides examples of situations where sampling would be required instead of collecting data from the entire population due to feasibility issues. It also contains examples calculating sample means, standard deviations, and confidence intervals. Central limit theorem concepts are demonstrated including how sample size impacts standard error.

Uploaded by

Anonymous 1997
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Combinations

In which of the following situations would you be constrained to collect data


for a sample instead of the entire population? (Please note that multiple
options can be selected.)

A brand manager of a large FMCG company is looking to get a sense


of customer response towards a new campaign.

✓ Correct
Feedback:
In this case, collecting data for the entire population (all customers who have
been exposed to the campaign) would be expensive and infeasible. Hence,
sampling would be the preferred approach.

An HR executive is setting up the operations for a cab pick-up


facility for the employees of an organisation.

✕ Incorrect
Feedback:
In this case, the population (all employees) can be surveyed feasibly and
inexpensively.

The principal of a school wants to evaluate the academic


performance of Class X students.

The Government of India wants to measure the impact of the


Swachh Bharat Mission in increasing awareness about cleanliness
across the country.

✓ CorrectYou missed this!


Feedback:
In this case, collecting data for the entire population would be both
expensive and infeasible. Hence, sampling would be the preferred approach.

Your answer is Wrong.


Sampling
What is the size of the sample that you have taken?

5
✓ Correct
Feedback:
In this sample, you have data for five students. Hence, the sample size is
equal to 5.

40334

698.44

✕ Incorrect
Feedback:
In this sample, you have data for five students. So, what would the sample
size be equal to?

10
Your answer is Wrong.
Sampling
What is the mean of the sample?

133.21

139.69

✓ Correct
Feedback:
The sample mean is (121.92+133.21+141.34+126.23+175.74)5=139.69.

146.76

141.34

Sampling
What is the standard deviation of the sample?

22.84
23.67

19.19

21.45

✓ Correct
Feedback:
In the previous question, we found that the sample mean (X̄) is 139.69. Now, we can
calculate ∑(Xi−Xˉ)2, which comes out to be 1,841.26. Dividing this by ‘n-1’, i.e, 4, we get the
variance as 460.315. Thus, the standard deviation (S) will be equal to √ (460.315) =21.45

Central Limit Theorem


For a given discrete random variable X, the population parameters are μ = 5
and σ = 12. Now, if you take 10,000 sample means of size 100 from this
population, what would the parameters of the distribution of sample means
be?

Mean = 5 and standard error = 0.12

Mean = 5 and standard error = 1.2

✓ Correct
Feedback:
According to the central limit theorem, the mean of the distribution of
sample means will be equal to the mean of the population (μ) and the
standard error will be ,σ√ n where n is the sample size. Hence, mean is 5 and
standard error is 1.2 in this case.

Mean = 0.5 and standard error = 1.2

Mean = 0.05 and standard error = 0.12


Your answer is Correct.
Central Limit Theorem
For a given continuous random variable Y, the population parameters are μ =
55 and σ = 1.8. What is the probability that a randomly chosen sample of
size 36 will have a mean that lies between 54.7 and 55.3?
47.5%

68%

✓ Correct
Feedback:
According to the central limit theorem, the mean of the distribution of
sample means (X̄) will be equal to the mean of the population (μ), and the
standard error (SE) will be σ/√n, where n is the sample size. Hence, the mean
is 55 and the standard error is 0.3 in this case. Now, 54.7 lies one standard
error to the left and 55.3 lies one standard error to the right of the mean.
Hence, P(54.7 < X̄ < 55.3) = 0.68, which is equal to 68%.

81.5%

95%
Your answer is Correct.
Central Limit Theorem
For a given discrete random variable Y, the population parameters are μ =
30 and σ = 10.5. What is the probability that a randomly chosen sample of
size 49 will have a mean that lies between 27 and 31.5?

47.5%

68%

81.5%

✓ Correct
Feedback:
According to the central limit theorem, the mean of the distribution of
sample means (X̄) will be equal to the mean of the population (μ), and the
standard error (SE) will be σ/√n, where n is the sample size. Hence, mean is
30 and standard error is 10.5/√49 = 10.5/7 = 1.5 in this case. Now, 27 lies
two standard errors to the left of the mean and 31.5 lies one standard error
to the right of the mean. Hence, P(27 < X̄ < 31.5) = 0.95/2 + 0.68/2 = 0.475
+ 0.34 = 0.815, which is equal to 81.5%.
95%
Your answer is Correct.
Confidence Intervals
What effect will decreasing the sample size and keeping everything else the
same have on the length of your confidence interval?

It will increase.

✓ Correct
Feedback:
The formula for calculating confidence interval is Xˉ±(z−score∗σ√ n ). Hence,
the length of the confidence interval is 2∗(z−score∗σ√ n ). Therefore, when
the sample size decreases, the length of the confidence interval increases.

It will decrease.

It will remain the same.

✕ Incorrect
Feedback:
The formula for calculating confidence interval is Xˉ±(z−score∗σ√ n ).

It cannot be determined from the given information.


Your answer is Wrong.

Confidence Intervals
The scores of 10-year-old children on an IQ test has a standard deviation of 12. Let’s suppose for
a sample of 36 children, the mean IQ comes out to be 75. What is the interval in which the
population mean will belong if you want to be 90% confident of your estimation?

(74.45, 75.55)

(71.63, 78.37)
(70.87, 79.14)

(71.71, 78.29)

✓ Correct
Feedback:
The z-score corresponding to the 90% confidence interval value is ±1.645. This means that the
population mean will belong with 90% certainty to the interval 75 ± 1.645 * (12/√36). Hence, the
90% confidence interval is 75 ± 1.645 * (12/√36) = 75 ± 1.645 * 2 = 75 ± 3.29 = (71.71, 78.29).

Confidence Intervals
An automobile manufacturer has collected data on NOx emissions from 100 of its cars. The
mean of the sample comes out to be 0.048 PPM. The manufacturer will have to recall all its cars
if the NOx emissions are higher than 0.05 PPM on average. Can the manufacturer decide that it
does not have to recall its cars at a 95% confidence level if the population standard deviation is
known to be 0.01 PPM?

Yes

✓ Correct
Feedback:
The z-score corresponding to the 95% confidence interval value is ±1.960. This means that the
population mean will belong with 95% certainty to the interval 0.048 ± 1.96 * (0.01/√100) =
0.048 ± 1.96 * 0.001 = 0.048 ± 0.00196 = (0.04604, 0.04996). As the 95% confidence interval is
entirely to the left of 0.05 PPM, the manufacturer can decide that it does not have to recall its
cars at this confidence level.

No

Sampling and Estimation


Find the mean and standard deviation of X.

0.532; 0.346

0.532; 0.249

0.532; 0.499

✓ Correct
Feedback:
The mean of X is (0 + 0 + 0 + ... 1,170 times) + (1 + 1 + 1 + ... 1,330 times)
/ 2,500 = (0 * 1,170) + (1 * 1,330) / 2,500 = 1,330/2,500 = 0.532.

The variance of X is ((0 - 0.532)2 * 1,170) + ((1 - 0.532)2 * 1,330) / 2,500 - 1


= (0.532)2 * 1,170 + (0.468)2 * 1,330 / 2499 = 0.249.

Hence, the standard deviation of X is √0.249 = 0.499

None of the above

✕ Incorrect
Feedback:
One of the options is correct.

Your answer is Wrong.


Sampling and Estimation
Suppose the population standard deviation is known to be 0.6, what is the interval in which the
mean of the sampling distribution of sample size 2,500 will belong at a 99% confidence level?

(0.456, 0.608)

(0.572, 0.588)

(0.493, 0.571)

(0.501, 0.563)

✓ Correct
Feedback:
The z-score corresponding to the 99% confidence interval value is ±2.576. Hence, the 99%
confidence interval for the population mean is 0.532 ± 2.576 * (0.6/√2,500) = 0.532 ± 2.576 *
0.012 = (0.501, 0.563).

Sampling and Estimation


Can Facebook be 99% confident that a majority of its users will find the feature useful?

Yes
✓ Correct
Feedback:
The 99% confidence interval for the population mean is (0.501, 0.563). As all the values in the
interval are greater than 0.5, Facebook can be 99% confident that a majority of its users will find
the feature useful.

No

✕ Incorrect
Feedback:
If the 99% confidence interval for the population mean includes values less than 0.5, then
Facebook cannot be 99% confident that a majority of its users will find the feature useful. If that
is not the case, then Facebook can be 99% confident that a majority of its users will find the
feature useful.

Depends on the distribution of the population

Insufficient data

Graded Assessments
Note that the following questions are graded.

Comprehension: Delhi Elections


Suppose you work for a news agency, which is conducting an exit poll for the MCD
(Municipal Corporation of Delhi) elections. You have been tasked with predicting the
winner for ward 75N (Ashok Vihar).

You ask 100 randomly selected voters from this ward to name the party they had
voted for. Of the 100 voters, 58 voted for AAP and 42 voted for BJP. So, you define
X as the proportion of people that voted for AAP. Then, the frequency distribution for
X would be as shown in the table given below.
X Frequency

1 58

0 42

Comprehension
Find the mean and standard deviation of X.

0.58; 0.246

0.58; 0.496

✓ Correct
Feedback:
The mean of X is (0 + 0 + 0 + ... 42 times) + (1 + 1 + 1 + ... 58 times) / 100
= (0 * 42) + (1 * 58) / 100 = 58/100 = 0.58. The variance of X is ((0 - 0.58)2
* 42) + ((1 - 0.58)2 * 58) / 100 - 1 = (0.58)2 * 42 + (0.42)2 * 58 / 99 =
0.2461. Hence, the standard deviation of X is √0.2461 = 0.496.

0.58; 0.494

None of the above


Your answer is Correct.
Comprehension
Let’s say you wish to construct a sampling distribution of sample size 100 for
the proportion of people that voted for AAP. Suppose the population standard
deviation is known to be 0.7, what is the interval in which the mean of the
sampling distribution will belong at a 90% confidence level?

(0.485, 0.675)

(0.572, 0.588)
(0.465, 0.695)

✓ Correct
Feedback:
As we are assuming that the population is normally distributed, the sample
standard deviation can be approximated as the population standard
deviation. The z-score corresponding to the 90% confidence interval value is
±1.645. Hence, the 90% confidence interval for the mean of the sampling
distribution is 0.58 ± 1.645 * (0.7/√100) = 0.58 ± 1.645 * 0.07 = (0.465,
0.695).

(0.503, 0.657)
Your answer is Correct.
Comprehension
Can you be 90% confident that AAP will win the majority vote in ward 75N?

Yes

No

✓ Correct
Feedback:
According to the central limit theorem, the mean of the sampling distribution
is equal to the mean of the population. Hence, the 90% confidence interval
for the population mean is (0.465, 0.695). However, as the 90% confidence
interval includes values that are less than 0.5, the population mean can
come out to be, say, 0.481. Hence, you cannot be 90% confident of AAP
winning the majority vote in ward 75N.

Depends on the distribution of the population

Insufficient data
Your answer is Correct.

You might also like