0% found this document useful (0 votes)
56 views18 pages

Stat 1 Notes Part 1

The document provides definitions and concepts related to quantitative and qualitative variables, research questions, populations and samples. It discusses different sampling methods and sources of bias. It also covers numerical summaries of data like the mean, median, minimum, maximum, standard deviation, variance, percentiles and interquartile range. Finally, it introduces probability, probability distributions, the law of large numbers and definitions of statistical events.

Uploaded by

rachelmvaz99
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views18 pages

Stat 1 Notes Part 1

The document provides definitions and concepts related to quantitative and qualitative variables, research questions, populations and samples. It discusses different sampling methods and sources of bias. It also covers numerical summaries of data like the mean, median, minimum, maximum, standard deviation, variance, percentiles and interquartile range. Finally, it introduces probability, probability distributions, the law of large numbers and definitions of statistical events.

Uploaded by

rachelmvaz99
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Stats 1 Notes:

Lecture 1:
Definitions:
Quantitative Variable : Qualitative variable:
Continuous: height, weight at birth, yield Nominal: hair colour, bachelor program, province
Discrete : number of children in a household, Ordinal : grade if eggs (AA/A/B), highest level
number of diseased plants in a field education completed, annual salary (in ordered
categories)

Research Question : Question Variable: Measured property of Sample: Part of the population that
that we want to answer an element of the sample we will study and collect
information from. We want to draw
Units: The elements of a Population: every member of a conclusions about the population, so
sample from which we collect group (persons, objects, etc.) the sample should be representative
the information for which we would like to of the population
collect information

How to draw a sample from a population? :


Sampling bias: certain parts pf the population might be overrepresented as compared to other parts to avoid
this; we must use good/recommended ways of sampling:
Simple Random Sampling (SRS): units are drawn at random from a pop. Every sample (of a certain size)
has equal chance to be selected (and every unit from a pop has the same chance to be selected into the
sample) Ex: draw blind 4 business cards from a box of 20. Assign numbers to all units and draw at random
your desired sample size)
Other errors in sampling:
Under sampling: certain groups are excluded from Non-response: Not participated of not successfully
the sample contacted
Voluntary participation (in a survey) : Might Response bias: Social desirability bias (self-
result in particularly positive or negative answers reported personal traits, questions about income)

Observational Research Experimental Research


Observe the unit/process Apply a treatment to the unit in order to observe a reaction
without influencing it Ex: Assume we have 20 (similar) experimental plots in a field. After
Ex: a study of the randomization, 10 plots are assigned to wheat variety A. and the other 10
consequences of smoking are assigned wheat variety B. Determine the difference in yield between
during pregnancy variety A and B
Cause effect relationship can only be concluded from an experimental
study
Frequency Table : For qualitative variables. Also applicable to discrete variables with a limited number of
outcomes. Below is an example of a frequency table.

Level of education Frequency Relative Frequency (fraction)


Primary education 172 172/945 = 0.18
Secondary Education 373 373/945 = 0.39
Vocational/professional training 125 125/945 = 0.13
University education 174 174/945 = 0.18
Other 101 101/945 = 0.11
Total 945 1

1
Lecture 2: Part 1 – Numerical summaries of data
Assume we are investigating salt in bread
Pop: all loaves of bread sold in Units: loaves of bread Variable: Amount of salt
the Netherlands on one day (g/100g) in bread
Sampling design: SRS from all supermarkets and bakeries, subsequently draw one loaf at random from
each of the selected supermarkets/bakeries (two stage cluster sample, NOT SRS!)

Product n Mean Minimum Maximum Median s. d. Range: max - min


g/100g g/100g g/100g g/100g g/100g
Bread 4880 1.4 0.1 2.7 1.4 0.3 Mean = average ; ȳ
Fast food 20 3.3 1.2 8.9 2.4 2.5 Median : 50th
percentile, midpoint value where 50% of observation are smaller and 50% are larger. ; M
Bread is a symmetric distribution as the mean and the median are the same
Fast food is an asymmetric distribution as the mean and the median are not the same
The median is not sensitive to outliers, whereas the mean is very sensitive to outliers

Effect of outliers:
7 4 9 6 5  put in increasing order 7 4 110 6 5  put in increasing order
45679 4 5 6 7 110
M=6 M = 6 (not sensitive to the outlier 110)
mean = 6.2 mean = 26.4
This is a symmetric distribution This is an asymmetric distribution

Measures of variability:
Standard deviation : s = √variance Variance: s2 =
( ȳ) ⋯ ( ȳ)
= ∑((𝑦 − ȳ)
Example: 4 7 3 9 6  ȳ = 5.8
( . ) ⋯ ( . )
Variance = s2 = = 5.7 Standard deviation : s = √5.7 = 2.39

Interquartile range IRQ = Q3 – Q1


Q1 = 1st Quartile = 25th percentile = lower quartile Q3 = 3rd Quartile = 75th percentile = upper quartile
Interquartile range is not sensitive to outliers in contrast to the variance (therefore also in contrast to the
s.d.)
Outliers will appear on the
Minimum Q1 Q2 Q3 Maximum
extremely low and high
Low High
areas, explaining why IQR
0% 25% 50% 75% 100% is not sensitive to them

Percentiles: The pth percentile of a set of n ordered observations (smallest to biggest) is the value where at
most p% of the observations are smaller than it and at most (100-p)% of the observations are larger

Example: 10th percentile, you are looking for at which value do you find at most 10% of the observations are
smaller and at most 90% are larger. For the IQR you find at which value the 25 th (Q1) and 75th (Q3)
percentile are, then the value at Q3 – the value at Q1 is your IQR.

Five number summaries:


Sample minimum Q1 Median / Q2 Q3 Sample Maximum
2
Lecture 2: Part 2 - Frequency  probability
Law of large numbers: relative frequencies stabilize if an experiment is repeated very often.
Probability: relative frequency “in the long run”
Ex: Toss a coin 4x you are likely to land on heads 1-3 times. This is 25-75% of the time. Repeat the
experiment 100%, you will most likely land on heads 45-55 times. This is 45-55% of the time and closes to
what you would expect with a 50/50 probability experiment.
n = sample size y = number of p = probability that a Estimator for p: sampling proportion
= number of persons who randomly chosen person y/n
persons in the consume more on average consumers y/p is a consistent estimator for p: the
sample than 6g of salt per more than 6g of salt per larger the sample size, the closer y/n
day day gets to the unknown value of p
Random phenomena: phenomena that is determined (partially) by chance
Random variable: variable whose numeric result originates from a random phenomenon (discrete or
continuous)
Probability distribution: a probability distribution for a discrete random variable consists of:
Set of all possible outcomes S (sample space) List of probabilities P for all possible outcomes
P(outcomes)
Ex: Probability distribution for the result of rolling a fair die.
Sample Space: 1, 2, 3, 4, 5, 6 P(outcomes): 1/6
Sum = P(1) + P(2) + … P(6) = 1
Definition of probability by Laplace:
If a random phenomenon has k possible outcomes P(event A) =
( )

that are all equally likely: (

Ex: Assume that A= “even outcome”  A ={2,4,6} Then, P(A) = 3/6 =0.5
(For rolling a fair fair)
Statistical Events:

The event A∪B (union)


The complement 𝐴 of an
consists of all outcomes
event A consists of all
𝑨 A outcomes that are not
A B that occur either only in
A, only in B or in A and
occurring in A
B simultaneously

The event A∩B If a A and B do not have


(intersection) consists of joint outcomes, we will
A B all outcomes that occur, A B call them mutually
only in A and B exclusive or disjoint
simultaneously (events)

Probability properties: Complement rule:


0 ≤ P (A) ≤ 1 for any event A P (𝐴) = 1 – P(A) for each event A

3
P(S) =1 ; S is the sample space (set of all possible
outcomes)

Lecture 3: Laws of probability theory, Tree diagrams


Genetics example: p = 0.50
In this example, we are crossing a yellow and green
pea plant, represented by a tree diagram . We see that
in each branch there is a 50% probability of the
allele, either make or female, and as they are crossed,
they are multiplied giving a 25% probability that
they are chosen together. There is only one outcome
with green peas, as that is the only outcome that does
not involve the dominant green pea allele (Y) and
thus there is only 25% probability of having a green
pea plant.
Laws of probability theories:
Multiplication Law: For independent events: P(A∩B) = P(A) x P(B) *if A and Bare independent events
Two events, A and B, are independent if the fact that A occurs, does not affect the probability of B
occurring. The multiplication law for independent events can also be used to define that events A and B
are independent.
Example: A= Male parent passing on Y B = Female parent passing on Y
P(A∩B) = P(A) x P(B) = 0.5 x 0.5 = 0.25

Addition law for disjoint events: Events never occurring at the same time are called mutually exclusive
or disjoint events. P(A∪B) = P(A) + P(B), *if A and B are disjoint events.
Example: Calculate the probability that the offspring is also heterozygous (Yy or yY)
A = Male parent passing on Y, female parent passing on y
B = Male parent passing on y, female parent passing on Y
P(A∪B) = P(A) + P(B) = 0.25 + 0.25 = 0.50

General addition law: For events that are not disjoint. P(A∪B) = P(A) + P(B) - P(A∩B) *if A and B are
disjoint (A∩B = Ø), P(A∩B) = 0. Addition law for disjoint events follows. You are essentially just
removing the double counts from the overlap of both events.
Example: A= Male parent passing on Y B = Female parent passing on Y
P(A∪B) = P(A) + P(B) - P(A∩B) = 0.5 + 0.5 – 0.25 = 0.75
Example: Rolling a die - Event A: uneven number = {1 ,3, 5} Event B = number ≥ 3 = {3, 4, 5, 6}
A∪B = {1, 3, 4, 5, 6} A∩B = {3, 5}
P(A∪B) = P(A) + P(B) - P(A∩B) = 3/6 + 4/6 – 2/6 = 5/6

Complement Rule: P(𝐴) = 1 – P(A) *for each event A


Example: A = at least one Y allele is passes on 𝐴 = no Y allele is passed on
P(𝐴) = 1 – P(A) = 1 – 0.25 = 0.75

Expectation of a variable: µ or E(y): Expectation/ expected value of a random variable y is the mean
outcome of y “in the long run”
Expectation and variance of a discrete random variable y: A random variable y with possible outcomes y 1,
y2, …, yn and associated probabilities p1, p2, …, pn has the following properties:

4
The expected value of y is: The variance of y is:
µ = E(y) = ∑ yi pi = ∑ outcome x probability σ2 = Var(y) = ∑( yi - µ)2 pi
The expected of the squared differences between y
and its expectation µ
Example: Example: Variance of y when rolling a die is
Expected value of rolling a die: σ2 = Var(y) = ∑( yi - µ)2 pi
µ = E(y) = ∑ yi pi = ∑ outcome x probability = (1-3.5)2 *1/6 + (2-3.5)2 *1/6 + (3-3.5)2 *1/6 +
= 1*1/6 + 2*1/6 + 3*1/6 + 4*1/6 + 5*1/6 + 6*1/6 (4-3.5)2 *1/6 + (5-3.5)2 *1/6 + (6-3.5)2 *1/6
= 3.5 = 2.9

Lecture 4: Binominal distribution, research question and hypothesis:


Binominal distribution:
Binominal coefficient:
Let us say we want to see all the ways we can arrange the letter ABC
We can write this out as ABC ACB BCA BAC CAB CBA  this gives us 6 possible ways
First, we chose from the first 3 letters: either A B or C, then there are two remaining letters to chose from
for the 2nd letter. After this one letter remains, which is automatically the 3rd letter. This gives us the
following equation: 3 x 2 x 1 because first there were 3 options, then 2, then 1. This is called a factorial
and can be written as 3! = 6 We can only use this function is all the letters are unique.
Next, let us use 2 Success and 1 Failure : This is because we can rearrange the letters to make a new
S1 S2 F – here we will have 6 possible order but not every rearrangement will produce a different
outcomes: If we number each S (1 order. So, there are only three unique orders to order 2
- S1 S2 F \ SSF and 2) we have 6 possible successes and one failure. We can calculate this number of
- S2 S1 F / outcomes. However, if unique orders using the binominal coefficient. Which is
- S1 F S2 \ SFS we remove the 1 and 2, mainly written as n is the total number of letters, and
- S2 F S1 / the two S are similar, and where k is the number of successes. In this example it would
- F S1 S2 \ FSS
we have 3 as the 3 other be because we have 3 letters but only 2 successes. We
- F S2 S1 /
outcomes are doubles could say that the number of unique orders is similar to the
total number or arrangements of all the letters divided by the
number of arrangements per letter.
! !
= )!
= = )!
= = 3/1 = 3
!( !(

! !
Example: SSFF: = )!
= = )!
= = 12/2 = 6
!( !(

Let S-disease be a disease where the vaccine has p = probability of sufficient protection and p = 0.80

5
Ex: Calculate the probability of the Ex: Calculate the probability of exactly 1 not protected pig:
combination PNP: P(PNP) = 0.8 x 0.2 x
0.8 = 0.128
P(NPP) + P(PNP) + P(PPN) = 0.128 + 0.128 + 0.128 = 0.384
*events are disjoint and thus we can add them
We can use the multiplication law to determine the probability of each outcome for the 2 and 3 randomly
selected pigs. For three this is doable, however if we add more pigs, it become too complex for this method.
So, we used the binominal distribution:
We can use the second example again, where we calculate the probability of exactly 1 not protected pig:
P(2 of 3 pigs are protected) = 3
𝑥 0.80 𝑥 0.20
P(1 of 3 pigs are not protected) = 2

The is the number of ways to choose 2 out of 3 pigs, the 0.802 is the probability of protection and the
0.201 is the probability of insufficient protection. The exponents represent how many pigs have that
probability.
Important Note: 0! = 1 and = =3
Binomial situation in general – assumptions: When you should be uses a binomial distribution:
- n trials - The trials are - Each Trial results in either - The probability of success π (notation in
independent success or failure textbook), remains the same in each trial

Binomial distribution – formula: the number od successes y in n trials in the binomial situation is
distributed according to a binomial distribution (~ means “is distributed as”)

y ~ B(n π) or Bin (n π) Formula: P(y = k) = π (1 − π) k= 0, 1, …, n


!
. *Where = )!
: number of ways to choose k successes in n trials*
!(

Using a Graphing Calculator: P (y = k)  Binompdf (n, π, k) or BINM/Bpd…


P (y ≤ k)  Binomcdf (n, π, k) or BINM/Bpd… (see also Table 1 in L.N.)

Example: for P(y = k)


A fair die is thrown 10 times y = number of ‘6’, What is the probability of 5x outcome ‘6’?
y ~ B(10, 1/6)
On TI-84: you go to Binompdf and put in (10, 1/6, 5) = 0.013
Manually:
!
P(y = k) = π (1 − π) P(y = 5) = (1 − ) = 0.013 with =
!( - )!

Example: for P(y ≤ k)


A fair die is thrown 10 times y = number of ‘6’, What is the probability of at most 2x outcome ‘6’?
y ~ B(10, 1/6) i.e. P(y ≤ 2)
On TI-84: you go to Binomcdf and put in (10, 1/6, 2) = 0.7752
Manually: P(y = k) = π (1 − π) 3. P(y = 2) = (1 − ) = 0.2907
1. P(y = 0) = (1 − ) = 0.1614
Now simply add all three probs together:
2. P(y = 1) = (1 − ) = 0.3230 0.1614 + 0.3230 + 0.2907 = 0.7751

6
Expected value and variance binomial distribution
Suppose 100 pigs get a vaccine against a certain disease.
Let π be the probability of sufficient protection. Extensive
research has shown that this probability is 0.80.

Then the number of sufficiently protected pigs y follows a


binomial distribution: : y ~ Bin(100, 0.80)

What would be the expected value µy? 100 * 0.80 = 80


Could it occur that, just by chance, 72 out of the 100 are
sufficiently protected? From the graph we can see that yes,
there is a variability so we expect the outcome to be 80, but
it could be a higher or lower value. It depends on chance.
If the number of successes y is distributed as y ~Bin(n, π), the expected value and variance of y are:
 µy = nπ  𝝈𝟐𝒚 = 𝒏𝝅(𝟏 − 𝝅)  𝝈𝒚 = 𝒏𝝅(𝟏 − 𝝅)
For sampling proportion of successes: 𝜋 =
 𝝁𝝅 = 𝝅 𝝅(𝟏 𝝅)
 𝝈𝟐𝝅 = 𝝅(𝟏 − 𝝅)/𝒏  𝝈𝝅 =

Please note: 𝜋 = is a consistent estimator for π : the larger the sample size n, the closer 𝜋 = gets to
the true π.

Example:
Suppose the number of sufficiently protected pigs y is binomially distributed: y ~ Bin(75, 0.85)
a. Calculate the expected values for the number of successes (success: pig is sufficiently protected)
and the sampling proportion of the successes.
b. Calculate the variances for the number of successes and the sampling proportion of successes
a. µy = nπ = (75)(0.85) = 63.75 b. 𝜎 = 𝑛𝜋(1 − 𝜋) = (75)(0.85)(0.15) = 9.56
𝜇 = 𝜋 = 0.85 𝜎 = 𝜋(1 − 𝜋)/𝑛 = (0.85)(0.15)/75 =0.017

Statistical test: hypothesis


Suppose that farmers are in doubt about the protection rate of a certain vaccine. The supplier of the
vaccine states that 80% of the pigs are sufficiently protected against a certain disease. Farmers ask an
independent organisation to investigate their concerns.
Research question: Null and alternative hypothesis in words:
“Can it be proven that the ● Null hypothesis (H0): “the protection rate is equal to 80%”.
protection rate is significantly ● Alternative hypothesis (Ha): “the protection rate is lower than 80%
below 80%?”

With a statistical test, we can decide whether there is enough evidence (by the data) against the null
hypothesis.
● Null hypothesis (H0): statement to be tested, usually formulated as “no effect” or “no difference”.
● Alternative hypothesis (Ha): statement that you are hoping or suspecting to be true instead of H 0.
Statistical notation:
 π = the probability that a pig is sufficient protected.
 H0: π = 0.8 or π ≥ 0.8 (null hypothesis)
 Ha: π < 0.8 (alternative hypothesis) → “Vaccine does not protect enough against a certain disease”

7
Lecture 5: Exact binomial test for a population proportion/probability π
Hypothesis Testing Steps: *Given (or chosen) level of significance α (i.e., 0.05)
Description: Example:
a. Give H0 and Ha H0 : π = 0.5 and Ha : π > 0.5
b. Describe the test statistic Number of mites that chose the odour
c. Give the distribution of the test statistic, y ~ Bin(100, 0.5)
assuming H0 is true
d. Sketch of the distribution of the test statistic
under H0

e. Outcome of the test statistic Y = 60


f. P-value of Test statistic and anything more P(y ≥ 60) = 0.0284
extreme
g. Conclusion P-value ≤ α= 0.05 → Reject H0, Ha has been shown
1 – Compare the p-value to α In words: It has been shown (with α= 0.05) that the
2 – Decide whether to reject the H0 probability of a predatory mite choosing the odour is
3 – Formulate conclusion in words larger than 0.5.

Let’s begin with an example:


Research on fighting spider mites with biological pest control:
 Plant attacked by spider mites →odour → attract predatory mites
Question: which odours (odorous substance) Aim: produce such odours to fight spider mites with
are suitable to attract predatory mites? biological pest control
Y-tube olfactometer: Closed Y-tube where 2 different air samples are placed at the end of
the two branches. The predatory mite must choose. If an insect does not move, it is not
counted in this experiment. If we repeat this test with many predatory mites, we will be
able to see if they prefer one or the other odour Example:
●Clean air (control) ●Odour of a plant
Points of attention during the design of this research: We really want to end with a strong claim at the end
of this experiment and for that we must control as many other factors as possible. Ex: Alternate the
position of the odour to eliminate the possibility that there is another factor for why the insects are
choosing that side. For this, we will use randomization: assign at random whether the odour will be placed
at the left or right branch of the olfactometer. Equally we must only use each predatory mite one time
(independent observations)

Application of the binomial distribution:


 n = number of predatory mites in the experiment (sample size).
 y = number of predatory mites choosing the odour (instead of the control sample)
 π= probability that the predatory mite chooses the the odour
 y ~ Bin(n, π)
Expected number of predatory mites choosing the odour, assuming y ~ Bin(n, π) : 𝜇y = nπ
 If n =100, y ~ Bin(100, 0.5) 𝜇y = 100*0.5 = 50
Now we font know what proportion in the population of predatory mites would choose the odour
however in our experiment we are contrasting 2 different types of odours and thus, they will choose
one or the other. Since we don’t know anything, there is one thing we do know, namely, if they don’t
have a preference, the probability will be 50/50, like a coin toss. Therefore, for π, we have indicated
that it is 0.5 as we can base out hypothesis on the spider mites having no preference.

8
Binomial test: hypothesis
With a statistical test we can decide whether there is enough evidence (by the data) against the null
hypothesis
 Null hypothesis (H0): statement to be tested, usually formulated as “no effect” or “no difference”.
 Alternative hypothesis (Ha): statement that you are hoping or suspecting to be true instead of H0
Statistical notation: ● Note: Ha shows that this is a one-sided hypothesis test
 π = the probability that a predatory mite chooses the odour
 H0: π = 0.5 or π ≤ 0.5 (null hypothesis)
 Ha: π > 0.5 (alternative hypothesis) → “predatory mites prefer the odour
How can we show that predatory mites prefer the odour?
 Count how many predatory mites choose the odour!
 Test Statistic = y = number of predatory mites in the sample choosing for the odour

Binomial Test: Test Statistic (T.S)


Assuming n = 100 predatory mites:
- y = 40 enough to “show” π> 0.5 (Ha) → NO - y = 60 enough to “show” π> 0.5 (Ha) → ???
- y = 50 enough to “show” π> 0.5 (Ha) → NO - y = 100 enough to “show” π> 0.5 (Ha) → YES
This is just an eyeball of whether there is enough evidence to show if the mites prefer the odour or not
Binomial test: P-value (I)
Assume n = 100 predatory mites
→ y ~ Bin(100, 0.5)
The reason you do not see the scale going from 1-100
but rather 30-70, is that the probability of those areas is
extremely low.

Outcome of the test statistic:


y = 60 mites choose the odour

P-Value:
P(y = 60) = 0.0108 & P(y > 60) = 0.0176
P(y ≥ 60) = P(y = 60) + P(y > 60) = 0.0284
In graphing calculator:
(1- Binomcdf(100,0.5,60)) + Binompdf(100,0.5.60)
= (1 – 0.9824) + 0.0176 = 0.0284

Statistical test: P-value (II)


- The probability P(y ≥ 60) is called a right - The P-value is the probability of the test statistic
sided P-value. obtaining the observed value or anything more
- The smaller the P-value, the stronger is extreme assuming that the null hypothesis is
the evidence - provided by the data- true!!!
against the null hypothesis H0.

Level of significance α
How to make a decision about the p-value we find?
Before we do the study, we decide there a particular probability that we are going to use as a cutoff. This
is the level of significance (α). If the probability that we find is smaller than α, then we will decide not to
believe the H0. If the probability is larger or equal to α, we accept H0, and Ha is rejected, as we lack the
evidence to reject H0.
 Choose a level of significance α before the analysis, e.g., α = 0.05
Using a level of significance α means we accept a probability of at most α to reject H 0 when, in reality, H0
is true.
 Often used: α = 0.05
9
However, choice often determined by field of research: Medicine: small(er) values of α (incorrect
conclusions can have large consequences!) e.g., α = 0.01 or less.

Statistical test: conclusion using level of significance α


Assume n = 100 and y = 60 , under H0: P-value = 0.02844
 P-value ≤ α= 0.05 → Reject H0, Ha has been shown*.
In words: It has been shown (with α= 0.05) that the probability of a predatory mite choosing the odour is
larger than 0.5.
*Here the alternative hypothesis has not been proven, but we have shown that the predatory mites have a
preference for the odour, we just are not sure how large that preference is.

Let’s now assume that n = 100 and y = 55 , under H 0: P-value = 0.1841


 P-value > α= 0.05 → Do not reject H0, Ha has not been shown.
In words: It has not been shown (with α = 0.05) that the probability of a predatory mite choosing the
odour is larger than 0.5.

Drawing a conclusion based on predetermined level of significance


P-value ≤ α → reject H0 and Ha has been shown. P-value > α → do not reject H0, Ha has not been
It has been shown (when α = ...) that ... (in words shown.
of Ha). It has not been shown that ... (in words of Ha).
!Pay attention:
 For the conclusion in non-statistical terms use the words of H a, not of H0.
 Not rejecting H0 does not imply that H0 has been shown.

Statistical test: Conclusion


Drawing conclusions about the hypotheses: Examples
Assume n = 100, y = 70 , under the assumption of H0: P-value = P(y ≥ 70) =0.00004
 There is a chance that the experiment results in y = 70 or more assuming y ~ Bin(100, 0.5), but this
probability is very small → significant! (Shown that the odour works!)

Assume n = 100, y = 60 , under the assumption of H0: P-value = 0.02844


 There is a chance that the experiment results in y = 60 or more assuming y ~ Bin(100, 0.5), but this
probability is still small → still significant!

Assume n = 100 en y = 55 , under the assumption of H0: P-value = 0.18410


There is a chance that the experiment results in y = 55 or more assuming y ~ Bin(100, 0.5), and this
probability is quite large → not significant! (not shown that the odour works!)

Example:
Assume the sample size n is equal to 8 in an experiment with predatory mites. Six out of eight predatory
mites choose the odour. Perform a statistical test with α = 0.05 whether the odour attracts more mites than
the clean air sample.

Steps: Answers: Steps: Answers:


a. H0 : π = 0.5 Ha : π > 0.5 b. Number of mites that choose the odour
c. y ~ Bin(8, 0.5) d.

e. y=6
f. P-value: P( y ≥ 6) for y ~ B(8, 0.5) g. P( y ≥ 6) = 0.1445 > α = 0.05. This means
1-Binomcdf(8, 0.5, 6) + that we do not reject H0 and Ha has not been
Binompdf(8,0.5,6) = 1-0.9648 + 0.1094 shown. It has not been shown (with α =
= 0.1445 0.05) that predatory mites prefer the odour
10
Calculate the p-value and draw the conclusion when 7 out of 8 mites choose the odour. (e. y = 7)
Steps: Answers: Steps: Answers:
f. P-value: P(y ≥ 7) for y ~ B(8, 0.5) g. P( y ≥ 7) = 0.352 < α = 0.05. This mean, we do
1- Binomcdf(8, 0.5, 7) + reject H0, and Ha has been shown. It has been
Binompdf(8, 0.5, 6) = 0.0039 + shown (with α = 0.05) that predatory mites
0.3125 = 0.3516 prefer the odour.

Example:
Problem:
One week before the election date an agency investigates the voter turnout for the municipal elections.
This research was commissioned by the local newspaper and conducted in the Northern Dutch provinces
of Groningen, Friesland and Drenthe based on a random sample of 500 eligible voters.
 In the last municipal elections, the voter turnout was 59% in these provinces. It is assumed that the
attention of the general public to the municipal elections is diminishing rapidly and therefore the voter
turnout is decreasing.
 Formulate the research question in the form of a statistical test. Provide the null hypothesis (including
a self-defined parameter) as well as the alternative hypothesis and the test statistic.
Answer:
n = 500 Previous election turnout = 59%
 “Suspected that less people vote” therefore research question of interest: will fewer than 59% of
the population vote?
 If nothing has changed, our hypothesis is that in the sample 59% will vote!
 Test Statistic: the number of people in the sample who will vote
 H0: π = 0.59 Ha: π < 0.59

Alternative hypothesis
In a statistical test for a probability π with For a two-sided alternative hypothesis Ha: π ≠ π0 both
H0: π = π0 large as well as small outcomes of the T.S. provide
There are three possible alternative evidence against the null hypothesis H0 and in favour of
hypotheses: Ha. We therefore need a two-sided P-value.
 Ha: π > π0 → Right-sided P-value: P(T.S. The P-value is the probability of the test statistic
≥ outcome) obtaining the observed value or anything more
 Ha: π < π0 → Left-sided P-value: extreme (supporting Ha), assuming that the null
P(T.S. ≤ outcome) hypothesis is true.
 Ha: π ≠ π0 → Two-sided P-value In this case: an extreme can be situated on both sides (tails
of the distribution)!
Examples:
 Ha: π > 0.59 →Right-sided P-value: P(T.S.≥ outcome) (if voter outcome had increased)
 Ha: π < 0.59 →Left-sided P-value: P(T.S.≤ outcome) (if voter outcome had decreased)
 Ha: π ≠0.59 →Two-sided P-value (if we don’t know whether to look for increasing or decreasing
outcome)
Suppose the distribution of the T.S. y when the null hypothesis is true
equals: y ~ Bin(n = 9, π0 = 0.5)
 Ha: π ≠ π0 → Two-sided P-value
 We can say the expected value of this is 4.5 but, in practise, we
will focus on either 4 or 5 mites choosing the odour. This means
we expect either 4 or 5 mites to choose the odour (π 0 = 0.5)
 If 0 or 9 mites choose the odour, then we have evidence to support
the alternative hypothesis. So, both sides support H a making this a 2 µy = nπ0 = 4.5
sided (2 tailed) p-value

11
General rules (symmetric distributions):
 Two-sided P-value = 2* one-sided P-value.
 The one-sided P-value is always the smallest P-value:
 If the outcome of the test statistic y0 > μy = nπ0, the two-sided P-value is 2*P(y ≥ y0)
 If the outcome of the test statistic y0 < μy = nπ0 , the two-sided P-value is 2*P(y ≤ y0).
Please note: We assume (in this course) in case of π ≠ 0.5 (for example π = 0.3) that the two-sided P-value
is equal to 2* one-sided P-value.
Example:
Assume the sample size n is now equal to 7 in an experiment with predatory mites with two different
odours. We have no prior idea whether the mites have a preference for one or the other odour. Six out of
seven predatory mites choose the odour no.1. Perform a statistical test with α = 0.05 to investigate
whether there is a difference in attraction between the two odours.
Steps: Answers: Steps: Answers:
a. H0 : π = 0.5 Ha ≠ 0.5 b. y = number of predatory mites choosing
odour source no.1
c. Under H0 y ~ Bin(n=7, π=0.5) d. P(y ≤ 5) = 0.9375
P(y = 6) = 0.0547
P(y ≥ 7) = 0.0078
P(y ≥ 6) = 0.0625
e. y = 6 (> nπ0 = 3.5) f. P-value: 2*P(y ≥ 6) if y ~ B(7,0.5) =
0.1250
g. Conclusion: 0.1250 > α = 0.05 Parameter Assume π = Probability that a predatory
therefore do not reject H0, and Ha mite chooses odour 1
has not been shown. It has not been
shown that predatory mites prefer
one or the other odour.

Lecture 6: Normal Distributions: For continuous variables (all values including in the intervals of real
values) The probability distribution thus cannot be described by individual outcomes y i with associated
probabilities pi
Histogram: Used to display observations. Uses bins to organize the observations. A small number of
observations leads to larger bin sizes. As you add more observations, the bin size will increase, and the
histogram will become a curve. When this occurs, it is called a probability density function (a visualization
of the probability distribution for continuous random variables)
Continuous random variable: a probability is
a relative frequency in the long run, therefore
the probability of A, P(A), is given as the area
under the curve in interval A. The total area
under the curve is equal to 1.
To compare with discrete random variables,
let’s say y = number of blue-eyed persons in a
random selected group of 10. We are looking
for the probability that at least 5 people in the 10 who have blue eyes: P(y ≤ 5) ≠ P (y < 5 ) so you must
include the probability that 5 people who have blue eyes into the sum!
For a continuous random variable where y = birth weight, and we are looking which babies’ weight less than
3150 g : P(y ≤ 3150) = P(y < 3150) these both reflect the same area under the curve! This is because it is
impossible to associate a specific probability with a specific outcome: P(y = 3150) = 0
Normal Distribution: Symmetrical, Unimodal (one mode = one peak) and Bell-shaped
12
Notation y ~ N(µ, σ) where µ = population mean and σ = population standard deviation (Greek letters are used to
represent the population) ( ~ means “is distributed as”)
For example: if we are looking at a distribution of
height withing a population:
It is relatively rare to see someone who is super short,
so the bell shapes curve in that area is relatively low.
Same for Tall. Whereas as we near the average height,
the slope raises, as it is quite common to see someone
with an average height.
The shape of the curve is determined by the
population mean and the population standard
deviation.

So, to draw a normal distribution, you need to know:


1. The average measurement, this tells you where the center of the curve goes
2. The standard deviation of the measurements, this tells you how wide the curve should be. The width of the
curve determines how tall the slope is. The wider the curve, he shorter the slope. The narrower the curve, the
taller.

Standard normal distribution: (Also known as the Z distribution)


Notation:
Z ~ N(0,1)
Table 1 .
inside O&L
You can use this Z distribution to
find probabilities related to areas
under the curve of a normal
distribution using the Table 1.

As the area under the curve is always equal to 1, it does not matter the shape of the normal distribution curve
as up to the area z you are looking for will always be the same area.
This table (this is not the full table) gives the area under the curve for a particular value of z (area or the
space for anything less than or equal to that z value – shaded gray). The shaded area is the probability that Z
is smaller than or equal to a particular z value (P(Z ≤ z))
In the table, the top row and first column correspond to z values. All the numbers in the middle correspond
to areas. So, for example, the z value 3.43 will have an area of .0003 (shown in blue shaded in the table)
Equally you can do this backwards, meaning that if needed you can find a z value by being given an area

13
and work backwards. This doesn’t always give the most accurate z value because as you reach the outer
edges of the Standard Normal distribution, there are many low probabilities (ex: multiple z values have
0.0001 areas)
Transformation: y ~ N(µ/,σ)  Z ~ N(0,1)
Every normal distribution can be transformed into a Z distribution using a certain formula: y = µ + z * σ ,
and rearranged it gives z = (y - µ)/ σ “z-score of y”
This is a probability density for a continuous random variable y
and we know it has a particular mean in the middle (µ) and
standard deviation (σ). With this transformation, we are essential
creating a new z axis, so we are transforming y into z.
Ex: For height in the population, we have a mean of 175cm and a
standard deviation of 7.5cm. What is the probability that the height
of an individual is smaller or equal to 165cm?
y ~ N(175,7.5)
P(y ≤ 165) = P(175 + Z x 7.5 ≤ 165)
We can solve for Z using the equation : z = (y - µ)/ σ
z = (165 – 175)/7.5 = -1.33 Now you can look in the table and the
answer is 0.0918

Problem exercise that I struggled with:


A is the complement of the B.
A. B. C. Given:
P = 0.60

µ = 250
σ = 50

If we know that the gray area in the first image is 0.60, we know that the second image has a gray area of 0.40 as it
would be 1-0.60 = 0.40.
Since the curve is symmetrical, we also know that the gray area in C will be half of B and thus 0.20
That means we can solve for 0.20 = P(y < 250 - k)
Now we look in our table for 0.20 and find the corresponding z which is 0.84
Now we can solve for k using the formula:
z = (y - µ)/ σ
0.84 = (250 – k – 250) / 50
42 = k
Example:
Given : n = 35 (number of units in sample) mean = 294 grams/mile S.D. = 26 grams/mile
thus, y ~N(294,26)
N (µ,σ)  µ and σ are unknown but we have a sample mean and sample S.D. that we will use as an
estimator for µ and σ. The formulas for mean and S.D. are called estimators.

14
Calculate the probability that a randomly selected car has a CO2
emission larger than 320 grams/mile
P(y > 320) = P(Z > (320-294)/26) =
P(Z > 1.00) = 0.1587 (remember that the table gives the area up to
y=320 and so on the table we get 0.8413 and since the whole area
on Z is 1, we use the complement rule and do 1-0.8413 = 0.1587)
Calculate the probability that a randomly selected car has a CO2
emission between 270 and 320 grams/mile
P(270 < y < 320) = P ((270 – 294)/26 < Z < (320 – 294)/26)
= P (-0.92 < Z < 1.00) = P(Z < 1.00) – P(Z < -0.92)
= 0.8413 – 0.1788
= 0.6625

Calculate the 10th percentile of this distribution: P(y < k) = 0.10  P (Z < (k -294)/26) =0.10
(k – 294)/26 = -1.28 k = 260.7

Calculation of desired expected value:


Ex: Using the example from above but we want to find a mean with only 5% of cars exceeding
300grams/mile σ = 26 grams/mile µ = ??? Notation: y ~ N(µ, 26), P( y> 300) = 0.05
P(y > 300) = P(µ + z * σ > 300) (300- µ)/σ = 1.645 *REMEMBER THAT THE Z TABLE
= P (Z < (300- µ/σ) = 0.05 µ = 300 - 1.645*26 IS FROM LEFT TO RIGHT! THIS
= 257.3 MEANS THAT WHEN YOU ARE
P (Z < (300- µ/σ) = 0.95 TRYING TO FIND AN AREA
P (Z < (300- µ/σ) = 1.645 GREATER THAN (>) YOU HAVE TO
CHANGE IT TOO LESSER THAN (<)

Lecture 7: Laws of calculating and the probability distributions of a sample mean and a sum

Overview of the laws:


Mean µ Standard deviation σ Variation σ2
Law 1 µa+by = a + bµy σa+by = |b|σy σ2a+by = b2σ2y
Law 2 µx+y = µx + µy σx±y = (σ2x + σ2y) σ2x±y = σ2x + σ2y
Combined µax+by+c = aµx + bµy + c σax+by+c = (a2σ2x + b2 σ2y) σ2ax+by+c = a2σ2x+ b2σ2y
∑y µ∑y = nµy σ∑y = √𝑛 * σy σ2∑y = n * σ2y
𝑦 𝑛𝜇 𝜎 𝑛𝜎 𝜎
𝜇 = = 𝜇 𝜎 = 𝜎 =
𝑛 √𝑛 𝑛 𝑛

Let’s start with an example:


Assume that we observed that within loaves of bread there is a salt concentration that has an expected
value of 1.8 and standard deviation of 0.1g/100g. Now what would be the expected value of salt in a 35g
slice of bread?
Well, we know that in 100g there is an expected value of 1.8g so we take 35% of that which would be
(35/100)*1.8 = 0.63g. We can do the same calculation for the standard deviation (35/100)*0.1 = 0.035g
So, that means our intuitive expectation is 0.63g and our intuitive s.d. is 0.035g

15
Next, lets assume that someone can add exactly 0.2g of salt (per 100g) to a loaf of bread. What would be
the expected value and s.d. of the salt conc3entration (per 100g)
Well for the expected value, we can simply add the 0.2 g: 1.8 + 0.2 = 2g
*The standard deviation would not change however because the same amount of salt would be added to
every loaf of bread

Rules of calculations Laws


Law 1 Law 2
Calculating If y is a random variable and a and b If x and y are random variables, the:
expected are constants, then: µa+by = a + bµy µx+y = µx + µy
values µ Example: Example:
*Note that an If µy = 7 and x = 2 + 3y If µx = 7, µy = 23 and z = x + y
expectation is (a
special case of) a Then µx = µ2+3y =2 + 3µy = 2+3*7 = Then µz = µx+y = µx + µy = 7 + 23 = 30
mean 23
Calculating If y is a random variable and a and b If x and y are independent random variables,
variances σ2 are constants, then: σ2a+by = b2σ2y then:
and therefore σa+by = |b|σy σ2x+y = σ2x + σ2y and σ2x-y = σ2x + σ2y
Example: **the – becomes a + in the second because in
If σ2y= 4 and x= 2 + 3y  σ2x = ? variation calculations everything is squared.
Then σ2x = σ22+3y = 32 x σ2y = 9 * 4 = 36 Example:
If σ2x = 4 σ2y = 5 and z = x – y  σ2z = ?
Then σ2z = σ2x-y = σ2x + σ2y = 4 + 5 = 9
*If you are interested in the s.d. then σz = 3

Laws 1 & 2 combined:


For calculating expected values µ
If x and y are random variables and a, b and c are constants then: µax+by+c = aµx + bµy + c
Example: µx = 4, µy = 5 and z = 2x + 3y + 1  µz = ?
µz = µ2x+3y+1 = 2µx + 3µy + 1 = 24
For calculating variances σ2
If x and y are independent random variables, and a, b and c are constants, then: σ 2ax+by+c = a2σ2x+ b2σ2y
Example: σ2x = 4, σ2y = 5 and z = 2x + 3y + 1  σ2z = ?
σ2z = σ22x+3y+1 = 22 (σ2x) + 32 (σ2y) = 61
**Every linear combination of independent random variables is also normally distributed!**

Example: We use combination when we want to add different normal distributions, so we need new σ and µ
Problem: Assume we are interested in the Given:
distribution of the amount of salt in a slice of bread Bread: Smoked meat
(35g) with a slice of smoked meat (10g) x = salt concentration y = salt concentration
x ~ N(1.8, 0.1) [g/100g] y ~ N(3.6, 0.5) [g/100g]
Solution: Calculate the distribution of z = 0.35*x + 0.10*y
So, to create the new normal distribution, you need to know your new
expected value and your new standard deviation.
Calculation of the expected value and the standard deviation of z:
µz = 0.35µx + 0.10µy = 0.35*1.8 + 0.10 * 3.6 = 0.99
σz = σax+by+c = (σ20.35x+0.10y) = (σ20.35x + σ20.10)
= ( 0.352 * 0.12 + 0.102 * 0.52)
= 0.061

16
Now we know that the distribution of the amount of salt in a randomly selected slice of bread with a
randomly selected slice of smoked meat: z ~ N(µz,σz)  N(0.99, 0.061)
Example 2:
Problem: Given:
Assume we are interested in the distribution of the Pita: Shawarma meat
amount of salt in a pita bread with 200g of spiced x = salt concentration y = salt concentration
shawarma x ~ N(0.6, 0.2) [g/pita] y ~ N(0.5, 0.1) [g/100g]
Solution: Calculate the distribution of z = x + 2y
So, to create the new normal distribution, you need to know your new expected value and your new
standard deviation.
Calculation of the expected value and the standard deviation of z:
x ~ N(µx = 0.6, σx = 0.2) and y ~ N(µy = 0.5, σy = 0.1)

µz = µx + 2µy = 0.6 + 2 * 0.5 = 1.6


σ2z = σ2x+2y= σ2x + σ22y = σ2x + 22 * σ2y = 0.22 + 4 * 0.12 = 0.08
σz = √0.08 = 0.283

Now we know that the distribution of the amount of salt in a randomly selected piece of pita with a
randomly selected 200g of shawarma meat: z ~ N(µz,σz)  N(1.6, 0.283)

µ∑y and σ∑y for independent drawings from a single distribution : if you want to add the data from a
sample of more observations into your µ and σ (from the same pop)
Take a simple random sample of size n from a population in which a random variable y has an expected
value (pop mean) µy and standard deviation σy: observations y1, y2. …, yn. Applying the laws of
calculating expected values and variances to the sum of these observations, gives: ∑y = y1 + y2. +… + yn
For µ: µ∑y = nµy For σ: σ∑y = √𝑛 * σy
(µy1+y2+…+yn = µy1 + µy2 + … + µyn = nµy) (σ2y1+y2+…+yn = σ2y1 + σ2y2 + … + σ2yn = nσ2y)

In general:
y ~ N(µy,σy) - Random sample of size n from this normal distribution : y1, y2, …, yn
∑y = y1 + y2. +… + yn
Distribution sum: ∑y ~ N(nµy, √n * σy)

Example:
Problem:
The Dutch Bakery Association is the independent quality and information institution fir the baking
industry. Assume that they would like to check the salt concentration in bread on a certain day. They
therefore draw a simple random sample of loaves of bread and checks these loaves for their salt
concentration.
The max limit for salt in bread is 2.1g/100g.
Assume the NBC advises to bake bread with a (mean) salt concentration of 1.7g/100g. The NBC would
like to check whether their advice is followed.
The NBC randomly selects 2 loaves of bread and determines the mean salt concentration of these 2
loaves. Is it accurate? And what do you think if 10 loaves of bread would be randomly chosen? 10
randomly chosen loaves of bread would be better than 2.

17
The following graph shows two simulations (1000x)
of the mean salt concentration of 2 and 10 loaves:
Both are normally distributed but, the 2000
observations are less accurate than the 10 000
observations. More observations lead to a narrower
slope, as more observations are listed so the
frequency scale is larger, and we see a smaller spread

Variability of 𝑦 can be described by the spread of outcomes of the simulation. Larger sample sizes show a
smaller spread in 𝑦.

𝝁𝒚 and 𝝈𝒚 of a sample mean

Take a simple random sample of size n from a population in which random variable y has an expected
value (pop mean) µy and a standard deviation σy : observations y1, y2, …, yn
Aim: estimate µy Aim: estimate 𝜎
Estimator (formula) : sample mean 𝑦 = (y1 + y2. +… + yn)/n 𝒏𝝈𝟐𝒚 𝝈𝟐𝒚 𝝈𝒚
𝒏𝝁𝒚 𝝈𝟐𝒚 𝟐 = 𝐒𝐨: 𝝈𝒚 =
𝝁𝒚 = = 𝝁𝒚 𝒏 𝒏 √𝒏
𝒏
𝑦 is a consistent estimator for µy because the larger the sample, the closer 𝑦 tends to the unknown true
value of µy.
Implementation: y1, y2, …, yn  outcome of 𝑦 is an estimate of µy.
In general:
 y ~ N(µy, σy)
 Random sample size of n from this normal distribution: y1, y2, …, yn

 𝑦= = Σ𝑦
 Distribution sample mean: 𝑦~N(𝜇 , )

Example:
Let the slat concentration in bread Calculate the probability that the mean salt concentration in a
(y, gram/100gram) be distributed as simple random sample of 4 loaves of bread is higher than 1.7
y ~ N(1.75, 0.1) g/100g.
Repeat the same calculation for a simple random sample size of 16.
For 4 loaves: For 16 loaves:
P(𝑦 > 1.7) = ??? y = salt concentration in bread P(𝑦 > 1.7) =??? y = salt concentration in bread
µy = 1.75 σy = 0.1 𝑦 = 1.7 n = 4 µy = 1.75 σy = 0.1 𝑦 = 1.7 n = 16
𝜇 = µy = 1.75 𝜎 = = 0.05 𝜇 = µy = 1.75 𝜎 = = 0.025
√ √
𝑦~ N(1.75,0.05) 𝑦 ~ N(1.75,0.025)
. - . -
P(𝑦 > 1.7) = 1 – P(𝑦 < 1.7) = 1 – P(z < ) P(𝑦 > 1.7) = 1 – P(𝑦 < 1.7) = 1 – P(z < )
= 1 – P(z < -1) = 0.8413 = 1 – P(z < -2) = 0.9772

18

You might also like