0% found this document useful (0 votes)
79 views61 pages

Math 1700

The document outlines the MATH1700 Probability and Statistics course at the University of Leeds for the academic year 2024-25, detailing the course structure, weekly schedule, and content covered in lecture notes and problem sheets. It includes topics such as sample spaces, probability rules, and classical probability, along with exercises and solutions for students. Additionally, it provides resources for learning R programming as part of the course materials.

Uploaded by

ziruiwang051120
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
79 views61 pages

Math 1700

The document outlines the MATH1700 Probability and Statistics course at the University of Leeds for the academic year 2024-25, detailing the course structure, weekly schedule, and content covered in lecture notes and problem sheets. It includes topics such as sample spaces, probability rules, and classical probability, along with exercises and solutions for students. Additionally, it provides resources for learning R programming as part of the course materials.

Uploaded by

ziruiwang051120
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 61

MATH1700 Probability and Statistics

Amanda Turner

University of Leeds, 2024–25


2
Contents

Overview 5
Weekly schedule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
About these notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

Problem Sheet 1 7
A: Short questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
B: Long questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Solutions to short questions . . . . . . . . . . . . . . . . . . . . . . . . 9

Lecture notes 13
1 Sample spaces and events 13
1.1 What is probability? . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.2 Sample spaces and events . . . . . . . . . . . . . . . . . . . . . . 14
1.3 Set theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2 The rules of probability 21


2.1 Probability axioms . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2 Properties of probability . . . . . . . . . . . . . . . . . . . . . . . 22
2.3 Addition rules for unions . . . . . . . . . . . . . . . . . . . . . . . 24
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3 Classical probability I 27
3.1 Probability with equally likely outcomes . . . . . . . . . . . . . . 27
3.2 Multiplication principle . . . . . . . . . . . . . . . . . . . . . . . 28
3.3 Sampling with and without replacement . . . . . . . . . . . . . . 30
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

4 Classical probability II 33
4.1 Ordering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.2 Sampling without replacement in any order . . . . . . . . . . . . 34
4.3 Birthday problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

Problem Sheet 2 39
A: Short questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

3
4 CONTENTS

B: Long questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Solutions to short questions . . . . . . . . . . . . . . . . . . . . . . . . 41

R Worksheets 45
Introduction to R 45
What are R and RStudio? . . . . . . . . . . . . . . . . . . . . . . . . . 45
How to use R and RStudio . . . . . . . . . . . . . . . . . . . . . . . . 45

R Worksheet 1: R Basics 49
Using R as a calculator . . . . . . . . . . . . . . . . . . . . . . . . . . 50
Functions in R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
Objects in R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Saving your work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

R Worksheet 1: R Basics (Solutions) 55

Solutions 61
Solutions and group feedback 61
knitr::opts_knit$set(root.dir = rprojroot::find_rstudio_root_file())
Overview

This lecture-note pack will be added to during the module, to eventually include
all the lecture notes, exercise sheets, practical tasks, and solutions that make
up semester 1 of this course.

Weekly schedule
For now, the course materials for weeks 1 and 2 are provided. The schedule for
week 1 is given below.
Week 1 (30 September– 4 October):
• Problem Sheet 1: Work through in preparation for your tutorial in
Week 1.
• Lecture 1: Sample spaces and events (Monday 30 September)
• Lecture 2: The rules of probability (Wednesday 2 October)
• Tutorial: to discuss Problem Sheet 1; check your timetable for details.

About these notes


These notes have evolved over many years of teaching at the University of Leeds,
and have been written and contributed to by several different people. The
current version is due to Amanda Turner. However, this have been closely based
on notes written by Matthew Aldridge who, in turn, adapted notes by Robert
Aykroyd and Wally Gilks. Jason Anquandah and Robert Aykroyd advised on
the R worksheets. Generative AI (ChatGPT and Microsoft Copilot) was used
in the development of some of the problem sheets.
These notes (in the web format) should be accessible by screenreaders. If you
have accessibility difficulties with these notes, contact me.

5
6 CONTENTS
Problem Sheet 1

This is Problem Sheet 1, which covers revision of A-level probability. You should
work through the questions on this problem sheet in advance of your tutorial
in Week 1. I’m aware that some students have qualifications other than A-level
mathematics. If any of the material on this sheet is unfamiliar, don’t worry too
much at this stage as we will recap everything as we go along during the course.

A: Short questions
The first seven questions are short questions, which are mostly intended to
be straightforward. You can check your answers with the solutions-without-
working at the bottom of this sheet. It is good practice to write up complete
solutions showing all your working and solutions-with-working will be available
in Week 2. If you get stuck on any of these questions, you should ask for
guidance in your tutorial.
A1. Suppose you toss a fair coin twice. Find the probability that:
(a) You get a head and then a tail.
(b) You get one head and one tail.
A2. A jar contains 7 red marbles, 5 blue marbles, and 8 green marbles. A
marble is drawn at random. Find the probability that:
(a) The marble is red.
(b) The marble is not green.
(c) The marble is either blue or green.
A3. If ℙ(𝐴) = 0.35, find the probability of the complement of event 𝐴, ℙ(𝐴𝑐 ).
A4. Two events 𝐴 and 𝐵 are mutually exclusive, and ℙ(𝐴) = 0.4, ℙ(𝐵) = 0.3.
What is the probability that either 𝐴 or 𝐵 occurs?
A5. A bag contains 5 red balls and 3 blue balls. Two balls are drawn at random.
Find the probability that both balls are red if:
(a) The balls are chosen without replacement.
(a) The balls are chosen with replacement.
A6. Two events 𝐴 and 𝐵 are independent, and ℙ(𝐴) = 0.5, ℙ(𝐵) = 0.4. Find:

7
8 CONTENTS

(a) ℙ(𝐴 ∩ 𝐵)
(b) ℙ(𝐴 ∪ 𝐵)
A7. In a class of 30 students, 18 study mathematics, 12 study physics, and 6
study both subjects. Find:
(a) The probability that a randomly selected student studies mathematics or
physics.
(b) The probability that a student studies mathematics given that they study
physics.

B: Long questions
The next three questions are long questions, which are intended to be harder.
Long questions often require you to think originally for yourself, not just di-
rectly follow procedures from the notes. You may not be able to solve all of
these questions, although you should make multiple attempts to do so. Here,
your answers should be written in complete sentences, and you should carefully
explain in words each step of your working. Your answers to these questions –
not only their mathematical content, but also how to write good, clear solutions
– are likely to be the main topic for discussion in your tutorial. Solutions will
be available in Week 2.
B1. In a survey of 100 people, 40 liked tea, 35 liked coffee, and 15 liked both
tea and coffee. By drawing a Venn diagram, or otherwise, find the probability
that a person chosen at random:
(a) Likes only tea.
(b) Likes neither tea nor coffee.
(c) Likes tea given that they like coffee.
B2. A factory produces 80% of its goods in Factory A and 20% in Factory B.
The probability of a defective item from Factory A is 0.05, and from Factory B,
it is 0.1.
(a) Draw a tree diagram representing this situation.
(b) Calculate the probability that a randomly selected item is defective.
B3. A magician has prepared two piles of cards taken from a standard deck of
cards. In the first pile he has put the 3, 6 and 7 of hearts; in the second pile
he has the 2 of spades and a second numbered spade, which we denote as 𝑥. A
member of the audience is invited to pick one card at random from each pack
and multiply the two numbers together.
(a) Draw a sample space diagram giving all possible products (in terms of 𝑥).
(b) Calculate the probability that the product is even in the two cases that (i)
𝑥 is odd and (ii) 𝑥 is even.
(c) The magician knows that the probability that the product is even is equal
to the probability that the product is greater than 10. Find the value of 𝑥.
CONTENTS 9

Solutions to short questions


A1. (a) 0.25, (b) 0.5. A2. (a) 0.35, (b) 0.6, (c) 0.65. A3. 0.65. A4. 0.7. A5.
(a) 0.3571, (b) 0.3906. A6. (a) 0.2, (b) 0.7. A7. (a) 0.8, (b) 0.5.
10 CONTENTS
Lecture notes

11
Chapter 1

Sample spaces and events

1.1 What is probability?


Probability theory is the study of randomness. Probability, as an area of math-
ematics, is a fascinating subject in its own right. However, probability is partic-
ularly important due to its usefulness in applications – especially in statistics
(the study of data), in finance, and in actuarial science (the study of insurance).
Probability is well suited to modelling situations that involve randomness, un-
certainty, or unpredictability. If you want to predict the time of the next solar
eclipse, a deterministic (that is, non-random) model based on physical laws will
tell you when the sun, the moon, and the earth will be in the correct positions;
but if you want to predict the weather tomorrow, or the price of a share of
Apple stock next month, or the results of an election next year, you will need
a probabilistic model that takes into account the uncertainty in the outcome.
A probabilistic model could tell you the most likely outcome, or a range of the
most probable outcomes.
So what do we mean when we talk about the “probability” of an event occurring?
You might say that the probability of an event is a measure of “how likely” it
is to occur, or what the “chance” of it occurring is.
More concretely, here are some interpretations of probability:
• Subjective (or Bayesian) probability: The probability of an event is
the way someone expresses their degree of belief that the event will occur,
based on their own judgement, and given the evidence they have seen.
Their belief is measured on a scale from 0 to 1, from probabilities near 0
meaning they believe the event is very unlikely to occur to probabilities
near 1 meaning they believe the event is very likely to occur.
– This interpretation is philosophically sound, but a bit vague to be
the basis for a mathematics module.
• Classical (or enumerative) probability: Suppose there are a finite
number of equally likely outcomes. Then the probability of an event is
the proportion of those outcomes that correspond to the event occurring.
1
So when we say that a randomly dealt card has a probability 13 of being

13
14 CHAPTER 1. SAMPLE SPACES AND EVENTS

an ace, this is because there are 52 cards of which 4 are aces, so the
4 1
proportion of favourable outcomes is 52 = 13 .
– This interpretation is good for simple procedures like flipping a fair
coin, rolling a dice, or dealing cards, where the “finite number of
equally likely outcomes” assumption holds. But we want to be able
to study more complicated situations, where some outcomes are more
likely than others, or where infinitely many different outcomes are
possible.
• Frequentist probability: In a repeated experiment, the probability of
an event is its long-run frequency. That is, if we repeat an experiment a
very large number of times, the probability of the event is (approximately)
the proportion of the experiments in which the event occurs. So when we
say a biased coin has probability 0.9 of landing heads, we mean that were
we toss it 1000 times, we would expect to see very close to 0.9×1000 = 900
heads.
– There are two problems with this. First, this doesn’t deal with events
that can’t be repeated over and over again (like “What’s the proba-
bility that Trump wins the 2024 US election?”). Second, to answer
the question, “Yes, but how close to the probability should the pro-
portion of occurrences be?”, you end up having to answer, “Well, it
depends on the probability,” and you’ve got a circular definition.
• Mathematical probability: We have a function that assigns to each
event a number between 0 and 1, called its probability, and that function
has to obey certain mathematical rules, called “axioms”.

It will not surprise you to learn that, in this mathematics course, we will take
the “mathematical probability” approach. However, we will also learn useful
things about the other approaches: we will see that classical probability is one
special case of mathematical probability; we will see a result called the “law
of large numbers” that says that the long-run frequency does indeed get closer
and closer to the mathematical probability; and a result called “Bayes’ theorem”
will advise a subjectivist on how to update their subjective beliefs when they
sees new evidence.

1.2 Sample spaces and events


Taking the “mathematical probability” approach, we will want to give a formal
mathematical definition of the probability of an event. But even before that, we
need to give a formal mathematical definition of an event itself. Our setup will
be this:

• There is a set called the sample space, normally given the letter Ω (upper-
case Omega), which is the set of all possible outcomes.
• An element of the sample space Ω is a sample outcome, sometimes given
the letter 𝜔 (lower-case omega), represents one of the possible outcomes.
• An event is a set of sample outcomes; that is, a subset of the sample space
Ω. Events are often given letters like 𝐴, 𝐵, 𝐶. We write 𝐴 ⊂ Ω to mean
that 𝐴 is an event in (or, equivalently, is a subset of) the sample space Ω.
Note that every set is a subset of itself, so Ω ⊂ Ω.
1.3. SET THEORY 15

This will be easier to understand with some concrete examples. We write a set
(such as a sample space or an event) by writing all the elements of that set inside
curly brackets { }, separated by commas.
Example 1.1. Suppose we toss a (possibly biased) coin, and record whether it
lands heads or tails. Then our sample space is Ω = {H, T}, where the sample
outcome H denotes heads and the sample outcome T denotes tails.
The event that the coin lands heads is {H}.
Example 1.2. Suppose we roll a dice, and record the number rolled. Then our
sample space is Ω = {1, 2, 3, 4, 5, 6}, where the sample outcome 1 corresponds
to rolling a one, and so on.
The event “we roll an even number” is {2, 4, 6}. The event “we roll at least a
five” is {5, 6}.
Example 1.3. Suppose we wish to count how many claims are made to an
insurance company in a year. We could model this by taking the sample space
Ω to be ℤ+ = {0, 1, 2, … }, the set of all non-negative integers.
The event “the company receives less than 1000 claims” is {0, 1, 2, … , 998, 999}.
Example 1.4. Suppose we want a computer to pick a random number between
0 and 1. We could model this by taking the sample space Ω to be the interval
[0, 1] of all real numbers between 0 and 1.
The event “the number is bigger than 12 ” is the sub-interval ( 21 , 1] of all real
numbers greater than 12 but no bigger than 1. The event “the first digit is a 7”

is the √
sub-interval [0.7, 0.8). The event “the random number is exactly 1/ 2”
is {1/ 2}.
In the first two examples, the sample space Ω was finite. In third example, the
sample space was infinite but “countably infinite”, in that it could be counted
using the discrete values of the positive integers. Both of these were for counting
discrete observations. In the fourth example, the sample space was infinite but
“uncountably infinite”, in that it had a sliding scale or “continuum” of gradually
varying measurements. This was for measuring continuous observations. This
distinction will be important later in the course.
For any sample space Ω, there are two special events that always exist. There’s
Ω itself, the event containing all of the sample outcomes, which represents “some-
thing happens”. There’s also the empty set ∅, which contains none of the sample
outcomes, which represents “nothing happens”. Common sense suggests that Ω
should have probability 1, because something is bound to happen – this will
later be one of our probability “axioms”. Common sense also suggests that ∅
should have probability 0, because it can’t be that nothing happens – this will
not be one probability axioms, but we’ll show that it follows logically from the
axioms we do choose.

1.3 Set theory


Since events are defined as sets – specifically, subsets of the sample space Ω –
the theory of probability uses the language of set theory. In this section, we
16 CHAPTER 1. SAMPLE SPACES AND EVENTS

recap the notation and some basic results from set theory.

• 𝜔 ∈ 𝐴 means “𝜔 is in 𝐴” or “𝜔 is an element of 𝐴”, while 𝜔 ∉ 𝐴 means


the opposite, that 𝜔 is not in 𝐴;
• a colon ∶ in the middle of set notation should be read as “such that”;
• so {𝜔 ∈ Ω ∶ fact about 𝜔} should be read as “the set of sample outcomes
𝜔 in the sample space Ω such that the fact is true”.

Definition 1.1. Consider a sample space Ω, and let 𝐴 and 𝐵 be events in that
sample space.

• NOT: The complement of 𝐴, written 𝐴c (and said “𝐴 complement” or


“not 𝐴”), is the set of sample outcomes not in 𝐴; that is

𝐴c = {𝜔 ∈ Ω ∶ 𝜔 ∉ 𝐴}.

This represents the event that 𝐴 does not occur.


• AND: The intersection of 𝐴 and 𝐵, written 𝐴 ∩ 𝐵 (and said “𝐴
intersect 𝐵” or “𝐴 and 𝐵”) is the set of sample outcomes in both 𝐴 and
𝐵; that is,

𝐴 ∩ 𝐵 = {𝜔 ∈ Ω ∶ 𝜔 ∈ 𝐴 and 𝜔 ∈ 𝐵}.

This represents the event that both 𝐴 and 𝐵 occur.


• OR: The union of 𝐴 and 𝐵, written 𝐴 ∪ 𝐵 (and said “𝐴 union 𝐵” or “𝐴
or 𝐵”) is the set of sample outcomess in 𝐴 or in 𝐵; that is,

𝐴 ∪ 𝐵 = {𝜔 ∈ Ω ∶ 𝜔 ∈ 𝐴 or 𝜔 ∈ 𝐵}.

This represents the event that 𝐴 occurs or 𝐵 occurs. (In mathematics,


“or” includes “both”, so a sample outcome in both 𝐴 and 𝐵 is in 𝐴 ∪ 𝐵
too.)


c
A

A B
1.3. SET THEORY 17

A B

Example 1.5. Suppose we are rolling a dice, so our sample space is Ω =


{1, 2, 3, 4, 5, 6}. Let 𝐴 = {2, 4, 6} be the event that we roll and even number,
and let 𝐵 = {5, 6} be the event that we roll at least a 5. Then

𝐴c = {1, 3, 5} = {roll an odd number},


𝐴 ∩ 𝐵 = {6} = {roll a 6},
𝐴 ∪ 𝐵 = {2, 4, 5, 6}.

An important case is when two events 𝐴, 𝐵 cannot happen at the same time;
that is, 𝐴 ∩ 𝐵 = ∅ (“𝐴 intersect 𝐵 is the empty set”). In this case, we say that
𝐴 and 𝐵 are disjoint or mutually exclusive. For example, when Ω is a deck
of cards, then 𝐴 = {the card is a spade} and 𝐵 = {the card is red} are disjoint,
because a card cannot be both a spade (a black suit) and red.
You might think that if two events are disjoint, then it would be reasonable
to find the probability of their union – that is, the probability that one (and,
by necessity, only one) of them happens – you can just add the two separate
probabilities together. This will be another of our “axioms” of probability.
There are a few rules about ways you can combine the complement, intersection
and union operations. These are ways of building new events from old.
• The double complement law tells us that not-not-𝐴 is the same as 𝐴:

(𝐴c )c = 𝐴.

This says that if it’s not “not-raining”, then it’s raining!


• The distributive laws tells us we can “multiply out of the brackets” with
sets:

𝐴 ∩ (𝐵 ∪ 𝐶) = (𝐴 ∩ 𝐵) ∪ (𝐴 ∩ 𝐶),
𝐴 ∪ (𝐵 ∩ 𝐶) = (𝐴 ∪ 𝐵) ∩ (𝐴 ∪ 𝐶).

The first says that if you are eating a burger with fries or salad, then you’re
eating a burger with fries or eating a burger with salad. The second is
a bit less intuitive, I find, but it’s clear that if 𝐴 is true then the first of
each of the terms on the right is true, while if both 𝐵 and 𝐶 are true then
the second of each of the terms on the right is true.
• De Morgan’s laws tell us how complements interact with intersec-
tion/unions:

(𝐴 ∩ 𝐵)c = 𝐴c ∪ 𝐵c
(𝐴 ∪ 𝐵)c = 𝐴c ∩ 𝐵c
18 CHAPTER 1. SAMPLE SPACES AND EVENTS

The first of these says that if it’s not a Monday in October, then either
it’s not Monday or it’s not October (or both). The second says that if a
maths lecture is not “useful or fun”, then it’s not useful and it’s not fun.
(Augustus De Morgan was a British mathematician of the 19th century
who did important work in logic.)
For this module, these mostly count as “common sense” – but if you ever do
need to prove one of these statements (or a similar one), one way is to use a
Venn diagram.
Let’s prove the second distributive law,

𝐴 ∪ (𝐵 ∩ 𝐶) = (𝐴 ∪ 𝐵) ∩ (𝐴 ∪ 𝐶),

with a Venn diagram as an example.


We can build the left-hand side of the law as:

B C

B C

B C
1.3. SET THEORY 19

The left-hand figure is 𝐴, the middle figure is 𝐵 ∩ 𝐶, and the right-hand figure
is union of these, 𝐴 ∪ (𝐵 ∩ 𝐶).
Then for the right-hand side of the law, we have:

B C

B C

B C

The left-hand figure is 𝐴 ∪ 𝐵, the middle figure is 𝐴 ∪ 𝐶, and the right-hand


figure is intersection of these, (𝐴 ∪ 𝐵) ∩ (𝐴 ∪ 𝐶).
We see that the areas shaded in two right-hand figures are the same, so it is
indeed the case that 𝐴 ∪ (𝐵 ∩ 𝐶) = (𝐴 ∪ 𝐵) ∩ (𝐴 ∪ 𝐶).

Summary
• A sample space Ω is a set representing all possible sample outcomes.
• An event is a subset of Ω.
20 CHAPTER 1. SAMPLE SPACES AND EVENTS

• For events 𝐴 and 𝐵, we also have the complement “not 𝐴” 𝐴c , the inter-
section “𝐴 and 𝐵” 𝐴 ∩ 𝐵, and the union “𝐴 or 𝐵” 𝐴 ∪ 𝐵.
Recommended reading:
• Stirzaker, Elementary Probability, Sections 1.1 and 1.2 (plus optionally
Chapter 0).
• Grimmett and Welsh, Probability, Sections 1.1 and 1.2.
Chapter 2

The rules of probability

2.1 Probability axioms


Recall that, in this mathematics course, the probability of an event will be a
real number that satisfies certain properties, which we call axioms.

Definition 2.1. Let Ω be a sample space. A probability measure on Ω is


a function ℙ that assigns to each event 𝐴 ⊂ Ω a real number ℙ(𝐴), called the
probability of 𝐴, and that satisfies the following three axioms:

1. ℙ(𝐴) ≥ 0 for all events 𝐴 ⊂ Ω;


2. ℙ(Ω) = 1;
3. if 𝐴1 , 𝐴2 , … is a finite or infinite sequence of disjoint events, then

ℙ(𝐴1 ∪ 𝐴2 ∪ ⋯) = ℙ(𝐴1 ) + ℙ(𝐴2 ) + ⋯ .

The sample space Ω together with the probability measure ℙ are called a prob-
ability space.

Axiom 1 says that all probabilities are non-negative numbers. Axiom 2 says the
probability that something happens is 1. Axiom 3 is about disjoint events – re-
call that these are events where no two can happen at the same time, because the
intersection of any pair of them is empty. Axiom 3 says that for disjoint events
the probability that one of them happens is the sum of the individual probabil-
ities. (Those who like their mathematical statements very precise should note
that an infinite sequence in Axiom 3 must be “countable”; that is, indexed by
the natural numbers 1, 2, 3, ….)

These axioms of probability (and our later results that follow from them)
were first written down by the Russian mathematician Andrey Nikolaevich
Kolmogorov in 1933. This marked the point from when probability theory
could now be considered a proper branch of mathematics – just as legitimate
as geometry or number theory – and not just a past-time that can be useful to
help gamblers calculate their odds. I always find it surprising that the axioms
of probability are only 90 years old!

21
22 CHAPTER 2. THE RULES OF PROBABILITY

There are other properties that it seems natural that a probability measure
should have aside from the axioms – for example, that ℙ(𝐴) ≤ 1 for all events
𝐴. But we will show shortly that other properties can be proven just by starting
from the three axioms.
But first, let’s see some examples.
Example 2.1. Suppose we wish to model tossing an biased coin the is heads
with probability 𝑝, where 0 ≤ 𝑝 ≤ 1.
Our probability space is Ω = {H, T}. The probability measure is given by

ℙ(∅) = 0 ℙ({H}) = 𝑝
ℙ({T}) = 1 − 𝑝 ℙ({H, T}) = 1.

Let’s check that the axioms hold:


1. Since 0 ≤ 𝑝 ≤ 1, all the probabilities are greater than or equal to 0.
2. It is indeed the case that ℙ(Ω) = ℙ({H, T}) = 1.
3. The only nontrivial disjoint union to check is {H} ∪ {T} = {H, T}, where
we see that

ℙ({H}) + ℙ({T}) = 𝑝 + (1 − 𝑝) = 1 = ℙ({H, T}),

as required.
Example 2.2. Suppose we wish to model rolling a dice.
Our sample space is {1, 2, 3, 4, 5, 6}. The probability measure is given by
|𝐴|
ℙ(𝐴) = ,
6
where |𝐴| is the number of sample outcomes in 𝐴.
So, for example, the probability of rolling an even number is
3 1
ℙ({2, 4, 6}) = = .
6 2

The dice rolling is a particular case of the “classical probability” of equally


likely outcomes. We’ll look at this more in the next lecture, and prove that the
classical probability measure does indeed satisfy the axioms

2.2 Properties of probability


The axioms of Definition 2.1 only gave us some of the properties that we would
like a probability measure to have. Our task now (in this subsection and the
next) is to carefully prove how these other properties follow from just those
axioms. In particular, we’re not allowed to make claims that merely “seem
likely to be true” or “are common sense” – we can only use the three axioms
together with strict logical deductions and nothing else.
Theorem 2.1. Let Ω be a sample space with a probability measure ℙ. Then we
have the following:
2.2. PROPERTIES OF PROBABILITY 23

1. ℙ(∅) = 0.
2. ℙ(𝐴c ) = 1 − ℙ(𝐴) for all events 𝐴 ⊂ Ω.
3. For events 𝐴 and 𝐵 with 𝐵 ⊂ 𝐴, we have ℙ(𝐵) ≤ ℙ(𝐴).
4. 0 ≤ ℙ(𝐴) ≤ 1 for all events 𝐴 ⊂ Ω.
Importantly, the second result here tells us how to deal with complements or
“not” events: the probability of 𝐴 not happening is 1 minus the probability it
does happen. This is often very useful.

Proof. The key with most of these “prove from the axioms” problems is to think
of a way to write the relevant events as part of a disjoint union, then use Axiom
3. Statements 1 and 2 are exercises for you on Problem Sheet 2. We’ll start
with the third statement.
Here, since 𝐵 is a subset of 𝐴, meaning that 𝐵 is entirely inside 𝐴.

It would be useful to write 𝐴 as a disjoint union of 𝐵 and “the bit of 𝐴 that


isn’t in 𝐵”. That is, we have the disjoint union

𝐴 = 𝐵 ∪ (𝐴 ∩ 𝐵c ).

Applying Axiom 3 to this disjoint union gives

ℙ(𝐴) = ℙ(𝐵) + ℙ(𝐴 ∩ 𝐵c ).

We’re happy to see the term on the left-hand side and the first term on the
right-hand side. But what about the awkward ℙ(𝐴 ∩ 𝐵c )? Well, by Axiom 1,
we know that the probability of any event is greater than or equal to 0, so in
particular. ℙ(𝐴 ∩ 𝐵c ) ≥ 0. Hence

ℙ(𝐴) ≥ ℙ(𝐵) + 0 = ℙ(𝐵),


24 CHAPTER 2. THE RULES OF PROBABILITY

and we are done with the third statement.


For the fourth statement, we have ℙ(𝐴) ≥ 0 directly from Axiom 1, so only
need to show that ℙ(𝐴) ≤ 1. We can do this using the third statement of this
theorem. For any event 𝐴 we have 𝐴 ⊂ Ω, so the third statement tells us that
ℙ(𝐴) ≤ ℙ(Ω). But Axiom 2 tells us that ℙ(Ω) = 1, so ℙ(𝐴) ≤ 1 and we are
done.

2.3 Addition rules for unions


If we have two or more events, we’d like to work out the probability of their
union; that is, the probability that at least one of them occurs.
We already have an addition rule for disjoint unions.
Theorem 2.2. Let 𝐴, 𝐵 ⊂ Ω be two disjoint events. Then

ℙ(𝐴 ∪ 𝐵) = ℙ(𝐴) + ℙ(𝐵).

Proof. In Axiom 3, take the sequence 𝐴1 = 𝐴, 𝐴2 = 𝐵 and 𝐴3 = 𝐴4 = ⋯ =


∅.

But what about if 𝐴 and 𝐵 are not disjoint? Then we have the following.
Theorem 2.3. Let 𝐴, 𝐵 ⊂ Ω be two events. Then

ℙ(𝐴 ∪ 𝐵) = ℙ(𝐴) + ℙ(𝐵) − ℙ(𝐴 ∩ 𝐵).

You may have seen this result before. You’ve perhaps justified it by saying
something like this: “We can add the two probabilities together, except now
we’ve double-counted the overlap, so we have to take the probability of that
away.” Maybe you drew a Venn diagram. That’s OK as a way to remember
the result – but this is a proper university mathematics course, so we have to
carefully prove it starting from just the axioms and nothing else.

Proof. The problem here is that 𝐴 and 𝐵 are not (in general) disjoint, so we
can’t apply Axiom 3.

A B

Instead, let’s split this up into the three disjoint bits: “𝐴 but not 𝐵” 𝐴 ∩ 𝐵c ,
“𝐵 but not 𝐴” 𝐵 ∩ 𝐴c , and “both” 𝐴 ∩ 𝐵.
2.3. ADDITION RULES FOR UNIONS 25

A B

Now we can write 𝐴, 𝐵 and 𝐴 ∪ 𝐵 in terms of these disjoint bits.

𝐴 = (𝐴 ∩ 𝐵c ) ∪ (𝐴 ∩ 𝐵) (2.1)
c
𝐵 = (𝐵 ∩ 𝐴 ) ∪ (𝐴 ∩ 𝐵) (2.2)
c c
𝐴 ∪ 𝐵 = (𝐴 ∩ 𝐵 ) ∪ (𝐵 ∩ 𝐴 ) ∪ (𝐴 ∩ 𝐵), (2.3)

with all the unions on the right-hand side being disjoint. Applying Axiom 3 to
them all gives

ℙ(𝐴) = ℙ(𝐴 ∩ 𝐵c ) + ℙ(𝐴 ∩ 𝐵) (2.4)


c
ℙ(𝐵) = ℙ(𝐵 ∩ 𝐴 ) + ℙ(𝐴 ∩ 𝐵) (2.5)
c c
ℙ(𝐴 ∪ 𝐵) = ℙ(𝐴 ∩ 𝐵 ) + ℙ(𝐵 ∩ 𝐴 ) + ℙ(𝐴 ∩ 𝐵). (2.6)

Here, (2.6) is looking good, but we need to get rid of the awkward ℙ(𝐴 ∩ 𝐵c )
and ℙ(𝐵 ∩ 𝐴c ) terms. We can do that be rearranging (2.4) and (2.5) to get

ℙ(𝐴 ∩ 𝐵c ) = ℙ(𝐴) − ℙ(𝐴 ∩ 𝐵) (2.7)


c
ℙ(𝐵 ∩ 𝐴 ) = ℙ(𝐵) − ℙ(𝐴 ∩ 𝐵). (2.8)

Substituting these into (2.6) gives

ℙ(𝐴 ∪ 𝐵) = ℙ(𝐴) − ℙ(𝐴 ∩ 𝐵) + ℙ(𝐵) − ℙ(𝐴 ∩ 𝐵) + ℙ(𝐴 ∩ 𝐵) (2.9)


= ℙ(𝐴) + ℙ(𝐵) − ℙ(𝐴 ∩ 𝐵), (2.10)

as required.

Example 2.3. Consider picking a card from a standard 52-card deck at random,
with ℙ(𝐴) = |𝐴|/52. What’s the probability the card is a spade or an ace?
It is possible to just to work this out directly. But let’s use our addition law for
unions.
13 4
We have ℙ(spade) = 52 and ℙ(ace) = 52 . So we have

13 4
ℙ(spade or ace) = 52 + 52 − ℙ(spade and ace).

But ℙ(spade and ace) is the probability of picking the ace of spades, which is
1
52 . Therefore

13 4 1 16 4
ℙ(spade or ace) = 52 + 52 − 52 = 52 = 13 .
26 CHAPTER 2. THE RULES OF PROBABILITY

Summary
• The axioms of probability are (1) ℙ(𝐴) ≥ 0; (2) ℙ(Ω) = 1; and (3) that for
disjoint events 𝐴1 , 𝐴2 , …, we have ℙ(𝐴1 ∪ 𝐴2 ∪ ⋯) = ℙ(𝐴1 ) + ℙ(𝐴2 ) + ⋯.
• Other properties can be proven from these axioms, like the complement
rule ℙ(𝐴c ) = 1 − ℙ(𝐴), and the addition rule for unions ℙ(𝐴 ∪ 𝐵) =
ℙ(𝐴) + ℙ(𝐵) − ℙ(𝐴 ∩ 𝐵).
Recommended reading:
• Stirzaker, Elementary Probability, Sections 1.3 and 1.4.
• Grimmett and Welsh, Probability, Sections 1.3 and 1.4.
• Matthew Aldridge’s blogpost “How to prove the addition rule for unions”
On Problem Sheet 2, you should now be able to complete Question A1 and
Questions B1, B2 and B3.
Chapter 3

Classical probability I

3.1 Probability with equally likely outcomes


Classical probability is the name we give to probability where there are a
finite number of equally likely outcomes.
Classical probability was the first type of probability to be formally studied –
partly because it is the simplest, and partly because it was useful for working out
how to win at gambling. Tossing fair coins, rolling dice, and dealing cards are
all common gambling situations that can be studied using classical probability
– in a deck of cards, for example, there are 52 cards that are equally likely to
be drawn. Among the first works to seriously study classical probability were
“Book on Games of Chance” by Girolamo Cardano (written in 1564, but not
published until 1663, one hundred years later), and a famous series of letters
between Blaise Pascal and Pierre de Fermat in 1654.
Definition 3.1. Let Ω be a finite sample space. Then the classical probabil-
ity measure on Ω is given by

|𝐴|
ℙ(𝐴) = .
|Ω|

So to work out a classical probability ℙ(𝐴), crucially we need to be able to count


how many outcomes |𝐴| are in the event 𝐴 and count how many outcomes |Ω|
are in the whole sample space Ω. (This is why classical probability is also called
“enumerative probability” – “enumeration” is another word for counting.) In
this lecture and the next, we’ll look at some different ways in which we can
count the number of outcomes in common events and sample spaces.
Example 3.1. We roll a dice. What is the probability we get at least 5?
The sample space is Ω = {1, 2, 3, 4, 5, 6}, with |Ω| = 6. The event that we roll
at least 5 is 𝐴 = {5, 6}, with |𝐴| = 2. Hence

|𝐴| 2 1
ℙ(𝐴) = = = .
|Ω| 6 3

27
28 CHAPTER 3. CLASSICAL PROBABILITY I

There’s something we ought to check before going any further!


Theorem 3.1. Let Ω be a finite nonempty sample space. Then the classical
probability measure on Ω,
|𝐴|
ℙ(𝐴) = ,
|Ω|
is indeed a probability measure, in that it satisfies the three axioms in Definition
2.1.

Proof. We’ll take the axioms one by one.


1. Since |Ω| ≥ 1 and |𝐴| ≥ 0, it is indeed the case that ℙ(𝐴) = |𝐴|/|Ω| ≥ 0.
|Ω|
2. We have ℙ(Ω) = = 1, as required.
|Ω|
3. Since we have a finite sample space, we only need to show Axiom 3 for a
sequence of two disjoint events; the argument can be repeated to get any
finite number of events. Let 𝐴 = {𝑎1 , 𝑎2 , … , 𝑎𝑘 } and 𝐵 = {𝑏1 , 𝑏2 , … , 𝑏𝑙 } be
two disjoint events with |𝐴| = 𝑘 and |𝐵| = 𝑙. Note that we can enumerate
the elements of the disjoint union 𝐶 = 𝐴 ∪ 𝐵 as

𝑐1 = 𝑎1 , 𝑐2 = 𝑎2 , … , 𝑐𝑘 = 𝑎𝑘 , 𝑐𝑘+1 = 𝑏1 , 𝑐𝑘+2 = 𝑏2 , … , 𝑐𝑘+𝑙 = 𝑏𝑙 .

Since 𝐴 and 𝐵 are disjoint, this list has no repeats, and we see that
|𝐶| = |𝐴 ∪ 𝐵| = 𝑘 + 𝑙. Hence

𝑘+𝑙 𝑘 𝑙
ℙ(𝐴 ∪ 𝐵) = = + = ℙ(𝐴) + ℙ(𝐵),
|Ω| |Ω| |Ω|

and Axiom 3 is fulfilled.

3.2 Multiplication principle


In classical probability, to find the probability of an event 𝐴, we need to count
the number of outcomes in 𝐴 and the total number of possible outcomes in Ω.
This can be easy when we’re just looking at one choice – like the 2 outcomes
from tossing a single coin, the 6 outcomes of rolling a single dice, or the 52
outcomes from dealing a single card. Now we’re going to look at what happens
if there are a number of choices one after another – like tossing multiple coins,
rolling more than one dice, or dealing a hand of cards.
Here, an important principle is the multiplication principle. The multipli-
cation principle says that if you have 𝑛 choices followed by 𝑚 choices, than all
together you have 𝑛×𝑚 total choices. You can see this by imagining the choices
in a 𝑛 × 𝑚 grid, with the 𝑛 columns representing the first choice and 𝑚 rows rep-
resenting the second choice. For example, suppose you go to a burger restaurant
where there are 3 choices of burger (beefburger, chicken burger, veggie burger)
and 2 choices of sides (fries, salad), then altogether there are 3 × 2 = 6 choices
of meal.
3.2. MULTIPLICATION PRINCIPLE 29

Beefburger Chicken burger Veggie burger


Fries 1: Beefburger 2: Chicken 3: Veggie burger
with fries burger with fries with fries
Salad 4: Beefburger 5: Chicken 6: Veggie burger
with salad burger with salad with salad

More generally, if you have 𝑚 stages of choosing, with 𝑛1 choices in the first
stage, then 𝑛2 choices in the second stage, all the way to 𝑛𝑚 choices in the final
stage, you have 𝑛1 × 𝑛2 × ⋯ × 𝑛𝑚 total choices altogether.
Example 3.2. Five fair coins are tossed. What is the probability they all show
the same face?
Here, the sample space Ω is the set of all sequences of 5 coin outcomes. How
many sample outcomes are in Ω? Well, the first coin can be heads or tails (2
choices); the second coin can be heads or tails (2 choices) and so on, until the
fifth and final coin. So, by the multiplication principle, |Ω| = 2 × 2 × 2 × 2 × 2 =
25 = 32.
The event we’re interested in is 𝐴 = {HHHHH, TTTTT}, the event that the
faces are all the same – either all heads or all tails. This clearly has |𝐴| = 2
outcomes.
So the probability all five coins show the same face is

|𝐴| 2 1
ℙ(𝐴) = = = ≈ 0.06.
|Ω| 32 16

Example 3.3. Four dice are rolled. What is the probability we get at least one
6?
Here, Ω is the set of all possible sequences of four dice rolls. Clearly |Ω| = 64 =
1296.
The event 𝐴 is the set of all dice roll sequences with at least one 6. Whenever
you see a question with the phrase “at least one” in it, it’s very often a good
idea to look at the complementary event 𝐴c instead. We know from the last
lecture that ℙ(𝐴) = 1 − ℙ(𝐴c ), but in “at least one” questions, it’s often easier
to count |𝐴c | than to count |𝐴|.
Here, since 𝐴 is the set of all dice roll sequences with at least one 6, then 𝐴c is
the set of dice roll sequence without any 6s at all. This means all four dice must
have rolled a 1, 2, 3, 4 or 5. Since each of the four dice rolls has five possibilities,
this means that |𝐴c | = 54 = 625.
Putting this together, we see that

|𝐴c | 625 671


ℙ(𝐴) = 1 − ℙ(𝐴c ) = 1 − =1− = ≈ 0.518.
|Ω| 1296 1296

So there’s about a 52% chance we get at least one 6.


30 CHAPTER 3. CLASSICAL PROBABILITY I

3.3 Sampling with and without replacement


Probabilists love problems where they pick coloured balls out of a bag!
Example 3.4. A bag contains 15 balls: 10 black balls and 5 white balls. We
draw 3 balls out of the bag. What is the probability all 3 balls are black (a) if
we put each ball back into the bag after it is chosen; (b) if we do not put each
ball back into the bag after it is chosen.
Let’s start with (a). The number of ways to choose a ball out 15 on three
occasions is |Ω| = 153 . The number of ways to choose a black ball out of 10 on
three occasions is |𝐴| = 103 . Hence
|𝐴| 103 1000 8
ℙ(𝐴) = = 3 = = ≈ 0.30.
|Ω| 15 3375 27

What about (b)? Here we don’t put the ball back in the bag once it has been
chosen. There are 15 ways to pick the first ball. But then there are only 14
balls left in the bag for the second choice, and only 13 balls for the third choice.
So |Ω| = 15 × 14 × 13. Similarly, there are 10 ways the first ball can be black.
But once that black ball is removed, only 9 choices for the second black ball,
and only 8 for the third. So |𝐴| = 10 × 9 × 8. So this time we have
|𝐴| 10 × 9 × 8 720 24
ℙ(𝐴) = = = = ≈ 0.26,
|Ω| 15 × 14 × 13 2730 91
which is slightly smaller than the answer in part (a).
This example illustrated the difference between sampling with replacement
(when the balls were put back into the bag) and sampling without replace-
ment (when the balls were not put back). If we want to sample 𝑘 items from a
set of 𝑛 items, then:
• the number of ways to sample with replacement is
𝑛𝑘 = 𝑛 × 𝑛 × ⋯ × 𝑛;
• the number of ways to sample without replacement is
𝑛𝑘 = 𝑛 × (𝑛 − 1) × ⋯ × (𝑛 − 𝑘 + 1).
Here, we’ve defined the notation 𝑛𝑘 for the number of ways to sample without
replacement; this is called the falling factorial or permutation number.
This is still 𝑘 numbers multiplied together, but decreasing by 1 each time down
from 𝑛. The final number in the product is the number of choices in the 𝑘th
an final round: this is the original 𝑛 items minus the 𝑘 − 1 items sampled in
the previous 𝑘 − 1 rounds; so the final number is 𝑛 − (𝑘 − 1) = 𝑛 − 𝑘 + 1. A
notation point: Notice that the subscript is underlined in the falling factorial;
other notation sometimes used includes (𝑛)𝑘 , 𝑃 (𝑛, 𝑘), or 𝑛 𝑃𝑘 .
In our balls-in-a-bag problem, the answer for sampling with replacement was
103 /153 , while the answer for sampling without replacement was 103 /153 .
Next time, we will look at more classical probability problems. Do two of your
friends share a birthday? Can you shuffle a deck of cards in an order that has
3.3. SAMPLING WITH AND WITHOUT REPLACEMENT 31

never happened before in the history of the universe? And can you win the
National Lottery?

Summary
• “Classical probability” describes the situation where there are finitely
many equally likely outcomes. The classical probability ℙ(𝐴) = |𝐴|/|Ω|
requires us to count how many outcomes there are in events or sample
spaces.
• The multiplication principle says that 𝑛 choices followed by 𝑚 choices
makes 𝑛 × 𝑚 choices in total.
• Sampling 𝑘 objects out of 𝑛 with replacement gives 𝑛𝑘 choices.
• Sampling 𝑘 objects out of 𝑛 without replacement gives 𝑛𝑘 = 𝑛(𝑛−1) ⋯ (𝑛−
𝑘 + 1) choices.
Recommended reading:
• Stirzaker, Elementary Probability, Sections 3.1 and 3.2.
32 CHAPTER 3. CLASSICAL PROBABILITY I
Chapter 4

Classical probability II

We continue looking at the classical probability ℙ(𝐴) = |𝐴|/|Ω|, by looking at


ways to enumerate Ω and 𝐴. Last time we saw:
• The multiplication principle: 𝑛1 choices followed by 𝑛2 choices, …, up to
𝑛𝑘 choices gives 𝑛1 × 𝑛2 × ⋯ × 𝑛𝑘 choices in total.
• Sampling 𝑘 objects out of 𝑛 with replacement gives 𝑛𝑘 choices.
• Sampling 𝑘 objects out of 𝑛 without replacement gives 𝑛𝑘 = 𝑛(𝑛−1) … (𝑛−
𝑘 + 1) choices.

4.1 Ordering
Example 4.1. Suppose a lecturer marks a pile of 𝑛 exam papers, all of which
receive a different mark. What is the probability she ends up marking them in
order from lowest scoring first in the pile to highest scoring last in the pile?
Here, the sample space Ω is the set of all orderings of the 𝑛 exam papers by
mark, and 𝐴 is the event that the papers are in order from lowest to highest
scoring. It’s clear that |𝐴| = 1: since the exams scored different marks, there’s
only one way of putting the exams in the correct lowest-to-highest order. But
what’s |Ω|?
There are 𝑛 choices for the first exam paper to be marked. Then, for the second
exam paper, there are 𝑛 − 1 choices left, because the lecturer is not going to
mark the same paper twice. There are 𝑛 − 2 choices for the third exam paper.
And so on, until she has marked 𝑛 − 1 papers, and there is only 1 choice left for
the final paper. So we have

|Ω| = 𝑛𝑛 = 𝑛(𝑛 − 1)(𝑛 − 2) ⋯ 3 ⋅ 2 ⋅ 1 = 𝑛!

ways to order the exam papers.


Hence, the probability the papers are marked in order is

|𝐴| 1 1
ℙ(𝐴) = = = .
|Ω| 𝑛(𝑛 − 1) ⋯ 2 ⋅ 1 𝑛!

33
34 CHAPTER 4. CLASSICAL PROBABILITY II

This number
𝑛! = 𝑛𝑛 = 𝑛(𝑛 − 1)(𝑛 − 2) ⋯ 3 ⋅ 2 ⋅ 1
is called 𝑛 factorial and denoted 𝑛!. It is the number of ways that 𝑛 different
objects can be ordered.
The factorial 𝑛! gets√ very large very quickly. Stirling’s formula gives the
approximation 𝑛! ≈ 2𝜋𝑛 e−𝑛 𝑛𝑛 .
Example 4.2. Suppose you shuffle a pack of cards. The resulting ordering of
the deck has 52! possibilities. This is an unimaginably huge number – the exact
value to 3 significant figures is

52! = 8.07 × 1067 ,

while Stirling’s formula gives the approximation



52! ≈ 2𝜋 × 52 × e−52 × 5252 = 8.05 × 1067 .

This is an 8 followed by 67 zeroes.


If every person on the planet (very roughly 1010 ) had shuffled a deck of cards
one million (106 ) times a second for the entire lifetime of the universe (roughly
1017 seconds), they could only expect to have got through about 1033 shuffles.
This is only the most tiny, microscopic fraction of 52!. So every time you have
ever shuffled a deck of cards, it is essentially certain that you have created an
ordering of the deck that has never existed before.
If we take the ratio of a bigger factorial 𝑛! over a smaller factorial 𝑗!, we get lot
of cancellation,
𝑛! 𝑛(𝑛 − 1) ⋯ (𝑗 + 1)𝑗(𝑗 − 1) ⋯ 1
=
𝑗! 𝑗(𝑗 − 1) ⋯ 1
= 𝑛(𝑛 − 1) ⋯ (𝑗 + 1),

because the last part of the product in the numerator cancels with the whole of
the denominator. Replacing 𝑗 with 𝑛 − 𝑘, this gives
𝑛!
= 𝑛(𝑛 − 1) ⋯ (𝑛 − 𝑘 + 1) = 𝑛𝑘 .
(𝑛 − 𝑘)!

This gives a way of writing the falling factorial as the ratio of two (normal)
factorials, which can sometimes be useful.

4.2 Sampling without replacement in any order


Example 4.3. In the Lotto, the UK national lottery, you can buy a ticket for
£2 and choose 6 numbers between 1 and 59. If your 6 numbers match the 6
numbers on the balls chosen by the lottery machine, you win the jackpot (usually
between £2 million and £20 million, shared between the tickets that get all 6
numbers). If you buy a ticket, what is the probability you win the jackpot?
Here, Ω is the set of all possible sets of 6 winning numbers, and 𝐴 is the set of
numbers on your ticket. Clearly |𝐴| = 1, but what is |Ω|?
4.2. SAMPLING WITHOUT REPLACEMENT IN ANY ORDER 35

Well, the first ball out of the machine has 59 possibilities, the second ball has
58 possibilities, and so on, making

59 × 58 × 57 × 56 × 55 × 54 = 596 .

But this isn’t the correct answer, because the same set of balls could be
drawn from the machine in any order! The sets of balls {1, 2, 3, 4, 5, 6} and
{6, 5, 4, 3, 2, 1} and {1, 3, 5, 6, 4, 2} are all the same set of numbers. How many
ways can we see the same list of numbers? This is precisely the number of
orderings of 6 balls, which we know is 6!. So the number of possible sets of 6
balls to come out of the machine is actually

59 596 59 × 58 × 57 × 56 × 55 × 54
( )= = ≈ 45 million.
6 6! 6×5×4×3×2×1

Thus the probability that your ticket wins the jackpot is


|𝐴| 1 1
ℙ(𝐴) = = 59 ≈ ≈ 0.000 000 02.
|Ω| (6) 45 million

Here, we have introduced the notation

𝑛 𝑛𝑘 𝑛(𝑛 − 1) ⋯ (𝑛 − 𝑘 + 1)
( )= =
𝑘 𝑘! 𝑘(𝑘 − 1) ⋯ 2 ⋅ 1
for the number of ways to choose 𝑘 objects out of 𝑛 without replacement and
where the order they were chosen in doesn’t matter. This is called the binomial
coefficient, although when we say it out loud we normally just say “𝑛 choose
𝑘”. (Another notation for the binomial coefficient is 𝑛 𝐶𝑘 .)
It can sometimes be useful to remember that 𝑛𝑘 = 𝑛!/(𝑛 − 𝑘)! allows us to write
the binomial coefficient in terms of the factorial function as
𝑛 𝑛𝑘 𝑛!
( )= = .
𝑘 𝑘! 𝑘!(𝑛 − 𝑘)!

Example 4.4. You are dealt a “hand” of 13 cards from a deck of 52 cards.
What is the probability that you have the Ace, King, Queen, and Jack of Spades?
Here, Ω is the set of all 13-card hands from the deck, and 𝐴 is the subset of
those that contain the AKQJ of Spades.
Using the binomial coefficient notation, it’s clear that
52 52 × 51 × ⋯ × 41 × 40
|Ω| = ( )= .
13 13 × 12 × ⋯ × 2 × 1

What about |𝐴|? If we fix the fact that the hand contains the 4 cards AKQJ
of Spades, then it also contains 13 − 4 = 9 cards out of the other 52 − 4 = 48
remaining cards in the deck. This makes
48 48 × 47 × ⋯ × 41 × 40
|𝐴| = ( )=
9 9×8×⋯×2×1
36 CHAPTER 4. CLASSICAL PROBABILITY II

hands.
Thus the probability that the hand contains AKQJ of Spades is

|𝐴| (48
9)
ℙ(𝐴) = = 52 .
|Ω| (13)

Conveniently, we can simplify the expression quite a lot, because plenty of can-
cellation will occur. We have
(48
9)
48×47×⋯×41×40
9×8×⋯×2×1
ℙ(𝐴) = 52 = 52×51×⋯×41×40
(13) 13×12×⋯×2×1
48 × 47 × ⋯ × 41 × 40 13 × 12 × ⋯ × 2 × 1
= ×
52 × 51 × ⋯ × 41 × 40 9×8×⋯×2×1
13 × 12 × 11 × 10
=
52 × 51 × 50 × 49
≈ 0.0026,

or about 1 in every 380 hands.

4.3 Birthday problem


Example 4.5. There are 𝑘 = 23 students in a class. What is the probability
that at least two of the students share a birthday?
This a famous problem, known as the “birthday problem”. You may have seen
this problem before – but let’s try to solve it using the techniques from this
section of notes. If you haven’t seen it before, you might like to guess what
you think the answer might be. (We’ll assume all days are equally likely for
birthdays, and ignore the leap day 29 February.)
The sample space Ω is the set of possible birthdays for all 𝑘 students. Clearly
|Ω| = 365𝑘 .
Let 𝐴 be the even that at least one pair of student share a birthday. Since this
is an “at least” event, it seems like it might be a good idea to look instead at
the complementary event 𝐴c . If 𝐴 is the event that there’s at least one shared
birthday, then 𝐴c is the event that there are no shared birthdays; that is, 𝐴c is
the event that all 𝑘 students have different birthdays.
So what is |𝐴c |, the number of ways the 𝑘 students can have different birthdays?
Well, the first student can have any of the 365 days for their birthday. For them
to have different birthdays, the second student only has 364 days available.
Then the third student must avoid the birthday of students 1 and 2, so has 363
available days, and so on. We see that

|𝐴c | = 365 × 364 × ⋯ × (365 − 𝑘 + 1) = 365𝑘 .

Hence, the probability at least two students share a birthday is

365𝑘 365 364 365 − 𝑘 + 1


ℙ(𝐴) = 1 − ℙ(𝐴c ) = 1 − 𝑘
=1− ⋅ ⋯ .
365 365 365 365
4.3. BIRTHDAY PROBLEM 37

Setting 𝑘 = 23, we can calculate the required answer in R:


k <- 23
1 - prod((365:(365 - k + 1)) / 365)

[1] 0.5072972
The probability is 50.7%. So it’s more likely than not that at least two students
share a birthday.
Some people find it surprising that only 23 students have such a high probability
of sharing a birthday, since 23 is so small compared to 365. But remember there
are (23
2 ) = 253 pairs of birthdays, and each of those 253 pairs is a potential
match.

Summary
• Ordering 𝑛 objects can be done in 𝑛! = 𝑛𝑛 = 𝑛(𝑛 − 1) ⋯ 2 ⋅ 1 ways.
• The number of ways to sample 𝑘 objects out of 𝑛 when the order doesn’t
matter is given by the binomial coefficient (𝑛𝑘) = 𝑛𝑘 /𝑘!.
Recommended reading:
• Stirzaker, Elementary Probability, Sections 3.2 and 3.3.
On Problem Sheet 2, you should now be able to complete all questions.
38 CHAPTER 4. CLASSICAL PROBABILITY II
Problem Sheet 2

This is Problem Sheet 2. This problem sheet covers material from Lectures
1 to 4. You should work through all the questions on this problem sheet in
preparation for your tutorial in Week 3.

A: Short questions
A1. Let 𝐴, 𝐵 and 𝐶 be events in a sample space Ω. Write expressions for
the following events using only 𝐴, 𝐵, 𝐶 and the complement, intersection, and
union operations.
(a) 𝐶 happens but 𝐴 doesn’t.
(b) At least one of 𝐴, 𝐵 and 𝐶 happens.
(c) Exactly one of 𝐵 or 𝐶 happens.
A2. An urn contains 4 red balls and 6 blue balls. Two balls are drawn from
the urn. What is the probability that both balls are red, if the balls are drawn
(a) with replacement; (b) without replacement?
A3. Suppose we pick a number at random from the set {1, 2, … , 2000}.
(a) What is the probability that the number is divisible by 5?
(b) What is the probability the number is divisible by 5 or by 7?
A4. Suppose your tutorial group contains 12 students – you and 11 others. The
tutor wishes to choose 4 members of the group to present their work.
(a) How many ways can the tutor choose the presentation group?
(b) How many ways can the tutor choose the presentation group if you are one
of the presenters?
(c) How many ways can the tutor choose the presentation group if you are not
one of the presenters?
A5. An urn contains 15 balls: 4 red balls, 5 blue balls, and 6 green balls.
(a) If three balls are drawn with replacement, what is the probability that all
three balls are the same colour?
(b) If three balls are drawn without replacement, what is the probability that
all three balls are different colours?

39
40 CHAPTER 4. CLASSICAL PROBABILITY II

A6. A “random digit” is a number chosen at random from {0, 1, … , 9}, each with
equal probability. A statistician chooses 𝑛 random digits (with replacement).
(a) For 𝑘 = 0, 1, … , 9, let 𝐴𝑘 be the event that all the digits are 𝑘 or smaller.
What is the probability of 𝐴𝑘 , as a function of 𝑘 and 𝑛?
(b) Let 𝐵𝑘 be the event that the largest digit chosen is equal to 𝑘. By finding
a relationship between 𝐵𝑘 , 𝐴𝑘−1 and 𝐴𝑘 , or otherwise, show that
(𝑘 + 1)𝑛 − 𝑘𝑛
ℙ(𝐵𝑘 ) = .
10𝑛

B: Long questions
In all of the following questions, let Ω be a sample space with a probability
measure ℙ, and let 𝐴, 𝐵 ⊂ Ω be events.
B1.
(a) Starting from just the three probability axioms, prove that
ℙ(∅) = 0.

(b) Let 𝐴 and 𝐵 be two events with ℙ(𝐴) = 0.8 and ℙ(𝐵) = 0.4. Prove the
upper bound ℙ(𝐴 ∩ 𝐵) ≤ 0.4. You may use any of the properties of probability
stated in the lecture notes.
(c) Prove that the upper bound in (b) can be achieved, by giving an example
of a sample space Ω, a probability measure ℙ and events 𝐴, 𝐵 ⊂ Ω such that
ℙ(𝐴) = 0.8, ℙ(𝐵) = 0.4 and ℙ(𝐴 ∩ 𝐵) = 0.4.
B2.
(a) Starting from just the three probability axioms, prove that
ℙ(𝐴c ) = 1 − ℙ(𝐴).

(b) Let 𝐴 and 𝐵 be two events with ℙ(𝐴) = 0.8 and ℙ(𝐵) = 0.4. Prove the
lower bound ℙ(𝐴 ∩ 𝐵) ≥ 0.2. You may use any of the properties of probability
stated in the lecture notes.
(c) Prove that the lower bound in (b) can be achieved, by giving an example
of a sample space Ω, a probability measure ℙ and events 𝐴, 𝐵 ⊂ Ω such that
ℙ(𝐴) = 0.8, ℙ(𝐵) = 0.4 and ℙ(𝐴 ∩ 𝐵) = 0.2.
B3.
(a) Starting from just the three probability axioms, prove that
ℙ(𝐴 ∩ 𝐵) + ℙ(𝐴 ∩ 𝐵c ) = ℙ(𝐴).

(b) Let 𝐴 and 𝐵 be two events with ℙ(𝐴) = 0.8 and ℙ(𝐵) = 0.4. Prove the
bounds 0.8 ≤ ℙ(𝐴 ∪ 𝐵) ≤ 1. You may use any of the properties of probability
stated in the lecture notes.
(c) Prove that the upper bound in (b) can be achieved, by giving an example
of a sample space Ω, a probability measure ℙ and events 𝐴, 𝐵 ⊂ Ω such that
ℙ(𝐴) = 0.8, ℙ(𝐵) = 0.4 and ℙ(𝐴 ∪ 𝐵) = 1.
4.3. BIRTHDAY PROBLEM 41

Solutions to short questions


A1. (a) 𝐶 ∩ 𝐴c (b) 𝐴 ∪ 𝐵 ∪ 𝐶 (c) (𝐵 ∪ 𝐶) ∩ (𝐵 ∩ 𝐶)c or (𝐵 ∩ 𝐶 c ) ∪ (𝐵c ∩ 𝐶)
4 2
A2. (a) 25 = 0.16 (b) 15 = 0.133 A3. (a) 0.2 (b) 0.314 A4. (a) 495 (b) 165
𝑛 𝑛
−𝑘𝑛
(c) 330 A5. (a) 0.12 (b) 0.264 A6. (a) (𝑘+1)
10𝑛 (b) (𝑘+1)
10𝑛 .
42 CHAPTER 4. CLASSICAL PROBABILITY II
R Worksheets

43
Introduction to R

Each week of semester 1 (starting in Week 2) there will be an R worksheet to


work through in your own time. I recommend spending about one hour on each
worksheet. In week 4 you will be issued with a mini-project brief. This needs
to be completed and submitted before the deadline in week 10. To help you
with R, there will be two practical sessions, the first in week 2 and the second
in week 8.

What are R and RStudio?


• R is a programming language that is particularly useful for working with
probability and statistics. The R language is very widely used in univer-
sities and increasingly widely used in industry. Learning to use R is a
mandatory part of this module, and exercises requiring use of R make up
at least 15% of your module mark. Many other statistics-related modules
at the University also use R.
• RStudio is a computer program (or app) that gives a convenient way to
work with the language R. The RStudio program is made by the company
Posit. The program RStudio is the most common way to use the language
R, and learning to use RStudio is strongly recommended.
R and RStudio are free/open-source software.
There are a number ways you can use R and RStudio:

How to use R and RStudio


1. On University computers. You will learn how to use R and RStudio on
University computers in your first practical session, in Week 2.
2. On your own computer. R and RStudio can easily be installed on Windows
and Mac laptops. Bring your laptop along to the first practical session to
learn how to install R and RStudio.
3. Using the Posit Cloud. The Posit Cloud is a way to use R and RStudio
online – sort of like a “Google Docs for R”. You can use it free for 25
hours a month, which should be plenty for this module, or pay for more.
I recommend the Posit Cloud for using R/RStudio with Chromebooks,
tablet computers, or when borrowing someone else’s device.

45
46 CHAPTER 4. CLASSICAL PROBABILITY II

Accessing R and RStudio on University computers


R and RStudio can be used on University computers via the AppsAnywhere por-
tal. AppsAnywhere is the University of Leeds system for loading “unusual” pro-
grams (common programs like Microsoft Office and web browsers are preloaded).
There are three steps to using R and RStudio on University computers:
1. Open the AppsAnywhere portal.
2. Load the language R onto your computer.
3. Open the program RStudio.
First, open the AppsAnywhere portal by double-clicking on the desktop icon.
This will open a web browser, and invite you to “Open AppsAnywhere Launcher”
– you should accept and open. AppsAnywhere has loaded properly when the
blue “Validation in progress…” box turns into a green “Validation Successful”
box.
Second, launch R from AppsAnywhere. R is called “Cran R 4.2.0 x64” on
AppsAnywhere, so searching for “Cran” is an easy way to find it. Click “Launch”.
This will do two things. First, it will silently load the language R in the back-
ground. Second, it will open a program called “RGui”. RGui is basically like an
older and less good version of RStudio; we do not recommend using the RGui
program, so you can close it. (The R language will remain loaded.)
Third, launch RStudio from AppsAnywhere. The most recent version on App-
sAnywhere is “RStudio 2023 (03.0.386)”. Click “Launch”. After a few second,
RStudio will launch. (If invited to choose a version of the language R, pick
“64-bit”. If invited to update R or RStudio, decline.)
You need to repeat all three steps each time you log onto a University computer.

Installing R and RStudio


When you install R and RStudio, it’s important that you install the language
R first, and only install the program RStudio after the language R has already
been installed. This ensures that RStudio can “find” R on your computer.
1. First, install R. Go to the Comprehensive R Archive Network (CRAN)
and follow the instructions:
• Windows: Click “Download R for Windows”, then “Install R for the
first time”. The main link at the top should be to download the most
recent version of R.
• Mac: Click Download R for macOS, and then download the relevant
PKG file. (Most modern Macbooks are based on Apple’s M1 or M2
processors, so you can choose “Apple silicon arm64 build”. Some
older Macbooks, mostly 2020 or earlier, have Intel processors; for
these you should use the “Intel 64-bit build”.)
2. After R is installed, then install RStudio. Go to the “Download RStudio”
page at posit.co and follow the instructions. You want “RStudio Desktop”
and you want the free version, if given a choice.
4.3. BIRTHDAY PROBLEM 47

Now, whenever you want to use R and RStudio, simply open program RStudio.
(The language R will automatically be loaded on your computer.)
For Chromebooks, we recommend using the Posit Cloud, as mentioned above.
However, if you have an Intel-based Chromebook and are feeling brave, then we
have had success installing R and RStudio using these instructions, which are
long and complicated.
48 CHAPTER 4. CLASSICAL PROBABILITY II
R Worksheet 1: R Basics

If you have difficulty with this Worksheet, you can get help at your R Practical
in Week 2.

This worksheet assumes you are using a computer with the programming lan-
guage R and the program RStudio installed.

Open RStudio. (On University computers, remember to load the language R


first.)

To enter commands, you should type them into the “Console” – this normally
takes up either the left half of the screen, or the bottom-left quarter. When you
open RStudio, there will be some information about R in the console, in black
writing, then an arrow > in blue. You can type commands next to this blue
arrow.

Exercise 1.1. Type 2 + 3 into the Console next to the arrow >,
and then press Enter. What happens?

In these worksheets, R commands for you to enter will look like this
2 + 3

We don’t write down the > in the worksheets, because you don’t have to type
the > either. But remember that you always type your commands next to the
blue arrow.

In these worksheets, the information returned by R will look like this:

[1] 5

The [1] 5 is what you should have seen as R’s response in Exercise 1.1. First,
the [1] just tells you that what follows is the first part of the answer – since
our answer here only has one part, we can ignore this. (It is useful when the
answer is a long list values going over many lines.) Then the 5 is the answer we
want.

Whenever you see commands in this worksheet, you are strongly encouraged to
try them out yourself.

49
50 CHAPTER 4. CLASSICAL PROBABILITY II

Using R as a calculator
The simplest way to use R is to use it to perform simple calculations for us
– like the 2 + 3 we saw earlier. R can perform addition +, subtraction -,
multiplication *, division / and powers ^. So, for example, we can find
(4 + 5) × 6 by entering into the Console:
(4 + 5) * 6

[1] 54
Spaces (with a few exceptions) are ignored by R, so 2+3 and 2 + 3 work equally
well – you can use whichever you find easier to read.
Exercise 1.2. Calculate:
(a) 943 − 242,
(b) 29 × 31,
(c) 28+5 ,
19 + 21
(d) .
5×3
R also allows you to put linebreaks in any command that is obviously not yet
finished. So if you enter 943 - and press Enter, R will know that command
is unfinished, because you haven’t said what to subtract from 943 yet.. The
next line will begin with a blue +, where you can continue the command. The
same thing happens if you have opened a pair of brackets without closing them.
But if you press Enter after just the 943 without the -, R will read that as a
complete (but rather boring) command to output simply the number 943.

Functions in R
R has an enormous number of useful functions. Let’s take as an example the
square-root function sqrt(). To use a function we need the function name,
then open brackets, then the thing we want to apply the function to, then close
brackets. For example, to find the square-root of 1000, we use
sqrt(1000)

[1] 31.62278
Other similar functions include the exponential exp(), the natural loga-
rithm log(), and the absolute value abs().
A useful function is signif(), which rounds a number to a given number of
significant figures. This function takes two arguments, which should be sep-
arated by a comma. The first argument is the number to be rounded, √ and the
second is the number of significant figures to round it to. To give 1000 to four
significant figures, we can put the sqrt() function into the signif() function
with either of the following:
signif(sqrt(1000), digits = 4)

[1] 31.62
4.3. BIRTHDAY PROBLEM 51

signif(sqrt(1000), 4)

[1] 31.62
Putting digits = before the 4 is optional. I find it useful when reading through
my code, because it reminds me what the second argument does; but it is extra
typing, so you might not want to bother.
A similar function is round(), which rounds a number to a given number of
decimal places. But round() can be used either with one argument or with
two arguments. The first argument is always the number to be rounded. The
optional second argument (with or without digits =) is the number of decimal
places to round to. If no second argument is given, R will assume you wanted it
to be 0; that is, to round to 0 decimal places, or the nearest
√ whole number. So
all of these commands do exactly the same thing – give 1000 to the nearest
whole number:
round(sqrt(1000), digits = 0)

[1] 32
round(sqrt(1000), 0)

[1] 32
round(sqrt(1000))

[1] 32
Exercise 1.3. Use R to find:
(a) 71 to 4 decimal places;
(b) √
log(10) to 3 significant figures;
(c) 712 + 34 to the nearest integer.

Objects in R
So far we haven’t done anything with R that you couldn’t have done with a
calculator. The real strength of R is using objects. An object allows us to save
a number (or anything else) calculated with R, and come back to use it later.

Let’s say we’ll want to come back to our special number 1000 again later.
Then instead of writing sqrt(1000) every time, we can save it as an R object
like this:
special <- sqrt(1000)

This tells us that we’re making an object with the name special, that we’re
assigning it a value – the assignment operator is a less-than sign < and a hyphen
- put together to look like an arrow <- pointing towards the object – and the
value we’re assigning it is sqrt(1000). (You can also use an equals sign =
instead of the assignment arrow <-; I prefer the <-, because it reminds me that
this isn’t the same as a mathematical equation, and because = is also used for
many other things in R too.)
52 CHAPTER 4. CLASSICAL PROBABILITY II

Names of objects are case-sensitive, so if you want the object special, you can’t
write SPECIAL or Special. When doing mathematics, we usually give variables
a single letter like 𝑥 or 𝑦; but when we’re doing computer programming, we
usually find it helpful to use longer descriptive names, like temperature_data
or current_total.
So now R knows we have an object called special. To find the value of an
object, we simply type the name of that object into the Console an press Enter:
special

[1] 31.62278
Exercise 1.4. Create an object called john, and assign it the value
7. Then create an object called paul and assign it the value 122 .
Then get R to tell you the value of paul multiplied by the value of
john.
When assigning objects you can use other objects. So we can create a
very_special number that is double our special number like this:
very_special <- 2 * special

We can also update an object in terms of itself. So to increase our very special
number by 10, we can write
very_special <- very_special + 10

This might seem bizarre – it would be nonsensical to write 𝑥 = 𝑥 + 10 as math-


ematical equation! But the key is to remember that <- means “assignment”; so
this command means “Assign to the new version of the object very_special
the old value of very_special plus 10.”
Exercise 1.5. This exercise continues with the objects assigned in
Exercise 1.4.
(a) Assign the value of paul mutiplied by john to the new value
ringo.
(b) Check the value of ringo.
(c) Double the value of ringo, keeping it still stored as ringo.
(d) Add 8 to the value of ringo.
(e) Check the new value of ringo. (It should be 2024.)

Saving your work


If you can’t complete an R worksheet in a single sitting (and maybe even if you
can), you’ll want to save your work until later.
The recommended thing to do is to save your commands, so that you can quickly
run them again if you need to. You can save your commands in an R Script. To
open a new R Script in RStudio, click File -> New File -> R Script from
the main menu. This will open a “notepad” style area, probably in the top-left
quarter of your screen. You can write R commands into this area, then click on
the Save button to save your work. (R Scripts are conventionally save with the
suffix .R, like math1700-r1-solutions.R.)
4.3. BIRTHDAY PROBLEM 53

To run commands from your R Script, you can just copy-and-paste them into
the Console. It can be even more convenient, though, to highlight the commands
then click Run in the top-right corner of the script area.
Saving the commands you used is better than just writing down your answers.
This way, you can quickly change your old commands, if you notice any small
mistakes later. Later, when we work with data, you can also use the same old
saved commands on new data.
Exercise 1.6. Write down the commands you used to solve Exer-
cises 1.4 and 1.5 in a new R Script. Save your work with a explana-
tory filename that will allow you to find it again later.
If you have an R Script that is just a bunch of commands copy-pasted into a
file, it might not be clear what they do. For example, which commands were
the answer to which exercise? It’s therefore helpful to annotate your work with
comments, to remind yourself what’s what. We normally prefix comments with
a hash-sign # like this:
# R Worksheet 1
# Exercise 1.4
john <- 7
paul <- 2^12
john + paul # When I ran this, I got the answer 1008

The reason we do this is that R ignores any commands or any writing preceded
by a #. This means that if you accidentally copy-and-paste the comment along
with the command, nothing will go wrong.
Exercise 1.7. Continuing with your R Script from Exercise 1.6,
add comments to make it clear which commands are doing what,
then re-save your R Script.

You have now finished R Worksheet 1. If you had difficulty with any of the
exercises, remember that there are R Practicals in Week 2 where you can ask for
help.
You can now close RStudio. When you close RStudio, R might ask you if you
want save unsaved work on your R Script (probably you do!). It may also ask
you if you want to save your workspace – that is, save all the objects you’ve
created while working on the worksheet. It’s considered best practice to not
save the workspace – it will end up getting cluttered up with all the objects you
create throughout the whole semester, which is unhelpful. Plus, since you’ve
already saved the commands you need in your R Script, you can afford to get rid
of the clutter and start RStudio afresh the next time – you can always quickly
re-run all the saved commands with one click of “Run” to get back to where you
were.
Solutions are available to R Worksheet 1 here.
54 CHAPTER 4. CLASSICAL PROBABILITY II
R Worksheet 1: R Basics
(Solutions)

Here are my solutions to R Worksheet 1. These aren’t necessarily the only ways
to solve these problems.
Remember that you can get R help at the R Practicals in Week 2.

Exercise 1.1. Type 2 + 3 into the console next to the arrow, and
then press Enter. What happens?
Let’s try!
2 + 3

We see the answer [1] 5. First, the [1] just tells us that what follows is the
first part of the answer – since our answer here only has one part, we can ignore
this. (It is useful when the answer is a very long vector or a table of values.)
Then the 5 is the answer we want.
Exercise 1.2. Calculate:
(a) 943 − 242,
(b) 29 × 31,
(c) 28+5 ,
19 + 21
(d) .
5×3
We calculate these as follows:
943 - 242

[1] 701
29 * 31

[1] 899
2^(8 + 5)

[1] 8192
(19 + 21) / (5 * 3)

[1] 2.666667

55
56 CHAPTER 4. CLASSICAL PROBABILITY II

Exercise 1.3. Use R to find:


(a) 71 to 4 decimal places;
(b) √
log(10) to 3 significant figures;
(c) 712 + 34 to the nearest integer.

We can find these as follows:


round(1 / 7, digits = 4)

[1] 0.1429
signif(log(10), digits = 3)

[1] 2.3
round(sqrt(712 + 34))

[1] 27

Exercise 1.4. Create an object called john, and assign it the value
7. Then create an object called paul and assign it the value 122 .
Then get R to tell you the value of paul multiplied by the value of
john.

We do this as follows:
john <- 7
paul <- 12^2
paul * john

[1] 1008

Exercise 1.5. This exercise continues with the objects assigned in


Exercise 1.4.
(a) Assign the value of paul mutiplied by john to the new value
ringo.
(b) Check the value of ringo.
(c) Double the value of ringo, keeping it still stored as ringo.
(d) Add 8 to the value of ringo.
(e) Check the new value of ringo. (It should be 2024.)

We do this as follows:
ringo <- paul * john
ringo

[1] 1008
ringo <- 2* ringo
ringo <- ringo + 8
ringo

[1] 2024

The answer is indeed 2024, as it should be.


4.3. BIRTHDAY PROBLEM 57

Exercise 1.6. Write down the commands you used to solve Exer-
cises 1.4 and 1.5 in a new R Script. Save your work with a explana-
tory filename that will allow you to find it again later.
Exercise 1.7. Continuing with your R Script from Exercise 1.6,
add comments to make it clear which commands are doing what,
then re-save your R Script.
My R Script R1-solutions.R looked like this:
# MATH1700: R WORKSHEET 1
# MY SOLUTIONS
# Last updated: 7 October 2024

# Exercise 1.1
2 + 3 # Gets output "[1] 5", meaning the answer is 5

# Exercise 1.2
943 - 242
29 * 31
2^(8 + 5)
(19 + 21) / (5 * 3)

# Exercise 1.3
round(1 / 7, digits = 4)
signif(log(10), digits = 3)
round(sqrt(712 + 34))

# Exercise 1.4
john <- 7
paul <- 12^2
paul * john

# Exercise 1.5
ringo <- paul * john
ringo
ringo <- 2* ringo
ringo <- ringo + 8
ringo # Answer is 2024, as it should be

# Exercise 1.5 created this R Script

# Exercise 1.6 added comments to this R Script


58 CHAPTER 4. CLASSICAL PROBABILITY II
Solutions

59
Solutions and group
feedback

This page will be used to publish the solutions to all the Problem Sheet questions.
Solutions are added after all tutorials on a Problem Sheet have finished.
There are many ways you get feedback on this module, both group feedback
(feedback that is generally relevant to many people) and individual feedback
(feedback based specifically on your own approach to the work).
• You will have received both individual and group spoken feedback in your
tutorial (the more you speak up in your tutorial, the more individualised
the feedback you get in return).
• These solutions include group written feedback on common issues for the
class.
• Most importantly, when your work on class quiz questions is marked, indi-
vidual written feedback will be given. It is very important that you read
that feedback.
–>

61

You might also like