0% found this document useful (0 votes)

22 views9 pages

Stats Lec01

Uploaded by

zhaoyixue116

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views9 pages

Stats Lec01

Uploaded by

zhaoyixue116

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

HUDM 4122- Statistics

Lecture 01 – Background Information

HUDM4122: Probability and Statistical Inference

The purpose of this course is to give you some theoretical background and to deepen your
understanding more applied (i.e. directly useful) material.

This course assumes that you have already had a previous statistics course as an undergraduate.
Specifically, you need to have reasonable knowledge of algebra, calculating means (averages), and
reading and interpreting graphs. Here is a brief introductory lecture on the materials that we presume
you already know.

Notation

Scientific Notation: Because your calculator has a limited amount of space to report numbers, you may
find that it returns a value that looks something like this: 2.35E4 or this: 2.35E-4. This is scientific
notation, which is telling you to move the decimal point a certain number of space to the right (2.35E4 is
really 2.35 X 104 , or 23,500) or to the left (2.35E-4 is really 2.35 X 10−4 , or .000235).

Factorial Notation: n! is called “n factorial” and is requires you to multiply n by all of the subsequent
lesser integers up to 1. For example, if you have 5!, you would multiply, 5*4*3*2*1 for a total of 120. 5!
= 120. 4! = 4*3*2*1 = 24.

Subscripts: X i = i-th entry (observation) of a dataset, X.

The “i” basically is a “index” of the position of a point in a data set.
If X is a dataset that has 3 values, 6, 3, and 7, they are denoted as:
X1=6
X2=3
X3=7
6 is the first (i = 1) value in the list, 3 is the second (i=2), and 7 is the third (i=3) Subscript notation is most
useful when combined with:
n
Summation Notation ∑X
i =1
i

The Greek letter, ∑ is telling us to add up a bunch of numbers. Sometimes all we have to do is add up

a set of numbers. Other times, we have to do something to each individual value in the set before
adding it up. For example, here we are expected to square each value of X before adding it up.
n

∑X
i =1
i
2
Remember Please Excuse My Dear Aunt Sally? Order of operations? Parentheses, Exponents,

Multiplication, Division, Addition, Subtraction. You follow the order of operations for these summation
operations.

1
Example:
i (index) Xi X i2
1 6 36
2 3 9
3 7 49
4 0 0
5 6 36
6 9 81
7 0 0
8 1 1
ΣX i =32 ΣX i = 212
2

Descriptive Statistics - Concerned with the presentation, summarization and description of data.

Ex: In Spring semester of 2008, 138 adult males from an urban college in the northeast were asked to
list their height in inches. The responses are summarized below:
1. Graphical Summary - height bar graph

2. Tabular (Table) Summary -

Height in Inches Frequency

60 - 63.5 20
64 - 67.5 46
68 - 71.5 45
72 - 75.5 22
76 - 79.5 5
Total: 138

2
3. Numerical Summary: Measure of Center

a) Mean: the arithmetic average of a set of measurements - the sum of the

measurements divided by the total number of measurements (not the true
center)
n

∑x i
X= i =1

n
(Refer to the summation notation section. You add up a set of numbers and
then divide by n, which is the total number of data points you have).

b) Median: the median of a set of n measurements is the middle value when

the measurements are arranged in ascending order.

Order the data. If n is odd, the median is the unique middle value. If n is
even, the median is the average of the two middle numbers

c) Mode: the value in the data set that occurs with the greatest frequency.
Note: If no value occurs with the greatest frequency, there is not one
value called the mode. If two values occur with the highest frequency, the
data is called bimodal. If more than two values occur with the highest
frequency, the data is called multimodal.

4. Numerical Summary: Measure of Dispersion (Spread)

a) Range - the difference between the largest and smallest values in a dataset.
Depends on only 2 values. (extremes). It is a poor measure because the
extremities are not typical of the total variability in your dataset.

b) Variance - the measure of variability of the scores from a dataset

Sample Variance - sometimes called the sample estimate of the variance.

∑(x − X ) i
2

s =
2 i =1
n −1
Follow the summation notation order of operations rules.

c) Standard deviation – positive square root of variance

s= s2

3
Example 1: The ages of a sample of students in an American literature class are as follows:
{20, 19, 65, 20, 21, 18, 17, 20, 19}

Q1: Compute the Mean, Median, and Mode of these ages:

∑x i

Mean = X= i =1
= 219/9 = 24.33
n
Median = (order the data first)
{17, 18, 19, 19, 20, 20, 20, 21, 65} Then find the middle number = 5th position = 20 years

Mode = most common age, 20 years

Example 2: Diners were asked to rate a meal in a restaurant on a scale from 0 to 10.
Here are the results:
{6, 3, 7, 0, 6, 9, 0, 1} Order the data: {0, 0, 1, 3, 6, 6, 7, 9}

Q1: Find the Mean, Median, and Mode for the sample: Mean = 4, Median = 4.5, Modes = 0 and 6 (This
set of distribution is bi-modal because it has 2 most common numbers)
n

∑(x − X ) i
2
84
Q2: Find the sample variance: s=
2 i =1
= = 12
n −1 7
i(index) X X- X (X i - X ) 2
1 0 -4 16
2 0 -4 16
3 1 -3 9
4 3 -1 1
5 6 2 4
6 6 2 4
7 7 3 9
8 9 5 25
n n

∑X
i =1
i =4 ∑ (X - X )
i =1
i
2
84

Q3: Find the sample estimate of the standard deviation. 12 = 3.464

Inferential Statistics - Concerned with using sample data to make an inference about a population of
data
4
A. Population - complete collection of all elements of interest in a particular study.
B. Sample - part of the population that is assumed to be representative of the population

Types of Statistical Studies

A. Experimental Study - one or more factors in the study are controlled for so data can be
obtained on how the factors influence the variables.

B. Non-experimental Study (Observational Study) - Do not attempt to control for the

influences of factors on the variables of interest. Observing events as they are in "nature."

Some vocabulary

A. Data - values that are collected, analyzed, and summarized

B. Data set - collection of data for a particular study
C. Subjects/Participants - entities on which data are collected (students)
D. Variable - Characteristic of interest (column)
E. Observation - set of measures collected on a particular subject or participant. (rows)

Types of Data

A. Quantitative Data - Observations measured on a numerical scale. These types of data

indicate how much or how many of something

B. Qualitative Data - Non-numerical data that can be classified into groups or categories.
These types of data are labels or names used to identify an attribute for each subject.

Scales of Measurement - Assignment of numeral to objects or events

A. Nominal Scale: Distinguish one object or event from another on the basis of a name. The
observations for the variable are labels that identify an attribute (gender, occupation, major,
etc.)

B. Ordinal Scale: Based on the relative amounts of some characteristic. This data can be rank
ordered. (taste preferences: bad, good, excellent)

C. Interval Scale: When objects or events can be distinguished from one another and ranked,
and when the differences between measurements have meaning. There is a fixed unit of
measurement (temperature)

D. Ratio Scale: When measurements have the properties of the previous three scales the
additional property that their ratios are meaningful. The zero point here is inherently defined
and must mean "nothing." (weight, price, area)

Methods of Describing Qualitative Data

5
Tabular Methods (Table)
Frequency - the frequency for a category is the number of observations that fall in that
particular category

Frequency distribution - a tabular summary showing the frequency of items in each of

the non-overlapping classes or categories

Relative Frequency - the relative frequency for a category is the frequency divided by
the total number of observations, n.

Relative Frequency distribution - tabular summary showing the relative frequencies in

each of the non-overlapping categories.

Example: A survey was taken in an intro to psych class from University College asking students what their
major is. Here is a tabular summary of the results:

Major Frequency Relative Frequency

1 = Psychology 18 18/50 = 0.36
2 = Biology 5 0.10
3 = Economics 9 0.18
4 = Communications 15 0.30
5 = Chemistry 3 0.06
Total n = 50 1.00

Q1: What is the most common major for the students in this class? (Psychology)
Q2: What percentage of students are chemistry majors? (6%)

6
Graphical Methods (Graph)

Bar Graph - The categories or classes are on the horizontal axis and the frequency for each class is on the
vertical axis. The bars should be of equal width and the spaces between them should also be of equal
width. The height of the bar is the frequency of the class. You should always start with the vertical axis
at 0.

Pie Chart - Circular chart used to represent the relative frequency distribution. The relative
frequency distribution is used to subdivide the circle into sectors whose size corresponds to the
relative frequency of each category.

Chemistry
6%

Communications
30% Psychology
36%

Economics
18%
Biology
10%
7
Methods of Describing Quantitative Data

Tabular Methods (Table)

Frequency – the frequency for a class (range of data) is the number of observations that
fall in that particular range

Relative Frequency – the relative frequency for a class is the frequency divided by the
total number of observations, n.

Cumulative Frequency - the cumulative frequency for a class is the total number of
values less then or equal to the upper class limit for each classification.

Cumulative Relative Frequency Distributions - the cumulative frequency divided by the

total number of observations, n.

Example: A frequency distribution table for the scores on midterm exam from the students in an intro
psych class.

Test Score Frequency Relative Freq Cumulative Freq Cumulative Rel. Freq.
10 - 29 11 0.22 11 0.22
30 - 49 2 0.04 13 0.26
50 - 69 13 0.26 26 0.52
70 - 89 15 0.30 47 0.82
90 - 109 6 0.12 47 0.94
110 - 130 3 0.06 50 1.00
Total 50 1.00

Q1: How many students scored less than 70? 26

Q2: How many students scored at least 70? 24
Q3: Proportion at most 49? 26%
Q4: Proportion at least 90? 18%

8
Graphical Methods

Histogram - bar graph of the frequency distribution. The vertical axis identifies the
frequencies for each class, and the horizontal axis is used for identifying the lower class
limits. The bars must touch.

10
10 - 29

30 - 49
8

50 - 69
6

70 - 89
4
90 - 109
2
110 - 130

Test Scores

Basic Statistics Concepts Guide
No ratings yet
Basic Statistics Concepts Guide
14 pages
Statistics A Review
No ratings yet
Statistics A Review
47 pages
History and Basics of Statistics
No ratings yet
History and Basics of Statistics
18 pages
Advance Statistics For Data Science and Data Analysis
No ratings yet
Advance Statistics For Data Science and Data Analysis
47 pages
3rd QTR Stats Reviewer
No ratings yet
3rd QTR Stats Reviewer
24 pages
Statistics
No ratings yet
Statistics
116 pages
BIOSTAT LESSON 2 - Descriptive Statistics
No ratings yet
BIOSTAT LESSON 2 - Descriptive Statistics
3 pages
Statistics
No ratings yet
Statistics
14 pages
D2 - Mathematics in The Modern World
No ratings yet
D2 - Mathematics in The Modern World
7 pages
Lecture 3
No ratings yet
Lecture 3
36 pages
Engineering Probability and Statistics
No ratings yet
Engineering Probability and Statistics
42 pages
Basic Statistics Notes
No ratings yet
Basic Statistics Notes
10 pages
Lesson 1: Fundamental Concepts and Summation Notation
No ratings yet
Lesson 1: Fundamental Concepts and Summation Notation
8 pages
Statatics Chapter 1
No ratings yet
Statatics Chapter 1
21 pages
And Dividing It by Total Number of Values
No ratings yet
And Dividing It by Total Number of Values
3 pages
Part1 141104090445 Conversion Gate01
No ratings yet
Part1 141104090445 Conversion Gate01
27 pages
Lesson 1: Engineering Data Analysis First Semester - A.Y. 2021 - 2022
100% (1)
Lesson 1: Engineering Data Analysis First Semester - A.Y. 2021 - 2022
4 pages
Lesson2 - Measures of Tendency
No ratings yet
Lesson2 - Measures of Tendency
65 pages
Basics of Statistics MATH100N MIDTERMS
No ratings yet
Basics of Statistics MATH100N MIDTERMS
11 pages
Statistics Review
No ratings yet
Statistics Review
59 pages
Inferential Statistics
No ratings yet
Inferential Statistics
92 pages
1st Mid
No ratings yet
1st Mid
19 pages
Basics of Statistics: Definition: Science of Collection, Presentation, Analysis, and Reasonable
100% (1)
Basics of Statistics: Definition: Science of Collection, Presentation, Analysis, and Reasonable
33 pages
Lesson 5 - Quantitative Analysis and Interpretation of Data
No ratings yet
Lesson 5 - Quantitative Analysis and Interpretation of Data
78 pages
Statistics - Slide 2
No ratings yet
Statistics - Slide 2
15 pages
Math
No ratings yet
Math
13 pages
C1S1 Statistics Packet
No ratings yet
C1S1 Statistics Packet
24 pages
Statistics for Teachers
100% (4)
Statistics for Teachers
124 pages
Guiang Mamow Paper 1 Statistical Terms
No ratings yet
Guiang Mamow Paper 1 Statistical Terms
5 pages
Psych 110 Notes Chap 1 2
No ratings yet
Psych 110 Notes Chap 1 2
10 pages
Prelim Lec 2017
No ratings yet
Prelim Lec 2017
49 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
86 pages
1 Introduction of The Nature of Statistics and Frequency Distributions and Graph
No ratings yet
1 Introduction of The Nature of Statistics and Frequency Distributions and Graph
13 pages
Basic Concepts
No ratings yet
Basic Concepts
105 pages
Math 5
No ratings yet
Math 5
3 pages
Understandingstatisticsinresearch 151026064600 Lva1 App6892
No ratings yet
Understandingstatisticsinresearch 151026064600 Lva1 App6892
37 pages
Stats For PGDM
No ratings yet
Stats For PGDM
52 pages
Statistics - Basic Concepts
No ratings yet
Statistics - Basic Concepts
29 pages
Gcse Statistics
No ratings yet
Gcse Statistics
8 pages
Statistics 12
No ratings yet
Statistics 12
29 pages
QM Tutorial - Session 1 Introduction, Descriptive Statistics and Numerical Measures
No ratings yet
QM Tutorial - Session 1 Introduction, Descriptive Statistics and Numerical Measures
13 pages
Statistics
No ratings yet
Statistics
68 pages
Quantitative Methods
No ratings yet
Quantitative Methods
20 pages
Lecture Guide Math019
No ratings yet
Lecture Guide Math019
63 pages
Summry Biostatstics
No ratings yet
Summry Biostatstics
32 pages
Intro To Statistics Lecture
No ratings yet
Intro To Statistics Lecture
41 pages
Intro to Statistics Basics
No ratings yet
Intro to Statistics Basics
18 pages
Locskew
No ratings yet
Locskew
8 pages
Unit 1 - Examining Distributions
No ratings yet
Unit 1 - Examining Distributions
80 pages
Mathematics in The Modern World
No ratings yet
Mathematics in The Modern World
50 pages
Statistical Tools and Techniques: College-Level Notes
No ratings yet
Statistical Tools and Techniques: College-Level Notes
14 pages
AL - I (Unit - I)
No ratings yet
AL - I (Unit - I)
19 pages
Statistics For Begineers
No ratings yet
Statistics For Begineers
28 pages
Chapter 2
No ratings yet
Chapter 2
38 pages
Ge MMW Hybrid - 2
No ratings yet
Ge MMW Hybrid - 2
7 pages
Introduction To Stati Stics: There Are Three Kinds of Lies: Lies, Damned Lies, A ND Statistics." (B.Disraeli)
No ratings yet
Introduction To Stati Stics: There Are Three Kinds of Lies: Lies, Damned Lies, A ND Statistics." (B.Disraeli)
39 pages
Or Lecture 202209
No ratings yet
Or Lecture 202209
21 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
11 pages
Collection of Data Part 2 Edited MLIS
No ratings yet
Collection of Data Part 2 Edited MLIS
45 pages
Chapter 14 - Statistics
No ratings yet
Chapter 14 - Statistics
33 pages
Remark: Mean:, X, X,, X X To Stand For The X N X N
No ratings yet
Remark: Mean:, X, X,, X X To Stand For The X N X N
31 pages
Assignment 1 QEM 1004 - 1 SUMMER 2024
No ratings yet
Assignment 1 QEM 1004 - 1 SUMMER 2024
11 pages
Ss 2 Mathematics Third Term e Note
No ratings yet
Ss 2 Mathematics Third Term e Note
85 pages
Research Tools
100% (7)
Research Tools
20 pages
Albumin Creatinine Ratio
No ratings yet
Albumin Creatinine Ratio
33 pages
Mean Mode and Median
No ratings yet
Mean Mode and Median
6 pages
04-003 Statistics
No ratings yet
04-003 Statistics
14 pages
Item Analysis of National Geography Olympiad Multiple-Choice Questions MCQs in Indonesia
No ratings yet
Item Analysis of National Geography Olympiad Multiple-Choice Questions MCQs in Indonesia
12 pages
Statistics For Managers Using Microsoft Excel: 4 Edition
No ratings yet
Statistics For Managers Using Microsoft Excel: 4 Edition
60 pages
2035 - Ayush Kumar Gupta
No ratings yet
2035 - Ayush Kumar Gupta
25 pages
© The Institute of Chartered Accountants of India
No ratings yet
© The Institute of Chartered Accountants of India
18 pages
Sulotion
No ratings yet
Sulotion
20 pages
I10064664-E1 - Statistics Study Guide PDF
No ratings yet
I10064664-E1 - Statistics Study Guide PDF
81 pages
Wingfield 2014
No ratings yet
Wingfield 2014
18 pages
Statistical Analysis of Student Scores
No ratings yet
Statistical Analysis of Student Scores
9 pages
S 3-Mathematics
No ratings yet
S 3-Mathematics
8 pages
Bioepi Lesson 6. Descriptive Statistics
No ratings yet
Bioepi Lesson 6. Descriptive Statistics
38 pages
M5 L1 Mean Median Mode Practice Probs
No ratings yet
M5 L1 Mean Median Mode Practice Probs
5 pages
Arch. Assignments Stat.
No ratings yet
Arch. Assignments Stat.
3 pages
LESSON 4 MMW Data Management
No ratings yet
LESSON 4 MMW Data Management
104 pages
Module 4 - 094416
No ratings yet
Module 4 - 094416
14 pages
Biostatistics Exam Prep Guide
No ratings yet
Biostatistics Exam Prep Guide
5 pages
Worksheet 7 Median Quartile of Grouped Data
100% (1)
Worksheet 7 Median Quartile of Grouped Data
10 pages
Estimation Methods For Strain-Life Fatigue Properties From Hardness
No ratings yet
Estimation Methods For Strain-Life Fatigue Properties From Hardness
15 pages
Practical File Artificial Intelligence Class 10 For 2023-24
80% (10)
Practical File Artificial Intelligence Class 10 For 2023-24
26 pages
5 - 3rd Term Scheme Ss 1-2 2025
No ratings yet
5 - 3rd Term Scheme Ss 1-2 2025
62 pages
Innocenti Report Card 7 - Child Poverty in Perspective: An Overview of Child Well-Being in Rich Countries
100% (1)
Innocenti Report Card 7 - Child Poverty in Perspective: An Overview of Child Well-Being in Rich Countries
52 pages
Mathematics - Advance Algebra and Statistics Pre-Test: Directions
No ratings yet
Mathematics - Advance Algebra and Statistics Pre-Test: Directions
2 pages
Psychological Testing and Assessment Notes 2
No ratings yet
Psychological Testing and Assessment Notes 2
15 pages

Stats Lec01

Uploaded by

Stats Lec01

Uploaded by

HUDM 4122- Statistics

Lecture 01 – Background Information

HUDM4122: Probability and Statistical Inference

Subscripts: X i = i-th entry (observation) of a dataset, X.

2. Tabular (Table) Summary -

Height in Inches Frequency

a) Mean: the arithmetic average of a set of measurements - the sum of the

b) Median: the median of a set of n measurements is the middle value when

4. Numerical Summary: Measure of Dispersion (Spread)

b) Variance - the measure of variability of the scores from a dataset

Sample Variance - sometimes called the sample estimate of the variance.

c) Standard deviation – positive square root of variance

Q1: Compute the Mean, Median, and Mode of these ages:

Mode = most common age, 20 years

Q3: Find the sample estimate of the standard deviation. 12 = 3.464

Types of Statistical Studies

B. Non-experimental Study (Observational Study) - Do not attempt to control for the

A. Data - values that are collected, analyzed, and summarized

A. Quantitative Data - Observations measured on a numerical scale. These types of data

Scales of Measurement - Assignment of numeral to objects or events

Methods of Describing Qualitative Data

Frequency distribution - a tabular summary showing the frequency of items in each of

Relative Frequency distribution - tabular summary showing the relative frequencies in

Major Frequency Relative Frequency

Tabular Methods (Table)

Cumulative Relative Frequency Distributions - the cumulative frequency divided by the

Q1: How many students scored less than 70? 26

You might also like