Elementary Statisticsand
Probability
             Unit 1
Basic Concepts and Terminologies
          in Statistics
What is Statistics?
Statistics
Statistics is a science that deals with the collection, presentation,
analysis and interpretation of data.
Statistics is a collection of methods for planning experiments,
obtaining data and then organizing, summarizing, presenting,
analyzing, interpreting and drawing conclusions based on the
data.
Identify which of the following questions are answerable using a
statistical process.
What is the ratio of teachers to students in secondary public schools in
Cagayan Province?
(Requires statistical process)
What is the smallest bone in a human body?
(Does not require statistical process)
What seminars and trainings do teachers of CSU need for the next five (5)
years?
(Requires statistical process)
Is planet Mars bigger than planet Earth?
(Does not require statistical process)
Who have a better study habit, teacher education students or medical
technology students?
(Require statistical process
Objectives of Statistics
Practitioners need to understand statistics:
 To know how to properly present and describe information,
 To know how to draw conclusions about large populations
   based only on information obtained from samples,
 To know howto solveproblems and make sensible,
   valid, and reliable decisions on the basis of the statistical
   analysis conducted.
Main Objective of Statistics
To help us in making wise decision.
Decision-making is an
important part of our lives. Everybody
makes decisions almost everyday.
Main Objective of Statistics
For instance,
 Students decide on what course they would take in college
   that could give them high salary and a better future.
 Mothers decide on what brand of milk to buy.
 Business-minded people think whether to put their money in
   the bank or to open a business or a factory
Descriptive and Inferential Statistics
Descriptive Statistics           Inferential Statistics uses
summarizes or describes the      sampledata to make
important characteristics of a   inference about a population.
known set of data.               It consists of generalizing
                                 from samples to populations,
                                 performing hypothesis testing,
                                 determining relationships
                                 among variables, and making
                                 predictions.
Descriptive and Inferential Statistics
Descriptive Statistics are        Inferential Statistics are
statistical procedures used for   statistical procedures that allow
summarizing, organizing,          one to draw inferences to the
                                  population on the basis ofsample
graphing and describing           data. In particular, these statistics
univariate data.                  test for statistical significance of
Examples:                         results –
                                  i.e. statistically significant
frequencies, percentages,         relationships between variables,
measures of central               or statistically
tendency, measures of             significant differences between
variation, and cross              two or more
tabulations                       groups. In quantitative data
                                  analysis, there are several
Population Vs. Sample
A POPULATION is a complete   A SAMPLE is a
collection of all elements   portion/sub- collection of
(scores, people,             elements drawn from a
measurements) to be          population.
studied.
Determining Adequate Sample Size
To determine the sample size from a given population
size, the Slovin’s formula is used.
          𝑵
    𝒏
    =
                        Where n = sample
    𝟏
                        size
    +
                             N = population
    𝒆𝟐
                             size e =
                             margin of
Solv
e!
 1. A group of researchers will conduct a survey to find out the
     opinion of residents of a particular community regarding the
     oil price hike. If there are 10,000 residents in the community
     and the researchers plan to use a sample using a 10%
     margin of error, what would be the sample size?
 Solution: Here: N = 10 000 and e = 10% or 0.10. Substituting the given
 values in the formula, we have
                                       𝟏𝟎 𝟎𝟎𝟎
                                    𝒏 = 𝟏𝟎
                             𝟏+. 𝒏
                                 𝟏𝟎 (𝟏𝟎𝟎𝟎𝟎 𝟎𝟎𝟎)
                                      𝟏+. 𝟎𝟏 (𝟏𝟎
                                    𝟐
                                 =
                                      𝟎𝟎𝟎)
                                          𝟏𝟎
                                      𝒏 𝟎𝟎𝟎
                                           𝟏𝟎
                                      =
                                           𝟏
                                   𝒏 = 𝟗𝟗. 𝟎𝟏
                                   𝒐𝒓 𝟗𝟗
Solv
e!
 2. Suppose that in Example 1, the researchers would like to
 use a 5% margin of error. What should be the size of the
 sample?
 Solution: Here: N = 10 000 and e = 5% or 0.05.
 Substituting the given values in the formula, we have
                            𝟏𝟎 𝟎𝟎𝟎
                        𝒏=
                 𝟏+. 𝟎𝟓 (𝟏𝟎 𝟎𝟎𝟎)
                               𝟏𝟎
                       𝟐
                      𝒏        𝟎𝟎𝟎 (𝟏𝟎
                          𝟏+. 𝟎𝟎𝟐𝟓
                      =
                          𝟎𝟎𝟎)
                        𝒏 = 𝟑𝟖𝟒. 𝟔𝟐 𝒐𝒓
                        𝟑𝟖𝟓
Parameter Vs. Statistic
A PARAMETER is a          A STATISTIC is a
numerical measurement     numerical
describing some           measurement
characteristics of a      describing        a
population.               characteristic of a
Summary data from a       sample.
population                Summary data from
                          a sample
Data Vs. Variable
              DATA                         VARIABLE
 •measurements or                 •a characteristic
  observations of a                that is
  variable                         observed or
 •the word “data” is plural,       manipulated
  datum is singular.              •can take on
 •a collection of data is often    different values
  called a data set (singular)
Classifications of Variables
Qualitative Data Vs. Quantitative Data
QUALITATIVE DATA                 QUANTITATIVE     DATA
(categorical) can be separated   (numerical) consist of
into different categories that   numbers       representing
are distinguished by some        counts or measurements.
nonnumeric characteristics.
Qualitative Data Vs. Quantitative Data
      QUALITATIVE DATA           QUANTITATIVE DATA
•smoking Status              •weight
•physical Activity at Home   •body Mass Index
•cause of death              •blood Glucose
•nationality                 •survival time
•race                        •systolic blood pressure
•gender                      •number of children in
•severity of pain             a family
Classifications of Variables
Discrete Data Vs. Continuous Data
DISCRETE DATA result from      CONTINUOUS DATA result
either a finite number of      from infinitely many possible
possible values or countable   values that can be
number of possible values      associated with points on a
as 0, or 1, or 2, and so on.   continuous scale in such a
                               way that there are no gaps
                               or interruptions.
Discrete Data Vs. Continuous Data
       DISCRETE DATA           CONTINUOUS DATA
•the number of eggs that   •the amounts of milk
 hens lay                   that cows produce
•number of pregnancies     •duration of a seizure
•number of missing teeth   •body mass index
                           •height
Dependent Variable Vs. Independent Variable
     DEPENDENT VARIABLE         INDEPENDENT VARIABLE
•the variable that is being   •the variable that
 affected or explained         affects or explains
•what is measured as an       •precede dependent
 outcome in a study            variables in      time
•values depend on the         •are often manipulated
 variabl
 independent                   research
                               by  the
 e                            •er
                               the treatment or
                               that is used in a
                               intervention
                               study
Dependent Variable Vs. Independent Variable
Example
Research Title: Effect of crime rates on tourist arrivals
Crimeratesrepresent the            independent variable, and
       tourist     arrivals represent the dependent
variable.
 Dependent Variable Vs. Independent Variable
Example
Research Title: Enhancing Students’ Performance in
Analytic Geometry Through GeoGebra Software
Independent variable – Use of Geogebra software in
teaching AG Dependent variable – Students’ performance
in AG (grades or test scores)
Levels of Measurement
Levels of Measurement
 The NOMINAL LEVEL of measurement is characterized by
  data that consist of names, labels, or categories only.
 A nominal scale is the 1st level of measurement scale
  in which the numbers serve as “tags” or “labels” to
  classify or identify the objects.
 A nominal scale usually deals with the non-numeric
  variables or the numbers that do not have any
  value.
Characteristics of Nominal Scale
• A nominal scale variable is classified into two or more
  categories. In this measurement mechanism, the answer
  should fall into either of the classes.
• It is qualitative. The numbers are used here to identify the
  objects.
• The numbers don’t define the object characteristics. The
  only permissible aspect of numbers in the nominal scale
  is “counting.”
Nominal
Level
Example:
•Gender: Male, female
•Eye color: Blue, green, brown
•Hair color: Blonde, black, brown, grey, other
•Blood type: O-, O+, A-, A+, B-, B+, AB-,
 AB+
•Political Preference: Republican, Democrat,
 Independent
•Place you live: City, suburbs, rural
Levels of Measurement
 The ORDINAL LEVEL of measurement involves data that
  may be arranged in some order but differences between
  data values either cannot be       determined or are
  meaningless.
 The ordinal scale is the 2nd level of measurement that
  reports the ordering and ranking of data without
  establishing the degree of variation between them.
 Ordinal represents the “order.”
 Ordinal data is known as qualitative data or categorical
  data. It can be grouped, named and also ranked.
Characteristics of the
• The ordinal
Ordinal Scale scale shows the relative ranking of the
  variables
• It identifies and describes the magnitude of a variable
• Along with the information provided by the nominal scale,
  ordinal scales give the rankings of those variables
• The interval properties are not known
• The surveyors can quickly analyse the degree of
  agreement concerning the identified order of variables
Ordinal
Levele:
Exampl
• Ranking of school students – 1st,
  2nd, 3rd, etc.
• Satisfaction
     Very unsatisfi
                  ed
                 Ratings       unsatisfied
                         in restaurants          very
                                satisfied        satisfied
• Evaluating
     Very often the frequency
                         Often of       Not at
  occurrences
     Not often                          all
• Assessing the degree of
  agreement
      Totally
      agree
      Agree
Levels of Measurement
 The INTERVAL LEVEL of measurement is like the ordinal
  level, but meaningful amounts of differences can be
  determined. It has no inherent     (natural) zero starting
  point.
 The interval scale is the 3rd level of measurement scale.
 It is defined as a quantitative measurement scale in
  which the difference between the two variables is
  meaningful.
 In other words, the variables are measured in an exact
  manner, not as in a relative way in which the presence
  of zero is arbitrary.
Characteristics of Interval Scale:
•      The interval scale is quantitative as it can quantify the
difference between the values
•It allows calculating the mean and median of the variables
•    To understand the difference between the variables, you
can subtract the values between the variables
•      The interval scale is the preferred scale in Statistics as
it helps to assign any numerical values to arbitrary
assessment such as feelings, calendar types, etc.
Interval
Level
Example:
•Temperature: Measured in
 Fahrenheit or   Celsius
•Credit Scores: Measured from
 300 to 850
•SAT Scores: Measured from
 400 to 1,600
•IQ
Levels of Measurement
 The RATIO LEVEL of measurement is the interval level
  modified to include the inherent zero starting point.
 The ratio scale is the 4th level of measurement scale,
  which is quantitative.
 It is a type of variable measurement scale.
 It allows researchers to compare the differences or
  intervals.
 The ratio scale has a unique feature. It possesses the
  character of the origin or zero points.
Characteristics of Ratio Scale:
•Ratio scale has a feature of absolute zero
•It doesn’t have negative numbers, because of its zero-point
 feature
•     It affords unique opportunities for statistical analysis.
The variables can be orderly added, subtracted, multiplied,
divided. Mean, median, and mode can be calculated using the
ratio scale.
•     Ratio scale has unique and useful properties. One such
feature is that it allows unit conversions like kilogram –
Ratio Level
Levels of
MeasurementPermissible                Best
 Measurement
               mathematics            measure of
    scale
                    operations        central
                                      tendency
   Nominal           Counting             Mode
               Greater or less than
   Ordinal                                Median
                     operations
                                      Symmetrical –
   Interval        Addition and
                                       Mean Skewed
                    subtraction
                                       – Median
                Addition,             Symmetrical –
    Ratio