Lesson 1
Introduction
Meaning of Statistics
Two Division of Statistics
Uses of Statistics
Population and Sample
Classification of Variables
STATISTICS – is a branch of mathematics that deals with the collection,
organization, analysis, and interpretation of quantitative data and such problems as
experiment design and decision making.
Processes Involving Statistics:
Collection of Data. This is the process of gathering information through direct or
interview, indirect or questionnaire, observation, registration, and experiment
method.
Tabulation or presentation of data. This is the process of organizing data into
texts, tables, charts or graphs.
Analysis of data. This involves the process of extracting relevant information
from the organized collected data. Statistical techniques are needed in this
process.
Interpretation of data. This is the process of drawing conclusions from the
analyzed data. It involves the formulation of conclusion about a large group
based on the gathered data from a small group.
Steps in Statistical Inquiry
1 2
Choosing the Problem Formulating the
and Stating the Research Design
Hypothesis
6 3
Interpreting Results Data Collection
5 4
Processing and Coding the Data
Analyzing Data
Two Divisions of Statistics
Descriptive Statistics
This is concerned with summarizing and describing key features of
numerical data without attempting to infer. This method can either be
graphical or computational. Topics included in this study are measures of
central tendency, variability of scores, skewness and kurtosis.
Inferential Statistics
This seeks to give information or inferences or implications pertaining to
the populations by studying its representative samples. It aims to give
information about a large group of data without dealing with each element
of these groups. Testing hypothesis, using t-test, z-test, simple linear
correlation, analysis of variance, the chi-square test, regression analysis,
and time series analysis are included in this study.
Uses of Statistics
It aids in decision making
It summarizes data for public use
It can give precise description of data
It can predict behavior of an individual
It can be used to test hypothesis
It is an essential tool in:
Education
Government
Office of justice programs
Business and economics
Medicine
Experimental psychology
Sociology
Sports
Criminology
Public opinion polling
Census and many others
Definition of Terms:
- Population is the complete set of individuals, objects, places, events, and
reactions having some characteristics in common.
- Sample is a representative cross-section of elements drawn from a
population.
- Parameter is defined as a numerical characteristic of the population.
- Statistic is defined as a numerical characteristic of the sample.
- Variable is defined as a characteristic or attribute of a person or objects,
which can assume different values for different persons or objects.
Classification of Variables
1. According to Functional Relationship
a. Independent variable. This is called the predictor variable.
b. Dependent variable. This is called the criterion variable.
2. According to Continuity of Values
a. Continuous variable. These are variables that can take the form of decimals.
b. Discrete or Discontinuous variable. These are variables that cannot take the
form of a decimal.
Classification of Data
1. Qualitative data are categorical data taking the form of attributes or categories.
They have labels or named assigned to their respective categories.
Examples:
Judicial cases handled – criminal, civil, appealed
Year Level - 1st year, 2nd year, 3rd year, 4th year, 5th year
2. Quantitative Data/Numerical data are data that consists of numbers obtained
from counts or measurements like weights, heights, ages, temperatures, and
other measurable quantities.
3. According to the Levels of Measurements
a. Nominal Scale – when numerical values or symbols are used to classify an
object, person, or characteristics to identify groups to which various objects,
persons, or characteristics belong. They used merely for classification or
identification purposes.
Examples:
1. Civil status, gender, nationality, color of the skin, religion
2. Type of courts – Supreme Court, Court of Appeals, Sandiganbayan, Court
of Tax Appeals, Regional Trial Courts, Metropolitan Trial Courts, Municipal
Trial Courts, etc
b. Ordinal Scale – this is a level of measurement which contains the properties
of the nominal level, but also gives arrangement of members of grouped
order, rank or ordered in some low-to-high manner. Inequalities of properties
between data values cannot be determined.
Example:
1. Educational attainment
o Illiterate
o Elementary level
o Elementary graduate
o High school level
o High school graduate
o College level
o College graduate
c. Interval scale – contains the properties of the ordinal level but the distances
between any two numbers on the scale are of known sizes. Characterized by
a common and constant unit of measurement.
Example:
1. Temperature in degrees (Celsius/ Fahrenheit)
2. Intelligence quotient (75, 100, 120, 150, and so on)
d. Ratio scale – contains the properties of the interval but it has a true zero
point, that is, the number zero indicates the absence of the characteristics
under consideration.
Lesson 2
Collection of Data
Collection of Data
Types of Data
Methods in the Collection of Data
Planning the Study
Types of Questionnaires
Features of Good Questionnaires
Sampling Techniques
COLLECTION OF DATA – refers to the process of obtaining numerical
measurements.
TWO SOURCES OF DATA
1. Documentary Sources – the information contained in published or
unpublished reports, statistics, Internet, letters, magazines, diaries, and so
on.
a. Primary data – data gathered are original
b. Secondary data – data that are previously gathered from original source,
which are computed and compiled.
2. Field Sources – this would include individuals who have sufficient knowledge
and experience regarding the study under investigation
METHODS USED IN THE COLLECTION OF DATA
1. The Direct Method – often referred to as interview method
2. The Indirect Method – popularly known as the questionnaire method.
Questionnaire is a list of questions, which are intended to elicit answers to the
problems under investigation.
3. The Registration Method – it is a method utilizing the existing data or fact or
information, which is kept systematized by the office concerned such as
registration of births, death, motor vehicles, and marriages and licenses
because these are being enforced by certain laws.
4. The Observation Method – is used to collect data pertaining attitudes,
behavior, values, and cultural patterns of the samples under investigation.
5. The Experiment Method – is used if the researcher would like to determine
the cause and effect relationship of certain phenomena under investigation
PLANNING THE STUDY
1. Estimate the number of items in the population
2. Assess resources such as time and money factors, which are available to pursue
the research.
3. Determine the sample size needed in the study using the Slovin’s formula below.
n = N/(1 + Ne2)
Where:
n = sample size
N = population size
e = margin of error
Example 1. A study of the CAFA students of TSU is to be conducted by a student
taking up a Master’s degree in Guidance and Behavior. If there are a total of 800
students from first year to fifth year, determine the sample size needed for a
marginal error of 10%.
Solution:
N = 800
e = 10% = 0.10
n = N/(1 +Ne2)
= 800/[1 + 800(0.10)2]
n = 88.8 or say 90 students
TYPES OF QUESTIONS
Structured Question – is a type of question that leaves only one way or few
alternative ways of answering it.
Unstructured or Open-ended Question – questions that can be answered in
many ways – probing questions or questions that want to elicit reasons.
FEATURES OF A GOOD QUESTIONNAIRE
1. Make the question short and clear.
2. Avoid leading questions.
3. Always state the precise units in which you require them to answer in order to
facilitate tabulation later on.
4. As much as possible ask questions which can only be answered by just checking
slots or stating simple names or brands.
5. Arrangement of questions should be carefully planned.
6. Limit questions to essential information.
PROBABILITY SAMPLING TECHNIQUES: 2 MAJOR WAYS
1. Probability Sampling
a. Popularly known as the simple random sampling
b. Is a sampling procedure of selecting a sample size (n) from a universe (N)
such that each member of the population is given a non-zero chance of being
included in the sample and all possible combinations of size (n) have an
equal chance of being selected as sample
c. Picking things at random means picking things without bias or any
predetermined choice
Ways of Drawing Sample Units at Random
a. Lottery Sampling – is carried out by assigning numbers to each member of
the population
b. Table of Random Numbers – in this technique the selection of each member
of the population is left adequately to chance, and every member of the
population has an equal chance of being chosen
c. Restricted Random Sampling
d. When population is too large to handle, restricted random sampling may be
used
Types of Restricted Random Sampling
a. Systematic Sampling – the process of selecting the sample when units are
obtained by drawing every nth element of the population.
nth = Total number of elements in the population
Desired sample size
nth = N/n
Types of Systematic Sampling
Stratified Sampling – the population is divided into groups based on
homogeneity to avoid the possibility of drawing samples whose
members come from one stratum
Cluster Sampling – an area sample because it is frequently applied on
a geographical basis. This is useful in selecting the sample when
heterogeneous groups occupy blocks in a community or city
Multi-Stage Sampling – this technique uses several stages or phases
in getting the samples from general population. Is useful in conducting
a nationwide survey or any survey involving a large universe
b. Non-Random Sampling – it is a random sampling wherein not all members of
the population are given equal chances to be selected at sample.
Types of Non-Random Sampling
Purposive Sampling – a non-random sampling of choosing samples
which is based on a certain criteria and rules laid out by the researcher
Quota Sampling – a non-random sampling in which the researcher
limits the number of his samples based on the required number of the
subject under investigation
Convenience Sampling – researcher conducts a study at his
convenient time, preferred place or venue. He specifies the place and
time where he can gather his data.
Lesson 3
Presentation of Data and Frequency Distribution
Presentation of Data
Ways to Present Data
Frequency Distribution
Cumulative Frequency Distribution
PRESENTATION OF DATA – refers to the organization of data into tables, graphs,
or charts, so that logical and statistical conclusions can be derived from the collected
measurements.
THREE METHODS OF PRESENTING DATA
1. Textural Presentation
a. The data gathered are presented in paragraph form
b. Data are written and read
c. It is a combination of texts and figures
2. Tabular Presentation
a. Data collected are presented in the form rows and columns
b. Data presented can be easily understood and easily be used for comparison and
facilitate analysis of relationships between and among the variables presented
c. Major parts of a statistical table:
1. Table Heading – consists of table number and title
2. Stubs – classification or categories are found at the left side of the body of
the table
3. Box Head – the top of the column
- It identifies what are contained in the column
-Included here are the stub head, master caption and column
captions
4. Body – the main part of the table
- this contains the substance or the figures on one’s data
Table
Title
Stub Head Master Caption
Row Caption Column Caption Column Caption Column Caption
d. Graphical Presentation
e. Data are presented in visual form. Graphs may appear in many forms like line,
circle, map, or picture graphs.
f. Kinds of Graphs or Diagrams
a. Line Graph
b. Bar Graph
c. Circle Graph or Pie Chart
d. Pictograph
FREQUENCY DISTRIBUTION – is the tabular arrangement of the gathered data by
categories plus their corresponding frequencies and class marks or midpoints.
DEFINITION OF TERMS
Range (R) – the difference between the highest score and the lowest score
Class Interval (CI) – a grouping or category defined by a lower limit and an upper
limit
Class Boundaries (CB) – the true limit, which is situated between the upper limit
of one interval and the lower limit of the next interval
Class Marks (x) – is the middle value or midpoint of a class interval
Class Size (i) – is the difference between the upper-class boundary and the lower
class boundary of a class interval
Relative Frequency (RF) – these are percentage distribution in every class
interval
Class Frequency (CF) – refers to the number of observations belonging to a
class interval, or the number of items within a category
Lesson 4
Measures of Central Tendency
The Mean
a. Ungrouped Data
b. Weighted Mean
c. Grouped Data
The Median
a. Ungrouped Data
b. Grouped Data
The Mode
a. Ungrouped Data
b. Grouped Data
The Quartiles
The MEAN
1. Arithmetic mean or simply mean (commonly called the average) is determined by
adding the scores together and the sum is divided by the number of scores.
a. Ungrouped Data
N
X 1+ X 2+ …+ X N
m=∑ Xi=
i=1 n
n
X 1+ X 2+ …+ X N
x=∑ Xi=
i=1 n
Where:
m= population mean
x=sample mean
Xi=the vale of ith observation∨score
N= population ¿ ¿
n=¿ sample size
Example 1: Given is an array of students’ scores in a quiz. Compute the mean
score.
35, 42, 45, 48, 49, 50, 51, 52, 53, 55
55, 56, 57, 57, 57, 60, 61, 62, 64, 64
65, 65, 68, 69, 70, 71, 71, 72, 73, 75
Solution:
n
X 1+ X 2+ …+ X N
x=∑ Xi=
i=1 n
30
35+ 42+…+75
x=∑ Xi=
i=1 30
x=59
The WEIGHTED MEAN
2. Calculating the mean by multiplying each of the scores by the corresponding
frequency.
X 1 f 1 + X 2 f 2+ …+ X i f i ∑ Xi f i
x= ∨x=
f 1+ f 2 +…+ f i ∑f i
Where:
Xi=number of different values of X ∈the set
fi=frequency of thecorresponding scores
Example 2: Given the following frequency distribution of test scores in a quiz.
Determine the mean score.
Xi fi Xifi
6 4 240
0
5 8 464
8
6 12 780
5
6 5 315
3
5 10 520
2
5 13 715
5
5 15 750
0
7 8 560
0
5 11 616
6
6 9 603
7
∑ fi=95 ∑ fiXi=5563
Solution:
X 1 f 1 + X 2 f 2+ …+ X i f i ∑ Xi f i
x= ∨x=
f 1+ f 2 +…+ f i ∑f i
5563
x= =58.56
95
Note: For other types of data, weights may be denoted by symbols as W 1, W2,
W3…Wk which may mean the importance attached to variates (or scores)
b. Grouped Data
∑ fd
M = AM + ci
n
Where:
AM = assumed mean
∑ fd = algebraic sum of the products of their frequencies by the
corresponding deviations from the assumed mean
n = number of class
ci = class interval