0% found this document useful (0 votes)
22 views14 pages

Basic Statistical Concepts

The document provides an overview of basic statistical concepts, including definitions of population, sample, and types of measurement scales (nominal, ordinal, interval, and ratio). It explains the differences between primary and secondary data, methods of data collection, and the importance of data presentation through tables and frequency distributions. Additionally, it discusses variables, their types, and the significance of statistical terms in research.

Uploaded by

imranzahidhasan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views14 pages

Basic Statistical Concepts

The document provides an overview of basic statistical concepts, including definitions of population, sample, and types of measurement scales (nominal, ordinal, interval, and ratio). It explains the differences between primary and secondary data, methods of data collection, and the importance of data presentation through tables and frequency distributions. Additionally, it discusses variables, their types, and the significance of statistical terms in research.

Uploaded by

imranzahidhasan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Basic Statistical Concepts

Population and Sample


Statistics is a branch of scientific methodology. It deals with the collection, classification,
description and interpretation of data through scientific procedures. Its essential purpose is
to describe about the numerical properties of populations and draw inferences about the
population from the samples.

Population: A population is the collection of all items of interest in a particular study. For
example, all the farmers, students, domestic animals, birds, total forest area, total
agricultural land etc. may constitute a population

Population may be finite or infinite.

Finite population: A population consisting of a finite number of individuals or items is


called a finite population. Students of an institution, farmers in a country, number of
livestocks etc. are examples of finite populations; these have specific numbers that can be
enumerated.

Infinite population: A population consisting of an infinite number of individuals, which


cannot be enumerated, is called an infinite population. For example, number of fishes in a
river, number of stars in the sky etc.

Sample: A small but representative part with finite number of individuals or items of a
population is called a sample. For example, a group of students, representing the first year
honors students (a population), is called a sample. A small quantity of blood, not the whole,
is collected for testing; the blood is a sample where the total quantity of blood of a person
is the population
 Sample is a sub-set or portion of the population selected to represent the population.

1
Sample size: The number of elements selected for a sample is known as the sample size. A
sample of size less than 30 is termed as a small sample and that having 30 or more elements
is termed as a large sample.

Scales of Measurement

This lesson describes the four scales of measurement that are commonly used in statistical
analysis: nominal, ordinal, interval, and ratio scales.

Properties of Measurement Scales

Each scale of measurement satisfies one or more of the following properties of


measurement.

 Identity. Each value on the measurement scale has a unique meaning.


 Magnitude. Values on the measurement scale have an ordered relationship to one
another. That is, some values are larger and some are smaller.
 Equal intervals. Scale units along the scale are equal to one another. This means,
for example, that the difference between 1 and 2 would be equal to the difference
between 19 and 20.
 A minimum value of zero. The scale has a true zero point, below which no values
exist.

Nominal Scale of Measurement

Nominal variables can be placed into categories. These don’t have a numeric value and so
cannot be added, subtracted, divided or multiplied. These also have no order, and nominal
scale of measurement only satisfies the identity property of measurement.

Gender is an example of a variable that is measured on a nominal scale. Individuals may be


classified as "male" or "female", but neither value represents more or less "gender" than the

2
other. Religion and political affiliation are other examples of variables that are normally
measured on a nominal scale.

Examples:
 Gender: Male, Female, Other.
 Hair Color: Brown, Black, Blonde, Red, Other.
 Type of living accommodation: House, Apartment, Other.
 Religious preference: Buddhist, Mormon, Muslim, Jewish, Christian, Other.

Ordinal Scale of Measurement

The ordinal scale measures a variable in terms of magnitude, or rank. Ordinal scales tell us
relative order, but give us no information regarding differences between the categories. The
ordinal scale has the property of both identity and magnitude.

Examples:
 High school class ranking: 1st, 9th, 87th…
 Socioeconomic status: poor, middle class, rich.
 The Likert Scale: strongly disagree, disagree, neutral, agree, strongly agree.
 Level of Agreement: yes, maybe, no.
 Time of Day: dawn, morning, noon, afternoon, evening, night.
 Political Orientation: left, center, right.

Interval Scale of Measurement

An interval scale has ordered numbers with meaningful divisions, the magnitude between
the consecutive intervals are equal. Interval scales do not have a true zero for which zero
represents simply an additional point of measurement i.e In Celsius 0 degrees does not mean
the absence of heat.
The interval scale of measurement has the properties of identity, magnitude, and equal
intervals.
For example, temperature on Fahrenheit/Celsius thermometer i.e. 90° are hotter than 45°
and the difference between 10° and 30° are the same as the difference between 60° degrees
and 80°.

3
Examples:
 Celsius Temperature.
 Fahrenheit Temperature.
 IQ (intelligence scale).
 Time on a clock with hands.

Ratio Scale of Measurement


The ratio scale of measurement is similar to the interval scale in that it also represents
quantity and has equality of units. However, this scale also has an absolute zero (no numbers
exist below the zero).

The ratio scale of measurement satisfies all four of the properties of measurement: identity,
magnitude, equal intervals, and a minimum value of zero.

The weight of an object would be an example of a ratio scale. Each value on the weight
scale has a unique meaning, weights can be rank ordered, units along the weight scale are
equal to one another, and the scale has a minimum value of zero.

Weight scales have a minimum value of zero because objects at rest can be weightless, but
they cannot have negative weight.

Examples:
 Weight.
 Height.
 Sales Figures.
 Ruler measurements.
 Income earned in a week.
 Years of education.
 Number of children.

Variables and Attributes


Measurable characteristics of a population that may vary from element to element either
in magnitude or in quality are called variables.
Suppose, we have a set of numbers, representing marks obtained by five students in a group.
The possible numbers may be represented as X: 4, 5, 7, 8 and 6. Here X is a variable since
it takes different values.

4
Variables are of two types- quantitative or qualitative. Variable characteristics, whose
values are expressed numerically, are known as quantitative variables.
Examples of quantitative variables are: height, weight, age, yield of crops, length or breadth
of fishes, weight of tomato, number of grains per panicles, income and family size, etc.

Quantitative variables may be further classified as discrete or continuous.

When the variable can take only integral values within a given range, is called discrete
variable. For example, the number of children in a family, number of students per class,
number of grains per panicles etc. These are called discrete variables.

A variable is said to be continuous if it assumes any value, integral or fractional, within


specified limits, a given range.
For example, height or weight of students, weight of tomato, length of fish, height of trees,
price of a commodity are continuous variables.

Some variables, which express the quality of population elements cannot be numerical
measured with a scale but can be classified or categorized, these are called qualitative
variables. A qualitative variable shows variation in objects not in terms of magnitude but
in quality or kind. These qualities are called attributes.

Examples of qualitative variables are type of farmers(big, medium, small), type of fishes(sea
fish, river fish), Hair color (brown, black, white etc.), religion (Muslim, Hindu, Christian
etc), Sex, nationality, type of crime, marital status, literacy, etc cannot be numerically
measured but can be grouped into classes or categories.

People vary according to sex as male and female, according to nationality as American,
French, Italian or Indian. Students in a college may be classified as belonging to Science,
Arts, or commerce faculty.

5
Some Statistical terms:
Population
A population consists of all the items or individuals about which we
want to draw a conclusion.
Sample
a sub-set of a population
Variable
a characteristic which may take on different values
A parameter is a characteristic of a population

A statistic is a characteristic of a sample

Types of Data: Primary and Secondary data

Data

The facts and figures which can be numerically measured are studied in statistics. Numerical
measures of same characteristic is known as observation and collection of observations is
termed as data. Data are collected by individual research workers or by organization through
sample surveys or experiments, keeping in view the objectives of the study. The data
collected may be:

1. Primary Data
2. Secondary Data

Primary and Secondary Data in Statistics

The difference between primary and secondary data in Statistics is that Primary data is
collected firsthand by a researcher (organization, person, authority, agency or party etc.)
through experiments, surveys, questionnaires, focus groups, conducting interviews and
taking (required) measurements, while the secondary data is readily available (collected
by someone else) and is available to the public through publications, journals and
newspapers.

6
Primary Data

Primary data means the raw data (data without fabrication or not tailored data) which has
just been collected from the source and has not gone any kind of statistical treatment like
sorting and tabulation. The term primary data may sometimes be used to refer to firsthand
information.

Sources of Primary Data

The sources of primary data are primary units such as basic experimental units,
individuals, households. Following methods are used to collect data from primary units
usually and these methods depends on the nature of the primary unit. Published data and the
data collected in the past is called secondary data.

 Personal Investigation

The researcher conducts the experiment or survey himself/herself and collected data
from it. The collected data is generally accurate and reliable. This method of
collecting primary data is feasible only in case of small scale laboratory, field
experiments or pilot surveys and is not practicable for large scale experiments and
surveys because it take too much time.

 Through Investigators

The trained (experienced) investigators are employed to collect the required data. In
case of surveys, they contact the individuals and fill in the questionnaires after asking
the required information, where a questionnaire is an inquiry form having a number
of questions designed to obtain information from the respondents. This method of
collecting data is usually employed by most of the organizations and its gives
reasonably accurate information but it is very costly and may be time taking too.

 Through Questionnaire

The required information (data) is obtained by sending a questionnaire (printed or


soft form) to the selected individuals (respondents) (by mail) who fill in the
questionnaire and return it to the investigator. This method is relatively cheap as
7
compared to “through investigator” method but non-response rate is very high as
most of the respondents don’t bother to fill in the questionnaire and send it back to
investigator.

 Through Local Sources

The local representatives or agents are asked to send requisite information who
provide the information based upon their own experience. This method is quick but
it gives rough estimates only.

 Through Telephone

The information may be obtained by contacting the individuals on telephone. It’s a


Quick and provide accurate required information.

 Through Internet

With the introduction of information technology, the people may be contacted through
internet and the individuals may be asked to provide the pertinent information. Google
survey is widely used as online method for data collection now a day. There are many paid
online survey services too.

It is important to go through the primary data and locate any inconsistent observations
before it is given a statistical treatment.

Sources of Secondary Data

Data which has already been collected by someone, may be sorted, tabulated and has
undergone a statistical treatment. It is fabricated or tailored data.

Sources of Secondary Data

The secondary data may be available from the following sources:

 Government Organizations
Federal and Provincial Bureau of Statistics, Crop Reporting Service-Agriculture
Department, Census and Registration Organization etc

8
 Semi-Government Organization
Municipal committees, District Councils, Commercial and Financial Institutions
like banks etc
 Teaching and Research Organizations
 Research Journals and Newspapers
 Internet

Presentation of Data

Introduction

Given a large mass of data, it is very hard for a researcher to comprehend all the information
and implications of such collected data. Normally, large masses of data or collected data
must be organized in order to show significant characteristics or information.

Methods of Presenting Data:

1. Tabular form – where the data are presented in row and columns
2. Graphical form – where the data are presented in pictorial or visual form.

Tabular Method. This method of data presentation makes use of the table where data are
arranged systematically into rows and columns. This systematic arrangement of data is
called a statistical table. Through this process, data can be readily understood and
comparisons are more easily be made.

A good statistical table has four essential parts:

1. Table heading – includes the table number and table title. The title should briefly
explain the contents of the table.
2. Stub – items or classification written on the first column and identifies what are
written on the rows.

9
3. Caption or box head – includes the items or classifications written on the first
row and identifies what are contained in the columns.
4. Body –the main part of the table and it contains the substance or the figures of
one’s data.

In the construction of a table, the following guidelines should prove helpful.

1. Every table must be self-explanatory.


2. The title should be clear and descriptive.
3. The title gives information about what, where, how, and when the data were taken.

Example of a statistical table:

Table 1.1

Population of the Philippines 1877 - 1980

Year Population Average Annual


Rate of Increase (%)
1877 5,567,685 2.41
1887 5,984,727 0.72
1896 6,261,339 0.50
1903 7,635,426 2.87
1918 10,314,310 1.89
1939 16,000,303 2.22
1948 19,234,182 1.91
1960 27,087,685 3.06
1970 36,684,486 3.01
1975 41,831,045 2.66
1980 48,098,000 2.40
Source of Data: National Statistics Office.

10
Frequency Distribution
The number of times a particular observation occurs in a data set is the frequency of that
particular observation. By the word frequency we mean, repetition of an item/observation.
Frequency is the usually denoted by ‘f’.

Definition of Frequency Distribution:


A frequency distribution is a tabular summary of data showing the frequency of items in
each of several nonoverlapping classes.

 If the data are presented by the observation and their corresponding frequencies, this
presentation is called frequency distribution.

How to construct a Frequency Distribution Table

In constructing a quantitative frequency distribution, the following steps are considered:


Step-1: Determine the range R.
R = highest value – lowest value
Step-2: The number of classes is to be decided
The appropriate number of classes (k) may be decided by the following formula:
Sturges’ Formula

K = 1 + 3.322 log10 N
Where, N is the number of observations to be included in the distribution.
Step-3: The class interval is to be determined. It is obtain by using the relationship
R
C.I 
K
Step-4: The table will have three columns having names- classes of the distribution, tally
marks and frequency. In first column, we write down all the classes of the distribution.
Step-5: Give tick mark to each of the values of the original table of raw data and in the
second column, put tally mark against the appropriate classes.
Step-6: The tally marks against each class are then counted. These counted numbers are
called the frequencies of that class. They are written in the third column and in the end, the

11
total of the 3rd column is checked against the total number of individuals or scores. The
whole table is known as frequency distribution table.

The frequency table can be made by two methods:


a) Exclusive method
b) Inclusive method

a) Exclusive method: In this method, the upper limit of any class interval is kept the same
as the lower limit of the just higher class or there is no gap between upper limit of one class
and lower limit of another class. It is continuous distribution
Example:
C.I. Tally marks Frequency(f)
0-10
10-20
20-30

b) Inclusive method: There will be a gap between the upper limit of any class and the lower
limit of the just higher class. It is discontinuous distribution
Example:
C.I. Tally marks Frequency(f)
0-9
10-19
20-29

To convert discontinuous distribution to continuous distribution by subtracting 0.5 from


lower limit and by adding 0.5 to upper limit
Note:
 The arrangement of data into groups such that each group will have some numbers.
These groups are called class and numbers of observations against these groups are
called frequencies.
 Each class interval has two limits 1. Lower limit and 2. Upper limit.

 The difference between upper limit and lower limit is called length of class interval.

 Length of class interval should be same for all the classes.

 The average of these two limits is called mid value of the class.

12
Example:
The profits (in lakhs taka) of 30 companies for the year 2001-2002 are given below:
25, 32, 45, 8, 24, 42, 22, 12, 9, 15, 26, 35, 23, 41, 47, 18, 44, 37, 27, 46, 38, 24, 43,
46, 10, 21, 36, 45, 22, 18.
Construct a frequency distribution table taking a suitable class interval.

Solution:
Range(R) = 47-8 = 39
Number of observations (N) = 30
Number of classes, K = 1 + 3.322 log10 N
= 1+3.322 x 1.4771 = 5.91 6.0
C.I=39/6= 6.5 7.0
Inclusive method:

C.I. Tally marks Frequency(f)


8-14 //// 4
15-21 //// 4
22-28 //// /// 8
29-35 // 2
36-42 //// 5
43-49 //// // 7
Total 30

Exclusive method:
C.I. Tally marks Frequency(f)
8-15 //// 4
15-22 //// 4
22-29 //// /// 8
29-36 // 2
36-43 //// 5
43-50 //// // 7
Total 30

The relative frequency of a class is the fraction or proportion of the total number of data
items belonging to the class.
A relative frequency distribution is a tabular summary of a set of data showing the relative
frequency for each class.

13
The percent frequency of a class is the relative frequency multiplied by 100.

A percent frequency distribution is a tabular summary of a set of data showing the percent
frequency for each class.

Relative Frequency and Percent Frequency Distributions

Rating Relative frequency Percent


frequency
Poor 0.10 10
Below Average 0.15 15
Average 0.25 25
Above Average 0.45 45
Excellent 0.05 5
Total 1.00 100

14

You might also like