Political Evaluation 1
Political Evaluation 1
Political evaluation is the analysis of social activity and behaviour relating to systems of
politicalgovernance. The essence of political evaluation is to help strengthen governance. In
order to strengthen governance, it is necessary to evaluate the status of governance as a basis
for intervention. Such an evaluation has to provide objective, meaningful and comparative
judgments on a nation-state’s governance effectiveness and quality. This assumes that
governance is measurable. A series of studies have been provided that measures for various
aspects of governance. These include measures of corruption, competitiveness, political risk,
independence of the judiciary, human rights, political inclusiveness, bureaucratic effectiveness
and general measures of human development. There are many evaluation research reports that
provide information on the status of governance across the world. Some of these reports include
a ranking of countries according to their performances in various indicators of governance.
Many government leaders, international organisations, investors and donors make use of these
rankings
in decision-making.
Political evaluation focuses on the various political goods that citizens expect their governments
to provide for them. Political goods could be tangible or intangible. Tangible goods include
schools, clinics, roads. Some of these goods are intangible. Intangible political goods include
security, rule of law, essential freedoms such as the right to participate in politics and to compete
for office, tolerance of dissent and difference and human rights. All citizens desire to be
governed well. They expect the delivery of essential political goods that are tangible as well.
These include sustainable economic opportunity, medical and health care, schools and
educational instruction, roads, railways, communication networks, and an effective banking
system presided over by a central bank.
The provision of each of these goods can be measured by creating proxy indicators and
sub-indicators. For instance, in a typical measure of the rule of law, the effectiveness and
predictability of the judiciary, the number of judges per 1000 people (the more judges the less
judicial delay), the number of political prisoners, the level of corruption, the extent of
demonstrated respect for property rights, and the ability to enforce contracts are used as
proxies. This engagement with measuring and assessing performance of the political system
falls within the realm of political evaluation. The goal is to determine the level of government
accomplishment in terms of the provisions of the above public goods. Evaluation is very
important in measuring performance. It is evaluation that enables us to know if a policy or
government has achieved its stated objectives and intended effects.
Political evaluations aim to assess whether policies and programs are achieving their intended
goals and objectives. This involves examining their impacts, identifying strengths and
weaknesses, and providing recommendations for improvements.
Evaluations hold policymakers and government agencies accountable for their actions and
decisions. They provide evidence to demonstrate whether resources are being used efficiently
and effectively, and whether public funds are being spent responsibly.
Evaluations provide opportunities for learning and adaptation. They help identify what works,
what doesn't, and why. This knowledge can be used to improve future policies and programs.
When conducted rigorously and transparently, evaluations can build public trust in government
by demonstrating that policies and programs are being assessed objectively and that efforts are
being made to improve their effectiveness.
Political evaluations can assess the impact of political processes, such as elections, legislative
processes, and public consultations, on policy outcomes and citizen well-being.
Evaluations help identify unintended consequences of policies and programs. This allows
policymakers to address negative side effects and mitigate potential harm.
Political evaluations can analyze the effects of policies on different groups within society,
assessing whether they benefit or harm specific populations. This includes analysis of equity
and social justice concerns.
By providing information about policy outcomes, evaluations can help citizens participate more
effectively in the political process. They can inform public debates, empower civil society
organizations, and help citizens hold their representatives accountable.
Political evaluation contributes to the broader field of political science by developing and testing
theories about the relationship between policies, political processes, and outcomes.
SELF-ASSESSMENT EXERCISE
1) Describe how political evaluation aid decision making.
2) State three goals of political evaluation.
4.0 SUMMARY
In this unit, we have provided a brief description of the meaning and context of political
evaluation. We have explained political evaluation as an important form of assessment of
government performance. Political evaluation is an assessment of government performance in
providing both the tangible and intangible goods that governments are expected to provide for
their citizens. The aim is to have a clear idea of the performance of government or to help
improve policy implementation.
5.0 CONCLUSION
In conclusion, political evaluation is very important in any political process that wants to
continue to improve. Indeed, all government programmes should be subject to one form of16
evaluation or the other. In real life, political evaluation is a continuous exercise that helps public
accountability and performance of government. It may also help the government in determining
the preferences of citizens. Indeed, it can be used to aid the responsiveness of government to
the
STATISTICS
What is Statistics?
As we have seen from the discussion of quantitative and qualitative research strategies,
statistical data and techniques are employed in the process of political evaluation. What does an
individual think of when he or she thinks of statistics? Statistics has two meaning depending on
how it is being used at a given time. When statistics is used in singular form, it means the
discipline or science, which deals with the collection, classification, analysis and use of
numerical data. Brase and Corrine (2007:4) defines statistics as the “study of how to collect,
organise, analyse, and interpret numerical information from data”. But when used in a plural
form, it refers to the numerical facts about data themselves e.g. GNP, divorce, population,
salary, age, crime,voters turn out etc.
Descriptive statistics are quantitative tools of analysis that enable social scientists to describe
and summarise data. It involves organising, picturing and summarising a large amount of
information from samples or populations. Any average, for example, is a descriptive statistic. So,
average daily rainfall, or average daily temperature are good examples of descriptive statistics.
Examples of descriptive statistics include measures of central tendency(mean, mode and
median value of dataset) measures of variability in the dataset ( such as standard deviation ,
range, and the interquartile range), measures of relative position in the set, such as percentiles
and standard scores; and measures of correlation between two or more variables, such as
correlation tests that are used to show how strongly and in what direction two variables are
related.
The second type of statistics is inferential statistics. This involves using information from a
sample to draw conclusions regarding the population (Brase and Brase 2007). For instance, we
may want to know what the people of Lagos State think about the performance of Babatunde
Fashola’s administration. Because it will take forever to ask the over 9,000,000 citizens of Lagos
state about their assessment of the performance of the administration, we could randomly select
200 citizens and ask them about the governor’s performance. From this sample we will then
infer the views of the citizens of Lagos State. Some basic types of inferential statistics
commonly used in Political Science are:
1. The t-test for significant differences between means of dependent (uncorrelated) groups.
2. The t-test for significant differences between the means of paired or correlated groups.
3. Simple regression analysis for measuring the strength and the direction of relationships
between variables.
4. Analysis of variance (ANOVA) tests for difference on one variable for two or more groups.
5. Analysis of variance (ANOVA) tests for differences on two or more variables between two or
more groups,
and for any interaction that might result from the two variables.
6. Analysis of variance (ANOVA) as used in pre- and post-test experimental applications
Numerical data and their analyses help us to make inference from a small collection of people
or item about a larger collection of people or item. Statistics is therefore a central element of
political evaluation. It is particularly useful when we deal with issues that have value or
numerical measurement. Indeed, statistics helps us to understand complex situations and make
decisions.
SELF-ASSESSMENT EXERCISE
1. State the significance of statistics for evaluation research.
4.0 SUMMARY
The word statistics is used to mean one or more specific measures or values describing
something, a population or a sample of a population. They may be used to summarise a larger
set of numbers, called a dataset or used to measure a smaller group, the sample, and then
used for making assumptions about a larger group, the population. Statistics is a central
element of political evaluation, particularly useful when we deal with issues that have value or
numerical measurement. Statistics helps us to understand complex situations and make
decisions.
5.0 CONLUSION
In this unit, we have seen that statistics is very important for political evaluation research. They
are used to describe and make inferences. There are computer programmes that are designed
to aid our analysis of statistical datasets. Examples of such programmes include E-View and
Statistical Package for the Social sciences.
It is a reflection of the variable we wish to study. For instance, the guarantee of tenure of
judicial officers may be used as an indicator of the independence of the judiciary.
Sometimes a single indicator may not be sufficient to measure a concept. Hence,
multiple or composite indicators may be used. This is often the case because a concept
may have more than one dimension. Hence, the rule of law may be more adequately
measured by multiple indicators like constitutional checks and balances status,
leadership’s respect for court order, police respect for human rights, citizen’s confidence
in law enforcement organs, monitoring of violations by police and prisons, civil society
monitoring of violations by police and prisons, penalty for violations of human rights by
police, and watch-dog organisations independence from executive.
There are several reasons why we engage in measurement in the process of research.
Alan Bryman (2004:66) has identified three of the reasons: First, measurement allows
us to delineate fine differences between people in terms of the characteristic in
question. This is very useful, since, although we can often distinguish between people in
terms of extreme categories, finer distinctions are much more difficult to recognise.
The same point is made by Obasi (1999) when he stated the classificatory function of
measurement enables researchers to differentiate between the objects being studied
according to the properties they possess. This means that individuals or objects being
studied can be classified because of their possession of certain characteristics. Second,
measurement gives us a consistent device or yardstick for making such distinctions. A
measurement device provides a consistent instrument for gauging differences.
This consistency relates to two things: our ability to be consistent over time in its result
and our ability to be consistent with other researchers - reliability. Third, measurement
provides the basis for more precise estimates of the degree of relationship between
concepts. An example is correlation analysis. For example, we can relate voter turnout
in an election with expressed confidence of citizens in the electoral process.
Measurement helps researchers to provide accurate description of hypothesis about the
phenomena. It makes it possible for data to be quantified thereby becoming amenable
to statistical manipulation and treatment. Lastly, measurement makes it
possible for hypothesis and theories to be subjected to empirical verification much more
easily.
SELF-ASSESSMENT EXERCISE
4.0 SUMMARY
In this unit we have examined the concept of measurement as the assignment of
numbers to things we are interested in analysing. We cannot measure most political
phenomena directly; we therefore measure indirectly by using indicators specified in the
operational definition. An indicator is a set of observations that results from applying the
operational definition. It is a reflection of the variable we wish to study. The aim of
measurement is to achieve a maximum clarity concerning the use of a concept in the
context of a particular study.
5.0 CONCLUSION
LEVELS OF MEASUREMENT
This enables researchers to rank or order their subjects or responses of their subjects in
such ways like 'greater than' and 'less than', etc. As a higher level measurement in
relation to nominal measurement, the ordinal measurement enables researcher not only
to differentiate groups but also to express them in 'greater than' or 'less than'
relationships with numbers used to represent the groups. The numbers used show the
relative positions of the differentiated groups. Examples of variables conforming to the
ordinalscale are social class, ratings of universities, organisations etc. Agbaje and
Alarape (2005) enumerated some characteristics of the ordinal measurement to include:
Each category used to measure the values of a variable has a unique place relative to
other categories. It is either less than or more than others. However, it conveys no
information as to the extent of difference between or among the categories. In other
words, there is no information or indication of the distance separating the categories.
According to Black and Champion (1976), this level of measurement has all the
properties of the nominal and ordinal levels of measurement in addition to showing
equal spacing between the intervals. It not only tells us the order of things, it also tells
us the interval or distance between them. The interval measurement takes care of the
inadequacy of the ordinal scale. This is because any one unit difference in score
represents the same amount of difference in the variable being quantified as any other
unit difference in score (Obasi 1999). One limitation of interval measurement is that it
does not have an absolute zero point to enable one make a ratio statement. Examples
of interval measurements are IQ or temperature. If, for example, a person has an IQ of
150 and another has an IQ of 75, one can only say that there is a 75-point difference
between their IQ levels. One cannot say that the one student is twice as intelligent as
the other. This illustrates the fact that distances are assumed equal but, because there
is no absolute zero level, relative comparisons cannot be made.
The ratio measurement has the properties of the highest level of measurement in terms
of precision. Measures of extent (in inches), of weight (in pounds), of time (in seconds),
are illustrations of ratio scales (Garret: 1962). According to Black and Champion (1976),
it is important to note that these different levels of measurement require 'particular set of
statistical procedures and techniques of analysis that are permissible under certain
scientific and mathematical rules'.
SELF-ASSESSMENT EXERCISE
SUMMARY
In this unit, we have seen that there are various levels of measurement. These are
nominal, ordinal, interval and ratio measurements. Nominal level of measurement
describes a variable that has attributes that are merely different. The ordinal level of
measurement describes a variable that with attributes that enables us to rank-order
along some dimensions. In addition to specifying that two or more people have different
attributes we can rank also one more than the other. For instance, that one is more
religious or more conservative than the other. At the interval level of measurement, we
describe variables whose attributes are rank - ordered and have equal distances
between adjacent attributes. The Fahrenheit scale is at the level of interval
measurement. This is because the distance between 14 and 15 is the same as that
between 80 and 81. The actual distances are meaningful standard intervals. The ratio
level of measurement describes a variable with attributes that have all the qualities of
nominal, ordinal and interval measures and, in addition, are based on a “true zero”
point. Age and income are examples ofattributes that can subject to ratio level of
measurement.
5.0 CONCLUSION
They express varying levels of specificity and precision in measurement. While nominal
measure is the lowest level of measurement, ratio level is the highest and most precise
level of measurement. This therefore calls for proper understanding of the nature of
one's data so as to know which appropriate
statistical technique of analysis could be applied.
SCALES OF MEASUREMENT
What is a Scale?
Scales are devices constructed or employed by researchers to quantify the responses
of a subject on a particular variable. Scales are tools used to show a broader aspect of
measurement. The scales can be used to obtain interval data concerning attitudes,
judgments or perceptions about almost any subject or object. A scale is a type of
composite measure composed of several items that have a logical or empirical
structure among them. Scaling involves assigning scores to patterns of responses,
recognising that some items reflect a relatively weak degree. Scales offer more
assurance of ordinality by tapping intensity of
structures among indicators. Although there are various types of scales namely: Likert
scale, Thurstone scale, Guttman scale, Bogardus social distance and Semantic
differential scale, the most commonly used
due principally to the easiness or convenience of its application, as well as the simplicity
of interpreting its measures, is the Likert scale. Based on this, we shall discuss only the
Likert scale here.
There are various types of scale corresponding to specific levels of measurement. The
presence or
absence of three properties or attributes in a variable determines the types of scale:
The existence of magnitude, which is the possibility of comparing different amounts or
intensities so as to
assess whether two values or levels of a variable are the same, or one is lesser or
greater than the other.
2. The existence of an absolute zero, which is a value indicating that the
measurement of a variable is meaningless in circumstances in which the variable
is non-existent.
Types of Scales
Based on the three properties of a variable listed above, four measurement scales may
be identified. These
are nominal, ordinal, interval and ratio scales.
Nominal Scale: This is the most elementary form of scale. It does not measure in the
strict sense of the
word. What it does is to label or name a variable. A nominal scale is used to classify
information into
categories or groups. The essence is not to compare the categories or groups. This is
because the
classification does not make such comparison possible. They are quantitatively
different. The variable
gender may be classified into ‘male’ and ‘female’. This does not in any way provide for a
comparison of
the two attributes, it merely classify the same according to the attribute. The nominal
scale is used for
such attributes as marital status, or state of origin of respondents.
Ordinal Scale:
This scale provides more information than the nominal scale. It allows some measure of
comparison and rank-order between different attributes of a variable. In the ordinal
scale, we can assess the magnitude of attributes to a point when can say that one
attribute is more or less than another. For instance, while we classify at the nominal
level that a respondent is happy or unhappy, in the ordinal scale we can say a
respondent is very happy, happy, indifferent, unhappy and very unhappy. But this does
not suggest that the very happy respondent is twice as happy as the happy respondent.
This is the case because there is no unit of measurement to be used in comparing
them. That is, only the property of magnitude is present in this scale. Similar forms of
variables of only magnitude are the grade a student gets in an essay, and the tax
bracket into which a person falls and so on.
Interval Scale:
Interval scales are more precise than the ordinal scale in the sense that a comparison
between the different occurrences of attributes can be done because of the existence of
equal intervals or units of measurement. Interval scales have both order and distance
but they do not have an original point unless one is assumed for them (Asika 1991: 56).
Bless et al (2004) illustrate the point in this way: After the concept of employment has
been operationalised, a person is classified as employed or unemployed on a nominal
scale; employed full time , employed part-time or unemployed on an
ordinal scale; employed for a certain number of hours per week on a scale possessing
equal intervals (one hour). In the last case, a person employed for forty hours a week is
employed twice as long as a person working only twenty hours a week. The unit
underlying the scale is the hour. All the scales are based on a set of real numbers.
However, the number does not have an absolute zero point. For instance, if the money
owned by somebody is used as a unit of measure say N1,500.00 somebody who is debt
may end up being classified as -N500.00 if the person is in debt to the tune of
N2,000.00. This scale has magnitude and equal units of measurement. Although its
values of 0 and negative values are meaningful, it has no absolute zero point.
Ratio Scales:
The variables that are subject to ratio scale have the three properties listed above:
magnitude, equal intervals and an absolute zero. A very clear example of a variable with
absolute zero is age. A person can be 50 years old but no one is minus three years old.
This means there is an absolute zero. Before birth is the absolute zero, that is
non-existence. The unit or interval is the year. A comparison can be made between the
age of the father and the child. If the child is 20 years and the father is sixty years, it
means that the father is three times as old as the child. Both the interval and ratio scale
have ceiling and floor effects. Ceiling effects occurs when a scale does not permit
sufficient high scores, resulting in all responses clustering at the top of the scale. A floor
effect occurs when a scale does not permit sufficiently low scores and all responses
cluster at the bottom of the scale. In the case of examinations, it may be difficult to
distinguish between excellent students and average students.
5 4 3 2 1
On the other hand, if the question is negatively phased, the weights will change
direction.
Example
'Goodluck Jonathan will not be a good president if he wins the 2011 elections". The
weights will now be:
1 2 3 4 5
Likert scale makes it possible to transform feelings into an interval scale that can be
subject to statistical analysis. Using Likert scale, we can compare the responses among
individuals, and between groups using Chi-square analysis. Likert scale is also flexible
and can be used to measure in minute detail, the degree of intensity of feeling or
attitudes towards a phenomenon.
SELF-ASSESSMENT EXERCISE
1) 2) Describe scaling as a tool of political evaluation research.
List at least four different scales that can be used for measurement.
4.0 SUMMARY
Scaling is a tool for measuring the amount of a property possessed by a class of objects
or events. There are various types of scale corresponding to specific levels of
measurement. The absence or presence of three properties or attributes in a variable
determines the types of scale: magnitude, equal interval and absolute zero. Attitude
scales, like the Likert scale, consists of a number of attitude statements with which the
respondent is asked to agree or disagree along some continuum.
5.0 CONCLUSION
A political researcher needs to conduct a research by systematic, careful and deliberate
observation to be able to understand the real world for describing objects and events in
terms of attributes composing of variables. He further presents these data in a more
understandable manner for other readers and researchers by adopting measurements
at the appropriate levels and the appropriate scales of
measurement.
Population
There is no way to design a research without defining its population and sample.
Population is the total number of the subject. Sample is drawn from the population. The
population in a study is the group of people or object the researcher is studying. The
term population could be people, schools, establishments, animals, specimens or even
countries. The defined population must have at least one characteristic that
differentiates it from other groups. However, if a researcher is carrying out a study on
the attitude of Nigerian women to party politics, the researcher in this position will
realise that it is impracticable to get every woman in Nigeria. He will therefore need to
decide to select a sample from a more narrowly defined (target) i.e. between 18 and 45
in order to save time and resources. In defining a research population, the researcher
establishes a boundary of the conditions which stipulate who is to be included or
excluded from the population. Seldom, if ever will researchers find themselves
measuring entire populations. Rather, they are far more likely to draw a sample from the
population and measure the elements in that sample. These results are then assumed
to apply to the entire population. Researchers believe that similar results would be
found if every element in the sample were measured.Sample measurement is called
statistics; population measurement is parameters. The process is known as inference,
and the statistical tests that are used for this purpose are called inferential statistics. If
all possible information needed to solve a problem could be collected, there would be
no need to sample. Political science researchers seldom have this luxury; they are
typically limited in time and money.
Therefore, people making decisions based on research use data gathered from
samples, and make their decisions based on probabilities that the sample data mirror
what could be expected if it were possible to survey an entire population. A population is
made up of all conceivable elements, subjects or observations relating to a particular
phenomenon of interest to the researcher (Asika, 1991:39). A population may be finite
or infinite. Several factors has been given as reason d’être for sampling. Olayinka and
Gbadegesin (2005:110) posits that we sample because sampling saves time, resources,
energy and it preserves the
items under study if they are fragile. Fadeyi and Adeokun (2004) and Asika (1991)
agreed with the reasons above but added that sampling helps to estimate the
population characteristics and, in the
circumstances, afford a better supervision than with a complete coverage of the entire
population. It helps to obtain quicker results than does a complete coverage of the
population.
3.2 Sample
The concept of sample is derived from the ideas that it is not desirable even if it is
possible to study the entire population. This is the major reason sample has to be taken
from the population. This is a way of
reducing the data to a manageable size or proportion. A well-drawn sample is a good
representation of the entire population so it should not be taken that a sample is inferior
to the population unless the sample
drawn are too small. In order to select a sample to be studied the researcher has to
follow the basic steps.
1. Identify the population.
2. Determine the required sample size.
3. Select the sample using a definite sampling technique.
Sampling Technique/Method
The skill in sampling determines to a considerable extent the degree to which accurate
statement about a total population can confidently be made. Sample selection is an
important step in any research work. Sampling method refers to the way the sample unit
are selected from a parent population (Bryman, 2004). Sampling techniques or design
can be divided into two types of samples; probability and non-probability sampling
designs. Authors have referred to these two types of samples with different concepts.
Some have called them random and non-random samples, probability and purposive
samples, strategic and non- strategic samples designs:
1. Cost vs Value: The sample should produce the greatest value for the least
investment. If the cost ofa probability sample is too high in relation to the type
and quality of information collected, a non- probability sample is a possible
alternative.
SUMMARY
The entire set of relevant unit of analysis is the population. A sample is a subset of the
population. To accurately estimate the parameter of a population from the sample the
research must endeavour that the sample is representative of the population. This
means that the researcher must be meticulous in way sample units are selected from a
parent population. Available sampling techniques include probability and
non-probability sampling methods.
5.0 CONCLUSION
Samples are used in place of a census of a population because it saves cost, time and
can accurately be inferred to the population. Researchers must however decide on what
type of sampling method is appropriate for the research project at hand. While perfect
match between sample and population is impossible, the researcher can achieve a high
probability that the sample reflects the
population.
Defining a Variable
A variable is something that can change, such as 'gender' and are typically the focus of a study.
Asika (1991:6) defines variable as "a construct or concept to which numerical values can be
assigned". Neuman (2000) notes that variables take two or more values. Bryman (2004:29)
summarises variable thus: A variable is simply an attribute on which cases vary. Cases can
obviously be people, but they can also include things such as households, cities, organisations,
schools, and nations. If an attribute does not vary, it is a constant... Constants are rarely of
interests to social researchers. Asika (1991) argues that numerical values cannot be assigned
to most concepts because they just do not vary. Such invariable concepts can be referred to as
constants or parameters.
In scientific research, variables refer to factors or conditions that can change during the course
of an experiment. For example, in political evaluation a political scientist may study factors that
affect change in electoral behaviour. For experimental purposes, communication through the
mass media may be used to manipulate public opinion to change the voting pattern. In a natural
science experiment, certain variables may be manipulated to see how different conditions affect
the temperature at which water boils. The size of the burner and pot used, amount of water,
temperature at which the water is heated and any other item may be manipulated.
These items are all variables. Scientists attempt to change only one of these variables at a time
so that there is no confusion about what caused a change. Variables have attributes, which are
sub-values of a variable, such as 'male' and 'female'. Causal research examines the world in
terms of variables (those things that reveal variation within a population). In computer science
and mathematics, a variable is a symbol denoting a quantity or symbolic representation. In
mathematics, a variable often represents an unknown quantity; in computer science, it
represents a place where a quantity can be stored. Variables are often contrasted with
constants, which are known and unchanging. In other scientific fields such as biology, chemistry
and physics, the word variable is used to refer to a measureable factor, characteristic or
attribute of an individual or a system. In a scientific experiment, so called "independent
variables" are factors that can be altered by the scientist. For example, temperature is a
common environmental factor that can be controlled in laboratory experiments. "Dependent
variables" or "response variables" are those that are measured and collected as data. Variables
can be used in open sentences. For instance, in the formula: x + 1 = 5, x is a variable which
represents an "unknown" number. In mathematics, variables are usually represented by letters
of the Roman alphabet, but are also represented by letters of other alphabets; as well as
various other symbols. In computer programming, variables are usually represented by either
single letters or alphanumeric strings.
Defining each variable is often a technical exercise based on specific jargon and uses of words
that are
characteristic to the area of research. One place to begin is to consider how other researchers
defined your
variables in their research. The researcher may adjust or adapt other definitions to meet his own
needs, or
borrow the definitions that other researchers have used. The research must also indicate how
the variable
is to be measured in the research. Some variables, like age or height, are pretty straightforward,
but the
researcher must still state the method of measurement: "age will be measured in years," or
"height will be
measured in inches." For more complex variables, like political persuasion or racism, you must
indicate
how you will quantify these concepts so they can be used as numbers for statistical analysis.
Political
persuasion might be a nominal variable determined by something as simple as a single question
on a
survey that asks:
Liberal _________
Conservative __________
Liberal Conservative Or, the researcher may use a method which produces a ratio variable by
asking several questions that would be combined into a score so that the researcher, could
determine quantitatively how much of a characteristic (such as liberalism) that each subject
possesses. However it is done, the researcher must describe his method for doing so in the
operationalization section (describe it in a way that other researchers would make the same
decisions about each subject if they followed your method).
Operationalisation therefore means putting a concept into a form that permits some kind of
measurement. It is a process through which we turn concepts into usable variables. According
to Meier (2006) et al an operational definition is a statement that tells us how a concept will be
measured by the analyst. An indicator is a variable or set of observations that results from
applying the operational definitions.Operational definitions are often not stated explicitly but
implied from the research report or briefing. In some cases multiple indicators are used to
measure a single concept. This happened when the concept being measured has more than
one dimension. Examples of operational definitions provided by Meier et al (2006) include the
following:
1. Educational attainment for Head Start participants is defined by the achievement scores on
the
IOWA Tests of Basic Skills.
2. A convict is a considered a recidivist if, within 1 year of release from jail, the convict is
rearrested and found guilty.
3. An active volunteer in the Environmental Justice Association is defined as a person who
donates her or his time to the association at least 5 hours per week on average.
One very interesting study of social capital, done by Robert Putnam (1993), operationalized
social capital as social group membership.
CLASSIFICATION OF VARIABLES
There are some ambiguities and debates about how to classify variables. While the meaning of
the concept is clear in scientific communities, there are various classifications and
representations of variables. Various fields classify and apply them differently. While attempt is
made in this unit to provide varied classifications, a researcher should be careful enough to
understand the classification in his own research community and employ them accordingly in
order to reduce ambiguity and enhance reader's understanding of his work. However, it is
pertinent to distinguish between two broad types of variables under which other classifications
are made: quantitative (or numeric) and qualitative (non-numeric). Each is broken down into two
sub-types: qualitative data can be ordinal or nominal, and quantitative data can be discrete
(often, integer) or continuous. Bryman (2004:29) states that "it is common to distinguish
between different types of variables. The most basic distinction is between independent and
dependent variables. The former are deemed to have causal influence on the latter". Silberstein
(2010) observes that in any science experiment, there are three types of variables: independent,
dependent and controlled
variables.
Dependent Variables
The dependent variable is a variable that changes as a result of the independent variable. If
candidate A wins an election few days after opinion poll indicates that candidate B was leading
with a very high margin, because the police declared candidate B wanted a day before the
election, then the change in the voting pattern is the dependent variable. Dependent variable is
the outcome of the independent variable being changed or manipulated. In graphical
representation, the dependent variable is put on the Y-axis. The Holy Grail for researchers is to
be able to determine the relationship between the independent and dependent variables, such
that if the independent variable is changed, then the researcher will be able to accurately predict
how the dependent variable will change.
Controlled variables are variables that the scientist does not want to change. They however help
the scientist in determining the effects of the stimulus applied to the experimental group. For
instance, if a political scientist wants to determine the effects of appeal to ethnicity in a
campaign rally on voting behaviour, he could set up an experimental group that is then exposed
to a campaign rally with a dose of appeal to ethnic identification. He would however administer a
questionnaire regarding their voting preferences before and after the campaign rally. He would
however also set up another group, the
controlled group that is not exposed to the campaign rally. The questionnaire will also be
administered to the controlled group as well in a similar manner. The comparison of the
responses of the control and experimental group at the end of the experiment points to the
effect of the experimental stimulus of exposure to a campaign rally with a heavy appeal to ethnic
identity.
Whenever an explanatory variable is endogenous, and thus, 'X' does not equal zero in theory,
this means that one of the independent variables is not fixed, and that it is potentially correlated
with the errors. This can happen in a number of different ways. In political science, the classic
example of instrumental variable use can be thought of as necessitated by an omitted variable.
Suppose we want to estimate the effects of campaign contributions received by a candidate
upon that candidate's vote share. A fundamental challenge for scholars estimating the impact of
money on votes is that both variables may be influenced by perceptions about the threat posed
by a challenger, a factor that is notoriously difficult to measure (leaving us with an omitted
variable).
Even when one accounts for past performance of the challenger's party and candidate quality,
a link between the unexplained variance in naira and votes remains. This shared error most
likely reflects the unmeasured perceptions of the challenger's chances. Researchers have
attempted to deal with this simultaneity by using two-stage least squares estimations (Jacobson,
1978; Green and Krasno, 1988; Gerber, 1998). These procedures first predict candidate
finances from factors that are (in theory) not directly related to election outcomes, and then use
these systematic figures — purged of their candidate-specific information - to explain vote totals.
In social science research, certain concepts such as belief, joy, peace, justice etc. cannot be
measured. This is because they are value based and highly normative in nature. Such concepts
are often philosophically contested. Researchers have tended to study such concepts using
other closely related concepts that could serve as pointer and indicate the presence of such
concepts that are being examined. These closely related concepts that could be used to replace
and indicate others are referred to as proxy
variables.
We can distinguish between two types of variables according to the level of measurement:
1. Continuous or Quantitative Variables.
2. Discrete or Qualitative Variables.
A quantitative variable is one in which the variables differ in magnitude, e.g. income, age, GNP,
etc. A qualitative variable is one in which the variables differ in kind rather than in magnitude,
e.g. marital status, gender, nationality, etc.
Interval scale data has order and equal intervals. Interval scale variables are measured on a
linear scale,and can take on positive or negative values. Interval scale variables are measured
on a linear scale, andcan take on positive or negative values. It is assumed that the intervals
keep the same importance throughout the scale. They allow us not only to rank order the items
that are measured but also to quantify and compare the magnitudes of differences between
them. We can say that the temperature of 40°C is higher than 30°C, and an increase from 20°C
to 40°C is twice as much as the increase from 30°C to 40°C. Counts are interval scale
measurements, such as counts of publications or citations, years of education,
etc.
They occur when the measurements are continuous, but one is not certain whether they are on
a linear scale, the only trustworthy information being the rank order of the observations. For
example, if a scale is
transformed by an exponential, logarithmic or any other nonlinear monotonic transformation, it
loses its interval - scale property. Here, it would be expedient to replace the observations by
their ranks.
These are continuous positive measurements on a nonlinear scale. A typical example is the
growth of bacterial population (say, with a growth function). In this model, equal time intervals
multiply the population by the same ratio, (hence, the name ratio - scale). Ratio data are also
interval data, but they are not measured on a linear scale. With interval data, one can perform
logical operations, add, and subtract, but one cannot multiply or divide. For instance, if a liquid is
at 40 degrees and we add 10 degrees, it will be 50 degrees. However, a liquid at 40 degrees
does not have twice the temperature of a liquid at 20 degrees because 0 degrees does not
represent "no temperature" — to multiply or divide in this way we would have to use the Kelvin
temperature scale, with a true zero point. (U degrees Kelvin — -273.15 degrees Celsius), in
social sciences, the issue of "true zero" rarely arises, but one should be aware of the statistical
issues involved. There are three different ways to handle the ratio-scaled variables:
a. Simply as interval scale variables. However, this procedure should be avoided as it can
distort the results.
b. As continuous ordinal scale.
c. By transforming the data (for example, logarithmic transformation) and then treating the
results as
interval scale variables.
3.2.2 Qualitative or Discrete Variables
Discrete variables are also called categorical variables. A discrete variable, X, can take on a
finite number of numerical values, categories or codes. Discrete variables can be classified into
the following categories:
1. Nominal variables.
2. Ordinal variables.
3. Dummy variables from quantitative variables.
4. Preference variables.
5. Multiple response variables.
1. Nominal Variables
Nominal variables allow for only qualitative classification. That is, they can be measured only in
terms of whether the individual items belong to certain distinct categories, but we cannot
quantify or even rank order the categories: Nominal data has no order, and the assignment of
numbers to categories is purely arbitrary. Because of lack of order or equal intervals, one cannot
perform arithmetic (+, -, /, *) or logical operations (>, <, =) on the nominal data. Typical examples
of such variables are:
2. Ordinal Variables
A discrete ordinal variable is a nominal variable, but its different states are ordered in a
meaningful sequence. Ordinal data has order, but the intervals between scale points may be
uneven. Because of lack of equal distances, arithmetic operations are impossible, but logical
operations can be performed on the ordinal data. A typical example of an ordinal variable is the
socio-economic status of families. We know 'upper middle' is higher than 'middle' but we cannot
say 'how much higher'. Ordinal variables are quite useful for subjective assessment of 'quality
importance or relevance'. Ordinal scale data are very frequently used in social and behavioural
research. Almost all opinion surveys today request answers on three-, five-, or seven-point
scales. Such data are not appropriate for analysis by classical techniques, because the
numbers are comparable only in terms of relative magnitude, not actual magnitude. Consider for
example a questionnaire item on the time involvement of scientists in the 'perception and
identification of research problems'. The respondents were asked to indicate their involvement
by selecting one of the following codes:
2 = low
3 = medium
4 = great
5 = very great
Here, the variable ’Time Involvement' is an ordinal variable with 5 states. Ordinal variables often
cause confusion in data analysis. Some statisticians treat them as nominal variables. Other
statisticians treat them as interval scale variables, assuming that the underlying scale is
continuous, but because of the lack of a sophisticated instrument, they could not be measured
on an interval scale.
Up to 25 1
25,40 2
40,50 3
50,60 4
above 60 5
4. Preference Variables
Preference variables are specific discrete variables, whose values are either in a decreasing or
an increasing order. For example, in a survey, a respondent may be asked to indicate the
importance of the following factors in voting decision by using the code for the most important
factor and for the least important factor:
1. Party manifesto
2. Party candidate
3. Ethnic group of candidates
4. Status of the party relative to other parties
5. Religious affiliation of the candidate
Note that preference data are also ordinal. The interval distance from the first preference to the
second preference is not the same as, for example, from the sixth to the seventh preference.
5. Multiple Response Variables
Multiple response variables are those, which can assume more than one value. A typical
example is a survey questionnaire about the use of computers in research. The respondents
were asked to indicate the purpose(s) for which they use computers in their research work. The
respondents could score more than
one category.
1. Statistical analysis
2. Lab automation/ process control
3. Data base management, storage and retrieval
4. Modelling and simulation
5. Scientific and engineering calculations
6. Computer aided design (CAD)
7. Communication and networking
8. Graphics
SELF-ASSESSMENT EXERCISES
1) 2) Explain the basis on which variables are classified.
Distinguish between dependent and independent variable.
Descriptive Statistics
Descriptive statistics is a term used to refer to numbers that summarise a group of data.
This data may be about the number of votes won by a political party in several elections
or the number of children immunised monthly in Ibadan metropolis or the number of
students admitted into a University annually in five years.
The data are presented in their raw form. They do not tell us anything significant about
the relative performance of each political party. Nor do they tell us how close the parties
are in terms of each election. They need to be organised and interpreted using a variety
of statistical techniques in order to tell us more about the performance of the parties in
the elections. A measure of central tendency is a number or score or data value that
represents the average in a group data. There are three measures of central tendency.
These are the mean, median and the mode. However, before these could be estimated
as a particular dataset, there is usually a need to order the data frequency.
When summarising large masses of raw data, it is often useful to classify the data. The
classification of data is usually based on a particular trait, characteristic or variable. A
class is one of the group categories of the variable. The class frequency is the number
of observations or occurrences of the variable within the class. A tabular arrangement of
data by classes together with the corresponding class frequencies is called a frequency
distribution. The tabular summary of data for a single variable is a frequency
distribution. Therefore, a frequency distribution shows the number of data values in
each of several overlapping classes. It can also be summarised in a graphical structure.
Frequency distribution is, in most cases, a tabular summary of a set of a data showing
the frequency (for number) of items in each of several overlapping classes (Anderson,
et al., 1981:97). It represents data in a relatively compact form, give a good overall
picture, and contain adequate information for many purposes.
Data organised and summarised as in the above frequency distribution are often called
grouped data. Although the grouping process generally destroys much of the original
details of the data, an important advantage is gained in the clear overall picture that is
obtained and in the vital relationship that are made evident.
The class interval is distance between the upper limit of one class and the upper limit of
the
next higher class while the class mid point is the point half way between the upper and
the
lower class boundaries. The class boundaries are the lowest and highest values that fall
within the class. These are the end number 60 and 62. They are also called class limits.
The
smaller number is the lower class, and the larger number (62) is the upper-class limit. A
symbol definition of a class, such as 60-62 is called a class interval. The term class
interval
and class are often used interchangeable, although the class interval is actually a
symbol for
the class. If heights are recorded to the nearest inch, the class interval 60 - 62
theoretically
includes all measurement from 59.5 to 62.5 inches. These numbers indicated briefly by
the
exact numbers 59.5 and 62.5, called class boundaries or true class limits; the smaller
number
(59.5) is the lower-class boundary, and the larger number (62.5) is the upper-class
boundary.
In other words, subtract 0.5 from the lower class and the larger number (62.5) is the
upper-
class boundary. In other words, subtract 0.5 from the lower class and add 0.5 to the
upper
class.
e.g. 60 – 0.5 = 59.5 = Lower class boundary
62 + 0.5 = 62.5 = Upper class boundary
The size or width of a class interval is the difference between the lower and the
upper-class
boundaries and is also referred to as the class width, class size or class length. If four
class
intervals of a frequency distribution have equal width, these common widths are
denoted by
(C). In such case, C is equal to the difference between two successive lower-class limits
or
two successive upper-class limits. i.e. C = 65.5 – 62.5=3
General rules for forming frequency distribution
1. Review the data to find the lowest and highest values, the largest and smallest
numbers in the raw data and thus find the range (the difference between the largest
and smallest numbers).
2. Make a list of the values from the lowest to the highest and then mark as follows.
Divide the range into a convenient number of class intervals having the same size. If
this is not feasible, use class interval of different side or open class interval. The
number of class interval is usually taken between 5 and 20, depending on the data.
Class intervals are also chosen so that the class marks (mid - point) coincide with the
actual observed data. This tends to lessen the so-called grouping error involved in
further mathematical analysis. However, the class boundaries should not coincide
with the actual observed data.
3. Determine the number of observations following each class interval: that is, find the
class frequency. This is best done by using a tally, or score sheet.
4. Avoid classes so narrow that some intervals have zero observation.
5. Make all the class intervals equal unless the top or bottom class is open ended.
6. Use open ended intervals only when closed intervals would result in class
frequencies of zero. This usually happens when some values are extremely high or
extremely low.
7. Try to construct the intervals so that the mid - points are whole numbers.
3.3 Mean
This is a measure of central tendency which is central to probability statistics. It is
commonly referred to as “average” It is the arithmetic average of a set of numbers. It is
useful both in indicating the characteristic value of a distribution and the simple index of
a variable.
The ungrouped frequency distribution becomes inappropriate where a long list of scores
also has repeated scores. The table becomes long, unwieldy and meaningless. To solve
this problem, scores can be grouped into separate classes with frequencies of
occurrence of scores in each class matched against them. This is a very useful method
for compressing very large data into desirable number of classes. The following should
be noted in grouped frequency distribution:
i. Class width: It is commonly denoted by 1 to represent the numberof scores.
ii. Class: A class is a range of scores defined by a lower limit and an upper limit.
iii. iv. Number of groups: The number of classes in the distribution
iv. Mid-point value: The average of the lower and upper limits of a given class.
Thus, in calculating the means for grouped data we apply the logic for ungrouped
data. The mean for grouped data refers to the sum of all the values divided by the
number of values. Whenever grouped data are used for calculation, it is assumed that
all values are spread evenly throughout the interval. Thus, the mean of the first class,
or any class, is equal to the midpoint of the class. In the example below, he mid-point
for the first class is 54.
Example:
Given the weight of 34 students in the Department of Political Science,
University of Ibadan.
Weight F
52-56 3
57-61 6
62-66 10
67-71 4
72-76 8
77-81 3
To calculate the mean for this data we seek for mid - value for each of the
52-56 54 3 162
56-60 59 6 354
60-64 64 10 640
64-68 69 4 276
68-72 74 8 592
Advantages
1. It is easy to calculate.
2. The principle of arithmetic mean is easy to understand.
3. Its calculation is clear and precise.
4. It provides a good measure of comparison.
Disadvantages
1. For qualitatively classified or nominal data, the mean is Meaningless
2. Computational complication arises when there are unbounded classes
3. There are situations when it is used as a summary measure that it is not
particularly meaningful
4. It cannot be obtained graphically.
Median
According to Meier et al (2006: 77) the median is “the middle observation in a set of
numbers when the observation is ranked in order of magnitude”. It is obtained by
rank-ordering the values of the observations by magnitude and then choosing the
value that is the middle of the rank order. When the number of n of observation is
odd, and the observations are arranged in ascending order, the median is the
observation that is simply the middle value.
Example: Find the median of the following marks; 44, 40, 79, 42, 51,
59, 71, 44, 45, 51, 59, 65, 71, 79.
The median value is = 51
For even number of observation, the median is taken as the mean of the two
middle value i.e. if n is even the median
Example: The number of attendances in twelve lectures in POS 702 is as
shown: Find the median 40, 32, 30, 24, 40, 38, 35, 40, 28, 32 and 37.
Median= 35+37
———-
2
= 36
If the number of observations are even then median is the average of (n / 2)th
and ( n / 2 +1)th observation For group data, the median is obtained as the nth/2
observation whether n is even or odd. In making the calculation, the first thing is to
identify the class n which the mean falls. Suppose it falls in a class which begin at Bi
and ends at Bii that is the median lie between Bi and Bii. if the cumulative frequency
preceding the class containing the median is cfp , the frequency for the interval Bi -Bii is
Fm , and the width of the interval containing the median is i then the median is:
Mode
The mean and the median are not appropriate tools for distributions that are
asymmetrical. The mode is the value that occurs most frequently in a distribution. Mode
offers two main advantages. One, it requires no calculation, only counting and two, it
can be determined even for qualitative or nominal data.
Example:
The 20 meetings of the Faculty of the Social Sciences Board of Examiners
were attended by 25, 25, 28, 23, 25, 24, 24, 21, 23, 28, 26, 24, 32, 25, 27, 24,
23, 24 and 22 of its members. Find the mode:
From the result, 24 occur five times and is the modal attendance.
From the table above we have the modal class to be an interval of 60-64, to
obtain the mode we use the formula
= 61.5+ (0.4)5
= 61.5 + 2
Mode= 63.5
ASSESSMENT
INFERENTIAL STATISTICS: ESTIMATION AND HYPOTHESIS TESTING
A statistical inference is the process by which conclusions are drawn about some
measure or attribute of a population based upon analysis of sample data. Inferential
statistics involves the use of mathematical methods that employ probability theory for
deducing (inferring) the properties of a population from the analysis of the properties of
a set of data (sample) drawn from it. Inferential statistics is concerned also with the
precision and reliability of the inferences it helps to draw. Thus, inferential statistics are
quantitative tools of analysis in the social sciences that enable us to analyse data, test
hypothesis, and draw conclusions especially about populations from studies of samples.
Good examples of these are correlation co-efficient analysis, multiple correlations and
multiple regressions, factor analysis (f-test), and others. When these statistical tools are
applied to social data, it is called social statistics. The heart of statistics is inferential
statistics. Inferential statistics are used when we want to draw conclusions about a
population from the sample drawn from that population.
For example, when we want to determine if a candidate is more popular than another
over a period of time, or if there are differences in how two political parties perform in
presidential elections. Inferential statistics are often complex and may have several
different interpretations. However, the goal of inferential statistics is to discuss some
property or general pattern about a large group by studying a smaller group of people in
the hope that the results will generalize to the larger group.
For example, we may ask residents of Abuja city their opinion about senatorial
elections in Abuja. We would probably poll few thousand individuals in Abuja city in an
attempt to find out how the city as a whole view the issue of senatorial election in Abuja.
This leads us to apply basic inferential statistics. Inferential statistics is normally used to
compare two or more groups and attempts are normally made to figure out if the two
groups are different from one another. For example, a drug company has developed an
anaesthetic pill that can increase recovery time from common cold. How do we find out
if the pill works or not? Using inferential statistics, what we can do is to get two groups
of people from the same population.
For example, we select two groups of people from a small part of Abuja who had just
caught cold, and administer the pill to one group (group N), and give the other group a
placebo (group K). We could then measure how many days each group took to recover.
When we measure something in a population, it is called a parameter. When we
measure something in a sample, it is called a statistic. To be sure, if I got the average
age of parents in single-family homes, the measure would be called a parameter. If I
measure the average age of a sample of those individuals, it would be called a statistic.
A good example is that, we may wish to test whether Extra Panadol is better than
Septrin at relieving pains. To do this, we cannot give these drugs to everyone in the
population; it is not practical since the general population is too large. Instead, we would
give it to a couple of hundreds of people and see, which one works better with them.
With inferential statistics, we can infer that what was true for a few hundred people is
also true for a very large population of hundreds of thousands of people. In our test
case of a drug company that administered pill to two different groups, they could now
measure how many days each group took to recover.
What is to be done at this juncture is to calculate the mean of each group. Let us
assume that the mean recovery time for the group with the drug was 4.4days, and the
mean recovery time for the group with the placebo was 4.8 days. The question then is,
is the difference due to random chance, or does taking the pill actually helps one to
recover from the cold faster? The means of the two groups alone does not help us
determine the answer to this question. We certainly need additional information. This
information is obviously from the sample size.
There are basically two types of statistical inference. These are estimation and
hypothesis testing. Estimation is concerned with the estimation of population
characteristics, such as the mean and the standard deviation, from the population
characteristics. Hypothesis testing is the process of setting up a theory or hypothesis
about some characteristics of the population and then sampling to see if the hypothesis
is supported or not.
As we have noted earlier, a population is the total set of items we are concerned about.
Any set of people or objects with something in common can be referred to as
population. Anything could be a population. We could have a population of university
students. We might be interested in the population of the elderly. Other example
includes among others, single parent families, people with depression, or bum victims.
For anything we might be interested in studying, we could define a population. In most
cases, we would like to test something about a population. A population is the entire
group of people you would like to know something about. For example, we might work
to test whether a new drug might be effective for a specific group.
If a sample is not a random sample, then the rules of statistical inference under
consideration do not necessarily hold. Given large enough samples, drawn at random;
ensures a fair and representative sample of a population. A statistic is a measure that is
used to summarise a sample. The mean, the standard deviation, and the median of a
sample are all statistics. A measure that is used to summarise a population is called a
parameter.
4. An estimator must be sufficient. This means that it uses all information in the sample
in estimating the required population parameter.
It must, however, be noted that all the population values are usually not available so we
have to sample in order to estimate the population parameters. Hence, it is necessary to
distinguish between the symbols used for ample statistics and population parameters as
shown in the table below.
The mean number of arrest by police officers is 15.0. But in a situation where the
population parameter is not known and therefore cannot be calculated or too large and
therefore unwieldy to calculate, a mean of a sample can be used to estimate the
population mean. If we take a random sample of the five police officers from Sango
police station, using a random number table, and calculate the mean for those five as
stated below:
It is immediately obvious that the sample mean is an estimate of the population mean.
This is not exact
but close. The discrepancy is the result of sampling error. This is because the sample is
not a perfect
representative of the population. However, if we took numerous samples of five, the
average sample
mean would approach the population mean.
The standard deviation for mean estimates is called the standard error. To find the
standard error of the
mean, we calculate a mean for each sample and then calculate the standard deviation
for the mean
estimates.
A hypothesis is a statement about the world that may be tested to determine whether it
is true or false. It is some testable belief or opinion. Hypothesis testing is the process
whereby such beliefs or statement are
tested by statistical means. Examples of testable hypothesis include the following:
1. Following the mounting of road blocks by the police the number of high way robberies
dropped.
2. After implementing the civil service reform, the productivity of service has increased.
Hypothesis expressed in the negative is called the null hypothesis. The research
hypothesis is expressed in the positive. In order to test a null hypothesis, you must first
state the research hypothesis. A research
hypothesis is always paired with a null hypothesis. The hypothesis to be tested is the
null hypothesis designated HO.
If the research hypotheses states that an outcome has been realised, the null
hypothesis states that an outcome has not been realised. If the research hypothesis
states that a change has occurred, the null hypothesis states that change has not
occurred.
Significance Levels
When a sample is taken to test a hypothesis there is no guarantee that the information
from the sample data completely supports the hypothesis. This may be due either to the
fact that the original hypothesis is wrong or the sample is slightly unrepresentative.
Generally, all samples are to a greater or lesser extent unrepresentative. It is therefore
important to test which of the two is the case. The test will show whether any difference
can be attributed to ordinary random factors or not. If the difference is probably not due
to chance factors, the difference is said to be statistically significant. Since we cannot
say for 100% certainty that a difference is significant because we are dealing with
samples and random factors, various levels of significance are chosen, most commonly
10%, 5% or 1%.
Statistics is normally used to compare two or more groups and attempts are normally
made to figure out if the two groups are different from one another. But if our sample
size only consisted of two people, that is,
one from the drug group and one from the placebo group as in the drug case we alluded
to earlier, there would be so few participants that we would not have much confidence
that there is a difference between the two groups. That is to say, there is a high
probability that chance explains our results as many explanations account for this. For
example, one person might be younger, and thus have a better immune system.
However, if our sample consisted of 1,000 people in each group, then the results
become much more robust while it might be easy to say that one person is younger
than another, it is hard to say that, 1,000 random people are younger than another
1,000 random people. If the sample is drawn at random from the population, then these
'random' variations in participants should be approximately equal in the two groups,
given that the two groups are large. This is why inferential statistics works best when
there are lots of people involved. Even if we have a large enough sample size, we still
need more inferential statistics to reach conclusion. What we need is some measure of
variability.
Variability
It takes about 5-6days to recover completely from cold. But it is pertinent to ask if
everyone around takes 5-6days, or do some people recover in 1 day, and others
recover in 10 days? Understanding the spread of the data will tell us how effective the
pill is. If everyone in the placebo group takes exactly 4.8 days to recover, then it is clear
that the pill has a positive effect. But if people have a whole variability in their length of
recovery (and they probably do), then the picture becomes a little fuzzy. It is only when
the mean sample size variability has been calculated that a proper conclusion can be
made. In the case under review, if the sample size is large, and the variability is small,
then we would receive a small value (probability-value). Small p-value is good and this
term is prominent enough to warrant further discussion.
P-Values
In classic inferential statistics, two hypotheses are made before the commencement of
study. These are
the null hypothesis, and the alternative hypothesis. The null hypothesis states that the
two groups we are studying are the same, while the alternative hypothesis states that
the two groups we are studying are
different. The goal of classic inferential statistics is to prove the null hypothesis wrong.
The logic says that if the two groups are not the same, then they must be different. A
low p-value indicates a low
probability that null-hypothesis is correct, thus providing for the alternative hypothesis.
What a p-value actually means is that the 'p' value one obtains from a test like this tells
you precisely the following: It is the probability that you would obtain these or more
extreme results assuming that the null hypothesis is true. For example, if we obtained a
p - value of 0.01 or 1% for our drug experiment, it would then mean
that the probability of obtaining a difference between these two groups, that is, this
larger is 1% assuming that the two group are in fact not different. When interpreting
p-values, it is important to understand that it does not tell you the probability that the null
hypothesis is wrong.
SUMMARY
A statistical inference involves drawing inference or conclusions about the population
from the samples. Estimation is concerned with estimating population parameters from
sample statistics. The central limits theorem states that the means of samples tend to
be normally distributed almost regardless of the shape of the original population. The
standard deviation of the distribution of means is called the standard error of the mean.
Hypothesis or significant testing is testing a belief or statement by statistical methods.
Based on the null hypothesis HO, a type I error is rejecting a true hypothesis and type II
error is accepting a false hypothesis. Significance levels are complementary concepts to
confidence limits. 1%, 5% and 10%, are
the most usual levels.
5.0 CONCLUSION
Quantitative Methods
1. *Surveys*: Online or offline questionnaires to collect data from a large sample size.
2. *Experiments*: Controlled studies to test cause-and-effect relationships.
3. *Observational studies*: Collecting data by observing people, behaviors, or
phenomena.
Qualitative Methods
1. *Interviews*: In-depth, open-ended conversations to gather detailed insights.
2. *Focus groups*: Group discussions to collect data on opinions, attitudes, and
behaviors.
3. *Content analysis*: Analyzing texts, images, or videos to identify patterns and
themes.
Mixed Methods
1. *Combining surveys and interviews*: Using both quantitative and qualitative
approaches to collect comprehensive data.
Other Methods
1. *Secondary research*: Analyzing existing data from academic papers, reports, or
databases.
2. *Social media listening*: Collecting data from social media platforms to understand
public opinions and trends.
3. *Sensors and IoT devices*: Collecting data from sensors and Internet of Things (IoT)
devices to monitor and analyze physical environments.