Epidemiology
MODULE – 2
Epidemiological
Study Designs
MODULE
Epidemiological Study Designs
Module Description
This module has been designed to introduce the candidates about the various study
designs used in the epidemiological studies. This will help the candidate to develop
ability to know the conditions under which a particular study design is to be used.
The candidate will also be able to understand the role of error, bias and confounding
in the epidemiological research.
Unit 2.1
Descriptive Epidemiology
Unit 2.2
Basic Designs of Research Studies
Unit 2.3
Variation: Role of Error, Bias and Confounding
Unit Table of Contents
Unit 2.1 Descriptive Epidemiology
Topics Page No.
Learning Objectives
Learning Outcome
2.1.1 Definition of Descriptive Epidemiology
2.1.2 Uses of Descriptive Epidemiology
2.1.3 Study Approach in Descriptive Epidemiology
2.1.4 Distribution by Time, Place and Person
2.1.5 Conclusion
Summary
References
Multiple Choice Question
Multiple Choice Question – Key Answer
Introduction
Learning Objectives
• State definition of Descriptive Epidemiology and explain the uses of
descriptive epidemiology
• List the 5Ws of Descriptive Epidemiology
• Name the study approach to Descriptive Epidemiology
• Characterise diseases by time, person and place
Learning Outcome
• Define Descriptive Epidemiology and the uses
• Mention the 5Ws of Descriptive Epidemiology
• Summarise the study approach to Descriptive Epidemiology
• Explain diseases’ occurrence by time, person and place
Epidemiology 1
Introduction
Outbreaks of diseases, for the most part, are not uniformly distributed in human
populations. In other words, the distribution of disease or health condition follows
specific patterns in a community. Understanding these patterns may lead to the
generation of hypotheses about causative (or risk) factors associated with the health
conditions. An important role of Descriptive Epidemiology is to study these
distribution patterns in the various subgroups of the population by time, place and
person. That is, whether there has been an increase or decrease of disease over time
span; whether there is a higher concentration of disease in one geographic area than
in others; whether the disease occurs more often in men or in a particular age-group,
and whether most characteristics or behaviour of those affected are different from
those not affected. An important outcome of this study is formulation of aetiological
hypothesis which may suggest or lead to measures to control or prevent the disease.
One of the functions of Descriptive Epidemiology is to provide data regarding the
type of disease problems and their magnitude in terms of incidence, prevalence and
mortality rates. Such information is needed to demarcate the affected areas and for
providing appropriate health care services.
Epidemiology 2
Introduction
2.1.1 Descriptive Epidemiology
2.1.1.1 Definition of Descriptive Epidemiology
Descriptive Epidemiology is the study of the time, place and person distribution of a
disease or health condition. The descriptive approach involves the study of disease
incidence and distribution by time, place, and person.
The 5W’s of Descriptive Epidemiology:
What = Health Issue of Concern
Who = Person
Where = Place
When = Time
Why/How = Causes, Risk Factors, Modes of Transmission
Fig 2.1.1: 5W’s of Descriptive Epidemiology
Epidemiology 3
Introduction
2.1.2 Uses of Descriptive Epidemiology
1. Create a detailed description of the health of a population that can be easily
communicated with tables, graphs, and maps.
2. Provide data regarding the magnitude of the disease load and types of disease
problems in the community in terms of morbidity and mortality rates and
ratios.
3. Provide clues to disease aetiology, and help in the formulation of an
aetiological hypothesis.
4. Provide background data for planning, organising and evaluating preventive
and curative service.
5. Contribute to research by describing variations in disease occurrence by time,
place and person.
2.1.3 Study Approach in Descriptive Epidemiology
Descriptive Epidemiology may use a cross-sectional or longitudinal design to obtain
estimates of magnitude of health and disease problems in human populations.
Fig 2.1.2: Study Approach in Descriptive Epidemiology
Epidemiology 4
Introduction
1. Cross-sectional Studies:
Cross-sectional study is the simplest form of an observational study. It is based on a
single examination of a cross-section of population at one point in time - the results
of which can be projected on the whole population, provided, the sampling has been
done correctly. Cross-sectional study is also known as "prevalence study".
Cross-sectional studies are more useful for chronic than short-lived diseases. For
example, in a study of hypertension, one can also collect data during the survey
about age, sex, physical exercise, body weight, salt intake and other variables of
interest. Then s/he can determine how prevention of hypertension is related to
certain variables simultaneously measured. Such a study tells one about the
distribution of a disease in population rather than its aetiology. The most common
reason for examining the inter-relationships between a disease, or one of its
precursors, and other variables is to attempt to establish a causal chain and thereby
give lead to possible ways of preventing a particular disease. A point which must be
stressed is that, the time sequence which is essential to the concept of causativity
cannot be deduced from cross-sectional data. However, frequently, there is evidence
that permits ranking of events to form such a sequence. That is, the distribution
patterns may suggest causal hypothesis which can be tested by analytical studies.
Although a cross-sectional study provides information about disease prevalence, it
provides very little information about the natural history of disease or about the rate
of occurrence of new cases (incidence).
2. Longitudinal Studies:
There is an increasing emphasis on the value of longitudinal studies in which
observations are repeated in the same population over a prolonged period of time by
means of follow-up examinations. Cross-sectional studies have been likened to a
photograph, and longitudinal studies to a cine film. Longitudinal studies are useful
(i) to study the natural history of disease and its future outcome (ii) for identifying
risk factors of disease, and (iii) for finding out incidence rate or rate of occurrence of
new cases of disease in the community. Longitudinal studies provide valuable
information which the cross-sectional studies may not provide, but longitudinal
studies are difficult to organise and more time-consuming than cross-sectional
studies.
Epidemiology 5
Introduction
2.1.4 Distribution by Time, Place and Person
2.1.4.1 Time
The occurrence of disease changes over time. Some of these changes occur regularly,
while others are unpredictable. Examples of diseases that occur during the same
season each year include influenza (winter) and West Nile virus infection (August–
September). In contrast, diseases such as hepatitis B and salmonellosis can occur at
any time. For diseases that occur seasonally, health officials can anticipate their
occurrence and implement control and prevention measures, such as an influenza,
by organising a vaccination campaign or mosquito repellent spraying. For diseases
that occur sporadically, investigators can conduct studies to identify the causes and
modes of spread, and then develop appropriately targeted actions to control or
prevent further occurrence of the disease. In either situations, displaying the patterns
of disease occurrence by time is critical for monitoring disease occurrence in the
community and for assessing whether the public health interventions made a
difference. It provides information about whether the disease is seasonal in
occurrence; whether it shows periodic increase or decrease; or whether it follows a
consistent time trend.
Time data is usually displayed with a two-dimensional graph. The vertical or y-axis
usually shows the number or rate of cases; the horizontal or x-axis shows the time
periods such as years, months, or days. The number or rate of cases is plotted over
time. Graphs of disease occurrence over time are usually plotted as line graphs or
histograms. Sometimes a graph shows the timing of events that are related to disease
trends being displayed. For example, the graph may indicate the period of exposure
or the date control measures were implemented. Studying a graph that notes the
period of exposure may lead to insights into what may have caused illness. Studying
a graph that notes the timing of control measures shows what impact, if any, the
measures may have had on disease occurrence.
Depending on the disease, the time scale may be as broad as years or decades, or as
brief as days or even hours of the day. For many chronic diseases, the focus is on
long-term trends or patterns in the number of cases or the rate. For other conditions,
such as foodborne outbreaks, the relevant time scale is likely to be days or hours.
Epidemiology 6
Introduction
Some of the common types of time-related graphs are further described below.
Fig 2.1.3: Types of Time-related Graphs
1. Secular (long-term) Trends:
The term "secular trend" implies changes in the occurrence of disease (i.e., a
progressive increase or decrease) over a long period of time, generally several years
or decades. Although it may have short-term fluctuations imposed on it, a secular
trend implies a consistent tendency to change in a particular direction or a definite
movement in one direction. Examples include coronary heart disease, lung cancer
and diabetes which have shown a consistent upward trend in the developed
countries during the past 50 years or so, followed by a decline of infectious diseases
such as tuberculosis, typhoid fever, diphtheria and polio.
This graph shows long-term or secular trends in the occurrence of a disease. It plots
the annual number of cases or rate of a disease over a period of years. It is useful for
assessing the prevailing direction of disease occurrence (increasing, decreasing, or
essentially flat), evaluating programs or make policy decisions, infer what caused
an increase or decrease in the occurrence of a disease (particularly if the graph
indicates when related events took place), and use past trends as a predictor of
future incidence of disease
Epidemiology 7
Introduction
2. Seasonality:
Here, disease occurrence is plotted against week or month over the course of a year
or more to show its seasonal pattern, if any. Some diseases such as influenza, rubella,
rotavirus and West Nile infection are known to have characteristic seasonal
distributions. The first three diseases display consistent seasonal distributions, but
each disease peaks in different months – rubella from March to June, influenza from
November to March, and rotavirus from February to April. The seasonal variations
of disease occurrence may be related to environmental conditions (e.g., temperature,
humidity, rainfall, overcrowding, life cycle of vectors, etc.) which directly or
indirectly favour disease transmission. However, in many infectious diseases (e.g.,
polio), the basis for seasonal variation is unknown. Non-infectious diseases and
health conditions may sometimes exhibit seasonal variation, e.g., sunstroke, hay
fever, snakebite. Seasonal patterns may suggest hypotheses about how the infection
is transmitted, what behavioural factors increase risk, and other possible
contributors to the disease or condition.
3. Day of Week and Time of Day:
For some conditions, displaying data by days of the week or time of day may be
informative. Analysis at these shorter time periods is particularly appropriate for
conditions related to occupational or environmental exposures that tend to occur at
regularly scheduled intervals. For example, on a particular farm, the number of farm
tractor fatalities on Sundays was about half the number on the other days. The
pattern of farm tractor injuries by hour peaked at 11:00 a.m., dipped at noon, and
peaked again at 4:00 p.m. These patterns may suggest hypotheses and possible
explanations that could be evaluated with further study.
Fig 2.1.4: Day of Week and Time of Day
Epidemiology 8
Introduction
4. Epidemic Period:
An epidemic is defined as "the occurrence in a community or region of cases of an illness
or other health-related events clearly in excess of normal expectancy". The community or
region, and the time period in which the cases occur, are specified precisely.
Epidemicity is thus relative to usual frequency of the disease in the same area,
among the specified population, at the same season of the year. It is the best known
example of short-term fluctuation in the occurrence of a disease.
Three major types of epidemics that have been identified are;
a) Common-source epidemics
• Single exposure or "point-source" epidemics
• Continuous or multiple exposure epidemics
b) Propagated epidemics
• Person-to-person
• Arthropod vector
• Animal reservoir
c) Slow (modern) epidemics
To show the time course of a disease outbreak or epidemic, a graph called an
epidemic curve is often used. In an epidemic curve’s y-axis shows the number of
cases, while the x-axis shows time as either date of symptom onset or date of
diagnosis. Depending on the incubation period (the length of time between exposure
and onset of symptoms) and routes of transmission, the scale on the x-axis can be as
broad as weeks (for a very prolonged epidemic) or as narrow as minutes (e.g., for
food poisoning by chemicals that cause symptoms within minutes).
Epidemiology 9
Introduction
Fig 2.1.5: Epidemic Curve
Fig 2.1.6: Types of Epidemic Curves
Conventionally, the gathered data is represented in a histogram (which is similar to
a bar chart but has no gaps between adjacent columns). Sometimes each case is
displayed as a square. The shape and other features of an epidemic curve can
suggest hypotheses about the time and source of exposure, the mode of
transmission, and the causative agent.
Epidemiology 10
Introduction
2.1.4.2 Place
Geographical pathology defined as study of the geography of disease, is one of the
important dimensions of Descriptive Epidemiology. Characterisation of the
occurrence of disease by place gives information about the geographic extent of the
problem and its geographic variation. Describing by place refers not only to place of
residence but to any geographic location relevant to disease occurrence. Such
locations include place of diagnosis or report, birthplace, site of employment, school
district, hospital unit, or recent travel destinations. The location may be as large as a
continent or country or as small as a street address, hospital wing, or operating
room. Sometimes place refers not to a specific location at all but to a place category
such as urban or rural, domestic or foreign, and institutional or non-institutional.
Place data can be shown in a table or a map which provides a more striking visual
display of place data. On a map, different numbers or rates of disease can be
depicted using different shadings, colours, or line patterns. A type of map for place
data is a spot map. Spot maps generally are used for clusters or outbreaks with a
limited number of cases. A dot or X is placed on the location that is most relevant to
the disease of interest, usually where each victim lived or worked. If known, sites
that are relevant, such as probable locations of exposure are usually noted on the
map.
Fig 2.1.7: Geography of Disease
Epidemiology 11
Introduction
Uses of data by place include identification of communities at increased risk of
disease, gives perspective on the fascinating differences (or variations) in disease
patterns not only between countries, but also within countries. It also provides
information on the relative importance of genes versus environment; changes with
migration; and the possible roles of diet and other aetiological factors. The clinician
is also benefitted from knowledge that a patient comes to him from a certain
geographic area which is endemic for certain infrequent diseases such as yaws or
leishmaniosis, as it helps him to focus attention on these diseases to which the
patient may have been exposed. Even if the data cannot reveal why these people
have an increased risk, it can help generate hypotheses to test with additional
studies. For example, is a community at increased risk because of characteristics of
the people in the community such as genetic susceptibility, lack of immunity, risky
behaviours, or exposure to local toxins or contaminated food? Can the increased risk,
particularly of a communicable disease, be attributed to characteristics of the
causative agent such as a particularly virulent strain, hospitable breeding sites, or
availability of the vector that transmits the organism to humans? Or can the
increased risk be attributed to the environment that brings the agent and the host
together, such as crowding in urban areas that increases the risk of disease
transmission from person to person, or more houses being built in wooded areas
close to deer that carry ticks infected with the organism that causes Lyme disease.
Examples of the different range of geographical variations in disease occurrence are
of the following types.
1. International Variations:
An international study of breast cancer showed that rates differ widely from country
to country with the lowest prevalence in Japan and the highest in the western
countries. Similarly, there are marked international differences in the occurrence of
cardiovascular diseases. These variations have stimulated search for cause-effect
relationships between the environmental factors and disease. The aim is to identify
factors which are crucial in the cause and prevention of disease.
Epidemiology 12
Introduction
2. National Variations:
Variations in disease occurrence also exist within countries or national boundaries.
For example the distribution of endemic goitre, lathyrism, fluorosis, leprosy, malaria,
nutritional deficiency diseases have all shown variations in their distribution within
many developing countries in Asia or Africa, with some parts of the country more
affected and others less affected or not affected at all.
3. Rural-urban Variations:
Rural/urban variations in disease distribution are well known. Chronic bronchitis,
accidents, lung cancer, cardiovascular diseases, mental illness and drug dependence
are usually more frequent in urban than in rural areas. On the other hand, skin and
zoonotic diseases and soil-transmitted helminths may be more frequent in rural
areas than in urban areas. Death rates, especially infant and maternal mortality rates,
are higher for rural than urban areas. These variations may be due to differences in
population density, social class, deficiencies in medical care, levels of sanitation,
education and environmental factors.
4. Local distributions:
Inner and outer city variations in disease frequency are well known. These variations
are best studied with the aid of 'spot maps' or 'shaded maps'. These maps show at a
glance areas of high or low frequency, the boundaries and patterns of disease
distribution. For example if the map shows "clustering" of cases, it may suggest a
common source of infection or a common risk factor shared by all the cases.
Epidemiology 13
Introduction
2.1.4.3 Person
It is a well-known fact that personal characteristics affect illness. Presentation and
analysis of health data by “person” use inherent characteristics of people (e.g. age,
sex, race), biologic characteristics (immune status), acquired characteristics (marital
status), activities (occupation, leisure activities, use of medications/tobacco/drugs), or
the conditions under which they live (socioeconomic status, access to medical care).
Age and sex are included in almost all data sets and are the two most commonly
analysed “person” characteristics. However, depending on the disease and the data
available, analyses of other person variables are usually necessary. Usually the
analysis of person data starts by looking at each variable separately. But sometimes,
two variables such as age and sex can be examined simultaneously. Person data is
usually displayed in tables or graphs.
a) Age:
Age is probably the single most important “person” attribute, because almost every
health-related event varies with age. A number of factors that also vary with age
include: susceptibility, opportunity for exposure, latency or incubation period of the
disease, and physiologic response (which affects, among other things, disease
development).
When analysing data by age, grouping is preferred but the age groups should be
narrow enough to detect any age-related patterns that may be present in the data.
For some diseases, particularly chronic diseases, 10-year age groups may be
adequate. For other diseases, 10-year and even 5-year age groups may conceal
important variations in disease occurrence by age e.g. when analysing data on
pertussis occurrence by standard 5-year age groups the highest rate may be among
children 4 years old and younger but when the same data is presented as <1 yr, 1yr
age group the rate of pertussis for children under 1 year of age was highest. Public
health efforts should thus be focused on children less than 1 year of age, rather than
on the entire 5-year age group.
Epidemiology 14
Introduction
b) Sex:
Males have higher rates of illness and death, than females do, for many diseases. For some
diseases, this sex-related difference is because of genetic, hormonal, anatomic, or other
inherent differences between the sexes. These inherent differences affect susceptibility or
physiologic responses. For example, premenopausal women have a lower risk of heart
disease than men of the same age due to higher estrogen levels in women. On the other
hand, the sex-related differences in the occurrence of many diseases reflect differences in
opportunity or levels of exposure e.g. men have higher incidence of lung cancer rates
compared with women. The difference noted in earlier years has been attributed to the
higher prevalence of smoking among men in the past. Unfortunately, prevalence of smoking
among women now equals that among men, and lung cancer rates in women have been
climbing as a result.
i. Ethnic and racial groups:
Sometimes person data are analysed by biological, cultural or social groupings such as race,
nationality, religion, or social groups such as tribes and other geographically or socially
isolated groups. Differences in racial, ethnic, or other group variables may reflect differences
in susceptibility or exposure, or differences in other factors that influence the risk of disease,
such as socioeconomic status and access to health care.
ii. Socioeconomic status:
Socioeconomic status though difficult to quantify, consists of many variables such as
occupation, family income, educational achievement or census track, living conditions, and
social standing. The variables that are easiest to measure and are commonly in use are
occupation, family income, and educational achievement, which though may not measure
socioeconomic status precisely.
The frequency of many adverse health conditions increases with decreasing socioeconomic
status. For example, tuberculosis is commoner among persons in lower socioeconomic
strata. Infant mortality and time lost from work due to disability are both associated with
lower income. These patterns may reflect more harmful exposures, lower resistance, and less
access to health care. Or they may in part reflect an interdependent relationship that is
impossible to untangle: Does low socioeconomic status contribute to disability, or does
disability contribute to lower socioeconomic status, or both? What accounts for the
disproportionate prevalence of diabetes and asthma in lower socioeconomic areas?
Epidemiology 15
Introduction
A few other adverse health conditions occur more frequently among persons of higher
socioeconomic status e.g. Gout was known as the “disease of kings” because of its association
with consumption of rich foods, breast cancer, Kawasaki syndrome, chronic fatigue
syndrome, and tennis elbow. Differences in exposure among the various socioeconomic
status account for at least some if not most of the differences in the frequency of these
conditions.
Epidemiology 16
Introduction
2.1.5 Conclusion
Descriptive Epidemiology is the first and most important step in the analyses of
health events or condition among the human population. Descriptive Epidemiology
also plays an important role in the study of these distribution patterns in the
different subgroups of the population by time, place and person. Roles of error, bias
and confounding helps us understand the epidemiology research. This will help the
candidate to develop ability to know the conditions under which a particular study
design is to be used. It is the precursor to any other more detailed study or
researches of the health condition and the foundation on which control interventions
are based.
Epidemiology 17
Introduction
Summary
o Descriptive Epidemiology analyses health events or conditions by
characterising the cases collectively according to time, place, and person.
o It plays a major role in identifying public health problems and suggesting
priorities for public health research.
o The 5W synonyms of Descriptive Epidemiology attempt to answer the five
questions: What is the case definition? Who are the affected persons? Where is
the place of occurrence? When (time) did it happen? and Why did it
happen (causes/risk factors/modes of transmission)?
o Using this approach, diseases can be classified as seasonal or sporadic and
unpredictable, while human groups at increased risk of exposure or
developing the health condition are also identified.
o This is vital for monitoring disease occurrence in the community and for
assessing whether the public health interventions have made a difference.
Epidemiology 18
Introduction
Bibliography
E-References
1. Basic Epidemiology - R Bontia ,R Beaglehole , T Kjellstrom
2. Epidemiology-An Introduction by Kenneth J. Rothman
3. Concepts of Epidemiology: An integrated Introduction to the ideas, theories, principles
and methods of Epidemiology by Raj Bhopal
4. Principles of Epidemiology - Self-Study U.S. Department of Health and Human Services
5 .A Short Introduction to Epidemiology -Neil Pearce
Epidemiology 19