Owusu 2019
Owusu 2019
Bright Owusu
Department of Mathematical Sciences
Montana State University, Bozeman
Bright Owusu
This writing project has been read by the writing project advisor and has been found to
be satisfactory regarding content, English usage, format, citations, bibliographic style,
and consistency, and is ready for submission to the Statistics Faculty.
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Line Transect Sampling . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Selecting line Transect . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.5 Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Model Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1 Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 Model Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.21 Akaike Information Criterion(AIC) . . . . . . . . . . . . . . 7
2.21 Likelihood ratio test . . . . . . . . . . . . . . . . . . . . . . . 8
2.23 Goodness of fit . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3 Abundance estimation . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3 Simulation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.1 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.2 Research Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.3 Detection function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.4 Model fitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.5 Abundance Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.51 Using an exponential detection function . . . . . . . . . . . . 14
3.52 Using a half-normal detection function . . . . . . . . . . . . 16
4. Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1
Bright Owusu Montana State University
Abstract
Estimation of abundance and density of animals and plants population is essential for
conservation and management in ecology. The study introduce the methods of line transect
sampling, a type of distance sampling method used in ecology. Line transect sampling
was first developed by R.T.Kings to estimate abundance of animals and plants in 1930s.
Perpendicular distances of detected objects are used to estimate density and abundance
of objects. The study describes randomly setting a single and multiple transect lines on
the study region and introduce a detection function to determine the objects of interest
that are detected based on the detection function. Model selection method using AIC
and goodness of fit test is used to select the best models for the detection function. We
describe how researchers can obtain estimates of abundance and density from the selected
model using the "Distance" program in R. A simulation study of longleaf pine trees is used
to demonstrate the method of line transect sampling. A transect line is first randomly
placed at 120m on a 200m by 200m study region and three(3) distributions (Half-normal,
Hazard-rate and Uniform with cosine adjustment) were considered to model the detection
function. A 1000 simulated samples were used and estimates of abundance of longleaf
pine trees were compared across the models. In addition, multiple transect lines are set
on the study region and the simulated study is repeated.
Page 2 of 19
Bright Owusu Montana State University
1. Introduction
1.1 Background
Several methods of estimating the density of plants and animals have been proposed.
Although no method is bias free, the most accurate density estimates are obtained from
complete counts [Davenport et al., 2007; McNeilage et al.,2001]. However, these methods
require sampling effort that is often impractical, especially over large areas. The line
transect sampling method is the most practical method [Plumptre,2000; Struhsaker,1997]
in most cases. The line transect sampling provides a convenience method of estimating
the number of objects in a study. The objects may be any species of animal or plant that
is easily visible, at least at close range.
There are two basic methods of distance sampling, line transect sampling and point
transect sampling. In line transect sampling, a series of lines is distributed according to
some design (usually a systematic grid of parallel lines and an observer travels along each
line, searching for animals or animal clusters). For each animal or cluster detected, the
observer measures or estimates the (perpendicular) distance x of the animal or cluster
centre from the nearest part of the line. In many surveys, it is easier to measure or
estimate the observer-to-animal (radial) distance r at the time the detection is made.
If the sighting angle θ is also measured, then the perpendicular distance may be found
Page 3 of 19
Bright Owusu Montana State University
Page 4 of 19
Bright Owusu Montana State University
1.4 Detection
In Line transect sampling, detection function is a key concept in estimating abundance of
species. The detection function g(x) is the probability of detecting an object, given that it
is at distance x from the transect line. The distance x refers to the perpendicular distance
from the transect line. The detection function decreases with increasing distance but lies
between 0 and 1. Thus, 0≤g(x)≤1. When object is located on the line, then we assume
perfect detection such that the detection function is g(0) =1. This indicates that the
object on the line is detected with certainty. Figure 2 displays a graph of the detection
function g(x) for a given distance x.
The graph above represents half-normal (top row) and hazard-rate(bottom row)
detection functions without adjustments, varying scale(σ) and (for hazard-rate) shape
(b) parameters(values are given above the plots). On the top row from left to right, the
study species becomes more detectable as the parameter, σ increases. The bottom row
shows the hazard-rate model’s with different parameters. In detection of objects, only
small percentage of the object of interest are detected in the study region. Although this
is the case, analysis of the associated distances allows reliable estimates of true density to
be made.
1.5 Assumptions
In line transect sampling method, the design is associated with four main assumptions
which includes;
Page 5 of 19
Bright Owusu Montana State University
The assumption that animals are distributed independently of the line is characterized
as one of the key design assumption. We assume that animals are distributed uniformly
with respect to distance from the line. This assumption will hold based on a suitable
randomized design (Strindberg et al. 2004). Thus, lines should be positioned according to
a random design. However, we cannot ensure that assumptions related to the model holds
by adopting a suitable design. Instead, we need to consider whether field methods can be
adopted that will ensure low bias when they fail. For example, if animals show responsive
movement, then field methods should if possible ensures that animals are detected and
their locations recorded before they respond. This will ensure the assumption that objects
are detected at their initial locations to be reasonable. Generally, movement independent
of the observer causes no problems, unless the object is counted more than once on the
same unit of transect line or if it is moving at at roughly half the speed of the observer or
faster. Animal movement after detection is not a problem, as long as the original location
can be established accurately and the appropriate distance measured. It is problematic
when animals move to the vicinity of the next transect in response to disturbance by
the observer. However, if movement is random, or at least not systematically in a single
direction, then animals moving in one direction will tend to be compensated by animals
moving in the other direction. It is also assumed that, objects on the line are detected
with certainty. This assumption ensures that all objects at zero distance are detected
such that g(0) = 1. Practically, detection on or near the line should be nearly certain.
Distance measurements are assumed to be exact in line transect sampling. Thus, there is
no measurement errors or recording errors of the distances and angles. Rounding errors in
measuring angles near zero are problematic, most often in the analysis of ungrouped data.
If errors in distance measurements are random and not too large, then reliable density
measurements are still likely, given that the sample size is large. (Gates et al 1985).
Page 6 of 19
Bright Owusu Montana State University
2. Model Selection
2.1 Models
In the line transect sampling, several models may be considered for the detection function.
However, the models for the detection function are expected to have the following
propoerties (Buckland et al. 2015);
• Shoulder: We expect observers to be able to see objects near them, not just those
directory infront of them. For this reason, we would expect the detection function
to be flat near the line.
• Non-increasing: The non-decreasing property suggests that, observers are less likely
to see distance objects than those nearer the transcript.
• Model robust: The model should be flexible enough to fit many different shapes.
In modeling the detection function, we would consider three(3) model types as key
functions in this study. These models include, the Uniform, Half-normal and the Hazard-
rate models. Adjustment terms would be included in the models to improve model fit. The
adjustment terms considered for each model is based on the distribution of the distance
data. If the survey results in good distance data, which exhibit a shoulder and with
adequate sample size, the choice of model is unlikely to affect the abundance estimate
much. However, if survey design results in poor distance data, models fitted to the data
might yield different estimates. Generally, histogram of the distance data is plotted and
visualize to identify any outliers, heaping, measurement errors, and movement prior to
detection.
In the model selection approach, the study would considered the AIC method of model
selection. The Akaike Information Criterion (AIC) provides quantitative method for
model selection. The criterion is given by;
AIC −2loge L 2q
Page 7 of 19
Bright Owusu Montana State University
In testing between hierarchical models, the likelihood ratio test is used for choosing the
number of adjustment terms to include in the model. Suppose that a fitted model(Model
1) has k1 adjustment terms, the likelihood ratio test allows an assessment of whether the
addition of k2 terms improves the adequacy of a model. In hypothesis testing, the null
states that Model 1 is the true model and the alternative states that Model 2 with all
k1 k2 adjustment terms is the true model. The test statistics is given by;
where L1 and L2 are the maximum value of the likelihood functions for Models 1 and 2
respectively. If model 1 is the true model, the test statistic follows a χ2 distribution with
k2 degrees of freedom. In line transect sampling, models are fitted to the key function
and lower order adjustment terms are fitted. If the adjustment term improve the fit, the
next term is added and the likelihood ratio test is carried out. The process is repeated
until the test is not significant.
In model selection for a line transect sampling, goodness of fit can be useful in assessing
the quality of the distance data and understanding the general shape of the detection
function. Models fitted to the distance data could have significantly poor fit. This need
not be of great concern, as it provides a warning of a problem in the data or the selected
detection model structure. This could be investigated by closer examination of the data or
by exploring other models and fitting options. Often, a good model will give a significant
goodness of fit statistics. Figure 3 below indicates example plots of goodness of fit to a
distance data for Half-normal, hazard-rate and uniform with cosine adjustment models.
Page 8 of 19
Bright Owusu Montana State University
n
D
2La
where a 0w g(x)dx
R
The probability density function, f(x) of the detected distances is obtained as;
g(x)
f (x)
a
At zero distance of objects to the transect line, the detection function is g(0) = 1.
Hence,
1
Θ
a
f (0)
and density now becomes
Page 9 of 19
Bright Owusu Montana State University
nf (0)
D
2L
The estimate of abundance or population size is obtained as the product of the area
under the study region(A) and the estimated density.
Θ Θ Anf (0)
t AD
2L
3. Simulation Study
3.1 Data
The data contains 584 longleaf pine tree locations (x,y) in a study region of 200m by 200m
forest in Thomas County, Georgia. Perpendicular distances of trees from the transect line
are measured as the variable of interest.
• When multiple transect lines are set at 60m,90m,120m and 150m, what is the average
estimated abundance of pine trees in the study region?
• How does the estimated abundance of longleaf pine trees change for different detection
functions?
Page 10 of 19
Bright Owusu Montana State University
g(x) ≡ exp(−αx)
where x is the distance of each pine tree from the transect line.
We applied this detection function to obtain detected pine trees when a single transect
line and multiple transect lines were placed on the study region (200m by 200m forest in
Thomas county, Georgia).
For further analysis, a half normal detection function with an estimated σ of 0.019
was also considered to obtain detected pine trees and distance of each detected pine tree
to the transect line was recorded. Mathematically, the detection function is given by;
x2
g(x) ≡ exp(− )
2σ 2
where x is the distance of each pine tree from the transect line.
We applied this detection function to obtain detected pine trees when a single transect
line and multiple transect lines were placed on the study region (200m by 200m forest in
Thomas county, Georgia).
Page 11 of 19
Bright Owusu Montana State University
Figure 4: Model fit to detected pine trees for first simulated case
Figure 5: Model fit to detected pine trees for the tenth simulated case
Page 12 of 19
Bright Owusu Montana State University
Figure 6: Model fit to detected pine trees for first simulated case
Figure 7: Model fit to detected pine trees for the tenth simulated case
Page 13 of 19
Bright Owusu Montana State University
For 1000 simulated cases of re-sampling the same transect line set at 120m on the
study region, the average abundance of the pine trees were estimated for the three(3)
distributions. Table 1 below displays the average abundance, average standard deviation,
average bias and a 95% confidence interval for each distribution.
Page 14 of 19
Bright Owusu Montana State University
Table 1
Model Avg.Ab Avg.Bias Avg.Sd 95% Conf.Int.
Half-normal 569.80 -14.20 45.76 (480.10, 659.49)
Hazard-rate 556.56 -27.44 47.14 (464.16, 648.97)
Uniform + Cos(1) 577.41 -06.59 40.74 (496.79, 658.04)
Figure 8 below displays the distribution of the average abundance from 1000 simulated
cases for Half-normal, Hazard-rate and Uniform with cosine adjustment distributions.
For 1000 simulated cases of re-sampling multiple transect lines set at 60m,90m,120m
and 150m on the study region, the average abundance of the pine trees were estimated
for the three(3) distributions. Table 2 below displays the average abundance, average
Page 15 of 19
Bright Owusu Montana State University
standard deviation, average bias and a 95% confidence interval for each distribution.
Table 2
The average abundance of pine trees are underestimated for Half-normal, Hazard-rate
and Uniform with cosine adjustment distributions.
Table 3
Figure 9 below displays the distribution of the average abundance from 1000 simulated
cases for Half-normal, Hazard-rate and Uniform with cosine adjustment distributions.
Page 16 of 19
Bright Owusu Montana State University
(B) Multiple transect lines set at 60m, 90m, 120m and 150m.
For 1000 simulated cases of re-sampling multiple transect lines set at 60m, 90m, 120m
and 150m on the study region, the average abundance of the pine trees were estimated
for the three(3) distributions. Table 4 below displays the average abundance, average
standard deviation, average bias and a 95% confidence interval for each distribution.
Table 4
The average abundance of pine trees are overestimated for Half-normal, Hazard-rate
and Uniform with cosine adjustment distributions.
Page 17 of 19
Bright Owusu Montana State University
estimate of abundance with the smallest bias. Again, when multiple transect lines were
set at 60m, 90m, 120m and 150m, the uniform with cosine(1) adjustment model had
the largest estimate of abundance with the smallest bias. Comparatively, we found that
the average bias was higher for the multiple transect line than the single transect line.
Generally, we cannot conclude this will always be the case as the difference could be
attributed to lower estimates of abundance from transect lines closer to the end of the
study region.
We also assessed abundance of logleaf pine trees using half-normal detection function.
The simulated process was repeated with a single transect line set at 120m on the study
region and the estimate of abundance using all models(Half-normal, Hazard-rate and
Uniform with cosine(1) adjustment) were considered. The abundance of longlef pine
trees were overestimated for all models. However, on average, the hazard-rate model
provided the best estimate of abundance with the smallest bias. We also found that,
the hazard-rate model provided the best estimate of abundance with the smallest bias
when multiple transect lines were set at 60m, 90m, 120m and 150m. We noticed that, the
uniform with cosine(1) model provided poor estimate of abundance with large bias among
all the models when half-normal detection function is used.
From the simulated results, we observed that the model which provides the best
estimate of abundance when single transect line is used also provides the best estimate of
abundance when multiple transect lines are used on the study region. However, the model
which provides the best estimate of abundance differ when different detection function is
used. These results could change if several models were used in estimating the abundance
of trees. It is therefore not certain to make a general conclusion on which model would be
best for a given detection function when using a line transect sampling.
In summary, line transect sampling method appeared to provide a better estimate of
abundance in a given study region. The estimate of abundance is dependent on which
detection function is considered. The choice of the detection function and its parameter(s)
may lead to underestimating or overestimating of abundance in a given study region. In
addition, all assumptions are needed to be satisfied to ensure efficient use of informations
to yield a better estimate of abundance. Again, several models are worth considering in
modeling the detection function for a better estimate of abundance. Future studies may
consider advance method of line transect sampling where truncation of the data are useful
in estimating of abundance.
Page 18 of 19
Bright Owusu Montana State University
References
Anderson, D.R., Burnham, K.P. and Crain, B.R.(1985a). Estimating population size and
density using line transect sampling. Biomedical Journal, 27, 723-31.
Anderson, D.R., Laake, J. L., Crain, B.R. and Burnham, K.P (1979). Guidlines for
line transect sampling of biological population. Journal of Wildlife Management, 43, 70-8.
Brockelman, W. Y. (1980). The use of the line transect sampling method for forest
primates, in Tropical Ecology and Development (ed. J.I.Furtado), The international
Society of Tropical Ecology, Kuala Lumpur, Malaysia, pp. 367-71.
DeVries, P.G. (1979a). Line transect sampling - statistical theory, applications, and
suggestions for extended use in ecological inventory, in Sampling Biological Populations
(eds R.M. Cormack, G.P. Patil and D.S. Robson), International Co-operative Publishing
House, Fairland, MD, USA, pp. 1-70.
Eberhardt, L.L. (1978a). Transect method for popualtion studies. Journal of Wildlife
Management, 42, 1-31.
Rao, P.V., Portier, K.M. and Ondrasik, J.A. (1981). Density estimation using line
transect sampling, in Estimating Numbers of Terrestrial Birds.. Studies in Avian Biology
No. 6 (eds C.J. Ralph and J.M.Scott), Cooper Ornithological Society, pp. 441-4.
Page 19 of 19