0% found this document useful (0 votes)
82 views12 pages

International Biometric Society

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content. Unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles. Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Uploaded by

naghito
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
82 views12 pages

International Biometric Society

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content. Unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles. Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Uploaded by

naghito
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

A Nonparametric Approach to Size-Biased Line Transect Sampling

Author(s): Pham Xuan Quang


Source: Biometrics, Vol. 47, No. 1 (Mar., 1991), pp. 269-279
Published by: International Biometric Society
Stable URL: http://www.jstor.org/stable/2532511 .
Accessed: 03/02/2011 11:19

Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless
you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you
may use content in the JSTOR archive only for your personal, non-commercial use.

Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at .
http://www.jstor.org/action/showPublisher?publisherCode=ibs. .

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed
page of such transmission.

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.

International Biometric Society is collaborating with JSTOR to digitize, preserve and extend access to
Biometrics.

http://www.jstor.org
47, 269-279
BIOMETRICS
March1991

A NonparametricApproachto Size-BiasedLine
TransectSampling

Pham Xuan Quang


Department
of Mathematical ofAlaska,
Sciences,University
Fairbanks,Alaska99775, U.S.A.

SUMMARY

Pointandinterval ofthedensityof objectsthattendto aggregateare proposedfordatafromline


estimators
transectsurveys.The probabilityof detectionof a clusteris assumedto dependbothon theperpendicular
distanceof theclusterto thetransectline, and on its size. No particularformforthebivariatedetection
functionis assumed,exceptthatclusterslocated directlyon the transectline are surelydetected.An
estimatorof mean clustersize is also proposed.The standarderrorsof the estimators can be estimated
eitherby calculatingsamplevariances,or by bootstrapresampling.Applicationsof themethodto a field
testby MarkOtto,to a shipboardsurveyof minkewhales,and to a surveyof thebobwhiteare provided.
Resultsof some MonteCarlo simulations are presentedand discussed.This methodis based on thetheory
of trigonometricFourierseries;it appearsto have thesame advantagesand weaknessesas thewell-known
Fourierseries densityestimatorof Burnham,Anderson,and Laake (1980, WildlifeMonograph 72,
Supplement to Journalof WildlifeManagement44).

1. Introduction

Line transectsamplingis a practicaland cost-effective procedureforestimating thedensityof


certainobjectsin a givenregion.These objectsmaybe mammals,birds,trees,etc. It is usually
assumedthattheprobability of detectingan objectis a function of itsperpendiculardistanceto
thetransectline, and thatdetections of differentobjectsare independent events.See Burnham,
Anderson,and Laake (1980) for a comprehensive review of "distance-only"line transect
procedures.
If theobjectsare plantsor animalsthattendto aggregateintoclusters,thentheindependence
assumptionmay be violatedsince the size of a clustermay influenceits own probability of
detection.Thentheobservedmeansize maybe a biasedestimateof thetruemeansize.
The problemof detection distortedby size has beenconsideredby severalauthors.The object
being sampledis now a cluster(or group,flock,school, etc.). The data recordedare the
perpendiculardistanceX fromthecenterof thedetectedclusterto thetransect lineand thesize
S of thatcluster.Quinn(1985) utilizesthe distance-only line transectformulato estimatethe
numberof clusters,whichhe thenmultipliesby a weightedaverage clustersize to get an
estimateof thepopulationsize. Quinnalso proposesa post-stratification scheme.Drummerand
McDonald (1987) transform the distance-onlydetectionfunction intoa distance-sizedetection
functionbyinserting size as a covariate.The methodof covariateadjustment area of
to effective
Ramsey,Wildman,and Engbring(1987) can also be used.
In thispaper,it is assumedthatthe detectionprobability g(x, s) is a function of boththe
distancex andthesize s of thecluster,thatbothquantities can be measuredaccurately, andthat
clusterslocateddirectlyon the transectline are surelydetected.No particularformof the

Key words: Bootstrap


resampling;Clustered
population;
Fourier
series;Linetransect
sampling;
Monte
Carlosimulation;
Size-biased
sampling.
269
270 Biometrics,March 1991
functiong(x, s) is postulated.The proposeddistance-size
procedureis based on thetheoryof
Fourier series.
The mainresultsare formula(2), givinga pointestimateof thepopulationdensityD, and
formula(5), givingan intervalestimateof D. A specialization of theprocedure(by settingall
sizes Si to 1) producespointand intervalestimatesof theclusterdensityA [formula(3)]. The
pointestimateforA turnsoutto be identicalto theone derivedby Crainet al. (1979).
The proposedprocedureis simpleto apply,as it requiresonlystraightforward calculations.It
can easilybe implemented on a microcomputer. Moreover,it has applicationto problemsother
thanthatof estimating the densityof a clusteredpopulation.Indeed, size may referto some
numericalattribute of the object being surveyed.Suppose, forexample,thatit is desiredto
estimatethe mean volumeper unitarea of a species of tree in a certainregion,and thata
voluminous treeis morelikelyto be detectedthana smallerone. Thendistance-size linetransect
samplingmaybe contemplated, wheresize is thevolumeof thetree.
Data setsare used to illustrate
themethod.Ottoand Pollock(1990) describea fieldtestusing
paintedbeer cans, in whichthepopulationparameters are known.Quinn(1985) and Drummer
and McDonald(1987) applytheirrespective proceduresto estimatethenumberof minkewhales
froma surveyin theAntarctic Ocean. Gateset al. (1985) estimatethebobwhitepopulationfrom
a surveyin Texas. By reworkingthese data sets, we have an opportunity to comparethe
different techniques.
Finally,the resultsof some Monte Carlo simulations utilizingbivariatedetectionfunctions
modeledafterDrummerand McDonald (1987) are presented.The unsurprising conclusionis
thatthepresentprocedurecarriesthesame generaladvantagesand weaknessesof distance-only
Fourierseries estimates.For this last subject,thereis a rich literature.See in particular
Burnhamet al. (1980), Buckland(1982, 1985), and Alldredgeand Gates (1985).

2. Notationand Assumptions

2.1 Notation
We adoptthefollowingnotation:
L, w: Length,half-widthof thesampledtransectstrip;
C, n: Actualand detectednumbersof clustersin thestrip;
A = E(C)/(2Lw): Clusterdensity(meannumberof clustersperunitarea);
Xj: Perpendicular centerof clusterj, for
distancefromthetransectline to thegeometrical
j= 1,...,C;
S.: Size (i.e., numberof objects)of clusterj, forj = 1, .C. Withoutloss of generality,
assumethatSj < M forall j;
N = Y_C Sj: Total numberof objectsin thestrip;
D = E(N)/(2Lw): Populationdensity(meannumberof objectsper upitarea);
SD(s): Commonprobability densityfunction (pdf)of the Sj's;
=B E(Sj): Common mean size of theclusters;
g(x, s): Bivariatedetectionfunction,i.e., probabilityof detecting a cluster,giventhatit is at
distancex fromthetransect line, and thatit has size s;
f(x, s): Jointpdfof distanceand size of a detectedcluster;
p: Averageprobability of detectionof a cluster.

2.2 Assumptions
The followingassumptions
are made:
(A1) A 2 w-by-Ltransectstripis chosenat randomin the regionunderstudy.It resultsthat
C, n, N are randomquantities.
Size-BiasedLine TransectSampling 271
(A2) Beforesampling,C, XI, ..., AXkC, SI, . . , Sc are mutually
independent;moreover,each
Xj is uniformly over
distributed [0, w].
(A3) For each detectedclusteri, thedistanceXi and size Si can be measuredaccurately.
(A4) Detectionsof clustersare mutuallyindependent events.
(A5) Clustersdirectlyon the transectline are surelydetectedwhatevertheirsizes, i.e.,
g(O, s) =1 forall s > 0; g(x, s) is moreovercontinuous withrespectto x.

2.3 Remarks

1. No assumptions are madeabouttheshapesof (p(s) and g(x, s). In particularp(s) can be of


thediscreteor continuous type.
2. The assumptiong(0, s) = 1 forall s > 0 is crucialforsubsequent derivations; it is a natural
extension ofa similarassumption foundin distance-onlynonparametric linetransect literature
(see, e.g., Crainet al., 1979, p. 732). It seemsto have been formulated firstby Drummer
and McDonald (1987), who assumed furtherthat g(x, s) is nonincreasing in x and
nondecreasing in s.
3. Assumption(A2) can be roughlyinterpreted as follows.The locationsand sizes of the
differentclustersmusthaveno mutualinfluence.Giventhata clusteris withina strip,itmust
haveequal chanceof beinglocatedanywhere withinthatstrip.The sizes and locationsof the
clustersfoundin a randomly chosentransectstripmusthave no influenceon theirnumber.

3. Point EstimationProcedures
If Si is of the discrete type, define 3(x) = Z 3f(x, s), where the summationextends over all
possiblevalues s of Si. If Si is of thecontinuous type,and satisfies0 < Si < M, say, define
O(X) -~ sf(x, s) ds. It is proved in the Appendix that

D E(n)(
= (1l)
2L
Noticethat(1) is similarto theformulaD E(n)fx(0)/(2L) derivedbyCrainet al. (1979) for
nonclusteredpopulations, wherefv is thepdfofthedetecteddistances.A refereepointsoutthat
3(O) = E[ f(OI s)] is the expectedclustersize weightedby the conditionalprobabilityof
detectiongivenclustersize, at distance0. Also in the Appendix,the theoryof trigonometric
Fourierseriesis used to estimateO(0), fromwhichthe followingestimator of the population
densityD is derived:
1 F
I
WrXA7
>S +2 cos (2)
2Lw1~
i '[ r W] ]
wheret is an integerchosenaccordingto Rule 2 below.
A consequenceof (2), using Si = 1 forall i, is thefollowingestimator
of theclusterdensity
A\:

,
Az 2Lw1
[
i I
+2Zcos
r=1
-rrX
W
(3)

wherem is an integerchosenaccordingto Rule 1 below. Noticethat(3) coincideswithformula


(1) of Crainet al. forestimating
thedensityof a nonclustered population.
Formulas(2) and (3) can be heuristically
explainedas follows.Considerall theclustersof size
Sj = 3. Denotethecorresponding detecteddistancesX, X. X%. These clustershaving
thesame size, Crainet al. wouldestimatetheirdensityA3 by

A3 =A[1Ei+ 2 Ecos lrX;


272 Biometrics,March 1991
Now (3) is the sum of 1, A2, 3I, . . .. The densityD3 of objectsbelongingto all size-3
clusterscan be estimated
by

D = 3AI\
2Lw
?[
ItL
+ 2E
rLCO
cos
wl]
|

Now (2) is thesumof D1, D2, D3I ... . if m = t.


The relationA = D /A is provedin theAppendix,and motivatesthefollowingestimator
of
themeanclustersize:
f D//\ (4)
Crainet al. (1979) determine thenumberm of termsin (3) by minimizing a meanintegrated
squareerror(MISE), conditional on n. See also Kronmaland Tarter(1968). Theirrulehas been
simplified becausesometermsare negligibleand can be dropped.It has been shownto perform
well, and is citedhereas Rule 1.

Rule 1. Choose m to be thefirstr suchthat


11 2
w n+1
where
2 '1 7rrX
a,.= --E cos , r 0,1,....
wniw
To determinet, we minimizethe MISE E,1{ /[1t(x) - 3(X)]2dx}, wherethe subscript
denotesexpectation
conditional on n. Since we are estimating3(0), it wouldbe moreappealing
to minimizethemeansquarederrorEn{[it(O) - 3(0)]2}, buttheMISE is moretractablein the
Fourierseriescontext.It followsthatt mustbe chosento be thefirstr suchthatvarn(br?II)
(br+1)2, where
2 n 7rrX
b,.= ZSicos ', r = 0,1, .
wniw
leadingto Rule 1 are not availablein thepresentcase, but we can finda
The simplifications
simple,conditionally of varn(br+l).Put
unbiased,estimator
2 /r-Xi
Tri= .w ASIcos ,i = 1,. n, r= 1,...;
w
A~~~~~~ _

thenbr is thesamplemean T,.of the Tri's. Let s, be theirsamplevariance.Since theT,.i'sare


independentand identically (i.i.d.) foreach r, var,,(b,-)= var,,(TI )/n = E,( S2 /n).
distributed
Thus S2/ n is a conditionally
unbiasedestimateof var,1(br),and we getRule 2.

Rule 2. Choose t to be thefirstr suchthat


s2
> 2-+ 1
n
suchthat
or, equivalently,
n
Z 12r?1)i > (n2 -n + 1)1-2+
i= 1

Calculationsbased on Otto'sbeercans and simulations


reported
in Section6 suggestthatRule
2 is adequate.
Size-BiasedLine TransectSampling 273
4. ConfidenceIntervals
We arguein theAppendixthat,conditional
on t, confidence
intervalsforD have theform

D-z Sd,1(D) < D < D +z Sd,,(D), (5)

where z is a quantileof the standardnormaldistribution


appropriate
fora givenconfidence
coefficient;
e.g., z = 1.96 correspondsto a 95% confidence.Put

I
Qi =2L + 2 Ecos 2

thenthe Qi's are i.i.d. and D is theirsum. This factallows us to estimatethe conditional
standarddeviationSd,(fD) eitherby bootstrapresampling[see Efron(1982) fora description,
and Bickel and Freedman(1981) for a theoreticaljustificationin the i.i.d. case], or by
calculatingthesamplestandard deviations ofthe bi's.A confidence interval forA can be setup
likewise.The bootstrap does notseemto give a morepreciseestimateof thestandarddeviation
than the sample standarddeviationapproach. But , = D/A\ is a ratio of two dependent
variables,henceitsstandarddeviationis difficult to estimateanalytically (see Cochran,1977,p.
33), whereasthebootstrap is handyand seemsto workwell. The computer programSIZEBIAS,
availableon requestfromtheauthor,uses bothapproachesin estimating standarddeviations.
Two featuresare worthnoting:(i) expression(5) has the familiarformof a "normal
interval,"and (ii) thestandarddeviationSd,,(D) is calculatedconditionalon n, i.e., as if n
werenonrandom.
In thefollowing, therelativemeritsof conditional and unconditional variancesare discussed.
Commonpracticein surveysof resourceabundancecalls for assessingthe variability of an
estimator,say D, by estimatingits unconditional variance var(D) or, equivalently,its
coefficientof variationSd(D)/D. Then further assumptions oftenneed to be made aboutthe
moments of n, whosedistribution is notknown.Quinnand Gallucci(1980) presentseveralways
of estimating suchunconditional variances.Gates(1981) discussesvariousmethodsof optimiz-
ing varianceestimation, amongthemsamplingdesignswhereseveraltransectstripsserveas
samplingunits.
Conditionalvariancesare easier to estimate,butunliketheirunconditional counterparts, do
notconveythefullvariability of theestimator.For thepurposeof settinga confidence interval
forD, however,use of a conditional varianceas in (5) is permissible.Conditional varianceswill
oftenproduceshorterconfidenceintervalsthanthe unconditional ones, since (see Cochran,
1977, p. 276) var(D) = E[var,(D)] + var[E,1(D)].
The interval(5) is notwithout weaknesses,however:(i) itis asymptotic (n mustbe large);(ii)
it is conditionalon t, i.e., it is rigorously valid onlyif thenumbert of termsin (2) is fixedin
advance,ratherthanbeing determined by Rule 2; and (iii) it carriesa coveragedeficiency
because it is built around 3t(O), which is a biased estimatorof j(0). Because of these
weaknesses,thispaperattempts to assess theperformance of (5) by carryingout some Monte
Carlo simulations (see Section6). The resultsparallelthoseof thedistance-only case (see Crain
et al., 1979): whenthebivariatedetection function g(x, s) is modeledby a half-normal density,
then(5) seems to be satisfactory; when g(x, s) is modeledby an exponentialdensity,then
coverage deficiencyis serious.

5. Examples
The computer
programSIZEBIAS is used to processthedata setsdescribedbelow. The results
are summarized
in Table 1. Outputis shownin Figure1.
274 Biometrics, March 1991
Table I
Summaryof analysesof data sets. The entriesare theestimates(standarderror,coefficient
of
variation). All standarderrorsare obtainedby a bootstrapwith100 replications.
Estimate Beer cans Bobwhite Whales
PopulationN ~ 448 (56, 12.4%) 1-2,727 (1,604,1I2.6%) 9,305 (3,037,32.6%)
ClustersC125 (13, 10.4%) 3,632 (299, 8.2%) 1,876a (367, 19.6%)
Clustersize ji 3.60 (.48, 13.33%) 3.50 (.34, 9.7%) 4.96 (1.81, 36.6%)
Obs. meansize S 4.21 3.745 5.33
Area2Lw 8,000 sq m 148.52 sq km 7,800 sq nrn
a withbiasreduction
Obtained technique.

SIZE -BIASED SAMPLING, File: 20


n =48, Maxm=50, w=20. 0, L =200. 0, Area =8000. 0, m=2, t= 1

Bootstrap Replications: 100, Seed =7651234


Observed meani size: 4.2083
SAMPLE VAR. Estimate S~e. C.V. Density
Population: 448.1961 52.4161 11.69% 0.056025
Clusters: 124.5206 14.2567 11.45% 0.015565
Cluster size: 3.5994
BOOTSTRAPVAR. Estimate S~e. C-v. Density
Population: 448.1961 55.6224 12.41% 0.056025
Clusters: 1.24.5206 12.9096 10.37% 0.015565
Cluster size: 3.5994 0.4794 13.32%

Figure 1. Outputof the computerprogram SIZEBIAS with Otto and Pollock's (1990) beer can
data.

5.1 Otto's Beer Can Data


In 1982, Mark Otto carried out two experimentsin a field near Raleigh, North Carolina, using
brown painted beer cans, and nine observers. The entire data sets are presented in Otto and
Pollock (1990); we will use only the firstone. All parametersare known: therewere N =462
beer cans in C =132 clusters. The mean clustersize is A = 3.75. We will pretendthatonly one
(fictitious)observer walks the transectline. That observer "detects" a cluster of cans when at
least one of the real observers does: n= 48 clusters were thus detected. Notice that the
estimatedpopulation size is N' 448, a 3 % relative error. A comparison of the average size of
the detected clusters, S -_4.21, and the estimated mean size, ft=3.60, suggests that larger
clustersare more easily detected. This last remarkapplies also to the next two data sets.

5.2 Bobwhite Data


A survey of several species of animals, among them the bobwhite (Colinus virginianus), was
carried out duringJanuary-July1975 and 1976 in Zavala County, Texas. Gates et al. (1985) use
this large data set (and several others) to compare the sensitivitiesof several parametricand
nonparametricmethods of estimatingdensities with respect to various ways of grouping of
distances, and to observationsof individuals by groups.
The-rewere-n n = 3317dei-tectdclusters. For each detecte-dclusite-ri, the size S1, the radial
Size-BiasedLine TransectSampling 275
m. The estimateddensityis D= 12,727/148.52= 85.69 birdsper sq km. Gates et al. find
estimatesrangingfrom82.9 to 88.0.

5.3 WhaleData
A surveyof theminkewhalewas conductedduringthe 1979- 1980 seasonby theInternational
WhalingCommission.There were n = 165 detectedpods of whalesin a stripof dimensions
L = 975, 2w = 8 nauticalmiles. The estimatedpopulationsize, N= 9,305 whales, is not
precise:c.v. = 32.6%, and theFourierexpansion(2) requiresa largenumberofterms(t = 13).
The estimatednumberof clusters,C= 2,405 (withs.e.= 219 by the bootstrapmethod),is
dubiousbecause(3) requiresm ) 20, whichis excessive.Indeed,a frequency histogram of the
detecteddistancesindicatesthatthe shape requirementforan efficient estimateof a density
(Burnhamet al., 1980, p. 47) is notmet.Whenthebias reduction technique(Quang, 1990) is
applied,C= 1,786. Bias-reducedvalues are also used in the subsequentbootstrap procedure,
yieldings.e.(C) = 367.
Resultsfromalternative methods(Quinn, 1985; Drummerand McDonald, 1987) are repro-
ducedin Table 2 forcomparison.The readeris referredto theseauthorsfora detaileddiscussion
oftheprocedures used. BothQuinnand DrummerandMcDonaldreportthepopulation estimates
overa regionof 39,100 sq nm,whilethetransect stripmeasures7,800 sq nm.The extrapolated
populationestimate,fromN = 9,305, is 46,644 whales.

Table 2
Comparison
ofanalysesof thewhaledata
Estimated Estimated Estimated
meanpod size # ofpods # of individuals
Method 4 (S.e.) C (S.e.) N (S.e.)
Quinn(1985)
Post-stratification 3.27 ( 0) 10,708(3,427) 35,054( 7,940)
Pooleddata 4.30 (.40) 10,747(2,520) 46,212(11,658)
DrummerandMcDonald(1987)
Negative
exponential 4.155(.502) 10,400(1,615) 43,209(8,242)
Half-normnal 5.175(.874) 4,845( 610) 25,071(5,263)
GEM 4.150(.502) 10,801 44,825
Thispaper
Nonparametric 4.96 (1.81) 9,402a (1,840) 46,644(13,871)
a Obtainedwitha bias-reduction
technique.

6. Monte Carlo Simulations


Monte Carlo simulationswere performedto investigatethe small-samplepropertiesof the
estimators D, A, ft,and theconfidence (5). Set w = 4, N = 100, and200; set L such
intervals
thatD = 1, iA= .5, ,I = 2 always.Sj is normalwithmeanIt,unitvariance,truncated between
0 and w. Xj is uniform over [0, w] (thesame w is reusedforsimplicity).
The detectionfunctiong is modeledafterDrummerand McDonald (1987) as follows.Start
with a univariatedetectionfunctiong*(z), eitherhalf-normal[g*(z) = exp(-z2/2)] or
exponential[g*(z) = exp(-z)], with 0 < z < oo. Transformit into a bivariatedetection
function by settingg(x, s) = g*(x/s'). The largerthe "size-bias parameter"ao, the more
influence of detectiong(x, s). Choose a = .2, .6, 1.0, 1.4,
thesize s exertson theprobability
and 1.8.
It is interesting
to notethatthemethodsof covariateadjustment developedby Ramseyet al.
(1987) can be used directlyon the presentsimulationmodels. Indeed, the effective area
surveyed, A(s) =/0Xg(x, s) dx, is proportional to sa, and henceis log-linearin log(s).
276 Biometrics, Matrch 1991
Table 3
Resultsof thesimulations.The truevaluesare: D = 1, A = .5, , = 2.
N= 100 N =200
cx .2 .6 1.0 1.4 1.8 .2 .6 1.0 1.4 1.8
Half-normaldetection
D Ave. D 1.000 .998 1.002 1.036 1.051 1.002 .987 .987 .997 1.020
% cover. 67.0 76.4 86.6 92.6 94.6 74.2 74.6 86.6 91.0 94.0
% RMSE 26.7 24.5 23.0 20.2 19.5 21.8 19.3 16.6 15.7 14.2
Ave. t 2.00 1.48 1.37 1.16 .92 2.38 1.63 1.42 1.36 1.12
A Ave. A .504 .491 .487 .518 .597 .506 .491 .479 .475 .498
% cover 68.7 66.1 78.6 89.7 94.2 75.4 63.9 71.4 78.2 84.7
% RMSE 27.1 22.7 22.5 31.3 48.0 20.0 18.7 16.6 17.6 25.5
Ave. m 2.34 1.66 1.51 1.38 1.18 2.61 1.98 1.55 1.46 1.46
[u Ave. I 1.97 2.05 2.08 2.09 1.94 2.00 2.02 2.07 2.12 2.11
Ave. S 2.09 2.23 2.31 2.33 2.33 2.09 2.24 2.31 2.33 2.32
Ave. n 35.4 45.4 55.9 64.3 70.0 70.6 90.7 111.5 128.5 140.2
Exponentialdetection
D Ave. D .818 .843 .881 .925 .967 .860 .852 .868 .895 .919
% cover. 54.8 48.4 58.0 77.0 88.4 53.0 51.6 50.6 55.8 68.8
% RMSE 35.0 31.0 26.9 22.8 19.5 30.5 26.8 24.2 21.6 18.9
Ave. t 2.34 1.84 1.57 1.41 1.11 3.09 2.28 1.86 1.67 1.45
A Ave. A .418 .420 .426 .439 .484 .431 .425 .424 .426 .434
% cover. 53.2 50.0 47.8 51.0 61.9 52.8 52.2 48.6 45.8 43.9
% RMSE 33.6 32.1 31.8 31.5 38.4 27.7 26.5 26.1 24.6 25.9
Ave. m 2.70 2.26 1.92 1.63 1.40 3.37 2.72 2.32 1.98 1.70
[u Ave. ,u 2.00 2.05 2.14 2.20 2.16 2.00 2.03 2.07 2.12 2.16
Ave. S 2.09 2.21 2.29 2.34 2.36 2.09 2.21 2.29 2.34 2.36
Ave. n 27.4 33.9 41.0 47.8 53.9 54.6 67.4 81.8 95.4 107.6

A run of the simulationmimicsa completesurvey.In each run, N pairs (Xj, Sj) of


pseudo-random variatesare generated.The functiong is utilizedto "detect" or "miss" a
cluster.Thus n clustersare detected;theirmeasurements a sampleto which
(Xi, Si) constitute
theformulas(2)-(5) are applied(via theprogramSIZEBIAS). The methodis inefficient in the
sense thataboutone-halfthe generatedvariatesare discarded.But thismay reflectbetterthe
randomnatureof thesamplesize n.
For each combination of valuesof N, ao,and of a shapeof g* (normalor exponential), 500
runsare made.The summary statistics in Table 3. Noticethatfromone runto the
are presented
nextn, t, and m vary.Table 3 givestheaverageof an estimateoverthe500 runs,e.g.,
500
ave(D) ZDk/,500

whereDk is D obtainedon the kthrun,theobservedcoveragerateof theconfidence


intervals
(thenominalcoveragerateis 95%), and therelativeroot-mean-square
error,e.g.,
100 1
RMSE(D) = D 500Z [13k - D]

6.1 Discussion
The half-normal
modelseems to providean estimateof D thatis overallunbiased,whilethe
exponential
modelunderestimates
it. The same remarkappliesconcerningi\.
Size-BiasedLine TransectSampling 277
Coverage deficiency is a constant feature. Coverage of D is more or less satisfactory
(67.0% -94.6%) under the half-normalmodel, while deficiencyis severe under the exponential
model. The bad behavior of the exponential model under Fourier series techniques is well
known. In the distance-only situation Burnham et al. (1980) showed that Fourier series
techniques work well in cases where the detection curve exhibits a shoulder near the origin
(e.g., the half-normaldetectionfunction),while Quang (1990) proposes a methodto reduce the
bias.
Overall, the estimate ft= D jA does well, while the mean detected cluster size S always
overestimates[t (as expected), oftenby a large amount. Finally, more clusters are detected as
the influenceof size on delectability(the magnitudeof ao) increases.

ACKNOWLEDGEMENTS

I wish to thank Steve Amstrup, Tom Drummer, Charles Gates, Terry Quinn II, Steve
Thompson, the associate editor, and two refereesfor careful reviews of the manuscript,and for
many criticismsthat have helped improve the presentationof the paper. Terry Quinn, Fred
Guthery,and Mark Otto have kindly made available the edited whale data, the bobwhite data,
and the beer can data, respectively.

RESUME
Sont proposes,des estimateurs de la valeur et de l'intervallede confianced'une densityd'objets 'a
repartition de typeagregative,observesau moyende la methoded'echantillonnage par transect lineaire.I1
est supposeque la probability de detecterun groupedepend'a la fois de son effectif et de sa distance
orthogonaleau transect.Aucunehypotheseparticuliere n'est fate sur la formede la fonctionbivariee
associe'e'a la probability un groupeexcepteeque les groupeslocalisesprecisement
de detecter surle transect
sont suirement reperes.Un estimateur de l'effectifmoyendes groupesest aussi propose. Les erreurs
standarddes estimateurs peuvent6treetabliessoit a partirdes variancesdes denombrements soit par
reechantillonnage par la methodebootstrap.MarkOttoa appliquecettemethodea deuxseriesde donnees,
l'une issue d'une campagnede navigationconsacreea l' observation du petitRorquall'autrerelative'a un
lynxamericain.Les resultats de quelquessimulations de typeMonteCarlo sontpresents et discutes.La
methodeest fonduesurl'utilisation de la theoriedes seriestrigonometriques de Fourier.Elle sembleavoir
les memesavantageset les memefaiblessesque celle bienconnuede Burnham,Anderson,et Laake (1980,
WildlifeMonograph 12, Supplement to Journalof WildlifeManagement44).

REFERENCES

Alldredge,J. R. and Gates, C. E. (1985). Line transectestimatorsfor left-truncated distributions.


Biometrics41, 273-280.
Anscombe,F. J. (1952). Large-sampletheoryof sequentialestimation.Proceedingsof the Cambridge
PhilosophicalSociety48, 600-607.
Bickel, P. J. and Freedman,D. A. (1981). Some asymptotic theoryforthe bootstrap.The Annals of
Statistics9, 1196-1217.
Buckland,S. T. (1982). A note on the Fourier series model for analysingline transectsampling.
Biometrics38, 469-477.
Buckland,S. T. (1985). Perpendiculardistancemodels for line transectsampling.Biometrics41,
177-195.
Burnham,K. P., Anderson,D. R., and Laake, J. L. (1980). Estimationof densityfromline transect
samplingof biologicalpopulations.WildlifeMonograph 72, supplement to Journalof Wildlife
Management44.
Cochran,W. G. (1977). Sampling Techniques,3rdedition.New York: Wiley.
Crain,B. R., Burnham,K. P., Anderson,D. R., and Laake, J. L. (1979). Nonparametric estimationof
populationdensityforline transectsamplingusingFourierseries. Biometrics21, 731-748.
Drummer, T. D. and McDonald,L. L. (1987). Size bias in linetransect
sampling.Biometrics43, 13-21.
Efron,B. (1982). The jackknife,the bootstrapand otherresamplingplans. In CBMS NSF Regional
ConferenceSeries in Applied Mathematics38. Philadelphia:SocietyforIndustrialand Applied
Mathematics.
Gates, C. E. (1981). Optimizingsamplingfrequency and numbersof transectsand stations.Studies in
Avian Biology 6, 399-404.
278 Biometrics, March 1991

Gates,C. E., Evans,W., Gober,D. R., Guthery, F. S., andGrant,W. E. (1985). Line transect estimation
of animaldensitiesfromlarge data sets. In Game HarvestManagement,S. L. Beasom and S. F.
Roberson(eds). Kingsville,Texas: Caesar KlebergResearchInstitute, Texas A&I University.
Kronmal,R. and Tarter,M. (1968). The estimation of probability densitiesand cumulativesby Fourier
seriesmethods.Journalof theAmericanStatisticalAssociation 63, 925-952.
Otto,M. C. and Pollock,K. H. (1990). Size bias in line transectsampling:A fieldtest. Biometrics46,
239-245.
Quang, P. X. (1990). Confidenceintervalsfor densitiesin line transectsampling.Biometrics46,
459-472.
Quinn,T. J. II and Gallucci,V. F. (1980). Parametric modelsforline transectestimators of abundance.
Ecology 61, 293-302.
Quinn,T. J. 11(1985). Line transect
estimatorsforschoolingpopulations.FisheriesResearch3, 183-199.
Ramsey,F. L., Wildman,V., and Engbring,J. (1987). Covariateadjustments to effectivearea in
variable-areawildlifesurveys.Biometrics43, 1-11.
Siegmund,D. 0. (1985). SequentialAnalysis. New York: Springer-Verlag.
fromtheRussianby R. A. Silverman).EnglewoodCliffs,
Tolstov,G. P. (1962). Fourierseries(translated
New Jersey:Prentice-Hall.

ReceivedMarch 1989; revisedNovember1989 and April 1990; acceptedApril 1990.

APPENDIX

A.1 The Relation tt= D/A


consequenceof thefollowing:
This relationis an immediate

E(N) = E(S1 + *--Sc) = E(C)E(SI) = ttE(C),

wherethemiddleequalityis justifiedby Wald's theorem(Siegmund,1985, p. 12).

A.2 The Relation (1)


The followingargumentswill establishrelation(1) when the Si's are of the discretetype. For the
continuoustype,thearguments are similar.
The jointpdf f(x, s) is relatedto thedetectionfunctiong(x, s) by thefollowingrelation,whichis a
directconsequenceof eq. (8) of Drummerand McDonald (1987, p. 16):
1
f(x, s) = -g(x, s)(p(s),
pw

where0 ( x < w, s is a possiblevalue of Si, and Sp(s) = Pr(Si = s). Assumption


(A5) implies
1
f (O, S) = __

Conditionalon C, n is binomially C, p; hence,E(n)


withparameters
distributed = E[Ec(n)] = E(Cp)
= pE(C). It resultsthat
E(n) = pE(C).

Eliminatep betweenthelast displayedrelations:

wE(n)f (O, s) = E(C)SD(s)

Multiplybothsidesby s, sumoverall valuesof s, observethattt= ZYssp(s) and that,t = D/A. We get

wE(n)Zsf(O,s) = E(C)Zssc(s) = E(C)M = E(N) = 2LwD.


S S

Now define 3(x) = >Zssf(x, s). Substituting


/(0) intothepreviousequationresultsin (1).
Size-Biased Line Transect Sampling 279
A.3 A FourierEstimatorof /3(O)
We expand 3(x) intoa trigonometric
Fourierseries(Tolstov,1962, p. 35):
1 X 7rrX
13(x) = -bo+ E>brCOS X
2 r= W
where, for r = 0, 1, . . ,
2 w 7rrx 2 w -rrx 2 -rrX]
br =-3 (x)cos ~ dx = - sf (x, s)cos dx - E Slcos

yieldingthefollowingunbiasedestimateof br:
2 n wrrX,
br Scos i
wni=1 w
Choose an integert, and definetheFouriersegment

St(o) = -bo + E br;

then St(?) can be unbiasedly estimatedby

2b+? rE r
-= 2 SSi +2 cos - = E
2 r,=1 wfl11 r EI W 2Ln1j1
Q

Moreover,St(O) is approximately unbiasedfor 3(0) forlarget. Indeedtheassumedcontinuityof g implies


thatSt(?) tendsto 3(0) as t -- co. Adopt St(O) as an estimatorof 3(0); then(1) suggeststheestimate
D = n/3(0)/(2L). Now (2) is immediate.

A.4 The ConfidenceInterval(5)


In thefollowing,assumethatt is a fixed integer.The mainidea in deriving(5) is thatSt(?) is an average
of independent, variables(albeit with a randomnumberof summands),hence
identicallydistributed
extensiveresultsabout the CentralLimitTheoremcan be broughtto bear. A versionof thattheorem
developedforrandomn (see Anscombe,1952; or Siegmund,1985,p. 23) can nowbe appliedto the Qi's:
theratioZ is close to thestandardnormalwhenn is large,where

_ SJ() - St ()
Z

Sdn[I~t (?) ]

We deducethatthefollowingstatement
holdswithprobability
approaching1 - of:

t(?) - zSd,[/(0)] < (OJ


St(?) (o) + zSd,[I3(0)]X

wherez is the 1 - y/2quantileof thestandardnormaldistribution. If Sd,[O/(O)]is a consistent


estimate
of Sd,[i3(O)], and if thebias St(O) - 3(0) is smallrelativeto thatstandarddeviation,thensubstitutions
can be made,yieldingtheapproximate interval

It) -sZSde n[ow to u< ly<


m] Opth) /( L[ to get (
ZSdn +

It sufficesnow to multiplythroughby n /(2 L) to get (5).

You might also like