0% found this document useful (0 votes)
28 views136 pages

Systematic Review and Evaluation

Uploaded by

ana maria costa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views136 pages

Systematic Review and Evaluation

Uploaded by

ana maria costa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 136

Health Technology Assessment 2006; Vol. 10: No.

6
Health Technology Assessment 2006; Vol. 10: No. 6

Systematic review and evaluation


of methods of assessing urinary
incontinence

JL Martin, KS Williams, KR Abrams,

Systematic review and evaluation of methods of assessing urinary incontinence


DA Turner, AJ Sutton, C Chapple,
RP Assassa, C Shaw and F Cheater
Feedback
The HTA Programme and the authors would like to know
your views about this report.
The Correspondence Page on the HTA website
(http://www.ncchta.org) is a convenient way to publish
your comments. If you prefer, you can send your comments
to the address below, telling us whether you would like
us to transfer them to the website.
We look forward to hearing from you.

February 2006

The National Coordinating Centre for Health Technology Assessment,


Mailpoint 728, Boldrewood,
University of Southampton,
Southampton, SO16 7PX, UK.
Fax: +44 (0) 23 8059 5639 Email: hta@soton.ac.uk
Health Technology Assessment
NHS R&D HTA Programme HTA
http://www.ncchta.org ISSN 1366-5278
HTA

How to obtain copies of this and other HTA Programme reports.


An electronic version of this publication, in Adobe Acrobat format, is available for downloading free of
charge for personal use from the HTA website (http://www.hta.ac.uk). A fully searchable CD-ROM is
also available (see below).
Printed copies of HTA monographs cost £20 each (post and packing free in the UK) to both public and
private sector purchasers from our Despatch Agents.
Non-UK purchasers will have to pay a small fee for post and packing. For European countries the cost is
£2 per monograph and for the rest of the world £3 per monograph.
You can order HTA monographs from our Despatch Agents:
– fax (with credit card or official purchase order)
– post (with credit card or official purchase order or cheque)
– phone during office hours (credit card only).
Additionally the HTA website allows you either to pay securely by credit card or to print out your
order and then post or fax it.

Contact details are as follows:


HTA Despatch Email: orders@hta.ac.uk
c/o Direct Mail Works Ltd Tel: 02392 492 000
4 Oakwood Business Centre Fax: 02392 478 555
Downley, HAVANT PO9 2NP, UK Fax from outside the UK: +44 2392 478 555
NHS libraries can subscribe free of charge. Public libraries can subscribe at a very reduced cost of
£100 for each volume (normally comprising 30–40 titles). The commercial subscription rate is £300
per volume. Please see our website for details. Subscriptions can only be purchased for the current or
forthcoming volume.

Payment methods
Paying by cheque
If you pay by cheque, the cheque must be in pounds sterling, made payable to Direct Mail Works Ltd
and drawn on a bank with a UK address.
Paying by credit card
The following cards are accepted by phone, fax, post or via the website ordering pages: Delta, Eurocard,
Mastercard, Solo, Switch and Visa. We advise against sending credit card details in a plain email.
Paying by official purchase order
You can post or fax these, but they must be from public bodies (i.e. NHS or universities) within the UK.
We cannot at present accept purchase orders from commercial companies or from outside the UK.

How do I get a copy of HTA on CD?


Please use the form on the HTA website (www.hta.ac.uk/htacd.htm). Or contact Direct Mail Works (see
contact details above) by email, post, fax or phone. HTA on CD is currently free of charge worldwide.

The website also provides information about the HTA Programme and lists the membership of the various
committees.
Systematic review and evaluation of
methods of assessing urinary
incontinence

JL Martin,1* KS Williams,1 KR Abrams,1


DA Turner,1 AJ Sutton,1 C Chapple,2
RP Assassa,3 C Shaw4 and F Cheater5
1
Department of Health Sciences, University of Leicester, UK
2
Urology Research, Royal Hallamshire Hospital, Sheffield, UK
3
Pinderfields and Pontefract General Infirmary, UK
4
Department of General Practice, University of Wales,
College of Medicine, UK
5
School of Healthcare Studies, University of Leeds, UK

* Corresponding author

Declared competing interests of authors: none

Published February 2006

This report should be referenced as follows:

Martin JL, Williams KS, Abrams KR, Turner DA, Sutton AJ, Chapple C, et al. Systematic
review and evaluation of methods of assessing urinary incontinence. Health Technol Assess
2006;10(6).

Health Technology Assessment is indexed and abstracted in Index Medicus/MEDLINE,


Excerpta Medica/EMBASE and Science Citation Index Expanded (SciSearch®) and
Current Contents®/Clinical Medicine.
NHS R&D HTA Programme

T he research findings from the NHS R&D Health Technology Assessment (HTA) Programme directly
influence key decision-making bodies such as the National Institute for Health and Clinical
Excellence (NICE) and the National Screening Committee (NSC) who rely on HTA outputs to help raise
standards of care. HTA findings also help to improve the quality of the service in the NHS indirectly in
that they form a key component of the ‘National Knowledge Service’ that is being developed to improve
the evidence of clinical practice throughout the NHS.
The HTA Programme was set up in 1993. Its role is to ensure that high-quality research information on
the costs, effectiveness and broader impact of health technologies is produced in the most efficient way
for those who use, manage and provide care in the NHS. ‘Health technologies’ are broadly defined to
include all interventions used to promote health, prevent and treat disease, and improve rehabilitation
and long-term care, rather than settings of care.
The HTA Programme commissions research only on topics where it has identified key gaps in the
evidence needed by the NHS. Suggestions for topics are actively sought from people working in the
NHS, the public, service-users groups and professional bodies such as Royal Colleges and NHS Trusts.
Research suggestions are carefully considered by panels of independent experts (including service users)
whose advice results in a ranked list of recommended research priorities. The HTA Programme then
commissions the research team best suited to undertake the work, in the manner most appropriate to find
the relevant answers. Some projects may take only months, others need several years to answer the
research questions adequately. They may involve synthesising existing evidence or conducting a trial to
produce new evidence where none currently exists.
Additionally, through its Technology Assessment Report (TAR) call-off contract, the HTA Programme is
able to commission bespoke reports, principally for NICE, but also for other policy customers, such as a
National Clinical Director. TARs bring together evidence on key aspects of the use of specific
technologies and usually have to be completed within a short time period.

Criteria for inclusion in the HTA monograph series


Reports are published in the HTA monograph series if (1) they have resulted from work commissioned
for the HTA Programme, and (2) they are of a sufficiently high scientific quality as assessed by the referees
and editors.
Reviews in Health Technology Assessment are termed ‘systematic’ when the account of the search,
appraisal and synthesis methods (to minimise biases and random errors) would, in theory, permit the
replication of the review by others.

The research reported in this monograph was commissioned by the HTA Programme as project number
99/29/02. The contractual start date was in April 2002. The draft report began editorial review in
December 2003 and was accepted for publication in April 2005. As the funder, by devising a
commissioning brief, the HTA Programme specified the research question and study design. The authors
have been wholly responsible for all data collection, analysis and interpretation, and for writing up their
work. The HTA editors and publisher have tried to ensure the accuracy of the authors’ report and would
like to thank the referees for their constructive comments on the draft document. However, they do not
accept liability for damages or losses arising from material published in this report.
The views expressed in this publication are those of the authors and not necessarily those of the
HTA Programme or the Department of Health.
Editor-in-Chief: Professor Tom Walley
Series Editors: Dr Peter Davidson, Dr Chris Hyde, Dr Ruairidh Milne,
Dr Rob Riemsma and Dr Ken Stein
Managing Editors: Sally Bailey and Sarah Llewellyn Lloyd
ISSN 1366-5278
© Queen’s Printer and Controller of HMSO 2006
This monograph may be freely reproduced for the purposes of private research and study and may be included in professional journals provided
that suitable acknowledgement is made and the reproduction is not associated with any form of advertising.
Applications for commercial reproduction should be addressed to NCCHTA, Mailpoint 728, Boldrewood, University of Southampton,
Southampton, SO16 7PX, UK.
Published by Gray Publishing, Tunbridge Wells, Kent, on behalf of NCCHTA.
Printed on acid-free paper in the UK by St Edmundsbury Press Ltd, Bury St Edmunds, Suffolk. G
Health Technology Assessment 2006; Vol. 10: No. 6

Abstract
Systematic review and evaluation of methods of assessing
urinary incontinence
JL Martin,1* KS Williams,1 KR Abrams,1 DA Turner,1 AJ Sutton,1 C Chapple,2
RP Assassa,3 C Shaw4 and F Cheater5
1
Department of Health Sciences, University of Leicester, UK
2
Urology Research, Royal Hallamshire Hospital, Sheffield, UK
3
Pinderfields and Pontefract General Infirmary, UK
4
Department of General Practice, University of Wales, College of Medicine, UK
5
School of Healthcare Studies, University of Leeds, UK
* Corresponding author

Objectives: To identify and synthesise studies of question 3 of the Urogenital Distress Inventory was
diagnostic processes of urinary incontinence and to found to have a sensitivity of 0.88 and specificity of
construct an economic model to examine the cost- 0.60. Seven studies compared a pad test with
effectiveness of simple, commonly used primary care multichannel urodynamics; however, four different pad
tests. tests were studied and therefore it was difficult to
Data sources: The electronic databases MEDLINE draw any conclusions about diagnostic accuracy. Of the
(1966–2002), CINAHL (1982–2002) and EMBASE four studies comparing urinary diary with multichannel
(1980–2002). urodynamics, only one presented data in a format that
Review methods: Studies were selected and assessed allowed sensitivity and specificity to be calculated.
using the Quality Assessment of Diagnostic Studies Their reported values of 0.88 and 0.83 suggest that a
(QUADAS) tool. Studies that reported the results of urinary diary may be effective in the diagnosis of DO in
applying the same diagnostic procedure using the same women. Examination of the incremental cost-
threshold value (cut-off) were pooled using a random effectiveness of three primary care tests used in
effects meta-analysis model to produce pooled addition to history found that the diary had the lowest
estimates of sensitivity, specificity and diagnostic odds cost-effectiveness ratio of between £35 and £77 per
ratio together with 95% confidence intervals. extra unit of effectiveness (or case diagnosed). Imaging
Results: In total, 6009 papers were identified from the by ultrasound to determine leakage was found to be
literature search, of which 129 were deemed relevant effective in the diagnosis of USI in women, with a
for inclusion in the review, and these papers compared sensitivity of 0.94 and specificity of 0.83.
two or more diagnostic techniques. The gold-standard Conclusions: This is the first systematic review of
diagnostic test for urinary incontinence with which each methods for diagnosing urinary incontinence. As
reference test was compared was multichannel reporting of the primary studies was poor, clinical
urodynamics. In general, reporting in the primary interpretation was often difficult because few studies
studies was poor; there was a lack of literature in the could be synthesised and conclusions made. The report
key clinical areas and minimal literature dealing with found that a large proportion of women with USI can
diagnosis in men. Only a limited number of studies be correctly diagnosed in primary care from clinical
could be combined or synthesised, providing the history alone. On the basis of diagnosis the diary
following results when compared with multichannel appears to be the most cost-effective of the three
urodynamics. A clinical history for diagnosing primary care tests (diary, pad test and validated scales)
urodynamic stress incontinence (USI) in women was used in addition to clinical history. Ultrasound imaging
found to have a sensitivity of 0.92 and specificity of may offer a valuable alternative to urodynamic
0.56 and for detrusor overactivity (DO) a sensitivity of investigation. The clinical stress test is effective in the
0.61 and specificity of 0.87. For validated scales, diagnosis of USI. Adaptation of such a test so that it iii

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Abstract

could be performed in primary care with a naturally setting to be undertaken so that the results of this
filled bladder may prove clinically useful. If a patient is systematic review can be verified or not. Such studies
to undergo an invasive urodynamic procedure, should include not only an assessment of clinical
multichannel urodynamics is likely to give the most effectiveness, in this case diagnostic accuracy, but also
accurate result in a secondary care setting. There is a an assessment of costs and quality of life/satisfaction to
dearth of literature on the diagnosis of urinary inform future health policy decisions. Studies carried
incontinence in men, with no studies meeting the out should be reported to a better standard. The
study criteria for data extraction in the diagnosis of recommendations of the Standards for Reporting
bladder outlet obstruction. There is a need for large- Diagnostic Accuracy (STARD) initiative should be
scale, high-quality primary studies evaluating the use of followed to ensure the accuracy and completeness of
a number of diagnostic methods in a primary care reporting design and results.

iv
Health Technology Assessment 2006; Vol. 10: No. 6

Contents
List of abbreviations .................................. vii 6 Conclusions, implications and
recommendations ...................................... 75
Executive summary .................................... ix Conclusions ................................................ 75
Implications ................................................ 75
1 Introduction and background ................... 1 Future research recommendations ............. 75
Background ................................................ 1 Dissemination and timescale for
Aetiology .................................................... 1 updating ..................................................... 76
Cost and social problems ........................... 1
Assessment and diagnosis .......................... 1 Acknowledgements .................................... 77
Aims and objectives .................................... 4
References .................................................. 79
2 Methods ..................................................... 5
General methodology ................................ 5 Appendix 1 Search strategy ...................... 89
Search strategy ........................................... 5
First exclusion process ................................ 6 Appendix 2 Quality assessment tool ......... 93
Second exclusion process ........................... 6
Categorisation of studies ............................ 6 Appendix 3 Instructions for quality
Quality assessment ..................................... 6 assessment .................................................. 95
Data extraction ........................................... 7
Data synthesis ............................................. 7 Appendix 4 Letter to authors requesting
additional data ........................................... 103
3 Results ........................................................ 9
Studies identified ....................................... 9 Appendix 5 Blank forms sent to contacted
Results of contacting authors ..................... 9 authors ........................................................ 105
Categorisation of papers ............................ 9
Quality assessment ..................................... 9 Appendix 6 Website created for contacted
Studies identified: key characteristics ........ 13 authors ........................................................ 109

4 Economic modelling ................................... 59 Appendix 7 Additional study information


Introduction ............................................... 59 sheet ........................................................... 113
Methods ...................................................... 59
Results ........................................................ 64 Appendix 8 STARD flowchart and
checklist ...................................................... 115
5 Discussion ................................................... 71
Appraisal of the systematic review ............. 71 Health Technology Assessment reports
Implications of the findings ....................... 73 published to date ....................................... 117

Health Technology Assessment


Programme ................................................ 129

v
Health Technology Assessment 2006; Vol. 10: No. 6

List of abbreviations

AUA American Urological Association MCU multichannel urodynamics

AUC area under the curve MSSU midstream specimen of urine

BIDI Bladder Instability Discriminant MUI mixed urinary incontinence


Index
PVRV postvoid residual volume (of
BMI body mass index urine)

BND bladder neck descent QALY quality-adjusted life-year

BOO bladder outlet obstruction QUADAS Quality Assessment of Diagnostic


Studies
BPH benign prostatic hyperplasia
RAP Resident Assessment Protocol
CI confidence interval
ROC receiver operating characteristic
CRD Centre for Reviews and
Dissemination SCU single-channel urodynamics

df degrees of freedom SE standard error

DIS Detrusor Instability Score sEMG surface electromyography

DO detrusor overactivity sROC summary receiver operating


characteristic
DOR diagnostic odds ratio
STARD Standards for Reporting of
DUEC distal urethral conductance Diagnostic Accuracy

F female SUI stress urinary incontinence

ICS International Continence Society UDI Urogenital Distress Inventory

IIQ Incontinence Impact UI urinary incontinence


Questionnaire
UPP urethral pressure profile
ISQ Incontinence Screening
Questionnaire USI urodynamic stress incontinence

LUTS lower urinary tract symptoms UUI urge urinary incontinence

M male

All abbreviations that have been used in this report are listed here unless the abbreviation is well known (e.g. NHS), or
it has been used only once, or it is a non-standard abbreviation used only in figures/tables/appendices in which case
the abbreviation is defined in the figure legend or at the end of the table.

vii

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Health Technology Assessment 2006; Vol. 10: No. 6

Executive summary
Background urinary incontinence: specifically urodynamic
stress incontinence (USI) and detrusor
Although urinary incontinence is not life overactivity (DO)
threatening, it can have enormous costs to ● quantitatively synthesise the extracted evidence
individuals and the health service in terms of using meta-analysis methods (where possible) or
expenditure and impact on quality of life. pooling of individual sensitivity and specificity
Epidemiological studies have demonstrated that data
urinary incontinence is a very common symptom, ● construct an economic model to examine the
with a reported prevalence of any urinary cost-effectiveness of simple, commonly used
incontinence (in those aged 40 and over) of 34% primary care tests
for women and 14% for men. ● identify gaps in the literature
● prioritise future clinical and research questions.
Pathways to diagnostic assessment are inconsistent,
with some individuals being assessed and treated in
primary care settings by GPs and nurses, and others Methods
being referred directly to a variety of specialists in
secondary care (e.g. physiotherapists, gynaecologists Data sources
and urologists) without any assessment or treatment. The online bibliographic databases MEDLINE
Assessment can be undertaken at a number of levels (1966–2002), CINAHL (1982–2002) and EMBASE
using different combinations of tests. (1980–2002) were used to obtain the literature.
The search strategy was based on the Cochrane
It is particularly important when implementing and NHS Centre for Reviews and Dissemination
certain treatment interventions (e.g. medication strategies for identifying studies of diagnostic
that may have side-effects) that a diagnosis is performance.
made to determine the most effective treatment
intervention, and it is imperative before surgical Study selection
intervention. If a diagnosis is not made, then Study selection comprised a three-stage process
inappropriate and unnecessary interventions may using defined inclusion and exclusion criteria. All
be implemented. Two types of diagnosis can be records were assessed for relevance by the first
made: symptomatic diagnosis and condition- investigator on the basis of the abstract, or if the
specific diagnosis. In general, symptomatic abstract was not available then title only. Papers
diagnoses are made in primary care using clinical were considered relevant to the systematic review if
history-taking, urinary diaries, pad tests and they considered the evaluation, appropriateness
validated symptom scales. Condition-specific and/or cost of diagnostic assessment in the
diagnoses are made in secondary care using following categories:
urodynamic techniques. The use of diagnostic
assessment methods is influenced by the clinical ● clinical history-taking
setting and the expertise of the individual ● simple investigations including validated scales,
undertaking the assessment. The evidence diaries and pad tests
available on the accuracy and acceptability of these ● advanced (invasive) investigations (e.g.
diagnostic processes is inconsistent and variable. urodynamics).

To be included, a paper had to provide a


Objectives quantitative comparison between two or more
different methods of diagnosing urinary
This systematic review aimed to: incontinence.

● identify, appraise and summarise the published Data extraction


evidence relating to different methods of A panel consisting of at least three members of the
diagnostic assessment of male and female review team, including at least one statistician, ix

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Executive summary

discussed all papers identified as of potential data in a format that allowed sensitivity and
relevance. The panel determined whether study specificity to be calculated. Their reported values
data were presented in a suitable format to of 0.88 and 0.83 suggest that a urinary diary may
calculate sensitivity and specificity. be effective in the diagnosis of DO in women.
Examination of the incremental cost-effectiveness
Quality assessment of three primary care tests used in addition to
All relevant papers were assessed for quality using history found that the diary had the lowest cost-
Quality Assessment of Diagnostic Studies effectiveness ratio of between £35 and £77 per
(QUADAS), a tool designed specifically for studies extra unit of effectiveness (or case diagnosed).
on diagnostic accuracy. An initial pilot study on Imaging by ultrasound to determine leakage was
four papers resulted in a number of clarifications found to be effective in the diagnosis of USI in
being added to the instructions of the QUADAS women, with a sensitivity of 0.94 and specificity
tool to ensure consistency between assessors. Seven of 0.83.
of the authors performed the full quality
assessment process, with 10% of the papers being
assessed by two authors to test for inter-reader Conclusions
agreement.
This is the first systematic review of methods for
Data synthesis diagnosing urinary incontinence. As reporting of
Studies that reported the results of applying the the primary studies was poor, clinical
same diagnostic procedure using the same interpretation was often difficult because few
threshold value (cut-off) were pooled using a studies could be synthesised and conclusions
random effects meta-analysis model to produce made. The following information could be
pooled estimates of sensitivity, specificity and deduced from the available data.
diagnostic odds ratio together with 95%
confidence intervals. ● A large proportion of women with USI can be
correctly diagnosed in primary care from
clinical history alone.
Results ● On the basis of diagnosis the diary appears to
be the most cost-effective of the three primary
In total, 6009 papers were identified from the care tests (diary, pad test and validated scales)
literature search, of which 129 were deemed used in addition to clinical history.
relevant for inclusion in the review, and these ● Ultrasound imaging may offer a valuable
papers compared two or more diagnostic alternative to urodynamic investigation.
techniques. The gold-standard diagnostic test for ● The clinical stress test is effective in the
urinary incontinence with which each reference diagnosis of USI. Adaptation of such a test so
test was compared was multichannel urodynamics. that it could be performed in primary care with
a naturally filled bladder may prove clinically
In general, reporting in the primary studies was useful.
poor; there was a lack of literature in the key ● If a patient is to undergo an invasive
clinical areas and minimal literature dealing with urodynamic procedure, multichannel
diagnosis in men. Only a limited number of urodynamics is likely to give the most accurate
studies could be combined or synthesised, result in a secondary care setting.
providing the following results when compared ● There is a dearth of literature on the diagnosis
with multichannel urodynamics. A clinical history of urinary incontinence in men, with no studies
for diagnosing USI in women was found to have a meeting the study criteria for data extraction in
sensitivity of 0.92 and specificity of 0.56 and for the diagnosis of bladder outlet obstruction.
DO a sensitivity of 0.61 and specificity of 0.87. For
validated scales, question 3 of the Urogenital Implications for healthcare
Distress Inventory was found to have a sensitivity ● There is currently a lack of high-quality
of 0.88 and specificity of 0.60. Seven studies research in clinically relevant areas to inform
compared a pad test with multichannel clinical practice.
urodynamics; however, four different pad tests ● Most diagnostic methods can be undertaken in
were studied and therefore it was difficult to draw primary or secondary care.
any conclusions about diagnostic accuracy. Of the ● Simple investigations (e.g. pad test and diary)
four studies comparing urinary diary with may offer useful information on severity which,
x multichannel urodynamics, only one presented when combined with history, may provide
Health Technology Assessment 2006; Vol. 10: No. 6

sufficient information to commence primary There is a need for large-scale, high-quality


care interventions (which are low cost and low primary studies evaluating the use of a number of
risk). diagnostic methods in a primary care setting to be
undertaken so that the results of this systematic
review can be verified or not. Such studies should
Recommendations for research include not only an assessment of clinical
effectiveness, in this case diagnostic accuracy, but
Given the demographics of the UK population also an assessment of costs and quality of
and the reported high prevalence of any urinary life/satisfaction to inform future health policy
incontinence in the community-dwelling decisions.
population, there will be an increasing burden
placed on primary (and secondary) care services in Studies carried out should be reported to a better
terms of the diagnostic assessment and standard. The recommendations of the Standards
appropriate treatment of incontinence. Therefore, for Reporting of Diagnostic Accuracy (STARD)
identifying which are the most clinically accurate initiative should be followed to ensure the
and cost-effective diagnostic methods is of crucial accuracy and completeness of reporting design
importance. and results.

xi

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Health Technology Assessment 2006; Vol. 10: No. 6

Chapter 1
Introduction and background
Background ● Urodynamic stress incontinence (USI) is the
involuntary leakage of urine during increased
Urinary incontinence has been defined by the abdominal pressure in the absence of a detrusor
International Continence Society (ICS) as “the contraction. This replaces the commonly used
complaint of any involuntary leakage of urine”.1 term genuine stress incontinence.1
They suggest that such leakage should be further ● Detrusor overactivity (DO) is involuntary
described by specifying type (distinguishing detrusor contractions during the filling phase,
between stress, urge and mixed urinary which may be spontaneous or provoked. This
incontinence), frequency, severity, precipitating term replaces detrusor instability.
factors, social impact, effect on hygiene and
quality of life, measures used to contain leakage
and whether the individual seeks or desires help
for incontinence. Although urinary incontinence is
Cost and social problems
not life threatening, it can have enormous costs to Urinary incontinence has an enormous cost to
individuals and the health service in terms of individuals and health services in terms of
expenditure and impact on quality of life. expenditure and impact on quality of life. A study
Epidemiological studies have demonstrated that investigating the cost of urinary storage disorders
urinary incontinence is a very common symptom; to the UK estimated that the total cost of treating
McGrother and colleagues report a prevalence of urinary storage disorders in community-dwelling
any urinary incontinence (in those aged 40 years adults over the age of 40 was £536 million in
and over) of 34% for women and 14% for men. 1999/2000 prices. In addition, there is an
The proportion of people finding that these estimated cost of £207 million that is borne by the
symptoms impact on their lives is estimated to be individual for managing their symptoms (£29
around 29% for women and 14% for men.2 million and £178 million for men and women,
respectively).3

Aetiology In addition to the economic costs, urinary


incontinence has a serious impact on the quality of
Three types of incontinence can be identified, life of sufferers. Effects have been shown to
depending on the symptoms of the presenting include depression,4 anxiety5 and poor life
patient. These terms are commonly used in satisfaction.6 All types of leakage have a
scientific studies and the definitions are taken detrimental effect on daily activities and overactive
from the current ICS Standardisation Report1 to bladder symptoms in particular have been shown
describe symptomatic diagnoses. to be distressing for young women.7

● Stress urinary incontinence (SUI) is the


complaint of involuntary leakage on effort or Assessment and diagnosis
exertion, or on sneezing or coughing.
● Urge urinary incontinence (UUI) is the Diagnosis of urinary incontinence usually begins
involuntary leakage of urine accompanied or with an assessment of the symptoms in a clinical
immediately preceded by urgency. history. There are several different symptoms of
● Mixed urinary incontinence (MUI) is the urinary incontinence, depending on the
complaint of involuntary leakage associated circumstances under which people leak urine.
with urgency and also with exertion, effort, Diagnosis may involve methods of assessing the
sneezing or coughing. severity and pattern of leakage, using methods
such as pad tests and urinary diaries. Pad tests
When the symptoms of incontinence are largely measure the severity of leakage, while the
confirmed by urodynamic investigation then two diary assesses the severity of frequency and
types of incontinence can be diagnosed: leakage. Increased frequency and incontinence
1

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Introduction and background

recorded in a diary may be indicative of UUI and tests. Figure 1 illustrates assessment processes in
a positive pad test may indicate SUI. clinical practice and how they are interrelated with
initiation of treatment. There are also likely
Assessment procedures tend to be sequential, overlaps of investigative methods being used at
beginning with the recording of symptoms in a different points in the care pathway.
patient history, which may be indicative of a
particular underlying condition. Linked in with It is particularly important when implementing
these sequential assessment procedures are often certain treatment interventions (e.g. medication
clinical treatment interventions; these may be that may have side-effects) that a diagnosis is
implemented and then further assessment made to determine the most effective treatment
processes undertaken depending on the success of intervention, and of course it is imperative before
the intervention. surgical intervention. If a diagnosis is not made,
then inappropriate and unnecessary interventions
Methods of diagnostic assessment can be broadly may be implemented. As has already been
divided and sequentially ordered into five mentioned, there are two levels of diagnosis:
groups: symptomatic diagnosis and condition-specific
diagnosis. In general, symptomatic diagnoses take
● clinical history-taking, including nature, place in primary care and condition-specific in
duration and reported severity of symptoms, secondary care, where urodynamic investigations
functional and mental status, relevant are available. In primary care the diagnosis of
medical, surgical and gynaecological history, urinary incontinence is dependent on history-
impact of symptoms on quality of life and taking, physical examination and simple
exacerbating factors including diet, fluid and investigations including frequency–volume charts,
medications pad tests, urinalysis and estimation of PVRV. The
● validated scales, which measure the severity of choice of diagnostic assessment method is
symptoms and impact of symptoms on quality influenced by the clinical setting (primary/
of life secondary care) and by the expertise of the
● physical examination, including abdominal, professional conducting the diagnostic test. To
perineal, rectal, neurological and measurement date, research has focused on the clinical
of body mass index (BMI) effectiveness of condition-specific diagnosis. Little
● simple investigations, including urinalysis, attention has been paid to the effectiveness of
midstream specimen of urine (MSSU), symptomatic diagnosis, despite this being the basis
measurement of postvoid residual volume of all treatment in primary care.
(PVRV), provocation stress test,
frequency–volume charts and pad tests The term urodynamics relates to the study of
● advanced investigations, including urodynamics. pressure–flow relationships in the urinary tract
and provides a functional assessment of the lower
Pathways to diagnostic assessment are inconsistent, urinary tract to provide objective explanations for
with some individuals being assessed and treated urinary symptoms or dysfunction.9 Urodynamic
in primary-care settings by GPs and nurses, and tests include such minimally invasive tests as
others being referred directly to a variety of frequency–volume charts, but more commonly
specialists in secondary care (e.g. physiotherapists, refer to cystometry, urethral pressure
gynaecologists, urologists, geriatricians or measurement, pressure–flow studies,
specialist nurses based in secondary care) without videourodynamics and ambulatory monitoring.9
any assessment or treatment. Although algorithms The aim of clinical urodynamics is to reproduce
for the assessment and treatment of urinary symptoms while making precise measurements to
incontinence have been recommended, the most identify the underlying cause for the symptoms
appropriate healthcare worker to conduct such and to quantify the pathophysiological processes.10
assessments has not been identified, nor has their Urodynamic tests are invasive, usually involving
ideal location.8 For example, a symptomatic catheterisation of the bladder and the
diagnosis conducted by a nurse in a health centre measurement of pressure in the urethra, bladder
will have a different service cost to a condition- and abdomen. A significant number of people who
specific diagnosis conducted by a specialist in undergo urodynamics find it embarrassing,
hospital using urodynamic equipment. painful or distressing.11

Assessment can be undertaken at a number of Full descriptions of urodynamic techniques can be


2 levels using different combinations of screening found in a number of recent publications.9,12
Patient with urinary symptoms contacts

Nurse/GP history

Nurse management GP management

General assessment with: General assessment with: Refer to specialist


history, MSSU ± bladder scan, history, MSSU ± bladder scan, assessment
diary, pad test diary, pad test

General assessment with:


Mild symptoms Moderate/severe Mild Severe history, screening tests,
symptoms symptoms symptoms MSSU ± bladder scan, diary, pad test

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Initiate treatment
Nurse Instigate first Refer Initiate Undertake
(e.g. drug therapy)
management line treatment back to GP treatment urodynamics

Improvement No improvement Successful Unsuccessful Test positive Test


USI/DO/mixed negative
Continue therapy Successful Not
and reassessment Discharge successful Initiate treatment

Discharge
Improvement No Successful Not successful
improvement

Discharge Discharge Further treatment


Health Technology Assessment 2006; Vol. 10: No. 6

FIGURE 1 Flowchart of assessment processes in clinical practice

3
Introduction and background

Although there are concerns about accuracy and diagnostic assessment of male and female
reproducibility, urodynamics is still regarded as the urinary incontinence: specifically USI and DO
gold-standard method for diagnosing urinary ● quantitatively synthesise the extracted evidence
incontinence and is usually a necessary procedure using meta-analysis methods (where possible) or
before surgery is performed.13 pooling of individual sensitivity and specificity
data
● develop an illustrative flowchart of diagnostic
Aims and objectives processes for urinary incontinence in current
clinical practice, and construct an economic
This systematic review aims to: model to examine the cost-effectiveness of
simple, commonly used primary-care tests
● identify, appraise and summarise the published ● identify gaps in the literature; and prioritise
evidence relating to different methods of future clinical and research questions.

4
Health Technology Assessment 2006; Vol. 10: No. 6

Chapter 2
Methods
General methodology where appropriate using quantitative techniques
and providing economic modelling of costs of
The systematic review followed the guidelines diagnostic methods.
contained in NHS Centre for Reviews and
Dissemination (CRD) Report 414 and aimed to
appraise and summarise the published evidence Search strategy
relating to the different methods of diagnostic
assessment in male and female urinary The online bibliographic databases MEDLINE
incontinence within the subgroups of diagnostic (January 1966 to December 2002), CINAHL
tests described in Chapter 1: (January 1982 to December 2002) and EMBASE
(January 1980 to December 2002) were used to
● clinical history-taking
obtain the literature. The search strategy was
● validated scales
based on the Cochrane and NHS CRD strategies
● physical examination
for identifying studies of diagnostic performance,
● simple investigations
and the information officers at these centres were
● advanced investigations.
consulted during this process. A number of
The review examined the evidence of these keywords was identified based on possible
subgroups of tests in relation to: diagnostic tests and possible permutations of their
names (Table 1). A paper was included if a word
● clinical use, including sensitivity, specificity and
from {Diagnostic filter} OR {Diagnostic test}
positive predictive values of different diagnostic
AND {Incontinence term} was found anywhere in
assessment methods when compared with the
the title or abstract or used as a MeSH heading.
gold standard of multichannel urodynamics
The search results were limited to humans, reports
● economic modelling.
in the English language and adults (>19 years)
The overall philosophy of the systematic review only. The full search strategies can be seen in
was to maintain breadth, synthesising the evidence Appendix 1.

TABLE 1 Keywords used in literature search of MEDLINE, EMBASE and CINAHL

{Urinary incontinence} {Diagnostic test}

Urinary incontinence Urodynamics


Urge incontinence Provocation stress test
Stress incontinence Frequency volume chart
Leakage AND urin* (keyword) Urinalysis
Post-void residual volume
Mid-stream specimen
MSSU
Pad tests OR pad testing OR pad test
Urinalysis
Midstream sample of urine

{Diagnostic filter}

Sensitivity Summary receiver operating characteristic/curve Predictive value


Specificity Diagnostic errors Predictive standards
Predictive value of tests Likelihood ratio Predictive models
Reference values Likelihood function Criteria test
Reference standard False positives Validated standard
‘Gold standard’ False negatives Work-up bias
Observer bias/variation
5

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Methods

First exclusion process again blinded, read 20% of the relevant, 20% of
the not relevant and all of the unclear papers,
All records were entered into a bibliographic and any discrepancies were discussed. Agreement
referencing software program (Procite). Duplicate between the two investigators was 96%.
papers were identified and deleted. The
remaining papers were assessed for relevance by
the first investigator on the basis of the abstract,
or if the abstract was not available then the title
Categorisation of studies
only. A sample (10%) was also assessed for Owing to the large number of tests used for the
potential relevance by the second investigator; diagnosis of urinary incontinence and, hence, the
agreement between the two readers was 99%. number of possible comparisons, a matrix was
constructed to organise the literature (see Table 2
Inclusion in Chapter 3). Each relevant paper was assigned to
Papers were considered relevant to the systematic a box in the matrix according to the two
review if they considered the evaluation, diagnostic tests compared (or boxes if more than
appropriateness and/or cost of diagnostic two tests were compared).
assessment in the five categories identified:
● clinical history-taking
● validated scales Quality assessment
● physical examination
The recent growth in systematic reviews of
● simple investigations
diagnostic tests has resulted in the need for
● advanced investigations.
methods to assess the quality of diagnostic studies.
To be included, a paper had to provide a In response to this, a project was funded by the
quantitative comparison between two different HTA programme to develop a quality tool
methods of diagnosing urinary incontinence. specifically for these types of studies, the Quality
Assessment of Diagnostic Studies (QUADAS)
Exclusion tool,15 which was used for the quality assessment
Any papers that fell into the following categories component of the review. The tool consists of 14
were excluded from the review: questions regarding the quality of the study and
quality of reporting (Appendix 2).
● diagnosis of children
● reports in a non-English language Pilot study
● case reports As the QUADAS tool was a recently developed
● letters instrument, a pilot quality assessment exercise was
● reviews (non-primary research) undertaken to ascertain whether it required
● papers investigating interventional procedures amending or extending for the specific remit of
where diagnostic tests were used as outcome the review. Four papers16–19 identified as
measures. potentially relevant for inclusion in the review
All of the abstracts were read by the first investigator were assessed for quality by five of the project
and classified as relevant, not relevant or unclear. A investigators using the original QUADAS tool.
second investigator who was blinded to the initial The investigators were asked to report any
classifications then read 20% of the relevant records, questions that they felt required clarification or
10% of the not relevant records and 100% of the expanding, or that were not relevant.
unclear records. Any discrepancies were discussed.
Agreement between the two investigators was 98%. Several clarifications were added to the instructions
Full copies of those papers identified as either based on the recommendations from the pilot
relevant or unclear were obtained. study (Appendix 3). These included directives that
no assumptions should be made, for example when
judging the period between the two tests. This is
Second exclusion process rarely explicitly stated and it is tempting (and
probably correct) to assume that the period
Once obtained, full copies of papers identified as between tests is short. Following advice from a
of potential relevance were read by the first clinical member of the project team, further
investigator and classified as relevant, not relevant information was provided for assessing the quality
or unclear on the basis of the same inclusion and of papers that investigated urodynamic procedures,
6 exclusion criteria. The same second investigator, including the minimum amount of detail required
Health Technology Assessment 2006; Vol. 10: No. 6

for replication of urodynamics to be possible. associated 95% confidence intervals (CIs). Tests for
Information was also added to clarify the quality heterogeneity were carried out for each outcome
assessment of other questions on validity of the and are reported. On the basis of the pooled
sample and appropriate reference standards. sensitivity and specificity the positive likelihood
ratio was calculated, together with associated 95%
Full quality assessment process CI. A positive likelihood ratio can be used to assess
Seven members of the investigation team took part the impact on diagnosis of a positive test result for
in this process, each assessing approximately 30 an individual, although values greater than 10 are
papers. Ten per cent of the papers were assessed usually considered necessary for a test to provide
by two different investigators to check the inter- convincing diagnostic evidence.18 Pooling
rater reliability of the tool; the remaining 90% were sensitivity and specificity separately assumes that
assessed only by one investigator. This procedure the diagnostic threshold is the same in each study.
also served as a final filter for relevance and Pooling DORs relaxes this assumption by assuming
investigators were asked to highlight any studies that the studies relate to the same symmetrical
that they felt were not relevant to the review. These receiver operating characteristic (ROC) curve. The
studies were discussed by two investigators and if DOR has been put forward as a useful single
not relevant were excluded from the review. indicator of test performance, which indicates the
strength of the association between test results in
disease (in much the same way as the odds ratio is
Data extraction used in epidemiology to express the association
between exposure and disease). For a thorough
All papers identified as of potential relevance were explanation of the use of odds ratios in diagnostic
discussed by a panel consisting of at least three applications, including their application to meta-
members of the review team, including at least one analysis, see Glas and colleagues.20
statistician. The panel determined whether study
data were presented in a suitable way to allow a The empirical study sensitivities and specificities
cross-tabulation of the results or sensitivity and and corresponding pooled estimates are plotted in
specificity to be calculated. The authors of studies ROC space to aid the simultaneous interpretation.
that did not present sufficient data for inclusion in The ROC curve corresponding to the pooled
any meta-analysis were contacted by letter and asked DORs is also presented together with the area
to provide further details (Appendix 4). In order to under the curve (AUC) for the ROC curve and
aid this procedure and maximise the response, associated 95% CI.21 The symmetric ROC curve
forms were sent with template data tables to aid the determined by the pooled DOR is given by
authors in providing data in either a cross-tabulation

/[ ]
form or individual patient data (Appendix 5). 1
Sensitivity = 1 1 + ————————————
A website was also set up to give authors further
information about the project and examples of
the data required (http://www.prw.le.ac.uk/
(
1 – Specificity
DOR × ———————
Specificity )
research/hta/) (Appendix 6). The intention was, where between study
heterogeneity existed, to explore it using meta-
While members of the project team were assessing regression investigating potential associations
the quality of papers they also recorded other between study characteristics (such as population
details. This included the size, gender and age of under study, country of study and quality of study)
the sample, the care setting where the study was on the DOR scale, but this proved to be infeasible
performed and the country (Appendix 7). owing to the low numbers of studies identified for
each separate outcome of interest.

Data synthesis All analyses were performed using Stata version


7.0 (Stata Corporation, College Station, TX, USA,
Studies that reported the results of applying the 2002) and MetaDiSc (www.hrc.es/investigation/
same diagnostic procedure using the same metadisc.html – a new freely available program for
threshold value (cut-off) were pooled using a carrying out meta-analysis of diagnostic test
random effects meta-analysis model (which performance studies).
reduced to a fixed effect model when the between-
study variability was estimated to be 0) to produce Whether studies reported sufficient data for meta-
pooled estimates of the sensitivity, specificity and analysis or not, an attempt was made to undertake a
diagnostic odds ratio (DOR), together with narrative synthesis of all relevant papers identified. 7

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Health Technology Assessment 2006; Vol. 10: No. 6

Chapter 3
Results
Studies identified imaging, although a small number of studies
investigated less common tests.
A flowchart of the studies is shown in Figure 2. In
total, 6099 papers were identified from MEDLINE A separate matrix was constructed to organise the
(2913), CINAHL (411) and EMBASE (2775). Of literature that compared the different urodynamic
these, 1479 duplicate papers were identified and tests (Table 3).
deleted: 11 from MEDLINE, 111 from CINAHL
and 1357 from EMBASE. There was a large
amount of overlap between the studies identified
by MEDLINE and EMBASE, with 49% of the
Quality assessment
studies identified by EMBASE also being Pilot study
identified by MEDLINE. Agreement between investigators for the various
questions ranged from 0.65 to 1.00 (Table 4). A
The deletion of duplicate papers left 4620 common problem encountered was in the lack of
individual papers. After the first exclusion process clarity in reporting. This led to investigators
490 records were identified as of potential making, probably correct, assumptions about
relevance and full copies of the papers were factors such as blinding of experimenters and
obtained. After the second exclusion process 197 periods between diagnostic tests.
different, original papers appeared to meet the
inclusion criteria. These potentially reported the Full quality assessment
quantitative comparison of two or more diagnostic The results of the full quality assessment
tests used for the detection of urinary procedure are displayed in Table 5. There was a lot
incontinence. After more detailed reading of each of variation in terms of the quality of the studies
paper during the data extraction and quality and also the quality of the reporting. The items
assessment processes a further 76 papers were that resulted in the most favourable ratings were
found not to meet the inclusion criteria of the questions 3, 5, 6 and 7, which were all concerned
review and were excluded.22–96 with the quality of study design: specifically,
whether an appropriate reference standard test
was used (question 3: 84% of papers rated as ‘yes’),
Results of contacting authors whether all patients underwent identical
diagnostic procedures (questions 5 and 6: 91%
Twenty-four studies were identified as being of and 86% of papers rated as ‘yes’) and whether
potential interest but with insufficient data the two diagnostic tests were independent
presented in the written paper to enable any of each other (question 7: 77% of papers rated
summary measures of diagnostic accuracy to be as ‘yes’).
calculated.19,32,43,52,80,97–116 The lead authors of
these studies were contacted by letter and asked to Several items were poorly described in the papers:
provide further information. Four authors 39% of the papers did not clearly describe the
responded with all of the requested data.97,98,104,107 selection criteria used in the study, therefore it is
The data from the other 19 studies were included not possible to judge how appropriate the sample
in the review in the form presented in the paper. was. Questions 9a and 9b dealt with the issue of
blinding; for the majority of the studies it was
unclear whether the reference (79%) or index tests
Categorisation of papers (83%) were interpreted without knowledge of the
other test. Sixty-one per cent of the papers did not
The completed matrix showing the distribution of report whether there were any uninterpretable or
the literature can be seen in Table 2. The majority intermediate results (question 11) and 67% of the
of the published studies deal with the most papers did not report whether there were any
commonly used diagnostic tests: urodynamics, pad withdrawals from the study (question 12).
test, urinary diary, clinical history and ultrasound Question 4 dealt with the period between the two 9

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Results

Search terms defined

MEDLINE, CINAHL and EMBASE searched

6099
Papers identified

Duplicate papers deleted

4620
Individual papers

Classification of papers by 1st investigator through reading abstract

294 211 4115


Relevant (R) Uncertain (U) Not relevant (NR)

Classification of papers by 2nd investigator through reading of abstracts

287 0 7 0 194 17 4 5 4106


R U NR R U NR R U NR
Full papers
obtained and read
163 124 29 165 5 4
R NR R NR R NR

197
R

Data extraction and quality assessment

121

FIGURE 2 Flowchart of literature

diagnostic tests; 64% of the papers did not report replication and 79% provided the same clinical
this and although it is likely that in a lot of cases data as would be available in practice.
tests were performed either on the same day or
within a few days this could not be assumed. To check for inter-rater agreement 16 of the 121
papers were quality assessed by two separate
The responses to the other questions (1, 8a, 8b investigators. The results of this did not allow a
and 10) showed that for these items quality of kappa test to be performed and therefore the
both study design and reporting were good: 64% proportion of agreement between the assessors
of the studies included a representative spectrum was calculated for each question (Table 6). The
of patients, 64% and 59%, respectively, described proportion of agreement between raters ranged
10 the index and reference test in sufficient detail for from 0.50 (identical ratings were given half of the
Health Technology Assessment 2006; Vol. 10: No. 6

TABLE 2 Matrix showing the distribution of literature that met the inclusion criteria

Urodynamics History Scales Pads Diary Battery sEMG

History 42 1 6 3 1
Scales 8 1
Pads 7 4
Diary 4 2
Paper towel test 1
Physical examination 1 1 2
Q-tip test 4
Algorithm 3
Battery 2 1
Conductance 1
Ultrasound 9
Urodynamics 37

TABLE 3 Matrix showing comparison of urodynamic tests

Multichannel urodynamics Clinical stress test Single-channel cystometry

Imaging 5
Stress tests 6
Single-channel urodynamics 8
Ambulatory 6
UPP 5 1
Flow measurement 1
Cystometry by
foetal monitor 2
Ice-water test 1
Fluid-bridge test 1
Stop test 1

TABLE 4 Quality assessment pilot study: agreement between investigators

Item Proportion of
agreement

1. Was the spectrum of patients representative of the patients who will receive the test in practice? 0.65
2. Were selection criteria clearly described? 0.90
3. Is the reference standard likely to correctly classify the target condition? 1.00
4. Is the time period between reference standard and index test short enough to be reasonably sure 0.70
that the target condition did not change between the two tests?
5. Did the whole sample or a random selection of the sample receive verification using a reference 1.00
standard of diagnosis?
6. Did patients receive the same reference standard regardless of the index test result? 1.00
7. Was the reference standard independent of the index test (i.e. the index test did not form part of 0.85
the reference standard)?
8a. Was the execution of the index test described in sufficient detail to permit replication of the test? 0.69
8b. Was the execution of the reference standard described in sufficient detail to permit its replication? 0.69
9a. Were the index test results interpreted without knowledge of the results of the reference standard? 0.65
9b. Were the reference standard results interpreted without knowledge of the results of the index test? 0.65
10. Were the same clinical data available when test results were interpreted as would be available when 0.74
the test is used in practice?
11. Were uninterpretable/intermediate test results reported? 0.50
12. Were withdrawals from the study explained? 0.75
11

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Results

TABLE 5 Summary of quality assessment

Item Yes No Unclear

1. Was the spectrum of patients representative of the patients who will receive the 78 12 31
test in practice?
2. Were selection criteria clearly described? 65 47 9
3. Is the reference standard likely to correctly classify the target condition? 102 1 18
4. Is the time period between reference standard and index test short enough to be 43 0 78
reasonably sure that the target condition did not change between the two tests?
5. Did the whole sample or a random selection of the sample receive verification using 110 6 5
a reference standard of diagnosis?
6. Did patients receive the same reference standard regardless of the index test result? 104 2 15
7. Was the reference standard independent of the index test (i.e. the index test did not 93 2 26
form part of the reference standard)?
8a. Was the execution of the index test described in sufficient detail to permit replication 78 25 18
of the test?
8b. Was the execution of the reference standard described in sufficient detail to permit 71 33 17
its replication?
9a. Were the index test results interpreted without knowledge of the results of the 23 3 95
reference standard?
9b. Were the reference standard results interpreted without knowledge of the results of 10 11 100
the index test?
10. Were the same clinical data available when test results were interpreted as would be 95 26 0
available when the test is used in practice?
11. Were uninterpretable/intermediate test results reported? 22 25 74
12. Were withdrawals from the study explained? 24 15 82

TABLE 6 Full quality assessment: agreement between investigators

Item Proportion of
agreement

1. Was the spectrum of patients representative of the patients who will receive the test in practice? 0.62
2. Were selection criteria clearly described? 0.75
3. Is the reference standard likely to correctly classify the target condition? 0.75
4. Is the time period between reference standard and index test short enough to be reasonably sure 1.00
that the target condition did not change between the two tests?
5. Did the whole sample or a random selection of the sample receive verification using a reference 0.87
standard of diagnosis?
6. Did patients receive the same reference standard regardless of the index test result? 0.75
7. Was the reference standard independent of the index test (i.e. the index test did not form part of 0.87
the reference standard)?
8a. Was the execution of the index test described in sufficient detail to permit replication of the test? 0.75
8b. Was the execution of the reference standard described in sufficient detail to permit its replication? 0.62
9a. Were the index test results interpreted without knowledge of the results of the reference standard? 0.87
9b. Were the reference standard results interpreted without knowledge of the results of the index test? 0.75
10. Were the same clinical data available when test results were interpreted as would be available 0.75
when the test is used in practice?
11. Were uninterpretable/intermediate test results reported? 0.50
12. Were withdrawals from the study explained? 0.75
12
Health Technology Assessment 2006; Vol. 10: No. 6

time) to 1.00 (perfect agreement). Questions 4, 5, studies report sensitivities of 0.66, 0.96 and 0.56
7 and 9a resulted in a high level of agreement of and specificities of 0.63, 0.23 and 0.70,
0.87 or above. Questions 1, 8b and 11 resulted in respectively. One study reported significantly
low levels of agreement of 0.62 or below. higher stress symptoms in the USI-confirmed
Disagreements were resolved by a third person group than in the non-USI group16 and one study
reading the paper. reported that multichannel urodynamics
confirmed USI in 89% of patients with stress
incontinence symptoms.164
Studies identified:
key characteristics In addition, one study compared stress
incontinence symptoms with single-channel
Where it is possible to undertake a meta-analysis urodynamics; a sensitivity of 0.92 and specificity of
or pool the results from a group of papers this will 0.39 is reported.165
be reported in the text and tables. For those
studies that could not be combined individual DO in women
study results are reported. The shaded text in the Fourteen studies compared the diagnosis of DO by
tables illustrates the studies that presented data in clinical history and urodynamics (Table 9).
a form that did not allow summary measures of Thirteen studies were performed in secondary
diagnostic accuracy to be calculated. Table 7 care and one in primary care.125
presents a summary of data and results of
diagnostic accuracy for index tests compared with Eight studies provided a full cross-tabulation of
multichannel urodynamics. results and the data from these studies were
combined to provide a pooled sensitivity of 0.61
Clinical history compared with (95% CI 0.57 to 0.65) and specificity of 0.87 (95%
urodynamics CI 0.85 to 0.89) for the diagnosis of DO in women
USI in women by clinical history (Figure 4). Although all of these
Twenty-one studies compared the diagnosis of USI studies used symptoms of urge incontinence it is
in women by clinical history-taking and probable that different amounts and types of
urodynamics (Table 8). Nineteen were performed questions were used and different cut-offs applied.
in secondary care, one in primary care125 and one Again, care should be taken therefore when
did not specify where the study was performed.119 interpreting the results. The positive likelihood
All of these studies used the presence or absence ratio associated with the pooled sensitivity and
of stress incontinence symptoms as their index test specificity is 4.69 (95% CI 4.05 to 5.33) and the
compared with the reference test of multichannel AUC for the ROC curve corresponding to the
urodynamics, except for one study that used pooled DOR is 0.83 (95% CI 0.69 to 0.97) (Figure 4).
single-channel urodynamics as the reference
standard.165 In addition, two studies compared diagnosis by
history with multichannel urodynamics in elderly
Fifteen studies provided a full cross-tabulation of women (Figure 5), resulting in a pooled sensitivity
results and the data from these studies were of 0.27 (95% CI 0.16 to 0.42) and specificity of
combined to provide a pooled sensitivity of 0.92 0.94 (95% CI 0.91 to 0.97).
(95% CI 0.91 to 0.93) and specificity of 0.56 (95%
CI 0.53 to 0.60) for the diagnosis of USI in Four papers presented only sensitivity and
women using a clinical history (Figure 3). Although specificity from their studies.162,163,166,167 The
all of these studies used symptoms of stress reported sensitivities were 0.70, 0.56, 0.40 and
incontinence, it is probable that different amounts 0.53 and specificities 0.35, 0.70, 0.74 and 0.94,
and types of questions were used and different cut- respectively.
offs applied. Care should be taken, therefore,
when interpreting the results. The positive Diagnosis of DO and USI in men
likelihood ratio associated with the pooled Three studies compared diagnosis made by
sensitivity and specificity is 2.09 (95% CI 1.83 to clinical history and urodynamics in men (Table 10).
2.35) and the AUC for the ROC curve In a post-prostatectomy population one study
corresponding to the pooled DOR is 0.83 (95% CI reports clinical history to be 1.00 sensitive and
0.71 to 0.95) (Figure 3). 0.50 specific for diagnosing USI and 0.50 sensitive
and 0.77 specific for diagnosing DO.168 One study
In addition to the pooled studies, three studies reports a sensitivity of 0.73 and specificity of 0.60
report sensitivity and specificity only.161–163 These for the diagnosis of DO by clinical history169 and 13

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


14
TABLE 7 Summary of data and results of diagnostic accuracy for index tests compared with multichannel urodynamics
Results

Reference TP FP FN TN Sensitivity 95% CI Specificity 95% CI DOR 95% CI

Clinical history for USI in women


Cundiff117 416 60 17 42 0.96 0.94 to 0.98 0.41 0.32 to 0.51 17.13 9.2 to 32.0
De Muylder118 228 58 14 108 0.94 0.91 to 0.97 0.65 0.57 to 0.72 30.33 16.20 to 56.76
Diokno119 65 14 30 52 0.68 0.58 to 0.78 0.79 0.67 to 0.88 8.05 3.87 to 16.73
Diokno120 145 40 9 6 0.94 0.89 to 0.97 0.13 0.05 to 0.26 2.42 0.81 to 7.19
FitzGerald121 187 51 22 33 0.90 0.85 to 0.93 0.39 0.29 to 0.51 5.5 2.95 to 10.25
Ishiko122 152 4 14 28 0.92 0.86 to 0.95 0.88 0.71 to 0.97 76 23.31 to 247.8
Korda123 451 39 52 24 0.90 0.87 to 0.92 0.38 0.26 to 0.51 5.34 2.98 to 9.57
Kujansuu124 46 20 11 43 0.81 0.68 to 0.90 0.68 0.55 to 0.79 8.99 3.86 to 20.93
Lagro-Janssen125 76 9 3 15 0.96 0.89 to 0.99 0.63 0.41 to 0.81 42.22 10.21 to 174.5
Niecestro126 13 17 3 32 0.81 0.54 to 0.96 0.65 0.50 to 0.78 8.16 2.04 to 32.63
Oulsander127 82 31 5 17 0.94 0.87 to 0.98 0.35 0.22 to 0.51 8.99 3.06 to 26.47
Ramsay128 72 28 19 81 0.79 0.69 to 0.87 0.74 0.65 to 0.82 10.96 5.65 to 21.28
Sand129 114 43 0 66 1.00 0.97 to 1.00 0.61 0.51 to 0.70 350.1 21.20 to 5780.2
Sandvik130 179 26 4 27 0.98 0.95 to 0.99 0.51 0.37 to 0.65 46.5 15.05 to 143.5
Sunshine131 73 14 0 15 1.00 0.95 to 1.00 0.52 0.33 to 0.71 157.1 8.89 to 2776.8
Pooled (RE) 0.92 0.91 to 0.93 0.56 0.53 to 0.60 14.34 8.68 to 23.68
LR+ 2.09 (95% CI 1.83 to 2.35)

Clinical history for DO in women


Ishiko122 25 6 4 154 0.86 0.68 to 0.96 0.96 0.92 to 0.99 160.42 42.26 to 608.9
Cundiff117 42 17 60 416 0.41 0.32 to 0.51 0.96 0.94 to 0.98 17.129 9.17 to 32.0
Sandvik130 23 8 18 187 0.56 0.40 to 0.72 0.96 0.92 to 0.98 29.868 11.68 to 76.36
De Muylder118 147 91 89 81 0.62 0.56 to 0.69 0.47 0.40 to 0.55 1.47 0.99 to 2.19
Lagro-Janssen125 11 4 7 81 0.61 0.36 to 0.83 0.95 0.88 to 0.99 31.821 8.00 to 126.6
Sand129 10 3 20 185 0.33 0.17 to 0.53 0.98 0.95 to 1.00 30.833 7.83 to 121.4
FitzGerald121 10 21 27 235 0.27 0.14 to 0.44 0.92 0.88 to 0.95 4.145 1.77 to 9.72
Cantor132 107 53 11 43 0.91 0.84 to 0.95 0.45 0.35 to 0.55 7.892 3.77 to 16.53
Pooled (RE) 0.61 0.57 to 0.65 0.87 0.85 to 0.89 14.72 4.87 to 44.5
LR+ 4.69 (95% CI 4.05 to 5.33)

Clinical history for DO in elderly womena


Diokno120 2 6 12 180 0.14 0.02 to 0.43 0.97 0.93 to 0.99 5.00 0.91 to 27.47
Ouslander127 12 10 25 88 0.32 0.18 to 0.50 0.90 0.82 to 0.95 4.22 1.63 to 10.92

continued
TABLE 7 Summary of data and results of diagnostic accuracy for index tests compared with multichannel urodynamics (cont’d)

Reference TP FP FN TN Sensitivity 95% CI Specificity 95% CI DOR 95% CI

Validated scale:
UDI-6 scale for USI in women
Lemack97 39 30 7 52 0.85 0.71 to 0.94 0.63 0.52 to 0.74 9.66 3.84 to 24.27
FitzGerald121 135 22 18 27 0.88 0.82 to 0.93 0.55 0.40 to 0.69 9.21 4.36 to 19.44
Pooled (RE) 0.87 0.82 to 0.92 0.60 0.51 to 0.69 9.38 5.25 to 16.77
LR+ 2.18 (95% CI 1.49 to 2.86)
DIS scale for USI in women
Klovning133 92 20 61 68 0.60 0.52 to 0.68 0.77 0.67 to 0.85 5.13 2.83 to 9.29
Pad test
ICS 1-hour pad test for any leakage in women
Jorgensen134 16 18 1 14 0.94 0.73 to 0.99 0.44 0.28 to 0.61 12.44 1.47 to 105.5
48 hour pad test for USI in women
Versi135 57 12 5 31 0.92 0.82 to 0.97 0.72 0.57 to 0.83 29.45 9.50 to 91.3
Urinary diary for DO in women
Contreras Ortiz136 23 33 3 158 0.88 0.71 to 0.96 0.83 0.77 to 0.87 36.71 10.41 to 129.4
Q-tip test for USI in womena

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Bergman137 38 27 13 37 0.75 0.60 to 0.86 0.58 0.45 to 0.70 4.01 1.80 to 8.93
Montz138 35 16 31 18 0.53 0.40 to 0.65 0.53 0.35 to 0.70 1.27 0.55 to 2.91
Pooled (RE) 2.27 0.74 to 6.99
Ultrasound: observed leakage for USI in women
Dietz139 10 3 6 18 0.63 0.35 to 0.85 0.86 0.64 to 0.97 10.00 2.05 to 48.89
Dietz140 66 9 13 29 0.84 0.74 to 0.91 0.76 0.60 to 0.89 16.36 6.29 to 42.5
Dietz141 33 2 2 15 0.94 0.81 to 0.99 0.88 0.64 to 0.99 123.8 15.89 to 964.0
Quinn142 87 6 3 28 0.97 0.91 to 0.99 0.82 0.66 to 0.93 135.3 31.8 to 576
Pooled (RE) 0.89 0.84 to 0.93 0.82 0.73 to 0.89 36.784 10.19 to 132.8
LR+ 4.94 (95% CI 3.88 to 6.01)
Ultrasound: bladder neck descent for USI in women
Chen98 27 15 10 50 0.73 0.56 to 0.86 0.77 0.65 to 0.87 9.00 3.56 to 22.74
Bergman143 38 2 6 45 0.86 0.73 to 0.95 0.96 0.86 to 1.00 142 27.16 to 747
Bergman144 30 3 2 26 0.94 0.79 to 0.99 0.90 0.73 to 0.98 130 20.14 to 839
Pooled (RE) 0.84 0.76 to 0.90 0.86 0.79 to 0.91 49.24 6.27 to 386
LR+ 6.00 (95% CI 4.72 to 7.28)

continued
Health Technology Assessment 2006; Vol. 10: No. 6

15
16
TABLE 7 Summary of data and results of diagnostic accuracy for index tests compared with multichannel urodynamics (cont’d)
Results

Reference TP FP FN TN Sensitivity 95% CI Specificity 95% CI DOR 95% CI

X-ray: observed leakage for USI in women


Pelsang145 37 29 24 69 0.61 0.47 to 0.73 0.70 0.60 to 0.79 3.67 1.87 to 7.19
Scotti146 53 18 35 68 0.60 0.49 to 0.71 0.79 0.69 to 0.87 5.72 2.92 to 11.21
Pooled (RE) 0.60 0.52 to 0.68 0.74 0.68 to 0.81 2.85 to 7.37
LR+ 2.31 (95% CI 1.62 to 3.00)
X-ray: bladder neck descent for USI in women
Grischke147 20 20 14 30 0.59 0.41 to 0.75 0.60 0.45 to 0.74 2.14 0.88 to 5.20
Bergman148 32 15 0 12 1.00 0.89 to 1.00 0.44 0.26 to 0.65 52.4 2.91 to 943
Pooled (RE) 0.79 0.67 to 0.88 0.55 0.43 to 0.66 8.11 0.30 to 222
LR+ 1.76 (95% CI 0.90 to 2.61)
Full bladder clinical stress test for USI in women
Hsu149 29 1 2 9 0.94 0.79 to 0.99 0.90 0.56 to 1.00 130.5 10.56 to 1612
Kadar150 14 5 4 14 0.78 0.52 to 0.94 0.74 0.49 to 0.91 9.8 2.17 to 44.32
Scotti17 68 10 13 54 0.84 0.74 to 0.91 0.84 0.73 to 0.92 28.25 11.50 to 69.37
Pooled (RE) 0.85 0.78 to 0.91 0.83 0.74 to 0.90 25.42 8.66 to 74.6
LR+ 5.00 (95% CI 3.79 to 6.21)
Single-channel urodynamics
Standing SCU for DO in women
Sand151 43 15 8 34 0.84 0.71 to 0.93 0.69 0.55 to 0.82 12.18 4.62 to 32.10
Sand152 15 8 46 134 0.25 0.15 to 0.37 0.94 0.89 to 0.98 5.46 2.17 to 13.72
Sutherst153 35 7 0 58 1.00 0.90 to 1.00 0.89 0.79 to 0.96 553 30.69 to 9993
Pooled (RE) 0.63 0.55 to 0.71 0.88 0.84 to 0.92 19.03 3.34 to 108.5
LR+ 12.00 (95% CI 10.58 to 13.42)
Supine SCU for DO in elderly women
Fonda154 17 7 4 15 0.81 0.58 to 0.95 0.68 0.45 to 0.86 9.11 2.22 to 37.34
Ouslander155 61 10 23 43 0.73 0.62 to 0.82 0.81 0.68 to 0.91 11.40 4.93 to 26.38
Pooled (RE) 0.74 0.65 to 0.82 0.77 0.66 to 0.86 10.75 5.23 to 22.12
LR+ 3.57 (95% CI 2.41 to 4.73)
Supine SCU for DO in elderly men
Fonda154 20 0 1 6 0.95 0.76 to 1.00 1.00 0.54 to 1.00 177.7 6.42 to 4914
Ouslander155 21 1 4 5 0.84 0.64 to 0.96 0.833 0.36 to 1.00 26.25 2.39 to 288
Pooled (RE) 0.89 0.76 to 0.96 0.92 0.62 to 1.00 50.58 7.24 to 353
LR+ 18.20 (95% CI 12.62 to 23.78)

continued
TABLE 7 Summary of data and results of diagnostic accuracy for index tests compared with multichannel urodynamics (cont’d)

Reference TP FP FN TN Sensitivity 95% CI Specificity 95% CI DOR 95% CI

Supine SCU for USI in women


Resnick156 19 10 3 64 0.86 0.67 to 0.95 0.86 0.77 to 0.92 40.53 10.11 to 162.4
Ambulatory urodynamics for USI in women
Davis157 7 38 2 3 0.78 0.45 to 0.94 0.07 0.03 to 0.19 0.28 0.039 to 1.97
Sitting UPP for USI in women
Swift158 32 1 33 42 0.49 0.37 to 0.62 0.98 0.88 to 1.00 40.73 5.29 to 313
Richardson159 30 43 5 59 0.86 0.70 to 0.95 0.58 0.48 to 0.68 8.23 2.95 to 22.95
Pooled (RE) 0.62 0.52 to 0.72 0.70 0.61 to 0.77 14.46 3.06 to 68.2
LR+ 9.38 (95% CI 6.91 to 11.84)
Supine UPP for USI in women
Versi160 54 21 16 81 0.77 0.66 to 0.85 0.79 0.71 to 0.86 13.02 6.24 to 27.17
a
Not pooled owing to excessive clinical heterogeneity.
DIS, Detrusor Instability Score; FN, false negative; FP, false positive; RE, random effects; TN, true negative; TP, true positive; UDI, Urogenital Distress Inventory; UPP, urethral
pressure profile.

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Health Technology Assessment 2006; Vol. 10: No. 6

17
18
TABLE 8 Clinical history compared with urodynamics for USI in women
Results

Reference n Gender Care setting Population Type of Gold standard Index test Statistical tests Main findings
incontinence

Cundiff117 535 F Secondary Symptoms of UI USI Multichannel History Full contingency table Sensitivity = 0.96
urodynamics (stress symptoms) Specificity = 0.41
Niecestro126 66 F Secondary Referred for UD USI Multichannel History Full contingency table Sensitivity = 0.81
urodynamics (stress symptoms) Specificity = 0.65
Diokno119 456 F Not specified Symptoms of UI USI Multichannel History Full contingency table Sensitivity = 0.68
+ controls urodynamics (stress symptoms) Specificity = 0.79
Ishiko122 198 F Secondary Symptoms of UI USI Multichannel History Full contingency table Sensitivity = 0.92
urodynamics (stress symptoms) Specificity = 0.88
Sandvik130 250 F Secondary Symptoms of UI USI Multichannel History Full contingency table Sensitivity = 0.98
urodynamics (stress symptoms) Specificity = 0.51
De Muylder118 408 F Secondary Symptoms of UI USI Multichannel History Full contingency table Sensitivity = 0.94
urodynamics (stress symptoms) Specificity = 0.65
Lagro-Janssen125 103 F Primary Symptoms of UI USI Multichannel History Full contingency table Sensitivity = 0.96
urodynamics (stress symptoms) Specificity = 0.63
Sand129 188 F Secondary LUTS USI Multichannel History Full contingency table Sensitivity = 1.00
urodynamics (stress symptoms) Specificity = 0.61
Diokno120 200 F Secondary Symptoms of UI USI Multichannel History Full contingency table Sensitivity = 0.94
urodynamics (stress symptoms) Specificity = 0.13
Ouslander127 135 F Secondary Symptoms of UI USI Multichannel History Full contingency table Sensitivity = 0.94
urodynamics (stress symptoms) Specificity = 0.35
Kujansuu124 121 F Secondary Symptoms of UI USI Multichannel History Full contingency table Sensitivity = 0.80
urodynamics (stress symptoms) Specificity = 0.68
FitzGerald121 293 F Secondary Symptoms of UI USI Multichannel History Full contingency table Sensitivity = 0.89
urodynamics (stress symptoms) Specificity = 0.39
Sunshine131 109 F Secondary Symptoms of UI USI Multichannel History Full contingency table Sensitivity = 1.00
urodynamics (stress symptoms) Specificity = 0.52
Korda123 566 F Secondary Symptoms of UI USI Multichannel History Full contingency table Sensitivity = 0.90
urodynamics (stress symptoms) Specificity = 0.38
Ramsay128 200 F Secondary Positive USI/DO Multichannel History Full contingency table Sensitivity = 0.79
urodynamics urodynamics (stress symptoms) Specificity = 0.74

continued
TABLE 8 Clinical history compared with urodynamics for USI in women (cont’d)

Reference n Gender Care setting Population Type of Gold standard Index test Statistical tests Main findings
incontinence

Weidner161 950 F Secondary Symptoms of UI USI Multichannel History Sensitivity and Specificity Sensitivity = 0.66
urodynamics (stress symptoms) Specificity = 0.63
Clarke162 100 F Secondary LUTS USI Multichannel History Sensitivity and specificity Sensitivity = 0.96
urodynamics (stress symptoms) Specificity = 0.23
Bergman163 154 F Secondary Symptoms of UI USI/DO Multichannel History Sensitivity and specificity Sensitivity = 0.56
+ controls urodynamics (stress symptoms) Specificity = 0.70
Amundsen16 115 F Secondary Symptoms of UI USI Multichannel History Difference between USI Significantly higher
urodynamics (stress symptoms) and non-USI groups SI symptoms in
USI group
Ng164 28 F Secondary Symptoms of UI USI Multichannel History Agreement between two USI confirmed in
urodynamics (stress symptoms) methods 89% of those with
stress symptoms
Le Coutour165 154 F Secondary Symptoms of UI USI Singlechannel History Full contingency table Sensitivity = 0.92
urodynamics (stress symptoms) Specificity = 0.39

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


F, female; LUTS, lower urinary tract symptoms; UI, urinary incontinence.
The shaded area in Tables 8–34 indicates studies that presented data in a form that did not allow summary measures of diagnostic accuracy to be calculated.
Health Technology Assessment 2006; Vol. 10: No. 6

19
Results

Sensitivity
Sensitivity (95% CI)
131 1.00 (0.95 to 1.00)
130 0.98 (0.94 to 0.99)
129 1.00 (0.97 to 1.00)
128 0.79 (0.69 to 0.87)
127 0.94 (0.87 to 0.98)
126 0.81 (0.54 to 0.96)
125 0.96 (0.89 to 0.99)
124 0.81 (0.68 to 0.90)
123 0.90 (0.87 to 0.92)
122 0.92 (0.86 to 0.95)
121 0.89 (0.84 to 0.93)
120 0.94 (0.89 to 0.97)
119 0.68 (0.58 to 0.78)
118 0.94 (0.90 to 0.97)
117 0.96 (0.94 to 0.98)

Pooled sensitivity = 0.92 (0.91 to 0.93)


2 = 133.34; df = 14 (p = 0.0000)
0 0.2 0.4 0.6 0.8 1.0

(a)

Specificity
Specificity (95% CI)
131 0.52 (0.33 to 0.71)
130 0.51 (0.37 to 0.65)
129 0.61 (0.51 to 0.70)
128 0.74 (0.65 to 0.82)
127 0.35 (0.22 to 0.51)
126 0.65 (0.50 to 0.78)
125 0.63 (0.41 to 0.81)
124 0.68 (0.55 to 0.79)
123 0.38 (0.26 to 0.51)
122 0.88 (0.71 to 0.96)
121 0.39 (0.29 to 0.51)
120 0.13 (0.05 to 0.26)
119 0.79 (0.67 to 0.88)
118 0.65 (0.57 to 0.72)
117 0.41 (0.32 to 0.51)

Pooled specificity = 0.56 (0.53 to 0.60)


2 = 130.90; df = 14 (p = 0.0000)
0 0.2 0.4 0.6 0.8 1.0

(b)

FIGURE 3 Pooled random effect results: clinical history versus multichannel urodynamics (MCU) for diagnosis of USI in women.
(a) Independently pooled sensitivity; (b) independently pooled specificity; (c) sensitivity and specificity for each study and pooled
20 estimates plotted in ROC space; (d) pooled DOR (random effect) plotted in ROC space. SROC, summary receiver operating characteristic.
Health Technology Assessment 2006; Vol. 10: No. 6

ROC plane
1.0
Sensitivity
0.9 0.92 (0.91 to 0.93)
0.8 2 = 133.34;
df = 14 (p = 0.0000)
0.7
Specificity
0.6 0.56 (0.53 to 0.60)
Sensitivity

2 = 130.90;
0.5
df = 14 (p = 0.0000)
0.4

0.3

0.2

0.1
0
0 0.2 0.4 0.6 0.8 1.0
(c) 1 – specificity

SROC curve
1.0

0.9 Symmetric SROC


AUC = 0.8274
0.8 SE (AUC) = 0.0609
Q* = 0.7603
0.7
SE (Q*) = 0.0553
0.6
Sensitivity

0.5

0.4

0.3

0.2

0.1

0
0 0.2 0.4 0.6 0.8 1.0
(d) 1 – specificity

Pooled (random effect) DOR 14.339 (95% CI 8.682 to 23.681)


Heterogeneity 2 = 62.07 (df = 14) p = 0.000

FIGURE 3 (cont’d) Pooled random effect results: clinical history versus multichannel urodynamics (MCU) for diagnosis of USI in
women. (a) Independently pooled sensitivity; (b) independently pooled specificity; (c) sensitivity and specificity for each study and
pooled estimates plotted in ROC space; (d) pooled DOR (random effect) plotted in ROC space. SROC, summary receiver operating
characteristic. 21

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


22
TABLE 9 Clinical history compared with urodynamics for DO in women
Results

Reference n Gender Care setting Population Type of Gold standard Index test Statistical tests Main findings
incontinence

Ishiko122 198 F Secondary Symptoms of UI DO Multichannel History Full contingency table Sensitivity = 0.86
urodynamics (urge symptoms) Specificity = 0.96
Cundiff117 535 F Secondary Symptoms of UI DO Multichannel History Full contingency table Sensitivity = 0.41
urodynamics (urge symptoms) Specificity = 0.96
Sandvik130 250 F Secondary Symptoms of UI DO Multichannel History Full contingency table Sensitivity = 0.56
urodynamics (urge symptoms) Specificity = 0.96
De Muylder118 408 F Secondary Symptoms of UI DO Multichannel History Full contingency table Sensitivity = 0.62
urodynamics (urge symptoms) Specificity = 0.47
Lagro-Janssen125 103 F Primary Symptoms of UI DO Multichannel History Full contingency table Sensitivity = 0.61
urodynamics (urge symptoms) Specificity = 0.95
Sand129 188 F Secondary LUTS DO Multichannel History Full contingency table Sensitivity = 0.33
urodynamics (urge symptoms) Specificity = 0.98
FitzGerald121 293 F Secondary Symptoms of UI DO Multichannel History Full contingency table Sensitivity = 0.27
urodynamics (urge symptoms) Specificity = 0.92
Cantor132 214 F Secondary Symptoms of UI DO Multichannel History Full contingency table Sensitivity = 0.91
urodynamics (urge symptoms) Specificity = 0.45
Diokno120 200 F Secondary Symptoms of UI DO Multichannel History Full contingency table Sensitivity = 0.14
(elderly) urodynamics (urge symptoms) Specificity = 0.97
Ouslander127 135 F Secondary Symptoms of UI DO Multichannel History Full contingency table Sensitivity = 0.32
(elderly) urodynamics (urge symptoms) Specificity = 0.90
Clarke162 100 F Secondary LUTS DO Multichannel History Sensitivity and specificity Sensitivity = 0.70
urodynamics Specificity = 0.35
Bergman163 154 F Secondary Symptoms of UI USI/DO Multichannel History (range Sensitivities and Mean
+ controls urodynamics of symptoms) specificities Sensitivities = 56 ± 17
Specificities = 70 ± 23
Petros166 113 F Secondary Symptoms of UI DO History Multichannel Sensitivity and Specificity Sensitivity = 0.40
urodynamics Specificity = 0.74
Van Doorn167 228 F Secondary Symptoms of UI DO Ambulatory History Full contingency table Sensitivity = 0.53
urodynamics (urge symptoms) Specificity = 0.94
Sensitivity Specificity
Sensitivity (95% CI) Specificity (95% CI)
132 0.91 (0.84 to 0.95) 132 0.45 (0.35 to 0.55)
121 0.27 (0.14 to 0.44) 121 0.92 (0.88 to 0.95)
129 0.33 (0.17 to 0.53) 129 0.98 (0.95 to 1.00)
125 0.61 (0.36 to 0.83) 125 0.95 (0.88 to 0.99)
118 0.62 (0.56 to 0.68) 118 0.47 (0.39 to 0.55)
130 0.56 (0.40 to 0.72) 130 0.96 (0.92 to 0.98)
117 0.41 (0.32 to 0.51) 117 0.96 (0.94 to 0.98)
122 0.89 (0.68 to 0.96) 122 0.96 (0.92 to 0.99)

0 0.2 0.4 0.6 0.8 1.0 0 0.2 0.4 0.6 0.8 1.0

(a) Pooled sensitivity = 0.61 (0.57 to 0.65) (b) Pooled specificity = 0.87 (0.85 to 0.89)
2 = 106.09; df = 7 (p = 0.0000) 2 = 373.67; df = 7 (p = 0.0000)

ROC plane SROC curve


1.0 1.0

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Sensitivity 0.9
0.9 Symmetric SROC
0.61 (0.57 to 0.65)
0.8 0.8 AUC = 0.8293
2 = 106.09;
SE (AUC) = 0.0673
0.7 df = 7 (p = 0.0000) 0.7
Q* = 0.7620
0.6 0.6 SE (Q*) = 0.0614
Specificity
0.5 0.87 (0.85 to 0.89) 0.5

Sensitivity
Sensitivity
0.4 2 = 373.67; 0.4
df = 7 (p = 0.0000)
0.3 0.3
0.2 0.2
0.1 0.1
0 0
0 0.2 0.4 0.6 0.8 1.0 0 0.2 0.4 0.6 0.8 1.0
1 – specificity 1 – specificity

(c) (d) Pooled (random effect) DOR 14.72 (95% CI 4.87 to 44.51)
Heterogeneity 2 = 105.65 (df = 7) p = 0.000

FIGURE 4 Pooled random effect results: clinical history versus MCU for diagnosis of DO in women. (a) Independently pooled sensitivity; (b) independently pooled specificity; (c) sensitivity and
Health Technology Assessment 2006; Vol. 10: No. 6

specificity for each study and pooled estimates plotted in ROC space; (d) pooled DOR (random effect) plotted in ROC space.

23
24
Results

Sensitivity Specificity
Sensitivity (95% CI) Specificity (95% CI)

127 0.32 (0.18 to 0.50) 127 0.90 (0.82 to 0.95)


120 0.14 (0.02 to 0.43) 120 0.97 (0.93 to 0.99)

0 0.2 0.4 0.6 0.8 1.0 0 0.2 0.4 0.6 0.8 1.0
(a) Pooled sensitivity = 0.27 (0.16 to 0.42) (b) Pooled specificity = 0.94 (0.91 to 0.97)
2 = 1.84; df = 1 (p = 0.1755) 2 = 5.52; df = 1 (p = 0.0188)

ROC plane
1.0
Sensitivity
0.9
0.27 (0.16 to 0.42)
0.8 2 = 1.84;
0.7 df = 1 (p = 0.1755)
0.6 Specificity
0.5 0.94 (0.91 to 0.97)
2 = 5.52;

Sensitivity
0.4
df = 1 (p = 0.0188)
0.3
0.2
0.1
0
0 0.2 0.4 0.6 0.8 1.0
(c) 1 – specificity

FIGURE 5 Pooled random effect results: clinical history versus MCU for diagnosis of DO in elderly women. (a) Independently pooled sensitivity; (b) independently pooled specificity; (c) sensitivity
and specificity for each study and pooled estimates plotted in ROC space.
Health Technology Assessment 2006; Vol. 10: No. 6

one study reports a higher incidence of urge incontinence as measured by the pad test for the
symptoms in a urodynamically confirmed DO IIQ and UDI, respectively.
group compared with a urodynamically normal
group.170 One paper aimed to validate further the Sandvik
severity index, this time with the association with a
Diagnosis of USI and DO in a mixed population 48-hour pad test.114 Insufficient data were
Three studies compared diagnosis by clinical presented to allow sensitivity or specificity to be
history and multichannel urodynamics in a mixed calculated. The correlation between the severity
population (Table 11). One study reports a index and leakage of the pad test was r = 0.36
sensitivity of 1.00 and specificity of 0.95 for the ( p < 0.001).
diagnosis of USI by history taking of stress
incontinence symptoms.171 One study reports an One study investigated a new screening
agreement of 93% (USI) and 63% (DO) between questionnaire designed for women in primary
the two methods,172 and one reports an agreement care, the Incontinence Screening Questionnaire
of 60% for the diagnosis of USI.173 (ISQ) and compared it against the 48-hour pad
test.176 This resulted in a sensitivity of 0.65 and a
Validated scale compared with clinical specificity of 0.80 (cut-off for pad test = 7 g,
history positive ISQ = responded positively to at least one
One study compared the association of the of the eight items).
Incontinence Impact Questionnaire (IIQ-6) and the
Urogenital Distress Inventory (UDI-7) with various One study evaluated the Sandvik scale, a three or
incontinence symptoms (Table 12). Correlation four-level severity scale, against the 24-hour pad
coefficients of between 0.24 and 0.69 were found. test.107 When contacted, the author sent individual
patient data for 315 cases allowing numerous cut-
Validated scale compared with off points to be used. Based on positive cut-off
validated scale point of above 1 for the severity scale and 7 g for
One study compared the association between the the pad test, the scale was found to be 0.74
long and short forms of the IIQ and the UDI sensitive and 0.76 specific.
(Table 13). These scales measure the life impact
and symptom distress of urinary incontinence in Validated scales compared with
women. Correlations of r = 0.93 (UDI) and 0.97 urodynamics
(IIQ) were found between the two forms of the Eight studies compared the use of validated scales
questionnaires, indicating that the shortened with standard multichannel urodynamics for the
versions are equally as valid for the measurement diagnosis of urinary incontinence (Table 15). Six
of these quality of life symptoms. studies investigated female patients97,106,121,133,177,178
and two studied male patients.101,179 Six separate
Validated scale compared with pad test scales were studied by the eight studies in this
Four papers reported a comparison of a validated group.
scale with a pad-test (Table 14). All four studied
only female patients. Three papers studied the UDI.97,121,178 Two papers
used the response on question 3 of the short form
Three papers did not present data in a way that of the scale (Are you bothered by urinary leakage
allowed sensitivity and specificity to be caused by physical exercise?) to predict
calculated.107,113,114 Attempts to contact the urodynamic diagnosis of USI.97,121 These studies
authors resulted in one response with the full data report sensitivities of 0.85 and 0.88 and
requested.107 specificities of 0.63 and 0.55. Owing to the
homogeneity of these papers it was possible to
One study investigated the association between the combine the data to produce a pooled sensitivity
UDI and IIQ long form with a 1-hour pad test.113 of 0.87 (95% CI 0.82 to 0.92) and specificity of
These scales were developed to assess the impact 0.60 (95% CI 0.51 to 0.69) for the diagnosis of
of urinary incontinence on activity and emotions USI from question 3 of the UDI-6 (Figure 6). One
and the degree to which symptoms of other paper reported a correlation of r = 0.54
incontinence are distressing. Data were not between diagnosis using multichannel
presented in a way that allowed sensitivity and urodynamics and score on the UDI.179
specificity to be calculated. However, the authors
present an ROC analysis that shows that there was One paper investigated the use of a Detrusor
a 54% and 51% probability of correctly classifying Instability Score (DIS),133 a ten-question scale 25

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


26
TABLE 10 Clinical history compared with urodynamics in men
Results

Reference n Gender Care setting Population Type of Gold standard Index test Statistical tests Main findings
incontinence

Ficazzola168 60 M Secondary Post-prostatectomy USI/DO Multichannel History Full contingency table USI:
incontinence urodynamics Sensitivity = 1.00
Specificity = 0.50
DO:
Sensitivity = 0.50
Specificity = 0.77
Ding169 126 M Secondary LUTS DO Multichannel History Sensitivity and specificity Sensitivity = 0.73
urodynamics Specificity = 0.60
Hyman170 160 M Not specified LUTS DO Multichannel History Differences between Higher incidence of
urodynamics diagnostic groups urge symptoms
associated with DO
group

M, male.

TABLE 11 Clinical history compared with urodynamics in a mixed population

Reference n Gender Care setting Population Type of Gold standard Index test Statistical tests Main findings
incontinence

Porru171 46 Mixed Secondary Symptoms of UI USI Multichannel History Full contingency table Sensitivity = 1.00
urodynamics (symptoms of USI) Specificity = 0.95
Gray172 148 Mixed Secondary Symptoms of UI USI/DO Multichannel History Agreement between USI: 93%
urodynamics methods DO: 63%
De Bolla173 82 Mixed Secondary Symptoms of UI USI/DO Multichannel History Agreement between Agreement = 60%
urodynamics methods
TABLE 12 Validated scale compared with clinical history

Reference n Gender Care setting Population Type of Gold standard Index test Data available Main findings
incontinence

Robinson174 384 F Primary Symptoms of UI Any leakage History (various IIQ-7 Correlation R a n g e
(elderly) measures of UDI-6 r = 0.24–69 to 0.69
incontinence)

TABLE 13 Validated scale compared with validated scale

Reference n Gender Care setting Population Type of Gold standard Index test Statistical tests Main findings
incontinence

Uebersax175 162 F Clinical trial Diagnosis of UI USI/DO UDI (long form) UDI (short form) Correlation UDI: 0.93
for UI IIQ (long form) IIQ (short form) IIQ: 0.97

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


TABLE 14 Validated scale compared with pad test

Reference n Gender Care setting Population Type of Gold standard Index test Data available Main findings
incontinence

Gunthorpe176 89 F Primary Attending GP’s surgery All 48-hour pad test ISQ Full contingency table Sensitivity = 0.65
(cut-off = 7 g) Specificity = 0.80
Sandvik107 315 F Secondary Symptoms of UI All 24-hour pad test Sandvik severity Individual patient data Sensitivity = 0.74
index Specificity = 0.76
(optimum)
Harvey113 150 F Clinical trial USI or DO positive USI/DO 1-hour pad test IIQ long form Correlation r = 0.18
(cut-off = 2 g)
Hanley114 237 F Primary + Symptoms of UI All 48-hour pad test Sandvik severity Differences between Significant
secondary index groups difference
between severity
groups
Health Technology Assessment 2006; Vol. 10: No. 6

27
28
TABLE 15 Validated scale compared with urodynamics
Results

Reference n Gender Care setting Population Type of Gold standard Index test Data available Main findings
incontinence

Lemack97 128 F Secondary LUTS USI Multichannel UDI-6 (question 3) Full contingency table Sensitivity = 0.85
urodynamics Specificity = 0.63
FitzGerald121 293 F Secondary Symptoms of UI USI Multichannel IIQ-7 Full contingency table Sensitivity = 0.88
urodynamics UDI-6 (question 3) Specificity = 0.55
Klovning133 250 F Secondary Referred for USI Multichannel DIS Full contingency table Sensitivity = 0.60
‘urogenital urodynamics ROC curve Specificity = 0.77
dysfunction’ (optimum)
Haeusler177 1938 F Secondary Referred for USI/DO Multichannel Gaudenz Sensitivity and specificity USI:
urodynamics urodynamics incontinence Sensitivity = 0.56
questionnaire Specificity = 0.45
(women) DO:
Sensitivity = 0.62
Specificity = 0.56
Shumaker178 162 F Clinical trial Symptoms of UI USI/DO Multichannel UDI Correlation r = 0.54
urodynamics IIQ
Nitti179 50 M Secondary Symptoms of DO Multichannel AUA symptom Difference in score No significant
voiding dysfunction urodynamics index (men) between diagnostic difference between DO
groups group and non-DO
group
Nitti101 83 M Secondary Symptoms of BPH DO Multichannel AUA men Difference in score Higher irritative
urodynamics between diagnostic symptoms scores
groups associated with DO
Nager106 52 F Secondary USI USI Multichannel Quality of life Correlation No statistically
urodynamics questionnaire significant correlation

BPH, benign prostatic hyperplasia.


Sensitivity Specificity
Sensitivity (95% CI) Specificity (95% CI)

121 0.88 (0.82 to 0.93) 121 0.55 (0.40 to 0.69)


97 0.85 (0.71 to 0.94) 97 0.63 (0.52 to 0.74)

0 0.2 0.4 0.6 0.8 1.0 0 0.2 0.4 0.6 0.8 1.0
(a) Pooled sensitivity = 0.87 (0.82 to 0.92) (b) Pooled specificity = 0.60 (0.51 to 0.69)
2 = 0.37; df = 1 (p = 0.5433) 2 = 0.88; df = 1 (p = 0.3478)

ROC plane

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


1.0
Sensitivity
0.9
0.87 (0.82 to 0.92)
0.8 2 = 0.37;
0.7 df = 1 (p = 0.5433)
0.6 Specificity
0.5 0.60 (0.51 to 0.69)
2 = 0.88;

Sensitivity
0.4
df = 1 (p = 0.3478)
0.3
0.2
0.1
0
0 0.2 0.4 0.6 0.8 1.0
(c) 1 – specificity

FIGURE 6 Pooled random effect results: validated scale versus MCU for diagnosis of USI in women. (a) Independently pooled sensitivity; (b) independently pooled specificity; (c) sensitivity and
Health Technology Assessment 2006; Vol. 10: No. 6

specificity for each study and pooled estimates plotted in ROC space.

29
Results

designed to highlight either USI or DO. This found significant differences in mean pad weight
study reports an optimum sensitivity of 0.60 and gain between three groups of patients grouped
specificity of 0.77 for the diagnosis of USI. One according to the self-perceived severity of their
paper177 studied the ability of the Gaudenz symptoms.
incontinence questionnaire to diagnose USI and
DO; this consists of 26 questions and also allows Four papers compared a short-term pad test with
grading of severity of the type of incontinence. patient history. Presenting individual patient data,
The paper reports sensitivities of 0.56 and 0.62 one study reported an optimum sensitivity of 0.87
and specificities of 0.45 and 0.56 for the diagnosis and specificity of 0.64 for the rapid exercise pad
of USI and DO, respectively. test for predicting self-reported incontinence
status.111 For the same test a second study
The ability of the American Urological Association reported a sensitivity of 0.90 and specificity of
(AUA) symptom index to diagnose DO in male 1.00; however, as the raw data were not presented
patients was studied by two papers.101,179 Both in this paper it was not possible to pool these
papers compared the scores on the seven-question results.181 A third study reported correlations of
AUA symptom index with diagnosis using between r = 0.31 and 0.67 between the 1-hour
multichannel urodynamics. Neither paper pad test and various history questions, with the
presented data in a format that allowed summary largest correlation being between the pad test and
statistics of diagnostic accuracy to be calculated. self-reported number of incontinent episodes.99 In
One paper179 found no difference in AUA the fourth study when the ICS 1-hour pad test was
symptom score between DO and non-DO groups; compared with self-reported grade of incontinence
however, the other found that those patients with severity, significant differences between mean pad
DO had significantly higher irritative scores on weight gain were found across the three groups.112
the AUA.101
Pad test compared with urodynamics
One paper studied the correlation between Seven studies were identified that compared the
urodynamic diagnosis and score on a quality of life use of a pad test with urodynamics (Table 17). All
questionnaire (SEAPI QMM incontinence studies used only female patients and were
classification system) in women with confirmed performed in a secondary care setting, apart from
USI.106 This study found no statistically significant one study that was conducted in mixed care
correlation between the two methods. settings.105

Pad test compared with clinical history Two studies presented data in a cross-tabulated
Six studies compared a pad test with clinical format that allowed sensitivity and specificity to be
history for the assessment of urinary incontinence calculated. One study found the ICS 1-hour pad
(Table 16). One study included both male and test to be 0.94 sensitive and 0.45 specific for
female patients,180 the other five only females. diagnosing any leakage compared with
Four studies were performed in secondary care, multichannel urodynamics;134 the other found the
one in primary care and one did not specify where 48-hour pad test to be 0.92 sensitive and 0.72
it was performed. specific for diagnosing USI.135

Three types of pad test were studied. Two studies Four other papers studied the use of short-term
investigated the use of the 48-hour pad test; one pad tests for diagnosing USI compared with
reported a sensitivity of 0.73 and specificity of multichannel urodynamics. One study found a
1.00 for the prediction of patient-reported rapid exercise pad test to be 0.86 sensitive in
incontinence status.180 One paper assigned diagnosing patients with a urodynamic diagnosis
patients to three severity groups according to of USI.105 A second study compared three
their self-reported urine loss and found different pad tests: unknown volume, 250 ml and
significant differences in mean urinary loss 1 hour, also in urodynamically positive patients,
between the three groups as measured by the and found sensitivities ranging from 0.79 to
48-hour pad test.112 0.94.182 A third study reported a correlation
between the rapid exercise pad test and
Two papers studied the 24-hour pad test: one multichannel urodynamics of 0.59.183 Finally, in a
study115 comparing the mean pad weight gain fourth study significantly higher results of the ICS
between self-reported incontinent and continent 1-hour test, 24-hour and 48-hour pad tests were
patient groups found no significant differences found in urodynamically confirmed USI compared
30 between the two groups. The other,112 however, with asymptomatic controls.
TABLE 16 Pad test compared with clinical history

Reference n Gender Care setting Population Type of Gold standard Index test Data available Main findings
incontinence

Hellstrom180 37 Mixed Primary Symptoms of UI USI History 48-hour pad test Individual patient data Sensitivity = 0.73
(all 85 years old) (severity of UI) Specificity = 1.00
(optimum)
Hahn111 50 F Secondary Symptoms of USI USI History Exercise pad test Individual patient data Sensitivity = 0.87
(severity of UI) Specificity = 0.64
(optimum)
Papa Petros181 113 F Not specified Symptoms of UI USI/DO/mixed History Rapid exercise Sensitivity and specificity Sensitivity = 0.90
+ controls pad test Specificity = 1.00
Jackson99 85 F Secondary Symptoms of UI USI/DO/mixed History (various 1-hour pad test Correlation r = 0.31–0.67
measures of
leakage)
Mouritsen112 72 F Secondary Symptoms of UI USI/DO/mixed History ICS: 1-hour Difference in pad gain Significant differences
(three severity 24 hour between groups for all pad tests
groups) 48 hour
Ryhammer115 144 F Secondary Clinical trial for UI Any leakage History 24-hour pad test Difference in pad gain No significant

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


(incontinent/ between groups difference
continent)
Health Technology Assessment 2006; Vol. 10: No. 6

31
32
TABLE 17 Pad test compared with urodynamics
Results

Reference n Gender Care setting Population Type of Gold standard Index test Data available Main findings
incontinence

Jorgensen134 49 F Secondary Symptoms of UI Any leakage Multichannel ICS: 1-hour pad Full contingency Sensitivity = 0.94
urodynamics test table Specificity = 0.44
Versi135 105 F Secondary LUTS USI Multichannel 48-hour home Full contingency Sensitivity = 0.92
urodynamics pad test table Specificity = 0.72
Versi105 99 F Mixed USI (all positive USI Multichannel Rapid exercise Sensitivity Sensitivity = 0.86
urodynamics) urodynamics pad test
Mayne182 33 F Secondary USI (all positive USI Multichannel Three pad tests: Sensitivity Unknown volume = 0.79
urodynamics) urodynamics unknown bladder 250 ml = 0.91
250 ml 1 hour = 0.94
1 hour
Mouritsen112 97 F Secondary Symptoms of UI USI/DO/mixed Multichannel ICS: 1 hour Mean values Significant differences
+ asymptomatic urodynamics 24 hour between patients and
controls (25) 48 hour controls
Siltberg183 15 F Secondary USI (all positive USI Multichannel Rapid exercise Correlation r = 0.59
urodynamics) urodynamics pad test
Berglund100 45 F Secondary USI (all positive USI 2 × 2-hour Multichannel Sensitivity Sensitivity = 0.86
urodynamics) pad test urodynamics
1 minimal exercise
1 maximal exercise

TABLE 18 Validated scale compared with clinical history

Reference n Gender Care setting Population Type of Gold standard Index test Data available Main findings
incontinence

Miller184 51 F Clinical trial Symptoms of UI USI History (frequency 6-day diary Correlation r = 0.33
for UI (all >60 years) of leakage) (frequency of leakage)
Jackson99 85 F Secondary Symptoms of UI USI/DO/mixed History (frequency Diary (unspecified) Kappa κ = 0.62
of leakage) (frequency of leakage)
Elser19 265 F Clinical trial Symptoms of UI USI/DO/mixed History (frequency 7-day diary Correlation r = 0.63
for UI of leakage) (frequency of leakage)
Health Technology Assessment 2006; Vol. 10: No. 6

One paper compared multichannel urodynamics One study compared the use of a 7-day diary with
with an exercise pad test, with the pad test result multichannel urodynamics for the diagnosis of
taken to be the gold standard.100 Multichannel USI in women with symptoms of pure stress
urodynamics were reported to be 0.86 sensitive in leakage.108 Data from patients with a normal
diagnosing patients with a positive pad-test result. urinary diary only were presented and therefore
neither sensitivity nor specificity could be
Urinary diary compared with clinical calculated. However, out of 555 women with a
history negative diary, incontinence (USI, DO or mixed
Three studies compared clinical history with a incontinence) was confirmed in 81%.
urinary diary for the measurement of frequency
of leakage (Table 18). None of these papers One study investigated the ability of a urinary
presented data in a form that enabled sensitivity diary differentially to diagnose USI and DO in a
or specificity to be calculated with either a female population with urodynamically confirmed
correlation coefficient or kappa statistic being urinary incontinence.109 Data were not presented
used. There was a high level of variance between in a format that would allow sensitivity or
the levels of agreement demonstrated by the specificity to be calculated. Based on logistic
three papers. One paper reported a correlation regression analysis, the parameters of a urinary
of 0.33,184 one a correlation of 0.6319 and one a diary that resulted in the best differentiation
kappa of 0.62.99 between USI and DO were frequency of
micturition and mean voided volume.
Urinary diary compared with urinary
diary One study aimed to validate the Bladder
Two studies performed a comparison of two Instability Discriminant Index (BIDI), a score
different urinary diaries (Table 19). One study derived from a 7-day urinary diary for the non-
compared a 7-day diary with different types of invasive diagnosis of DO.136 A score was developed
instructions: extensive and minimal for different based on parameters including weekly averages of
symptoms of incontinence in women with a diurnal micturition, nocturnal micturition, and
urodynamic diagnosis.185 The correlation between mean, lowest and highest daily micturition
the two methods ranged from 0.67 to 0.78. volume. By using a cut-off point of below –0.554
to identify a positive result when compared with
One study compared the first 3 days of a 7-day urodynamic diagnosis a sensitivity of 0.88 and
diary with the last 4 days in elderly male specificity of 0.83 were obtained.
patients.186 The correlation between the mean
number of incontinent episodes for this period Paper towel test compared with clinical
was r = 0.84. history
One study compared a simple paper towel test
Urinary diary compared with with patient history of incontinence (Table 21). No
urodynamics significant correlation was found between patient
Four papers studied the use of a urinary diary perception of amount of leakage and the results of
compared with urodynamics (Table 20). However, the paper towel test.
the data from three studies were not presented in
a form suitable for inclusion in any analysis and Physical examination compared with
attempts to contact the authors were clinical history
unsuccessful.108,109,187 One paper studied the relationship between the
pelvic muscle rating scale and patient history
One study compared the 24-hour diary with (Table 22).188 The scale was found to have a
multichannel urodynamics for the diagnosis of sensitivity of 0.68 and specificity of 0.71.
USI, DO and mixed incontinence in female
patients.187 This paper reported significant Physical examination compared with
differences between diagnostic groups for various electromyography
diary parameters. Mean voided volume showed Two studies compared the use of a pelvic muscle
the highest differentiating power between the rating scale for the measurement of pelvic muscle
three diagnostic groups, but statistically strength compared with surface electromyography
significant differences were also found for total (sEMG) measurements (Table 23). Although these
voided volume, mean voided volume, largest papers do not deal specifically with the diagnosis
single voided volume and smallest single voided of urinary incontinence, pelvic muscle strength is a
volume. crucial part of any evaluation of urinary symptoms. 33

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


34
TABLE 19 Urinary diary compared with urinary diary
Results

Reference n Gender Care setting Population Type of Gold standard Index test Data available Main findings
incontinence

Robinson185 278 F Clinical trial All positive Any leakage 7-day diary 7-day diary Correlation (various r = 0.67–0.78
for UI urodynamics (detailed instructions) (minimal instructions) symptoms of (range)
incontinence)
Robb186 44 M Not stated Symptoms Any leakage 7-day diary 7-day diary Correlation r = 0.84
of UI (elderly) (first 3 days) (last 4 days) (incontinent episodes
per day)

TABLE 20 Urinary diary compared with urodynamics

Reference n Gender Care setting Population Type of Gold standard Index test Data available Main findings
incontinence

Contreras 271 F Secondary Symptoms DO Multichannel BIDI derived from Full contingency table Sensitivity = 88%
Ortiz136 of UI urodynamics urinary diary Specificity = 83%
Fink187 132 F Secondary Symptoms USI/DO/mixed Multichannel 24-hour FVC Difference between Significant differences
of UI urodynamics (various parameters) diagnostic groups found between
diagnostic groups
James108 555 F Secondary Symptoms USI (all positive Multichannel 7-day diary Partial contingency table Agreement between
of UI urodynamics) urodynamics diary and urodynamics
= 108/555
Larsson109 142 F Secondary LUTS USI/DO Multichannel 48-hour diary Difference between Significant differences
urodynamics diagnostic groups found between
diagnostic groups

FVC, frequency volume chart.


TABLE 21 Paper towel test compared with clinical history

Reference n Gender Care setting Population Type of Gold standard Index test Data available Main findings
incontinence

Miller184 51 F Clinical trial Symptoms of USI History Paper towel test Correlation No statistically
for UI UI (all ≥ 60 years) significant correlation

TABLE 22 Physical examination compared with clinical history

Reference n Gender Care setting Population Type of Gold standard Index test Data available Main findings
incontinence

Romanzi188 57 F Clinical trial Volunteers Pelvic muscle History Pelvic muscle Full contingency table Sensitivity = 0.68
(incontinent/ strength rating scale Specificity = 0.71
continent)

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


TABLE 23 Physical examination compared with sEMG

Reference n Gender Care setting Population Type of Gold standard Index test Data available Main findings
incontinence

Romanzi188 57 F Clinical trial Volunteers Pelvic muscle Surface EMG Pelvic muscle rating scale Correlation r = 0.46–0.57
(incontinent/ strength
continent)
Brink189 208 F Clinical trial Symptoms Pelvic muscle Surface EMG Pelvic muscle rating scale Correlation r = 0.37–0.63
of UI strength
Health Technology Assessment 2006; Vol. 10: No. 6

35
Results

Both papers report a moderate association between consisted of a physical examination, cystometry and
the two measures, with correlations of r = a stress test. A patient’s history of their symptoms
0.46–0.57188 and r = 0.37–0.63.189 was found to be 0.52 sensitive and 0.85 specific in
predicting diagnosis based on the battery.
Physical examination compared with
battery of tests Battery of tests compared with
One paper compared the diagnosis of USI by a urodynamics
battery of tests with that by physical examination, Two papers studied the use of a battery of tests
specifically genital prolapse (Table 24). A sensitivity compared with multichannel urodynamics
of 0.72 and specificity of 0.46 are reported. (Table 28). One study compared a diagnosis based
on a Q-tip test, cough test and patients’ symptoms
Q-tip test compared with urodynamics with multichannel urodynamics.196 Good
Four papers investigated the use of the Q-tip test agreement was found between the two methods,
compared with urodynamics (Table 25). Two papers with a sensitivity and specificity of 0.94 and 0.84
studied the ability of the Q-tip test, measuring for the diagnosis of USI or mixed incontinence
straining angle, to diagnose USI compared with and 0.71 and 0.96 for the diagnosis of DO.
multichannel urodynamics.137,138 Both papers
presented data in a form that allowed sensitivity One study compared the combination of a pad test
and specificity to be calculated; however, different and patient history for the diagnosis of DO
cut-off points were used to classify a positive result only;197 this reports a sensitivity of 0.88 compared
and therefore the data cannot be combined. A cut- with diagnosis made by multichannel urodynamics.
off point of 35 degrees or greater resulted in a
sensitivity of 0.75 and specificity of 0.58,137 and a Conductance measurement compared
cut-off of 30 degrees or greater in a sensitivity of with multichannel urodynamics
0.53 and specificity of 0.53.138 One paper198 studied the measurement of distal
urethral conductance (DUEC) for the diagnosis of
A further two studies also compared the Q-tip with USI compared with multichannel urodynamics
multichannel urodynamics.191,192 These studies did (Table 29) and reported a sensitivity of 0.64 and
not present data in a form suitable for calculating specificity of 0.86.
summary measures of diagnostic accuracy;
however, they both report significantly higher Urodynamics compared with
mean straining angles in the USI-confirmed group ultrasound
than in asymptomatic controls. Nine studies compared the use of ultrasound
imaging with urodynamic investigations (Table 30).
Algorithm compared with urodynamics Unfortunately, data from two papers were not
Three studies researched the accuracy of presented in a form suitable for inclusion in the
algorithm diagnostic tools compared with formal analysis.98,199 Attempts to contact the
multichannel urodynamics in elderly women authors for further information resulted in one
(Table 26). One study investigated the Resident reply with the full, individual patient data
Assessment Protocol (RAP), a non-urodynamic requested.98
algorithm.193 They report the RAP to have a
sensitivity of 0.76 and specificity of 0.97 for the All nine studies included only female patients and
diagnosis of USI, and a sensitivity of 0.76 and all were conducted in a secondary care setting.
specificity of 0.71 for the diagnosis of DO. Two papers investigated the use of translabial
colour Doppler ultrasound.139,141 This was
Two studies investigated the ability, retrospectively, compared as an alternative to fluoroscopy for the
of an algorithm method to predict diagnosis of detection of urinary leakage during urodynamic
USI, DO and mixed incontinence by multichannel investigation for the diagnosis of USI, DO and
urodynamics.194,195 They reported that treatment mixed incontinence.
based on the algorithm method would have been
correct in 85%194 and 95% of cases.195 Two papers studied the use of transrectal
ultrasound for the evaluation of the bladder base
Battery of tests compared with clinical and urethrovesical junction compared with the
history ICS-defined diagnosis of USI by urodynamic
One paper190 studied the association between investigation.143,144 A urethrovesical junction drop
diagnosis of USI using a battery of tests compared during stress of at least 1 cm was defined as the
36 with a clinical history (Table 27). The battery of tests cut-off for USI.
TABLE 24 Physical examination compared with battery of tests

Reference n Gender Care setting Population Type of Gold standard Index test Statistical tests Main findings
incontinence

Fischer- 212 F Secondary Symptoms USI History/pelvic floor Physical examination Full contingency Sensitivity = 0.72
Rasmussen190 of UI examination/cystometry/ (genital prolapse) table Specificity = 0.46
stress test

TABLE 25 Q-tip test compared with urodynamics

Reference n Gender Care setting Population Type of Gold standard Index test Data available Main findings
incontinence

Bergman137 115 F Secondary Symptoms of UI USI Multichannel urodynamics Q-tip Full contingency table Sensitivity = 0.75
Specificity = 0.58

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Montz138 100 F Secondary Symptoms of UI USI Multichannel urodynamics Q-tip Full contingency table Sensitivity = 0.53
Specificity = 0.53
Karram191 63 F Secondary Symptoms of UI USI Multichannel urodynamics Q-tip Difference between Mean straining angle
+ controls diagnostic groups significantly higher in
USI group
Walters192 48 F Secondary Symptoms of UI USI Multichannel urodynamics Q-tip Difference between Mean straining angle
+ controls diagnostic groups significantly higher in
USI group
Health Technology Assessment 2006; Vol. 10: No. 6

37
38
TABLE 26 Algorithm compared with urodynamics
Results

Reference n Gender Care setting Population Type of Gold standard Index test Data available Main findings
incontinence

Resnick193 102 F Secondary Symptoms of UI USI/DO Multichannel Algorithm Sensitivity and specificity USI:
urodynamics (RAP) Sensitivity = 0.76
Specificity = 0.71
DO:
Sensitivity = 0.76
Specificity = 0.97
Eastwood194 65 F Secondary Referred for USI/DO/mixed Multichannel Algorithm Retrospective Algorithm would have
urodynamics urodynamics comparison between resulted in correct
diagnostic pathways treatment in 85% of cases
Hilton195 100 F Secondary Referred for USI/DO/mixed Multichannel Algorithm Retrospective Algorithm would have
urodynamics urodynamics comparison between resulted in correct
diagnostic pathways treatment in 95% of cases

TABLE 27 Battery of tests compared with clinical history

Reference n Gender Care setting Population Type of Gold standard Index test Statistical tests Main findings
incontinence

Fischer 212 F Secondary Symptoms USI History/pelvic floor History Full contingency table Sensitivity = 0.52
Rasmussen190 of UI examination/cystometry/ Specificity = 0.85
stress test

TABLE 28 Battery of tests compared with urodynamics

Reference n Gender Care setting Population Type of Gold standard Index test Data available Main findings
incontinence

Summitt196 90 F Secondary Symptoms USI/DO/mixed Multichannel Q-tip/symptoms/ Full contingency table USI/Mixed:
of UI urodynamics cough test Sensitivity = 0.94
Specificity = 0.84
DO:
Sensitivity = 0.71
Specificity = 0.96
Griffiths197 100 F Secondary Symptoms DO Multichannel Pad/history Sensitivity Sensitivity = 0.88
of UI urodynamics
TABLE 29 Conductance measurement compared with urodynamics

Reference n Gender Care setting Population Type of Gold standard Index test Statistical tests Main findings
incontinence

Creighton198 F F Secondary Symptoms USI/mixed Multichannel urodynamics DUEC Full contingency table Sensitivity = 0.64
of UI Specificity = 0.86

TABLE 30 Urodynamics compared with ultrasound

Reference n Gender Care setting Population Type of Gold standard Index test Data available Main findings
incontinence

Dietz139 37 F Secondary Referred for USI/DO/mixed Multichannel Translabial ultrasound Full contingency table Sensitivity = 77%
urodynamics urodynamics (visible leakage) Specificity = 75%
Dietz140 117 F Secondary Symptoms USI/DO/mixed Multichannel Transperineal Full contingency table Sensitivity = 83%
of UI urodynamics (opening of bladder Specificity = 76%
neck and mid-urethra)

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Chen98 102 F Secondary USI (positive USI Multichannel Perineal (rotational Full contingency table Sensitivity = 73%
urodynamics) urodynamics angle and BND) Specificity = 77%
continent
controls
Kiilholma102 38 F Secondary USI (positive USI Multichannel Perineal (BND) Sensitivity Sensitivity = 72%
urodynamics) urodynamics
Bergman143 91 F Secondary Symptoms USI Multichannel Transrectal (drop of the Full contingency table Sensitivity = 86%
of UI urodynamics urethrovesical junction) Specificity = 92%
Bergman144 32 F Secondary USI controls USI Multichannel Transrectal (drop of the Full contingency table Sensitivity = 94%
(DO = control) urodynamics urethrovesical junction) Specificity = 89%
Dietz141 52 F Secondary Referred for USI/DO Multichannel Translabial Full contingency table Sensitivity = 94%
urodynamics urodynamics (visible leakage) Specificity = 93%
Quinn142 124 F Secondary Not specified USI Multichannel Vaginal (opening of Full contingency table Sensitivity = 96%
urodynamics bladder neck/proximal Specificity = 82%
urethral with leakage
during cough)
Kolbi199 32 F Secondary USI USI Multichannel Perineal Difference between No significant
urodynamics (urethrovesical angle) diagnostic methods difference
Health Technology Assessment 2006; Vol. 10: No. 6

39
Results

Three studies compared ultrasound (vaginal142 sensitivity and specificity is 1.76 (95% CI 0.90 to
and transperineal98,140) with fluoroscopy during 2.61) (Figure 10).
videourodynamics. The imaging of bladder neck
descent (BND) and rotation of the proximal Stress test compared with multichannel
urethra were recorded using both methods. Simple urodynamics
funnelling or opening of the proximal urethra Six studies were identified that compared the use
during valsalva was taken to be the measure of USI. of a stress test with multichannel urodynamics
(Table 32).
Imaging techniques compared with
multichannel urodynamics All of the studies included only female patients
When imaging the lower urinary tract during and were performed in a secondary care setting.
investigation of urinary incontinence two One study included only nursing home residents,
anatomical features are commonly used: meaning that their sample consisted entirely of
observation of leakage from the bladder and elderly women.156
descent of the bladder neck. Two methods for
directly observing leakage from the bladder are Each of the six papers dealt with the diagnosis of
reported: X-ray imaging performed during USI. In all cases a positive stress test was defined
urodynamics (Table 31) and ultrasound (as as leakage occurring coinciding with cough or
described in the previous section). valsalva.

Four studies report the accuracy of observed Two papers used the supine stress test, one with
leakage using ultrasound for the diagnosis of USI the bladder filled with 200 ml saline,149 the other
compared with multichannel urodynamics with an empty bladder.201 Two papers used a
(Figure 7). The data from these studies were standing stress test, both with a full bladder
combined to provide a pooled sensitivity of 0.89 (>200 ml).156,158
(95% CI 0.84 to 0.93) and specificity of 0.82 (95%
CI 0.73 to 0.89). The positive likelihood ratio One paper performed the stress test in both the
associated with the pooled sensitivity and supine and standing position with a full
specificity is 4.94 (95% CI 3.88 to 6.01), and the bladder.150 One paper performed a cough stress
AUC for the ROC curve corresponding to the test with the patient sitting in the erect position;
pooled DOR is 0.90 (95% CI 0.84 to 0.96) however, the diagnosis was also dependent on the
(Figure 7). Two studies used X-ray imaging for the result of single-channel urodynamics.17
detection of leakage;145,146 when combined, these
studies provide a sensitivity of 0.60 (95% CI 0.52 The quality of reporting of the studies in this
to 0.68) and specificity of 0.74 (95% CI 0.68 to group was high. All six papers presented full
0.81) for the diagnosis of USI compared with contingency tables. One paper only provided data
multichannel urodynamics. The positive likelihood for patients who were positive on multichannel
ratio associated with the pooled sensitivity and urodynamics; therefore, for this study only
specificity is 2.31 (95% CI 1.62 to 3.00) (Figure 8). sensitivity could be calculated.158

Three studies used ultrasound imagining of BND Based on advice from the clinical members of the
during stress for the diagnosis of USI in women investigation team, data from three papers were
compared with multichannel urodynamics.98,143,144 combined to provide a pooled sensitivity of 0.85
The data from these studies were combined to (95% CI 0.78 to 0.91) and specificity of 0.83 (95%
provide a pooled sensitivity of 0.84 (95% CI 0.76 CI 0.74 to 0.90) for the diagnosis of USI in
to 0.90) and specificity of 0.86 (95% CI 0.79 to women using the supine clinical stress test
0.91). The positive likelihood ratio associated with compared with multichannel urodynamics
the pooled sensitivity and specificity is 6 (95% CI (Figure 11). The positive likelihood ratio associated
4.72 to 7.28) and the AUC for the ROC curve with the pooled sensitivity and specificity is 5.00
corresponding to the pooled DOR is 0.94 (95% CI (95% CI 3.79 to 6.21) and the AUC for the ROC
0.84 to 1.00) (Figure 9). curve corresponding to the pooled DOR is 0.87
(95% CI 0.69 to 1.00).
Two studies used X-ray to image BND.147,148
The data from these studies resulted in a pooled Single-channel cystometry compared
sensitivity of 0.79 (95% CI 0.67 to 0.88) and with multichannel urodynamics
specificity of 0.55 (95% CI 0.43 to 0.66). The Eight studies were identified that compared the
40 positive likelihood ratio associated with the pooled use of single-channel urodynamics with
TABLE 31 X-ray imaging compared with multichannel urodynamics

Reference n Gender Care setting Population Type of Gold standard Index test Statistical tests Main findings
incontinence

Pelsang145 159 F Secondary LUTS USI Multichannel Bead chain urethrocystography Full contingency Sensitivity = 0.61
urodynamics (observed leakage) table Specificity = 0.70
Grischke147 84 F Secondary Symptoms USI Multichannel Bead chain urethrocystography Full contingency Sensitivity = 0.59
of UI urodynamics (BND) table Specificity = 0.60
Bergman148 59 F Secondary Symptoms USI Multichannel Bead chain urethrocystography Full contingency Sensitivity = 0.94
of SUI + urodynamics (urethrovesical angle) table Specificity = 0.56
controls (optimum)
Scotti146 204 F Secondary Symptoms USI Multichannel Urethroscopy (opening of the Full contingency Sensitivity = 0.60
of UI urodynamics urethrovesical junction) table Specificity = 0.79
Rose200 1584 M Secondary LUTS BOO Multichannel Micturating cystourethrography Full contingency Sensitivity = 0.91
urodynamics (trabeculated bladder) table Specificity = 0.91

BOO, bladder outlet obstruction.

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


TABLE 32 Stress test compared with multichannel urodynamics

Reference n Gender Care setting Population Type of Gold standard Index test Data available Main findings
incontinence

Hsu149 41 F Secondary Symptoms USI Multichannel Supine cough stress test Full contingency table Sensitivity = 0.94
of UI urodynamics Specificity = 0.90
Resnick156 97 F Secondary Symptoms USI Multichannel Clinical stress test Full contingency table Sensitivity = 0.67
of UI (nursing urodynamics Specificity = 0.98
home)
Lobel201 304 F Secondary Symptoms USI Multichannel Empty supine stress test Full contingency table Sensitivity = 0.49
of UI urodynamics Specificity = 0.95
Kadar150 37 F Secondary Symptoms USI Multichannel Cough stress test Full contingency table Sensitivity = 0.78
of UI urodynamics Specificity = 0.74
Scotti17 145 F Secondary Symptoms USI Multichannel Cough stress test + Full contingency table Sensitivity = 0.49
of UI urodynamics single-channel Specificity = 0.95
urodynamics
Swift158 108 F Secondary LUTS USI Multichannel Cough stress test Sensitivity only Sensitivity = 0.91
urodynamics
Health Technology Assessment 2006; Vol. 10: No. 6

41
42
Results

Sensitivity Specificity
Sensitivity (95% CI) Specificity (95% CI)
142 0.97 (0.91 to 0.99) 142 0.82 (0.65 to 0.93)
141 0.94 (0.81 to 0.99) 141 0.88 (0.64 to 0.99)
140 0.84 (0.74 to 0.91) 140 0.76 (0.60 to 0.89)
139 0.63 (0.35 to 0.85) 139 0.86 (0.64 to 0.97)

0 0.2 0.4 0.6 0.8 1.0 0 0.2 0.4 0.6 0.8 1.0
(a) Pooled sensitivity = 0.89 (0.84 to 0.93) (b) Pooled specificity = 0.82 (0.73 to 0.89)
2 = 18.17; df = 3 (p = 0.0004) 2 = 1.48; df = 3 (p = 0.6871)

ROC plane SROC curve


1.0 1.0
Sensitivity
0.9 0.9 Symmetric SROC
0.89 (0.84 to 0.93)
0.8 AUC = 0.9076
2 = 18.17; 0.8
SE (AUC) = 0.0288
0.7 df = 3 (p = 0.0004) 0.7 Q* = 0.8394
0.6 0.6 SE (Q*) = 0.0316
Specificity
0.5 0.82 (0.73 to 0.89) 0.5

Sensitivity
2 = 1.48;

Sensitivity
0.4 0.4
df = 3 (p = 0.6871)
0.3 0.3
0.2 0.2
0.1 0.1
0 0
0 0.2 0.4 0.6 0.8 1.0 0 0.2 0.4 0.6 0.8 1.0
1 – specificity 1 – specificity

(c) (d) Pooled (random effect) DOR 36.78 (95% CI 10.19 to 132.75)
Heterogeneity 2 = 9.36 (df = 3) p = 0.025

FIGURE 7 Pooled random effect results: imaging of observed leakage using ultrasound versus MCU for diagnosis of USI in women. (a) Independently pooled sensitivity; (b) independently pooled
specificity; (c) sensitivity and specificity for each study and pooled estimates plotted in ROC space; (d) pooled DOR (random effect) plotted in ROC space.
Sensitivity Specificity
Sensitivity (95% CI) Specificity (95% CI)
146 0.60 (0.49 to 0.71) 146 0.79 (0.69 to 0.87)
145 0.61 (0.47 to 0.73) 145 0.70 (0.60 to 0.79)

0 0.2 0.4 0.6 0.8 1.0 0 0.2 0.4 0.6 0.8 1.0
(a) Pooled sensitivity = 0.60 (0.52 to 0.68) (b) Pooled specificity = 0.74 (0.68 to 0.81)
2 = 0.00; df = 1 (p = 0.9581) 2 = 1.82; df = 1 (p = 0.1770)

ROC plane
1.0

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Sensitivity
0.9
0.60 (0.52 to 0.68)
0.8 2 = 0.00;
0.7 df = 1 (p = 0.9581)
0.6 Specificity
0.5 0.74 (0.68 to 0.81)
2 = 1.82;

Sensitivity
0.4
df = 1 (p = 0.1770)
0.3
0.2
0.1
0
0 0.2 0.4 0.6 0.8 1.0
(c) 1 – specificity

FIGURE 8 Pooled random effect results: imaging of observed leakage using X-ray versus MCU for diagnosis of USI in women. (a) Independently pooled sensitivity; (b) independently pooled
Health Technology Assessment 2006; Vol. 10: No. 6

specificity; (c) sensitivity and specificity for each study and pooled estimates plotted in ROC space.

43
44
Results

Sensitivity Specificity
Sensitivity (95% CI) Specificity (95% CI)
144 0.94 (0.79 to 0.99) 144 0.90 (0.73 to 0.98)
143 0.86 (0.73 to 0.95) 143 0.96 (0.85 to 0.99)
98 0.73 (0.56 to 0.86) 98 0.77 (0.65 to 0.86)

0 0.2 0.4 0.6 0.8 1.0 0 0.2 0.4 0.6 0.8 1.0
(a) Pooled sensitivity = 0.84 (0.76 to 0.90) (b) Pooled specificity = 0.86 (0.79 to 0.91)
2 = 5.90; df = 2 (p = 0.0522) 2 = 9.08; df = 2 (p = 0.0107)

ROC plane SROC curve


1.0 1.0
Sensitivity Symmetric SROC
0.9 0.9
0.84 (0.76 to 0.90) AUC = 0.9402
0.8 2 = 5.90; 0.8
SE (AUC) = 0.0583
0.7 df = 2 (p = 0.0522) 0.7 Q* = 0.8777
0.6 Specificity 0.6 SE (Q*) = 0.0739
0.5 0.86 (0.79 to 0.91) 0.5
2 = 9.08;

Sensitivity
Sensitivity
0.4 0.4
df = 2 (p = 0.0107)
0.3 0.3
0.2 0.2
0.1 0.1
0 0
0 0.2 0.4 0.6 0.8 1.0 0 0.2 0.4 0.6 0.8 1.0
1 – specificity 1 – specificity

(c) (d) Pooled (random effect) DOR 49.24 (95% CI 6.27 to 386.46)
Heterogeneity 2 = 11.96 (df = 2) p = 0.003

FIGURE 9 Pooled random effect results: imaging of bladder neck descent using ultrasound versus MCU for diagnosis of USI in women. (a) Independently pooled sensitivity; (b) independently
pooled specificity; (c) sensitivity and specificity for each study and pooled estimates plotted in ROC space; (d) pooled DOR (random effect) plotted in ROC space.
Sensitivity Specificity
Sensitivity (95% CI) Specificity (95% CI)
148 1.00 (0.89 to 1.00) 148 0.44 (0.25 to 0.65)
147 0.59 (0.41 to 0.75) 147 0.60 (0.45 to 0.74)

0 0.2 0.4 0.6 0.8 1.0 0 0.2 0.4 0.6 0.8 1.0
(a) Pooled sensitivity = 0.79 (0.67 to 0.88) (b) Pooled specificity = 0.55 (0.43 to 0.66)
2 = 22.14; df = 1 (p = 0.0000) 2 = 1.71; df = 1 (p = 0.1909)

ROC plane
1.0

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Sensitivity
0.9
0.79 (0.67 to 0.88)
0.8 2 = 22.14;
0.7 df = 1 (p = 0.0000)
0.6 Specificity
0.5 0.55 (0.43 to 0.66)
2 = 1.71;

Sensitivity
0.4
df = 1 (p = 0.1909)
0.3
0.2
0.1
0
0 0.2 0.4 0.6 0.8 1.0
(c) 1 – specificity

FIGURE 10 Pooled random effect results: imaging of bladder neck descent using X-ray versus MCU for diagnosis of USI in women. (a) Independently pooled sensitivity; (b) independently pooled
Health Technology Assessment 2006; Vol. 10: No. 6

specificity; (c) sensitivity and specificity for each study and pooled estimates plotted in ROC space.

45
46
Results

Sensitivity Specificity
Sensitivity (95% CI) Specificity (95% CI)
17 0.84 (0.74 to 0.91) 17 0.84 (0.73 to 0.92)
150 0.78 (0.52 to 0.94) 150 0.74 (0.49 to 0.91)
149 0.94 (0.79 to 0.99) 149 0.90 (0.55 to 1.00)

0 0.2 0.4 0.6 0.8 1.0 0 0.2 0.4 0.6 0.8 1.0
(a) Pooled sensitivity = 0.85 (0.78 to 0.91) (b) Pooled specificity = 0.83 (0.74 to 0.90)
2 = 2.89; df = 2 (p = 0.2352) 2 = 1.52; df = 2 (p = 0.4683)

ROC plane SROC curve


1.0 1.0
Sensitivity Symmetric SROC
0.9 0.9
0.85 (0.78 to 0.91) AUC = 0.8741
0.8 2 = 2.89; 0.8
SE (AUC) = 0.0997
0.7 df = 2 (p = 0.2352) 0.7 Q* = 0.8045
0.6 Specificity 0.6 SE (Q*) = 0.0992
0.5 0.83 (0.74 to 0.90) 0.5
2 = 1.52;

Sensitivity
Sensitivity
0.4 0.4
df = 2 (p = 0.4683)
0.3 0.3
0.2 0.2
0.1 0.1
0 0
0 0.2 0.4 0.6 0.8 1.0 0 0.2 0.4 0.6 0.8 1.0
1 – specificity 1 – specificity

(c) (d) Pooled (random effect) DOR 25.42 (95% CI 8.66 to 74.59)
Heterogeneity 2 = 3.21; df = 2 (p = 0.201)

FIGURE 11 Pooled random effect results: clinical stress test versus MCU for diagnosis of USI in women. (a) Independently pooled sensitivity; (b) independently pooled specificity; (c) sensitivity
and specificity for each study and pooled estimates plotted in ROC space; (d) pooled DOR (random effect) plotted in ROC space.
Health Technology Assessment 2006; Vol. 10: No. 6

multichannel urodynamics (Table 33). Six of the the diagnosis of BOO in males.203 The other four
studies used only female patients,17,151–153,156,202 were concerned with the diagnosis of DO: one in
whereas two studies used both male and female female patients,110 one in male patients204 and two
patients.154,155 All studies were conducted in a in mixed populations.103,205
secondary care setting. Three studies investigated
elderly populations (older than 70, 60 and Owing to the variability in this group it is not
65 years, respectively).154–156 possible to combine the data from any of these
studies. The sensitivities and specificities
Six studies were concerned only with the diagnosis demonstrated by these studies are heterogeneous.
of DO151–155,202 and two studies with USI.17,156 The It is not possible, therefore, to draw any
criterion standard used in each of the eight studies conclusions about the efficacy of ambulatory
was standard multichannel urodynamics. In urodynamics.
addition, one study used videoimaging as part of
the multichannel urodynamic procedure.156 There is an issue with ambulatory urodynamics, in
that it is thought by some experts to be more
Full contingency tables were provided for all sensitive than standard multichannel urodynamics
papers, allowing pooling of data. One study used and should be the true gold standard for the
only urodynamically confirmed patients and diagnosis of urinary incontinence. However, the
therefore only sensitivity could be calculated.202 view of the ICS is that ambulatory urodynamics is
overly sensitive but not very specific in detecting
After clinical advice, data from two papers were urinary leakage. Ambulatory urodynamics has not
combined to provide a pooled sensitivity of 0.74 been standardised by the ICS and therefore cannot
(95% CI 0.65 to 0.82) and specificity of 0.77 (95% be recommended for routine clinical practice. The
CI 0.66 to 0.86) for the diagnosis of DO in elderly International Consultation on Incontinence group
women using supine single-channel cystometry on urodynamics in 2002 concluded that further
(Figure 12). The positive likelihood ratio associated study of the place and advantages of ambulatory
with the pooled sensitivity and specificity is 12 monitoring was necessary.9
(95% CI 10.58 to 13.42) and the AUC for the
ROC curve corresponding to the pooled DOR is Urethral pressure profile compared
0.92 (95% CI 0.80 to 1.00). Data from the same with multichannel urodynamics
two papers were combined to provide a pooled Five studies investigated the use of the urethral
sensitivity of 0.89 (95% CI 0.76 to 0.96) and pressure profile (UPP) for the diagnosis of USI
specificity of 0.92 (95% CI 0.62 to 1.00) for the (Table 35). Each study included female patients
diagnosis of DO in elderly men using supine and was carried out in a secondary care setting.
single-channel cystometry. The positive likelihood
ratio associated with the pooled sensitivity and The data from two studies were combined to
specificity is 18.2 (95% CI 12.62 to 23.78) provide a pooled sensitivity of 0.62 (95% CI 0.52
(Figure 13). to 0.72) and specificity of 0.70 (95% CI 0.61 to
0.77) for the diagnosis of USI in women by UPP
Data from three papers were combined to in the sitting position (Figure 15).
provide a pooled sensitivity of 0.63 (95% CI 0.55
to 0.71) and specificity of 0.88 (95% CI 0.84 to Flow-rate acceleration compared with
0.92) for the diagnosis of DO in women using multichannel urodynamics
standing single-channel cystometry (Figure 14). One paper compared the use of flow-rate
acceleration for the diagnosis of DO with
Ambulatory urodynamics compared multichannel urodynamics (Table 36). Forty female
with multichannel urodynamics patients with symptoms of urinary incontinence
Ambulatory urodynamic monitoring is the were studied. Flow-rate acceleration was found to
monitoring of leakage, flow recordings and be 0.75 sensitive and specific for the diagnosis
pressure in the bladder and abdomen, with or of DO.
without pressure in the urethra, in an ambulatory
setting.113 Cystometry by foetal monitoring
compared with multichannel
Six studies compared the use of ambulatory urodynamics
urodynamics with standard multichannel Two studies investigated the accuracy of
urodynamics (Table 34). One paper was concerned cystometry using the intrauterine pressure channel
with the diagnosis of USI in women,157 one with of a foetal monitor compared with multichannel 47

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


48
TABLE 33 Single-channel urodynamics compared with multichannel urodynamics
Results

Reference n Gender Care setting Population Type of Gold standard Index test Data available Main findings
incontinence

Scotti17 145 F Secondary Symptoms USI Multichannel Single-channel Full contingency table Sensitivity = 0.49
of UI urodynamics cystometry (sitting) + Specificity = 0.95
cough stress test
Resnick156 97 F Secondary Symptoms USI Multichannel Single-channel cystometry Full contingency table Sensitivity = 0.86
of UI (nursing urodynamics (supine) Specificity = 0.86
home)
Sand151 100 F Secondary Symptoms DO Multichannel Single-channel cystometry Full contingency table Sensitivity = 0.84
of UI urodynamics (standing) Specificity = 0.69
Sand152 218 F Secondary LUTS DO Multichannel Single channel cystometry Full contingency table Supine:
urodynamics (supine and standing) Sensitivity = 0.25
Specificity = 0.94
Standing:
Sensitivity = 0.59
Specificity = 0.82
Sutherst153 100 F Secondary Symptoms DO Multichannel Single-channel cystometry Full contingency table Sensitivity = 1.00
of UI urodynamics (supine and standing) Specificity = 0.83
Fonda154 70 Mixed Secondary Symptoms of DO Multichannel Single-channel cystometry Full contingency table M: Sensitivity = 0.95
UI (> 60 years) urodynamics (supine) Specificity = 1.00
F: Sensitivity = 0.81
Specificity = 0.68
Ouslander155 264 Mixed Secondary Symptoms of DO Multichannel Single-channel cystometry Full contingency table M: Sensitivity = 0.84
UI (>65 years) urodynamics (supine) Specificity = 0.83
F: Sensitivity = 0.73
Specificity = 0.81
Hebert202 47 F Secondary DO (positive DO Multichannel Single-channel cystometry Sensitivity only Sensitivity = 0.74
urodynamics) urodynamics (supine)
Sensitivity Specificity
Sensitivity (95% CI) Specificity (95% CI)
155 0.73 (0.62 to 0.82) 155 0.81 (0.68 to 0.91)
154 0.81 (0.58 to 0.95) 154 0.68 (0.45 to 0.86)

0 0.2 0.4 0.6 0.8 1.0 0 0.2 0.4 0.6 0.8 1.0
(a) Pooled sensitivity = 0.74 (0.65 to 0.82) (b) Pooled specificity = 0.77 (0.66 to 0.86)
2 = 0.64; df = 1 (p = 0.4231) 2 = 1.42; df = 1 (p = 0.2326)

ROC plane
1.0

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Sensitivity
0.9
0.74 (0.65 to 0.82)
0.8 2 = 0.64;
0.7 df = 1 (p = 0.4231)
0.6 Specificity
0.5 0.77 (0.66 to 0.86)
2 = 1.42;

Sensitivity
0.4
df = 1 (p = 0.2326)
0.3
0.2
0.1
0
0 0.2 0.4 0.6 0.8 1.0
(c) 1 – specificity

FIGURE 12 Pooled random effect results: supine single-channel urodynamics (SCU) versus MCU for diagnosis of DO in women over 60. (a) Independently pooled sensitivity; (b) independently
Health Technology Assessment 2006; Vol. 10: No. 6

pooled specificity; (c) sensitivity and specificity for each study and pooled estimates plotted in ROC space.

49
50
Results

Sensitivity Specificity
Sensitivity (95% CI) Specificity (95% CI)
155 0.84 (0.64 to 0.95) 155 0.83 (0.36 to 1.00)
154 0.95 (0.76 to 1.00) 154 1.00 (0.54 to 1.00)

0 0.2 0.4 0.6 0.8 1.0 0 0.2 0.4 0.6 0.8 1.0
(a) Pooled sensitivity = 0.89 (0.76 to 0.96) (b) Pooled specificity = 0.92 (0.62 to 1.00)
2 = 1.60; df = 1 (p = 0.2054) 2 = 1.48; df = 1 (p = 0.2242)

ROC plane
1.0
Sensitivity
0.9
0.89 (0.76 to 0.96)
0.8 2 = 1.60;
0.7 df = 1 (p = 0.2054)
0.6 Specificity
0.5 0.92 (0.62 to 1.00)
2 = 1.48;

Sensitivity
0.4
df = 1 (p = 0.2242)
0.3
0.2
0.1
0
0 0.2 0.4 0.6 0.8 1.0
(c) 1 – specificity

FIGURE 13 Pooled random effect results: supine SCU versus MCU for diagnosis of DO in men over 60. (a) Independently pooled sensitivity; (b) independently pooled specificity; (c) sensitivity and
specificity for each study and pooled estimates plotted in ROC space.
Sensitivity Specificity
Sensitivity (95% CI) Specificity (95% CI)
153 1.00 (0.90 to 1.00) 153 0.89 (0.79 to 0.96)
152 0.25 (0.14 to 0.37) 152 0.94 (0.89 to 0.98)
151 0.84 (0.71 to 0.93) 151 0.69 (0.55 to 0.82)

0 0.2 0.4 0.6 0.8 1.0 0 0.2 0.4 0.6 0.8 1.0
(a) Pooled sensitivity = 0.63 (0.55 to 0.71) (b) Pooled specificity = 0.88 (0.84 to 0.92)
2 = 80.95; df = 2 (p = 0.0000) 2 = 18.63; df = 2 (p = 0.0001)

ROC plane SROC curve


1.0 1.0
Sensitivity Symmetric SROC
0.9 0.9

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


0.63 (0.55 to 0.71) AUC = 0.9281
0.8 2 = 80.95; 0.8
SE (AUC) = 0.0629
0.7 df = 2 (p = 0.0000) 0.7 Q* = 0.8628
0.6 Specificity 0.6 SE (Q*) = 0.0749
0.5 0.88 (0.84 to 0.92) 0.5
2 = 18.63;

Sensitivity
Sensitivity
0.4 0.4
df = 2 (p = 0.0001)
0.3 0.3
0.2 0.2
0.1 0.1
0 0
0 0.2 0.4 0.6 0.8 1.0 0 0.2 0.4 0.6 0.8 1.0
1 – specificity 1 – specificity

(c) (d) Pooled (random effect) DOR 19.03 (95% CI 3.34 to 108.5)
Heterogeneity 2 = 10.64 (df = 2) p = 0.005

FIGURE 14 Pooled random effect results: standing SCU versus MCU for diagnosis of DO in women. (a) Independently pooled sensitivity; (b) independently pooled specificity; (c) sensitivity and
Health Technology Assessment 2006; Vol. 10: No. 6

specificity for each study and pooled estimates plotted in ROC space; (d) pooled DOR (random effect) plotted in ROC space.

51
52
TABLE 34 Ambulatory urodynamics compared with multichannel urodynamics
Results

Reference n Gender Care setting Population Type of Gold standard Index test Data available Main findings
incontinence

Davis157 60 F Military Symptoms of USI Multichannel Ambulatory urodynamics Full contingency table Sensitivity = 0.78
UI + controls urodynamics Specificity = 0.07

Rosario203 63 M Secondary Borderline BOO Multichannel Ambulatory urodynamics Full contingency table Sensitivity = 0.25
obstruction urodynamics Specificity = 0.88

McInerney103 20 Mixed Secondary DO symptoms DO Multichannel Ambulatory urodynamics Full contingency table Sensitivity = 1.00
urodynamics Specificity = 0.58

Bhatia204 26 M Secondary LUTS DO Multichannel Ambulatory urodynamics Full contingency table Sensitivity = 0.43
urodynamics Specificity = 0.58

Webb205 52 Mixed Secondary LUTS (all DO Multichannel Ambulatory urodynamics Specificity only Specificity = 0.60
negative urodynamics
urodynamics)

Davila110 27 F Secondary Symptoms DO Multichannel Ambulatory urodynamics Agreement between More cases identified
of DO urodynamics two methods by ambulatory
urodynamics
TABLE 35 Urethral pressure profile compared with multichannel urodynamics

Reference n Gender Care setting Population Type of Gold standard Index test Data available Main findings
incontinence

Swift158 108 F Secondary LUTS USI Multichannel urodynamics UPP (sitting) Full contingency table Sensitivity = 0.49
Specificity = 0.98

Richardson159 144 F Secondary LUTS USI Multichannel urodynamics UPP (sitting) Full contingency table Supine:
Sensitivity = 0.32
Specificity = 0.93
Standing:
Sensitivity = 0.41
Specificity = 0.92

Versi160 172 F Secondary Symptoms USI Multichannel urodynamics UPP (supine) Full contingency table Sensitivity = 0.77
of USI Specificity = 0.79

Pajoncini206 119 F Secondary USI (positive USI Multichannel urodynamics VLPP Sensitivity and VLPP:
urodynamics) MUCP specificity only Sensitivity = 0.84
Specificity = 0.60
MUCP:
Sensitivity = 0.63

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Specificity = 0.52

Versi207 303 F Secondary Symptoms USI Multichannel urodynamics UPP (supine) Sensitivity and Sensitivity = 0.48
of UI specificity only Specificity = 0.84

VLPP, vesical leak point pressure; MUCP, maximum urethral closure pressure.

TABLE 36 Flow-rate acceleration measurement compared with multichannel urodynamics

Reference n Gender Care setting Population Type of Gold standard Index test Data available Main findings
incontinence

Cucchi104 40 F Secondary Symptoms DO Multichannel Flow-rate acceleration Individual patient data Sensitivity = 0.75
of UI urodynamics Specificity = 0.75
Health Technology Assessment 2006; Vol. 10: No. 6

53
54
Results

Sensitivity Specificity
Sensitivity (95% CI) Specificity (95% CI)
159 0.86 (0.70 to 0.95) 159 0.58 (0.48 to 0.68)
158 0.49 (0.37 to 0.62) 158 0.98 (0.88 to 1.00)

0 0.2 0.4 0.6 0.8 1.0 0 0.2 0.4 0.6 0.8 1.0
(a) Pooled sensitivity = 0.62 (0.52 to 0.72) (b) Pooled specificity = 0.70 (0.61 to 0.77)
2 = 14.01; df = 1 (p = 0.0002) 2 = 29.61; df = 1 (p = 0.0000)

ROC plane
1.0
Sensitivity
0.9
0.62 (0.52 to 0.72)
0.8 2 = 14.01;
0.7 df = 1 (p = 0.0002)
0.6 Specificity
0.5 0.70 (0.61 to 0.77)
2 = 29.61;

Sensitivity
0.4
df = 1 (p = 0.0000)
0.3
0.2
0.1
0
0 0.2 0.4 0.6 0.8 1.0
(c) 1 – specificity

FIGURE 15 Pooled random effect results: UPP versus MCU for diagnosis of USI in women. (a) Independently pooled sensitivity; (b) independently pooled specificity; (c) sensitivity and specificity
for each study and pooled estimates plotted in ROC space.
Health Technology Assessment 2006; Vol. 10: No. 6

urodynamics (Table 37). Both studies were with standard cystometry (Table 39). A sensitivity
concerned with the diagnosis of DO in women in of 0.86 and specificity of 0.42 were demonstrated
secondary care. Because of the form in which the when the test was performed in the supine
data were presented in these studies and the position, and 1.00 (sensitivity) and 0.24
homogeneous nature, the results were combined (specificity) when in the erect position. The fact
to provide a pooled sensitivity of 0.92 (95% CI that there is only one paper studying this test
0.76 to 0.98) and specificity of 0.89 (95% CI 0.78 and that this was published in 1981 indicates
to 0.94). that this is not a test of great relevance to
clinicians.
Ice-water test compared with
multichannel urodynamics Urethral closure pressure profile
One paper studied the use of the ice-water test for compared with the clinical stress test
the diagnosis of detrusor overactivity, specifically One paper studied the ability of a UPP to
with regard to distinguishing this condition from diagnose USI in women compared with the
detrusor hyperflexia (Table 38). The ice-water test clinical stress test (Table 40). Measurement of UPP
was found to have a sensitivity of 0.85 and a was found to have a sensitivity of 0.93 and
specificity of 0.65 when diagnosing DO. This study specificity of 0.83; however, this test was not
was performed in a very specific population where compared with the recognised gold standard of
82% of the sample had a neurological disease; multichannel urodynamics.
therefore, the applicability of the results may be
restricted. Stop test compared with single-channel
cystometry
Fluid-bridge test compared with One study compared the use of the stop test with
standard cystometry single-channel cystometry for the diagnosis of DO
One study compared the use of the fluid-bridge in women (Table 41). This test was found to be
test for the diagnosis of USI in women compared 0.95 sensitive and 0.66 specific.

55

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


56
TABLE 37 Cystometry by foetal monitor compared with multichannel urodynamics
Results

Reference n Gender Care setting Population Type of Gold standard Index test Data available Main findings
incontinence

Swift208 66 F Secondary Symptoms DO Multichannel Cystometry by foetal Full contingency table Sensitivity = 0.91
of UI urodynamics monitor Specificity = 0.86
Bergman209 35 F Secondary Symptoms DO Multichannel Cystometry by foetal Full contingency table Sensitivity = 1.00
of UI urodynamics monitor Specificity = 0.96

TABLE 38 Ice-water test compared with multichannel urodynamics

Reference n Gender Care setting Population Type of Gold standard Index test Statistical tests Main findings
incontinence

Petersen210 130 Mixed Secondary Symptoms DO Multichannel Ice-water test Full contingency table Sensitivity = 0.85
of UI urodynamics Specificity = 0.65

TABLE 39 The fluid-bridge test compared with standard cystometry

Reference n Gender Care setting Population Type of Gold standard Index test Statistical tests Main findings
incontinence

Sutherst211 127 F Secondary Symptoms USI Cystometry Fluid-bridge test Full contingency table Supine:
of UI (supine and erect) Sensitivity = 0.86
Specificity = 0.42
Erect:
Sensitivity = 1.00
Specificity = 0.24
TABLE 40 Urethral closure pressure profile compared with the clinical stress test

Reference n Gender Care setting Population Type of Gold standard Index test Statistical tests Main findings
incontinence

Hanzal212 981 F Secondary Symptoms USI Clinical stress test UPP Full contingency table Sensitivity = 0.93
of UI Specificity = 0.83

TABLE 41 Stop test compared with single-channel cystometry

Reference n Gender Care setting Population Type of Gold standard Index test Statistical tests Main findings
incontinence

Frigerio213 112 F Secondary Symptoms DO Single-channel Stop test Full contingency table Sensitivity = 0.95
of UI cystometry Specificity = 0.66

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Health Technology Assessment 2006; Vol. 10: No. 6

57
Health Technology Assessment 2006; Vol. 10: No. 6

Chapter 4
Economic modelling
Introduction symptoms caused by successful treatment. One
possible solution would have been to use expert
Diagnostic techniques for urinary symptoms, like opinion as to the QALY change caused by
the majority of healthcare interventions, have a successful treatment. However, it was felt this
potential to consume healthcare resources. These would be unlikely to generate feasible values
resources would otherwise be available for because of the uncertainty involved. The expert
alternative forms of healthcare. If the use of a would need an opinion on the type of treatments
diagnostic test is to be justified then the benefits likely to be carried out in primary care. They
received need to exceed the costs incurred in would then need to form an opinion of the
carrying out this test. This study aimed to examine effectiveness of these treatments in reducing
the cost-effectiveness of diagnostic techniques for symptoms and the QALY change caused by these
urinary symptoms from a primary care symptoms. The final level of uncertainty would be
perspective, as this is where most clinical/nursing that they are giving a proxy value of the QALY
assessments are undertaken. These tests are likely change, that is, what they believe would be the
to have resource implications, as there are costs, value that an individual would put on a change in
such as primary care practitioner time, in carrying their urinary symptoms. Because of these factors it
them out. In addition, the results of these tests, was not felt that this approach would be
both positive and negative, are likely to have appropriate or credible. Finally, there were
consequences in terms of other care received. insufficient data to estimate the proportions of
individuals who would have any particular test or
The framework within which any primary care- treatment and who would be referred to and from
based diagnostic test would be used is outlined in primary care and GP/specialist care. For these
Figure 1. As can be seen from this diagram, there is reasons this type of model was always considered
no simple relationship where individuals receive outside the scope of the current project.
diagnostic tests and actions are taken on the basis
of the results of these tests. Treatment and testing Therefore, a limited approach to the economic
are linked in this framework as individuals under evaluation was taken. A cost-effectiveness study was
primary care management may receive treatment conducted where the measure of effectiveness was
from their primary care practitioner and may only limited to how well the test detected any of the
be referred to specialist assessment and care if underlying urinary conditions that an individual
there is no improvement with primary treatment. may have. It was also assumed that positives from
Ideally, an economic model in this framework these diagnostic techniques could then be referred
would consider all tests and treatments received as to secondary specialist assessment. By this means,
a common part of the process of improving an attempt was made to isolate the diagnosis of
health. All resources used would be costed and the urinary conditions from the rest of the treatment
outcome measure would be health related, for pathway. This enabled judgements to be made
example quality-adjusted life-years (QALYs). This about the accuracy and cost-effectiveness of
would enable comparison with a wide range of different diagnostic techniques in diagnosing
other healthcare situations. However, a number of urinary conditions.
problems precluded this approach. These all
related to the availability of data and the original
remit of the project (which did not consider the Methods
results of treatment, only of diagnostic tests). No
sufficiently reliable data were found to enable Population groups considered
evaluation of the effectiveness of all treatments Although these tests can be used in the diagnosis
that could potentially be received by individuals of USI and DO in men and women, the evidence
on a common framework. In addition, information from the systematic review related to their use in
was not available on the QALY gains obtained women. Therefore, the models constructed were
from successful treatment of urinary symptoms or specific for women and not men. An inclusion
of the QALY changes due to changes in urinary criterion for the review was adults only; in 59

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Economic modelling

addition, studies were excluded if they studied a a true condition, which is unknown by the primary
purely elderly population. care practitioner who carries out the diagnostic
tests; in this model the condition may be USI, DO
Alternative diagnostic test strategies or both (here referred to as mixed). In addition,
Four alternative diagnostic test strategies on which an individual may have neither of these
some data were available were considered. These conditions. In all cases the model structure is the
were history-taking, history and a 48-hour pad same; only the probabilities of entering any
test, history and validated scales, and history and branch, and the payoffs at the end nodes will
urinary diary. As all individuals were assumed to change. Therefore, only the model structure if the
have a history taken, the additional costs and individuals true condition is USI is shown in
accuracy of the 48-hour pad test, validated scale Figure 17. Regardless of an individual’s true
and urinary diary compared with history alone condition, primary care tests can declare they have
were evaluated. Evidence from the systematic USI, DO, mixed or no condition. If an individual
review showed that history could be used to has any of these diagnoses they may then be
diagnose both USI and DO. There was also referred for a specialist secondary assessment.
evidence that both a 48-hour pad test and
validated scales could be used to diagnose USI. As Parameters used in the model
there was no evidence on the effectiveness of these Cost variables
tests in detecting DO, the assumption was made The cost variables used are given in Table 42. In
that they would only be used to diagnose USI. addition to the mean values, the distribution
Similarly, there was evidence that a urinary diary parameters assigned to each cost variable are
was useful for diagnosing DO, but no evidence given; these represent the uncertainty involved in
regarding its use for diagnosing USI. Therefore, estimation for each parameter. For cost variables
the urinary diary was only considered as a log-normal distributions were used as these only
diagnostic tool for DO. return positive values. Furthermore, these
distributions are often used for cost data as they
The economic model have a skewed distribution; this reflects the fact
The model of the cost-effectiveness of primary that cost data often have a positive skewed
care diagnostic tests is set in the context of distribution, with a small number of high cost
primary care management, where much of the estimates giving distributions a long tail.214
assessment, diagnosis and treatment of urinary Table 42 also provides details of the derivation of
conditions is undertaken. The viewpoint of the each parameter. The cost of carrying out pad
model is from the perspective of a healthcare tests, validated scales and diaries included
provider. The diagnostic strategies evaluated are consumables costs and any extra time required
outlined in Figure 16. Diagnosis can be made by from the practitioner. This information was
the primary care practitioner using history only. In obtained from two experts in providing these
addition to this, other diagnostic strategies are forms of nursing services. The experts were asked
available. These include any of the following: 48- to provide lists of all consumables required to
hour pad test, validated scales and urinary diary. carry out tests. They were also asked for estimates
For each of these strategies the model has the of any extra time that would be required to
structure outlined in Figure 17. The individual has perform tests. Cost estimates for the tests were

Nurse evaluates with history only

Nurse evaluates with history and with 48-hour pad test


Diagnostic strategy options
Nurse evaluates with history and validated scales

Nurse evaluates with history and urinary diary

60 FIGURE 16 Treatment options considered


Health Technology Assessment 2006; Vol. 10: No. 6

Referred to specialist
secondary assessment
Primary care tests declare
individual has USI
Not referred to specialist
secondary assessment

Referred to specialist
secondary assessment
Primary care tests declare
individual has DO
Not referred to specialist
secondary assessment

Individual has USI


Referred to specialist
secondary assessment
Primary care tests declare
individual has both USI and DO
Not referred to specialist
secondary assessment

Referred to specialist
Diagnostic is made on the secondary assessment
basis of history alone Primary care tests declare
individual has no condition
Not referred to specialist
secondary assessment

Individual has DO
Structure is the same as USI only
Individual has mixed
Structure is the same as USI only
Individual has no condition
Structure is the same as USI only

FIGURE 17 Model structure when history only is used as a diagnostic tool

then derived from these estimates; further urinary questionnaire and urodynamic diagnosis
information is given in Table 42. All cost variables by Matharu and colleagues.13 In this study,
are in UK pounds for 2002. For all strategies an individuals who reported symptoms in a postal
individual is assumed to have a nurse consultation questionnaire were invited to attend a randomised
and the time taken to take a history and evaluate it clinical trial comparing a nurse-led continence
is included as part of the duration of this service with GP management. At the end of this
consultation. trial individuals who had not responded to
treatment were invited to attend urodynamics. The
Outcome variables individuals who had urodynamics were therefore
The aim was to compare how well primary care either the more severe cases or those whose
tests performed in detecting the underlying condition was least responsive to treatment. This
conditions causing urinary symptoms. The measure may mean that the numbers reported by Matharu
of effectiveness was therefore the number of are not representative of the proportions of
individuals who had at least one of their conditions individuals with each condition that would be
successfully detected by a primary care test. The found in a primary care setting. However,
outcomes considered are outlined in Table 43. urodynamics would not be routinely used on this
type of population, so these data are unavailable.
Prevalence As Matharu and colleagues considered individuals
The measure of the prevalence of urinary who were appropriate for primary care treatment,
conditions was taken from an investigation of the these were felt to be the best available data. The
relationship between symptoms reported in a following prevalences were reported: USI 0.336 61

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Economic modelling

TABLE 42 Cost variables used in analysis (2002 UK pounds sterling)

Mean cost (SE) Distribution Derivation

Cost of validated scales 3.75 (0.34) Log normal Time taken from expert opinion. Cost of time taken
from midpoint of grade F salary scale with on-costsa.
Distribution taken from assumption that high and low
point of estimates given approximate to 95% CI
Cost of diary 3.75 (0.34) Log normal Time taken from expert opinion. Cost of time taken
from midpoint of grade F salary scale with on-costsa.
Distribution taken from assumption that high and low
point of estimates given approximate to 95% CI
Costs of nurse consultation 18.17 (0.09) Log normal Average length of time taken from a database
obtained from a locally conducted trial of a
continence nurse practitioner-led service.116 Nurse
pay was taken as midpoint of grade F a. Overhead
rate of 37% applied
Costs of pad test 4.06 (0.56) Log normal Time taken from expert opinion. Cost of time taken
from midpoint of grade F salary scale with on-costsa.
Distribution taken from assumption that high and low
point of estimates given approximate to 95% CI.
Cost of consumables obtained from local service
providers
Cost of first referral to 56.22 (6.89) Log normal Obtained from NHS reference cost215
urology department
Cost of urodynamics 125.10 (16.71) Log normal Obtained from NHS reference cost.215 This value
used for sensitivity analysis
a
Costs of nursing time are increased to take into account face-to-face contact only. This involved calculating the proportion
of all time that involved face-to-face contact by means of nurse-completed diaries. This information was then used to
generate a multiplier and the average cost per minute for all nurse time was increased by this multiplier.

TABLE 43 Payoffs from diagnosis

Underlying condition Diagnosis from primary care tests (history and Outcome
any additional tests carried out)

USI UI 1
DO 0
Mixed 1
None 0
DO USI 0
DO 1
Mixed 1
None 0
Mixed USI 1
DO 1
Mixed 1
None 0
None USI 0
DO 0
Mixed 0
Nonea 0
a
In the case of no condition, where the diagnosis of no condition is made, a payoff of zero is recorded even though the
correct diagnosis is made. This is because the measure of effectiveness is individuals with any condition correctly
diagnosed. An individual cannot have a condition correctly diagnosed if they have no condition to be diagnosed.
62
Health Technology Assessment 2006; Vol. 10: No. 6

TABLE 44 Performance of primary care tests

Variable Mean value (95% CI) Distribution ( and  parameters) Derivation

Sensitivity of history for USI 0.92 (0.91 to 0.93)  (2600 and 226) Systematic review
Specificity of history for USI 0.56 (0.53 to 0.60)  (432 and 340) Systematic review
Sensitivity of history for DO 0.61 (0.57 to 0.65)  (348 and 222) Systematic review
Specificity of history for DO 0.87 (0.85 to 0.89)  (944 and 141) Systematic review
Sensitivity of pad test for USI 0.92 (0.82 to 0.97)  (45.3 and 3.9) Systematic review
Specificity of pad test for USI 0.72 (0.57 to 0.83)  (32.3 and 12.6) Systematic review
Sensitivity of diary for DO 0.88 (0.71 to 0.96)  (22.0 and 3.0) Systematic review
Specificity of diary for DO 0.83 (0.77 to 0.87)  (179.1 and 36.7) Systematic review
Sensitivity of scales for USI 0.87 (0.82 to 0.92)  (150 and 22.5) Systematic review
Specificity of scales for USI 0.6 (0.51 to 0.69)  (68 and 45.1) Systematic review

(95% CI 0.294 to 0.378), DO 0.291 (95% CI 0.251 example, if an individual tested positive for USI
to 0.331), mixed 0.207 (95% CI 0.171 to 0.243) using one test but negative using another they
and no condition in 0.166 (95% CI 0.133 to would still be considered positive for USI. If they
0.199).13 Mean values were reported in Matharu13 tested positive for USI using one test and positive
and the confidence intervals were obtained from for DO on a different test they would be
one of the authors (Matthews R, University of considered as positive for mixed.
Leicester: personal communication, 2003). For
each sample the probability of these four Referral to specialist care
parameters had to add up to 1, as they were One important consequence of diagnostic tests in
mutually exclusive events, one of which always had primary care is likely to be referrals to specialist
to occur. For this reason the four probabilities were secondary care assessment. The authors had no
varied randomly around their means (using beta- access to data that indicated the proportions of
distributions as these distributions are bounded individuals in primary care services who would
between 0 and 1), but the sum of these four have referrals after positive results from a primary
variables was re-based always to equal 1. care test. Estimating these proportions by means
of expert opinion proved problematic, as referral
Effectiveness of primary care diagnostic to specialist secondary assessment would depend
tests on individual characteristics in each case. Two
The estimates of performance of primary care specialists in clinical care were asked their opinion
tests in detecting USI and DO were obtained from on the proportions referred to secondary care.
the current systematic review. These are detailed However, neither expert felt able to give referral
in Table 44. Again, these variables were assumed to rates because they felt that this would be so
have beta-distributions. Where two tests are used dependent on individual circumstances. There is a
together, for example history and pad test, the further complication here as the ability for
results are assumed to be independent, that is, the primary care practitioners to refer individuals to
probability of 48-hour pad tests correctly secondary care may be influenced by supply
diagnosing an individual is unrelated to the constraints in secondary care, for example a
probability that history detected this condition. limited capacity to carry out urodynamics.
Therefore, what constitutes sufficient grounds to
Interpretation of test results refer may differ from one location to another
Where more than one test is used there may be depending on local capacity.
situations where different tests give contradictory
results. For example, history may declare an Because of these factors no data were available
individual positive for USI while an individual that indicated the probability of an individual
tests negative using a 48-hour pad test. For these being referred to specialist secondary assessment
situations a decision rule was needed as to the given different primary care diagnostic test results.
results of the combination of the two tests. In this For this reason analyses were carried out using two
analysis the assumption was that if either test was extreme cases. In the first case none of the
positive then that individual was considered to individuals with a positive diagnosis for USI or
have tested positive for that condition. For DO or both would be referred to specialist 63

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Economic modelling

secondary assessment; in the second case all single parameters being varied through a range of
individuals with a positive diagnosis for any of these values to estimate the effect that different
conditions would be referred. A midpoint analysis parameters have on the results of the model.
was also evaluated; in this case 50% of all positive These analyses are evaluated deterministically, that
diagnoses would be referred to specialist secondary is, only mean values are used to parameterise the
assessment. It was assumed that individuals who model. These will be referred to here as one-way
tested negative on all tests used would not be sensitivity analyses.
referred to specialist secondary assessment.

Model evaluation Results


The primary analysis was carried out using second
order Monte Carlo simulation. This approach The probability of individuals being detected as
assigns a distribution to model parameters. having each diagnosis and their underlying
Random values from those distributions are taken condition is given in Tables 45–48. As can be seen
for each sample of the Monte Carlo simulation from Table 45, history accurately detects USI, with
and a cost-effectiveness result is generated based 80% of individuals correctly diagnosed as having
on these values. The model was evaluated using USI. It performs much less effectively in detecting
10,000 samples for each simulation. The model was DO and in identifying individuals who are mixed
constructed and evaluated using Microsoft Excel. compared with USI. History also diagnoses
This probabilistic analysis allowed confidence correctly only about 50% of individuals who have
intervals around costs and effects to be generated. no condition. It can be seen from Tables 46 and 48
As history should be taken in all cases the research that history, in combination with pad tests or
question was the additional costs and effects of validated scales, performs better than history
further tests in addition to history. Therefore, an alone in terms of diagnosing USI; less than 1% of
incremental analysis was calculated. This gives the individuals with USI are diagnosed as having DO
extra costs generated by strategies involving or no condition. However, because there are two
additional tests compared with the costs of history- tests working in combination fewer individuals
taking alone. The extra proportion of individuals with no condition are now diagnosed as such. The
who have any of their conditions correctly performance of diary in addition to history can be
diagnosed was also calculated. Finally, the extra seen in Table 47. This performs less well in terms
costs were calculated per extra individual with any of USI diagnosed. However, it performs much
condition correctly diagnosed for strategies better in diagnosing DO, with 95% of individuals
involving history and another test compared with diagnosed as either DO or mixed. Again, fewer
history alone. The results of these analyses are individuals with no condition are diagnosed as
presented in cost-effectiveness acceptability curves such. In general, using additional tests generates
that track the changing percentage of samples that more positive and less negative results (using a
are cost-effective given different values for decision rule of a positive from any test being
detecting cases of urinary disorder. taken as a positive diagnosis).

Probabilistic sensitivity analysis As stated earlier, the outcome used in this analysis
In addition to the above analysis, an additional was cost per individual who has at least one
probabilistic analysis was performed where it was condition correctly diagnosed. The costs and units
assumed that individuals who were referred to of effectiveness from the probabilistic model are
specialist secondary assessment would also receive given in Table 49. In all cases the values given are
urodynamics. This would have two effects. First, it incremental compared with history, that is, they
would increase the costs associated with individual are the extra costs and extra units of effectiveness
diagnoses; and second, referral to urodynamics generated by carrying out history and an
would also result in more cases being correctly additional test when compared with history alone.
diagnosed, because urodynamics is assumed to be The results are presented in this way as it was
a reference standard. It was therefore assumed assumed that history would always be performed.
that any individual referred to urodynamics, even Table 49 gives the result of the two extreme
on the basis of an incorrect diagnosis in primary analyses where 0% and 100% of individuals who
care, would then be correctly diagnosed. have a positive diagnosis are referred to secondary
specialist assessment. Table 49 also presents a
Deterministic sensitivity analysis midpoint analysis where 50% of all individuals
In addition to the probabilistic sensitivity analysis, declared positive are referred to specialist
64 a series of analyses are presented that involve secondary assessment. Table 49 shows that all costs
Health Technology Assessment 2006; Vol. 10: No. 6

TABLE 45 Results of history

Condition and diagnosis Total for all individuals Mean value as a


(95% percentile) percentage of the total
for each condition

Individual has USI, history declares USI 0.269 (0.24 to 0.298) 80.0%
Individual has USI, history declares DO 0.003 (0.003 to 0.004) 1.0%
Individual has USI, history declares mixed 0.04 (0.033 to 0.048) 12.0%
Individual has USI, history declares no condition 0.023 (0.02 to 0.027) 7.0%
Total USI 0.336 (0.301 to 0.371) 100.0%
Individual has DO, history declares USI 0.05 (0.042 to 0.059) 17.2%
Individual has DO, history declares DO 0.099 (0.085 to 0.115) 34.2%
Individual has DO, history declares mixed 0.078 (0.066 to 0.091) 26.8%
Individual has DO, history declares no condition 0.064 (0.053 to 0.075) 21.8%
Total DO 0.291 (0.257 to 0.325) 100.0%
Individual has mixed, history declares USI 0.074 (0.061 to 0.089) 35.9%
Individual has mixed, history declares DO 0.01 (0.008 to 0.012) 4.9%
Individual has mixed, history declares mixed 0.116 (0.097 to 0.137) 56.1%
Individual has mixed, history declares no condition 0.006 (0.005 to 0.008) 3.1%
Total mixed 0.207 (0.175 to 0.240) 100.0%
Individual has no condition, history declares USI 0.064 (0.052 to 0.077) 38.3%
Individual has no condition, history declares DO 0.012 (0.009 to 0.015) 7.3%
Individual has no condition, history declares mixed 0.009 (0.007 to 0.012) 5.7%
Individual has no condition, history declares no condition 0.081 (0.066 to 0.097) 48.7%
Total for no condition 0.166 (0.137 to 0.197) 100.0%

TABLE 46 Results of history and pad test

Condition and diagnosis Total for all individuals Mean value as a


(95% percentile) percentage of the total
for each condition

Individual has USI, combination of tests declares USI 0.290 (0.26 to 0.322) 86.4%
Individual has USI, combination of tests declares DO 0.0003 (0.0001 to 0.0006) 0.1%
Individual has USI, combination of tests declares mixed 0.043 (0.036 to 0.052) 12.9%
Individual has USI, combination of tests declares no condition 0.002 (0.001 to 0.004) 0.6%
Total USI 0.336 (0.301 to 0.371) 100.0%
Individual has DO, combination of tests declares USI 0.068 (0.055 to 0.082) 23.3%
Individual has DO, combination of tests declares DO 0.072 (0.055 to 0.089) 24.6%
Individual has DO, combination of tests declares mixed 0.106 (0.088 to 0.127) 36.4%
Individual has DO, combination of tests declares no condition 0.046 (0.035 to 0.058) 15.7%
Total DO 0.291 (0.257 to 0.325) 100.0%
Individual has mixed, combination of tests declares USI 0.080 (0.066 to 0.096) 38.8%
Individual has mixed, combination of tests declares DO 0.001 (0.000 to 0.002) 0.4%
Individual has mixed, combination of tests declares mixed 0.125 (0.105 to 0.147) 60.6%
Individual has mixed, combination of tests declares no condition 0.0005 (0.0001 to 0.0011) 0.2%
Total mixed 0.207 (0.175 to 0.240) 100.0%
Individual has no condition, combination of tests declares USI 0.086 (0.068 to 0.107) 51.9%
Individual has no condition, combination of tests declares DO 0.009 (0.006 to 0.012) 5.2%
Individual has no condition, combination of tests declares mixed 0.013 (0.010 to 0.017) 7.8%
Individual has no condition, combination of tests declares 0.058 (0.043 to 0.074) 35.1%
no condition
Total for no condition 0.166 (0.137 to 0.197) 100.0%

65

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Economic modelling

TABLE 47 Results of history and diary

Condition and diagnosis Total for all individuals Mean value as a


(95% percentile) percentage of the total
for each condition

Individual has USI, combination of tests declares USI 0.223 (0.197 to 0.251) 66.4%
Individual has USI, combination of tests declares DO 0.007 (0.006 to 0.009) 2.2%
Individual has USI, combination of tests declares mixed 0.086 (0.07 to 0.104) 25.6%
Individual has USI, combination of tests declares no condition 0.019 (0.016 to 0.023) 5.8%
Total USI 0.336 (0.301 to 0.371) 100.0%
Individual has DO, combination of tests declares USI 0.006 (0.001 to 0.013) 2.1%
Individual has DO, combination of tests declares DO 0.155 (0.133 to 0.179) 53.4%
Individual has DO, combination of tests declares mixed 0.122 (0.104 to 0.141) 41.9%
Individual has DO, combination of tests declares no condition 0.008 (0.002 to 0.017) 2.6%
Total DO 0.291 (0.257 to 0.325) 100.0%
Individual has mixed, combination of tests declares USI 0.009 (0.002 to 0.020) 4.3%
Individual has mixed, combination of tests declares DO 0.016 (0.013 to 0.019) 7.6%
Individual has mixed, combination of tests declares mixed 0.182 (0.153 to 0.212) 87.7%
Individual has mixed, combination of tests declares no condition 0.001 (0.000 to 0.002) 0.4%
Total mixed 0.207 (0.175 to 0.240) 100.0%
Individual has no condition, combination of tests declares USI 0.053 (0.042 to 0.064) 31.8%
Individual has no condition, combination of tests declares DO 0.026 (0.020 to 0.033) 15.6%
Individual has no condition, combination of tests declares mixed 0.020 (0.015 to 0.026) 12.2%
Individual has no condition, combination of tests declares 0.067 (0.054 to 0.081) 40.4%
no condition
Total for no condition 0.166 (0.137 to 0.197) 100.0%

TABLE 48 Results of history and validated scales

Condition and diagnosis Total for all individuals Mean value as a


(95% percentile) percentage of the total
for each condition

Individual has USI, combination of tests declares USI 0.289 (0.259 to 0.32) 86.1%
Individual has USI, combination of tests declares DO 0.0005 (0.0003 to 0.0007) 0.1%
Individual has USI, combination of tests declares mixed 0.043 (0.036 to 0.052) 12.9%
Individual has USI, combination of tests declares no condition 0.003 (0.002 to 0.004) 0.9%
Total USI 0.336 (0.301 to 0.371) 100.0%
Individual has DO, combination of tests declares USI 0.075 (0.063 to 0.089) 25.9%
Individual has DO, combination of tests declares DO 0.060 (0.048 to 0.073) 20.5%
Individual has DO, combination of tests declares mixed 0.118 (0.100 to 0.137) 40.5%
Individual has DO, combination of tests declares no condition 0.038 (0.030 to 0.047) 13.1%
Total DO 0.291 (0.257 to 0.325) 100.0%
Individual has mixed, combination of tests declares USI 0.080 (0.066 to 0.095) 38.6%
Individual has mixed, combination of tests declares DO 0.0013 (0.0008 to 0.0019) 0.6%
Individual has mixed, combination of tests declares mixed 0.125 (0.105 to 0.147) 60.4%
Individual has mixed, combination of tests declares no condition 0.0008 (0.0005 to 0.0013) 0.4%
Total mixed 0.207 (0.175 to 0.240) 100.0%
Individual has no condition, combination of tests declares USI 0.096 (0.078 to 0.116) 57.8%
Individual has no condition, combination of tests declares DO 0.007 (0.005 to 0.009) 4.4%
Individual has no condition, combination of tests declares mixed 0.014 (0.011 to 0.018) 8.6%
Individual has no condition, combination of tests declares 0.049 (0.038 to 0.061) 29.2%
no condition
Total for no condition 0.167 (0.137 to 0.197) 100.0%

66
Health Technology Assessment 2006; Vol. 10: No. 6

TABLE 49 Results of cost-effectiveness analyses (probabilistic values)

Referral to specialist Incremental costs Incremental Incremental


secondary assessment (95% percentile) effectiveness cost-effectiveness
(£) (95% percentile) (£)

0% Referred Pad test 4.06 0.0307 132


(3.07 to 5.25) (0.0255 to 0.0361)
Diary 3.75 0.1057 35
(3.12 to 4.46) (0.0830 to 0.1276)
Scale 3.74 0.0290 129
(3.14 to 4.45) (0.0246 to 0.0339)
50% Referred Pad test 5.97 0.0307 195
(4.44 to 8.09) (0.0256 to 0.0361)
Diary 5.98 0.1055 57
(4.64 to 8.58) (0.0782 to 0.1640)
Scale 6.09 0.0290 210
(4.64 to 8.32) (0.0246 to 0.0337)
100% Referred Pad test 7.82 0.0307 255
(5.43 to 11.48) (0.0255 to 0.036)
Diary 8.16 0.1054 77
(5.67 to 12.06) (0.0837 to 0.1266)
Scale 8.42 0.029 290
(5.81 to 12.63) (0.0245 to 0.0339)

are positive since all strategies that involve an needed for the joint test strategies to be preferred
additional diagnostic test involve greater cost than to history alone.
history alone. They are also more effective than
history alone in detecting cases. The incremental Probabilistic sensitivity analysis
cost-effectiveness shows the additional costs Table 50 and Figure 19 show the results of a
incurred per additional case detected. The probabilistic model where individuals are referred
incremental cost-effectiveness ratio indicates for urodynamics as well as specialist secondary
differences between the tests. The additional cost assessment. It can be seen from Table 50 that
per extra case detected was generally highest for referral to urodynamics dramatically increases the
scales, varying between £129 and £290. Next incremental cost per individual with any condition
highest was the pad test, which varied from £129 diagnosed compared with history alone. This is
to £255. Diary had the most favourable cost- because more individuals are being referred in the
effectiveness ratios, varying between £35 and £77 joint test strategies and referral is more expensive
per extra unit of effectiveness. because it includes urodynamics. However,
including urodynamics also increases the number
These results are also presented in Figure 18 as a of individuals with any condition diagnosed, as
cost-effectiveness acceptability curve. Shown here urodynamics is effective in detecting cases.
are the curves for the two extreme cases, 0% Although there are extra cases detected, the
referred and 100% referred. These curves show incremental costs per additional unit of effect
the probability that each strategy is cost-effective, increase as there are large additional costs (the
given different values placed on a case detected. urodynamic referral) but only small extra
The higher the value of a case detected, the more numbers of individuals with any condition
likely it is that a strategy detecting additional cases diagnosed. For example, with 100% referral, diary
will be considered worthwhile. The curves are in addition to history costs an extra £275 per
incremental; history alone is compared to the person with any condition diagnosed, compared
other three strategies and each of these strategies with history alone.
is compared to history. For very low values given to
a case detected, history alone is the preferred One-way sensitivity analysis
strategy as it has the lowest cost. However, as the Table 51 presents a series of one-way sensitivity
value given to a case detected rises so does the analyses that were carried out on a deterministic
probability that any of the other strategies are model. The sensitivity analysis was carried out on
cost-effective. Increasing the proportion referred a model where 50% of individuals were referred to
increases the value of a case detected that is specialist secondary assessment. In all cases the 67

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Economic modelling

0% referred to specialist secondary assessment


1.0
0.9
0.8
% Cost-effective

0.7
History
0.6
Pad test
0.5
Diary
0.4
Scales
0.3
0.2
0.1
0
0 100 200 300 400 500
Willingness to pay for a case detected

100% referred to specialist secondary assessment


1.0
0.9
0.8
0.7 History
% Cost-effective

0.6
Pad
0.5
Diary
0.4
Scales
0.3
0.2
0.1
0
0 100 200 300 400 500
Willingness to pay for a case detected

FIGURE 18 Cost-effectiveness acceptability curves for referral to specialist secondary assessment

values given are the incremental cost per extra for USI, as the higher the sensitivity of history, the
unit of effect generated compared with history fewer cases remain for additional tests to detect.
alone. In the first part of Table 51 the proportion Also given are the effects of varying the sensitivity
of individuals who had no condition was varied and specificity of pad test, diary and scales. As
from 0 to 1. The more individuals have USI, DO expected, as sensitivity and specificity increase, the
or mixed, the lower the cost-effectiveness ratios. cost-effectiveness ratios become more favourable.
If 80% of the sample have no condition the cost- The final part of Table 51 shows the effect of
effectiveness ratios for pad tests and scales are varying cost estimates. As the cost of carrying out
approximately £1000 per unit of effect. In the tests and referrals increases so does the
second part of Table 51 the performance of the incremental cost-effectiveness ratio. However, the
various tests is varied between the upper and model seems less sensitive within the range of the
lower points of their 95% confidence intervals. Of confidence intervals for costs than for other
68 particular importance is the sensitivity of history variables such as sensitivity and specificity.
Health Technology Assessment 2006; Vol. 10: No. 6

TABLE 50 Results of model with positives referred to specialist secondary assessment and urodynamics

Referral to specialist Incremental costs Incremental Incremental


secondary assessment (95% percentile) effectiveness cost-effectiveness
(£) (95% percentile) (£)

50% Referred Pad test 10.23 0.038 269


(6.99 to 15.16) (0.0317 to 0.0448)
Diary 10.93 0.0855 128
(7.55 to 16.15) (0.0682 to 0.1023)
Scale 11.34 0.0402 282
(7.84 to 16.83) (0.0350 to 0.0459)
100% Referred Pad test 16.28 0.0452 360
(10.27 to 25.94) (0.0359 to 0.0556)
Diary 18.05 0.0655 275
(11.47 to 28.54) (0.0522 to 0.0785)
Scale 18.84 0.0513 367
(11.90 to 29.71) (0.0436 to 0.0597)

The case where referral is 0% is not shown as this is equivalent to values for 0% in Table 49.

50% referred to urodynamics


1.0
0.9
0.8
% Cost-effective

0.7
History
0.6
Pad
0.5
0.4 Diary

0.3 Scales
0.2
0.1
0
0 200 400 600 800 1000
Willingness to pay for a case detected

50% referred to urodynamics


1.0
0.9
0.8
0.7
% Cost-effective

History
0.6
Pad
0.5
Diary
0.4
0.3 Scales

0.2
0.1
0
0 200 400 600 800 1000
Willingness to pay for a case detected

FIGURE 19 Cost-effectiveness acceptability curves for sensitivity analysis on the effect of referral to urodynamics 69

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Economic modelling

TABLE 51 One-way sensitivity analyses on the probabilities used in the model

Probability (0 to 1) 0 0.2 0.4 0.6 0.8 1

Proportion of individuals who have Pad test 152 205 295 474 1010 NA
no condition (base case 0.166) Diary 47 59 79 119 239 NA
Scales 157 224 334 555 1218 NA

Probability (range of 95% CI) Lower 95% CI Upper 95% CI

Sensitivity of history for USI Pad test 176 219


Diary 57 57
Scales 190 237
Specificity of history for USI Pad test 193 197
Diary 56 58
Scales 207 214
Sensitivity of history for DO Pad test 193 196
Diary 53 61
Scales 209 211
Specificity of history for DO Pad test 194 195
Diary 56 57
Scales 209 211
Sensitivity of pad test for USI Pad test 215 186
Specificity of pad test for USI Pad test 214 180
Sensitivity of diary test Diary 66 53
Specificity of diary test for USI Diary 58 55
Sensitivity of scales Scales 222 200
Specificity of scales Scales 223 198

Cost variables Lower 95% CI Upper 95% CI

Pad test cost Pad test 159 230


Diary cost Diary 50 63
Scales cost Scales 187 233

NA, not applicable.

70
Health Technology Assessment 2006; Vol. 10: No. 6

Chapter 5
Discussion
his is the first systematic review of methods for A large number of papers was identified from the
T diagnosing urinary incontinence, meta-
analysing the data, where possible, from different
search (6009), of which 121 were deemed relevant
for inclusion in the review. All papers compared
studies to generate conclusions about the two or more assessment/diagnostic techniques. A
diagnostic performance of commonly used two-stage exclusion process was applied and
diagnostic methods in both primary and decisions on relevance were checked in a random
secondary care. The objectives of the review were selection of 20% of cases; it is acknowledged that
to identify, appraise and summarise the published some papers of relevance may have been excluded
evidence, quantitatively synthesise the extracted unintentionally. There was diversity across the
evidence (where possible) and construct an papers in diagnostic methods studied,
economic model to examine the cost-effectiveness methodology, analysis of the data and quality of
of simple, commonly used primary care tests. reporting.

Inclusion/exclusion criteria
Appraisal of the systematic review The extent to which the questions within a
systematic review can be answered depends on the
Research methodology nature and quality of primary studies available.
Search strategy The inclusion criteria in this study were broad:
A systematic literature search was undertaken studies that presented any quantitative comparison
using three databases. There was an overlap between two or more methods of assessing urinary
between the databases, particularly MEDLINE and incontinence. The study excluded case reports,
EMBASE: 45% of the studies identified by letters, non-primary research and research
EMBASE were also identified by MEDLINE. involving only children. All studies presented in a
CINAHL contributed the lowest papers to the non-English language were also excluded, as time
review (seven). The search strategy was based on and financial constraints did not allow for the
the Cochrane and NHS CRD strategies for translation of such papers. However, it is possible
identifying studies of diagnostic performance, that this may have excluded important
which is well validated. It is important, for studies.217,218
consistency and accuracy, for systematic reviews of
diagnostic methods to use these strategies. Assessment of relevance
A critical part of classifying the papers included in
Keywords were added to the generic search the systematic review was to determine what tests
strategies for identifying diagnostic studies to were compared. The development of the cross-
identify all possible tests used for the diagnosis of tabulation table enabled this to be clearly recorded
urinary incontinence, including terms for and all similar studies to be grouped together,
potential permutations of their names. However, it aiding the quality assessment and data extraction
is possible that relevant studies may have been processes.
missed that use unusual or obscure diagnostic
tests. Quality assessment
It is important to assess the quality of studies
The development of online bibliographic included in any systematic review in terms of
databases in recent years means that internal validity, external validity, and the quality
handsearching of journals has become less of data analysis and reporting. The QUADAS
important.216 As urinary incontinence and tool13,219 that was devised for this purpose is an
diagnostic performance are well-established important development. However, the relatively
medical subheadings it was felt that using a low levels of agreement between the investigators
detailed search strategy would identify a high assessing the same papers using the tool suggest
proportion of relevant studies and that that it has limitations and that additional
handsearching would not identify a significant instructions need to be added according to the
number of additional studies. topic area of the individual review. 71

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Discussion

The most significant problem associated with individuals with any condition diagnosed
study quality was in the reporting of results, with compared with history alone were examined. On
only a small proportion of relevant studies this basis, the urinary diary performs well as it
presenting data in a way that allowed inclusion in generates extra cases detected for the lowest extra
a meta-analysis. It was noticed that the quality of cost. This is because the diary has been taken as a
reporting was significantly higher in the more test for DO and the sensitivity of history for
recent studies, indicating that standards are detecting DO is much lower than for detecting
improving and this will be furthered by USI. In other words, far more cases of DO are not
developments such as the Standards for Reporting detected by history and therefore there is more
of Diagnostic Accuracy (STARD) initiative.220 scope for an additional test to detect additional
‘missing’ cases. However, a number of things
Data extraction should be considered when evaluating these
This is a potential source of error in any results. It is important to consider that these tests
systematic review. The method of extracting data are only evaluated in terms of their ability to
within a meeting of at least two study investigators diagnose urinary conditions and do not consider
was designed to minimise this, as studies could be any other benefits that the information they
discussed at length, reducing the chance of data generate have in treating individuals, for example
being missed or incorrectly interpreted. A if considerations of severity of leakage had an
predefined form was used to record all relevant impact on the likelihood of receiving surgery. It
data during the data extraction process. should also be noted that the unit of effectiveness
considers the value of a case of DO, USI and
Data synthesis mixed found to be of equal importance. If it was
The number of studies suitable for data synthesis considered more important to diagnose USI than
was small. Another major problem was that studies DO then the relative values of tests for DO and
that appeared to be comparing the same tests for USI may change. Finally, the measures of
diagnostic tests were in fact comparing very the performance of these tests are generally based
different variations of the same test. For example, on single studies, so there is likely to be
within pad tests, there were three different types: considerable uncertainty over the values of these
1 hour, 24 hour and 48 hour. Both the paucity of estimates.
evidence and the heterogeneous reporting of
those studies that were identified severely limited The estimates of prevalence used in the model
the ability to undertake meaningful meta-analyses. come from urodynamics carried out on a group of
individuals referred from a primary care setting.
In addition, the heterogeneous nature of the These are likely to be the more serious or
studies identified, in terms of the precise intractable cases. The prevalence of these
diagnostic methods used or the patient population conditions in the more general group, who
to which they were applied, meant that those present to primary care, may be lower. Sensitivity
meta-analyses that could be performed only analysis shows that the cost-effectiveness of these
included a small number of studies. tests is sensitive to the prevalence; the likely
occurrence of these conditions is therefore an
Specific methodological issues that were identified important consideration in their implementation.
during the systematic review included the issue of
indirect comparisons, classification of patients into It is clear from this analysis that the decisions
more than two diagnostic categories, e.g. USI, DO taken after the use of these tests have implications
or mixed, and the reporting of both raw data in for their cost-effectiveness. There is likely to be
terms of an ROC curve/table and summary data, wide variation in referral patterns among primary
for example a single estimate of sensitivity and care practitioners. It is important to consider that
specificity. This parallels the situation found in in this simple model the analysis ends at
other areas in which some studies report individual secondary care referral, when in reality there may
patient, while others report only summary data.221 be a series of secondary care services received, and
benefits obtained, from these services.
Economic modelling
It was assumed that it would always be good An important consideration in the interpretation
practice to take a history. The relevant question is, of this work is the value placed on an individual
therefore, is it worth carrying out other tests in with any condition detected. It is clear from the
primary care in addition to taking a history? cost-effectiveness acceptability curves (Figures 18
72 Therefore the extra costs and numbers of and 19) that as the values of this outcome change,
Health Technology Assessment 2006; Vol. 10: No. 6

then so do the conclusions for optimum of clinical history and urodynamics in female
management. If detecting an individual’s patients was identified. Pooled sensitivity and
condition is not highly valued then strategies specificity values for diagnosis of USI in women
where only history and no further tests are carried suggest that a clinical history is highly sensitive
out would be the optimum ones. As the value (0.92, 95% CI 0.91 to 0.93), but less specific (0.56,
placed on this outcome increases then strategies 95% CI 0.53 to 0.60) in diagnosing USI. These
that involve extra costs but generate extra benefits findings suggest that a large proportion of women
will be optimum. The value of detecting an with USI can be correctly diagnosed in primary
individual’s urinary condition would depend on a care and that initiating low-risk, low-cost
number of factors not explicitly tested here. This behavioural treatment at this stage may be
would be expected to include the burden of a appropriate. The lower specificity suggests that
condition on an individual, and the cost and women without USI may be incorrectly diagnosed;
effectiveness of available treatments and therapies. however, behavioural therapy should not have any
detrimental effects and may result in some
alleviation of symptoms.
Implications of the findings
With regard to the diagnosis of DO by clinical
The literature dealing with the diagnosis of history-taking, sensitivity was found to be lower
urinary incontinence is highly fragmented. Within than for USI (0.61, 95% CI 0.57 to 0.65), but
primary care there are so many types of each test specificity was found to be high (0.87, 95% CI 0.85
that it is almost impossible to find two studies that to 0.89). This indicates that history-taking may
compare the same tests. There is no real correctly exclude those women who do not have
agreement among clinical experts on what the DO, but that further investigations may be
‘gold standard’ is for diagnosing urinary required for those who present with DO symptoms
incontinence, whether it is urodynamics and, if so, to confirm their status before any treatment is
what methods should be used. This review used initiated. The next stage for those whose history
the ICS-defined criterion that multichannel suggests DO may be a further simple, non-invasive
urodynamics is the gold standard test for test, such as a urinary diary.
diagnosing USI or DO. Owing to the large
number of comparisons between a lot of different Simple investigations
diagnostic tests, only the areas of high clinical Validated scales
interest will be discussed; namely, the most The studies in this group highlight the fragmented
popular, simple and advanced investigations nature of the overall literature. Seven different
compared with multichannel urodynamics. Within scales are compared and there is currently no
each group there is a lack of literature dealing consensus on the most effective scales to use in
with the diagnosis of urinary incontinence or BOO clinical practice. The most commonly researched
in men, and for this reason the discussion of scale was the UDI. Combining data from two
results will concentrate on diagnosis in women. studies resulted in a sensitivity of 0.87 (95% CI
0.82 to 0.92) and specificity of 0.60 (95% CI 0.51
It is critical to make a distinction between tests to 0.69) for the diagnosis of USI in women based
and assessment methods that can be undertaken on one question from the UDI. The diagnostic
in primary care and those that can only be value of this scale is comparable to taking just a
undertaken in secondary care. The majority of clinical history, indicating that this scale may not
diagnostic and assessment processes can be add anything to the diagnostic procedure.
undertaken in primary care and comprise clinical
history-taking, the use of scales, physical Little evidence was found on scales that seek to
examination, and simple tests such as diaries and diagnose DO. One study reported the Gaudenz
pad tests. These tests are simple, are low in cost incontinence questionnaire to be 0.45 sensitive
and carry low risks. The results of assessments and and 0.56 specific, less accurate than clinical
tests are used to identify a presumed diagnosis on history-taking.
which an appropriate management/treatment plan
can be instigated. There needs to be consensus about the most
appropriate scale for the diagnosis of urinary
Clinical history incontinence. Efforts should be concentrated on
The recording of a clinical history is critical in developing and amending one or two scales,
determining a symptomatic diagnosis in primary rather than continually developing new scales,
care. A large number of studies comparing the use unless based on specific clinical need. 73

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Discussion

Pad tests diagnosis of USI through ultrasound and X-ray


Because of the many different pad tests used to methods. Ultrasound was found to be the most
investigate urinary incontinence it is difficult to effective method of imaging the two anatomical
draw any firm conclusions about diagnostic features used for the diagnosis of USI: the
accuracy. The majority of literature in this area was observance of leakage from the bladder and
concerned with the diagnosis of USI. Although movement of the bladder neck during
high sensitivity and specificity values were provocation. This method resulted in higher
reported in some studies, there were insufficient sensitivities (0.89, 95% CI 0.84 to 0.93, and 0.84,
studies that compared the same pad tests and 95% CI 0.76 to 0.90) and specificities (0.82, 95%
presented the data appropriately, and therefore no CI 0.73 to 0.89, and 0.86, 95% CI 0.79 to 0.91) for
formal pooling of data could be carried out. these landmarks than X-ray imaging. This
suggests that ultrasound is a valuable diagnostic
Urinary diary tool that could be used in secondary care as an
A number of different urinary diaries was studied. alternative to multichannel urodynamics, owing to
Four studies compared a urinary diary with likely lower risks, costs and discomfort for the
urodynamics and each study used a different type patient, although few studies reported these
of diary. The only study to present data in a patient-based outcomes.
format that allowed sensitivity and specificity to be
calculated reported values of 0.88 (95% CI 0.71 to Urodynamics
0.96) and 0.83 (95% CI 0.77 to 0.87), respectively. The review identified literature on a number of
This indicates that this type, an index derived different urodynamic tests compared with the gold
from various variables of a urinary diary, may be standard of multichannel urodynamics. It is
effective for the diagnosis of DO. The economic arguable, however, whether such tests are less
modelling suggests that the urinary diary unpleasant, expensive or of less risk to perform,
performs well in combination with a clinical and whether it would be better just to perform the
history for the diagnosis of DO. As the review has gold-standard test.
shown a clinical history to have a relatively low
sensitivity for diagnosing DO (0.56) there is more A number of papers compared the clinical stress
scope for an additional test to detect additional test with multichannel urodynamics for the
cases. These conclusions should be treated with diagnosis of USI, resulting in a high sensitivity of
some caution as they were drawn from the results 0.85 (95% CI 0.78 to 0.91) and specificity of 0.83
of a single study. (95% CI 0.74 to 0.90). These studies performed
the clinical stress test with an artificially filled
A recent symposium at the International bladder, which increases the invasiveness of the
Continence Society 2003 Annual Conference test. If the test could be performed with a naturally
found that 59% of clinicians prefer to use a full bladder, with no significant detriment to
urinary diary for the initial evaluation of patients, diagnostic accuracy, then this would be a very
suggesting that this is the non-invasive test of useful non-invasive diagnostic test that could
choice.222 This opinion contrasts with the amount be used in primary and secondary care. Research
of literature available on the urinary diary. into such a test would be of great clinical
interest.
Other simple investigations
A small number of studies investigated the Within the review, far fewer studies were
diagnosis of urinary incontinence by an algorithm undertaken in primary care than in secondary
method or a battery of tests. This appears to be a care settings. This has important implications
sensible approach, particularly in primary care, for interpretation of the findings. The studies
and arguably the most similar to real-life clinical undertaken in secondary care are mainly
practice. Although the number of studies in these undertaken on referred patients attending as
groups was small and pooling of the data was not outpatients. They are very different to
possible, the agreement between the results of undifferentiated patients presenting in primary
these tests and multichannel urodynamics indicates care. It is likely that referred patients have
that future research may be of significant interest. already undergone some form of diagnostic
process and, therefore, using various diagnostic
Advanced investigations assessment tools with this population may
Imaging by ultrasound and X-ray produce greater levels of sensitivity and specificity
A large amount of literature was identified that than in a mainly unreferred, undifferentiated
74 dealt with imaging the lower urinary tract for the population.
Health Technology Assessment 2006; Vol. 10: No. 6

Chapter 6
Conclusions, implications and recommendations
Conclusions with a naturally filled bladder may prove
clinically useful.
● This is the first systematic review of methods of – If a patient is to undergo an invasive
assessing urinary incontinence. urodynamic procedure, multichannel
● In total, 6009 papers were identified from the urodynamics is likely to give the most
search, of which a final 121 were deemed accurate result in a secondary care setting.
relevant for inclusion in the review. These ● There is a dearth of literature on the diagnosis
papers compared two or more of urinary incontinence in men, with no studies
assessment/diagnostic techniques. meeting the criteria for data extraction in the
● A large number of different tests is used in the diagnosis of BOO.
diagnosis of urinary incontinence, generating a
great number of possible comparisons. The
extent of heterogeneity between studies meant Implications
that few papers actually compared the same
assessment/diagnostic tests. A matrix was ● Most simple diagnostic methods can be
constructed so that each relevant paper could undertaken in primary or secondary care.
be assigned to a cell in the matrix. However, ● A thorough and accurate clinical history is
even when a cell contained ten papers crucial.
comparing, for example, scales with ● The use of simple investigations (e.g. pad test
urodynamics, within the cell seven different and diary) may offer useful information on
scales had been used, making actual severity which, when combined with history,
comparison impossible. may provide sufficient information to
● Reporting in the primary studies was generally commence primary care interventions (which
poor. Both the clinical heterogeneity and poor are low cost and low risk).
reporting meant that it was often impossible to ● From the data available the urinary diary is the
synthesise results, although studies reported in most cost-effective simple investigation to use in
recent years generally reported better than combination with the clinical history.
older studies. ● If urodynamic investigations are deemed
● Clinical interpretation was often difficult necessary, multichannel urodynamics will offer
because few studies could actually be the most accurate result.
synthesised and conclusions drawn. The ● There is a lack of research in certain areas of
following information could be deduced from clinical interest and a general lack of high-
the available data: quality work, particularly economic studies.
– A large proportion of women with USI can be
correctly diagnosed in primary care from
clinical history alone. Future research
– The value of validated scales or pad tests recommendations
could not be determined from the available
data owing to the wide range of different ● There is a need for large-scale, high-quality,
types of instrument used. primary studies evaluating the systematic use of
– On the basis of diagnosis the diary appears to a number of diagnostic methods in a primary
be the most cost-effective of the three care setting, so that the results of this systematic
primary care tests (diary, pad test and review can be verified or not. Such studies
validated scales) when used in addition to should include not only an assessment of
clinical history. clinical effectiveness, in this case diagnostic
– Ultrasound imaging may offer a valuable accuracy, but also an assessment of costs and
alternative to urodynamic investigation. quality of life/patient acceptance/satisfaction to
– The clinical stress test is effective in the inform future health policy decisions.
diagnosis of USI. Adaptation of such a test so ● There is a need for the development and
that it could be performed in primary care standardisation of scales, pad tests and diaries 75

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Conclusions, implications and recommendations

for use in the diagnosis and measurement of ● Studies should be carried out and reported to a
severity of urinary incontinence. better standard. The recommendations of the
● Only a small number of studies investigated the STARD initiative should be followed to ensure
diagnosis of urinary incontinence using an the accuracy and completeness of reporting
algorithm or a battery of tests. Such a common- design and results. The flowchart for the
sense approach, which mirrors clinical practice, suggested design and checklist for the
warrants further investigation. reporting of a study of diagnostic accuracy
● Research on the accuracy of a stress test using a developed by STARD are presented in
naturally filled bladder would be of clinical Appendix 8.
interest. ● Given the demographics of the UK population
● In terms of economic modelling, the literature and the recently reported prevalence of any
has only begun to address the cost-effectiveness urinary incontinence (in those aged 40 and
of the use of diagnostic tools in urinary over) of 34% for women and 14% for men,2
incontinence. There has been some work there will be an increasing burden placed on
published examining the use of urodynamics primary (and secondary) care services in terms
before surgery.223,224 However, there is a lack of of the diagnostic assessment and appropriate
studies that consider the use of low-cost tests treatment of incontinence. Therefore,
such as diaries in primary care. Since these are identifying which are the most clinically and
widely used techniques and they have the cost-effective methods is of crucial
potential to impact on other services in terms of importance.
referrals to secondary care and treatment
received, it would be important to consider
explicitly the cost-effectiveness of their use. In Dissemination and timescale for
terms of the use of simple diagnostic tests there updating
would be a potential for their results to be used
in primary care to inform treatment options. The target audience for dissemination of these
This could lead to improvements in health. results is clinicians. It may also prove interesting to
● A full economic model, which incorporates both those involved in systematic reviews of diagnostic
diagnosis and treatment, and evaluates outcomes methods. Realistically, in light of the broad nature
in terms of cost per QALY, would enable more of the literature and the improvements in
rational decisions to be made; this would reporting, the updating of this review should be
represent an important focus for future work. considered within 4–6 years.

76
Health Technology Assessment 2006; Vol. 10: No. 6

Acknowledgements
he authors would like to thank Mary design of the study, day-to-day supervision of the
T Edmunds-Otter for help in developing the
search strategies and Ariadna Juarez-Garcia for
study, quality assessment, interpretation of data,
drafting parts of the report and revising the
assisting with the economic modelling. We are report. KR Abrams (Professor of Medical Statistics)
grateful to Lesley Harris for administrative was involved in the conception and design of the
support and proof-reading of the final report, study, was fully involved in supervising the
Ruth Matthews and Clare Gillies for additional interpretation and presentation of data, drafted
statistical help, and Dr Helen Dallosso for reading sections of the report and commented on it.
the final report. We would also like to thank NHS D Turner (Research Fellow in Health Economics)
CRD at York and particularly Penny Whiting, who was involved in the conception and design of the
allowed us to pilot the QUADAS tool. David study, with particular emphasis on the health
Turner is funded by the Trent Institute for Health economics component, undertook analysis,
Services Research. drafting and revision of the economics chapter
and commented on the full report. A Sutton
We would also like to thank those authors who (Senior Lecturer in Medical Statistics) was involved
responded to our requests for additional in the conception and design of the study, was
information: Antonio Cucchi, Hogne Sandvik, involved in interpretation and presentation of
Gary Lemack and Gin-Den Chen. data, drafted sections of the report and
commented on it. C Chapple (Consultant
Contribution of authors Urologist) was involved in the conception and
All the authors were involved in the conception design of the study, quality assessment, clinical
and design of the study, or analysis and interpretation and commenting on the report.
interpretation of the data; drafting and revising RP Assassa (Consultant Gynaecologist) was
the report; and final approval of the version to be involved in the conception and design of the
published. study, quality assessment, clinical interpretation
and commenting on the report. C Shaw (Senior
Individual contributions were as follows: JL Martin Research Fellow) was involved in the conception
(Research Fellow) undertook the day-to-day activity and design of the study, quality assessment,
on the project, quality assessment, interpretation interpretation and commenting on the report.
and presentation of data, drafting of the full F Cheater (Professor of Public Health Nursing)
report and making revisions to the report. was involved in the conception and design of the
KS Williams (Senior Research Fellow in Nursing) study, quality assessment, interpretation and
was principal investigator and was involved in the commenting on the report.

77

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Health Technology Assessment 2006; Vol. 10: No. 6

References
1. Abrams P, Cardozo L, Fall M, Griffiths D. The 12. Chapple CR, MacDiarmid SA. Urodynamics made
standardisation of terminology in lower urinary easy. 2nd ed. London: Harcourt; 2000.
tract function: report from the standardisation
13. Matharu G, Donaldson MMK, McGrother CW,
sub-committee of the International Continence
Matthews RJ. Relationship between urinary
Society. Urology 2003;61:37–49.
symptoms reported in a postal questionnaire and
2. McGrother CM, Donaldson MMK, Wagg A, urodynamic diagnosis. Neurourol Urodyn
Matharu G, Williams KS, Warsame J, et al. 2005;24:100–5.
Continence. In Stevens A RJMJ, editor. Health care
needs assessment: the epidemiologically based needs 14. NHS Centre for Reviews and Dissemination.
assessment reviews. Abingdon: Radcliffe Medical Undertaking systematic reviews of research on
Press; 2003. http://www.hcna.radcliffe- effectiveness. CRD Guidelines for those carrying out
oxford.com/contframe.htm commissioning reviews. CRD Report No. 4. York:
NHS Centre for Reviews and Dissemination; 2001.
3. Turner DA, McGrother CW, Dallosso HM, Shaw C, pp. 1–50.
Cooper NJ, MRC Incontinence Study Team. The
cost of urinary storage disorders in the UK. 2003 15. Whiting P, Rutjes AWS, Dinnes J, Reitsma JB,
(submitted). Bossuyt PMM, Kleijnen J. The development and
validation of methods for assessing the quality and
4. Valvanne J, Juva K, Erkinjuntti T, Tilvis R. reporting of diagnostic studies. Health Technol
Major depression in the elderly: a population Assess 2004;8(25).
study in Helsinki. Int Psychogeriatr 1996;8:437–43.
16. Amundsen C, Lau M, English SF, McGuire EJ.
5. Berglund AL, Eisemann M, Lalos O. Personality Do urinary symptoms correlate with urodynamic
characteristics of stress incontinent women: a pilot findings? J Urol 1999;161:1871–4.
study. Journal of Psychosomatic Obstetrics and
Gynecology 1994;15:165–70. 17. Scotti RJ, Myers DL. A comparison of the cough
stress test and single-channel cystometry with
6. Herzog AR, Fultz N, Brock BM, Brown MB, multichannel urodynamic evaluation in genuine
Diokno AC. Urinary incontinence and stress incontinence. Obstet Gynecol 1993;81:430–3.
psychological distress among older adults. Psychol
Aging 1988;3:115–21. 18. Ishiko O, Sumi T, Hirai K, Ogita S. Classification
of female urinary incontinence by the scored
7. Van Der Vaart CH, De Leeuw JR, Roovers JP, incontinence questionnaire. Int J Gynaecol Obstet
Heintz AP. The effect of urinary incontinence and 2000;69:255–60.
overactive bladder symptoms on quality of life in
young women. British Journal of Urology 19. Elser DM, Fantl JA, McClish DK. Comparison of
International 2002;90:544–9. ‘subjective’ and ‘objective’ measures of severity of
urinary incontinence in women. Program for
8. Abrams P, Lowry SK, Wein AJ, Bump R, Denis L, Women Research Group. Neurourol Urodyn 1995;
Kalache A. Assessment and treatment of urinary 14:311–16.
incontinence. Lancet 2000;355:2153–8.
20. Glas AS, Lijmer JG, Prins MH, Bonsel GJ,
9. Homma Y, Batista J, Bauer D, Griffiths P, Hilton G, Bossuyt PMM. The diagnostic odds ratio: a single
Kramer G, Lose G, Rosier P. Urodynamics. In indicator of test performance. J Clin Epidemiol
Abrams P, Cardozo L, Khoury S, Wein A, editors. 2003;56:1129–35.
Incontinence: 2nd International Consultation on
Incontinence. 2nd ed. Plymouth: Plymbridge 21. Deeks JJ. Systematic reviews of evaluations of
Distributors; 2002. pp. 317–72. diagnostic and screening tests. In Egger M,
Smith G, Altman D, editors. Systematic reviews in
10. Schafer W, Abrams P, Liao L, Mattiasson A, health care, meta-analysis in context. 2nd ed. London:
Pesce F, Spangberg A, et al. Good urodynamics BMJ Books; 2001. pp. 248–82.
practice: uroflowmetry, filling cystometry and
pressure–flow studies. Neurourol Urodyn 2002; 22. Bergman J, Elia G. Effects of the menstrual cycle
21:261–74. on urodynamic work-up: should we change our
practice? Int Urogynecol J 1999;10:375–7.
11. Gorton E, Stanton S. Women’s attitudes to
urodynamics: a questionnaire survey. Br J Obstet 23. Chaikin DC, Blaivas JG, Rosenthal JE, Weiss JP.
Gynaecol 1999;106:851–6. Results of pubovaginal sling for stress 79

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


References

incontinence: a prospective comparison of 4 36. Peters S. Don’t ask, don’t tell. Breaking the silence
instruments for outcome analysis. J Urol 1999; surrounding female urinary incontinence [review].
162:1670–3. Advance for Nurse Practitioners 1997;5(5):41–4.
24. Glazer HI, Romanzi L, Polaneczky M. Pelvic floor 37. Siltberg H, Larsson G, Victor A. Frequency/volume
muscle surface electromyography. Reliability and chart: the basic tool for investigating urinary
clinical predictive validity. J Reprod Med 1999; symptoms. Acta Obstet Gynecol Scand Suppl 1997;
44:779–82. 166:24–7.

25. Madersbacher S, Pycha A, Klingler CH, Schatzl G, 38. Donnellan SM, Duncan HJ, MacGregor RJ,
Marberger M. The International Prostate Russell JM. Prospective assessment of incontinence
Symptom score in both sexes: a urodynamics- after radical retropubic prostatectomy: objective
based comparison. Neurourol Urodyn 1999; and subjective analysis. Urology 1997;49:225–30.
18:173–82. 39. van Waalwijk van Doorn ES, Ambergen AW,
26. Aanestad O, Flink R. Urinary stress incontinence. Janknegt RA. Detrusor activity index:
A urodynamic and quantitative electromyographic quantification of detrusor overactivity by
study of the perineal muscles. Acta Obstet Gynecol ambulatory monitoring. J Urol 1997;157:596–9.
Scand 1999;78:245–53. 40. Managing acute and chronic urinary incontinence.
27. Siltberg H, Larsson G, Victor A. Cough-induced US Department of Health and human services.
leak-point pressure – a valid measure for assessing Journal of the American Academy of Nurse Practitioners
treatment in women with stress incontinence. Acta 1996;8:390–403.
Obstet Gynecol Scand 1998;77:1000–7. 41. Theofrastous JP, Cundiff GW, Harris RL,
Bump RC. The effect of vesical volume on Valsalva
28. Watson AJ, Currie I, Curran S, Jarvis GJ.
leak-point pressures in women with genuine stress
A prospective study examining the association
urinary incontinence. Obstet Gynecol 1996;
between the symptoms of anxiety and depression
87:711–14.
and severity of urinary incontinence. Eur J Obstet
Gynecol Reprod Biol 2000;88:7–9. 42. Moore AA, Siu AL. Screening for common
problems in ambulatory elderly: clinical
29. Yoon E, Swift S. A comparison of maximum
confirmation of a screening instrument. Am J Med
cystometric bladder capacity with maximum
1996;100:438–43.
environmental voided volumes. Int Urogynecol J
Pelvic Floor Dysfunct 1998;9:78–82. 43. Woodtli A. Stress incontinence: clinical
identification and validation of defining
30. Kirschner-Hermanns R, Scherr PA, Branch LG, characteristics. Nursing Diagnosis 1995;6:115–22.
Wetle T, Resnick NM. Accuracy of survey questions
for geriatric urinary incontinence. J Urol 1998; 44. Theofrastous JP, Bump RC, Elser DM, Wyman JF,
159:1903–8. McClish DK. Correlation of urodynamic measures
of urethral resistance with clinical measures of
31. McLennan MT, Bent AE. Supine empty stress test incontinence severity in women with pure genuine
as a predictor of low valsalva leak point pressure. stress incontinence. The Continence Program for
Neurourol Urodyn 1998;17:121–7. Women Research Group. Am J Obstet Gynecol 1995;
32. Frauscher F, Helweg G, Strasser H, Enna B, 173:407–12; Discussion 412–14.
Klauser A, Knapp R, et al. Intraurethral 45. Cetinel B, Turan T, Talat Z, Yalcin V, Alici B,
ultrasound: diagnostic evaluation of the striated Solok V. Update evaluation of benign prostatic
urethral sphincter in incontinent females. hyperplasia: when should we offer prostatectomy?
Eur Radiol 1998;8:50–3. Br J Urol 1994;74:566–71.
33. Elbadawi A, Hailemariam S, Yalla SV, Resnick NM. 46. Ouslander JG, Simmons S, Tuico E, Nigam JG,
Structural basis of geriatric voiding dysfunction. Fingold S, Bates-Jensen B, et al. Use of a portable
VII. Prospective ultrastructural/urodynamic ultrasound device to measure post-void residual
evaluation of its natural evolution. J Urol 1997; volume among incontinent nursing home
157:1814–22. residents. J Am Geriatr Soc 1994;42:1189–92.
34. Elbadawi A, Hailemariam S, Yalla SV, Resnick NM. 47. Mayer R, Wells TJ, Brink CA, Clark P. Correlations
Structural basis of geriatric voiding dysfunction. between dynamic urethral profilometry and
VI. Validation and update of diagnostic criteria in perivaginal pelvic muscle activity. Neurourol Urodyn
71 detrusor biopsies. J Urol 1997;157:1802–13. 1994;13:227–35.
35. Moore KH, Foote A, Siva S, King J, Burton G. 48. Petros PE, Ulmsten U. Natural volume
The use of the bladder neck support prosthesis in handwashing urethrocystometry: a physiological
combined genuine stress incontinence and technique for the objective diagnosis of the
detrusor instability. Aust N Z J Obstet Gynaecol 1997; unstable detrusor. Gynecol Obstet Invest 1993;
80 37:440–5. 36:42–6.
Health Technology Assessment 2006; Vol. 10: No. 6

49. Lalos O, Berglund AL, Bjerle P. Urodynamics in 63. Goldwasser B, Rife CC, Benson RC, Furlow WL,
women with stress incontinence before and after Barrett DM. Urodynamic evaluation of patients
surgery. Eur J Obstetr Gynecol Reprod Biol 1993; after the Camey operation. J Urol 1987;138:832–5.
48:197–205.
64. Glezerman M, Glasner M, Rikover M, Tauber E,
50. Walter JS, Wheeler JS, Morgan C, Zaszczurynski P, Bar-Ziv J, Insler V. Evaluation of reliability of
Plishka M. Measurement of total urethral history in women complaining of urinary stress
compliance in females with stress incontinence. incontinence. Eur J Obstet Gynecol Reprod Biol 1986;
Neurourol Urodyn 1993;12:273–6. 21:159–64.

51. Vinsnes AG, Hunskaar S. Distress associated with 65. Vehkalahti I, Kivela SL, Seppanen J. Are
urinary incontinence, as measured by a visual cystometric and cystoscopic examinations of any
analogue scale. Scandinavian Journal of Caring value for disabled incontinent elderly?
Sciences 1991;5:57–61. Scandinavian Journal of Primary Health Care 1986;
4:243–7.
52. van Waalwijk van Doorn ES, Remmers A,
Janknegt RA. Extramural ambulatory urodynamic 66. Kauppila A, Alavaikko P, Kujansuu E. Detrusor
monitoring during natural filling and normal daily instability score in the evaluation of stress urinary
activities: evaluation of 100 patients. J Urol 1991; incontinence. Acta Obstet Gynecol Scand 1982;
146:124–31. 61:137–41.

53. Griffiths DJ, McCracken PN, Harrison GM. 67. Walter S, Olesen KP. Urinary incontinence and
Incontinence in the elderly: objective genital prolapse in the female: clinical,
demonstration and quantitative assessment. Br J urodynamic and radiological examinations.
Urol 1991;67:467–71. British Journal of Obstetrics and Gynaecology 1982;
89:393–401.
54. Resnick NM. Noninvasive diagnosis of the patient
68. Robinson H, Stanton SL. Detection of urinary
with complex incontinence. Gerontology 1990;
incontinence. British Journal of Obstetrics and
36 Suppl 2:8–18.
Gynaecology 1981;88:59–61.
55. Brocklehurst JC. Urinary incontinence in old age:
69. Fantl JA, Hurt WG, Beachley MC, Bosch HA,
helping the general practitioner to make a
Konerding KF, Smith PJ. Bead-chain
diagnosis. Gerontology 1990;36 Suppl 2:3–7.
cystourethrogram: an evaluation. Obstet Gynecol
56. Kong TK, Morris JA, Robinson JM, 1981;58:237–40.
Brocklehurst JC. Predicting urodynamic
70. Thuroff JW, Jonas U, Petri E, Frohneberg D.
dysfunction from clinical features in incontinent
Telemetric urodynamic investigations in female
elderly women. Age Ageing 1990;19:257–63.
incontinence. Prog Clin Biol Res 1981;78:211–22.
57. Frazer MI, Haylen BT. Trigonal sensitivity testing 71. Drutz HP, Mandel F. Urodynamic analysis of
in women. J Urol 1989;141:356–8. urinary incontinence symptoms in women. Am J
58. Lose G, Jorgensen L, Thunedborg P. 24-hour Obstet Gynecol 1979;134:789–92.
home pad weighing test versus 1-hour ward test in 72. Awad SA, Bryniak SR, Lowe PJ, Bruce AW, Twiddy
the assessment of mild stress incontinence. Acta DA. Urethral pressure profile in female stress
Obstet Gynecol Scand 1989;68:211–15. incontinence. J Urol 1978;120:475–9.
59. Varpula M, Makinen J, Kiilholma P. Cough 73. Drutz HP, Shapiro BJ, Mandel F. Do static
urethrocystography: the best radiological cystourethrograms have a role in the investigation
evaluation of female stress urinary incontinence? of female incontinence? Am J Obstet Gynecol 1978;
Eur J Radiol 1989;9:191–4. 130:516–20.
60. Walters MD, Shields LE. The diagnostic value of 74. Susset JG, Shoukry I, Schlaeder G, Cloutier D,
history, physical examination, and the Q-tip cotton Dutartre D. Stress incontinence and urethral
swab test in women with urinary incontinence. Am obstruction in women: value of uroflowmetry and
J Obstet Gynecol 1988;159:145–9. voiding urethrography. J Urol 1974;111:504–13.
61. Khan Z, Mieza M, Bhola A. Relative usefulness of 75. Diokno AC, Wells TJ, Brink CA. Comparison of
physical examination, urodynamics and self-reported voided volume with cystometric
roentgenography in the diagnosis of urinary stress bladder capacity. J Urol 1987;137:698–700.
incontinence. Surgery, Gynecology and Obstetrics
76. Van Venrooij GEPM, Eckhardt MD, Gisolf KWH,
1988;167:39–44.
Boon TA. Data from frequency–volume charts
62. Victor A, Larsson G, Asbrink AS. A simple patient- versus filling cystometric estimated capacities and
administered test for objective quantitation of the prevalence of instability in men with lower urinary
symptom of urinary incontinence. Scand J Urol tract symptoms suggestive of benign prostatic
Nephrol 1987;21:277–9. hyperplasia. Neurourol Urodyn 2002;21:106–11. 81

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


References

77. Wolters M, Methfessel HD, Goepel C, Koelbl H. 90. Mayer R, Wells T, Brink C, Diokno A, Cockett A.
Computer-assisted virtual urethral pressure profile Handwashing in the cystometric evaluation of
in the assessment of female genuine stress detrusor instability. Neurourol Urodyn 1991;
incontinence. Obstet Gynecol 2002;99:67–74. 10:563–69.
78. Kerschan-Schindl K, Uher E, Wiesinger G, 91. Thind P, Gerstenberg TC. One-hour ward test vs.
Kaider A, Ebenbichler G, Nicolakis P, et al. 24-hour home pad weighing test in the diagnosis
Reliability of pelvic floor muscle strength of urinary incontinence. Neurourol Urodyn 1991;
measurement in elderly incontinent women. 10:241–5.
Neurourol Urodyn 2002;21:42–7.
92. Wilson PD, Mason MV, Herbison GP, Sutherst JR.
79. FitzGerald MP, Butler N, Shott S, Brubaker L. Evaluation of the home pad test for quantifying
Bother arising from urinary frequency in women. incontinence. British Journal of Urology 1989;
Neurourol Urodyn 2002;21:36–40. 64:155–7.
80. Radley SC, Rosario DJ, Chapple CR, Farkas AG. 93. Yu HJ, Kuo HC, Chen SC, Law HS, Tsai TC.
Conventional and ambulatory urodynamic Evaluation of urodynamics for the diagnosis of
findings in women with symptoms suggestive of female stress urinary incontinence. Journal of
bladder overactivity. J Urol 2001;166:2253–8. Surgical Association Republic of China 1988;
21:326–33.
81. Groutz A, Samandarov A, Gold R, Pauzner D,
Lessing JB, Gordon D. Role of urethrocystoscopy 94. Chou TP, Gorton E, Stanton SL, Atherton M,
in the evaluation of refractory idiopathic detrusor Baessler K, Rienhardt G. Can uroflowmetry
instability. Urology 2001;58:544–6. patterns in women be reliably interpreted?
International Urogynecology Journal 2000;11:142–7.
82. Romanzi LJ, Groutz A, Heritz DM, Blaivas JG.
Involuntary detrusor contractions: correlation of 95. Yossepowitch O, Gillon G, Baniel J, Engelstein D,
urodynamic data to clinical categories. Neurourol Livne PM. The effect of cholinergic enhancement
Urodyn 2001;20:249–57. during filling cystometry: can edrophonium
chloride be used as a provocative test for
83. Araki I, Kitahara M, Oida T, Kuno S. Voiding overactive bladder? J Urol 2001;165:1441–5.
dysfunction and Parkinson’s disease: urodynamic
abnormalities and urinary symptoms. J Urol 2000; 96. Goode PS, Locher JL, Bryant RL, Roth DL,
164:1640–3. Burgio KL. Measurement of postvoid residual
urine with portable transabdominal bladder
84. Moretti M, Varaldo M, Malcangi B, Cichero A, ultrasound scanner and urethral catheterization.
Pittaluga P, Riva D. Introital sonography and International Urogynecology Journal 2000;
urodynamic examination in stress urinary 11:296–300.
incontinence: anatomic and functional
relationships. Urodinamica 1998;8:226–31. 97. Lemack GE, Zimmern PE. Predictability of
urodynamic findings based on the Urogenital
85. Nitahara KS, Aboseif S, Tanagho EA. Long-term Distress Inventory-6 questionnaire. Urology 1999;
results of colpocystourethropexy for persistent or 54:461–6.
recurrent stress urinary incontinence. J Urol 1999;
162:138–41. 98. Chen GD, Su TH, Lin LY. Applicability of perineal
sonography in anatomical evaluation of bladder
86. Kirkemo A, Peabody M, Diokno AC, Afanasyev A, neck in women with and without genuine stress
Nyberg LM Jr, Landis JR, et al. Associations incontinence. J Clin Ultrasound 1997;25:189–94.
among urodynamic findings and symptoms in
women enrolled in the Interstitial Cystitis Data 99. Jackson S, Donovan J, Brookes S, Eckford S,
Base (ICDB) study. Urology 1997;49(5S):76–80. Swithinbank L, Abrams P. The Bristol Female
Lower Urinary Tract Symptoms questionnaire:
87. Yeh NH, Chen GD, Lin LY, Wu GS, Su TH. development and psychometric testing. British
Comparison of bladder neck mobility in patients Journal of Urology 1996;77:805–12.
with genuine stress incontinence and continent
women by perineal sonography. Journal of Medical 100. Berglund AL, Lalos O. The pre- and postsurgical
Ultrasound 1996;4:129–33. nursing of women with stress incontinence. J Adv
Nurs 1996;23:502–11.
88. Rivas DA, Chancellor MB. Utility of the American
Urological Association symptom index in the 101. Nitti VW, Kim Y, Combs AJ. Correlation of the
diagnosis of women with voiding dysfunction. AUA symptom index with urodynamics in patients
International Urogynecology Journal 1994;5:202–7. with suspected benign prostatic hyperplasia.
Neurourol Urodyn 1994;13:521–9.
89. Zollner-Nielsen M, Samuelsson SM. Maximal
electrical stimulation of patients with frequency, 102. Kiilholma PJ, Makinen JI, Pitkanen YA,
urgency and urge incontinence: report of 38 cases. Varpula MJ. Perineal ultrasound: an alternative for
82 Acta Obstet Gynecol Scand 1992;71:629–31. radiography for evaluating stress urinary
Health Technology Assessment 2006; Vol. 10: No. 6

incontinence in females. Ann Chir Gynaecol Suppl 116. Williams KS, Assassa RP, Smith NKG, Jagger C,
1994;208:43–5. Perry S, Shaw C, et al. Development,
implementation and evaluation of a new nurse-led
103. McInerney PD, Vanner TF, Harris SA, continence service: a pilot study. Journal of Clinical
Stephenson TP. Ambulatory urodynamics. British Nursing 2000;9:566–73.
Journal of Urology 1991;67:272–4.
117. Cundiff GW, Harris RL, Coates KW, Bump RC.
104. Cucchi A. Acceleration of flow rate as a screening Clinical predictors of urinary incontinence in
test for detrusor instability in women with stress women. Am J Obstet Gynecol 1997;177:262–7.
incontinence. British Journal of Urology 1990;
65:17–19. 118. De Muylder X, Claes H, Neven P, De Jaegher K.
Usefulness of urodynamic investigations in female
105. Versi E, Cardozo LD. Perineal pad weighing versus incontinence. Eur J Obstet Gynecol Reprod Biol 1992;
videographic analysis in genuine stress 44:205–8.
incontinence. British Journal of Obstetrics and
Gynaecology 1986;93:364–6. 119. Diokno AC, Normolle DP, Brown MB, Herzog AR.
Urodynamic tests for female geriatric urinary
106. Nager CW, Schulz JA, Stanton SL, Monga A. incontinence. Urology 1990;36:431–9.
Correlation of urethral closure pressure, leak-point
120. Diokno AC, Wells TJ, Brink CA. Urinary
pressure and incontinence severity measures. Int
incontinence in elderly women: urodynamic
Urogynecol J Pelvic Floor Dysfunct 2001;12:395–400.
evaluation. J Am Geriatr Soc 1987;35:940–6.
107. Sandvik H, Seim A, Vanvik A, Hunskaar S. 121. FitzGerald MP, Brubaker L. Urinary incontinence
A severity index for epidemiological surveys of symptom scores and urodynamic diagnoses.
female urinary incontinence: comparison with Neurourol Urodyn 2002;21:30–5.
48-hour pad-weighing tests. Neurourol Urodyn
2000;19:137–45. 122. Ishiko O, Hirai K, Sumi T, Nishimura S, Ogita S.
The urinary incontinence score in the diagnosis of
108. James M, Jackson S, Shepherd A, Abrams P. female urinary incontinence. Int J Gynaecol Obstet
Pure stress leakage symptomatology: is it safe to 2000;68:131–7.
discount detrusor instability? British Journal of
Obstetrics and Gynaecology 1999;106:1255–8. 123. Korda A, Krieger M, Hunter P, Parkin G.
The value of clinical symptoms in the diagnosis of
109. Larsson G, Blixt C, Janson G, Victor A. urinary incontinence in the female. Aust N Z J
The frequency/volume chart as a differential Obstet Gynaecol 1987;27:149–51.
diagnostic tool in female urinary incontinence.
International Urogynecology Journal 1994;5:273–7. 124. Kujansuu E, Kauppila A. Scored urological history
and urethrocystometry in the differential diagnosis
110. Davila GW. Ambulatory urodynamics in urge of female urinary incontinence. Ann Chir Gynaecol
incontinence evaluation. International Urogynecology 1982;71:197–202.
Journal 1994;5:25–30.
125. Lagro-Janssen AL, Debruyne FM, van Weel C.
111. Hahn I, Fall M. Objective quantification of stress Value of the patient’s case history in diagnosing
urinary incontinence: a short, reproducible, urinary incontinence in general practice. British
provocative pad-test. Neurourol Urodyn 1991; Journal of Urology 1991;67:569–72.
10:475–81.
126. Niecestro RM, Wheeler JS, Nanninga J, Einhorn
112. Mouritsen L, Berlid G, Hertz J. Comparison of C, Goggin C. Use of stresscath for diagnosing
different methods for quantification of urinary stress incontinence. Urology 1992;39:266–9.
leakage in incontinent women. Neurourol Urodyn 127. Ouslander J, Staskin D, Raz S, Su HL, Hepps K.
1989;8:579–87. Clinical versus urodynamic diagnosis in an
113. Harvey MA, Kristjansson B, Griffith D, Versi E. incontinent geriatric female population. J Urol
The Incontinence Impact Questionnaire and the 1987;137:68–71.
Urogenital Distress Inventory: a revisit of their 128. Ramsay IN, Hilton P, Rice N. The symptomatic
validity in women without a urodynamic diagnosis. characterization of patients with detrusor
Am J Obstet Gynecol 2001;185:25–31. instability and those with genuine stress
incontinence. International Urogynecology Journal
114. Hanley J, Capewell A, Hagen S. Validity study of
1993;4:23–6.
the severity index, a simple measure of urinary
incontinence in women. BMJ 2001;322:1096–7. 129. Sand PK, Hill RC, Ostergard DR. Incontinence
history as a predictor of detrusor stability. Obstet
115. Ryhammer AM, Laurberg S, Djurhuus JC,
Gynecol 1988;71:257–60.
Hermann AP. No relationship between subjective
assessment of urinary incontinence and pad test 130. Sandvik H, Hunskaar S, Vanvik A, Bratt H,
weight gain in a random population sample of Seim A, Hermstad R. Diagnostic classification of
menopausal women. J Urol 1998;159:800–3. female urinary incontinence: an epidemiological 83

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


References

survey corrected for validity. J Clin Epidemiol 1995; stress urinary incontinence. British Journal of
48:339–43. Urology 1988;62:228–34.
131. Sunshine TJ, Glowacki GA. Clinical correlation of 145. Pelsang RE, Bonney WW. Voiding
urodynamic testing in patients with urinary cystourethrography in female stress incontinence
incontinence. Journal of Gynecologic Surgery 1989; [see comments]. AJR Am J Roentgenol 1996;
5:93–8. 166:561–5.
132. Cantor TJ, Bates CP. A comparative study of 146. Scotti RJ, Ostergard DR, Guillaume AA, Kohatsu
symptoms and objective urodynamic findings in KE. Predictive value of urethroscopy as compared
214 incontinent women. British Journal of Obstetrics to urodynamics in the diagnosis of genuine stress
and Gynaecology 1980;87:889–92. incontinence. J Reprod Med 1990;35:772–6.
133. Klovning A, Hunskaar S, Eriksen BC. Validity of a 147. Grischke EM, Anton H, Stolz W, von Fournier D,
scored urological history in detecting detrusor Bastert G. Urodynamic assessment and lateral
instability in female urinary incontinence. Acta urethrocystography. A comparison of two
Obstet Gynecol Scand 1996;75:941–5. diagnostic procedures for female urinary
134. Jorgensen L, Lose G, Andersen JT. One-hour incontinence. Acta Obstet Gynecol Scand 1991;
pad-weighing test for objective assessment of 70:225–9.
female urinary incontinence. Obstet Gynecol 1987; 148. Bergman A, McKenzie C, Ballard CA, Richmond J.
69:39–42. Role of cystourethrography in the preoperative
135. Versi E, Orrego G, Hardy E, Seddon G, Smith P, evaluation of stress urinary incontinence in
Anand D. Evaluation of the home pad test in the women. J Reprod Med 1988;33:372–6.
investigation of female urinary incontinence. 149. Hsu TH, Rackley RR, Appell RA. The supine
British Journal of Obstetrics and Gynaecology 1996; stress test: a simple method to detect intrinsic
103:162–7. urethral sphincter dysfunction. J Urol 1999;
136. Contreras Ortiz O, Lombardo RJ, Pellicari A. 162:460–3.
Non-invasive diagnosis of bladder instability using 150. Kadar N. The value of bladder filling in the
the Bladder Instability Discriminant Index (BIDI). clinical detection of urine loss and selection of
Zentralbl Gynakol 1993;115:446–9. patients for urodynamic testing. British Journal of
137. Bergman A, McCarthy TA, Ballard CA, Yanai J. Obstetrics and Gynaecology 1988;95:698–704.
Role of the Q-tip test in evaluating stress urinary 151. Sand PK, Brubaker LT, Novak T. Simple standing
incontinence. J Reprod Med 1987;32:273–5. incremental cystometry as a screening method for
138. Montz FJ, Stanton SL. Q-tip test in female urinary detrusor instability. Obstet Gynecol 1991;77:453–7.
incontinence. Obstet Gynecol 1986;67:258–60. 152. Sand PK, Hill RC, Ostergard DR. Supine
139. Dietz HP, McKnoulty L, Clarke B. Translabial urethroscopic and standing cystometry as
color Doppler for imaging in urogynecology: screening methods for the detection of detrusor
a preliminary report. Ultrasound Obstet Gynecol instability. Obstet Gynecol 1987;70:57–60.
1999;14:144–7. 153. Sutherst JR, Brown MC. Comparison of single and
140. Dietz HP, Wilson PD. Anatomical assessment of the multichannel cystometry in diagnosing bladder
bladder outlet and proximal urethra using instability. BMJ 1984;288:1720–2.
ultrasound and videocystourethrography. Int 154. Fonda D, Brimage PJ, D’Astoli M. Simple
Urogynecol J Pelvic Floor Dysfunct 1998;9:365–9. screening for urinary incontinence in the elderly:
141. Dietz HP, Clarke B. Translabial color Doppler comparison of simple and multichannel
urodynamics. Int Urogynecol J Pelvic Floor Dysfunct cystometry. Urology 1993;42:536–40.
2001;12:304–7. 155. Ouslander J, Leach G, Abelson S, Staskin D,
142. Quinn MJ, Fanrsworth BA, Pollard WJ, Smith PJB, Blaustein J, Raz S. Simple versus multichannel
Stott MA. Vaginal ultrasound in the diagnosis of cystometry in the evaluation of bladder function in
stress incontinence: a prospective comparison to an incontinent geriatric population. J Urol 1988;
urodynamic investigations. Neurourol Urodyn 1989; 140:1482–6.
8:302–3.
156. Resnick NM, Brandeis GH, Baumann MM,
143. Bergman A, Ballard CA, Platt LD. Ultrasonic DuBeau CE, Yalla SV. Misdiagnosis of urinary
evaluation of urethrovesical junction in women incontinence in nursing home women: prevalence
with stress urinary incontinence. J Clin Ultrasound and a proposed solution. Neurourol Urodyn 1996;
1988;16:295–300. 15:599–618.
144. Bergman A, McKenzie CJ, Richmond J, 157. Davis G, McClure G, Sherman R, Hibbert M,
Ballard CA, Platt LD. Transrectal ultrasound Wong M, Perez R. Ambulatory urodynamics of
84 versus cystography in the evaluation of anatomical female soldiers. Mil Med 1998;163:808–12.
Health Technology Assessment 2006; Vol. 10: No. 6

158. Swift SE, Ostergard DR. Evaluation of current 173. de Bolla AR, Arkell DG. Urodynamic investigation
urodynamic testing methods in the diagnosis of in a district general hospital. Ann R Coll Surg Engl
genuine stress incontinence. Obstet Gynecol 1995; 1983;65:173–5.
86:85–91.
174. Robinson D, Pearce KF, Preisser JS, Dugan E,
159. Richardson DA. Value of the cough pressure Suggs PK, Cohen SJ. Relationship between patient
profile in the evaluation of patients with stress reports of urinary incontinence symptoms and
incontinence. Am J Obstet Gynecol 1986;155:808–11. quality of life measures. Obstet Gynecol 1998;
91:224–8.
160. Versi E. Discriminant analysis of urethral pressure
profilometry data for the diagnosis of genuine 175. Uebersax JS, Wyman JF, Shumaker SA,
stress incontinence. British Journal of Obstetrics and McClish DK, Fantl JA. Short forms to assess life
Gynaecology 1990;97:251–9. quality and symptom distress for urinary
incontinence in women: The incontinence impact
161. Weidner AC, Myers ER, Visco AG, Cundiff GW,
questionnaire and the urogenital distress
Bump RC. Which women with stress incontinence
inventory. Neurourol Urodyn 1995;14:131–9.
require urodynamic evaluation? Am J Obstet Gynecol
2001;184(2):20–7. 176. Gunthorpe W, Brown W, Redman S. The
162. Clarke B. The role of urodynamic assessment in development and evaluation of an incontinence
the diagnosis of lower urinary tract disorders. screening questionnaire for female primary care.
Int Urogynecol J Pelvic Floor Dysfunct 1997;8:196–9. Neurourol Urodyn 2000;19:595–607.

163. Bergman A, Bader K. Reliability of the patient’s 177. Haeusler G, Hanzal E, Joura E, Sam C, Koelbl H.
history in the diagnosis of urinary incontinence. Differential diagnosis of detrusor instability and
Int J Gynaecol Obstet 1990;32:255–9. stress-incontinence by patient history: the
Gaudenz Incontinence Questionnaire revisited.
164. Ng RK, Murray A. Can we afford to take short cuts Acta Obstet Gynecol Scand 1995;74:635–7.
in the management of stress urinary incontinence?
Singapore Med J 1993;34:121–4. 178. Shumaker SA, Wyman JF, Uebersax JS, McClish D,
Fantl JA. Health-related quality of life measures
165. Le Coutour X, Jung-Faerber S, Klein P, Renaud R. for women with urinary incontinence: the
Female urinary incontinence: comparative value of Incontinence Impact Questionnaire and the
history and urodynamic investigations. Eur J Obstet Urogenital Distress Inventory. Continence
Gynecol Reprod Biol 1990;37:279–86. Program in Women (CPW) research group. Qual
166. Petros PP, Ulmsten U. Urge incontinence history is Life Res 1994;3:291–306.
an accurate predictor of urge incontinence. Acta 179. Nitti VW, Kim Y, Combs AJ. Voiding dysfunction
Obstet Gynecol Scand 1992;71:537–9. following transurethral resection of the prostate:
167. Van Doorn ESCVW, Ambergen AW, Janknegt RA. symptoms and urodynamic findings. J Urol 1997;
Detrusor activity index: quantification of detrusor 157:600–3.
overactivity by ambulatory monitoring. J Urol 180. Hellstrom L, Ekelund P, Larsson M, Milsom I.
1997;157:596–9. A comparison between experienced and
168. Ficazzola MA, Nitti VW. The etiology of post- objectively demonstrated urinary leakage in 85-
radical prostatectomy incontinence and correlation year old men and women. Scandinavian Journal of
of symptoms with urodynamic findings. J Urol Caring Sciences 1991;5:17–21.
1998;160:1317–20. 181. Papa Petros PE, Ulmsten U. An analysis of rapid
169. Ding YY, Lieu PK, Choo PW. Is the bladder ‘an pad testing and the history for the diagnosis of
unreliable witness’ in elderly males with persistent stress incontinence. Acta Obstet Gynecol Scand 1992;
lower urinary tract symptoms? Geriatric Nephrology 71:529–36.
and Urology 1997;7:17–21. 182. Mayne CJ, Hilton P. Short pad test:
170. Hyman MJ, Groutz A, Blaivas JG. Detrusor standardisation of method and comparison with
instability in men: correlation of lower urinary 1-hour test. Neurourol Urodyn 1988;7:443–5.
tract symptoms with urodynamic findings. J Urol
183. Siltberg H, Larsson G, Hallen B, Johansson C,
2001;166:550–3.
Ulmsten U. Validation of cough-induced leak
171. Porru D, Usai E. Standard and extramural point pressure measurement in the evaluation of
ambulatory urodynamic investigation for the pharmacological treatment of stress incontinence.
diagnosis of detrusor instability-correlated Neurourol Urodyn 1999;18:591–602.
incontinence and micturition disorders. Neurourol
184. Miller JM, Ashton-Miller JA, Carchidi LT,
Urodyn 1994;13:237–42.
DeLancey JO. On the lack of correlation between
172. Gray M, McClain R, Peruggia M, Patrie J, self-report and urine loss measured with standing
Steers WD. A model for predicting motor urge provocation test in older stress-incontinent
urinary incontinence. Nurs Res 2001;50:116–22. women. J Womens Health 1999;8:157–62. 85

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


References

185. Robinson D, McClish DK, Wyman JF, Bump RC, preliminary assessment of its role as a quick
Fanti JA. Comparison between urinary diaries screening test for incontinent women. British
completed with and without intensive patient Journal of Obstetrics and Gynaecology 1991;98:69–72.
instructions. Neurourol Urodyn 1996;15:143–8.
199. Kolbl H, Bernaschek G, Wolf G. A comparative
186. Robb SS. Urinary incontinence verification in study of perineal ultrasound scanning and
elderly men. Nurs Res 1985;34:278–82. urethrocystography in patients with genuine stress
187. Fink D, Perucchini D, Schaer GN, Haller U. incontinence. Arch Gynecol Obstetr 1988;244:39–45.
The role of the frequency–volume chart in the 200. Rose DH, Eaton AC. Observations in micturating
differential diagnostic of female urinary cystourethrography. J R Soc Med 1983;76:121–5.
incontinence. Acta Obstet Gynecol Scand 1999;
78:254–7. 201. Lobel RW, Sand PK. The empty supine stress test
as a predictor of intrinsic urethral sphincter
188. Romanzi LJ, Polaneczky M, Glazer HI. dysfunction. Obstet Gynecol 1996;88:128–32.
Simple test of pelvic muscle contraction during
pelvic examination: correlation to surface 202. Hebert DB, Ostergard DR. Vesical instability:
electromyography. Neurourol Urodyn 1999; urodynamic parameters by microtip transducer
18:603–12. catheters. Obstet Gynecol 1982;60:331–7.
189. Brink CA, Wells TJ, Sampselle CM, Taillie ER, 203. Rosario DJ, MacDiarmid SA, Radley SC, Chapple
Mayer R. A digital test for pelvic muscle strength CR. A comparison of ambulatory and conventional
in women with urinary incontinence. Nurs Res urodynamic studies in men with borderline outlet
1994;43:352–6. obstruction. British Journal of Urology International
1999;83:400–9.
190. Fischer-Rasmussen W, Hansen RI, Stage P.
Predictive values of diagnostic tests in the 204. Bhatia NN, Bradley WE, Haldeman S.
evaluation of female urinary stress incontinence. Urodynamics: continuous monitoring. J Urol 1982;
Acta Obstet Gynecol Scand 1986;65:291–4. 128:963–8.
191. Karram MM, Bhatia NN. The Q-tip test: 205. Webb RJ, Ramsden PD, Neal DE. Ambulatory
standardization of the technique and its monitoring and electronic measurement of urinary
interpretation in women with urinary leakage in the diagnosis of detrusor instability and
incontinence. Obstet Gynecol 1988;71:807–11. incontinence. British Journal of Urology 1991;
192. Walters MD, Diaz K. Q-tip test: a study of 68:148–52.
continent and incontinent women. Obstet Gynecol 206. Pajoncini C, Costantini E, Rociola W, Porena M.
1987;70:208–11. The maximum urethral closure pressure and the
193. Resnick NM, Brandeis GH, Baumann MM, Valsalva leak point pressure in the diagnosis of
Morris JN. Evaluating a national assessment intrinsic sphincter deficiency: preliminary results.
strategy for urinary incontinence in nursing home Acta Urologica Italica 1999;13:231–5.
residents: reliability of the minimum data set and
207. Versi E, Cardozo L, Cooper DJ. Urethral
validity of the resident assessment protocol.
pressures: analysis of transmission pressure ratios.
Neurourol Urodyn 1996;15:583–98.
British Journal of Urology 1991;68:266–70.
194. Eastwood HD, Warrell R. Urinary incontinence in
208. Swift SE. The reliability of performing a screening
the elderly female: prediction in diagnosis and
cystometrogram using a fetal monitoring device
outcome of management. Age Ageing 1984;
for the detection of detrusor instability. Obstet
13:230–4.
Gynecol 1997;89:708–12.
195. Hilton P, Stanton SL. Algorithmic method for
assessing urinary incontinence in elderly women. 209. Bergman A, Nguyen H, Koonings PP, Ballard CA.
BMJ 1981;282:940–2. Use of fetal cardiotocographic monitor in the
evaluation of urinary incontinence. Israel Journal of
196. Summitt RL, Stovall TG, Bent AE, Ostergard DR. Medical Sciences 1988;24:291–4.
Urinary incontinence: correlation of history and
brief office evaluation with multichannel 210. Petersen T, Chandiramani V, Fowler CJ. The ice-
urodynamic testing. Am J Obstet Gynecol 1992; water test in detrusor hyper-reflexia and bladder
166:1835–44. instability. British Journal of Urology 1997;79:163–7.

197. Griffiths DJ, McCracken PN, Harrison GM, 211. Sutherst JR, Brown M. Detection of urethral
Gormley EA. Characteristics of urinary incompetence. Erect studies using the fluid-bridge
incontinence in elderly patients studied by 24- test. British Journal of Urology 1981;53:360–3.
hour monitoring and urodynamic testing. Age 212. Hanzal E, Berger E, Koelbl H. Reliability of the
Ageing 1992;21:195–201. urethral closure pressure profile during stress in
198. Creighton SM, Plevnik S, Stanton SL. Distal the diagnosis of genuine stress incontinence.
86 urethral electrical conductance (DUEC) – a British Journal of Urology 1991;68:369–71.
Health Technology Assessment 2006; Vol. 10: No. 6

213. Frigerio L, Ferrari A, Candiani GB. The validation of methods for assessing the quality
significance of the stop test in female urinary of diagnostic studies. Health Technol Assess
incontinence. Diagnostic Gynecology and Obstetrics 2004;8(25).
1981;3:301–4.
220. Bossuyt P, Reitsma J, Bruns D, Gatsonis CA,
214. Briggs AH, Gray AM. Handling uncertainty Glasziou PP, Irwig LM, et al. Towards complete
when performing economic evaluation of and accurate reporting of studies of diagnostic
healthcare interventions. Health Technol Assess accuracy: the STARD initiative. BMJ 2003;
1999;3(2). 326:41–4.
215. NHS Reference Costs. Department of Health, 2002. 221. Riley R, Burchill S, Abrams K, Heney D,
http://www.dh.gov.uk/PublicationsAndStatistics/ Lambert PC, Jones DR, et al. A systematic review
Publications/PublicationsPolicyAndGuidance/ and evaluation of the use of tumour markers in
PublicationsPolicyAndGuidanceArticle/fs/en? paediatric oncology: Ewing’s sarcoma and
CONTENT_ID=4069646&chk=vzKK5z. Accessed neuroblastoma. Health Technol Assess 2003;7(5).
March 2004.
222. Chapple C, Pesce F. An interactive symposium to
216. Hopewell S, Clarke M, Lefebvre C, Scherer R. determine criteria for evaluating new minimally
Is handsearching still worthwhile? Results of a invasive treatments for SUI. International
Cochrane methodology review. Proceedings of the XI Continence Society 33rd Annual Meeting, Florence,
Cochrane Colloquium, Barcelona, Spain, 26–31 Italy, 5–9 October 2003. p. 34.
October 2003.
223. Weber AM, Taylor RJ, Wei JT, Lemack G,
217. Juni P, Holenstein F, Sterne J, Bartlett C, Egger M.
Piedmonte MR, Walters MD. The cost-
Direction and impact of language bias in meta-
effectiveness of preoperative testing (basic office
analyses of controlled trials: empirical study. Int J
assessment vs urodynamics) for stress urinary
Epidemiol 2002;31:115–23.
incontinence in women. British Journal of Urology
218. Egger M, Zellweger-Zahner T, Schneider M, International 2002;89:356–63.
Junker C, Lengeler C, Antes G. Language bias in
224. Weber AM, Walters MD. Cost-effectiveness of
randomised controlled trials published in English
urodynamic testing before surgery for women with
and German. Lancet 1997;350:326–9.
pelvic organ prolapse and stress urinary
219. Whiting P, Rutjes AW, Dinnes J, Reitsma J, incontinence. Am J Obstet Gynecol 2000;
Bossuyt PM, Kleijnen J. Development and 183:1338–46.

87

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Health Technology Assessment 2006; Vol. 10: No. 6

Appendix 1
Search strategy
MEDLINE 34 ((observer adj2 bias) or inderminate
result$).ti,ab.
1 exp URODYNAMICS/ or urodynamics.mp. 35 ((observer adj25 different) or observer
2 provocation stress test$.mp. variation$).ti,ab.
3 frequency volume chart$.mp. 36 Observer Variation/
4 urinanalysis.mp. 37 ((interrater or intrarater or observer) adj25
5 post-void residual volume.mp. [mp=title, reliability).ti,ab.
abstract, registry number word, mesh subject 38 (intra adj4 reliability).ti,ab.
heading] 39 ((accuracy or reliability) adj2 (test or tests or
6 (mid-stream specimen adj2 urine).mp. testing or standard or standards)).ti,ab.
[mp=title, abstract, registry number word, 40 (performance adj2 (test or tests or testing or
mesh subject heading] standard or standards)).ti,ab.
7 mssu.mp. [mp=title, abstract, registry number 41 (reference value or reference values or
word, mesh subject heading] sroc).ti,ab.
8 (pad tests or pad testing or pad test).ti,ab. 42 exp Urinary Incontinence/ or urinary
9 exp URINALYSIS/ or urinalysis.mp. incontinence.mp.
10 (mid-stream sampl$ adj2 urine).ti,ab. 43 urge incontinence.mp.
11 or/1-10 44 stress incontinence.mp.
12 exp "Sensitivity and Specificity"/ 45 (leakage and urin$).mp. [mp=title,
13 sensitivity.tw. abstract, registry number word, mesh subject
14 specificity.tw. heading]
15 DO.xs. 46 detrusor instability.mp.
16 ri.fs. 47 or/42-46
17 du.fs. 48 or/19-41
18 or/12-17 49 48 or 11 or 18
19 exp Predictive Value of Tests/ 50 47 and 49
20 Reference Values/ 51 limit 50 to (human and english language and
21 Reference Standards/ all adult <19 plus years>)
22 ROC Curve/
23 exp Diagnostic Errors/
24 ((sensitivity or specificity) adj25 (test or EMBASE
tests)).ti,ab.
25 (predictive value$ or predictive standard$ or 1 exp "Sensitivity and Specificity"/
predictive model$).ti,ab. 2 exp Predictive Value of Tests/
26 (roc or receiver operat$ characteristic or 3 Reference Values/
receiver operat$ curve$).ti,ab. 4 Reference Standards/
27 (likelihood ratio$ or likelihood function$).ti,ab. 5 ROC Curve/
28 (diagnostic error$ or (errors adj2 diagnosis) or 6 exp Diagnostic Errors/
(false adj2 reaction$)).ti,ab. 7 ((sensitivity or specificity) adj25 (test or
29 (false positive or false positives or false tests)).ti,ab.
negative or false negatives).ti,ab. 8 (predictive value$ or predictive standard$ or
30 (‘gold standard’$ or reference test$ or ‘gold predictive model$).ti,ab.
standard’$).ti,ab. 9 (roc or receiver operat$ characteristic or
31 (criter$ standard$ or criter$ bias or criteria receiver operat$ curve$).ti,ab.
test or criteria tests).ti,ab. 10 (likelihood ratio$ or likelihood function$).ti,ab.
32 (validat$ standard or validat$ test or validat$ 11 (diagnostic error$ or (errors adj2 diagnosis) or
tests or validat$ bias).ti,ab. (false adj2 reaction$)).ti,ab.
33 (work-up bias or workup bias or expectation 12 (false positive or false positives or false
bias or verification bias).ti,ab. negative or false negatives).ti,ab.
89

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Appendix 1

13 (‘gold standard’$ or reference test$ or ‘gold CINAHL


standard’$).ti,ab.
14 (criter$ standard$ or criter$ bias or criteria 1 pa.fs.
test or criteria tests).ti,ab. 2 us.fs.
15 (validat$ standard or validat$ test or validat$ 3 ra.fs.
tests or validat$ bias).ti,ab. 4 DO.fs.
16 (work-up bias or workup bias or expectation 5 du.fs.
bias or verification bias).ti,ab. 6 exp "Sensitivity and Specificity"/
17 ((observer adj2 bias) or inderminate 7 sensitivity.tw.
result$).ti,ab. 8 specificity.tw.
18 ((observer adj25 different) or observer 9 or/1-8
variation$).ti,ab. 10 exp Urinary Incontinence/ or urinary
19 Observer Variation/ incontinence.mp.
20 ((interrater or intrarater or observer) adj25 11 Stress Incontinence/ or stress incontinence.mp.
reliability).ti,ab. 12 exp Urge Incontinence/ or urge
21 (intra adj4 reliability).ti,ab. incontinence.mp.
22 ((accuracy or reliability) adj2 (test or tests or 13 detrusor instability.mp.
testing or standard or standards)).ti,ab. 14 (leak$ and urin$).mp. [mp=title, cinahl
23 (performance adj2 (test or tests or testing or subject heading, abstract, instrumentation]
standard or standards)).ti,ab. 15 or/10-14
24 (reference value or reference values or 16 9 and 15
sroc).ti,ab. 17 exp Predictive Value of Tests/
25 or/1-24 18 Reference Values/ or reference values.mp.
26 DO.fs. 19 roc curve.mp.
27 exp URODYNAMICS/ or urodynamics.mp. 20 exp Diagnostic Errors/ or diagnostic
28 exp URINALYSIS/ or urinalysis.mp. errors.mp.
29 (mid stream specimen adj2 urine).mp. 21 (predictive value$ or predictive standard$ or
30 (mid stream sampl$ adj2 urine).mp. predictive model$).ti,ab.
31 pad test$.mp. 22 (roc or receiver operat$ characteristic or
32 (validat$ adj25 scal$).mp. receiver operat$ curve$).ti,ab.
33 (stress and provocation and test$).mp. 23 (likelihood ratio$ or likelihood function$).ti,ab.
[mp=title, abstract, subject headings, drug 24 (diagnostic error$ or (errors adj2 diagnosis) or
trade name, original title, device manufacturer, (false adj2 reaction$)).ti,ab.
drug manufacturer name] 25 (false positive or false positives or false
34 exp Physical Examination/ or physical negative or false negatives).ti,ab.
examination.mp. 26 (gold standard$ or reference test$ or gold
35 or/26-34 standard$).ti,ab.
36 25 or 35 27 (criter$ standard$ or criter$ bias or criteria
37 exp Urine Incontinence/ or urinary test or criteria tests).ti,ab.
incontinence.mp. 28 (validat$ standard or validat$ test or validat$
38 exp Urge Incontinence/ or urge tests or validat$ bias).ti,ab.
incontinence.mp. 29 (work-up bias or workup bias or expectation
39 exp Stress Incontinence/ or stress bias or verification bias).ti,ab.
incontinence.mp. 30 ((observer adj2 bias) or inderminate
40 exp Detrusor Dyssynergia/ or detrusor result$).ti,ab.
instability.mp. 31 ((observer adj25 different) or observer
41 (leak$ and urin$).mp. variation$).ti,ab.
42 or/37-41 32 observer variation.mp.
43 36 and 42 33 ((interrater or intrarater or observer) adj25
44 limit 43 to (human and english language) reliability).ti,ab.
45 limit 44 to (adult <18 to 64 years> or aged 34 (intra adj4 reliability).ti,ab.
<65+ years>) 35 ((accuracy or reliability) adj2 (test or tests or
testing or standard or standards)).ti,ab.
36 (performance adj2 (test or tests or testing or
standard or standards)).ti,ab.
37 (reference value or reference values or
90 sroc).ti,ab.
Health Technology Assessment 2006; Vol. 10: No. 6

38 or/17-37 46 (validat$ adj25 scale$).mp. [mp=title, cinahl


39 15 and 38 subject heading, abstract, instrumentation]
40 16 or 39 47 stress provocation test$.mp. [mp=title, cinahl
41 exp URODYNAMICS/ or urodynamics.mp. subject heading, abstract, instrumentation]
42 urinalysis.mp. [mp=title, cinahl subject 48 provocation stress test$.mp. [mp=title, cinahl
heading, abstract, instrumentation] subject heading, abstract, instrumentation]
43 (mid stream specimen adj2 urine).mp. 49 physical examination.mp. [mp=title, cinahl
[mp=title, cinahl subject heading, abstract, subject heading, abstract, instrumentation]
instrumentation] 50 or/41-49
44 (mid stream sampl$ adj2 urine).mp. [mp=title,
cinahl subject heading, abstract, 52 40 or 51
instrumentation] 53 limit 52 to english
45 pad test$.mp. [mp=title, cinahl subject 54 from 53 keep 1-165
heading, abstract, instrumentation]

91

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Health Technology Assessment 2006; Vol. 10: No. 6

Appendix 2
Quality assessment tool
Item Yes No Unclear

1. Was the spectrum of patients representative of the patients who will


receive the test in practice?
2. Were selection criteria clearly described?
3. Is the reference standard likely to correctly classify the target condition?
4. Is the time period between reference standard and index test short enough
to be reasonably sure that the target condition did not change between the
two tests?
5. Did the whole sample or a random selection of the sample receive
verification using a reference standard of diagnosis?
6. Did patients receive the same reference standard regardless of the index
test result?
7. Was the reference standard independent of the index test (i.e. the index
test did not form part of the reference standard)?
8a. Was the execution of the index test described in sufficient detail to permit
replication of the test?
8b. Was the execution of the reference standard described in sufficient detail
to permit its replication?
9a. Were the index test results interpreted without knowledge of the results
of the reference standard?
9b. Were the reference standard results interpreted without knowledge of the
results of the index test?
10. Were the same clinical data available when test results were interpreted as
would be available when the test is used in practice?
11. Were uninterpretable/intermediate test results reported?
12. Were withdrawals from the study explained?

93

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Health Technology Assessment 2006; Vol. 10: No. 6

Appendix 3
Instructions for quality assessment219
Explanation of items included in therefore important that diagnostic test
evaluations include an appropriate spectrum of
the quality assessment tool and patients for the test under investigation and that a
guide to scoring items clear definition of the characteristics of the
In addition to the quality assessment sheet, included patients is provided.
please fill in the attached sheet with the
requested information on the study sample. In b. Situations in which this item does
addition to providing useful information for not apply
categorising the paper this will also assist you This item is relevant to all studies of diagnostic
with the quality assessment. accuracy and should always be included in the
quality assessment tool.
Following the pilot quality assessment some
further instructions have been added to assist c. How to score this item
with the scoring of some of the items. These are Studies should score ‘yes’ for this item if you
included in the blue boxes after the original believe, based on the information reported or
instructions obtained from the study’s authors, that the
spectrum of patients included in the study was
representative of those in whom the test will be
General Note: In the pilot study there used in practice. The judgement should be based
appeared to be a reluctance to code items as on both the method of recruitment and the
‘unclear’. This is an equally valid response characteristics of those recruited. Studies which
and should be used when appropriate. No recruit a group of healthy controls and a group
papers will be excluded from the review on known to have the target disorder will be coded as
the basis of quality assessment: the coding of ‘no’ on this item in nearly all circumstances.
items as unclear is not necessarily a sign of Reviewers should prespecify in the protocol of the
poor quality, only a reflection of a lack of review what spectrum of patients would be
clarity in reporting. This may provide useful acceptable taking factors such as disease
recommendations for the reporting of prevalence and severity, age and sex into account.
diagnostic studies in the future. If you think that the population studied does not
fit into what you specified as acceptable, the study
should be scored as ‘no’. If there is insufficient
1. Was the spectrum of patients representative information available to make a judgement then it
of the patients who will receive the test in should be scored as ‘unclear’.
practice?
Additional instructions for Question 1:
It is not necessary for the study sample to be
a. What is meant by this item statistically representative of all the patients
Differences in demographic and clinical features who may receive the test in practice. The
between populations may produce measures of study should include a sample that meets the
diagnostic accuracy that vary considerably; this is broad remits of the study:
known as spectrum bias. Reported estimates of
diagnostic accuracy may have limited clinical A sample of community-dwelling adults not
applicability (generalisability) if the spectrum of exclusively consisting of patients with a
tested patients is not similar to the patients in related chronic disease.
whom the test will be used in practice. The Therefore, the sample does not have to consist
spectrum of patients refers not only to the severity of both men and women, to include a wide
of the underlying target condition, but also to range of age groups or include a primary and
demographic features and to the presence of secondary care population to be coded as ‘yes’.
differential diagnosis and/or co-morbidity. It is
95

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Appendix 3

ways, including laboratory tests, imaging tests,


2. Were selection criteria clearly described? function tests and pathology, but also clinical
follow-up of participants. The decision of which
a. What is meant by this item reference standard to use depends on the
This refers to whether studies have provided a definition of the target condition and the purpose
clear definition of the criteria used as selection of the study. If no single reference test is available,
criteria for entry into the study. then careful clinical follow-up, a consensus
between observers or results of two or more
b. Situations in which this item does combined tests may be used to determine the
not apply presence or absence of the target condition.
This item is relevant to all studies of diagnostic Estimates of test performance are based on the
accuracy and should always be included in the assumption that the index test is being compared
quality assessment tool. to a reference standard which is 100% sensitive
and specific. If there are any disagreements
c. How to score this item between the reference standard and the index test
If you think that all relevant information then it is assumed that the index test is incorrect.
regarding how participants were selected for Thus, from a theoretical point of view the choice
inclusion in the study has been provided then this of an appropriate reference standard is very
item should be scored as ‘yes’. If study selection important.
criteria are not clearly reported then this item
should be scored as ‘no’. In situations where b. Situations in which this item does
selection criteria are partially reported and you not apply
feel that you do not have enough information to This item is relevant to all studies of diagnostic
score this item as ‘yes’, then it should be scored as accuracy and should always be included in the
‘unclear’. quality assessment tool. The only exception would
be if a particular reference standard is specified in
In order for this to be coded as ‘yes’ the the inclusion criteria, i.e. to be included in the
description of the sample needs to fulfil all of review a study may have to compare the index test
these criteria: to a specified reference standard.
Age: either an age range or a measure of
central tendency (with SD) should be c. How to score this item
presented. If a statement such as ‘women over If you believe that the reference standard is likely
the age of 50’ is the only description then this to correctly classify the target condition then this
item should be coded as ‘unclear’. item should be scored ‘yes’. Making a judgement
Gender: the proportion of male and female as to the accuracy of the reference standard may
patients must be stated. not be straightforward. You may need experience
Location of recruitment and test: the paper of the topic area to know whether a test is an
should state where recruitment of patients appropriate reference standard, or if a
took place and whether the tests were combination of tests is used you may have to
performed in primary or secondary care. consider carefully whether these were appropriate.
Sample size If you do not think that the reference standard
was likely to have correctly classified the target
condition then this item should be scored as ‘no’.
3. Is the reference standard likely to correctly If there is insufficient information to make a
classify the target condition? judgement then this should be scored as
‘unclear’.

a. What is meant by this item If urodynamics or an ICS approved pad test is


The reference standard is the method used to used as the reference standard then this item
determine the presence or absence of the target should be coded as ‘yes’.
condition. To assess the diagnostic accuracy of the
index test its results are compared with the results
of the reference standard; subsequently indicators 4. Is the time period between reference
of diagnostic accuracy can be calculated. The standard and index test short enough to be
reference standard is therefore an important reasonably sure that the target condition did
determinant of the diagnostic accuracy of a test. not change between the two tests?
96 The reference standard may be obtained in many
Health Technology Assessment 2006; Vol. 10: No. 6

a. What is meant by this item 5. Did the whole sample or a random selection
Ideally the results of the index test and the of the sample, receive verification using a
reference standard are collected on the same reference standard?
patients at the same time. If this is not possible and
a delay occurs, misclassification due to spontaneous
recovery or a more advanced stage of disease may a. What is meant by this item
occur. This is known as disease progression bias. Partial verification bias (also known as work-up
The size of the time period which may cause such bias, (primary) selection bias or sequential
bias will vary between conditions. For example, a ordering bias) occurs when not all of the study
delay of a few days is unlikely to be a problem for group receive confirmation of the diagnosis by a
chronic conditions; however, for other infectious reference standard. If the results of the index test
diseases a delay between performance of index and influence the decision to perform the reference
reference standard of only a few days may be standard then biased estimates of test
important. This type of bias may occur in chronic performance may arise. If patients are randomly
conditions in which the reference standard involves selected to receive the reference standard the
clinical follow-up of several years. overall diagnostic performance of the test is, in
theory, unchanged. In most cases, however, this
b. Situations in which this item does selection is not random, possibly leading to biased
not apply estimates of the overall diagnostic accuracy.
This item is likely to apply in most situations.
b. Situations in which this item does
c. How to score this item not apply
When to score this item as ‘yes’ is related to the Partial verification bias generally only occurs in
target condition. For conditions that progress rapidly diagnostic cohort studies in which patients are
even a delay of several days may be important. For tested by the index test prior to the reference
such conditions this item should be scored ‘yes’ if the standard. If the test sequence is reversed, as it is in
delay between the performance of the index and case–control designs, partial verification bias is
reference standard is very short, a matter of hours or generally not applicable. However, there may be
days. However, for chronic conditions disease status exceptions to this. For example, in radiologic
is unlikely to change in a week, or a month, or even re-reading studies, scans are read at a later data by
longer. In such conditions longer delays between one or more radiologists, but the scans will usually
performance of the index and reference standard have been obtained in regular clinical practice.
may be scored as ‘yes’. You will have to make If the study is limited to those with, for example,
judgements regarding what is considered ‘short biopsy verification the index (radiological
enough’. You should think about this before starting interpretations) could by influenced by the
work on a review, and define what you consider to be decision to biopsy or not, and verification bias
‘short enough’ for the specific topic area that you are may apply. In situations where the reference
reviewing. If you think the time period between the standard is assessed before the index test, you
performance of the index test and the reference should first decide whether there is a possibility
standard was sufficiently long that disease status may that verification bias could occur, and if not how to
have changed between the performance of the two score this item. This may depend on how quality
tests then this item should be scored as ‘no’. If will be incorporated in the review. There are two
insufficient information is provided this should be options: either to score this item as ‘yes’, or to
scored as ‘unclear’. remove it from the quality assessment tool.

Some disagreement resulted from variations in c. How to score this item


the strictness of coding for this item. It is rare If it is clear from the study that all patients who
that time periods are explicitly presented and received the index test went on to receive
in some cases people made the (probably verification of their disease status using a
correct) assumption that the two tests were reference standard, even if this reference standard
carried out at around the same time. It has was not the same for all patients, then this item
been decided that no assumptions should be should be scored as ‘yes’. If some of the patients
made when performing the quality assessment. who received the index test did not receive
Therefore, if there is no mention of the time verification of their true disease state then this
period between tests then this item should item should be scored as ‘no’. If this information is
always be coded as ‘unclear’. not reported by the study then it should be scored
as ‘unclear’. 97

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Appendix 3

6. Did patients receive the same reference standard, and hence overestimate the various
standard regardless of the index test result? measures of diagnostic accuracy. It is important to
note that knowledge of the results of the index test
alone does not automatically mean that these
a. What is meant by this item results are incorporated in the reference standard.
Differential verification bias occurs when some of For example, a study investigating MRI for the
the index test results are verified by a different diagnosis of multiple sclerosis could have a
reference standard. This is especially a problem if reference standard composed of clinical follow-up,
these reference standards differ in their definition cerebrospinal fluid analysis and MRI. In this case
of the target condition, for example the index test forms part of the reference
histopathology of the appendix and natural standard. If the same study used a reference
history for the detection of appendicitis. This standard of clinical follow-up and the results of the
usually occurs when patients testing positive on MRI were known when the clinical diagnosis was
the index test receive a more accurate, often made but were not specifically included as part of
invasive, reference standard than those with the reference then the index test does not form
negative test results. The link (correlation) part of the reference standard.
between a particular (negative) test result and
being verified by a less accurate reference standard b. Situations in which this item does
will affect measures of test accuracy in a similar not apply
way as in partial verification, but less seriously. This item will only apply when a composite
reference standard is used to verify disease status.
b. Situations in which this item does In such cases it is essential that a full definition of
not apply how disease status is verified and which tests form
Differential verification bias generally only occurs part of the reference standard are provided. For
in diagnostic cohort studies in which all patients studies in which a single reference standard is
are tested by the index test prior to the reference used this item will not be relevant and should
standard. However, there may be situations in either be scored as ‘yes’ or be removed from the
which this does not apply (see Item 3). If the test quality assessment tool.
sequence is reversed, as it is in case–control
designs, partial verification bias is not applicable. c. How to score this item
In situations where the reference standard is If it is clear from the study that the index test did
assessed before the index test, you should decide not form part of the reference standard then this
how to score this item. This may depend on how item should be scored as ‘yes’. If it appears that
quality will be incorporated in the review. There the index test formed part of the reference
are two options: either to score this item as ‘yes’, standard then this item should be scored as ‘no’.
or to remove it from the quality assessment tool. If this information is not reported by the study
then it should be scored as ‘unclear’.
c. How to score this item
If it is clear that patients received verification of 8a. Was the execution of the index test
their true disease status using the same reference described in sufficient detail to permit
standard then this item should be scored as ‘yes’. replication of the test?
If some patients received verification using a 8b. Was the execution of the reference
different reference standard this item should be standard described in sufficient detail to
scored as ‘no’. If this information is not reported permit its replication?
by the study then it should be scored as ‘unclear’.

7. Was the reference standard independent of a. What is meant by these items


the index test (i.e. the index test did not A sufficient description of the execution of index
form part of the reference standard)? test and reference standards is important for two
reasons. First, variation in measures of diagnostic
accuracy can sometimes be traced back to
a. What is meant by this item differences in the execution of index/reference
When the result of the index test is used in standards. Second, a clear and detailed
establishing the final diagnosis, incorporation bias description (or references) is needed to implement
may occur. This incorporation will probably a certain test in another setting. If tests are
increase the amount of agreement between index executed in different ways then this would be
98 test results and the outcome of the reference expected to impact on test performance. The
Health Technology Assessment 2006; Vol. 10: No. 6

extent to which this would be expected to affect b. Situations in which these items do
results would depend on the type of test being not apply
investigated. If, in the topic area that you are reviewing, the
index test is always performed first then
b. Situations in which these items do interpretation of the results of the index test will
not apply usually be without knowledge of the results of the
These items are likely to apply in most situations. reference standard. Similarly, if the reference
standard is always performed first (for example, in
c. How to score these items a diagnostic case–control study) then the results of
If the study reports sufficient details to permit the reference standard will be interpreted without
replication of the index test and reference knowledge of the index test. However, in certain
standard then these items should be scored as situations the results of both the index test and
‘yes’. In other cases these items should be scored reference standard are blinded in both directions
as ‘no’. In situations where details of test before being interpreted. In situations where one
performance are partially reported and you feel form of review bias does not apply there are two
that you do not have enough information to score possibilities: either score the relevant item as ‘yes’
this item as ‘yes’, then it should be scored as or remove this item from the list. If tests are
‘unclear’. entirely objective in their interpretation then test
interpretation is not susceptible to review bias. In
If the paper cites a reference for a full such situations review bias may not be a problem
description of the methodology then this item and these items can be omitted from the quality
should be coded as ‘yes’. assessment tool. Another situation in which this
form of bias may not apply is when test results are
For a description of urodynamics to be coded interpreted in an independent laboratory. In such
as ‘yes’ the following information needs to be situations it is unlikely that the person interpreting
given: the test results will have knowledge of the results
what type of catheter is used of the other test (either index test or reference
filling speed standard).
volume and type of medium (fluid, gas, etc.).
c. How to score these items
If the study clearly states that the test results
9a. Were the index test results interpreted (index or reference standard) were interpreted
without knowledge of the results of the blind to the results of the other test then these
reference standard? items should be scored as ‘yes’. If this does not
9b. Were the reference standard results appear to be the case they should be scored as
interpreted without knowledge of the ‘no’. If this information is not reported by the
results of the index test? study then it should be scored as ‘unclear’.

This is also rarely explicitly mentioned


a. What is meant by these items although it could be assumed that when
This item is similar to ‘blinding’ in intervention
performing urodynamics some history of the
studies. Interpretation of the results of the index
patient will be known. However, no
test may be influenced by knowledge of the results
assumptions should be made and therefore
of the reference standard, and vice versa. This is
the item should be coded thus:
known as review bias, and may lead to inflated
If there is mention of blinding or
measures of diagnostic accuracy. The extent to
independent interpretation – ‘yes’
which this may affect test results will be related to
If it is mentioned that the tests are not
the degree of subjectiveness in the interpretation
blinded – ‘no’
of the test result. The more subjective the
If blinding is not mentioned at all –
interpretation the more likely that the interpreter
‘unclear’
can be influenced by the results of the index test
in interpreting the reference standard, and vice
versa. It is therefore important to consider the 10. Were the same clinical data available
topic area that you are reviewing and to determine when test results were interpreted as would
whether the interpretation of the index test or be available when the test is used in
reference standard could be influenced by practice?
knowledge of the results of the other test. 99

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Appendix 3

a. What is meant by this item performance. Whatever the cause of


The availability of information on clinical data uninterpretable results it is important that these
during interpretation of test results may affect are reported so that the impact of these results on
estimates of test performance. In this context test performance can be determined.
clinical data is defined broadly to include any
information relating to the patient obtained by b. Situations in which this item does
direct observation such as age, sex and symptoms. not apply
The knowledge of such factors can influence the This item is relevant to all studies of diagnostic
diagnostic test result if the test involves an accuracy and should always be included in the
interpretative component. If clinical data will be quality assessment tool.
available when the test is interpreted in practice
then this should also be available when the test is c. How to score this item
evaluated. If, however, the index test is intended If it is clear that all test results, including
to replace other clinical tests then clinical data uninterpretable/indeterminate/intermediate, are
should not be available. It is therefore important reported then this item should be scored as ‘yes’.
to determine what information will be available If you think that such results occurred but have
when test results are interpreted in practice before not been reported then this item should be scored
assessing studies for this item. as ‘no’. If it is not clear whether all study results
have been reported then this item should be
b. Situations in which this item does scored as ‘unclear’.
not apply
If the interpretation of the index test is fully A strict approach should be used when coding
automated and involves no interpretation then this item. If there is no mention of any
this item may not be relevant and can be omitted uninterpretable results then this should be
from the quality assessment tool. coded as ‘unclear’.

c. How to score this item


If clinical data would normally be available when 12. Were withdrawals from the study
the test is interpreted in practice and similar data explained?
were available when interpreting the index test in
the study then this item should be scored as ‘yes’.
Similarly, if clinical data would not be available in a. What is meant by this item
practice and these data were not available when This occurs when patients withdraw from the study
the index test results were interpreted then this before the results of both the index test and
item should be scored as ‘yes’. If this is not the reference standard are known. If patients lost to
case then this item should be scored as ‘no’. If this follow-up differ systematically from those who
information is not reported by the study then it remain, for whatever reason, then estimates of test
should be scored as ‘unclear’. performance may be biased.

11. Were uninterpretable/intermediate test


b. Situations in which this item does
results reported?
not apply
This item is relevant to all studies of diagnostic
accuracy and should always be included in the
a. What is meant by this item quality assessment tool.
A diagnostic test can produce an uninterpretable/
indeterminate/intermediate result with varying c. How to score this item
frequency depending on the test. These problems If it is clear what happened to all patients who
are often not reported in diagnostic accuracy entered the study, for example if a flow diagram of
studies, with the uninterpretable results simply study participants is reported, then this item
removed from the analysis. This may lead to the should be scored as ‘yes’. If it appears that some of
biased assessment of the test characteristics. the participants who entered the study did not
Whether bias will arise depends on the possible complete the study, i.e. did not receive both the
correlation between uninterpretable test results index test and reference standard, and these
and the true disease status. If uninterpretable patients were not accounted for then this item
results occur randomly and are not related to the should be scored as ‘no’. If it is not clear whether
true disease status of the individual then, in all patients who entered the study were accounted
100 theory, these should not have any effect on test for then this item should be scored as ‘unclear’.
Health Technology Assessment 2006; Vol. 10: No. 6

Again a strict approach should be used when


coding this item. If there is no mention of any
withdrawals then this should be coded as
‘unclear’.

101

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Health Technology Assessment 2006; Vol. 10: No. 6

Appendix 4
Letter to authors requesting additional data
Dear

We are currently undertaking a systematic review on the methods of diagnosing urinary incontinence. This
work is funded by the Department of Health in the United Kingdom (http://www.hta.nhsweb.nhs.uk/).
The results will be used to advise health care professionals on the most appropriate assessment methods
when dealing with this highly prevalent condition.

We have identified your paper {InsertReference} as relevant for inclusion in the review as it quantitatively
compares the diagnostic methods: {insert diagnostic test 1 and diagnostic 2}.

However, in order to be able to fully include your paper in the review and any meta-analysis we need a
little further information from you. Combining data from different studies in a meta-analysis requires
data in a very specific format. In order that we can include the results from all possible studies in the
meta-analysis we are writing to authors for this extra information. As I am sure you are aware the very
nature of systematic reviews requires as many relevant papers as possible to be included in order to
provide representative results1.

We need to know the number of patients (both with and without urinary incontinence) classified correctly
and incorrectly by the index test (e.g. a 2×2 or 3×3 contingency table). This would allow us to calculate
sensitivity, specificity and positive predictive value for the index test. The cut-off points used to determine
a positive result for each of the diagnostic tests are also required. If your study did not define cut-off
points for a positive test then it would be most useful if you could provide us with the raw data, we only
require two columns of data (please see our website http://www.prw.le.ac.uk/research/hta/ for an example
of what we require). To minimise the effort on your part we have attached a ‘fax-back’ form that you can
complete by hand with the required data (which is potentially just six numbers), and our website will
hopefully answer any additional queries relating to this request.

We do hope that you will be able to assist us with this request, your help will greatly improve the validity
of the review and maximise its impact. You will of course be acknowledged for your assistance and sent a
copy of the final report. As I am sure you can appreciate we are on a very tight timetable, therefore a
response within two weeks would be greatly appreciated. However, if you are going to find this difficult
please contact us.

If you require any further information about what data is required or have any questions about any aspect
of the project please do not hesitate to contact us by any of the contact methods given above.

Yours sincerely

Jennifer Martin
On behalf of the study team

1. Deeks JJ. Systematic reviews in health care: Systematic reviews of evaluations of diagnostic and
screening tests. BMJ 323(7305):157–62. www.bmj.com
103

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Health Technology Assessment 2006; Vol. 10: No. 6

Appendix 5
Blank forms sent to contacted authors

Fax

F.A.O: Dr Jennifer Martin From:

Fax No. +44 (0)116 252 5423 Phone No. +44 (0)116 252 5451

Re: Systematic Review Data

Author Paper ID

Data Required:

Gold Standard/Reference Test


+ ve – ve
+ ve

Index Test

– ve

Cut-off for a positive result on the gold standard test

Cut-off for a positive result on the index standard test

105

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Appendix 5

Fax

F.A.O: Dr Jennifer Martin From:

Fax No. +44 (0)116 252 5423 Phone No. +44 (0)116 252 5451

Re: Systematic Review Data

Author Paper ID

Data Required:

Gold Standard/Reference Test

Index Test

Cut-off for a positive result on the gold standard test

Cut-off for a positive result on the index standard test

106
Health Technology Assessment 2006; Vol. 10: No. 6

Fax

F.A.O: Dr Jennifer Martin From:

Fax No. +44 (0)116 252 5423 Phone No. +44 (0)116 252 5451

Re: Systematic Review Data

Paper ID

Data Required:

Patient No. Gold Standard Reference


Diagnostic Test Diagnostic Test
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

If you would rather send data in electronic form by email or send an existing data sheet by fax then
please do so.

107

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Health Technology Assessment 2006; Vol. 10: No. 6

Appendix 6
Website created for contacted authors
Systematic review: methods of diagnosing urinary incontinence
Data examples
This website has been created to provide assistance to those authors that have been contacted for extra
data to be included in our systematic review on methods of diagnosing urinary incontinence. This work is
funded by the Department of Health. http://www.hta.nhsweb.nhs.uk

We have provided a number of examples to illustrate the form in which we require the data. These
illustrate what to do in situations where cut-off points have been used to classify patients as either positive
or negative on a particular test (i.e. categorical data). Examples are also given for studies where no cut-
offs have been used and therefore data is in a continuous form.

In order to comply with the Data Protection Act 1998 please do not send us any unique patient identifier
numbers, initials or any other information that could be used to identify individuals.

We hope that these examples will enable you to provide the data that we have asked for. However, if you
have any questions whatsoever about what is required or indeed any queries about the project in general
please do not hesitate to get in contact with us.

Contact email: jlm26@le.ac.uk

Example 1
This study was undertaken to determine the diagnostic accuracy of the 48-hour pad-test against the gold
standard test of multichannel video urodynamics. 38 patients performed both tests. A clear cut-off point
was defined for a positive result for each of the diagnostic tests. Each of the 38 patients can be assigned
to one of the 4 boxes within the contingency table.

Multichannel video urodynamics


(Gold standard/reference test)
+ ve – ve
leakage no leakage

48 h + ve 10 6
pad test
(index test)
– ve 4 18

Cut-off for a positive result Visualisation of leakage in


for multichannel absence of a detrusor
videourodynamics contraction
Cut-off for a positive result Leakage greater than 15 g
on 48-hour pad-test

109

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Appendix 6

Example 2
This study was undertaken to determine the accuracy of a clinical stress test in diagnosing different types
of incontinence compared with the gold standard of multichannel videourodynamics. A total of 34
patients performed both tests. Cut-off points were defined for each diagnosis. Each of the 34 patients can
be assigned to one of the 9 boxes within the contingency table.

Multichannel videourodynamics
(Gold standard/reference test)
USI DO Normal

USI 17 2 1
Clinical
stress test DO 1 8 0
(index test)
Normal 1 2 2

USI DO
Cut-off for a positive Involuntary leakage Spontaneous contraction
result on multichannel during increased whilst the patient
videourodynamics abdominal pressure in attempts to inhibit
the absence of a micturition
detrusor contraction
Cut-off for a positive Observed leakage Uncontrollable leakage
result on clinical coincidentally with during examination
stress test coughing or straining

110
Health Technology Assessment 2006; Vol. 10: No. 6

Example 3
A total of 20 patients were studied to investigate the accuracy of using a severity index to diagnose
urinary incontinence. The scale was compared with a 48-hour pad-test, which had a clear cut-off point for
a positive or negative result. No cut-off point was used for the severity score therefore the raw data is
given.

Patient no. Pad test result Severity score


1 Positive 14
2 Negative 3
3 Positive 18
4 Positive 16
5 Positive 11
6 Negative 5
7 Negative 7
8 Positive 9
9 Positive 11
10 Negative 2
11 Negative 0
12 Positive 7
13 Positive 9
14 Positive 12
15 Positive 15
16 Positive 13
17 Negative 11
18 Positive 16
19 Positive 18
20 Negative 13

Cut-off point for a positive 48-hour pad test = 15 g.

111

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Health Technology Assessment 2006; Vol. 10: No. 6

Appendix 7
Additional study information sheet

Description of Study Sample Paper No.

1. Age of patients* Range/measure of central tendency

2. Gender* % Female

3. Where sample was recruited* Primary/2ndary/Mixed

4. Where tests were performed* Primary/2ndary/Mixed

5. Community dwelling? %

6. Proportion of patients with related %


chronic disease

7. Year of publication

8. Sample size*

9. Country of study

* This information is required for Q2 to be coded as ‘Yes’.

Currently this paper is classified as comparing the following tests.

Do you agree with this classification? (please circle) YES NO

If not, how would you classify the paper?

113

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Health Technology Assessment 2006; Vol. 10: No. 6

Appendix 8
STARD flowchart and checklist

Eligible patients (n= )

Excluded patients
Reasons (n= )

Index test (n= )

Abnormal result Normal result Inconclusive result


(n= ) (n= ) (n= )

No reference No reference No reference


standard standard standard
(n= ) (n= ) (n= )

Reference standard Reference standard Reference standard


(n= ) (n= ) (n= )

Inconclusive Inconclusive Inconclusive


(n= ) (n= ) (n= )

Target Target Target Target Target Target


condition condition condition condition condition condition
present absent present absent present absent
(n= ) (n= ) (n= ) (n= ) (n= ) (n= )

115

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Appendix 8

STARD checklist for reporting diagnostic accuracy studies

Section and topic Item Description

Title, abstract, and keywords 1 Identify the article as a study of diagnostic accuracy (recommend MeSH
heading "sensitivity and specificity")
Introduction 2 State the research questions or aims, such as estimating diagnostic accuracy
or comparing accuracy between tests or across participant groups
Methods:
Participants 3 Describe the study population: the inclusion and exclusion criteria and the
settings and locations where the data were collected
4 Describe participant recruitment: was this based on presenting symptoms,
results from previous tests, or the fact that the participants had received
the index tests or the reference standard?
5 Describe participant sampling: was this a consecutive series of participants
defined by selection criteria in items 3 and 4? If not, specify how
participants were further selected
6 Describe data collection: was data collection planned before the index tests
and reference standard were performed (prospective study) or after
(retrospective study)?
Test methods 7 Describe the reference standard and its rationale
8 Describe technical specifications of material and methods involved,
including how and when measurements were taken, or cite references for
index tests or reference standard, or both
9 Describe definition of and rationale for the units, cut-off points, or
categories of the results of the index tests and the reference standard
10 Describe the number, training, and expertise of the persons executing and
reading the index tests and the reference standard
11 Were the readers of the index tests and the reference standard blind
(masked) to the results of the other test? Describe any other clinical
information available to the readers.
Statistical methods 12 Describe methods for calculating or comparing measures of diagnostic
accuracy and the statistical methods used to quantify uncertainty (e.g. 95%
confidence intervals)
13 Describe methods for calculating test reproducibility, if done
Results:
Participants 14 Report when study was done, including beginning and ending dates of
recruitment
15 Report clinical and demographic characteristics (e.g. age, sex, spectrum of
presenting symptoms, comorbidity, current treatments, and recruitment
centre)
16 Report how many participants satisfying the criteria for inclusion did or did
not undergo the index tests or the reference standard, or both; describe
why participants failed to receive either test (a flow diagram is strongly
recommended)
Test results 17 Report time interval from index tests to reference standard, and any
treatment administered between
18 Report distribution of severity of disease (define criteria) in those with the
target condition and other diagnoses in participants without the target
condition
19 Report a cross-tabulation of the results of the index tests (including
indeterminate and missing results) by the results of the reference standard;
for continuous results, report the distribution of the test results by the
results of the reference standard
20 Report any adverse events from performing the index test or the reference
standard
Estimates 21 Report estimates of diagnostic accuracy and measures of statistical
uncertainty (e.g. 95% confidence intervals)
22 Report how indeterminate results, missing responses, and outliers of index
tests were handled
23 Report estimates of variability of diagnostic accuracy between readers,
centres, or subgroups of participants, if done
24 Report estimates of test reproducibility, if done
Discussion 25 Discuss the clinical applicability of the study findings
116
Health Technology Assessment 2006; Vol. 10: No. 6

Health Technology Assessment


Programme
Director, Deputy Director,
Professor Tom Walley, Professor Jon Nicholl,
Director, NHS HTA Programme, Director, Medical Care Research
Department of Pharmacology & Unit, University of Sheffield,
Therapeutics, School of Health and Related
University of Liverpool Research

Prioritisation Strategy Group


Members

Chair, Professor Bruce Campbell, Professor Jon Nicholl, Director, Dr Ron Zimmern, Director,
Professor Tom Walley, Consultant Vascular & General Medical Care Research Unit, Public Health Genetics Unit,
Director, NHS HTA Programme, Surgeon, Royal Devon & Exeter University of Sheffield, School Strangeways Research
Department of Pharmacology & Hospital of Health and Related Research Laboratories, Cambridge
Therapeutics,
Dr Edmund Jessop, Medical Dr John Reynolds, Clinical
University of Liverpool
Advisor, National Specialist, Director, Acute General
Commissioning Advisory Group Medicine SDU, Radcliffe
(NSCAG), Department of Hospital, Oxford
Health, London

HTA Commissioning Board


Members

Programme Director, Professor Ann Bowling, Professor Fiona J Gilbert, Dr Linda Patterson,
Professor Tom Walley, Professor of Health Services Professor of Radiology, Consultant Physician,
Director, NHS HTA Programme, Research, Primary Care and Department of Radiology, Department of Medicine,
Department of Pharmacology & Population Studies, University of Aberdeen Burnley General Hospital
Therapeutics, University College London
University of Liverpool Professor Adrian Grant, Professor Ian Roberts, Professor
Dr Andrew Briggs, Public Director, Health Services of Epidemiology & Public
Chair,
Health Career Scientist, Health Research Unit, University of Health, Intervention Research
Professor Jon Nicholl,
Economics Research Centre, Aberdeen Unit, London School of
Director, Medical Care Research
University of Oxford Hygiene and Tropical Medicine
Unit, University of Sheffield,
Professor F D Richard Hobbs,
School of Health and Related
Professor John Cairns, Professor Professor of Primary Care & Professor Mark Sculpher,
Research
of Health Economics, Public General Practice, Department of Professor of Health Economics,
Deputy Chair, Health Policy, London School of Primary Care & General Centre for Health Economics,
Professor Jenny Hewison, Hygiene and Tropical Medicine, Practice, University of Institute for Research in the
Professor of Health Care London Birmingham Social Services, University of York
Psychology, Academic Unit of
Psychiatry and Behavioural Professor Nicky Cullum, Professor Peter Jones, Head of
Sciences, University of Leeds Dr Jonathan Shapiro, Senior
Director of Centre for Evidence Department, University
School of Medicine Fellow, Health Services
Based Nursing, Department of Department of Psychiatry,
Management Centre,
Health Sciences, University of University of Cambridge
Birmingham
York
Dr Jeffrey Aronson
Professor Sallie Lamb,
Reader in Clinical Ms Kate Thomas,
Mr Jonathan Deeks, Professor of Rehabilitation,
Pharmacology, Department of Deputy Director,
Senior Medical Statistician, Centre for Primary Health Care,
Clinical Pharmacology, Medical Care Research Unit,
Centre for Statistics in University of Warwick
Radcliffe Infirmary, Oxford University of Sheffield
Medicine, University of Oxford
Professor Deborah Ashby, Professor Stuart Logan,
Professor of Medical Statistics, Dr Andrew Farmer, Senior Director of Health & Social Ms Sue Ziebland,
Department of Environmental Lecturer in General Practice, Care Research, The Research Director, DIPEx,
and Preventative Medicine, Department of Primary Peninsula Medical School, Department of Primary Health
Queen Mary University of Health Care, Universities of Exeter & Care, University of Oxford,
London University of Oxford Plymouth Institute of Health Sciences

129
Current and past membership details of all HTA ‘committees’ are available from the HTA website (www.ncchta.org)

© Queen’s Printer and Controller of HMSO 2006. All rights reserved.


Health Technology Assessment Programme

Diagnostic Technologies & Screening Panel


Members

Chair, Professor Adrian K Dixon, Dr Susanne M Ludgate, Medical Professor Lindsay Wilson
Dr Ron Zimmern, Director of Professor of Radiology, Director, Medicines & Turnbull, Scientific Director,
the Public Health Genetics Unit, University Department of Healthcare Products Regulatory Centre for MR Investigations &
Strangeways Research Radiology, University of Agency, London YCR Professor of Radiology,
Laboratories, Cambridge Cambridge Clinical School University of Hull
Professor William Rosenberg,
Professor of Hepatology, Liver Professor Martin J Whittle,
Dr David Elliman, Research Group, University of
Ms Norma Armston, Consultant Paediatrician/ Associate Dean for Education,
Southampton Head of Department of
Lay Member, Bolton Hon. Senior Lecturer,
Population Health Unit, Obstetrics and Gynaecology,
Dr Susan Schonfield, Consultant
Professor Max Bachmann Great Ormond St. Hospital, University of Birmingham
in Public Health, Specialised
Professor of Health London Services Commissioning North
Care Interfaces, Dr Dennis Wright,
West London, Hillingdon
Department of Health Consultant Biochemist &
Professor Glyn Elwyn, Primary Care Trust
Policy and Practice, Clinical Director,
University of East Anglia Primary Medical Care Dr Phil Shackley, Senior Pathology & The Kennedy
Research Group, Lecturer in Health Economics, Galton Centre,
Professor Rudy Bilous Swansea Clinical School, School of Population and Northwick Park & St Mark’s
Professor of Clinical Medicine & University of Wales Swansea Health Sciences, University of Hospitals, Harrow
Consultant Physician, Newcastle upon Tyne
The Academic Centre, Mr Tam Fry, Honorary
South Tees Hospitals NHS Trust Chairman, Child Growth Dr Margaret Somerville, PMS
Foundation, London Public Health Lead, Peninsula
Dr Paul Cockcroft, Medical School, University of
Consultant Medical Plymouth
Microbiologist and Clinical Dr Jennifer J Kurinczuk,
Director of Pathology, Consultant Clinical Dr Graham Taylor, Scientific
Department of Clinical Epidemiologist, Director & Senior Lecturer,
Microbiology, St Mary's National Perinatal Regional DNA Laboratory, The
Hospital, Portsmouth Epidemiology Unit, Oxford Leeds Teaching Hospitals

Pharmaceuticals Panel
Members

Chair, Mr Peter Cardy, Chief Dr Christine Hine, Consultant in Professor Jan Scott, Professor
Dr John Reynolds, Chair Executive, Macmillan Cancer Public Health Medicine, South of Psychological Treatments,
Division A, The John Radcliffe Relief, London Gloucestershire Primary Care Institute of Psychiatry,
Hospital, Oxford Radcliffe Trust University of London
Hospitals NHS Trust Professor Imti Choonara,
Professor in Child Health, Professor Stan Kaye, Mrs Katrina Simister, Assistant
Academic Division of Child Cancer Research UK Director New Medicines,
Health, University of Professor of Medical Oncology, National Prescribing Centre,
Professor Tony Avery,
Nottingham Section of Medicine, Liverpool
Head of Division of Primary
The Royal Marsden Hospital,
Care, School of Community
Dr Robin Ferner, Consultant Sutton Dr Richard Tiner, Medical
Health Services, Division of
General Practice, University of Physician and Director, West Director, Medical Department,
Ms Barbara Meredith,
Nottingham Midlands Centre for Adverse Association of the British
Lay Member, Epsom
Drug Reactions, City Hospital Pharmaceutical Industry,
Ms Anne Baileff, Consultant NHS Trust, Birmingham Dr Andrew Prentice, Senior London
Nurse in First Contact Care, Lecturer and Consultant
Southampton City Primary Care Dr Karen A Fitzgerald, Obstetrician & Gynaecologist, Dr Helen Williams,
Trust, University of Consultant in Pharmaceutical Department of Obstetrics & Consultant Microbiologist,
Southampton Public Health, National Public Gynaecology, University of Norfolk & Norwich University
Health Service for Wales, Cambridge Hospital NHS Trust
Professor Stirling Bryan, Cardiff
Professor of Health Economics, Dr Frances Rotblat, CPMP
Health Services Mrs Sharon Hart, Head of Delegate, Medicines &
Management Centre, DTB Publications, Drug & Healthcare Products Regulatory
University of Birmingham Therapeutics Bulletin, London Agency, London

130
Current and past membership details of all HTA ‘committees’ are available from the HTA website (www.ncchta.org)
Health Technology Assessment 2006; Vol. 10: No. 6

Therapeutic Procedures Panel


Members
Chair, Dr Carl E Counsell, Clinical Ms Maryann L Hardy, Professor James Neilson,
Professor Bruce Campbell, Senior Lecturer in Neurology, Lecturer, Division of Professor of Obstetrics and
Consultant Vascular and Department of Medicine and Radiography, University of Gynaecology, Department of
General Surgeon, Department Therapeutics, University of Bradford Obstetrics and Gynaecology,
of Surgery, Royal Devon & Aberdeen University of Liverpool
Exeter Hospital Professor Alan Horwich,
Ms Amelia Curwen, Executive Dr John C Pounsford,
Director of Clinical R&D, Consultant Physician,
Director of Policy, Services and
Academic Department of Directorate of Medical Services,
Research, Asthma UK, London
Radiology, The Institute of North Bristol NHS Trust
Professor Gene Feder, Professor Cancer Research,
of Primary Care R&D, London Karen Roberts, Nurse
Department of General Practice Consultant, Queen Elizabeth
and Primary Care, Barts & the Hospital, Gateshead
Dr Simon de Lusignan,
London, Queen Mary’s School Senior Lecturer, Dr Vimal Sharma, Consultant
Dr Aileen Clarke, of Medicine and Dentistry, Primary Care Informatics, Psychiatrist/Hon. Senior Lecturer,
Reader in Health Services London Department of Community Mental Health Resource Centre,
Research, Public Health & Health Sciences, Cheshire and Wirral Partnership
Policy Research Unit, Barts & Professor Paul Gregg, St George’s Hospital Medical NHS Trust, Wallasey
the London School of Medicine Professor of Orthopaedic School, London
& Dentistry, London Surgical Science, Department of Dr L David Smith, Consultant
General Practice and Primary Cardiologist, Royal Devon &
Dr Matthew Cooke, Reader in Care, South Tees Hospital NHS Professor Neil McIntosh, Exeter Hospital
A&E/Department of Health Trust, Middlesbrough Edward Clark Professor of
Advisor in A&E, Warwick Child Life & Health, Professor Norman Waugh,
Emergency Care and Ms Bec Hanley, Co-Director, Department of Child Life & Professor of Public Health,
Rehabilitation, University of TwoCan Associates, Health, University of Department of Public Health,
Warwick Hurstpierpoint Edinburgh University of Aberdeen

131
Current and past membership details of all HTA ‘committees’ are available from the HTA website (www.ncchta.org)
Health Technology Assessment Programme

Expert Advisory Network


Members

Professor Douglas Altman, Mr John Dunning, Dr Duncan Keeley, Professor Chris Price,
Director of CSM & Cancer Consultant Cardiothoracic General Practitioner (Dr Burch Visiting Chair – Oxford, Clinical
Research UK Med Stat Gp, Surgeon, Cardiothoracic & Ptnrs), The Health Centre, Research, Bayer Diagnostics
Centre for Statistics in Surgical Unit, Papworth Thame Europe, Cirencester
Medicine, University of Oxford, Hospital NHS Trust, Cambridge
Institute of Health Sciences, Dr Donna Lamping, Professor Peter Sandercock,
Headington, Oxford Mr Jonothan Earnshaw, Research Degrees Programme Professor of Medical Neurology,
Consultant Vascular Surgeon, Director & Reader in Psychology, Department of Clinical
Professor John Bond, Gloucestershire Royal Hospital, Health Services Research Unit, Neurosciences, University of
Director, Centre for Health Gloucester London School of Hygiene and Edinburgh
Services Research, University of Tropical Medicine, London
Newcastle upon Tyne, School of Professor Martin Eccles, Dr Eamonn Sheridan,
Population & Health Sciences, Professor of Clinical Mr George Levvy, Consultant in Clinical Genetics,
Newcastle upon Tyne Effectiveness, Centre for Health Chief Executive, Motor Genetics Department,
Services Research, University of Neurone Disease Association, St James’s University Hospital,
Mr Shaun Brogan, Newcastle upon Tyne Northampton Leeds
Chief Executive, Ridgeway
Professor Pam Enderby, Professor James Lindesay, Dr Ken Stein,
Primary Care Group, Aylesbury
Professor of Community Professor of Psychiatry for the Senior Clinical Lecturer in
Rehabilitation, Institute of Elderly, University of Leicester, Public Health, Director,
Mrs Stella Burnside OBE,
General Practice and Primary Leicester General Hospital Peninsula Technology
Chief Executive, Office of the
Care, University of Sheffield Assessment Group,
Chief Executive. Trust Professor Julian Little, University of Exeter
Headquarters, Altnagelvin Professor of Human Genome
Hospitals Health & Social Mr Leonard R Fenwick,
Chief Executive, Newcastle Epidemiology, Department of Professor Sarah Stewart-Brown,
Services Trust, Altnagelvin Area Epidemiology & Community Professor of Public Health,
Hospital, Londonderry upon Tyne Hospitals NHS Trust
Medicine, University of Ottawa University of Warwick,
Professor David Field, Division of Health in the
Ms Tracy Bury, Professor Rajan Madhok, Community Warwick Medical
Professor of Neonatal Medicine,
Project Manager, World Medical Director & Director of School, LWMS, Coventry
Child Health, The Leicester
Confederation for Physical Public Health, Directorate of
Royal Infirmary NHS Trust
Therapy, London Clinical Strategy & Public Professor Ala Szczepura,
Mrs Gillian Fletcher, Health, North & East Yorkshire Professor of Health Service
Professor Iain T Cameron, & Northern Lincolnshire Health Research, Centre for Health
Antenatal Teacher & Tutor and
Professor of Obstetrics and Authority, York Services Studies, University of
President, National Childbirth
Gynaecology and Head of the Warwick
Trust, Henfield
School of Medicine, Professor David Mant,
University of Southampton Professor Jayne Franklyn, Professor of General Practice, Dr Ross Taylor,
Professor of Medicine, Department of Primary Care, Senior Lecturer, Department of
Dr Christine Clark, Department of Medicine, University of Oxford General Practice and Primary
Medical Writer & Consultant University of Birmingham, Care, University of Aberdeen
Pharmacist, Rossendale Professor Alexander Markham,
Queen Elizabeth Hospital,
Director, Molecular Medicine Mrs Joan Webster,
Edgbaston, Birmingham
Professor Collette Clifford, Unit, St James’s University Consumer member, HTA –
Professor of Nursing & Head of Ms Grace Gibbs, Hospital, Leeds Expert Advisory Network
Research, School of Health Deputy Chief Executive,
Sciences, University of Dr Chris McCall,
Director for Nursing, Midwifery
Birmingham, Edgbaston, General Practitioner, The
& Clinical Support Services,
Birmingham Hadleigh Practice, Castle Mullen
West Middlesex University
Hospital, Isleworth Professor Alistair McGuire,
Professor Barry Cookson,
Director, Laboratory of Professor of Health Economics,
Dr Neville Goodman, London School of Economics
Healthcare Associated Infection, Consultant Anaesthetist,
Health Protection Agency, Southmead Hospital, Bristol Dr Peter Moore,
London Freelance Science Writer, Ashtead
Professor Alastair Gray,
Professor Howard Cuckle, Professor of Health Economics, Dr Sue Moss, Associate Director,
Professor of Reproductive Department of Public Health, Cancer Screening Evaluation
Epidemiology, Department of University of Oxford Unit, Institute of Cancer
Paediatrics, Obstetrics & Research, Sutton
Gynaecology, University of Professor Robert E Hawkins,
Leeds CRC Professor and Director of Mrs Julietta Patnick,
Medical Oncology, Christie CRC Director, NHS Cancer Screening
Dr Katherine Darton, Research Centre, Christie Programmes, Sheffield
Information Unit, MIND – Hospital NHS Trust, Manchester
The Mental Health Charity, Professor Tim Peters,
London Professor Allen Hutchinson, Professor of Primary Care
Director of Public Health & Health Services Research,
Professor Carol Dezateux, Deputy Dean of ScHARR, Academic Unit of Primary
Professor of Paediatric Department of Public Health, Health Care, University of
Epidemiology, London University of Sheffield Bristol

132
Current and past membership details of all HTA ‘committees’ are available from the HTA website (www.ncchta.org)
HTA

How to obtain copies of this and other HTA Programme reports.


An electronic version of this publication, in Adobe Acrobat format, is available for downloading free of
charge for personal use from the HTA website (http://www.ncchta.org). A fully searchable CD-ROM is
also available (see below).
Printed copies of HTA monographs cost £20 each (post and packing free in the UK) to both public and
private sector purchasers from our Despatch Agents, York Publishing Services.
Non-UK purchasers will have to pay a small fee for post and packing. For European countries the cost is
£2 per monograph and for the rest of the world £3 per monograph.
You can order HTA monographs from our Despatch Agents, York Publishing Services by:
– fax (with credit card or official purchase order)
– post (with credit card or official purchase order or cheque)
– phone during office hours (credit card only).
Additionally the HTA website allows you either to pay securely by credit card or to print out your
order and then post or fax it.
Contact details are as follows:
York Publishing Services Email: ncchta@yps-publishing.co.uk
PO Box 642 Tel: 0870 1616662
YORK YO31 7WX Fax: 0870 1616663
UK Fax from outside the UK: +44 1904 430868
NHS libraries can subscribe free of charge. Public libraries can subscribe at a very reduced cost of
£100 for each volume (normally comprising 30–40 titles). The commercial subscription rate is £300
per volume. Please contact York Publishing Services at the address above. Subscriptions can only be
purchased for the current or forthcoming volume.

Payment methods
Paying by cheque
If you pay by cheque, the cheque must be in pounds sterling, made payable to York Publishing
Distribution and drawn on a bank with a UK address.
Paying by credit card
The following cards are accepted by phone, fax, post or via the website ordering pages: Delta, Eurocard,
Mastercard, Solo, Switch and Visa. We advise against sending credit card details in a plain email.
Paying by official purchase order
You can post or fax these, but they must be from public bodies (i.e. NHS or universities) within the UK.
We cannot at present accept purchase orders from commercial companies or from outside the UK.
How do I get a copy of HTA on CD?
Please use the form on the HTA website (www.ncchta.org/htacd.htm). Or contact York Publishing
Services (see contact details above) by email, post, fax or phone. HTA on CD is currently free of charge
worldwide.

The website also provides information about the HTA Programme and lists the membership of the various
committees.
Health Technology Assessment 2006; Vol. 10: No. 6
Health Technology Assessment 2006; Vol. 10: No. 6

Systematic review and evaluation


of methods of assessing urinary
incontinence

JL Martin, KS Williams, KR Abrams,

Systematic review and evaluation of methods of assessing urinary incontinence


DA Turner, AJ Sutton, C Chapple,
RP Assassa, C Shaw and F Cheater
Feedback
The HTA Programme and the authors would like to know
your views about this report.
The Correspondence Page on the HTA website
(http://www.ncchta.org) is a convenient way to publish
your comments. If you prefer, you can send your comments
to the address below, telling us whether you would like
us to transfer them to the website.
We look forward to hearing from you.

February 2006

The National Coordinating Centre for Health Technology Assessment,


Mailpoint 728, Boldrewood,
University of Southampton,
Southampton, SO16 7PX, UK.
Fax: +44 (0) 23 8059 5639 Email: hta@soton.ac.uk
Health Technology Assessment
NHS R&D HTA Programme HTA
http://www.ncchta.org ISSN 1366-5278

You might also like