0% found this document useful (0 votes)

77 views15 pages

Conducting Equivalence Testing in Laboratory Applications: Standard Practice For

The document outlines the American National Standard E2935-16, which provides a statistical methodology for conducting equivalence testing in laboratory applications. It covers the scope, terminology, and guidance for determining data requirements and controlling risks associated with equivalence decisions. The standard emphasizes the importance of demonstrating that modifications to testing processes do not adversely affect test results within predetermined limits.

Uploaded by

QC Lab

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

77 views15 pages

Conducting Equivalence Testing in Laboratory Applications: Standard Practice For

Uploaded by

QC Lab

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

This international standard was developed in accordance with internationally recognized principles on standardization established in the Decision on Principles

for the
Development of International Standards, Guides and Recommendations issued by the World Trade Organization Technical Barriers to Trade (TBT) Committee.

Designation: E2935 − 16 An American National Standard

Standard Practice for

Conducting Equivalence Testing in Laboratory Applications1
This standard is issued under the fixed designation E2935; the number immediately following the designation indicates the year of
original adoption or, in the case of revision, the year of last revision. A number in parentheses indicates the year of last reapproval. A
superscript epsilon (´) indicates an editorial change since the last revision or reapproval.

1. Scope E456 Terminology Relating to Quality and Statistics

1.1 This practice provides statistical methodology for con- E2282 Guide for Defining the Test Result of a Test Method
ducting equivalence testing on numerical data from two E2586 Practice for Calculating and Using Basic Statistics
sources to determine if their true means or variances differ by 2.2 USP Standard:3
no more than predetermined limits. USP <1223> Validation of Alternative Microbiological
Methods
1.2 Applications include (1) equivalence testing for bias
against an accepted reference value, (2) determining means
3. Terminology
equivalence of two test methods, test apparatus, instruments,
reagent sources, or operators within a laboratory or equiva- 3.1 Definitions—See Terminology E456 for a more exten-
lence of two laboratories in a method transfer, and (3) sive listing of statistical terms.
determining non-inferiority of a modified test procedure versus 3.1.1 accepted reference value, n—a value that serves as an
a current test procedure with respect to a performance charac- agreed-upon reference for comparison, and which is derived
teristic. as: (1) a theoretical or established value, based on scientific
principles, (2) an assigned or certified value, based on experi-
1.3 The guidance in this standard applies only to experi-
mental work of some national or international organization, or
ments conducted on a single material at a given level of the test
(3) a consensus or certified value, based on collaborative
result.
experimental work under the auspices of a scientific or
1.4 Guidance is given for determining the amount of data engineering group. E177
required for an equivalence trial. The control of risks associ- 3.1.2 bias, n—the difference between the expectation of the
ated with the equivalence decision is discussed. test results and an accepted reference value. E177
1.5 The values stated in SI units are to be regarded as 3.1.3 confidence interval, n—an interval estimate [L, U]
standard. No other units of measurement are included in this with the statistics L and U as limits for the parameter θ and
standard. with confidence level 1 – α, where Pr(L ≤ θ ≤ U) ≥ 1– α. E2586
1.6 This standard does not purport to address all of the 3.1.3.1 Discussion—The confidence level, 1 – α, reflects the
safety concerns, if any, associated with its use. It is the proportion of cases that the confidence interval [L, U] would
responsibility of the user of this standard to establish appro- contain or cover the true parameter value in a series of repeated
priate safety and health practices and determine the applica- random samples under identical conditions. Once L and U are
bility of regulatory limitations prior to use. given values, the resulting confidence interval either does or
does not contain it. In this sense “confidence” applies not to the
2. Referenced Documents particular interval but only to the long run proportion of cases
2.1 ASTM Standards:2 when repeating the procedure many times.
E177 Practice for Use of the Terms Precision and Bias in 3.1.4 confidence level, n—the value, 1 – α, of the probability
ASTM Test Methods associated with a confidence interval, often expressed as a
percentage. E2586
3.1.4.1 Discussion—α is generally a small number. Confi-
1
This test method is under the jurisdiction of ASTM Committee E11 on Quality dence level is often 95 % or 99 %.
and Statistics and is the direct responsibility of Subcommittee E11.20 on Test
Method Evaluation and Quality Control.
3.1.5 confidence limit, n—each of the limits, L and U, of a
Current edition approved Nov. 15, 2016. Published January 2017. Originally confidence interval, or the limit of a one-sided confidence
approved in 2013. Last previous edition approved in 2015 as E2935 – 15. DOI: interval. E2586
10.1520/E2935-16.
2
For referenced ASTM standards, visit the ASTM website, www.astm.org, or
contact ASTM Customer Service at service@astm.org. For Annual Book of ASTM
3
Standards volume information, refer to the standard’s Document Summary page on Available from U.S. Pharmacopeial Convention (USP), 12601 Twinbrook
the ASTM website. Pkwy., Rockville, MD 20852-1790, http://www.usp.org.

1
E2935 − 16
3.1.6 degrees of freedom, n—the number of independent 3.1.23 test unit, n—the total quantity of material (containing
data points minus the number of parameters that have to be one or more test specimens) needed to obtain a test result as
estimated before calculating the variance. E2586 specified in the test method. See test result. E2282
2 2
3.1.7 equivalence, n—condition that two population param- 3.1.24 variance, σ , s , n—square of the standard deviation
eters differ by no more than predetermined limits. of the population or sample. E2586
3.1.8 intermediate precision conditions, n—conditions un- 3.2 Definitions of Terms Specific to This Standard:
der which test results are obtained with the same test method 3.2.1 bias equivalence, n—equivalence of a population
using test units or test specimens taken at random from a single mean with an accepted reference value.
quantity of material that is as nearly homogeneous as possible, 3.2.2 equivalence limit, E, n—in equivalence testing, a limit
and with changing conditions such as operator, measuring on the difference between two population parameters.
equipment, location within the laboratory, and time. E177
3.2.2.1 Discussion—In certain applications, this may be
3.1.9 mean, n—of a population, µ, average or expected termed practical limit or practical difference.
value of a characteristic in a population – of a sample, X̄ sum 3.2.3 equivalence test, n—a statistical test conducted within
of the observed values in the sample divided by the sample predetermined risks to confirm equivalence of two population
size. E2586 parameters.
3.1.10 percentile, n—quantile of a sample or a population, 3.2.4 means equivalence, n—equivalence of two population
for which the fraction less than or equal to the value is means.
expressed as a percentage. E2586 3.2.5 non-inferiority, n—condition that the difference in
3.1.11 population, n—the totality of items or units of means or variances of test results between a modified testing
material under consideration. E2586 process and a current testing process with respect to a
performance characteristic is no greater than a predetermined
3.1.12 population parameter, n—summary measure of the limit in the direction of inferiority of the modified process to
values of some characteristic of a population. E2586 the current process.
3.1.13 precision, n—the closeness of agreement between 3.2.5.1 Discussion—Other terms used for non-inferior are
independent test results obtained under stipulated conditions. “equivalent or better” or “at least equivalent as.”
E177 3.2.6 paired samples design, n—in means equivalence
3.1.14 quantile, n—value such that a fraction f of the sample testing, single samples are taken from the two populations at a
or population is less than or equal to that value. E2586 number of sampling points.
3.1.15 repeatability, n—precision under repeatability 3.2.6.1 Discussion—This design is termed a randomized
conditions. E177 block design for a general number of populations sampled, and
each group of data within a sampling point is termed a block.
3.1.16 repeatability conditions, n—conditions where inde-
3.2.7 power, n—in equivalence testing, the probability of
pendent test results are obtained with the same method on
accepting equivalence, given the true difference between two
identical test items in the same laboratory by the same operator
population means.
using the same equipment within short intervals of time. E177
3.2.7.1 Discussion—In the case of testing for bias equiva-
3.1.17 repeatability standard deviation (sr), n—the standard lence the power is the probability of accepting equivalence,
deviation of test results obtained under repeatability given the true difference between a population mean and an
conditions. E177 accepted reference value.
3.1.18 sample, n—a group of observations or test results, 3.2.8 two independent samples design, n—in means equiva-
taken from a larger collection of observations or test results, lence testing, replicate test results are determined indepen-
which serves to provide information that may be used as a basis dently from two populations at a single sampling time for each
for making a decision concerning the larger collection. E2586 population.
3.1.19 sample size, n, n—number of observed values in the 3.2.8.1 Discussion—This design is termed a completely
sample. E2586 randomized design for a general number of populations
sampled.
3.1.20 sample statistic, n—summary measure of the ob-
3.2.9 two one-sided tests (TOST) procedure, n—a statistical
served values of a sample. E2586
procedure used for testing the equivalence of the parameters
3.1.21 standard deviation—of a population, σ, the square from two distributions (see equivalence).
root of the average or expected value of the squared deviation 3.3 Symbols:
of a variable from its mean; —of a sample, s, the square root
of the sum of the squared deviations of the observed values in B = bias (7.1.1)
the sample from their mean divided by the sample size dj = difference between a pair of test results at sampling
minus 1. E2586 point j (7.1.1)
d̄ = average difference (7.1.1)
3.1.22 test result, n—the value of a characteristic obtained
by carrying out a specified test method. E2282 D = difference in sample means (6.1.2) (X1.1.2)

2
E2935 − 16

E = equivalence limit (5.2) 3.4.5 TOST, n—two one-sided tests (5.5.1) (Section 6)
E1 = lower equivalence limit (5.2.1) (Section 7) (Section 8) (Appendix X1)
E2 = upper equivalence limit (5.2.1) 3.4.6 UCL, n—upper confidence limit (6.2.5) (7.2.3)
f = degrees of freedom for s (8.1.1) (X1.1.2)
F1–α = (1 – α)th percentile of the F distribution (9.3.1)
4. Significance and Use
fi = degrees of freedom for si (6.1.1)
fp = degrees of freedom for sp (6.1.2) 4.1 Laboratories conducting routine testing have a continu-
^(•) = the cumulative F distribution function (X1.6.3) ing need to make improvements in their testing processes. In
H 0: = null hypothesis (X1.1.1) these situations it must be demonstrated that any changes will
HA: = alternate hypothesis (X1.1.1) not cause an undesirable shift in the test results from the
n = sample size (number of test results) from a popu- current testing process nor substantially affect a performance
lation (5.4) (6.1.3) (7.1.1) (8.1.1) characteristic of the test method. This standard provides
ni = sample size from ith population (6.1.1) guidance on experiments and statistical methods needed to
n1 = sample size from population 1 (6.1.2) demonstrate that the test results from a modified testing process
n2 = sample size from population 2 (6.1.2) are equivalent to those from the current testing process, where
R = ratio of two sample variances (5.5.3)
equivalence is defined as agreement within a prescribed limit,
5 = ratio of two population variances (X1.6.3)
s = sample standard deviation (8.1.1) termed an equivalence limit.
sB = sample standard deviation for bias (8.1.2) 4.1.1 Examples of modifications to the testing process
sd = standard deviation of the difference between two include, but are not limited, to the following:
test results (7.1.1) (1) Changes to operating levels in the steps of the test
sD = sample standard deviation for mean difference method procedure,
(6.1.3) (X1.1.2) (2) Installation of new instruments, apparatus, or sources of
si = sample standard deviation for ith population (6.1.1) reagents and test materials,
s i2 = sample variance for ith population (6.1.1) (3) Evaluation of new personnel performing the testing,
s 21 = sample variance for population 1 (6.1.2) and
s 21 = variance of test results from the current process (4) Transfer of testing to a new location.
(5.5.3) 4.1.2 The equivalence limit, which represents a worst-case
s 22 = sample variance for population 2 (6.1.2) difference, is determined prior to the equivalence test and its
s 22 = variance of test results from the modified process value is usually set by consensus among subject-matter ex-
(5.5.3) perts.
sp = pooled sample standard deviation (6.1.2)
sr = repeatability sample standard deviation (6.2) 4.2 Two principal types of equivalence are covered in the
t = Student’s t statistic (6.1.4) (7.1.3) (8.1.3) practice, means equivalence and non-inferiority. Means
t 12α,f = (1-α)th percentile of the Student’s t distribution equivalence implies that a sustained shift in test results
with f degrees of freedom (X1.1.2) between the modified and current testing processes refers to an
Xij = jth test result from the ith population (6.1) absolute difference, meaning differences in either direction
UCLR = = upper confidence limit for 5 (9.3.1) from zero. Non-inferiority is concerned with a difference only
X̄ = test result average (8.1.1) in the direction of an inferior outcome in a performance
¯
X = test result average for the ith population (6.1.1) characteristic of the modified testing procedure versus the
i
¯
X1 = test result average for population 1 (6.1.3) current testing procedure.
¯
X2 = test result average for population 2 (6.1.3) 4.2.1 Equivalence testing is performed by an experiment
Z 12α = (1-α)th percentile of the standard normal distribu- that generates test results from the modified and current testing
tion (X1.6.1) procedures on the same materials that are routinely tested. An
α = consumer’s risk (5.2.3) (6.2) (7.2) exception is bias equivalence where the experiment consists of
β = producer’s risk (5.4.1) conducting multiple testing on a certified reference material
∆ = true mean difference between populations (5.4.1) (CRM) having an accepted reference value (ARV) to evaluate
µ = population mean (X1.4.1) the test method bias.
µi = ith population mean (X1.1.1)
ν = approximate degrees of freedom for sD (X1.1.4) 4.2.2 Examples of performance characteristics directly ap-
σ = standard deviation of the test method (5.2) plicable to the test method are bias, precision, sensitivity,
σd = standard deviation of the true difference between specificity, linearity, and range. Additional characteristics are
two populations (7.2) test cost and elapsed time to conduct the test procedure.
Φ(•) = standard normal cumulative distribution function 4.2.3 Non-inferiority may involve trade-offs in performance
(X1.6.1) characteristics between the modified and current procedures.
For example, the modified process may be slightly inferior to
3.4 Acronyms:
the established process with respect to assay sensitivity or
3.4.1 ARV, n—accepted reference value (5.3.3) (8.1) (X1.4)
precision but may have off-setting advantages such as faster
3.4.2 CRM, n—certified reference material (5.3.3) (8.1) delivery of results or lower testing costs.
3.4.3 ILS, n—interlaboratory study (6.2) 4.3 Risk Management—Guidance is also provided for deter-
3.4.4 LCL, n—lower confidence limit (6.2.5) (7.2.3) mining the amount of data required to control the risks of

3
E2935 − 16
making the wrong decision in accepting or rejecting equiva- independent test results are usually generated in a single
lence (see Section X1.2). laboratory by both testing procedures under repeatability
4.3.1 The consumer’s risk is the risk of falsely declaring conditions. For method transfer each laboratory generates
equivalence. The probability associated with this risk is di- independent test results using the same testing procedure,
rectly controlled to a low level so that accepting equivalence preferably under repeatability conditions. If this is not possible
gives a high degree of assurance that the true difference is less due to constraints on time or facilities, then the test results can
than the equivalence limit. be conducted under intermediate precision conditions, but a
4.3.2 The producer’s risk is the risk of falsely rejecting statistician is recommended for design and analysis of the test.
equivalence. The probability associated with this risk is con- 5.3.2 The Paired Samples Design for means equivalence is
trolled by the amount of data generated by the experiment. If discussed in Section 7. In this design, multiple pairs of single
valid improvements are rejected by equivalence testing, this test results from each testing procedure are generated under
can lead to opportunity losses to the company and its labora- different conditions of a second variable, such as time of
tories (the producers) or cause unnecessary additional effort in process sampling. This design is most useful when there are
improving the testing process. constraints on conducting the two independent samples design.
5.3.3 The design for bias equivalence is discussed in Sec-
5. Planning and Executing the Equivalence Study tion 8. In this design test results are generated by the current
5.1 This section discusses the stages of conducting an testing process on a certified reference material (CRM) having
equivalence test: (1) determining the information needed, (2) an accepted reference value (ARV) for the material character-
setting up and conducting the study design, and (3) performing istic of interest.
the statistical analysis of the resulting data. The study is usually 5.3.4 The statistical analysis for non-inferiority is discussed
conducted either in a single laboratory or, in the case of a in Section 9 for evaluating two testing procedures with respect
method transfer, in both the originating and receiving labora- to a performance characteristic. The data can be generated by
tories. Using multiple laboratories will almost always increase either of the designs discussed in Sections 6 and 7.
the inherent variability of the data in the study, which will 5.4 Sample size in the design context refers to the number n
increase the cost of performing the study due to the need for of test results required by each testing process to manage the
more data. producer’s risk. It is possible to use different sample sizes for
5.2 Prior information required for the study design includes the modified and current test processes, but this can lead to
the equivalence limit E, the consumer’s risk α, and an estimate poor control of the consumer’s risk (see X1.1.4).
of the test method precision σ. 5.4.1 The number of test results, symbol n, from each
5.2.1 For means equivalence tests there are two equivalence testing process controls the producer’s risk β of falsely reject-
limits, –E and E, that are tested. Limits may be nonsymmetrical ing means equivalence at a given true mean difference, . The
around zero, such as –E1 and E2, but this is not usual and would producer’s risk may be alternatively stated in terms of the
require advice from a qualified statistician for a proper design power, the probability 1–β of correctly accepting equivalence
setup. For non-inferiority tests only one of these limits is at a given value of .
tested. 5.4.1.1 For symmetric equivalence limits in means equiva-
5.2.2 A prior estimate of the test method precision is lence tests the power profile plots the probability 1–β against
essential for determining the number of test results required in the absolute value of , due to the symmetry of the equivalence
the study design for adequate producer’s risk control. This limits. This calculation can be performed using a spreadsheet
estimate can be available from method development work, computer package (see X1.6.1 and Appendix X2).
from an interlaboratory study, or from other sources. The 5.4.1.2 An example of a set of power profiles in means
precision estimate should take into account the test conditions equivalence tests is shown in Fig. 1. The probability scale for
of the study, such as repeatability, intermediate, or reproduc- power on the vertical axis varies from 0 to 1. The horizontal
ibility conditions. axis is the true absolute difference . The power profile, a
5.2.3 The consumer’s risk may be determined by an indus- reversed S-shaped curve, should be close to a power probabil-
try norm or a regulatory requirement. A probability value often ity of 1 at zero absolute difference and will decline to the
used is α = 0.05, which is a 5 % risk to the consumer that the consumer risk probability at an absolute difference of E. Power
study falsely declares equivalence. for absolute differences greater than E are less than the
5.3 The design type determines how the data are collected consumer risk and decline asymptotically to zero as the
and how much data are needed to control the risk of a wrong absolute difference increases.
decision. A sufficient quantity of a homogeneous material for 5.4.1.3 In Fig. 1 power profiles are shown for three different
the required number of tests is necessary. For comparing data sample sizes for testing means equivalence. Increasing the
from the modified and current testing processes, two basic sample size moves the power curve to the right, giving a
designs are discussed in this practice, the Two Independent greater chance of accepting equivalence for a given true
Samples Design, and the Paired Samples Design. These de- difference . Equations for power profiles are shown in Section
signs are suitable for determining either means equivalence or X1.5 and a spreadsheet example in Appendix X2.
non-inferiority. 5.4.2 Power curves for bias equivalence and non-inferiority
5.3.1 The Two Independent Samples Design for means are constructed by different formulas but have the same shape
equivalence is discussed in Section 6. In this design sets of and interpretation as those for means equivalence.

4
E2935 − 16

FIG. 1 Multiple Power Curves for Lab Transfer Example

5.4.2.1 For non-inferiority testing the power profile plots the 5.5.1.1 The conventional Student’s t test based on the null
probability 1–β against the true difference for means (see hypothesis of a zero difference is not recommended for means
X1.6.2) or against the true variance ratio 5 for variances (see equivalence testing as it does not properly control the consum-
X1.6.3). er’s and producer’s risks for this application (see Section
5.4.3 Power curves are evaluated by entering different X1.3). This test is suitable for supporting superiority of the
values of n and evaluating the curve shape. A practical solution modified process versus the established process instead of
is to choose n such that the power is above a 0.9 probability out equivalence.
to about one-half to two-thirds of the distance to E, thus giving 5.5.1.2 For bias equivalence the calculation for sD is based
a high probability that equivalence will be demonstrated for a
on only a single set of data because the ARV is considered as
range of true absolute differences that are deemed of little or no
a known mean with zero variability for the purpose of the
scientific import in the test result.
equivalence study.
5.5 The statistical analysis for accepting or rejecting 5.5.2 The data analysis for non-inferiority testing of popu-
equivalence is similar for all cases and depends on the outcome lation means uses a single one-sided test in the direction of an
of one-sided statistical hypothesis tests for means and vari-
inferior outcome with respect to a performance characteristic
ances. The calculations are given in detail with examples in
determined by the test results. When the performance charac-
Sections 6 – 9. The statistical theory is given in an appendix
teristic is defined as “higher is better”, such as method
(see Section X1.1).
sensitivity, the statistical test supports noninferiority when
5.5.1 The data analysis for means equivalence testing in this
LCL.2E. Conversely, when the performance characteristic is
practice uses a statistical methodology termed the two one-
sided tests (TOST) procedure. This is based on calculating defined as “lower is better”, such as incidence of
confidence limits for the true mean difference as D6t s D , misclassifications, the statistical test supports noninferiority
where D is the difference between the two test result averages, when UCL,E . Note that the means equivalence procedure
sD is the standard error of that difference, and t is a tabulated comprises two one-sided statistical tests while the non-
multiplier based on the number of data and a preselected inferiority procedure performs only a single one-sided statisti-
confidence level. The calculation for sD is based on the cal test. For statistical details see Section X1.5.
standard deviations of the two sets of data and the type of study 5.5.3 For the equivalence testing of precision the variance is
design. Then equivalence is supported if both of the following used, and “lower is better” for this parameter, so the test for
two conditions are met: non-inferiority applies. Because variances are a scale
(1) The lower confidence limit, LCL5D2t s D , is greater parameter, the non-inferiority test is based the ratio R of the
than the lower equivalence limit, –E, and
two sample variances instead of their difference; thus R
(2) The upper confidence limit, UCL5D1t s D , is less than
5s 22 ⁄s 21 , where s 21 and s 22 are the calculated variances of the test
the upper equivalence limit, E.
NOTE 1—Historically, this procedure originated in the pharmaceutical
results from the current and modified test processes, respec-
industry for use in bioequivalence trials (1, 2),4 denoted as the Two tively. An upper confidence limit for the true variance ratio
One-Sided Tests Procedure, which has since been adopted for use in σ 22 ⁄σ 21 , denoted UCLR, for the given confidence level and sample
testing and measurement applications (3, 4). sizes, can be found from the tabulated F distribution. The
non-inferiority limit E is also in the form of a ratio. For
4
The boldface numbers in parentheses refer to a list of references at the end of example, if E52 , the noninferiority limit would allow the
this standard. modified process to have up to twice the variance of the

5
E2935 − 16
established process or up to about 1.4 times the standard TABLE 1 Data for Equivalence Test Between Two Laboratories
deviation in the worst case. The statistical test supports Test Results
noninferiority if UCL R ,E . Laboratory 1 96.9 97.9 98.5 97.5 97.7 97.2
Laboratory 2 97.8 97.6 98.1 98.6 98.6 98.9

6. The TOST Procedure for Statistical Analysis of Means

Equivalence — Two Independent Samples Design
6.1 Statistical Analysis—Let the sample data be denoted as
Xij = the jth test result from the ith population. The equivalence study (ILS) on this test method had given an estimate of sr =
limit E, consumer’s risk α, and sample sizes have been 0.5 units for the repeatability standard deviation. Thus E = 2
previously determined. units, α = 0.05, and estimated σ = 0.5 units are inputs for this
6.1.1 Calculate averages, variances, and standard study (the actual units are unspecified for this example).
deviations, and degrees of freedom for each sample: 6.2.1 Sample Size Determination—Power profiles for n = 3,
ni 6, and 20 were generated for a set of absolute difference values
(X
j51
ij ranging 0.00 (0.20) 2.40 units as shown in Fig. 1. All three
X̄ i 5 , i 5 1, 2 (1) curves intersect at the point (2, 0.05) as determined by the
ni
consumer’s risk at the equivalence limit.
ni
2 6.2.1.1 A sample size of n = 6 replicate assays per labora-
( ~X
j51
ij 2 X̄ i !
tory yielded a satisfactory power curve, in that the probability
s i2 5 , i 5 1, 2 (2)
~ n i 2 1! of accepting equivalence (power) was greater than a 0.9
probability (or a 90 % power) for a difference of about 1.2 units
s i 5 =s i2 , i 5 1, 2 (3)
or less. Therefore, there would be less than an estimated 10 %
f i 5 n i 2 1, i 5 1, 2 (4) risk to the producer that such a difference would fail to support
6.1.2 Calculate the pooled standard deviation and degrees of equivalence in the actual trial.
freedom: 6.2.1.2 A comparison of the three power curves indicates
that the n = 3 design would be underpowered, as the power
sp 5 Œ ~ n 1 2 1 ! s 21 1 ~ n 2 2 1 ! s 22
~n1 1 n2 2 2!
(5)
falls below 0.9 at 0.8 units. The n = 20 design gives somewhat
more power than the n = 6 design but is more costly to conduct
If n1 = n2 = n, then: and may not be worth the extra expenditure.
6.2.2 Averages, variances, standard deviations, and degrees
~ s 21 1 s 22 ! of freedom for the two laboratories are:
s 2p 5
2
X̄ 1 5 s 96.9 1 97.9 1 98.5 1 97.5 1 97.7 1 97.2d ⁄6
fp 5 ~n1 1 n2 2 2! (6) 597.62 mg⁄g
X̄ 2 5 s 97.8 1 97.6 1 98.1 1 98.6 1 98.6 1 98.9d ⁄6
6.1.3 Calculate the difference between means and its stan-
598.27 mg⁄g
dard error:
s 21 5 f s 96.9 2 97.62d 2 1 ... 1 s 97.2 2 97.62d 2 g ⁄ s 6 2 1 d
D 5 X̄ 2 2 X̄ 1 (7) 50.31367

sD 5 sp Œ 1
1
1
n1 n2
(8)
s 22 5 f s 97.8 2 98.27d 2 1 ... 1 s 98.9 2 98.27d 2 g ⁄ s 6 2 1 d
50.26267

s 1 5 œ0.3136750.560
If n1 = n2 = n, then: s 2 5 œ0.2626750.513

sD 5 sp Œ 2
n
f i 5n i 21562155
The estimates of standard deviation are in good agreement
6.1.4 Test for Equivalence—Compute the upper (UCL) and with the ILS estimate of 0.5 mg/g.
lower (LCL) confidence limits for the 100 (1–2α) % two-sided 6.2.3 The pooled standard deviation is:
confidence interval on the true difference. If the confidence
interval is completely contained within the equivalence limits s p5 Œs 6 2 1 d 0.313671 s 6 2 1 d 0.26267
s6 1 6 2 2d
5 Œ 2.8817
10
50.537 mg⁄g

(0 6 E), equivalently if LCL > –E and UCL < E, then accept with 10 degrees of freedom.
equivalence. Otherwise, reject equivalence. 6.2.4 The difference of means is D = 98.27 – 97.62 = 0.65
mg/g. The plant laboratory average is 0.65 mg/g higher than
UCL 5 D1ts D (9)
the development laboratory average. The standard error of the
LCL 5 D 2 ts D (10) difference of means is s D 50.537 =2⁄650.310 mg/g with 10
where t is the upper 100 (1–α) % percentile of the Student’s degrees of freedom (same as that for sp).
t distribution with (n1 + n2 – 2) degrees of freedom. 6.2.5 The 95th percentile of Student’s t with 10 degrees of
6.2 Example for Means Equivalence—The example shown freedom is 1.812. Upper and lower confidence limits for the
is data from a transfer of an ASTM test method from R&D Lab difference of means are:
1 to Plant Lab 2 (Table 1). An equivalence of limit of 2 units UCL = 0.65 + (1.812)(0.310) = 1.21
was proposed with a consumer risk of 5 %. An interlaboratory LCL = 0.65 – (1.812)(0.310) = 0.09

6
E2935 − 16
The 90 % two-sided confidence interval on the true differ- UCL 5 D1ts D (16)
ence is 0.09 to 1.21 mg/g and is completely contained within LCL 5 D 2 ts D (17)
the equivalence interval of –2 to 2 mg/g. Since 0.09 > –2 and
1.21 < 2, equivalence is accepted. where t is the upper 100(1-α) % percentile of the Student’s
t distribution with (n − 1) degrees of freedom.
7. The TOST Procedure for Statistical Analysis of Means
Equivalence — Paired Samples Design 7.2 Example for Means Equivalence—Total organic carbon
in purified water was measured by an on-line analyzer, wherein
7.1 Statistical Analysis—Let the sample data be denoted as
a water sample was taken directly into the analyzer from the
Xij = the test result from the ith population and the jth block,
pipeline through a sampling port and the test result was
where i = 1 or 2. Each block represents a pair of single test
determined by a series of operations within the instrument. A
results from each population. For example, the blocking factor
may be time of sampling from a process. The equivalence limit new analyzer was to be qualified by running a TOC analysis at
E, consumer’s risk α, and sample size (number of blocks, the same time as the current analyzer utilizing a parallel
symbol n) have been previously determined (see Section 5). sampling port on the pipeline. The sampling time was the
7.1.1 Calculate the n differences, symbol dj, between the blocking factor, and the data from the two instruments consti-
two test results within each block, the average of the tuted a pair of single test results measured at a particular
differences, symbol d̄ , and the standard deviation of the sampling time. Sampling was to be conducted at a frequency of
differences, symbol sd, with its degrees of freedom, symbol f. four hours between sampling periods.
An equivalence limit of 2 parts per billion (ppb), or 4 % of
d j 5 X 1j 2 X 2j ,j 5 1,..., n (11)
the nominal process average of 50 ppb, was proposed with a
n
Σ j51 dj consumer risk of 5 %. A repeatability estimate of sr = 0.7 ppb,
d̄ 5 5D (12)
n based on previous validation work, gave an estimate for σd=

sd 5 Œ n
Σ j51 ~ d j 2 d̄ !
~n 2 1!
2

(13)
0.7√2 or approximately 1 ppb. Thus E = 2 ppb, α = 0.05, and
σd = 1 ppb were inputs for this study.
7.2.1 Sample Size Determination—Because the paired
f5n21 (14) samples design uses the differences of the test results within
7.1.2 Calculate the standard error of the mean difference, sampling periods for data analysis, the sample size equals the
symbol sD. number of pairs for purposes of calculating the power curve. In
sd
this example, the cost of obtaining test results was not a major
sD 5 (15) consideration once the new analyzer was installed in the
=n system. Comparative power profiles for n = 10, 20, and 50
7.1.3 Test for Equivalence—Compute the upper (UCL) and sample pairs are shown in Fig. 2. The sample size of 20 pairs
lower (LCL) confidence limits for the 100(1–2α) % two-sided yielded a satisfactory power curve, in that the probability of
confidence interval on the true difference. If the confidence accepting equivalence was greater than a 0.9 (or a 90 % power)
interval is completely contained within the equivalence limits for a true difference of about 1.25 ppb. Therefore, there would
(0 6 E), or equivalently if LCL > –E and UCL < E, then accept be less than an estimated 10 % risk to the producer that such a
equivalence. Otherwise, reject equivalence. difference would fail to support equivalence in the actual trial.

FIG. 2 Power Curves for Total Organic Carbon Analyzers Comparison

7
E2935 − 16
7.2.2 Test results for the two instruments at each of the 20 denoted as Xi = the ith test result. The format is similar to that
sampling times are listed in Table 2. The current analyzer was for the means equivalence example in Section 6, but the CRM
designated as Instrument A, and the new analyzer was desig- substitutes for the first population, and its ARV is treated as a
nated as Instrument B. The differences dj at each sampling time known constant. This assignment gives the correct sign for the
period were calculated and listed in Table 2 as differences in test method bias.
the test results of Instrument B minus Instrument A. The 8.1.1 The equivalence limit E, consumer’s risk α, and
averages and standard deviations of the test results for each sample sizes have been previously determined. Calculate the
analyzer and their differences are also listed in Table 2. average, estimated bias, standard deviation, and degrees of
7.2.2.1 The average difference d̄ was 0.46 ppb and the freedom:
standard deviation of the differences sd was 1.05 ppb with f = n
19 degrees of freedom. The standard error of the average (X i
i51
difference was: X̄ 5 (18)
n
1.05
sD 5 5 0.235 ppb B 5 X̄ 2 ARV (19)
=20
7.2.2.2 Note that the standard deviations of test results for
each analyzer over time were about 6 ppb due to process
s5 Œ( ~
n

i51
2
X i 2 X̄ ! ⁄ ~ n 2 1 ! (20)

fluctuations in a range of 37–59 ppb. The source of variation f 5 ~ n 2 1 ! degrees of freedom ~ d f ! (21)
due to blocks (sampling times from the process) is eliminated
8.1.2 Calculate the standard error of the bias:
in the variation of the differences by pairing the test results.
7.2.3 The 95th percentile of Student’s t with 19 degrees of sB 5 s ⁄ =n (22)
freedom was 1.729. Upper and lower confidence limits for the
difference of means were: 8.1.3 Test for Equivalence—Calculate upper and lower con-
fidence limits:
UCL 5 D1ts D 5 0.461 ~ 1.729!~ 0.235! 5 0.87 ppb
UCL 5 B1ts B (23)
LCL 5 D 2 ts D 5 0.46 2 ~ 1.729!~ 0.235! 5 0.05 ppb
LCL 5 B 2 ts B (24)
The 90 % two-sided confidence interval on the true differ-
ence is 0.05 to 0.87 ppb and is completely contained within the where t is the upper 100(1–α) percentile of the Student’s t
equivalence interval of –2 to 2 ppb. Since 0.05 > –2 and 0.87 distribution with (n1 – 1) degrees of freedom.
< 2, equivalence of the two analyzers is accepted. If the 100(1–2α) two-sided confidence interval on the true
difference is completely contained within the equivalence
8. The TOST Procedure for Statistical Analysis of Bias limits (0 6 E), equivalently if LCL > –E and UCL < E,
Equivalence
equivalence is accepted. Otherwise, reject equivalence.
8.1 Statistical Analysis—A number of tests are conducted on
a certified reference material (CRM) in a laboratory. The 8.2 Example for Bias Equivalence—The accepted reference
average of the test results is compared with the accepted value for the test material was given as 49.50 % by weight
reference value (ARV) for that material. Let the data be (wt%). An estimate of the repeatability precision from the
method development validation was 1.5 wt%. An equivalence
limit of 3.0 wt% was selected, based on the specification range
TABLE 2 Data for Paired Samples Equivalence Test
for that material, at 5 % consumer risk. Thus E = 3 wt%, α =
TOC in Water, ppb 0.05, and estimated σ = 1.5 wt% are inputs for this study.
Sampling Time
Inst A Inst B Diff
1 46.4 48.8 2.4 8.2.1 Sample Size Determination—Power profiles for n = 5,
2 44.2 43.5 –0.7 12, and 30 were generated for a set of absolute difference
3 52.4 53.0 0.6
4 37.6 37.3 –0.3 values ranging 0.00 (0.25) 4.00 wt% as shown in Fig. 3. All
5 49.3 49.1 –0.2 three curves intersect at the point (3, 0.05) as determined by the
6 45.0 44.5 –0.5 consumer’s risk at the equivalence limit.
7 51.4 51.3 –0.1
8 57.6 56.8 –0.8 8.2.1.1 A sample size of 12 replicate assays yields a
9 43.4 44.9 1.5 satisfactory power curve, in that the probability of accepting
10 45.2 44.1 –1.1
11 59.0 58.5 –0.5 equivalence (power) was greater than a 0.9 probability (or a 90
12 43.1 44.1 1.0 % power) for a difference of 1.75 wt% or less. Therefore, there
13 39.3 40.9 1.6 would be less than an estimated 10% risk to the producer that
14 48.2 48.4 0.2
15 48.7 49.0 0.3 such a difference would fail to support equivalence in the
16 44.4 46.1 1.7 actual trial.
17 52.7 53.2 0.5
18 43.3 44.6 1.3 8.2.1.2 A comparison of the three power curves indicates
19 54.4 56.7 2.3 that the n = 5 design would be underpowered, as the power
20 58.4 58.4 0.0 falls below 0.9 at 1.0 wt%. The n = 30 design gives somewhat
Average 48.20 48.66 0.46
Std Dev 6.13 5.99 1.05 more power than the n = 12 design but is more costly to
conduct and may not be worth the extra expenditure.

8
E2935 − 16

FIG. 3 Multiple Power Curves for Bias Example

8.2.2 Results for the twelve replicate assays are given in 9.1.1 Depending on the experimental design that was used,
Table 3. The laboratory mean, the bias, the laboratory standard calculate the upper (UCL) and lower (LCL) confidence limits
deviation, its degrees of freedom, and the standard error of the on the difference between means. For the Two Independent
bias are: Samples Design, use the calculations in 6.1. For the Paired
X̄5 s 48.5 1 51.0 1 .... 1 48.9d 550.49 wt% Samples Design, use the calculations in 7.1.
9.1.2 For a performance characteristic where “higher is
B550.49249.5050.99 wt% better”, accept noninferiority for the modified test procedure
s5 œf s 48.5 2 50.49d 2 1 ... 1 s 48.9 2 50.49d 2 g ⁄ s 12 2 1 d
with respect to the current test procedure when LCL.2E;
51.935 wt% otherwise denote inferiority for the modified test procedure.

f51221511
9.1.3 For a performance characteristic where “lower is
better”, accept noninferiority for the modified test procedure
s B 51.935 ⁄ œ1250.559 wt% with respect to the current test procedure when UCL,E ;
8.2.3 The 95th percentile of Student’s t with 11 degrees of otherwise denote inferiority for the modified test procedure.
freedom is 1.796. Upper and lower confidence limits are: 9.2 Example—Non-Inferiority Test for Sensitivity of
UCL = 0.99 + (1.796)(0.559) = 1.99 wt% Detection—Environmental testing for microbial contamination
LCL = 0.99 – (1.796)(0.559) = –0.01 wt% by the current (compendial) test method involves counting
8.2.4 Since –0.01 > –3 and 1.99 < 3, equivalence is microbial colony-forming units (CFU) after plating and incu-
accepted. bating the sample for a period of days. Newer rapid test
methods give a result in shorter time and so have benefits in
9. Procedure for Statistical Analysis of Non-Inferiority timeliness even though they might have slightly lower detec-
Tests Involving Means and Variances tion sensitivity than the compendial method. Therefore, the
9.1 Statistical Analysis Involving Means—The calculations performance characteristic, sensitivity, is “higher is better” and
for non-inferiority tests are essentially the same as for means the non-inferiority test is based on LCL.2E.
equivalence with the following exceptions. 9.2.1 In this example the acceptance criterion is based on a
(1) The means being compared are from values of a ratio rather than on a difference. The industry standard USP
performance characteristic, not necessarily the test result <1223> stipulates that “The alternate method should provide
means. an estimate of viable microorganisms not less than 70 % of the
(2) The scale for a performance characteristic is estimate provided by the traditional method …”, thus the
directional, one direction denoting inferiority of the of the noninferiority limit for the ratio of the CFU counts (rapid/
modified test procedure. Thus only a single one-sided test is compendial) would be 2E50.7. For this situation, a logarithmic
conducted. transformation gives a natural scale for this acceptance crite-
rion in terms of a mean difference. Let X̄ 1 5 the average count
by the rapid method and let X̄ 2 5 the average count by the
TABLE 3 Data for Bias Equivalent Test
compendial method. In the log metric, the log of the ratio is
Test Results
equal to the difference in the log means thus log10~ X̄ 1 ⁄ X̄ 2 !
48.5 51.0 54.0 53.2 47.6 49.4
50.2 49.5 52.1 51.6 49.9 48.9 5log10~ X̄ 1 ! 2log10~ X̄ 2 ! 5D . Therefore, the equivalence limit –E
is equal to log10~ 0.7! 5-0.1549 in the log metric.

9
E2935 − 16
9.2.2 Eighteen independent bioassays were conducted at the and s 22 is the variance estimate of the modified procedure with
same time, nine each by the compendial and rapid test f2 degrees of freedom:
methods, sampling from a single microorganism suspension
R 5 s 22 ⁄s 21 (25)
having approximately 50 CFU. This was a Two Independent
Samples Design. The 6.1 calculations were made on the The upper confidence limit for the true variance ratio 5 =
log-transformed count data using an equivalence limit of -E σ 22 ⁄σ 21 is:
5-0.1549, and these calculations are summarized in Table 4.
The average recovery by the rapid method was lower than the UCL R 5 R F 12α (26)
compendial method (50.4 CFU versus 54.3 CFU) with a ratio
of 0.928, or a 7.2 % reduction. The lower confidence limit where F1-α is the upper 100(1-α)th percentile of the F
(LCL) on the log difference D was –0.0828, which was higher distribution with f1 and f2 degrees of freedom (see X1.5.3).
than the equivalence limit –0.1549, and thus non-inferiority 9.3.2 Test for Non-Inferiority of Population 2 Precision—
was supported. Because precision stated inversely as variance is a performance
9.2.3 Note that the use of a normal distribution for log characteristic where “lower is better”, accept noninferiority for
counts was justified in this situation because the count range is the modified test procedure with respect to the current test
small. This was confirmed by a normality test on each source procedure when UCL R ,E ; otherwise denote inferiority for the
of nine log-transformed counts (not shown here). modified test procedure.
9.2.4 Fig. 4 shows a post-facto power curve based on n = 9, 9.3.3 The needed sample sizes for variance tests will be
α = 0.05, and σ = 0.06 log CFU. The curve intersects the point much larger than those for means. It will usually be difficult, if
(–0.1549, 0.05) confirming that the power is 5 % at the given
not impossible, to generate 30 or more test results at the same
equivalence limit. Power is above 90 % at a log CFU Ratio
time by each test method under repeatability conditions. This
down to near –0.1 (about a 20 % reduction in sensitivity) for
this design. This supports the sample size that was used for this means that the tests will be conducted under intermediate
non-inferiority test. precision conditions using a set of control samples that are
homogeneous and stable. Fig. 5 shows power curves for equal
9.3 Statistical Analysis Involving Variances—The test for sample sizes n = 31, 51, and 101 (degrees of freedom, f = 30,
non-inferiority of precision is conducted using variances as the
50, and 100) with α = 0.05 and E = 4. A power of 0.8 can be
test statistic. Non-inferiority tests are used for variances
attained at 5 = 1.6, 2.0, and 2.4, respectively. Note that the
because precision is a performance characteristic in which
three power profile curves intersect at the point (4, 0.05).
“smaller is better”. The statistical procedure is based on a
one-sided F test. The proper design is the Two Independent 9.3.4 If a control sample is unavailable, an alternate design
Samples Design, so use the calculations in 6.1, Eq 2-4, for the would be to run duplicate tests by each test method on a series
variances. of routine samples. Each duplicate will provide a one degree of
9.3.1 Calculate the ratio R of the variances (modified/ freedom estimate of test variance under repeatability condi-
current) and its upper confidence limit, where s 21 is the variance tions. These variances can then be pooled to obtain a repeat-
estimate of the current procedure with f1 degrees of freedom ability estimate for each test method.

TABLE 4 Data and Calculations for Non-inferiority Test on Microbial Recovery

Counts, CFU Log Counts
Equation Number
Comp Rapid Comp Rapid
Data 53 59 1.7243 1.7709
62 53 1.7924 1.7243
61 41 1.7853 1.6128
43 60 1.6335 1.7782
54 58 1.7324 1.7634
47 47 1.6721 1.6721
66 46 1.8195 1.6628
54 43 1.7324 1.6335
49 47 1.6902 1.6721

n 9 9 9 9
Average 54.3 50.4 1.7313 1.6989 Eq 1
Std Dev 7.52 7.21 0.0605 0.0620 Eq 3
Degrees of Freedom, f 8 8 Eq 4
Pooled Standard Deviation 0.0612 Eq 5
Degrees of Freedom 16 Eq 6
Difference (Rapid-Comp) –0.0325 Eq 7
Standard Error of Difference 0.0289 Eq 8
95 % Confidence Limit:
Student’s t, f = 16, 95th Percentile 1.746
Lower Confidence Limit –0.0828 Eq 10
Equivalence Limit –0.1549
Non-Inferiority Test Pass

10
E2935 − 16

FIG. 4 Power Curve for Microbial Detection Example

FIG. 5 Power Curves for Variance Non-Inferiority Tests where E = 4

10. Keywords
10.1 bias equivalence; confidence interval; equivalence;
equivalence limit; means equivalence; non-inferiority; two
one-sided tests (TOST) procedure

11
E2935 − 16

APPENDIXES

(Nonmandatory Information)

X1. STATISTICAL HYPOTHESIS TESTS FOR EQUIVALENCE

X1.1 Two One-Sided Tests (TOST) Procedure (1) resulting degrees of freedom are bounded between MIN(n1 – 1,
X1.1.1 Data from two populations (sources) are assumed to n2 – 1) and n1 + n2 – 2.
arise independently from normally distributed populations
having distinct means, denoted as µ1, µ2, and a common X1.2 Decision Errors and Risks
standard deviation, denoted as σ. The TOST procedure sets up X1.2.1 In any statistical hypothesis testing situation a deci-
two null hypotheses (H0) and corresponding alternate hypoth- sion is made to either accept or reject the null hypothesis based
eses (Ha) on the difference between the two population means on outcome of the procedure. Since the data are subject to
as follows: variation, this will create uncertainty in the final decision.
Hypothesis 1 Hypothesis 2 There are two kinds of errors associated with the final decision:
Null hypothesis H 01: µ 2 2µ 1 $E H 02: µ 2 2µ 1 #2E (1) Rejecting the null hypothesis when it is true (Type I
Alternative hypothesis H a1 : µ 2 2µ 1 ,E H a2 : µ 2 2µ 1 .2E
error), and
The value E is termed the equivalence limit, representing the (2) Not rejecting the null hypothesis when it is false (Type
worst case difference between the two means. II error).
X1.1.2 The TOST procedure is carried out using the data
X1.2.2 For the equivalence application of a hypothesis test,
sampled from the two populations, as illustrated in 6.1 with an
example. A one-sided t test at the α significance level tests each the null hypothesis is that the two populations are not
of the two null hypotheses. equivalent, so the Type I error is declaring equivalence when
the two populations are truly not equivalent. The Type I error
Let D5X̄ 2 2X̄ 1 and s D 5s p Œ 1
1
1
n1 n2
is considered a consumer’s risk, since acceptance of a non-
equivalent testing process will affect customers (patients,
where s p 5 Œ
~ n 1 2 1 ! s 21 1 ~ n 2 2 1 ! s 22
~n1 1 n2 2 2!
, with f5 ~ n 1 1 n 2 2 2 ! regulators, etc.) by creating erroneous test results in release of
product and other quality management activities. This risk is
degrees of freedom. set by choosing the significance level of the two hypothesis
The t statistics are t 1 5 ~ E 2 D ! ⁄s D and t 2 5 ~ E 1 D ! ⁄s D for tests in the TOST procedure, so that the consumer’s risk is
hypotheses 1 and 2, respectively. Both null hypotheses are directly controlled.
rejected when t 1 .t 12α,f and t 2 .t 12α,f where t 12α,f is the upper
(1–α)th quantile of the Student’s t distribution with f degrees of X1.2.3 The Type II error is failing to declare equivalence
freedom. If both hypotheses are rejected, then it is asserted that when the two populations are truly equivalent. The Type II
2E,µ 1 2µ 2 ,E and the two sources are said to be equivalent; error is considered a producer’s risk, since this will create
otherwise, the two data sources are deemed non-equivalent. additional investigational work to make a desired improve-
ment. This risk is controlled by choosing an adequate sample
X1.1.3 The TOST procedure is operationally identical to size to be taken from each population by consideration of
constructing a two-sided 100(1–2α) % confidence interval on power profiles from various sample sizes.
the difference between two means (2). If the confidence
interval is completely contained within the interval (–E, E) X1.2.4 The table below summarizes the four situations that
then equivalence is accepted. The interval (–E, E) is termed the may occur for a given TOST procedure.
equivalence interval. Populations are truly:
TOST declares that: Equivalent Not Equivalent
X1.1.4 It is strongly recommended (5) that the sample sizes Populations are equivalent Decision is correct Type I Error
from each population be equal to minimize the effect of a Populations are not equivalent Type II Error Decision is correct
departure from equal population variances. If the variances
differ greatly the standard error of the difference may be X1.3 Criticism of the Use of the Conventional t Test for
calculated as: Equivalence Testing

sD 5 Œ s 21 s 22
1
n1 n2
(X1.1)
X1.3.1 In the conventional two sample t test a single
hypothesis test is set up as follows:
Null hypothesis H 0 : µ 1 2µ 2 50
With approximate degrees of freedom: Alternative hypothesis H a : µ 1 2µ 2 ﬁ0
~ s 21 ⁄ n1 1 s 22 ⁄ n 2 ! 2
The null hypothesis is rejected if the two-sided confidence
v5 (X1.2)
F ~ s 21 ⁄ n 1 ! 2
~n1 2 1!
1
~n2 2 1! G
~ s 22 ⁄ n 2 ! 2 interval on the difference between the population means
excludes zero and is not rejected if the confidence interval
In many statistical software packages this calculation is used includes zero. If used for equivalence testing, equivalence
in the option “assume unequal variances” for a t test. The would be rejected if the null hypothesis was rejected. This is

12
E2935 − 16
operationally the same as rejecting the null hypothesis if the the form of the ratio 5 = σ 22 ⁄σ 21 that represents the worst case
two-sided confidence interval on the mean does not include increase of variance.
zero. X1.5.3.1 The statistical test involves the ratio R5s 22 ⁄s 21 , using
X1.3.2 The Type I Error for the t test is the error of falsely s 21 as the variance estimate of the current procedure with f1
declaring a non-zero difference, or the error of falsely declaring degrees of freedom and s 22 as the variance estimate of the
non-equivalence, which is the producer’s risk. As hypothesis modified procedure with f2 degrees of freedom, is the test
tests are set up to directly control the Type I error (often at the statistic for the one-sided F test. The acceptance criterion for
0.05 significance level) the conventional t test is not directly non-inferiority is R F 12α ,E , where F1–α is the upper 100(1-α)th
protecting the customer in the equivalence application. The percentile of the F distribution with f1 and f2 degrees of
consumer’s risk is indirectly controlled by the samples sizes freedom.
selected. X1.5.4 A reference for non-inferiority procedures is M.
X1.3.3 If the variances of the population means are small, Rothmann, et. al. (6). Although their context is directed to
either reflecting a precise test method, large sample sizes, or clinical trials for pharmaceuticals, many numerical examples
both, the confidence interval on the difference may not include are included, and these are easily translatable to test method
zero, thus rejecting equivalence, even for small differences that evaluation.
are not of scientific importance. On the other hand, if the
variances of the population means are large, the confidence X1.6 Power Profiles
interval on the difference may include zero, but may be
X1.6.1 The power function of the means equivalence test
extremely wide, thus masking critical differences. For these
has been examined (7, 8) where the emphasis is on finding a
reasons, the conventional t test is not recommended for
sample size n for a given value of the true difference in means.
equivalence testing.
Power functions involving the non-central and central Stu-
dent’s t distributions were considered, along with incorporating
X1.4 Equivalence Testing for Bias an upper confidence limit on the variance estimate with a
X1.4.1 The TOST procedure may also be used for bias normal distribution power function. The normal distribution
equivalence testing. In this situation population mean µ1 is the approximation should be adequate when a strong estimate of σ
accepted reference value (ARV) with zero variance. The ex- is used (or the use of an upper confidence limit on σ if a more
periment consists of comparing µ2 with the ARV. The popula- conservative estimate is desired.) The normal approximation of
tion mean is re-designated as µ and the sample mean and power, given a true difference ∆, for equal sample sizes n is:
variance calculated for the single data set is used for estimating
the bias, µ – ARV, and its confidence limits for testing against Power 5 Φ S E2∆
σD
2 z 12α 2 Φ
σDD S
2E 2 ∆
1 z 12α D
the equivalent limit, or worst-case bias. The only change from
(X1.3)
the two population case is the calculation of the standard error
and its degrees of freedom. where:
Φ(•) = the standard normal cumulative distribution function,
X1.5 Equivalence Testing for Non-Inferiority ∆ = µ1 – µ2, the true difference parameter,
σD = σ =2⁄n , the standard error of the test statistic D, and
X1.5.1 Non-inferiority in this practice compares a modified
z1–α = the (1–α)th percentile of the standard normal
testing process to the current process with respect to a
distribution.
performance characteristic, where the acceptance criterion is
stated in terms of a difference in means or a ratio of variances. If the sample sizes are too small, the upper confidence limit
The statistical procedure for non-inferiority testing uses a on –E may exceed the lower confidence limit on E, and there
single one-sided hypothesis test where the null hypothesis will be a zero chance of accepting equivalence.
states that the modified testing process is inferior to the current
X1.6.2 The power function for the non-inferiority test for
process. If the null hypothesis is rejected, the modified process
means depends on the direction of inferiority and uses the
is declared non-inferior to the current process for that perfor-
appropriate part of equation (Section X1.7).
mance characteristic.
For a performance characteristic where “higher is better use:
X1.5.2 For performance characteristics comparing means,
the hypothesis sets in X1.5.1 are used with µ1 defined as the
mean of the current process and µ2 defined as the mean of the
Power 5 1 2 Φ S 2E 2 ∆
σD
1 zα D (X1.4)

modified process. For an acceptance criterion where “lower is For a performance characteristic where “lower is better use:
better” use Hypothesis 1, and for an acceptance criterion where
“higher is better” use Hypothesis 2. The TOST procedure will
supply the necessary one-sided hypothesis test calculations.
Power 5 Φ S E2∆
σD
2 zα D (X1.5)

X1.6.3 The power function, plotted against values of 5 =

X1.5.3 For performance characteristics comparing
σ 22 ⁄σ 21 , of the non-inferiority test for variances uses the F
variances, with σ 21 as the variance of the current process and σ 22
distribution:
as the variance of the modified process, the hypothesis set is
H 0 :σ 22 $E σ 21 and H A :σ 22 #E σ 21 . The equivalence limit E is set in Power 5 1 2 ^ ~ 5 F 12α ⁄ E ! (X1.6)

13
E2935 − 16
where: X1.7 Alternative Designs
^(•) = the cumulative F distribution function with f1 and f2 X1.7.1 Designs conducted using intermediate precision
degrees of freedom, conditions may involve other sources of variation, thus making
E = the equivalence limit expressed as the hypothesized
the analysis more complicated and possibly raising side issues,
ratio σ 22 ⁄σ 21 , and
F1–α = the upper 100(1-α)th percentile of the F distribution such as differences among operators or instruments within
with f1 and f2 degrees of freedom. laboratories (9, 10).

X2. SPREADSHEET FOR POWER PROFILE CURVES

X2.1 Power Profile for Means or Bias Equivalence Using X2.1.2 Calculation—Cells E3 and E4 list results for inter-
a Single Sample or Two Independent Samples of mediate calculations of Zα and σD. The power for a given true
Equal Sample Size difference is calculated from E, ∆, Zα, and σD, and the function
X2.1.1 Data Entry—A spreadsheet example for generating equation for this appears in Row 24. The calculated power
power profiles is shown in Fig. X2.1. See Section X1.6 for curve values will appear in B10 downward
background information. Five input variables are entered into
X2.1.3 Graph—The graph plots the power on the vertical
cells B3–B7 as follows:
axis versus the absolute true difference on the horizontal axis.
• In B3, enter the estimate of the standard deviation of the
The curve is anchored at the point (E, α). For different ranges
test results, σ
• In B4, enter the consumer risk, α for the true difference the axes may have to be altered by the
• In B5, enter 1 for a single sample design or 2 for a two user.
independent samples design
• In B6, enter the equivalence limit, E X2.2 Disclaimer—This spreadsheet example is not sup-
• In B7, enter the sample size, n ported by ASTM, and the user of this standard is responsible
• In A10, downward enter a range of true differences for its use. For questions pertaining to use of this spreadsheet
starting with zero and exceeding the equivalence limit, E, and example please contact Subcommittee E11.20.
adjust the horizontal axis of the graph accordingly.

14
E2935 − 16

FIG. X2.1 Power Profile Spreadsheet Example for Means

REFERENCES

(1) Schuirmann, D. J.,“A Comparison of the Two One-sided Tests (6) Rothmann, M. D., Wiens, B. L., and Chan, S. F. I., Design and
Procedure and the Power Approach for Assessing the Equivalence of Analysis of Non-Inferiority Trials, Chapman & Hall/CRC, Taylor &
Average Bioavailability,” Journal of Pharmacokinetics and Francis Group, Boca Raton, FL, 2012.
Biopharmaceutics, Vol 15, 1987, pp. 657–680. (7) Bristol, D. R., “Probabilities and Sample Sizes for the Two One-sided
(2) Westlake, W. J.,“Response to T. B. L. Kirkwood: Bioequivalence Tests Procedure,” Communications in Statistics – Theory Methods,
Testing – A Need to Rethink,” Biometrics, Vol 37, 1981, pp. 589–594. Vol 22, 1993, pp. 1953–1961.
(3) Limentani, G. B., Ringo, M. C., Ye, F., Bergquist, M. L., and (8) Stein, J., and Doganaksoy, N., “Sample Size Considerations for
McSorley, E. O., “Beyond the t-Test: Statistical Equivalence Testing,” Assessing the Equivalence of Two Process Means,” Quality
Analytical Chemistry, June 1, 2005, pp. 221A–226A. Engineering, Vol 12, No. 1, 1999, pp. 105–110.
(4) Chambers, D., Kelly, G., Limentani, G., Lister, A., Lung, K. R., and (9) Kringle, R., Khan-Malek, R., Snikeris, F., Munden, P., Agut, C., and
Warner, E., “Analytical Method Equivalency – An Acceptable Ana- Bauer, M., “A Unified Approach for Design and Analysis of Transfer
lytical Practice,” Pharmaceutical Technology, September 2005, pp. Studies for Analytical Methods,” Drug Information Journal, Vol 35,
64–80. 2001, pp. 1271–1288.
(5) Welch, B. L., “The Significance of the Difference Between Two (10) Schwenke, J., and O’Connor, D., “Design and Analysis of Analytical
Means When the Population Variances are Unequal,” Biometrika, Vol Method Transfer Studies,” Journal of Pharmaceutical and
29, 1938, pp. 350–362. BioSciences, Vol 18, No. 5, 2008, pp. 1013–1033.

ASTM International takes no position respecting the validity of any patent rights asserted in connection with any item mentioned
in this standard. Users of this standard are expressly advised that determination of the validity of any such patent rights, and the risk
of infringement of such rights, are entirely their own responsibility.

This standard is subject to revision at any time by the responsible technical committee and must be reviewed every five years and
if not revised, either reapproved or withdrawn. Your comments are invited either for revision of this standard or for additional standards
and should be addressed to ASTM International Headquarters. Your comments will receive careful consideration at a meeting of the
responsible technical committee, which you may attend. If you feel that your comments have not received a fair hearing you should
make your views known to the ASTM Committee on Standards, at the address shown below.

This standard is copyrighted by ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959,
United States. Individual reprints (single or multiple copies) of this standard may be obtained by contacting ASTM at the above
address or at 610-832-9585 (phone), 610-832-9555 (fax), or service@astm.org (e-mail); or through the ASTM website
(www.astm.org). Permission rights to photocopy the standard may also be secured from the Copyright Clearance Center, 222
Rosewood Drive, Danvers, MA 01923, Tel: (978) 646-2600; http://www.copyright.com/

Collaborative Statistics Supplemental Course Materials
No ratings yet
Collaborative Statistics Supplemental Course Materials
106 pages
Bs en Iso 8655-6-2022 Gravimetric Reference Measurement Procedure For The Determination of Volume PDF Free
No ratings yet
Bs en Iso 8655-6-2022 Gravimetric Reference Measurement Procedure For The Determination of Volume PDF Free
26 pages
Dixon Test
No ratings yet
Dixon Test
8 pages
EN ISO 648 (2008) (E) Codified
No ratings yet
EN ISO 648 (2008) (E) Codified
5 pages
Iso 648
0% (1)
Iso 648
2 pages
Nte Inen 52-2013 Rules For Rounding Numbers
No ratings yet
Nte Inen 52-2013 Rules For Rounding Numbers
8 pages
Iso 3507-1999
No ratings yet
Iso 3507-1999
7 pages
711 USP Dissolution PDF
No ratings yet
711 USP Dissolution PDF
11 pages
PTS-021 Final Report Calibration of Digital Thermometer-Amd-01
No ratings yet
PTS-021 Final Report Calibration of Digital Thermometer-Amd-01
17 pages
Astm E275 - 08
No ratings yet
Astm E275 - 08
10 pages
GU Guide Routine Pipette Testing EN
No ratings yet
GU Guide Routine Pipette Testing EN
12 pages
Petrifilm REC AOAC OMA 2018.13final
No ratings yet
Petrifilm REC AOAC OMA 2018.13final
3 pages
Uncertainty Budget Tables
No ratings yet
Uncertainty Budget Tables
11 pages
ISO 835 2007 (Pipetas Graduadas)
100% (1)
ISO 835 2007 (Pipetas Graduadas)
16 pages
Booka (SPIE Tutorial Text Vol. TT78) Peter Saunders - Radiation Thermometry - Fundamentals and Applications in The Petrochemical Industry-SPIE Publications (2007)
No ratings yet
Booka (SPIE Tutorial Text Vol. TT78) Peter Saunders - Radiation Thermometry - Fundamentals and Applications in The Petrochemical Industry-SPIE Publications (2007)
175 pages
Iso 6706-1981
No ratings yet
Iso 6706-1981
9 pages
ISO TS 22117 - 2010 - Ed1 - en - 40720 - 5 - CPDF
No ratings yet
ISO TS 22117 - 2010 - Ed1 - en - 40720 - 5 - CPDF
36 pages
YD-2 English Manual (50Kg)
No ratings yet
YD-2 English Manual (50Kg)
13 pages
Astm e 77-07 PDF
No ratings yet
Astm e 77-07 PDF
14 pages
Calibration and Verification SOP
No ratings yet
Calibration and Verification SOP
3 pages
Pilot Study On Calibration of Micropipettes Using
No ratings yet
Pilot Study On Calibration of Micropipettes Using
5 pages
USP-NF 〈31〉 Volumetric Apparatus
No ratings yet
USP-NF 〈31〉 Volumetric Apparatus
5 pages
Industrial Test Weights Guide
No ratings yet
Industrial Test Weights Guide
18 pages
Determinacion de Viscosidad ASTM D2196-10
No ratings yet
Determinacion de Viscosidad ASTM D2196-10
5 pages
Sulfur in Petroleum Products by Wavelength Dispersive X-Ray Fluorescence Spectrometry
No ratings yet
Sulfur in Petroleum Products by Wavelength Dispersive X-Ray Fluorescence Spectrometry
12 pages
Norma EN 45501
No ratings yet
Norma EN 45501
8 pages
Calibrating TR 525M
No ratings yet
Calibrating TR 525M
2 pages
ASTM Liquid-in-Glass Thermometers: Standard Specification For
No ratings yet
ASTM Liquid-in-Glass Thermometers: Standard Specification For
64 pages
DOC316.53.01099 Ed10
No ratings yet
DOC316.53.01099 Ed10
10 pages
Beakers, Borosilicate Glass, Squat Form, ISO 3819 DIN 12331
No ratings yet
Beakers, Borosilicate Glass, Squat Form, ISO 3819 DIN 12331
1 page
FDA Analytical Methods Evaluation
No ratings yet
FDA Analytical Methods Evaluation
10 pages
Internationallstandard: Laboratory Glasswa¡Re - Graduated Pipettes Part 1: General Req¡Uirements
No ratings yet
Internationallstandard: Laboratory Glasswa¡Re - Graduated Pipettes Part 1: General Req¡Uirements
9 pages
ISO - 8655 - 1 - 2002 - Piston Operated
No ratings yet
ISO - 8655 - 1 - 2002 - Piston Operated
2 pages
ISO 1769-1975 Pipetas (Colores)
No ratings yet
ISO 1769-1975 Pipetas (Colores)
8 pages
Iso 8655-1
50% (2)
Iso 8655-1
16 pages
ISO Guide 35 Powerpoint
No ratings yet
ISO Guide 35 Powerpoint
47 pages
ISO Standard 8655-1
No ratings yet
ISO Standard 8655-1
18 pages
Is 1117 - 2018 - 20230327 - 0001
No ratings yet
Is 1117 - 2018 - 20230327 - 0001
14 pages
Standard Operating Procedure For Manual Dispensing Tools
No ratings yet
Standard Operating Procedure For Manual Dispensing Tools
90 pages
Iso 8655 6 2022
No ratings yet
Iso 8655 6 2022
11 pages
Tot Cyanid ASTM D7511 091
No ratings yet
Tot Cyanid ASTM D7511 091
36 pages
ASTM E3116 Viscosidad
No ratings yet
ASTM E3116 Viscosidad
5 pages
NMKL Method Template Ver Apr 2023
No ratings yet
NMKL Method Template Ver Apr 2023
10 pages
E 1154 - 89 R03 Rtexntq - PDF
100% (1)
E 1154 - 89 R03 Rtexntq - PDF
10 pages
International Standard Iso - 8655-6 - 2002 en
No ratings yet
International Standard Iso - 8655-6 - 2002 en
22 pages
BS en 15267-1-2023
No ratings yet
BS en 15267-1-2023
22 pages
Equação de Horwitz e ISO 17025
100% (1)
Equação de Horwitz e ISO 17025
7 pages
USP-NF 41 Balances
No ratings yet
USP-NF 41 Balances
2 pages
ASTM E1-14 2014 Standard Specification For ASTM Liquid-In-Glass Thermometers
100% (1)
ASTM E1-14 2014 Standard Specification For ASTM Liquid-In-Glass Thermometers
51 pages
Model TB6/0.1mm: Tipping Bucket Raingauge
No ratings yet
Model TB6/0.1mm: Tipping Bucket Raingauge
2 pages
Eure Chem
No ratings yet
Eure Chem
5 pages
4500 - PH Value (H+) PDF
No ratings yet
4500 - PH Value (H+) PDF
5 pages
ASTM D1265-11 Muestreo de Gases Método Manual
No ratings yet
ASTM D1265-11 Muestreo de Gases Método Manual
5 pages
ISO 4787 2021 (E) - Character PDF Document
No ratings yet
ISO 4787 2021 (E) - Character PDF Document
6 pages
Temperature-Electromotive Force (EMF) Tables For Standardized Thermocouples
No ratings yet
Temperature-Electromotive Force (EMF) Tables For Standardized Thermocouples
189 pages
NABL Scope - Transcal
No ratings yet
NABL Scope - Transcal
93 pages
Iso 835
No ratings yet
Iso 835
20 pages
Astm Volumetrcos
No ratings yet
Astm Volumetrcos
8 pages
Astm e 177 - 14 PDF
No ratings yet
Astm e 177 - 14 PDF
9 pages
Statistical Procedures To Use in Developing and Applying Test Methods
No ratings yet
Statistical Procedures To Use in Developing and Applying Test Methods
8 pages
Determining The Fire Resistance of Continuity Head-of-Wall Joint Systems Installed Between Rated Wall Assemblies and Nonrated Horizontal Assemblies
No ratings yet
Determining The Fire Resistance of Continuity Head-of-Wall Joint Systems Installed Between Rated Wall Assemblies and Nonrated Horizontal Assemblies
16 pages
Control of Respiratory Hazards in The Metal Removal Fluid Environment
No ratings yet
Control of Respiratory Hazards in The Metal Removal Fluid Environment
11 pages
ASTM D3107 - 07 (2019) Standard Test Methods For Stretch Properties of Fabrics Woven From Stretch Yarns
No ratings yet
ASTM D3107 - 07 (2019) Standard Test Methods For Stretch Properties of Fabrics Woven From Stretch Yarns
6 pages
Evaluating The Ability of Exterior Vents To Resist The Entry of Embers and Direct Flame Impingement
No ratings yet
Evaluating The Ability of Exterior Vents To Resist The Entry of Embers and Direct Flame Impingement
14 pages
Standard Guide For The Evaluation, and Calibration, Continuous Friction Measurement Equipment (CFME)
No ratings yet
Standard Guide For The Evaluation, and Calibration, Continuous Friction Measurement Equipment (CFME)
6 pages
Applying Aerosolized Spores As Dry Inocula To Inanimate Surfaces
No ratings yet
Applying Aerosolized Spores As Dry Inocula To Inanimate Surfaces
7 pages
Minimum Criteria For Comparing Whole Building Life Cycle Assessments For Use With Building Codes, Standards, and Rating Systems
No ratings yet
Minimum Criteria For Comparing Whole Building Life Cycle Assessments For Use With Building Codes, Standards, and Rating Systems
4 pages
Ultrasonic Extraction of Lead From Composited Wipe Samples: Standard Practice For
No ratings yet
Ultrasonic Extraction of Lead From Composited Wipe Samples: Standard Practice For
4 pages
Producing High Titers of Viable and Semi-Purified Spores of Clostridium Difficile Using A Liquid Medium
No ratings yet
Producing High Titers of Viable and Semi-Purified Spores of Clostridium Difficile Using A Liquid Medium
6 pages
Risk-Based Validation of Analytical Methods For PAT Applications
No ratings yet
Risk-Based Validation of Analytical Methods For PAT Applications
7 pages
Evaluation of The Effectiveness of Hand Hygiene Topical Antimicrobial Products Using Porcine Skin
No ratings yet
Evaluation of The Effectiveness of Hand Hygiene Topical Antimicrobial Products Using Porcine Skin
4 pages
Performance Validation of Thermomechanical Analyzers: Standard Test Method For
No ratings yet
Performance Validation of Thermomechanical Analyzers: Standard Test Method For
7 pages
Examination of Paper Machine Rolls Using Acoustic Emission From Crack Face Rubbing
No ratings yet
Examination of Paper Machine Rolls Using Acoustic Emission From Crack Face Rubbing
7 pages
Evaluating Potential Hazard As A Result of Methane in The Vadose Zone
No ratings yet
Evaluating Potential Hazard As A Result of Methane in The Vadose Zone
31 pages
Conducting Rotating Bending Fatigue Tests of Solid Round Fine Wire
100% (1)
Conducting Rotating Bending Fatigue Tests of Solid Round Fine Wire
10 pages
Examination of Drillstring Threads Using The Alternating Current Field Measurement Technique
No ratings yet
Examination of Drillstring Threads Using The Alternating Current Field Measurement Technique
10 pages
Ignition Sources: Standard Practice For
No ratings yet
Ignition Sources: Standard Practice For
32 pages
Measuring and Reporting Performance of Fourier-Transform Nuclear Magnetic Resonance (FT-NMR) Spectrometers For Liquid Samples
No ratings yet
Measuring and Reporting Performance of Fourier-Transform Nuclear Magnetic Resonance (FT-NMR) Spectrometers For Liquid Samples
30 pages
Film Permeability Determination Using Static Permeability Cells
No ratings yet
Film Permeability Determination Using Static Permeability Cells
12 pages
Determination of Trace Elements in Soda-Lime Glass Samples Using Laser Ablation Inductively Coupled Plasma Mass Spectrometry For Forensic Comparisons
100% (1)
Determination of Trace Elements in Soda-Lime Glass Samples Using Laser Ablation Inductively Coupled Plasma Mass Spectrometry For Forensic Comparisons
7 pages
Determination of Effective Boron-10 Areal Density in Aluminum Neutron Absorbers Using Neutron Attenuation Measurements
No ratings yet
Determination of Effective Boron-10 Areal Density in Aluminum Neutron Absorbers Using Neutron Attenuation Measurements
4 pages
Determining The Bacteria-Reducing Effectiveness of Food-Handler Handwash Formulations Using Hands of Adults
No ratings yet
Determining The Bacteria-Reducing Effectiveness of Food-Handler Handwash Formulations Using Hands of Adults
5 pages
Kinetic Parameters by Factor Jump/Modulated Thermogravimetry
No ratings yet
Kinetic Parameters by Factor Jump/Modulated Thermogravimetry
5 pages
Workforce Education in Nanotechnology Characterization: Standard Practice For
No ratings yet
Workforce Education in Nanotechnology Characterization: Standard Practice For
7 pages
Evaluation of Environmental Aspects of Sustainability of Manufacturing Processes
No ratings yet
Evaluation of Environmental Aspects of Sustainability of Manufacturing Processes
6 pages
Characterization and Classification of Smokeless Powder: Standard Practice For
No ratings yet
Characterization and Classification of Smokeless Powder: Standard Practice For
7 pages
Workforce Education in Nanotechnology Health and Safety: Standard Guide For
No ratings yet
Workforce Education in Nanotechnology Health and Safety: Standard Guide For
3 pages
Sensory Analysis-Tetrad Test: Standard Test Method For
No ratings yet
Sensory Analysis-Tetrad Test: Standard Test Method For
7 pages
Assessment of Continued Applicability of Fire Test Reports Used in Building Regulation
No ratings yet
Assessment of Continued Applicability of Fire Test Reports Used in Building Regulation
3 pages
Making Sustainability-Related Chemical Selection Decisions in The Life-Cycle of Products
No ratings yet
Making Sustainability-Related Chemical Selection Decisions in The Life-Cycle of Products
7 pages
Stainless Steel Banding - Ss316 / 316L: Characteristics
No ratings yet
Stainless Steel Banding - Ss316 / 316L: Characteristics
1 page
Physics Vectors Quiz
No ratings yet
Physics Vectors Quiz
2 pages
Embryology Skeletal System
No ratings yet
Embryology Skeletal System
29 pages
Forest Mystery for Young Adventurers
No ratings yet
Forest Mystery for Young Adventurers
2 pages
Ryton QC160N and QC160P: Polyphenylene Sulfide Alloys
No ratings yet
Ryton QC160N and QC160P: Polyphenylene Sulfide Alloys
2 pages
IBT Sample Grade 5 Science
85% (13)
IBT Sample Grade 5 Science
8 pages
TiO2 Concrete Sealer Efficiency
No ratings yet
TiO2 Concrete Sealer Efficiency
6 pages
Workbook: WINTER 2023 / 24
No ratings yet
Workbook: WINTER 2023 / 24
244 pages
User Manual 3504863
No ratings yet
User Manual 3504863
24 pages
Existing Base Plate Design
No ratings yet
Existing Base Plate Design
5 pages
Top 20 Electronics Manufacturing in India
0% (1)
Top 20 Electronics Manufacturing in India
21 pages
Veterinary Early Warning Score
No ratings yet
Veterinary Early Warning Score
2 pages
Dealer Training Offer - 2025 V3 P3
No ratings yet
Dealer Training Offer - 2025 V3 P3
46 pages
KT7585 Tong Technical Manual
No ratings yet
KT7585 Tong Technical Manual
1 page
Safety in The Environment
No ratings yet
Safety in The Environment
17 pages
NUMA-Piano Manual
No ratings yet
NUMA-Piano Manual
72 pages
2007 Treinamento Cmts Net-2
No ratings yet
2007 Treinamento Cmts Net-2
132 pages
GettingStarted Open Sign ASPIRE
No ratings yet
GettingStarted Open Sign ASPIRE
32 pages
Hukling
No ratings yet
Hukling
13 pages
LR-TB5000C Datasheet
No ratings yet
LR-TB5000C Datasheet
3 pages
(Ebook) Principles of Corporate Finance by Richard A. Brealey, Stewart C. Myers, Franklin Allen ISBN 9780073530734, 0073530735 Download
No ratings yet
(Ebook) Principles of Corporate Finance by Richard A. Brealey, Stewart C. Myers, Franklin Allen ISBN 9780073530734, 0073530735 Download
121 pages
Gerunds and Infinitves
No ratings yet
Gerunds and Infinitves
5 pages
(FREE PDF Sample) The Economics of Electricity Markets 1st Edition Darryl R. Biggar Ebooks
100% (4)
(FREE PDF Sample) The Economics of Electricity Markets 1st Edition Darryl R. Biggar Ebooks
62 pages
Module V
No ratings yet
Module V
55 pages
PNC ML Model Brand: Husqvarna Outdoor Products Italia Spa
No ratings yet
PNC ML Model Brand: Husqvarna Outdoor Products Italia Spa
3 pages
Swimming Pool Layout
100% (2)
Swimming Pool Layout
1 page
EHS - Kick-Off MOM Compliance Report.
No ratings yet
EHS - Kick-Off MOM Compliance Report.
16 pages
Resume C-Arm
No ratings yet
Resume C-Arm
4 pages
HAZEMAG APK Brochure 2022
No ratings yet
HAZEMAG APK Brochure 2022
12 pages
Grade Thresholds - November 2016: Cambridge IGCSE Physics (0625)
No ratings yet
Grade Thresholds - November 2016: Cambridge IGCSE Physics (0625)
2 pages

Conducting Equivalence Testing in Laboratory Applications: Standard Practice For

Uploaded by

Conducting Equivalence Testing in Laboratory Applications: Standard Practice For

Uploaded by

This international standard was developed in accordance with internationally recognized principles on standardization established in the Decision on Principles

Designation: E2935 − 16 An American National Standard

Standard Practice for

1. Scope E456 Terminology Relating to Quality and Statistics

FIG. 1 Multiple Power Curves for Lab Transfer Example

6. The TOST Procedure for Statistical Analysis of Means

FIG. 2 Power Curves for Total Organic Carbon Analyzers Comparison

FIG. 3 Multiple Power Curves for Bias Example

TABLE 4 Data and Calculations for Non-inferiority Test on Microbial Recovery

FIG. 4 Power Curve for Microbial Detection Example

FIG. 5 Power Curves for Variance Non-Inferiority Tests where E = 4

X1. STATISTICAL HYPOTHESIS TESTS FOR EQUIVALENCE

X1.6.3 The power function, plotted against values of 5 =

X2. SPREADSHEET FOR POWER PROFILE CURVES

FIG. X2.1 Power Profile Spreadsheet Example for Means

You might also like