Response Surface Methodology
Response Surface Methodology
2014-013
By
14 February, 2014
ISSN 0924-7815
ISSN 2213-9532
Electronic copy
Electronic copy available
available at:
at:https://ssrn.com/abstract=2395836
http://ssrn.com/abstract=2395836
manuscript No.
(will be inserted by the editor)
1 Introduction
Electronic copy
Electronic copy available
available at:
at:https://ssrn.com/abstract=2395836
http://ssrn.com/abstract=2395836
2 Jack P.C. Kleijnen
see the classic article by Box and Wilson (1951), the overview of RSM dur-
ing the period 1966-1988 by Myers et al. (1989), the recent third edition of
thee popular handbook by Myers et al. (2009), and recent RSM software
on the Web. RSM in simulation was discussed in the monograph by Kleij-
nen (1975). One of the …rst case-studies on RSM in simulation is Van den
Bogaard and Kleijnen (1977), who simulated a computer center with two
servers and three priority classes— namely, small, medium, and large jobs—
for which they estimated the 90% quantiles of the waiting times per class,
for di¤erent class limits, and applied RSM to …nd the optimal class limits.
RSM in simulation is also discussed by Barton and Meckesheimer (2006),
Bartz-Beielstein (2006), Law (2014), and Rosen et al. (2008). Google gave
more than two million results for the term "Response Surface Methodol-
ogy", on February 4, 2014. Unfortunately, RSM (unlike some other search
heuristics) has not yet been implemented as an add-on to commercial sim-
ulation software.
RSM treats the simulation model as a black box ; i.e., RSM observes the
input/output (I/O) of the simulation model, but not the internal variables
and speci…c functions implied by the simulation’s computer modules. RSM
is a sequential heuristic; i.e., it uses a sequence of local experiments that
may lead to the optimum input combination. Note that an input combina-
tion is also called a point or a scenario. RSM uses design of experiments
(DOE) and the concomitant linear regression analysis. Though RSM is only
a heuristic, it has gained a good track record— as we shall indicate in the
next sections. Moreover, practitioners may not be interested in convergence
proofs, because realistic experiments take so much time that large sample
sizes are impossible; practitioners may be more interested in …nding better
solutions than the current one (a French expression claims that "the best is
the enemy of the better").
More speci…cally, RSM uses a sequence of local metamodels (approxima-
tions) that are …rst-order polynomials in the inputs; once the optimum seems
close, RSM augments the latest …rst-order polynomial to a second-order
polynomial. Notice that terminology may be misleading, because many au-
thors use the term "response surface’instead of "metamodel"; an example
is Rikards and Auzins (2002). Note that RSM may be combined with white-
box methods that give estimated gradients, which may improve RSM; see
Qu and Fu (2012).
We focus on discrete-event simulation, though RSM has also been ap-
plied to deterministic simulation and the simulation of sets of random non-
linear di¤erence equations. By de…nition, random simulation (including dis-
crete event simulation) uses pseudorandom numbers (PRN). We do not dis-
cuss deterministic simulation with numeric noise.
Furthermore, we assume that the inputs are continuous variables. We
focus on simulation that has a mean (expected value) as "the" response
(output). Nevertheless, we have also applied RSM to outputs that are quan-
tiles; see again Van den Bogaard and Kleijnen (1977). The response may
also be a probability (e.g., the over‡ow probability in a queueing system),
2 RSM: Basics
As we have already said, the basics of RSM were developed in real-life ex-
periments for the simplest type of optimization problems; namely, minimize
the expected value of a single output, with continuous inputs and without
any constraints:
min E(w0 jz) (1)
z
noise, so least squares (LS) gives the best linear unbiased estimator (BLUE)
of :
b = (Z0 Z) 1 Z0 w (3)
where Z denotes the N (k + 1) matrix that is determined by the R-III
design and the mi replicationsPn of point i, and w denotes the vector with N
outputs; obviously, N = i=1 mi so Z has mi identical rows where the …rst
element of each row is 1 and corresponds with the intercept 0 .
To select the n input combinations needed to estimate these k +1 e¤ects,
RSM uses a R-III design with n k + 1. In classic R-III designs this n
is a multiple of four; e.g., if k = 7, then Table 1 gives a 27 4 fractional
factorial design in the standardized inputs xj so - means -1 and + means 1;
the last column has the so-called generator 7 = 1:2:3, which means xi;7 =
xi;1 xi;2 xi;3 ; the generators in the columns 4, 5, and 6 are de…ned analogously.
The design is "balanced"; i.e., each column has n=2 values equal to -1.
Moreover, all pairs of columns are orthogonal. If n is not a power of 2,
then the R-III design is called a Plackett-Burman design, and is tabulated
in many publications and provided by DOE software; e.g., Kleijnen (2014)
gives the design for k = 11 inputs and n = 12 combinations.
Unfortunately, there are no general guidelines for determining the appro-
priate size of the local area in each step; again, intuition and prior knowledge
are important. Chang et al. (2013), however, decide on the size of the local
area, using a so-called trust region; also see Section 3.
To decide on the next subarea, RSM follows the steepest descent path;
e.g., if the estimated …rst-order e¤ects are such that b1 >> b2 > 0, then RSM
decreases input z1 much more than input z2 . Unfortunately, the steepest-
descent method is scale dependent; i.e., linear transformations of the inputs
a¤ect the search direction; see Myers et al. (2009). In Section 4 we shall
present a scale-independent variant.
The steepest descent technique does not quantify the step size along its
path. The analysts may therefore try some intuitively selected value for the
step size. If that value yields an output that is signi…cantly higher instead
of lower, then they may reduce the step size. Otherwise, they take one more
step in the current steepest descent direction. In Section 5 we shall present
a more sophisticated mathematical procedure for selecting the step size.
After a number of steps along the steepest descent path, the output
will increase instead of decrease: the …rst-order polynomial is only a local
approximation of the true I/O function. When such deterioration occurs,
RSM observes the n > k combinations speci…ed by a R-III design centered
around the best point found so far; i.e., RSM may use the same design as in
step 2 (see Table 1), but translate the standardized inputs xj into di¤erent
values for the original inputs zj . The best combination found so far, may
be one of the corner points of the design; see again Figure 1. Next RSM
estimates the …rst-order e¤ects in the new local polynomial approximation,
using LS; see (3). And so the search continues.
It is intuitively clear that the plane implied by the most recent local …rst-
order polynomial cannot adequately represent a hill top when searching to
maximize a function or— equivalently— minimize (1). So in the neighbor-
hood of the optimum, a …rst-order polynomial shows serious lack of …t. A
popular and simple diagnostic measure is the coe¢ cient of determination
R2 . A related diagnostic uses an F statistic to test whether all estimated
…rst-order e¤ects— and hence the gradient— are zero. These diagnostics are
given by all modern regression software packages. Instead of these diagnos-
tics, RSM might use "cross-validation". More details including references
are given by Kleijnen (2014).
If the most recently …tted …rst-order polynomial turns out to be inade-
quate, then RSM …ts a second-order polynomial :
k
X k X
X k
y= 0 + j zj + j;j 0 zj zj 0 + e: (4)
j=1 j=1 j 0 =k
To estimate the q = 1 + 2k+ k(k 1)=2 coe¢ cients of this polynomial, RSM
uses a CCD with n > q combinations More precisely, a CCD starts with a
resolution-V (R-V) design, which by de…nition gives unbiased estimators of
the intercept, the k …rst-order e¤ects, and the k(k 1)=2 cross-products or
two-factor interactions in (4). For example, if k = 7. then the 27 4 design
in Table 1 is replaced by a 27 1 design with the generator 7 = 1:2:3:4:5:6
so 64 combinations are used to estimate 29 e¤ects. A CCD augments this
R-V design such that the purely quadratic e¤ects j;j can also be estimated.
More speci…cally, a CCD adds the central point and 2k axial points that
form a star design, where— for the standardized factors— the central point
is (0; : : : 0)0 and the "positive" axial point for factor j is xj = c and all other
(k 1) factors are …xed at the center so xj 0 = 0 with j 0 = 1; : : : ; k and j 0 6=
j; the "negative" axial point for factor j is xj = c and xj 0 = 0. Kleijnen
(2014) gives a detailed discussion of CCDs, including more e¢ cient so-called
"saturated" designs; by de…nition, the latter designs have n = q. Using this
…tted second-order polynomial, RSM estimates the optimal values of the
inputs by straightforward di¤ erentiation or by more sophisticated canonical
analysis; see Myers et al. (2009).
If time permits, then RSM may try to escape from a possible local mini-
mum and restart the search from a di¤erent initial local area— which brings
RSM back to its initial step.
We recommend not to eliminate inputs that have non-signi…cant e¤ects
in a …rst-order polynomial …tted within a local experimental area: these
inputs may have signi…cant e¤ects in a next experimental area. A related
issue is the determination of the number of replications. Indeed, determining
whether the signal-to-noise ratio is big enough, is a moot issue in metamodel-
ing. Recently, this issue is examined in Kriging metamodeling by Ankenman
et al. (2010). For the time being, we recommend estimating the true mean
response for a given input combination such that a relative precision of (say)
10% has a 90% probability, using the method detailed in Law (2014).
The Taylor series argument suggests that a higher-order polynomial is
more accurate than a lower-order polynomial. A statistical counterargu-
ment, however, is that over…tting gives less accurate estimators of the poly-
nomial coe¢ cients. Consequently, the higher-order polynomial may give a
predictor yb with lower bias but higher variance such that its mean squared
error (MSE) increases. Moreover, a higher-order polynomial requires the
simulation of more input combinations.
3 RSM in simulation
So-called adapted steepest descent (ASD) accounts for the covariances be-
tween the elements of the estimated gradient b 0 = ( b1 ; : : : ; bk )0 where
the subscript 0 means that the intercept b0 of the estimated …rst-order
polynomial vanishes in the estimated gradient; i.e., b = ( b0 ; b 0 )0 with b
de…ned in (3). Obviously, the white-noise assumption implies
a b0
cov( b) = 2 0
w (Z Z)
1
= 2
w (5)
bC
where
2
w denotes the variance of the simulation output w;
Z is the N (1 + k) matrix of explanatory regression variables including
the column
Pnwith N one’s;
N = i=1 mi is the total number of simulation runs;
n is the number of di¤erent simulated input combinations;
mi is the number of IID replications for combination i;
a is a scalar;
b is a k-dimensional vector;
C is a k k matrix such that cov( b 0 ) = w 2
C.
2
We estimate the variance w in (5) through the mean squared residuals
(MSR): Pn Pmi
2 (wi;r ybi )2
bw = i=1 r=1 (6)
N (k + 1)
where ybi = z0i b; also see Kleijnen (2015).
We can easily prove that the predictor variance var(b
y jz) increases as z—
the point to be predicted— moves further away from the local area where
the gradient is estimated. The point with the minimum predictor variance
is C 1 b where C and b were de…ned below (5). ASD means that the new
point to be simulated is
1 1b
d= C b C 0 (7)
where
C 1 b is the point where the local search starts; namely, the point with
minimum local variance;
is the step size;
C 1 b 0 is the classic steepest descent direction b 0 adapted for cov( b 0 ).
Accounting for cov( b 0 ) gives a scale-independent search direction. Klei-
jnen et al. (2004, 2006) present experimental results showing that ASD
performs "better" than classic steepest descent; i.e., the angle between the
search direction based on the true 0 and the search direction estimated
in ASD is smaller; e.g., in one example this angle reduces from 89.87 for
classic steepest descent to 1.83 for ASD.
such that the other (say) (r 1) random outputs satisfy the constraints
lj zj uj with j = 1; : : : ; k. (10)
Like RSM, GRSM assumes that locally the white noise assumption holds
for (11), so the BLUEs are the following LS estimators:
^h = (Z0 Z) 1
Z0 wh with h = 0; : : : ; r 1: (12)
where b0; 0 = ( b0;1 ; : : : ; b0;k )0 is the LS estimate of the local gradient of the
goal function. The (r 1) estimates ^h0 in (12) combined with the original
output constraints (9) give
b0 0 ch0 with h0 = 1; : : : ; r 1 (14)
h ; 0z
minimize b0 z
0; 0
subject to Bz s = c
(15)
z+r=u
z v = l.
This optimization problem is linear in the inputs z. GRSM uses this problem
to derive the following search direction:
0 1
d= BS 2
B+R 2
+V 2 b0; 0 (16)
where the factor 0:8 is chosen to decrease the probability that the local
metamodel (14) is misleading when applied globally; zc denotes the current
input combination, so the new combination becomes zc + d. The box con-
straints (10) for the deterministic inputs hold globally, so it is easy to check
whether the new combination zc + d violates these constraints.
Analogously to classic RSM, GRSM proceeds stepwise. After each step
along the search path, GRSM tests the following two hypotheses, denoted
by the superscripts (1) and (2):
GRSM uses binary search; i.e., it simulates a combination that lies halfway
between these two combinations— and is still on the search path. This halv-
ing of the stepsize may be applied several times.
Next, GRSM proceeds analogously to classic RSM; i.e., around the best
combination found so far, GRSM selects a new local area. Again a R-III
design speci…es the new simulation input combinations, and r …rst-order
polynomials are …tted, which gives a new search direction, etc. Note that
we might use the m replications br to estimate the accuracy of the search
direction; to test the accuracy of the estimated optimum, we present a test
in the next section.
Figure 1 illustrates GRSM. This …gure shows two inputs; see the two axes
labeled z1 and z2 . The goal function is to be minimized; the …gure also shows
two contour plots or iso-costs functions: E(w0 ) = a0;1 and E(w0 ) = a0;2 with
a0;2 < a0;1 . There are two constrained random outputs; see E(w1 ) = a1 and
E(w2 ) = a2 . The search starts in the lower-right local area with a 22 design;
see the four elongated points. Together with the replications that are not
shown, the I/O data give a search direction; see the arrow leaving from
point (0). The maximum step size along this path takes the search from
point (0) to point (1). The binary search takes the search back to point (2),
and next to point (3). Because the best point so far turns out to be point
(1), the 22 design is used to select four points to be simulated in this local
area; this point is one of the four points. This design gives a new search
direction, which indeed avoids the boundary. The maximum step-size now
takes the search to point (4). The binary search takes the search back to
point (5), and next to point (6). Because the best point so far is now point
(4), the 22 design is simulated in the local area with this point as one of its
points. A new search direction is estimated, etc.
Angün (2004) applies GRSM to two more examples; namely, Bashyam
and Fu (1998)’s inventory simulation with a service-level constraint so the
solution is unknown, and a test function with known solution. Her results
for these examples are encouraging, as GRSM found solutions that were
both feasible and gave much lower values for the goal functions.
E(w0) = a0;1
(4)
E(w2) = a2
(6)
(5)
z2 D E(w0) = 96
E(w1) = 4 C
E(w0) = 76
B
E(w0) = 66
E(w2) = 9
z1
Fig. 2 A constrained nonlinear random optimization problem: three contour plots
with goal values 66, 76, and 96; two other outputs with lower bounds 4 and 9; .
optimal point A; points B and C on bound 9; point D on bound 4; local gradients
at A through D for goal function and binding constraint, perpendicular to local
tangent lines for binding constraint
9; also see (9). The …gure displays the boundaries of the feasible area, which
is determined by E(w1 ) = 4 and E(w2 ) = 9. The optimum combination is
point A. The points B and C lie on the boundary E(w2 ) = 9; point D lies on
the boundary E(w1 ) = 4. Obviously, the points A and D lie far away from
each other. The …gure also displays the local gradients at these four points
for the goal function and for the binding constraint; i.e., the constraint with
a zero slack value in (9). These gradients are perpendicular to the local
tangent lines; those lines are shown only for the binding constraint— not for
the goal function.
Let z0 denote a local minimizer for the deterministic variant of our
problem. The KKT conditions for z0 are then
P 0
0; 0 = h h; 0
h2A(z0 )
0
0 (20)
h
h 2 A z0
where 0; 0 denotes the k-dimensional vector with the gradient of the goal
function, as we saw in (13); A z0 is the index set with the indices of
the constraints that are binding at z0 ; 0h is the Lagrangian multiplier for
binding constraint h; h; 0 is the gradient of the output in that binding con-
straint; Figure 2 has only one binding constraint at the points A through D.
Altogether, (20) implies that the gradient of the objective is a nonnegative
linear combination of the gradients of the binding constraints, at z0 . Figure
2 shows that A satis…es (20); B has two gradients that point in di¤erent but
similar directions— and so does C— whereas D has two gradients that point
in completely di¤erent directions.
Note: If the optimum occurs inside the feasible area, then there are no
binding constraints so the KKT conditions reduce to the condition that
the goal gradient is zero. Classic RSM includes tests for a zero gradient
estimated from a second-order polynomial; see Section 2.
In simulation we must estimate the gradients; moreover, to check which
constraints are binding, we must estimate the slacks of the constraints. This
estimation changes the KKT conditions into a problem of nonlinear statis-
tics. Angün (2004) presents an asymptotic test, but Bettonvil et al. (2009)
derive a small-sample bootstrap test. We present the latter test, because
it is simpler and it allows expensive simulation. Readers not interested in
the technical details of this bootstrapping should skip the remainder of this
section.
As in classic RSM, we assume locally constant (co)variances for each of
the r simulation outputs (when moving to a new local area, the (co)variances
may change; e.g., the points A through D in Figure 2 do not have the same
variance for the goal output). LS per univariate simulation output gives ^h
(h = 0; 1; : : : ; r 1) de…ned in (12). These estimators have the following
estimated covariance matrix:
1
d ^h ; ^h0 ) = cov(w
cov( d h ; wh0 ) (Z0 Z) (h; h0 = 0; : : : ; r 1) (21)
Xm
1
d h ; wh0 ) = (bh;h0 ) = (
cov(w (wh;l wh )(wh0 ;l wh0 )) : (22)
m 1
l=1
: E(b)
(3)
H0 0: (25)
where both the numerator and the denominator use the m replications at
the local center point; see (22). This t statistic may give the following three
results:
(i) The statistic is signi…cantly positive; i.e., the constraint for output
h0 is not binding. If none of the (r 1) constraints is binding, then the
optimal solution is not yet found, assuming that at the optimum at least
one constraint is binding; otherwise, classic RSM applies. The search for
better solutions continues (see again Section 5).
(ii) The statistic is signi…cantly negative; i.e., the current local area does
not give feasible solutions so the optimal solution is not yet found. The
search should back-up into the feasible area.
(iii) The statistic is non-signi…cant; i.e., the current local area gives
feasible solutions, and the constraint for output h0 is binding. The index
of this gradient is then included in A z0 ; see (24). And the KKT test
proceeds as follows.
Sub 2 and 3 : To estimate the linear combination in (24), we apply LS
with as explanatory variables the estimated gradients of the (say) J binding
constraints; these explanatory variables are now random. We collect these
gradients in the k J matrix B b J; 0 . The parameters estimated through LS
b
are b. Let b 0; 0 denote the LS estimator of the goal gradient:
bb
0; 0
b J;
=B b0 b
0 (BJ; 0 BJ; 0 ) BJ; 0 b0; 0
1 b0 b J;
=B 0
b (27)
with b = (B
b0 b
J; 0 BJ; 0 ) BJ; 0 b0; 0 . To quantify the validity of this linear
1 b0
b
Hypothesis (24) implies E(be( b0; 0 )) = 0. This hypothesis involves a product
of multivariates, so standard tests do not apply and we use bootstrapping.
We do not apply distribution-free bootstrapping, because in expensive sim-
ulation only the center point is replicated a few times. Instead, we apply
parametric bootstrapping; i.e., we assume a Gaussian distribution (like in
classic RSM), and we estimate its parameters from the simulation’s I/O
data. This bootstrapping consists of the following four steps, where the
superscript denotes a bootstrapped value:
vec( b0; b
0 ; BJ; 0 ) N (vec( b0; b d
0 ; BJ; 0 ); cov(vec(
b b
0; 0 ; BJ; 0 ))) (29)
d
where cov(vec( b0; 0 ; B
b J; 0 )) is the (k + kJ) (k + kJ) matrix computed
through (21).
(ii) Use the bootstrap values resulting from Step 1, to compute the LS
estimate of the bootstrapped goal gradient using the bootstrapped gradients
of the binding constraints as explanatory variables; i.e., use (27) adding the
b
superscript to all random variables resulting in b and b .
0; 0
b
(iii) Use b0; b
0 from Step (ii) and 0; 0 from Step (i) to compute the
b b
bootstrap residual b e( b0; 0 ) = b0; 0 b0; 0 , analogous to (28). Determine
whether any of the bootstrapped Lagrangian multipliers b found in Step (ii)
is negative; i.e., augment a counter (say) c with the value 1 if this event
occurs.
(iv) Repeat the preceding three steps (say) 1000 times. This bootstrap
b
sample gives the EDF of b e( b0; 0;j )— which denotes the bootstrapped resid-
uals per input j (j = 1; : : : ; k)— and the …nal value of the counter c . Re-
ject the hypothesis in (24) if this EDF implies a two-sided (1 =(2k))
con…dence interval that does not cover the value 0, where the factor k is
explained by Bonferroni’s inequality. Reject (25) if the fraction c =1000
is signi…cantly higher than 50%; if the true Lagrangian multiplier is only
"slightly" larger than zero, then "nearly" 50% of the bootstrapped val-
ues is negative. To test the latter fraction, we approximate the binomial
distribution through the normal distribution with mean 0.50 and variance
(0:50 0:50)=1000 = 0:00025.
The numerical examples in Bettonvil et al. (2009) give encouraging re-
sults; i.e., the classic t test for zero slacks performs as expected and the new
bootstrap tests give observed type-I error rates close to the prespeci…ed
(nominal) rates, while the type-II error rate decreases as the tested input
combination is farther away from the true optimum (see A through D in
Figure 2).
7 Robust optimization
where x is de…ned in the obvious way; e.g., the element corresponding with
1;2 (interaction between d1 and d2 ) is d1 d2 . Note that (34) is linear in ,
but (31) is not linear in d. Then (34) gives the LS estimator
b = (X0 X) 1
X0 w (35)
Pn
where X is the N q matrix of explanatory variables with N = i=1 mi
and n denoting the number of di¤erent simulated combinations of d and e;
w consists of the N simulation outputs. Obviously, the covariance matrix
of b is
cov(b) = (X0 X) 1 2 : (36) w
2 2
The RSM metamodel (31) implies that w equals . This variance is esti-
mated by
y w)0 (b
(b y w)
M SR = (37)
N q
0
where yb = b x; also see (6).
To estimate y , we simply plug b de…ned by (35) into the right-hand
side of (32), where d and e are known. To estimate y2 , we also plug b
into (33), where e is known. Note that (33) involves products of unknown
parameters, so it implies a nonlinear estimator by2 ; plugged-in estimators
certainly create bias, but we ignore this bias.
Our …nal goal is to solve (30). We solve this constrained minimiza-
tion problem through a mathematical programming solver; e.g., Matlab’s
"fmincon"— but a di¤erent solver might be used; see Gill et al. (2000). This
gives estimates of the robust decision variables and the corresponding mean
and variance.
Dellino et al. (2010) give the example of the economic order quantity
(EOQ) for an environment with a demand rate that is uncertain but has a
known distribution. This example demonstrates that if management prefers
low variability of inventory costs, then they must pay a price; i.e., the ex-
pected costs increases. Furthermore, the classic EOQ assuming a known
…xed demand rate and the robust EOQ do di¤er. More examples are refer-
enced by Yan¬ko¼ glu et al. (2013).
Ben-Tal emphasizes that the nominal solution— which ignores the uncer-
tainty in e— may easily violate the constraints in the given mathematical
programming problem, so he derives a robust solution that gives a slightly
worse value for the goal variable but increases the probability of satisfying
the constraints. The mathematical challenge is to develop a computationally
tractable "robust counterpart" of the original mathematical programming
problem. Therefore he proposes a robust solution that is "immune" to vari-
ations within the uncertainty set. Yan¬ko¼ glu et al. (2013) derive a speci…c
uncertainty set for p where p denotes the unknown density function of e,
which is compatible with the historical data on e. Technically, p belongs to
this set with con…dence 1 , assuming some phi-divergence measure such
as the well-known chi-square distance. In this chapter we do not present the
mathematical details of the derivation of tractable robust counterparts, but
refer to Yan¬ko¼glu et al.
Note that Taguchians assume a speci…c distribution for e, which im-
plies e and e in (32) and (33). This distribution may be estimated from
historical data; Yan¬ko¼ glu et al., however, develop an approach that uses
only the original observed data on e. Several numerical examples demon-
strate the e¤ectiveness of this novel combination of Taguchi’s and Ben-Tal’s
approaches.
Yan¬ko¼glu et al. also present Myers et al. (2009, p. 512)’s real-life tele-
vision example, and Shi (2011)’s simulated distribution-center example. In
this chapter we focus on simulation, so we discuss only Shi’s example. He
has …ve decision variables (e.g., number of forklifts) and two environmen-
tal variables (e.g., delay probabilities of suppliers); the response is the to-
tal throughput. Shi …ts an incomplete second-order polynomial; see (31).
Yan¬ko¼glu et al. replace (30) by the following related problem:
2
min w such that w T: (38)
These two examples again demonstrate that in general robust solutions have
better worst-case performance and also better average performance.
8 Conclusions
In this chapter, we started with the basics of classic RSM, which minimizes
the expected value of a single response variable in real-life experiments. Next
we considered simulation experiments. We then added the adapted steepest
descent search direction, which improves classic steepest descent. We also
summarized GRSM for simulation with multivariate responses, assuming
that one response is to be minimized while all the other responses and de-
terministic inputs must meet given constraints. Furthermore, we presented
a bootstrap procedure for testing whether the KKT conditions hold for the
estimated optimum. Finally, we considered robust optimization.
Future research may study the selection of the required number of repli-
cations, and the use of replications to estimate the accuracy of the resulting
estimated search direction or optimum. Bootstrapping may solve this prob-
lem, but it needs more research. Numerical evaluation of the adapted steep-
est descent method would bene…t from more applications in practice. There
is also a need for more research on the KKT testing procedure when all local
points (not only the center) are replicated and CRN are used; more practi-
cal applications are also needed. In Taguchian robust optimization through
RSM we may vary the threshold values, which gives di¤erent optimal solu-
tions and a corresponding Pareto frontier; bootstrapping this frontier might
enable management to make the …nal compromise decision— but more re-
search and applications are needed.
References
Ankenman, B., Nelson, B., and Staum, J., (2010), Stochastic Kriging for
Simulation Metamodeling, Operations Research, 58, No. 2, pp. 371-382
Barnes, E. R. (1986), A variation on Karmarkar’s algorithm for solving
linear programming problems. Mathematical Programming, 36, pp. 174-182
Barton, R.R. and M. Meckesheimer (2006), Metamodel-based simulation
optimization. Handbooks in Operations Research and Management Science,
Volume 13, Elsevier/North Holland, pp. 535-574
Bartz-Beielstein, T. (2006), Experimental research in evolutionary com-
putation; the new experimentalism. Springer, Berlin
Bashyam, S. and M. C. Fu (1998), Optimization of (s, S) inventory
systems with random lead times and a service level constraint. Management
Science, 44, pp. 243-256
Ben-Tal, A. and A. Nemirovski (1998), Robust convex optimization.
Math. Oper. Res. 23(4) 769-805
Ben-Tal, A. and A. Nemirovski (2008), Selected topics in robust convex
optimization, Mathematical Programming, 112, no. 1, pp. 125–158
Bettonvil, B.W.M., E. del Castillo, and J.P.C. Kleijnen (2009). Statisti-
cal testing of optimality conditions in multiresponse simulation-based opti-
mization. European Journal of Operational Research, 199(2), 448-458.
Beyer, H. and B. Sendho¤ (2007), Robust optimization— a comprehen-
sive survey. Computer Methods in Applied Mechanics and Engineering, 196,
nos. 33-34, pp. 3190–3218
Box, G.E.P. and K.B. Wilson (1951), On the experimental attainment
of optimum conditions. Journal Royal Statistical Society, Series B, 13, no.
1, pp. 1-38
Chang, K-H., J. Hong, and H. Wan (2013), Stochastic Trust-Region
Response-Surface Method (STRONG) – A New Response-Surface Frame-
work for Simulation Optimization, INFORMS Journal on Computing, 25,
no. 2, pp. 230-243
Chang, K.-H. M-K Li and H. Wan (2014), Combining STRONG with
Screening Designs for Large-Scale Simulation Optimization, IIE Transac-
tions, accepted.
Chih, M. (2013), A more accurate second-order polynomial metamodel
using a pseudo-random number assignment strategy. Journal of the Opera-
tional Research Society, 64, pp. 198-207
Conn, A.R., N.L.M. Gould, P.L. Toint (2000), Trust-Region Methods.
SIAM
Dellino, G., Kleijnen, J.P.C., & Meloni, C. (2010). Robust optimization
in simulation: Taguchi and Response Surface Methodology. International
Journal of Production Economics, 125(1), 52-59
Dellino, G., Kleijnen, J.P.C., & Meloni, C. (2012). Robust optimization
in simulation: Taguchi and Krige combined. Informs Journal on Computing,
24(3), 471-484
Dykstra, R.L. (1970), Establishing the positive de…niteness of the sample
covariance matrix. The Annals of Mathematical Statistics, 41, no. 6, pp.
2153–2154