0% found this document useful (0 votes)
134 views138 pages

Purdue University: Aggregate - Flow Scheduling: Theory and Practice

gfder jhjh

Uploaded by

Bandi Sumanth
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
134 views138 pages

Purdue University: Aggregate - Flow Scheduling: Theory and Practice

gfder jhjh

Uploaded by

Bandi Sumanth
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 138

G raduate S chool F orm 9

(R evised 7/94)

PURDUE UNIVERSITY
GRADUATE SCHOOL
Thesis Acceptance
This is to certify that the thesis prepared
By

_________________________________ H uan

R en___________________________________

Entitled

Aggregate - Flow Scheduling: Theory and Practice


Complies with University regulations and meets the standards o f the Graduate School for originality
and quality

For the degree o f _____________ D octor

of P h ilo so p h y _____________________________

Signed by the finaFexamining committee:


chair

Approved by:

Department Head

This thesis ^

Date

is not to be regarded as confidential. __


Major Professor

Format Approved by:

or
Chair, Final Examining Committee

'hesis Format Adviser

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

AGGREGATE-FLOW SCHEDULING: THEORY AND PRACTICE

A Thesis
Submitted to the Faculty
of
Purdue University
by
Huan Ren

In Partial Fulfillment of the


Requirements for the Degree
of
Doctor of Philosophy

December 2002

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

UMI Number: 3105011

Copyright 2004 by
Ren, Huan

All rights reserved.

INFORMATION TO USERS

The quality of this reproduction is dependent upon the quality of the copy
submitted. Broken or indistinct print, colored or poor quality illustrations and
photographs, print bleed-through, substandard margins, and improper
alignment can adversely affect reproduction.
In the unlikely event that the author did not send a complete manuscript
and there are missing pages, these will be noted. Also, if unauthorized
copyright material had to be removed, a note will indicate the deletion.

UMI
UMI Microform 3105011
Copyright 2004 by ProQuest Information and Learning Company.
All rights reserved. This microform edition is protected against
unauthorized copying under Title 17, United States Code.

ProQuest Information and Learning Company


300 North Zeeb Road
P.O. Box 1346
Ann Arbor, Ml 48106-1346

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

ii

To Xin and Eric

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

ACKNOWLEDGMENTS

I would like to show my sincere gratitude to my advisor Professor Kihong Park for
his guidance during my Ph.D. study. The hundreds of hours th at we spent together in
the past four years are still vivid in my memory. His devotion to science, enthusiasm
for perfection, and creative thinking provide an example of researcher th at I shall
follow in my life. I thank him not only for giving me copious amounts of insightful
criticism on my research but also for his far-reaching advices on my professionalism
and future career.
I would like to thank Professors Sonia Fahmy, Ness Shroff, and Wojciech Szpankowski for being on my advisory committee. The computer networking seminar
course I took from Professor Fahmy helped me start working on the topic of this
dissertation. Prof. Shroff is a true mentor to his students; I was fortunate to get
many wise advices from him on various subjects throughout my Ph.D. study. Prof.
Szpankowski always encourages students to explore unknown world, and I benefited
greatly from the algorithm analysis course he taught. I also thank Professor Steven
Low at California Institute of Technology for serving on my final examining commit
tee. He made a long trip to attend my final defense and provided insightful comments
on this dissertation. I thank Dr. Gorman for all his help in the department.
Special thanks should be given to my wife, Xin Liu, for her love, support, encour
agement, and for being an intelligent colleague as well. The past four and a half years
th at we studied together at Purdue are the happiest and most memorable time in my
life. Both of us get our Ph.D. degrees from Purdue with Eric Rucheng Ren as our
another accomplishment. It is her th at makes this long journey of pursuing Ph.D. so
enjoyable.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

Finally, I would like to thank my parents for their selfless love and support. I also
thank my brother Kun for sharing my experiences, his sense of humor, and being my
soul mate.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

TABLE OF CONTENTS
Page
LIST OF F I G U R E S .....................................................................................................
ABSTRACT

viii

..................................................................................................................

xi

1 In tro d u c tio n ...............................................................................................................

1.1

M otivation........................................................................................................

1.2

Key Is s u e s ........................................................................................................

1.3

Theoretical C o n tr ib u tio n s ..........................................................................

1.4

Im p lem en tatio n ..............................................................................................

1.5

Related W o r k .................................................................................................

2 Network A rch itectu re..............................................................................................

10

2.1

Overall System S tru c tu re ..............................................................................

10

2.2

Basic Definitions

...........................................................................................

12

2.3

Per-hop C o n tro l..............................................................................................

12

2.3.1

Per-hop Control C o m p o n e n ts ........................................................

12

2.3.2

Per-hop Control P ro p erties..............................................................

13

Edge C o n t r o l .................................................................................................

14

2.4.1

Access Control

.................................................................................

14

2.4.2

End-to-end C o n tro l...........................................................................

15

User C o n tro l....................................................................................................

16

2.5.1

User Utility and Selfishness

...........................................................

16

2 .5 .2

N o n c o o p e r a tiv e G a m e ...................................................................................

17

Service Provider C o n tr o l..............................................................................

18

3 Optimal Aggregate-flow S c h e d u lin g .....................................................................

20

2.4

2.5

2.6

3.1

Optimal Classifiers and Per-hop C o n t r o l .................................................

20

3.1.1

20

Optimal Per-flow C la ssific a tio n ....................................................

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

vi
Page

3.1.2

Optimal Aggregate-flow Classification...........................................

26

3.1.3

Properties of Optimal Aggregate-flow C lassifiers.......................

31

3.1.4

System Optimality and Structural P roperties..............................

34

3.2 Game Theoretic S tr u c tu r e ............................................................................

39

3.2.1

Basic D e fin itio n s ...............................................................................

39

3.2.2

Nash Equilibria and Stability P r o p e r tie s ........................................

40

3.3 C onclusion.........................................................................................................

42

........................................................................................

44

4.1 QoS Provisioning Architecture D esign.........................................................

44

4 Performance Evaluation

4.1.1

Optimal Aggregate-flow Per-hop Control D e s ig n ........................

44

4.1.2

End-to-end QoS Control D e s ig n .....................................................

46

4.1.3

Scaling Function

...............................................................................

47

4.1.4

Load Imbalance and LocalQoSR esponsibility ..............................

48

Performance R e s u l t s .....................................................................................

50

4.2.1

Simulation S e t-u p ...............................................................................

50

4.2.2

Service D iffe re n tia tio n .....................................................................

51

4.2.3

Structural Properties of Optimal Aggregate-flow Per-hop Control 55

4.2.4

The Role of Scaling Function

........................................................

59

4.2.5

Impact of B u r s tin e s s ........................................................................

61

4.2.6

Dynamics and C o n v e rg e n c e ...........................................................

62

4.3 C onclusion.........................................................................................................

69

5 System Building and B enchm arking....................................................................

70

5.1 System D esign...................................................................................................

70

4.2

5.1.1

Key is s u e s ............................................................................................

70

5.1.2

Overall S tr u c tu r e ...............................................................................

71

5.1.3

Dynamic Weight C om putation ........................................................

74

5.1.4

User Configuration In te rfa c e ...........................................................

75

5.2 System Building P rocedure............................................................................

75

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

v ii

Page
5.3 Benchmarking R e s u l t s ..................................................................................

76

5.3.1

QoS D ifferentiation............................................................................

76

5.3.2

Dynamic E n v iro n m e n t.....................................................................

79

5.4 C onclusion.........................................................................................................

80

6 Stochastic Modeling and O p tim iz a tio n ..............................................................

82

6.1 In tro d u c tio n ......................................................................................................

82

6.1.1

Features of Optimal Aggregate-flow Scheduling

........................

83

6.1.2

New C ontribution ...............................................................................

84

6.2 Related W o r k ..................................................................................................

86

6.3 System M o d e l..................................................................................................

89

6.3.1

Multi-class Queueing M o d e l............................................................

89

6.3.2

Optimal Aggregate-flow S ch ed u lin g ...............................................

91

6.3.3

Conservation L a w ...............................................................................

92

6.3.4

Aggregate-flow Performance Space and Open Ball Containment

6.4 Structure of Optimal Aggregate-flow S ch ed u lin g .....................................

93
94

6.4.1

Main R e s u lt.........................................................................................

94

6.4.2

Decomposition

..................................................................................

95

6.4.3

C lu s te rin g ............................................................................................

97

6.4.4

Open Ball Scaling...............................................................................

98

6.5 Complexity of Optimal C lu sterin g...............................................................

100

6.5.1

Unconstrained Optimization and Subspace P ro je c tio n ................

101

6.5.2

Hardness of One-dimensional C lu s te r in g .....................................

104

6.6 Conclusion and D iscu ssio n ............................................................................

112

7 Conclusion and Future W o rk .................................................................................

114

7.1 Thesis Summary

............................................................................................

114

7.2 Future W o rk ......................................................................................................

115

LIST OF R E F E R E N C E S .............................................................................................

117

VITA

123

.................................................................................................................................

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

viii

LIST OF FIGURES
Figure

Page
Overall QoS provisioning architecture. Network exports per-hop and
edge control, user exercises scalar QoS control (77 -control), and service
provider exports QoS cost to user.................................................................

11

Left: Aggregate-flow QoS control affected by two stages of information


loss via many-to-one coarsificationat edge and per-hop. Right: 77
value in DS field of IP datagram is used by the classifier to select service
class in GPS packet scheduler........................................................................

13

Structure of forward QoS control path. Lower path comprised of ad


mission control, policing/shaping, per-hop controlopen-loop control.
Upper control path comprised of dynamic 77 control, pricing, receiver
QoS monitoring, QoS feedbackclosed-loop control................................

15

3.1

Behavior of reduction classifier....................................................................

28

4.1

Structure of reduction classifier for m = L\ a,k is the service weight


allocated to service class k G { 1 ,..., L } ......................................................

45

4.2 Left: Equal spacing QoS separation achieved by optim al aggregateflow classifier when L = 16. Right: Scaling function a affecting nonuni
form stretching and contraction....................................................................

48

Structure of reduction classifier with scaling function for m = L\ oik is


the service weight allocated to service class k { 1 ,..., L } ....................

49

Uniform vs. nonuniform local QoS responsibility distribution to satisfy


30ms end-to-end delay requirement for a given load imbalance. . . .

49

Benchmark network topology. Left: 2-switch single bottleneck link


shared by n flows. Right: 4-switch multiple bottleneck link caterpillar
topology. ........................................................................................................

51

QoS separation achieved by optimal aggregate-flow classifier when L =


16. Left: Packet loss rate. Right: End-to-end delay.
..........................

52

4.7 Manifestation of properties (Al) and (A2). End-to-end QoS shaping


as a function of label value rju of singular user flow. Left: 16 users
(originally each group has one user). Right: 48 users (average group
population size of 3)........................................................................................

53

2 .1

2 .2

2.3

4.3
4.4
4.5

4.6

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

ix
Figure

Page

Impact of v on QoS separation for L = 16. Left: QoS exported by


service classes as a function of bottleneck bandwidth when v 0.9.
Right: Corresponding plots when v = 0.1...................................................

54

Structure of A*. Left: The change in Nash equilibria as we increase


bottleneck bandwidth for user population with QoS requirement pro
file (0.01,0.05,0.1,0.15,0.2,0.3,0.4,1.0). At 10 Mbps, Nash equilibria
become corner points of A*. Right: Corresponding landscape for more
stringent user population QoS profile shown in the legend.....................

56

4.10 Impact of bounded label set size L on QoS exported by the service
classes as a function of bottleneck bandwidth for L = 1, 4, 8 , 32. . . .

57

4.11 The combined impact of L and v on QoS shaping. Left: QoS exported
in L service classes as a function of L for v = 0.5. Right: Corresponding
plot for v 0.1.................................................................................................

58

4.12 Impact of L on existence of A*: minimum bottleneck bandwidth re


quired to achieve A 7^ 0 as a function of L ...............................................

59

4.13 QoS differentiation achieved by optimal aggregate-flow classifier with


scaling function. <7 (77 ) for 77 G [0,15]: 1.0, 1.1, 1.2, 10, 11, 12, 100,
110, 120, 500, 550, 600, 1000, 1100, 1200, 2000. Left: Packet loss rate.
Right: End-to-end d e la y ...............................................................................

60

4.14 Impact of scaling function on system efficiency: minimum bottleneck


bandwidth required to achieve A* ^ 0 as a function ofL ......................

60

4.15 QoS separation achieved by optimal aggregate-flow classifier when L =


16 under VBR traffic. Left: packet loss rate. Right: end-to-end delay.

61

4.16 Impact of finite label set size L on QoS for VBR traffic. Left: L=4;
Right: L=16......................................................................................................

62

4.17 QoS separation achieved by optimal aggregate-flow classifier with scal


ing function when L = 16 under VBR traffic. <7 (77 ), 77 G [0,15]: 1.0, 1.1,
1.2, 10, 11, 12, 100, 110, 120, 500, 550, 600, 1000, 1100, 1200, 2000.
Left: packet loss rate. Right: end-to-end delay. ....................................

63

4.18 Time evolution of adaptive label control and end-to-end QoS. Top row:
Evolution of label values shown for three user groups with common QoS
requirements (0.1, 0.3, and 0.5). Bottom row: Corresponding trace of
measured end-to-end QoS for user flows belonging to the three QoS
groups.................................................................................................................

64

4.8

4.9

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

Page

Figure

4.19 End-to-end QoS achieved under adaptive label control as a function of


bottleneck bandwidth for successively more stringent QoS requirement
profiles (shown in the legends)

65

4.20 Time evolution of adaptive label control and end-to-end QoS in manyswitch topology (Figure 4.5 (right)) with many flows. Top row: Evo
lution of label values shown for three user groups with common QoS
requirements (0.1, 0.3, and 0.5). Bottom row: Corresponding trace of
measured end-to-end QoS for user flows belonging to the three QoS
groups.......................................................................................................
66
4.21 QoS distribution on multi-hop path with three medium-loaded
switches. Left: Switch loads. Middle: Static QoS distribution. Right:
67
Dynamical QoS distribution................................................................
4.22 QoS distribution on multi-hop path with one heavy-loaded switch and
two light-loaded switches. Left: switch loads. Middle: static QoS
distribution. Right: dynamical QoS distribution............................
68
4.23 To satisfy given QoS target, the actual QoS achieved at each switch
with respect to different load imbalance patterns...........................
69
5.1 Structure of optimal aggregate-flow per-hop control module imple
mented in Cisco routers
72
5.2 Network topology of Q-bahn t e s t b e d ..............................................

76

5.3 QoS separation among classes under different load distribution . . . .

77

5.4 QoS separation among classes under different congestion level

78

5.5 Q-Bahn experiment r e p o r t .................................................................

....
81

6.1 Depiction of conservation law hyperplane, per-flow performance space,


and aggregate-flow subspace for an n = 3 and m 2 optimum schedul
85
ing example.............................................................................................

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

ABSTRACT
Ren, Huan. Ph.D., Purdue University, December, 2002. Aggregate-flow Scheduling:
Theory and Practice. Major Professor: Kihong Park.

This dissertation studies providing QoS to individual flows using aggregate-flow


scheduling. Our work is carried out on both theoretical and practical sides.
We present a theoretical framework for reasoning about scalable QoS provisioning
using aggregate-flow scheduling, constrained to be implementable in IP networks.
Our control frameworkScalar QoS Controlgeneralizes per-hop and edge control
achievable by setting a scalar value in packet headers, e.g., the TOS field of IP.
We study optimal aggregate-flow scheduling problem under the framework and the
properties the optimal solution exhibit which facilitate end-to-end QoS via the joint
action of aggregate-flow scheduling per-hop and per-flow provisioning at the edge.
We design an optimal aggregate-flow per-hop control algorithm th at achieves the
induced optimal aggregate-flow scheduling solution and implement the algorithm in
Cisco routers. We conduct a comprehensive performance evaluation by both simu
lation and experiments over Q-bahn testbed comprised of Cisco routers running the
implemented optimal aggregate-flow per-hop control. The benchmarking results con
firm our theoretical framework and analysis, and reveal further quantitative features
of both structural and dynamical properties of the system. Our results, collectively,
show th at user-specified services can be efficiently and effectively achieved over net
works with optimal aggregate-flow per-hop control substrate when coupled with either
open-loop or closed-loop (adaptive label control) edge control.
We generalize our optimal aggregate-flow per-hop control analysis by consider
ing stochastic input and study optimal aggregate-flow scheduling problem in gen
eral multi-class queueing systems. We introduce a stochastic framework of optimal

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

aggregate-flow scheduling, which extends the optimal per-flow scheduling framework


pioneered by Coffman and Mitrani. We show th at optimal aggregate-flow schedul
ing in multi-class G /G /l systems with work-conserving, non-preemptive and nonanticipative scheduling disciplinesfor which Kleinrocks conservation law holdsis
NP-hard. This stands in contrast with the quadratic time complexity of optimal
per-flow scheduling and cubic time complexity of optimal aggregate-flow scheduling
in static input environments subject to relative service differentiation. We show th at
computational hardness results from the combination of optim al aggregation and
Kleinrocks conservation law.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

1
1.1

INTRODUCTION

M otivation
Architecting networks capable of providing scalable, efficient, and fair services to

users with diverse QoS requirements is a challenging problem. The traditional ap


proach uses resource reservation and admission control to provide both guarantees
and graded services to application traffic flows. Analytical tools for computing and
provisioning QoS guarantees [15, 16, 59, 60] rely on overprovisioning coupled with
traffic shaping/policing to preserve well-behavedness properties across switches th at
implement a form of GPS packet scheduling [17]. Scale-invariant burstiness associ
ated with self-similar network traffic [46, 64] limits the shapability of input traffic and
restricts reserving bandwidth th at is significantly smaller than the peak transmission
rate. This imposes a trade-off between QoS and resource utilization which limits
the degree of utilization achievable while guaranteeing stringent QoS. For applica
tions needing guaranteed services, the unconditional protection afforded by per-flow
resource reservation and admission control is a necessity. However, for elastic appli
cations th a t require QoS-sensitive services but not guarantees, it would be overkill to
provision QoS using the mechanisms of per-flow reservation and admission control. In
addition to the service mismatch , overhead associated with administering resource
reservation and admission control would impede scalability.
Efforts have been directed at designing network architectures with the aim of
delivering QoS-sensitive services by introducing weaker forms of protection or as
surance to achieve scalability [8, 12, 18, 50, 55]. The differentiated services frame
work [3, 12, 37, 55] has advanced a set of building blocks comprised of per-hop and
access point behaviors with the aim of facilitating scalable services through aggregateflow resource control inside the network and per-flow traffic control at the edge. By

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

performing a many-to-one mapping from the large space of individual flows to the
much smaller space of aggregate-flow labels, scalability of per-hop control is achieved
at the expense of introducing uncertainty and possible volatility by flow-aggregation
and aggregate-flow packet scheduling.

1.2

K ey Issues
A number of works have studied the behavioral characteristics of specific in

stances of differentiated services networks. In previous work [8, 9, 63], the authors
introduced aggregate-flow per-hop control mechanisms motivated by game theoretic
considerationsa router performs class-based label switching which emulates user
optimal service class selection with respect to selfish userswithout considering the
space of all aggregate-flow per-hop controls which is carried out in this dissertation. In
[52] simplified models of Assured Service [37] and Premium (or Expedited) Service [38]
are presented and analyzed with respect to their performance when compared with
simulations. In [24], an adaptive 1-bit marking scheme is described, and the resulting
bandwidth sharing behavior is demonstrated via simulations when the priority level
is controlled end-to-end. In [18], the authors describe the proportional differentiation
model which seeks to achieve robust, configurable service class separationi.e., QoS
differentiationwith the support of two candidate packet schedulers. They use simu
lation to study the behavioral properties. Other related works include [8, 12, 50, 56].
In spite of these efforts, a comprehensive understanding of the power and limita
tion of aggregate-flow QoS provisioning is still in its infancy. Little is known about
how to select good aggregate-flow per-hop controlsincluding optimal onesand
per-flow end-to-end (or edge) controls, and what criteria to apply when designing
these components. Following the divide-and-conquer approach to network design, we
would like to reduce the scalable QoS provisioning problem to subproblems and solve
them individually without worrying about the details of other subsystems except
through well-defined interfaces and black box function definitions. Although the

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

same approach is undertaken in this work, we find th at there are intim ate relationships
between the selection of per-hop and end-to-end controls. Thus the key problem of our
study is the formulation and solution of optimal aggregate-flow scheduling for scalable
QoS provisioning with regard to individual user requirement and performance.

1.3

Theoretical Contributions
Our theoretical study has three parts.

First, we give a general framework of

scalable QoS provisioning using aggregate-flow scheduling where packet labels can
be set from a finite label set and routers provide differentiated treatm ent of packets
based on the labels enscribed. We define the meaning of optimal per-hop control
within this context and find the optimal solution for aggregate-flow control. We show
th a t the optimal per-hop control satisfies certain propertiesdenoted (A l), (A2),
and (B), and defined in Section 2.3which relate how label values impact the service
a flow receives at a router. We augment the general result by presenting optimal
solutions when restricting the packet scheduling disciplines to variants of GPS, and
the consequences on the core properties.
Second, we expand the framework by introducing selfish users who can influence
QoS provisioning behavior by regulating the label values assigned to their traffic
streams. Based on the properties exported by the network control (A l), (A2), and
(B)we show how a population of selfish users with diverse QoS requirements setting
their packet labels can arrive at a global allocation of resources th a t is stable (Nash
equilibrium) and efficient (system optimal). We show th at even in situations when
network resources are scarce such th at no resource allocationaggregate-flow differ
entiation, per-flow reservation, or otherwisecan satisfy all users QoS requirements,
the system is stable and reaches a Nash equilibrium. We show th at the optimal perhop control is also optimal in the noncooperative game context in the sense that
when network resources are configurable such th at all users QoS requirements can
be satisfied, then there exists a Nash equilibrium th at is system optimal.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

Third, we generalize our optimal aggregate-flow per-hop control analysis by con


sidering the impact of stochasticity of the input: arrival processes with general in
terarrival and service times. We study the stochastic modeling and optimization of
aggregate-flow scheduling in a general multi-class queueing systems. We introduce a
stochastic framework of optimal aggregate-flow scheduling, which extends the optimal
per-flow scheduling framework pioneered by Coffman and M itrani [19, 30, 23, 22, 76].
We show th a t optimum scheduling in multi-class G /G /l systems with work con
serving, non-preemptive and non-anticipative scheduling disciplinesfor which Klein
rocks conservation law holdsis NP-hard. This stands in contrast with the quadratic
time complexity of optimal per-flow scheduling, and cubic time complexity in static
input environments subject to relative service differentiation. We show th a t computa
tional hardness results from the combination of optimal aggregation and Kleinrocks
conservation law.
The chapters related to theoretical analysis are organized as follows: In Chapter 2
we present a general QoS provisioning architecture using aggregate-flow scheduling.
Chapter 3 studies optimal aggregate-flow pre-hop control, and system stability and ef
ficiency under non-cooperative game context. In Chapter 6, we discuss the stochastic
framework of optimal aggregate-flow scheduling.

1.4

Im plem entation
In addition to the theoretical analysis, We design efficient algorithm to achieve

optimal aggregate-flow per-hop control and implement the algorithm in Cisco routers.
We carry out a comprehensive performance evaluation study of QoS provisioning
using optimal aggregate-flow per-hop control by both simulation and benchmarking
over real network testbed.
In chapter 4, we investigate implementation issues of QoS provisioning using opti
mal aggregate-flow per-hop control derived in Chapter 3, and carry out a performance
evaluation study by simulation. We design a system th at implements the optimal per-

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

hop control and end-to-end control, and propose a practical enhancement by introduc
ing a scaling function which is applied to the TOS field label value in the IP header at
each router. The scaling function allows the service provider to configure the per-hop
control so as to export customized QoS separationessential when shaping end-toend absolute QoS over per-hop relative QoScommensurate with the QoS profiles of
the service providers user base. Using simulation, we show th a t the scaling function
enhances system efficiency in the sense th at less bandwidth is needed to achieve the
same QoS requirements. We also complement the theoretical framework introduced
by carrying out a performance evaluation study of QoS provisioning using optimal
aggregate-flow per-hop control. We use simulation to study both the dynamical and
structural properties of aggregate-flow QoS provisioning as they relate to stability and
optimality. We provide comprehensive and detailed quantitative performance results
and evaluations under aggregate-flow interactions. We dem onstrate the QoS shaping
prowess of optimal aggregate-flow per-hop control, QoS separation and efficiency at
matching diverse QoS requirements, the impactwith respect to service resolution
and loss of QoS shaping powerstemming from discrete and bounded label values in
TOS field, the effectiveness of end-to-end adaptive label control over optimal per-hop
control, dynamical properties of adaptive label control with respect to convergence
and stability, and QoS distribution across multiple routers on an end-to-end path in
a wide area network.

In Chapter 5, we describe the design and implementation of optimal aggregate-flow


per-hop control in Cisco routers. The system building experience shows th at the op
timal per-hop control scheme purposed is practical and implementable. The overhead
brought by optimal per-hop control is small. We also conducted benchmarking over
Q-Bahn testbed which is comprised of Cisco 7206 VXR routers running developed
optimal per-hop control. Our benchmarking output confirms the previous results
from theoretical analysis and simulation study, and demonstrates the scalability of
the QOS provisioning architecture.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

Collectively, our results show th at optimal aggregate-flow per-hop control is imple


mentable, and user-specified, diverse QoS requirements can be effectively facilitated
over the network with optimal aggregate-flow scheduling substrate.

1.5

R elated Work
The early approach to QoS provisioning uses resource reservation and admis

sion control to provision guaranteed servicesdeterministic or statisticalwith the


help of leaky bucket regulators. Although research abounds [14, 15, 20, 21, 31, 45,
54, 59, 60, 16], analytic tools for computing QoS guarantees rely on shaping of in
put traffic to preserve well-behavedness across switches which implement a form of
generalized processor sharing (GPS), also known as weighted fair queuing [17, 59].
Real-time constraints of multimedia traffic and the scale-invariant burstiness associ
ated with self-similar network traffic [46, 61, 67, 85] limit the shapability of input
traffic while at the same time reserving bandwidth th at is significantly smaller than
the peak transmission rate. Statistical multiplexingwhen employing a small buffer
capacity/large bandwidth resource provisioning policy [64]can, by an application
of the central limit theorem, yield improved efficiency while achieving predictable
performance. However, the statistical guarantee is only approximate, some efficiency
is lost when fitting leaky bucket parameters to self-similar input [73], and secondorder propertieseven under bufferless queuingadversely affect jitte r and related
second-order performance measures. Thus QoS and utilization stand in a trade-off
relationship with each other [62, 61] and transporting application traffic over reserved
channels, in general, incurs a high cost.
Aggregate-flow scheduling has been investigated in the noncooperative QoS provi
sioning contextexplicitly for multi-class QoS and implicitly for competitive routing.
In [8, 9, 63] the authors introduced aggregate-flow per-hop packet scheduling mecha
nisms and end-to-end controls motivated by game theoretic considerationsa router
performs class-based label switching which emulates user optim al service class selec

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

tion with respect to selfishnesswithout considering the space of all aggregate-flow


per-hop controls. In [51], the aggregate-flow scheduling scheme is modeled as a multi
class priority queue, where users are given the freedom to choose the priorities of their
traffic, but are charged accordingly. In [42, 57] the authors study stability and ef
ficiency properties of a parallel link routing system where selfish users are allowed
to choose which link to send their traffic commensurate with their performance re
quirements. The QoS experienced in each link is a function of how many users have
selected th at linkflow-aggregationand interpreting each link as a service class
(albeit without work conservation coupling) yields a aggregate-flow QoS provision
ing system. Several other papers have also addressed the issue of multi-class QoS
provisioning in high-speed networks [13, 49, 77, 74, 66]. Some of the works employ
a cooperative framework or place significant computing responsibilities on the part
of the user [49, 74], some investigate the effect of pricing incentives [13], and others
represent flow/congestion control and routing models th at only partially address the
quality of service problem [77].

In the differentiated services area, efforts are directed at designing network ar


chitectures with the aim of delivering service levels with soft or weak guaran
tees using aggregate-flowas opposed to per-flowtraffic control [12, 55], with pri
mary emphasis on scalability. Assured Service [12, 37] and Premium (or Expedited)
Service [55, 38]two principal proposals adopted by the IETF Diff-Serv Working
Groupaffect weak protection through traffic shaping/m arking at the edge and dif
ferentiated packet treatm ent support from routers. In both cases, it is assumed that
service level (i.e., QoS) is computed using admission control, and the core task revolves
around providing protection from ill-behaving flows th at exceed their contract speci
fications. This is done through 2-state (in/out) marking using RIO gateways [12], or
through leaky bucket traffic shaping with routers implementing priority queuing [55].
The most challenging problemhow to efficiently compute service level agreements
(SLA)is not addressed within its scope. It is this problem, phrased in the context

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

of shaping end-to-end absolute QoS over per-hop relative QoS, th a t is studied in this
dissertation.
Several works have addressed the performance evaluation side of differentiated
services, both quantitatively and qualitatively. In [52], the authors advanced simpli
fied models of Assured Service and Premium Service and analyzed them with respect
to their performance when compared with simulation-based evaluations. In [47] the
authors investigated enhanced mechanisms th at improve the throughput and fairness
properties exhibited by Assured Service. Feng et al. [24] describe an adaptive 1-bit
marking scheme and demonstrate the resulting bandwidth sharing behavior using
simulations when the priority level is controlled end-to-end. In [18], the authors de
scribe a proportional differentiation model which seeks to achieve robust, configurable
service class separationi.e., QoS differentiationwith the support of two candidate
packet schedulers. They use simulation to study the behavioral properties.
Some other works has been carried out in formulating resource allocation problems
spanning a number of different domains in the context of microeconomics and game
theory [13, 25, 26, 36, 41, 43, 44, 49, 57, 66, 74, 77, 81, 83]. In Smart Market [50],
pricingin the form of packets carrying bidsis used to resolve scheduling conflicts
of packets at switches inside a network implementing priority queues. Paris Metro
Pricing [56] provides a framework and argument for discussing the role of service
differentiation through pricing, even if network resources are plentiful. In [34], the
authors study the efficiency properties of RIO which is used as the scheduler in
Assured Service. Although the concerns raised with respect to RIO being an efficient
scheduler are well-founded, the authors contribution is restricted on performance
analysis of a packet scheduler and does not relate in an essential way to aggregateflow scheduling which is at the heart of differentiated services.
Prior to stochastic optimal aggregate-flow scheduling framework, optimal per-flow
scheduling with stochastic input has been intensively studied, with focus on charac
terizing the per-flow performance space satisfying strong conservation laws [76]a
weaker form being Kleinrocks conservation law [39, 40, 75]under different assump

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

tions on the input and scheduler space. Several other aspects of optimal schedul
ing, in contexts relevant to network control, have also been investigated.

These

include computation complexity of optimal control, optimal clustering under mini


mum mean-square error (MMSE) criterion, and job grouping. Chapter 6 Section 6.2
provides a more detailed review on previous works related to stochastic optimization
of aggregate-flow scheduling.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

10

NETWORK ARCHITECTURE

In this chapter, we present a scalable QoS provisioning architecture using aggregateflow scheduling. The architecture, Scalar QoS Control, generalizes per-hop and edge
control achievable by setting a scalar value in packet headers, e.g., the TOS field
of IP. Our control framework incorporates assumptions, albeit weak, about selfish
user behavior and service provider behavior. This is necessitated by the essential
role they play in influencing end-to-end QoS, without which an effective evaluation
of aggregate-flow QoS provisioning remains incomplete.

2.1

Overall System Structure


The network system is comprised of four principal componentsper-hop control,

edge control, user control, and service provider controlwhere the first two make
up the network system proper, and the latter two are incorporated to evaluate the
goodness of the first two components. Figure 2.1 depicts the overall system struc
ture. A users traffic flow, upon entering the network, is assigned a label from a set of
L values, e.g., enscribed in the TOS field of IPv4. The routers provide differentiated
treatm ent of packets based on their enscribed labels, and end-to-end QoS is deter
mined by the treatm ent of an users flow on all hops along a given path. The label
values are set at the edge on a per-flow basiseither once-and-for-all (open-loop),
or dynamically as a function of network state (closed-loop)facilitating end-to-end
control as part of edge control. A second component of edge control is access control
which prevents users from arbitrarily assigning labels to their packet flows without
consequences. Access control may be achieved by policing, traffic shaping, and pric
ing. We assume th a t the network (in general, service provider) exports a cost to each
user which increases with service quality, or equivalently, with the resources received.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

11

The system is completed by incorporating selfish users who can regulate the label
values on their packet streams to satisfy their QoS requirements at least cost, and
a selfish service provider who sets priceswhich determines user costto maximize
profit.

Control

A ccess
Control

P r ic in g

Per-Hop
Control

Figure 2.1. Overall QoS provisioning architecture. Network exports per-hop and
edge control, user exercises scalar QoS control (tj-control), and service provider
exports QoS cost to user.

The job of the network system properper-hop control and edge-controlis to


provide sufficient and efficient network mechanisms such th at for a set of users or
traffic flows with diverse QoS requirements, by suitable setting of the packet labels,
user-specified services in the form of target end-to-end QoS can be provided. The
setting of the label value, whether it is done by access control on behalf of a user or
by a user directly, should be powerful enough so th at the users QoS requirements
can be satisfied without necessitating the engagement of other traffic controls to the
extent possible1. The network control substrate should also promote stability in a
noncooperative network habited by selfish users and service providers, and facilitate
efficient allocation of network resources as an outcome of selfish interactions.

xIf an end-to-end delay of 30ms is desired but the route assigned has a propagation latency of 50ms,
then clearly no amount of class-based label switching can achieve the target QoS.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

12

2.2

Basic Definitions
Assume there are n flows or users. A user i e [1, n] sends a traffic stream at

average rate A; > 0 (bps). In the following, we will assume A; is given and fixed
( fixed bandwidth demand ). The case when Aj is variable ( variable bandwidth
demand) is considered separately. Let x* = (x\, x l2, . . . , x la) denote the vector of
end-to-end QoS rendered to user i. For example, x\ may represent mean delay, x \
packet loss rate, x \ delay jitter (e.g., as measured by some second-order statistic),
and so forth. We assume th at all QoS measures are represented such th at a smaller
magnitude means better QoS. A packet belonging to user i is enscribed with a scalar
T}i <E ( 1 , 2 , . . . , }
taking on L distinct values. Unless otherwise specified, we will use [a, b], for a < b,
to denote the set of integers between a and b. Typically, the number of users is very
large vis-a-vis the range of rji, i.e., n

L, and per-flow identityas conveyed by

Tjiis lost as soon as a packet enters the network. Thus by the many-to-one mapping
implied by n > L, aggregate-flow QoS control is imposed on per-hop behavior and
executed per-hop at routers on an end-to-end path. In our implementation design
(Chapter 4, 5) we use a number of bits in the DS field of IPv4 (and IPv6) to carry
the r) value (i.e., DSCP).

2.3

Per-hop Control

2.3.1

Per-hop Control Components

Per-hop control consists of a classifier and a packet scheduler. We assume a GPS


packet scheduler with m service classes and service weights a* > 0, 53fcLi a k = U
for an output port whose link bandwidth p is shared in accordance with the ser
vice weights. It is not necessary to have GPS as the underlying packet scheduling
disciplinee.g., priority queues, multiple copies of RED with different thresholds
are alternativesbut we will show th at GPS has certain desirable properties when

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

13

considering the problem of selecting an optimal aggregate-flow per-hop control for


differentiated services. An im portant component is the classifier which is given by
a map : [1, n] > [l,ra]. T hat is, n flowseffectively L (or less) flows from the
routers perspective since packets are scheduled by their label values onlyrouted to
the same output port on a switch are mapped to m service classes. For aggregate-flow
control, n > L and L > m. Thus
n > L > m,
and if L > m, this leads to a further aggregation per-hop in addition to the manyto-one mapping exercised at the edge due to n > L. For some choice of classifier
and packet scheduler, the QoS received by flow i G [l,n] at a switch is determined
explicitly or implicitlyby a performance function xl, x* = xl(rf, A), where rf =
(7 7 1 , . . . ,

T)n )

and A = (Ai,. . . , An). Flow is performance x%(r), A) is induced by the

performance function of the service class th at flow i is mapped to by . When the


traffic rate A is fixed, we will omit it from the argument list.
1 I

| PS |

2 [

| PS |

Per-H op

Figure 2.2. Left: Aggregate-flow QoS control affected by two stages of information
loss via many-to-one coarsificationat edge and per-hop. Right: r) value in DS
field of IP datagram is used by the classifier to select service class in GPS packet
scheduler.

2.3.2

Per-hop Control Properties

There are three properties of the per-hop control, listed below, which are of interest
and deemed desirable from a QoS control perspective. Let e* = ( 0 , . . . , 0 , 1 , 0 , . . . , 0)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

14

denote the unit vector whose ath (i [1, n]) component is 1, and 0, otherwise. In
the following, i [l,n] refers to the end user, and x l(-) denotes the individual users
performance function. The properties are:
(A l) for each flow i and configuration 77, x l (rf +

e i)

< x*(rj) and x l (rf e*) > x l(i 7);

(A2) for any two flows i ^ j and configuration rj, x^ (v+ ei) > x j (rj) and

<

a;-7 f a - e * )

xi{ri);
(B) for two flows i ^ j and configuration rj, rji > r]j implies x %(rj) < x J(r]).
In the definitions, the range of rj is such th at the perturbations remain in the ndimensional lattice, i.e.,

77

+ ei,t) e* e [1,L]n. Property (A l) states that, other

things being equal, increasing the label value of flow %improves the QoS received
by flow i (recall th at small means better QoS in our representation). Property
(A 2 ) states th a t increasing r]i will not increase the QoS received by any other flow j.
Property (B) states th a t if flow i has a higher

77

value than flow j , then the QoS it

receives is superior to th at of flow j. We call property (B) the differentiated service


property. Note th at (B) has the immediate consequence x l(rf) =

x^ tj)

rji

rjj.

Thus there is no absolute, a priori QoS level attached to the rii values. It is the
magnitude of rji relative to other flows label valuesth a t will determine the QoS
received by a flow i. We will show th at the three properties, collectively, facilitate
effective QoS differentiation and control via

77

controli.e., scalar QoS controland

furthermore, allow selfish users to share resources efficiently when setting their

77

values commensurate with their QoS requirements.

2.4

Edge Control

2.4.1

Access Control

The properties exported by per-hop controlif satisfiedare not sufficient by


themselves to render end-to-end QoS commensurate with user requirements. End-toend (or edge) control complements per-hop control by setting the value of 77 per-flow

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

15

in accordance with user needs. We assume th at the network exercises access control
at the edge such th at users are not perm itted to assign rj values to their packets at
willif every user assigns the maximum rj value L to their flows, then QoS control
via rj loses its meaning (degenerates to FIFO-based best-effort service by property
(B)). This can be done by performing per-flow policing, traffic shaping, or assigning
costs via pricing. Open-loop control is used in the Assured Service and Expedited
Service instantiations of differentiated servicesalso called absolute differentiated
services [18]and is generally suited for short-lived flows for which feedback control,
when subject to long round-trip times (RTT), is ineffective. Figure 2.3 depicts the
overall structure of the end-to-end control framework.
T) Feedback Loop (User Control)
Accounting Loop

Receiver

Network Control

Per-flow Control

Aggregate-flow Control

Per-flow Control

Figure 2.3. Structure of forward QoS control path. Lower path comprised of
admission control, policing/shaping, per-hop controlopen-loop control. Upper
control path comprised of dynamic rj control, pricing, receiver QoS monitoring, QoS
feedbackclosed-loop control.

2.4.2

End-to-end Control

Our framework (also referred to as relative differentiated services in [18]) allows


end-to-end control to dynamically adjust the rj value in accordance with a users QoS
needs. Properties (A l), (A2), and (B) admit to composability in a WAN environment
where a users traffic flow goes through several hops along an end-to-end path. T hat
is, if a property holds for any single per-hop control, it also holds for a sequence of

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

16

per-hop controls in a network of switches when viewed as implementing a composite


performance function2. An end-to-end control of the form
rji(t) + 1 ,
Vi(t + r ) =

<

if x* > 0 \
if x l <

(2.4.1)

otherwise,
where 0%represents user zs QoS requirement vectori.e., expressed as a threshold
with delay less than 9\, packet loss rate less than 6\ and r > 0 represents the next
update, is asymptotically stable with respect to a single user3. Properties (A2) and
(B) reflect the resource-boundedness property of a router, and come into play when
considering a collection of selfish users engaged in end-to-end scalar QoS control, and
the dynamics this induces as a result of interaction.

2.5

U se r C o n tro l

2.5.1

U se r U tility a n d Selfishness

User i 's QoS requirement can be represented by a utility function Ui which has
the form /j(Aj, x \p j) where Aj is the traffic rate, x* the end-to-end QoS received, and
Pi the unit price charged by the service provider. The total cost to user i is given by
Pi A;. We assume th at Ui satisfies the monotonicity properties 4
dUi/dXi > 0, DUi/dx* < 0, and dUi/dpi < 0.

(2.5.1)

Other things being equal, an increase in the traffic rate is favorably received by a
user, so is an improvement in QoS, but an increase in the price charged by the
service provider has a detrimental effect on user satisfaction. These are minimal,
2In general, under flow conservation for (A l) and (A2), or certain packet loss dominance conditions.
3This assumes a total order on the union of reachable and required QoS vectors. See [10] for a
discussion of QoS ordering.
i Ui need not be differentiable, nor even be continuous. We use continuous notation here for notational clarity; monotonicity is the only property required.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

17

weak requirements on the qualitative form of user utility. If rj control is allowed


to be exercised by the user, then a selfish user i can be defined as performing the
self-optimization
max Ui(\i,yLl,Pi)

(2.5.2)

me[i,L]

where rjiinfluences

user zs utility Ui via its effect on the QoS received x \

We

assumePi(xl) is a monotone (nonincreasing) function of x*whichcorresponds to the


price function exported by the service provider. A slightly different formulation of
selfish, cost-conscious user behavior is obtained by the constrained optimization
formulation

min XiPi(xl)
m
subject to

(2.5.3)

x* < 0l

where dl is user i's QoS requirement vector.

Thus the user wants to minimize

costi.e., achieve efficient resource allocationwhile satisfying his QoS requirements.


Threshold utilities expressed as bounds on the QoS received is a useful means of rep
resenting and conveying a users QoS requirementdelay less than 33ms, packet loss
rate less 10-4 , jitter less than 3ms, and so forth. The user is asked to convey her QoS
preference as a quantifiable threshold when interacting with the network system (e.g.,
through a Web browser interface) which is employed in some practical systems [53].

2.5.2

N oncooperative Game

User i's QoS is influenced by the actions (rjj values) of other users j

%via x* =

x l {rj) as captured by properties (A2) and (B). If all users engage in self-optimization,
this leads to a noncooperative game. The first point-of-interest is stability. In a
noncooperative game, a configuration rj = (rji,. . . ,rjn) which determines the global
QoS allocation is stable if no user, under (unilateral) selfish actions, can improve her

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

18

utility from th a t achieved at rj. More precisely, 77 is a stable configuration or Nash


equilibrium if for all users i

G [1, n ],

+ cei),pi(ri + cei)) < U i(\i,x l(r}),pi(rj))


for all c G Z such th at rji + ce,
to

(2.5.4)

[1, L]. Since all users are stuck at rjwith respect

selfishmoves, the system finds itself at an impasse, i.e., rest point.

A similar

characterization holds for (2.5.3). Existence of Nash equilibria and their efficiency
properties are of im port since they characterize the behavioral aspect of a differenti
ated services network when put into action in a noncooperative environment such as
the Internet. We will show th at the global resource allocation properties in a nonco
operative network environment are intimately tied with the properties exported by
the per-hop control.

2.6

S erv ice P ro v id e r C o n tro l


For a single router shared by flows i and j , the only pricing constraint we impose

is
x* < xJ => P

i > Pj.

(2.6.1)

T hat is, the better the QoS received at a shared resource (i.e., router), the higher the
per unit flow cost charged to the user receiving superior QoS. Since x* < x J if, and
only if, the relative resources (in the present framework, bandwidth) allocated to flow
i is greater than th at of flow j , relation (2 .6 . 1 ) just says th at the more resources a flow
consumesthus receiving superior QoSthe higher the cost it incurs vis-a-vis a flow
th at consumes comparatively less resources. Relation (2.6.1), due to its generality,
leaves open the degree of freedom of setting the magnitude of the prices which we
assume is under the control of a service provider. The service provider can be treated
as yet another player in the gameassigned the index zeroand, if selfish, will try to
maximize his individual utility Uq. Uo is assumed to have the form of revenue minus
cost (i.e., profit) given by Uo(rj,X) =

^iPi(x *) Cost0 where Costa is the total

cost incurred by the service provider in delivering the services. The service provider

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

19

exports a price function p = p(x) where p(-) is monotone decreasing in x. Thus a


selfish service provider performs the self-optimization
n

max

AiPi(xl)

(2 .6 .2 )

p(,)

assuming fixed Costo. Closing the system by incorporating the actions of a selfish
ISP leads to a (n + l)-player noncooperative game.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

20

OPTIMAL AGGREGATE-FLOW SCHEDULING

In this chapter, we present a theoretical framework for reasoning about aggregateflow QoS provisioning, constrained to be implementable in IP networks. We develop
a theory of optimal aggregate-flow per-hop control and the properties they exhibit
which facilitate end-to-end QoS via the joint action of aggregate-flow control perhop and per-flow control at the edge.

We also study the stability and efficiency

properties of the overall network system when users are allowed to influence the
choice of scalar values in the DS field at the edge, and service providers export costs
to users commensurate with the QoS received. The results in this chapter are also
published in [69,

3.1

68

].

Optim al Classifiers and Per-hop Control


We take a reductionist approach to optimal aggregate-flow per-hop control by first

defining what optimal per-flow control is when packets are enscribed with a value from
L possible choices. Aggregate-flow control can then be viewed as an approximation to
the QoS achieved by per-flow control in a well-defined sense. Comparability between
aggregate-flow and per-flow control is facilitated by the fact that, even in aggregateflow control, an end users QoS remains well-defined, and the loss in power due to
coarsification affected by flow aggregation can be exactly quantified.

3.1.1

Optim al Per-flow Classification

Consider the per-flow control or classifier problem for n users who choose packet
labels from integer set [1, L]. Technically, per-flow classification means n = m (each
flows service can be individually configured), and L is either greater or smaller than

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

21

n. In general, the range L may be finite or unbounded, and the variable rji discrete
or continuous. The influence of boundedness and discreteness can be subtle, and its
effect is shown in Section 3.1.4 with respect to system optim ality where we quantify
the negative performance impact of boundedness and discreteness affected by loss
of resolution. When n users mark their flows with a value rji [1 , L] drawn from
the metric space [1 ,L] with property (Al) satisfiedlarger rji values, other things
being equal, result in a greater apportionment of resources and thus better QoS rji
can be viewed as codifying a users QoS or resource demand with respect to some
measurement unit. For example, rji may represent bandwidth demand in units of
Mbps. If network resources are infinite, then a flows request can be satisfied based
on the rji value specified, without consideration of the needs specified by other flows
(except, possibly, for pricing issues). That is, independence or decoupling holds. If,
on the other hand, resources are finitean OC-12 link is shared among bandwidth
insensitive usersthen, in general, the users collective resource demand may exceed
the available bandwidth.

In the presence of such resource contention, a conflict

resolution scheme is needed, including the criteria by which resource allocation is


decided.
Assume available bandwidth is normalized such th at total available bandwidth
is fj, = 1. First, assume rji R+ is a continuous variable over the suitable real
interval [0,1], expressing user is normalized bandwidth demand per unit flow. Let
a = (a \ , . . . , ocn) with a* >

, ]T]/t=i

, represent the fraction of resources

apportioned by the per-flow classifier to i [1 , n], and let u>j = ctj/Aj denote the
fraction of resources allocated to i per unit flow. Under the above semantics, given
rj (and A), the optimization
n

min
Y2(Vi
~ Ui)2
at
/-*

(3.1.1)

i= 1

measures the goodness of a resource allocation u; with respect to users codified


needs rj in the mean-square sense1. Since (3.1.1) penalizes by the difference error, the
relative importance of higher rji values is preserved, and resources are apportioned
1The generalization to other norms is treated separately.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

22

accordingly. For general % K+, including the discrete and bounded case rji e
which is of special interest, define the normalization

{1

Vi =

- Z

i n

l f

^ m ax

^ m in >

2 ^

otherwise,

where 77mjn, ?7 max are the minimum and maximum values of {vi, V2 , , Vn}, respec
tively. Note th at rji

[0,1], and unless all rji values are equal, rjmin = 0 and fjmax = 1 .

Let Qi denote the normalization of Ui via (3.1.2). Given rj, the optimization corre
sponding to (3.1.1) is

n
min
rv
a

7f

- u;*)2.

(3.1.3)

i= 1
(3.1.3) realizes the same semantics as (3.1.1), however, generalized by the function or
code (it is not 1-1) given by (3.1.2) to Vi values not restricted to the real unit interval
[0,1]. If L is bounded, then the 1-1 function rji = iji/L achieves a similar purpose.
(3.1.3) possesses the same desirable properties as (3.1.1), which are characterized by
the following two results.

P roposition 3.1.4 (O ptim al Per-flow Classifier)

Given rj, A 6 R , the com

plete solution set to (3.1.3) is


OH = (1 - v)
where
Proof,

< v <

+ V
x ,
2-jj=i AjVj
2^,j=i Aj

* e [ l ,n ] ,

(3.1.5)

is a parameter which defines a continuous family of solutions.

(i) First we show th at any a given by (3.1.5) is an optimal solution to (3.1.3).

To achieve this, we just need to show th at such a satisfies rji = i2>i for i [1, n}.
(a) If vmax = Vmin, then rji = 1, i <E [l,n\. From (3.1.5), we have
oii =

(1

u)

+ v
l~,j=i Ai

l ^ j =i Ai

nA*
2 ^j=i

* e [ l,n ] ,

Aj

and
1

t -

Ai

v -^ n

r - i

r-l

i G [1 , n j .

l ~ / j = 1 Aj

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

23

Since uj\ = = ojn, we get Cji =


(b) If Tjmax

, i [1, n], which means rji

[1, n].

Vmin, then fjmin = 0, and fjmax = 1. From (3.1.5), we have


_

_
\
Ai

( 1 - u) rji
v-'n \ "
2~/j=i ^jVj

v
V 'n \ !
z^j=i Aj

^ ^ [1j

>

and
l i/
wmm

\r^ n

w m ax

2 ^ j = 1 Aj

+1

2 w j= l AjTfj

2 w j= l A?

Because u w / w , from (3.1.2),


Wj COmin
Wi = ---------------- ,
^max ^min

. ri ]
z G [ l,n j.

/o i >\
(3.1.6)

Substitute tu*, tumjn and ojmax in (3.1.6), we will get cu* = rji, i G [l,n].
(ii)

Next we show th at (3.1.5) represents a complete solution set for (3.1.3). Sup

pose a ' = ( (, ,a4 ) is an optimal solution to (3.1.3).


0

< v' <

We will find some u',

such that
I
^i

(1

l\
XiVi i /
Ai
^ ) v='n
I ^
L - / j = 1 A j Vj

F mSr~ )

_ ri

ZG

[1, Zl] .

A-ij=l Aj

From the first part of the proof, we have seen th at some a i , , a n can achieve
rji = u){. Since a i, ,a'n is an optimal solution to (3.1.3), rji =
(a) If r}max = rjmin, then rji = 1, and u- = 1, i G [1, n]. We have
ai =
Ai

= <
A

and
ai H

1-

a'n =

1 .

By solving the above two equations, we can get


A,;
Oi'i

E ]U V

In (3.1.5), let v' = c, 0 < c < 1, then we get the same a i, , a'n

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

24

(b) If r]max Vmin, then u'min = fjmin = 0, and u max = fimax = 1. Thus

>

min*
mm
max

= rjt,

(3.1.7)

[ l,n ] ,

mm

and
Ai + ----- 1- u'nXn 1

(3.1.8)

From (3.1.7) and (3.1.8), we can represent cuj, i [l,n], as


mm
mm

Therefore,
"F ( 1
In (3.1.5), let u' = u>'min Y^j=i

mm

then we get the same a'x, , a'n.

We shall show 0 < v' < 1. v' =

X)j=i 'V? 0 because

> 0. On the other

hand, since u'rmn < uj'n


n
mm

y
j= i

< y > ; . A , i.
j =i

The param eter v, which stems from the dimension reduction associated with (3.1.2),
has an appealing interpretation. The second term in (3.1.5) corresponds to the pro
portional share achieved by FIFO scheduling, whereas the first term corresponds to
proportional share of the corresponding virtual flows Aifji, which are the original flow
rates weighted by their relevancy variable rji derived from rji. Thus, if v 1, then
the per-hop control effectively ignores the label values and behaves as a FIFO queue.
If v = 0, then the router acts like a GPS scheduler with service weights given by the
first term. For any other value of v, (3.1.5) represents a convex combination of the
two behavioral modes.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

25

The optimal per-flow clas

P ro p o s itio n 3.1.9 (P er-flow C lassifier P r o p e rtie s )

sifier given in (3.1.5) satisfies properties (A l), (A 2 ), and (B).

Proof.

Denote the normalization of rj by fjfirj), i [l,n], and the normalization of

rj + e* by fji(r) + e*), i [1, n\. Similarly, we will use notations 0Ji(rj) and ufirj + e*),
i [l,n].
(i) Property A l .

Consider uji = yC Rewrite (3.1.5) as


(1

- v)

Wi ------

E"=IAy(|)

Because

+ e,) > i7 (t/) and %(ij + e,) < fij(r)) for j # i,

j [l,n]. Hence, wfirj + e*) > cofirj). Furthermore, x %is a monotone function of cu*,
so x>(ri + ei) < x {(rj). Similarly, we will get x %{r\ e{) > x^rj).
(ii) Property A2.

Consider Uj =

j ^ i.

(a) If rji rjmax, rewrite (3.1.5) as


(l-W i

.,

3 *iVi + Y,k?ixkfik ELiV


Because -rji rjmax, fn(r} + a ) > fjfir]), and fjk(ri + a ) = fjk(rj) for k i. Hence,
ufirj +

< ufirj)-,

(b) If rji = rjmax, then rewrite (3.1.5) as


v

(I-*')

Ai(|-) +

M f^ )

E L iA fc

Because rji rjmax for configuration rj, rji + 1 = rjmax for configuration rj + e^. By
(3.1.2),

> | M , and

= $$, M

i- Hence,

+ e() < u , ( , ) .

Therefore, in both cases, Uj(rj + e*) < ojfirj). From the monotonicity of x \ we get
^ (* 1

+ ei) > x (rj). Similarly, we can also get x^{rj d ) < xj (rj).

(iii)

Property B.

For i / j , rji > rjj implies rji > fjj. By (3.1.5), uii > ojj if rji > rjj.

Hence x l (rj) < xi in)-

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

26

3.1.2

Optim al Aggregate-flow Classification

W ith the semantic set-up of optimal per-flow classification, let us consider the
aggregate-flow classifier problem where n > m. The original aggregate-flow classifier
problem, n > L = m, is subsumed by the more general set-up where L can take on
any value. From a QoS provisioning perspective, the ultim ate goal of a differenti
ated services network comprised of aggregate-flow per-hop controls is the provision
ing of end-to-end QoS commensurate with each users needs. Aggregate-flow control,
whether it has many or few labels, must service n flows using m < n service classes
which results in a reduced ability to effectively shape end-to-end QoS with respect
to the performance criterion (3.1.3) when compared to per-flow control. T hat is, the
minimum value of (3.1.3) achieved by optimal per-hop control is smaller than th at of
optimal aggregate-flow control. This is a consequence of a more general result given
by Proposition 3.1.12.
We give a formal definition of aggregate-flow per-hop control. An aggregate-flow
per-hop control with parameter (m, n) is a function

$ m ,n

where : [l,n]

[1

: (?, A )

(3.1.10)

(, a ) ,

, ra] is the classifier and a = ( a 1, . . . , a m) is the vector of

service weights assigned to the m service classes. W ith respect to end users, $ TO>n
inducesexplicitly or implicitlya performance function (plm n for each user i [1 , n]

: (*? A)

where a* = (p%

>

(3.1.11)

ah

is user zs share of the bandwidth allocated by

Since the traffic rate A is fixed, we will omit it from the argument list. The two-stage
interpretation of aggregate-flow per-hop control is depicted in Figure 2.2.
The resource share received by individual flows under <hmi can be computed as
follows. Let
Uk = {i G [1 , n) : (t) = k},

h e [1 , m],

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

27

be the partition of [l,n] induced by . For i 4, set a* such th a t ^2ieUk ai = a k,


and ai/Xi = constant. This is the share received by user i [1 , n].

Proposition 3.1.12 (Service Class M onotonicity)

Let $ m,n be an aggregate-

flow per-hop control, and let S m ( a : <pmtn(r}) = at for some rj}.

Then (3.1.3)

achieves a smaller value with more service classes, i.e., for all m' satisfying m +

<

m' < n,
n

{ min

< { m m ^ W , - * ) 2}.
i= 1

Proof.

Let

i 1

be an arbitrary but fixed m-class aggregate-flow per-hop control.

Because m + 1 < n, under

there exists a service class k th at receives more

than one flows. Construct an (m + l)-class aggregate-flow per-hop control $ m+i,n


as follows: <&m+i,n takes same class mapping and weight assignment as

except

dividing class k into two classes k i , k2 with weight a kl, a k2 satisfying


a ki

a k2

a kl + a k 2 = a k,
iieukj ^

z-,ieuk2

We have
<P%
m ,nin) = V m + l M ,

* G [l, n].

Thus any a G S m can be achieved by some <3>m+i)n. Therefore, S m C S m + 1 C C S n,


and the proposition follows.

Consider a special type of aggregate-flow per-hop control $ m,ncalled Reduction


Classifierwhose behavior is completely determined by its classifier : [l,n] >
[1, m], in the following sense. On input (rf, A), $ m>n behaves as shown in Figure 3.1: It
reduces the n user problem to an m user per-flow classification problem by aggregation
of component flows and centroid computation, then solves the reduced problem by
applying the optimal per-flow classification solution.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

28

. Compute Afe = ^ZieUk X, for each A; [1, m].

2. Compute rjk for k G [1, m] as follows,


,

if 3i e Uk, fji = 0 ;

1,

if 3i E Uk, 7ji = 1;

'EieUkVi/lUkl

otherwise.

vk -

3. Use per-flow optimal solution (Proposition 3.1.4) with new input


rj = (fj1, . . . , fjm), A = (A1, . . . , Am), to solve the reduced per-flow
classifier problem consisting of m superusers.

Figure 3.1. Behavior of reduction classifier

Theorem 3.1.13 (R eduction Classifier)

Let 4m) be a reduction classifier rep

resented by its classifier . Then $ m,n is an optimal aggregate-flow per-hop control,


i.e., satisfies (3.1.3) if, and only if, is a solution to

m
finfce[i,m]
E iet/fc
E fi-w

t3-1-14)

where the minimum ranges over all reduction classifiers '.


Proof.

Let

be an aggregate-flow per-hop control. Let u: (a;1, , ojm) where

ojk = a k/X k, k e [1, m}. We have Ui =

and

Thus $ m>n is the optimal

solution to (3.1.3) if and only if 4>m>n satisfies

!? E Ew*-^)2For a reduction classifier, d = ( a 1, . . . , a m) is computed by optimal per-flow


classification with input r) = (fj1, . . . , f}m), thus fjk = u k, k [l,m ].

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

Hence the

29

theorem holds if we can show th at in an optimal aggregate-flow per-hop control, The


weight a k of service class k, k G [1, m] is computed as

ojk

0,

if 3i GUk, rji = 0;

1,

if 3 i e 4 , ^

'Eiuk Vi/\u k l

otherwise.

= 1;

(3-1.15)

We will prove the above by contradiction.


(i) For class k, if 3i* G Uk, rji* = 0, then (3.1.15) indicates th at tjk = 0. Suppose
u)k ^ 0. Then 3 j / k, il)j = 0. We will construct another aggregate-flow per-hop
control $'m n with same (w1, , cum) but different ' as follows: For flow i*, '(i*) = j ;
For any other flow i / i*, '( i) = ( i) . We have (rji* u k)2 > 0, and (rji* u j )2 = 0.
Therefore,
TO

TO

]C(* j =i ieu'j

^ ) 2

< X!
k=i ieuk

_ ^ fe)2>

which is contradictory to the optimality of


(ii) For class k, if 3i* G Uk, fji* = 1 , then (3.1.15) indicates th a t u k = 1. This case
can be proved by the similar argument as in (i).
(iii) For class k, if Vi G Uk,rji 7 ^ 0 and rji 7 ^ 1, then(3.1.15) indicates th at u k =
H ic u ^ i/W k l- Suppose u k j- Y,iuk Vi/WA- By (i), 3kx ^ k , u kl = 0 since Vi G
Uk,rji 7 ^ 0; similarly, we know 3/c2

= k,

u k2 1. Therefore, we can construct another

n with same but different Cj ' (u/1), , o /m))

aggregate-flow per-hop control


as follows: For j

7 ^ k,

= uV;For j = k, cu^ = ^2ieuk Vi/Wk\- By the property of

mean square, we have


^2(rji - u (k))2 <
ieuh
i<=Uk

- Vk)2

and hence
TO
J2

k= 1 ieuk

TO

< X! X !^
k 1 iUk

^ )2>

which is contradictory to the optimality of <&TO,.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

30

Theorem 3.1.13 shows th at an optimal aggregate-flow classifier must be a reduction


classifier, and furthermore, it must efficiently coverin the mean-square sensethe
set of label values {7 7 1 , 7 7 2 , >fjn} using m centroids {fj1, . . . , rjm}.

Thus optimal

aggregate-flow per-hop control is a clustering or classification problem in the statistical


classification sense.
A classifier is well-formed (also called a grouping) if the three conditions rji < rjj,
(*) = 0 ); and rji < r)k < rjj jointly imply (&) f(i).

Thus if two different

label values are mapped to the same service class, then all r] values sandwiched
in-between must be mapped to the same service class. can be represented by wellformed parentheses on the totally ordered set rji < ij2 < < Vm where adjacent
values are grouped into the same partition except, possibly, at boundaries.
T h e o re m 3.1.16 (G ro u p in g )
Proof.

An optimal aggregate-flow classifier is well-formed.

Let 4>m>n be an optimal aggregate-flow per-hop control with classifier and

weight vector a = ( a 1 , , a rn) . We will first prove th at is well-formed in 77-space


by contradiction.
Suppose fjh < f)iQ, (/i) = (/2) = k and 3/, fjh < rji < % and (/) = k', k' k.
For class k, let f j and fjknax be the minimum and maximum value of fj* for all i Uk,
and let iknin and ikmax be the indices of fjknin and fjknax- We also use similar notations
for class k'. Clearly

< rji < iji2 < Vmax- We then construct another aggregate-

flow per-hop control $'m n with classifier ' and corresponding Gf

, u /m))

as follows:
(i) Suppose cbk = u k'. Because fj^ < fji2, either 77^ ^ Cjk or fji2 ^ u k. Suppose
77 (l

/ Cjk. Let U'k Uk U Uk>- {/1 },

= u k and U'k, = {h }, Cj ^

= %.

(ii) Suppose Cok < u>k'. For class k and k1, do the following:
(a) If Vmin ^

then U'k = Uk u { & } , W{k) = Uk and U'k, = Uk>- {& },

Cj(k'} = u)k' .
(b) If u k' < fj^axl then U'k = Uk - { i ^ ax}, u (k) - u k and U'k, = Uk>U {ikmax},
oj(k') = Cjk' .

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

31

(c) If U)k

< f}*in, 'tjkmax < uik' , and C m < v L x , then U'k = Uk U {ikmin} - {ikmax},

and U'k, = Uk<U {ikmax} - { C in}, u {k>) = u k'.


(d) If ujk< rjkmin, iikmax < u k', and ry^in = rjkmax, then compare
ItfLn ~ u k'\-

\ f ^ ax - u k\ and

If \v L x ~ u k\ < | C - wfc'l. U'k = Uk U {**<n},

when w* = 0,

&{h) = 'E ieu'k Vi/\U'k\ when u k 0, and U'k, = Uk>u {k,) = E ie/', f)i/\U'kl\ when u)k> #

> | ^ - w*'|, U'k = Uk - {ikmax},

; If

= ieC/, Vi/\Uk\ when u k 0, and U'k, = Uk>U {**,},

,(*) = w* when w* = 0,
uj(k') = cDfc/ when Cjk' =

= u k' when Cjk' = 1,

= Ylieu', Hi/Wwl when Cjk' ^

(iii) Qk > u k' . Similar to (ii).


In each of the above cases, C/j = Uj and

= d;J for all class j with j ^ k and

j / k '. We have
Y ^ iV i ~ w(/ )2 + Y ^ (fc,))2 <

k*)2 + S ^
%.Uy

~ ^ )2-

(3-1-17)

Hence
m

Y Y (^ ~ ^0))2< Ufa~ 2,j)2


j=i ieu'j

j=i ieUj

(3 . 1 .1 8 )

which is contradictory to the hypothesis th at 4>m>n is optimal. Therefore is wellformed in ?)-space. Because rji < rjj O- rji < rjj, is also well-formed in ?7 -space.

Thus aggregate-flow per-hop control is, mathematically, an optimal clustering prob


lem. Unlike its many brethren in higher dimensions th at are, with few exceptions,
NP-complete [28], the clustering problem given by (3.1.14) in Theorem 3.1.13 has a
poly-time algorithm; e.g., it can be solved by dynamic programming. When L = m
the practically relevant case where there are as many labels as service classesoptimal
aggregate-flow classification has a linear time algorithm.

3.1.3

Properties of Optimal Aggregate-flow Classifiers

Although optimal per-flow classifiers satisfy properties (A l), (A2), and (B), the
same is not necessarily true of optimal aggregate-flow classifiers.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

32

An

T h e o re m 3.1.19 (A g g reg ate-flo w C lassifier P r o p e rtie s )

optimal

aggregate-flow per-hop control satisfies property (B), but need not satisfy properties
(Al) and (A2).
Proof.

Let 4>m>n be an optimal aggregate-flow per-hop control.

(i) Property B.

Consider two flows i / j and rji > rjj. Let k\ = (rji) and

k2 = (rjj). Since an optimal aggregate-flow classifier is well-formed, rji > rjj implies
fjkl > fjk2. Thus u ki > ujk\ Therefore, uji > ujj and property B follows.
(ii) Property A l .

Property

A l does not necessarily hold. Given rj' = rj

+ e*, let

' be the classifier and a ' = (ct^\ , a ^ ) be the weight vector computed by $ m,.
Property A l requires u;' > cjj. If = ', then

77 ^ W

> f j ^ . Thus

oj[

> uii. However,

i-e- th e change of rji leads to regrouping , then it is possible th at a;' < u>i.

if 7 ^

Here is an example.
Let n 5, m 3. Let A = (1,10,1,1,1), rj = (0 ,2 ,4 ,6 ,9 ), and rj' = rj + e 4 =
(0 ,2 ,4 ,7 ,9 ).

The results of $ m,n for (A, rj) are: fj = (0, | , | , , 1), Ui = {1,2},

U2 = {3,4}, U3 {5}, and Cb = ( 0 , ^ , ^ ) ; The results of $ m,n for (A, 77') are:
= (0, , f , 5 , 1 ), U[ = {1}, U'2 = {2,3}, UI = {4,5}, and u ' = (0,
4, (4) = 2, '(4) 3, and W4 = o>2 =
(iii) Property A2.

Thus

<

^ ) . For user
oj4 .

Property A2 does not necessarily hold. Given rj' = rj + e;, let

be the classifier and a ' = (a^1), , of1^ ) be the weight vector computed by $ m,n.
Property A2 requires Vj 7 ^ i, ui'j <
Suppose (i) = k and Vj
Theorem 3.1.13, u k =

ujj.

Following is a counterexample.

Uk, rjj / 0 and rjj 7 ^ 1 . As shown in the

^ If

= > an(l Vi 7^ 1> fhen ;(*) = ^ and

proof of
=

*/

Cjk +

Because $ > rji,

> u k. Therefore Vj E Uk, oJj > Uj.

Property (A2) is more subtle than (Al) and (B), but of im port in influencing the sta
bility and dynamical structure of noncooperative networks built on top of aggregateflow scheduling.
P ro p o s itio n 3.1.20 (C lassifier P r o p e rtie s w ith m L)

A n optimal aggregate-

flow per-hop control with parameters m = L satisfies properties (A l), (A2), and (B).

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

33

Proof. Let

be an optimal aggregate-flow per-hop control with param eter m = L.

Given A and rj, because rji, i G [1, n], ranges over [1, L], is computed as:
(*) = Vi,

*7C(,) = Vi,

* e [1, n],

(3.1.21)

(3.1.22)

and d is computed as:


Akfik

* = ( ! - ')

\k

Ei=i AV

Ei=iA

* [ 1 ,L ] ,

From (3.1.21) and (3.1.22), we can derive users share of bandwidth allocated by $L,n'Q = aiW T
M = t1 ~
A?w

22 j=i

y E j= i A;

xj Vj

i l1 n l'

<3 1 '23>

Compare (3.1.23) with (3.1.5), we can see th at $L,n is equal to an optimal per-flow
classifier. According to proposition (3.1.9),

satisfies properties (A l), (A 2 ), and

(B).

The next result shows th at the optimal aggregate-flow per-hop control with m = L
not only differentiates services among classes, i.e. satisfying property (B), but also
preserves an even, uniform QoS separation: the performance difference (in terms of
bandwidth per unit flow) between flows is proportional to their r] value difference.

Proposition 3.1.24 (QoS Separation w ith m = L)


aggregate-flow per-hop control with parameters m = L.

Let

be an optimal

Given input rj and A, for

all i, j G [1, n],


Ui Uj = c (rji - rjj)

(3.1.25)

where c is a constant only depending on rj and A.

Proof

(i) If rjmax - Vmin, then V*, j G [l,n], rji = rjj and Ui = Uj. Thus (3.1.25) is

satisfied, (ii) If rjmax # rjmin, we have


fj
OJi (1 v) ^\n

T ;

2-jk= 1 XkVk

1/
b

T- ,

2^=1 Ak

* ^ [1)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

34

Hence,
Ui-Uj

V i ~ Vj

( 1 - 1 / ) = ; ----

l^,k= 1 AlcVk
^

V i~ 'n m in
Vj ~V rnin
Vm ax Umin_____V m ax ~~Vmin
Vk V m in
' ^mca: V m in

E L i x k-,

Sfe=1 ^k{Vk

iVi-Vj),

i,je[l,n}.

Vmin)

The L = m constraint advanced by Proposition 3.1.20 and Proposition 3.1.24 co


incides with practical considerations th at derive from an implementation perspective.
For example, assuming four bits from the TOS field in IPv4 are used to encode the
label set {a, a + 1 , . . . , a + 15} for some a > 0, then we may configure 16 service
classes at routers, one for each of the 16 possible label values. The classifier results
and properties for fixed service weights are treated separately.

3.1.4

System O ptim ality and Structural Properties

To satisfy user i s QoS requirement 0l, the per-hop controlwhatever its specific
formmust apportion a fraction a* > 0 of the available bandwidth. Let a* denote the
minimal such bandwidth. We will find it more convenient to work in the service weight
space ( a : a > 0 and Y^i=\a i < !} We will use </?*() to denote the performance
function corresponding to *() which allocatesexplicitly or implicitlya service
weight to user i for a given input rj.
Given per-hop control 4>mi, let A represents the set of configurations where all
users QoS requirements are satisfied, i.e.
A* = {rj: ip'fa) > a*,i G [l,n]}.
A configuration rj is system optimal if rj ^4*, and for all rj' rj, <p(r}') > <p(rj)
does not hold. In a system optimal configuration, the users QoS requirements are
met while expending the minimal amount of resources. In an overloaded system, i.e.,

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

35

E "= i a i >

) by definition, there cannot exist a way of allocating network resources

such th at all users QoS requirements are satisfied.


We turn our focus to characterizing when A* is nonempty under optimal
aggregate-flow per-hop control. The next result is the only general result th a t holds
from property (B) without exploiting further features of optimal aggregate-flow clas
sifier solutions.

Proposition 3.1.26 (Diagonal Inclusion)


[l,n ]} .

Let V = {rj : r]i = r]j for a l l i , j E

For all per-hop control $ m>n satisfying (B), if a* < Aj/]T ) " = 1 Aj for all users

i E [1, n], then V C A*.


Proof.

For any r) E V , rji = rjj, for all i, j [1, n\. By property (B), rji = rjj implies

uji = Uj. We have ^

^ = =

and

= 1- Thus a* =

which

means o* > of, i E [1, n]. Therefore, rj E A and V C A.

Note th a t a* < A,/ E j= i Aj for all i E [1, n) implies th at E ie[i,n] a i 1- Next, we


find weaker conditions for A* ^ 0, and characterize the loss of power resulting from
having a bounded, discrete label set { 1 , 2 , . . . , L}. To achieve this, we utilize the
properties of the optimal aggregate-flow classifier solution for L = m . First, consider
the case when rji E M+ for all i E [1, n ], and n m . The case of interest, rj E [1, L]n
in the aggregate-flow case can be analyzed by relating it to the unrestricted case.

Theorem 3.1.27 (Unrestricted Intersection)

Assume rji E R+ for alii E [l,n].

Let n = m, and let be the optimal per-flow classifier. Then A* ^ 0 if, and only if,
(a) 3 i E [ l , n ] such that a* < ^Aj/

= 1

Aj, and

(b) E"=imaxK*> uXi/YTj=i Xj} < T


Proof.

(i) First we show th at A ^ 0 when both (a) and (b) are satisfied. Given

ot* = (a*, , a*) satisfying (a) and (b), we will construct an rj such th at ^ ( r j ) > a*,
i E [l,n]. Let a[ = max{a*,
construct rj as rji = ^

\j }> * e [1, n]. Because of (b),


e [1, n]. rji> 0 since a[ > u

We
A. , i E [l,n].

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

36

We need to check whether ipl(rj) > a*, i [1, n\. Because of (a), we get t]min = 0.
Hence /h =

*f m a x

. Using (3.1.5),
we have:
'
li

v)

(1

(1

him.
EU

+ v

" E"=i ^

u)

- E J .^ A .-(ti - m "
Wmax

^7

j= 1

E?=fJ-" + u E ^ A
' x.
f 1 7/^ ------- Izzl-_1_ 7 / ----- Al__
11
v>
1-v
E?=i Aj
^ _ ^

o'.
(ii) Second we show th at .4 = 0 when either (a) or (b) are violated.
(a) Suppose Vi [1 , n], a* >

A.. Pick any rj G M", we have fjmin = 0.

Suppose fjk = fjmin, then according to (3.1.5), a*, = v ^ k , and a*, < a. Therefore
2 _,j=i Aj
.4 = 0.
(b) Suppose ]C = 1 niaxjaj*, ^ n A* A, } > 1. We can show A = 0 by contradiction.
Suppose > 1 / 0 , then

3 7 ,

a:* = </(r/) > cc* for i G [l,n]. By (3.1.5), a* > ^ vnAi .


E,=i Aj
for i G [l,n]. Therefore, a* > max{a*,
a} Because
= 1> we et
^j=l 3
Y^!i=\ max{a,
A } < 1 , which is contradictory to our hypothesis.

Here v > 0 is the solution parameter of the optimal per-flow classifier which de
termines how much proportional sharing to inject in the service weight allocation
{y = 1 degenerates per-hop control to FIFO). Theorem 3.1.27 is a tight characteriza
tion of .4*s nonemptiness in the unrestricted case where properties (a) and (b) stem
from the particular form of the optimal per-flow classifier solution given by Proposi
tion 3.1.4. Note th a t as v > 0, (b) becomes

a t ^ 1 which is the weakest possible

condition for nonemptiness of A*. The next result is an immediate consequence of


Theorem 3.1.27.

Corollary 3.1.28 (Em pty R estricted Intersection)

I f A* = 0 in the unre

stricted case, then A* = 0 in the restricted case where T]i (1,2, . . . , L } fo r all
i [1, n], and L < oo.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

37

The aggregate-flow and per-flow cases with respect to nonemptiness of A* can be


related by the next result which is a consequence of Theorem 3.1.20.

Proposition 3.1.29 (Per-flow and Aggregate-flow R elation)


{1,2, . . . , L } for all i G [l,ra], and L < oo. A*
control (i.e., n = m) if, and only if, A*

7^

Let

0 for optimal per-flow per-hop

0 for optimal aggregate-flow per-hop

7^

control with m L.
Proof.

By the proof of Proposition 3.1.9, the optimal aggregate-flow per-hop control

with m = L is equivalent to the optimal per-flow per-hop control.

Given the relationship of nonemptiness of A* between the per-flow and aggregateflow case under 77, G {1, 2, . . . , L}, what remains is a quantitative characterization of
the loss of power due to discreteness and boundedness of the label set [1, L] in the
aggregate-flow case.

Theorem 3.1.30 (Loss of Power due to R estriction)


optimal aggregate-flow per-hop control

if there exists

Let L m < n.
7

For

= (7 1 , 7 2 , , 7 n) with

7min 0, Ifmax 1 , 0 ^ 7 i T Such that

( v) E U h i i

"

- i E U V r ,

for all i [1 , n], then A* 7 ^ 0.


Proof.

Given users resource requirement a* = (o;*, , a*), Let $i,n be an m

L optimal aggregate-flow per-hop control with classifier f and weight vector & =
( a 1, , a L). If there exists

= (7 1 , , 7 n) with

min = 0,

max = 1, 0 <

th at satisfies (3.1.31), then we have


(1

.. ^*(7* l - 1 ) ,
v)
v
V
l^j=l
2^j=l *j

W ithout loss of generality, suppose


D =

{7 7

: fji = 0,

71

i -^ -7 - < 7)i <

n 1
Z G [l, Tlj.

^ Qft* 5

= 0, 7 =
7

max = 1. Define

i for i G [2, n - 1], fjn = 1}.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

*< 1

38

Then for all fj E D, we have


Xi^
i
v ) y^n
\
f
.
2~ij =i jV j

(1
(1

x*
^ n
vZ^j=i
1 \A? (

\ ^*(7* i - l ) .
V 'n TjTj
\

xi
V*
\
2-q=i

*
0ii'

Construct rj* = (r]l,- , 77 *) as follows:


V* =

l( L

i) 7<+

lj,

t [ l , n ] .

Then f f E D. Hence <pl (ri*) > a*, i E [l,rc]. Therefore, .4 7^ 0.

The left-hand-side of inequality (3.1.31) just denotes a valid service weight vector with
respect to the optimal aggregate-flow classifier. The second term in the right-handside of (3.1.31) of Theorem 3.1.30 quantifies the loss of power due to coarsification.
If L > 0 0 , then the loss-of-power term drops out. In practice, L is a small finite
value (e.g., using 4 bits in the precedence field of IP, L = 16). The next result shows
th at n

L the raison d etre of aggregate-flow controlfacilitates tightness of the

bound.

Corollary 3.1.32 (N onem pty D iscrete Intersection)

Under the same condi

tions as Theorem 3.1.30, let di = [(L 1)7*J, i E [l,n].

Then, A* ^ 0 if for all

i E [1, n],
(1

Proof.

_ y)

X*7l
+ v
x > a* + (1 2^=1 Af7j
L , j = 1 Aj

^ --------- .
E fc=1 k T,j:dj=k Xi

(3.1.33)

We have
L 1

kj:dj
E k =jE
k
1

= 1

S (i - i) E
j

= 1

Hence,
/1

X i

(! - y) ^ L - i
E E *

1 -

r ^
_ L_

1/

Ai

E "= l AJ7,

Therefore, if (3.1.33) holds, then (3.1.31) holds and 4*


For n

L, we can expect

L_t

0.

------ C 1, and (3.1.31) gives a tight bound on

Ejfe = 1 * Z ~ i j : d j = k 7?

the existence condition of system optimal configurations.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

39

3.2

Game Theoretic Structure


The roadmap of the game theoretic results is as follows. First, we derive stability

propertiesexistence of Nash equilibria and their structureand dynamics of the


noncooperative QoS provision game when users are allowed to set their rj values endto-end. Second, we show efficiency properties with respect to system optimality, in
particular, when Nash equilibria are system optimal.

3.2.1

Basic Definitions

To satisfy user s QoS requirement 0 \ the per-hop controlwhatever its specific


formmust apportion a fraction a* > 0 of the available bandwidth. Let a* denote the
minimal such bandwidth. We will find it more convenient to work in the service weight
space { a : a >

and JZlLi a * -*} We will use <*() to denote the performance

function corresponding to x l (-) which allocatesexplicitly or implicitlya service


weight to user i for a given input

77.

We will call the pair (7 7 , rf') of control vectors a selfish move of user i E [1, n\ with
respect to a* if 7 7 ' = rf e, and the following two conditions are satisfied:
< a* implies

77'

77

+ e* and <^*(7 7 ') > <*(7 7 );

(ii) v?*(t7 ) > a \ implies

77'

77

e and a* < <*(7 7 ') < <p%(rj).

(i)

Thus an unhappy user tries to improve his happiness by increasing rji, while an
overly satisfied user tries to reduce the satisfaction level to match his actual needs.
We will call a pair of control vectors (7 7 , 7 7 ') a concurrent selfish move (in the
negative direction) if for some J C [1, n],

77

77

e*, and (7 7 , rf e*) is a selfish

move for all i J. An analogous definitions holds for concurrent selfish moves in
the positive direction. We will sometimes refer to selfish moves as sequential selfish
moves to distinguish from concurrent ones. The definition of selfish move describes
an efficient or cost conscious user who only consumes just enough resources to satisfy
her QoS needs.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

40

For user i, let A {f) (pl(rj) > a*}. Thus Ai represents the set of configuration
where user i's QoS requirement is satisfied. Let
n

A* = f ] A i .
i= 1

Thus all users QoS requirements are satisfied at rj A*.


system optimal if rj A*, and for all rj' / rj, <p(rj') >

A configuration rj is
does not hold. In a

system optimal configuration, the users QoS requirements are met while expending
the minimal amount of resources. In an overloaded system, i.e., Yhi=ia i > 1, by
definition, there cannot exist a way of allocating network resources such th at all
users QoS requirements are satisfied, rj A* is a comer point of A* if the set of
selfish moves from rj is empty.

3.2.2

Nash Equilibria and Stability Properties

We present the dynamical properties of the noncooperative QoS provision game


when A* exists (i.e., is nonempty) and r] A*.

Proposition 3.2.1 (Projection)

For user i and configuration rj

A>

let

M i( r j) = {rj' : 77- = rjU and rf^ < rjj for j ^ i } . Then M i ( r j ) C A -

Proof.

Let rj' M iirj). By property (A2),

771

= (r}[,

7n) A - By a

second application of (A2) to rj1, rj2 = (ij[, rf2, 7 7 3 , . . . , rji,. . . , ijn) A By a repeated
application of (A 2 ), the procedure eventually yields rj' where membership in A is
preserved at every step.

Proposition 3.2.1 is a consequence of property (A2) of the per-hop control. We can


use Proposition 3.2.1 and property (AI) to show a closure property of A*.

Lemma 3.2.2 (Closure)

A* is closed under selfish moves, sequential and concur

rent. That is, for rj A* and any subset of users J C [l,n] such that (7 7 , 7 7 e*) is
a selfish move for all i J ,
r j - J 2 e i e A *'
ieJ

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

41

Proof.

First, note th a t closure with respect to sequential selfish moves is a special

case of closure with respect to concurrent selfish moves. We will show the lemma for
the sequential case and then use it to prove the general case.
(i) Sequential case.

Let i be such th at rj e* or rj -I- e* ^ A*. By property (AI),


- ei) < ^ ( v ) ,

hence this contracts (7 7 , rj e,) being a selfish move. A similar argument holds for
77+

e*.
(ii) Concurrent case.

Let rj' = rj Y liej e* G .A*. For any i G [1, n], we will show

th at rj' G Ai. First, assume i J. By rj G Ai and Proposition 3.2.1, Mi{rj) C Ai.


But by the definition of concurrent selfish move rj' G M-firj). which proves the first
case. Assume %G J. By Proposition 3.2.1, M i(rj e$) C Ai. Since rj' G Ai, this
completes the proof.

Thus selfish users, even when making simultaneous selfish changes to their

77

values,

cannot escape from the set A* where their QoS requirements are all satisfied, some
more than necessary.
A concurrent selfish move, with respect to users in J C [l,n] and intersection set
D iej Ai, can be represented by a subset of J' C J th at shows the users making a
move since selfish moves within

j A can only occur in the downward direction.

Theorem 3.2.3 (M onotone Convergence)

Any initial configuration rj G A*

converges to a corner point of A* under selfish moves, sequential or concurrent.

Proof

Consider rj' rj YlieJ e* where J C [l,n] represents a concurrent selfish

move. By definition of selfish move, we have 77^ <

77* for

all i G J. By closure of A*

(Lemma 3.2.2) and boundedness of 77 - by o;*, after a finite number of applications of


concurrent selfish moves, no selfish move will be possible. The configuration reached,
by definition, is a corner point of A*.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

42

Thus a corner point of A* is a fixed point under the dynamics of selfish moves within
A*, from which users cannot escape by selfish actions due to closure. Theorem 3.2.3
also shows th a t A* always possesses a corner point, not necessarily unique.
A corner point rj represents an efficient allocation of resources for all users in the
sense th a t each user Ps QoS requirement is satisfied by rj, i.e., a* = cpl(tj) > a*.
Furthermore, any incremental action by i will either violate his QoS requirement or
increase the apportioned resources beyond what is needed to satisfy the users QoS
requirement. We will show th at a nonincremental action by user i will have the same
consequences (Theorem 3.2.4). If <pl(rj) = a* then rj is efficient in an absolute sense.

Theorem 3.2.4 (Corner Point and Nash)

Let rj be a corner point of A*. Then

rj is a Nash equilibrium.
Proof.

Pick any user i [l,n]. Since

77

is a corner point, (pl (rj e*) < a*. By

property (AI), (pl {rj ce*) < (p%(rj e*) for c > 1, and part two of condition (i) is
satisfied. P art one of the same condition is satisfied by (A l).

We remark th at a corner point of A* must be Nash equilibrium, but the converse


need not be true. Indeed, there are Nash equilibria th at need not be in A*, even
when it is nonempty.

Theorem 3.2.5 (Nash and System O ptim ality)

A configuration rj is Nash and

system optimal if, and only if, rj is a corner point of A*.

3.3

Conclusion
In this chapter, we provide a general theoretical framework of scalable QoS pro

visioning using aggregate-flow scheduling where packet labels can be set from a finite
label set and routers provide differentiated treatm ent of packets based on the labels
enscribed. We define the meaning of optimal per-hop control within this context and
find the optimal solution for aggregate-flow control. We show th at the optimal perhop control satisfies certain propertiesdenoted (A l), (A2), and (B)which relate

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

43

how label values impact the service a flow receives at a router. We augment the
general result by presenting optimal solutions when restricting the packet scheduling
disciplines to variants of GPS, and the consequences on the core properties.
Next, in Section 3.2, we expand the framework by considering selfish users who
can influence QoS provisioning behavior by regulating the label values assigned to
their traffic streams. Based on the properties exported by the network control
(A l), (A2), and (B)we show how a population of selfish users with diverse QoS
requirements setting their packet labels can arrive at a global allocation of resources
th a t is stable (Nash equilibrium) and efficient (system optimal). We show th at even
in situations when network resources are scarce such that no resource allocation
aggregate-flow differentiation, per-flow reservation, or otherwisecan satisfy all users
QoS requirements, the system reaches a Nash equilibrium. We show th at the optimal
per-hop control is also optimal in the noncooperative game context in the sense
th a t when network resources are configurable such th at all users QoS requirements
can be satisfied, then there exists a Nash equilibrium th at is system optimal.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

44

PERFORMANCE EVALUATION

In this chapter, we extend the theoretical work in Chapter 3 by investigating im


plementation and performance evaluation issues associated with the induced optimal
aggregate-flow scheduling. We design a system th at realizes the optimal per-hop
control coupled with end-to-end adaptive QoS control, and implement a practical en
hancement by introducing a scaling function which is applied to the TOS field label
value in the IP header at each router. The scaling function allows the service provider
to configure the per-hop control so as to export customized QoS separation essential
when shaping end-to-end absolute QoS over per-hop relative QoScommensurate
with the QoS profiles of the service providers user base. We use simulation to carry
out a comprehensive performance evaluation study of QoS provisioning using optimal
aggregate-flow scheduling. The results in this chapter are also published in [70, 72].

4.1

QoS Provisioning Architecture Design

4.1.1

Optim al Aggregate-flow Per-hop Control D esign

In our design, the optimal per-hop control consists of a classifier and a packet
GPS scheduler which is configured to have m service classes. We set m = L (cf.
Section 2.3) where L is the total number of label values. For each incoming packet,
the optimal aggregate-flow per-hop control does the following:
The classifier inspects the i) value carried in the TOS field of the packets IP
header, and puts the packet with value rj into service class rj { 1 , 2 , . . . , L } .
Based on a fixed time interval Ts (Ts is set at 100ms in the results reported in
Section 4.2):

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

45

- The classifier periodically measures the average arrival rates Ai, , Am of


all the classes over Ts (a simple counting operation).
Based on the measured instantaneous arrival rate of each service class,
the classifier adjusts the service weights of all classes according to the
procedure in Figure 4.1.

The packet GPS scheduler then schedules the packets based on the new service
weights. The scheduler uses the current weights to compute the finish time of
backlogged packets.

. Set r f = fji for k e [1, L], if for some i

[1, n], rji = k,

Set r f = 0 for all other k (no packet carries label value k ).


2. For k G [1,L \, set
ak =

(1

.
- v)

\ kTjk
E L

Xk
+ v I ----A V
E t i A*

Figure 4.1. Structure of reduction classifier for m = L; a k is the service weight


allocated to service class k G { 1 , . . . , L ).

In Section 4.2.2, we verify the above design by confirming the QoS separation and
adaptive label control properties of optimal aggregate-flow per-hop control. Further
fine-tuning of its components and parameters can yield additional performance gains.
xWe remark that the original weighted fair queue implementation of ns is a rough approximation of
ideal GPS: it uses real-time, instead of virtual time, which can dilate the fairness bound of PGPS.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

46

4.1.2

End-to-end QoS Control Design

The properties exported by per-hop controlif satisfiedare not sufficient by


themselves to render end-to-end QoS commensurate with user requirements. Endto-end (or edge) control complements per-hop control by setting the value of rj on a
per-flow basis in accordance with user needs. We assume th at the network exercises
access control at the edge such th at users are not perm itted to assign r) values to their
packets arbitrarily: if every user assigns the maximum rj value L to their flows, then
QoS control via rj loses its meaning (degenerates to FIFO-based best-effort service by
property (B)).
As described in Chapter 2, user i s QoS requirement, in general, can be represented
by a utility function Ui which has the form UfiXi, x l, p*) where A* is the traffic rate,
x* the end-to-end QoS received, and pi the unit price charged by the service provider.
The total cost to user i is given by

We assume th at Ui satisfies the following

monotonicity properties: increasing with A;, decreasing with x*, and decreasing with
Pi. If jy-control is allowed to be exercised by the user, then a selfish user i can be
defined as performing the self-optimization
max U i{\i,x.l,pi).
Vi[l,L]
rji influences user Vs utility Ui via its effect on the QoS received x \ We assume Pi(x*)
is a monotone (non-increasing) function of x*better QoS incurs higher costwhich
corresponds to the price function exported by the service provider. Consider threshold
utilities where user preferences are specified as bounds on QoS measurese.g., bounds
on delay, packet loss rate, jitter, and so forthand user zs QoS requirement is given
by the vector of QoS upper bounds 0 \ The control law
^

= ( - l ) V ||x ( ) - 0 ||

(4.1.1)

is a specific instance of adaptive label control and can be shown to be asymptoti


cally stable. Here, a sgn(||x*(y)11 ||0 l ||) where sgn(-) = 1 if the argument is
nonnegative, and

, otherwise.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

47

Thus, the end-to-end adaptive label control performs the following two tasks:

Measurement and feedback: the receiver measures the QoS received over a fixed
window of length Tu (Tu = Is in our simulation results reported in Section 4.2),
and sends feedback control packets using UDP with label value L.

(control

packets are given high priority).

Adaptive label control: after the /cth feedback is received, the sender compares
measured end-to-end performance with the QoS target and sets a new rj value
based on (4.1.1).

4.1.3

Scaling Function

From an absolute QoS shaping perspective, the optimal relative QoS exported
by routers according to the procedure in Figure

4.1 has the following potential

weakness. Assume a service provider, over time, observes th at its customer base and
corresponding QoS requirement profile varies, but includes as stable subpopulations
user groups requiring 0.0001, 0.001, 0.005, 0.05, 0.1, 0.3, and 0.6 packet loss rates.
Consider Figure 4.2 (left) which shows the packet loss rates exported by the optimal
aggregate-flow scheduler specified in Section 4.1.1 with m = 16 service classes as link
bandwidth is varied2. We observe th at optimal aggregate-flow scheduling exports
relative QoS with equal spacing . The QoS spacing between service classes has a
finite resolution Ac 0.06, or 0 (degenerate case), as affected by m. For 0.0001-,
0.001-, 0.005-packet loss rate user flows, the consequent coarseness forces them to
choose label values th at map them essentially to the same service class whose QoS
for the three user groups to be satisfiedis dominated by the most stringent QoS
requirement 0.0001. Thus 0.001- and 0.005-packet loss flows become oversatisfied
or overprovisioned which, in turn, implies th at resources are inefficiently utilized.
2The performance results are taken from Section 4.2.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

48
Label Value

Scaled Label Value

0(T1)

O
.OOOl
0.001

0.8

0.005

0.6
0.4

0.2

0
10

11

12

13

14

15

16

Scaling Function

bandwidth (Mbps)

Performance
Function

Figure 4.2. Left: Equal spacing QoS separation achieved by optimal


aggregate-flow classifier when L = 16. Right: Scaling function o affecting
nonuniform stretching and contraction.

To solve the potential inefficiency and structural lim itation pointed out above, we
introduce a scaling function
<7

: [l, Lj\ ^ R+

where a is monotone increasing in its argument. This scaling function can be config
ured by the service providerwe envision this to be done at larger time scalesto
apply nonuniform scaling to the TOS field label values commensurate with its ob
served customer QoS profile before the IP packets are passed to the classifier proper.
This is illustrated in Figure 4.2 (right).
The modified classifier with scaling function is depicted in Figure 4.3.
The performance gain brought by the scaling function is demonstrated and further
discussed in Section 4.2.3.

4.1.4

Load Imbalance and Local QoS R esponsibility

Another practical concern for aggregate-flow per-hop control is the efficiency and
QoS shaping properties across multiple routers on an end-to-end path in a wide area
network. To motivate the problem, consider a flow i with end-to-end delay require
ment 30ms whose traffic is routed through a 6 -hop path shown in Figure 4.4. Assume

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

49

A):
1. Set r f = a(r)i) for k G [1, L} if for some i G [1, n], rji = k.
Set r f =

for all other k (no packet carries label value k ).

2. For k G [1, L], set


ak =

(1

- u)

Xkfjk
E L

Xk

a^

' "

L V

Figure 4.3. Structure of reduction classifier with scaling function for m = L; a k is


the service weight allocated to service class k G { ! , . . . , L}.

th at the load distribution of the routers is nonuniformtypically, only some routers


are bottlenecks or hot spots along an end-to-end pathwith routers 1 and 5 highly
loaded and routers 2 , 3, 4, and

being lightly loaded. A uniform QoS assignmentas


30m
sEnd-to-EndQoSRequirem
ent

,____________________________ A____________________________ ,

Load Distribution

*0

Uniform QoS
Responsibility

5ms

5ms

5ms

5ms

Sms

5ms

Nonmiform QoS
Responsibility

13ms

1ms

1ms

1ms

13ms

1ms

Figure 4.4. Uniform vs. nonuniform local QoS responsibility distribution to satisfy
30ms end-to-end delay requirement for a given load imbalance.

induced by flow is choice of label value rjiwould impose the same 5ms (ignoring, for
simplicity of exposition, link latencies) as the local QoS responsibility at each router
irrespective of its load. A more desirableand, in general, efficientallocation is one

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

50

where heavily loaded routers are given less responsibility than lightly loaded ones3.
This is illustrated in the nonuniform QoS distribution in Figure 4.4 where the highly
loaded routers export 13ms local queueing delay whereas the lightly loaded hops
achieve 1ms delays for a total of 30ms. The aforementioned property is not difficult
to satisfy. A more subtleand, from a control perspectiveim portant property for
end-to-end QoS shaping is the change in local QoS responsibility as a function of rj.
For example, if flow i increases its label value rji because the current delivered QoS
is deficient vis-a-vis its requirement, other things being equal, it is desirable th at the
lightly loaded routers take on a greater share of the burden for improved QoS than
heavily loaded ones. In Section 4.2.6, we show th at both properties are satisfied by
the optimal aggregate-flow scheduler.

4.2

Performance Results

4.2.1

Sim ulation Set-up

We use the LBNL Network Simulator ns (version 2) as our simulation platform.


We have extended ns into a wide area network QoS benchmarking environment
called QSim [11]which was used in our earlier work on WAN QoS performance
evaluation [8 , 9]. QSim exports a selection of per-hop controls and end-to-end controls,
along with a QoS interface through which user QoS requirements can be specified.
The control parameters and network configurations are configurable via a GUI. The
latest version incorporates the optimal per-hop control and adaptive label control
specified in Section 4.1.

3One can formally attribute this property by using congestion pricing and utilization as cost mea
sures [9].

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

51

Network Configuration
We use two benchmark network topologies, a 2-switch dumbbell topology shown
in Figure 4.5 (left) which isolates a bottleneck link, and a 4-switch caterpillar topology
shown in Figure 4.5 (right) which is comprised of multiple bottleneck links.

receiver2
switcl.

swilcl.

S w it c h !

Figure 4.5. Benchmark network topology. Left: 2-switch single bottleneck link
shared by n flows. Right: 4-switch multiple bottleneck link caterpillar topology.

Most of the performance results reported in this chapter are from the bottleneck
topology shown in Figure 4.5 (left) whose results are extended by those carried out
for the topology shown in Figure 4.5 (right). The number of service classes varied
from 1 to 32, and the number of user flows varied from 16 to 450. For the convenience
of (3.1.2), the r/ value we used in simulation ranges from 0 to L 1.

4.2.2

Service Differentiation

In this section, we investigate the service differentiation properties of optimal


aggregate-flow per-hop control with respect to its QoS separation (property (B))
and shaping (properties (Al) and (A2)) performance, the impact of the weighting
param eter v (cf. Step 2 in Figure 4.1), and the impact of input traffic burstiness.

QoS Separation and Bandwidth


First, we show the QoS separation performance of optimal aggregate per-hop con
trol as a function of bottleneck bandwidth when L = 16 and there are two flows per

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

52

class (the number of flows, in this instance, is irrelevant since their impact is sub
sumed by volume). Each flow is CBR. Figure 4.6 shows the QoS separation achieved

0.8

eta = 15
e t a s 13
e ta 11

0.6

eta s 3

e ta = 1 5 -----

I
2
%

</>
0.4

0.2

20
11

10

12

14
13
bandwidth (Mbps)

15

16

10

11

12

14
13
bandwidth (Mbps)

15

16

Figure 4.6. QoS separation achieved by optimal aggregate-flow classifier when


L = 16. Left: Packet loss rate. Right: End-to-end delay.

in the L = 16 service classes when the scheduler executes the optimal aggregateflow service weight assignment. The left figure shows the shape of the performance
function for packet loss rate and its QoS separation across the 16 service classes (all
classes are nonempty) as bottleneck bandwidth is increased from

Mbps to about

16 Mbps. We observe th at even, uniform separation is preserved (property (B) but


in a stronger form), with packet loss rate degenerating to zero as resources become
plentiful. Only the odd

77

values are plotted to reduce cluttering. Figure 4.6 (right)

shows the corresponding performance results for end-to-end delay (in m s). We ob
serve th a t the performance function for delay is less uniformthe performance gap
decreases as

77

approaches L 1while satisfying property (B). This can be explained,

in part, by the fact th at the waiting time Wk in service class k is its queue length
divided by its service rate

Since the optimal aggregate-flow classifier satisfies

> fJ'j the shape of the end-to-end delay curve is consistent with the
waiting time dependence on service rate.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

53

Label Control: Properties (A l) and (A2)


Next we show how properties (Al) and (A2) are satisfied by the optimal aggregateflow per-hop control. The number of service classes is configured at L = 16 (bottleneck
bandwidth is fixed at 5 Mbps), and we let a single flow occupy service class 0 initially.
Subsequently, this single user flow increases its label value ?y16 from 0 up to the
maximum value 15, and the resulting QoS exported by the 16 services classes is
plotted in Figure 4.7 (left). We observe th at property (B) remains satisfied, and more
importantly, properties (A l) and (A2) are satisfied by the optimal aggregate-flow
classifier. Property (A l) is implied by service class separation which shows th at the

0.4 1fc

B ...Q ...Q ...

+... +.

--Q-

Q -E- Q .Q ...|

-+ - 4 -

.-*-3

0.2 r * i

10

12

14

Figure 4.7. Manifestation of properties (Al) and (A2). End-to-end QoS shaping as
a function of label value 7716 of singular user flow. Left: 16 users (originally each
group has one user). Right: 48 users (average group population size of 3).

QoS of the user who is increasing his rj value is monotonically improving, and property
(A2) can be confirmed by the slight positive slope of the service class performance
curves as a function of

7716-

Also, note th at for rft6 > 1, there are effectively only 15

service classes since class 0 becomes empty. Figure 4.7 (right) shows the corresponding
performance curves when the initially service class 0 is populated by three user flows,
and one of the flows increases its label value from 0 to 15 as before. Bottleneck
bandwidth is set at 15 Mbps. We observe th at properties (A l) and (A2) are satisfied.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

54

Impact o f W eighting Parameter v


Recall th at the service weight assignment, given by equation (3.1.5), has a weight
ing param eter

< v <

which can be controlled to determine the relative importance

of service differentiation at the router. As discussed in Section 3.1, v 1 diminishes


the contribution of the differentiation component (the first term of (3.1.5)) thus turn
ing the classifier into a FIFO scheduler where label values are effectively ignored.
Conversely, as v >0, the contribution from the FIFO component (the second term)
is eliminated thus turning the classifier into a maximal differentiator. This is shown
in Figure 4.8.
1

e ta = 1 5 ----e ta = 1 3 .......
eta=11
eta=9 .......
eta=7 ......
e ta = 5 ----e ta = 3 ......
eta-1

0.8

0.6

M
0.4

0.2

11

12
13
14
bandwidth (Mbps)

15

10

11

12

14
13
bandwidth (Mbps)

15

16

Figure 4.8. Impact of v on QoS separation for L = 16. Left: QoS exported by
service classes as a function of bottleneck bandwidth when v = 0.9. Right:
Corresponding plots when v = 0.1.

The left figure shows QoS separation as a function of bottleneck bandwidth for
L = 16 when u 0.9 which makes the system resemble a FIFO scheduler. Fig
ure 4.8 (right) shows the corresponding plot when the weighting param eter is set
to v = 0.1, thus amplifying the contribution of the differentiation component. We
observe th at the range of QoS differentiation is significantly larger for v = 0.1 than
v = 0.9the QoS spacing between rj = 0 and t] = 15 is about 0.1 for the latter,
whereas it is about 0.8 for the former. However, this occurs at the cost of reduced
resolution with respect to fine-granular QoS spacing. Thus, when viewing L as a re
source, v influences how QoS is to be shaped at the switchto achieve fine-granular
QoS spacing or coarse-granular QoS separation. From a network management per

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

55

spective, v is a control param eter th at the service provider may engage to reflect the
overall needs of its customer base and their QoS profile.

4.2.3

Structural Properties of Optimal Aggregate-flow Per-hop Control

The major part of our performance evaluation study focuses on the structural and
dynamical properties of differentiated services as affected by optimal aggregate-flow
per-hop control. We first study the structural properties of optimal aggregate-flow
per-hop control with respect to its efficiency and optimality indicated by the existence
and Size of A* (cf. Section 3.2), the impact of bounded and discrete label values
{ 1 , 2 , . . . , L }, and the benefit of scaling function. Then we show the dynamical and
convergence properties of adaptive label control running over the optimal aggregateflow per-hop substrate.

Efficiency and O ptim ality


In this section, we study the structure of A*. Recall in Section 3.2, we define
A* as the set of rj configurations where all user requirements are satisfied. Given
a configuration rf, we call a change in label value rfi by user i a selfish move if the
consequent configuration gives higher utility to user i. A configuration rf is a Nash
equilibrium if no user can improve her utility by selfish move. If, in addition to being
a Nash equilibrium, t] belongs to A*, then we call rf a com er point. Thus a corner
point is a maximally efficient configuration where all users QoS requirements are
satisfied.
Figure 4.9 (left) shows the Nash equilibria and corner point structure of A* when
25 user flows are grouped into

QoS requirement groups given by the packet loss

rate bound profile (0.01,0.05,0.1,0.15,0.2,0.3,0.4,1.0). Thus the population group


with bound 0.01 has the most stringent QoS requirement, and the last group cor
responds to a best-effort user group in the worst-case sense. The Nash equilibria
and corner points are found by brute-force search. For bottleneck bandwidths below

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

56

Q-'-Ejv
"Jfc-'-A '"X X\'

14

0.01
0.05
0.10
0.15

0 0 7 -Q

12

0.20
0.30
0.40

10
8
6
4

2
10
10.2 10.4 10.6
bandwidth (Mbps)

10.8

11

10

1 0 .2

1 0 .4

1 0 .6

1 0 .8

11

bandw idth (M bps)

1 1 .2

1 1 .4

Figure 4.9. Structure of A*. Left: The change in Nash equilibria as we increase
bottleneck bandwidth for user population with QoS requirement profile
(0.01,0.05,0.1,0.15,0.2,0.3,0.4,1.0). At 10 Mbps, Nash equilibria become corner
points of A*. Right: Corresponding landscape for more stringent user population
QoS profile shown in the legend.

10 Mbps, A* is empty but Nash equilibria nonetheless exist (cf. Section 3.2). User
flows with packet loss requirements 0.01, 0.04, and 0.07 are not satisfied (their actual
packet loss rates received exceed their respective bounds), and their r) values have
saturated at the maximum value 15. At 10 Mbps bottleneck bandwidth, however,
A* 7 ^ 0, and the distribution of seven label values correspond to the corner config
uration. We observe th at the range of individual label values is large (from 5 to 13
for a distance of 8 ), and then subsequently shrinks as bandwidth is increased. The
optimal aggregate-flow classifier, is also efficient in the sense th at if an aggregate-flow
resource assignment exists th at satisfies all user requirements, then it is apt to find
it. Figure 4.9 (right) shows the plots for a more stringent user population profile
(0.01,0.04,0.07,0.1,0.15,0.2,0.3,1.0). Nash equilibria turn into corner points at a
higher bottleneck bandwidth 10.7 Mbps, but the reduction in the distance between
label values at corner point configurations persists.

Im pact of Bounded and D iscrete Label Values (L)


In this section, we show the performance impact of bounded and discrete label
values given by L. If L = 1, then we know th at the optimal aggregate-flow control

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

57

degenerates to FIFO scheduling. Thus small L diminishes the QoS resolution achiev
able by the classifier and increasing L amplifies it. Figure 4.10 shows this dependence.
For n = 32 user flows, we show the QoS exported by L service classes as a function
of bandwidth as L is increased L =

, 4,

, 32. Thus at L = 32, the system degen

erates to per-flow control. The weighting parameter v of the optimal classifier (cf.
equation (3.1.5)) is set to 0.5.
eta=31
eta=27
eta=23
eta=19
eta=15
eta=11
eta7

0.6

0.8

0.6

w
0.4

0.4

0.2

0.2

10

11

12

13
14
bandwidth (Mbps)

15

eta=31
eta=27 - - eta=23 .......
eta=19
e t a - 1 5 ......
eta=11 .......
e ta = 7 .......
eta=3 -----

0.8

0.6

10

16

11

12

14
13
bandwidth (Mbps)

15

16

eta=31 ----eta=27 eta=23 .......

08

0.6

v>

a.
0.4

0.4

0.2

0.2
10

11

12

14
13
bandwidth (Mbps)

15

16

10

11

12

13
14
bandwidth (Mbps)

15

16

Figure 4.10. Impact of bounded label set size L on QoS exported by the service
classes as a function of bottleneck bandwidth for L = 1, 4, 8 , 32.

Figure 4.10 (top row, left plot) shows QoS shaping when L =

, i.e., FIFO with

no QoS separation. As L is increased, however, we observe th at increased L endows


increased QoS resolution with finer levels of QoS spacing. In fact, for L 2,4,8,16,32
(i.e., excluding L = 1), we observe th at the performance curves are strict supersets
of each other as indexed by L. This can be shown to follow from the form of label
value normalization r) defined by the map (3.1.2) (see Section 3.1). Since for all finite
L > 0, rj { 1 ,2 ,... ,L } is mapped by (3.1.2) to a subset of the closed unit real
interval fj G [0 , 1 ], this also explicates the effect of the label set being discrete: a

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

58

closed continuous interval [a,b], a < b, maps onto [0,1] by the action of (3.1.2), and
Z+ is mapped to a dense subset of [0,1]. Thus L translates to QoS resolution in a
straightforward fashion.

o.e

0.8
B

1
0.4

Ia

0.2

0.4

0.2
0

2
3
range of labels (log L)

2
3
range of labels (log L)

Figure 4.11. The combined impact of L and v on QoS shaping. Left: QoS exported
in L service classes as a function of L for v = 0.5. Right: Corresponding plot for
i/ = 0 . 1 .

Figure 4.11 shows the joint influence of L and u on QoS shaping. The left figure
shows the QoS exported by L service classes as a function of L (actually its logarithm)
for L = 1, 2, 4,

, 16, 32 when the weighting param eter is set to v = 0.5. As a

function of L, this generates a binary QoS tree whose leaves at level logL show
the range and values of QoS covered by the L service classes. Figure 4.11 (right)
shows the corresponding plot when v 0.1.

We observe th a t the width of the

covering has significantly widened, however, at the cost of a more coarse-granular


QoS spacing. Thus, as indicated earlier, the L service classes can be used to cover the
QoS space (here, the packet loss rate space [0 , 1 ]) sparsely but uniformly, or densely
but concentrated in a subspace of the entire QoS space. For L = 2e ( > 1) the
subset relation C2 C C\ C C Cl C of the coverings Cl C [0,1] as a function
of L which is implied by the form of the normalization (3.1.2)can be observed in
Figure 4.11.
The size of the label set L also impacts the system efficiency, the existence or
nonemptiness of A* which is shown in Figure 4.12. The plot shows the minimum
bottleneck bandwidth needed to achieve A* ^ 0 as a function of L. We observe that,

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

59

12.5

I
I

11

10.5

9.5
5

10

15

20

25

30

r a n g e of lab e ls (L)

Figure 4.12. Impact of L on existence of .4*: minimum bottleneck bandwidth


required to achieve A* / 0 as a function of L.

other things being equal, too small an L value incurs a high QoS shaping penalty
with respect to the network system being able to satisfy the QoS requirements of all
users, even if sufficient bandwidth were available. On the other hand, the marginal
or incremental benefit of increasing L diminishes for given user QoS requirements,
which points toward the possibility th at with not-too-large L (perhaps a subset of
the TOS bits) the QoS requirements of a multitude of users can be met.

4.2.4

The R ole of Scaling Function

In this section, we study the performance impact of the scaling function. Fig
ure 4.13 shows how the scaling function can affect the QoS differentiation among
service classes. Comparing it with Figure 4.6, we see th at the scaling function can
achieve non-uniform QoS differentiation.
Next we show th at the scaling function can improve system efficiency. As in Sec
tion 4.2.3, system efficiency is represented by the minimum bottleneck bandwidth
needed to achieve A* / 0. Configure users into

groups with different QoS require

ments. Figure 4.14 shows the minimum bottleneck bandwidth needed to achieve all
users QoS targets as a function of L. The top curve is obtained without the scaling
function. This is essentially the same curve described in Figure 4.12. The bottom

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

60

e t a s 15

e t a s 15
e ta = 1 3 .......

eta = 1 3 ......
eta 11 ----eta = 9 .......

0.8
0.6

0>

=3

E,

eta
eta
eta
eta

40

=9
=7
*5
=3

.......
.......

.......

>.
0.4

0.2

25

10

11
12
13
14
bandwidth (Mbps)

15

16

10

11
12
13
14
bandwidth {Mbps)

15

16

Figure 4.13. QoS differentiation achieved by optimal aggregate-flow classifier with


scaling function. a(rj) for rj 6 [0,15]: 1.0, 1.1, 1.2, 10, 11, 12, 100, 110, 120, 500, 550,
600, 1000, 1100, 1200, 2000. Left: Packet loss rate. Right: End-to-end delay.

curve is obtained with an optimal scaling function, i.e., for a given L, if we are
allowed to choose a, what is the minimal bandwidth needed.

n o scalin g
with sc alin g n .

12.5

11.5

IE

11

10.5

9.5
5

10

15

20

25

30

ran g e oi lab els (L)

Figure 4.14. Impact of scaling function on system efficiency: minimum bottleneck


bandwidth required to achieve A* ^ 0 as a function of L.

We observe the following: First, with a scaling function, the system achieves its
maximum efficiency when L > 8 , i.e., for the bottom curve, the minimum bottleneck
bandwidth required to satisfy all the users QoS targets does not further decrease when
L >

. Second, the scaling function does not help when L 1,2. This is because the

system is FIFO when L =

, and we always get a(rj) = 0,1 after the transformation

(3.1.2) when L = 2. Third, the increase in efficiency, which is represented by the

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

61

difference of the minimal bottleneck bandwidths in the two curves, is large when
L = 4 , 8 . However, it becomes smaller when L > 8 .

4.2.5

Impact of Burstiness

The previous sections have shown the structural QoS provisioning properties of
the optimal aggregate-flow per-hop control when input traffic is CBR. In this section,
we inject burstinessan ever-present orthogonal dimension to traffic control and QoS
provisioningand show th at the qualitative behavior of the aggregate-flow QoS pro
visioning remains the same. Figure 4.15 shows QoS separation results across m = 16
aggregate-flow service classes for the same network configuration as Figure 4.6 (see
Section 4.2.2) except for the difference th at the traffic sources are VBRPoisson
with the same average d ata rates.
60
e ta = 15
e ta = 13
e ta * 11
eta = 9
eta = 7
eta = 5
eta = 3

0.8

0.6

----.......

.......
......
.......
......

e ta = 15 ----e ta 13 .......

55
50
45
40
35

0.4

30
25

0.2

20
10

11

12

14
13
bandwidth (Mbps)

15
15

16

10

11

12

14
13
bandwidth (Mbps)

15

16

Figure 4.15. QoS separation achieved by optimal aggregate-flow classifier when


L = 16 under VBR traffic. Left: packet loss rate. Right: end-to-end delay.

Both the packet loss and delay plots show th at QoS separation is achieved, and
moreover, burstiness im parts a more gradual (or smooth) change in the overall QoS
as a function of bottleneck bandwidth due to the absence of threshold effect charac
teristic of CBR traffic.
Figure 4.16 shows the impact of bounded, discrete label set size L under VBR
traffic which corresponds to the set-up in Figure 4.10 (Section 4.2.3) for CBR traffic.
As with the CBR traffic case, we observe an increased power of QoS resolution as L is

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

62

increased (only shown for L 4 and 16 due to space constraints). The principal qual
itative difference, as with Figure 4.15, is the more gradual change of the performance
curve which is affected by burstiness. In essence, burstiness does not add additional
complications to QoS provisioning above and beyond its usual impact with respect
to queuing, multiplexing gain, and so forth, which can be immensely subtlein their
own rightdepending on the traffic properties.
eta = 3 ----eta * 2 ----eta * 1 .......

0.8

eta = 0

e ta * 16 ----e ta * 1 3 .......
e ta 11 -----

0.8

0.6

(0
S.
0.4

0.4 >

0.2

0.2

10

12

14

16

bandwidth (Mbps)

10

11

12

13
14
bandwidth (Mbps)

15

16

Figure 4.16. Impact of finite label set size L on QoS for VBR traffic. Left: L=4;
Right: L=16.

Figure 4.17 shows QoS separation results across m 16 aggregate-flow service


classes with scaling function under VBR traffic. The network configuration in Fig
ure 4.17 is same as in Figure 4.13 except for the difference th a t the traffic sources
are VBRPoissonwith the same average data rates. Both the packet loss (left)
and delay (right) plots show th at QoS separation is achieved, and moreover, bursti
ness im parts a more gradual (or smooth) change in the overall QoS as a function of
bottleneck bandwidth due to the absence of threshold effect characteristic of CBR
traffic.

4.2.6

Dynam ics and Convergence

Tim e Evolution
In this section, we show th at end-to-end adaptive label controlin conjunction
with the optimal aggregate-flow per-hop control network substrateleads to stable

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

63
1
eta * 15

56

6 ta 13
eta = 11

0.6

e ta 13
e ta 11

e ta = 5

0.6
0.4

0.2
0

10

11
12
13
14
bandwidth (Mbps)

15

16

10

11
12
13
14
bandwidth (Mbps)

15

16

Figure 4.17. QoS separation achieved by optimal aggregate-flow classifier with


scaling function when L = 16 under VBR traffic. cr(r)), rj G [0,15]: 1.0, 1.1, 1.2, 10,
11, 12, 100, 110, 120, 500, 550, 600, 1000, 1100, 1200, 2000. Left: packet loss rate.
Right: end-to-end delay.

time evolutions. The QoS exported by the service classes and the end-to-end QoS
received by individual user flows is commensurately stable.
Figure 4.18 (top row) shows the label value dynamics of adaptive label control for
an L = 16 system with n = 25 users grouped into

groups with common packet loss

rate requirements 0.1 (users 0 and 1), 0.15 (users 2-4), 0.2 (users 5-7), 0.3 (users 8-10),
0.4 (users 11-14), 0.5 (users 15-18), 0.6 (users 19-21), and best-effort (users 22-24).
We show the rj value traces for three of the eight user groups. First, we observe th at
in spite of each individual user flow executing its end-to-end adaptive label control
independently of all other user flowsincluding those in the same group to which each
flow is oblivioususer flows with the same QoS requirement aggregate to the same
service class. Second, the

77

values converge to equilibrium values after a transient

period. The length of the transient period is a function of feedback control parameters
th a t are not specific to optimal aggregate-flow scheduling. Third, the brief label value
perturbations in the groups comprised of users 0 , 1 , and users 15-18 show th at QoS is
not guaranteed. Figure 4.18 (bottom row) shows the corresponding measured packet
loss rate traces observed at the receiver. We observe th at the rendered end-to-end QoS
is stable, with the label perturbations reflecting the corresponding QoS perturbations
in the packet loss trace.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

64

1
U8f0 ----- *

14

14

ueerlS -

userl ......

user 16

userlO

12

user17 -

12

user 18

10

8
6

j
a la AW

~ r ........

tb m

W lm

/ \ r

20

40

60
80
(see)

100

120

J __ LL
ml

'

i
40

time

'

60
lime (eeo)

80

100

120

60

60

100

120

uaerO ---userl ---usenO

0.6

0.6

0 .4

0.2

1 M

i L

time (sec)

0
time (sec)

20

40

lime (sec)

Figure 4.18. Time evolution of adaptive label control and end-to-end QoS. Top row:
Evolution of label values shown for three user groups with common QoS
requirements (0.1, 0.3, and 0.5). Bottom row: Corresponding trace of measured
end-to-end QoS for user flows belonging to the three QoS groups.

Stringency of QoS Requirement

Figure 4.19 shows the impact of increased stringency in the QoS requirement
profile on QoS provisioning performance.

For the first QoS requirement profile

(0.1,0.15,0,2,0.3,0.4,0.5,0.6), when the bottleneck bandwidth is less than 8.3 Mbps,


A* is emptyi.e., there does not exist an aggregate-flow (L 16) service weight
assignment and configuration th at is able to satisfy the QoS requirements of all user
flows. This shows up as clustering of the stringent QoS flows which stems from
saturation of label values at the maximum L. At bottleneck bandwidth 8.3 Mbps
or higher, however, A* 7 ^ 0 and adaptive label control finds a label assignment (and
corresponding service weight assignment at routers controlled by the classifier) where
all user flows are satisfied. A similar behavior is observed for the three successively
more stringent QoS profiles whose initial clustering is more pronounced, and the QoS
exported lies in a narrower QoS rangeas required to satisfy the more stringent QoS

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

65

requirementswith fine-granular QoS spacing (note th at the ordinate has been scaled
accordingly).
0.8
0.10
0 .1 5

0 .7

0 .0 5

0 .2 0 -B--

0 .3 0 -*
0 .4 0
0 .5 0 -*0 .6 0

0.6

0 .4

0.5
0 .4
0 .3
0.2

^ Q *...
*--- ::
bandw idth (M bps)

0 .4

0 .3
0.01

0 .3 5
0 .2 5
0 .3
0.2

0 .2 5

&

0.2

0 .1 5
0.1
0.1

0 .0 5
0 .0 5

10.1

1 0 .2 1 0 .3

1 0 .4

1 0 .5

1 0 .6

1 0 .7

1 0 .6

1 0 .9

11

01 0 .5

1 0 .6

1 0 .7

b a n d w id th

1 0 .8

1 0 .9

11

bandw idth

11 .1

1 1 .2

1 1 .3

1 1 .4

Figure 4.19. End-to-end QoS achieved under adaptive label control as a function of
bottleneck bandwidth for successively more stringent QoS requirement profiles
(shown in the legends).

Tim e Evolution in M any-switch Topology


We also have performance results for the many-switch system (shown in Figure
4.5) when adaptive label control is used. Figure 4.20 shows the QoS provisioning
dynamics of the 4-router benchmark internetwork shown in Figure 4.5 (right). The
number of service classes is set at L = 16, and there are a total of 175 user flows
possessing

sets of QoS requirements including one best-effort application type (the

trivial QoS upper bound). The QoS requirements are:

0 .1

, 0.15, 0 .2 , 0.3, 0.4, 0.5, 0 .6 ,

and best-effort. Figure 4.20 (top) shows the label value dynamics and convergence
for a subset of three user groups, and Figure 4.20 (bottom) shows the corresponding
end-to-end QoS achieved. The qualitative dynamics are analogous to the previous

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

66

benchmark results with one noticeable quantitative performance difference being the
longer transient period required for convergence. This is, in part, due to the larger
number of flowsmore inter-flow interactions which can impede convergenceand
the increased round-trip time (RTT) of the feedback loop.

14

14

12

12

10

10

50

100

150
lim e

200

260

0
300

50

100

160
tim e

200

260

300

u s e rS
u se rlO u se rll -

jsen

*****
ISO
tim e

200

250

300

SO

100

150
tim e

200

260

300

Figure 4.20. Time evolution of adaptive label control and end-to-end QoS in
many-switch topology (Figure 4.5 (right)) with many flows. Top row: Evolution of
label values shown for three user groups with common QoS requirements (0.1, 0.3,
and 0.5). Bottom row: Corresponding trace of measured end-to-end QoS for user
flows belonging to the three QoS groups.

In addition to the increased noise factor and time lag influence (in general, func
tional or delay-differential equations used in the analysis of feedback congestion con
trol can cause oscillatory behavior [65]), another pertinent factor is the inter-flow
interaction due to property (A2). The latter can make one user flows label control
action adversely affect another flows end-to-end QoS thus increasing the time needed
to quiescence.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

67

Load Imbalance
In Section 4.1.4, we indicated th at an efficient way to provide QoS across multiple
switches on an end-to-end path is to give more responsibility to the lightly loaded
switches than to the heavily loaded ones. In this section, we dem onstrate th at the
optimal aggregate-flow per-hop control exports such properties both statically and
dynamically.
The simulation set-up is as follows: let user flows traverse multiple hops where
they share the bandwidth with cross traffic. The cross traffic sources are configured
such th at each class of the switches receives some amount of cross traffic. Therefore,
by varying the volumes of cross traffic injected into all the classes, we can effectively
control the load of the switches.
w tto h 1

o R

'0

1/ CrossTraffic y

00

I
/"\ "0 "
Cross Trafficy

V'

00

0.8

0.8

0.6

.6

0 .4

0 .4

0.2

.2

w itc h 3 - a -

10

12

14

Figure 4.21. QoS distribution on multi-hop path with three medium-loaded


switches. Left: Switch loads. Middle: Static QoS distribution. Right: Dynamical
QoS distribution.

We first consider the scenario when all the switches are evenly loaded. Figure 4.21
shows the QoS distribution pattern of the system. The left figure depicts the topology
and the load on each switch. The middle figure shows the QoS achieved by each class
at each switch. Since all the switches have similar load, the QoS experienced by the
same class at different switches are comparable. The right figure shows the change
in end-to-end QoS and the QoS achieved at each switch when one user changes its
r] value from 0 to 15. As expected, since all switches are offered similar loads, the
switches take the same responsibilities to improve the QoS of the user.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

68

08

i
0.4

sw itc h ID

10

12

14

e ta

Figure 4.22. QoS distribution on multi-hop path with one heavy-loaded switch and
two light-loaded switches. Left: switch loads. Middle: static QoS distribution.
Right: dynamical QoS distribution.

Next we study the QoS distribution among switches when they are under different
load conditions as depicted in Figure 4.22 (left). Figure 4.22 (middle) shows the static
QoS distribution among the switches by plotting the QoS of all the classes at each
switch. As stated in Section 4.1.4, the classes in the lightly-loaded switches experience
better QoS than the corresponding classes in the heavily-loaded switch. Figure 4.22
(right) shows the more subtle feature, the dynamic QoS distribution among switches.
It depicts the change of the end-to-end QoS and the QoS achieved at each switch when
one user changes its r\ value from 0 to 15. In Section 4.1.4, we stated th at when one
user increases rj value to improve its QoS, an efficient way is to put more responsibility
on the lightly-loaded switch. Figure 4.22 (right) verifies th at our optimal aggregateflow per-hop control exhibits this property. As rj increases, we observe the packet loss
rate at the lighted-loaded switch decreases faster than it does at the heavily-loaded
switch.
The QoS distribution over load imbalance system is further dem onstrated by Fig
ure 4.23 where the switches along the path are configured with four different patterns
of loads: (High, low, low), (low, high, low), (low, low, high), and (medium, medium,
medium) respectively. The Figure describes the actual QoS achieved at each switches
under these four different load imbalance patterns to satisfy a given QoS target (packet
loss rate = 0.15). We see th at the system adaptively distributes QoS responsibility
among the switches to meet the end-to-end QoS requirement.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

69

0 .2
high-low-low
low-high-low
low-low-high
m ed ium -m edium -m edlum

0.18
0.16

_
-----....... .
.......

0.14
0.12
a.

0.1

0.08
0.06
0.04
0.0 2

olj
1

L.ii.1
2

it
3

sw itch ID

Figure 4.23. To satisfy given QoS target, the actual QoS achieved at each switch
with respect to different load imbalance patterns.

4.3

Conclusion
In this chapter, we advance previous theoretical work on optimal aggregate-flow

scheduling. We have designed a system th at implements the optimal aggregate-flow


per-hop control and end-to-end QoS control, and proposed practical enhancements
by introducing scaling function and discussing load imbalance issues. We have given
a comprehensive simulation-based performance evaluation of the system. We have
shown quantitative performance evaluations of both structural and dynamical prop
erties, thus extending, in addition to complementing, the theoretical results in Chap
ter 3. The performance results, overall, show th at user-specified services can be effi
ciently and effectively achieved over the network with optimal aggregate-flow per-hop
control substrate when coupled with adaptive label control.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

70

SYSTEM BUILDING AND BENCHMARKING

Collaborating with Cisco Systems, we have implemented the optimal aggregate-flow


per-hop control in commercial routers. The system building experience shows that
the optimal per-hop control scheme purposed is practical and implementable. The
overhead brought by optimal per-hop control is small. We also conducted bench
marking over Q-Bahn testbed which is comprised of Cisco 7206 VXR routers running
developed optimal per-hop control. Our benchmarking output confirms the previous
results from theoretical analysis and simulation study, and demonstrates the scala
bility of the QOS provisioning architecture.
In Section 5.1, we discuss the overall design and the key components in optimal
per-hop control implementation. Section 5.2 describes system building procedure. In
Section 5.3, we present our benchmarking results.

5.1

System Design

5.1.1

K ey issues

In Section 4.1.1, we described an algorithm for optimal aggregate-flow per-hop


control based on weighted fair queue. The basic idea is to online measure the traf
fic to each class and dynamically adjust the weights of classes commensurate to the
current load distribution among them (cf. Section 4.1.1 for more details). Our im
plementation of optimal per-hop control in Cisco routers follows the same algorithm
with focus on system efficiency. Two key issues related to efficiency are:
How to keep the operation in packet forwarding path as minimal as possible,
including packet classification, class load measurement, and enqueueing?
How to reduce the overhead of periodical weight recalculation?

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

71

We also expect the optimal per-hop control may coexist with other QoS solutions
in practice. Thus our design reuses many existing QoS mechanisms provided in Cisco
routers and provides interface for flexible user configuration.
The platforms we use are Cisco 7200 series routers with Cisco Internetwork Op
erating System (IOS) version 12.2. The optimal per-hop control implementation can
also be transplanted to other platforms with minor change.

5.1.2

Overall Structure

In our design, the optimal per-hop control is built upon class based weight fair
queue (CBWFQ) mechanism in Cisco routers (cf. [80] for an overview) and its func
tionality can take effects on individual output interfaces. If enabled, the optimal
per-hop control intercepts and processes packets in Cisco express forwarding (cf. [78]
for more information) which is the default packet process mode in Cisco 7200 series
routers.
In express forwarding mode, packets are handled by interrupt when received.
Following is a general description of the packet process procedure during express
forwarding when CBWFQ is configured on the corresponding output interface [80, 79].
1. The interface processor first detects there is a packet on the network media,
and transfers this packet to the input/output memory on the router.
2. The interface processor generates a receive interrupt. During this interrupt, the
central processor determines what the type of the packet (IP), and then begins
to switch the packet.
3. The processor searches the route cache to determine if the packets destination
is reachable, what the output interface should be, what the next hop towards
this destination is, and finally, what MAC header the packet should have to
successfully reach the next hop. The processor uses this information to rewrite
the packets MAC header.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

72

4. When CBWFQ is enable on the outbound interface, the processor decides which
class to put the packet and copies the packet to the queue of the class. The
receive interrupt then returns, and the process th at was running on the processor
before the interrupt occurred continues running.
5. After the transmission of the current outgoing packet is completed, the output
interface processor generates an interrupt. During the interrupt, the central pro
cessor determines which packet to send next based on the CBWFQ algorithm,
copies the packet to the transm it queue of the interface, then returns.
6

. The output interface processor detects the packet on its transm it queue, and
transfers the packet onto the network media.
In the above procedure, after determining the output interface of the packet and

before putting the packet into the output queue of the interface, the IOS detects
th at the optimal per-hop control is enabled, and then passes control to the optimal
per-hop control module (referred as OPC module from now on).
Figure 5.1 depicts the overall structure of optimal per-hop control module.
LTimer,

W e ig h t'
C o n tro lle r

C la s s
M a p p in g

C o u n te r

C la s s
C o n tro l

C la s s
W e ig h ts

B u ffe r

cm

W FQ
S c h e d u le r

C la s s ifie r

mu
Figure 5.1. Structure of optimal aggregate-flow per-hop control module
implemented in Cisco routers.

The OPC module has two components: classifier, which is packet-driven, and
weight controller, which is timer-driven. The data abstraction used and/or updated
by OPC module are:

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

73

Class mapping table: determines the mapping from DSCP values to class in
dices.
Counters: records the traffic characteristics of classes during the last time pe
riod, including number of packets received and number of packets dropped.
Class weights: the weight assignment (bandwidth sharing percentage) of classes
in CBWFQ.
Class control table: the most im portant data structure in OPC module. It
contains the control parameters used to calculate the class weights (cf. Fig
ure 4.1.1). Our design supports the scaling function enhancement (cf. Sec
tion 4.1.3), thus the scaling function results of DSCP values are contained. The
parameters in class control table can be flexiblely configured through Command
Line Interface (CLI) of Cisco routers. For this purpose, link list is a preferred
implementation.
When a packet is passed to OPC module, the classifier first looks up the DSCP
value at the packet header and decides which class this packet should go to based on
the information in the class mapping table, then updates the corresponding counters
and puts the packet into the buffer of the target class. In the simplified casethe
number of DSCP used is equal to the number of class configured at CBWFQthe
mapping from DSCP to class index is one to one (cf. Section 4.1.1). To reduce over
head, we use simple counters, together with existing traffic measurement mechanism
provided by IOS, to record the traffic characteristics of classes.
The weight controller is a timer-driven module whose time interval can be set
through CLI. When triggered, the weight controller first computes the weights of the
classes based on the counter values and the information stored in class control table
following the algorithm described in Figure 4.1.1 (or Figure 4.1.3 if scaling function
is used), then updates the weights for CBWFQ, and finally resets the counters. To
reduce overhead, we use a back-end process to act as weight controller. The timer
inaccuracy is handled accordingly. Section 5.1.3 provides further details.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

74

5.1.3

D ynam ic W eight Com putation

We further simplified the algorithm described in Figure 4.1.1 (or Figure 4.1.3
when scaling function is used) to improve system efficiency. First, rewrite the weight
computation equation in Figure 4.1.1 as follows:
Oik =

,
(1 - V)

(1 -

Akfjk
: +
i

t f

T .U x

u) i

+ v -

Yr

\j
1

Xk
V

Yt

~~7iin
Y )m ax~ t)m in

\j
1

A*(* - mi)
+y_ ^ _
E f= ,
- )
E i= ,v

* e [j.z,].
1,1

In practice, best-effort traffic exists and rjmin 0. Thus, we dont need to keep
track of 77mjn which takes 0 ( n ) time and the above equation can be simplified as
\k

\fc~Jc

* 611,4

(511)

Next, we need to relate Xk to the counter values. Note th at


(k

number -o f jpacket-arrived averagejpacketJength


last-timer-interval

A valid assumption is th at the average lengths of packets in different classes within


the last tim er period are same. Let A k denote the number of packet arrived at class
k, k [1, , L\. Then equation (5.1.1) becomes
Akrik
q = ( 1 ' ! /) e L ^

Ak
+ '/

L ^

|1 ,4

(5' 1'2)

Equation (5.1.2) is used by weight controller module. Note th at the tim er inaccuracy
due to weight controller being back-end process does not influence the final weight
calculation results. When scaling function is used, the corresponding equation will
be
Oik = (1 V)

+ u ^~ ,
A3a(rfi)
Z U AJ

k e [1,L\.

(5.1.3)

In our implementation, we use integer operation when calculating equation (5.1.2)


and (5.1.3) to improve efficiency. To represent weight ak, 0 < ak < 1 by an integer

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

75

Pk, let B be the integer representing the maximum range of (3k, k [1,L], then
ftk = oikB. Our simulation results show th at B > 512 suffices in terms of accuracy
it achieves the same performance results as float weights are used in CBWFQ. The
order of operation in Equation (5.1.3) and (5.1.2) is arranged in the way such that
round-off errors are minimized.
When updating CBWFQ, ftk, k [1 ,L] needs to be converted into bandwidth
share, the d ata format used by CBWFQ.

5.1.4

User Configuration Interface

The parameters of OPC module can be configured through CLI using existing
Cisco QoS configuration framework class-map and policy-map. Class-map defines
the recognized DSCP values or DSCP sets. Policy-map defines the behavior of perhop control including number of classes used, the mapping from DSCP values (or sets)
to classes, and the scaling function values for classes. Policy-map is set on interface
basis, thus an ISP can enforce different per-hop control policies on different interfaces.
There are also some parameters global to all interfaces, including the tim er inter
val of weight controller. This simplifies the weight controller designaddition data
structure (delta list) is not needed to keep track of timer and the recalculation of the
class weights can be done uniformly for all interfaces.

5.2

System Building Procedure


Two independent organizations, Cisco IOS developing team and Network System

Lab at Purdue University collectively built optimal per-hop control in Cisco routers.
The collaboration protocol is as follows: First both parties jointly design the overall
structure of optimal per-hop control (OPC module). Next, Network System Lab
provides the pseudo code of OPC module, and Cisco team incorporates the module
with the rest of IOS and compiles a prototype IOS image. The prototype images is
then downloaded and tested by Network System Lab over Q-bahn testbed at Purdue.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

76

The testing results are sent back to Cisco team for debugging. The modified images
will be tested and debugged in the same way repeatedly until a final version is reached.
The network configuration for testing follows the ones used in simulation (Figure
5.2)to verify testing results. Some software tools are developed to autom ate the tests.
The whole system development process took

months with most of the time devoted

to testing and debugging.


The benchmarking and performance evaluation was carried out over Q-bahn
testbed by Network System Lab independently at Purdue.

5.3

Benchmarking Results
This section presents the benchmarking and performance evaluation results ob

tained from Q-bahn testbedan IP-over-SONET QoS backbone comprised of Cisco


7206 VXR routerswhich is used as a QoS testbed at the Network Systems Lab at
Purdue University. Figure 5.2 shows the 11-switch Abilene-like topology of Q-bahn
testbed.

Figure 5.2. Network topology of Q-bahn testbed

5.3.1

QoS Differentiation

In this section, we examine the QoS differentiation behavior, the most basic prop
erty, of optimal per-hop control.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

77

The first benchmarking setup is as follows: we configure the router with

classes,

and send different traffic volume to different classes while keeping the total arriving
traffic constant. We measure the average QoS experienced at classes 1-8 and examine
the QoS ordering among these classes under different load distribution.
i

i-----------1

i
i i---------1
i
lo ad s h a r e ------------------

p i s -------------

class index

olass Index

Figure 5.3. QoS separation among classes under different load distribution

Figure 5.3 shows load distribution and the average QoS achieved at different
classes. The load distribution is depicted by the percentage of traffic received at
each classes out of the total traffic arrived at the router. The QoS is measured as the
average packet loss rate at each class during the period. Each figure of 5.3 represents
a different load distribution. Class indices are marked on the X-axis. For each class,
the left box represents the load percentage and the right box represents the packet
loss rate. In top left figure, all classes receive same traffic volume thus the load per
centage is 1/8 for all classes. In top right figure, higher classes get more traffic. In
particular, the ratio of load at class 1-8 are 1 : 2 : 3 : 4 : 5 : 6 : 7 : 8 .
left figure, higher classes get less traffic where ratio of load at class

1 -8

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

In bottom

are in reverse

78

order: 8 : 7 : 6 : 5 : 4 : 3 : 2 : 1 .

Bottom right figure shows another pattern of load

distribution where ratio of load at class 1-8 are 2 : 4 :

: 8 : 7 : 5 : 3 : 1. In all the

figures corresponding to different load distributions, we observe the semantics higher


class gets better QoS is preserved. Note th at the average packet loss rate over all
classes are same for all the four figures since the total arriving traffic at the router is
constant.

e ta = 0
e ta = 2
0.8

0.8

0.6

0.6

0 .4

0 .4

0.2

0.2

0
30

28

26
24
22
20
bandw idth (M bps)

18

16

e ta = 6
e ta = 8
e ta = 1 0
e t a s 12
e ta = 1 4

15

14

13
12
11
10
bandw idth (M bps)

Figure 5.4. QoS separation among classes under different congestion level

Next, as a complement to the above setup, we vary the total amount of arriving
traffic at the router, which determines the congestion level and overall QoS at the
router, while keeping the load distribution among the classes constant. Figure 5.4
shows the QoS achieved at different classes under different congestion level. The traffic
load is evenly distributed among the classes, i.e. all classes receive same percentages of
total traffic, and the congestion level is increased/decreased by increasing/decreasing
the traffic sending to individual classes simultaneously. Figure 5.4 (left) demonstrates
the result when

classes are used and Figure 5.4 (right) demonstrates the result when

16 classes are used. The X-axis represents the traffic rate (M b/s) received per class
and the Y-positions of the points mark the QoS (packet loss rate) achieved at the
classes. We observe the following: although the overall QoS is getting better/w orse
as the congestion level (traffic rate per class) is decreased/increased, the semantics
higher class gets better QoS is preserved.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

79

5.3.2

Dynamic Environment

We also conducted more realistic experiments over Q-bahn testbed. In each ex


periment, we generate a large number of processes on the hosts connected to Q-bahn
testbed (Figure 5.2). These processes setup connections (sessions) between each other
and send traffic over Q-bahn testbed. For each session, the source and destination
hosts, interarrival time, lifetime, sending rate, and QoS (throughput) requirement are
randomly selected following pre-defined distribution. Our goal is to study the behav
ior of the optimal per-hop control in large-scale dynamic environment.
Figure 5.5 shows the result of a two-hour experiment. In the experiment, we
generated 1378 sessions with average interarrival time 10 seconds and average session
lifetime 10 minutes. The average sending rate of sessions is 6 M b/s. The traffic are
sent to 5 different classes. The number of sessions selecting different classes has the
ratio 9 : 7 : 5 : 3 : 1, i.e. the higher classes are selected by less sessions.
The left figures in 5.5 shows the workload configuration.

The left top figure

depicts the total number of active sessions over the testbed as time evolves. The left
middle figure depicts the percentage of sessions generated on each end host. The left
bottom figure depicts the sessions distribution among the end host as time evolves. At
any given time, the average number of sessions over all hosts and the corresponding
standard deviation are displayed.
The right figures in 5.5 demonstrates the class assignment and QoS achieved. The
right top figure depicts the percentage of sessions selecting different classes. The
right middle figure depicts the percentage of sessions th at achieve different levels
of QoS satisfaction. The right bottom figure, which is the most im portant figure,
demonstrates the relation between QoS satisfaction level and the class selected. The
average QoS satisfaction ratios for sessions selecting different classes are shown in the
right bottom figure.
The experiment results confirms th at the optimal per-hop control achieve higher
classes get better QoS semantics in any scenarios including dynamic environments.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

80

5.4

Conclusion
This chapter describes our experience on building the optimal per-hop control in

Cisco routers and benchmarking over Q-bahn testbed. Our results shows th at the
optimal per-hop control is a practical and implementable scheme which can provide
predictable services in realistic and dynamic network environments.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

81

N um ber of Total S e s s io n s on Q -B ahn

140

I n j o s histogram

120
100
80

60

40

20

0 1
11/12

10:00:00

11/12

10:20:00

11/12

10:40:00

11/12

11:00:00

11/12

11:20:00

11/12

11:40:00

11/12

12:00:00

11/12

12:20:00

S e ssio n Distribution p e r E n d sy stem

Q o S Satisfaction Ratio Distribution


ciot ratio histogram

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.65 0.9 0.95 1
Q o S Satisfaction Ratio

E n d S y stem ID

E n d sy ste m Activity D eviation

T o S v s. Q o S Satisfaction Ratio

std. dev.
a vg se ssio n s

a v e ra g e q o e ratio

0.9
06
0.7
0 .6

97^7

0.4

i
02

0.1

1 0 :00:00

11/12
11:0 0 :0 0

11/12
11:20:00

00 1

10 11 12 13 14 15 16 17

M e asu rem en t Tim e

Figure 5.5. Q-Bahn experiment report

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

82

6
6.1

STOCHASTIC MODELING AND OPTIMIZATION


Introduction

In this chapter, we study the stochastic modeling and optimization of aggregateflow scheduling in multi-class G / G / l queueing systems, where each user flow has
a QoS requirement and schedulers allocate resources such th at user satisfaction for
the whole system is maximized. This work is motivated by our results in Chapter 3
on optimal per-hop control design which did not consider the impact of stochasticity of the input: arrival processes with general interarrival and service times. For
certain scheduler spacesin particular, work conserving, non-preemptive and nonanticipative schedulersvarious forms of Kleinrocks conservation law [40] can be
shown to hold. The optimum scheduling problem then asks: Given a target per
formance or quality of service (QoS) requirement associated with the input, find a
scheduler th a t comes closest, in a suitable sense, to achieving the desired performance.
Kleinrocks conservation law has the effect of making the optimal scheduling problem
constrained, as the performance space of the input flows must lie on a hyperplane
defined by the conservation law.
The optimal per-flow scheduling problem in multi-class queueing systems with n
flows or types was pioneered by Coffman and M itrani [19], with a string of works
(cf. Section 6.2 for related work) th at have focused on characterizing the structure of
the performance space. Our work can be viewed as extending Coffman and M itranis
per-flow scheduling framework by introducing aggregate-flow schedulers th at capture
the constraint th at a router may only have mn, m < n, service classes at its disposal.
In Chapter

, we specify aggregate-flow schedulers by assuming a many-to-one

function, : { l , . . . , n } > { l , . . . , m } , called a classifier, and requiring th at the


scheduler maps each flow i G {1, . . . ,n} to one of m service classes as specified by

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

83

(i). The resultant m (or less) aggregated super-flows are then treated as input to
an m-flow per-flow scheduler, which, in conjunction with the classifier , defines the
aggregate-flow scheduler. The goodness of an aggregate-flow scheduler is evalu
ated with respect to the QoS received by individual flows, hence the criterion of
optimality remains unchanged. Since m = n and = identity map corresponds to
per-flow schedulers, the optimal aggregate-flow scheduling framework can be viewed
as a generalization of Coffman and M itranis per-flow scheduling framework.

6.1.1

Features of Optim al Aggregate-flow Scheduling

Aggregate-flow schedulingand per-flow scheduling as a special casehas the


following features th at are relevant to optimal scheduling:
Kleinrocks conservation law. The stochastic input along with properties of the sched
uler space induce conservation lawsKleinrocks conservation law [39, 40, 75] and
more restricted forms called strong conservation laws [76, 23]which capture the
trade-off relation th a t to improve service to one flow, the performance of one or more
flows must be sacrificed. Kleinrocks conservation law can be viewed as an applica
tion of Littles conservation law [48, 84]. Both hold for general G / G / l systems and
scheduler spaces, but details of the underlying stochastic framework must be specified
as there is no unique form.
Performance space. Previous works in per-flow scheduling have focused on properties
of the performance space. For a given input and scheduler space, the performance
space is the set of n-dimensional performance vectors th at are achievable by some
schedulers in the scheduler space. Kleinrocks conservation law constrains the per
formance space to lie on a (n 1 )-dimensional hyperplane. Strong conservation laws
additionally impose a polymatroidal structure. For example, for M / G / l systems
with work conserving, non-preemptive and non-anticipative scheduling disciplines,
the performance space is convex and spanned by random weighted combinations of
static priority schedulers [30].

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

84

Optimum performance. The process of finding an optimum scheduler can be divided


into two steps: (i) find a performance vector in the performance space closest to the
desired target QoS, and (ii) identify a scheduler in the scheduler space th at achieves
the performance of the first step. When the notion of closest is captured by the
minimum mean-square error (MMSE) under the L 2 norma commonly used criterion
in estimation, control and optimizationthen step (i) can be further subdivided
into two steps: first, find the orthogonal projection of the target QoS vector on the
conservation law hyperplane, then find a vector in the performance space closest to
the projection point. When the performance space is suitably nicee.g., convexthe
orthogonal projection may lie inside the performance space; in general, this needs not
be the case.
Complexity of optimum scheduling. The 3-step decomposition property of finding
an optimum scheduler facilitates studying the algorithmic aspect of computing an
optimum scheduler for a given stochastic input and target performance, when Klein
rocks conservation law holds, without necessitating a detailed characterization of the
structure of the performance space. The complexity of optimal per-flow scheduling
for a given input and target performance vector involves finding an optimal point
in the performance space with respect to a given performance criterion. Optimal
aggregate-flow scheduling has the added effect of limiting the feasible region to sub
spaces induced by the scheduling constraint th at only m < n service classes are
available to an aggregate-flow scheduler.
Figure 6.1 illustrates of the conservation law hyperplane, performance space, and
aggregate-flow subspace for n 3 and m 2.

6.1.2

N ew Contribution

A conceptual contribution of the work in this chapter is the introduction of


stochastic framework of optimal aggregate-flow scheduling, which extends Coffman
and M itranis optimal per-flow scheduling framework [19]. The generalized framework

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

85
conservation law hyperpJane

performance space

aggregate-flow subspace

Figure 6.1. Depiction of conservation law hyperplane, per-flow performance space,


and aggregate-flow subspace for an n 3 and m 2 optimum scheduling example.

provides, in part, a theoretical foundation for QoS provisioning using aggregate-flow


scheduling under stochastic input. The technical contribution is a computational
complexity characterization of aggregate-flow scheduling where we prove th at opti
mal aggregate-flow scheduling for multi-class G / G / l systems is NP-hard. This stands
in contrast with optimal per-flow scheduling studied in previous contexts, includ
ing certain polymatroid optimization under strong conservation laws and optimal
aggregate-flow scheduling without conservation laws, which are poly-time solvable.
The latter arises in relative service differentiation under static input, and has cubic
time complexity (Chapter 3). Kleinrocks conservation law in combination with opti
mal flow aggregation imparts computational hardness. In settings where only one of
the conditions holds, optimal aggregate-flow scheduling is polynomially solvable.
The complexity result applies to a broad range of multi-class G / G / l queueing
systems under work conserving, non-preemptive and non-anticipative scheduling dis
ciplines, when Kleinrocks conservation law holds, without requiring detailed knowl
edge of the associated performance space. This is achieved by a sufficiency condition
which states th a t (i) as long as the performance space contains an open ballhowever
smallcentered around the performance vector achieved by the FIFO scheduler,
and (ii) within the ball performance inside a service class is the same, the optimal
aggregate-flow scheduling problem is NP-hard. While the aforementioned conditions

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

86

point toward the applicability of the hardness result to general queueing systems,
even for very simple queueing systems such as M /M /1 whose performance space sat
isfies (i) and (ii)i.e., they are not assumptionsoptimal aggregate-flow scheduling
remains NP-hard.
The rest of the chapter is organized as follows. In the next section we discuss re
lated works. In Section 6.3, we introduce a multi-class queueing framework including
assumptions on the stochastic input and scheduler space. We then define the optimal
aggregate-flow scheduling problem for multi-class G / G / l systems. In Section 6.4, we
describe the structure of optimal aggregate-flow scheduling and state the main result.
Section 6.5 proves hardness of one-dimensional clustering with linear constraints, a
key component in the proof of the main result. The results in this chapter are also
presented in [71].

6.2

R elated Work
The problem of finding an optimal scheduler for a given performance target under

stochastic input in per-flow multi-class queueing systems was introduced by Coff


man and M itrani [19]. Since then, the area has been intensively investigated, with
focus on characterizing the per-flow performance space satisfying strong conserva
tion laws [76]a weaker form being Kleinrocks conservation lawunder different
assumptions on the input and scheduler space.
Coffman and M itranis seminal paper [19] studied the per-flow multi-class M /M /1
system for preemptive schedulers. They showed th at the performance space is con
vex and spanned by static priority schedulers (i.e., priority queues). Gelenbe and
Mitrani [30] studied the M / G / l queue for non-preemptive schedulers and proved
analogous results. In [23], Federgruen and Groenevelt extended [30] to M / G / c , and
they established a corresponding result for G / M / l under preemptive schedulers in
[22]. Federgruen and Groenevelt [23, 22] used the polymatroidal structure of the per
formance space to show th at convex separable objective functions can be optimized

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

87

in polynomial time. Georgiadis and Viniotis [32] considered G I / G / 1 systems under


general work conserving schedulers, including time-varying adaptive policies, without
a priori assuming ergodicity. Shanthikumar and Yao [76] established the equivalence
between strong conservation laws and the polymatroidal structure of the performance
space, showing th at the latter is not only sufficient but also necessary. Bertsimas [2]
provides a survey of optimal control in multi-class queueing systems following Coffman
and M itranis framework [19], where relaxation is used to find approximations of the
performance space and corresponding performance bounds. Chen and Yao [7] (Chap
ter

11

) provide a comprehensive exposition of related works, including generalized

conservation laws and their relationship to extended polymatroids with application


to closed queueing systems.

The aforementioned works in optimal per-flow scheduling considered computa


tional complexity. Polymatroid optimization, enabled by strong conversation laws, af
fected poly-time solvability of convex separable functions for queueing systems where
the

2 inequalities of the strong conservation law can be represented in space

polynomial in n. For linear objective functions where the optimal solution must be a
vertex of the base polytope of the polymatroid, this dependence of the input size on
a compact representation of the constraints is removed. This is due to the fact that
an optimal solution can be recovered from the n coefficients of the objective function.
Poly-time solvability of linear objective functions extends to generalized conservation
laws. An im portant negative result was proved by Papadimitriou and Tsitsiklis [58]
for optimal control of closed multi-class queueing networks which they showed to be
EXP-complete. The queueing system considered, however, is significantly different in
th at it is a closed queueing network where stochasticity only arises from the (expo
nential) service time distribution, and combinatorial hardness is built in at multiple
levels including routing and scheduling. Thus, it is not surprising th at the stochas
tic control problem considered is computationally hard, although the degree of its
hardness (EXP-complete) is interesting.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

88

In Chapter 3, we studied optimal aggregate-flow multi-class scheduling in a shared


resource environment where bandwidth is directly provisioned. T hat is, the semantics
of resource sharing is with respect to the underlying resourcei.e., bandwidthand
not the performance induced by the apportioned resources.

Under the minimum

mean-square error (MMSE) criterion and relativization of user bandwidth require


ments, optimal aggregate-flow scheduling is shown to be poly-time solvable, in partic
ular, has cubic time complexity. This chapter generalizes the optimal aggregate-flow
multi-class scheduling framework by considering stochastic input for which Kleinrocks conservation law holds. The study of optimal per-flow multi-class scheduling
showed th at per-flow scheduling combined with strong conservation laws leads to
poly-time complexity of convex objective functions. In Chapter 3, it was shown that
aggregate-flow scheduling without conservation laws is also computationally tractable.
This chapter shows th a t aggregate-flow scheduling combined with conservation laws
is NP-hard.
A key component to proving NP-hardness of optimal aggregate-flow schedul
ing involves reduction from a one-dimensional constrained clustering problem with
MMSE objective function. Surprisingly, little is known about the complexity of one
dimensional optimal clustering with linear constraints and MMSE objective function.
We prove it is NP-complete. Brucker [6 ] showed th at one-dimensional unconstrained
clustering under MMSE is poly-time solvable.

The complexity of fc-dimensional,

k > 2, unconstrained clustering under MMSE remains an open problem [6 , 35, 28].
Gal and Klots [27] solved a form of one-dimensional unconstrained clustering where
the objective is to maximize the sum of average group weights over all groups.
Statistical multiplexinga form of flow aggregationhas been studied with the
aim of characterizing the multiplexing gain via envelop curves and effective band
width [4, 21, 33,

86

]. From a scheduling perspective, these works fall under the

category of per-flow multi-class scheduling. They showed th at bandwidth is more


efficiently utilized the more flows are multiplexed or aggregated. In contrast with
Coffman and M itranis per-flow scheduling framework where the performance space

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

89

given by strong conservation lawscaptures the trade-off across different schedulers,


in statistical multiplexing an effective characterization of achievable performance for
specific schedulers (e.g., FIFO, EDF, SP, GPS) is sought with the aid of the law of
large numbers assuming independence among flows.
A notion similar to flow aggregation, known as job grouping [82], has been studied
in the scheduling literature. The job grouping model studies optimal scheduling for
deterministic sequences of jobs. Aggregation is defined at the individual job level, and
optimal scheduling studies the grouping of jobs so as to minimize processing time.
In essence, optimal scheduling in the job grouping context involves unconstrained
clustering which, in the absence of conservation laws, is polynomially solvable by
dynamic programming.
Although optimal aggregate-flow scheduling was originally motivated by the prac
tical context of designing efficient routers for achieving scalable QoS provisioning, it
has led to interesting linkages between stochastic scheduling, clustering, job grouping,
and scalable network design.

6.3

System M odel
The system model is comprised of four parts. The first part describes the stochas

tic framework of multi-class queueing which, in addition to notational set-up, is


needed to define aggregate-flow scheduling.

The second part defines the optimal

aggregate-flow scheduling problem. The third part discusses Kleinrocks conservation


law with respect to the underlying input and scheduler space, and how it impacts
scheduling. The fourth part specifies an interface between the stochastic and algo
rithmic components of optimal aggregate-flow scheduling which is used in Section 6.4.

6.3.1

M ulti-class Queueing M odel

Consider an n-flow (or type) G / G / l system with general interarrival and service
times. Let U = (14), 14 = [Tk, *4, 14], k G Z, be a sequence of random variables rep

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

90

resenting the input, where Tk R+ (the positive reals excluding 0) is the interarrival
time between the A;-th and (k + l)-th arriving packets, Sk K+ is the service time
of the k-th packet, and Ck

{1, , n} is the flow index of the k-th arriving packet.

Following [5], the evolution of the system at arrival instants may be described by a
recursive stochastic equation
X k+1 = f ( X k, Uk) = f ( X k, [Tk, Sk, Ck}),

ke Z

(6.3.1)

where X k denotes the state of the system seen by the A;-th arriving packet, and / is a
measurable function on the joint probability space and captures the scheduling disci
pline. For stochastic schedulers, an additional random input is needed to represent the
behavior of / . For strictly stationary input, (6.3.1) can be shown to have well-defined
solutions [5]. For our purposes, it suffices to have a stochastic framework wherein
Kleinrocks conservation law holds, which need not be restricted to strictly stationary
input. For example, this applies to second-order self-similar processes which are used
to model long-range dependence in Internet traffic.
Recall in Chapter 2, we specify aggregate-flow schedulers by assuming a many-toone function, : { 1 , . . . , n} > { 1 , . . . , m}, m < n, called a classifier, and requiring
th at a scheduler maps each flow i G { 1 , . . . , n} to one of m service classes as specified
by (i). The resultant m (or less) aggregated super-flows are then treated as input to
an m-flow per-flow scheduler, which, in conjunction with the classifier , defines the
aggregate-flow scheduler.
D e fin itio n 6.3.2

An n-to-m aggregate-flow scheduler g is a pair ( , / ) , where :

{ 1 , . . . , n} > { 1 , . . . , m } is a classifier, / is a state transition function of an m-flow


multi-class queueing system, and the action of g is given by
X k+i = g (X k,Uk)
=

f ( X k,[Tk, S k, t ( C k) ]),

kez.

Since and / are measurable, g is measurable. An n-to-m aggregate-flow scheduler


is just a special case of an n-flow per-flow scheduler, hence the stochastic framework

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

91

of per-flow scheduling can be inherited. On the other hand, since m = n and =


identity map corresponds to per-flow schedulers, aggregate-flow schedulers can be
viewed as a generalization of per-flow schedulers, where the classifier allows various
forms of information loss due to m < n to be imposed on the system, including the
special case of no loss.
An alternative definition of aggregate-flow scheduler is: g is an n-to-m aggregateflow scheduler if there exists a classifier such th at given an arbitrary system
state Xk seen by the k-th packet and two instances of the (k + l)-th packet
Wfc-t- 1 [^fc+i)Sfc+i, Cfe+i] and u^+i [^fc+i>fc+i>^fc+i], Qi^ki^k+i) s(*fc>^hfc+i) ^
(cfc+1) = (c4 +1). The second definition is more behavioral and does not make use
of a state transition function / of an m-flow multi-class queueing system. The two
definitions can be shown to be equivalent.

6.3.2

Optim al Aggregate-flow Scheduling

Let w = (u>i, , wn) denote the performance vector of an n-flow multi-class


queueing system where u>i > 0 is a performance measure associated with flow i. In
general, w may depend on (Xk) which is determined by the state transition function /
and input U = (Uk). This is denoted by w 4>f(U) where >/ is assumed measurable.
Given input U and a set of schedulers <S, the (per-flow) performance space with respect
to U and S is defined as

n = { w : 3f e S , w = M U ) } -

Thus, % is the set of performance vectors th at are achievable by some schedulers in the
scheduler space. We use Ha to denote the corresponding aggregate-flow performance
space when / is restricted to be an n-to-m aggregate-flow scheduler.
Let 0 (0i, , 9n) be a desired target performance vector where 0* > 0 repre
sents the performance or QoS requirement of flow i. Under the minimum mean-square

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

92

error criterion of goodness, the per-flow optimal scheduling problem in n-flow multi
class G / G / l systems given input U and scheduler space S is defined as
inf ||w - 0 | | 2
weH

(6.3.3)

where || || is the L 2 norm. Note th at finding a scheduler / G S th at achieves an


optimal w is treated as a separate problem. This is justified by the negative nature of
the main result. The corresponding optimal aggregate-flow scheduling problem when
the scheduler is restricted to belong to the aggregate-flow scheduler space S a is given
by
inf | | t u - 0 | | 2
(6.3.4)
t0/Ha
where H a is the aggregate-flow performance space under input U and scheduler space
S a. For m = n, H a = H, i.e., per-flow scheduling is a special case. Hence, the case of

interest in aggregate-flow scheduling is m < n for which H a Q H .

6.3.3

Conservation Law

Kleinrocks conservation law takes on the form of an inner product functional.


The following shows an instance under strictly stationary input.
1

Let 1/A = E {T 0},

/ p - E { S 0}, p = X/p, and A* = P { C 0 = *}A, 1/pi = E { S 0 \ C0 = *}, Pi = \ / p u for

i=

, , n. Let Xk be the waiting time W*, and let Wi = E{Wo \ Co = i}.

Proposition 6.3.5 Consider an n-flow G / G / l


i.e., astationary solution

queueing system

in steady state,

of (6.3.1) exists. Given input U, for any non-preemptive,

work-conserving and non-anticipative scheduler,


n

where c >

^ P iW i = c
i~ l
is a constant that depends on the input.

Proposition 6.3.5 is a straightforward consequence of Theorem 6.3.2 in [5] (Chap


ter

, pp. 202).

In general, Kleinrocks conservation law specifies an (n 1)-

dimensional hyperplane
H* { w : p o w c}

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

(6.3.6)

93

where p = ( p i , . . . , pn). We have H a C % C %*. Strict stationarity in not necessary


for Kleinrocks conservation law to hold. For example, for second-order stationary
processes which have been used to model long-range dependent traffic [64], Kleinrocks
conservation law can be shown to hold under certain steady state assumptions.
Coffman and M itranis per-flow scheduling framework [19] uses strong conservation
laws [76] where, in addition to (6.3.6), 2n 2 inequalities over the subsets of J C
{ 1 , . . . , n} are imposed
Y^PiWi > CJieJ

Jointly with (6.3.6), the inequalities lead to a polytope whose vertices are static
priority schedulers. In our work, the weaker form (6.3.6) suffices.

6.3.4

Aggregate-flow Performance Space and Open Ball Containment

Given : {1 , . . . , n} > { 1 , . . . , m}, let


Qm = [J^

= {u G W1 : Vi = Vj if (*) = (j)}. Let

where the union is over all n-to-m classifiers . Thus, Qm denotes the set

of all vectors th at have at most m distinct elements. Under the assumption


Wi

= wt;(i),

* = 1, , n

(6.3.7)

where flows belonging to the same service class experience the same performance, it
holds th at

nac n n gm.

(6.3.8)

Let q > 0 denote the performance vector achieved by the FIFO scheduler. Note
th at in the definition of classifier, need not be surjective. Hence, for 1 < m < n,
FIFO is an n-to-m aggregate-flow scheduler and q E'Ha- By work conservation and
assumption (6.3.7), we have qx q2 = = qn = c/p. The following assumption
( open ball containment property) provides an interface between the stochastic and
algorithmic components of optimal aggregate-flow scheduling:
3r >

such th at Br{q) C\'Ha = Br(q) DH* C\Qm

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

(6.3.9)

94

where Br(q) C Kn is the open ball of radius r centered around q. We will show that
assumption (6.3.9) suffices to make the main complexity result hold for any multi-class
G / G / l system when Kleinrocks conservation law holds.

6.4

Structure of Optimal Aggregate-flow Scheduling


In this section, we state the main result and show the structure of optimal

aggregate-flow scheduling which can be related to one-dimensional optimal clustering


with linear constraint. The latter is shown to be NP-complete in Section 6.5.

6.4.1

M ain Result

Theorem 6.4.1
vector

Given an n-flow m-class G / G / l system, with user QoS requirement

and work-conserving, non-preemptive and non-anticipative schedulers,

assume Kleinrocks conservation law (6.3.6) holds and the open ball containment prop
erty (6.3.9) on 'Ha is satisfied.

Then optimal aggregate-flow scheduling under the

MMSE criterion (6.3.4)


inf llto 0\\2
we.Ua
is NP-hard.

Theorem 6.4.1 says th at for multi-class queueing systems with general input where
Kleinrocks conservation law holds and containment of a well-structured ball in Ha
is satisfied, the problem of optimally grouping n input flows into m service classes
such th a t the realized performance comes closest to users target performance in the
mean-square sense is computationally hard. In fact, our proof shows th at a special
case of the main result with m 2 is already NP-hard. For strong conservation laws,
H is convex and (6.3.9) is more easily satisfied.
The decision problem corresponding to Theorem 6.4.1 is given by the following.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

95

P ro b le m 6.4.2 [OPT-AGG]

Under the assumptions of Theorem 6.4.1, for given

K K+ and 9 R" decide if there exists w ( z Ha such th at


11u> - O f < K.
We remark th at the parameters of the input instance are actually defined over the
rationals Q, consistent with convention of computational complexity for capturing
input length. For notational simplicity, we will continue to use reals assuming this is
understood.

6.4.2

D e c o m p o sitio n

In the definition of OPT-AGG, it suffices for the performance requirement vector


to lie on the conservation law hyperplane, i.e., 9 %*.

The restricted problem

is called OPT-AGG'. Both problems are computationally equivalent: a poly-time


algorithm for OPT-AGG' gives a poly-time algorithm for OPT-AGG, and vice versa.
This is enabled by the next proposition.
P ro p o s itio n 6.4.3

Given 9 Kn, let 9 be the orthogonal projection of 9 on %*;

i.e., 9 is the unique solution to m in^e^. ||i 0 O f . For any A C %*, w* is a solution
to
inf I|i0 O f

weA

if and only if w* is a solution to


inf \ \ w - 9 f .

wA

Proof. The orthogonal projection is given by


9 = 9 + --ir4 ? -p
where p and c are the parameters of W*. Since A C %*, for all w

(6-4-4)
6

A we have

\\w -0 f = \\w -0 f + \\9-9f.

\\e-er is a constant from which the proposition follows.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

96

Proposition 6.4.3 states th at finding a closest point in a subset A of the conserva


tion law hyperplane with respect to an arbitrary performance requirement 0 can be
decomposed into two steps: first, finding the orthogonal projection of 0 on %*, then
finding a point in A closest to the projection point. When A = H and the per-flow
performance space % is suitably nicee.g., convex polytope under strong conser
vation lawthen the optimal per-flow solution is either the projection point itself, or
it lies on the boundary of the polytope %. In either case, the optim al solution can be
found in polynomial time. The next proposition relates the optimal per-flow solution
to a fairness property.

P ro p o s itio n 6.4.5

w E Ti* is the solution to m in ^ ^ . ||tu 0\\2, if and only if w

satisfies
wi - 0\ _
Pi

_ wn - 6n

(6.4.6)

Pn

Proof, w %* if and only if p o w = c, and w satisfies (6.4.6) if and only if 3 c G K


such th a t w = 0 + cp. Jointly we get w = 0, where 0 is the orthogonal projection of
6 on V.*, as specified by (6.4.4).

(6.4.6) can be viewed as a fairness criterion in the following sense: pi = \ / Pi repre


sents the traffic intensity of flow i; w Oi measures the deviation from the performance
target. By setting Wi Oi proportional to pi, we enforce the trade-off: The excess
happiness (or misery) is inversely proportional to how much one sends.
Let OPT-AGG" be a further restriction of OPT-AGG 7 where the QoS requirement
vector 0 is restricted to lie in an open ball on the hyperplane

0 Br/2(q)nn*
where r >

is the radius of the open ball in Theorem 6.4.1. Since OPT-AGG" is a

special case of OPT-AGG7, OPT-AGG 7 is computationally as hard as OPT-AGG".


Thus, to show hardness of OPT-AGG, it suffices to show th at OPT-AGG" is NP-hard.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

97

6.4.3

Clustering

We show th at optimal aggregate-flow scheduling is equivalent to an optimal clus


tering problem. Given : {1, , n} > {1, , m } and d [d\, , dm] G Km,
define 7t(, d) G Kn
^*(5 d )

~ ^(*)>

* =

!)

i^-

Thus 7r is an n-dimensional vector whose components, as dictated by , take on values


from the m-dimensional vector d. Following is a definition of a clustering problem
which we show is equivalent to optimal aggregate-flow scheduling.

Problem 6.4.7 [OPT-CLUST]

Under the assumptions of Theorem 6.4.1, for given

K G K+ and 9 G Br/ 2 (q) D Pi* decide if there exist and d G K+ such th at


M t,d)-e\\2< K
subject to 7 r(, d) G %a-

Proposition 6.4.8

OPT-AGG" has a solution if and only if OPT-CLUST has a

solution.
Proof. We can consider two cases: (i) K > ( r / 2 ) 2 and (ii) K < ( r j 2)2. In case (i),
w q, and (i) = 1 for i = 1 , . . . , n and d\ = q\ are solutions to OPT-AGG,; and
OPT-CLUST, respectively.
Consider case (ii). Assume OPT-AGG" has a solution w . The radius of the open
ball in the definition of OPT-AGG" implies th at w G Br(q). Since w G Tta, by (6.3.9),
w G Qm. Thus there exist and d such th at w = 1r(, d), and , d is a solution to
OPT-CLUST. In the reverse direction, w = 1r(, d) is a solution to OPT-AGG".

Proposition 6.4.8 implies th at OPT-CLUST and OPT-AGG" are computationally


equivalent. Thus to prove Theorem 6.4.1, it suffices to prove th at OPT-CLUST is
NP-complete.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

98

6.4.4

Open Ball Scaling

To prove NP-completeness of OPT-CLUST, we first put OPT-CLUST into a polynomially equivalent form OPT-CLUST'.

Problem 6.4.9 [OPT-CLUST']

Under the assumptions of Theorem 6.4.1, for given

K G R+ and 0 E H * decide if there exist and d Rm such th at


IM ? .d )-0 ||2 < K
subject to 7 r(, d)

%*.

OPT-CLUST' is a relaxation of OPT-CLUSTas opposed to specializations car


ried out in previous transformationswhere 0 and

7t ( ,

d) are allowed to be in %*.

Therefor the equivalence reduction between OPT-CLUST and OPT-CLUST' is more


complex.

Lemma 6.4.10

O PT-CLU ST and O P T-C LU ST are computationally equivalent.

Before proving Lemma 6.4.10, we give Proposition 6.4.11 which is needed in the
proof of the lemma. The proposition says: performing optimal clustering, given a
target performance vector on li* th at is far away from q, is equivalent to performing
optimal clustering with respect to a target performance vectoractually, a continuum
of vectors on the line connecting the original target vector and qarbitrarily close
to q. The scaling is determined by the parameter c > 0 in the proposition.

Proposition 6.4.11

Given c e R+, K e R+, 0 R , and

R , there exist '

and dl Rm such that


||tt(', d!) - 0||2 < K ,
if and only if there exist " and d"
I K f , d") - (9 + c

where qi y>n0p

tt(', d!) o p = 0 o p

Rm such that

) I I 2 < 4cL'

*(f"> d") o f> = (9 +

for * = ! , , n.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

99

Proof. (=>) Let " = ' and d" = q +

for i = 1, ,m where q =

. Then

id ) q

_ i

n? , a ) = q + ------------
We have

IWc",d")-(9 + ^

? )ll2 =

~(g + ^ p ) l l 2

iW

H ,,

<
~

( q + - - - ** ^ - ) o p = ( g +

c2

and
tt(", d ) o p =

) P

(<=) Let ' = " and d\ = q + c(d" <?) for * = 1, , m. Then


Tr(t',d') = q + c ( 7 r ( e ,d " ) - q ) .
Analogous to the if-part of the proof, it is straightforward to verify th at ' and d!
satisfy conditions ||7r(', d') 0 f < K and 7r(', d') o p = 0 o p.

Proof of Lemma 6.4-10.

(i) Given an instance of OPT-CLUST' specified by K ' and

O' , construct an instance of OPT-CLUST as follows: K and 6 are defined as


K'
K = ,
&
where c = \/3 ||0 ' q f / r .

We have 0

0' - q
0 = q + -------2
c
o

p= q

p and

||0 qf| | 2 = r/3 .

Thus

0 GBr/ 2 {q) fl %*. We will show: 3 ' and d! such th at


||t t ( ',O - 0 '||2 < K \

tt(', <) O p =

O' O p

(6.4.12)

if and only if there exist and d such th at


||7 r ( ,

d) - O f < K ,

tt( , d) G

H a.

(6.4.13)

Suppose ' and dl satisfy (6.4.12). By Proposition 6.4.11, 3" and d" such th at
||tt(", d") - 0 | | 2 < K ,

tt(", d") o p = 0 o p .

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

(6.4.14)

100

Of those solutions satisfying (6.4.14), choose *, d* such th at || 7r(*,cT) 0\\2 <


\\q 0\\2. To show th at 7r(*, d*) e Ha, it suffices to establish th at 7r(*, d*) is within

distance r of q. We have
M C , d*)~ q f

<

h (e,d * )-O \\2+ \\0 - q

<

2 ||9 -

9 ||2

\\2

|r ,

from which (6.4.13) follows. Conversely, if and d satisfy (6.4.13), by Proposition


6.4.11, there exist ' and d' satisfying (6.4.12).
Given an instance K and 0 of OPT-CLUST, use the same instance as input

(ii)

to OPT-CLUST'. Suppose OPT-CLUST has a solution , d. Since %a C TL*, , d


is also a solution to OPT-CLUST'. On the other hand, suppose OPT-CLUST' has a
solution ', d'. By the same argument as in (i), we can find a solution *, d* which
satisfies (6.4.13).

The preceding results have shown that


OPT-AGG

OPT-AGG' <^= OPT-AGG"

OPT-CLUST

OPT-CLUST'

where A < = B denotes polynomial reduction from problem B to problem A. Since


OPT-AGG is the decision problem corresponding to the optimization problem in
Theorem 6.4.1, the main result is proved by showing OPT-CLUST' is NP-complete.

Theorem 6.4.15

OPT-CLUST1 is NP-complete.

Section 6.5 is devoted to proving Theorem 6.4.15.

6.5

Complexity of Optimal Clustering


We formulate an unconstrained optimization version of OPT-CLUST', called

OPT-CLUST", and show th at they are equivalent. OPT-CLUST" is then proved


NP-complete in Section 6.5.2.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

101

6.5.1

Unconstrained Optimization and Subspace Projection

Given 0, p, and , let >| = {* : (i) = j} , j = 1, , m denote the set of flow


indices mapped to j under classifier . Define
for j =

, , m , and

= [r|, , r] as r | =

Problem 6.5.1 [OPT-CLUST"]


G

R f and 6

P* ^or 3 ~

>m Let

= n ( t,r () .

Bf = 7 r(^ e?) and

, e] as

Under the assumptions of Theorem 6.4.1, for given

H* decide if there exists such th at


12 , { E j p p - d o p f
lg ~ ^ l l 2 +
II\R
Df 112
^ < K-

Lemma 6.5.2

O P T-C LU ST and O P T-C LU ST1 are computationally equivalent.

Lemma 6.5.2 is proved with the help of the next proposition which states th at for
a given classifier , the optimal performance vector

which resides in the subspace

TL* n<? can be calculated in 0 ( n 2) time.

Given , 0

P ro p o s itio n 6.5.3

K" , p

K ", the solution to

min ||7 r(, d) 0 ||2,


a
subject to

7 t( ,

d) o p

(6.5.4)

o p is given by
E . i s | K r | - E UOiPi

d( = e ( - r *

Furthermore, let

- . I W

'

= 7r(, d^). Then


w ^ E s - Rs

o p 0 o p
* *1
,1-Rf" 2

(6.5.6)

and
\B -

||0

- B { | | 2 + (E i

P)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

(6-5.7)

102

Proof.
7 t(,

Given , we can use Lagrange multipliers to get d^. Rewrite (6.5.4) and

d )o p = 0 o p a s
m
mj n X

m
(6i ~

pi ~ 0 p = 0

X dJ X

j = 1 f(*)=j

i= l

(t)=j

respectively. The Lagrangian is given by


m
L(d, A) = X

m
X (

0 * - ^ ) 2

+ a (X ^

j = l (i)=j

j=1

Y ,P * - 0

o p )-

(*)=7

We have
A T-

- ^ 2 ( 0 i - d i ) + A ^ ft = O,
f(0 =J
C(*)=J

^
7-

aT

j = 1, , m,

(6.5.8)

XI
i=i

X I pi ~ 0 p = CW=i

(6-5,9)

From (6.5.8),

^ = ^

- * ^

=4-54.

'

<6-5-10>

Substitute (6.5.10) into (6.5.9),


2

E " ,e |E WWf t - 2 g P
E = 1 >1 , ) = , f t

E " , I ^ K - j -

E " =1^

E J ., |s |l ( i ) J

Thus
(6-5.il)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

103

Note th at (6.5.5) is a vector version of (6.5.11). By the definition of 7r, (6.5.6) follows
from (6.5.5). For (6.5.7),

lie - E { | | 2 + l|.{ll2( ^ f n ^ |p P ) 2
+ 2(9 - E ^ o R f

Proof of Lemma 6.5.2.

Et 0 0 -0 0 0
^

Consider an instance specified by 9 and K for both

OPT-CLUST' and OPT-CLUST" (i) Suppose OPT-CLUST' has a solution , d. Ac


cording to the optimality of

described in Proposition 6.5.3,

0|| 2 < lk ( ,< 9 -

O f < K . Thus is a solution to OPT-CLUST". (ii) Suppose OPT-CLUST" has a


solution . We have \\w% 0\\2 < K and w ^ o p = 0 o p. Thus , d$ is a solution of
OPT-CLUST'.

From a geometric perspective, the optimal performance vector

for a given

classifier is the orthogonal projection of 0 on the subspace %* fl Q$. This orthogonal


relationship is captured by the next proposition which will be used in the proof of
Theorem 6.5.13 in the next section.
P ro p o s itio n 6.5.12

Given 0 W f, p W f, and , fo r all w satisfying w G

and w o p = 0 o p,
\\0 w\\2 = \\0 w^\\2 + ||iO tn | | 2
where

is defined by (6.5.6).

Proof. Since w G G^, 3 d G Mm such th at w = 7 r(, d). We have


m
(0

Es)

(w t -

w) = ^ 2 ^ 2
j = 1 (i)=j

~ e\ ) i d\ ~ dj) = 0

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

104

since

- e|) =
m

and
m
r l ( d l ~ d j) = Y j J 2

R s(w s - w ) = J 2
j= 1 (i)= j

p i (d l - d j ) = p w ( i - p w

Q.

j = l (i)= j

Thus,
(0 - w s ) o (ti> - w )

- .E* +
(0

|)

E ^o p 0 o p
g 0| | ^ | | 2 P )
(lU

w)

E ^o p 0 o p
iu) H-----^ H p [|2------ R$ O (w

w)

R>
0

We have

6.5.2

\0 w \\2 =

||0

||0

in^ll2 + ||u> w \\2 + 2(0 W{) o ( w ( w )

w\\2

Hardness of One-dimensional Clustering

Theorem 6.5.13

OPT-CLUST" is NP-complete.

To prove NP-completeness of OPT-CLUST",

we will reduce the follow

ing problema variant of PARTITION known to be NP-complete [29]to


OPT-CLUST".

Problem 6.5.14 [EVEN-PARTITION]

Given X \, , X n, X i G N, i 1, ,n , n

is even, is there a subset S C {1, , n ) with cardinality n /2 such th at

5 >
ies

= sX >
i= i

The next proposition is used in the reduction from EVEN-PARTITION to


OPT-CLUST" in the proof of Theorem 6.5.13.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

105

P ro p o s itio n 6.5.15

Given real numbers w[, , w'k, pi, , p'k and w", , w",

p'/, , p with mean ui', p', w", p", respectively, and k, >

, define wi, - ,w n,

Pi) ' ) Pn> where n = k + , as follows:

w'i

< i < k,

Wi

w"_k ,

k+

p'i

< i < n,

<i<k,

Pi

Pi-k > k + 1 < * < n Let w, p be the mean of the Wi and p is, respectively. Then
k

j^(Wi - w)2 =
izz 1

kl
k+

<sS
>.
1

^ 2 ( Wi ~ ) Pi
i1

i1

ii

Y l w - ')2 + E

')p'i

k
k + {

+ Y lW
i-1

(6.5.17)

Proof.
J 2 (w i - w ) 2
i= 1

^ 2 ( wi - w)2 +
i =1
i =1

~ w ^2

^ 2 ,(W'i ~
i=l

+ k ^ )>~

^ ) 2

^ Ifa i ~

*i-= 1

k
k2

= 5 2 ( w'i - ')2 + k {^ {w ' - w")Y


i
1
+E W
i =1

'

+ RTh " - ') )


kl

5 > - - ') 2 + e

- ")2 +

')2

<=1

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

^ ) 2

106

] P ( W i - w )pi
i =1

_ w )p'i + i , ( wi ~
i =1

i
1
k

w ) p '1

- ')p'i + J 2 ( wi - w")pi
i =1

i =1

+ 5 3 ^ ~ ^P i +
i =1
i=l

= EM - >' + X>? i=l

i=l

* ,

i=l

i =1

kt

e m
i =1

- * v .+ E w
i=l

- ft

Proof of Theorem 6.5.13. It is easy to check th at the problem is in NP. We will reduce
EVEN-PARTITION to OPT-CLUST". Given an instance of EVEN-PARTITION
specified by Ai , - - - ,X i, let X max, X min denote the maximum and minimum, k =
| { 1 ) +

, and let

2(fc + | )

C = max

( ufU + (fc+i!)~(fc+i2) ^ A 4 ) ) A maa; ~ fc+!T^mn + (fc+k)(fc+k)^

^(44))

(fc+^)(fe+^)^ _ ^ (4 > 4 ) )

+ 1.
Construct an instance of OPT-CLUST" specified by n, m, K , p, 0 as follows:
n

2k + ,

AT =

kt
*; +

1,
0i =

Pi

= P i/A

1 < i < A;,

3,

k + + 1 < i < n,

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

107

where
I

1 < i < k,

C + -Xj-fc ,

/c + l < i < A ; + T ,

A; + + l < * < n .

We shall show: 3 S' C { 1 , , f), \S\ = | and

Yies X i = \ Yl=i X u

if and oniy if 3

such th at
lie - w 5 | | 2 = ye - y i 2 + ( Ef '

<

where w f is defined by (6.5.6).


(=) Suppose 3 S C { 1 , , }, l^l = , and Y ie s X i = \ Y U i
( 1,
(*) = <
y 2 ,

x u

Define as

%e { l , - - - ,k } U{i : i - k e S},
i {k + + 1,

n} U { i : i k 51}.

Using (6.5.16) from Proposition 6.5.15, we have

A; +
Let y =

, then
2

0 oP - Et op

| +| |Ae: - B ||2
7 = k + l - e)2 +

ies\
= U- Using (6.5.17), we have
A^l
r( 2 - l ) 2 + - ^r ( 3 - 2 ) 2 -

- e)2
ies|

.o

I A:

( * , - e\)Pi + ^ 2 ( 0i ~ e\)pi
ie,S'l
ies|

Thus | | 0 - m d | 2 = - & < K .


K+2
(< = ) Suppose there is no 5 C
2

1,

*- *

,1}

Y\=\ Xi- We will show for all , ||9 tn ^ | | 2


((i) = 1 U
2,

th at satisfies |5 | = | and

Y ie s X i

> K . can be expressed as

i S - u S uSi S ' a US't U S:C>

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

(6 .5.18)

108

where S aUS'a = {1, , k}, S bl)S'b = { k + 1, , k+}, and SCUS'C= {k+ + 1 , , n}.
Thus, all must fall into one of the following cases:

(a) S 0 =

{1, , k}, Sc = 0, \Sb\ = l;

(b) Sa =

{ l , - . . , f c } , S e = 0, \Sb\ ^ f;

(c) 0 C S a C {1, , A;}, 0 C S c C {k + I + 1, , n};

(d) 0 C 5 a C { l , . . . , A } , 5 c = 0;

(e) Sa =

Note th a t (d) follows from (c), and (e) follows from (a), (b) and (c). We will show
th a t ||0 u>f| | 2 > K holds for (a), (b), and (c).
Case (a).

Let yl =

y2 =

j f f l - - , and rb =

P' =

rb = ^ | s '| Pl = C^ 2 By the contrapositive hypothesis, y\ ^ y2, thus rb 7 ^ r'h. Using


(6.5.16) and (6.5.17), we have
ii0 - j S f i i 2

X > - e)2 + E w

- e(>2

ie s |

and
O o p - E f : o p

^ (0 * -

e\)pi + ^ ( 0 * -

ies\

e\)pi

ies%

r f i < 2 - i><rs -

T - r r ( n - K) /
fc +

1) +

- 2 ) d - >i)

Thus ||0 - iUf | | 2 > ||0 - JE7C | | 2 = -Mt = K .


A
C+ 2

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

109

Let |S,| = t u

Case (b).
P' =

/>, =

|SJ| = l 2, Vl = s + y t * , y2 =

and

. We have fi / i 2. W ithout loss of

rj, =

generality,assume l\ > l 2. Using (6.5.16),

- 2= i r k + ^

By the definition of C,
C >

+fcy2 +fcyi +

Since k |( 1) +

(F+i1)(fc4-k)^^i , ^2) ^ mox + (fc+ii)(fc+I)+ (fc+l)1(fc+i)^(^1^2)


k2^ i - l2)_____ k(t i-e2) Atp p \
(k+ei)(k+e2) (fe+^i)(fe+<2 )yiUi> i 2 )

, k A ( li , l 2) > 0. Hence,

* * (4 -4 0

v\

V ( t + < ! > ( : + 2>


(* + < i)(* + 4 )
'
*,-<,)
,, . .
*2 i - fe)
+ (k + h H k + i,)- ( l, !)
(* + <,)(* + 4 )
Noting
f k ii
\k + h

* + < j"

kh
fc + f i
,,

. >

(Jfe-Mi)(k + / a) 1

2,1

= j ^ t ~ M b "'e have
k l 2 \ (n
k +lj(

k + L Vl
m

fc 2
k + lo V2

- l 2)

> (* + <,)(* + 4 )

H tjjJ k lM (r , y
, .V
V
2(* + I)
lt + A + <:''

Divide both sides by C + X ma;r + A: > 0 and square them. Since l\ l 2 > 0,
((g fe - F fe H C - 1 ) + g U
(C + X ^ + k ) 2
>

k2( h - l 2)2
( ^ + ^)fc + W
(k + h Y i k + l t f
2(k + I)

k2(l1 - 2)2

2(k + e1)(k + l 2)(k + ^ ) \ k + l 1

P2

-).

k + l 2t /

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

(6.5.21)

110

Thus
{ O o p - E ^ o p )2

m *

( D i e s i (fii - e\ )Pi + E i6s | & ~ e \ ) P i f

E te s i( r } ) 2 + E eS|( r | )

( ^ ( 2 - 1)(r * - 1) + ^ & ( 3 - 2)(1 - ^ ) ) S


(k + h

r b)2

(* + < a r a

fc+ ^ l

fe+ < 2

(^ r(g + i - D - ^ ( g + ift-i))3
l(C + l)+ fla

(^ 2 (C + !!/2 )+ fc )2

fc+l

fc+f2

( ( & ~ jfifeXC ~ 1 ) + F f e y ~ g f c ^

>

< ? (C + !+ fc )2

k+ ii

k+t2

((^ -F & )(g -i) + f ty i

>

) 2

^ (C + 3 /2 + fc )2

(fc^T + fef^) (C + X max + k )2


.

k 2( i - 2) 2

2(A: + i)(Ar + 2 )(A: + | )

The second equality follows from Proposition 6.5.15. The last inequality follows from
(6.5.21).
Prom (6.5.20) and (6.5.22), we get
||* - * ||

||-EeII2 + p - ' B{0p^


k + i

k + 2

2(k + h ) ( k + 2)(k + I)

(k2 + 2k12){2k + ) + k2( h - 2f


2{k + \ ){k + 2){k + | )
k{k2 + k + x2)
2{k + \){k + 2)(k + | )
k
k + l'
2

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

Ill

Let \Sa\ = k u | ^ | - k2, |S| = u |SJ| = 2, and |5 C| = j u |S '| = j 2.

Case (c).
By (6.5.16),

1 1 0 -jy 2 =

' ( 0 i- 4 ) 2+ Y , & - 4 ) 2
iesi
*esf

k iji 2 2
(fcl+Jl)^l a
2ji y
k\ + j i
k\ + ji + l\ \
k\ + j i )
^2 j 2 ^ 2
( ^ 2 + J2 K 2 A
2 fc2
\ 2
k2 + j 2
k2 + j 2 + 2 '
k2 + j 2 )
1 feiJi
|
h
(ki ~ j i ) 2 | 1 k2j 2 |
k\ + i i
&i + j i -M i ki + ji
k2 + ,72
I

W ithout loss of generality,assume k \+ ji > k2+ j2 and k\ > j\.

k2 +

O 2 k2)2
+ ^ 2 ^2 + J 2

Noting k \ j\ j 2k2

since ki + k2 = ji + j 2 = k, we have
ii9 _ B t ii. >
-

i P ~ + u l \ } kr h ? +
ki + ji
k\ + ji + k\ + ji

i ^ - + u
k2 + j 2 ki + j\ + ki + ji

. * '* +
1 .....+ i - M L r .
k\ + j\
k\ + ji + k \+ j 1

k2 + j 2

(6.5.23)

Furthermore, ki > j i implies k2 < j 2, thus


a

k2j 2 > _____ k2( + 2 ji)2


k2 + j 2 ~ (k + j l + )(kl + j l + Y

.
>

(6.5.23) and (6.5.24) imply


2

\ \ 0 - E e \\2

> a
(ki~ji)2
k2{ + 2 jl )2
4 , kljl. + t,
T - t t tt
rr +
ki+ji
(k\ + j\ + ){k\ + j\)
(k + j i + )(ki + j i + )
4 k \ji + (ki + j\)
k2( + 2 j i ) 2
k\ + j\ +
(k + ji + ){ki + j i + )
_ (k + ji) + 4kji
k + ji +
>

Since k > f ( 1), for all ji, 1 < ji < k, we have


|| 0 - ^ | |

> j!fi- Therefore,

kt

> l l ^ - ^ d l 2 > A)+o

The proof of Theorem 6.5.13 actually shows a stronger result: OPT-CLUST"


with constant m, in particular, m = 2, is already NP-complete. Thus the optimal
aggregate-flow scheduling problem in Theorem 6.4.1, with OPT-AGG being its deci
sion form, is NP-hard even for m = 2.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

112

6.6

Conclusion and Discussion

In this chapter, we have extended Coffman and M itranis optimal per-flow schedul
ing framework [19] by considering aggregate-flow schedulers for general stochas
tic input under work-conserving, non-preemptive and non-anticipative schedulers.
Whereas optimal per-flow scheduling subject to conservation laws is poly-time
solvableas is optimal aggregate-flow scheduling without conservation lawswe show
th a t optimal aggregate-flow scheduling in multi-class G /G /l systems subject to Kleinrocks conservation law is NP-hard.
There remain a number of open problems. They include:
Approximation. A natural question to explore is the approximability property
of optimal aggregate-flow scheduling. The optimization version of PARTITION
more precisely, EVEN-PARTITION, which was used to show th at OPT-CLUST" is
NP-completeis known to have a polynomial time approximation scheme [1 ]. A
heuristic approach is to solve the unconstrained optimization problem which can be
done in cubic time using dynamic programming. We conjecture th at OPT-CLUST"
is much harder to approximate than PARTITION, perhaps even not constant factor
approximable.
Objective function. This chapter considered an MMSE objective function for both
its practical relevance and its common use in optimization and control. Other objec
tive functions of interest include the Li-norm ||m 0 ||, signed optimization problems
corresponding to w < 0 including one th at counts how many user requirements are
satisfied, proportional fairness, and linear objective functions. For a subset of these
objective functions, we can show equivalence of optimal solutions in the per-flow per
formance space. In the aggregate-flow case, the optimal solution for linear objective
functions depends on the right-hand-side of the inequalities of the strong conserva
tion law. This stands in contrast with per-flow optimization of linear functions via
linear programming which only depends on the n normalized coefficients of the objec

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

113

tive function [23, 76]. We conjecture th at optimal aggregate-flow scheduling remains


NP-hard for the above functions.
Performance space relaxation.

The open ball containment assumption (6.3.9)

requires th at all flows within a service class receive the same performance. It may
be interesting to explore if the equality constraint can be meaningfully relaxed, for
example, by requiring th at any two flows belonging to the same service class receive
performance th at is at most e apart. The hardness result would not be affected by an
e-relaxation as we have already proved NP-completeness for the more difficult special
case e = 0. However, for approximability relaxation may play a role.
Two-dimensional clustering. When transforming one-dimensional clustering with
linear constraint into an unconstrained form enabled by Proposition 6.5.3, the con
sequent two-dimensional clustering problem bears resemblance to a simpler twodimensional MMSE clustering problem: Given 0 Rn x Rn , find : {1 , . . . , n} >
{1, . . . , m} and d Rm x Rm that minimizes ||7t(, d) 0\\2 where 7 r is the correspond
ing two-dimensional extension. The complexity of unconstrained two-dimensional
MMSE clustering remains open. We believe th at the methods used in this chapter
may be applied to show th at the problem is NP-complete.
Aggregation and efficiency. In addition to optimality, another im portant aspect of
aggregate-flow scheduling involves characterizing the loss of efficiency stemming from
flow aggregation. In previous chapters where stochasticity of the input and conser
vation laws were not considered, a quantitative result on the loss of performance as
a function of the degree of flow aggregation was given (cf. Chapter 3, 4). A similar
exploration may be of interest for the generalized framework. The computational
hardness of optimal aggregate-flow scheduling may have implications on finding effec
tive characterizations of the performance loss, however, this needs not be necessarily
the case.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

114

7
7.1

CONCLUSION AND FUTURE WORK

Thesis Summary
This dissertation studies providing QoS to individual flows using aggregate-flow

scheduling. Our work is carried out on both theoretical and practical sides.
In theory part, we present a theoretical framework to analyze network architec
tures based on aggregate-flow scheduling and study optimal aggregate-flow schedul
ing problem. We show th at the optimal aggregate-flow scheduling problem in general
queueing systems with stochastic input is NP-hard under MMSE QoS provisioning
criterion. The comlexity comes from the combination of conservation law and aggre
gation. Under relative QoS differentiation objective function, we show the optimal
aggregate-flow scheduling problem is poly-time solvable by dynamic programming and
efficent linear algorithm exists in practical scenario where user QoS requirements are
coded in the ToS field of IP header. We study quantitative properties of the induced
optimal aggregate-flow scheduling solution th at are related to class differentiation and
end-to-end QoS provisioning. The optimal solution and its properties jointly build
theoretical fundation for PHB design in Diff-serv networks.
In practice part, we design an optimal aggregate-flow per-hop control th at achieves
the induced optimal aggregate-flow scheduling solution and implement the optimal
per-hop control in Cisco routers. We conduct a comprehensive performance evaluation
by both simulation and experiments over Q-bahn testbed comprised of Cisco routers
running the implemented optimal per-hop control. The benchmarking results confirm
our theoretical framework and analysis, and reveal further quantitative features of
both structural and dynamical properties of the system. Our results, collectively, show
th at user-specified services can be efficiently and effectively achieved over networks

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

115

with optimal aggregate-flow per-hop control substrate when coupled with either openloop or closed-loop (adaptive label control) edge control.

7.2

Future Work
Our study focuses on the core of networks with aggregate-flow scheduling as build

ing blocks. There remain a number of open problems related to other components
and system-wise properties of our QOS provisioning architecture. They include:
Many switch system. The properties A l, A2, and B imposed on per-hop control
play an essential role in our QoS provisioning framework. As discussed in Chapter 2 ,
these three properties are extensible to WAN environment since if a property holds for
any single per-hop control, it also holds for a sequence of hops as a composite function
of individual hops. Thus our results based on the properties apply to both single
switch and many-switch systems. On the other hand, the quantitative features related
to the three properties have only been studied in single switch case. The extension of
quantitative analysis to many switch case will further charecterise system behaviour
and help design more effective and efficient end-to-end control schemes.
End-to-end control. The close loop end-to-end QoS control dynamically adjusts
label values carried in the ToS fields of IP headers to meet users QoS requirements.
Our current scheme increases/decreases label values in accordance with congestion
level without changing d ata rate. Another direction of end-to-end control is to adjust
data rate during congestion. This can be achieved either by using TC P congestion
control or by adding this functionality into close loop end-to-end QoS control. Setting
label value and data rate are both necessary in real networks. Thus, in addition to
studying individual control schemes, we need to analyze interaction between these
two control schemes and design coordinating mechanism between them.
Game theory analysis. We study global resource allocation properties of the sys
tem in the non-cooperative game environment where each user is changing its lable
value independently to maximized its utility. Our stability and optim ality results are

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

116

based on the nonemptyness of A*, the set of configurations where all users QoS re
quirements are satisfied (cf. Section 3.2). The dynamical properties inside A* is also
characterized. The remaining issues on game theory analysis include system stability
when A* is empty, system dynamics outside A*, and the extended game theoretic
structure with selfish service provider as an additional player.
Edge control implementation. In system building part, the most im port task is to
implement a QoS agent th at performs edge control functionality (end-to-end control
and access control). The QoS agent may reside at the edge of the network or in
the end host on behalf of network. The desire feature of QoS agent is application
transparent, i.e. legacy applications can run without knowing QoS agent is setting
the ToS fields of its outgoing packets. The development of QoS agent is the ongoing
work carried out in the network system lab at Purdue University.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

117

LIST OF REFERENCES
[1] G. Ausiello, P. Crescenzi, G. Gambosi, V. Kann, A. Marchetti-Spaccamela, and
M. Protasi. Complexity and Approximation Combinatorial Optimization Prob
lems and Their Approximability Properties. Springer, 1999.
[2] D. Bertsimas. The achievable region method in the optimal control of queueing
systems; formulations, bounds and policies. Queueing Systems, 21(3-4):337-389,
1995.
[3] S. Blake, D. Black, M. Carlson, E. Davies, Z. Wang, and W. Weiss. An archi
tecture for differentiated service. RFC 2475, 1998.
[4] R. Boorstyn, A. Burchard, J. Liebeherr, and C. Oottamakorn. Effective en
velopes: Statistical bounds on multiplexed traffic in packet networks. In Proc.
IEE E INFOCOM 00, 2000.
[5] A. Brandt, P. Franken, and B. Lisek. Stationary Stochastic Models. John Wiley
& Sons, 1990.
[6 ] P. Brucker. On the complexity of clustering problems. In R. Henn, B. Korte,
and W. Oletti, editors, Optimierung und Operations Research, Lecture Notes in
Economics and M athematical Systems. Springer, Berlin, 1978.
[7] H. Chen and D. Yao. Fundamentals of Queueing Networks. Springer, 2001.
[8 ] S. Chen and K. Park. A distributed protocol for multi-class QoS provision in
noncooperative many-switch systems. In Proc. IEEE International Conference
on Network Protocols, pages 98-107, 1998.
[9] S. Chen and K. Park. An architecture for noncooperative QoS provision in
many-switch systems. In Proc. IEEE INFOCOM 99, pages 864-872, 1999.
[10] S. Chen, K. Park, and M. Sitharam. On the ordering properties of GPS routers
for multi-class QoS provision. In Proc. SPIE International Conference on Per
formance and Control of Network Systems, pages 252-265, 1998.
[11] Shaogang Chen. Stratified Best-effort QoS Provisioning in Noncooperative Net
works. PhD thesis, Purdue University, 2000.
[12] D. Clark and W. Fang. Explicit allocation of best-effort packet delivery service.
IE E E /A C M Trans. Networking, 6(4):362-373, 1998.
[13] R. Cocchi, S. Shenker, D. Estrin, and L. Zhang. Pricing in computer networks:
motivation, formulation, and example. IE E E /A C M Trans. Networking, 1(6):614627, 1993.
[14] R. L. Cruz. A calculus for network delay, part I: network elements in isolation.
IEE E Trans. Inform. Theory, 37(1):114131, 1991.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

118

[15] R. L. Cruz. Quality of service guarantees in virtual circuit switched networks.


IEE E J. Select. Areas Commun., 13(6): 1048-1056, 1995.
[16] G. de Veciana, G. Kesidis, and J. Walrand. Resource management in wide-area
ATM networks using effective bandwidths. IEEE J. Select. Areas Commun.,
13(6):1081-1090, 1995.
[17] A. Demers, S. Keshav, and S. Shenker. Analysis and simulation of a fair queueing
algorithm. Journal of Internetworking: Res. Exper., 1:3-26, 1990.
[18] C. Dovrolis, D. Stiliadis, and P. Ramanathan. Proportional differentiated ser
vices: Delay differentiation and packet scheduling. In Proc. AC M SIGCOMM
99, 1999.
[19] E. Coffman, Jr. and I. Mitrani. A characterization of waiting time performance
realizable by single-server queues. Operations Research, 28(3):810-821, 1980.
[20] A. Elwalid and D. Mitra. Effective bandwidth of general Markovian traffic sources
and admission control in high speed networks. IE E E /A C M Trans. Networking,
1(3), 1993.
[21] A. Elwalid and D. Mitra. Analysis, approximations and admission control of a
multi-service multiplexing system with priorities. In Proc. IE E E INFOCOM 95,
pages 463-472, 1995.
[22] A. Federgruen and H. Groenevelt. Characterization and optimization of achiev
able performance in general queueing systems. Operations Research, 36(5):733741, 1988.
[23] A. Federgruen and H. Groenevelt. M /G /c queueing systems with multiple cus
tomer classes: Characterization and control of achievable performance under
nonpreemptive priority rules. Management Science, 34(9): 1121-1138, 1988.
[24] W. Feng, D. Kandlur, D. Saha, and K. Shin. Adaptive packet marking for
providing differentiated services in the Internet. In Proc. IE E E International
Conference on Network Protocols, pages 108-117, 1998.
[25] D. Ferguson, C. Nikolaou, and Y. Yemini. An economy for flow control in com
puter networks. In Proc. IEEE INFOCOM 89, pages 110-118, 1989.
[26] D. Ferguson, Y. Yemini, and C. Nikolaou. Microeconomic algorithms for load
balancing in distributed computer systems. In Proc. 8th International Conference
on Distributed Computing Systems, pages 491-499, 1988.
[27] S. Gal and B. Klots. Optimal partitioning which maximizes the sum of the
weighted averages. Operations Research, 43(3):500-508, 1995.
[28] M. Garey and D. Johnson. Computers and Intractability. W. H. Freeman and
Company, 1979.
[29] M. Garey and D. Johnson. Computers and Intractability, page 223. W. H.
Freeman and Company, 1979.
[30] E. Gelenbe and I. Mitrani. Analysis and synthesis of computer systems. Academic
Press, New York, 1980.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

119

[31] L. Georgiadis, R. Guerin, V. Peris, and K. Sivarajan. Efficient network QoS


provisioning based on per node traffic shaping. IE E E /A C M Transactions on
Networking, 4(4):482-501, 1996.
[32] L. Georgiadis and I. Viniotis. On the conservation law and the performance
space of single server systems. Operations Research, 42(2):372-379, 1994.
[33] R. Gibbens. Traffic characterization and effective bandwidths for broadband net
work traces. In F. Kelly, S. Zachary, and I. Ziedins, editors, Stochastic Networks:
Theory and Applications, pages 169-179. Clarendon Press, Oxford, 1996.
[34] R. J. Gibbens, S. K. Sargood, F .P. Kelly, H. Azmoodeh, R. Macfadyen, and
N. Macfadyen. An approach to service level agreement for ip networks with
differentiated services. In Phil. Trans. Royal. Soc. Lond. A, pages 2165-2182,
2000 .

[35] T. Gonzalez. Clustering to minimize the maximum intercluster distance. Theo


retical Computer Science, 38:293-306, 1985.
[36] A. Heddaya and K. Park. Parallel computing on high-speed wide-area networks:
a pricing policy for its communication needs. In Proc. 3rd IEE E Workshop
on the Architecture and Implementation of High Performance Communication
Subsystems, pages 188-191, 1995.
[37] J. Heinanen, F. Baker, W. Weiss, and J. Wroclawski. Assured Forwarding PHB
Group. RFC 2597, 1999.
[38] V. Jacobson, K. Nichols, and K. Poduri. An Expedited Forwarding PHB. RFC
2598, 1999.
[39] L. Kleinrock. A conservation law for a wide class of queueing disciplines. Naval
Research Logistics Quarterly, 12:181-192, 1965.
[40] L. Kleinrock. Queueing Systems, Volume 2: Computer Applications, chapter 3,
pages 113-118. Wiley-Interscience, New York, 1976.
[41] Y. Korilis and A. Lazar. Why is flow control hard: Optimality, fairness, partial
and delayed information. In Proc. 2nd ORSA Telecommunications Conference,
March 1992.
[42] Y. Korilis, A. Lazar, and A. Orda. Architecting noncooperative networks. IEEE
J. Select. Areas Commun., 13(7):12411251, 1995.
[43] Y. Korilis, A. Lazar, and A. Orda. Achieving network optim a using Stackelberg
routing strategies. IE E E /A C M Trans. Networking, 5(1):161173, 1997.
[44] J. Kurose and R. Simha. A microeconomic approach to optimal resource allo
cation in distributed computer systems. IEE E Trans, on Computers, 38(5):705717, 1989.
[45] A. Lazar and G. Pacifici. Control of resources in broadband networks with quality
of service guarantees. IEEE Network Magazine, pages 66-73, October 1991.
[46] W.E. Leland, M.S. Taqqu, W. Willinger, and D.V. Wilson. On the self-similar
nature of Ethernet traffic (extended version). IE E E /A C M Transactions on Net
working, 2:1-15, 1994.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

120

[47] W. Lin, R. Zheng, and J. Hou. How to make assured services more assured?
In Proc. IEE E International Conference on Network Protocols, pages 182-191,
1999.
[48] J. Little. A proof for the queueing formula L XW .
9:383-387, 1961.

Operations Research,

[49] S. Low and P. Varaiya. An algorithm for optimal service provisioning using
resource pricing. In Proc. IEEE INFOCOM 94, pages 368-373, 1994.
[50] J. MacKie-Mason and H. Varian. Economic FAQs about the Internet. In L. McKnight and J. Bailey, editors, Internet Economics, pages 27-63. MIT Press, 1996.
[51] Peter Marbach. Pricing differentiated services networks: Bursty traffic. In Proc.
IEEE INFOCOM 2001, pages 650-658, 2001.
[52] M. May, J. Bolot, A. Jean-Marie, and C. Diot. Simple performance models of
differentiated services schemes for the Internet. In Proc. IE E E INFOCOM 99,
pages 1385-1394, 1999.
[53] John-Francis Mergen. Personal communication.
[54] R. Nagarajan and J. Kurose. On defining, computing and guaranteeing qualityof-service in high-speed networks. In Proc. IEE E INFOCOM 90, pages 20162025, 1992.
[55] K. Nichols, V. Jacobson, and L. Zhang. A two-bit differentiated services archi
tecture for the Internet. Internet Draft, 1997.
[56] Andrew Odlyzko. Paris Metro Pricing: The minimalist differentiated services
solution. In Proc. IE E E /IF IP International Workshop on Quality of Service,
1999.
[57] A. Orda, R. Rom, and N. Shimkin. Competitive routing in multiuser communi
cation networks. IE E E /A C M Trans. Networking, 1(5):510521, 1993.
[58] C. Papadim itriou and J. Tsitsiklis. The complexity of optimal queueing network
control. Mathematics of Operations Research, 24(2):293-305, 1999.
[59] A. Parekh and R. Gallager. A generalized processor sharing approach to flow
control in integrated services networks: the single-node case. IE E E /A C M Trans.
Networking, l(3):344-357, 1993.
[60] A. Parekh and R. Gallager. A generalized processor sharing approach to flow
control in integrated services networks: the multiple node case. IE E E /A C M
Trans. Networking, 2(2) :137150, 1994.
[61] K. Park, G. Kim, and M. Crovella. On the relationship between file sizes, trans
port protocols, and self-similar network traffic. In Proc. IEE E International
Conference on Network Protocols, pages 171-180, 1996.
[62] K. Park, G. Kim, and M. Crovella. On the effect of traffic self-similarity on
network performance. In Proc. SPIE International Conference on Performance
and Control of Network Systems, pages 296-310, 1997.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

121

[63] K. Park, M. Sitharam, and S. Chen. Quality of service provision in noncooper


ative networks: heterogeneous preferences, multi-dimensional QoS vectors, and
burstiness. In Proc. 1st International Conference on Information and Computa
tion Economies, pages 111-127, 1998.
[64 K. Park and W. Willinger, editors. Self-Similar Network Traffic and Performance
Evaluation. Wiley-Interscience, 2000.
[65 Kihong Park. Warp control: a dynamically stable congestion protocol and its
analysis. In Proc. AC M SIGCOMM 93, pages 137-147, 1993.
[66 Kihong Park. Self-organized multi-class QoS provision for ABR traffic in ATM
networks. In Proc. 15th IEEE International Phoenix Conference on Computers
and Communications, pages 446-453, 1996.
[67 V. Paxson and S. Floyd. Wide-area traffic: the failure of Poisson modeling. In
Proc. AC M SIGCOMM 94, pages 257-268, 1994.
[68 H. Ren and K. Park. Efficient shaping of user-specified QoS using aggregateflow control. In Proc. IE E E /IF IP International Workshop on Quality of Future
Internet Services, pages 259-271, 2000.
[69 H. Ren and K. Park. Toward a theory of differentiated services. In Proc.
IE E E /IF IP International Workshop on Quality o f Service, pages 211-220, 2000.
[70 H. Ren and K. Park. On the QoS provisioning power of optim al aggregate-flow
scheduling. In Proc. SPIE International Conference on Scalability and Traffic
Control in IP Networks, 2001.
[71 H. Ren and K. Park. On the complexity of optimal aggregate-flow scheduling.
Technical report, CSD-TR 02-021, Dept, of Computer Sciences, Purdue Univer
sity, 2 0 0 2 .
[72 H. Ren and K. Park. Performance evaluation of optimal aggregate-flow schedul
ing: A simulation study. Computer Communications, 2 0 0 2 .
[73 J. W. Roberts. Engineering for quality of service. In K. Park and W. Will
inger, editors, Self-Similar Network Traffic and Performance Evaluation. WileyInterscience, 2000.
[74 J. Sairamesh, D. Ferguson, and Y. Yemini. An approach to pricing, optimal
allocation and quality of service provisioning in high-speed networks. In Proc.
IEE E INFOCOM 95, pages 1111-1119, 1995.
[75 L. Schrage. An alternative proof of a conservation law for the queue G / G / l .
Operations Research, 18(1):185187, 1970.
[76 J. Shanthikumar and D. Yao. Multiclass queueing systems: Polymatroidal struc
ture and optimal scheduling control. Operations Research, 40(2):293-299, 1992.
[77 Scott Shenker. Making greed work in networks: a game-theoretic analysis of
switch service disciplines. In Proc. AC M SIGCOMM 94, pages 47-57, 1994.
[78 Cisco Systems. Cisco ios software. http://w w w .cisco.com /en/U S/products/sw /
ioss wrel / index, ht ml.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

122

[79] Cisco Systems. How to choose the best router switching path for your network.
http://w w w .cisco.com /w arp/public/105/20.htm l.
[80] Cisco Systems. Qc: Cisco ios quality of service solutions configuration guide,
release 12.2. http://w w w .cisco.com /en/U S/products/sw /iossw rel/index.htm l.
[81] C. Waldspurger, T. Hogg, B. Huberman, J. Kephart, and W. Stornetta. Spawn:
a distributed computational economy. IEE E Trans. Software Engineering,
18(2):103-117, 1992.
[82] S. Webster and K. Baker. Scheduling groups of jobs on a single machine. Oper
ations Research, 43(4):692-703, 1995.
[83] Michael P. Wellman. A market-oriented programming environment and its ap
plication to distributed multicommodity flow problems. Journal of Artificial
Intelligence Research, 1:1-23, 1993.
[84] W. W hitt. A review of L AW and extensions. Queueing Systems, 9(3):235-268,
1991.
[85] W. Willinger, M. Taqqu, R. Sherman, and D. Wilson. Self-similarity through
high-variability: statistical analysis of Ethernet LAN traffic at the source level.
In Proc. AC M SIGCOMM 95, pages 100-113, 1995.
[8 6 ] Z. Zhang, D. Towsley, and J. Kurose. Statistical analysis of the general
ized processor sharing scheduling discipline. IEEE J. Select. Areas Commun.,
13(6):1071-1080, 1995.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

123

VITA
Huan Ren received his B.S. and M.S. degrees ip electrical engineering from Xian
Jiaotong Univerisity, Peoples Republic of China. Since 1998 he has been working
toward a Ph.D. degree in the Department of Computer Sciences at Purdue Univer
sity. His research interest include scalable quality of service provisioning, network
modeling, network protocols and algorithms design, and scheduling.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

You might also like