Newsletter Downloads
Faster IP lookups using controlled prefix expansion
Internet (IP) address lookup is a major bottleneck in high performance routers. IP address lookup is challenging because it requires a longest matching prefix lookup. It is compounded by increasing routing table sizes, increased traffic, higher speed ...
On calibrating measurements of packet transit times
We discuss the problem of detecting errors in measurements of the total delay experienced by packets transmitted through a wide-area network. We assume that we have measurements of the transmission times of a group of packets sent from an originating ...
Modeling communication pipeline latency
In this paper, we study how to minimize the latency of a message through a network that consists of a number of store-and-forward stages. This research is especially relevant for today's low overhead communication systems that employ dedicated ...
Implementing cooperative prefetching and caching in a globally-managed memory system
- Geoffrey M. Voelker,
- Eric J. Anderson,
- Tracy Kimbrel,
- Michael J. Feeley,
- Jeffrey S. Chase,
- Anna R. Karlin,
- Henry M. Levy
This paper presents cooperative prefetching and caching --- the use of network-wide global resources (memories, CPUs, and disks) to support prefetching and caching in the presence of hints of future demands. Cooperative prefetching and caching ...
Cello: a disk scheduling framework for next generation operating systems
In this paper, we present the Cello disk scheduling framework for meeting the diverse service requirements of applications. Cello employs a two-level disk scheduling architecture, consisting of a class-independent scheduler and a set of class-specific ...
The impact of I/O on program behavior and parallel scheduling
In this paper we systematically examine various performance issues involved in the coordinated allocation of processor and disk resources in large-scale parallel computer systems. Models are formulated to investigate the I/O and computation behavior of ...
Is service priority useful in networks?
A key question in the definition of new services for the Internet is whether to provide a single class of relaxed real-time service or multiple levels differentiated by their delay characteristics. In that context we pose the question: is service ...
Improving TCP throughput over two-way asymmetric links: analysis and solutions
The sharing of a common buffer by TCP data segments and acknowledgments in a network or internet has been known to produce the effect of ack compression, often causing dramatic reductions in throughput. We study several schemes for improving the ...
Asymptotic behavior of global recovery in SRM
The development and deployment of a large-scale, wide-area multicast infrastructure in the Internet has enabled a new family of multi-party, collaborative applications. Several of these applications, such as multimedia slide shows, shared whiteboards, ...
The busy period in the fluid queue
Consider a fluid queue fed by N on/off sources. It is assumed that the silence periods of the sources are exponentially distributed, whereas the activity periods are generally distributed. The inflow rate of each source, when active, is at least as ...
Transient loss performance of a class of finite buffer queueing systems
Performance-oriented studies typically rely on the assumption that the stochastic process modeling the phenomenon of interest is already in steady state. This assumption is, however, not valid if the life cycle of the phenomenon under study is not large ...
Queueing-based analysis of broadcast optical networks
We consider broadcast WDM networks operating with schedules that mask the transceiver tuning latency. We develop and analyze a queueing model of the network in order to obtain the queue-length distribution and the packet loss probability at the ...
Predicting MPEG execution times
This paper reports on a set of experiments that measure the amount of CPU processing needed to decode MPEG-compressed video in software. These experiments were designed to discover indicators that could be used to predict how many cycles are required to ...
Self-similarity in file systems
We demonstrate that high-level file system events exhibit self-similar behaviour, but only for short-term time scales of approximately under a day. We do so through the analysis of four sets of traces that span time scales of milliseconds through months,...
Generating representative Web workloads for network and server performance evaluation
One role for workload generation is as a means for understanding how servers and networks respond to variation in load. This enables management and capacity planning based on current and projected usage. This paper applies a number of observations of ...
Performance measurements for multithreaded programs
Multithreaded programming is an effective way to exploit concurrency, but it is difficult to debug and tune a highly threaded program. This paper describes a performance tool called Tmon for monitoring, analyzing and tuning the performance of ...
A methodology and an evaluation of the SGI Origin2000
As hardware-coherent, distributed shared memory (DSM) multiprocessing becomes popular commercially, it is important to evaluate modern realizations to understand how they perform and scale for a range of interesting applications and to identify the ...
An analytic behavior model for disk drives with readahead caches and request reordering
Modern disk drives read-ahead data and reorder incoming requests in a workload-dependent fashion. This improves their performance, but makes simple analytical models of them inadequate for performance prediction, capacity planning, workload balancing, ...
Modeling set associative caches behavior for irregular computations
While much work has been devoted to the study of cache behavior during the execution of codes with regular access patterns, little attention has been paid to irregular codes. An important portion of these codes are scientific applications that handle ...
Inter-receiver fairness: a novel performance measure for multicast ABR sessions
In a multicast ABR service, a connection is typically restricted to the rate allowed on the bottleneck link in the distribution tree from the source to the set of receivers. Because of this, receivers in the connection can experience inter-receiver ...
Application and evaluation of large deviation techniques for traffic engineering in broadband networks
Accurate yet simple methods for traffic engineering are important for efficient dimensioning of broadband networks. The goal of this paper is to apply and evaluate large deviation techniques for traffic engineering. In particular, we employ the recently ...
The concept of relevant time scales and its application to queuing analysis of self-similar traffic (or is Hurst naughty or nice?)
Recent traffic analyses from various packet networks have shown the existence of long-range dependence in bursty traffic. In evaluating its impact on queuing performance, earlier investigations have noted how the presence of long-range dependence, or a ...
Scheduling with implicit information in distributed systems
Implicit coscheduling is a distributed algorithm for time-sharing communicating processes in a cluster of workstations. By observing and reacting to implicit information, local schedulers in the system make independent decisions that dynamically ...
Scheduling policies to support distributed 3D multimedia applications
We consider the problem of scheduling the rendering component of 3D multimedia applications on a cluster of workstations connected via a local area network. Our goal is to meet a periodic real-time constraint.In abstract terms, the problem we address is ...
LoGPC: modeling network contention in message-passing programs
In many real applications, for example those with frequent and irregular communication patterns or those using large messages, network contention and contention for message processing resources can be a significant part of the total execution time. This ...
Modeling and optimizing I/O throughput of multiple disks on a bus (summary)
- Rakesh Barve,
- Elizabeth Shriver,
- Phillip B. Gibbons,
- Bruce K. Hillyer,
- Yossi Matias,
- Jeffrey Scott Vitter
For a wide variety of computational tasks, disk I/O continues to be a serious obstacle to high performance. The focus of the present paper is on systems that use multiple disks per SCSI bus. We measured the performance of concurrent random I/Os, and ...
Task assignment in a distributed system (extended abstract): improving performance by unbalancing load
We consider the problem of task assignment in a distributed system (such as a distributed Web server) in which task sizes are drawn from a heavy-tailed distribution. Many task assignment algorithms are based on the heuristic that balancing the load at ...
A self-scaling and self-configuring benchmark for Web servers (extended abstract)
World Wide Web clients and servers have become some of the most important applications in our computing base, and we need realistic and meaningful ways of measuring their performance. Current server benchmarks do not capture the wide variation that we ...