Li-Cloud-Performance-Yad1 2
Li-Cloud-Performance-Yad1 2
Introduction
In this fast-paced world of technological advancements, rising customer demands, and rapidly
increasing competition, using a public cloud is a popular option for deploying business-critical work-
loads. In a recent Qualtrics survey, improved customer experience and reduced total cost of owner-
ship are among the top outcomes that businesses are expecting when it comes to moving workloads
to the cloud.1 However, the convenience of using public cloud infrastructure introduces some unique
challenges. Customers expect excellent performance, regardless of their chosen deployment model.
Linux® is a critical component of the cloud infrastructure as it is often chosen as the foundation for
modern cloud services and emerging use cases.
With capabilities that facilitate uninterrupted workload migration and more efficient manage-
ment, Red Hat® Enterprise Linux delivers the consistency you need to streamline how you manage
hardware and workload performance across your entire hybrid cloud infrastructure. You can detect
performance lag or anomalies to determine the reason behind application performance issues.
Intelligent tooling helps you build a comprehensive view of overall system performance and provides
user-friendly tuning of the kernel for optimum function. Use best practices for performance tuning
with common tuning profiles that optimize hardware and workload performance.
Testing parameters
In 2021, Red Hat performance experts conducted extensive internal testing on public cloud
environments. This paper details the results of that testing, highlighting important factors that
affect Red Hat Enterprise Linux performance in the cloud.
Our testing measured Red Hat Enterprise Linux performance on two popular public cloud platforms
using central processing unit (CPU) and memory-bound workloads. We selected three well-known
benchmarks—LINPACK, STREAM, and SPECjbb2005—and used them to create a CPU aggregate
score that we use for price-performance comparisons. Details on performance characteristics, test
suites, benchmarks, and cloud instance types are outlined in the sections that follow. Keep in mind
that shared cloud infrastructure often has varying performance due to other workloads running on
the same systems. Your price to performance ratio could fluctuate based on other users’ activity.
facebook.com/redhatinc
@RedHat
linkedin.com/company/red-hat 1 Dan Juengst, “Insights into hybrid cloud: Here’s what to consider.” Red Hat blog, 28 May 2020.
1. Peak load is the maximum amount of concurrent operations that are on a server within a certain
time period. Peak load measurements are important because they help enterprises properly size
their systems before the busy period hits.
2. Memory bandwidth is the amount of data that can be moved to and from the given memory
destination by the CPU. This metric is important because it demonstrates how quickly the operat-
ing system (OS) can get data into and out of memory for processing. If memory bandwidth is low,
then the processor would be wasting cycles waiting for memory to respond. If memory bandwidth
is high, processor cycles are not wasted.
3. Compute throughput is the number of concurrent compute operations performed per second.
Higher compute throughput means more responsive applications and a better user experience.
4. Price-performance ratio helps balance the price of the solution against its effectiveness. The
lower the price-performance ratio, the better since you are able to get more performance value at
a lower cost. Workloads that are able to scale out, such as containerized applications, will benefit
more than workloads that can only scale up, such as monolithic single-node applications.
There are hundreds, if not thousands, of programs that stress the CPU and memory components of a
system in different ways. When selecting which benchmarks to try, there are several considerations.
The benchmarks should:
Benchmarks
LINPACK
The LINPACK benchmark solves a dense system of linear equations focused on floating-point
compute capabilities of the CPU. As all of the instance types in this document are based on Intel
CPUs, we will use the version of LINPACK that Intel ships as part of its Intel Math Kernel Library.
Other CPU types in future work will use a version of LINPACK optimized for their architecture.
STREAM
STREAM is a simple synthetic benchmark program that measures sustainable memory bandwidth (in
MB/s). This benchmark is run by increasing the load starting at one thread until there are two threads
per virtual central processing unit (vCPU) in the system. We test four separate sets of operations
and we include all four in our aggregate score because each highlights a slightly different aspect
of performance.
CPU aggregate
The CPU aggregate score is calculated as the geometric mean of the above benchmarks. This score
allows us to calculate a single metric per system from the results of multiple benchmarks that are
measured using different metrics without any one benchmark overwhelming the rest of the results.
The right level of performance testing ensures that your workloads meet expectations and deliver
a superior user experience. The testing highlights potential problems before your system is put
into production. Our study is limited to just a few benchmarks. We recommend that you ade-
quately test your workload prior to deploying it into production.
In our tests, we used these instance types that exist on the public cloud:
Storage-optimized instances are for workloads that require low latency and process random
input/output operations per second (IOPS).
Across each of these instance types, we have selected a representative instance in the small, medium,
and large size categories. We chose instances with 8, 32, and 64 CPUs. Because some instance types
also scale beyond that, we also chose the largest CPU count supported if it is higher than 64. In the
case where some instance types do not have sizes available at 32 and 64 CPUs, we have picked the
next-largest available size.
CVX_{C|G|S}YYCPU where:
Performance results
The results below are based on internal benchmark tests run using public cloud infrastructures.
General-purpose class instances are described by cloud vendor 1 and cloud vendor 2 as “balanced”
configurations with 4GB of memory per vCPU. They are intended as a good solution for most work-
loads if not optimized for any specific cases. An immediate difference between the cloud vendor 1
and cloud vendor 2 general-purpose instances is that the cloud vendor 1 instances are running on
the older Intel Broadwell CPU while the general purpose instances in cloud vendor 2 are running
Intel Skylake or Intel Cascade Lake. This gives the cloud vendor 2 instances much improved per-
thread performance due to improved microarchitecture, faster memory support, Intel UltraPath
Interconnect (UPI), Intel Advanced Vector Extensions 512 (AVX-512), and other features.
LINPACK
Figure 1 explanation
STREAM
Figure 3 explanation
Figure 4 explanation
Figure 5 explanation
Figure 6 explanation
Figure 7 explanation
Figure 8 explanation
The compute-optimized class of instances in cloud vendor 1 and cloud vendor 2 are much closer
in configuration than what is found in the general purpose category, with both supporting 2GB of
memory per vCPU and the cloud vendor 1 class coming with either Skylake or Cascade Lake pro-
cessors and the cloud vendor 2 class coming with Cascade Lake processors. With the more limited
memory per vCPU, customers should select this instance type when their workload’s working set
is relatively small or the improvement in price/performance is sufficient to justify the reduced
memory size.
Note for the below results that cloud vendor 2 has chosen non-power-of-two CPU counts for their
medium and large instance sizes. Instead of 32 vCPUs, their closest is 36, and instead of 64, their
closest is 72. For workloads that scale best with powers of two (common in some high-performance
computing (HPC)-style workloads), this is a factor to keep in mind.
LINPACK
Figure 9 explanation
Figure 12 explanation
Figure 13 explanation
Figure 14 explanation
Figure 15 explanation
Figure 16 explanation
These price/performance
scores should not be any sur-
prise after looking at the pre-
vious results. Worth noting:
The degree to which the 8
vCPU instances outscore the
larger instances is significant.
The storage-optimized instance classes in cloud vendor 1 and cloud vendor 2 are even more different
than the general purpose classes. In cloud vendor 1, the instances are based on the AMD Naples CPU,
and the cloud vendor 2 instances are based on Intel Skylake CPUs. These CPU architectures are
drastically different from each other, as seen in the following benchmark results. While most custom-
ers who select these instance classes are looking for the advantages gained by the instance-local
storage, the applications that run on them are still dependent on CPU horsepower.
LINPACK
Figure 17 explanation
A further improvement for the AMD-based results could have been made using the AMD Optimized
compiler, but at this time we have not studied its impact on high-performance LINPACK (HPL) or
LINPACK results. Customers may find that applications that have been highly optimized for Intel
CPUs might need work to perform well on AMD CPUs.
STREAMS
Figure 19 explanation
SPECjbb2005
Figure 21 explanation
Figure 23 explanation
Figure 24 explanation
In addition, Red Hat Enterprise Linux comes with recommended out-of-the-box best practices for
performance tuning, empowering customers to optimize workload performance. With TuneD, you
can manage and select from a variety of performance profiles to meet your use cases.
See how well your systems are performing in the cloud using the tools we outlined above. Read
our performance blog series on how you can measure and tune your Red Hat Enterprise Linux
performance.
facebook.com/redhatinc
@RedHat NORTH AMERICA EUROPE, MIDDLE EAST, ASIA PACIFIC LATIN AMERICA
linkedin.com/company/red-hat 1 888 REDHAT1 AND AFRICA +65 6490 4200 +54 11 4329 7300
00800 7334 2835 apac@redhat.com info-latam@redhat.com
europe@redhat.com
redhat.com Copyright © 2021 Red Hat, Inc. Red Hat and the Red Hat logo are trademarks or registered trademarks of Red Hat, Inc. or its subsidiaries
#F30497_1121 in the United States and other countries. Linux® is the registered trademark of Linus Torvalds in the U.S. and other countries. Java is
the registered trademark of Oracle America, Inc. in the United States and other countries. All other trademarks are the property of their
respective owners.