Storage Patterns-Kubernetes
Storage Patterns-Kubernetes
These materials are © 2020 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited.
  Storage Patterns for Kubernetes For Dummies®, Red Hat
  Special Edition
  Published by
  John Wiley & Sons, Inc.
  111 River St.
  Hoboken, NJ 07030-5774
  www.wiley.com
  Copyright © 2020 by John Wiley & Sons, Inc.
  No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or
  by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as
  permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without the prior written
  permission of the Publisher. Requests to the Publisher for permission should be addressed to the
  Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax
  (201) 748-6008, or online at http://www.wiley.com/go/permissions.
  Trademarks: Wiley, For Dummies, the Dummies Man logo, The Dummies Way, Dummies.com, Making
  Everything Easier, and related trade dress are trademarks or registered trademarks of John Wiley & Sons,
  Inc. and/or its affiliates in the United States and other countries, and may not be used without written
  permission. Red Hat, Red Hat Enterprise Linux, the Shadowman logo, and JBoss are trademarks or
  registered trademarks of Red Hat, Inc. or its subsidiaries in the United States and other countries. The
  OpenStack Word Mark and OpenStack Logo are either registered trademarks/service marks or
  trademarks/service marks of the OpenStack Foundation, in the United States and other countries and are
  used with the OpenStack Foundation’s permission. Linux is the registered trademark of Linus Torvalds in
  the U.S. and other countries. Java is the registered trademark of Oracle America, Inc. in the United States
  and other countries. All other trademarks are the property of their respective owners. John Wiley & Sons,
  Inc., is not associated with any product or vendor mentioned in this book.
  For general information on our other products and services, or how to create a custom For Dummies book
  for your business or organization, please contact our Business Development Department in the U.S. at
  877-409-4177, contact info@dummies.biz, or visit www.wiley.com/go/custompub. For information about
  licensing the For Dummies brand for products or services, contact BrandedRights&Licenses@Wiley.com.
10 9 8 7 6 5 4 3 2 1
Publisher’s Acknowledgments
   Some of the people who helped bring this book to market include the following:
   Project Editor:                                               Editorial Manager: Rev Mengle
       Carrie Burchfield-Leighton
                                                                 Business Development Representative:
   Acquisitions Editor: Ashley Coffey                               Molly Daugherty
   Production Editor: Mohammed Zafar Ali
These materials are © 2020 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited.
  Introduction
                   H
                           ave you run Docker containers and discovered that storage
                           isn’t as simple as mounting a directory? Perhaps you have
                           exposure to Kubernetes and have discovered volumes but
                    need more? The vast and flexible world of hyper-converged infra-
                    structure and how that can be implemented with Kubernetes can
                    help. Beyond understanding the pieces and parts, Kubernetes
                    brings it all together with concrete examples of how your applica-
                    tions can benefit from enhanced storage capabilities.
                    The Tip icon gives you helpful hints or pointers to something that
                    may assist you in understanding or implementing the technology
                    being discussed.
Introduction 1
These materials are © 2020 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited.
                    The warning icon calls your attention to information that may be
                    a stumbling block or pitfall. As you look to discover or use bits of
                    info from warning sections, pay extra attention to the detail.
                    These icons give you a heads-up that the topic immediately fol-
                    lowing is likely to be deeper in the technical “weeds” than other
                    passages. You may find this information interesting but not com-
                    pletely necessary for a higher level understanding of Kubernetes
                    storage patterns.
2 Storage Patterns for Kubernetes For Dummies, Red Hat Special Edition
These materials are © 2020 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited.
                                                    IN THIS CHAPTER
                                                    »» Discussing persistent storage for
                                                       containerized applications
  Chapter                          1
  Introducing Storage
  in Kubernetes
                   K
                           ubernetes has emerged as the most mature platform for
                           managing, configuring, and deploying containers at scale. It
                           provides you a declarative, API-centric way to describe
                    container-based applications that can self-manage, self-heal,
                     self-scale, and more. In this chapter, you explore the impact of
                     expanding the automation and orchestration from applications
                     and networking to also include storage, and you look at how you
                     can apply these technologies and patterns to your applications.
                     We also discuss where storage is heading in the Kubernetes
                     ecosystem.
  Persistent Storage in an
  Ephemeral World
                    Containers have changed the way we build, ship, and run appli-
                    cations. This transformative technology affords greater velocity
                    and application portability but has also brought new challenges
                    to technology teams.
These materials are © 2020 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited.
                    Challenges of persisting data in a
                    transient system
                    Kubernetes is the leading orchestrator for container workloads.
                    When a container stops for one reason or another, Kubernetes is
                    able to start a new instance almost immediately, making avail-
                    ability and recoverability something that happens in seconds as
                    opposed to minutes or even hours. But this new instance of the
                    application may not have access to the file system from the old
                    instance, especially in the case of local file systems, so additional
                    orchestration is required to ensure that the stateful component of
                    the application is bound to the instance at restart.
                    So, how do you provide a stable storage solution for any type of
                    content, with an equally composable and portable convention as
                    the ELK pattern? I’m glad you asked. You don’t want to inhibit
                    your IT team’s abilities to backup, recover, or secure the data
4 Storage Patterns for Kubernetes For Dummies, Red Hat Special Edition
These materials are © 2020 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited.
                    either. Fortunately, that’s where Kubernetes steps in. Its features
                    and capabilities address this challenge, and they’re rapidly grow-
                    ing and maturing. Kubernetes began with storage features as part
                    of the Pod API and soon expanded these into standalone inter-
                    faces. These include
                       »» Volumes
                       »» Persistent volumes
                       »» Storage classes
                       »» Container Storage Interface (CSI)
                    A Kubernetes pod is the smallest schedulable resource in Kuber-
                    netes. It’s a logical collection of containers, which ensures the
                    encapsulated containers share resources and are hosted on the
                    same node in your cluster. Typically, a pod consists of a single
                    container and is an instance of an application.
  Volumes in Kubernetes
                    The Kubernetes volume in its simplest form is a directory that’s
                    attached to a pod and mounted to one or more containers in the
                    pod. Such coupling is tight and is a foundational construct for how
                    data can be presented to a container.
                    Kubernetes volumes are typed, which is how you define the stor-
                    age mechanism backing the volume. Volume characteristics may
                    be defined via volume type-specific parameters. This may be a
                    physical disk or path on the Kubernetes node, a SAN, network
                    attached storage (NAS), or even a storage service from a cloud
                    provider — to name a few.
These materials are © 2020 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited.
                    layer of abstraction helps but on its own doesn’t solve all persist-
                    ent storage concerns.
                    Special volumes
                    Kubernetes makes the following special volumes available for use:
                       »» emptyDir
                       »» configMap
                       »» secret
                    When a pod is scheduled to run on a node, a volume is created
                    on that node and will remain there until the pod is deleted or
                    moved to another node. This volume is an emptyDir volume that
                    begins empty and is backed by the storage of the host itself. Given
                    the nature of this volume type, it’s not regularly backed up and
                    should be treated as ephemeral or temporary storage in most
                    cases, such as caching or working objects. Every pod can make
                    use of an emptyDir.
                    The other two types, configMap and secret, are volumes whose
                    content is actually stored by Kubernetes. This mechanism is often
                    used to pass configuration information, such as credentials, into
                    a container and ensure they’re available wherever the pod is
                    scheduled to run.
                    Volumes on nodes
                    Much like the emptyDir (see the preceding section), you can also
                    define a hostPath volume or a local volume. Both volume types
                    provide a way for you to specify a local path or storage device on a
                    node and mount it in a path defined in a container.
                    hostPath volumes
                    While hostPath volumes allow you to access the file system of the
                    Kubernetes node it’s scheduled on, it also means your pod can
                    get access to other containers running on the host, certificates of
                    the kubelet, and other sensitive files. Some critical limitations of
                    hostPath volumes to keep in mind include the following:
6 Storage Patterns for Kubernetes For Dummies, Red Hat Special Edition
These materials are © 2020 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited.
                       »» The same path may be absent or contain a different version
                            of the referenced file from node to node, causing irregular
                            behavior in an application.
                       »» Kubernetes can’t account for the storage available for
                            hostPath volumes, which means your available local storage
                            is in a blind spot of your orchestrator.
                       »» Typically, only the root user has the ability to write to files
                            and directories in your hostPath and forces you to run your
                            container as a privileged container or your processes as root.
                            This is a bad practice because it’s at odds with the principle
                            of least privilege (PoLP).
                       »» Each node that can be targeted by a hostPath pod must be
                            able to support these pods.
                    Local volumes
                    A local volume varies slightly from the hostPath in two ways:
These materials are © 2020 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited.
                       »» Selectors
                       »» Storage class
                    Find details about the PVC elements at https://kubernetes.io/
                    docs/concepts/storage/persistent-volumes/#persistent
                    volumeclaims.
8 Storage Patterns for Kubernetes For Dummies, Red Hat Special Edition
These materials are © 2020 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited.
                    FlexVolume is included for completeness; however, the conven-
                    tion is being deprecated. Only use it if you must.
These materials are © 2020 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited.
                                                    IN THIS CHAPTER
                                                    »» Discovering real-world Kubernetes
                                                       storage examples
  Chapter                          2
  Looking at the
  Convergence of Storage
  and Applications
                   I
                      n Chapter  1, you discover the numerous capabilities in
                      Kubernetes to address both ephemeral and persistent storage.
                      In this chapter, you find out about systems and patterns that
                    can be added to Kubernetes to solve even more complex storage
                    needs and architectural challenges.
These materials are © 2020 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited.
                    Kubernetes and the real world collide, and storage is a big part of
                    this transformation.
12 Storage Patterns for Kubernetes For Dummies, Red Hat Special Edition
These materials are © 2020 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited.
                    Software-defined systems
                    include storage
                    Orchestrating stateless, containerized applications requires adopt-
                    ing a declarative model of describing your application deployment.
                    The desired state of your deployment is described as YAML code,
                    and Kubernetes works to make the deployed system match that
                    desired state.
                    GCP, AWS, and Azure all have storage drivers in Kubernetes that
                    help offload the concerns of managing and provisioning the plat-
                    form storage services and provide a common interface in PVCs
                    and Storage Classes. But how do you achieve parity for clusters
                    running inside corporate data centers, or for storage services you
                    wish to use when a consistent experience in Kubernetes doesn’t
                    exist yet? One solution, which is to host your own storage system,
                    is covered in the next section.
These materials are © 2020 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited.
                    where you orchestrate your compute and networking in one plat-
                    form. In this section, you look at bringing storage into the mix as
                    well and how doing so with Kubernetes can help you run the same
                    platform on premise and across multiple cloud providers.
                    Rook
                    Rook is a storage orchestrator for Kubernetes. It uses the declara-
                    tive nature of Kubernetes to make storage services self-healing,
                    self-scaling, and managed with similar conventions as your con-
                    tainerized applications. Rook uses a plugin-extensible architec-
                    ture to help you automate storage administrative tasks such as
                    deployment, bootstrapping, configuring, scaling, provisioning,
                    and monitoring.
                    What does this mean to you? Rook allows you to use a declara-
                    tive style of management that empowers you to achieve “as code”
                    capabilities for a multitude of storage types. It implements the
                    new necessary Kubernetes objects so you have a common inter-
                    face to provision storage software that provides object, block, and
                    file system storage types.
                    Rook goes beyond primitive storage types and also enables you to
                    orchestrate databases and other storage types in a claims-based
                    fashion. This common way of dealing with volumes and now other
                    storage types in Kubernetes is one way it is growing to orchestrate
                    all storage types.
                       »» Cassandra
                       »» Ceph
14 Storage Patterns for Kubernetes For Dummies, Red Hat Special Edition
These materials are © 2020 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited.
                       »» CockroachDB
                       »» EdgeFS
                       »» Minio
                       »» NFS
                       »» YugabyteDB
                    Ceph and EdgeFS are the most mature and are actively supported
                    by the Rook community. The other operators are mentioned for
                    completeness and are in various stages of development.
                    Ceph
                    Ceph is a high-performance, distributed storage platform designed
                    to bring similar capabilities found in GCP and AWS to any cloud
                    platform, such as Kubernetes. It provides object storage, block
                    storage, and distributed file systems backed by a single, reliable
                    storage cluster running on commodity server hardware.
                    Longhorn
                    Longhorn is another software-defined storage platform, pro-
                    viding Kubernetes with a storage subsystem which rounds out
                    the HCI capabilities. Unlike Rook-Ceph, it is a single solution
These materials are © 2020 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited.
                    encompassing features of both of those projects, but only imple-
                    ments a distributed block storage solution, which means volumes
                    can only be consumed by a single pod at a time.
                    Architectural patterns
                    If you’ve been reading this chapter up to this point, it may be
                    helpful to see how Rook-Ceph fits into your Kubernetes cluster.
                    Take a look at Figure 2-1. You see the relationship of Rook, Ceph,
                    and Kubernetes components.
16 Storage Patterns for Kubernetes For Dummies, Red Hat Special Edition
These materials are © 2020 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited.
  Practical Application of Converged
  Storage Patterns
                    Armed with an understanding of Kubernetes storage and how
                    Rook-Ceph extends those capabilities, this section helps you
                    understand how you can make good use of the combined tech-
                    nologies. Putting HCI to work for you and your applications can
                    make you more nimble and resilient.
These materials are © 2020 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited.
                    In Figure 2-3, the application can use storage provided by Rook-
                    Ceph (or another software-defined storage solution), which is in
                    turn consuming compute and storage resources from any cloud
                    provider.
18 Storage Patterns for Kubernetes For Dummies, Red Hat Special Edition
These materials are © 2020 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited.
                    FIGURE 2-4: Application and storage deployment that is portable between
                    HCI platforms on-premises and different cloud environments.
These materials are © 2020 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited.
                                                    IN THIS CHAPTER
                                                    »» Discovering persistent storage for your
                                                       applications
  Chapter                          3
  Multi-Cloud and Hybrid
  Cloud Considerations
                   I
                        n this chapter, you examine different patterns for dealing with
                        the complexity of running stateless apps that produce and use
                        stateful data.
These materials are © 2020 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited.
                            volume. This has a direct impact on your application start-up
                            and fail over times.
                       »» Performance of EBS volumes is directly tied to the size of the
                            volume. Such a relationship can lead to over-provisioning of
                            EBS, which has a proportionate financial increase to size (and
                            speed).
                       »» EBS is built in a way that it’s scoped to exist in a specific
                            availability zone (AZ), inhibiting your ability to automatically
                            fail over to another zone.
22 Storage Patterns for Kubernetes For Dummies, Red Hat Special Edition
These materials are © 2020 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited.
                    For these reasons, it’s often smart to provision your own
                    software-defined file solution by using a system like Rook-Ceph
                     and CephFS.
  Object Storage
                    If your application stores a lot of mostly static content, it may be
                    best suited to use object storage services as opposed to block or
                    file storage.
                    Today, Google, Azure, and AWS all provide object storage as a ser-
                    vice. However, these services all implement slightly different APIs
                    and security models, and provide slightly different management
                    capabilities.
These materials are © 2020 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited.
                    is, a bucket is provisioned, and access credentials and URLs are
                    passed to the pod so it knows how to connect to the storage.
                    lib-bucket-provisioner
                    lib-bucket-provisioner is a library for Kubernetes that intro-
                    duces the OB and OBC, OB/OBC objects, and associated workflows,
                    interfaces, and so on. Discover the details about this project at
                    github.com/kube-object-storage/lib-bucket-provisioner.
24 Storage Patterns for Kubernetes For Dummies, Red Hat Special Edition
These materials are © 2020 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited.
                    at, you should recognize that providing a native way to deal with
                    object storage rounds out the most common storage scenarios
                    directly in the platform. Being able to describe and dynamically
                    orchestrate, in an API-first fashion, block devices, file systems,
                    and object stores solve a great number of challenges for develop-
                    ers and administrators alike.
These materials are © 2020 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited.
                    Coupling of application stack and
                    storage state
                    Traditionally, disaster recovery (DR) is implemented in one of the
                    following patterns:
26 Storage Patterns for Kubernetes For Dummies, Red Hat Special Edition
These materials are © 2020 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited.
                    your Kubernetes cluster. Velero triggers snapshots of your vol-
                    umes, captures the objects from the Kubernetes API, and stores
                    them all together in its own space in the storage provider.
These materials are © 2020 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited.
  Managing Storage in a Multi-Cluster/
  Hybrid Cloud Environment
                    Many of the patterns we discuss in this book center on meth-
                    ods for managing your deployments for a relatively small num-
                    ber of clusters. As we scale, the complexity increases not only for
                    ephemeral applications but also even more so for applications
                    that create and utilize storage.
28 Storage Patterns for Kubernetes For Dummies, Red Hat Special Edition
These materials are © 2020 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited.
                                                    IN THIS CHAPTER
                                                    »» Getting involved in the Kubernetes
                                                       community
  Chapter                          4
  Ten Items for Your
  Kubernetes Storage
  Checklist
These materials are © 2020 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited.
                            aspects of their volumes such as read/write and size.
                            Expecting each cluster to be able to satisfy your PVC via a
                            storage class isn’t an unreasonable assumption.
                       »» Embrace the ObjectBucketClaim model for provisioning
                            object storage. Using the OB/OBC claim-based model
                            insulates you from the different object storage provider
                            backends much like PVCs do.
                       »» Understand the limits of emptyDir volume in your
                            applications. Leveraging ephemeral volumes still provides
                            persistence beyond the container lifetime and is useful when
                            you need ramdisks and the like.
                       »» When possible, be API driven. With the use of both the
                            Kubernetes API and Custom Resource Definitions (CRDs)
                            users can create a consistent model for defining their
                            resources that make their deployments more portable in
                            the future.
                       »» Keep an eye on the storage drivers you use in your
                            environments. As more storage drivers are converting to
                            Container Storage Interface (CSI), you want to embrace the
                            go-forward pattern as soon as you’re able to.
                       »» Mind your Kubernetes versions. As you look at branching
                            out into multi-cloud deployments, mixed versions will have
                            varying levels of support for the same features. For example,
                            volume snapshotting is still an Alpha feature and was
                            introduced in Kubernetes version 1.12. Alpha features aren’t
                            enabled by default, so not all 1.12+ clusters will have this
                            available to you.
                       »» Get running quickly with operators and Helm charts for
                            Rook-Ceph, NooBaa, and so on. You can gain a lot of
                            capability for less effort than deploying these solutions
                            manually. Critical tools are available at the following
                            websites:
                            •	 operatorhub.io/operator/noobaa-operator
                            •	 operatorhub.io/operator/rook-ceph
                            •	 github.com/helm/charts/tree/master/stable/velero
                       »» Never stop learning. Kubernetes is one of the best technolo-
                            gies we’ve ever worked with. It’s rapidly evolving, with quar-
                            terly releases. As the ecosystem continues to become more
                            plugin-centric, the components and tools will continue to
                            change and grow exponentially. You can’t stop reading and
                            learning to stay on top of the rate of change to remain current.
30 Storage Patterns for Kubernetes For Dummies, Red Hat Special Edition
These materials are © 2020 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited.
WILEY END USER LICENSE AGREEMENT
Go to www.wiley.com/go/eula to access Wiley’s ebook EULA.