0% found this document useful (0 votes)

10 views35 pages

09 Maintenance and Monitoring

This document outlines the final module of a course on application maintenance and monitoring, focusing on managing service versions, cost planning, and monitoring dashboards. It covers strategies for deploying updates such as rolling updates, blue/green deployments, and canary releases, as well as methods for optimizing service costs using Google Cloud tools. Additionally, it emphasizes the importance of monitoring service availability and performance through Cloud Monitoring and alerts.

Uploaded by

ayushman292140

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views35 pages

09 Maintenance and Monitoring

Uploaded by

ayushman292140

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 35

Maintenance and Monitoring

Philipp Maier
Course Developer, Google Cloud

In this final module of this course, we cover application maintenance and monitoring.
Learning objectives
● Manage new service versions using rolling updates, blue/green deployments,
and canary releases.

● Forecast, monitor and optimize service cost using the Google Cloud pricing
calculator and billing reports, and by analyzing billing data.

● Observe whether your services are meeting their SLOs using Cloud Monitoring
and Dashboards.
● Use Uptime Checks to determine service availability.
● Respond to service outages using Cloud Monitoring Alerts.

Maintenance is primarily concerned with how updates are made to running

applications, the different strategies available, and how different deployment platforms
support them. For monitoring, I discuss this vital area for cloud-native applications
from two perspectives:

1. First, I will talk about the cost perspective to make sure that resources are
being best provisioned against demand. After all, why should you pay for
resources that you don’t need?
2. Second, I will discuss how to implement monitoring and observability to
determine and alert on the health of services and applications using Cloud
monitoring and dashboards.

This will also allow us to define uptime checks and use Cloud Monitoring alerts to
identify service outages. Let’s get started!
Agenda
Managing Versions

Cost Planning

Monitoring Dashboards

Let’s begin by taking a look at version management.

In a microservice architecture, be careful not to break
clients when services are updated
● Include version in URI:
○ If you deploy a breaking change, you need to change the version.
● Need to deploy new versions with zero downtime.
● Need to effectively test versions prior to going live.

A key benefit of a microservice architecture is the ability to independently deploy

microservices. This means that the service API has to be protected. Versioning is
required, and when new versions are deployed, care must be taken to ensure
backward compatibility with the previous version. Some simple design rules can help,
such as indicating the version in the URI and making sure you change the version
when you make a backwardly incompatible change. Deploying new versions of
software always carries risk. We want to make sure we test new versions effectively
before going live, and when ready to deploy a new version, we do so with zero
downtime.

Let me discuss some strategies that can help achieve these objectives.
Rolling updates allow you to deploy new versions
with no downtime
● Typically, you have multiple instances ● Rolling updates are a feature of
of a service behind a load balancer. instance groups; just change the
● Update each instance one at a time. instance template.
● Rolling updates work when it is ok to ● Rolling updates are the default in
have 2 different versions running Kubernetes; just change the Docker
simultaneously during the update. image.
● Completely automated in App Engine.

Rolling updates allow you to deploy new versions with no downtime. The typical
configuration is to have multiple instances of a service behind a load balancer. A
rolling update will then update one instance at a time. This strategy works fine if the
API is not changed or is backward compatible, or if it is ok to have two versions of the
same service running during the update.

If you are using instance groups, rolling updates are a built-in feature. You just define
the rolling update strategy when you perform the update.
For Kubernetes, rolling updates are there by default; you just need to specify the
replacement Docker image.
Finally, for App Engine, rolling updates are completely automated.
Use a blue/green deployment when you don’t want
multiple versions of a service running simultaneously
● The blue deployment is the current ● In Compute Engine, you can use DNS
version. to migrate requests from one load
● Create an entirely new environment balancer to another.
(the green). ● In Kubernetes, configure your service
● Once the green deployment is tested, to route to the new pods using labels.
migrate client requests to it. ○ Simple configuration change
● If failures occur, switch it back. ● In App Engine, use the Traffic Splitting
feature.

Use a blue/green deployment when you don’t want multiple versions of a service to
run simultaneously.

Blue/green deployments use two full deployment environments. The blue deployment
is running the current deployed production software, while the green deployment
environment is available for deploying updated versions of the software.

When you want to test a new software version, you deploy it to the green
environment. Once testing is complete, the workload is shifted from the current (blue)
to the new (green) environment. This strategy mitigates the risk of a bad deployment
by allowing the switch back to a previous deployment if something goes wrong.

For Compute Engine, you can use DNS to migrate requests, while in Kubernetes you
can configure your service to route to new pods using labels, which is just a simple
configuration change. App Engine allows you to split traffic, which you explored in the
previous lab of this course.
Canary releases can be used prior to a rolling update
to reduce the risk
● The current service version continues ● In Compute Engine, you can create a
to run. new instance group and add it as an
● Deploy an instance of the new version additional backend in in your load
and give it a portion of requests. balancer.
● Monitor for errors. ● In Kubernetes, create a new pod with
the same labels as the existing pods;
the service will automatically route a
portion of requests to it.
● In App Engine, use the Traﬃc Splitting
feature.

Now, you can use canary releases prior to a rolling update to reduce risk. With a
canary release, you make a new deployment with the current deployment still running.
Then you send a small percentage of traffic to the new deployment and monitor it.

Once you have confidence in your new deployment, you can route more traffic to the
new deployment until 100% is routed this way.

In Compute Engine, you can create a new instance group and add it to the load
balancer as an additional backend.
In Kubernetes, you can create a new pod with the same labels as the existing pods.
The service will automatically divert a portion of the requests to the new pod.
In App Engine, you can again use the traffic splitting feature to drive a portion of traffic
to the new version.
Proprietary + Confidential

02
Cost Planning

Cost planning is an important phase in your design that starts with capacity planning.
Proprietary + Confidential

Capacity planning is a continuous, iterative cycle

Forecast Allocate
Estimate capacity needed Determine resources required to
Monitor Repeat meet forecasted capacity
Continuous
Integration

Deploy Approve
Monitor to see how accurate your Cost estimation versus risks
forecasts were and rewards

I recommend that you treat capacity planning not as a one off task, but as a
continuous, iterative cycle, as illustrated on this slide.

Start with a forecast that estimates the capacity needed. Monitor and review this
forecast. Then allocate by determining the resources required to meet the forecasted
capacity. This allows you to estimate costs and balance them against risks and
rewards. Once the design and cost is approved, deploy your design and monitor it to
see how accurate your forecasts were. This feeds into the next forecast as the
process repeats.
Proprietary + Confidential

Optimizing cost of compute

● Start with small VMs, and test to see whether they work.

● Consider more small machines with auto scaling turned on.

● Consider committed use discounts.

● Consider at least some preemptible instances:

○ 80% discount
○ Use auto healing to recreate VMs when they are
preempted.

● Google Cloud rightsizing recommendations will alert you

when VMs are underutilized.

A good starting point for anybody working on cost optimization is to become familiar
with the VM instance pricing. It is often beneficial to start with a couple of small
machines that can scale out through auto scaling as demand grows.

To optimize the cost of your virtual machines, consider using committed use
discounts, as these can be significant. Also, if your workloads allow for preemptible
instances, you can save up to 80% and use auto healing to recover when instances
are preempted.

Compute Engine also provides sizing recommendations for your VM instances, as

shown on the right. This is a really useful feature that can help you select the right
size of VM for your workloads and optimize costs.
Proprietary + Confidential

Optimizing disk cost

● Don’t over-allocate disk space.

● Determine what performance characteristics your applications require:

○ I/O Pattern: small reads and writes or large reads and writes
○ Conﬁgure your instances to optimize storage performance.

● Depending on I/O requirements, consider Standard over SSD disks.

Monthly capacity Standard PD SSD PD

10 GB $0.40 $1.70

1 TB $40 $170

16 TB $655.36 $5,570.56

A common mistake is to over-allocate disk space. This is not cost-efficient, but

selecting a disk is not just about size. It is important to determine the performance
characteristics your applications display: the I/O patterns, do you have large reads,
small writes, vice versa, mainly read-only data? This type of information will help you
select the correct type of disk. As the table shows, SSD persistent disks are
significantly more expensive than standard persistent disks. Understanding your I/O
patterns can help provide significant savings.
Proprietary + Confidential

To optimize network costs, keep machines close to

your data
Internet
Continent Egress within the Continent
same zone: free

Region Region Region

Internet
Egress
Zone Zone Zone

Intercontinental
Egress
Egress between zones
in the same region
Egress between
Zone regions Zone

To optimize network costs, it is best practice to keep machines as close as possible to

the data they need to access. This graphic shows the different types of egress: within
the same zone, between zones in the same region, intercontinental egress, and
internet egress. It is important to be aware of the egress charges. These are not all
straightforward. Egress in the same zone is free. Egress to a different Google Cloud
service within the same region using an external IP address or an internal IP address
is free, except for some services such as Memorystore for Redis. Egress between
zones in the same region is charged and all internet egress is charged.

One way to optimize your network costs is to keep your machines close to your data.
Proprietary + Confidential

GKE usage metering can prevent over-provisioning

Kubernetes clusters

Control plane Request-based Consumption-

Metrics based Metrics Billing Export

Usage Metering
API Server
Agent

Compares requested resources

Metrics Server Data Studio Dashboard
with consumed resources.
Requested Vs. Consumption

CPU requested (cpu hour) CPU consumed (cpu hour)

Namespace Cost Amount Cost Amount

Kubelet . Kubelet . Kubelet .
cAdvisor . cAdvisor . cAdvisor . Namespace-1 507.21 16041 42.45 1343

Namespace-2 101.87 3208 81.95 2460

Kube-system 49.64 1548 24.5 762

Node Node Node Kube: system-overhead 61.24 1908 50.36 1675

Another way to optimize cost is to leverage GKE usage metering, which can prevent
over-provisioning your Kubernetes clusters.

With GKE usage metering, an agent collects consumption metrics in addition to the
resource requests by polling PodMetrics objects from the metrics server. The
resource request records and resource consumption records are exported to two
separate tables in a BigQuery dataset that you specify. Comparing requested with
consumed resources makes it easy to spot waste and take corrective measures.

This graphic shows a typical configuration where BigQuery is used for request-based
metrics collected from the usage metering agent and, together with data obtained
from billing export, it is analyzed in a Data Studio dashboard.
Proprietary + Confidential

Compare the costs of different

storage alternatives before
deciding which one to use
Choose a storage service that meets your
capacity requirements at a reasonable cost:

● Storing 1GB in Firestore is free.

● Storing 1GB in Cloud Bigtable would be

around $500/month.

Earlier in the course, we talked about all of the different storage services. It’s
important to compare the costs of the different options as well as their characteristics.

In other words, your storage and database service choice can make a significant
difference to your bill.
Proprietary + Confidential

Consider alternative services

to save cost rather than
allocating more resources
● CDN
● Caching
● Messaging
● Queueing
● Etc.

Your architectural design can also help you optimize your costs.

For example, if you use Cloud CDN for static content or Memorystore as a cache, you
can save instead of allocating more resources, Similarly, instead of using a datastore
between two applications, consider messaging/queuing with Pub/Sub to decouple
communicating services and reduce storage needs.
Proprietary + Confidential

Use the Google Cloud Pricing Calculator to estimate

costs
● Base your cost estimates on your
forecasting and capacity planning.

● Compare the costs of different

compute and storage services.

https://cloud.google.com/products/calculator

The pricing calculator should be your go-to resource for estimating costs. Your
estimates should be based on your forecasting and capacity planning. The tool is
great for comparing costs of different compute and storage services, and you will use
it in the upcoming design activity.

[Google Cloud Pricing Calculator: https://cloud.google.com/products/calculator]

Proprietary + Confidential

Billing reports provide detailed cost breakdowns

To monitor the costs of your existing service, leverage the Cloud Billing Reports page
as shown here. This report shows the changes in costs compared to the previous
month, and you can use the filters to search for particular projects, products, and
regions, as shown on the right.

The sizing recommendations for your Compute Engine instances will also be in this
report.
Proprietary + Confidential

For advanced cost analysis, export billing data to

BigQuery

For advanced cost analysis I recommend exporting your billing data to BigQuery, as
shown in this screenshot. You can then analyze the billing data to identify large
expenses and optimize your Google Cloud spend.

For example, let’s assume you label VM instances that are spread across different
regions. Maybe these instances are sending most of their traffic to a different
continent, which could incur higher costs. In that case, you might consider relocating
some of those instances or using a caching service like Cloud CDN to cache content
closer to your users, which reduces your networking spend.
Proprietary + Confidential

Visualize spend with Google Data Studio

Billing Dashboard

Daily View Monthly View Overall

Today’s Spend by Service Month-to-Date Spend Month-to-Date Spend

by Service by Project

Google
Data Studio

You can even visualize spend over time with Google Data Studio, which turns your
data into informative dashboards and reports that are easy to read, easy to share,
and fully customizable.

The service data is displayed in a daily and monthly view, providing at-a-glance
summaries that can also be drilled down in to provide greater insights.
Proprietary + Confidential

Set budgets and alerts to keep your team aware of how

much they are spending

Programmatic Budgets: Pub/Sub → Cloud Functions

To help with project planning and controlling costs, you can set a budget. Setting a
budget lets you track how your spend is growing toward that amount. This screenshot
shows the budget creation interface:
1. Set a budget name and specify which project this budget applies to.
2. Set the budget at a specific amount or match it to the previous month's spend.
3. Set the budget alerts. These alerts send emails to Billing Admins after spend
exceeds a percent of the budget or a specified amount.

In our case, it would send an email when spending reaches 50%, 90%, and 100% of
the budget amount. You can even choose to send an alert when the spend is
forecasted to exceed the percent of the budget amount by the end of the budget
period.

In addition to receiving an email, you can use Pub/Sub notifications to

programmatically receive spend updates about this budget. You could even create a
Cloud Function that listens to the Pub/Sub topic to automate cost management.
Agenda
Managing Versions

Cost Planning

Monitoring Dashboards

Let’s get into monitoring and visualizing information with dashboards.

Google Cloud unifies the tools you need to monitor
your service SLOs and SLAs

Monitoring Logging Trace Debugger Error Proﬁler

Reporting

Google Cloud unifies the tools you need to monitor your service SLOs and SLAs in
real time.

These tools include Monitoring, Logging, Trace, Debugger, Error Reporting, and
Profiler. All of these enable you to gain the insights you need to achieve your SLOs
and determine the root cause in those rare cases that you do not achieve your SLOs.
Monitoring dashboards monitor your services

● Monitor the things you pay for:

○ CPU use
○ Storage capacity
○ Reads and writes
○ Network egress
○ Etc.
● Monitor your SLIs to determine
whether you are meeting your SLOs.

Dashboards are one way for you to view and analyze metric data that is important to
you. This includes your SLIs to ensure that you are meeting your SLAS. The
Monitoring page of the Cloud Console automatically provides predefined dashboards
for the resources and services that you use. It is important that you monitor the things
you pay for to determine trends, bottlenecks, and potential cost savings.
Example charts in a Monitoring dashboard

Here is an example of some charts in a Monitoring dashboard. On the left you can
see the CPU usage for different Compute Engine instances, and on the right is the
ingress traffic for those instances.

Charts like these provide valuable insights into usage patterns.

To help you get started, Cloud Monitoring creates
default dashboards for your project resources

To help you get started, Cloud Monitoring creates default dashboards for your project
resources, as shown in this screenshot. You can also create custom dashboards,
which you can explore in the upcoming lab.
Create uptime checks to monitor availability and
latency

Now, it’s a good idea to monitor latency, because it can quickly highlight when
problems are about to occur. As shown on this slide, you can easily create uptime
checks to monitor the availability and latency of your services. So far there is a 100%
uptime with no outages.

Latency is actually one of the four golden signals called out in Google’s site reliability
engineering, or SRE, book. SRE is a discipline that applies aspects of software
engineering to operations whose goals are to create ultra-scalable and highly reliable
software systems. This discipline has enabled Google to build, deploy, monitor, and
maintain some of the largest software systems in the world.

I’ve linked the SRE book in the slides of this module

[https://landing.google.com/sre/books/].
Create alerts when your service fails to meet your
SLOs

Your SLO will be more strict than your SLA, so it is important to be alerted when you
are not meeting an SLO because its an early warning that the SLA is under threat.

Here is an example of what creating an alerting policy looks like. On the left, you can
see an HTTP check condition on the summer01 instance. This will send an email that
is customized with the content of the documentation section on the right.
Activity 13: Cost estimating
and planning
Refer to your Design and Process
Workbook.
● Use the price calculator to create an
initial estimate for deploying your
case study application.

In this design activity, use Google Cloud’s pricing calculator to create an initial
estimate for deploying your case study application.
The pricing calculator gives you a form for each service, which you fill out to estimate
the cost of using that service. For example, in this screenshot I calculated the cost of
one custom SQL instance with 4 cores, 16 GB of RAM, and 500 GB of SSD storage.
This could represent the orders database of my online travel application.

Some of these estimates aren’t easy to generate because you might not know how
much data your storage and database services need and how much compute your
deployment platforms require. However, it can be more challenging to estimate things
like network egress or the number of reads and writes. Start with a rough estimate
and refine it as your capacity plans improve.

Refer to activity 13 in your workbook for similar cost estimates for your case study.
Review Activity 13: Cost
estimating and planning
● Use the price calculator to create an
initial estimate for deploying your
case study application.

In this activity, you were asked to use the Google Cloud pricing calculator to estimate
the cost of your case study application.
Service name Google Cloud Resource Monthly cost
Orders Cloud SQL $1264.44
Inventory Firestore $ 215.41
Inventory Cloud Storage $1801.00
Analytics BigQuery $ 214.72

Here’s a rough estimate for the database applications of my online travel portal,
ClickTravel.

I adjusted my orders database to include a failover replica for high availability and
came up with some high-level estimates for my other services. My inventory service
uses Cloud Storage to store JSON data stored in text files. Because this is my most
expensive service, I might want to reconsider the storage class or configure object
lifecycle management.

Again, this is just an example, and your costs would depend on your case study.
45 minutes

Lab Objectives
● Examine the Cloud Logs.
Monitoring Applications in ● View Proﬁler Information.

Google Cloud ● Explore Cloud Trace.

● Monitor Resources using Dashboards.
● Create Uptime Checks and Alerts.

Monitoring Logging Trace Proﬁler

We started this course with a discussion on defining SLOs and SLIs for your services.
This helps with the detailed design and architecture and helps developers know when
they are done implementing a service.

However, the SLIs and SLOs aren’t very useful if you don’t monitor your applications
to see whether you are meeting them. That’s where the monitoring tools come in. In
this lab you will see how to use some of these tools.

Specifically, you will examine logs, view Profiler information, explore tracing, monitor
your resources using Dashboards, and create Uptime Checks and Alerts.
Lab review
Monitoring Applications
in Google Cloud

In this lab, you saw how to monitor your applications using built-in Google Cloud
tools. First, you deployed an application to App Engine and examined Cloud logs.
Then, you viewed Profiler information and explore Cloud Trace. Last but not least, you
monitored your application with dashboards and created uptime checks and alerts.

You can stay for a lab walkthrough, but remember that Google Cloud's user interface
can change, so your environment might look slightly different.
Review
Maintenance and
Monitoring

In this module you learned about managing new versions of your microservices using
rolling updates, canary deployments, and blue/green deployments. It’s important
when deploying microservices that you deploy new versions with no downtime, but
also that the new versions don’t break the clients that use your services.

You also learned about cost planning and optimization, and you estimated the cost of
running your case study application.

You finished the module by learning how to leverage the monitoring tools provided by
Google Cloud. These tools can be invaluable for managing your services and
monitoring your SLIs and SLOs.
[P] Thank you for taking the “Reliable Cloud Infrastructure: Design and Process”
course! We hope you have a better understanding of how to design applications and
services that make best use of the platform services provided by Google Cloud.

[S] We also hope that the design activities and labs made you feel more comfortable
with design and process in Google Cloud.

[P]Now it’s your turn. Go ahead and apply what you have learned by designing your
own applications, deployments, and monitoring.

[S] See you next time!

09 Maintenance and Monitoring
No ratings yet
09 Maintenance and Monitoring
35 pages
Google Cloud Architect Exams Questions
No ratings yet
Google Cloud Architect Exams Questions
40 pages
CCS336 CSM
No ratings yet
CCS336 CSM
36 pages
CSM Mannual 2025
No ratings yet
CSM Mannual 2025
33 pages
Ccs336 CSM Lab Manual
No ratings yet
Ccs336 CSM Lab Manual
30 pages
CSM Lab Manual for Students
No ratings yet
CSM Lab Manual for Students
36 pages
Share It Cloud Lab
No ratings yet
Share It Cloud Lab
31 pages
CCC336-CSM Manual
No ratings yet
CCC336-CSM Manual
32 pages
Cloud Setup & Cost Modeling Guide
No ratings yet
Cloud Setup & Cost Modeling Guide
30 pages
Microservices PDF
No ratings yet
Microservices PDF
6 pages
12 Factor Applications With Docker and Go (Tit Petric) (Z-Library)
100% (2)
12 Factor Applications With Docker and Go (Tit Petric) (Z-Library)
148 pages
Module 3 - Planning and Configuring A Cloud Solution
No ratings yet
Module 3 - Planning and Configuring A Cloud Solution
29 pages
GCP Intro 20160721 160720074302
No ratings yet
GCP Intro 20160721 160720074302
84 pages
Week 3 GCP Lec Notes
No ratings yet
Week 3 GCP Lec Notes
14 pages
Robust:: Serving SSCA App
No ratings yet
Robust:: Serving SSCA App
13 pages
GCP App Engine Certification Guide
No ratings yet
GCP App Engine Certification Guide
8 pages
Week 4 GCP Lec Notes
100% (1)
Week 4 GCP Lec Notes
23 pages
GCP Associate Cloud Engineer Guide
No ratings yet
GCP Associate Cloud Engineer Guide
339 pages
Module 5 - Ensuring Successful Operation of A Cloud Solution
No ratings yet
Module 5 - Ensuring Successful Operation of A Cloud Solution
27 pages
Google Certified Associate Cloud Engineer
No ratings yet
Google Certified Associate Cloud Engineer
346 pages
ACE Prep - Google
100% (1)
ACE Prep - Google
104 pages
12 Factor Principles
No ratings yet
12 Factor Principles
5 pages
Dzone2018 Researchguide Microservice PDF
No ratings yet
Dzone2018 Researchguide Microservice PDF
46 pages
ACE Exam 8
No ratings yet
ACE Exam 8
43 pages
Presentation 4
No ratings yet
Presentation 4
13 pages
CSM Full Record
No ratings yet
CSM Full Record
37 pages
Workbook - Design & Process
No ratings yet
Workbook - Design & Process
34 pages
GCP Associate Guide
No ratings yet
GCP Associate Guide
14 pages
04 Modernize Infrastructure and Applications With Google Cloud
No ratings yet
04 Modernize Infrastructure and Applications With Google Cloud
3 pages
Cracking Microservices Interviews v1.3 PDF
40% (5)
Cracking Microservices Interviews v1.3 PDF
157 pages
Google Cloud App Eng Notes
No ratings yet
Google Cloud App Eng Notes
3 pages
Google Cloud Compute Engine Guide
No ratings yet
Google Cloud Compute Engine Guide
39 pages
Mod-2 CCII
No ratings yet
Mod-2 CCII
27 pages
Azure Fundamentals Course Book
100% (1)
Azure Fundamentals Course Book
324 pages
Week 4 GCP Notes
No ratings yet
Week 4 GCP Notes
7 pages
M6 - T-GCPFCI-B - Core Infrastructure 5.0 - ILT PDF
No ratings yet
M6 - T-GCPFCI-B - Core Infrastructure 5.0 - ILT PDF
37 pages
File Module Slides 6 Deploying Applications To Google Cloud en - en
No ratings yet
File Module Slides 6 Deploying Applications To Google Cloud en - en
16 pages
Cloud App Design for Developers
No ratings yet
Cloud App Design for Developers
4 pages
Case Study On Google Cloud Platform
75% (4)
Case Study On Google Cloud Platform
4 pages
Course Presentation GoogleCloudDigitalLeader
No ratings yet
Course Presentation GoogleCloudDigitalLeader
254 pages
Module 5 - Ensuring Successful Operation of A Cloud Solution
No ratings yet
Module 5 - Ensuring Successful Operation of A Cloud Solution
41 pages
Professional Cloud Architect
No ratings yet
Professional Cloud Architect
42 pages
GCP Cloud Developer Exam Guide
100% (1)
GCP Cloud Developer Exam Guide
443 pages
Overview of GCP Compute Engine
No ratings yet
Overview of GCP Compute Engine
28 pages
GCP Fund Module 9 Summary and Review
No ratings yet
GCP Fund Module 9 Summary and Review
13 pages
Google Certified Associate Cloud Engineer
No ratings yet
Google Certified Associate Cloud Engineer
333 pages
(Git) (Non HTTP WSDL) : I. Codebase VII. Port Binding
No ratings yet
(Git) (Non HTTP WSDL) : I. Codebase VII. Port Binding
6 pages
Google Certified Professional Cloud Architect
100% (1)
Google Certified Professional Cloud Architect
446 pages
Module 2
No ratings yet
Module 2
6 pages
Service Deployment Concepts Done
No ratings yet
Service Deployment Concepts Done
14 pages
TypesofCloudServicespdf 2024 09 02 07 55 02
No ratings yet
TypesofCloudServicespdf 2024 09 02 07 55 02
59 pages
Google Cloud Platform Overview
No ratings yet
Google Cloud Platform Overview
11 pages
Cracking Microservices Interviews v1.1
100% (4)
Cracking Microservices Interviews v1.1
152 pages
10 Unnamed 09 10 2023
No ratings yet
10 Unnamed 09 10 2023
17 pages
Intro To MicroServices
100% (4)
Intro To MicroServices
109 pages
Detailed Curriculum & Program Outline - Cloud
No ratings yet
Detailed Curriculum & Program Outline - Cloud
6 pages
DevOps Unit 2
No ratings yet
DevOps Unit 2
21 pages
CPE432 - Lecture Notes 4
No ratings yet
CPE432 - Lecture Notes 4
31 pages
Google Certified Professional Cloud Architect
No ratings yet
Google Certified Professional Cloud Architect
464 pages
ABC
No ratings yet
ABC
1 page
ML Distance
No ratings yet
ML Distance
18 pages
Output
No ratings yet
Output
3 pages
Important Script
No ratings yet
Important Script
4 pages
Student
No ratings yet
Student
2 pages
FPDF Table
No ratings yet
FPDF Table
1 page
Sample
No ratings yet
Sample
15 pages
Sseco Data
No ratings yet
Sseco Data
1 page
Threadingand Multiprocessing
No ratings yet
Threadingand Multiprocessing
12 pages
Ipadd
No ratings yet
Ipadd
4 pages
Kuber Net Es Pod Types
No ratings yet
Kuber Net Es Pod Types
2 pages
Msal Authentication
No ratings yet
Msal Authentication
5 pages
01 Defining Services
No ratings yet
01 Defining Services
31 pages
Commvault Cloud Architecture Guide
No ratings yet
Commvault Cloud Architecture Guide
32 pages
FPGA-Based RISC-V Processor for Education
No ratings yet
FPGA-Based RISC-V Processor for Education
5 pages
Fuwell 14112020 PDF
No ratings yet
Fuwell 14112020 PDF
5 pages
El Ecu Del Detroit Diesel Serie 60
50% (2)
El Ecu Del Detroit Diesel Serie 60
5 pages
OS Fundamentals Assignments
No ratings yet
OS Fundamentals Assignments
29 pages
Experiment No 8 Memory Design
No ratings yet
Experiment No 8 Memory Design
6 pages
Process Management & Scheduling
No ratings yet
Process Management & Scheduling
109 pages
Igcse Ict Glossary
100% (2)
Igcse Ict Glossary
0 pages
Introduction To Computers and Programming: - Number Representation
No ratings yet
Introduction To Computers and Programming: - Number Representation
9 pages
Troubleshooting Guide CMDB 7.6.04
No ratings yet
Troubleshooting Guide CMDB 7.6.04
204 pages
Emc Ecc Intro
No ratings yet
Emc Ecc Intro
581 pages
Azure AZ305 Study Plan
No ratings yet
Azure AZ305 Study Plan
6 pages
11 IP Study Material
No ratings yet
11 IP Study Material
120 pages
DOS Manual v1.1 ENG PDF
No ratings yet
DOS Manual v1.1 ENG PDF
85 pages
Honeywell 3
No ratings yet
Honeywell 3
6 pages
Parallel & Distributed DBMS Guide
No ratings yet
Parallel & Distributed DBMS Guide
58 pages
Arteam PE File Format
No ratings yet
Arteam PE File Format
3 pages
Tve 11 - Css 1st Semester Finals Module 3 (Davide)
No ratings yet
Tve 11 - Css 1st Semester Finals Module 3 (Davide)
12 pages
OnApp Diagnostics
No ratings yet
OnApp Diagnostics
7 pages
Digital Computer-Unit-3
No ratings yet
Digital Computer-Unit-3
63 pages
Lesson 4 - Operating Systems
No ratings yet
Lesson 4 - Operating Systems
29 pages
Introduction To Computing and Problem Solving
No ratings yet
Introduction To Computing and Problem Solving
171 pages
How To Launch Remix OS For PC
No ratings yet
How To Launch Remix OS For PC
2 pages
Data Sheet 6AV2123-2GA03-0AX0: General Information
No ratings yet
Data Sheet 6AV2123-2GA03-0AX0: General Information
9 pages
PHPExcel Developer Documentation
No ratings yet
PHPExcel Developer Documentation
52 pages
Dell SC5020 and SC5020F Storage Systems Getting Started Guide
No ratings yet
Dell SC5020 and SC5020F Storage Systems Getting Started Guide
15 pages
RR1720 User Manual PDF
No ratings yet
RR1720 User Manual PDF
71 pages
Ecu Immobilizers
No ratings yet
Ecu Immobilizers
87 pages
01-05 Device Management Commands PDF
No ratings yet
01-05 Device Management Commands PDF
188 pages
David Peyceré
No ratings yet
David Peyceré
9 pages

09 Maintenance and Monitoring

Uploaded by

09 Maintenance and Monitoring

Uploaded by

Maintenance and Monitoring

Maintenance is primarily concerned with how updates are made to running

Let’s begin by taking a look at version management.

A key benefit of a microservice architecture is the ability to independently deploy

Capacity planning is a continuous, iterative cycle

Optimizing cost of compute

● Consider more small machines with auto scaling turned on.

● Consider committed use discounts.

● Consider at least some preemptible instances:

● Google Cloud rightsizing recommendations will alert you

Compute Engine also provides sizing recommendations for your VM instances, as

Optimizing disk cost

● Don’t over-allocate disk space.

● Determine what performance characteristics your applications require:

● Depending on I/O requirements, consider Standard over SSD disks.

Monthly capacity Standard PD SSD PD

A common mistake is to over-allocate disk space. This is not cost-efficient, but

To optimize network costs, keep machines close to

Region Region Region

To optimize network costs, it is best practice to keep machines as close as possible to

GKE usage metering can prevent over-provisioning

Control plane Request-based Consumption-

Compares requested resources

CPU requested (cpu hour) CPU consumed (cpu hour)

Namespace Cost Amount Cost Amount

Namespace-2 101.87 3208 81.95 2460

Kube-system 49.64 1548 24.5 762

Node Node Node Kube: system-overhead 61.24 1908 50.36 1675

Compare the costs of different

● Storing 1GB in Firestore is free.

● Storing 1GB in Cloud Bigtable would be

Consider alternative services

Use the Google Cloud Pricing Calculator to estimate

● Compare the costs of different

[Google Cloud Pricing Calculator: https://cloud.google.com/products/calculator]

Billing reports provide detailed cost breakdowns

For advanced cost analysis, export billing data to

Visualize spend with Google Data Studio

Daily View Monthly View Overall

Today’s Spend by Service Month-to-Date Spend Month-to-Date Spend

Set budgets and alerts to keep your team aware of how

Programmatic Budgets: Pub/Sub → Cloud Functions

In addition to receiving an email, you can use Pub/Sub notifications to

Let’s get into monitoring and visualizing information with dashboards.

Monitoring Logging Trace Debugger Error Proﬁler

● Monitor the things you pay for:

Charts like these provide valuable insights into usage patterns.

I’ve linked the SRE book in the slides of this module

Google Cloud ● Explore Cloud Trace.

Monitoring Logging Trace Proﬁler

[S] See you next time!

You might also like