#LifeKoKaroLift
1
Course : Monitoring
Module Name: Monitoring
Tools
Instructor : Harshwardhan
Singh
2
Today’s Agenda
● VDI & Citrix
3
Monitoring
Monitoring in the context of IT or systems management typically refers to the
practice of observing and tracking the performance and health of various
components in a system or network. Here are a few key pointers about
monitoring:
1. Purpose: Monitoring is essential for ensuring the reliability, availability, and
performance of systems. It helps in identifying issues, troubleshooting
problems, and optimizing performance.
2. Tools: There are various monitoring tools available, such as Nagios, Zabbix,
Prometheus, and Grafana, which help in collecting and analyzing monitoring
data.
Monitoring
3. Types of Monitoring: There are several types of monitoring, including:
- Performance Monitoring: Tracking metrics like CPU usage, memory
usage, disk I/O, etc., to ensure optimal performance.
- Availability Monitoring: Checking if services are up and running and
accessible to users.
- Security Monitoring: Monitoring for suspicious activity or breaches in
security.
- Network Monitoring: Observing network traffic and devices to ensure
smooth communication.
- Application Monitoring: Monitoring the performance and behavior of
applications.
Monitoring
4. Alerting: Monitoring systems often include alerting mechanisms to notify
administrators or users when certain thresholds are breached or issues are
detected.
5. Logging: Monitoring often goes hand-in-hand with logging, where logs are
collected and analyzed to provide a detailed record of events and activities in
a system.
6. Trending and Analysis: Monitoring data can be used for trend analysis and
capacity planning, helping to predict future resource needs and optimize
system performance.
7. Automation: With the rise of DevOps practices, monitoring is often
Monitoring
8. Cloud Monitoring: In cloud environments, monitoring is crucial for
managing resources, optimizing costs, and ensuring the performance and
availability of cloud services.
Why do we need “Monitoring”
● Performance Optimization: Monitoring helps identify bottlenecks and
areas of improvement in systems, networks, and applications, allowing
for performance optimization and better resource utilization.
● Issue Detection: Monitoring alerts IT teams to potential issues such as
system failures, performance degradation, or security breaches, enabling
quick resolution before they impact users or business operations.
● Capacity Planning: Monitoring helps in understanding current resource
usage trends and predicting future requirements, facilitating effective
capacity planning and resource allocation.
● Security: Monitoring helps detect and mitigate security threats, ensuring
the integrity and confidentiality of data and systems.
Why do we need “Monitoring”
● Compliance: Monitoring helps organizations comply with regulatory
requirements by ensuring that systems meet specific performance,
availability, and security standards.
● Business Continuity: Monitoring contributes to the overall business
continuity strategy by reducing downtime and ensuring that critical
systems and services are available when needed.
● Cost Optimization: Monitoring can lead to cost savings by identifying
inefficient resource usage and enabling better decision-making regarding
resource provisioning and utilization.
Tools
Tools
● Zenoss
● Solarwinds
● SCOM
● Dynatrace
● Appdynamics
Zenoss
● Purpose: Zenoss is an open-source monitoring platform that provides unified
monitoring of IT infrastructure and applications.
● Features: It offers real-time monitoring, event management, performance
analytics, and automated remediation.
● Deployment: It can be deployed on-premises or in the cloud.
Zenoss
Here's how Zenoss works:
1. Data Collection: Zenoss collects data from different sources using agents or
SNMP (Simple Network Management Protocol) for network devices. It can also
collect data from logs, APIs, and other sources.
2. Data Processing: Once the data is collected, Zenoss processes it to extract
relevant information about the performance and health of the monitored
devices and applications.
3. Monitoring: Zenoss monitors various aspects of the IT environment, such as CPU
usage, memory usage, disk space, network traffic, and application performance, to
ensure they are operating within acceptable levels.
Zenoss
4. Alerting: Zenoss can be configured to raise alerts when it detects issues or
anomalies in the monitored devices or applications. These alerts can be sent via
email, SMS, or other channels.
5. Reporting: Zenoss provides reporting capabilities to help IT teams track and
analyze the performance of the IT environment over time. Reports can be
customized to show specific metrics and trends.
6. Automation: Zenoss supports automation through its integration with other tools
and systems. It can trigger automated actions based on predefined conditions,
helping to streamline IT operations and improve efficiency.
SolarWinds
● Purpose: SolarWinds provides a range of IT management and monitoring
tools, including network management, system management, database
management, and security management.
● Features: SolarWinds products offer monitoring, alerting, reporting, and
troubleshooting capabilities for various aspects of IT infrastructure.
● Deployment: SolarWinds tools can be deployed on-premises or in the cloud.
SolarWinds
● Discovery: The first step is to discover the network devices and infrastructure that
you want to monitor. SolarWinds NPM uses various discovery methods to
automatically identify devices on the network.
● Polling: Once the devices are discovered, SolarWinds NPM polls these devices at
regular intervals to collect data about their performance and status. This data
includes metrics such as CPU usage, memory usage, bandwidth utilization, and
packet loss.
● Alerting: SolarWinds NPM includes a powerful alerting system that can notify you
when certain conditions are met, such as high CPU usage or a device going offline.
You can configure alerts based on thresholds and conditions that are relevant to
your network.
● Reporting: SolarWinds NPM provides reporting capabilities that allow you to
generate reports on various aspects of your network, such as performance trends,
uptime reports, and capacity planning.
● Integration: SolarWinds NPM can integrate with other SolarWinds products as well
SCOM (System Center Operations Manager)
● Purpose: SCOM is a Microsoft product that provides infrastructure monitoring
and application performance monitoring.
● Features: It offers monitoring for servers, applications, and network devices,
as well as reporting and alerting capabilities.
● Deployment: SCOM is typically deployed on-premises in Windows
environments.
SCOM (System Center Operations Manager)
● Agent Deployment: SCOM uses agents to collect monitoring data from
servers, applications, and other devices in the IT environment. Agents are
installed on the systems that need to be monitored.
● Data Collection: The agents collect data such as performance metrics, event
logs, and other relevant information from the monitored systems. This data is
then sent to the SCOM management server for processing.
● Management Server: The SCOM management server is the central component
of the SCOM infrastructure. It receives data from the agents, processes it, and
stores it in the SCOM database.
● Monitoring Packs: SCOM uses monitoring packs to define the monitoring
requirements for different types of systems and applications. These
monitoring packs include rules, monitors, and alerts that specify how SCOM
should monitor and respond to events in the IT environment.
SCOM (System Center Operations Manager)
● Alerts and Notifications: SCOM generates alerts based on the monitoring data
it collects. These alerts are used to notify IT administrators about potential
issues in the IT environment. SCOM can also be configured to send
notifications via email or other channels.
● Reporting: SCOM provides reporting capabilities that allow IT administrators to
generate reports on the performance and health of their IT environment.
These reports can help identify trends, analyze performance, and plan for
future capacity needs.
● Integration: SCOM integrates with other Microsoft System Center products,
such as System Center Configuration Manager (SCCM) and System Center
Virtual Machine Manager (SCVMM), to provide comprehensive management
capabilities for the entire IT infrastructure.
Dynatrace
● Purpose: Dynatrace is an application performance monitoring (APM) tool that
provides full-stack monitoring for applications, including user experience
monitoring, application performance monitoring, and infrastructure
monitoring.
● Features: It offers real-time monitoring, automatic discovery of dependencies,
AI-powered root cause analysis, and automated remediation.
● Deployment: Dynatrace can be deployed on-premises or in the cloud.
Dynatrace
● Deployment: Dynatrace can be deployed in different environments, including on-
premises, hybrid, or cloud. Once deployed, it starts monitoring the application and
its infrastructure.
● Automatic Discovery: Dynatrace automatically discovers all components of your
application, including servers, databases, services, and dependencies. It builds a
dynamic topology map of your application environment.
● Monitoring: Dynatrace continuously monitors the performance of your application
and its components in real-time. It collects metrics such as response times, error
rates, and resource utilization.
● User Monitoring: Dynatrace provides insights into user experience by monitoring
real user interactions with your application. It captures user actions, load times,
and error rates.
● Root Cause Analysis: When performance issues occur, Dynatrace uses artificial
intelligence (AI) to perform root cause analysis. It correlates data from different
components to identify the underlying cause of the problem.
Dynatrace
● Alerting: Dynatrace provides alerting capabilities to notify you of performance
issues or anomalies. Alerts can be configured based on predefined thresholds or AI-
based anomaly detection.
● Reporting and Analytics: Dynatrace offers reporting and analytics features to help
you understand the performance of your application over time. It provides insights
into trends, patterns, and areas for improvement.
● Integration: Dynatrace can integrate with other tools and platforms, such as
DevOps and CI/CD tools, to provide end-to-end visibility and automation in the
application lifecycle.
AppDynamics
● Purpose: AppDynamics is an APM tool that provides monitoring and analytics
for applications and business transactions.
● Features: It offers real-time visibility into application performance, code-level
diagnostics, business transaction monitoring, and end-user monitoring.
● Deployment: AppDynamics can be deployed on-premises or in the cloud.
AppDynamics
● Agent Installation: AppDynamics requires agents to be installed on the
application servers and infrastructure components that you want to monitor.
These agents collect performance data and send it to the AppDynamics
Controller.
● AppDynamics Controller: The Controller is the central component of
AppDynamics. It receives data from the agents, processes it, and stores it in a
database. The Controller provides a web-based interface for users to view and
analyze the performance data.
● Monitoring Application Performance: AppDynamics monitors various aspects
of application performance, including response times, errors, and resource
usage (CPU, memory, etc.). It also provides visibility into the performance of
individual transactions within the application.
AppDynamics
● Business Transaction Correlation: One of the key features of AppDynamics is
its ability to correlate performance data with specific business transactions.
This allows users to see how application performance is impacting business
metrics such as revenue or customer satisfaction.
● Alerting and Reporting: AppDynamics can be configured to send alerts when
performance metrics exceed predefined thresholds. It also provides reporting
features that allow users to analyze performance trends over time.
● Automatic Root Cause Analysis: AppDynamics uses machine learning
algorithms to analyze performance data and automatically identify the root
cause of performance issues. This can help IT teams quickly diagnose and
resolve issues before they impact users.