Jump to content

SNAMP

From Wikipedia, the free encyclopedia
SNAMP
Developer(s)Bytex Solutions OÜ
Initial releaseJuly 7, 2016 (2016-07-07)
Stable release
2.0.0 / July 24, 2017; 7 years ago (2017-07-24)
Written inJava
Operating systemCross-platform
TypeCluster management software System monitoring
LicenseApache License 2.0
Websitesnamp.io

SNAMP is an open-source, cross-platform software platform for telemetry, tracing and elasticity management[1] of distributed applications.

Overview

[edit]

The main purpose of SNAMP is to simplify management of microservices running inside of containers or software-defined data centers. It provides telemetry data gathering (metrics and events), end-to-end tracing of requests between software components reflecting topology of communication paths, automatic scaling of cluster nodes based on workload, unification of telemetry data sources.

Telemetry

[edit]

Metrics, events and health checks are used to control state and health of services in IT landscape. Gathering is carried out in real time. The collected data can be used for visualization in the form of charts, alarming and executing maintenance actions. It is possible to set up a watcher for the important metric. The watcher determines limitations and conditions applied to the value of the metric. If the value is out of range then watcher can execute a trigger. The trigger can be represented as handwritten script using one of the supported scripting languages, e.g. Groovy. It can be a maintenance action (restart server node) or notification (alert to e-mail).

Additionally, it is possible to extend functionality of existing monitoring tool used in enterprise. SNAMP can gather telemetry data and expose the data to the outside using any combination of supported protocols. For example, the data collected from JMX protocol can be exposed through SNMP and acquired by other network management software such as Nagios.

Tracing of requests

[edit]

Tracing of requests allows to identify communication paths between services and to collect important metrics such as response time, requests per second, availability, scalability etc. This data helps to troubleshoot latency and scalability issues and to find the bottlenecks. Additionally, communication paths can be visualized in the form of the graph in the Web Console that allows to observe entire IT landscape in real time.

Applications should be instrumented to report the necessary information to SNAMP. Instrumentation libraries can be found at Maven Central[2] using groupID io.snamp.instrumentation. Third-party instrumentation libraries are also supported:

  • OpenZipkin[3]
  • Apache HTrace[4]

Elasticity management

[edit]

Elasticity manager is a component of SNAMP that is responsible for automatic provisioning and decommissioning of cluster nodes. Its behavior is based on scaling policies. One more scaling policies can be associated with the cluster. Decision process is based on fuzzy logic. Each policy participating has its own vote weight and elasticity manager execute voting process periodically. Voting result represents one out of three possible decisions: enlarge cluster, shrink cluster or do nothing. Scaling policy can be based on health check, handwritten script or range of values associated with some metric. Due to the flexibility of the decision process it is possible to define several strategies for scaling:

  • All-of strategy means that all scaling policies should vote for changing the capacity of the cluster
  • Any-of strategy means that at least one of the scaling policies can vote for changing the capacity of the cluster
  • Majority strategy means that majority of scaling policies can vote for changing the capacity of the cluster

Moreover, it is also possible to assign custom weights for each scaling policy.

Elasticity manager uses underlying cluster or cloud management platform for sending commands about provisioning and decommissioning. It can be OpenStack, Kubernetes or VMware ESXi.

Web Console

[edit]

Web Console is used for visualization of metrics in the form of charts, visualization of communication paths between services in the form of graph, cluster monitor. Using Web Console for visualization is an optional feature, because SNAMP provides integration with other tools such as Grafana.[5]

Architecture

[edit]

SNAMP platform consists of following several components:

  • Resource Connector[6] responsible for communication between SNAMP and service in IT landscape. It encapsulates communication protocol and exposes telemetry data to SNAMP in unified way. For example, JMX Connector can be used to control Java applications using JMX protocol.
  • Gateway[7] exposes information collected from all resource connectors to the outside using the selected specified protocol. For example, SNMP Gateway can expose telemetry data obtained from all resource connectors using SNMP protocol.
  • Supervisor[8] controls group of resources. It provides health monitor, elasticity management, automatic discovery of resources.

Combination of different gateways and resource connectors is able to transform telemetry data from one protocol to another. Each component might be customized using Groovy-based scripts. It is possible to write custom component using any JVM-compatible languages.

Features

[edit]
  • Integration with third-party visualization and monitoring tools: Grafana,[5] Nagios, SSH
  • Collecting telemetry data using following protocols and technologies: Spring Actuator, OpenZipkin spans (from Kafka and HTTP), HTTP, JMX, Modbus, rsh, stdout from command-line tools, SSH
  • Exposing telemetry data using following protocols: XMPP (chat bot), SNMPv2/SNMPv3, HTTP, NRDP, NSCA,[9] syslog, data streaming to InfluxDB
  • Elasticity management supports OpenStack Senlin.[10]
  • Groovy scripting

Alternatives

[edit]

An alternative solution might be constructed using combination of software components:

Jolokia[12] offers JMX-to-HTTP bridge that can be hosted inside of a standalone Java program, Java EE application server or OSGi environment.

See also

[edit]

References

[edit]
  1. ^ "Elasticity Manager". cloudcomputingpatterns.github.io. Retrieved 4 January 2018.
  2. ^ "The Central Repository Search Engine". search.maven.org. Retrieved 4 January 2018.
  3. ^ a b "OpenZipkin · A distributed tracing system". zipkin.io. Retrieved 4 January 2018.
  4. ^ "Apache HTrace – About". htrace.incubator.apache.org. Retrieved 4 January 2018.
  5. ^ a b c "Grafana - The open platform for analytics and monitoring". Grafana Labs. Retrieved 4 January 2018.
  6. ^ "Resource Connector". bytex.solutions. Archived from the original on 5 September 2017. Retrieved 4 January 2018.
  7. ^ "Gateway". bytex.solutions. Archived from the original on 5 September 2017. Retrieved 4 January 2018.
  8. ^ "Supervisor". snamp.io. Archived from the original on 5 September 2017. Retrieved 4 January 2018.
  9. ^ Galstad, Ethan. "NSCA - Nagios Service Check Acceptor - Nagios Exchange". exchange.nagios.org. Retrieved 4 January 2018.
  10. ^ "OpenStack Docs: Welcome to the Senlin documentation!". docs.openstack.org. Retrieved 4 January 2018.
  11. ^ "Auto Scaling Documentation". Amazon Web Services, Inc. Retrieved 4 January 2018.
  12. ^ Huss, Roland. "Jolokia – Overview". jolokia.org. Retrieved 4 January 2018.
[edit]