GCC Unit 2
GCC Unit 2
UNIT II
GRID SERVICES
Introduction to Open Grid Services Architecture (OGSA) – Motivation – Functionality
Requirements – Practical & Detailed view of OGSA/OGSI – Data intensive grid service
models – OGSA services.
2. Introduction to Open Grid Services Architecture (OGSA)
• Initially grid standard published by the Global Grid Forum – OGSI.(OGSI defines
grid services and the basic mechanisms for creating, managing, and exchanging
information between them).
• A second standard appeared a year later (and was still in initial draft form at press time):
the Open Grid Services Architecture (OGSA).
• Open Grid Services Architecture (OGSA), the technology driving it, and problems
addressed by the Grid Service Specification proposed by the Global Grid Forum (GGF)
Department of CSE 1
Grid & Cloud Computing Unit II Notes
• It has long been recognized that middleware requires a set of functionality, such as
logging, security, failover, clustering, heartbeat monitoring, first failure data capture,
trace, and so on across a variety of platforms and technologies.
• Application servers, databases, or messaging engines all require similar functionality, but
are often implemented in different ways by different middleware. For this reason, IBM
embarked in this project to provide a common set of core infrastructure functions across
all platforms.
Emerging Web Services standards
• Through a significant cooperative venture by some of the biggest players in the industry
(including IBM and Microsoft) working through a number of standards organizations, a
common set of technologies have emerged. These technologies allow for the registration,
discovery, and use of distributed services.
Basis for OGSA
• Web Services
• The Web Services model
• Operations in a Web Service architecture
• Standards
• Grid security
Web Services
• Web Services is an interface that describes a collection of operations that are network-
accessible through standard XML messaging.
• Web Services is intended to facilitate the conversation or communication between
computer programs.
• Web Services will do for program-to-program communication
• Web Services uses a program-to-program communications model built on existing
standards, such as Hyper Text Transmission Protocol (HTTP), Extensible Markup
Language (XML), Simple Object Access Protocol (SOAP), Web Services Description
Language (WSDL) and Universal Description Discovery, and Integration (UDDI).
Department of CSE 2
Grid & Cloud Computing Unit II Notes
Department of CSE 3
Grid & Cloud Computing Unit II Notes
• Find:
The entity looking for a specific type of resource must find the resource with the necessary
characteristics.
• Bind:
Eventually, the service can be invoked and bound to be used.
• The Web Services architecture defines two important concepts that are called artifacts.
They are:
• Service
– This is the implementation of the Web Service interface.
• Service Description
– This is the details of the interface and the implementation of the service. This
includes its data types, operations, binding information and network location. The
service description might be published directly to a service requestor or to a
service registry.
Standards
• OGSA proposes the heavy use of three standards:
– Simple Object Access Protocol (SOAP),
– Web Service Description Language (WSDL), and
– Web Service Inspection (WSI).
SOAP
• This is a means for providing messaging between a Service Requestor and a Service
Provider.
• It is a simple enveloping mechanism for XML payloads and defines a Remote Procedure
Call (RPC) mechanism and conventions.
• SOAP payloads can be carried on HTTP, FTP, Java Messaging Service (JMS), and so
on.
WSDL
This is a XML document for describing Web Services as a set of endpoints on messages
containing either document-oriented (messaging) or RPC payloads.
Department of CSE 4
Grid & Cloud Computing Unit II Notes
WSI
• This is a simple XML language, with related conventions for locating service
descriptions published by a service provider.
• A WSI language (WSIL) document can contain a collection of service descriptions
and links to other source of service descriptions. It is a service description and, normally,
is a URL to a WSDL document, but occasionally can be a URL to another WSI
document.
• With WS-Inspection, a service provider creates a WSIL document and makes the
document network accessible.
The Web Services stack
Department of CSE 5
Grid & Cloud Computing Unit II Notes
Department of CSE 6
Grid & Cloud Computing Unit II Notes
OGSA in detail
• The Web Services Description Language (WSDL) provides a mechanism to define
service interfaces in XML.
– These descriptions set out the structure and sequence of exchanges between the
invoker and the service.
• WSDL allows the same service to support multiple protocol bindings for a single
interface.
– This capability contributes greatly to support of heterogeneous distributed
systems.
– Not only can the binding be, for example, SOAP over JMS or HTTP, but a service
can support multiple bindings for each offering, for example, different qualities of
service or authentication mechanisms.
• OGSA brings the grid and Web service communities together to address the problem of
services across a distributed, heterogeneous, dynamic, and virtual organization.
• The core of the OGSA architecture is the grid service.
• Grid services may be computational resources, storage resources, programs, or
databases.
• Taking the Web services model as the example, grid services map very well to the
concepts of registration, discovery, and use.
• The two critical aspects for users in such a
– Service Orientated Architecture (SOA) are definition of the service interfaces and
– identification of the protocol(s) that can be used to invoke a given service.
• Grid Services extend the Web Services concept by laying out a set of well-defined
interfaces that address discovery, dynamic service creation, lifetime management,
notification, and manageability and a set of conventions for naming and upgrade ability.
• These interfaces and conventions are vital for allowing reliable interoperability between
services and with invoking applications.
• WSDL refers these interface as portTypes.
These port Types include:
• Grid Service port Type
Department of CSE 7
Grid & Cloud Computing Unit II Notes
– To allow the discovery of data relating to the service and enable lifetime
management of the service
• Notification-Source port Type
– To allow the sending of notification messages
• Notification-Sink port Type
– To allow the receiving of notification messages
• Notification-Subscription port Type
– To allow a Notification Source port Type to subscribe to a set of notifications for
a period of time.
• Registration port Type
– To allow a service instance to register/unregister to enable/disable discover of the
service instance
• Factory port Type
– To allow the creation of service instances
• Handle Resolver port Type
– To allow a grid service handle to be converted into a Grid Service
Reference,necessary for binding to the service
FUNCTIONALITY REQUIREMENTS
• Basic Functionality Requirements
• Security Requirements
• Resource Management Requirements
• System Properties Requirements
• Other Functionality Requirements
• Discovery and brokering
• Metering and accounting
• Data sharing
• Deployment
• Virtual organizations
• Monitoring
• Policy
Department of CSE 8
Grid & Cloud Computing Unit II Notes
Department of CSE 9
Grid & Cloud Computing Unit II Notes
• A global, cross-organizational view of resources and assets for project and fiscal
planning, troubleshooting, and other purposes.
• The users want to monitor their applications running on the grid.
• Also, the resource or service owners need to surface certain states so that the user of
those resources or services may manage the usage using the state information.
Policy:
• An error and event policy guides self-controlling management, including failover and
provisioning.
• It is important to be able to represent policy at multiple stages in hierarchical
systems, with the goal of automating the enforcement of policies that might otherwise be
implemented as organizational processes or managed manually.
• There may be policies at every level of the infrastructure: from low-level policies
that govern how the resources are monitored and managed, to high-level policies that
govern how business process such as billing are managed.
• High-level policies are sometimes decomposable into lower-level policies.
Security Requirements
• Multiple security infrastructures
• Perimeter security solutions
• Authentication, Authorization, and Accounting
• Encryption
• Application and Network-Level Firewalls
• Certification.
Multiple security infrastructures:
• Distributed operation implies a need to interoperate with and manage multiple security
infrastructures.
• For example, for a commercial data center application, isolation of customers in the same
commercial data center is a crucial requirement; the grid should provide not only access
control but also performance isolation.
• For another example, for an online media and entertainment use case, proper isolation
between content offerings must be ensured; this level of isolation has to be ensured by the
security of the infrastructure.
Department of CSE 10
Grid & Cloud Computing Unit II Notes
Department of CSE 11
Grid & Cloud Computing Unit II Notes
• Advanced reservation.
• Notification and messaging
• Logging
• Workflow management.
• Pricing.
Provisioning:
• Computer processors, applications, licenses, storage, networks, and instruments are all
grid resources that require provisioning.
• OGSA needs a framework that allows resource provisioning to be done in a uniform,
consistent manner.
Resource virtualization:
• Dynamic provisioning implies a need for resource virtualization mechanisms that allow
resources to be transitioned flexibly to different tasks as required; for example, when
bringing more Web servers on line as demand exceeds a threshold.
Optimization of resource usage :
• While meeting cost targets (i.e., dealing with finite resources) mechanisms to manage
conflicting demands from various organizations, groups, projects, and users to
implement a fair sharing of resources and access to the grid.
Transport management:
• For applications that require some form of real-time scheduling, it can be important to be
able to schedule or provision bandwidth dynamically for data transfers or in support of
the other data sharing applications.
• In many (if not all) commercial applications, reliable transport management is essential to
obtain the end-to-end QoS required by the application.
• Access. Usage models that provide for both batch and interactive access to resources.
Management and monitoring:
• Support for the management and monitoring of resource usage and the detection of SLA
or contract violations by all relevant parties.
Also, conflict management is necessary; it resolves conflicts between management
disciplines that may differ in their optimization objectives
Department of CSE 12
Grid & Cloud Computing Unit II Notes
Processor scavenging :
• It is an important tool that allows an enterprise or VO to use to aggregate computing
power that would otherwise go to waste.
• For example, consider a collection of desktop computers running software that supports
integration into processing and/or storage pools managed via systems such as Condor,
Entropia, and United Devices. Issues here include maximizing security in the absence of
strong trust.
Scheduling of service tasks:
• Long recognized as an important capability for any information processing system,
scheduling becomes extremely important and difficult for distributed grid systems.
• In general, dynamic scheduling is an essential component . Computer resources must be
provisioned on-demand to satisfy the need to complete a forecast on time.
Load balancing:
• In many applications, it is necessary to make sure make sure deadlines are met or
resources are used uniformly.
• For example, for the commercial data center use case, monitoring the job performance
and adjusting allocated resources to match the load and fairly distributing end users’
requests to all the resources are necessary.
• For the online media and entertainment use case, the amount of workload is a
direct result of how many concurrent online game players are being hosted on a game
server.
Advanced reservation:
• This functionality may be required in order to execute the application on reserved
resources.
• For example, for the commercial data center use case, the grid decides when to start the
request processing based on the customer’s request.
• It interprets the job specification description language in which the request is written and
it checks to see if the customer has the right to perform the request.
Notification and messaging
• Notification and messaging are critical in most dynamic scientific problems. Notification
and messaging are event driven.
Department of CSE 13
Grid & Cloud Computing Unit II Notes
Logging:
• It may be desirable to log processes such as obtaining/deploying application
programs because, for example, the information might be used for accounting. This
functionality is represented as ―metering and accounting.‖
Workflow management:
• Many applications can be wrapped in scripts or processes that require licenses and other
resources from multiple sources. Applications coordinate using the file system based on
events.
Pricing. Mechanisms for determining how to render appropriate bills to users of a grid.
Department of CSE 14
Grid & Cloud Computing Unit II Notes
• In case of commercial data center applications if the data center becomes unavailable due
to a disaster such as an earthquake or fire, the remote backup data center needs to take
over the application systems.
Self-healing capabilities of resources, services and systems are required.
• Significant manual effort should not be required to monitor, diagnose, and repair faults.
• There is a need for the ability to integrate intelligent self-aware hardware such as disks,
networking devices, and so on.
• Strong monitoring for defects, intrusions, and other problems. Ability to migrate
attacks away from critical areas.
Legacy application management. Legacy applications are those that cannot be changed, but they
are too valuable to give up or to complex to rewrite. Grid infrastructure has to be built around
them so that they can continue to be used.
Administration. Be able to ―codify‖ and ―automate‖ the normal practices used to administer the
environment. The goal is that systems should be able to self organize and self-describe to
manage low-level configuration details based on higher-level configurations and management
policies specified by administrators.
Agreement-based interaction. Some initiatives require agreement-based interactions capable of
specifying and enacting agreements between clients and servers (not necessarily human) and
then composing those agreements into higher-level end-user structures.
Grouping/aggregation of services. The ability to instantiate (compose) services using some set
of existing services is a key requirement. There are two main types of composition techniques:
selection and aggregation.
• Selection involves choosing to use a particular service among many services with the
same operational interface. Aggregation involves orchestrating a functional flow
(workflow) between services. For example, the output of an accounting service is fed into
the rating service to produce billing records.
• One other basic function required for aggregation services is to transform the syntax
and/or semantics of data or interfaces.
Other Functionality Requirements
• Platforms. The platforms themselves are heterogeneous, including a variety of operating
systems (Unixes, Linux, Windows, and, presumably, embedded systems), hosting
Department of CSE 15
Grid & Cloud Computing Unit II Notes
• from which Grid systems can be built, based on open standards such as WSDL
Objectives of OGSA
Department of CSE 16
Grid & Cloud Computing Unit II Notes
In Contrast, it specifies
Department of CSE 17
Grid & Cloud Computing Unit II Notes
• Service state data that augment the constraint capabilities of XML schema
definition
Department of CSE 18
Grid & Cloud Computing Unit II Notes
The Web Services Invocation Framework (WSIF) and Java API for XML RPC (JAX-RPC) used
to invoke the services.
It is possible, but not recommended, to build customized code that directly couples
client applications to fixed bindings of a particular grid service instance.
Department of CSE 19
Grid & Cloud Computing Unit II Notes
The GSH does not provide sufficient information to allow a client to access the service
instance.
the client needs to ―resolve‖ a GSH into a grid service reference (GSR).
The GSR contains all the necessary information to access the service instance.
• The client resolves a GSH into a GSR by invoking a HandleResolver grid service
instance identified by some out-of-band mechanism.
• The HandleResolver may have the GSR stored in a local cache. The HandleResolver may
need to invoke another HandleResolver to resolve the GSH.
• The HandleResolver may use a handle resolution protocol, specified by the particular
kind (or scheme) of the GSH to resolve to a GSR. The HandleResolver protocol is
specific to the kind of GSH being resolved
Department of CSE 20
Grid & Cloud Computing Unit II Notes
Department of CSE 21
Grid & Cloud Computing Unit II Notes
The purpose of the OGSI document is to specify the (standardized) interfaces and behaviors that
define a grid service.
every grid service is a Web service the converse is not true.
Defining service data that provide a standard way for representing and querying
metadata and state data from a service instance
Service Data
The approach to stateful Web services introduced in OGSI identified the need for a
common mechanism to expose a service instance’s state data to service requestors
Department of CSE 22
Grid & Cloud Computing Unit II Notes
Service data can be exposed for read, update, or subscription purposes. Since WSDL
defines operations and messages for portTypes, the declared state of a service must be
externally accessed only through service operations defined as part of the service
interface.
To avoid the need to define serviceData-specific operations for each serviceData element,
the grid service portType provides base operations for manipulating serviceData elements
by name.
Consider an example. Interface alpha introduces operations op1, op2, and op3.
Also assume that the alpha interface consists of publicly accessible data elements of de1,
de2, and de3.
One uses WSDL to describe alpha and its operations. The OGSI serviceData construct
extends WSDL so that the designer can further define the interface to alph a by declaring
the public accessibility of certain parts of its state de1, de2, and de3.
This declaration then facilitates the execution of operations on the service data of a
stateful service instance implementing the alpha interface.
The serviceData declaration is the mechanism used to express the elements of the publicly
available state exposed by the service’s interface. ServiceData elements are accessible
through operations of the service interfaces such as those defined in this specification.
The private internal state of the service instance is not part of the service interface and is
therefore not represented through a serviceData declaration.
Department of CSE 23
Grid & Cloud Computing Unit II Notes
• The OGSI model uses the serviceData elements and XML schema types to achieve a
similar result.
• The OGSI specification has chosen not to require getXXX and setXXX WSDL
operations for each serviceData element, although service implementers may choose to
define such safe get and set operations themselves.
Extending portType with serviceData
• ServiceData defines a new portType child element named serviceData, used to define
serviceData elements, or SDEs, associated with that portType.
• These serviceData element definitions are referred to as serviceData declarations,
or SDDs.
• Initial values for those serviceData elements (marked as ―static‖ serviceData elements)
may be specified using the staticServiceDataValues element within portType.
• The values of any serviceData element, whether declared statically in the portType
or assigned during the life of the Web service instance, are called serviceData element
values, or SDE values.
Service DataValues
• Each service instance is associated with a collection of serviceData elements: those
serviceData elements defined within the various portTypes that form the service’s
interface, and also, potentially, additional service Data elements added at runtime.
• Each service instance must have a ―logical‖ XML document, with a root element of
service DataValues that contains the service Data element values.
• A service implementation is free to choose how the SDE values are stored; for example,
it may store the SDE values not as XML but as instance variables that are converted into
XML or other encodings as necessary.
• The wsdl:binding associated with various operations manipulating service Data elements
will indicate the encoding of that data between service requestor and service provider.
For example, a binding might indicate that the serviceData element values are encoded as
serialized Java objects
Department of CSE 24
Grid & Cloud Computing Unit II Notes
A grid service description describes how a client interacts with service instances.
Department of CSE 25
Grid & Cloud Computing Unit II Notes
The need arises at various points to represent time that is meaningful to multiple parties
in the distributed Grid.
The GMT global time standard is assumed for grid services, allowing operations to refer
unambiguously to absolute times
Grid service hosting environments and clients should utilize the Network Time Protocol (NTP)
or equivalent function to synchronize their clocks to the global standard GMT time
Department of CSE 26
Grid & Cloud Computing Unit II Notes
<gsdl:serviceData
name=―nmtoken‖?
globalName=‖qname‖?
type=‖qname‖
goodFrom="xsd:dateTime"?
goodUntil="xsd:dateTime"?
availableUntil=‖xsd:dateTime‖?>
<--
</gsdl:serviceData>
Department of CSE 27
Grid & Cloud Computing Unit II Notes
• Replication strategies determine when and where to create a replica of the data.
• The factors to consider include data demand, network conditions, and transfer cost.
The strategies of replication can be classified into method types: dynamic and static.
• For the static method, the locations and number of replicas are determined in advance and
will not be modified.
• Although replication operations require little overhead, static strategies cannot adapt to
changes in demand, bandwidth, and storage vailability.
• Dynamic strategies can adjust locations and number of data replicas according to
changes in conditions
• The replication strategy must be optimized with respect to the status of data replicas.
• For static replication, optimization is required to determine the location and number of
data replicas.
• For dynamic replication, optimization may be determined based on whether the data
replica is being created, deleted, or moved.
• The most common replication strategies include preserving locality, minimizing update
costs, and maximizing profits.
Department of CSE 28
Grid & Cloud Computing Unit II Notes
Department of CSE 29
Grid & Cloud Computing Unit II Notes
Federation model:
This data access model shown in the Figure is better suited for designing a data grid with
multiple sources of data supplies.
• Sometimes this model is also known as a mesh model.
• The data sources are distributed to many different locations.
• Although the data is shared, the data items are still owned and controlled by their original
owners.
• According to predefined access policies, only authenticated users are authorized to
request data from any data source.
• This mesh model may cost the most when the number of grid institutions becomes very
large.
Hybrid model:
This data access model is shown in the Figure. The model combines the best features of the
hierarchical and mesh models.
• Traditional data transfer technology, such as FTP, applies for networks with lower
bandwidth.
• Network links in a data grid often have fairly high bandwidth, and other data transfer
models are exploited by high-speed data transfer tools such as GridFTP developed with
the Globus library.
• The cost of the hybrid model can be traded off between the two extreme models for
hierarchical and mesh-connected grids
Parallel versus Striped Data Transfers
• Compared with traditional FTP data transfer, parallel data transfer opens multiple data
streams for passing subdivided segments of a file simultaneously.
• Although the speed of each stream is the same as in sequential streaming, the total time to
move data in all streams can be significantly reduced compared to FTP transfer.
• In striped data transfer, a data object is partitioned into a number of sections, and each
section is placed in an individual site in a data grid.
• When a user requests this piece of data, a data stream is created for each site, and all the
sections of data objects are transferred simultaneously.
Department of CSE 30
Grid & Cloud Computing Unit II Notes
Striped data transfer can utilize the bandwidths of multiple sites more efficiently to speed up data
transfer.
OGSA SERVICES
• Handle Resolution
• Virtual Organization Creation and Management
• Service Groups and Discovery Services
• Choreography, Orchestrations and Workflow
• Transactions
• Metering Service
• Rating Service
• Accounting Service
• Billing and Payment Service
• Installation, Deployment, and Provisioning
• Distributed Logging
• Messaging and Queuing
• Event
• Policy and Agreements
• Base Data Services
• Other Data Services
• Discovery Services
• Job Agreement Service
• Reservation Agreement Service
• Data Access Agreement Service
• Queuing Service
• Open Grid Services Infrastructure
• Common Management Model
Handle Resolution
• OGSI defines a two-level naming scheme for grid service instances based on abstract,
long-lived grid service handles (GSHs) that can be mapped by HandleMapper services to
concrete, but potentially less long lived, grid service references (GSRs).
Department of CSE 31
Grid & Cloud Computing Unit II Notes
Department of CSE 32
Grid & Cloud Computing Unit II Notes
Department of CSE 33
Grid & Cloud Computing Unit II Notes
Accounting Service
• Once the rated financial information is available, an accounting service can manage
subscription users and accounts information, calculate the relevant monthly charges and
maintain the invoice information. This service can also generate and present invoices to
the user. Account-specific information is also applied at this time.
Billing and Payment Service
• Billing and payment service refers to the financial service that actually carries out the
transfer of money; for example, a credit card authorization service.
Installation, Deployment, and Provisioning
• Computer processors, applications, licenses, storage, networks, and instruments are all
grid resources that require installation, deployment, and provisioning (other new resource
types will be invented and added to this list.) OGSA affords a framework that allows
resource provisioning to be done in a uniform, consistent manner.
Distributed Logging
• Distributed logging can be viewed as a typical messaging application in which message
producers generate log artifacts, (atomic expressions of diagnostic information) that may
or may not be used at a later time by other independent message consumers.
• OGSA-based logging can leverage the notification mechanism available in OGSI as the
transport for messages.
• However, it is desirable to move logging-specific functionality to intermediaries, or
logging services.
• Furthermore, the secure logging of events is required for the audit trails needed to fulfill
judiciary and organizational
• policy requirements, to reconcile security-related inconsistencies, and to
• provide for forensic evidence both after the fact and in real time. It is expected that
• standards and implementations for secure logging should be able to considerably
• leverage on the efforts associated with distributed logging.
Messaging and Queuing
• OGSA extends the scope of the base OGSI Notification Interface to allow grid services
• to produce a range of event messages, not just notifications that a serviceData element
has changed. Several terms related to this work are:
Department of CSE 34
Grid & Cloud Computing Unit II Notes
Department of CSE 35
Grid & Cloud Computing Unit II Notes
Department of CSE 36
Grid & Cloud Computing Unit II Notes
Model.
Department of CSE 37
Grid & Cloud Computing Unit II Notes
Department of CSE 38
Grid & Cloud Computing Unit II Notes
Department of CSE 39