Académique Documents
Professionnel Documents
Culture Documents
0
SM
Table of Contents
Legal Notice................................................................................................................................................................................................ 4 Acknowledgments....................................................................................................................................................................................... 4 Terminology and Provenance....................................................................................................................................................................... 4 1.0 Executive Summary............................................................................................................................................................................... 5 2.0 Purpose................................................................................................................................................................................................ 6 3.0 Taxonomy............................................................................................................................................................................................. 6 Table 3.1Terms and Definitions................................................................................................................................................. 6 Figure 3.1Overview of the CIaaS Environment........................................................................................................................... 7
4.0 Introduction.......................................................................................................................................................................................... 9 5.0 Reference Framework..........................................................................................................................................................................10 Figure 5.1High-Level Grouping of Service Scenarios.................................................................................................................10 6.0 Usage Scenarios..................................................................................................................................................................................11 6.1 Usage Scenario 1Compose Service...........................................................................................................................................11 6.2 Usage Scenario 2 Submit Provisioning Request.........................................................................................................................12 6.3 Usage Scenario 3 Reserve Resources for Service......................................................................................................................12 6.4 Usage Scenario 4Deploy Service...............................................................................................................................................13 6.5 Usage Scenario 5Track Status and Manage Deployment............................................................................................................13 6.6 Usage Scenario 6Reopen Expired Request................................................................................................................................14 6.7 Usage Scenario 7Start Dependencies........................................................................................................................................15 6.8 Usage Scenario 8Stop Dependencies........................................................................................................................................15 6.9 Usage Scenario 9Suspend Cloud Service..................................................................................................................................16 6.10 Usage Scenario 10Resume Cloud Service................................................................................................................................16 6.11 Usage Scenario 11Systems Monitoring, Alerting, and Data Collection.......................................................................................16 6.12 Usage Scenario 12Systems Administration and Remediation....................................................................................................17 6.13 Usage Scenario 13Reporting...................................................................................................................................................18 6.14 Usage Scenario 14Capacity Planning.......................................................................................................................................18 6.15 Usage Scenario 15Auditing......................................................................................................................................................19 6.16 Usage Scenario 16Change a Deployed Service Instance.......................................................................................................... 20 6.17 Usage Scenario 17Auto Scale................................................................................................................................................. 20 6.18 Usage Scenario 18 Comply to Regulatory Requirements......................................................................................................... 21 6.19 Usage Scenario 19 Service and Data Termination and Deletion............................................................................................... 22
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
7.0 Service Orchestration Overview............................................................................................................................................................24 Figure 7.1Generic Cloud Ecosystem Architecture......................................................................................................................24 8.0 Service Catalog Overview.................................................................................................................................................................... 26 8.1 Background.................................................................................................................................................................................. 26 8.2 Description.................................................................................................................................................................................. 26 8.3 Service Catalog Lifecycle............................................................................................................................................................. 28 Figure 8.3.1Service Catalog Lifecycle...................................................................................................................................... 28 8.4 Service Catalog Usage................................................................................................................................................................. 28 8.5 Service Catalog XML Structure.................................................................................................................................................... 29 Example 1.................................................................................................................................................................................. 30 Example 2.................................................................................................................................................................................. 31 Example 3.................................................................................................................................................................................. 31 9.0 Services Interface Overview................................................................................................................................................................ 31 Table 9.1Requested Collections and Associated Usage Models................................................................................................ 32 Table 9.2 Requested Resources and Associated Usage Scenarios............................................................................................. 33 10.0 Key Performance Indicators Overview................................................................................................................................................ 35 Figure 10.1KPI Visualization.................................................................................................................................................... 35 Table 10.1Measures in KPI Calculation.................................................................................................................................... 37 11.0 Orchestration Lifecycle Overview....................................................................................................................................................... 38 Figure 11.1Orchestration Lifecycle Flowchart.......................................................................................................................... 38 Table 11.1High Level Process for a Service Requiring Orchestration......................................................................................... 39 Table 11.2Lifecycle Events and Phases................................................................................................................................... 40 Figure 11.2Service Catalog Life Cycle......................................................................................................................................41 Table 11.3Defined Cloud Consumer Responsibilities................................................................................................................ 42 Table 11.4Activities Within the Phases of Orchestration........................................................................................................... 43 12.0 Usage Requirements......................................................................................................................................................................... 45 Table 12.1Additional Requirements.......................................................................................................................................... 45 13.0 RFP RequirementsService Providers................................................................................................................................................ 46 14.0 Summary of Industry Actions Required...............................................................................................................................................47 15.0 References........................................................................................................................................................................................ 48
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
Legal Notice
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. This Open Data Center AllianceSM Master Usage Model: Service Orchestration is proprietary to the Open Data Center Alliance, Inc. NOTICE TO USERS WHO ARE NOT OPEN DATA CENTER ALLIANCE PARTICIPANTS: Non-Open Data Center Alliance Participants only have the right to review, and make reference or cite this document. Any such references or citations to this document must give the Open Data Center Alliance, Inc. full attribution and must acknowledge the Open Data Center Alliance, Inc.s copyright in this document. Such users are not permitted to revise, alter, modify, make any derivatives of, or otherwise amend this document in any way. NOTICE TO USERS WHO ARE OPEN DATA CENTER ALLIANCE PARTICIPANTS: Use of this document by Open Data Center Alliance Participants is subject to the Open Data Center Alliances bylaws and its other policies and procedures. OPEN DATA CENTER ALLIANCESM, ODCA SM, and the OPEN DATA CENTER ALLIANCE logo are trade names, trademarks, service marks and logotypes (collectively Marks) owned by Open Data Center Alliance, Inc. and all rights are reserved therein. Unauthorized use is strictly prohibited. This document and its contents are provided AS IS and are to be used subject to all of the limitation set forth herein. Users of this document should not reference any initial or recommended methodology, metric, requirements, or other criteria that may be contained in this document or in any other document distributed by the Alliance (Initial Models) in any way that implies the user and/or its products or services are in compliance with, or have undergone any testing or certification to demonstrate compliance with, any of these Initial Models. Any proposals or recommendations contained in this document including, without limitation, the scope and content of any proposed methodology, metric, requirements, or other criteria does not mean the Alliance will necessarily be required in the future to develop any certification or compliance or testing programs to verify any future implementation or compliance with such proposals or recommendations. This document does not grant any user of this document any rights to use any of the Alliances Marks. All other service marks, trademarks and trade names referenced herein are those of their respective owners. Published November, 2012
Acknowledgements
ODCA would like to acknowledge the substantial contributions of content and prior art from Deutsche Bank, UBS, NAB, Deutsche Telekom, Bank, Leumi, Atos CapGemini, BMW, and from Wikipedia.
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
OPEN DATA CENTER ALLIANCE Master USAGE MODEL: Service Orchestration REV 1.0
SM
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
2.0 Purpose
This document focuses on service discovery and orchestration for Compute IaaS. It defines automation required for orchestration that includes programmatic interfaces, interaction patterns, control interfaces, and lifecycle management. The goals of this usage model are as follows: Define the basic elements of service orchestration for Compute IaaS. Apply the baseline work on the ODCA Usage Model: Service Catalog 1 and ODCA Usage Model: Standard Units of Measure for IaaS 2 Identify the ODCA Usage Model: Service Catalog 1, API interface, key performance indicators (KPIs), and process for Service Orchestration. Align with the ODCA Master Usage Models: Commercial Framework 3 and Compute IaaS (CIaaS) 4 to underpin cloud services provisioning. Define a standard orchestration process for service integration which can be used as a reference model for improving interoperability between cloud providers and subscribers.
3.0 Taxonomy
Table 3.1Terms and Definitions
Term
Auto Scaling
Definition
The ability to automatically provision a new service instance or remove an existing service instance in a collection as defined by a set of one or more policies. It also includes the ability to alter a configuration to align more directly on usage requirements. The ability for a cloud consumer to scale out its infrastructure from one cloud provider into another, typically from a private cloud provider to a public cloud provider. Describes a set of one or more service instances. The ability for cloud consumer to move a service instance from one cloud provider to another. Represents a task to be executed after the deployment of service instance, typically in the form of a script that the cloud consumer provides. A confirmation from the cloud provider that the specific set of resources specified in a service request has been reserved for the cloud consumer. The action of defining the specification for a set of interrelated components that makes up a service. An occurrence or instantiation of the service as listed in the catalog that is deployed into the environment for a cloud consumer. The ongoing ability to arrange, coordinate, and manage the automated deployment and configuration of one or more interrelated components required for service delivery at a point in time. An order submitted by the cloud consumer that details the service and its composition that the cloud provider is expected to deliver.
Bursting Collection Migration Post Deployment Task Reservation Service Composition Service Instance Service Orchestration
Service Request
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
The following diagram provides an overview of a CIaaS environment, and the variety of components and systems that are addressed by the orchestration task. Note: Orchestration is indicated on the right side of the diagram.
Services
Actionable Service Catalog (UI and API
Remediation
SaaS
Web & Data Service Interoperability Cloud Aware Apps
Resource Allocation
PaaS
Web Data Message
Application Development
Remediation
Configuration
IaaS
Integration Service Desk Compute Storage Network Event
Facility
Security
The top layer represents the service bundling into usable services for a consumer, from the service provider, and accessible through a service catalog; for example, availability, administration, and data recovery (DR) as represented in the service catalog. Access to this catalog is based on both Graphical User Interface(s) (GUIs), Command Line Interfaces (CLIs) and Application Programmable Interface(s) (APIs). The multiple layers, front to back, represent the contributions and interactions and responsibilities from vendors, service providers, and service consumersall have some responsibility in the Orchestration process which must be triggered, and these have to be clearly mapped and integrated within the Orchestration process. (For example: the hardware vendor must support the SLA-based Break Fix requirement, the service provider must take care of the SLA monitoring, incidents and events, and the consumer must define his data classification, availability and performance requirements, and SLA requirements). The definitions for each block in Figure 3.1 are as follows: 1. Solution Provider Plane a. Monitoring: Monitoring of the constituent element of the overall cloud service for which the solution provider is contracted. b. Resource Allocation: Allocation of contracted resources to overall federation that comprises the cloud service, and can include people, technology, and processes. c. Remediation: Address and resolve incidents and problems. d. Tuning & Optimization: Monitor performance and resolve any bottlenecks or IO contention, and maximize capacity utilization/ efficiency of provided resources.
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
2.
Service Receiver Plane a. End User: Overall end consumer of the service, funding the use of the cloud services. b. AppDev: Internal corporate application development team, developing and operating business applications on the cloud infrastructure, also providing data classification, business processes, and business rules and regulatory requirements. c. Business Processes: Provision of the overall business processes into which the cloud based applications integrate as part of an overall system, in order to convert defined inputs to defined outputs.
3.
Service Provider Plane a. Remediation: Address and resolve incidents and problems. b. Application Operations: Monitor, run and perform scheduled and defined tasks relating the smooth running and operations of an application, on the cloud infrastructure. c. Service Management: Monitor and ensure the achievement of defined SLAs and KPIs for the cloud service, including all cloud elements, financial, technical and legal dimensions. d. Integration: Connect cloud based applications and processes to any other required/defined systems according to business processes, rules, and interface definitions, in such a way as to enable the defined inputs and outputs of the business system, based on the cloud infrastructure, to interact, internally, and with any external systems to the cloud. e. Service Desk: Monitor events and queues for cloud based services, and respond and alert support resources in the event of an incident or problem, as well as acting as an interface for cloud consumers to log calls, and find status updates. May use web, telephone and other interfaces. f. Capacity & Service: Monitor and report overall service capacity consumption, available resources, trends, and create relevant reports to enable consumer, operational and service delivery teams to operate and optimize the cloud infrastructure to maximum efficiency and performance. g. SW Delivery (OS & Apps): Define cloud service elements and deliver the software components according to defined catalog definitions, installed on the cloud infrastructure. Special focus is on enabling the applications to be cloud aware so that they can operate on virtualized infrastructure. Additional packaging and bundling of software elements will constitute PaaS and SaaS offerings, depending on the level of functionality, and the elements combined into useful pre-determined groups as systems, with defined processes and functions. h. Configuration: Define scripts and pre-determined implementation formats, in addition to the binaries that the software elements are fundamentally composed of, and extends to the integration of software elements into PaaS and SaaS bundles potentially. i. Event: Any data generated as a result of a cloud element operation. Produces data which contributes towards reporting, identification of incidents and problems, as well as financial and compliance reporting j. Security: Provides rules and guidelines for compliant delivery of services, and ensuring the security of the cloud service according to defined requirements and catalog entries. k. SaaS: The capability provided to the consumer is to use the providers applications running on a cloud infrastructure and accessible from various client devices through a thin client interface such as a Web browser (e.g., web-based email). The consumer does not manage or control the underlying cloud infrastructure, network, servers, operating systems, storage, or even individual application capabilities, with the possible exception of limited user-specific application configuration settings.5 l. PaaS: The capability provided to the consumer is to deploy onto the cloud infrastructure consumer-created applications using programming languages and tools supported by the provider (e.g., java, python, .Net). The consumer does not manage or control the underlying cloud infrastructure, network, servers, operating systems, or storage, but the consumer has control over the deployed applications and possibly application hosting environment configurations.5
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
m. IaaS: The capability provided to the consumer is to rent processing, storage, networks, and other fundamental computing resources where the consumer is able to deploy and run arbitrary software, which can include operating systems and applications. The consumer does not manage or control the underlying cloud infrastructure but has control over operating systems, storage, deployed applications, and possibly select networking components (e.g., firewalls, load balancers).5 n. Facility: The building facility where the cloud system is located; location is typically identifiable by means of the network termination on the compute infrastructure from which the cloud services are provided. For further details about the elements of the diagram, refer to the ODCA Master Usage Model: Compute Infrastructure as a Service (CIaaS) 4
4.0 Introduction
Service discovery and orchestration is a paradigm that supports cloud providers in arranging, coordinating, and managing computing resources as a system of components and automated workflows that can be delivered as cloud services to cloud consumers. At the most basic level a service orchestrator can be a human, but for the purpose of this document, an orchestrator is an automation service that provides orchestration across various technical and business workflow domains. Because the cloud is all about scale, automated workflows are essential to the delivery of cloud services, which includes fulfilment assurance and billing in addition to other domains. The main difference between workflow automation and service orchestration is that automated workflows represent the entities and execution relationship; automation is the non-human service that drives the workflow. These workflows are often processed and completed as processes with a single domain. On the other hand, service orchestration includes a workflow, but implies directed action towards larger goals and objectives. In a nutshell, it is different from a typical workflow automation process because it ties together a variety of different or disparate automated processes and IT resources that use workflows through a portal from which those workflows can be managed.6 Service orchestration for any type of cloud service involves specific considerations. Functional, non-functional, and constraint descriptions must be clearly defined. Introspection of a service may be useful in determining these details. A well-defined service catalog where these services can be looked up to determine which services are needed for required functionality as well as the interfaces that support them is a must-have. Discovery is a process of assessing the capabilities of the services and contracts as well as the commercial parameters that allow efficient transactions that are highly secure with great elasticity. In the future, with established marketplaces for services, it will be possible to have pre-qualified service providers and services and discovery time should be reduced. Service reservation or a declaration of intent and when orders will be filled, such as when the service and resources are filled is important for fulfilment of a service. Identification of the methods for service orchestration of enhanced services, customization, or mashups is an important part of orchestration. This also includes requirements related to business processes, outsourced segments of the business process to external cloud services, and orchestration for managing contracted services. Defining the order of services may be necessary to identify other dependencies, such as lifecycle management, interoperability of environments, bootstrapping, and so on. Security considerations are a necessary requirement to delivering a robust cloud service. For example, authentication, authorization, audit and accounting (AAAA) are essential to service orchestration as consumers evaluate such topics as federated identity. Service Level Agreements (SLAs) are also essential to the fulfilment of cloud services. This might take place either in-band, out-of-band, or in advance of a specific engagement. Using existing work from the ODCA workgroups, this paper focuses on how service components are combined based on the CIaaS framework. It also takes the viewpoint of the service consumer initially, and will later be updated to include the viewpoints of the service providers and vendors.
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
This usage model contains 19 service orchestration usage scenarios between a cloud subscriber and a public cloud provider for CIaaS. This usage model lays the foundation for the next phase of usage development and other areas of the cloud. It identifies what needs to be configured for the CIaaS cloud service, what must appear in the service catalog, as well as how the client orders it, orders changes, orders decommissioning, and how it is proved that they are receiving the service.
1. Compose Service 2. Submit Provisioning Request 3. Reserve Resources for Service 4. Deploy Service 5. Track Status and Manage Deployment 6. Reopen Expired Request
11. System Monitoring & Data Collection 12. System Administration & Remediation 13. Reporting 14. Capacity Planning 15. Auditing
18. Comply Regulatory Requirements 19. Service and Data Termination & Deletion
10
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
Failure Condition 1: Data is incomplete. Failure Handling 1: The subscriber is notified and prompted to correct the composition. Failure Condition 2: Cyclic dependencies are identified and erred. Failure Handling 2: The subscriber is notified and prompted to correct the composition.
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
11
12
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
Success Scenario 1: Cloud provider checks for resource availability and reserve resources. Steps: 1. Cloud provider reserves the resources for the cloud subscriber. Reservation has an expiration date.
Failure Condition 1: Cloud provider does not have capacity to reserve resources for the cloud subscriber. Failure Handling 1: The subscriber is notified and can resubmit the order at a later time. Ideally the provider notifies the subscriber when capacity is available.
Failure Condition 1: Resource reservation has expired. Failure Handling 1: The subscriber is notified and has to reopen the service request or start a new one. Failure Condition 2: Cloud subscriber cancels the order and any resources reserved will be released back into the resource pool. Failure Handling 2: Any resources reserved will be released back into the resource pool. Failure Condition 3: Cloud subscriber cancels the deployment only. The service request is still open and resources are still reserved. Failure Handling 3: The service request is still open and resources are still reserved.
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
13
Name of request Type of service and cloud provider List of resources specified in the request Cloud provider returns deployment status. Status report should include but not limited to the following list of information. Percentage to completion Estimated time of delivery Status of different resources in the request Error messages if any Activity log 3. Depending on the current state of deployment, the cloud subscriber may choose one of the following actions: Cancel the deployment, rollback any changes, and put resources back into pool. Pause the deployment at the next check point. Resume deployment that was paused. Fix error and retry/resume deployment from last checkpoint. Failure Condition 1: Provisioning request is not progressing and there is no error message. Failure Handling 1: Subscriber has a self-service means to check progress and escalate within the providers support organization. Failure Condition 2: Deployment erred and aborted by the cloud provider. Failure Handling 2: Subscriber is notified and can either resubmit or cancel the request.
14
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
3. 4. 5. 6. 7. 8.
On submit, completeness of data is verified by the cloud provider. If data is incomplete, the cloud provider highlights the missing data, and prompts the cloud subscriber to review. On successful submission, the cloud provider checks if the requested resources in the order can be met. If the order can be met, the cloud provider reserves the resources for the cloud subscriber (trigger the Usage Scenario 3Reserve Resources for Service). Reservation has an expiration date. If the order cannot be met, the cloud provider will provide an estimated date in the future when the cloud subscriber can expect resources to become available. Cloud subscriber reviews the order one last time prior to release for automatic deployment.
Failure Condition 1: Data is incomplete. Failure Handling 1: The subscriber is notified and prompted to complete the request. Failure Condition 2: Cloud provider does not have capacity to reserve resources for the cloud subscriber. Failure Handling 2: The subscriber is notified and can resubmit the order at a later time. Ideally the provider notifies the subscriber when capacity is available.
Failure Condition 1: Cloud service not started or execution of test scenario unsuccessful. Failure Handling 1: Escalation through cloud providers support path.
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
15
Success Scenario 1: Cloud service is stopped, resources are freed up. ID and configuration data is kept for billing and future usage. Steps: 1. 2. 3. Cloud service is stopped. Correct order of stopping dependencies is ensured. Resources used by cloud service are freed up and made available to resource pool again. ID, configuration and billing record are kept for future re-initialization (see start).
Failure Condition 1: Cloud service is not stopped, not stopped at the designated time, or stopped. Failure Handling 1: Tier 2 support of cloud provider.
Failure Condition 1: Cloud service is not suspended. Failure Handling 1: Tier 2 support of cloud provider.
Failure Condition 1: Cloud service is not resumed, not resumed at the designated time, or just partially resumed. Failure Handling 1: Tier 2 support of cloud provider.
16
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
Assumption 1: CIaaS has a set of KPIs which are configured based on published SLA and contract agreement between the subscriber and provider. Assumption 2: Defined monitoring points for data collection accommodate both KPI requirements and compliancy. Assumption 3: Monitoring is based on pro-active, real-time architecture for all relevant KPI elements, according to a process which is compliant with applicable country and corporate governance. Assumption 4: A database has been created to store trend data, including on-going maintenance. Success Scenario 1: Cloud subscriber receives an alert message when an established threshold is triggered. Steps: 1. 2. 3. 4. 5. Thresholds and associated messaging are configured. The system detects that an established threshold is reached in the running system, such as health, performance or billing conditions. The trigger action is automatically executed to send an alert message in the proper format (email, SMS and message queue). Identified data for trend analysis is posted to a database. Data is retained for a sufficient period to enable analysis and compliancy.
Failure Condition 1: Trigger condition is realized but alert is not generated. Failure Handling 1: Error in alerting is logged and escalated. Cloud provider delivers a manual alert. Failure Condition 2: Correct data is not retained in a database to enable sufficient trend analysis. Failure Handling 2: Update monitors to collect the correct data and to retain it in a suitable location. Failure Condition 3: An agreed threshold was not established/captured by the cloud provider system. Failure Handling 3: Technical business remediation, according to the agreement.
Failure Condition 1: Failure to trigger remediation routine or script. Failure Handling 1: Identify the trigger and correct remediation routine or script, and add them to the automation mechanism. Failure Condition 2: Remediation routine or script fails Failure Handling 2: Update the routine or script if applicable, or correct the pre-conditions expected or required by the routine or script.
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
17
2. 3.
Failure Condition 1: Reports are not defined. Failure Handling 1: Identify the relevant reports that must be produced, and the KPI related data to support them, and produce the report. Failure Condition 2: Reports are inaccurate. Failure Handling 2: Identify if the incorrect data is being used, or if inadequate data is collected, and correct the gap. Failure Condition 3: Report data is not retained. Failure Handling 3: Identify data and report retention requirements. Correct for future report generation. Failure Condition 4: Reports are inadequate to determine key service factors. Failure Handling 4: Identify the correct source data needed to represent the service factor, collect it, and update the report. Failure Condition 5: Compliance requirements are not met. Failure Handling 5: Review the process required to collect the data and produce the report, and identify what gaps exist to achieve compliance, and correct the process.
18
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
Steps: 1. 2. 3. 4. 5. All capacity reporting dimensions are defined. Supporting data collection is scheduled or automated. Capacity data is analyzed automatically from the available source. Forecasts of events and changes are automatically produced. The information is sent to the cloud subscriber for approval, if required by the contract.
Failure Condition 1: Service quality is impacted. Failure Handling 1: Identify the relevant data to analyze and develop a trend line with thresholds to forecast decision or change points. Failure Condition 2: Unplanned costs arise. Failure Handling 2: Identify the relevant data to analyze and develop a trend line with thresholds to forecast decision / change points.
Failure Condition 1: Required data to support KPI reporting is not available. Failure Handling 1: Identify the needed data, its source, and any other factors, such as retention requirements, and update the monitoring system to collect it. Failure Condition 2: Unacceptable methods are used to collect the data. Failure Handling 2: Identify the correct method to be used for collecting the relevant data, and correct the method or process. Failure Condition 3: Unacceptable methods are used to convert the data into reports. Failure Handling 3: Identify the correct method for interpreting the data into compliant reports, and then correct the process.
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
19
Failure Condition 1: Change failure. Failure Handling 1: Run back-out script. Failure Condition 2: Capacity constraintno landing zone for change. Failure Handling 2: Delay change, notify consumer of constraint, and find an alternative resource.
20
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
Steps: 1. 2. 3. 4. 5. 6. 7. Front-end Site Traffic: Scale based on the number of incoming requests, such as web pages, objects, and data transfer. The cloud consumer completes the necessary steps to request specific service and request a confirmation for the service provision or expansion. The cloud provider verifies possible constraints of provision request, such as terms of service and contract, SLAs, etc. Back-end Batch Processing (Scale Horizontally): Load-based scalingScale based on the number of jobs in the queue; Time-based scalingScale based on how long jobs have been in the queue The cloud provider updates the contract for a specific service, and provisions service-specific cloud infrastructure within the terms of service requirements. The cloud provider returns a confirmation message indicating the successful provision of the service. The cloud provider returns evidence that the data has been deleted in accordance to the terms of service definitions.
Failure Condition 1: Scale-out request cannot be completed by the cloud subscriber. Failure Handling 1: Consumer notified of failed request and alternatives are suggested. Failure Condition 2: Scale-back request cannot be completed by the cloud subscriber. Failure Handling 2: Consumer notified of failed request and alternatives are suggested.
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
21
5.
For assurance levels silver, gold, and platinum, all files and databases of the operational storage, such as disks and SSDs, are deleted securely, that is not recoverable11. Additionally, for assurance levels gold and platinum, the file systems allocated to the service are reformatted, including overwriting all physical disk blocks. For assurance levels gold and platinum, the memory of all machines (physical and virtual) and in the various network components, such as routers, switches, and firewalls, is securely deleted. The technical deletion process must be defined as being technically feasible and approved. All archives on disks or tapes of the service are deleted securely so that they are not recoverable12 or destroyed. The provider hands over a report evidencing all cleanup actions that were executed. The report content is defined and is part of the contract between subscriber and provider. The report will take the assurance levels into account.
6.
7. 8.
Failure Condition 1: Some transactions cannot be closed; some operations cannot be resolved when the service is shutting down gracefully. Failure Handling 1: The service is forced to shut down, unresolved operations or open transactions are logged, and provided to the subscriber. Failure Condition 2: The subscribers database or some files are corrupted after the services shutdown. Failure Handling 2: Application log and database transaction log are provided to the subscriber with the database files. Application log and corrupted files are provided to the subscriber as well. Failure Condition 3: The transfer of the operational data to the subscriber failed. Failure Handling 3: The transfer is repeated, possibly over a different channel. Failure Condition 4: The provider cannot provide an evidence report that fits the contract between subscriber and provider. Failure Handling 4: Provider has to redo some clean-up actions and reproduce an evidence report. For assurance levels gold and platinum, the provider has to allow an external auditor to verify the completion of the clean-up process.
22
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
5.
For assurance levels silver, gold, and platinum, all files and databases from the operational storage unit are deleted securely, and are not recoverable.6 Additionally, for assurance levels gold and platinum, the file systems allocated to the service are reformatted, including overwriting all physical disk blocks. For assurance levels gold and platinum, the memory of all machines (physical and virtual) and in the various network components, such as routers, switches, and firewalls, is securely deleted as well. The technical deletion process must be defined as being technically feasible and approved. All archives on disks or tapes of the service are deleted securely, and are not recoverable or destroyed.12 The provider hands over a report evidencing all cleanup actions that were executed. The reports content is defined and is part of the contract between subscriber and provider. The report will take the assurance levels into account. The resources are returned to the available pools for re-allocation within the cloud.
6.
7. 8. 9.
Failure Condition 1: Some transactions cannot be closed, or some operation cannot be resolved when the service is shutting down gracefully. Failure Handling 1: The service is forced to shut down, unresolved operations or open transactions are logged, and provided to the subscriber. Failure Condition 2: The subscribers database or some files are corrupted after the services shutdown. Failure Handling 2: Application log and database transaction log are provided to the subscriber with the database files. Application log and corrupted files are provided to the subscriber as well. Failure Condition 3: The transfer of the operational data to the subscriber failed. Failure Handling 3: The transfer is repeated, possibly over a different channel. Failure Condition 4: The provider cannot provide an evidence report that fits the contract between subscriber and provider. Failure Handling 4: Provider has to redo some clean-up actions and reproduce an evidence report. For assurance levels Gold and Platinum, the provider has to allow an external auditor to verify the completion of the clean-up process.
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
23
Catalog
Capacity Planning
Cloud Consumers
Allocation
ISO
Front-end
Federation
Front-end
Statistics
A provision exists for an organization, or organizations to take on a number of roles as middlemen, usually described as brokerage, integration, aggregation or orchestration. Large parts of these functions might be automated. The interactions to and from cloud consumers and to and from cloud providers are similar, with central functionality sometimes acting primarily as a switching center. They may be invoked through a portal, but are able to be used in a more sophisticated manner when invoked by APIs. One issue that arises in such a multi-party arrangement is how visible, in terms of levels, prices, and volumes, the different suppliers services will be between each other and to non-customers.
24
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
The components within the interactions are as follows: Master Services Agreement: As defined by the ODCA Usage Model: Regulatory Framework 13, a work group within the commercial framework provides a contractual framework in which services can be commissioned according to pre-agreed upon terms and conditions. Services Catalog: As defined by the ODCA Usage Model: Service Catalog 1, a work group provides a standard framework for the description of cloud services, especially IaaS, such that they can be described and discovered in a standard way and then compared between suppliers. Configuration: Complements the services catalog by providing an indication of the available capacities and prices of the suppliers environment at any one time. This can be used to support the implementation of conditional pricing, volume discounting, or spot market, a market where services are traded for immediately. Provisioning: Determines the best fit for any particular request and assigns it to the relevant suppliers. This can be established by pre-defined criteria as described in TREC from the EC-sponsored OPTIMIS 14 project. Image Library: Used to hold and allocate re-configured machine images for deployment across any supplier. Images can be deposited by cloud consumers or providers, and may or may not be made available to others. Identity Management: Deployed on a federated basis, using standard protocols, such as SAML, such that cloud consumer organizations maintain their own directory of their users, and cloud providers can establish trust relationships to permit access to their environments. These functions can include both users who are entitled to commission and configure systems as well as the end-users themselves. It includes aspects such as authorization levels to order further facilities. Delivery and Operations: Takes place with the cloud provider who delivers services to the relevant cloud consumers. Metering: Takes place during delivery, such that the cloud provider tracks the allocation or actual usage, depending on the nature of the service contract or the relevant resources. This makes data available to the cloud consumer for real-time tracking, and uses it for subsequent billing. Monitoring: Allows for normal or exceptional events, utilisation levels, or component failures in such a way that cloud consumers can be alerted to potential failures of their services, and that they or some agent can potentially take remedial actions on their behalf. Service Level Recording: Takes place when tracking the actual services delivered against service levels, such as availability, that have been agreed upon for possible remediation and reporting to the cloud consumer. This may include utilization and performance levels that trigger the deployment or release of further resources. Support: An optional facility. This can be passive, such as the provision of frequently-asked questions; or it can be more active, whereby some agent can detect anomalies or errors in the running environment, fix the relevant failing component, or recommend a course of action to the cloud consumer. Billing: Takes place on a periodic basis (typically monthly), involving the cloud consumer being notified of their charges, prompted by the contract terms, levels of actual usage, and service levels delivered.. Payments: Take place from the cloud consumers directly to the cloud providers or through the broker organization. Termination: Takes place when the cloud consumer has finished with part or all of the services. In addition, some other possible services are shown that surround various operational services: Advice and Guidance: Helping a potential cloud consumer determine what systems are most suitable for cloud, how they should be configured, and on which platforms they can be deployed. Implementation and Transition: Actually converting applications for cloud platforms, identifying and transferring the necessary data, setting up the environment, etc.; these are all things which the cloud consumer would otherwise have to do for themselves. Successful CIaaS service orchestration requires a number of elements, including a service catalog, a services interface, and key performance indicators (KPIs). The following sections cover each of these elements in turn. The service catalog section includes background information and definitions as well as a detailed lifecycle process for users. Information about a service catalogs structure also is included.
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
25
8.1 Background
Service catalogs are designed to help organizations in their assessments and selections of cloud services. A detailed and up-to-date service catalog is an essential component of any providers offerings. It is the mechanism by which a consumer can determine the capabilities and characteristics of one or more services in an industry accepted standard and consistent manner. Moreover, a standard approach to content and classification within a typical service catalog will allow consumers to be able to compare services between providers. This will help drive competitiveness in the marketplace, allow meaningful service benchmarking and enable consumers to be able to cross-check service details, costs and service levels on a like-for-like basis. Ultimately, in a market where services become more and more intangible as products, a service catalog will enable the consumer to give the products more context and ensure they do not feel as if they are missing out on elements of a service which may be more obvious in a traditional outsourced IT agreement. The concept and design for the service catalog should be applicable to all layers of cloud services, from Infrastructure as a Service (IaaS) to Platform as a Service (PaaS) and Software as a Service (SaaS), and to other kinds of services such as a Virtual Private Data Center as a Service (VPDCaaS). The desire is that there should be a core set of parameters that are applicable to all offerings, plus specific characteristics, for each of the individual services. Again, consistency across providers will allow comparisons to be made between specific parts of an offering, such as availability or transactional throughput. Additionally, interoperability is improved, and there is an improved degree of ability to consistently span cloud for enriched business purposes, such as project interoperability, service integration and capacity bursting. Capacity bursting refers to a cloud consumer scaling out their infrastructure from one cloud provider to another, typically from a private cloud provider to a public cloud provider.
8.2 Description
The catalog must include and describe offered standard or custom services and their function and feature parameters, such as: Specific infrastructure services to host the application Additional services such as database and messaging The available technical characteristics, including: SLA Related KPIs for the technology layer Dependencies (i.e. relationship with other services inside/outside the cloud) Capacity Usage display Performance parameters Automation Availability Supported assurance levels (bronze, silver, gold, platinum) Service Continuity parameters (RTO, RPO, RCO) Data classification and retention metrics Security controls Defined reports and auditing metrics Technology related characteristics
26
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
OS, patch and release levels Bandwidth, Connectivity, firewall rules Network capacity to access the service Storage, backup and recovery options, including other management options Disaster recovery options and facilities Data retention facilities and archiving APIs for service integration The offered environmental constraints, such as: Geographic location (jurisdiction) Regulatory and legal requirements or pre-approvals Validated third-party assessments, such as ISOx The offered SLAs and qualities of the packaged service, including: Operationally by product Provisioning and de-provisioning times Service tiers The offered commercial terms, including: Cost per unit of capacity or consumption Cost uplift and rebate for each service assurance parameter Cost for any custom extensions Discount structure for scale, user banding, minimum commitments, etc. Necessary surcharges for pre-reservation of capacity Applicable taxes and levies (if cross-border, regional, and so on) Any government credit schemes, such as carbon off-set, skills development, and so on General service rules, standards, and (pre)requirements Status (i.e. the service is obsolete / available / suspended / in development(Beta)) Availability for delivery of service in the required timeframe, including: Amount of capacity immediately in stock Amount of expandability in a defined amount of time to support growth To some degree, the service catalog acts as a sales tool for the cloud service provider as it offers a list of the key services and what differentiates them from competitors.
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
27
Cloud Subscriber
log Sp
ata
ec ify
eC
Se
um
rvi
ns
ce
Co
Service Catalog
Se rvi ce Up da
Cloud Provider
28
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
Im
ple
me
nt
te Ca tal og
5. 6. 7. 8. 9.
Cloud subscriber submits requirements to cloud provider. Cloud provider commits changes /or new services or rejects. Cloud provider and cloud subscriber sign contract. Cloud provider implements changes. Cloud provider adds or changes entries to the service catalog. See Step 2.
10. Cloud provider orchestrates the cloud subscriber-specific set of services. 11. Cloud provider hands infrastructure over to cloud subscriber.
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
29
Example 1
To illustrate, lets look at provisioning a basic VMware virtual machine running Red Hat Enterprise Linux Version 6.1. A customer searches the service provider for a service that matches the description and receives an SLA object associated with that service. The SLA object includes information that uniquely identifies and describes the service provider, the service being offered, options on compute, network, and storage resources, available locations, pricing, and estimated time for provisioning. The following is an XML example of a service object. The service named Red Hat Enterprise Linux Version 6.1 offers several options for the number of CPU cores, memory, and storage size. Additionally, it offers instances in Singapore, Germany, and the United States. The service is a CIaaS type infrastructure service (OS) available for the bronze and silver assurance levels. Finally, three types of SLAs are available in order to match the clients needs. <Service> <Name>Red Hat Enterprise Linux</Name> < Provider>ACME Cloud Corp</ Provider> <Resource> <Type>Infrastructure</Type> <Name>OS</Name> <Version>6.1</Version> </Resource > <Resource> <Type>Hardware</Type> <Name>vCPU</Name> <Size>1, 2, 4, 8</Size> </Resource> <Resource> <Type>Hardware</Type> <Name>Memory</Name> <Size>1, 2, 4, 8</Size> <Unit>GB</Unit> </Resource> <Resource> <Type>Hardware</Type> <Name>Storage</Name> <Size>10, 50, 100, 500, 1000</Size> <Unit>GB</Unit> </Resource> <Location>SGP, GER, USA</Location> <Service_Type>CIaaS</Service_Type> <Assurance_Level>Bronze, Silver</Assurance_Level> <Service_Level>Budget, Professional, Premium</Service_Level> </Service> The customer then places the order, specifying desired options, and customizing the VM instance. Customization includes giving a network name for the VM instance and specifying post deployment tasks that should be run after the VM is online. The service provider accepts that request and sends back a confirmation and a unique reference number to the customer for tracking. At any time, the customer can query the service provider with the reference number for status and any messages.
30
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
Example 2
A more complex example of a CIaaS request may include a set of one or more instances of services from the service catalog, such as a request that describes the resources for multiple virtual machines that are required to land a multi-tier application. There are more advanced customizations such as priority order and dependencies for deployment, firewall rules for communications between VMs, cluster configurations, load balancer, parameterized post deployment tasks, or auto scaling for a subset of VMs. In this more complex example, the need for service orchestration is necessary for repeatable, consistent and successful fulfilment of these complex types of service requests.
Example 3
Another example is the ability to burst into a public cloud. A consumer has a business application that is experiencing high demand. They wish to burst to an external provider and need additional capacity within 10 minutes. They may have contractual relationships set up with five cloud providers to burst in this manner. By querying the SLA response times for each cloud provider they can choose the response time most appropriate to their needs, rather than risk bursting from one overloaded data center to another which is suffering the same temporary capacity issues. The approach to defining the use cases in this document is to break up the overall process into a number of sub-elements and to define each of those as a usage scenario. The following usage scenarios assume user self-service from a shared multi-tenant service provider against an existing commercial relationship. The term order in the usage scenario represents a request for a service from an existing client to a service provider, and is not intended to focus on the commercial dimension of corporate level ordering with order numbers and contracts, etc. The context is to enable a user with an existing account with the service provider to order a new service from the IaaS Service. In this context, the term is used to package the technical provisioning of a virtual system on an IaaS platform, the technical administration, and the request for changes to be effected to provision this service in the following usage scenarios.
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
31
This section builds upon several ODCA Usage Models: Compute IaaS 4, Standard Units of Measure for IAAS 2, and Guide to Interoperability Across Clouds .16 It takes the perspective of the cloud subscriber in that the APIs are derived from the usage scenarios defined earlier in this document. APIs are expressed in terms of the Representational State Transfer (REST) architectural model. The intent is not to provide a complete specification for cloud providers. Rather, it is intended to convey the set of capabilities required in an API. With REST, there is the concept of resources and collections. A resource is a typed object with properties, methods, and associations with other objects. A collection is a directory or group of resources. Since resources are addressed through Uniform Resource Identifiers (URIs), the expectation is for the cloud provider to use conventional URI design to publish APIs. For example, for a provider named myprovider and a subscriber named mysubscriber: http://api.iaas.myprovider.com http://api.iaas.myprovider.com/services/V1 http://api.iass.myprovider.com/mysubscriber http://api.iaas.myprovider.com/mysubscriber/servicerequests In addition, techniques for searching and filtering collections should be supported. For instance, the selection of a certain subset or number of items within a collection can be accomplished by applying a $filter predicate expression as part of the URI. Another important REST concept is the use of standard methods. The basic four methods are as follows: PostCreate a resource GetRead information about a resource PutUpdate a resource DeleteRemove a resource The following table describes the requested service orchestration collections and associated usage scenarios:
Post
NA NA NA
Get
Get list of all available services Get list of all possible user roles
Put
NA NA
Delete
NA NA NA
Usage Scenarios
1. Compose Service 1. Compose Service 7. Start 8. Stop 9. Suspend 10. Resume
Return the list of all compute objects that have been NA created through the provisioning process
NA NA NA NA
Get the list of possible events that can be generated NA by the system List the KPIs that are supported in the IaaS service Returns the list of all reservations for the account or user Returns a list of service requests that have been created for the user or account NA NA NA
NA NA NA NA
12. System Administration and Remediation 13. Reporting 3. Reserve Resources for Service 2. Submit Provisioning Request
32
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
The following table describes the requested resources and maps them to the associated usage scenarios:
Post
NA have to contact the provider to create the account
Get
Get current subscriber settings
Put
Update subscriber settings
Delete
NA - have to contact the provider to delete the account
Usage Scenarios
1. Compose Service 16. Change Deployed Service Instance 1. Compose Service 12. System Administration and Remediation
User
Provision a new user for Get current user the subscriber account of settings a particular role
Reservation
Update resource reservation for example, the expiration date Change state of object to trigger reboot, suspend, resume Configuration determines whether prior state is preserved for suspend/resume
3. Reserve Resources for Service 6. Reopen Expired Request 7. Start 8. Stop 9. Suspend 10. Resume
Server
Eventlog
Returns trend data of monitors for a particular time period apply filter
Forecast
Create a forecast object Generate reservation for a particular time settings based on the period based on forecast object history, KPIs and forecast properties NA
KPI
Get the details for NA a particular KPI current value not just the metadata View monitoring attributes and status Change the attributes of the monitor and apply the changes to the thread
NA
13. Reporting
Monitor
Create a new monitoring thread which specifies format email, SMS or message queue for each event as it occurs
11. System Monitoring and Data Collection 14. Capacity Planning 17. Auto Scaling
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
33
Resource
Provisionqueue
Post
Submit a service request to the provisioning queue
Get
See a list of service requests that have been submitted for provisioning
Put
Remove request from the provisioning queue
Delete
Delete a service request that is not fully deployed
Usage Scenarios
2. Submit Provisioning Request 5. Track Status and Manage Deployment
Remediation
Attributes: 1. The Monitor with which the remediation is associated. 2. The event for which remediation will take place 3. The remediation script
Report
Get the data associated with a particular report that has been created including location of where the report is published View current resource reservation for a service request
Change attribute in order to trigger the creation of a report specify the report date/time or recurring frequency and report delivery mechanism Revise a reservation
Remove a report
Reservation
3. Reserve Resources for Service 1. Compose Service 5. Track Status and Manage Deployment All
Service Request
Set up an order for a new View detail of a service specific service request
Health
NA
Retrieve general information describing whether the service is up and functioning normally
NA
NA
Response status codes are expected to conform to standard HTTP format as maintained by the Internet Assigned Numbers Authority (IANA). Response status codes are expressed as four digit numbers where the first digit indicates one of five standard classes. For this usage model there are two relevant classes: 4XX is a client error where the request contains bad syntax or cannot be filled. For example, 400 is a bad syntax error and 401 and 401 is an authorization error. 5XX is a server error where the server failed to fill a seemingly valid request. For example, 500 is an internal server error and 503 is a service unavailable error. A complete list of status codes can be found in the IANA HTTP Status Code Registry at http://www.iana.org/assignments/http-status-codes/ http-status-codes.xml 17
34
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
KPI Principles: Should define a specific measure title. The parameters of the measure constitute the aggregated SLA. Has a high and low water mark (The aggregated SLA is set against one of these). Can have multiple dimensions, some of which are shared. Service Consumer view, to gauge quality of service. Service Provider view, to manage overall services. Shared view on some items.
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
35
The KPIs recommended as a base for service orchestration follow: Agility, which includes: CapacityThe maximum quantity of a service that a service provider can deliver while meeting defined SLAs. ElasticityThe ability of a service to adjust available resources to meet demands. ScalabilityThe ability of the service provider to increase or decrease the quantity of resources available to meet service consumer requirements. Service Assurance, which includes: AvailabilityThe amount of time that a client can make use of a service. MaintainabilityRefers to the ability for the service provider to make modifications to the service to keep the service in good repair, without disrupting running services. RecoverabilityThe degree to which a service is able to resume a normal state of operation after an unplanned disruption. ReliabilityReflects measures of how a service operates without failure under given conditions during a given time period. Fault ToleranceThe ability of a service to continue to operate properly in the event of a failure in one or more of its components. AuditabilityThe ability to provide evidence of service usage, including completed and aborted requests. Security patchingreporting patch procedures including success or failure, and delay after patch availability. Financial, which includes: Onboarding CostsThe cost of migrating into the cloud service. Operating CostsThe client cost to consume the service. This includes recurring flat costs, such as monthly access fees and usage-based costs. Performance, which includes: AccuracyThe extent to which a service adheres to its defined parameters. FunctionalityThe range of defined features provided by a service. InteroperabilityThe ability of a service to interact with other services, usually associated with the use of open standards. Service Response Time The time between when a transaction is triggered and when the response is returned. Security, which includes: Access Control, Privilege ManagementThe policies and processes in use by the service provider to ensure that only the authorized representatives with appropriate status make use of the provided services. Country Legislative ComplianceThe ability of the service provider to demonstrate compliance of services with applicable country legislation. Corporate Legislative ComplianceThe ability of the service provider to demonstrate compliance of services with applicable corporate requirements. Data IntegrityThe ability to keep the data that is created, used, and stored, in its correct original form so that service consumers may be confident that it is valid. Physical and Environmental SecurityThe policies and processes in use by the service provider to protect the service facilities from unauthorized access, damage, or interference.
36
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
For the purposes of Service Orchestration, the KPIs should align with any definitions provided by the ODCA Master Usage Models: Compute Infrastructure as a Service,4 and Commercial Framework .3 The following table represents some of the more important measures to be considered towards the KPI calculation:
Description of Measure
Ability of a cloud provider to perform requested actions correctly. An appropriate measure would be per cent done right first time. As defined by CIaaS, the configurability and expandability of the solution. An appropriate measure would be cycle time to perform such a change. Ability for cloud subscriber or regulatory agency to request that the provider make all relevant data available to an auditing agency, and for provider to periodically audit its information security program. An appropriate measure would be last audit date, or days since last audit.
Relevant Processes
Order/provision; start/ stop Start/stop; run Compliance
Availability
Discover/negotiate; As defined in the ODCA Master Usage Model: CIaaS ,4 the degree of uptime for the run; end-to-end solution (e.g., taking into account contention probabilities). Appropriate measures would include overall CIaaS service availability, availability of individual service components (e.g. service management portal or API), and aggregate availability of a cloud subscribers active VMs. Cloud subscriber receives an accurate accounting of cost, based on a consistent per-unit price. Trend of per-unit price over time should be made available for each consumer. A systems exposure to risks or vulnerabilities. Appropriate measures would include: Weighted security risk score that expresses risk to business over time Time to Repair or Time of Exposure for the business to vulnerabilities Weighted number of security breaches that incorporates detected breaches and the severity level of those breaches Weighted time to recover for instances when breaches actually occur FIRST.orgs Common Vulnerability Scoring System (CVSS) provides an open framework for quantifying these characteristics. Order/provision; start/ stop All
In the above examples we have suggested a measure but have not specified the high- or low-water marks, as those will be contract- and SLA-specific. The same measures should be used regardless of service tier, however, to provide a cloud subscriber with a means of comparing the cost and benefit of a different tier.
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
37
Stop
Report
Monitor Forecast
Terminate
Comply
38
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
The high level process for a typical service which requires orchestration is as follows:
Description
The cloud subscriber establishes access to the cloud provider services. This includes service discovery through the service catalog, establishment of a service contract with SLAs and pricing, and the creation and management of user accounts with defined security access. This process also includes the initial creation of service requests. The subscriber places a reservation for compute capacity with the provider. In effect, this process notifies the provider that a service request order is pending and guarantees that the resources needed by the subscriber will be available within the reservation timeframe. The cloud subscriber submits a provisioning request with the provider, associated with a reservation The cloud provider fulfils the request and notifies the subscriber. The servers become available. The cloud subscriber can make any needed configuration changes and load any applications and data. The cloud subscriber boots the servers and starts consuming compute resources. The subscriber monitors the running service for event conditions.
Reserve
3 4
Order Deploy
2. Submit Provisioning Request 4. Deploy Service 5. Track Status and Manage Deployment 7. Start 11. System Monitoring and Data Collection 12. System Administration and Remediation
5 6
Start Monitor
7 8
Report Forecast
Scheduled and ad-hoc reports are generated and published. The cloud provider produces a capacity forecast for the subscriber, based on historical usage trends and expected new demand defined by the subscriber. Make changes to the running service based on new requirements such as the maintenance of applications and data, scaling demand or other business needs.
Update
16. Change Deployed Service Instance 17. Auto Scale 9. Suspend 10. Resume
10
Stop
The subscriber stops the service to apply changes from an update or to simply discontinue the use of the services. The servers are no longer available to subscriber end users. Billing stops. The subscriber or provider effects a service termination, which results in the return of compute resources to the pool of available capacity. The subscriber and provider ensure regulatory requirements associated with service termination are met. An audit may be initiated at any time during the service lifecycle by the cloud subscriber or provider in response to legal or corporate requirements.
8. Stop
11
Terminate
19. Service and Data Termination and Deletion 18. Comply Regulatory Requirements 15. Auditing
12 13
Comply Audit
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
39
Please refer to the ODCA usage scenarios in section 6.0 for detailed process steps for each core activity. Service orchestration enables a cloud consumer to invoke services and changes to those services by means of initiating a number of workflows based on the lifecycle event or phase being triggered. Orchestration and the associated workflows must comply with corporate and service related rules, as well as support corporate processes and policies. Rules orientate around services and service compilations (e.g. HA configuration with redundant storage must be deployed to enable 99.9+% availability), and corporate policies may dictate that a contract must exist between the service receiver and the service provider before a service catalog may be published, or before an actual service can be ordered by a service receiver. This makes common sense, since it lays the foundation for setting expectations and a relationship (whether internal IT-to-business or -business-to-business). Therefore, it is identified that there are three key phases in service lifecycles, and that orchestration must recognize which phase is relevant to the request, and therefore which rules, processes, workflows, and systems must be triggered. There are clear lifecycle events and phases where service orchestration uses these triggers that then take slightly different workflow routes:
Activity
Compose, order, start cloud services (plan and build), and establish a supporting or backing contractual relationship
Operate and change cloud services (run GUI and API and change) based on existing services and contracts Stop, end and delete cloud services (end and delete) based on contractual agreements, compliancy requirements, or an abnormal condition GUI or API, unless there is an abnormal contractual situation which requires manual steps, such as early termination with penalty.
These workflows (which are triggered by the orchestration process) progress through a number of layers and sections within the cloud ecosystem, roughly described as follows: 1. 2. 3. 4. 5. Composing sub-services according to business requirement from a catalog of options, typically via a GUI or API. Contracting and establishing a commercial relationship including both contractual and financial frameworks, typically through corporate procurement functions. Deploying the technology related elements, automated and manual workflows. Deploying the service and support related elements, automated and manual workflows. Monitoring, reporting and auditing the compiled services, automated data collection, and report generation.
These functional areas are depicted by means of the following graphic, with the probable workflows for the service lifecycle phases overlaid onto the functional areas.
40
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
Client Interface
Commercial
Technology
Services
GUI
API
Contractual
SLAs
SLAs
Initiate
Software
Operations
Financial
Compute
One-Off Events
End/Delete
Location
Note: Lines indicate different sequences for the three major phases of a service lifecycle
Service orchestration recognizes a series of pre-planned and wired processes and triggers associated with catalog-based services, available activities for those services, and based on rules and processes associated with those activities. As a request is compiled by the service consumer, a package is built up of these activities, workflows, rules, and processes. This determines the specific process flow for that orchestration package as represented in the previous graphic. Standard processes are defined by the cloud service provider, and the rules behind the service catalog must align to those processes. These processes must enable compliance as defined by the relevant ODCA Usage Model: Service Catalog1 applicable to that scenario, security and technology deployment aspects as described in the ODCA Master Usage Model: Compute Infrastructure as a Service (CIaaS) 4, and service compilation based on the defined SLAs and services, together with the applicable commercial framework behind those services. Based on triggers from the orchestration process and the different phases through which this orchestration request must flow dependent on its lifecycle activity, a table below has been created describing the functions, and then later the sub-systems, which must be invoked or triggered for that lifecycle activity. Some triggers are compiled as a service request by the cloud consumer, and some depend on the rules behind the service catalog associated with that particular lifecycle activity. The service provider must understand and define the process and associated workflow interaction for each of the lifecycle activities in advance to support his standard service catalog, and the service consumer must recognize the service lifecycle phase that they are addressing, and the expected interactions and triggers associated with their responsibilities in each phase. These pre-defined processes are defined per the service providers corporation, and are usually based on:
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
41
Compliance requirements at both country and corporate level, such as applicable legislation, privacy, and rules. Commercial requirements, such as contract structure and wording. Security structure, such as internet or corporate network-based security systems and tools. Operational environment structure, such as infrastructure location, and service team structures. Examples of the defined cloud consumer responsibilities could be as follows:
42
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
Reporting
1. Triggers the workflow for the creation of standard reporting for the composed service, with associated data collection, and monitoring queues 2. Reference 3 and 4 3. Initiates reports for the service provider: initiates reports on clients, usage overview, SLA / KPI compliance, revenue, resource sprawl, needed capacity, efficiency, license use, risk, quality etc. For this purpose collects data from Service Level Management and Business Service Management 4. Initiates customizable reports for the user (service overview, usage, SLA and KPI compliance, pricing.)
Operations
1. Initiated and triggers the workflow to communicate to the involved teams for the composed service, that the service is being instantiated, triggers packaging and deployment of the service 1. Initiates and triggers the deployment of the technology resources to support the composed services
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
43
Service Orchestration Process Section Functional Description Run Phase Functional Description End Phase
Compose Services
Technology Layer
1. Triggers the financial system to start its workflow for monthly batch or ETL usage upload and cost calculation, and invoice generation
1. Triggers the workflow to release the resources and return them to the available resource pool
1. Triggers the financial system to poll for final usage values and tally them up
1. Starts workflow to release resources, scrub them, and return them to the available resource pool. Reclaim of unused services
Order Cloud Services, Change Cloud Services Cloud portal, identity mgmt., service catalog, configuration mgt System, Master Services Agreement
Order Cloud Services, Change Cloud Services Cloud shop, provisioning, image library, identity mgt
Order Cloud Services, Change Cloud Services, Run Cloud Services Service catalog, billing system, metering System, Master Services Agreement, payments
Order Cloud Services, Change Cloud Services, Run Cloud Services CRM system
Run Cloud Services, Change Cloud Services Cloud portal, management and monitoring system, metering, service level recording, identity management
Run Cloud Services, Change Cloud Services Management and monitoring system, CRM system, service level recording
Run Cloud Services, Change Cloud Services Management and monitoring, service desk, administration portal, identity management, delivery and operations, provisioning
Order Cloud Services, Change Cloud Services Administration s system, deployment system, management and monitoring, license management system, identity management
44
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
Silver
Enterprise equivalent
Gold
Platinum
As per ODCA Usage Model: Provider Assurance 9 Up to 4 hours nonavailability Up to 4 hours transaction response time Monthly Updated Monthly No Automation Up to 1 hour nonavailability Up to 1 Hour transaction response time Weekly Updated monthly, and on request Partial Automation Standardized and reproducible Programmatic web services in cloud providers choice of standard Up to 10 minutes nonavailability Up to 10 minutes transaction response time Realtime Updated daily Full Automation Compliant with selected corporate requirements Programmatic web services in recognized industry standard, consistent with ODCA concepts Always on Immediate response
Realtime Updated immediately Full Automation, in real time Compliant with selected corporate and country requirements Programmatic web services in recognized industry standard, consistent with ODCA concepts
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
45
46
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
ODCA Service Orchestration Usage Model 1.0 Describe your experience in weaving together multiple different cloud computing services offered by you, if any, or by other vendors. ODCA Service Orchestration Usage Model 1.0 Define the logging and auditing facilities available for the Orchestration API, CLI and GUI, for purposes of auditing and tracing of transactions, and for what period the data is retained. ODCA Service Orchestration Usage Model 1.0 Describe the Security and Encryption applicable within the Orchestration API. ODCA Service Orchestration Usage Model 1.0 Service Orchestration programming examples must be made available and in commonly used languages. Outline what client libraries (aka bindings, proxies) are available for various commonly used languages to accelerate adoption. ODCA Service Orchestration Usage Model 1.0 Web services provide mechanisms to support async call back (i.e. publish result to message queue where consumer is listening, email notification, etc.) for long running tasks. The task should be identified by a globally unique identifier. Subscriber can query for status/progress of the task using this ID. ODCA Service Orchestration Usage Model 1.0 Web services are compartmentalized and capable of scaling out and back, supporting elasticity at massive scale. ODCA Service Orchestration Usage Model 1.0 Web services are designed to be accessed via meaningful and easily remembered vanity name in combination with global load balancers with appropriate routing policies (i.e. proximity, round robin, etc.) ODCA Service Orchestration Usage Model 1.0 Web services are designed for failure, anticipating infrastructure failures and maintain uptime across multiple availability zones. ODCA Service Orchestration Usage Model 1.0 Maintenance and on-going development of the services interfaces are non-intrusive (backward compatible), enabling sustained operations without impact or requirement for downtime. Incompatible changes are addressed through API versioning so that the old and new versions are available simultaneously. Click here for an online assistant from the ODCA: the Proposal Engine Assistant Tool (PEAT) 20 to help you detail your RFP requirements.
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
47
15.0 References
ODCA USAGE MODELS
ODCA Usage Model: Cloud Based Identity Governance and Auditing http://www.opendatacenteralliance.org/document-sections/category/71docs?download=677:HW_ODCA_Identity_Gov_Auditing_Rev1.0_final ODCA Usage Model: Cloud Based Identity Provisioning http://www.opendatacenteralliance.org/document-sections/category/71docs?download=679:ODCA_Identity_Provisioning_Rev1%200_final ODCA Master Usage Model: Compute Infrastructure as a Service (CIaaS ) http://www.opendatacenteralliance.org/docs/ODCA_Compute_ IaaS_MasterUM_v1.0_Nov2012.pdf ODCA Usage Models: Conceptual Overview & Document Map http://www.opendatacenteralliance.org/index2.php?option=com_productsearc h&view=lightbox&proid=16&ie=UTF-8&oe=UTF-8&q=prettyphoto&iframe=true&width=60%&height=90% ODCA Usage Model: Guide to Interoperability Across Clouds http://www.opendatacenteralliance.org/docs/ODCA_Interop_Across_Clouds_ Guide_Rev1.0.pdf ODCA Usage Model: Identity Management Interoperability Guide http://www.opendatacenteralliance.org/document-sections/category/71docs?download=676:HODCA_%20IdM_%20InteropGuide_Rev1%200_final ODCA Usage Model: Long Distance Workload Migration http://www.opendatacenteralliance.org/docs/Long_Distance_Workload_Migration_ Rev1.0_b.pdf ODCA Usage Model: Provider Assurance http://www.opendatacenteralliance.org/docs/Security_Provider_Assurance_Rev%201.1_b.pdf ODCA Usage Model: Regulatory Framework http://www.opendatacenteralliance.org/document-sections/category/71docs?download=455:regulatory_framework ODCA Usage Model: Single Sign On Authentication http://www.opendatacenteralliance.org/document-sections/category/71docs?download=680:ODCA_idM_SingleSign_Rev1.0_final ODCA Usage Model: Standard Units of Measure for IaaS http://www.opendatacenteralliance.org/document-sections/category/71docs?download=458:standard_units_of_measure ODCA Usage Model: VM Interoperability In A Hybrid Cloud Environment http://www.opendatacenteralliance.org/docs/VM_Interoperability_ Rev_1.1_b.pdf ODCA White Paper: Developing Cloud-Capable Applications http://www.opendatacenteralliance.org/index2.php?option=com_productsearch &view=lightbox&proid=17&ie=UTF-8&oe=UTF-8&q=prettyphoto&iframe=true&width=60%&height=90%
other sources
Cloud Computing: Principles and Paradigms Buyya, Broberg, Goscinski; John Wiley & Sons; 2011 DMTFs Cloud Management for Communications Service Providers http://www.dmtf.org/sites/default/files/standards/documents/ DSP2029%20_1.0.0a.pdf NIST Cloud Computing Reference Architecture http://www.nist.gov/customcf/get_pdf.cfm?pub_id=909505 The NIST Definition of Cloud Computing http://csrc.nist.gov/publications/nistpubs/800-145/SP800-145.pdf ODCA Proposal Engine Assistant Tool (PEAT) http://www.opendatacenteralliance.org/ourwork/proposalengineassistant Wikipedia http://en.wikipedia.org
48
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
Endnotes
1. 2. 3. 4. 5. 6. 7. 8. 9. ODCA Usage Model: Service Catalog http://www.opendatacenteralliance.org/document-sections/category/71docs?download=445:service-catalog ODCA Usage Model: Standard Units of Measure for IaaS http://www.opendatacenteralliance.org/document-sections/category/71docs?download=458:standard_units_of_measure ODCA Master Usage Model: Commercial Framework http://www.opendatacenteralliance.org/docs/ODCA_Commercial_Framework_ MasterUM_v1.0_Nov2012.pdf ODCA Master Usage Model: Compute Infrastructure as a Service (CIaaS ) http://www.opendatacenteralliance.org/docs/ODCA_ Compute_IaaS_MasterUM_v1.0_Nov2012.pdf National Institute of Standards and Technologies; Draft NIST Working Definition of Cloud Computing , May 14, 2009, l, m, n Private Cloud Automation, Orchestration, and Measured Service by Joe Onisick http://www.networkcomputing.com/privatecloud-tech-center/231000293 ODCA Usage Model: Identity Management Interoperability Guide http://www.opendatacenteralliance.org/document-sections/ category/71-docs?download=676:HODCA_%20IdM_%20InteropGuide_Rev1%200_final ODCA Usage Model: Security Monitoring http://www.opendatacenteralliance.org/docs/Security_Monitoring_Rev%201.1_b.pdf ODCA Usage Model: Provider Assurance http://www.opendatacenteralliance.org/docs/Security_Provider_Assurance_Rev%20 1.1_b.pdf
10. All data directly linked to a service is considered operational, i.e. business data, logs and audit trails. Configuration data and user related data (identity, entitlements) is not considered being operational data. 11. Crypto shredding might be considered as a possible option for non-recoverable deletion. 12. Not recoverable means possible only with far too high effort, or not recoverable at all (this might depend on the specific case). 13. ODCA Usage Model: Regulatory Framework http://www.opendatacenteralliance.org/document-sections/category/71docs?download=455:regulatory_framework 14. TREC from the EC-sponsored OPTIMIS project http://www.optimis-project.eu/content/ec-optimis-project-release-first-open-sourcecloud-toolkit-service-providers 15. ODCA White Paper: Developing Cloud-Capable Applications http://www.opendatacenteralliance.org/index2.php?option=com_produ ctsearch&view=lightbox&proid=17&ie=UTF-8&oe=UTF-8&q=prettyphoto&iframe=true&width=60%&height=90% 16. ODCA Usage Model: Guide to Interoperability Across Clouds http://www.opendatacenteralliance.org/docs/ODCA_Interop_Across_ Clouds_Guide_Rev1.0.pdf 17. IANA HTTP Status Code Registry http://www.iana.org/assignments/http-status-codes/http-status-codes.xml 18. Balanced Scorecard (BSC) is a strategic performance management tool http://en.wikipedia.org/wiki/Balanced_scorecard 19. Performance Indicators http://en.wikipedia.org/wiki/Performance_indicator 20. ODCA Proposal Engine Assistant Tool (PEAT) http://www.opendatacenteralliance.org/ourwork/proposalengineassistant
Copyright 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED.
49