Vous êtes sur la page 1sur 66

Overview

Presentation

Anbazhagan Mani
Grid Computing Competency Center
IBM India Software Lab

© 2003 IBM Corporation


Contents

• Becoming an On Demand Business

• Grid Overview

• Detailed Grid Architecture

• Demo

Appendix
Grid Links

© 2003 IBM Corporation


Computing Evolution

On Demand

Network-Centric

Client-Server

Mainframe

© 2003 IBM Corporation


An On Demand Business
“An enterprise whose business processes -- integrated end-to-end
across the company and with key partners, suppliers and customers
-- can respond with speed to any customer demand, market
opportunity or external threat”

Responsive

Variable  = Profit
Focused 
Resilient

© 2003 IBM Corporation
On Demand Operating Environment Attributes

Open Integrated
…an approachable, adaptive, integrated and
reliable infrastructure delivering on demand
services for on demand business operations

Virtualized Autonomic

© 2003 IBM Corporation


Virtualized

Storage Operating System


I/O
Processing Applications Data

Grid Computing
Distributed Computing Over a Network,
Using Open Standards to Enable
Heterogeneous Operations
© 2003 IBM Corporation
What would it mean if your business could…

• Analyze the value of an investment portfolio in minutes, rather than


hours?

• Significantly accelerate the drug discovery process?

• Cut the design time of products in half, while reducing the instances of
defects?

• Efficiently expand and contract to meet cyclical demand?

• Unite research teams around the world to take advantage of the most
up-to-date learnings?

© 2003 IBM Corporation


Current Environment: Distributed, Heterogeneous
and Complex
Typical Financial Subsystem Configuration
IMS Sysplex
Network Data
zSeries

Profile Database Security DB2


Local
Capture Servers Gateways
Security IMS Complex
Director Servers Data
zSeries

Presentation Business Logic Gateway IMS Sysplex


Netscape HTTP Data
Enterprise
Server WebSphere JDBC SNA zSeries
Application
WebSphere Security
Server Client
SNA
Application
Server
MQ
MQ MQ SNA IMS Sysplex
Data CICS
Hub Server Group zSeries
MQ MQ

Application Gateway
Logging Logging
TPF
zSeries
Front-end for Web presence for financial services Back-end
Systems

© 2003 IBM Corporation


Total Cost of Ownership Rising

Typical Financial Subsystem Configuration


Complexity,IMSTotal
Sysplex
Data
Cost of Ownership
Network
zSeries

Profile Database Security DB2


Local
Capture Servers Gateways
Security IMS Complex
Director Servers Data
zSeries

Presentation Business Logic Gateway IMS Sysplex


Netscape HTTP Data
Enterprise
Server WebSphere JDBC SNA zSeries
Application
WebSphere Security
Server Client
SNA
Application
Server
MQ
MQ MQ SNA IMS Sysplex
Data CICS
Hub Server Group zSeries
MQ MQ

Application Gateway Technology Component


TPF
Costs, IT Utilization RateszSeries
Logging Logging

Front-end for Web presence for financial services Back-end


Systems

© 2003 IBM Corporation


Low Infrastructure Utilization

Peak-hour Prime-shift 24-hour Period


Utilization Utilization Utilization

Mainframes 85-100% 70% 60%

UNIX 50-70% 10-15% <10%

Intel-based 30% 5-10% 2-5%

Storage N/A N/A 52%

Source: IBM Scorpion White Paper: Simplifying the Corporate IT Infrastructure, 2000

© 2003 IBM Corporation


Grid Addresses These Needs

Infrastructure Optimization
Workload Management and Consolidation
Reduced Cycle Times

Increased Access to Data and Collaboration


Federation of Data
Global Distribution

Resilient / Highly Available Infrastructure


Business Continuity
Recovery and Failover

© 2003 IBM Corporation


Untapped Potential

Value is expressed relative to the ASCI White Supercomputer (12.3 Teraflops) *

© 2003 IBM Corporation


* Cost of one ASCI White Supercomputer= $110M
Businesses are leveraging Grid technologies today

© 2003 IBM Corporation


Infrastructure Optimization

• Provide capacity for high-demand


applications
– Applications that cannot be run effectively on a
single processor
– New large scale applications that provide
strategic business advantage
• Reduce infrastructure costs
– Balance workload based on business rules
– Optimize for cost or throughput
• Reduce resource management costs
– Fewer resources to manage for the same
workload

© 2003 IBM Corporation


Increased Access to Data and Collaboration

• Enable collaboration across organizations for


better results
– Leverage Distributed Data and Resources
Design
Analytics
• Support large multi-disciplinary collaboration Design

– Link Business Processes


– Federation of Data
• Both within a single organization
and between partners
Pricing
– Exploit Replication Services Across Design

Enterprises
Simulation

© 2003 IBM Corporation


Resilient / Highly Available Infrastructure

• Leverage distributed resources to balance


workload 10
11 12 1
2
– Scheduler manages job distribution 9 3
8 4
7 6 5
– Failover and recovery leverage distributed
resources Job
Scheduler
– Scheduler use policies and priorities Recovery
JOB
JOB 1
1 JOB 2 JOB 3
to determine how to meet goals /
Restart

© 2003 IBM Corporation


The Value of Open Standards

Distributed Computing:
Grid
(Globus -> OGSA)

Applications:
Web Services
(SOAP, WSDL, UDDI)

Operating System:
Linux

Information:
World-wide Web
(html, http, j2ee, xml)

Communications:
e-mail
(pop3,SMTP,Mime)
Networking:
The Internet
(TCP/IP)

© 2003 IBM Corporation


Open Grid Services Architecture
(OGSA)

OGSA
OGSAOGSAOGSA
Enabled
Enabled OGSA
Enabled
Enabled
OGSA Enabled

© 2003 IBM Corporation


Architecture Framework
OGSA Structure

Applications

System Management
Grid Services
Sevices
Open Grid Services Architecture (OGSA)
OGSI – Open Grid Services Infrastructure

Web Services

OGSA Enabled OGSA Enabled OGSA Enabled OGSA Enabled OGSA Enabled OGSA Enabled
Workflo File Messagin
Security Database Directory
w Systems g

i vr e Sl a noi ss ef or P
ba pa C ci monot u A

OGSA Enabled OGSA Enabled OGSA Enabled

Servers Storage Network

© 2003 IBM Corporation


Architecture Framework
Products and Services for Grids

System Management
Grid Services
Sevices

IBM Global Services


OGSI – Open Grid OGSA
Services Infrastructure

OGSA Enabled OGSA Enabled OGSA Enabled OGSA Enabled OGSA Enabled OGSA Enabled
Workflo File Messagin
Security Database Directory
w Systems g
ba pa C ci monot u A

OGSA Enabled OGSA Enabled OGSA Enabled

© 2003 IBM Corporation


Grid - a part of competitive strategy

Higher Quality of Increased


Service Efficiency

Increased Reduced
Productivity Complexity

Improved
Resiliency

© 2003 IBM Corporation


An Example - Butterfly.net
Enterprise Optimization

• Needed a scalable, resilient infrastructure for running


massive multiplayer games

• Using 2 clusters of 50 IBM Xseries, WebSphere Application


Server, DB2 Universal Database, and Globus Toolkit,
running in IGS hosting facilities

• Improved end-user experience

• Developers avoid huge upfront costs

• Demonstrated 8x increase in profitability over centralized


server model

© 2003 IBM Corporation


Grid Technical Overview

© 2003 IBM Corporation


The Grid Problem
• Flexible, secure, coordinated resource sharing
among dynamic collections of individuals,
institutions, and resource
– From “The Anatomy of the Grid: Enabling Scalable Virtual
Organizations”
• Enable communities (“virtual organizations”) to
share geographically distributed resources as they
pursue common goals -- assuming the absence
of…
– central location,
– central control,
– omniscience,
– existing trust relationships.

© 2003 IBM Corporation


Elements of the Problem

• Resource sharing
– Computers, storage, sensors, networks, …
– Sharing always conditional: issues of trust, policy, negotiation,
payment, …
• Coordinated problem solving
– Beyond client-server: distributed data analysis, computation,
collaboration, …
• Dynamic, multi-institutional virtual orgs
– Community overlays on classic org structures
– Large or small, static or dynamic

© 2003 IBM Corporation


Broader Context

• “Grid Computing” has much in common with major


industrial thrusts
– Business-to-business, Peer-to-peer, Application Service
Providers, Storage Service Providers, Distributed Computing,
Internet Computing…
• Sharing issues not adequately addressed by existing
technologies
– Complicated requirements: “run program X at site Y subject to
community policy P, providing access to data at Z according to
policy Q”
– High performance: unique demands of advanced & high-
performance systems

© 2003 IBM Corporation


The Globus Project

• Close collaboration with real Grid projects in


science and industry
• Development and promotion of standard Grid
protocols to enable interoperability and shared
infrastructure
• Development and promotion of standard Grid
software APIs and SDKs to enable portability and
code sharing
• The Globus Toolkit™: Open source, reference
software base for building grid infrastructure and
applications
• Global Grid Forum: Development of standard
protocols and APIs for Grid computing

© 2003 IBM Corporation


Some Important Definitions

• Resource
• Network protocol
• Network enabled service
• Application Programmer Interface (API)
• Software Development Kit (SDK)
• Syntax
• Policies

© 2003 IBM Corporation


Globus Architecture

• Descriptive
– Provide a common vocabulary for use when describing Grid
systems
• Guidance
– Identify key areas in which services are required
• Prescriptive
– Define standard “Intergrid” protocols and APIs to facilitate
creation of interoperable Grid systems and portable
applications

© 2003 IBM Corporation


One View of Requirements
• Identity & authentication ● Adaptation
• Authorization & policy ● Intrusion detection
• Resource discovery ● Resource management
• Resource characterization ● Accounting & payment
• Resource allocation ● Fault management
• (Co-)reservation, workflow
● System evolution
• Distributed algorithms
● Etc.
• Remote data access
• High-speed data transfer
● Etc.
• Performance guarantees
● …
• Monitoring

© 2003 IBM Corporation


A Protocol-oriented View of Grid Architecture

• Development of Grid protocols & services


– Protocol-mediated access to remote resources
– New services: e.g., resource brokering
– “On the Grid” = speak Intergrid protocols
– Mostly (extensions to) existing protocols
• Development of Grid APIs & SDKs
– Interfaces to Grid protocols & services
– Facilitate application development by supplying higher-level
abstractions
• The (hugely successful) model is the Internet

© 2003 IBM Corporation


Layered Grid Architecture

Application

Internet Protocol Architecture


“Coordinating multiple resources”:
ubiquitous infrastructure services, Collective
app-specific distributed services Application

“Sharing single resources”:


negotiating access, controlling use Resource

“Talking to things”: communication


(Internet protocols) & security Connectivity Transport
Internet
“Controlling things locally”: Access
to, & control of, resources Fabric Link

© 2003 IBM Corporation


Protocol, Services & API occur at each level

Applications

Languages/Frameworks

Collective Service APIs and SDKs


Collective Service Protocols
Collective Services

Resource APIs and SDKs


Resource Service Protocols
Resource Services

Connectivity APIs
Local Access APIs and Protocols
Fabric Layer

© 2003 IBM Corporation


Fabric Layer

• Just what you would expect: the diverse mix of


resources that may be shared
– Individual computers, Condor pools, file systems, archives,
metadata catalogs, networks, sensors, etc., etc.
• Few constraints on low-level technology:
connectivity and resource level protocols form
the “neck in the hourglass”
• Defined by interfaces not physical characteristics

© 2003 IBM Corporation


Connectivity Layer

• Communication
– Internet protocols: IP, DNS, routing, etc.
• Security: Grid Security Infrastructure (GSI)
– Uniform authentication, authorization, and message protection
mechanisms in multi-institutional setting
– Single sign-on, delegation, identity mapping
– Public key technology, SSL, X.509, GSS-API
– Supporting infrastructure: Certificate Authorities, certificate &
key management, …

© 2003 IBM Corporation


Resource Layer Protocols & Services

• Grid Resource Allocation Mgmt (GRAM)


– Remote allocation, reservation, monitoring, control of
compute resources
• GridFTP protocol (FTP extensions)
– High-performance data access & transport
• Grid Resource Information Service (GRIS)
– Access to structure & state information
• Network reservation, monitoring, control
• All built on connectivity layer: GSI & IP

© 2003 IBM Corporation


Collective Layer Protocols & Services

• Index servers aka metadirectory services


– Custom views on dynamic resource collections assembled by a
community
• Resource brokers (e.g., Condor Matchmaker)
– Resource discovery and allocation
• Replica catalogs
• Replication services
• Co-reservation and co-allocation services
• Workflow management services
• Etc.

© 2003 IBM Corporation


Security Services
• Resources being used may be valuable & the
problems being solved sensitive
• Resources are often located in distinct administrative
domains
– Each resource has own policies & procedures
• Set of resources used by a single computation may
be large, dynamic, and unpredictable
– Not just client/server, requires delegation
• It must be broadly available & applicable
– Standard, well-tested, well-understood protocols; integrated with
wide variety of tools

© 2003 IBM Corporation


Security Services (GSI)
Single sign-on via “grid-id”
& generation of proxy cred. User Proxy
User Or: retrieval of proxy cred.
Proxy
credential
from online repository
Remote process
creation requests*
GSI-enabled Authorize Ditto GSI-enabled
Site A
GRAM server Map to local id GRAM server Site B
(Kerberos) (Unix)
Create process
Computer Generate credentials Computer
Process Process
Local id Communication* Local id
Kerberos Restricted Remote file Restricted
ticket proxy
access request* proxy

GSI-enabled
Site C FTP server
(Kerberos)
* With mutual authentication Authorize
Map to local id
Access file

© 2003 IBM Corporation


Grid Security Infrastructure (GSI)

• Extensions to standard protocols & APIs


– Standards: SSL/TLS, X.509 & CA, GSS-API
– Extensions for single sign-on and delegation
• Globus Toolkit reference implementation of GSI
– SSLeay/OpenSSL + GSS-API + SSO/delegation
– Tools and services to interface to local security
• Simple ACLs; SSLK5/PKINIT for access to K5, AFS; …
– Tools for credential management
• Login, logout, etc.
• Smartcards
• MyProxy: Web portal login and delegation
• K5cert: Automatic X.509 certificate creation

© 2003 IBM Corporation


Resource Management Services

• Enabling secure, controlled remote access to


heterogeneous computational resources and
management of remote computation
– Authentication and authorization
– Resource discovery & characterization
– Reservation and allocation
– Computation monitoring and control
• Addressed by new protocols & services
– GRAM protocol as a basic building block
– Resource brokering & co-allocation services
– GSI for security, MDS for discovery

© 2003 IBM Corporation


Resource Management Services

• The Grid Resource Allocation Management


(GRAM) protocol and client API allows programs
to be started on remote resources, despite local
heterogeneity
• Resource Specification Language (RSL) is used
to communicate requirements
• A layered architecture allows application-specific
resource brokers and co-allocators to be defined
in terms of GRAM services
– Integrated with Condor, PBS …

© 2003 IBM Corporation


Resource Management Architecture

Broker
RSL
RSL specialization

Queries Information
Application
& Info Service
Ground RSL

Co-allocator

Simple ground RSL


Local GRAM GRAM GRAM
resource
managers LSF Condor NQE

© 2003 IBM Corporation


GRAM Protocol

• GRAM-1: Simple HTTP-based RPC


– Job request
• Returns a “job contact”: Opaque string that can be passed between
clients, for access to job
– Job cancel, status, signal
– Event notification (callbacks) for state changes
• Pending, active, done, failed, suspended

• GRAM-1.5 (U Wisconsin contribution)


– Add reliability improvements
• Once-and-only-once submission
• Recoverable job manager service
• Reliable termination detection

• GRAM-2: Moving to Web Services (SOAP)…

© 2003 IBM Corporation


GRAM Components
MDS client API calls
to locate resources
Client MDS: Grid Index Info Server
MDS client API calls Site boundary
to get resource info

MDS: Grid Resource Info Server


Query current status
GRAM client API state of resource
Grid Security change callbacks
Infrastructure Local Resource Manager
Allocate &
Request
create processes
Create Job Manager

Gatekeeper Parse
Process
Monitor &
control Process
RSL Library
Process

© 2003 IBM Corporation


Information Services

• System information is critical to operation of the


grid and construction of applications
– What resources are available?
• Resource discovery
– What is the “state” of the grid?
• Resource selection
– How to optimize resource use
• Application configuration and adaptation?
• We need a general information infrastructure to
answer these questions

© 2003 IBM Corporation


Grid Information Service

• Provide access to static and dynamic information


regarding system components
• A basis for configuration and adaptation in
heterogeneous, dynamic environments
• Requirements and characteristics
– Uniform, flexible access to information
– Scalable, efficient access to dynamic data
– Access to multiple information sources
– Decentralized maintenance

© 2003 IBM Corporation


Two classes of Information Services

• Resource Description Services


– Supplies information about a specific resource (e.g. Globus
1.1.3 GRIS).
• Aggregate Directory Services
– Supplies collection of information which was gathered from
multiple GRIS servers (e.g. Globus 1.1.3 GIIS).
– Customized naming and indexing

© 2003 IBM Corporation


Information Protocols

• Grid Resource Registration Protocol


– Support information/resource discovery
– Designed to support machine/network failure
• Grid Resource Inquiry Protocol
– Query resource description server for information
– Query aggregate server for information
– LDAP V3.0 in Globus 1.1.3

© 2003 IBM Corporation


Meta-Computing Directory Service (MDS)

• Use LDAP as Inquiry


• Access information in a distributed directory
– Directory represented by collection of LDAP servers
– Each server optimized for particular function
• Directory can be updated by:
– Information providers and tools
– Applications (i.e., users)
– Backend tools which generate info on demand
• Information dynamically available to tools and
applications

© 2003 IBM Corporation


Grid Resource Information Service

• Server which runs on each resource


– Given the resource DNS name, you can find the GRIS server (well
known port = 2135)
• Provides resource specific information
– Much of this information may be dynamic
• Load, process information, storage information, etc.
• GRIS gathers this information on demand

• “White pages” lookup of resource information


– Ex: How much memory does machine have?
• “Yellow pages” lookup of resource options
– Ex: Which queues on machine allows large jobs?

© 2003 IBM Corporation


Grid Index Information Service

• GIIS describes a class of servers


– Gathers information from multiple GRIS servers
– Each GIIS is optimized for particular queries
• Ex1: Which machines are >16 process SGIs?
• Ex2: Which storage servers have >100Mbps bandwidth to host X?
– Akin to web search engines
• Organization GIIS
– The Globus Toolkit ships with one GIIS
– Caches GRIS info with long update frequency
• Useful for queries across an organization that rely on relatively static
information (Ex1 above)
• Can be merged into GRIS

© 2003 IBM Corporation


Data Management Services
• Two major Data Grid components:

• 1. Data Transport and Access


– Common protocol
• Secure, efficient, flexible, extensible data movement
– Family of tools supporting this protocol

• 2. Replica Management Architecture


– Simple scheme for managing:
• multiple copies of files
• collections of files

© 2003 IBM Corporation


GridFTP

• Why FTP?
– Ubiquity enables interoperation with many commodity tools
– Already supports many desired features, easily extended to
support others
– Well understood and supported
• We use the term GridFTP to refer to
– Transfer protocol which meets requirements
– Family of tools which implement the protocol
• Note GridFTP > FTP
• Note that despite name, GridFTP is not restricted to
file transfer!

© 2003 IBM Corporation


Replica Management

• Maintain a mapping between logical names for files


and collections and one or more physical locations
• Important for many applications
– Example: CERN data
• Multiple petabytes of data per year
• Copy of everything at CERN (Tier 0)
• Subsets at national centers (Tier 1)
• Smaller regional centers (Tier 2)
• Individual researchers will have copies

© 2003 IBM Corporation


Replica Manager Components

• Replica catalog definition


– LDAP object classes for representing logical-to-physical
mappings in an LDAP catalog
• Low-level replica catalog API
– globus_replica_catalog library
– Manipulates replica catalog: add, delete, etc.
• High-level reliable replication API
– globus_replica_manager library
– Combines calls to file transfer operations and calls to low-level
API functions: create, destroy, etc.

© 2003 IBM Corporation


Globus Overall View

Applications

High-Level Services & Tools


GlobusView Testbed Status

DUROC MPI Condor HPC++ Nimrod/G globusrun

resource mgmt
Core Services
Nexus GRAM
Metacomputing HeartBeat
Directory Svcs Global Security Monitor
I/O Interface GASS
secure messaging repositories
cluster services
security services

Local Services
Condor MPI TCP UDP
services hosting environment
LSF PBS NQE Linux AIX Solaris

© 2003 IBM Corporation


What is evolving – Web Services & Grid

Web Services and Grid try to solve similar problems in


different realms:

• Defining an open distributed computing platform

• Assuring interoperability

• Dealing with heterogeneous platforms, protocols and


applications

• In the business and scientific computing areas

© 2003 IBM Corporation


OGSA (Open Grid Services Architecture)

Web Services Globus Toolkit

OGSA
OGSA

© 2003 IBM Corporation


Open Grid Infrastructure (OGSI)
Anatomy of a Grid Service

GridService Other Interfaces •Service creation (Factory)


(required) (Optional) •Service discovery (Registry)
•Service Data Access
•Notification
•Lifetime Management
•Handle Management

•Other functions e.g.


•Workflow
Element
•Auditing
Handle Element
Service
ElementData •Resource Management
Grid Service
Implementation

Hosting Environment

© 2003 IBM Corporation


Open Grid Infrastructure (OGSI)
Grid Service Implementation Independence

Abstract service
interface remains the
same

Implementation

Hosting Environment

Other Middleware

Operating System
Hardware
© 2003 IBM Corporation
Open Grid Infrastructure (OGSI)
Grid Service Implementation - Examples

Abstract service
interface remains the
same
Registry
Service

Implementation

Hosting Environment - J2EE


JNDI JDBC

Other Middleware LDAP

Database (DB2)

Operating System

Hardware

© 2003 IBM Corporation


Open Grid Infrastructure (OGSI)
Grid Service Implementation - Examples

Abstract service
interface remains the
same
File Transfer
Service

Implementation

Hosting Environment - J2EE

Other Middleware
Database (DB2)

Operating System File System

Hardware Storage System (NAS/SAN)

© 2003 IBM Corporation


OGSA portTypes
PortType Name Description

GridService encapsulates the root behavior of the service model


HandleResolver mapping from a GSH to a GSR
NotificationSource allows clients to subscribe to notification messages
NotificationSubscription defines the relationship between a single NotificationSource
and NotificationSink pair
NotificationSink defines a single operation for delivering a notification
message to the service instance that implements the
operation
Factory standard operation for creation of Grid service instances
ServiceGroup allows clients to maintain groups of services
ServiceGroupRegistration allows Grid services to be added and removed from a
ServiceGroup
ServiceGroupEntry defines the relationship between a Grid service and its
membership within a ServiceGroup

© 2003 IBM Corporation


Grid Vision

Grid Vision

Service Grid
supported by xSPs

Partner Grid
across multiple orgs

Enterprise Grid
inter-dept sharing within orgs

© 2003 IBM Corporation


Grid

Grid Computing is real TODAY & very important for every


company!
Grids will be an integral part of your organization’s
on-demand infrastructure.

© 2003 IBM Corporation

Vous aimerez peut-être aussi