Cloud Computing (Draft For Review)

Evaluation Towards Cloud : Overview of Next Generation Computing Architecture
by Monowar Hasan & Sabbir Ahmed A Thesis submitted to the Department of Computer Science and Engineering in partial fulllment of the requirements for the degree of
Bachelor of Science (B.Sc.) in Department of Computer Science and Engineering
Bangladesh University of Engineering and Technology 22 March 2012 Dhaka Bangladesh
Abstract
Nowadays Cloud Computing become a buzz-word in distributed processing. Cloud Computing, originates from the ideas of concurrent processing from Computer Cluster, enhanced the established architecture and standards of Grid - another technology of parallel processing with the ideas of Utility Computing and Service oriented Computing. Cloud Computing is actually provide a business model as form of X-as-a-Service where X may include Hardware, Software, Developing platform or some Storage media. End-users can consume any of these service form provides pay-as-you-go basis without knowing details about underlying architecture. Hence, cloud provide a layers of abstraction to end-users and provides scope to modify the application demand for end-users, developers and providers.
ii
Acknowledgements
We are grateful to several people for this thesis without whom it wont be a successful one. Our heart felt thanks to our supervisor Professor Dr. Md. Humayun Kabir Sir for his support and valuable guidelines. With his continuous feedback and assistance helps us to clear our ideas and understandings on the topics.
Special thanks to Professor Dr. Hanan Lutyya from University of Western Ontario, Canada and Professor Dr. Ivona Brandic, Vienna University of Technology, Vienna, Austria for proving their research publications which helps to progress our thesis.
Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology provided us with sound working environment and helps us to get on-line publications.
Last but not the least, we acknowledge the contribution and support of our family members for being with us and encouraging us all the way. Without their sacrice it would not end up a successful one.
iii
Table of Contents
Abstract Acknowledgments Table of Contents List of Tables List of Figures 1 Computing with Distributes Units: Computer Clusters 1.1 Distributed Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.1 1.1.2 1.1.3 1.2 1.3 1.4 1.5 Centralized vs Distributed Systems . . . . . . . . . . . . . . . Advantages of Distributed Systems . . . . . . . . . . . . . . . Issues and Challanges in Distributed Systems . . . . . . . . .
ii iii vii viii x 1 4 5 6 7 7 8 10 12 13 14
Computer Clusters . . . . . . . . . . . . . . . . . . . . . . . . . . . . Architecture of Computer Clusters . . . . . . . . . . . . . . . . . . . Cluster Interconnection . . . . . . . . . . . . . . . . . . . . . . . . . . Protocols for Cluster Communication . . . . . . . . . . . . . . . . . . 1.5.1 1.5.2 Internet Protocols . . . . . . . . . . . . . . . . . . . . . . . . . Low-latency Protocols . . . . . . . . . . . . . . . . . . . . . .
iv
1.5.2.1 1.5.2.2 1.5.2.3 1.5.2.4 1.5.2.5 1.5.3
Active Messages . . . . . . . . . . . . . . . . . . . . Fast Messages . . . . . . . . . . . . . . . . . . . . . . VMMC . . . . . . . . . . . . . . . . . . . . . . . . . U-net . . . . . . . . . . . . . . . . . . . . . . . . . . BIP . . . . . . . . . . . . . . . . . . . . . . . . . . .
14 15 16 16 17 17 18 19 22 23 24 24 25 26 27 28 29 30 31 33 34 35 35 36 37 38
Standards for Cluster Communication . . . . . . . . . . . . . 1.5.3.1 1.5.3.2 VIA . . . . . . . . . . . . . . . . . . . . . . . . . . . InniBand . . . . . . . . . . . . . . . . . . . . . . . .
1.6 1.7
Single System Image (SSI) . . . . . . . . . . . . . . . . . . . . . . . . Cluster Middleware . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.7.1 1.7.2 1.7.3 Message-based Middleware . . . . . . . . . . . . . . . . . . . RPC-based Middleware . . . . . . . . . . . . . . . . . . . . . Object Request Broker . . . . . . . . . . . . . . . . . . . . . .
1.8
Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 Grid Computing : An Introduction 2.1 Grid Computing: denitions and overview . . . . . . . . . . . . . . . 2.1.1 2.1.2 2.2 2.3 Virtualization and Grid . . . . . . . . . . . . . . . . . . . . .
Grids over Cluster Computing . . . . . . . . . . . . . . . . . .
An example of Grid Computing environment . . . . . . . . . . . . . . Grid Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 2.3.2 2.3.3 2.3.4 2.3.5 Fabric Layer: Interfaces to Local Resources . . . . . . . . . . . Connectivity Layer: Managing Communications . . . . . . . . Resource Layer: Sharing of a Single Resource . . . . . . . . . Collective Layer : Co-ordination with multiple resources . . . Application Layer : User dened Grid Applications . . . . . .
2.4
Grid Computing with Globus . . . . . . . . . . . . . . . . . . . . . . v
2.5
Resource Management in Grid Computing . . . . . . . . . . . . . . . 2.5.1 2.5.2 Resource Specication Language . . . . . . . . . . . . . . . . . Globus Resource Allocation Manager (GRAM) . . . . . . . . .
39 40 41 42 43 45 45 47 48 55 57 61 62 62 63 64 67 68
2.6 2.7
Evolution towards Cloud Computing from Grid . . . . . . . . . . . . Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 An overview of Cloud Architecture 3.1 3.2 Cloud Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cloud Architectures . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 3.2.2 3.2.3 3.3 A layered model of Cloud architecture - Cloud ontology . . . . Cloud Business Model . . . . . . . . . . . . . . . . . . . . . . Cloud Deployment Model . . . . . . . . . . . . . . . . . . . .
Cloud Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 3.3.2 3.3.3 Infrastructure as a Service (IaaS) . . . . . . . . . . . . . . . . Platform as a Service (PaaS) . . . . . . . . . . . . . . . . . . . Software as a Service (SaaS) . . . . . . . . . . . . . . . . . . .
3.4 3.5 3.6
Virtualization on Cloud . . . . . . . . . . . . . . . . . . . . . . . . . Example of a Cloud Implementation . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4 Grid and Cloud Computing Comparisons : Similarities & Dierences 4.1 4.2 Major Focus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Points of Considerations . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.1 4.2.2 4.2.3 Business Model . . . . . . . . . . . . . . . . . . . . . . . . . . Scalability issues . . . . . . . . . . . . . . . . . . . . . . . . . Multitasking and Availability . . . . . . . . . . . . . . . . . . 69 69 70 70 71 72
vi
4.2.4 4.2.5 4.2.6 4.3
Resource Management . . . . . . . . . . . . . . . . . . . . . . Application Model . . . . . . . . . . . . . . . . . . . . . . . . Other issues . . . . . . . . . . . . . . . . . . . . . . . . . . . .
72 76 76 77 77 77 79
Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.1 Comparative results . . . . . . . . . . . . . . . . . . . . . . .
4.4
Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5 Conclusion and Future works
vii
List of Tables
3.1 Example of existing Cloud Systems w.r.to classication into layers of Cloud Ontology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 4.1 CPU utilization in Full Virtualization and Paravirtualization . . . . . Comparative analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 66 78
viii
List of Figures
1.1 1.2 1.3 1.4 1.5 1.6 2.1 2.2 2.3 2.4 2.5 2.6 2.7 Eras of Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . Distributed computing . . . . . . . . . . . . . . . . . . . . . . . . . . Architecture of Cluster Computing . . . . . . . . . . . . . . . . . . . Categories of Cluster Interconnection Hardware. . . . . . . . . . . . . Traditional Protocol Overhead and Transmission Time. . . . . . . . . The InniBand Architecture . . . . . . . . . . . . . . . . . . . . . . . Evolution of Grid Computing . . . . . . . . . . . . . . . . . . . . . . Resource availability according to demand . . . . . . . . . . . . . . . Serving job requests in traditional environment . . . . . . . . . . . . Serving job requests in traditional environment . . . . . . . . . . . . Google search architecture . . . . . . . . . . . . . . . . . . . . . . . . Grid Protocol Architecture . . . . . . . . . . . . . . . . . . . . . . . . Collective and Resource layer protocols are combined in various ways to provide application functionality . . . . . . . . . . . . . . . . . . . 2.8 Programmers view of Grid Architecture. Dotted lines denotes protocol interactions where solid lines represent a direct call . . . . . . . . . . 2.9 A resource management architecture for Grid Computing environment 37 40 42 43 36 3 5 9 11 14 20 28 30 31 32 33 34
2.10 Globus GRAM Architecture . . . . . . . . . . . . . . . . . . . . . . . 2.11 Enhancement of generic Grid architecture to Service Oriented Grid . ix
3.1 3.2 3.3
Components of a Cloud Computing Solution . . . . . . . . . . . . . . Hierarchical abstraction layers of Cluster, Grid and Cloud Computing Cloud layered architecture : consists of ve layers, gure represents inter-dependency between layers . . . . . . . . . . . . . . . . . . . . .
46 48
49
3.4
Non-cloud environment needs three servers but in the Cloud, two servers are used . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 55 59 59 61 62 63 63 64
3.5 3.6 3.7 3.8 3.9
Cloud computing Business model . . . . . . . . . . . . . . . . . . . . External or Public Cloud . . . . . . . . . . . . . . . . . . . . . . . . . Internal or Private Cloud . . . . . . . . . . . . . . . . . . . . . . . . . Example of Hybrid Cloud . . . . . . . . . . . . . . . . . . . . . . . . Correlation between Cloud Architecture and Cloud Services . . . . .
3.10 Infrastructure as a Service . . . . . . . . . . . . . . . . . . . . . . . . 3.11 Platform as a Service . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.12 Software as a Service . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.13 A fully virtualized deployment where operating platform running on servers is displayed . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.14 A Paravirtualized deployment where many OS can run simultaneously 4.1 4.2 Motivation of Grid and Cloud . . . . . . . . . . . . . . . . . . . . . . Comparison regarding performance, reliability and cost . . . . . . . .
65 66 70 71
Chapter 1 Computing with Distributes Units: Computer Clusters

Computing Industry is one of the fastest growing industries and it started since 1943. Computers [1, 2] between 1943 and 1959 usually regarded as rst generation computers and are based on valves and wire circuits. They are [3] characterized by the use of punched cards and vacuum valves. All programming was done in machine code.
The Second Generation Computers were built between 1959 and 1964 .They were based on transistors and printed circuits. So they were much smaller. These computers were more powerful, accepting English-like commands, and so were much more exible in their applications.
Computers built between 1964 and 1972 are often regarded as Third Generation computers. They werere based on the rst integrated circuits and creating even smaller machines.
Computers built after 1972 are often called fourth generation computers. These computers were based on LSI (Large Scale Integration) of circuits such as microprocessors - typically 500 or more components on a chip. Later developments include 1
VLSI (Very Large Scale Integration) typically 10,000 components.
The fth generation computers are based on parallel processing and VLSI integration - but are still being developed. The recent advances in VLSI(Very Large Scale Integration) technology has played a major role in the development of powerful sequential and parallel computers. Software technology is also developing fast as well. Mature software, like Operating Systems, programming languages, development methodologies, and tools, are now available. This gives the opportunity of development and deployment of applications needs for scientic, engineering, and commercial needs. Again several challenging applications, such as weather forecasting and earthquake analysis, have become the main driving force behind the development of powerful parallel computers.
So we can show computing as two prominent eras: Sequential Computing Era Parallel Computing Era A graphical view of the changes in computing eras is shown in Figure 1.1. Each computing era started with hardware architectures of the system and then followed by system software specially operating systems and compilers, applications, and reaching its limit with its growth Problem Solving Environments. Each component of computing eras had to face three phases: R&D (Research and Development), commercialization, and commodity. Technology for the development of components of parallel era is not so much developed as for sequential era.
There are several reasons of using parallel computers. Some of them are 2
Figure 1.1: Eras of Computing
Parallelism is one of the best ways to overcome the speed bottleneck of a single processor. The price performance ratio of a small cluster-based parallel computer as opposed to a minicomputer is much smaller and consequently a better value. Developing and producing systems of moderate speed using parallel architectures is much cheaper than the equivalent performance of a sequential system. In the 1980s it was believed that computer performance was best improved by creating faster and more ecient processors. But this idea was challenged by parallel processing. This means linking together two or more computers to jointly solve some computational problem. Since the early 1990s there has been an increasing trend to move away from expensive and specialized proprietary parallel supercomputers towards networks of workstations. This was the driving force for starting Cluster Computing. Later several distributed computing systems are developed like Grid Conputing and Cloud Computing. In this chapter we will going to discuss about Cluster Computing.
1.1
Distributed Systems
A distributed system is a computing system in which several autonomous computers are linked by a computer network that appear to the users of the system as a single computer.
The computers in the network interact with each other in order to achieve a common goal. The program which runs in a distributed system are called distributed program. By running a distributed system software the computers are enabled to: Coordinate their activities. 4
Figure 1.2: Distributed computing Share resources: hardware, software, data. Achieve transparancy of resources. Illusion of single system while running upon multiple system. Distributed systems are useful for [4] breaking down an application into individual computing (Figure 1.2) agents so that they can be easily solve. These systems are distributed over a network. They work together on a cooperative task. They can solve larger problems without larger computers. So they are very cheap in comparison to single system computing. So now a days Distributed systems become more preferableThere is a central sever and several clients and they are connected together. Various parallel devices are connected to the whole system through distributed system and both operator and client can use them.
1.1.1
Centralized vs Distributed Systems
Here are some [5] dierence between Centralized and Distributed Systems. Centralized Systems: Centralized systems have non-autonomous components. 5
Centralized systems are often build using homogeneous technology. Multiple users share the resources of a centralized system at all times. Centralized systems have a single point of control and of failure. Distributed Systems: Distributed systems have autonomous components. Distributed systems may be built using heterogeneous technology. Distributed system components may be used exclusively. Distributed systems are executed in concurrent processes. Distributed systems have multiple points of failure.
1.1.2
Advantages of Distributed Systems
Distributed system has [6] several advantages over single system. Some of them are Performance: Very often a collection of processors can provide higher performance than a centralized computer.Distributed System has better Price/performance ratio. Distribution: There are applications involve, by their nature, spatially separated machines (banking, commercial, automotive system). Reliability: There may be crash of the machines. For single system if machine crashes then all data will be lost but for Distributed system if some of the machines crash, the system can survive. Incremental growth: As requirements on processing power grow, new machines can be added incrementally. 6
Sharing of data/resources: Shared data is essential to many applications (banking, computer supported cooperative work, reservation systems); other resources can be also shared (e.g. expensive printers). Communication: Give the opportunity of human-to-human communication.
1.1.3
Issues and Challanges in Distributed Systems
Though there are several advantages of [7] Distributed system, there are some disadvantages also. Some of them are Diculties of developing distributed software: It is dicult to develop software for these distributed systems. It is hard to nd that how should operating systems, programming languages and applications look like. Networking problems: Several problems are created by the network infrastructure, which have to be dealt with: loss of messages, overloading, etc. Security problems: Sharing generates the problem of data security. More components to fail: As Distributed systems deals with larger network, so there are more possibility of failure of the system and data transfer.
1.2
Computer Clusters
A cluster [8] is a type of parallel or distributed processing system. It consists of a collection of interconnected stand-alone computers and they working together as a single,integrated computing resource. All the component subsystems of a Cluster are supervised within a single administrative domain, usually residing in a single room 7
and managed as a single computer system. We can use Cluster computing [9] for load balancing as well as for high availability. We can also use Cluster computing as a relatively low-cost form of parallel processing for scientic and other applications that lend themselves to parallel operations. Some properties of cluster computing: The computers also known as nodes on a cluster are networked in a tightlycoupled fashion .They are all on the same subnet of the same domain and often networked with very high bandwidth connections. The nodes of a Cluster are homogeneous.They all use the same hardware, run the same software, and are generally congured identically. Each node in a cluster is a dedicated resource generally only the cluster applications run on a cluster node. We use the Message Passing Interface (MPI) [10] in Cluster which is a programming interface that allows the distributed application instances to communicate with each other and share information. In Cluster Computing we use Dedicated hardware, high-speed interconnects, and MPI that provide clusters the ability to work eciently on ne-grained parallel problems where the subtasks must communicate many times per second, including problems with short tasks, some of which may depend on the results of previous tasks.
1.3
Architecture of Computer Clusters
In Cluster Computing a computer node can be a single or multiprocessor system [11].The nodes can be PCs, workstations,or SMPs with memory, I/O facilities, and 8
Figure 1.3: Architecture of Cluster Computing an operating system. In Cluster Computing two or more nodes are connected together. These nodes can exist in a single cabinet or be physically separated and connected via a LAN. This LAN-based inter-connected cluster of computers appear as a single system to users and applications. Cluster Computing can provide a cost-efective way to gain features and benets like fast and reliable services that could previously found only on more expensive proprietary shared memory systems. The typical architecture of a cluster is shown in Figure 1.3. A Cluster Computing system is consist of several components. The following are some prominent components of cluster computers: Cluster is consist of Multiple High Performance Computers. There can be PCs, Workstations, or SMPs. There is a state-of-the-art Operating Systems. Operating System can be Layered or Micro-kernel based. Several high Performance Networks/Switches are used to connect the nodes of the Cluster. Among them Gigabit Ethernet and Myrinet are most common. Cluster Interconnection use Network Interface Cards .
Several Fast Communication Protocols and Services are used to communicate within nodes. Active messages, Fast Messages are such type of Protocols. Later we get some standards like InniBand for communicate . There is a Middleware which sits between operating system and application. Middleware provides the system Single System Image (SSI) and System Availability Infrastructure. Middleware consist of Several hardwares like Digital (DEC) Memory Channel and Operating System Kernel or Gluing Layer such as Solaris MC and GLUnix . Applications and Subsystems that consist of several applications,runtime systems and resource management and scheduling software. Applications such as system management tools . Runtime Systems such as software DSM and parallel le system. Resource Management and Scheduling software such as LSF (Load Sharing Facility). Cluster includes Parallel Programming Environments and Tools such as compilers, MPI (Message Passing Interface). Both Sequential and Parallel or Distributed Applications. Sequential Parallel or Distributed
1.4
Cluster Interconnection
In Cluster Computing The choice of interconnection technology is a key component. We can classify the Interconnection technologies into four categories. These four 10
Figure 1.4: Categories of Cluster Interconnection Hardware. categories depends on the internal connection and how the nodes communicate each other . The internal connection can be from the I/O bus or the memory bus and the communication between the computers can be performed primarily using messages or using shared storage [12]. Figure 1.4 illustrates the four types of interconnection. Among the four interconnection categories I/O attached message-based systems are by far the most common. This system includes all commonly-used wide-area and local-area network technologies. It also includes several recent products that are specically designed for cluster computing. I/O attached shared storage systems include computers that share a common disk sub-system. Memory attached systems are not common like I/O attached systems,since the memory bus of an individual computer generally has a design that is unique to that type of computer. However, many memory-attached systems are implemented.Most of the time they are implemented in software or with memory-mapped I/O, such as Reective Memory [13]. There are several Hybrid systems that combine the features of more than one category. Example of a Hybrid system the Inniband standard. Inniband [14] is an I/O attached interconnection. It can be used to send data to a shared disk sub-system as well as to send messages to another computer.There are many factors that aect the choice of interconnect technology for a cluster. Factors like compatibility with the cluster hardware and operating system, price, and performance.Performance of a Cluster depends on the latency and bandwidth.
11
Latency is the time needed to send data from one computer to another. Latency also includes overhead for the software to construct the message as well as the time to transfer the bits from one computer to another. Bandwidth is the number of bits per second that can be transmitted over the interconnect hardware. Applications that utilise small messages will have better performance particularly because the latency is reduced.Applications that send large messages will have better performance particularly as the bandwidth increases.The latency is a function of both the communication software and network hardware.
1.5
Protocols for Cluster Communication
A communication protocol denes a set [15] of rules and conventions for communicating between nodes among the cluster. Each protocol uses dierent technology to exchange information. Communication protocols can be classied as: Connection-oriented or connectionless. Oering various levels of reliability.Protocol can be reliable that fully guaranteed to arrive in order or can be unreliable that not guaranteed to arrive in order. Communication can be not buered which is synchronous, or buered which is asynchronous. By the number of intermediate data copies between buers, which may be zero, one, or more.
12
Several protocols are used in clusters. Formerly Traditional internet protocols are used for clustering. Later several protocols that have been designed specically for cluster communication .Finally two new protocol standards have been specially designed for use in cluster computing.
1.5.1
Internet Protocols
The Internet Protocol (IP) is the standard for networking worldwide. The Transmission Control Protocol (TCP) and the User Datagram Protocol (UDP) are both transport layer protocols built over the Internet Protocol. TCP and UDP protocols and the de facto standard BSD sockets Application Programmers Interface (API) to TCP and UDP were among the rst messaging libraries used for [16] cluster computing. Internet Protocol use one or more buers in system memory with the help of operating system services. User application constructs the message in user memory, and then makes an operating system request to copy the message into a system buer. A system interrupt is required for send and receive. In Internet protocol Operating system overhead and the overhead for copies to and from system memory are a signicant portion of the total time to send a message. As network hardware became faster during the 1990s, the overhead of the communication protocols became signicantly larger than the actual hardware transmission time for messages, as shown in Figure 1.5. So there needed the necessity of new types of protocols for cluster computing.
13
Figure 1.5: Traditional Protocol Overhead and Transmission Time.
1.5.2
Low-latency Protocols
For avoiding operating system intervention several research projects were done during the 1990s.These projects led to the development low-latency protocols. These protocols at the same time providing user-level messaging services across high-speed networks. Low-latency protocols developed during the 1990s include Active Messages, Fast Messages, the VMMC (Virtual Memory-Mapped Communication) system, U-net and Basic Interface for Parallelism (BIP), among others. 1.5.2.1 Active Messages
Active Messages was developed in the university of Berkeley. It [17] is the enabling low-latency communications library for the Berkeley Network of Workstations (NOW) project [18]. Short messages in Active Messages are synchronous and based on the concept of a request-reply protocol. Sending user-level application constructs a message in user memory. To transfer the data, the receiving process allocates a receive buer in user memory on the receiving side and sends a request to the sender. The sender replies by copying the message from the user buer on the sending
14
side directly to the network. No buering in system memory is performed. Network hardware transfers the message to the receiver, and then the message is transferred from the network to the receive buer in user memory. It is required in Active Messages that user virtual memory on both the sending and receiving side be pinned to an address in physical memory .The reason behind it so that it will not be paged out during the network operation.Once the pinned user memory buers are established, no operating system intervention is required for a message to be sent. Since no copies from user memory to system memory are used, this protocol is known as a zero-copy protocol. To support multiple concurrent parallel applications in a cluster Active Messages was extended to Generic Active Messages (GAM). In GAM, a copy sometimes occurs to a buer in system memory on the receiving side so that user buers can be reused more eciently. In this case, the protocol is referred to as a one-copy protocol. 1.5.2.2 Fast Messages
Fast Messages was developed at the University of Illinois.Itl similar to Active Messages [19]. Fast Messages extends Active Messages by imposing stronger guarantees on the underlying communication. Fast Messages guarantees that all messages arrive reliably and in-order, even if the underlying network hardware does not. Fast Message using ow control to ensure that a fast sender cannot overrun a slow receiver, thus causing messages to be lost. Flow control is implemented in Fast Messages with a credit system that manages pinned memory in the host computers.
15
1.5.2.3
VMMC
The Virtual Memory-Mapped Communication(VMMC) [20] system was low-latency protocol for the Princeton SHRIMP project. One goal of VMMC is to view messaging as reads and writes into the user-level virtual memory system. VMMC works by mapping a page of user virtual memory to physical memory.It makes a correspondence between pages on the sending and receiving sides. It uses specially designed hardware .This hardware allows the network interface to snoop writes to memory on the local host and have these writes automatically updated on the remote hosts memory. Various optimisations of these writes have been developed that help to minimize the total number of writes, network trac, and overall application performance. VMMC is an example of a paradigm known as distributed shared memory (DSM). In DSM systems memory is physically distributed among the nodes in a system, but processes in an application may view shared memory locations as identical and perform reads and writes to the shared memory locations. 1.5.2.4 U-net
The U-net network interface architecture [21] was developed at Cornell University. U-net provides zero-copy messaging where possible. U-net adds the concept of a virtual network interface for each connection in a user application. Just as an application has a virtual memory address space that is mapped to real physical memory on demand. Each communication endpoint of the application is viewed as a virtual network interface mapped to a real set of network buers and queues on demand. 16
The advantage of this architecture is that once the mapping is dened, each active interface has direct access to the network without operating system intervention. The result is that communication can occur with very low latency. 1.5.2.5 BIP
BIP [22] (Basic Interface for Parallelism) is a low-latency protocol that was developed at the University of Lyon. BIP is designed as a low-level message layer over which a higher-level layer such as Message Passing Interface (MPI) [10]can be built. Programmers can use MPI over BIP for parallel application programming. The initial BIP interface consisted of both blocking and non-blocking calls. Later versions (BIP-SMP) provide multiplexing between the network and shared memory under a single API for use on clusters of symmetric multiprocessors.
BIP achieves low latency and high bandwidth by using dierent protocols for various message sizes.It also provides a zero or single memory copy of user data. To simply the design and keep the overheads low, BIP guarantees in-order delivery of messages, although some ow control issues for small messages are passed to higher software levels.
1.5.3
Standards for Cluster Communication
Research on low-latency protocols had progressed suciently during those years.So new standard for low-latency messaging to be developed, the Virtual Interface Architecture (VIA). During a similar period of time industrial researchers worked on standards for shared storage subsystems. The combination of the eorts of many
17
researchers has resulted in the InniBand standard.
1.5.3.1
VIA
The Virtual Interface Architecture (VIA) [23] is a communications standard that combines many of the best features of various academic projects. A consortium of academic and industrial partners, including Intel, Compaq, and Microsoft, developed the standard. VIA supported heterogeneous hardware and was available as of early 2001. It was based on the concept of a virtual network interface.Before a message can be sent in VIA, send and receive buers must be allocated and pinned to physical memory locations. there was no need of system calls after the buers and associated data structures are allocated. A send or receive operation in a user application consists of posting a descriptor to a queue. The application can choose to wait for a conrmation that the operation has completed, or can continue host processing while the message is being processed.
Several hardware vendors and some independent developers have developed VIA implementations for various network [24, 25] products. VIA implementations can be classied as native or emulated. A native implementation of VIA o-loads a portion of the processing required to send and receive messages to special hardware on the network interface card. 18
When a message arrives in a native VIA implementation, the network card performs at least a portion of the work required to copy the message into user memory. An emulated VIA implementation, the host CPU performs the processing to send and receive messages. Although the host processor is used in both cases, an emulated implementation of VIA has less overhead than TCP/IP. However, the services provided by VIA are dierent than those provided by TCP/IP, since the communication may not be guaranteed to arrive reliably in VIA.
1.5.3.2
InniBand
The InniBand standard [26] was an another standard for cluster protocol was supported by a large consortium of industrial partners, including Compaq, Dell, HewlettPackard, IBM,Intel, Microsoft and Sun Microsystems. The InniBand architecture replaces the standard shared bus for I/O on current computers with a high-speed serial, channel-based, message-passing, scalable, switched fabric. There are two types of adaptors. Host channel adapters(HCA) and target channel adapters(TCA). All systems and devices attach to the fabric through host channel adapters (HCA) or target channel adapters (TCA), as shown in Figure 1.6. In InniBand data is sent in packets, and six types of transfer methods are available, including: Reliable and unreliable connections, Reliable and unreliable datagrams, Multicast connections and Raw packets.
19
Figure 1.6: The InniBand Architecture InniBand supports remote direct memory access (RDMA) read or write operations.This allows one processor to read or write the contents of memory at another processor, and also directly supports IPv6 [27] messaging for the Internet. There are several components of InniBand: Host channel adapter(HCA): Host channel adapter is an interface that resides within a server.Communicates directly with the servers memory, processor, target channel adapter or a switch. It guarantees delivery of data and can recover from transmission errors. Target channel adapter(TCA): Target channel adapter enables I/O devices to be located within the network independent of a host computer. It includes an I/O controller that is specic to its particular devices protocol. TCAs can communicate with an HCA or a switch. Switch: Switch is virtually equivalent to a trac cop.It allows many HCAs and TCAs to connect to it and handles network trac. Oers higher availability, higher aggregate bandwidth, load balancing, data mirroring and much more. Looks at the local route header on each packet of data and forwards it to the 20
appropriate location. A group of switches is referred to as a fabric. If a host computer is down, the switch still continues to operate. The switch also frees up servers and other devices by handling network trac. Router: Router forwards data packets from a local network (called a subnet) to other external subnets. Reads the global route header and forwards to appropriate address.Rebuilds each packet with the proper local address header as it passes it to the new subnet. Subnet Manager: It is an application responsible for conguring the local subnet and ensuring its continued operation.Conguration responsibilities include managing switch and router setups and reconguring the subnet if a link goes down or a new one is added. The IBA is comprised of four primary layers that describe communication devices and methodology. Physical Layer: Denes the electrical and mechanical characteristics of the IBA, including the cables, connectors and hot-swap characteristics. IBA connectors include ber, copper and backplane connectors. There are three link speeds specied as 1X, 4X and 12X.1X link cable has four wires, two for each direction of communication (read and write). Link Layer: Includes packet layout, point-to-point link instructions, switching within a local subnet and data integrity.Two types of packets, management and data. Management packets handle link congurations and maintenance. Data packets carry up to 4 kilobytes of transaction payload. Every device in a local subnet has a local ID (LID) for forwarding data appropriately. Handles data integrity by 21
including variant and invariant cyclic redundancy checking (CRC). The variant CRC checks elds that change from point-to-point and the invariant CRC provides end-to-end data integrity. Network Layer: The network layer is responsible for routing packets from one subnet to another. The global route header located within a packet includes an IPv6 address for the source and destination of each packet. For single subnet environments, the network layer information is not used. Transport Layer: Handles the order of packet delivery.Also handles partitioning, multiplexing and transport services that determine reliable connections.
1.6
Single System Image (SSI)
Single System Image (SSI) is a property through which we can view a distributed system as a single unied computing resource. This property hides the distributed and heterogeneous nature of the available resources and represents them before the users as a single, powerful, unied computing resource [28]. A system using SSI gives the users a system view of the resources available to them but they dont have to know the node to which they are physically associated.These resources can range from access and manipulation of remote processes to the use of a global le-system.SSI provides high availability,the system can operate after some failure. It also ensures that the nodes are evenly loaded. SSI cluster-based systems are mainly focused on complete transparency of resource management, scalable performance and system availability in supporting user applications [28, 29, 30, 31, 32].There are several key attributes of SSI. The following are among some of the desirable key SSI attributes : point of entry, user interface, pro22
cessspace, I/O and memory space, job-management system and point of management and control. The most important benets of SSI [28] include: SSI allows the use of resources in a transparent way.The user dont have to think about their physical location. It oers the same command syntax as in other systems and thus reduces the risk of operator errors, with the result that end-users see an improved performance, reliability and higher availability of the system. The end-userdont have to know where in the cluster an application will run. SSI greatly simplies system management and thus reduced cost of ownership. It promotes the development of standard tools and utilities.
1.7
Cluster Middleware
Middleware is the layer of software sandwiched between the operating system and applications.It has re-emerged as a means of integrating software applications that run in a heterogeneous environment.There is large overlap between the infrastructure that is provided to a cluster by high-level Single System Image (SSI) services and those provided by the traditional view of middleware. Middleware helps a developer overcome three potential problems with developing applications on a heterogeneous cluster: Gives the ability to access to software inside or outside their site. Helps to integrate softwares from dierent sources. Rapid application development. 23
The services that middleware provides are not restricted to application development.Middleware also provides services for the management and administration of a heterogeneous system.
1.7.1
Message-based Middleware
Message-based middleware uses a common communications protocol to exchange data between applications. The communications protocol hides many of the low-level message passing primitives from the application developer. Message-based middleware software can pass messages directly between applications, send messages via software that queues waiting messages, or use some combination of the two. Examples of this type of middleware are the three upper layers of the OSI model [33], the session, presentation and applications layers.
1.7.2
RPC-based Middleware
There are many applications where the interactions between processes in a distributed system are remote operations, often with a return value. For these applications (RPC) Remote Procedure Call is used.The implementation of the client/server model in terms of Remote Procedure Call (RPC) allows the code of the application to remain the same whether the procedures are the same or not. Inter-process communication mechanisms serve four important functions [34]: They oer mechanisms against failure. Thay also provides the means to cross administrative boundaries. They allow communications between separate processes over a computer network. They enforce clean and simple interfaces, thus providing a natural aid for the modular structure of large distributed applications. 24
They hide the distinction between local and remote communication, thus allowing static or dynamic reconguration.
1.7.3
Object Request Broker
An Object Request Broker (ORB) is a type of middleware that supports the remote execution of objects. An international ORB standard is CORBA (Common Object Request Broker Architecture). It is supported by more than 700 groups and managed by the Object Management Group (OMG) [35].The OMG is a non prot-making organization whose objective is to dene and promote standards for object orientation in order to integrate applications based on existing technologies.
The Object Management Architecture (OMA) is characterized by the following: The Object Request Broker (ORB): It is the controlling element of the architecture and it supports the portability of objects and their interoperability in a network of heterogeneous systems. Object services: These are specic system services for the manipulation of objects.Their goal is to simplify the process of constructing applications. Application services. These oer a set of facilities for allowing applications access databases, to printing services, to synchronize with other application, and so on. Application objects: These allow the rapid development of applications.A new application can be formed from objects in a combined library of application services.
25
1.8
Concluding Remarks
As a begining of thesis, we are studying the necessity and issues related of parallal computation and focusing architectures, protocols and standerds of Computer Clusters. The motivation of distributed processing using Computer Cluster turns into more advance technology named Grid Computing which we will going to discuss in next Section.
26
Chapter 2 Grid Computing : An Introduction

Grid Computing, more specially Grid Computing System is a virtualized distributed environment. Grid environment provides dynamic runtime selection, sharing and aggregation of geographically distributed resources based on resources availability, capability, performance and cost of these computing resources. Fundamentally, Grid Computing is the advanced form of distributed processing which is the combination of decentralized architecture for managing computing resources and a layered hierarchical architecture for providing services to the user [36].
The rest of the chapter is organized as follows. We begin our discussion with denition of Grid Computing and the benets of virtualization on Grid in Section 2.1. In Section 2.3 and 2.4 we consider the underlying layers of Grid Computing in details. Resource management architecture is discussed in Section 2.5 and the a protocol for resource management (GRAM) is discussed in Section 2.5.2 . We Conclude our discussion in Section 2.6 introducing a new approach of distributed processing named Cloud Computing.
27
2.1
Grid Computing: denitions and overview
The concept of Grid was introduced in early 1990s, where high performance computers were connected by fast data communication. The motivation of that approach was to support calculation- and data-intensive scientic applications. Figure 2.1 [37] shows the evolution of grid over time.
Figure 2.1: Evolution of Grid Computing The basics of Grid is to co-allocation of distributed computation resources. The most cited denition of Grid is [38]:
A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities.
Again, according to IBM denition [39],
A grid is a collection of distributed computing resources available
28
over a local or wide area network that appear to an end user or application as one large virtual computing system. The vision is to create virtual dynamic organizations through secure, coordinated resource-sharing among individuals, institutions, and resources.
A Grid Computing environments must include: Coordinated resources: Grid environment must be facilitated with necessary infrastructure for co-ordination of resources based upon policies and service level agreements. Open standard protocols and frameworks: Open standards can provide interoperability and integration facilities. These standard should be applied for resource discovery, resource access and resource co-ordination. Open Grid Services Infrastructure (OGSI) [40] and Open Grid Services Architecture (OGSA) [41] was published by the Global Grid Forum (GGF) as a proposed recommendation for this approach. Grid Computing can be distinguished also from High Performance Computing (HPC) and Clustered Systems in following way: while Grid focuses on resource sharing and can result in HPC, whereas HPC does not necessarily involve sharing of resources [42].
2.1.1
Virtualization and Grid
Virtualization is the process of making resources accessible to a user as if they were a single, larger, homogeneous, resource. Virtualization supports the concept of dynamically shifting resources across various platforms so that computing demands can be scaled with available resources [43]. Figure 2.2 shows the necessity of virtualization 29
to support the proper utilization of resources. Although average utilization of the resources may be relatively low, during peak cycles the server might be overtaxed and resources may not be available.
Figure 2.2: Resource availability according to demand Grid environments can supports the benets of virtualization. Grid enables the abstraction of distributed systems and resources such as processing, network bandwidth and data storage to create a single system image. Such abstraction provides continuous access to large pool of IT capabilities. Figure 2.3 and 2.4 [37] compares the Grid environment over the traditional computations. In Figure 2.4 and organization-owned computational grid is shown where a scheduler sets policies and priorities for placing jobs in the grid infrastructure.
2.1.2
Grids over Cluster Computing
Computer Clusters detailed in Chapter XX are local to the domain. The Clusters are designed to resolve the problem of inadequate computing power. It provides more computation power by pooling of computational resources and parallelizing the workload. As Clusters provide dedicated functionality to local domain, they are not suitable solution for resource sharing between users of various domains. Nodes in the Cluster controlled centrally and Cluster manager is monitoring the state of the node 30
Figure 2.3: Serving job requests in traditional environment [44]. So, in brief, Cluster units only provide a subset of Grid functionality.
2.2
An example of Grid Computing environment
We consider searching world wide web in Google as an example of Grid Computing. Figure 2.5 shows the abstract view of Google search architecture [45]. Google process tens of thousands of queries per second. Each of this query is rst received by one of the Web Servers, then passes it to the array of Index Servers. Index Servers are responsible for keeping index of words and phrases found in websites. The servers are distributed in several machines and hence the searching should run concurrently. In fraction of second, index servers perform a logical AND operation and return the reference of the websites containing query (searching phrase). The resultant references then sent to Store Servers. Store Servers maintain compressed copies of all the pages known to Google. These compressed copies are used to prepare page snippets and nally presented to the end user in a readable form.
31
Figure 2.4: Serving job requests in traditional environment Crawler Machines synchronizing through the web and updating the Google database of pages stored in Index and Store servers. So, the Store Servers actually contains relatively recent and compressed copies of all the pages available in the web.
Grid Computing can facilitates the above scenario of ecient searching. As it stated earlier the servers are distributed and searching should be parallel in order to achieve eciency. The infrastructure also need to scale with the growth of web as the number of pages and indexes increased. Dierent organizations and numerous servers are shared with Google. Copy the content and transforming it into its local resource is allowed by Google. Local resources contain keyword database of the Index Servers and cached content in the database of the Store Servers. The resources partially shared with end-users who send queries through their browsers. Users can then directly contact with the original servers to request the full content of the web page.
Google is also shared its computing cycles. Google shares its computing resources, such as storage and computing capabilities with the end-user by performing data 32
Figure 2.5: Google search architecture caching, ranking and searching of query.
2.3
Grid Architecture
In this section we will discuss about grid architecture, which identies the basic components of a grid system/ It also denes the purpose and functions of such components. However, this layered Grid architecture also indicates how these components actually interacts with one another. Here we present Grid architecture in accordance with Internet protocol architecture. As Internet protocol architecture extends from network to application, however, we can relate the Grid layers into Internet layers [46]. Figure 2.6 shows the Grid layers from top to bottom.
Grid architecture described in [46], the Resource and Connectivity protocol is respon-
33
Figure 2.6: Grid Protocol Architecture sible for sharing individual resources. The protocols in this layer are designed to implement on a top of various types of resources which we identied as Fabric layer. The raw Fabrics, however, can be used to support application specic requirements.
2.3.1
Fabric Layer: Interfaces to Local Resources
Fabric layer provides the resources that can be shared in Grid environment. An example of such resources may be computational resources, storage systems, sensors and network systems. Grid architecture do not deals with logical resources for example distributed le systems, where resource implementation requires individual internal protocols [46].
Components of the Fabric layer implement the local and resource-specic operations on specic resources. Such resources are physical or even logical. These resourcespecic operations provides functionalities of sharing operations at higher levels. In order to support sharing mechanisms we need to provide [44] : an inquiry mechanism so that the components of Fabric are allowed to discover and monitor resources. 34
an appropriate (either application dependent or unied or both) resource management functionalities to control the QoS in Grid environment.
2.3.2
Connectivity Layer: Managing Communications
Connectivity layer denes the core communication and authentication protocols necessary for grid networks. Communication protocol transfers data between Fabric layer resources. Authentication protocols, however, build on communication services for providing cryptographically secure mechanisms to the Grid users and resources.
The communication protocol can work with any of the networking layer protocols that support transport, routing, and naming functionalities. In computational Grid, TCP/IP Internet protocol stack is commonly used [46].
2.3.3
Resource Layer: Sharing of a Single Resource
Resource layer is on top of Connectivity layer to dene the protocols along with API and SDKs for secure negotiation, monitoring, initialization, control and payment of sharing operations on individual resources. Resource layer uses Fabric layer interfaces and functions to access and control local resources. This layer entirely consider local and individual resources and therefore, ignores global resource management issues [46]. To share single resource, we need to classify two resource layer protocols [46]: Information protocols: Information protocols are used to discover the information about state and structure of the resource for example - the conguration of resource, current load state, usage policy or costing of the resource. Management protocols: Management protocols in Resource layer are used to control and access to a shared resource. The protocols specify resource re35
quirements, which includes advanced reservation and QoS and the operations on resources. Such operations include process creation, data access etc. There need to present some protocols to support monitoring application status and termination of an operation.
2.3.4
Collective Layer : Co-ordination with multiple resources
Resource layer, described in Section 2.3.3 deals with operation and management of single resource. But for global resource co-ordination Collective layer protocols have been used. This layer provides necessary API and SDKs not associated with specic resource rather the global resources in overall grid environment.
Figure 2.7: Collective and Resource layer protocols are combined in various ways to provide application functionality The implementation of Collective layer functions can be built on Resource layer or other Collective layer protocols and APIs [46]. Figure 2.7 shows a Collective coallocation API and SDK that uses a Resource layer management protocol to control resources. In the top of this, we dene a co-reservation service protocol and the service itself. So by calling the co-allocation API to implement co-allocation operations provide additional functionality such as authorization, fault tolerance etc. An appli-
36
cation then use the co-reservation service protocols to request and perform end-to-end reservations.
2.3.5
Application Layer : User dened Grid Applications
The top layer of the Grid consists with user applications, which are constructed by utilizing the services dened at each lower layer. At each layer, we have well-dened protocols that access some useful services for example resource management, data access, resource discovery etc. Figure 2.8 shows the correlation between dierent layers [46]. APIs are implemented by SDKs, which use Grid protocols to provide functionalities to end user. Higher level SDKs can also provide functionality so that it is not directly mapped to a specic protocol. However, it may combine protocol operations with calls to additional APIs to implement local functionality.
Figure 2.8: Programmers view of Grid Architecture. Dotted lines denotes protocol interactions where solid lines represent a direct call
37
2.4
Grid Computing with Globus
Globus [47] provides a software infrastructure so that applications can distributed computing resources as a single virtual machine [48]. Globus Tooklit, the core component of the infrastructure denes basic servise and capabilities required for computational Grid. Globus is designed as a layered architecture where high-level global services are built on the top of low-level local services. In this section we will discuss how Globus toolkit protocols actually interacts with Grid layers. Fabric Layer: Globus toolkit is designed to use existing fabric components [46]. For example, enquiry software is provided for discovering and state information of various common resources such as computer information (i.e. OS version, hardware conguration etc), storage systems (i.e. available spaces) etc. In the higher level protocols (particularly at the Resource layer) implementation of Resource management, is normally assumed to be the domain of local resource managers. Connectivity Layer: Globus uses public-key based Grid Security Infrastructure (GSI) protocols [49, 50] for authentication, communication protection, and authorization. GSI extends the Transport Layer Security (TLS) protocols [51] to address the issues of single sign-on, delegation, integration with various local security solutions. Resource Layer: A Grid Resource Information Protocol (GRIP) [52] is used to dene standard resource information protocol. The HTTP-based Grid Resource Access and Management (GRAM) [53] protocol is used for allocation of computational resources and also for monitoring and controlling the computation of those resources. An 38
extended version of the FTP, GridFTP [54], is used for partial le access and management of parallelism in the high-speed data transfers [46].
The Globus Toolkit denes client-side C and Java APIs and SDKs for these protocols. However, Server-side SDKs can also provided for each protocol, to provide the integration of various resources for example computational, storage, network into the Grid [46]. Collective Layer: Grid Information Index Servers (GIISs) supports arbitrary views on resource subsets, LDAP information protocol used to access resource-specic GRISs to obtain resource state and Grid Resource Registration Protocol (GRRP) is used for resource registration. Also couple of replica catalog and replica management services are used to support the management of dataset replicas. There is an on-line credential repository service known MyProxy provide secure storage for proxy credentials [55]. The The Dynamically-Updated Request Online Coallocator (DUROC) provides an SDK and API for resource co-allocation [56].
2.5
Resource Management in Grid Computing
In this section we will discuss about a resource management architecture for Grid environment described in [53]. The block diagram of the architecture is found in Figure 2.9. To communicate request for resources between components we use an Resource Specication Language (RSL) which is described details later in Section 2.5.1. With the help of the process called specialization, Resource Brokers transfer the high level RSL specication into concrete specication of resources. This specication of request
39
Figure 2.9: A resource management architecture for Grid Computing environment named ground request is passed to a co-allocator, which is responsible for allocating and management of resources at multiple sites. A multi-request is a request which is involved resources at multiple sites. Resource co-allocators can brake such multirequest into components and pass each element into appropriate resource manager. The information service, working between Resource Broker and Co-allocator is responsible for giving access to availability and capability of resources.
2.5.1
Resource Specication Language
Resource Specication Language (RSL) is combination of parameters including the operators: & : conjunction of parameter specications | : disjunction of parameter specications + : combining two or more request into single compound request or multi-request Resource broker, co-allocators and resource managers each dene a set of parameter-name. Resource managers generally recognize two types of parameter-name in order to com40
municate with local schedulers. MDS attribute name: to express constraint on resources. For example: memory>64 or network=atm etc. Scheduler parameters: used to communicate information related to job, i.e. count (number of nodes required), max_time (maximum time required), executables , environment (environment variables) etc. For example the following simple specication taken from [53], &(executable=myprog) (|(&count=5)(memory>=64)) (&(count=10)(memory>=32))) requests 5 nodes with at least 64MB memory or 10 nodes with atleast 32 MB memory. Here, executable and count are scheduler parameters.
Again, the following is an example of multi-request: +(&count=80)(memory>=64) (executable=my_executable) (resourcemanager=rm1) (&(count=256)(network=atm) (executable=my_executable) (resourcemanager=rm2) Here two requests are concatenated by + operator. This is also an example of ground request as every component of the request requires a resource manager.
2.5.2
Globus Resource Allocation Manager (GRAM)
Globus Resource Allocation Manager (GRAM) is designed to run jobs remotely and providing an API for submitting, monitoring, and terminating job. GRAM is the lowest level of Globus resource management architecture [57]. 41
Figure 2.10: Globus GRAM Architecture Figure 2.10 shows the basic architecture for GRAM. When a job is submitted, the request is sent to the gatekeeper of the remote computer. The gatekeeper handles the request and creates a job manager for the job. The job manager then starts and monitors the remote program, communicating state changes back to the user on the local machine. When the remote application terminates (either normally or by failing). the job manager also terminates [57].
2.6
Evolution towards Cloud Computing from Grid
The convergence of Grid Computing with Service-Oriented Computing (SOC) resembles the Grid functionality in form of services. Service-oriented Grid oers virtualization of the available resources which increase the versatility of Grid [58]. It also binds Grid specic services on the hardware level and application services. With the help of Grid Computing it is possible to integrate heterogeneous physical resources in to virtualized and centrally accessible computing unit. Based on the convergence with 42
SOC, Grid Computing is oered in form of Grid services [42] as shown in Figure 2.11.
Figure 2.11: Enhancement of generic Grid architecture to Service Oriented Grid In order to meet market demands, providers approach to oer the following functionality [42] : Scalable, exible, robust and reliably physical infrastructure Platform services to enable programming access to physical infrastructure with abstraction in interfaces SaaS (described in Chapter XX) supported scalable physical infrastructure All this is emerging in new on-line platforms referred as Cloud Computing, that provides X-as-a-Service products which we will going to discuss in next Chapter.
2.7
Concluding remarks
In this Chapter, we have discussed about the brief of Grid Computing environment and comparing it with traditional Clusters. Besides we have also discussed about layered architecture of Grid. As an implementation method of Grid, we consider 43
Globus toolkit and correlates Grid layers with Globus implementations. Later, we also discuss about the resource management issues in Grid and focusing how GRAM protocol actually used in Globus toolkit to manage resource requests. We conclude the chapter introducing Cloud Computing a new trend in distributed systems inspired form Grid and Service-oriented Computing.
44
Chapter 3 An overview of Cloud Architecture

In a Cloud environment hardware and software services needed to be stored on web servers the Cloud, rather spreading over a single computer connected through Internet. Cloud computing responsible for delivering IT functionalities to external users by obtaining that functionality from external providers with the services in pay-peruse manner over the Internet. These Cloud services are consumed via web browser or a dened API [59].
The rest of the chapter is organize as follows: we began our discuss with detail architectural overview of Cloud Computing environment in Section 3.2 . A details of Cloud services (PaaS, IaaS, SaaS) is discussed in Section 3.3 and Virtualization on Cloud is discussed in Section 3.4. We conclude the Chapter by explaining a practical Cloud implementation in Section 3.5.
3.1
Cloud Components
Cloud environments consist with the following elements : Clients, Data-centers and Distributed Servers [60]. These components are combined together to build a Cloud Computing Solution as shown in Figure 3.1. Each element has distinct functionalities which will describe next. 45
Figure 3.1: Components of a Cloud Computing Solution i. Clients:
Clients are same as in traditional Local Area Networks (LAN). In general, clients are computers or machines used for accessing functionalities. These machines may include laptops, tablet computers, mobile or cellular phones, PDAs because of their mobility. Clients are generally classied into following three categories: Mobile Clients: includes mobile devices like PDA or Smartphones. Examples are Blackberry, Windows Mobile Smartphone, iPhone/iPad etc. Thin Clients: the computers that do not have their internal hard drive, instead, server do all the work, the the clients task to display the information. Generally used as a terminal. Thick Clients: are regular computer, using a web browser to connect in the Cloud.
ii. Data-centers:
46
Data-center is collection of servers where the processes or application is hosted. Severs can be physically grouped in a room or building or can be distributed throughout the world. In virtualized severs, application is installed and allow multiple instances so the all virtual servers can access it. Using this principle, several virtual severs can run into one physical servers. Number of virtual server in a physical server depends upon type of the application, size and speed of the server and service provided by provider.
iii. Distributed Servers:
As said earlier, often servers are in geographically disparate locations. But to the end-users servers act as if they are operate right next to each other. This gives exibility in operations and enhance security and privacy. If any of the servers downs due to failure or maintenance purpose, service provided by system can still accessed thorough other distributed server(s).
3.2
Cloud Architectures
Cloud architectures focus the diculties arise in large-scale data processing. In traditional approach it is dicult to allocate processing units as per application demand. Also sometimes it is dicult to access CPU according to users requirement. Job allocation is another problem: often it is dicult to distribute and maintain largescale jobs on dierent machines. We need to provide recovery mechanism by another machine in for avoiding failures. Also, scalability is another issue in traditional approach, it is dicult to scale-up and scale-down automatically. Cloud architectures, in contrast to traditional approaches, concentrates to solve these problems [61].
47
In Cloud computing, computational resources are provided as services, generally XassS - known as X-as-a-Service. In particular, Cloud is vritualization of grid and traditional web services. When Cloud services and platform has been created, it is possible to access virtual Grid to the companies which request it by creating Guest Virtual Organizations (GVO) [62]. One possible distinction of Cluster, Grid and Cloud architecture is shown in Figure 3.2.
Figure 3.2: Hierarchical abstraction layers of Cluster, Grid and Cloud Computing Rest of the section we will discuss about various approaches on Cloud architectures and give a brief about underlying layers.
3.2.1
A layered model of Cloud architecture - Cloud ontology
The Cloud ontology is considered as stack of layers. Each layer consist with one or more Cloud services. Services with same level of abstraction (determined by their targeted users) belongs to same layers [63]. For example, Cloud software environment mainly targeted to programmers or developers. On the other hand, Cloud applications target end-users. So, Cloud software environment and Cloud applications classied as dierent layer.
The ordering in Cloud stack is important, it determines the work-ow in Cloud. 48
For example, Cloud applications are composed from Cloud Software environments. Hence, application layer is in upper position in Cloud stack. The Cloud ontology is shown in 3.3 which is depicted as stack of ve layers [63]: a) Cloud Application Layer b) Cloud Software Environment Layer c) Cloud Software Infrastructure Layer d) Software Kernel Layer e) Hardware Layer
Figure 3.3: Cloud layered architecture : consists of ve layers, gure represents interdependency between layers
(a) Cloud Application Layer:
Application layer is top of cloud layer and most visible to the end-users. Users can access these services through Internet by paying necessary fees. It carried out computational works from users terminal (input) to processing units (e.g. data centers) in where the applications are hosted. Total procedures are abstracted to the end-users and provides the outputs of CPU-intensive and memory intensive large scale tasks in their local machine.
From providers persecutive, higher manageability can be achieved. The application is deployed in providers infrastructure, not in client machine, hence, they can maintain or upgrade system without interrupting users. 49
The model generally known as Software as a Service (SaaS). Cloud applications can be grouped as a service for another Cloud services. Cloud applications can be developed in the Cloud software environments or sometimes in Cloud infrastructure components.
(b) Cloud Software Environment Layer:
The layer just below the Application layer is Software Environment layer. The layer mainly targets the developer, who build and deploy softwares for end-users in the Cloud. Providers of this layer provide suitable programming-language level developing environment by means of well dened and documented API. The API integrates developers softwares as well as provides necessary deployment and scalability support. The services provided by this layer is known as Platform as a Service (PaaS).
Developers are beneted by developing their application in Cloud Programming environment with a support of automatic load balancing, authentication services, e-mail services etc. Developers can add necessary services to their application ondemand, which makes application development is less tidier and minimize logic faults [63]. Hadoop [64], a Cloud Software Environment, provides developers a programming environment (MapReduce - programming model for data processing in large clusters [65]). Yahoos Pig [66] is a high level language which can process very large les in hadoop environment. That is how developers can beneted by several services as per necessity.
50
(c) Cloud Software Infrastructure Layer:
Software Infrastructure Layer provides necessary resources to the higher level layers. Services oered in this layer classied into following subclasses : i. Computational Resources, ii. Data Storage and iii. Communications. i. Computational Resources:
Cloud users get the computational resources by Virtual Machines (VMs) in this layer. Services provided is often known as Infrastructure as a Service (IaaS). Virtualization providers the user exibility in conguring settings. At the same time, it protects the physical infrastructure of providers data center [63]. Virtualization shown in Figure 3.4 where traditional non-cloud environment runs three dierent applications on its own server. On the other hand, Cloud shares the servers for OS and applications which results fewer servers [67].
Figure 3.4: Non-cloud environment needs three servers but in the Cloud, two servers are used
51
IaaS get beneted by two type of virtualization technologies : Paravirtualiztion and Hardware-assisted virtualiztion. Still, the problem of performance interference between VMs and sharing same cache and TLB hierarchy remains unsolved. Modern multi-core machines in main servers sometimes create performance isolation problem. This lack of performance isolation between VMs, that share same physical node is problematic for optimal performance [63]. We will cover more on virtualization in Section 3.4.
ii. Data Storage:
Data storage is another infrastructure resource in this layer which allows user to store their data in remote storage devices and provide an access mechanism anytime and from anywhere, The service provided by Cloud providers is known as Database as a Service (DaaS). DaaS facilitates scalability to cloud applications for both users and developers.
In preliminary level, Cloud storage system needs one data server for connecting to Internet. Client can access data by interacting with database server using a web-based interface. Server may send the le back kept by user or provide functionality to manipulate the data on-line. However, practically commercial Cloud storage systems use hundreds if data servers, For server maintenance or repairing purpose it is necessary to keep multiple machines to fulll users demand. Which creates redundancy but without this redundancy clients might not access information at any given time. Often, providers keep data on servers running in dierent power supplies. Which ensures, clients can still access and manipulate data even in case of power failures [68]. 52
Some example of Data storage systems are: distributed le systems (e.g. Google File System [69]), replicated relational databases (RDBMS) (e.g. Bayou [70]) and key-value stores (e.g. Dynamo [71]). RDBMS model gives more focus on consistency model [72, 73] but paid the cost of availability of data. On the other hand, key-value stores give much importance on the availability on data loosen up consistency model [63].
iii. Communication:
The rate of data transfer is high in Cloud environment. For providing Quality of Service (QoS) communication plays vital role in Cloud infrastructure. To meet QoS, concept of Communication as a Service (CaaS) introduces which consist of network security, dynamic trac isolation or dedicated bandwidth, guaranteed message delay, communication encryption, network monitoring etc. [63]. Though CaaS is least discussed topic in literature, there are couple of research publications and articles [74, 75, 76] focus design and architecture of CaaS for providing QoS in communication systems. A practical example of CaaS is Microsofts new Connected Service Framework (CFS) [77]. Also, VoIP telephone systems and instant messaging softwares in Cloud can also use CaaS for better network utilization. (d) Software Kernel:
Software kernel layers provides software management functionalities for physical servers in Cloud. Such software kernel can be implemented as an OS kernel,
53
hypervisor, Virtual Machine Monitor (VMM) and/or as a clustering middleware [63]. Grid applications can run in this layer connected through several clusters of machines. But due to lack of virtualization in Grid, periodic check-pointing and load balancing is bit complicated because jobs are mainly tied in actual hardware infrastructure, not in kernel. Two such middleware for Grid are Globus [78] and Condor [79].
(e) Hardware and Firmware:
The bottom layer in Cloud layered architecture is fabric layer i.e. actual physical hardware and switches which are so-called backbone for the Cloud [63]. Users of this layer are organizations with massive IT requirements. Providers sometimes facilitates Hardware as a Service (HaaS). This model helps enterprise clients so that they need not build and maintain large data centers. Services included (but not limited to) in HaaS are servers, desktops, notebooks, infrastructure components, licensing etc. [80].
Some technical challenges still exist to implement HaaS eectively. Eciency in speed in large scale systems is a challenging issue. Remote scriptable bootloaders (for example UBoot [81] is one solution to boot the system remotely and deploy applications hosted in distributed data centers. Another challenges in HaaS are data center management, scheduling power consumption optimizations etc. [63]. In Table 3.1 [63] we provide example of some existing Cloud system and then classied into the layers of Cloud Ontology . 54
Cloud Layers Cloud Application Layer Cloud Software Environment
Example of existing Cloud Solutions Google Apps, Salesforce Customer Relation Management (CRM) Google App Engine, Salesforce Apex System
Computational Resources: Amazon EC2, Enomalism Elastic Cloud Software Infrastructure Cloud Storage: Amazon S3, EMC Storage Manages Service Communication: Microsoft Connected Service Framework (CSF) Software Kernel Grid and Cluster Computing Systems (for example : Globus and Condor) Firmware or Hardware IBM-Morgan Stanleys Computing Sublease, IBM Kittyhawk Project
Table 3.1: Example of existing Cloud Systems w.r.to classication into layers of Cloud Ontology
3.2.2
Cloud Business Model
Cloud computing provides a service-driven business model [82]. In Cloud, hardware and platform resources (which is actually provided as services) are available as per demand. Each layer discussed in the layered architecture can be used as a service to the upper layer. In other words, every layer is considered as consumer to the lower level layer.
Figure 3.5: Cloud computing Business model Cloud services generally grouped into three categories : a) Infrastructure as a Service 55
(IaaS). b) Platform as a Service (PaaS) c) Software as a Service (SaaS). (a) Infrastructure as a Service (IaaS): In IaaS, customer can deploy his own software on the infrastructure. IaaS provides infrastructural resources (for example: servers, storage systems, networking devices, data center space etc. [83]) based upon demand with the benet of Virtual Machines (VM). Organization oers the IaaS know as IaaS provider. Common example of IaaS providers include Amazon EC2 [84], GoGrid [85] and 3Tera [86]. (b) Platform as a Service (PaaS): PaaS provides platform level resources which may include support for operating systems and software development frameworks [82]. The combination of operating systems and software development frameworks (for example LAMP platform - Linux, Apache, MySQL, PHP) ensures manageability and scalability of Cloud environment [83]. Microsoft Windows Azure [87], Google App Engine [88], Force.Com [89] are common example of PaaS providers. (c) Software as a Service (SaaS): SaaS provides on-demand applications through Internet. Single instance of the service (single or multiple softwares) runs on the Cloud and multiple users connected through Cloud can access it. Customers beneted by saving their equipment investment and software licensing cost. On the other hand, providers are beneted because of only single instance of the software (service) needs to be hosted and maintained. SaaS is oered by Google [90], Microsoft [91], Rackspace [92]. Figure 3.5 illustrates typical Cloud business model. Based upon layer architecture of Cloud, PaaS providers run on top of IaaS providers services. But, in current business markets, IaaS and PaaS providers provides services jointly (for example Google and
56
Salesforce) [82]. For that reason PaaS and IaaS providers often considered as Infrastructure providers or Cloud providers [93].
We will cover details of these services on section 3.3.
3.2.3
Cloud Deployment Model
Cloud Deployment Model describes Cloud deployment scenarios available to any typical organization. Deployment model mainly denes [94]: a) External (or Public) Cloud, b) Internal (or Private) Cloud, c) Hybrid (or Integrated) Cloud and d) Community (or Vertical) Cloud.
Other than traditional Cloud solutions, organization can implement Cloud internally, commonly known as Private Cloud. In Private Cloud business organization can provide eective utilization of computing resources, at the same time, security and privacy of data can be ensured. Many analysts suggest that implementing Cloud systems internally inside organization actually defeats the main objective of Cloud [94].
Main focus of traditional Cloud is obtaining computing resources form a network of Cloud service provider based upon demand with a provision of dynamic addition or subtraction of capacity. Implementing internal Cloud means internal capacity. In traditional (public) Cloud, end-users need not pay infrastructure costs once they purchase services form the providers. But, Private Cloud, like internal data centers incur depreciated costs. As a matter of fact, some would argue that, Public Cloud actually use of internal resources through a highly virtualized hardware and application wrapper [94]. Regardless of this debate, These dierent types of Cloud, each with its own advantages and drawbacks are discussed here.
57
(a) External (Public) Cloud:
This Cloud solutions is Provided by independent third-party cloud service providers. Service providers oer their resources as services to all (general public to business organizations). Example of External (Public) Cloud Deployment Model are Amazon, Salesforce, Google and other Cloud service providers. Key attributes [94] of this deployment model are: Services are accessed through web with a self-service user interface, Well documented user guides, APIs and technical support, Service Level Agreements (SLA) between clients and providers, Availibilty of multiple virtual machines with various congurations based upon requirements (which includes conguration of processor, memory, operating system, application server, development environments and so on), Provision of dierent types of Cloud resources: for example, Amazon provide dierent services targeting dierent group of users - Amazon Simple Storage Service (S3), Amazon Simple DB for storage, Amazon Elastic Compute Cloud (EC2) for computation etc. Figure 3.6 shows example of Public Cloud. One of the major Benets of Public Cloud includes no initial investment for infrastructure. But there is a controversy that Public Clouds lack control over data, network and security settings which may hamper eectiveness in many business organizations [82].
58
Figure 3.6: External or Public Cloud (b) Internal (Private) Cloud:
Internal or Private Clouds are mainly designed for single organization. These type of Cloud can be built and manage by organization itself or by external providers. Benets of Private Cloud includes highest degree of control over performance. reliability, security and privacy. But as said earlier, Private Clouds are being criticized due to similarity with traditional proprietary servers or data centers and hence do not provide benets of no up-front capital costs [82]. Figure 3.7 shows example of Public Cloud.
Figure 3.7: Internal or Private Cloud Private vs. Public Cloud Computing: Several distinguish characteristics [95] of a Private Cloud actually diers it from traditional distributed systems.
59
Firstly, Private Cloud diers from Public Clouds that the infrastructure in Private Cloud is solely dedicated to a single business enterprise and which is not shared with others. This infrastructure may include corporate clients, business partners, intranet vendors or any other groups. Secondly, Security credentials are generally strict in Private Cloud Deployment Model. Though Private Cloud is not inheretently more secure than Public Cloud, but the organization that has security issues and risk concerns make adopt tighter security accessories.
(c) Hybrid (Integrated) Cloud:
Combination of Public and Private Cloud model is Hybrid (or Integrated) Cloud. In this type of Deployment model, part of the services runs in Private Cloud while rest of services runs under Public Cloud. Hybrid Deployment Model provide more adaptability which makes it exible than Public or Private models. More generally, Hybrid Clouds provide strong security features and more control over application and data compared to Public Clouds. Besides it is still able to provide scalability and can serve clients on-demand requests. But, the complex part is to determine the optimum partition or splitting boundary of public and private components [82]. Hence, Hybrid Cloud requires Cloud integration. So, often this model is known as Integrated Deployment Model. Cloud integration and interoperability is one of major research challenges in Cloud industry [94]. There are some Cloud interfaces and APIs, Cloud integration and interoperability standards, tools for cross-cloud composition exists to meet the business
60
requirements and need to improvise for optimized performance and meet future demands. Figure 3.8 shows example of Hybrid Cloud.
Figure 3.8: Example of Hybrid Cloud Major attributes [94] of Hybrid Clouds are: A combination of Private (Internal) Cloud and Public (External) Cloud enabled resources. Benets of cost-eectiveness of external third-party Clouds with mitigation of risks by maintaining internal Private Cloud for critical process (and application data). Integration of external and internally provided capabilities which includes integration of vendor proprietary APIs with internal interfaces.
3.3
Cloud Services
In Cloud Business Model in Section 3.2.2 we give a brief of Cloud Services. Now in this Section, we will cover the the services in more detail and describe how they logically connected to each other.
61
3.3.1
Infrastructure as a Service (IaaS)
IaaS provides computing resources such as processing or storage that can be received as a service. IaaS providers typically oer virtualised infrastructure as a service so that end-users need not to buy raw hardware infrastructure. Raw hardware resources, such as compute, storage and network resources, are considered as the fabric layer. Typically by virtualization, hardware level resources are abstracted and encapsulated and exposed to end users through a standardized interface [59] as shown in Figure 3.9.
Figure 3.9: Correlation between Cloud Architecture and Cloud Services IaaS allows to provide resources such as Server space, Network equipment, Memory, CPU cycles, Storage space etc [68]. Figure 3.10 shows an example of IaaS. The infrastructures can be dynamically scaled up or down, based on the application demand of resources.
3.3.2
Platform as a Service (PaaS)
Platforms are an abstraction layer between the software applications (SaaS) and the virtualized infrastructure (IaaS). PaaS are targeted for software developers. Developers can write applications based on specications of a particular platform without going deeper about the underlying hardware infrastructure. Developers write and 62
Figure 3.10: Infrastructure as a Service upload their application code to a platform, which upscales automatically in general as per application usage [96]. Figure 3.11 shows an illustration of PaaS.
Figure 3.11: Platform as a Service PaaS should cover all phases of software development or may be specialized in a specic area like content management. The PaaS layer built on standardized interface of the IaaS layer that virtualizes the access to the available resources on Cloud. It also provides provides a development platform for the SaaS layer [59].
3.3.3
Software as a Service (SaaS)
SaaS, as shown in Figure 3.12 is the most visible layer of Cloud Computing for endusers as the actual software applications are actually accessed by users. SaaS is actually a software which is owned, delivered and managed remotely by one or more providers and is oered in a pay-per-use basis. The end-user of a SaaS oering usually
63
has neither knowledge nor control of the underlying infrastructure [59]. SaaS applications can be developed on an existing platform and run on infrastructure of a third party.
Figure 3.12: Software as a Service SaaS eliminates the issues related to traditional model [95]: Compatibility with hardware, other software and operating systems Licensing issues, for example, unauthorized copies of the software used in the organization Maintenance, support and patch revision processes
3.4
Virtualization on Cloud
As stated in Section 3.2.1, virtualization provides more exibility in conguring settings. Virtualization is related to Cloud environment because of its functionality to access services on the Cloud. IaaS is beneted by two type of virtualization techniques - Full virtualization and Paravirtualization which will concentrated next.
Full virtualization:
64
In fully virtualized environment, complete installation of one machine is capable to run on another [60]. The result of this type of virtualization is a system in which all application instances running on the server is within a virtual machine. Figure 3.13 shows a fully virtualized deployment.
Figure 3.13: A fully virtualized deployment where operating platform running on servers is displayed Full virtulization is successful for following intentions [60]: Sharing computing environment among multiple users Isolation of users from each other and from main control (administrative) application Emulating hardwares into another machine
Paravirtualization:
Paravirtualization, shows in Figure 3.14, allows multiple operating systems to run on a single hardware device. It provides better and eective utilization of system resources with processors and memory. In Full virtualization, the entire system (for example BIOS, drivers etc.) is emulated. But in the case of Paravirtualization deployment, management components operates with OS and provide necessary adjustment to communicate with a virtual machine [60]. In general, Paravirtualization runs better than fully virtualized environment because in a fully virtualized deployment, all elements 65
must be emulated but in case of Paravirtualization which is not required.
Figure 3.14: A Paravirtualized deployment where many OS can run simultaneously Better scalability is one of major benets of Paravirtualization. For example, if a fully virtualized solution require 10% of processor utilization, at most 5 systems could be run on the host system without aecting performance (i.e. processor utilization 50% ). But, in the above scenario, Paravirtualization requires only 2% of processor utilization per guest instances with 10% availability for the guest OS. Table 3.2 [60] illustrates the comparisons in detail.
Virtualization type
Guest in- Virtualization System Total stances overhead Processing Needs Full Virtualiza- 5 10% (50% to- 10% (50% to- 100% tion tal) tal) Paravirtualization 8 2% (16% total) 10% (50% to- 96% tal) Table 3.2: CPU utilization in Full Virtualization and Paravirtualization
66
Drawback of Paravirtualization is reduced exibility and security [60]. Due to inability of a particular OS or distribution can reduce exibility. For example, a new Windows deployment environment may not be available as a guest OS for the entire solution. Security concern because the guest OS often has more control over the underlying hardware due to nature of paravirtualization technique.
Paravirtualization is suitalbe for following deployments [60]: Disaster recovery: In case of catastrophic failures, guest instances can be moved to other hardware until recovery. Migration: Moving to a new system in paravirtualization is easier and faster because guest instances can be separated from underlying hardware. Capacity management: As a result of easier migrations, management of capacity is simpler to implement. It is also easier to add more processing power or storage capacity in this type of virtualized environment.
3.5
Example of a Cloud Implementation
Here we are going to illustrate a possible scenario of Cloud oriented infrastructure [62]. Let consider a search engine which is running on a Grid Computing system. The search engine is managed by software layer in Cloud stack, on which it can able to run thousands of dierent program (or processes) and services. Underlying grid is divided into interconnected clusters distributed worldwide. The following setup incorporates: the advantage of best possible geographic conditions for the servers and
67
the benets of load balancing DNS-based system, to send queries to the cluster and to the user physically nearest so that Round Trip Time (RTT) is reduced. Every cluster is made with large number of (possibly more than hundreds of thousands) computing nodes. Such conguration can store various copies of the web and can replicate services massively. Execution of a service or serving a query request by users need dierent processors. But all these required processors are reside within specic clusters. CPU clock here is not considered as an important factor as they can be divided into multiple processors or can be done in a single processor based upon workload.
3.6
Conclusion
In this Chapter, we are focusing the most recent trend in the distributed computing community The Cloud. Many believe that Cloud is going to reshape the IT industry as a revolution. However, we are trying to understand what the Cloud actually is, the architectural overview of Cloud and what functionalities a Cloud environment can provide to its users. On next Chapter, we are trying to investigate how Cloud actually dier and relates with another distributed paradigm the Grid.
68
Chapter 4 Grid and Cloud Computing Comparisons : Similarities & Dierences

There is a myth [97] that Cloud Computing is just Grid Computing by a dierent name. In spite of their similarities, there are several factors that make two technologies distinct. The idea of cloud is not completely new concept. Based upon well established grid technology, cloud computing combines dierent distributed techniques with service orientated architectures.
The rest of the following chapter we have discussions about similarities and dierences of these two architectures based on way of implementations and some other possible criterion. At the end of the chapter we presented a side by side comparison between existing Grid and Cloud distribution named EGEE Grid and the Amazon Cloud.
4.1
Major Focus
Cloud Computing may overlap with existing technologies for example Grid Computing, Utility Computing, Service Oriented Computing etc. But the trend actually shift form and base infrastructure (like storage or processing power - provided by Grid) to
69
Figure 4.1: Motivation of Grid and Cloud economy based services with more virtualization and abstraction like in Cloud.
The Figure 4.1 shows relationship between Cloud and other existing co-related architectures [98]. Supercomputers and Clusters are mostly traditional non-service oriented applications. Grid actually overlaps the boundary and moving the trend through present concept of Service oriented Computing using the benet of Web 2.0.
Figure 4.2 shows the comparison between Supercomputers, Clusters, Grids and Clouds with respect to performance, reliability and cost [99].
4.2
4.2.1
Points of Considerations
Business Model
Traditional business model consists of unlimited use of single PC by single time payment. But, in Cloud users pay as per resource or service consumption, hence the concept of Utility Computing come into light. On the other hand, business model for Grid generally project or task oriented and community based, where there is certain 70
Figure 4.2: Comparison regarding performance, reliability and cost agreement [100] about consumption of service unit like CPU hour or storage amount etc.
4.2.2
Scalability issues
Both Grids and Clouds are scalable. Scalability can be achieved by load balancing of dierent application instances. These application instances might run on dierent operating systems and managed by web services.Resources like processing power and network bandwidth can be allocated and de-allocated on-demand. Also, resources like storage capacity scales up or down depending number of active users, application instances require data and amount of data required at the point of instance [101]. Grid generally provide persistent use of all available resources. On the other side, Cloud is very specic on user/consumers requirements, hence, solves the burden of overpositioning resources [102]. So, scalability of Grid means scaling the computational resources and scalability of Cloud depends upon the end-users according to their demand of scaling application resources [103].
71
4.2.3
Multitasking and Availability
Grid and Cloud technology have the multitasking functionality [101], which provides number of users to perform dierent tasks. Users can use single or multiple application instances at the same time. Resource management between group of users reducing infrastructure cost for the business organizations and enables users to access resources even in peak demand.
Grid and Cloud providers ensures users the availability of resources and services. With the benet of Service Level Agreement, users get guaranteed uptime availability [104, 101], also, it eliminates the problem of single point of failure [104].
4.2.4
Resource Management
We consider resource management issues using following point of views: i. Compute Model Most of the Grid systems use a batch-scheduled compute model [98]. A Local Resource Manager for example Condor, Oracle Grid Engine [105] manages computational resources for a Grid site. Users submit batch job using protocols such as GRAM [106] to request resources for certain amount of time. Most of the Grid system does not support because of long queue time (as the resources monitored by a queueing system) and costly scheduling decisions. There are several approaches taken to overcome this problem using multilevel scheduling so that applications with short running tasks can run eectively in Grid [107]. On the other hand, resources in the Cloud is shared by all the users at the same time. Thus, latency sensitive applications can run on Cloud. Providing QoS in such applications might be a challenging research issues when the size of the applications and users increased. 72
ii. Data Model: Data management plays important role in Cloud systems due to increase of dataintensive applications. Cloud application mostly depend on distributed storage systems which provide abstraction to users and hides complexity. Though application interface is quite user friendly, actual application needs to know the location of data in the network. There are several approaches, for example, The Contour System [108] proposed to optimize computation by placement of data and maximize throughput. Also, data transaction needed to be active to the users even when Internet connectivity or the Cloud itself is down, or in slow network communications. But the improvement of hardware helps to transfer large amount of data (i.e. data needed for multimedia applications) and process it locally. Data Grids [109] are designed specially for data oriented applications in which the concept of virtual data [110]. Virtual Data System generally provides group of components and services which helps users to organize and manipulate large scale data and computation resources [111]. Virtual data also deals with dierent kind of abstraction like : Location transparency : data can be requested regardless of data location, a separate module (for example, distributed metadata catalogue [112] keep tracks of locations of the data along with its backup duplicates. Also it ensures privacy and access control. Materialization transparency : data can be computed remotely or can be transferred as per request, but mostly depends on how much data is available and how much it costs to re-compute. Representation transparency : means data can be generated and used re73
gardless of the actual physical representation and storage criterion. Data mapped into internal abstract representation and can be computed based on way of representation. iii. Data Locality: In distributed systems like Grid and Cloud data must be placed appropriately for minimizing communication overhead as moving data form remote processing/storage unit increase data overhead [113]. For example, a separate module might run on top of le system, when data is loaded, it is partitioned into chunks and each chunk then replicated. Thus, data processing and data storage occur simultaneously. Grid systems, generally based on shared le systems where no explicit data locality is applied due to complexity. To provide the functionality of data locality we need to improve schedulers so that it preserves the information of data locality on the time of scheduling computational task [114]. iv. Virtualization: Cloud Computing ensures sharing only user interface. Resource interfaces are hidden to the users using abstractions [115]. Cloud need to run number of applications simultaneously. Cloud provider must have functionality so that users can access any resource available in Cloud. Virtualization enables dierent level of abstraction so that users can use pool of resources as a service (say data storage service), using underlying fabrics (i.e. processing power, storage, network resources). All the lower level architectures are totally hidden from end-users. Each application is encapsulated with certain level of abstraction to provide enhanced maintenance, security and isolation [98]. Virtualization is not shown in Grid like Cloud. In Grid, user and resource interfaces both must be shared. Which allow users to connect with providers resources [115]. But, there are sev74
eral attempts, for example, Virtual Workspace Service [116] presently known as Nimbus [117, 118], provides abstraction in Grid. But, as hardware support for virtualization improved nowadays, lack of performance factor reduces comparing with traditional operating systems. v. Resource Monitoring: Dierent levels of abstraction due to virtualization makes it dicult to take control in resource monitoring in Cloud. Some Grid applications (for example TeraGrid [119]) induce restrictions about the type of modules, sensors or services that can user launch. But in Cloud, monitoring is not such simple, mainly because of : a) Grid has dierent trust model, users can access resources of various grids using their credentials, and b) Grid resources are not highly abstracted and virtualized like Cloud [115, 98].
In Cloud, users are oered dierent services, where lower fabrics are totally abstracted to user. Trust model might not exists, because everything held inside Cloud [cite security]. Users do not have access to deploy their monitoring infrastructure, so an user might not get enough detailed status of resource. This is also problematic for the developers as well because due to dierent levels of abstraction and encapsulation, tracing in lower level of hardware and software stack may not possible all the time. Besides, with the help of virtualization, we can add another extra layer [98] , which is responsible for failure management, monitoring of resources and providing QoS independent of application requirements. Also, if the Cloud become more autonomic, explicit monitoring might less important in future.
75
4.2.5
Application Model
Processing power and resources of distributed computers in the Grid network are more concentrated to solve a specic problem [book cloud security]. The types of application varies form High Performance Computing (HPC) to High Throughput Computing (HTC). HPC applications are designed to execute tightly coupled parallel job, which resides in low-latency interconnection machine and not execute in WAN Grids in general. They generally use a Message Passing Interface (MPI) [120] for inter-process communication. Again, Grid can run loosely coupled application (i.e. HTC applications). These applications are collection of several tasks, might be dependent of independent and can be scheduled individually. Individual schedule helps to peak dierent computing resources with various administrative boundaries that helps to solve the large scale application.
Cloud can also have similar set of applications described above. But, in contrast to Grid, Clouds have diculties in HPC applications where fast and low latency interconnection is necessary for scaling to many processors [98]. Application domain in Cloud is still unclear, and we assumed the applications generally loosely coupled, transaction oriented and interactive [98], where in Grid they are batch scheduled.
4.2.6
Other issues
It is often said that Grid computing is open-source but Cloud Computing is not [MarcElian Begin. Comparative Study: Grids and Clouds, Evolution or revolution. May 2005. CERN ], which creates interoperatibility problems between todays Clouds [98]. But at present there are several open source cloud computing platform (for example Eucalyptus [121], OpenStack [121]) exists which may solve the problems.
76
Grid has multiple research user communities in which users can access resources from dierent administrative domain and also grouped in Virtual Organizations (VO). Cloud, generally consists of a common group of system administrators [115].
Sometimes storing small les (like le with size of 1-byte) in Grid might not be economically benecial [101]. But, as Cloud provides all the benets of Utility Computing, size of le does not aect cost-eectiveness.
4.3
Case Study
In the following section we compare Grid and Cloud architectures based upon two existing distributions. For comparison purpose we refer tow well-known and documented implementations: the Enabling Grids for E-sciencE (EGEE) Grid and the Amazon Cloud [122]. Grid: We consider EGEE Grid Infrastructure [123] as our Grid computing platform. EGEE Grid is based on an open source distribution gLite [124]. Cloud: We consider Amazons commercial cloud oering called the Amazon Web Services (AWS). AWS provide two types of service : Computing (Elastic Computing Cloud : EC2) and Storage (Simple Storage Service : S3) [125].
4.3.1
Comparative results
Here we present the overall ndings in the tabular form [122].
4.4
Concluding remarks
In this chapter we discussed the similarities and dierence of tow distributed paradigms Grid and Cloud with dierent point of views. Both the systems are processed par-
77
Features SLA
Grid (EGEE Grid)
Data File Transfer Service Data metadata Workload Management Metering BitTorrent interface storage Based on standards Resource Pool Queue prioritisation User Platform Support Geographic distribution Scalable File Size Limit Compute scale Storage capacity Cost of data transfer Secured Service
Cloud (Amazon Web Service) Local (between the Global ((between EGEE project and the Amazon and users) resource providers) Yes No Yes Yes Yes No Partially Federated institutional Yes RedHat compatible International (50 Countries) Yes Not known 55000 CPUs ?20PB Paid by data centres Yes Yes No Yes Yes No Commercial access No All Mainly USA, partially Europe Yes 5GB Unknown but large Unknown but large Pay-as-you-go Yes
Table 4.1: Comparative analysis allel computations by sharing resources to distributed units but diers their way of implementation in the factors like scalability, virtulization etc. Cloud is, however, more service oriented to the end-users. In our comparisons, we skipped the architectural comparison between Grid and Cloud, as we have discuss in deeply in chapter 2 and Chapter 3.
78
Chapter 5 Conclusion and Future works

In our thesis, we are trying to investigate the evolotion of computing towards Cloud. In our research we, nd the Cloud is actually build on well-established Grid Computing technology with more service oriented by nature. We are then trying to understand the basics of distributed processing beginning with Computer Clusters and then Grid environment. We are going in details of architectural overview of both Cluster and Grid Computing and correlates it with a relatively newer Cloud Computing technology. In our thesis, we limit our scope by understanding only the basic architectures, operation ow control and resource management issues and skipped security related concerns. Upon conclusion our review, we address couple of unresolved areas and implementation issues [93, 126, 127] in Cloud, which needs more research: Technical Concerns: a) security, trust and privacy b) lack of standardization c) insucient virtualization d) data movement and management e) programming models to provide the required elasticity f) systems and services development methods Non-Technical Concerns: a) business or cost models for Cloud Computing b) legalistic issues concerning data processing 79
Bibliography
[1] http://en.wikipedia.org/wiki/History_of_computing. [2] Michelle A.Hoyle. Computers from past to present. http://lecture.eingang. org/index.html. [3] Stephen White. A Brief History of Computing-The various eras that the history of computer has passed through. http://trillian.randomstuff.org. uk/stephen/history/. [4] Jim Waldo, Geo Wyant, Ann Wollrath, and Sam Kendalll. A Note
on Distributed Computing. http://labs.oracle.com/techrep/1994/smli_ tr-94-29.pdf. [5] www.cs.iit.edu/ren/cs447/lectures/dsIntro-2.ppt. [6] Wanlei Zhou. Weijia Jia. Distributed network systems: from concepts to implementations. Springer, 2005. [7] http://www.ece.rutgers.edu/parashar/Classes/ece572slides/ dist-compintro.pdf. [8] Mark Baker, Amy Apon, Rajkumar Buyya, and Hai Jin. 18th September 2000 ,Cluster Computing and Applications. [9] Mark Baker. Cluster Computing White Paper. University of Portsmouth,UK. 80
[10] Message Passing Interface Standard. index.html.
http://www-unix.mcs.anl.gov/mpi/
[11] Mark Baker and Rajkumar Buyya. Cluster Computing at a Glance. [12] G.F. Pser. In Search of Cluster,2nd Edition. Prentice Hall PTR, 1998. [13] M.G. Jacunski et al. Low-Latency Message Passing for Reective Memory Networks. In Workshop on Communication, Architecture and Applications for Network-Based Parallel Computing (CANPC), 1999. [14] InfinibandArchitecture,http://www.infinibandta.org. [15] Mark Baker. Cluster Computing White Paper. University of Portsmouth, UK . [16] L. L. Peterson and B. S. Davie. Computer Networks, A Systems Approach, Second Edition. Morgan Kaufmann Publishers, 2000. [17] S. Goldstein T. von Eicken, D. Culler and K. Shauser. Active Messages: a mechanism for integrated communications and computation, Proceedings of the International Symposium on Computer Architectures. 1992. [18] T. E. Anderson and D. E. Culler and D. A. Patterson. A Case for NOW. IEEEMicro, Feb., 1995, http://now.cs.berkeley.edu/Case/now.html. [19] S. Pakin, M. Lauria, and A. Chien. High Performance Messaging on Workstations: Illinois Fast Messages (FM) for Myrinet, Supercomputing 1995. [20] M.A. Blumrich, C. Dubnicki, E.W. Felten, K. Li, and M.R. Mesarin. Virtual Memory Mapped Network Interfaces, ITEE Micro, February 1995. [21] M. Welsh A. Basu and T. von Eicken. Incorporating Memory Management into User- Level Network Interfaces, Presented at Hot Interconnects V, August 1997, 81
Stanford University and also a Cornell University, Technical Report, TR97-1620 http://www2.cs.cornell.edu/U-Net/papers.html. [22] L. Prylli and B. Tourancheau. BIP: a new protocol designed for high performance networking on Myrinet. In the PC-NOW workshop, IPPS/SPDP 1998, Orlando, USA, 1998. [23] Virtual Interface Architecture. http://www.viarch.org/. [24] P. Buonadonna, A. Geweke, and D. Culler. An Implementation and Analysis of the Virtual Interface Architecture, Proceedings of SC98, Orlando, Florida, November 1998. [25] M-VIA: A High Performance Modular VIA for Linux. gov/research/FTG/via/ . [26] InniBand Architecture. http://developer.intel.com/design/servers/ http://www.nersc.
future_server_io/index.htm. [27] Internet Protocol Version 6 Information Page. http://www.ipv6.org/. [28] DS-Online Grid Computing. http://computer.org/channels/ds/gc/. [29] R. Buyya. High Performance Cluster Computing: Architectures and Systems, Vol. 1, Prentice Hall PTR, NJ, USA, 1999. [30] G. Popek and B.J. Walker (Ed.). The Locus Distributed System Architecture, MIT Press, 1996. [31] K. Hwang, H. Jin, E. Chow, C.-L. Wang, and Z. Xu. Designing SSI Clus-
ters with Hierarchical Checkpointing and Single I/O Space, IEEE Concurrency, vol.7(1), pp.60- 69, Jan March, 1999. 82
[32] B. Walker and D. Steel. Implementing a Full Single System Image UnixWare Cluster: Middleware vs. Underware, In Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA99), Las Vegas, USA, 1999. [33] International Organization for Standardization. http://www.iso.ch/. [34] Mullendar. S. (Ed). Distributed Systems, 2nd Edition, Adison-Wesley, 1993, ISDN 0- 201-62427-3. [35] Object Management Group. http://www.omg.org/. [36] C.S.R Pravu. Grid and Cluster Computing. 2008. Chapter 2. [37] Daniel Minoli. A Networking Approach to Grid Computing. Wiley-Interscience; 1 edition, November 1, 2004. [38] Ian Foster and Carl Kesselman. Computational grids, 1998. [39] J.-Y. Chung L.-J. Zhang and Q. Zhou. developing grid computing applications, part 1: Introduction of a grid architecture and toolkit for building grid solutions. October 1, 2002. [40] Open Grid Services Infrastructure, version 1.0, Proposed Recommendation. http://www.ggf.org/documents/GFD.15.pdf. [41] The Open Grid Services Architecture. http://www.ogf.org/documents/GFD. 80.pdf. [42] Thomas Wozniak Katarina Stanoevska-Slabeva. Grid Basics, chapter 3.3. Grid and Cloud Computing A Business Perspective on Technology, Springer, 2010.
83
[43] M. Otey. Grading grid computing. January 2004. Published by Windows & .Net Magazine Network, a Division of Penton Media Inc. [44] Joshy Joseph and Craig Fellenstein. Grid Computing. IBM Press, January 9, 2004. [45] Pawel Plaszczak and Jr. Richard Wellner. Grid Computing: The Savvy Managers Guide. Morgan Kaufmann, 2006. Chapter 1. [46] Ian Foster, Carl Kesselman, and Steven Tuecke. The Anatomy of the Grid: Enabling Scalable Virtual Organizations. Int. J. High Perform. Comput. Appl., 15(3):200222, August 2001. [47] Ian Foster and Carl Kesselman. Globus: A Metacomputing Infrastructure Toolkit. International Journal of Supercomputer Applications, 11:115128, 1996. [48] David De Roure and Mark A. Baker and Nicholas R. Jennings and Nigel R. Shadbolt. The Evolution of the Grid. In Grid Computing: Making the Global Infrastructure a Reality, pages 65100. John Wiley & Sons. [49] Randy Butler, Doug Engert, Ian Foster, Carl Kesselman, Y Butler, Steven Tuecke, Von Welch, and John Volmer. Design and Deployment of a NationalScale Authentication Infrastructure, 1999. [50] Ian Foster, Carl Kesselman, Gene Tsudik, and Steven Tuecke. A security architecture for computational grids, 1998. [51] T. Dierks and C. Allen. The tls protocol version 1.0, rfc 2246, ietf,. http: //www.ietf.org/rfc/rfc2246.txt, 1999. [52] Steven Fitzgerald. Grid Information Services for Distributed Resource Sharing. In Proceedings of the 10th IEEE International Symposium on High Performance 84
Distributed Computing, HPDC 01, pages 181, Washington, DC, USA, 2001. IEEE Computer Society. [53] Karl Czajkowski and Ian Foster and Nick Karonis and Carl Kesselman and Stuart Martin and Warren Smith and Steven Tuecke. A Resource Management Architecture for Metacomputing Systems, 1997. [54] Bill Allcock, Joe Bester, John Bresnahan, Ann L. Chervenak, Ian Foster, Carl Kesselman, Sam Meder, Veronika Nefedova, and Darcy Quesnel Steven. Ecient data transport and replica management for high-performance data-intensive computing. In in Mass Storage Conference, 2001. [55] J. Novotny, S. Tuecke, and V Welch. Initial Experiences with an Online Certicate Repository for the Grid: MyProxy, 2001. [56] Karl Czajkowski, Ian Foster, and Carl Kesselman. Resource Co-Allocation in Computational Grids. In In Proceedings of The Eighth IEEE International Symposium on High Performance Distributed Computing (HPDC-8, pages 219 228. IEEE Computer Society, 1999. [57] Globus GRAM Architecture. http://www-unix.globus.org/api/
c-globus-2.4/globus_gram_documentation/html/. [58] Matthew Smith, Thomas Friese, and Bernd Freisleben. Model driven development of serviceoriented grid applications. In In: Proc. of the International Conference on Internet and Web Applications & Services, pages 139146. IEEE Press, 2006. [59] Katarina Stanoevska-Slabeva and Thomas Wozniak. Cloud Basics An Introduction to Cloud Computing, chapter 4. Grid and Cloud Computing A Business Perspective on Technology, Springer, 2010. 85
[60] Anthony T. Velte, Toby J. Velte, and Robert Elsenpeter. Cloud Computing: A Practical Approach, chapter 1. McGraw-Hill, 2010. [61] Jinesh Varia. Cloud architectures. Technical report, Amazon Web Services, June 2008. [62] F. M. Aymerich, G. Fenu, and S. Surcis. An approach to a Cloud Computing network. In Applications of Digital Information and Web Technologies, 2008. ICADIWT 2008., pages 113118, 2008. [63] L. Youse, M. Butrico, and D. Da Silva. Toward a Unied Ontology of Cloud Computing. In Grid Computing Environments Workshop, 2008. GCE 08, pages 110, November 2008. [64] Hadoop http://hadoop.apache.org/. [65] Jerey Dean, Sanjay Ghemawat, and Google Inc. MapReduce: simplied data processing on large clusters. In In OSDI04: Proceedings of the 6th conference on Symposium on Operating Systems Design and Implementation. USENIX Association, 2004. [66] Christopher Olston, Benjamin Reed, Utkarsh Srivastava, Ravi Kumar, and Andrew Tomkins. Pig Latin: A Not-So-Foreign Language for Data Processing. [67] Borko Furht. Cloud Computing Fundamentals, chapter 1. Handbook of Cloud Computing, Springer, 2010. [68] Anthony T. Velte, Toby J. Velte, and Robert Elsenpeter. Cloud Computing: A Practical Approach, chapter 7. McGraw-Hill, 2010.
86
[69] Sanjay Ghemawat, Howard Gobio, and Shun-Tak Leung. The Google le system. In Proceedings of the nineteenth ACM symposium on Operating systems principles, SOSP 03, pages 2943, New York, NY, USA, 2003. ACM. [70] Karin Petersen, Mike Spreitzer, Douglas Terry, and Marvin Theimer. Bayou: Replicated database services for world-wide applications. In In Proceedings 7th SIGOPS European Workshop, pages 275280. ACM, 1996. [71] Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall, and Werner Vogels. Dynamo: amazons highly available key-value store. In In Proceedings SOSP, pages 205220, 2007. [72] Consistency model: http://en.wikipedia.org/wiki/Consistency_model. [73] Chao-Tung Yang, Wen-Chi Tsai, Tsui-Ting Chen, and Ching-Hsien Hsu. A One-Way File Replica Consistency Model in Data Grids. In Asia-Pacic Service Computing Conference, The 2nd IEEE, pages 364 373, 2007. [74] J. Hofstader. Communications as a Service. http://msdn.microsoft.com/ en-us/library/bb896003.aspx. [75] William Johnston, Joe Metzger, Michael Collins, Eli Dart, Jim Gagliardi, Chin Guok, and Kevin Oberman. Network Communication as a Service-Oriented Capability. In Published in: High Performance Computing and Grids in Action, Volume 16 Advances in Parallel Computing, Editor: L. Grandinetti, March 2008, IOS Press, ISBN. [76] Andreas Hanemann, Je W. Boote, Ericl. Boyd, Jrme Dur, Loukik Kudarieo moti, Roman Lapacz, D. Martin Swany, Szymon Trocha, and Jason Zurawski.
87
Perfsonar: A service oriented architecture for multi-domain network monitoring. In In Proceedings of the Third International Conference on Service Oriented Computing (ICSOC 2005). ACM Sigsoft and Sigweb, 2005. [77] Microsoft Connected Service Framework. http://www.microsoft.com/
serviceproviders/solutions/connectedservicesframework.mspx. [78] Ian Foster and Carl Kesselman. Globus: A metacomputing infrastructure
toolkit. International Journal of Supercomputer Applications, 11:115128, 1996. [79] Douglas Thain, Todd Tannenbaum, and Miron Livny. Distributed computing in practice: The condor experience. Concurrency and Computation: Practice and Experience, 17:24, 2005. [80] Equus Computer Systems : Hardware as a Service (HaaS). equuscs.com/hardware-as-a-service. [81] Das U-Boot: The Universal Boot Loader. http://www.denx.de/wiki/U-Boot/ WebHome. [82] Qi Zhang, Lu Cheng, and Raouf Boutaba. Cloud computing: state-of-the-art and research challenges. J Internet Serv Appl (2010) 1: 718, 2010. Springer. [83] Torry Harris. CLOUD COMPUTING An Overview. [84] Amazon Elastic Cloud: http://aws.amazon.com/ec2/. [85] GoGrid Cloud Hosting: http://www.gogrid.com/cloud-hosting/. [86] 3Tera Cloud Computing: http://www.3tera.com/Cloud-computing/. [87] Windows Azure MSDN library: http://msdn.microsoft.com/en-us/ http://www.
library/windowsazure/dd163896.aspx. 88
[88] Google App Engine Overview: http://www.google.com/enterprise/cloud/ appengine/. [89] SalesForce Cloud: http://www.salesforce.com/platform/. [90] Google Apps: http://www.google.com/apps/intl/en/business/details. html. [91] Microsoft SaaS - Software as a Service - Service Providers, http://www. microsoft.com/err/serviceproviders/saas/. [92] Dedicated Server, Managed Hosting, Web Hosting by Rackspace Hosting, http: //www.rackspace.com. [93] Michael Armbrust, Armando Fox, Rean Grith, Anthony D. Joseph, Randy H. Katz, Andrew Konwinski, Gunho Lee, David A. Patterson, Ariel Rabkin, and Matei Zaharia. Above the Clouds: A Berkeley View of Cloud Computing. Technical report, 2009. [94] Eric A. Marks and Bob Lozano. Executives Guide to Cloud Computing, chapter 6 : Cloud Architecture, Modeling, and Design. John Wiley & Sons, Inc., 2010. [95] Ronald L. Krutz and Russell Dean Vines. Cloud Security - A Comprehensive Guide to Secure Cloud Computing, chapter 2 : Cloud Computing Architecture. Wiley Publishing, Inc. [96] Dene Cloud Computing. RightScale Blog. http://blog.rightscale.com/ 2008/05/26/define-cloud-computing/, Accessed: 9 Jun 2009, 26 May 2008. [97] Cloud Computing Myths Dispelled. resources/cloud-myths-dispelled#q2. http://eucalyptus.cs.ucsb.edu/
89
[98] Ian Foster, Yong Zhao, Ioan Raicu, and Shiyong Lu. Cloud computing and grid computing 360-degree compared. [99] Derrick Kondo. Volunteer Computing and Cloud Computing: Opportunities for Synergy. event.twgrid.org/isgc2009/asiaathome/wiki/images/b/b7/ Derrick_talk.pdf. [100] Ivona Brandic and Schahram Dustdar. Grid vs Cloud - A Technology Comparison. it - Information Technology, 53(4):173179, 2011. [101] Judith M. Myerson. Cloud computing versus grid computing Service types, similarities and dierences, and things to consider. IBM, developerWorks, March 2009. [102] Akshaya Bhatia. puting. Comparison of Cloud computing with Grid com-
http://it.toolbox.com/wiki/index.php/Comparison_of_Cloud_
computing_with_Grid_computing. [103] What Grid is the dierence between Cloud computing and
Computing?
http://www.ibeehosting.com/blog/
what-is-the-difference-between-cloud-computing-and-grid-computing. html, January, 2010. [104] Oracle Grid Computing, June, 2009. An Oracle White Paper. [105] http://en.wikipedia.org/wiki/Oracle_Grid_Engine. [106] Karl Czajkowski, Ian Foster, Nick Karonis, Carl Kesselman, Stuart Martin, Warren Smith, and Steven Tuecke. A Resource Management Architecture for Metacomputing Systems. Job Scheduling Strategies for Parallel Processing, 1998. 90
[107] I. Raicu, Y. Zhao, C. Dumitrescu, I. Foster, and M. Wilde. Falkon: a Fast and Light-weight tasK executiON framework. IEEE/ACM SuperComputing, 2007. [108] Birjodh Tiwana, Hitesh Ballani, Mahesh Balakrishnan, Z. Morley Mao, and Marcos K. Aguilera. Location, Location, Location! Modeling Data Proximity in the Cloud. [109] Ann Chervenak, Ian Foster, Carl Kesselman, Charles Salisbury, and Steven Tuecke. The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientic Datasets. JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 23:187200, 1999. [110] Ian Foster, Jens Vckler, Michael Wilde, and Yong Zhao. Chimera: A Virtual o Data System For Representing, Querying, and Automating Data Derivation. In In Proceedings of the 14th Conference on Scientic and Statistical Database Management, pages 3746, 2002. [111] Yong Zhao, Michael Wilde, Ian Foster, Jens Voeckler, James Dobson, Eric Glibert, Thomas Jordan, and Elizabeth Quigg. Virtual data grid middleware services for data-intensive science, concurrency and computation: Practice and experience. CCPE, 18:595608, 2006. [112] Nuno Santos Nuno. Distributed metadata with the amga metadata catalog. In In Workshop on NextGeneration Distributed Data Management - HPDC-15, 2006. [113] A. Szalay, A. Bunn, J. Gray, I. Foster, and I. Raicu. The Importance of Data Locality in Distributed Computing Applications. NSF Workow Workshop, 2006.
91
[114] I. Raicu, Y. Zhao, I. Foster, and A. Szalay. Accelerating Largescale Data Exploration through Data Diusion. International Workshop on Data-Aware Distributed Computing, 2008. [115] David Munoz Sanchez. Comparison between security solutions in Cloud and Grid Computing. 2010. Aalto University, Seminar on Network Security. [116] K. Keahey, I. Foster, and T. Freeman X. Zhang. Virtual Workspaces: Achieving Quality of Service and Quality. In of Life in the Grid. Scientic Programming Journal, pages 265276, 2005. [117] http://www.nimbusproject.org/about/. [118] Katarzyna Keahey, Ian Foster, Timothy Freeman, and Xuehai Zhang. Virtual Workspaces in the Grid. In In Proc. of Euro-Par Conf, pages 421431, 2005. [119] Maytal Dahan, Eric Roberts, and Jay Boisseau. TeraGrid User Portal v1.0: Architecture, Design, and Technologies. [120] https://computing.llnl.gov/tutorials/mpi/. [121] http://en.wikipedia.org/wiki/Eucalyptus_(computing). [122] Marc-Elian Bgin. An EGEE comparative study: Grids and clouds - evolution e or revolution. Technical report, CERN - Engineering and Equipment Data Management Service, 2008. [123] John Walsh, Brian Coghlan, and Stephen Childs. An Introduction to Grid Computing Using EGEE, 2009. [124] http://en.wikipedia.org/wiki/GLite. [125] http://aws.amazon.com/documentation/. 92
[126] Keith Jeery.
The Future of CLOUD Computing - Report from EC
CLOUD Computing Expert Group. http://ercim-news.ercim.eu/en80/es/ the-future-of-cloud-computing. [127] Tharam Dillon, Chen Wu, and Elizabeth Chang. Cloud computing: Issues and challenges. Advanced Information Networking and Applications, International Conference on, 0:2733, 2010.
93

Cloud Computing (Draft For Review)

Transféré par

Informations du document

Description originale:

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Cloud Computing (Draft For Review)

Transféré par

Droits d'auteur :

Formats disponibles

Evaluation Towards Cloud : Overview of Next Generation Computing Architecture

Bachelor of Science (B.Sc.) in Department of Computer Science and Engineering

Bangladesh University of Engineering and Technology 22 March 2012 Dhaka Bangladesh

ii iii vii viii x 1 4 5 6 7 7 8 10 12 13 14

1.5.2.1 1.5.2.2 1.5.2.3 1.5.2.4 1.5.2.5 1.5.3

Active Messages . . . . . . . . . . . . . . . . . . . . Fast Messages . . . . . . . . . . . . . . . . . . . . . . VMMC . . . . . . . . . . . . . . . . . . . . . . . . . U-net . . . . . . . . . . . . . . . . . . . . . . . . . . BIP . . . . . . . . . . . . . . . . . . . . . . . . . . .

Standards for Cluster Communication . . . . . . . . . . . . . 1.5.3.1 1.5.3.2 VIA . . . . . . . . . . . . . . . . . . . . . . . . . . . InniBand . . . . . . . . . . . . . . . . . . . . . . . .

Grids over Cluster Computing . . . . . . . . . . . . . . . . . .

Grid Computing with Globus . . . . . . . . . . . . . . . . . . . . . . v

Evolution towards Cloud Computing from Grid . . . . . . . . . . . . Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.4 3.5 3.6

Virtualization on Cloud . . . . . . . . . . . . . . . . . . . . . . . . . Example of a Cloud Implementation . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4.2.4 4.2.5 4.2.6 4.3

Resource Management . . . . . . . . . . . . . . . . . . . . . . Application Model . . . . . . . . . . . . . . . . . . . . . . . . Other issues . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.1 Comparative results . . . . . . . . . . . . . . . . . . . . . . .

5 Conclusion and Future works

3.1 3.2 3.3

3.5 3.6 3.7 3.8 3.9

Chapter 1 Computing with Distributes Units: Computer Clusters

VLSI (Very Large Scale Integration) typically 10,000 components.

Figure 1.1: Eras of Computing

Centralized vs Distributed Systems

Advantages of Distributed Systems

Issues and Challanges in Distributed Systems

Architecture of Computer Clusters

Protocols for Cluster Communication

Figure 1.5: Traditional Protocol Overhead and Transmission Time.

Standards for Cluster Communication

researchers has resulted in the InniBand standard.

Single System Image (SSI)

Object Request Broker

Chapter 2 Grid Computing : An Introduction

Grid Computing: denitions and overview

Again, according to IBM denition [39],

A grid is a collection of distributed computing resources available

Virtualization and Grid

Grids over Cluster Computing

An example of Grid Computing environment

Fabric Layer: Interfaces to Local Resources

Connectivity Layer: Managing Communications

Resource Layer: Sharing of a Single Resource

Collective Layer : Co-ordination with multiple resources

Application Layer : User dened Grid Applications

Grid Computing with Globus

Resource Management in Grid Computing

Resource Specication Language

Globus Resource Allocation Manager (GRAM)

Evolution towards Cloud Computing from Grid

Chapter 3 An overview of Cloud Architecture

Figure 3.1: Components of a Cloud Computing Solution i. Clients:

iii. Distributed Servers:

A layered model of Cloud architecture - Cloud ontology

The ordering in Cloud stack is important, it determines the work-ow in Cloud. 48

(a) Cloud Application Layer:

(b) Cloud Software Environment Layer:

(c) Cloud Software Infrastructure Layer:

ii. Data Storage:

(e) Hardware and Firmware:

Cloud Layers Cloud Application Layer Cloud Software Environment

Cloud Business Model

We will cover details of these services on section 3.3.

Cloud Deployment Model