Subject Index

SUBJECT INDEX
Sr. No. Title Page

No.
1 Introduction to cloud computing. 9
Study about cloud reference model.

2 14
Study about virtual private network.

3 17
Case study on google app engine.

4 19
Case study on Hadoop.

5 27
6 Case study on Microsoft Azure. 37
7 41
Case Study: Amazon Web Services
8 Case Study: Aneka 49

Experiment No. 1
Aim: To study in detail about cloud computing.

Theory: The term cloud has been used historically as a metaphor for the Internet.
This usage was originally derived from its common depiction in network diagrams
as an outline of a cloud, used to represent the transport of data across carrier
backbones (which owned the cloud) to an endpoint location on the other side of the
cloud. This concept dates back as early as 1961, when Professor John McCarthy
suggested that computer time-sharing technology might lead to a future where
computing power and even specific applications might be sold through a utility-
type business model. 1
This idea became very popular in the late 1960s, but by the mid-1970s the idea
faded away when it became clear that the IT-related technologies of the day were
unable to sustain such a futuristic computing model. However, since the turn of the
millennium, the concept has been revitalized. It was during this time of
revitalization that the term cloud computing began to emerge in technology circles.
Cloud computing is a model for enabling convenient, on-demand network access
to a shared pool of configurable computing resources (e.g., networks, servers,
storage, applications, and services) that can be rapidly provisioned and released
with minimal management effort or service provider interaction.
A Cloud is a type of parallel and distributed system consisting of a collection of

inter-connected and virtualized computers that are dynamically provisioned and
presented as one or more unified computing resource(s) based on service-level
agreements established through negotiation between the service provider and
consumers.
When you store your photos online instead of on your home

computer, or use webmail or a social networking site, you are using a cloud
computing service. If you are in an organization, and you want to use, for example,
an online invoicing service instead of updating the in-house one you have been
using for many years, that online invoicing service is
a ―cloud computing service. Cloud computing is the delivery of computing
services over the Internet. Cloud services, allow individuals and businesses to use
software and hardware that are managed by third parties at remote locations.
Examples of cloud services include online file storage, social networking sites,
webmail, and online business applications. The cloud computing model allows
access to information and computer resources from anywhere.
computing provides a shared pool of resources, including data storage space,
networks, Computer processing power, and specialized corporate and user
applications.
ESSENTIAL CHARACTERISTICS:--
On-demand self-service:--A consumer can unilaterally provision computing

capabilities such as server time and network storage as needed automatically,
without requiring human
interaction with a service provider.
Broad network access:--Capabilities are available over the network and

accessed through
standard mechanisms that promote use by heterogeneous thin or thick client
platforms (e.g., mobile phones, laptops, and PDAs) as well as other traditional
or cloud based software services.
13
Resource pooling:--The provider‘s computing resources are pooled to
serve multiple
consumers using a multi-tenant model, with different physical and virtual
resources
dynamically assigned and reassigned according to consumer demand.
Rapid elasticity:--Capabilities can be rapidly and elastically provisioned in

some cases
automatically - to quickly scale out; and rapidly released to quickly scale in. To
the consumer, the capabilities available for provisioning often appear to be

unlimited and can be purchased in any quantity at any time.
Measured service:--Cloud systems automatically control and optimize resource

usage by leveraging a metering capability at some level of abstraction appropriate to
the type of
service. Resource usage can be monitored, controlled, and reported - providing
transparency for both the provider and consumer of the service.
Conclusion-successfully studied about cloud computing.

Experiment No. 2
Aim: To Study about cloud reference model.
Theory: cloud computing reference model is a model showing the dependencies

between all your systems. In the context of cloud computing, this would cover your
subscription details, along with all of the virtual components and other details that
you can use to build, or rebuild, your cloud infrastructure.
Infrastructure as a Service (IaaS):
1. The capability provided to the consumer is to provision processing, storage,

networks, and other fundamental computing resources.
2. Consumer is able to deploy and run arbitrary software, which can include
operating systems and applications.
3. The consumer does not manage or control the underlying cloud

infrastructure but has control over operating systems; storage, deployed
applications, and possibly limited control of select
4. networking components (e.g., host firewalls).
Platform as a Service (PaaS):

1. The capability provided to the consumer is to deploy onto the cloud
infrastructure consumer created or acquired applications created using
programming languages and tools supported by the provider.
2. The consumer does not manage or control the underlying cloud

infrastructure including network, servers, operating systems, or storage, but
has control over the deployed applications and
3. Possibly application hosting environment configurations.

Software as a Service (SaaS):--
The capability provided to the consumer is to use the provider‘s
applications running on a
cloud infrastructure.
The applications are accessible from various client devices through a thin
client interface such as a web browser (e.g., web-based email).
The consumer does not manage or control the underlying cloud infrastructure
including network, servers, operating systems, storage, or even individual
application capabilities, with the possible exception of limited user specific
application configuration settings.
12
Cloud Deployment Models:
Public
Private
Community Cloud
Hybrid Cloud
Public Cloud: The cloud infrastructure is made available to the general public
or a large industry group and is owned by an organization selling cloud
services.
Private Cloud: The cloud infrastructure is operated solely for a single

organization. It may be managed by the organization or a third party, and
may exist on-premises or off-premises.
Community Cloud: The cloud infrastructure is shared by several

organizations and supports a specific community that has shared concerns
(e.g., mission, security requirements, policy, or compliance considerations).
It may be managed by the organizations or a third party and may exist on-
premises or off-premises.
Hybrid Cloud: The cloud infrastructure is a composition of two or more

clouds (private, community, or public) that remain unique entities but are
bound together by standardized or
Proprietary technology that enables data and application portability (e.g.,

cloud bursting for load-balancing between clouds).
Conclusion: Studied about cloud reference model.

Experiment No. 3
Aim: Study about virtual private network.

Theory: One of the most common types of VPNs is a virtual private dial-up
network (VPDN). A VPDN is a user-to-LAN connection, where remote users need
to connect to the company LAN. Here the company will have a service provider
set-up a NAS (network access server) and provide the remote users with the
software needed to reach the NAS from their desktop computer or laptop. For a
VPDN, the secure and encrypted connection between the company's network and
remote users is provided by the third-party service provider.
Another type of VPN is commonly called a site-to-site VPN. Here the company
would invest in dedicated hardware to connect multiple sites to their LAN though a
public network, usually the Internet. Site-to-site VPNs are either intranet or
extranet-based
Intranet
A network based on TCP/IP protocols (an intranet) belonging to an organization,
usually a corporation, accessible only by the organization's members, employees or
others with authorization. Secure intranets are now the fastest-growing segment of
the Internet because they are much less expensive to build and manage than private
networks based on proprietary protocols.
Extranet
An extranet refers to an intranet that is partially accessible to authorized outsiders.
Whereas an intranet resides behind a firewall and is accessible only to people who
are members of the same company or organization, an extranet provides various
levels of accessibility to outsiders. You can access an extranet only if you have a
valid username and password, and your identity determines which parts of the
extranet you can view. Extranets are becoming a popular means for business
partners to exchange information.
Other options for using a VPN include such things as using dedicated private
leased lines. Due to the high cost of dedicated lines, however, VPNs have become
an attractive cost-effective solution.
Securing a VPN
If you're using a public line to connect to a private network, then you might
wonder what makes a virtual private network private? The answer is the manner in
which the VPN is designed. A VPN is designed to provides a secure,
encrypted tunnel in which to transmit the data between the remote user and the
company network. The information transmitted between the two locations via the
encrypted tunnel cannot be read by anyone else.
VPN security contains several elements to secure both the company's private
network and the outside network, usually the Internet, through which the remote
user connects through. The first step to security is usually a firewall. You will have
a firewall site between the client (which is the remote users workstation) and the
host server, which is the connection point to the private network. The remote user
will establish an authenticated connection with the firewall.
VPN Encryption
Encryption is also an important component of a secure VPN. Encryption works by
having all data sent from one computer encrypted in such a way that only the
computer it is sending to can decrypt the data. Types of encryption commonly used
include public-key encryption which is a system that uses two keys — a public key
known to everyone and a private or secret key known only to the recipient of the
message. The other commonly used encryption system is a Symmetric-key
encryption system in which the sender and receiver of a message share a single,
common key that is used to encrypt and decrypt the message.
VPN Tunnelling
With a VPN you'll need to establish a network connection that is based on the idea
of tunnelling. There are two main types of tunnelling used in virtual private
networks. Voluntary tunnelling is where the client makes a connection to the
service provider then the VPN client creates the tunnel to the VPN server once the
connection has been made. In compulsory tunnelling the service provider manages
the VPN connection and brokers the connection between that client and a VPN
server.
Network Protocols for VPN Tunnels
There are three main network protocols for use with VPN tunnels, which are
generally incompatible with each other. They include the following:
IPSec
A set of protocols developed by the IETF to support secure exchange of packets at
the IP layer. IPsec has been deployed widely to implement VPNs. IPsec supports
two encryption modes: Transport and Tunnel. Transport mode encrypts only the
data portion (payload) of each packet, but leaves the header untouched. The more
secure Tunnel mode encrypts both the header and the payload. On the receiving
side, an IPSec-compliant device decrypts each packet. For IPsec to work, the
sending and receiving devices must share a public key. This is accomplished
through a protocol known as Internet Security Association and Key Management
Protocol/Oakley (ISAKMP/Oakley), which allows the receiver to obtain a public
key and authenticate the sender using digital certificates.
PPTP
Short for Point-to-Point Tunnelling Protocol, a new technology for creating VPNs,
developed jointly by Microsoft, U.S. Robotics and several remote access vendor
companies, known collectively as the PPTP Forum. A VPN is a private network of
computers that uses the public Internet to connect some nodes. Because the
Internet is essentially an open network, PPTP is used to ensure that messages
transmitted from one VPN node to another are secure. With PPTP, users can dial in
to their corporate network via the Internet.
L2TP
Short for Layer Two (2) Tunnelling Protocol, an extension to the PPP protocol that
enables ISPs to operate Virtual Private Networks (VPNs). L2TP merges the best
features of two other tunnelling protocols: PPTP from Microsoft and L2F from
Cisco Systems. Like PPTP, L2TP requires that the ISP's routers support the
protocol.
VPN Equipment
Depending on the type of VPN you decide to implement, either remote-access or
site-to-site, you will need specific components to build your VPN. These standard
components include a software client for each remote workstation, dedicated
hardware, such as a firewall or a product like the Cisco VPN Concentrator, a VPN
server, and a Network Access Server (NAS).
Conclusion-Successfully studied about virtual private network.

Experiment No. 4
Aim: Case Study on Google App Engine
Theory:
Google App Engine:
Architecture :
The Google App Engine (GAE) is Google`s answer to the ongoing trend of Cloud
Computing offerings within the industry. In the traditional sense, GAE is a web
application hosting service, allowing for development and deployment of web-
based applications within a pre-defined runtime environment. Unlike other cloud-
based hosting offerings such as Amazon Web Services that operate on an IaaS
level, the GAE already provides an application infrastructure on the PaaS level.
This means that the GAE abstracts from the underlying hardware and operating
system layers by providing the hosted application with a set of application-oriented
services. While this approach is very convenient fo
developers of such applications, the rationale behind the GAE is its focus on
scalability and usage-based infrastructure as well as payment.
Costs :
Developing and deploying applications for the GAE is generally free of charge but
restricted to a certain amount of traffic generated by the deployed application. Once
this limit is reached within a certain time period, the application stops working.
However, this limit can be waived when switching to a billable quota where the
developer can enter a maximum budget that can be spent on an application per day.
Depending on the traffic, once the free quota is reached the application will continue
to work until the maximum budget for this day is reached. Table 1 summarizes some
of the in our opinion most important quotas and corresponding amount per unit that
is charged when free resources are depleted and additional, billable quota is desired.
Features :
With a Runtime Environment, the Data store and the App Engine services, the GAE
can be divided into three parts.
Runtime Environment
The GAE runtime environment presents itself as the place where the actual
application is executed. However, the application is only invoked once an HTTP
request is processed to the GAE via a web browser or some other interface, meaning
that the application is not constantly running if no invocation or processing has been
done. In case of such an HTTP request, the request handler forwards the request and
the GAE selects one out of many possible Google servers where the application is
then instantly deployed and executed for a certain amount of time (8). The
application may then do some computing and return the result back to the GAE
request handler which forwards an HTTP response to the client. It is important to
understand that the application runs completely embedded in this described sandbox
environment but only as long as requests are still coming in or some processing is
done within the application. The reason for this is simple: Applications should only
run when they are actually computing, otherwise
they would allocate precious computing power and memory without need. This
paradigm shows already the GAE‘s potential in terms of scalability. Being able
to run multiple instances of one
application independently on different servers guarantees for a decent level of
scalability. However, this highly flexible and stateless application execution
paradigm has its limitations. Requests
are processed no longer than 30 seconds after which the response has to be returned
to the client and the application is removed from the runtime environment again (8).
Obviously this method
50
accepts that for deploying and starting an application each time a request is
processed, an additional lead time is needed until the application is finally up and
running. The GAE tries to encounter this problem by caching the application in the
server memory as long as possible, optimizing for several subsequent requests to the
same application. The type of runtime environment on the Google servers is
dependent on the programming language used. For Java or other languages that have
support for Java-based compilers (such as Ruby, Rhino and Groovy) a Java-based
Java Virtual Machine (JVM) is provided. Also, GAE fully supports the Google Web
Toolkit (GWT), a framework for rich web applications. For Python and related
frameworks a Python-based environment is used.
Persistence and the datastore
As previously discussed, the stateless execution of applications creates the need for
a datastore that provides a proper way for persistence. Traditionally, the most
popular way of persisting data in web applications has been the use of relational
databases. However, setting the focus on high flexibility and scalability, the GAE
uses a different approach for data persistence, called Big table
Services
As mentioned earlier, the GAE serves as an abstraction of the underlying hardware

and operating system layers. These abstractions are implemented as services that can
be directly called from the actual application. In fact, the datastore itself is as well a
service that is controlled by the runtime environment of the application.
MEM CACHE
The platform innate memory cache service serves as a short-term storage. As its
name suggests, it stores data in a server‘s memory allowing for faster access
compared to the datastore.
Me cache is a non-persistent data store that should only be used to store temporary
data within a series of computations. Probably the most common use case for Me
cache is to store session specific data (15). Persisting session information in the
datastore and executing queries on every page interaction is highly inefficient over
the application lifetime, since session-owner instances are unique per session (16).
Moreover, Me cache is well suited to speed up common datastore queries (8). To
interact with the Me cache GAE supports Cache, a proposed interface standard for
memory caches.
Experiment No. 5
Aim-Case Study of Hadoop
Theory: Hadoop is an open source, Java-based programming framework that

supports the processing and storage of extremely large data sets in a distributed
computing environment. It is part of the Apache project sponsored by the Apache
Software Foundation.
Hadoop makes it possible to run applications on systems with thousands of
commodity hardware nodes, and to handle thousands of terabytes of data.
Its distributed file system facilitates rapid data transfer rates among nodes and
allows the system to continue operating in case of a node failure. This approach
lowers the risk of catastrophic system failure and unexpected data loss, even if a
significant number of nodes become inoperative. Consequently, Hadoop quickly
emerged as a foundation for big data processing tasks, such as scientific analytics,
business and sales planning, and processing enormous volumes of sensor data,
including from internet of things sensors.
Hadoop was created by computer scientists Doug Cutting and Mike Carell in 2006
to support distribution for the Notch search engine It was inspired by Google's
MapReduce, a software framework in which an application is broken down into
numerous small parts. Any of these parts, which are also called fragments or
blocks, can be run on any node in the cluster After years of development within the
open source community, Hadoop 1.0 became publicly available in November 2012
as part of the Apache project sponsored by the Apache Software Foundation.
Since its initial release, Hadoop has been continuously developed and updated. The
second iteration of Hadoop (Hadoop 2) improved resource management and
scheduling. It features a high-availability file-system option and support for
Microsoft Windows and other components to expand the framework's versatility
for data processing and analytics.
What is Hadoop?
Organizations can deploy Hadoop components and supporting software packages
in their local data center. However, most big data projects depend on short-term
use of substantial computing resources. This type of usage is best-suited to highly
scalable public cloud services, such as Amazon Web Services
(AWS), Google Cloud Platform and Microsoft Azure. Public cloud providers often
support Hadoop components through basic services, such as AWS Elastic Compute
Cloud and Simple Storage Service instances. However, there are also services
tailored specifically for Hadoop-type tasks, such as AWS Elastic
MapReduce, Google Cloud Daturic and Microsoft Azure HDInsight.
Hadoop modules and projects
As a software framework, Hadoop is composed of numerous functional modules.
At a minimum, Hadoop uses Hadoop Common as a kernel to provide the
framework's essential libraries. Other components include Hadoop Distributed File
System (HDFS), which is capable of storing data across thousands of commodity
servers to achieve high bandwidth between nodes; Hadoop Yet Another Resource
Negotiator (YARN) which provides resource management and scheduling for user
applications; and Hadoop MapReduce which provides the programming model
used to tackle large distributed data processing -- mapping data and reducing it to a
result.
Hadoop also supports a range of related projects that can complement and extend
Hadoop's basic capabilities. Complementary software packages include:
Apache Flume. A tool used to collect, aggregate and move huge amounts of
streaming data into HDFS.
Apache HBase An open source, nonrelational, distributed database;
Apache Hive A data warehouse that provides data summarization, query and
analysis;
Cloudera Impala. A massively parallel processing database for Hadoop,

originally created by the software company Cloudera, but now released as open
source software;
Apache Oozy. A server-based workflow scheduling system to manage Hadoop

jobs;
Apache Phoenix. An open source, massively parallel processing, relational

database engine for Hadoop that is based on Apache HBase;
Apache-. A high-level platform for creating programs that run on Hadoop;
Apache Sqoop. A tool to transfer bulk data between Hadoop and structured data
stores, such as relational databases;
Apache Spark- A fast engine for big data processing capable of streaming and
supporting SQL, machine learning and graph processing;
Apache Storm. An open source data processing system; and
Apache Zookeeper- An open source configuration, synchronization and naming

registry service for large distributed systems.
Conclusion-Successfully studied about Hadoop.

Experiment No. 6
Aim-To Study about amazon web services.
Theory-
Why Amazon Web Services?
Amazon.com initiated the evaluation of Amazon S3 for economic and performance

improvements related to data backup. As part of that evaluation, they considered
security, availability, and performance aspects of Amazon S3 backups.
Amazon.com also executed a cost-benefit analysis to ensure that a migration to
Amazon S3 would be financially worthwhile. That cost benefit analysis included
the following elements:
Performance advantage and cost competitiveness. It was important that the

overall costs of the backups did not increase. At the same time, Amazon.com
required faster backup and recovery performance. The time and effort required
for backup and for recovery operations proved to be a significant improvement
over tape, with restoring from Amazon S3 running from two to twelve times
faster than a similar restore from tape. Amazon.com required any new backup
medium to provide improved performance while maintaining or reducing overall
costs. Backing up to on-premises disk based storage would have improved
performance, but
missed on cost competitiveness. Amazon S3 Cloud based storage met both
criteria.
Greater durability and availability. Amazon S3 is designed to provide

99.999999999% durability and 99.99% availability of objects over a given year.
Amazon.com compared these figures with those observed from their tape
infrastructure, and determined that Amazon S3 offered significant improvement.
Less operational friction. Amazon.com DBAs had to evaluate whether Amazon

S3 backups would be viable for their database backups. They determined that
using Amazon S3 for backups was easy to implement because it worked
seamlessly with Oracle RMAN.
Strong data security. Amazon.com found that AWS met all of their requirements
for physical security, security accreditations, and security processes, protecting
data in flight, data at rest, and utilizing suitable encryption standards.
The Benefits
With the migration to Amazon S3 well along the way to completion,

Amazon.com has realized several
benefits, including:
Elimination of complex and time-consuming tape capacity planning.

Amazon.com is growing larger
and more dynamic each year, both organically and as a result of acquisitions.
AWS has enabled Amazon.com to keep pace with this rapid expansion, and to do
so seamlessly. Historically, Amazon.com business groups have had to write
annual backup plans, quantifying the amount of tape storage that they plan to use
for the year and the frequency with which they will use the tape resources. These
plans are then used to charge each organization for their tape usage, spreading
the cost among many teams. With Amazon S3, teams simply pay for what they
use, and are billed for their usage as they go. There are virtually no upper limits
as to how much data can be stored in Amazon S3, and so there are no worries
about running out of resources. For teams adopting Amazon S3 backups, the need
for formal planning has been all but eliminated.
Reduced capital expenditures. Amazon.com no longer needs to acquire tape

robots, tape drives, tape inventory, data center space, networking gear, enterprise
backup software, or predict future tape consumption. This eliminates the burden
of budgeting for capital
equipment well in advance as well as the capital expense.
Immediate availability of data for restoring – no need to locate or retrieve

physical tapes. Whenever a DBA needs to restore data from tape, they face
delays. The tape backup software needs to read the tape catalog to find the correct
files to restore, locate the correct tape, mount the tape, and read the data from it.
In almost all cases the data is spread across multiple tapes,
resulting in further delays. This, combined with contention for tape drives
resulting from multiple users‘ tape requests, slows the process down even
more. This is especially severe during critical events such as a data center
outage, when many databases must be restored
simultaneously and as soon as possible. None of these problems occur with
Amazon S3. Data restores can begin immediately, with no waiting or tape
queuing – and that means the
database can be recovered much faster.
68
Backing up a database to Amazon S3 can be two to twelve times faster than with
tape drives. As one example, in a benchmark test a DBA was able to restore 3.8
terabytes in 2.5 hours over gigabit Ethernet. This amounts to 25 gigabytes per
minute, or 422MB per second. In addition, since Amazon.com uses RMAN data
compression, the effective restore rate was 3.37 gigabytes per second. This 2.5
hours compares to, conservatively, 10-15 hours that
would be required to restore from tape.
Easy implementation of Oracle RMAN backups to Amazon S3. The DBAs found
it easy to start backing up their databases to Amazon S3. Directing Oracle RMAN
backups to Amazon S3 requires only a configuration of the Oracle Secure Backup
Cloud (SBC) module. The effort required to configure the Oracle SBC module
amounted to an hour or less per database. After this one-time setup, the database
backups were transparently redirected to Amazon S3.
Durable data storage provided by Amazon S3, which is designed for 11 nines
durability. On occasion, Amazon.com has experienced hardware failures with tape
infrastructure – tapes that break, tape drives that fail, and robotic components that
fail. Sometimes this happens when a DBA is trying to restore a database, and
dramatically increases the mean time to recover (MTTR). With the durability and
availability of Amazon S3, these issues are no
longer a concern.
Freeing up valuable human resources. With tape infrastructure, Amazon.com had

to seek out engineers who were experienced with very large tape backup
installations – a specialized,
vendor-specific skill set that is difficult to find. They also needed to hire data center
technicians and dedicate them to problem-solving and troubleshooting hardware
issues – replacing drives, shuffling tapes around, shipping and tracking tapes, and
so on. Amazon S3 allowed them to free up these specialists from day-to-day
operations so that they can work on more valuable, business-critical engineering
tasks.
Elimination of physical tape transport to off-site location. Any company that has
been storing Oracle backup data offsite should take a hard look at the costs involved
in transporting, securing and storing
their tapes offsite – these costs can be reduced or possibly eliminated by storing
the data in Amazon S3.
As the world‘s largest online retailer, Amazon.com continuously innovates in
order to provide
improved customer experience and offer products at the lowest possible prices. One
such innovation has been to replace tape with Amazon S3 storage for database
backups. This innovation is one that can be easily replicated by other organizations
that back up their Oracle databases to tape
mazon Relational Database Service (RDS)
Amazon Relational Database Service is a web service that makes it easy to set up,
operate, and scale a relational database in the cloud.
›Amazon Elastic ache
Amazon Elastic ache is a web service that makes it easy to deploy, operate, and
scale an in-memory cache in the cloud.
Amazon Fulfilment Web Service allows merchants to deliver products using

Amazon.com‘s
worldwide fulfillment capabilities.
Deployment & Management
:AWS Elastic Beanstalk
AWS Elastic Beanstalk is an even easier way to quickly deploy and manage
applications in the AWS cloud. We simply upload our application, and Elastic
Beanstalk automatically handles the deployment details of capacity provisioning,
load balancing, auto-scaling, and application health monitoring.
›AWS Cloud Formation
AWS Cloud Formation is a service that gives developers and businesses an easy
way to create a collection of related AWS resources and provision them in an
orderly and predictable fashion.
Information Service makes Alexa‘s huge repository of data about structure and
Alexa Web traffic patterns on the Web available
developers.
The Google File System(GFS)
 The Google File System (GFS) is designed to meet the rapidly

growing demands of Google_s data processing needs.
o GFS shares many of the same goals as previous distributed file

systems such as performance, scalability, reliability, and availability.
o It provides fault tolerance while running on inexpensive commodity

hardware, and it delivers high aggregate performance to a large
number of clients.
o While sharing many of the same goals as previous distributed file

systems, file system has successfully met our storage needs.
o It is widely deployed within Google as the storage platform for the

generation and processing of data used by our service as well as
research and development efforts that require large data sets.
Conclusion:
Thus we have studied a case study on amazon web services.

Experiment No. 7
Aim-To Study about Microsoft Azure.

Theory:
What is Azure?
Azure is a comprehensive set of cloud services that developers and IT

professionals use to build, deploy and manage applications through our global
network of data centers. Integrated tools, DevOps and a marketplace support
you in efficiently building anything from simple mobile apps to internet-scale
solutions.
Azure is productive for developers
Get your apps to market faster. Azure integrated tools, from mobile DevOps to
serveries computing support your productivity. Build the way you want to,
using the tools and open source technologies you already know. Azure supports
a range of operating systems, programming languages, frameworks, databases
and devices.
Continuously innovate and deliver high-quality apps.
Provide cross-device experiences with support for all major mobile platforms.
Run any stack, Linux-based or Windows-based and use advanced capabilities
such as Kubernetes cluster in Azure Container Service.
Azure is the only consistent hybrid cloud

Build and deploy wherever you want with Azure, the only consistent hybrid
cloud on the market. Connect data and apps in the cloud and on-premises—for
maximum portability and value from your existing investments. Azure offers
hybrid consistency in application development, management and security,
identity management and across the data platform.
Extend Azure on-premises and build innovative, hybrid apps with Azure Stack.
Connect on-premises data and apps to overcome complexity and optimise your
existing assets.
Distribute and analyse data seamlessly across cloud and on-premises.
Azure is the cloud for building intelligent apps

Use Azure to create data-driven, intelligent apps. From image recognition to bot
services, take advantage of Azure data services and artificial intelligence to
create new experiences—that scale—and support deep learning, HPC
simulations and real-time analytics on any shape and size of data.
Develop breakthrough apps with built-in AI.
Build and deploy custom AI models at scale, on any data.
Combine the best of Microsoft and open source data and AI innovations.
Azure is the cloud you can trust
Ninety percent of Fortune 500 companies trust the Microsoft Cloud. Join them.
Take advantage of Microsoft security, privacy, transparency and the most
compliance coverage of any cloud provider.
Achieve global scale on a worldwide network of Microsoft-managed
datacenters across 42 announced regions.
Detect and mitigate threats with a central view of all your Azure resources
through Azure Security Center
Rely on the cloud with the most comprehensive compliance coverage (50
compliance offerings) and recognised as the most trusted cloud for U.S.
government institutions.
Azure or AWS?
With the most data center regions around the globe, unmatched consistent
hybrid cloud capabilities and comprehensive AI services—Azure is the right
choice for your business. See why organisations worldwide are choosing Azure.
Azure vs. AWS

Why Azure is the right choice
Organisations all over the world recognise Azure over AWS as the most trusted
cloud, because it offers:
More regions than any other cloud provider
Unmatched hybrid capabilities
The strongest intelligence
Trust the cloud that helps protect your work
When you compare AWS versus Azure, you will find that Azure has more
comprehensive compliance coverage with more than 60 compliance offerings
and was the first major cloud provider to contractually commit to the
requirements of the General Data Protection Regulation (GDPR). To protect
your organisation, Azure embeds security, privacy and compliance into
its development methodology and has been recognised as the most trusted cloud
for U.S. government institutions, earning a FedRAMP High authorisation that
covers 18 Azure services. In addition, Azure IP Advantage provides best-in-
industry intellectual property protection, so you can focus on innovation, instead
of worrying about baseless lawsuits.
Conclusion-We have successfully studied about Microsoft azure.

Experiment No. 8
Aim-To study about Aneka.

Theory
Aneka is a platform and a framework for developing distributed applications on

the Cloud. It harnesses the spare CPU cycles of a heterogeneous network of
desktop PCs and servers or datacenters on demand. Aneka provides developers
with a rich set of APIs for transparently exploiting such resources and
expressing the business logic of applications by using the preferred
programming abstractions. System administrators can leverage on a collection
of tools to monitor and control the deployed infrastructure. This can be a public
cloud available to anyone through the Internet, or a private cloud constituted by
a set of nodes with restricted access.
The Aneka based computing cloud is a collection of physical and virtualized

resources connected through a network, which are either the Internet or a
private intranet. Each of these resources hosts an instance of the Aneka
Container representing the runtime environment where the distributed
applications are executed. The container provides the basic management
features of the single node and leverages all the other operations on the services
that it is hosting. The services are broken up into fabric, foundation, and
execution services. Fabric services directly interact with the node through the
Platform Abstraction Layer (PAL) and perform hardware profiling and dynamic
resource provisioning. Foundation services identify the core system of the
Aneka middleware, providing a set of basic features to enable Aneka containers
to perform specialized and specific sets of tasks. Execution services directly
deal with the scheduling and execution of applications in the Cloud.
One of the key features of Aneka is the ability of providing different ways for
expressing distributed applications by offering different programming models;
execution services are mostly concerned with providing the middleware with an
implementation for these models. Additional services such as persistence and
security are transversal to the entire stack of services that are hosted by the
Container. At the application level, a set of different components and tools are
provided to: 1) simplify the development of applications (SDK); 2) porting
existing applications to the Cloud; and 3) monitoring and managing the Aneka
Cloud.
A common deployment of Aneka is presented at the side. An Aneka based

Cloud is constituted by a set of interconnected resources that are dynamically
modified according to the user needs by using resource virtualization or by
harnessing the spare CPU cycles of desktop machines. If the deployment
identifies a private Cloud all the resources are in house, for example within the
enterprise. This deployment is extended by adding publicly available resources
on demand or by interacting with other Aneka public clouds providing
computing resources connected over the Internet.
Manjrasoft is focused on the creation of innovative software technologies for

simplifying the development and deployment of applications on private or
public Clouds. Our product Aneka plays the role of Application Platform as a
Service for Cloud Computing. Aneka supports various programming models
involving Task Programming, Thread Programming and MapReduce
Programming and tools for rapid creation of applications and their seamless
deployment on private or public Clouds to distribute applications.
Aneka technology primarily consists of two key components:
1. SDK (Software Development Kit) containing application programming

interfaces (APIs) and tools essential for rapid development of applications.
Aneka APIs supports three popular Cloud programming models: Task,
Thread, and MapReduce; and
2. A Runtime Engine and Platform for managing deployment and execution of
applications on private or public Clouds.
One of the notable characteristics of Aneka PaaS is to support provisioning of
private cloud resources ranging from desktops, clusters to virtual data centres
using VMWare, Citrix Zen server and public cloud resources such as Windows
Azure, Amazon EC2, and GoGrid Cloud Service.
The potential of Aneka as a Platform as a Service has been successfully
harnessed by its users and customers in three various sectors including
engineering, life science, education, and business intelligence.
Conclusion-successfully studied about aneka

Subject Index

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Subject Index

Transféré par

Droits d'auteur :

Formats disponibles

SUBJECT INDEX

Sr. No. Title Page

1 Introduction to cloud computing. 9

Study about cloud reference model.

Study about virtual private network.

Case study on google app engine.

Case study on Hadoop.

6 Case study on Microsoft Azure. 37

8 Case Study: Aneka 49

Aim: To study in detail about cloud computing.

A Cloud is a type of parallel and distributed system consisting of a collection of

When you store your photos online instead of on your home

On-demand self-service:--A consumer can unilaterally provision computing

Broad network access:--Capabilities are available over the network and

Rapid elasticity:--Capabilities can be rapidly and elastically provisioned in

the consumer, the capabilities available for provisioning often appear to be

Measured service:--Cloud systems automatically control and optimize resource

Conclusion-successfully studied about cloud computing.

Aim: To Study about cloud reference model.

Theory: cloud computing reference model is a model showing the dependencies

Infrastructure as a Service (IaaS):

1. The capability provided to the consumer is to provision processing, storage,

3. The consumer does not manage or control the underlying cloud

4. networking components (e.g., host firewalls).

Platform as a Service (PaaS):

2. The consumer does not manage or control the underlying cloud

3. Possibly application hosting environment configurations.

Private Cloud: The cloud infrastructure is operated solely for a single

Community Cloud: The cloud infrastructure is shared by several

Hybrid Cloud: The cloud infrastructure is a composition of two or more

Proprietary technology that enables data and application portability (e.g.,

Conclusion: Studied about cloud reference model.

Aim: Study about virtual private network.

Conclusion-Successfully studied about virtual private network.

Aim: Case Study on Google App Engine

Persistence and the datastore

As mentioned earlier, the GAE serves as an abstraction of the underlying hardware

Aim-Case Study of Hadoop

Theory: Hadoop is an open source, Java-based programming framework that

Cloudera Impala. A massively parallel processing database for Hadoop,

Apache Oozy. A server-based workflow scheduling system to manage Hadoop

Apache Phoenix. An open source, massively parallel processing, relational

Apache-. A high-level platform for creating programs that run on Hadoop;

Apache Storm. An open source data processing system; and

Apache Zookeeper- An open source configuration, synchronization and naming

Conclusion-Successfully studied about Hadoop.

Aim-To Study about amazon web services.

Amazon.com initiated the evaluation of Amazon S3 for economic and performance

Performance advantage and cost competitiveness. It was important that the

Greater durability and availability. Amazon S3 is designed to provide

Less operational friction. Amazon.com DBAs had to evaluate whether Amazon

With the migration to Amazon S3 well along the way to completion,

Elimination of complex and time-consuming tape capacity planning.

Reduced capital expenditures. Amazon.com no longer needs to acquire tape

Immediate availability of data for restoring – no need to locate or retrieve

Freeing up valuable human resources. With tape infrastructure, Amazon.com had

Amazon Fulfilment Web Service allows merchants to deliver products using

Deployment & Management

:AWS Elastic Beanstalk

›AWS Cloud Formation

The Google File System(GFS)

 The Google File System (GFS) is designed to meet the rapidly

o GFS shares many of the same goals as previous distributed file