Vous êtes sur la page 1sur 5

Multi-Clouds Database: A New Model to Provide

Security in Cloud Computing


Ion Morozan
ion.morozan@gmail.com
Vrije Universiteit
Amsterdam, The Netherlands
AbstractThe use of Cloud Computing (CC) has increased
rapidly in many organizations. One of the advantages of CC is
cloud data storage, where the customers do not have to store
their data on their own servers, but the data is stored on the
cloud service providers (CSP) side. To this end, security in CC
is still considered to be one of the most critical aspects due to
the sensitive and confidential information stored in the cloud by
the users.
This paper proposes a model to securely store information
into the cloud, by splitting data into several chunks and storing
parts of it on multiple cloud providers (e.g. Amazon, Google,
Microsoft, etc.) in a manner that preserves data confidentiality,
integrity and ensures availability.
Our approach preserves security and privacy of users sensitive information by replicating data across multiple clouds,
using a secret sharing approach that uses Shamirs secret sharing
algorithm. This model avoids the negative effects of a single cloud
and reduces the consequences of encryption techniques.

I.

I NTRODUCTION

During the last couple of years, Cloud Computing became a much bigger trend in the business and networking
world, with a predicted increase of 130% by 2016, as IDC
[1] announced during their last event, in September 2013.
Ensuring the security is considered to be the most crucial
issue that cloud service providers are facing, as users often
share sensitive information with the cloud blindly and they
can not be sure that providers are trusted or not. There are
numerous security vulnerabilities for CC as it encompasses
many technologies including networks, databases, operating
systems, virtualization, resource scheduling, load balancing
and memory management. Security problems faced by these
systems are applicable to cloud computing as well, therefore
cloud providers should address privacy and security issues as
a matter of high and urgent priority.
Even though CC offers limitless flexibility, reliability, enhanced collaboration and unlimited storage, how secure is it
after all? How can we be sure that our data is safe over the
cloud? Dealing with single cloud providers is becoming less
popular due to potential problems that can affect our data, such
as service availability failure (e.g. some catastrophe befalling
the cloud service provider) and the possibility that there are
malicious insiders in the single cloud (e.g. stolen data by an
attacker who found a vulnerability). To this end we have to
come up with a way to secure those files. In recent years,
there has been a move towards multi-clouds, intercloud or
cloud-of-clouds. As a consequence, since there is no fully
secure and truthful service provider that can host our sensitive

information, we can store our data in a clever way in more


than one cloud in a manner that preserves data confidentiality,
integrity and ensures availability.
This paper has as prime focus the issues related to the
data security aspect of CC. Since users share information with
a third party, they want to avoid untrusted cloud computing
providers by protecting private and sensitive information, such
as credit card details or patients medical records from attackers or malicious insiders. To this end, we propose a multi-cloud
database model which uses multiple cloud service providers
to store data. The purpose of the model is to address security
and privacy risks in the cloud computing environment. The
leverage of security in multi-clouds is furthermore enhanced
by our solution which employs a secret sharing scheme to
avoid storing clear data in the clouds.
The remainder of this paper is organized as follows. Section
II analyzes the new generation of cloud computing, that is,
multi-clouds and recent solutions to address security in the
CC paradigm. Furthermore, Section III presents the Shamirs
secret sharing algorithm which will be used in our proposed
model, depicting the basic principle and the mathematical
implementation. Section IV describes the proposed model with
a thorough data flow explanation and, finally, in Section V we
present our conclusions.
II.

F ROM S INGLE TO M ULTI -C LOUDS

During this section we are going to examine the migration


of cloud computing from the single to multi-cloud perspective.
After all, What is multi-cloud? Clearly, it is a more complex
system than a hybrid cloud, which is typically a paired private
and public cloud. Multi-cloud add more clouds to the mix (i.e
perhaps two or more public IaaS providers, a private PaaS,
private use-based accounting, etc.) which aims at minimizing
the risk of service availability failure, corruption of data, loss of
privacy, and the possibility of malicious insiders in the single
cloud. Known also as, cloud-of-clouds, this strategy can
improve the enterprise overall performance by avoiding vendor
lock-in (proprietary rights) and using different infrastructures
to meet the needs of diverse partners and customers to avoid
the service unavailability and security risks that can occur
inside a single cloud infrastructure. The term multi-clouds
was first time mentioned by Vukolic et al. in [8] and assume
that the main purpose of shifting towards inter-clouds is to
improve what was offered in single cloud by distributing
the reliability, trust and the security among multiple cloud
providers.

A. Why Moving to Multi-Clouds?


The use of cloud-based platforms in the technology industry is in a continuous growing to more complex arrangements.
Why? Since the business world now demands a combination
of many complex cloud services, the concept of multi-cloud
represents the optimal solution to meet the requirements of
current businesses.
Shifting from single clouds to multi-clouds to ensure
the privacy and security of users sensitive information is
extremely important. Cachin et al. [9] claim that services of
single clouds are still subject to outage. Moreover, K. Bowers
et al. [10] showed that over 80% of company management
fear security threats and loss of control of data and schemes.
To this end, they suggest that cloud computing should not end
with a single cloud provider, but rather users should focus on
developing applications using the power of multi-clouds.
There are numerous advantages of adopting a multi-cloud
approach some of which we are going to describe next.
Reducing risks Embracing a multi-cloud strategy by running cloud-based applications on multiple cloud providers
has a huge impact on infrastructure redundancy. By selecting
different data centers to host cloud servers, we can effectively
eliminate the risk associated with the business services uptime, as well as risks related to networking providers and
other cloud issues, since each provider will usually operate
separately.
A multi-cloud strategy also reduces other risks associated
with having a single provider: assuming that someone discovers a breach vulnerability with the storage system on the
current infrastructure provider. If that service is hosted on
multiple clouds, then we can simply shut down the vulnerable
servers with minimum impact on the service availability.
Further, the same policy applies if, for instance, the provider
suddenly decides to increase its prices: shutting down servers
and moving the application somewhere else. Therefore, by
designing a flexible application from scratch allows us to
migrate our service with minimum impact on the services
performance.
Virtual power During the early years of cloud computing,
adopting a multi-cloud strategy was hard if not impossible.
Cloud providers operated on proprietary closed architectures
that made the migration between CSP an important problem.
Today, however, these barriers have changed as the open source
community expands fast. Therefore, today on the market there
are implementations such as [12], [13], [14] that merge the interfaces of several cloud providers by offering API abstractions
in Java or C which make easier the management of multiple
cloud providers at the same time.
Flexibility A multi-cloud model can offer not only the
hardware, software and infrastructure redundancy, but it can
also steer traffic from different customers or partners through
the fastest possible parts of the network. To this end, some
clouds are better suited than others for a particular task.
Therefore, some platforms might service large number of
requests per unit time requiring small data time transfers, but
on the other hand, a different CSP might perform better for
smaller numbers of requests per unit time involving large data
transfers. Moreover, some organizations offer the possibility

of serving applications via a public cloud making the services


available on the Internet, but also via a private cloud hosting
services tailored to a limited number of people behind a
firewall.
Like any market where competition abounds, on the cloud
there are significant differences between providers: some will
offer better service level agreement terms, some will offer
lower prices, some will offer APIs for different programming
languages, and so on. The best way to understand all these
differences and choose the providers which best suit particular
needs is to experiment with them.
III.

S ECRET S HARING S TRATEGY

Simply storing the information on multiple clouds solves


the problem of data availability, but what about security?
Having multiple copies of data into different clouds it will
just create multiple gates for intruders to hack in. Therefore,
we need to make sure that the data shipped to multiple clouds
is safer that it was on a single cloud. This is when we apply
the secret sharing algorithm presented by Adi Shamir in [2].
Invented in 1979, the algorithm has occupied a huge place
in the area of cryptography. The author discussed the issue of
information distribution with the aim of showing that there
is an orthogonal approach which is based on information
distribution instead of encryption. The need of a secure communication between two endpoints challenged most of the
work on data security. A similar approach was discovered by
George Blakley [3], but the mathematical evolution behind the
algorithm is more complicated, thats where the secret sharing
algorithm of Shamir lies in its simplicity of implementation.
Shamirs secret sharing or secret splitting represents a way
for distributing a secret among a group of n participants, each
of whom is allocated a part of the secret, in our case, a piece
of data. The strong point of this method is that the secret can
be reconstructed only when a predefined number of shares
are combined together; individual shares are of no use on
their own, so anyone with fewer than t out of n shares has
no extra information about the secret than someone with 0
shares. For example, consider a secret sharing scheme in which
our information to be protected is academia. This word is
divided into the shares: ac00 , ad00 ,
em 00 , ia00 . A person with 0 shares
knows that the word consists of eight letters. He would have to
guess the word from 268 = 208 billion possible combinations.
If he has one share, then the interval is narrowing down to
266 = 308 million combinations, and so on. Thus, an user with
fewer than t shares is able to reduce the problem of obtaining
the inner secret without first needing to obtain all the necessary
shares.
A. Basic Principle
This scheme consists of a special party called dealer and
n players. The dealer is in charge of dividing a certain data D
into n parts say, D1 , D2 , ..., Dn in such a way that:

The knowledge of any t or more Di pieces makes the


value of D known.

A complete knowledge of t 1 shares revels no


information about D (in the sense that all possible

values are equally split). t should be less than n to


keep the value of shares unconstructible and ensure
that the adversary cannot access t pieces of data.
Such a system is called a (t, n) threshold scheme. The value
of factor t can be decided depending on the level of security
we desire. For example, if the data is of maximum importance
than we can keep t = n; thus all players will need to have all
the shares to be able to reconstruct the secret (original data).
The use of a secret sharing scheme allows us to integrate
confidentiality, guarantees to the stored data without using
a key distribution mechanism which imposes sharing of a
secret key. Therefore, instead of encryption, this mechanism
[2] distributes data to multiple servers belonging to different
cloud providers to ensure privacy of users queries.
B. Mathematical Implementation
The mathematical implementation of secret sharing algorithm can be understood with the help of a simple example
based on [4]. The generalized idea is presented below.
If the user wants to outsource data from data source D to
the cloud (CSP1 , CSP2 , ..., CSPn ), data should be divided
into n shares and each share should be stored in a different
CSP . After this, the written query will be sent to all CSP s to
retrieve the relevant shares to answers a client query without
knowing the value of data shares from the service provider.

Figure 1. Data distribution into multiple shares by applying the polynomial.

the secret information X {x1 = 2, x2 = 4, and x3 = 1} will


be used by the polynomial functions for distribution to each
CSP .
C. Properties
Shamirs secret sharing algorithm possesses some properties that make it further more powerful [2]:

The size of individual pieces do not exceed the size


of the original data.

When t is fixed, Di pieces can be inserted or removed


without affecting the other pieces.

By changing the polynomial and constructing new


shares for the players we can enhance security without
changing the secret.

Using tuples of polynomial values, we can get a


hierarchical scheme in which the number of pieces
needed to determine D depends on their importance.

Afterwards the following steps are being taken:

We choose at random (t 1) coefficients i.e.


a1 , a2 , ..., at1 ;

In order to recreate the secret value Vs at the data


source, the knowledge of any t can refer to Vs besides
some secret information X that is known only to the
data source. We divide our secret value Vs by picking
a random polynomial q(x) of degree (t1) where the
constant is Vs . Bellow we illustrate how the shares can
be written [5].

Shares(Vs , 1) = q(x1 ) = a1 xk1


+ a2 xk2
... + Vs
1
1
k1
Shares(Vs , 2) = q(x2 ) = a1 x2 + a2 xk2
...
+ Vs
2
...
Shares(Vs , n) = q(xn ) = a1 xk1
+ a2 xk2
n
n ... + Vs

IV.

P ROPOSED M ODEL

In this section we are going to analyze our proposed


database model (see Figure 2), which enables storing information on multiple clouds transparently to the user. Furthermore,
during this section we are going to examine several security
risks such as: data integrity, data intrusion and service availability.
A. Multi-Clouds Database Model

The main idea of Shamirs secret scheme is that 2 points


are sufficient to define a line, 3 points to define a parabola, 4
points to define a cubic curve and so on. That is, it takes t
points to define a polynomial of degree t 1. By selecting any
t such combinations of the available n parts will generate the
same result. The value in these sets are meaningless alone, it
is only when t sets are brought in together and further worked
upon that we get our secret back.
To make things clear, Figure 1. presents an example of
data source D that has Student table and Age attribute. The
owner of data D wants to outsource the Age attribute to three
different cloud service providers (CSP1 , CSP2 , and CSP3 ).
In addition, the user chooses 5 random polynomials with a
degree of one. Since the number of shares is t = 2, n = 3,

Our multi-cloud database model does not preserve privacy by encryption; rather privacy is ensured by using multiclouds service providers and the secret sharing mechanism.
This avoids the negative impact of using a single cloud
provider and encryption on queries. These techniques permit customers with different types of database queries (i.e.
aggregation, range and exact match) to avoid the risk of a
malicious insider in the cloud and prevent the loss of data
in case of a datastore critical damage. As we have already
mentioned to preserve security and privacy of users sensitive
information, data is replicated among a predefined number
of cloud service providers (CSP) using Shamirs algorithm.
The database management system (DBMS) is in charge of
managing and controlling the operations between the client

Figure 3.

Figure 2.

General Overview of Multi-clouds.

and CSP. Therefore, the DBMS is transparently rewriting the


original users request creating a separate query for each CSP.
Next, it generates the polynomial values, handles the users
queries to each cloud, reconstructs the result and finally sends
it back to the client.
The use of multiple clouds imposes that our approach deals
with the heterogeneity of the interfaces of each cloud provider.
An aspect that is specially important is the format of the data
accepted by each cloud. Fortunately, on the market there are
already solutions which address this problem [12], [13], [14]
and unify the interface of several cloud providers by offering
API abstractions in Java or C to get easy access to different
cloud services.
B. General Data Flow Scenario
Next, we are going to conduct a thorough analysis of
the proposed model by taking a look at how users queries
can be executed through the model in a private and secure
way. We emphasize that the security of data transfer through
the network is not the aim of this paper; thus networks or
security layers will not be modified. Moreover, we describe
how DBMS manages the data, divides it into multiple chunks
and distributes it to different CSPs.
There are two different data flows present in our model:
Client Data Procedure (see Figure 2), in which an user
sends a query through an application via a HTTP request. Once
the request reaches the DBMS, it is rewritten and sent to each
CSP. After the result of the query is returned by each CSP, the
DBMS reconstructs and forwards it back to the client. As can
be seen from Figure 3, DBMS component includes the process
of generating the polynomial functions. Further, data is sent
and stored directly at the endpoints (CSP1 , CSP2 , ..., CSPn ),
so there is no need of storing the data at the DBMS layer.

Multi-clouds DB Data workflow.

Cloud Data Procedure. This scenario (see Figure 3)


describes the data flow from the database management system
to the multi-clouds (CSP1 , CSP2 , ..., CSPn ) in our proposed
model. First, the DBMS component receives a request from
an user. Next, it divides the received data that the client
wants to hide, into n shares or chunks. After this step,
the database management system generates a random same
degree polynomial function, one for each CSP and creates the
shares which are distributed to the defined cloud providers.
These polynomials are not being stored at the DBMS, but are
generated when the query is received and when the response is
reconstructed. For data retrieval, the users query is forwarded
to the DBMS which further splits the query into n individual
requests, one for each CSP. Once the data is received from the
multi-cloud, DBMS computes the secret of the coming result
and sends it to the concerned requester.
For instance, the rewritten query for CSP 1 retrieves all
students whose age is share(29, 1), where the secret value
is the age 29 and the cloud service provider is CSP 1. To
find the share(29, 1) the DBMS, first generates polynomials
for the secret values age 29 and the position for the value
(x )
in the share q29i . After receiving the result from the CSP s,
DBMS computes the secret value and sends it to the client
through a secured and private network. As we have already
mentioned the secret sharing mechanism can be applied to
execute different types of queries such as range, aggregation
and exact match.
C. What Makes Our Approach Different?
Our proposed multi-cloud database model differs from
single cloud service. During this section we are going to
show how this solution claims to address the three security
dimensions by the use of previous techniques and by increasing
systems reliability of the model:
Data Integrity. Represents one of the most important
problems in the cloud security world. Data stored in the cloud
may suffer from damage, due to service malfunction or due to
attackers which act from both inside or outside. In the past,
CSPs, such as Amazon faced data integrity problems which

lead to loss of data [15], [16] and disappointed customers.


Moreover, authors of [17] claim that Amazon suffers from
privacy issues inside their storage system, S3, therefore they
suggest that users should compute the digest of data by using
hash-based message authentication code (HMAC) mechanism
[18] to make sure that data cannot be modified. Yet, our model
uses multiple cloud providers, moreover uses the Shamirs
secret mechanism which makes it more powerful. Since data is
distributed to multiple CSP, an attacker who wants to know the
hidden secret needs to retrieve at least t out of n shares to be
able to reveal the real value which was converted and hidden
before being stored in the multi-clouds. This approach depends
on the value of t. If there are 5 shares stored in 5 clouds and
t = 4 then the knowledge of 3 or less shares makes the secret
unconstructible. To this end we can conclude that our model
is superior to other cloud service providers (e.g. Amazon) in
addressing the issue of data integrity.
Data Confidentiality. Password hijacking and data intrusion represents another security risk that needs careful
attention. For instance, if an attacker is able to gain access to
a Microsoft Azure account, then he will be able to control all
the instances and resources. However, our model which uses
a multi-cloud approach is almost impossible to corrupt since
data is replicated among n different clouds; hackers need to
retrieve all the information from the n clouds to be able to
reconstruct the secret. Therefore we can see that distributing
data to multiple clouds reduces considerably the risk of data
intrusion compared to storing data to single cloud provider.
Service Availability. Another major issue faced by the
cloud computing environment is service availability. Cloud
providers such as Amazon [19], Microsoft [20] and Google
[21] mention in their licensing agreement that the unavailability of their services may occur. Thus, the users service may
crash without any prior notice at any time. Moreover, since all
these information is stipulated in the Service Level Agreement
(SLA) there is no compensation if any damage occurs to a
customer web service. Yet, our solution is different from all
the other mentioned above in relation to service availability
risk or loss of data. In our case data is replicated to different
cloud service providers, therefore if one CSP fails, the user
is still able to access its data in other cloud providers. In the
worst case if all the CSPs are down at the same time than the
users might not be able to get access to information, but the
probability of such an event is very low.
To this end we can conclude that our multi-cloud database
model presents enough strong features in addressing the problem of data integrity, confidentiality and availability which
makes it a better approach to the single service cloud provider.

The use of multi-sharing techniques is considered novel. It


is proven to be more secure, harder to compromise and superior
to the encryption techniques due to their biggest limitation,
large time for encryption and decryption. The purpose of this
model is to reduce the security risks which occur in cloud
computing. Also, we address the issues related to data integrity,
confidentiality and service availability arguing why the multiclouds model is superior to single cloud by giving relevant
scenarios where the single cloud approach fails. Therefore, it
is clear that storing the data over multi-clouds is efficient, so
with the tools and functionality available today, we strongly
believe that there is no excuse for not going the multi-cloud
route.
R EFERENCES
[1]

[2]
[3]

[4]

[5]

[6]

[7]
[8]
[9]
[10]
[11]

[12]
[13]
[14]
[15]

[16]

V.

C ONCLUSION

The benefits of cloud computing are clear: minimizing the


risk of physical infrastructure deployment, reducing cost of
entry, reducing the execution and response time of applications,
etc. Even though CC is extensively researched, security still
represents the major issue of it. To this end, this paper focuses
on the issues related to the security aspects of cloud and
aims at facilitating a new model which uses multiple cloud
service providers (CSP) and Shamirs secret sharing algorithm
to prevent and overcome all the shortcomings of a single cloud
model.

[17]
[18]
[19]
[20]
[21]

International Data Corporation: IDCs Cloud Computing and


Datacenter Roadshow 2013. [online] http:// idc-cema.com/ eng/ events/
52888-idc-s-cloud-computing-and-datacenter-roadshow-2013,
2013.
[cited: January-2014].
A. Shamir, How to Share a Secret, Communications of ACM, pp 612
613, New York, USA, November 1979.
I. N. Bozkurt, K. Kaya and A. A. Selcuk. Threshold Cryptography
Based on Blakley Secret Sharing., In Proc. of Information Security and
Cryptology Ankara, Turkey, Dec. 2008.
M. Alam, S. Banu, An Approach Secret Sharing Algorithm in Cloud
Computing Security over Single to Multi-Clouds.,International Journal
of Scientific and Research Publications, vol. 3, issue 4, April 2013.
D. Agrawal, A. Abbadi, F. Emekci and, A. Metwally, Database Management as a Service: Challenges and Opportunities., Proceedings of
the 2009 IEEE International Conference on Data Engineering, pp 1709
1716, April, 2009.
A. Bessani, M. Correia, B. Quaresma, F. Andr and P.Sousa, DepSky:
dependable and secure storage in a cloud-of-clouds, EuroSys, pp. 31
46, 2011
M. ALzain, and E. Pardede, Using Multi Shares for Ensuring Privacy in
Database-as-a-Service, Proceedings of (HICSS), IEEE, pp. 19, 2011.
M. Vukolic, The Byzantine empire in the intercloud, ACM SIGACT
News, pp. 105111, New York, September 2010.
C. Cachin, I. Keidar and A. Shraer, Trusting the cloud, ACM SIGACT
News, pp. 8186, 2009.
K. D. Bowers, A. Juels and A. Oprea, HAIL: A high-availability and
integrity layer for cloud storage, ACM, pp. 187198, 2009.
M. AlZain, E. Pardede, B. Soh and J. Thom, Cloud Computing Security:
From Single to Multi-clouds, Proceedings of (HICSS), IEEE, pp. 5490
5499, Hawaii, 2012.
Apache: jclouds, Cloud interfaces, simplified. [online] http:// jclouds.
apache.org/ . [cited: January-2014].
Zamanda: Libzcloud, Abstraction for cloud storage services. [online]
http:// zmanda.github.io/ libzcloud/ . [cited: January-2014].
SMEStorage: smestorage, Multi-cloud storage provider. [online] http:
// code.google.com/ p/ smestorage/ . [cited: January-2014].
Oracle Inc: Amazon S3 Silent Data Corruption, [online] https:
// blogs.oracle.com/ gbrunett/ entry/ amazon s3 silent data corruption,
2009, [cited: January-2014].
Dotan Horovits: AWS Outage - Moving from Multi-AvailabilityZone to Multi-Cloud, [online], http:// www.cloudifysource.org/ 2012/ 10/
24/ aws-outage-multi-availability-zone-multi-cloud.html, October 2012,
[cited: January-2014].
S. Garfinkel, An evaluation of Amazons grid computing services EC2,
S3 and SQS, Technical report, 2007.
H. Krawczyk, M. Bellare and R. Canetti, HMAC: Keyed-hashing for
message authentication, R. Editor, pp. 111., 1997.
Amazon Inc.: Amazon EC2 Service Level Agreement, http:// aws.
amazon.com/ ec2-sla/ , 2014, [cited: January-2014].
Microsoft Inc.: Service Level Agreements, http:// www.windowsazure.
com/ en-us/ support/ legal/ sla/ , 2014, [cited: January-2014].
Google Inc.: Google Compute Engine Service Level Agreement (SLA),
https:// developers.google.com/ compute/ sla, 2014, [cited: January-2014].

Vous aimerez peut-être aussi