Vous êtes sur la page 1sur 6

2015 1st International conference on futuristic trend in computational analysis and knowledge management (ABLAZE 2015)

Privacy Preservation in Vertical Partitioned Medical


Database in the Cloud Environments
Raghvendra Kumar1, Yogesh Sharma2, Prashant Kumar Pattnaik*3

Jodhpur National University
Jodhpur, Rajasthan, India
*
KIIT University
Bhubaneswar, India
1
raghvendraagrawal7@gmail.com, 2yogeshsharma@gmail.com, *3patnaikprasant@gmail.com

AbstractIn this paper we introduced the concept of cloud versa. And the secure multi party computation occurs when
computing with the help of secure multi party computation, the number of parties is greater than two, in which each party
secure multi party computation play a very crucial role in the calculate their common function and calculate their result with
field of distributed database where the number of parties or disclosing to other parties present in the database environment
resources is greater than or equal to two. In secure multi party there are two type of model where the secure multi party
computation each party will calculate their common function and
computation play a very crucial role, Real model and Ideal
send to the next party present in the database environments.
Secure multi party computation is used to increases the model.
performance and efficiency. Cloud computing is new emerging
II. CLOUD COMPUTING
technology in the field of information and communication
technology. Which will use the high speed infrastructure and Cloud computing [6] [7] [8] is new developing and a large
services provided by the cloud service provider? In this paper we amount discussed topic among the individuals and business
implement and study the concept of the vertical partitioned organizations that utilize and research over the newest trends
resource database in the cloud environments. Where every party in Information communication Technology (ICT). Some of the
contains their own private resources and wants to communicate
leading IT companies in the world such as IBM, Google,
with each other without disclosing their private information to
the other resources present in the secure multi party cloud based
Yahoo, and Amazon have already developed large scale cloud
environment. systems for providing various types IT services through the
KeywordsDistributed database; Cloud computing; SMC; cloud. The term cloud is analogical to the Internet. Hence
Association Rule Mining; Ring topology. cloud computing can be visualized as computing over Internet.
More precisely it is a set of resources and facilities oared via
I. INTRODUCTION the Internet. Cloud architecture consist of a large number of
New era of computer science and communication technology, shared servers distributed all over the world providing
human being are fully depended on the information searing software, infrastructure, platform, devices and other required
systems. Each user wants to share their private and public resources and hosting to subscribers on a pay as you use it
information to other user presents in the environment. If the basis". The services provided over a cloud are categorized into
user wants to share private information to other user they three service models depending on the type of resources
dont want to disclose their private information to other. And allocated by the cloud service provider for the customers.
also they want very high speed communication media. In this They are
paper, we consider an example of medical database that are 1. Software as a Service (SaaS)
portioned in vertical partitioned manner in cloud 2. Platform as a Service (PaaS)
environments. So mining the huge amount of database we 3. Infrastructure as a Service (IaaS)
introduce the concept of data mining, that useful to mine the
useful information from the big database e.g. Medical III. MOTIVATION
database. In distributed database [1] [2] [3] divide the whole Even though secure multi party computation algorithms are
database into the distributed manner and each site or party capable of achieving privacy of user data even at the
contain the subset of the particular dataset. There are mainly computation time, most of the existing multi-party
three type of distributed database [5], first one if horizontal computation approaches acquire a lot of communication
partitioned database where division occurs on the basis of overhead [5]. Thus, a greater interest is evolved among the
subset of their transaction, second one is the vertical research personals in recent times regarding the possibility of
partitioned database where the partitioned Occurs on the basis outsourcing the computations to a Cloud Service Provider
of their subset of their attributes from the original database. which in turn will reduce the expenditure and operation
And last one is hybrid or mixed partitioned database divided overhead while improving the efficiency and speed. However,
the database into the first horizontal and then vertical or vice outsourcing data to clouds for computational purposes may
978-1-4799-8433-6/15/$31.002015 IEEE

236
2015 1st International conference on futuristic trend in computational analysis and knowledge management (ABLAZE 2015)

lead to privacy issues due to the fact that data is handled by 9. Each party/resources attributes calculate their confidence
UN trusted external entity. So, it is clear that users simply by using the formula Confidence(XY)=
cannot send private data as raw data to the cloud. This makes
Support(XY)/ Support(X)
way for an interested topic of how to do the computations on
encrypted data" for finding the global results. If this is 10. All party/resources calculate their partial confidence by
achieved then it is possible for users to send the encrypted data using the following formula Partial confidence= Support
to the cloud environments. (XY) -Minimum Confidence* Support(X) +|DB|+Ri
IV. PROBLEM DEFINITION 11. Analyzer calculate their global confidence of the rule by
using the following formula Global confidence= Partial
The problem of association rule mining where operation is
confidence (PC) of the parties- Random number(RN)
distributed cloud computing environments when the database
exits in different multiple resources presents in the 12. After that the cloud analyzer broadcast the global
environment, so its very difficult to find the global relation confidence to all the parties/resources presents
result/relationship [9] [10] among the attributes in the in the cloud database environments.
environments. In this proposed scenario we take the medical database. Those
V. PROPOSED WORK are portioned into the vertical partitioned method. In which the
each party/ resources will contain their own database. Then
In this paper, we proposed the conceptual framework of secure
multi party computation in the cloud based environments. In after the partition of the attribute, select all the number of
this model divide the centralized database into distributed attributes that presented in the original database and put that
database, in which each party contains the distributed database attribute value is zero. So the numbers of attribute in the
[3] [4] that are vertical partitioned and the number of original and modified medical database are same. Then all
parties/resources are greater than or equal to three. In which portioned database find the rules by using Association rule
each party contain their own vertical partitioned database; all mining algorithm [11]. In the given figure there are the
the resources select their own random number for providing
number of cloud user is N and each want to share or store their
the privacy to the partitioned database.
information the database. So first they send their information
Algorithm to the proxy server then the proxy server send their
Input:-Vertical Partitioned medical database DB1, information to the cloud server and after all the cloud server
DB1,,DBn. sends the information to the resources presents in the cloud
Process:-Apply the Privacy and Rule mining algorithm. environments. So for increasing the speed of accessing the
Output:-Global Relationship between the attributes. database and also increasing the privacy to the resources used
1. Divide the database into the vertical partitioned database the concept of random number. In this way all the parties are
DB1,DB2,.. DBn. able to share their private information to the database without
2. Each party/resources contain their own partitioned any interrupts. Figure1 shows the cloud distributed database
database. framework model. Here in this paper we consider the numbers
3. Each party/resources select their own random number of cloud resources/parties are equal to three. Table1 represents
R1, R2..R2. the list of attributes and their information in the medical
4. Each party/resources arranged in the cloud based database, Table2 contain the original central database, Table3
environment where all the resources are connected in the contain the vertical partitioned database for the Resource1/
cloud server. Party1, Table4 contain the modified vertical database after
5. The entire user is able to send/ store their information in selecting the number of attributes is same as the original
the database. database, Table5 contain the information after converting the
6. Cloud analyzer analyzes the result on the basis of their decimal value into the 1/0 format so easy for the calculation of
relationship. vertical database. Table6 contain the vertical partitioned
7. For finding the relationship use the rule mining medical database for the Resource2/ Party2, Table7 contain
algorithm. the database for the Resource2/ Party2 after selecting the
8. First the every party/resources calculate their Support number of attributes is same as the original database
value of the attributes by using the following formula attributes. Table8 Modified database for the Resource2/ Party2
Support(XY)=Probe(XY)/ Total number of after putting the approx values 1/0, Table9 contain the vertical
Transaction partitioned medical database for the Resource3/ Party3,
Table10 contain the database for the Resource3/ Party3 after

237
2015 1st International conference on futuristic trend in computational analysis and knowledge management (ABLAZE 2015)

selecting the number of attributes is same as the original


database attributes, Table11 Modified database for the
Resource3/ Party3 after putting the approx values
1/0.

Consider the number of attributes is same as the given in the


original database, and if the value of the attribute is not given
then consider the value as zero.

Table 4:- Modified vertical database after selecting the number of attributes is
same as the original database

Figure1: shows the cloud distributed database framework model.

Table 1:-Contain the list of Attribute for the Medical Database

Let the value of Age greater than or equal to 35 is consider as


1 otherwise is 0, Male=1 and Female=0, CP value is greater
than or equal to 3 is consider as 1 otherwise 0, Blood Pressure
Table2-Main Central Database value is greater than or equal to 130 is consider as 1 otherwise
0, Cholesterol value is greater than 250 is consider as 1
otherwise is 0.
Table5:- contain the information after converting the decimal value into the
1/0 format so easy for the calculation of vertical database

Divide the database into the vertical partition manner, and


party1 contain the following datasets.
Table 3:- Vertical Partitioned database for the party P1/Resource R1

The support count of Age=15, the support count of sex=13,


the support count of CP=14, the support count of Blood
Pressure=11, and the support count of Cholesterol=3, and the
rest of attribute have the support count value is zero, Age
(Support) = 15/24=0.625, Support (Sex) =13/24=0.541, Support (CP)
=14/24=0.583, Support (Blood Pressure) =11/24=0.458, Support

238
2015 1st International conference on futuristic trend in computational analysis and knowledge management (ABLAZE 2015)

=3/24=0.125, calculate the support percentage then


(Cholesterol)
Table7:-Modified Database for the party 2
the support of age=62.5, support of sex=54.1, support of CP
=58.3, support of blood pressure=45.8 and the support of
cholesterol= 12.5. Let the minimum support value is 40%,
then the select those attribute which value greater than or
equal to 40%, then the numbers of item selected in the first
database table are {Age, Sex, CP, Blood Pressure}, after that
find the support count between the two attributes in the given
database, support count of Age and Sex=8, Age and CP=10,
Age and Blood Pressure=8, Age and Cholesterol=3, support
count of Sex and CP=11, Sex and Blood Pressure=5, Sex and
Cholesterol=2, support count of CP and Blood Pressure=6,
CP and Cholesterol=2, support count of Blood Pressure and Fasting blood sugar value 5 is consider as 1 otherwise 0,
Cholesterol=0, calculate the confidence of the rule XY then Resting ECG value greater than or equal to 5 is 1 otherwise 0,
Calculation of confidence=Support of (XY)/ Support of X Tallish value greater than or equal to 150 is consider as 1
there for confidence of the rule confidence(Age otherwise 0, Induced Angina value greater than or equal to 5 is
consider as 1 otherwise 0, Old Peak value greater than or
Sex)=Support(Age Sex)/ Support(Age)=8/15=0.533,
equal to 5 is consider as 1 otherwise 0.
confidence(Age CP)=Support(Age CP)/
Support(Age)=10/15=0.66, confidence(Age Blood
Pressure)=Support(Age Cholesterol)/
Table8:-Modified database for the party 2 after putting the values
Support(Age)=2/15=0.133, confidence(Sex
CP)=Support(Sex CP)/ Support(Sex)=11/13=0.846,
confidence(Sex Blood Pressure)=Support(Sex Blood
Pressure)/ Support(Sex)=5/13=0.846, confidence(Sex
CP)=Support(Sex
Cholesterol)/Support(Sex)=2/13=0.153,confidence(CP
Blood Pressure)=Support(CP Blood Pressure)/
Support(CP)=6/14=0.428, confidence(CP
Cholesterol)=Support(CP Cholesterol)/
Support(CP)=2/14=0.142, confidence(Blood Pressure
Cholesterol)=Support(Blood Pressure Cholesterol)/ The support count of Fasting Blood Sugar=2, the support
Support(Blood Pressure)=0/11=0.00. Consider the minimum count of Resting ECG =8, the support count of Tallish =16,
the support count of Induced Angina =9, and the support count
confidence value=80% after select those item have the value
of Old Peak =10, and the rest of attribute have the support
greater then or equal to that value of minimum confidence,
count value is zero, Support (Fasting Blood Sugar) = 2/24=0.625,
three attribute have the value greater then and equal to defined
minimum confidence {Sex, CP and Blood Pressure}, then if Support (Resting ECG) =8/24=0.33, Support (Tallish) =16/24=0.66,
design the rule for these three attributes confidence (Sex, Support (Induced Angina) =9/24=0.375, Support (Old Peak)
=10/24=0.41, calculate the support percentage then the
CP Blood Pressure) = Support (Sex, CP Blood Pressure)/ support of Fasting Blood Sugar=62.5, support of Resting
Support (Sex, CP) =4/11=0.363. Selected only those rule have ECG=33, support of Tallish=66.6, support of Induced
their confidence is greater than or equal to minimum Angina=37.5 and the support of Old Peak= 41. Let the
confidence. Rule {Sex CP, Sex Blood Pressure} for minimum support value is 40%, then the select those attribute
party1. which value greater than or equal to 40%, then the numbers of
Table 6: - Party P2 has the following dataset
item selected in the first database table are {Fasting Blood
Sugar, Tallish, Old Peak}, after that find the support count
between the two attributes in the given database, support count
of Fasting Blood Sugar and Tallish=2, Fasting Blood Sugar
and Old Peak=2, Tallish and Old Peak=0, calculate the
confidence of the rule XY then Calculation of
confidence=Support of (XY)/ Support of X there for
confidence of the rule confidence(Fasting Blood Sugar
Tallish)=Support(Fasting Blood Sugar Tallish)/
Support(Fasting Blood Sugar)=2/2=1.00, confidence(Fasting
Blood Sugar Old Peak)=Support(Fasting Blood Sugar Old

239
2015 1st International conference on futuristic trend in computational analysis and knowledge management (ABLAZE 2015)

Peak )/ Support(Fasting Blood Sugar)=2/2=1.00,


confidence(Tallish Old Peak)=Support(Tallish Old
Peak)/Support(Tallish)=0/16=0.00. Consider the minimum
confidence value=80% after select those item have the value
greater than or equal to that value of minimum confidence,
three attribute have the value greater then and equal to defined
minimum confidence {{Fasting Blood Sugar, Tallish}, Old
Peak}, then if design the rule for these three attributes
confidence ({Fasting Blood Sugar, Tallish} Old Peak) =
Support ({Fasting Blood Sugar, Tallish} Old Peak)/ Support
({Fasting Blood Sugar, Tallish}) =0/2=0.00.

The support count of Slope=8, the support count of CA=10,


the support count of Thai =13, the support count of Concept
Class =11, and the rest of attribute have the support count
value is zero, Support (Slope) = 8/24=0.33, Support (CA)
=10/24=0.416, Support (Thai) =13/24=0.541, Support (Concept Class)
=11/24=0.458. Let the minimum support value is 40%, then
the select those attribute which value greater than or equal to
40%, then the numbers of item selected in the first database
table are {CA, Thai, Concept Class}, after that find the
support count between the two attributes in the given database,
Table 9: - Party P3 has the following datasets support count of CA and Thai=6, CA and Concept Class=4,
calculate the confidence of the rule XY then Calculation of
confidence=Support of (XY)/ Support of X there for
confidence of the rule confidence(CA Thai)=Support(CA
Thai)/ Support(CA)= 6/10=0.66, confidence(CA Concept
Class)=Support(CA Concept Class)/
Support(CA)=4/10=0.40. Consider the minimum confidence
value=80% after select those item have the value greater than
or equal to that value of minimum confidence, three attribute
have the value greater then and equal to defined minimum
Table 10:-Database for the Party 3 confidence {CA, Thai, Concept Class}, then if design the rule
for these three attributes confidence ({CA, Thai} Concept
Class) = Support ({CA, Thai} Concept Class)/ Support
({CA, Thai}) =3/6=0.50. So its lesser than the minimum
support that taken in this paper. For calculating the partial
confidence of each party use the following formula,
Confidence=Support (XY)/Support(X) Minimum
Confidence, Solve this equation then, Support (XY)
Minimum Confidence* Support(X) after that Support (XY) -
Minimum Confidence* Support(X) 0, Then the partial
Confidence ,Partial confidence= Support (XY) -Minimum
Confidence* Support(X) +|DB|+Ri, {Sex CP, Sex Blood
Pressure}, {Fasting Blood Sugar Tallish, Fasting Blood
Slope value greater than or equal to 5 is consider as 1 Sugar Old Peak, Tallish Old Pea, {CA Thai, CA
otherwise 0, CA value greater than or equal to 5 is consider 1 Concept Class}. Select one of the relations from the list of
otherwise 0, Thai value greater than or equal to 5 is consider relation and calculate their partial confidence for the relation
as 1 otherwise 0, Concept Class value greater than or equal to {Sex CP} for the party P1 and consider random number for
5 is consider as 1 otherwise 0. parties P1, P2 and P3 are 1, 2 and 3, Partial confidence=
Support (XY) -Minimum Confidence* Support(X) +
Table 11:-Database for the Party3 after putting the approx value
|DB|+Ri, Partial confidence P1 {Sex CP} = Support (Sex
CP) -Minimum Confidence* Support (Sex) +|DB|+Ri = 11-
0.8*13+24+1=25.6, Partial confidence P2 {Fasting Blood

240
2015 1st International conference on futuristic trend in computational analysis and knowledge management (ABLAZE 2015)

Sugar Tallish} = Support (Fasting Blood Sugar Tallish) - Communication System Software and Middleware, ACM, New York, USA,
Jun. 2009.
Minimum Confidence* Support (Fasting Blood Sugar) + [10] D. Mishra, and M. Chandwani, Anonymity Enabled Secure Multi-party
|DB|+Ri =2-0.8*2+24+2+25.6=52, Partial confidence P3 Computation for Indian BPO, in Proceedings of Region 10 Conference -
{CA Thai} = Support (CA Thai) -Minimum Confidence* TENCON, IEEE, Taipei, Republic of China, Nov. 2007.
[11] Han, J. Kamber,M., Data mining. Concepts and Techniques. Morgan
Support (CA) + |DB|+Ri=6-0.8*10+24+3+52=77. Then Kaufmann, San Francisco, 2006.
calculate the global confidence of one of the that the
administrator selected select the relation {Sex CP} to
calculate the global confidence of this relation use the
following formula, Global confidence= Partial confidence
(PC) of the parties- Random number(RN) then global
confidence of the relation=77-(1+2+3)=71%. The confidence
is lesser than the minimum confidence thats why this relation
is globally infrequent relation. May be its frequent for their
parties.
VI. CONCLUSION
In this paper we have presented a secure multi party
computation in the cloud based environment where every user
want to share their private information to the other user
presented in the cloud environment, and also dont want to
disclose their private information to others, in this scenario we
used the concept of the secure multi party computation where
every resources selected their common function and calculate
their result, for providing the highest privacy to the medical
database we used the concept of random numbers, so this
model are able to provide the highest privacy, efficiency,
speed to the medical database, but in future we will increases
the privacy, efficiency, speed of the database when the number
of users increases and also provide the zero percentage of data
leakage as well as reduce the time and space complexity of
database.
REFERENCES
[1] N. Maheshwari, and K. Kiyawat, Structural Framing of Protocol for
Secure Multiparty Cloud Computation, in Proceedings of 5th Asia Modelling
Symposium, IEEE, Kuala Lampur, Malaysia, May 2011.
[2] A. C. Yao, Protocols for Secure Computations, in Proceedings of
Annual Symposium on Foundations of Computer Science, vol.0 , pp. 160-164,
Nov. 1982.
[3] R. Oppliger, Contemporary Cryptography, Artech House Computer
Security Library, Norwood, 2005.
[4] F. Shaikh, S. Haider Security Threats in Cloud Computing, in
Proceedings of International Conference for Internet Technology and Secured
Transactions (ICITST), IEEE, Abu Dhabi, UAE, Dec. 2011.
[5] Q. Ma, L. Xiao, I.-L. Yen, M. Tu, and F. Bastani, An Adaptive
Multiparty Protocol for Secure Data Protection, in Proceedings of 11th
International Conference on Parallel and Distributed Systems, IEEE, Fukuoka,
Japan, July 2005.
[6] S. Chakraborty, S. Sehgal, and A. Pal, Privacy Preserving E-negotiation
Protocols based on Secure Multi-party Computation, in Proceedings of
SoutheastCon., IEEE, Fort Lauderdale, USA, April 2005.
[7] S. Bleikertz, M. Schunter, C. W. Probst, D. Pendarakis, and K. Eriksson,
Security Audits of Multi-tier Virtual Infrastructures in Public Infrastructure
Clouds, in Proceedings of the Workshop on Cloud Computing Security
Workshop (CCSW), ACM, New York, USA, Oct. 2010.
[8] S. Pearson, Y. Shen, and M. Mowbray, A Privacy Manager for Cloud
Computing, in Proceedings of the 1st International Conference on Cloud
Computing, Springer-Verlag, Berlin, Germany, 2009.
[9] M. Mowbray, and S. Pearson, A Client-based Privacy Manager for Cloud
Computing, in Proceedings of the 4th International ICST Conference on

241

Vous aimerez peut-être aussi