Vous êtes sur la page 1sur 4

Secure Mining of Association Rules in Horizontally Distributed Databases

ABSTRACT We propose a protocol for secure mining of association rules in horizontally distributed databases. The current leading protocol is that of Kantarcioglu and Clifton [18]. Our protocol li!e theirs is based on the "ast #istributed $ining %"#$& algorithm of Cheung et al. [8] 'hich is an unsecured distributed (ersion of the )priori algorithm. The main ingredients in our protocol are t'o no(el secure multi*party algorithms + one that computes the union of pri(ate subsets that each of the interacting players hold and another that tests the inclusion of an element held by one player in a subset held by another. Our protocol offers enhanced pri(acy 'ith respect to the protocol in [18]. ,n addition it is simpler and is significantly more efficient in terms of communication rounds communication cost and computational cost.

Existing System That goal defines a problem of secure multi*party computation. ,n such problems there are M players that hold pri(ate inputs x1, . . . , xM and they 'ish to securely compute y - f%x1, . . . , xM& for some public function f. ,f there e.isted a trusted third party the players could surrender to him their inputs and he 'ould perform the function e(aluation and send to them the resulting output. ,n the absence of such a trusted third party it is needed to de(ise a protocol that the players can run on their o'n in order to arri(e at the re/uired output y. 0uch a protocol is considered perfectly secure if no player can learn from his (ie' of the protocol more than 'hat he 'ould ha(e learnt in the idealized setting 'here the computation is carried out by a trusted third party. 1ao [23] 'as the first to propose a generic solution for this problem in the case of t'o players. Other generic solutions for the multi*party case 'ere later proposed in Proposed System 4erein 'e propose an alternati(e protocol for the secure computation of the union of pri(ate subsets. The proposed protocol impro(es upon that in [18] in terms of simplicity and efficiency as 'ell as pri(acy. ,n particular our protocol does not depend on commutati(e encryption and obli(ious transfer %'hat simplifies it significantly and contributes to'ards much reduced communication and computational costs&. While our solution is still not perfectly secure it lea!s e.cess information only to a small number %three& of possible coalitions unli!e the protocol of that discloses information also to some single players. ,n addition 'e claim that the e.cess

Contact: 040-40274843, 9703109334 Email id: academicliveprojects@gmail.com, .logicsystems.org.in

Secure Mining of Association Rules in Horizontally Distributed Databases

information that our protocol may lea! is less sensiti(e than the e.cess information lea!ed by the protocol of. The protocol that 'e propose here computes a parameterized family of functions 'hich 'e call threshold functions in 'hich the t'o e.treme cases correspond to the problems of computing the union and intersection of pri(ate subsets. Those are in fact general*purpose protocols that can be used in other conte.ts as 'ell. )nother problem of secure multiparty computation that 'e sol(e here as part of our discussion is the set inclusion problem5 namely the problem 'here )lice holds a pri(ate subset of some ground set and 6ob holds an element in the ground set and they 'ish to determine 'hether 6ob7s element is 'ithin )lice7s subset 'ithout re(ealing to either of them information about the other party7s input beyond the abo(e described inclusion. IMPLEMENTATION ,mplementation is the stage of the pro8ect 'hen the theoretical design is turned out into a 'or!ing system. Thus it can be considered to be the most critical stage in achie(ing a successful ne' system and in gi(ing the user confidence that the ne' system 'ill 'or! and be effecti(e. The implementation stage in(ol(es careful planning in(estigation of the e.isting system and it7s constraints on implementation designing of methods to achie(e changeo(er and e(aluation of changeo(er methods. Modules: 1 Pri!"#y Preser!ing $"t" Mining: 9re(ious 'or! in pri(acy preser(ing data mining has considered t'o related settings. One in 'hich the data o'ner and the data miner are t'o different entities and another in 'hich the data is distributed among se(eral parties 'ho aim to 8ointly perform data mining on the unified corpus of data that they hold. ,n the first setting the goal is to protect the data records from the data miner. 4ence the data o'ner aims at anonym zing the data prior to its release. The main approach in this conte.t is to apply data perturbation. The idea is that. Computation and communication costs (ersus the number of transactions N the perturbed data can be used to infer general trends in the data 'ithout re(ealing original record information.

,n the second setting the goal is to perform data mining 'hile protecting the data records of each of the data o'ners from the other data o'ners. This is a problem of secure multiparty computation. The usual approach here is cryptographic rather than probabilistic.

Contact: 040-40274843, 9703109334 Email id: academicliveprojects@gmail.com, .logicsystems.org.in

Secure Mining of Association Rules in Horizontally Distributed Databases

:ind ell and 9in! as sho'ed ho' to securely build an ,#2 decision tree 'hen the training set is distributed horizontally. :in et al. discussed secure clustering using the ;$ algorithm o(er horizontally distributed data. The problem of distributed association rule mining 'as studied in in the (ertical setting 'here each party holds a different set of attributes and in [18] in the horizontal setting. )lso the 'or! of [3<] considered this problem in the horizontal setting but they considered large*scale systems in 'hich on top of the parties that hold the data records %resources& there are also managers 'hich

% $istri&uted Comput"tion: We compared the performance of t'o secure implementations of the "#$ algorithm 0ection ,n the first implementation %denoted "#$*KC& 'e e.ecuted the unification step using 9rotocol =>,",*KC 'here the commutati(e cipher 'as 1?3@*bit A0) in the second implementation %denoted "#$& 'e used our 9rotocol =>,", 'here the !eyed*hash function 'as 4$)C [@]. ,n both implementations 'e implemented 0tep B of the "#$ algorithm in the secure manner that 'as described in later. We tested the t'o implementations 'ith respect to three measuresC 1& Total computation time of the complete protocols %"#$KC and "#$& o(er all players. That measure includes the )priori computation time and the time to identify the globally s*fre/uent item sets as described in later. 3& Total computation time of the unification protocols only %=>,",*KC and =>,",& o(er all players. 2& Total message size. We ran three e.periment sets 'here each set tested the dependence of the abo(e measures on a different parameterC N + the number of transactions in the unified database ' (re)uent Itemsets: We describe here the solution that 'as proposed by Kantarcioglu and Clifton. They onsidered t'o possible settings. ,f the re/uired output includes all globally s*fre/uent item sets as 'ell as the sizes of their supports then the (alues of D% x& can be re(ealed for all x Ck s . ,n such a case those (alues may be computed using a secure summation protocol %e.g. [<]& 'here the pri(ate addend of Pm is suppm%x& sNm. The more interesting setting ho'e(er is the one 'here the support sizes are not part of the re/uired output. We proceed to discuss it.

* Asso#i"tion Rules:

Contact: 040-40274843, 9703109334 Email id: academicliveprojects@gmail.com, .logicsystems.org.in

Secure Mining of Association Rules in Horizontally Distributed Databases

Once the set Fs of all s*fre/uent itemsets is found 'e may proceed to loo! for all % s, c&* association rules %rules 'ith support at least sN and confidence at least c& as described in [18]. "or X, Y Fs 'here X Y - the corresponding association rule X Y has confidence at least c if and only if supp%X Y &/supp%X& c or e/ui(alently CX,Y C- ME m-1 %suppm%X Y & c suppm%X&& ? . %1?& 0ince |CX,Y | N then by ta!ing q - 3NF1 the players can (erify ine/uality %1?& in parallel for all candidate association rules as described in 0ection 2. ,n order to deri(e from Fs all %s, c&*association rules in an efficient manner 'e rely upon the follo'ing straightfor'ard lemma.

Contact: 040-40274843, 9703109334 Email id: academicliveprojects@gmail.com, .logicsystems.org.in

Vous aimerez peut-être aussi