Académique Documents
Professionnel Documents
Culture Documents
A PROJECTED REPORT
ON
Fuzzy Keyword Search over Encrypted Data in Cloud
Computing
SUBMITED TO
UNIVERSITY OF MUMBAI
BY
VENNELA KUMARARAJA
M.Sc.(COMPUTER SCIENCE)
20132014
______________
Prof. In-Charge
___________________
Head of the Department
____________
Examiner
College Seal
ACKNOWLEDGEMENT
No project is ever complete without the guidance of those expert how have already traded this
past before and hence become master of it and as a result, our leader. So we would like to take
this opportunity to take all those individuals how have helped us in visualizing this project.
We express out deep gratitude to our project guide Mrs. for providing timely assistant to
our query and guidance that she gave owing to her experience in this field for past many year.
She had indeed been a lighthouse for us in this journey.
We would also take this opportunity to thank our project co-ordinate Mr. for his guidance
in selecting this project and also for providing us all this details on proper presentation of this
project.
We extend our sincerity appreciation to all our Professor form COLLEGE OF
ENGINEERING for their valuable inside and tip during the designing of the project. Their
contributions have been valuable in so many ways that we find it difficult to acknowledge of
them individual.
We also great full to our HOD Mrs. for extending her help directly and indirectly through
various channel in our project work.
.
Thanking You,
________________
ABSTRACT
Abstract As Cloud Computing becomes prevalent, more and more
sensitive information are being centralized into the cloud. For the
protection of data privacy, sensitive data usually have to be encrypted
before outsourcing, which makes effective data utilization a very
challenging task. Although traditional searchable encryption schemes
allow a user to securely search over encrypted data through keywords
and selectively retrieve files of interest, these techniques support only
exact keyword search. That is, there is no tolerance of minor typos and
format inconsistencies which, on the other hand, are typical user
searching behavior and happen very frequently. This significant
drawback makes existing techniques unsuitable in Cloud Computing as
it greatly affects system usability, rendering user searching experiences
very frustrating and system efficacy very low. In this paper, for the first
time we formalize and solve the problem of effective fuzzy keyword
search over encrypted cloud data while maintaining keyword privacy.
Fuzzy keyword search greatly enhances system usability by returning
the matching files when users searching inputs exactly match the
predefined keywords or the closest possible matching files based on
keyword similarity semantics, when exact match fails. In our solution,
we exploit edit distance to quantify keywords similarity and develop an
INDEX
SR.N
O
1)
2)
3)
4)
5)
6)
7)
8)
9)
TITLE
INTRODUCTION
LITERATURE SURVEY
PROBLEM DEFINITION
REQUIREMENT ANALYSIS
PG.NO
1
5
8
11
13
27
29
FUTURE MODIFICATIONS
10)
APPLICATION
11)
BIBLIOGRAPHY
12)
SCREENSHOTS
13)
SOURCE CODE
31
33
48
INTRODUCTION
INTRODUCTION
As Cloud Computing becomes prevalent, more and more sensitive
information are being centralized into the cloud, such as emails, personal
health records, government documents, etc. By storing their data into the
cloud, the data owners can be relieved from the burden of data storage
and maintenance so as to enjoy the on-demand high quality data storage
service. However, the fact that data owners and cloud server are not in
the same trusted domain may put the outsourced data at risk, as the
cloud server may no longer be fully trusted. It follows that sensitive data
usually should be encrypted prior to outsourcing for data privacy and
combating unsolicited accesses. However, data encryption makes
effective data utilization a very challenging task given that there could
be a large amount of outsourced data files. Moreover, in Cloud
Computing, data owners may share their outsourced data with a large
number of users. The individual users might want to only retrieve certain
specific data files they are interested in during a given session. One of
the most popular ways is to selectively retrieve files through keywordbased search instead of retrieving all the encrypted files back which is
completely impractical in cloud computing scenarios. Such keywordbased search technique allows users to selectively retrieve files of
interest and has been widely applied in plaintext search scenarios, such
as Google search [1]. Unfortunately, data encryption restricts users
ability to perform keyword search and thus makes the traditional
plaintext search methods unsuitable for Cloud Computing. Besides this,
data encryption also demands the protection of keyword privacy since
keywords usually contain important information related to the data files.
Although encryption of keywords can protect keyword privacy, it further
renders the traditional plaintext search techniques useless in this
scenario. To securely search over encrypted data, searchable encryption
techniques have been developed in recent years [2][10]. Searchable
encryption schemes usually build up an index for each keyword of
interest and associate the index with the files that contain the keyword.
By integrating the trapdoors of keywords within the index information,
effective keyword search can be realized while both file content and
keyword privacy are well-preserved. Although allowing for performing
searches securely and effectively, the existing searchable encryption
techniques do not suit for cloud computing scenario since they support
only exact keyword search. That is, there is no tolerance of minor typos
and format inconsistencies. It is quite common that users searching
input might not exactly match those pre-set keywords due to the possible
typos, such as Illinois and Illinois, representation inconsistencies, such
as PO BOX and P.O. Box, and/or her lack of exact knowledge about the
data. The naive way to support fuzzy keyword search is through simple
LITERATURE
SURVEY
Motivation
Plain text fuzzy keyword search.
Recently, the importance of fuzzy search has received attention in the
context of plaintext searching in information retrieval community [11]
[13]. They addressed this problem in the traditional informationaccess
paradigm by allowing user to search without using try-and-see approach
for finding relevant information based on approximate string matching.
At the first glance, it seems possible for one to directly apply these string
matching algorithms to the context of searchable encryption by
computing the trapdoors on a character base within an alphabet.
However, this trivial construction suffers from the dictionary and
statistics attacks and fails to achieve the search privacy
Searchable encryption.
Traditional searchable encryption [2][8], [10] has been widely studied
in the context of cryptography. Among those works, most are focused
on efficiency improvements and security definition formalizations. The
first construction of searchable encryption was proposed by Song et al.
[3], in which each word in the document is encrypted independently
under a special two-layered encryption construction. Goh [4] proposed
to use Bloom filters to construct the indexes for the data files. To
achieve more efficient search, Chang et al. [7] and Curtmola et al. [8]
PROBLEM
DEFINITION
Problem statement
Scope
We propose a system where Fuzzy keyword search greatly enhances
system usability by returning the matching files when users searching
inputs exactly match the predefined keywords or the closest possible
matching files based on keyword similarity semantics, when exact match
fails.
Existing System
For the protection of data privacy, sensitive data usually have to be
encrypted before outsourcing, which makes effective data utilization a
very challenging task. Although traditional searchable encryption
schemes allow a user to securely search over encrypted data through
keywords and selectively retrieve files of interest, these techniques
support only exact keyword search. That is, there is no tolerance of
minor typos and format inconsistencies which, on the other hand, are
typical user searching behavior and happen very frequently. This
significant drawback makes existing techniques unsuitable in Cloud
Computing as it greatly affects system usability, rendering user
searching experiences very frustrating and system efficacy very low.
Proposed System
Proposed System:
Main Modules:
1. Wildcard Based Technique
2. Gram - Based Technique
Edit Distance:
a.
b.
c.
Substitution
Deletion
Insertion
a.
b.
c.
HARDWARE &
SOFTWARE
REQUIREMENT
PLANNING AND
ESTIMATION
Project Methodology
Software development Life Cycle
The entire project spanned for duration of 6 months. In order to
effectively design and develop a cost-effective model the Waterfall model
was practiced.
TimeLine
Cost Estimation
cost Drivers
Product attributes
Required software reliability
Size of application database
Complexity of the product
Hardware attributes
Run-time performance constraints
Memory constraints
Volatility of the virtual machine envi
ronment
Required turnabout time
Personnel attributes
Analyst capability
Applications experience
Software engineer capability
Virtual machine experience
Programming language experience
Project attributes
Very
Low
0.75
0.70
1.46
1.29
1.42
1.21
1.14
Ratings
Nomin
Low
al
High
Very Extra
High High
0.88
0.94
0.85
1.00
1.00
1.00
1.15
1.08
1.15
1.40
1.16
1.30
1.65
1.00
1.00
1.11
1.06
1.30
1.21
1.66
1.56
0.87
1.00
1.15
1.30
0.87
1.00
1.07
1.15
1.19
1.13
1.17
1.10
1.07
1.00
1.00
1.00
1.00
1.00
0.86
0.91
0.86
0.90
0.95
0.71
0.82
0.70
1.24
1.10
1.00
0.91
0.82
1.24
1.10
1.00
0.91
0.83
1.23
1.08
1.00
1.04
1.10
FEASIBILITY STUDY
Feasibility study is made to see if the project on completion will serve the purpose of the
organization for the amount of work, effort and the time that spend on it. Feasibility study lets
the developer foresee the future of the project and the usefulness. A feasibility study of a system
proposal is according to its workability, which is the impact on the organization, ability to meet
their user needs and effective use of resources. Thus when a new application is proposed it
normally goes through a feasibility study before it is approved for development.
The document provide the feasibility of the project that is being designed and lists various
areas that were considered very carefully during the feasibility study of this project such as
Technical, Economic and Operational feasibilities. The following are its features:
TECHNICAL FEASIBILITY
The system must be evaluated from the technical point of view first. The assessment of this
feasibility must be based on an outline design of the system requirement in the terms of input,
output, programs and procedures. Having identified an outline system, the investigation must go
on to suggest the type of equipment, required method developing the system, of running the
system once it has been designed.
Since the system is developed as part of project work, there is no manual cost to spend for the
proposed system. Also all the resources are already available, it give an indication of the system
is economically possible for development.
BEHAVIORAL FEASIBILITY
This includes the following questions:
Is there sufficient support for the users?
Will the proposed system cause harm?
The project would be beneficial because it satisfies the objectives when developed and
installed. All behavioral aspects are considered carefully and conclude that the project is
behaviorally feasible.
Design &
Implementation
Start
If verified?
No
Yes
Select file to upload
Encrypt data with key
1 LEVEL DFD
2 level DFD
APPLICATION
1) search semantics
2) search ranking
3) File retrieval
Conclusion
Conclusion
In this paper, for the first time we formalize and solve the problem of
supporting efficient yet privacy-preserving fuzzy search for achieving
effective utilization of remotely stored encrypted data in Cloud
Computing. We design an advanced technique (i.e., wildcard-based
technique) to construct the storage-efficient fuzzy keyword sets by
exploiting a significant observation on the similarity metric of edit
distance. Based on the constructed fuzzy keyword sets, we further
propose an efficient fuzzy keyword search scheme. Through rigorous
security analysis, we show that our proposed solution is secure and
privacy-preserving, while correctly realizing the goal of fuzzy keyword
search.
FUTURE MODIFICATION
FUTURE MODIFICATION
BIBILIOGRAPHY
Bibliography
[1] Google, Britney spears spelling correction, Referenced online at
http:
//www.google.com/jobs/britney.html, June 2009.
[2] M. Bellare, A. Boldyreva, and A. ONeill, Deterministic and
efficiently
searchable encryption, in Proceedings of Crypto 2007, volume 4622 of
LNCS. Springer-Verlag, 2007.
[3] D. Song, D. Wagner, and A. Perrig, Practical techniques for
searches
on encrypted data, in Proc. of IEEE Symposium on Security and
Privacy00, 2000.
[4] E.-J. Goh, Secure indexes, Cryptology ePrint Archive, Report
2003/216, 2003, http://eprint.iacr.org/.
[5] D. Boneh, G. D. Crescenzo, R. Ostrovsky, and G. Persiano, Public
key
encryption with keyword search, in Proc. of EUROCRYP04, 2004.
[6] B. Waters, D. Balfanz, G. Durfee, and D. Smetters, Building an
encrypted and searchable audit log, in Proc. of 11th Annual Network
and Distributed System, 2004.
[7] Y.-C. Chang and M. Mitzenmacher, Privacy preserving keyword