Vous êtes sur la page 1sur 86

CLOUD DATA STORAGE WITH SECURE DEDUPLICATION

AT RESOURCE SIDE
A THESIS
Submitted in partial fulfillment of the requirements for the award of the degree of

MASTER OF TECHNOLOGY
In
COMPUTER SCIENCE
AND
ENGINEERING

Submitted by
P.DEEPTHI
(14KC1D5806)

Under The Esteemed Guidance of


Mr. P.VEMANA
M.Sc(c.s),MCA,M.Phil

Asst.Professor & Dept. of CSE

DEPARTEMENT OF COMPUTER SCIENCE & ENGINEERING

KRISHNAVENI ENGINEERING COLLEGE FOR WOMEN,


KESANUPALLI
(APROVED BY AICTE NEW DELHI & AFFILIATED TO J.N.T.UNIVERSITY, KAKINADA)

(2014-2016)

KRISHNAVENI ENGINEERING COLLEGE FOR WOMEN,


KESANUPALLI
DEPARTEMENT OF COMPUTER SCIENCE & ENGINEERING

CERTIFICATE
This is to certify that the project work entitled CLOUD DATA
STORAGE WITH SECURE DEDUPLICATION AT RESOURCE SIDE is a
Bonafide record carried out by the P.DEEPTHI(14KC1D5806). Under the guidance
and supervision, as a partial fulfillment for the award of Master of Technology in
Computer

Science

&

Engineering

of

JAWAHARLAL

NEHRU

TECHNOLOGICAL UNIVERSITY KAKINADA, KAKINADA during the


academic year 2014-2016.
The content of the thesis has not been submitted earlier for the award
of any other degree or certificate and I heartily comment the work done by
him in this connection.

INTERNAL GUIDE

Mr. P.VEMANA M.Sc(c.s),MCA,M.Phil


Asst.Professor & Dept. of CSE
KRISHNAVENI ENGINEERING COLLEGE FOR WOMEN,

KESANUPALLI, NARASARAOPET.

Internal Examiner

HEAD OF THE DEPT

Mrs.B.JYOTHI

M.Tech

Asst.Professor & Head, Dept. Of CSE


KRISHNAVENI ENGINEERING COLLEGE FOR WOMEN,

KESANUPALLI, NARASARAOPET.

External Examiner

DECLARATION
I

(PATHELLA

DEEPTHI)

the

student

of

KRISHNAVENI

ENGINEERING COLLEGE FOR WOMEN, hereby declare that this is titled


Cloud Data Storage With Secure Deduplication At Resource Side
being submitted to the Department of Computer Science and Engineering,
KRISHNAVENI ENGINEERING COLLEGE FOR WOMEN, Narasaraopet,
Guntur (Dt.).

This is a project bonafide work done by me and it has not been submitted to
any other Institution or University for the award of any other degree.

PATHELLA DEEPTHI
14KC1D5806

ACKNOWLEDGEMENT
I articulate my profound gratitude to Mr. P.VEMANA, Asst.Professor,
department of CSE for his indispensable encouragement with salient guide lines
and suggestions throughout the work. He helped me in various difficulties and
challenges that raised at various stages of project.
With a great sense of pleasure and privilege, I extend my gratitude to Mrs.
B.JYOTHI, Asst.Professor and Head of the Computer Science and Engineering
Department, for her cooperation and providing necessary help for completing this
project work.
I convey my special thanks to my beloved and honorable Principal Dr. T SITA
RAMAIAH MBA, Ph.D , Secretary and Correspondent Mr.P.SRINIVASA RAO and
Academic Director Mr.K.NASARAIAH and my College Management for
providing excellent lab facilities for the completion of project within the
campus. Finally, I express indebtedness to everyone even remotely involved in the
successful completion of my project.
I am very much gratitude to all faculty members of the Department for their kind
cooperation with out which this work could not have been to this shape. I would like
to thank the teaching and non-teaching Staff and all my friends who made my stay
here at KRISHNAVENI ENGINEEERING COLLEGE FOR WOMEN a morale one.

(P.Deepthi)
(14KC1D5806)

ABSTRACT
The cloud computing technology develops during the last decade,
Outsourcing data to cloud service for storage becomes an attractive trend, which
Benefits in sparing efforts on heavy data maintenance and management.
Nevertheless, since the outsourced cloud storage is not fully trustworthy, it
raises Security concerns on how to realize data deduplication in cloud while
achieving Integrity auditing.
On cloud data. Specifically, aiming at achieving both data integrity and
deduplication in cloud, we propose two secure systems, namely SecCloud and
SecCloud+. SecCloud introduces an auditing entity with a maintenance of a
MapReduce cloud, which helps clients generate data tags before uploading as
well as audit the integrity of data having been stored in cloud. Compared with
previous work, the computation by user in SecCloud is greatly reduced during
the file uploading and auditing phases. SecCloud+ is designed motivated by the
fact that customers always want to encrypt their data before uploading, and
enables integrity auditing and secure deduplication on encrypted data.

INDEX

CONTENT
1. INTRODUCTION

PAGE NO
1

1.1 What is cloud computing

1.2 How Cloud Computing Works

1.3 Characteristics and Services Models

2-3

1.4 Benefits of cloud computing

3-4

2. LITERATURE SURVEY

2.1 A view of cloud computing

2.2 Secure and constant cost public cloud storage

6-7

Auditing with deduplication


2.3 Proofs of ownership in remote storage systems

7-8

2.4 Dupless: Server aided encryption for deduplicated storage

2.5 Provable data possession at untrusted stores

8-9

3. SYSTEM ANALYSIS

10

3.1 Existing System

10-11

3.2 Proposed System

11

3.3 System Requirements

12

3.3.1 Hardware Requirements

12

3.3.2 Software Requirements

12

4. SYSTEM DESIGN

13

4.1 Data Flow Diagram

13-14

4.2 UML Diagrams

15

4.2.1 Goals

15

4.2.2 Use Case Diagram

16

4.2.3 Class Diagram

17

4.2.4 Sequence Diagram

18

4.2.5 Activity Diagram

19-20

5. SOFTWARE IMPLEMENTATION

21

5.1 Modules

21

5.2 Modules Description

21-22

5.3 Software Environment

22

5.3.1 Java Technology

22-23

5.3.2 The Java Platform

24

5.3.3 What can Java Technology Do

25

5.3.4 ODBC

27-28

5.3.5 JDBC

29-32

5.3.6 Networking

32-40

6. CODING

41

6.1 Admin Login

41-46

6.2 User Login

47-52

6.3 Registration

53-59

7. SYSTEM TESTING

60

7.1 Unit Testing

60

7.2 Integration Testing

60

7.3 Functional Testing

61

7.4 White box Testing

61

7.5 Black Box Testing

62

7.6 Acceptance Testing

63

8. SCREENSHOTS

64-72

9. FUTURE ENHANCEMENT

73

10. CONCLUSION

74

11. BIBLIOGRAPHY

75-76

12. APPENDICES
Published paper

LIST OF FIGURES
FIG.NO

TITLE

PG.NO

1.1

Structure Of Cloud Computing

1.3.1

Characteristics of Cloud computing

1.3.2

Structure of service models

System Design

13

4.1

Data Flow Diagram

14

4.2.1

Use Case Diagram

16

4.2.2

Class Diagram

17

4.2.3

Sequence Diagram

18

4.2.4

Activity Diagram

20

5.3.1

Java Programming

24

1.INTRODUCTION
1.1 What is cloud computing?
Cloud computing is the use of computing resources (hardware and
software) that are delivered as a service over a network (typically the Internet).
The name comes from the common use of a cloud-shaped symbol as an
abstraction for the complex infrastructure it contains in system diagrams. Cloud
computing entrusts remote services with a user's data, software and
computation. Cloud computing consists of hardware and software resources
made available on the Internet as managed third-party services. These services
typically provide access to advanced software applications and high-end
networks of server computers.

Figure 1.1 Structure of cloud computing

1.2 How Cloud Computing Works?


The goal of cloud computing is to apply traditional supercomputing, or highperformance computing power, normally used by military and research
facilities, to perform tens of trillions of computations per second, in consumeroriented applications such as financial portfolios, to deliver personalized
information, to provide data storage or to power large, immersive computer
games.
The cloud computing uses networks of large groups of servers typically running
low-cost consumer PC technology with specialized connections to spread dataprocessing chores across them. This shared IT infrastructure contains large
pools of systems that are linked together. Often, virtualization techniques are
used to maximize the power of cloud computing.
1.3 Characteristics and Services Models:
The salient characteristics of cloud computing based on the
definitions provided by the National Institute of Standards and Terminology
(NIST) are outlined below:

On-demand self-service: A consumer can unilaterally provision


computing capabilities, such as server time and network storage, as
needed automatically without requiring human interaction with each
services provider.
Broad network access: Capabilities are available over the network and
accessed through standard mechanisms that promote use by
heterogeneous thin or thick client platforms (e.g., mobile phones, laptops,
and PDAs).
Resource pooling: The providers computing resources are pooled to
serve multiple consumers using a multi-tenant model, with different
physical and virtual resources dynamically assigned and reassigned
according to consumer demand. There is a sense of locationindependence in that the customer generally has no control or knowledge
over the exact location of the provided resources but may be able to
specify location at a higher level of abstraction (e.g., country, state, or
data center). Examples of resources include storage, processing, memory,
network bandwidth, and virtual machines.

Rapid elasticity: Capabilities can be rapidly and elastically provisioned,


in some cases automatically, to quickly scale out and rapidly released to
quickly scale in. To the consumer, the capabilities available for
provisioning often appear to be unlimited and can be purchased in any
quantity at any time.
Measured service: Cloud systems automatically control and optimize
resource use by leveraging a metering capability at some level of
abstraction appropriate to the type of service (e.g., storage, processing,
bandwidth, and active user accounts). Resource usage can be managed,
controlled, and reported providing transparency for both the provider and
consumer of the utilized service.

Figure 1.3.1 Characteristics of cloud computing

Services Models:
Cloud Computing comprises three different service models, namely
Infrastructure-as-a-Service (IaaS), Platform-as-a-Service (PaaS), and Softwareas-a-Service (SaaS). The three service models or layer are completed by an end
user layer that encapsulates the end user perspective on cloud services. The
model is shown in figure below. If a cloud user accesses services on the
infrastructure layer, for instance, she can run her own applications on the
resources of a cloud infrastructure and remain responsible for the support,
maintenance, and security of these applications herself. If she accesses a service
3

on the application layer, these tasks are normally taken care of by the cloud
service provider.

Figure 1.3.2 Structure of service models


1.4 Benefits of cloud computing:
1. Achieve economies of scale increase volume output or productivity
with fewer people. Your cost per unit, project or product plummets.
2. Reduce spending on technology infrastructure. Maintain easy access
to your information with minimal upfront spending. Pay as you go
(weekly, quarterly or yearly), based on demand.
3. Globalize your workforce on the cheap. People worldwide can access
the cloud, provided they have an Internet connection.
4. Streamline processes. Get more work done in less time with less people.
5. Reduce capital costs. Theres no need to spend big money on hardware,
software or licensing fees.
6. Improve accessibility. You have access anytime, anywhere, making your
life so much easier!
7. Monitor projects more effectively. Stay within budget and ahead of
completion cycle times.

8. Less personnel training is needed. It takes fewer people to do more


work on a cloud, with a minimal learning curve on hardware and software
issues.
9. Minimize licensing new software. Stretch and grow without the need to
buy expensive software licenses or programs.
10.Improve flexibility. You can change direction without serious people
or financial issues at stake.
Advantages:
1. Price:Pay for only the resources used.
2. Security: Cloud instances are isolated in the network from other
instances for improved security.
3. Performance: Instances can be added instantly for improved
performance. *Clients have access to the total resources of the Clouds
core hardware.
4. Scalability: Auto-deploy cloud instances when needed.
5. Uptime: Uses multiple servers for maximum redundancies. In case of
server failure, instances can be automatically created on another server.
6. Control: Able to login from any location. Server snapshot and a software
library lets you deploy custom instances.
7. Traffic: Deals with spike in traffic with quick deployment of additional
instances to handle the load.

2.LITERATURE SURVEY

2.1 A view of cloud computing


AUTHORS: M. Armbrust, A. Fox, R. Griffith, A. D. Joseph, R. Katz, A.
Konwinski, G. Lee, D. Patterson, A. Rabkin, I. Stoica, and M. Zaharia
Cloud computing, the long-held dream of computing as a utility, has the
potential to transform a large part of the IT industry, making software even
more attractive as a service and shaping the way IT hardware is designed and
purchased. Developers with innovative ideas for new Internet services no longer
require the large capital outlays in hardware to deploy their service or the
human expense to operate it. They need not be concerned about
overprovisioning for a service whose popularity does not meet their predictions,
thus wasting costly resources, or underprovisioning for one that becomes wildly
popular, thus missing potential customers and revenue. Moreover, companies
with large batch-oriented tasks can get results as quickly as their programs can
scale, since using 1,000 servers for one hour costs no more than using one
server for 1,000 hours. This elasticity of resources, without paying a premium
for large scale, is unprecedented in the history of IT.

2.2 Secure and constant cost public cloud storage auditing with
deduplication
AUTHORS: J. Yuan and S. Yu
Data integrity and storage efficiency are two important requirements for cloud
storage. Proof of Retrievability (POR) and Proof of Data Possession (PDP)
techniques assure data integrity for cloud storage. Proof of Ownership (POW)
improves storage efficiency by securely removing unnecessarily duplicated data
on the storage server. However, trivial combination of the two techniques, in
order to achieve both data integrity and storage efficiency, results in non-trivial
duplication of metadata (i.e., authentication tags), which contradicts the
6

objectives of POW. Recent attempts to this problem introduce tremendous


computational and communication costs and have also been proven not secure.
It calls for a new solution to support efficient and secure data integrity auditing
with storage deduplication for cloud storage. In this paper we solve this open
problem with a novel scheme based on techniques including polynomial-based
authentication tags and homomorphic linear authenticators. Our design allows
deduplication of both files and their corresponding authentication tags. Data
integrity auditing and storage deduplication are achieved simultaneously. Our
proposed scheme is also characterized by constant realtime communication and
computational cost on the user side. Public auditing and batch auditing are both
supported. Hence, our proposed scheme outperforms existing POR and PDP
schemes while providing the additional functionality of deduplication. We
prove the security of our proposed scheme based on the Computational DiffieHellman problem, the Static Diffie-Hellman problem and the t-Strong DiffieHellman problem. Numerical analysis and experimental results on Amazon
AWS show that our scheme is efficient and scalable.

2.3 Proofs of ownership in remote storage systems


AUTHORS: S. Halevi, D. Harnik, B. Pinkas, and A. Shulman-Peleg
Cloud storage systems are becoming increasingly popular. A promising
technology that keeps their cost down is deduplication, which stores only a
single copy of repeating data. Client-side deduplication attempts to identify
deduplication opportunities already at the client and save the bandwidth of
uploading copies of existing files to the server. In this work we identify attacks
that exploit client-side deduplication, allowing an attacker to gain access to
arbitrary-size files of other users based on a very small hash signatures of these
files. More specifically, an attacker who knows the hash signature of a file can
convince the storage service that it owns that file, hence the server lets the
attacker download the entire file. (In parallel to our work, a subset of these
attacks were recently introduced in the wild with respect to the Dropbox file
synchronization service.) To overcome such attacks, we introduce the notion of
proofs-of ownership (PoWs), which lets a client efficiently prove to a server
that that the client holds a file, rather than just some short information about it.
We formalize the concept of proof-of-ownership, under rigorous security
7

definitions, and rigorous efficiency requirements of Petabyte scale storage


systems. We then present solutions based on Merkle trees and specific
encodings, and analyze their security. We implemented one variant of the
scheme. Our performance measurements indicate that the scheme incurs only a
small overhead compared to naive client-side deduplication.
2.4 Dupless: Server aided encryption for deduplicated storage
AUTHORS: S. Keelveedhi, M. Bellare, and T. Ristenpart
Cloud storage service providers such as Dropbox, Mozy, and others perform
deduplication to save space by only storing one copy of each file uploaded.
Should clients conventionally encrypt their files, however, savings are lost.
Message-locked encryption (the most prominent manifestation of which is
convergent encryption) resolves this tension. However it is inherently subject to
brute-force attacks that can recover files falling into a known set. We propose
an architecture that provides secure deduplicated storage resisting brute-force
attacks, and realize it in a system called DupLESS. In DupLESS, clients encrypt
under message-based keys obtained from a key-server via an oblivious PRF
protocol. It enables clients to store encrypted data with an existing service, have
the service perform deduplication on their behalf, and yet achieves strong
confidentiality guarantees. We show that encryption for deduplicated storage
can achieve performance and space savings close to that of using the storage
service with plaintext data.
2.5 Provable data possession at untrusted stores
AUTHORS: G. Ateniese, R. Burns, R. Curtmola, J. Herring, L. Kissner, Z.
Peterson, and D. Song
We introduce a model for provable data possession (PDP) that allows a client
that has stored data at an untrusted server to verify that the server possesses the
original data without retrieving it. The model generates probabilistic proofs of
possession by sampling random sets of blocks from the server, which drastically
reduces I/O costs. The client maintains a constant amount of metadata to verify
the proof. The challenge/response protocol transmits a small, constant amount
of data, which minimizes network communication. Thus, the PDP model for
8

remote data checking supports large data sets in widely-distributed storage


system.
We present two provably-secure PDP schemes that are more efficient than
previous solutions, even when compared with schemes that achieve weaker
guarantees. In particular, the overhead at the server is low (or even constant), as
opposed to linear in the size of the data. Experiments using our implementation
verify the practicality of PDP and reveal that the performance of PDP is
bounded by disk I/O and not by cryptographic computation.

3.SYSTEM ANALYSIS

3.1 EXISTING SYSTEM


Ateniese et al. proposed a dynamic PDP schema but without insertion
operation.
Erway et al. improved Ateniese et al.s work and supported insertion by
introducing authenticated flip table.
Wang et al.proposed proxy PDP in public clouds.
Zhu et al. proposed the cooperative PDP in multi-cloud storage.
Wang et al. improved the POR model by manipulating the classic Merkle
hash tree construction for block tag authentication.
Xu and Chang proposed to improve the POR schema with polynomial
commitment for reducing communication cost.
Stefanov et al. proposed a POR protocol over authenticated file system
subject to frequent changes.
Azraoui et al. combined the privacy-preserving word search algorithm
with the insertion in data segments of randomly generated short bit
sequences, and developed a new POR protocol.
Li et al. considered a new cloud storage architecture with two
independent cloud servers for integrity auditing to reduce the
computation load at client side.

DISADVANTAGES OF EXISTING SYSTEM


The first problem is integrity auditing. The cloud server is able to relieve
clients from the heavy burden of storage management and maintenance.
The most difference of cloud storage from traditional in-house storage is
that the data is transferred via Internet and stored in an uncertain domain,
not under control of the clients at all, which inevitably raises clients great
concerns on the integrity of their data.
The second problem is secure deduplication. The rapid adoption of cloud
services is accompanied by increasing volumes of data stored at remote
cloud servers. Among these remote stored files, most of them are
10

duplicated: according to a recent survey by EMC, 75% of recent digital


data is duplicated copies.
Unfortunately, this action of deduplication would lead to a number of
threats potentially affecting the storage system, for example, a server
telling a client that it (i.e., the client) does not need to send the file reveals
that some other client has the exact same file, which could be sensitive
sometimes. These attacks originate from the reason that the proof that the
client owns a given file (or block of data) is solely based on a static, short
value (in most cases the hash of the file).

3.2 PROPOSED SYSTEM


In this paper, aiming at achieving data integrity and deduplication in
cloud, we propose two secure systems namely SecCloud and SecCloud+.
SecCloud introduces an auditing entity with maintenance of a
MapReduce cloud, which helps clients generate data tags before
uploading as well as audit the integrity of data having been stored in
cloud.
Besides supporting integrity auditing and secure deduplication,
SecCloud+ enables the guarantee of file confidentiality.
We propose a method of directly auditing integrity on encrypted data.

ADVANTAGES OF PROPOSED SYSTEM


This design fixes the issue of previous work that the computational load
at user or auditor is too huge for tag generation. For completeness of finegrained, the functionality of auditing designed in SecCloud is supported
on both block level and sector level. In addition, SecCloud also enables
secure deduplication.
The challenge of deduplication on encrypted is the prevention of
dictionary attack.
Our proposed SecCloud system has achieved both integrity auditing and
file deduplication.

11

3. SYSTEM REQUIREMENTS

3.1 HARDWARE REQUIREMENTS

System
Hard Disk
Floppy Drive
Monitor
Mouse
Ram

:
:
:
:
:
:

Pentium IV 2.4 GHz.


40 GB.
1.44 Mb.
15 VGA Colour.
Logitech.
512 Mb

3.2 SOFTWARE REQUIREMENTS

Operating system
Coding Language
IDE
Database

:
:
:
:

Windows XP/7.
JAVA/J2EE
Netbeans 7.4
MYSQL

12

4.SYSTEM DESIGN

SYSTEM ARCHITECTURE

Figure 4 System Architecture


4.1 DATA FLOW DIAGRAM
1. The DFD is also called as bubble chart. It is a simple graphical formalism
that can be used to represent a system in terms of input data to the system,
various processing carried out on this data, and the output data is
generated by this system.
2. The data flow diagram (DFD) is one of the most important modeling
tools. It is used to model the system components. These components are
the system process, the data used by the process, an external entity that
interacts with the system and the information flows in the system.
3. DFD shows how the information moves through the system and how it is
modified by a series of transformations. It is a graphical technique that
depicts information flow and the transformations that are applied as data
moves from input to output.

13

4. DFD is also known as bubble chart. A DFD may be used to represent a


system at any level of abstraction. DFD may be partitioned into levels
that represent increasing information flow and functional detail.

Cloud

User

Owner

Account Activation

Register

Yes

Login

Upload

Auditor

Duplication Check

Token Request
Accept
File Token

Ignore

Download
Encrpyted

Accept
Public & private
keycheck

Upload to cloud

Ignore

Cloud Database

End

Figure 4.1 Data Flow Diagram


14

4.2 UML DIAGRAMS

UML stands for Unified Modeling Language. UML is a standardized


general-purpose modeling language in the field of object-oriented software
engineering. The standard is managed, and was created by, the Object
Management Group.
The goal is for UML to become a common language for creating models
of object oriented computer software. In its current form UML is comprised of
two major components: a Meta-model and a notation. In the future, some form
of method or process may also be added to; or associated with, UML.
The Unified Modeling Language is a standard language for specifying,
Visualization, Constructing and documenting the artifacts of software system,
as well as for business modeling and other non-software systems.
The UML represents a collection of best engineering practices that have
proven successful in the modeling of large and complex systems.
The UML is a very important part of developing objects oriented
software and the software development process. The UML uses mostly
graphical notations to express the design of software projects.
GOALS
The Primary goals in the design of the UML are as follows:
1. Provide users a ready-to-use, expressive visual modeling Language so
that they can develop and exchange meaningful models.
2. Provide extendibility and specialization mechanisms to extend the core
concepts.
3. Be independent of particular programming languages and development
process.
4. Provide a formal basis for understanding the modeling language.
5. Encourage the growth of OO tools market.
6. Support higher level development concepts such as collaborations,
frameworks, patterns and components.
7. Integrate best practices.

15

4.2.1 USE CASE DIAGRAM


A use case diagram in the Unified Modeling Language (UML) is a type
of behavioral diagram defined by and created from a Use-case analysis. Its
purpose is to present a graphical overview of the functionality provided by a
system in terms of actors, their goals (represented as use cases), and any
dependencies between those use cases. The main purpose of a use case diagram
is to show what system functions are performed for which actor. Roles of the
actors in the system can be depicted.

Register

Account Activation

Login

Token Request
Cloud
Accept Request

Previlege Token
User

Log details

Encrypt

Check for Deduplicate


Auditor

Logout

Figure 4.2.1 Use Case Diagram


16

4.2.2 CLASS DIAGRAM


In software engineering, a class diagram in
(UML) is a type of static structure diagram
system by showing the system's classes,
methods), and the relationships among the
contains information.

the Unified Modeling Language


that describes the structure of a
their attributes, operations (or
classes. It explains which class

Cloud
Request details
Response details
View User Details
Store Files
User
Account Activation()
Provide Rights()

File Request
Token Request
File Upload/Download

Auditor

Request()

Check for Depulication


Upload files to cloud
Log Details
Encrypt()
Upload()

Figure 4.2.2 Class Diagram

17

4.2.3 SEQUENCE DIAGRAM


A sequence diagram in Unified Modeling Language (UML) is a kind of
interaction diagram that shows how processes operate with one another and in
what order. It is a construct of a Message Sequence Chart. Sequence diagrams
are sometimes called event diagrams, event scenarios, and timing diagrams.

Cloud

User

Auditor

upload request
Duplicate check
Deduplication
File Upload
User Rquest

Create Account
Token Request

Provide Token

Database

Figure 4.2.3 Sequence Diagram

18

4.2.4 ACTIVITY DIAGRAM


Activity diagrams are graphical representations of workflows of stepwise
activities and actions with support for choice, iteration and concurrency. In the
Unified Modeling Language, activity diagrams can be used to describe the
business and operational step-by-step workflows of components in a system. An
activity diagram shows the overall flow of control.

19

User

Register

NO

Login

Cloud

Auditor

User Activation

Duplication check

Accept

User Update

File uploadto cloud

Token Request

Ignore
View Details

File Upload/Download

REQUEST

Save File in Encrypted Format

Figure 4.2.4 Activity Diagram


20

Log details

5. SOFTWARE IMPLEMENTATION

5.1 MODULES

Cloud Servers
Data Users Module
Auditor
Secure De-duplication System

5.2 MODULES DESCRIPTON


Cloud Service Provider
In this module, we develop Cloud Service Provider module. This is an
entity that provides a data storage service in public cloud.
The CS provides the data outsourcing service and stores data on behalf of
the users.
To reduce the storage cost, the CS eliminates the storage of redundant
data via deduplication and keeps only unique data.
In this paper, we assume that CS is always online and has abundant
storage capacity and computation power.
Data Users Module
A user is an entity that wants to outsource data storage to the S-CSP and
access the data later.
In a storage system supporting deduplication, the user only uploads
unique data but does not upload any duplicate data to save the upload
bandwidth, which may be owned by the same user or different users.
In the authorized deduplication system, each user is issued a set of
privileges in the setup of the system. Each file is protected with the
convergent encryption key and privilege keys to realize the authorized
deduplication with differential privileges.

21

Auditor
Auditor which helps clients upload and audit their outsourced data maintains a
MapReduce cloud and acts like a certificate authority. This assumption
presumes that the auditor is associated with a pair of public and private keys. Its
public key is made available to the other entities in the system. The first design
goal of this work is to provide the capability of verifying correctness of the
remotely stored data. public verification, which allows anyone, not just the
clients originally stored the file, to perform verification.
Secure De-duplication System
We consider several types of privacy we need protect, that is, i)
unforgeability of duplicate-check token: There are two types of
adversaries, that is, external adversary and internal adversary.
As shown below, the external adversary can be viewed as an internal
adversary without any privilege.
If a user has privilege p, it requires that the adversary cannot forge and
output a valid duplicate token with any other privilege p on any file F,
where p does not match p. Furthermore, it also requires that if the
adversary does not make a request of token with its own privilege from
private cloud server, it cannot forge and output a valid duplicate token
with p on any F that has been queried.

5.3 Software Environment


5.3.1 Java Technology
Java technology is both a programming language and a platform.
The Java Programming Language
The Java programming language is a high-level language that can be
characterized by all of the following buzzwords:

Simple
Architecture neutral
Object oriented
Portable
22

Distributed
High performance
Interpreted
Multithreaded
Robust
Dynamic
Secure

With most programming languages, you either compile or interpret a


program so that you can run it on your computer. The Java programming
language is unusual in that a program is both compiled and interpreted. With the
compiler, first you translate a program into an intermediate language called
Java byte codes the platform-independent codes interpreted by the interpreter
on the Java platform. The interpreter parses and runs each Java byte code
instruction on the computer. Compilation happens just once; interpretation
occurs each time the program is executed. The following figure illustrates how
this works.

You can think of Java byte codes as the machine code instructions for the
Java Virtual Machine (Java VM). Every Java interpreter, whether its a
development tool or a Web browser that can run applets, is an implementation
of the Java VM. Java byte codes help make write once, run anywhere
possible. You can compile your program into byte codes on any platform that
has a Java compiler. The byte codes can then be run on any implementation of
the Java VM. That means that as long as a computer has a Java VM, the same
program written in the Java programming language can run on Windows 2000,
a Solaris workstation, or on an iMac.

23

Figure 5.3.1 Java Programming


5.3.2 The Java Platform
A platform is the hardware or software environment in which a
program runs. Weve already mentioned some of the most popular
platforms like Windows 2000, Linux, Solaris, and MacOS. Most
platforms can be described as a combination of the operating system and
hardware. The Java platform differs from most other platforms in that its
a software-only platform that runs on top of other hardware-based
platforms.
The Java platform has two components:
The Java Virtual Machine (Java VM)
The Java Application Programming Interface (Java API)
Youve already been introduced to the Java VM. Its the base for the
Java platform and is ported onto various hardware-based platforms.
The Java API is a large collection of ready-made software components
that provide many useful capabilities, such as graphical user interface
(GUI) widgets. The Java API is grouped into libraries of related classes
and interfaces; these libraries are known as packages. The next section,
What Can Java Technology Do? Highlights what functionality some of
the packages in the Java API provide.
The following figure depicts a program thats running on the Java
platform. As the figure shows, the Java API and the virtual machine
insulate the program from the hardware.

24

Native code is code that after you compile it, the compiled code
runs on a specific hardware platform. As a platform-independent
environment, the Java platform can be a bit slower than native code.
However, smart compilers, well-tuned interpreters, and just-in-time byte
code compilers can bring performance close to that of native code
without threatening portability.
5.3.3 What Can Java Technology Do?
The most common types of programs written in the Java programming
language are applets and applications. If youve surfed the Web, youre
probably already familiar with applets. An applet is a program that
adheres to certain conventions that allow it to run within a Java-enabled
browser.
However, the Java programming language is not just for writing cute,
entertaining applets for the Web. The general-purpose, high-level Java
programming language is also a powerful software platform. Using the
generous API, you can write many types of programs.
An application is a standalone program that runs directly on the Java
platform. A special kind of application known as a server serves and
supports clients on a network. Examples of servers are Web servers,
proxy servers, mail servers, and print servers. Another specialized
program is a servlet. A servlet can almost be thought of as an applet that
runs on the server side. Java Servlets are a popular choice for building
interactive web applications, replacing the use of CGI scripts. Servlets are
similar to applets in that they are runtime extensions of applications.
Instead of working in browsers, though, servlets run within Java Web
servers, configuring or tailoring the server.
How does the API support all these kinds of programs? It does so with
packages of software components that provides a wide range of
functionality. Every full implementation of the Java platform gives you
the following features:
The essentials: Objects, strings, threads, numbers, input and
output, data structures, system properties, date and time, and so on.
25

Applets: The set of conventions used by applets.


Networking: URLs, TCP (Transmission Control Protocol), UDP
(User Data gram Protocol) sockets, and IP (Internet Protocol)
addresses.
Internationalization: Help for writing programs that can be
localized for users worldwide. Programs can automatically adapt to
specific locales and be displayed in the appropriate language.
Security: Both low level and high level, including electronic
signatures, public and private key management, access control, and
certificates.
TM
Software components: Known as JavaBeans , can plug into
existing component architectures.
Object
serialization: Allows lightweight persistence and
communication via Remote Method Invocation (RMI).
TM
Java Database Connectivity (JDBC ): Provides uniform access
to a wide range of relational databases.
The Java platform also has APIs for 2D and 3D graphics, accessibility,
servers, collaboration, telephony, speech, animation, and more. The
following figure depicts what is included in the Java 2 SDK.

How Will Java Technology Change My Life?


We cant promise you fame, fortune, or even a job if you learn the
Java programming language. Still, it is likely to make your programs
better and requires less effort than other languages. We believe that Java
technology will help you do the following:
26

Get started quickly: Although the Java programming language is


a powerful object-oriented language, its easy to learn, especially
for programmers already familiar with C or C++.
Write less code: Comparisons of program metrics (class counts,
method counts, and so on) suggest that a program written in the
Java programming language can be four times smaller than the
same program in C++.
Write better code: The Java programming language encourages
good coding practices, and its garbage collection helps you avoid
memory leaks. Its object orientation, its JavaBeans component
architecture, and its wide-ranging, easily extendible API let you
reuse other peoples tested code and introduce fewer bugs.
Develop programs more quickly: Your development time may be
as much as twice as fast versus writing the same program in C++.
Why? You write fewer lines of code and it is a simpler
programming language than C++.
Avoid platform dependencies with 100% Pure Java: You can
keep your program portable by avoiding the use of libraries written
in other languages. The 100% Pure JavaTM Product Certification
Program has a repository of historical process manuals, white
papers, brochures, and similar materials online.
Write once, run anywhere: Because 100% Pure Java programs
are compiled into machine-independent byte codes, they run
consistently on any Java platform.
Distribute software more easily: You can upgrade applets easily
from a central server. Applets take advantage of the feature of
allowing new classes to be loaded on the fly, without
recompiling the entire program.
5.3.4 ODBC
Microsoft Open Database Connectivity (ODBC) is a standard
programming interface for application developers and database systems
providers. Before ODBC became a de facto standard for Windows programs to
interface with database systems, programmers had to use proprietary languages
for each database they wanted to connect to. Now, ODBC has made the choice
of the database system almost irrelevant from a coding perspective, which is as
it should be. Application developers have much more important things to worry

27

about than the syntax that is needed to port their program from one database to
another when business needs suddenly change.
Through the ODBC Administrator in Control Panel, you can specify the
particular database that is associated with a data source that an ODBC
application program is written to use. Think of an ODBC data source as a door
with a name on it. Each door will lead you to a particular database. For
example, the data source named Sales Figures might be a SQL Server database,
whereas the Accounts Payable data source could refer to an Access database.
The physical database referred to by a data source can reside anywhere on the
LAN.
The ODBC system files are not installed on your system by Windows 95.
Rather, they are installed when you setup a separate database application, such
as SQL Server Client or Visual Basic 4.0. When the ODBC icon is installed in
Control Panel, it uses a file called ODBCINST.DLL. It is also possible to
administer your ODBC data sources through a stand-alone program called
ODBCADM.EXE. There is a 16-bit and a 32-bit version of this program and
each maintains a separate list of ODBC data sources.
From a programming perspective, the beauty of ODBC is that the
application can be written to use the same set of function calls to interface with
any data source, regardless of the database vendor. The source code of the
application doesnt change whether it talks to Oracle or SQL Server. We only
mention these two as an example. There are ODBC drivers available for several
dozen popular database systems. Even Excel spreadsheets and plain text files
can be turned into data sources. The operating system uses the Registry
information written by ODBC Administrator to determine which low-level
ODBC drivers are needed to talk to the data source (such as the interface to
Oracle or SQL Server). The loading of the ODBC drivers is transparent to the
ODBC application program. In a client/server environment, the ODBC API
even handles many of the network issues for the application programmer.
The advantages of this scheme are so numerous that you are probably
thinking there must be some catch. The only disadvantage of ODBC is that it
isnt as efficient as talking directly to the native database interface. ODBC has
had many detractors make the charge that it is too slow. Microsoft has always
claimed that the critical factor in performance is the quality of the driver
software that is used. In our humble opinion, this is true. The availability of
good ODBC drivers has improved a great deal recently. And anyway, the
criticism about performance is somewhat analogous to those who said that
28

compilers would never match the speed of pure assembly language. Maybe not,
but the compiler (or ODBC) gives you the opportunity to write cleaner
programs, which means you finish sooner. Meanwhile, computers get faster
every year.
5.3.5 JDBC
In an effort to set an independent database standard API for Java; Sun
Microsystems developed Java Database Connectivity, or JDBC. JDBC offers a
generic SQL database access mechanism that provides a consistent interface to
a variety of RDBMSs. This consistent interface is achieved through the use of
plug-in database connectivity modules, or drivers. If a database vendor wishes
to have JDBC support, he or she must provide the driver for each platform that
the database and Java run on.
To gain a wider acceptance of JDBC, Sun based JDBCs framework on
ODBC. As you discovered earlier in this chapter, ODBC has widespread
support on a variety of platforms. Basing JDBC on ODBC will allow vendors to
bring JDBC drivers to market much faster than developing a completely new
connectivity solution.
JDBC was announced in March of 1996. It was released for a 90 day
public review that ended June 8, 1996. Because of user input, the final JDBC
v1.0 specification was released soon after.
The remainder of this section will cover enough information about JDBC for
you to know what it is about and how to use it effectively. This is by no means a
complete overview of JDBC. That would fill an entire book.
JDBC Goals
Few software packages are designed without goals in mind. JDBC is one
that, because of its many goals, drove the development of the API. These goals,
in conjunction with early reviewer feedback, have finalized the JDBC class
library into a solid framework for building database applications in Java.
The goals that were set for JDBC are important. They will give you some
insight as to why certain classes and functionalities behave the way they do. The
eight design goals for JDBC are as follows:
1. SQL Level API
The designers felt that their main goal was to define a SQL interface for
Java. Although not the lowest database interface level possible, it is at a low
29

enough level for higher-level tools and APIs to be created. Conversely, it is


at a high enough level for application programmers to use it confidently.
Attaining this goal allows for future tool vendors to generate JDBC code
and to hide many of JDBCs complexities from the end user.
2. SQL Conformance
SQL syntax varies as you move from database vendor to database vendor.
In an effort to support a wide variety of vendors, JDBC will allow any query
statement to be passed through it to the underlying database driver. This
allows the connectivity module to handle non-standard functionality in a
manner that is suitable for its users.
3. JDBC must be implemental on top of common database interfaces
The JDBC SQL API must sit on top of other common SQL level
APIs. This goal allows JDBC to use existing ODBC level drivers by the
use of a software interface. This interface would translate JDBC calls to
ODBC and vice versa.
4. Provide a Java interface that is consistent with the rest of the Java
system
Because of Javas acceptance in the user community thus far, the
designers feel that they should not stray from the current design of the core
Java system.
5. Keep it simple
This goal probably appears in all software design goal listings. JDBC is
no exception. Sun felt that the design of JDBC should be very simple,
allowing for only one method of completing a task per mechanism. Allowing
duplicate functionality only serves to confuse the users of the API.
6. Use strong, static typing wherever possible
Strong typing allows for more error checking to be done at compile time;
also, less error appear at runtime.
7. Keep the common cases simple
Because more often than not, the usual SQL calls used by the
programmer are simple SELECTs, INSERTs, DELETEs and UPDATEs,
these queries should be simple to perform with JDBC. However, more
complex SQL statements should also be possible.
30

Finally we decided to proceed the implementation using Java


Networking And for dynamically updating the cache table we go for MS
Access database.
Java has two things: a programming language and a platform.
Java is a high-level programming language that is all of the following
Simple

Architecture-neutral

Object-oriented

Portable

Distributed

High-performance

Interpreted

multithreaded

Robust

Dynamic

Secure

Java is also unusual in that each Java program is both compiled and
interpreted. With a compile you translate a Java program into an
intermediate language called Java byte codes the platform-independent
code instruction is passed and run on the computer.

Compilation happens just once; interpretation occurs each time the


program is executed. The figure illustrates how this works.

Interpreter

Java Program

Compilers

My Program

31

You can think of Java byte codes as the machine code instructions
for the Java Virtual Machine (Java VM). Every Java interpreter,
whether its a Java development tool or a Web browser that can run
Java applets, is an implementation of the Java VM. The Java VM can
also be implemented in hardware.
Java byte codes help make write once, run anywhere possible. You
can compile your Java program into byte codes on my platform that
has a Java compiler. The byte codes can then be run any
implementation of the Java VM. For example, the same Java program
can run Windows NT, Solaris, and Macintosh.

5.3.6 Networking

TCP/IP stack
The TCP/IP stack is shorter than the OSI one:

TCP is a connection-oriented protocol; UDP (User Datagram


Protocol) is a connectionless protocol.

32

IP datagrams
The IP layer provides a connectionless and unreliable delivery
system. It considers each datagram independently of the others. Any
association between datagram must be supplied by the higher layers.
The IP layer supplies a checksum that includes its own header. The
header includes the source and destination addresses. The IP layer
handles routing through an Internet. It is also responsible for breaking
up large datagram into smaller ones for transmission and reassembling
them at the other end.
UDP
UDP is also connectionless and unreliable. What it adds to IP is a
checksum for the contents of the datagram and port numbers. These are
used to give a client/server model - see later.

TCP
TCP supplies logic to give a reliable connection-oriented protocol
above IP. It provides a virtual circuit that two processes can use to
communicate.
Internet addresses
In order to use a service, you must be able to find it. The Internet
uses an address scheme for machines so that they can be located. The
address is a 32 bit integer which gives the IP address. This encodes a
network ID and more addressing. The network ID falls into various
classes according to the size of the network address.
Network address
Class A uses 8 bits for the network address with 24 bits left over for
other addressing. Class B uses 16 bit network addressing. Class C uses
24 bit network addressing and class D uses all 32.

33

Subnet address
Internally, the UNIX network is divided into sub networks. Building
11 is currently on one sub network and uses 10-bit addressing, allowing
1024 different hosts.
Host address
8 bits are finally used for host addresses within our subnet. This
places a limit of 256 machines that can be on the subnet.
Total address

The 32 bit address is usually written as 4 integers separated by dots.

Port addresses
A service exists on a host, and is identified by its port. This is a 16
bit number. To send a message to a server, you send it to the port for
that service of the host that it is running on. This is not location
transparency! Certain of these ports are "well known".
Sockets
A socket is a data structure maintained by the system to handle
network connections. A socket is created using the call socket. It returns
an integer that is like a file descriptor. In fact, under Windows, this
handle can be used with Read File and Write File functions.
34

#include <sys/types.h>
#include <sys/socket.h>
int socket(int family, int type, int protocol);
Here "family" will be AF_INET for IP communications, protocol
will be zero, and type will depend on whether TCP or UDP is used.
Two processes wishing to communicate over a network create a socket
each. These are similar to two ends of a pipe - but the actual pipe does
not yet exist.
JFree Chart
JFreeChart is a free 100% Java chart library that makes it easy for
developers to display professional quality charts in their applications.
JFreeChart's extensive feature set includes:
A consistent and well-documented API, supporting a wide range of
chart types;
A flexible design that is easy to extend, and targets both server-side
and client-side applications;
Support for many output types, including Swing components,
image files (including PNG and JPEG), and vector graphics file formats
(including PDF, EPS and SVG);
JFreeChart is "open source" or, more specifically, free software. It
is distributed under the terms of the GNU Lesser General Public Licence
(LGPL), which permits use in proprietary applications.
1. Map Visualizations
Charts showing values that relate to geographical areas. Some
examples include: (a) population density in each state of the United
States, (b) income per capita for each country in Europe, (c) life
expectancy in each country of the world. The tasks in this project include:
Sourcing freely redistributable vector outlines for the countries of
the world, states/provinces in particular countries (USA in particular, but
also other areas);

35

Creating an appropriate dataset interface (plus default


implementation), a rendered, and integrating this with the existing
XYPlot class in JFreeChart;
Testing, documenting, testing some more, documenting some
more.
2. Time Series Chart Interactivity

Implement a new (to JFreeChart) feature for interactive time series charts --to display a separate control that shows a small version of ALL the time
series data, with a sliding "view" rectangle that allows you to select the
subset of the time series data to display in the main chart.
3. Dashboards
There is currently a lot of interest in dashboard displays. Create a flexible
dashboard mechanism that supports a subset of JFreeChart chart types (dials,
pies, thermometers, bars, and lines/time series) that can be delivered easily
via both Java Web Start and an applet.
4. Property Editors
The property editor mechanism in JFreeChart only handles a small subset of
the properties that can be set for charts. Extend (or reimplement) this
mechanism to provide greater end-user control over the appearance of the
charts.
J2ME (Java 2 Micro edition):Sun Microsystems defines J2ME as "a highly optimized Java run-time
environment targeting a wide range of consumer products, including pagers,
cellular phones, screen-phones, digital set-top boxes and car navigation
systems." Announced in June 1999 at the JavaOne Developer Conference,
J2ME brings the cross-platform functionality of the Java language to smaller
devices, allowing mobile wireless devices to share applications. With J2ME,
Sun has adapted the Java platform for consumer products that incorporate or are
based on small computing devices.

36

1. General J2ME architecture

J2ME uses configurations and profiles to customize the Java Runtime


Environment (JRE). As a complete JRE, J2ME is comprised of a configuration,
which determines the JVM used, and a profile, which defines the application by
adding domain-specific classes. The configuration defines the basic run-time
environment as a set of core classes and a specific JVM that run on specific
types of devices. We'll discuss configurations in detail in the The profile defines
the application; specifically, it adds domain-specific classes to the J2ME
configuration to define certain uses for devices. We'll cover profiles in depth in
the The following graphic depicts the relationship between the different virtual
machines, configurations, and profiles. It also draws a parallel with the J2SE
API and its Java virtual machine. While the J2SE virtual machine is generally
referred to as a JVM, the J2ME virtual machines, KVM and CVM, are subsets
of JVM. Both KVM and CVM can be thought of as a kind of Java virtual
machine -- it's just that they are shrunken versions of the J2SE JVM and are
specific to J2ME.
2. Developing J2ME applications
Introduction In this section, we will go over some considerations you need to
keep in mind when developing applications for smaller devices. We'll take a
look at the way the compiler is invoked when using J2SE to compile J2ME
applications. Finally, we'll explore packaging and deployment and the role
preverification plays in this process.
37

3. Design considerations for small devices


Developing applications for small devices requires you to keep certain strategies
in mind during the design phase. It is best to strategically design an application
for a small device before you begin coding. Correcting the code because you
failed to consider all of the "gotchas" before developing the application can be a
painful process. Here are some design strategies to consider:
* Keep it simple. Remove unnecessary features, possibly making those features
a separate, secondary application.
* Smaller is better. This consideration should be a "no brainer" for all
developers. Smaller applications use less memory on the device and require
shorter installation times. Consider packaging your Java applications as
compressed Java Archive (jar) files.
* Minimize run-time memory use. To minimize the amount of memory used at
run time, use scalar types in place of object types. Also, do not depend on the
garbage collector. You should manage the memory efficiently yourself by
setting object references to null when you are finished with them. Another way
to reduce run-time memory is to use lazy instantiation, only allocating objects
on an as-needed basis. Other ways of reducing overall and peak memory use on
small devices are to release resources quickly, reuse objects, and avoid
exceptions.
4. Configurations overview
The configuration defines the basic run-time environment as a set of core
classes and a specific JVM that run on specific types of devices. Currently, two
configurations exist for J2ME, though others may be defined in the future:
* Connected Limited Device Configuration (CLDC) is used specifically with
the KVM for 16-bit or 32-bit devices with limited amounts of memory. This is
the configuration (and the virtual machine) used for developing small J2ME
applications. Its size limitations make CLDC more interesting and challenging
(from a development point of view) than CDC. CLDC is also the configuration
that we will use for developing our drawing tool application. An example of a
small wireless device running small applications is a Palm hand-held computer.

38

* Connected Device Configuration (CDC) is used with the C virtual machine


(CVM) and is used for 32-bit architectures requiring more than 2 MB of
memory. An example of such a device is a Net TV box.
5. J2ME profiles
What is a J2ME profile?
As we mentioned earlier in this tutorial, a profile defines the type of device
supported. The Mobile Information Device Profile (MIDP), for example,
defines classes for cellular phones. It adds domain-specific classes to the J2ME
configuration to define uses for similar devices. Two profiles have been defined
for J2ME and are built upon CLDC: KJava and MIDP. Both KJava and MIDP
are associated with CLDC and smaller devices. Profiles are built on top of
configurations. Because profiles are specific to the size of the device (amount of
memory) on which an application runs, certain profiles are associated with
certain configurations.
A skeleton profile upon which you can create your own profile, the Foundation
Profile, is available for CDC.
Profile 1: KJava
KJava is Sun's proprietary profile and contains the KJava API. The KJava
profile is built on top of the CLDC configuration. The KJava virtual machine,
KVM, accepts the same byte codes and class file format as the classic J2SE
virtual machine. KJava contains a Sun-specific API that runs on the Palm OS.
The KJava API has a great deal in common with the J2SE Abstract Windowing
Toolkit (AWT). However, because it is not a standard J2ME package, its main
package is com.sun.kjava. We'll learn more about the KJava API later in this
tutorial when we develop some sample applications.
Profile 2: MIDP
MIDP is geared toward mobile devices such as cellular phones and pagers. The
MIDP, like KJava, is built upon CLDC and provides a standard run-time
environment that allows new applications and services to be deployed
dynamically on end user devices. MIDP is a common, industry-standard profile
for mobile devices that is not dependent on a specific vendor. It is a complete
and supported foundation for mobile application

39

development. MIDP contains the following packages, the first three of which
are core CLDC packages, plus three MIDP-specific packages.

* java.lang
* java.io
* java.util
* javax.microedition.io
* javax.microedition.lcdui
* javax.microedition.midlet
* javax.microedition.rms

40

6. CODING

6.1 ADMIN LOGIN


<!DOCTYPE HTML>
<html>

<head>
<title>Secure Auditing</title>
<meta name="description" content="website description" />
<meta
name="keywords"
keywords"/>

content="website

keywords,

website

<meta http-equiv="content-type" content="text/html; charset=UTF-8" />


<link
rel="shortcut
icon"
href="images/brainstorming_alternative.png"/>

type="image/x-icon"

<link rel="stylesheet" type="text/css" href="css/style.css" />


<!-- modernizr enables HTML5 elements and feature detects -->
<script type="text/javascript" src="js/modernizr-1.5.min.js"></script>
<style>
#id{
width: 200px;
height: 25px;
background-color: #D3F2F7;
}
#but{
width: 60px;
41

height: 25px;
}
</style>
<script>
function validation() {
var uname = document.ulogin.username.value;
var pass = document.ulogin.password.value;

if (uname == 0) {
alert("Enter ADMIN ID");
document.ulogin.username.focus();
return false;
}
if (pass == 0) {
alert("Enter password");
document.ulogin.password.focus();
return false;
}
}
</script>
</head>

<body>
<div id="main">

42

<header>
<div id="logo">
<div id="logo_text">
<!-- class="logo_colour", allows you to change the colour of the
text -->
<pre> <h1><a href="#"> Secure Auditing and De-duplicating
Data in Cloud</a></h1></pre>
</div>
</div>
<nav>
<ul class="sf-menu" id="nav">
<li><a href="index.html">Home</a></li>
<li class="selected"><a href="admin.jsp">Auditor</a></li>
<li><a href="p_cloud.jsp">Cloud Server</a></li>
<li><a href="#">User</a>
<ul>
<li><a href="user.jsp">Login</a></li>
<li><a href="register.jsp">Register</a></li>
</ul>
</li>
<li><a href="">Contact Us</a></li>
</ul>
</nav>
</header>
<div id="site_content">
43

<div id="sidebar_container">
<div class="gallery">
<ul class="images">
<li
class="show"><img
width="450"
src="images/model.png" alt="photo_one" /></li>

height="450"

<li><img width="450" height="450" src="images/c2.jpg"


alt="photo_two" /></li>
<li><img width="450" height="450" src="images/c3.jpg"
alt="photo_three" /></li>
<li><img width="450" height="450" src="images/c4.jpg"
alt="photo_four" /></li>
<li><img width="450" height="450" src="images/c5.jpg"
alt="photo_five" /></li>
</ul>
</div>
</div>
<div id="content">
<fieldset

style="background-color:
9px;height:
350px;margin-top:
url('images/audit.jpg')">

lightblue;border-radius:
40px;background-image:

<center>
<h1 style="font: monospace;font-size:
bold;color: darkslateblue">Auditor Login</h1><br>

35px;font-weight:

<form action="admin_login" name="ulogin" method="post"


onsubmit="return validation()">
<!--

USERNAME:<br>-->

<input
type="text"
id="id"
placeholder="Enter Auditor ID"/><br></br><br>
44

name="username"

<!--

PASSWORD:<br>-->

<input type="password"
placeholder="Enter password"/><br></br><br>

id="id"name="password"

<input type="submit" id="but" value=""style="borderradius:


12px;height:
38px;width:
93px;background-image:
url('images/submit.png')"/>
&nbsp;&nbsp;
<input type="reset"id="but" value="" style="border-radius:
12px;height:
38px;width:
83px;background-image:
url('images/reset.png')"></input><br></br>
</form>
</center>
</fieldset>
</div>
</div>
<footer>
<p>Copyright &copy; Jpinfotech. All Rights Reserved.</p>
</footer>
</div>
<p>&nbsp;</p>
<!-- javascript at the bottom for fast page loading -->
<script type="text/javascript" src="js/jquery.js"></script>
<script type="text/javascript" src="js/jquery.easing-sooper.js"></script>
<script type="text/javascript" src="js/jquery.sooperfish.js"></script>
<script type="text/javascript" src="js/image_fade.js"></script>
<script type="text/javascript">
45

$(document).ready(function() {
$('ul.sf-menu').sooperfish();
});
</script>
</body>
</html>

46

6.2 USER LOGIN


<!DOCTYPE HTML>
<html>

<head>
<title>Secure Auditing</title>
<meta name="description" content="website description" />
<meta name="keywords" content="website keywords, website keywords" />
<meta http-equiv="content-type" content="text/html; charset=UTF-8" />
<link
rel="shortcut
icon"
href="images/brainstorming_alternative.png"/>

type="image/x-icon"

<link rel="stylesheet" type="text/css" href="css/style.css" />


<!-- modernizr enables HTML5 elements and feature detects -->
<script type="text/javascript" src="js/modernizr-1.5.min.js"></script>
<script>
function validation(){
var uname=document.ulogin.username.value;
var pass=document.ulogin.password.value;

if(uname==0){
alert("Enter username");
document.ulogin.username.focus();
return false;
}
if(pass==0){
47

alert("Enter password");
document.ulogin.password.focus();
return false;
}
}
</script>
<style>
#id{
width: 200px;
height: 25px;
background-color: #D5D5D5;
}
#but{
width: 60px;
height: 25px;
}
</style>
</head>

<body>
<%
if(request.getParameter("status")!=null){
out.println("<script>alert('Registered')</script>");
}

48

%>
<div id="main">
<header>
<div id="logo">
<div id="logo_text">
<!-- class="logo_colour", allows you to change the colour of the text -->
<<pre> <h1><a href="#"> Secure Auditing and De-duplicating Data in
Cloud</a></h1></pre>
</div>
</div>
<nav>
<ul class="sf-menu" id="nav">
<li><a href="index.html">Home</a></li>
<li><a href="admin.jsp">Auditor</a></li>
<li><a href="p_cloud.jsp">Cloud Server</a></li>

<li class="selected"><a href="#">User</a>


<ul>
<li class="selected"><a href="user.jsp">Login</a></li>
<li><a href="register.jsp">Register</a></li>
</ul>
</li>
</ul>
</nav>
49

</header>
<div id="site_content">
<div id="sidebar_container">
<div class="gallery">
<ul class="images">
<li class="show"><img width="450" height="450" src="images/c1.jpg"
alt="photo_one" /></li>
<li><img
width="450"
alt="photo_two" /></li>

height="450"

src="images/c2.jpg"

<li><img
width="450"
alt="photo_three" /></li>

height="450"

src="images/c3.jpg"

<li><img
width="450"
alt="photo_four" /></li>

height="450"

src="images/c4.jpg"

<li><img
width="450"
alt="photo_five" /></li>

height="450"

src="images/c5.jpg"

</ul>
</div>
</div>
<div id="content">
<fieldset style="background-color: lightblue;border-radius: 9px;height:
430px;margin-top: 10px;background-image: url('images/user.jpg')">
<center>
<h1 style="font: monospace;font-size: 35px;font-weight: bold;color:
darkslateblue">Cloud User Login</h1><br>
<form
action="user_login"
onsubmit="return validation()">
<!--

USERNAME:<br>-->
50

name="ulogin"

method="post"

<input
type="text"
id="id"
username"/><br></br><br>
<!--

name="username"

placeholder="Enter

id="id"name="password"

placeholder="Enter

PASSWORD:<br>-->

<input
type="password"
password"/><br></br><br>

<input type="submit" id="but" value=""style="border-radius: 12px;height:


38px;width: 93px;background-image: url('images/submit.png')"/>
&nbsp;&nbsp;
<input type="reset"id="but" value="" style="border-radius:
12px;height:
38px;width:
83px;background-image:
url('images/reset.png')"></input><br></br>
<font style="font-size: 17px;">
<label style="font: monospace;color:
darkslateblue">New User</label>
<a href="register.jsp" style="font:
monospace;color: darkslateblue">Register here</a> </font>

</form>
</center>
</fieldset>
</div>
</div>
<footer>
<p>Copyright &copy; Jpinfotech. All Rights Reserved.</p>
</footer>
</div>
<p>&nbsp;</p>
<!-- javascript at the bottom for fast page loading -->
<script type="text/javascript" src="js/jquery.js"></script>
51

<script type="text/javascript" src="js/jquery.easing-sooper.js"></script>


<script type="text/javascript" src="js/jquery.sooperfish.js"></script>
<script type="text/javascript" src="js/image_fade.js"></script>
<script type="text/javascript">
$(document).ready(function() {
$('ul.sf-menu').sooperfish();
});
</script>
</body>
</html>

52

6.3 REGISTRATION

<!DOCTYPE HTML>
<html>

<head>
<title>Secure Auditing</title>
<meta name="description" content="website description" />
<meta name="keywords" content="website keywords, website keywords" />
<meta http-equiv="content-type" content="text/html; charset=UTF-8" />
<link
rel="shortcut
icon"
href="images/brainstorming_alternative.png"/>

type="image/x-icon"

<link rel="stylesheet" type="text/css" href="css/style.css" />


<!-- modernizr enables HTML5 elements and feature detects -->
<script type="text/javascript" src="js/modernizr-1.5.min.js"></script>
<script>

function validation(){
var uname=document.ureg.username.value;
var pass=document.ureg.password.value;
var cpass=document.ureg.cpassword.value;
var mail=document.ureg.mail.value;
var name=document.ureg.name.value;
var ph=document.ureg.mobile.value;

53

if(name==0){
alert("Enter name");
document.ureg.name.focus();
return false;
}
if(uname==0){
alert("Enter username");
document.ureg.username.focus();
return false;
}
if(pass==0){
alert("Enter password");
document.ureg.password.focus();
return false;
}
if(cpass!=pass){
alert("Incorrect password");
document.ureg.cpassword.focus();
return false;
}
if(mail==0){
alert("Enter mailid");
document.ureg.mail.focus();
return false;

54

}
if(ph==0){
alert("Enter your mobile no");
document.ureg.mobile.focus();
return false;
}
if(isNaN(ph)){
alert("Invalid phoneno");
document.ureg.mobile.focus();
return false;
}
}
</script>
<style>
input{
width: 200px;
height: 25px;
background-color: #D5D5D5;
}
#but{
width: 75px;
height: 25px;
}
</style>

55

</head>
<body>
<div id="main">
<header>
<div id="logo">
<div id="logo_text">
<!-- class="logo_colour", allows you to change the colour of the text -->
<pre> <h1><a href="#"> Secure Auditing and De-duplicating Data in
Cloud</a></h1></pre>
</div>
</div>
<nav>
<ul class="sf-menu" id="nav">
<li><a href="index.html">Home</a></li>
<li><a href="admin.jsp">Auditor</a></li>
<li><a href="p_cloud.jsp">Cloud Server</a></li>
<li class="selected"><a href="#">User</a>
<ul>
<li><a href="user.jsp">Login</a></li>
<li class="selected"><a href="register.jsp">Register</a></li>
</ul>
</li>
</ul>
</nav>
</header>
56

<div id="site_content">
<div id="sidebar_container">
<div class="gallery">
<ul class="images">
<li class="show"><img width="450" height="450" src="images/c1.jpg"
alt="photo_one" /></li>
<li><img
width="450"
alt="photo_two" /></li>

height="450"

src="images/c2.jpg"

<li><img
width="450"
alt="photo_three" /></li>

height="450"

src="images/c3.jpg"

<li><img
width="450"
alt="photo_four" /></li>

height="450"

src="images/c4.jpg"

<li><img
width="450"
alt="photo_five" /></li>

height="450"

src="images/c5.jpg"

</ul>
</div>
</div>
<div id="content">
<fieldset style="background-color: lightblue;border-radius: 9px;height:
460px;margin-top: 00px;background-image: url('images/user.jpg')">
<center>
<h1 style="font: monospace;font-size: 26px;font-weight:
bold;color: darkslateblue">Registration</h1>
</center>

<form action="registration" name="ureg" style="position:


relative;left: 110px;" method="get" onsubmit="return validation()">
57

<!--

USERNAME:<br>-->

<label style="font: monospace;color: darkslategrey;font-weight: bold">ENTER


NAME:</label><BR>
<input type="text" name="name" placeholder="Enter your name" ><br></br>
<label style="font: monospace;color: darkslategrey;font-weight: bold">ENTER
USERNAME:</label><BR>
<input
type="text"
username"><br></br>
<!--

name="username"

placeholder="Enter

PASSWORD:<br>-->

<label style="font: monospace;color: darkslategrey;font-weight: bold">ENTER


PASSWORD:</label><BR>
<input
type="password"
password"><br></br>

name="password"

<label
style="font:
monospace;color:
bold">CONFIRM PASSWORD:</label><BR>
<input type="password"
password"><br></br>

name="cpassword"

placeholder="Enter

darkslategrey;font-weight:
placeholder="confirm

your

<label style="font: monospace;color: darkslategrey;font-weight: bold">ENTER


EMAIL:</label><BR>
<input type="email" name="mail" placeholder="Enter your email"><br></br>
<label style="font: monospace;color: darkslategrey;font-weight: bold">ENTER
PHONE NO:</label><BR>
<input type="text" name="mobile" placeholder="Enter your mobile no"
maxlength="10"><br></br>
&nbsp;<input
type="submit"
id="but"
value=""style="border-radius:
12px;height: 38px;width: 93px;background-image: url('images/submit.png')"/>

58

&nbsp;&nbsp; <input type="reset" id="but" value="" style="border-radius:


12px;height:
38px;width:
83px;background-image:
url('images/reset.png')"></input><br></br>

</form>
</fieldset>
</div>
</div>
<footer>
<p>Copyright &copy; Jpinfotech. All Rights Reserved.</p>
</footer>
</div>
<p>&nbsp;</p>
<!-- javascript at the bottom for fast page loading -->
<script type="text/javascript" src="js/jquery.js"></script>
<script type="text/javascript" src="js/jquery.easing-sooper.js"></script>
<script type="text/javascript" src="js/jquery.sooperfish.js"></script>
<script type="text/javascript" src="js/image_fade.js"></script>
<script type="text/javascript">
$(document).ready(function() {
$('ul.sf-menu').sooperfish();
});
</script>
</body>
</html>
59

7.SYSTEM TESTING
The purpose of testing is to discover errors. Testing is the process of
trying to discover every conceivable fault or weakness in a work product. It
provides a way to check the functionality of components, sub-assemblies,
assemblies and/or a finished product It is the process of exercising software
with the intent of ensuring that the
Software system meets its requirements and user expectations and does not fail
in an unacceptable manner. There are various types of test. Each test type
addresses a specific testing requirement.
TYPES OF TESTS
7.1 Unit testing
Unit testing involves the design of test cases that validate that the internal
program logic is functioning properly, and that program inputs produce valid
outputs. All decision branches and internal code flow should be validated. It is
the testing of individual software units of the application .it is done after the
completion of an individual unit before integration. This is a structural testing,
that relies on knowledge of its construction and is invasive. Unit tests perform
basic tests at component level and test a specific business process, application,
and/or system configuration. Unit tests ensure that each unique path of a
business process performs accurately to the documented specifications and
contains clearly defined inputs and expected results.
7.2 Integration testing
Integration tests are designed to test integrated software components to
determine if they actually run as one program. Testing is event driven and is
more concerned with the basic outcome of screens or fields. Integration tests
demonstrate that although the components were individually satisfaction, as
shown by successfully unit testing, the combination of components is correct
and consistent. Integration testing is specifically aimed at exposing the
problems that arise from the combination of components.

60

7.3 Functional test


Functional tests provide systematic demonstrations that functions tested are
available as specified by the business and technical requirements, system
documentation, and user manuals.
Functional testing is centered on the following items:
Valid Input

: identified classes of valid input must be accepted.

Invalid Input

: identified classes of invalid input must be rejected.

Functions

: identified functions must be exercised.

Output
exercised.

: identified classes of application outputs must be

Systems/Procedures: interfacing systems or procedures must be invoked.


Organization and preparation of functional tests is focused on requirements,
key functions, or special test cases. In addition, systematic coverage pertaining
to identify Business process flows; data fields, predefined processes, and
successive processes must be considered for testing. Before functional testing is
complete, additional tests are identified and the effective value of current tests is
determined.
System Test
System testing ensures that the entire integrated software system meets
requirements. It tests a configuration to ensure known and predictable results.
An example of system testing is the configuration oriented system integration
test. System testing is based on process descriptions and flows, emphasizing
pre-driven process links and integration points.
7.4 White Box Testing
White Box Testing is a testing in which in which the software tester has
knowledge of the inner workings, structure and language of the software, or at
least its purpose. It is purpose. It is used to test areas that cannot be reached
from a black box level.

61

7.5 Black Box Testing


Black Box Testing is testing the software without any knowledge of the
inner workings, structure or language of the module being tested. Black box
tests, as most other kinds of tests, must be written from a definitive source
document, such as specification or requirements document, such as
specification or requirements document. It is a testing in which the software
under test is treated, as a black box .you cannot see into it. The test provides
inputs and responds to outputs without considering how the software works.
Unit Testing:
Unit testing is usually conducted as part of a combined code and unit test
phase of the software lifecycle, although it is not uncommon for coding and unit
testing to be conducted as two distinct phases.
Test strategy and approach
Field testing will be performed manually and functional tests will be
written in detail.
Test objectives
All field entries must work properly.
Pages must be activated from the identified link.
The entry screen, messages and responses must not be delayed.
Features to be tested
Verify that the entries are of the correct format
No duplicate entries should be allowed
All links should take the user to the correct page.

Integration Testing
Software integration testing is the incremental integration testing of two
or more integrated software components on a single platform to produce failures
caused by interface defects.
The task of the integration test is to check that components or software
applications, e.g. components in a software system or one step up software
applications at the company level interact without error.

62

Test Results: All the test cases mentioned above passed successfully. No
defects encountered.
Acceptance Testing
User Acceptance Testing is a critical phase of any project and requires
significant participation by the end user. It also ensures that the system meets
the functional requirements.
Test Results: All the test cases mentioned above passed successfully. No
defects encountered.

63

8.SCREENSHOTS

Register

64

65

USER LOGIN

66

67

68

ADMIN LOGIN

69

70

CLOUD SERVER

71

72

9. FUTURE ENHANCEMENTS
This project is an attempt to enhance the Security Model of SecCloud+
with multi-layered cryptosystem based Secure and Authorized Auditing
deduplication model. In this project, we present a scheme that permits a more
fine-grained trade-off. The intuition is that outsourced data may require
different levels of protection, depending on how popular it is: content shared by
many users, such as a popular song or video, arguably requires less protection
than a personal document, the copy of a payslip or the draft of an unsubmitted
scientific paper. As more corporate and private users outsource their data to
cloud storage providers, recent data breach incidents make end-to-end
encryption an increasingly prominent requirement. Unfortunately, semantically
secure encryption schemes render various cost-effective storage optimization
techniques, such as data deduplication, ineffective. We present a novel idea that
differentiates data according to their popularity. Based on this idea, we design
an encryption scheme that guarantees semantic security for unpopular data and
provides weaker security and better storage and bandwidth benefits for popular
data. This way, data deduplication can be effective for popular data, whilst
semantically secure encryption protects unpopular content. We show that our
scheme is secure under the Symmetric External Decisional Diffe-Hellman
Assumption in the random oracle model.

73

10. CONCLUSION
Aiming at achieving both data integrity and deduplication in cloud, we
propose SecCloud and SecCloud+. SecCloud introduces an auditing entity with
maintenance of a MapReduce cloud, which helps clients generate data tags
before uploading as well as audit the integrity of data having been stored in
cloud. In addition, SecCoud enables secure deduplication through introducing a
Proof of Ownership protocol and preventing the leakage of side channel
information in data deduplication. Compared with previous work, the
computation by user in SecCloud is greatly reduced during the file uploading
and auditing phases. SecCloud+ is an advanced construction motivated by the
fact that customers always want to encrypt their data before uploading, and
allows for integrity auditing and secure deduplication directly on encrypted
data.

74

11. BIBILOGRAPHY
[1] M. Armbrust, A. Fox, R. Griffith, A. D. Joseph, R. Katz, A. Konwinski, G.
Lee, D. Patterson, A. Rabkin, I. Stoica, and M. Zaharia, A view of cloud
computing, Communication of the ACM, vol. 53, no. 4, pp. 5058, 2010.
[2] J. Yuan and S. Yu, Secure and constant cost public cloud storage auditing
with deduplication, in IEEE Conference on Communications and Network
Security (CNS), 2013, pp. 145153.
[3] S. Halevi, D. Harnik, B. Pinkas, and A. Shulman-Peleg, Proofs of
ownership in remote storage systems, in Proceedings of the 18th ACM
Conference on Computer and Communications Security. ACM, 2011, pp. 491
500.
[4] S. Keelveedhi, M. Bellare, and T. Ristenpart, Dupless: Serveraided
encryption for deduplicated storage, in Proceedings of the 22Nd USENIX
Conference on Security, ser. SEC13. Washington, D.C.: USENIX Association,
2013, pp. 179194. [Online]. Available:
https://www.usenix.org/conference/usenixsecurity13/technicalsessions/presentat
ion/bellare

[5] G. Ateniese, R. Burns, R. Curtmola, J. Herring, L. Kissner, Z. Peterson, and


D. Song, Provable data possession at untrusted stores, in Proceedings of the
14th ACM Conference on Computer and Communications Security, ser. CCS
07. New York, NY, USA: ACM, 2007, pp. 598609.

75

[6] G. Ateniese, R. Burns, R. Curtmola, J. Herring, O. Khan, L. Kissner, Z.


Peterson, and D. Song, Remote data checking using provable data possession,
ACM Trans. Inf. Syst. Secur., vol. 14, no. 1, pp. 12:112:34, 2011.
[7] G. Ateniese, R. Di Pietro, L. V. Mancini, and G. Tsudik, Scalable and
efficient provable data possession, in Proceedings of the 4th International
Conference on Security and Privacy in Communication Netowrks, ser.
SecureComm 08. New York, NY, USA: ACM, 2008, pp. 9:19:10.
[8] C. Erway, A. Kupc u, C. Papamanthou, and R. Tamassia, Dynamic
provable data possession, in Proceedings of the 16th ACM Conference on
Computer and Communications Security, ser. CCS 09. New York, NY, USA:
ACM, 2009, pp. 213222.

[9] F. Sebe, J. Domingo-Ferrer, A. Martinez-Balleste, Y. Deswarte, and J.-J.


Quisquater, Efficient remote data possession checking in critical information
infrastructures, IEEE Trans. on Knowl. and Data Eng., vol. 20, no. 8, pp.
10341038, 2008.
[10] H. Wang, Proxy provable data possession in public clouds, IEEE
Transactions on Services Computing, vol. 6, no. 4, pp. 551559, 2013.

76

12. APPENDICES

77