Académique Documents
Professionnel Documents
Culture Documents
Volume: 3 Issue: 7
ISSN: 2321-8169
4382 - 4387
_______________________________________________________________________________________________
Securing Hadoop using OAuth 2.0 and Real Time Encryption Algorithm
Mr. Vijaykumar N. Patil
Prof. Venkatesan N.
AbstractHadoop is most popularly used distributed programming framework for processing large amount of data with Hadoop distributed file
system (HDFS), but processing personal or sensitive data on distributed environment demands secure computing. Originally Hadoop was
designed without any security model. Hadoop projects deals with security of data as a top agenda, which in turn to represents classification of a
critical data item. The data from various applications such as financial deemed to be sensitive which need to be secured. With the growing
acceptance of Hadoop, there is an increasing trend to incorporate more and more enterprise security features. The encryption and decryption
technique is used before writing or reading data from HDFS respectively. Advanced Encryption Standard (AES) enables protection of data at
each cluster which performs encryption or decryption before read or writes occurs at HDFS. The earlier methods do not provide Data privacy
due to the similar mechanism used to provide data security to all users at HDFS and also it increases the file size; so these are not suitable for
real-time application. Hadoop require additional terminology to provide unique data security to all users and encrypt data with the compatible
speed. We have implemented method in which OAuth does the authentication and provide unique authorization token for each user which is
used in encryption technique that provide data privacy for all users of Hadoop. The Real Time encryption algorithms used for securing data in
HDFS uses the key that is generated by using authorization token.
Keywords- Hadoop,DataNode,NameNode,TaskTracker,ASE,HDFS,OAuth.
__________________________________________________*****_________________________________________________
I.
INTRODUCTION
4382
IJRITCC | July 2015, Available @ http://www.ijritcc.org
_______________________________________________________________________________________
ISSN: 2321-8169
4382 - 4387
_______________________________________________________________________________________________
RELATED WORK
PROPOSED SYSTEM
4383
IJRITCC | July 2015, Available @ http://www.ijritcc.org
_______________________________________________________________________________________
ISSN: 2321-8169
4382 - 4387
_______________________________________________________________________________________________
3.
4.
5.
6.
1.
Start
2.
3.
4.
5.
6.
7.
Stop
Decryption Steps
A. Algorithms for OAuth Authorization Server
Input: User Credentials
Output: Authentication token & authorization token
The following steps explain the server-side flow:
1.
Start
2.
3.
4.
5.
1.
Start
2.
3.
6.
4.
7.
Stop
5.
6.
7.
Stop
2.
Start
2.
_______________________________________________________________________________________
ISSN: 2321-8169
4382 - 4387
_______________________________________________________________________________________________
4.
5.
F= {f1, f2 ...fn}
Where f1is a text file
S= {AT, OT, F}
5.
Authorization token
6.
7.
Authorization token
J = {j1_dn,j2_dn,........,jn_dn}
Where j1_dn is a MapReduce program with
decryption process
S = {AT, OT, F, En, Dn, J}
8.
Initialize Tokens
IV.
A) At = {}
B) Ot = {}
2.
3.
4.
4385
IJRITCC | July 2015, Available @ http://www.ijritcc.org
_______________________________________________________________________________________
ISSN: 2321-8169
4382 - 4387
_______________________________________________________________________________________________
The NameNode is center piece of Hadoop in light of the
fact that it controls the entire DataNodes exhibit in bunch. It is
a Single-Point-of-Failure yet late forms (0.21+) accompany
Backup NameNode [2] to make it exceptionally accessible. The
DataNodes contain all the information in bunch on which we
will work our MapReduce projects and perspective the activity
information from different points of view. JobTracker controls
all the tasks which are running on TaskTrackers shown in
following figure 4.
Encryption
Type
Encrypted
Data(MB)
AES
RTEA
AES
RTEA
1.8819
1.0659
20.1015
10.7252
26.0420
22.0510
83.2780
54.2360
10
Figure 4: TaskTracker
We have developed two different encryption techniques
first does encryption using AES and second new algorithm
perform encryption using OAuth token we called as Real-time
encryption algorithm. The MapReduce programs (Hadoop job)
which take the input as encrypted data and execute job, we can
observe that 23.0490 seconds was taken for running a
WordCount MapReduce job for unencrypted HDFS for size of
10MB test file while 83.2780 seconds for the encrypted HDFS
with AES and 54.2360 seconds for encrypted HDFS with Realtime encryption algorithm(RTEA).
Data
(MB)
Encryption
Type
Encrypted
Data(MB)
Time Consume
for Encryption
(sec)
Time Consume
to upload to
HDFS(sec)
AES
RTEA
AES
RTEA
1.8819
1.0659
20.1015
10.7252
26.2190
12.1510
298.0950
131.5510
1.7660
1.6370
10
2.0110
1.8120
_______________________________________________________________________________________
ISSN: 2321-8169
4382 - 4387
_______________________________________________________________________________________________
In Future work our topic leads to generate Hadoop with all
kinds of security mechanism for securing data as well as secure
job execution.
VI.
ACKNOWLEDGMENT
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
V.
[15]
[16]
[17]
[18]
4387
IJRITCC | July 2015, Available @ http://www.ijritcc.org
_______________________________________________________________________________________