Académique Documents
Professionnel Documents
Culture Documents
MapReduce.
HDFS.
What is Hadoop?
Why HDFS?
HDFS is a distributed file system. While NTFS and FAT are not.
HDFS stores data reliably, It has built-in redundancy and failover. There is
no built-in redundancy and failover.
NTFS and FAT file system supports 4-8 block size. HDFS supports much
larger block sizes, by default 64 MB.
NTFS and FAT are optimized for random access read while HDFS is
optimized for sequential reads.
No local caching for HDFS as files size are huge. A typical file in HDFS
can size upto 1TB or even more
Blocks?
Advantages of blocks:
Components of HDFS
NameNode
Secondary NameNode
DataNode
NameNode
Secondary NameNode
No.
Secondary NameNode
Secondary Name Node connects to the Name Node [every hour] and grabs a
copy of the Name Nodes in-memory metadata. Combines this information in
a fresh set of files and delivers them back to the Name Node, while keeping a
copy for itself.
Configuration : core-site.xml
DataNode
DataNodes are the storage servers. These are nodes where the actual
data resides.
These are the slaves of the hadoop Master-Slave architechture.
DataNodes are used to store the blocks of data, But without
NameNode they are not capable to make any sense out of these
blocks.
Data Nodes send heartbeats to the Name Node every 3 seconds via a
TCP handshake.
Every tenth heartbeat is a Block Report, where the Data Node tells
the Name Node about all the blocks it has.
The block reports allow the Name Node build its metadata and insure
minimum required repilca of the block exist on different nodes, in
different racks.
Rack Awareness
NameNode
NameNode
switch
A
switch
switch
DN1
DN6
DN11
DN2
B DN7
DN12
DN3
DN8
DN13
DN4
DN9
DN14
DN5
DN10
DN15
Meta Data
File.txt =
Block A:
DN : 1, 6, 7
Block B:
DN : 7, 14, 15
Rack Awareness
Rack 1:
1, 2, 3, 4, 5
Rack 2 :
6, 7, 8, 9, 10
Rack 3 :
11, 12, 13, 14, 15
Rack Awareness
HDFS exposes a web server which is capable of performing basic status monitoring
and file browsing operations.
By default this is exposed on port 50070 on the NameNode. http://namenode:50070/
Contains overview information about the health, capacity, and usage of the cluster
(similar to the information returned by bin/hadoop dfsadmin -report).
The address and port where the web interface listens can be changed by setting
dfs.http.address in conf/hdfs-site.xml.
It must be of the form address:port. To accept requests on all addresses, use 0.0.0.0.
From this interface, you can browse HDFS itself with a basic file-browser interface.
Each DataNode exposes its file browser interface on port 50075. You can override this
by setting the dfs.datanode.http.address configuration key to a setting other than
0.0.0.0:50075.
Log files generated by the Hadoop daemons can be accessed through this interface,
which is useful for distributed debugging and troubleshooting.
Similar to put command, except that the source is restricted to a local file reference.
Copy single src, or multiple srcs from local file system to the destination filesystem.
Similar to get command, except that the destination is restricted to a local file reference.
Write Operation.
Result.txt => C, D
Hadoop Client
NameNode
NameNode
C => 3, 11, 12
D => 9, 4, 5
switch
A
switch
switch
DN1
DN6
DN11
DN2
B DN7
DN12
DN8
DN13
DN3
DN4
DN9
DN14
DN5
DN10
DN15
Meta Data
File.txt =
Block A:
DN : 1, 6, 7
Block B:
DN : 7, 14, 15
Rack Awareness
Rack 1:
1, 2, 3, 4, 5
Rack 2 :
6, 7, 8, 9, 10
Rack 3 :
11, 12, 13, 14, 15
Write Operation.
Result.txt => C, D
Re
11 ady
,1 :
2
Hadoop Client
NameNode
NameNode
C => 3, 11, 12
D => 9, 4, 5
switch
A
switch
switch
DN1
DN6
DN11
DN2
B DN7
2
:1
DN12
DN3
dy
a
Re
DN8
Meta Data
File.txt =
Block A:
DN : 1, 6, 7
Block B:
DN : 7, 14, 15
DN13
DN4
DN9
DN14
DN5
DN10
DN15
Rack Awareness
Rack 1:
1, 2, 3, 4, 5
Rack 2 :
6, 7, 8, 9, 10
Rack 3 :
11, 12, 13, 14, 15
Write Operation.
NameNode
NameNode
Hadoop Client
Su
cce
ss
Block Received
switch
A
switch
switch
DN1
DN6
DN11
DN2
B DN7
DN12
DN3
DN8
DN13
DN4
DN9
DN14
DN5
DN10
DN15
Meta Data
File.txt =
Block A:
DN : 1, 6, 7
Block B:
DN : 7, 14, 15
Result.txt =
Block C:
DN : 3, 11, 12
Block D:
DN : 9, 4, 5
Read Operation.
File.txt
Hadoop Client
NameNode
NameNode
A => 1, 6, 7
B => 7, 14, 15
switch
A
switch
switch
DN1
DN6
DN11
DN2
B DN7
DN12
DN8
DN13
DN3
DN4
DN9
DN14
DN5
DN10
DN15
Meta Data
File.txt =
Block A:
DN : 1, 6, 7
Block B:
DN : 7, 14, 15
Rack Awareness
Rack 1:
1, 2, 3, 4, 5
Rack 2 :
6, 7, 8, 9, 10
Rack 3 :
11, 12, 13, 14, 15
Read Operation.
File.txt
Hadoop Client
NameNode
NameNode
A => 1, 6, 7
B => 7, 14, 15
switch
A
switch
switch
DN1
DN6
DN11
DN2
B DN7
DN12
DN8
DN13
DN3
DN4
DN9
DN14
DN5
DN10
DN15
Name Node intelligently order the list of Data Node, considering the
Network traffic load on each Data Node containing the block.
Meta Data
File.txt =
Block A:
DN : 1, 6, 7
Block B:
DN : 7, 14, 15
Rack Awareness
Rack 1:
1, 2, 3, 4, 5
Rack 2 :
6, 7, 8, 9, 10
Rack 3 :
11, 12, 13, 14, 15
Under /home/hduser/tmp/dfs/name
current
image
in_use.lock
previous.checkpoint
current and previous.checkpoint will have same directory structure, both will
have files with same name under them.
/home/hduser/tmp/dfs/ will have namesecondary directory. This will have same
directory structure as name except for the previous.checkpoint directory.
Directory structure for /home/hduser/tmp/dfs/name/current/
edits
fsimage
fstime
VERSION
namespaceID=1443825132
cTime=0
storageType=NAME_NODE
layoutVersion=-19
namespaceID :
Namenode uses it to identify new datanodes, since they will not know the namespaceID
until they have registered with the namenode.
CTime :
layoutVersion :
The secondary asks the primary to roll its edits file, so new edits go to a new file.
The secondary retrieves fsimage and edits from the primary (using HTTP GET).
The secondary loads fsimage into memory, applies each operation from edits, then
creates a new consolidated fsimage file.
The secondary sends the new fsimage back to the primary (using HTTP POST).
The primary replaces the old fsimage with the new one from the secondary, and the old
edits file with the new one it started in step 1.
It also updates the fstime file to record the time that the checkpoint was taken.
At the end of the process, the primary has an up-to-date fsimage file and a shorter edits
file
A block file just consists of the raw bytes of a portion of the file being stored;
the metadata file is made up of a header with version and type information, followed by a series of
checksums for sections of the block.
When the number of blocks in a directory grows to a certain size, the datanode creates a new
subdirectory in which to place new blocks and their accompanying metadata. It
creates a new subdirectory every time the number of blocks in a directory reaches 64.
dfs.datanode.numblocks
This ensures that thereis a manageable number of files per directory, which avoids the problems that
most operating systems encounter when there are a large number of files.
Cluster configuration : . Add a key named dfs.hosts.exclude to your conf/hadoop-site.xml file. The value associated with this key
provides the full path to a file on the NameNode's local file system which contains a list of machines which are not permitted to connect
to HDFS.
Determine hosts to decommission. Each machine to be decommissioned should be added to the file identified by
dfs.hosts.exclude, one per line. This will prevent them from connecting to the NameNode.
Force configuration reload. Run the command bin/hadoop dfsadmin -refreshNodes. This will force the NameNode to reread its
configuration, including the newly-updated excludes file.
It will decommission the nodes over a period of time, allowing time for each node's blocks to be replicated onto machines which are
scheduled to remain active.
Shutdown nodes. After the decommission process has completed, the decommissioned hardware can be safely shutdown for
maintenance, etc. The bin/hadoop dfsadmin -report command will describe which nodes are connected to the cluster.
Edit excludes file again. Once the machines have been decommissioned, they can be removed from the excludes file. Running
bin/hadoop dfsadmin -refreshNodes again will read the excludes file back into the NameNode, allowing the DataNodes to rejoin the
cluster after maintenance has been completed, or additional capacity is needed in the cluster again, etc.
Conclusion.