Académique Documents
Professionnel Documents
Culture Documents
CreatingaMultinodeHadoopSandbox|TECHtonka
TECHtonka
Hadoop,Itswhatsfordinner.
CreatingaMultinodeHadoopSandbox
Postedon27Aug14
OneofthegreatthingsaboutalltheHadoopvendorsisthattheyhavemadeitveryeasy
forpeopletoobtainandstartusingtheirtechnologyrapidly.IwillsaythatIthinkCloudera
hasdonethebestjobbyprovidingcloudbasedaccesstotheirdistributionviaCloudera
Live.AllvendorsseemstohaveaVMwareandVirtualBoxbasedsandbox/trialimage.
HavingworkedwithHortonworksIhavethemostexperiencewithandthoughtaquick
initialblogpostwouldbehelpful.
Whileonecouldsimplydothisinstallationfromscratchfollowingthepackage
instructions,itsalsopossibletoshortcircuitmuchofthesetupaswellastakeadvantage
ofthescaleddownconfigurationworkalreadyputintothevirtualmachineprovidedby
Hortonworks.InshorttheideawouldbetousetheVMasasinglemasternodeand
simplyadddatanodestothismaster.Runningthiswayprovidesandeasywaytoinstall
andexpandaninitialHadoopsystemuptoabout10nodes.Asthesystemgrowsyouwill
needtoaddRAMtonotonlythevirtualhostbuttoHadoopDaemonsasitscales.Afull
scriptisavailablehere.Belowisadescriptionoftheprocess.
Thegeneralstepsinclude:
1.TheSandbox
DownloadandinstalltheHortonworksSandboxasyourheadnodeinyourvirtualization
systemofchoice.Thesandboxtendstobeproducedpriortothelatestmajorrelease
(compareyumlisthadoop*\output).MakesureyouhavefirstenabledAmbaribyrunning
thescriptinrootshomedirectoryandreboot.
InordertomakesureyouareusingtheverylateststablereleaseandthattheAmbari
serverandagentdaemonshavematchingversionsupgradingiseasiest.Thisincludes
following:
1
2
HWXREPO="http://s3.amazonaws.com/public-repo-1.hortonworks.com"
export AMBARIREPO="http://$HWXREPO/ambari/centos6/1.x/updates/1.6.1/ambari.repo"
http://techtonka.com/?p=223
1/5
11/25/2015
CreatingaMultinodeHadoopSandbox|TECHtonka
3
4
5
6
7
8
9
10
11
12
13
14
15
2.TheNodes
Install1NCentos6.5nodesasslavesandprepthemasworkernodes.Thesecanbe
defaultinstallsoftheOSbutneedtobeonthesamenetworkastheAmbariserver.This
canalsobefacilitatedviapdsh(butthisrequirespasswordlessssh)ORbetteryetsimply
creatingonedatanodeimageviaaPXEbootenvironmentorsnapshotoftheVirtual
machinetoquicklyreplicate1Nnodeswiththesechanges.
IfyouwanttouseSSHyoucandothisfromtheheadnodetoquicklyenable
passwordlessSSH:
1
2
3
4
5
6
7
8
9
Youthenwanttomakesureyoumakethefollowingchangestoyourslavenodes.Again
thiscouldeasilybedoneviapdshbypcdptheascripttoeachnodeandexecutingwith
thefollowingcontent.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
AMBSRV="HEADNODENAME"
# or insert local repo name here
HWXREPO="http://s3.amazonaws.com/public-repo-1.hortonworks.com"
export AMBARIREPO="http://$HWXREPO/ambari/centos6/1.x/updates/1.6.1/ambari.repo"
wget $AMBARIREPO -O /etc/yum.repos.d/ambari.repo
sed -i 's/SELINUX=permissive/SELINUX=disabled/g;s/SELINUX=enforcing/SELINUX=disabled
chkconfig --del iptables
iptables -F
service iptables stop
iptables -vnL
yum -y erase mysql-libs postgresql nagios ganglia ganglia-gmetad libganglia
wait
yum -y install net-snmp net-snmp-utils ntp wget
wait
service ntpd start
chkconfig --add ntpd
chkconfig --levels 35 ntpd on
JDKLOC="http://$HWXREPO/artifacts/jdk-7u45-linux-x64.tar.gz"
wget $JDKLOC -O /tmp/jdk-7u45-linux-x64.tar.gz
http://techtonka.com/?p=223
2/5
11/25/2015
CreatingaMultinodeHadoopSandbox|TECHtonka
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
mkdir -p /usr/java
tar -C /usr/java -zxvf /tmp/jdk-7u45-linux-x64.tar.gz
wait
echo "export JAVA_HOME=/usr/java/jdk1.7.0_45" > /etc/profile.d/java.sh
echo "export PATH=/usr/java/jdk1.7.0_45/bin:$PATH" >> /etc/profile.d/
echo "export PDSH_SSH_ARGS_APPEND=\"-o StrictHostKeyChecking=no\"" >
source /etc/profile.d/java.sh
source /etc/profile.d/login.sh
wait
sed -i 's/gpgcheck=1/gpgcheck=0/' /etc/yum.conf /etc/yum.repos.d/*
yum clean all
wait
service iptables stop
yum -y install ambari-agent
wait
sed -i "s/^hostname=.*/hostname=$AMBSRV/" /etc/ambari-agent/conf/ambari
ambari-agent start
Pushthisfiletoslavenodesandrunit.ThisdoesNOTneedtobedoneonthe
sandbox/headnode.
1 pdcp -whost[1-5] ./scriptfile.sh root@~/
2 pdsh -whost[1-5] "chmod 755 /root/scriptfile.sh;/root/scriptfile.sh"
3.ConfigureServicesRuntheAmbariaddnodesGUIinstallertoadddatanodes.Be
suretoselectmanualregistrationandfollowtheonscreenpromptstoinstall
components.Irecommendinstallingeverythingonallnodesandsimplyturningthe
servicesoffandonasneeded.Alsoinstallingtheclientbinariesonallnodeshelpsto
makesureyoucandodebuggingfromanynodeinthecluster.
4.Turnoffselectservicesasrequired.
Thereshouldnowbe1Ndatanodes/slavesattachedtoyourAmbari/Sandboxheadnode.
Herearesomesuggestedchanges.
http://techtonka.com/?p=223
3/5
11/25/2015
CreatingaMultinodeHadoopSandbox|TECHtonka
1.TurnofflargeservicesyouarentusinglikeHBase,Storm,Falcon.Thiswillhelpsave
RAM.
2.DecommissiontheDatanodeonthismachine!No!aheadnodeisnotadatanode.If
yourunjobshereyouwillhaveproblems.
3.HDFSReplicationfactorThisissetto1inthesandboxbecausethereisonlyone
datanode.Ifyouonlyhave13datanodesthentriplereplicationdoesntmakesense.I
suggestyouuse1untilyougetover3datanodesatabareminimum.Ifyouhavethe
resourcesjuststartwith10datanodes(thatswhyitscalledBigData).Ifnotstickwith
replicationfactorof1butbeawarethiswillfunctionasaprototypesystemandwont
providethenaturalsafeguardsorparallelismofnormalHDFS.
4.IncreaseRAMtoHeadnodeAtabareminimumAmbarirequires4096MB.Ifyouplan
torunthesandboxasaheadnodeconsiderincreasingfromthisminimum.Alsoconsider
givingrunningservicesroomtobreathbyincreasingtheRAMallocatedinAmbariforeach
service.Hereisagreatreviewandscriptforguestimatinghowtoscaleservicesfor
MapReduceandYarn.
5.NFStomakeyourlifeeasieryoumightwanttoenableNFSonadatanodeortwo.
SH A R ETH IS:
ThisentrywaspostedinHadoopbyoneadem12.Bookmarkthepermalink
http://techtonka.com/?p=223
4/5
11/25/2015
CreatingaMultinodeHadoopSandbox|TECHtonka
[http://techtonka.com/?p=223].
http://techtonka.com/?p=223
5/5