Vous êtes sur la page 1sur 11

Programing Map-Reduce ( Hadoop ) with Eclipse

Wei - Yu Chen NCHC 2008 / 05 / 27

s ee m o r e : h t t p: / / t r ac.nc hc.org.tw / clo u d /

1. Prepare :

System :
Ubu n t u 7.10 Ha do o p 0.16

Requirement :
Eclipse (3.2.2)

$ a p t - get ins t all eclip se

java 6

$ a p t - get ins t all s u n - java6 - bi n s u n - java6 - j dk s u n - java6 - jre s u n - java6 pl ugin

s ugges t t o re m ove t he d efa ult java co m piler

gcj

$ a p t - get p u rge java - gcj - co m p a t

Ap pe n d two co des t o / e t c / b a s h.bas h rc t o s e t u p Java Clas s p a t h

ex p o r t JAVA_HOME= / u s r / lib / jv m / j ava - 6 - s u n ex p o r t HADOOP_HOME = / h o m e / w a ue / w o r k s p ace / h a d o o p / ex p o r t CLASSPATH =.:$JAVA_HOME /lib / d t.jar:$JAVA_HOME /lib / t o ols.jar

Building UP Path
Nam e Path / h o m e / w a ue / w o rk s p ace / h a d o o p / / u s r / lib / jv m / j ava - 6 - s u n

Ha d o o p Ho me Java Ho m e

2. Hadoop Setup
1. Generate a n SSH key for t he u s e r. $ $ $ $ s s h - keygen - t r s a - P "" ca t ~ / . s s h / i d_rs a.p u b > > ~ . s s h / a u t h o ri z e d_keys s s h localh o s t exit 2. $ $ $ $ $ Ins t allatio n Ha do o p

c d / h o m e / w a ue / w o r k s p ace s u d o t a r xzf h a d o o p - 0.16.0.tar.g z s u d o m v h a d o o p - 0.16.0 h a d o o p s u d o ch ow n - R wa ue:wa ue h a d o o p cd hadoop 3. Configura tio n 1.

h a d o o p - e nv.s h Cha nge

($HADOOP_HOME /co nf /)

# The java im ple m e n t a tio n t o u s e. Require d. # exp or t JAVA_HOME = / u s r / lib / j 2 s d k 1.5 - s u n

to
# The java im ple m e n t a tio n t o u s e. Require d. exp or t JAVA_HOME= / u s r / lib / jv m / j ava - 6 - s u n exp or t HADOOP_HOME = / h o m e / w a ue / w o r k s p ace / h a d o o p exp or t HADOOP_LOG_DIR = $HADOOP_HOME /logs exp or t HADOOP_SLAVES = $HADOOP_HOME /co nf / slaves

2.

h a d o o p - site.x ml ($HADOOP_HOME /co nf /) m o dify t h e co n te n t s of conf / h a d o o p - site.x ml a s below

< c o nfigu ra tio n > < p r o p e r ty > < n a m e > f s. d efa ult.n a m e < / n a m e > < v alue > localh o s t:9 0 0 0 < / v al u e > < d e sc ri p tio n > < / d e sc rip tio n > < / p r o p e r ty > < p r o p e r ty > < n a m e > m a p r e d.jo b.tr acke r < / n a m e > < v alue > local h o s t:90 0 1 < / v alu e > < d e sc ri p tio n > < / d e sc rip tio n > < p r o p e r ty >

< n a m e > m a p r e d. m a p.ta sk s < / n a m e > < v alue > 1 < / v al u e > < d e sc ri p tio n > d efine m a p r e d. m a p t a s k s to b e n u m b e r of slave h o s t s < / d e sc rip tio n > < / p r o p e r ty > < p r o p e r ty > < n a m e > m a p r e d.re d uce.tas k s < / n a m e > < v alue > 1 < / v al u e > < d e sc ri p tio n > d efine m a p r e d.re d uce t a s k s t o b e n u m b e r of slave h o s t s < / d e sc rip tio n > < / p r o p e r ty > < p r o p e r ty > < n a m e > d f s.re plicatio n < / n a m e > < v alue > 1 < / v al u e > < / p r o p e r ty > < / c o nfig u ra tio n >

4. Star t Up Ha do o p

$ c d $HADOOP_HOME $ bin / h a d o o p n a m e n o d e - for m a t


08 / 0 5 / 2 3 1 4:52:16 INFO d f s.Na m eNo de: STARTUP_MSG: / ************************************************************ STARTUP_MSG: Star ti ng Na m eNo de STARTUP_MSG: h o s t = Dx7200 / 1 2 7.0.1.1 STARTUP_MSG: args = [ - for m a t] STARTUP_MSG: ver sio n = 0.16.4 STARTUP_MSG: b uil d = h t t p: / / s v n.a pac he.org / r e p o s / a s f / h a d o o p / c o re / b r a nc he s / b r a nc h - 0.16 - r 6 5 26 1 4; co m pile d by 'ha d o o p q a' o n Fri May 2 0 0:18:12 UTC 2 0 0 8 ************************************************************ / 08 / 0 5 / 2 3 1 4:52:17 INFO fs.FSNa m e sys te m: fsOw ne r = w a u e,wa ue,a d m, dialo u t,c dr o m,flo p py,a u dio,di p,video,plug dev,s taff,sca n ne r,l pa d mi n,a d mi n,ne t d ev,power dev,vboxu s er s 08 / 0 5 / 2 3 1 4:52:17 INFO fs.FSNa m e sys te m: s u p e rgro u p = s u p e rgro u p 08 / 0 5 / 2 3 1 4:52:17 INFO fs.FSNa m e sys te m: isPer mis sio nEna ble d = t r ue 08 / 0 5 / 2 3 1 4:52:17 INFO d f s.Storage: Storage direct ory / t m p / h a d o o p - wa ue / d f s / n a m e h a s bee n s ucce s sf ully for m a t t e d. 08 / 0 5 / 2 3 1 4:52:17 INFO d f s.Na m eNo de: SHUTDOWN_MSG: / ************************************************************ SHUTDOWN_MSG: Shut ti ng d ow n Na m eNo de a t Dx7200 / 1 2 7.0.1.1 ************************************************************ /

$ / b i n / s t a r t - all.s h
s t a r ti ng n a m e n o d e, logging t o / h o m e / w a ue / w o r k s p ace / h a d o o p / l ogs / h a d o o p - wa ue - n a m e n o d e Dx720 0.o u t localho s t: s t a r ti ng d a t a n o d e, logging t o / h o m e / w a ue / w o r k s p ace / h a d o o p / l ogs / h a d o o p - wa ue d a t a n o d e - Dx720 0.ou t localho s t: s t a r ti ng seco n d a ryn a m e n o de, logging t o / h o m e / w a u e / w o r k s p ace / h a d o o p / l ogs / h a d o o p - wa ue - seco n d a ry na m e n o d e - Dx7200.o u t s t a r ti ng job t racke r, logging t o / h o m e / w a u e / w o rk s p ace / h a d o o p / l ogs / h a d o o p - wa ue - jobt racker Dx720 0.o u t localho s t: s t a r ti ng t a s k t r acker, logging t o / h o m e / w a u e / w o r k s p ace / h a d o o p / l ogs / h a d o o p - wa ue t a s k t r acker - Dx7200.o u t

The n m a k e s u re h t t p: / / l ocalho s t:500 3 0 / by yo u r explorer is o n going.

Ps : if yo u r sys te m h a d erro r after re s t a r t, you co ul d d o t h ere for re s olving a n d re n ewing o n e. $ $ $ $ c d $HADOOP_HOME bin / s t o p - all.s h r m - rf / t m p /* r m - rf logs /*

An d r e p e a t t o 4. s t a r t u p Ha d oo p

3. Eclipse Setup
3.1 install IBM mapReduce tool
1. Dow nloa d t h e IBM MapRe d uce Tools zi p file a n d extract t o / t m p / . 2. Make s u r e Eclipse is close d a n d ... $ cd / t m p / $ u n zi p m a p r e d uce_tools.zi p $ m v pl ugins / c o m.ib m. hi p o d s. m a p re d uce* / u s r / lib / eclip se / pl ugin s / 3. Res ta r t Eclipse Check IBM MapRe d uce Tools pl ugin ins t alling well Eclip s e File > Ne w > Project

see MapReduc e categ or y

3.2 Eclipse configure


Eclip s e Windo w > Preferenc e s > java > c o mpiler se t c o m piler c o m pliance le v el t o 5.0

So me eclip se - pl ugin m ay exh a u s t m u c h re s o u rce, yo u m ay h a p p e n t o ou t of m e m o ry er ro r . We s ugges t t o execu te eclips e wit h s o m e p a r a m e t e r s a s t h a t :

$ e clip s e - v marg s - Xmx 5 1 2 m

4. Run on Eclipse
4.1 map-reduce sample code
Eclip s e File > n e w > project > map - reduc e project > n e xt > project nam e : sa mple u s e d efault location : V u s e d efault Hado op : V > Finish at Project e x plorer , y o u w ill s e e sample tree. No w, y o u s h o uld create a sa m ple c o d e. Eclip s e rig h t click sa m ple > n e w > file > file na m e : WordCount.java t h e s a m ple co de is h e re http: / / trac.nchc.org.t w / cl ou d / attach m ent / w i ki / had o op - sa mple c o d e /WordCount.java p a s t e t h e co n te n t s t o you r n ew a d di ng file Wor dCo u n t.java

4.2. Connect to Hadoop File System


Enable t h e MapRe d uce se rver s win d ow Eclip s e Windo w > Sho w Vie w > Other... > MapReduc e Tool s > MapRedu ce Serv ers At t h e b o t t o m of yo u r win d ow, yo u s h o ul d h ave a "MapReduc e Serv er s " t a b. If n o t, s ee s eco n d b ullet a bove. Switc h t o t h a t t a b. At t h e t o p righ t e dge of t he t a b, you s h o ul d s ee a little bl ue ele p h a n t icon s. Eclip s e Click bl ue ele p h a n t t o a d d a n ew MapRed uce s erver locatio n. Serv er nam e : an y_ y o u_want Ho stna m e : localh o st Installation directory: / h o m e / w a u e / w ork s pace / n utch /

Usernam e : w au e

If a ny p a s swo r d p r o m p t, plea se in p u t t he pa s s w ord w hich you login t o local It s h o ul d s h ow u p u n d e r a little ele p h a n t ico n in t h e Project Explorer (on t h e left si de of Eclip se). p s : Pleas t m a ke s u re yo ur Ha d o o p is working o n local sys te m. If n o t, please refer ses sio n 2 Ha do o p Set u p for d e b uging, o r you ca n n o t p a s s t h r o ug h. $ $ $ $ c d / h o m e / w a ue / w o r k s p ace / h a d o o p / wget h t t p: / / w ww.gu te n be rg.org / e t ext / 1 3 2 / 1 3 2.txt bin / h a d o o p df s - m k dir in p u t bin / h a d o o p df s - ls
< dir > 2 0 0 8 - 0 5 - 2 3 1 5:15 rwxr - xr - x wa ue s u p e rgro u p

Fou n d 1 ite m s / u s e r / w a ue / i n p u t

$ bin / h a d o o p df s - p u t 1 3 2.txt in p u t

4.3 Run
Eclip s e sa m ple > righ t click WordCount.java > run as ... > run o n Hado op > ch o o s e an e xi sting s er v er fro m the list b elo w > finish A co n s ole t ag will s h ow be si de MapRe d uce Server t ag.

While Map Red uce is r u n ni ng, you ca n visit h t t p: / / l ocal ho s t:5003 0 / t o view t h a t Ha d o o p is dis p a tc hi ng job s by Map Red uce. After finis h, you ca n go t o h t t p: / / l ocal ho s t:5006 0 / t o s ee t he re s ult.

5. Reference

NCHC Clou d Tech niq ue Develo p Gro u p h t t p: / / t r ac.nc hc.org.tw / clo u d / IBM Map - Red uce h t t p: / / w ww.alp hawork s.ib m.co m / t ec h / m a p r e d ucet ools Clo u d 9 h t t p: / / w ww.u miacs.u m d.e d u / ~ ji m mylin / clo u d 9 / u m d - h a d o o p dis t / clo u d 9 - d oc s / h ow t o / s t a r t.ht ml Ru ning Ha d oo p h t t p: / / w ww. mic hael n oll.co m / wiki /R u n ni ng_Ha d oo p_On_Ubu n t u_Lin ux_%28Single No d e_Clus te r%29

Related Files :

Ha do o p h t t p: / / a p ac he.nt u.e d u.tw / h a d o o p / c o re /

IBM m a p re d uce t o ol : h t t p: / / w ww.alp hawork s.ib m.co m / t ec h / m a p r e d ucet ools

wor d s a m ple 1 : The Art of War by 6 t h cen t. B.C. Sun zi h t t p: / / w ww.gu te n be rg.org / e t ext / 1 3 2

wor d s a m ple 2 : The Adve n t u re s of Sherlock Hol m es by Sir Art h u r Cona n Doyle h t t p: / / w ww.gute n be rg.org / e t ext / 1 6 6 1

Vous aimerez peut-être aussi