Académique Documents
Professionnel Documents
Culture Documents
#PP:O#C8E" #. In case of product Boin scenarios4c2ec6 for - Proper usage of a(ias - Boining on matc2ing co(umns - $sage of Boin 6e/words - (i6e specif/ing t/pe of Boins ;e+. inner or outer A - use union in case of DO:@ scenarios
Ensure statistics are co((ected on Boin co(umns and t2is is especia((/ important if t2e co(umns /ou are Boining on are not uni9ue.
B. co((ects stats - :un command Ddiagnostic 2e(p stats on for t2e sessionD - %at2er information on co(umns on w2ic2 stats 2as to 3e co((ected - Co((ect stats on suggestions co(umns - #(so c2ec6 for stats missing on PI4 "I or co(umns used in Boins D2e(p stats Edata3asenameF.Eta3(enameF - &a6e sure stats are re-co((ected w2en at-(east 1GH of data c2anges - remove unwanted stats or stat w2ic2 2ard(/ improves performance of t2e 9ueries - Co((ect stats on co(umns instead of inde+es since inde+ dropped wi(( drop stats as we((II - co((ect stats on inde+ 2aving mu(tip(e co(umns4 t2is mig2t 3e 2e(pfu( w2en t2ese co(umns are used in Boin conditions - C2ec6 if stats are re-created for ta3(es w2ose structures 2ave some c2anges C. !u(( ta3(e scan scenarios - Tr/ to avoid !T" scenarios as4 it mig2t ta6e ver/ (ong time to access a(( t2e data in ever/ amp in t2e s/stem - &a6e sure "I is defined on t2e co(umns w2ic2 are used as part of Boins or #(ternate access pat2. - Co((ect stats on "I co(umns e(se t2ere are c2ances w2ere optimi0er mig2t go for !T" even w2en "I is defined on t2at particu(ar co(umn. 2. If intermediate ta3(es are used to store resu(ts4 ma6e sure t2at - It 2as same PI of source and destination ta3(e 3. Tune to get t2e optimi0er to Boin on t2e Primar/ Inde+ of t2e (argest ta3(e4 w2en possi3(e4 to ensure t2at t2e (arge ta3(e is not redistri3uted on #&P". '. !or (arge (ist of va(ues4 avoid using I C OT I in ".Ls. Jrite (arge (ist va(ues to a temporar/ ta3(e and use t2is ta3(e in t2e 9uer/. ). &a6e sure w2en to use e+istsCnot e+ists condition since t2e/ ignore un6nown comparisons ;e+. - $LL va(ue in t2e co(umn resu(ts in un6nownA . 8ence t2is (eads to inconsistent resu(ts
,. Inner Ks Outer Loins C2ec6 w2ic2 Boin wor6s efficient(/ in given scenarios."ome e+amp(es are - Outer Boins can 3e used in case of (arge ta3(e Boining wit2 sma(( ta3(es ;(i6e fact ta3(e Boining wit2 5imension ta3(e 3ased on reference co(umnA - Inner Boins can 3e used w2en we get actua( data and no e+tra data is (oaded into spoo( for processing P(ease note for outer Boin conditions7 !i(ter condition for inner ta3(e s2ou(d 3e present in DO D condition 2. !i(ter condition for outer ta3(e s2ou(d 3e present in DJ8E:ED condition.
2. If
"ELECT 8#"8#&P ;8#"8B$CSET ;8#"8:OJ ;EQO$:RCOL$& FAAA 4 CO$ T ;>A !:O& EQO$:R5BF.EQO$:RTBF %:O$P BQ 1P If /ou see t2at 5ata is e9ua((/ distri3uted among a(( t2e amps ;Kariance of O-) H is acceptedA4 If t2ere is (arge amount of 5#T#"SEJ in one #&P4 t2en "#&PLI % is not a good option 3. If /ou donMt find data s6ew on an/ particu(ar #&P t2en4 :un samp(e statistics on co(umn of particu(ar ta3(e as fo((ows. COLLECT "T#TIC"TIC" O EQO$:R5BF.EQO$:RTBF COL$& ;EQO$:RCOL$& FA $"I % "#&PLEP '. C2ec6 t2e performance of 9uer/ after running samp(e "T#T"4 a(so note t2e time ta6en for co((ecting samp(e stats. ). If not satisfied wit2 performance4 tr/ to run fu(( statistics on co(umns and measure performance and time ta6en to co((ect fu(( stats ,. 5ecide w2ic2 is t2e 3est option <!$LL "T#T" or "#&PLE<considering factors (i6e - Performance4 - Time ta6en for statistics co((ection on scenarios4 - Ta3(e si0e4 - 5ata s6ew4 - !re9uenc/ of ta3(e 3eing (oaded - 8ow man/ times t2is ta3(e wou(d 3e used in /our environment.
2. Loin inde+es
If /ou are wor6ing on writing 9ueries4 wor6ing on performance or 2e(ping in 3etterment of performance. Qou wi(( 2ave to ta6e sometime in going t2roug2 t2is topic. It is a(( to do a3out Loins w2ic2 is most important concern in Teradata. If some (ig2t is given to fo((owing suggestions4 an/ Boin re(ated issues can 3e ta6en care off...
Tip 17 )oining on PI*NUPI* Non PI co(umn$ 7 Je s2ou(d ma6e sure Boin is 2appening on co(umns composed of $PIC $PI. But w2/VV J2enever we Boin two ta3(es on common co(umns4 t2e smart optimi0er wi(( tr/ to ta6e data from 3ot2 t2e data into a common spoo( space and Boin t2em to get resu(ts. But getting data from 3ot2 t2e ta3(es into common spoo( 2as over2ead. J2at if I Boined a ver/ (arge ta3(e wit2 sma(( ta3(eV "2ou(d sma(( ta3(e 3e redistri3uted or (arge ta3(eV "2ou(d sma(( ta3(e 3e dup(icated across a(( t2e #&PsV "2ou(d 3ot2 t2e ta3(es 3e redistri3uted across a(( t2e #&PsVV 8ere is some 3asic t2um3 ru(es on Boining co(umns on Inde+4 so Boining 2appens faster. Case 1 - P.I U P.I Boins T2ere is no redistri3ution of data over ampNs. "ince amp (oca( Boins 2appen as data are present in same #&P and need not 3e redistri3uted. T2ese t/pes of Boins on uni9ue primar/ inde+ are ver/ fast. Case 2 - P.I U on PI co(umn Boins -5ata from second ta3(e wi(( 3e re-distri3uted on a(( amps since Boins are 2appening on PI vs. $PI co(umn. Idea( scenario is w2en sma(( ta3(e is redistri3uted to 3e Boined wit2 (arge ta3(e records on same amp -5ata in sma(( ta3(e is dup(icated to Ever/ #&P w2ere it is Boined (oca((/ wit2 (arge ta3(e Case 3 - o Inde+ U on PI co(umn Boins 5ata from 3ot2 t2e ta3(es are redistri3uted on a(( #&Ps. T2is is one of t2e (ongest processing 9ueries 4 Care s2ou(d 3e ta6en to see t2at stats are co((ected on t2ese co(umns Tip 27 T+e co(umn$ %ar! of ,oin mu$! -e of !+e $ame /CHAR0 INTEGER012 But w2/VIV a!a !.%e
J2en tr/ing to Boin co(umns from two ta3(es4 optimi0er ma6es sure t2at datat/pe is same or e(se it wi(( trans(ate t2e co(umn in driving ta3(e to matc2 t2at of derived ta3(e. "a/ for e+amp(e T#BLE emp(o/ee deptno ;c2arA T#BLE dept deptno ;integerA If I am Boining emp(o/ee ta3(e wit2 5ept on emp(o/ee.deptno;c2arA U dept.deptno;IntegerA4 optimi0er wi(( convert c2aracter co(umn to Integer resu(ting in trans(ation . J2at wou(d 2appen if emp(o/ee ta3(e 2ad 1GG mi((ion records and ever/ time deptno wou(d 2ave to undergo Trans(ation. "o we 2ave to ma6e sure to avoid suc2 scenarios since trans(ation is a cost factor and mig2t need time and s/stem resources. &a6e sure /ou are Boining co(umns t2at 2ave same data t/pes to avoid trans(ationIIII Tip 3 7 Do no! u$e func!ion$ (i3e 'U#'TR0 COA&E'CE 0 CA'E 222 on !+e in ice$ u$e a$ %ar! of )oin2 J2/VIV It is not recommended not to use functions suc2 as "$B"T:4 CO#LE"CE4 C#"E and ot2ers since t2e/ add up to cost factor resu(ting in performance issue. Optimi0er wi(( not 3e a3(e to read stats on t2ose co(umns w2ic2 2ave functions as it is 3us/ converting functions. T2is mig2t resu(t in Product Boin4 spoo( out issues and optimi0er wi(( not 3e a3(e to ta6e decisions since no statsCdemograp2ics are avai(a3(e on co(umn. It mig2t assume co(umn to 2ave 1GG va(ues instead of 1 mi((ion va(ues and mig2t redistri3ute on wrong assumption direct(/ impacting performance. Tip ' 7 U$e NOT NU&& w+ere ever %o$$i-(e4 J2atVII 5id someone sa/ ot u((VV .. Qes4 we 2ave to ma6e sure to use OT nu(( for co(umns w2ic2 are dec(ared as $LL#BLE in T#BLE definition. :eason 3eing t2at a(( t2e u(( va(ues mig2t get sorted to one poor #&P resu(ting in infamous D O "POOL "P#CE D Error as t2at #&P cannot accommodate an/ more u(( va(ues. "O remem3er to use avoid . OT $LL in Boining so t2at ta3(e "SEJ can 3e
"ince K2:) 4 teradata automatica((/ adds t2e condition W I" OT $LL X to t2e 9uer/. "ti(( it is 3etter to ensure OT $LL co(umns are not inc(uded as part of t2e Boin.
It is a(wa/s suggested to use <(oc6ing ta3(e for accessD w2ic2 since t2e/ wi(( not 3(oc6 t2e ot2er users from app(/ing readCwrite (oc6 on t2e ta3(e.
10
If LISE is used in a J8E:E c(ause4 it is 3etter to tr/ to use one or more (eading c2aracter in t2e c(ause4 if at a(( possi3(e. egP LISE NH"T:I %HN wi(( 3e processed different(/ compared to LISE N"T:I %HN If a (eading c2aracter N"T:I %HN is used in t2e 3eginning of (i6e c(ause 4 t2en t2e .ptimi0er ma6es use of an inde+ to perform on 9uer/ t2ere3/ increasing t2e performance. But if t2e (eading c2aracterN in NH"T:I %HN is a wi(dcard;sa/ NHNA 4 t2en t2e Optimi0er wi(( not 3e a3(e to use an inde+4 and a fu(( ta3(e scan ;!T" A must 3e run4 w2ic2 reduces performance and ta6es more time. 8ence it is suggested to go for NH"T:I %HN on(/ if "T:I % is a part of entire pattern sa/ N"$B"T:I %N
1. $ti(i0ing TeradataMs Para((e( #rc2itecture7 If /ou understand w2at 2appens in t2e 3ac6ground4 /ou wi(( 3e a3(e to ma6e /our 9uer/ wor6 its 3est. "o4 tr/ and run e+p(ain p(an on /our 9uer/ 3efore e+ecuting it and see 2ow t2e PE;Parsing EngineA 2as p(anned to e+ecute it. $nderstand t2e Se/-words in E+p(ain p(an. I wi(( 2ave to write a more detai(ed post on t2is topic. But for now4 (et us go on wit2 t2e 2ig2(ig2ts 2. $nderstanding :esource consumption7 :esource t2at /ou consume can 3e direct(/ re(ated to do((ars. Be aware and fruga( a3out t2e resources /ou use. !o((owing are t2e factors /ou need to 6now and c2ec6 from time to time7 a. CP$ consumption 3. Para((e( Efficienc/ C 8ot amp percentage c. "poo( usage 3. 8e(p t2e Parser7 "ince t2e arc2itecture 2as 3een made to 3e inte((igent4 we 2ave to give it some respect Qou can 2e(p t2e parser understand data /ou are dea(ing wit24 3/ co((ecting statistics. 11
But /ou need to 3e carefu( w2en /ou do so4 due to 2 reasons7 Incorrect stats are worse t2an not co((ecting stats4 so ma6e sure /our stats are not sta(e;o(dA If /our dataset c2anges rapid(/ in /our ta3(e4 and suppose /ou are dea(ing wit2 a (ot of data4 t2en co((ecting stats itse(f mig2t 3e resource consuming. "o4 3ased on 2ow fre9uent(/ /our ta3(e wi(( 3e accessed4 /ou wi(( 2ave to ma6e t2e ca((
'. "ince same ".L can 3e written in different wa/s4 /ou wi(( 2ave to 6now w2ic2 met2od is 3etter t2an w2ic2. !or eg4 creating Ko(ati(e ta3(e vs %(o3a( temp ta3(e vs wor6ing ta3(e. Qou cannot direct(/ point out w2ic2 is t2e 3est4 But I can touc2 3ase on t2e pros and cons and comparison for t2em. ). Ta6e a step 3ac6 and (oo6 at t2e w2o(e process. Consider 2ow muc2 data /ou need to 6eep4 2ow critica( is it for /our 3usiness to get t2e data soon4 2ow fre9uent(/ do /ou need to run /our ".L. &ost of t2e times4 t2e Y3ig pictureM wi(( give /ou a (ot of answers
12
It is recommended to refres2 t2e stats after ever/ 1GH of data c2ange. Je can co((ect t2e statistics at co(umn (eve( or at inde+ (eve(. "/nta+7 Co((ect statistics on Eta3(eRnameF co(umn ;co(umnRname 14..4 co(umnRname nAP O: Co((ect statistics on Eta3(eRnameF inde+ ;co(umnRname 14..4 co(umnRname nAP 2A. Pac6 5is6 7 Pac6 dis6 is an uti(it/ t2at free up t2e c/(inder space on t2e data3ase4 t2is uti(it/ must 3e run periodica((/ as in t2e ware2ouse environment (arge amount of data inserts4 updates are 2appening w2ic2 causes t2e p2/sica( memor/ to disorder due to fre9uent data manipu(ation. Pac6 dis6 uti(it/ a((ows us to restructure Z p2/sica((/ reorder t2e data4 free up space same as defragmentation. Teradata a(so run mini CQLP#CSs automatica((/4 if c/(inder space goes 3e(ow t2e prescri3ed (imit. C/(inder space is re9uired for t2e merge operation w2i(e t2e data Insert4 5e(etes $pdates etc. To run a pac6 dis6 we use !erret uti(it/ provided 3/ Teradata can 3e run t2roug2 Teradata &anager Too( or t2roug2 te(net on node session. T2e set of commands t2at starts pac6dis6 uti(it/ are given 3e(ow one can create a 6ron Bo3 to sc2edu(e t2e same Z run it periodica((/. Commands to run pac6 defrag Z pac6dis6 uti(ities 7 [ferret defrag Q pac6dis6 fspU1 Q
3A. "6ew #na(/sis 7 Primar/ inde+ of a ta3(e in Teradata is responsi3(e for t2e data distri3ution on a(( t2e #&Ps. Proper data distri3ution is re9uired for t2e para((e( processing in t2e s/stem. #s Teradata s/stem fo((ows s2ared not2ing arc2itecture4 a(( t2e #&Ps wor6s in para((e(. If data is even(/ distri3uted amongst t2e #&Ps t2en t2e amount of t2e wor6 done 3/ ever/ #&P wou(d 3e e9ua( Z time re9uired for particu(ar Bo3 wou(d o3vious(/ 3e (esser. In contrast to t2is if on(/ oneCtwo #&Ps are f(ooded wit2 t2e data i.e. data s6ew t2en w2i(e running t2at Bo3 t2e two #&Ps wou(d 3e wor6ing Z ot2ers wi(( 3e id(e. In t2is case we wonMt 3e uti(i0ing t2e para((e( processing power of t2e s/stem.
13
To avoid suc2 data s6ew need to ana(/0e t2e primar/ inde+ of t2e ta3(es in Teradata data3ase over t2e period of time it mig2t 2appen t2at data is getting accumu(ate at t2e few #&Ps4 w2ic2 can 2ave a adverse effect on t2e ETL as we(( as t2e s/stem performance. To ana(/0e t2e data distri3ution for t2e ta3(e we can use t2e in3ui(t 8#"8 functions provided 3/ t2e Teradata. To c2ec6 t2e data distri3ution for a ta3(e one can use a 9uer/7 "ELECT 8#"8#&P ;8#"8B$CSET ;8#"8:OJ ;Co(umn 14..4 co(umn nAAA #" #&PR $&4 count;>A !rom Ta3(eR ame %roup 3/ 1P T2is 9uer/ wi(( provide t2e distri3ution of records on eac2 #&P we can a(so ana(/0e t2e pro3a3(e PIs wit2 t2is 9uer/ w2ic2 wi(( predict t2e data distri3ution on t2e #&Ps 'A. Loc6 monitoring 7 Loc6ing Logger is an uti(it/ t2at ena3(es us to monitor t2e (oc6ing on t2e ta3(es. $sing t2is uti(it/ we can create a ta3(e t2at 2as t2e entries for t2e (oc6s w2ic2 2ave 3een app(ied to t2e ta3(es w2i(e processing. T2is uti(it/ a((ows us to ana(/0e t2e regu(ar ETL process4 Bo3s 3eing 3(oc6ed at particu(ar time w2en t2ere is no one to monitor t2e (oc6ing. B/ ana(/0ing suc2 (oc6ing situations we can modif/ t2e Bo3s Z avoid t2e waiting period due to suc2 situations. To app(/ t2is (oc6ing (oggers !irst4 we need to ena3(e (oc6ing (ogger via t2e 5B" conso(e window or t2e cnsterm su3s/stem. T2e setting does not ta6e effect unti( t2e data3ase is restarted. Loc6Logger - T2is !ie(d defines t2e s/stem defau(t for t2e (oc6ing (ogger. T2is a((ows t2e 5B# to (og t2e de(a/s caused 3/ data3ase (oc6s to 2e(p in identif/ing (oc6 conf(icts. To ena3(e t2is feature set t2e fie(d to T:$E. To disa3(e t2e feature set t2e fie(d to !#L"E. #fter a data3ase restart wit2 t2e Loc6Logger f(ag set to true4 t2e Loc6ing Logger wi(( 3egin to accumu(ate (oc6 information into a circu(ar memor/ 3uffer of ,'SB. 5epending on 2ow fre9uent(/ t2e s/stem encounters (oc6 contention4 t2is 3uffer wi(( wrap4 3ut it wi(( usua((/ span a severa( da/ period. !o((owing a period of (oc6 contention4 to ana(/0e t2e (oc6 activit/4 /ou need to run t2e dump(oc6(og uti(it/ w2ic2 moves t2e data from t2e memor/ 3uffer to a data3ase ta3(e w2ere it can 3e accessed. )A. "ession Tuning7 "ession tuning is done for t2e running t2e (oad uti(ities in para((e(
14
t2is re9uires to ana(/0e some 5B"contro( parameters Z tune t2e same to provide t2e 3est para((e( processing of t2e (oad uti(ities. T2ere are two parameters &a+Load#JT Z &a+LoadTas6s t2at ena3(es t2e para((e( Bo3 management a s2ort note on t2e same7 T2e &a+Load#JT interna( fie(d serves two purposes7 1A Ena3(ing a 2ig2er (imit for t2e &a+LoadTas6s fie(d 3e/ond t2e defau(t (imit of 1). 2A "pecif/ing t2e #&P Jor6er Tas6 ;#JTA (imit for concurrent !astLoad and &u(tiLoad Bo3s w2en a 2ig2er (imit is ena3(ed. In effect4 t2is fie(d a((ows more !astLoad4 &u(tiLoad4 and !astE+port uti(ities running concurrent(/ w2i(e contro((ing #JT usage and preventing e+cessive consumption and possi3(e #JT e+2austion. T2e defau(t va(ue is 0ero7 J2en &a+Load#JT is 0ero4 concurrenc/ (imit operates in t2e same manner as prior to K2:,.1 &a+LoadTas6s specifies t2e concurrenc/ (imit for a(( t2ree uti(ities7 !astLoad4 &u(tiLoad4 and !astE+port. T2e va(id range for &a+LoadTas6s is from G to 1). J2en &a+Load#JT is non-0ero ;2ig2er (imit ena3(edA7 It specifies t2e ma+imum num3er of #JTs t2at can 3e used 3/ !astLoads and &u(tiLoads. &a+imum a((owa3(e va(ue is ,GH of t2e tota( #JTs. T2e va(id range for &a+LoadTas6s is from G to 3G. # new !astLoadC&u(tiLoad Bo3 is a((owed to start on(/ if BOT8 &a+LoadTas6s # 5 &a+Load#JT (imits are not reac2ed. T2erefore4 Bo3s ma/ 3e reBected 3efore &a+LoadTas6s (imit is e+ceeded. &a+LoadTas6s specifies t2e concurrenc/ (imit for t2e com3ination of on(/ two uti(ities7 !astLoad and &u(tiLoad. !astE+port is managed different(/P !astE+port is no (onger contro((ed 3/ t2e &a+LoadTas6s fie(d. # !astE+port Bo3 is on(/ reBected if t2e tota( num3er of active uti(it/ Bo3s is ,G. #t (east 3G !astE+port Bo3s can run at an/ time. # !astE+port Bo3 ma/ 3e a3(e to run even w2en !astLoad and &u(tiLoad Bo3s are reBected. J2en a Teradata 5/namic Jor6(oad &anager ;T5J&A uti(it/ t2rott(e ru(e is ena3(ed4 t2e &a+Load#JT fie(d is overridden. T5J& wi(( use t2e 2ig2est a((owa3(e va(ue w2ic2 is ,GH of tota( #JTs. $pdate to &a+Load#JT 3ecomes effective after t2e 5B" contro( record 2as 3een written. o 5B" restart is re9uired. ote t2at w2en t2e tota( num3er of #JTs ;specified 3/ t2e interna( fie(d &a+#&PJor6erTas6sA 2as 3een modified 3ut a 5B" restart 2as not occurred4 t2en t2ere ma/ 3e a discrepanc/ 3etween t2e actua( num3er of #JTs and t2e 5B" contro( record. T2e s/stem ma/ interna((/ reduce t2e effective va(ue of &a+Load#JTs to prevent #JT e+2austion. #JT $sage of Load $ti(ities7 #(( (oadCun(oad uti(ities re9uire and consume #JTs at different rates depending on t2e e+ecution p2ase7 !astLoad7 * P2ase 1 ;LoadingA7 3 #JTMs * P2ase 2 ;End 15
LoadingA7 1 #JTMs &u(tiLoad>7 * #c9uisition P2ase ;and 3eforeA7 2 #JTMs. #pp(ication P2ase ;and afterA7 1 #JTMs !astE+port7 * #((. T2is description is for t2e sing(e target ta3(e case w2ic2 is t2e most common. T2e a3ove e+p(ained parameters can 3e ana(/0ed Z tuned according(/ to ac2ieve t2e e+pected performance on t2e Teradata s/stem. #(so need to 2ave some maintenanceC 8ouse 6eeping activities in p(ace to avoid t2e performance imp(ications due to some p2/sica( data parameters (i6e data s6ew4 (ess c/(inder space etc.
1A Primar/ inde+es7 $se primar/ inde+es for Boins w2enever possi3(e4 and specif/ in t2e w2ere c(ause a(( t2e co(umns for t2e primar/ inde+es. 2A "econdar/ inde+es ;1GH ru(e rumorA7 T2e optimi0er does not actua((/ use a 1GH ru(e to determine if a secondar/ inde+ wi(( 3e used. But4 t2is is a good estimation7 If (ess t2an 1GH of a ta3(e wi(( 3e accessed if t2e secondar/ inde+ is used4 t2en assume t2e s9( wi(( use t2e secondar/ inde+. Ot2erwise4 t2e s9( e+ecution wi(( do a fu(( ta3(e scan. T2e optimi0er actua((/ uses a <(east cost@ met2od7 T2e optimi0er determines if t2e cost of using a secondar/ inde+ is c2eaper t2an t2e cost of doing a fu(( ta3(e scan. T2e cost invo(ves t2e cpu usage4 and dis6io counts. 3A Constants7 $se constants to specif/ inde+ co(umn contents w2enever possi3(e4 instead of specif/ing t2e constant once4 and Boining t2e ta3(es. T2is ma/ provide a sma(( savings on performance. 'A &at2ematica( operations7 &at2ematica( operations are faster t2an string operations ;i.e. concatenationA4 if 3ot2 can ac2ieve t2e same resu(t. )A Karia3(e (engt2 co(umns7 T2e use of varia3(e (engt2 co(umns s2ou(d 3e minimi0ed4 and s2ou(d 3e 3/ e+ception. !i+ed (engt2 co(umns s2ou(d a(wa/s 3e used to define ta3(es. ,A $nion7 T2e <union@ command can 3e used to 3rea6 up a (arge s9( process or statement into severa( sma((er s9( processes or statements4 w2ic2 wou(d run in para((e(. But t2ese cou(d t2en cause spoo(space (imit pro3(ems. <$nion a((@ e+ecutes t2e s9(Ms sing(e t2readed. 16
-A J2ere inCw2ere not in ;su39uer/A7 T2e s9( <w2ere in@ is more efficient t2an t2e s9( <w2ere not in@. It is more efficient to specif/ constants in t2ese4 3ut if a su39uer/ is specified4 t2en t2e su39uer/ 2as a direct impact on t2e s9( time. If t2ere is a s9( time pro3(em wit2 t2e su39uer/4 t2en t2e s9( su39uer/ cou(d 3e separated from t2e origina( 9uer/. T2is wou(d re9uire 2 s9( statements4 and an intermediate ta3(e. T2e 2 s9( statements wou(d 3e7 1A ew s9( statement4 w2ic2 does t2e previous su39uer/ function4 and inserts into t2e temporar/ ta3(e4 and 2A &odified origina( s9( statement4 w2ic2 doesnMt 2ave t2e su39uer/4 and reads t2e temporar/ ta3(e. 1A "trategic "emico(on7 #t t2e end of ever/ s9( statement4 t2ere is a semico(on. In some cases4 t2e strategic p(acement of t2is semico(on can improve t2e s9( time of a group of s9( statements. But t2is wi(( not improve an individua( s9( statementMs time. T2ese are a coup(e cases7 1A T2e groupMs s9( time cou(d 3e improved if a group of s9( statements s2are t2e same ta3(es ;or spoo( fi(esA4 2A T2e groupMs s9( time cou(d 3e improved if severa( s9( statements use t2e same uni+ input fi(e.
17
,A Trigger ta3(es7 # group of ta3(es4 eac2 contains a su3set of t2e 6e/s of t2e inde+ of an origina( ta3(e. t2e ta3(es cou(d 3e created 3ased on some va(ue in t2e inde+ of t2e origina( ta3(e. T2is provides an a3i(it/ to 3rea6 up a (arge ".L statement into mu(tip(e sma((er ".L statements4 3ut creating t2e trigger ta3(es re9uires more update time. -A "orts ;order 3/A7 #(t2oug2 sorts ta6e time4 t2ese are a(wa/s done at t2e end of t2e 9uer/4 and t2e sort time is direct(/ dependent on t2e si0e of t2e so(ution. $nnecessar/ sorts cou(d 3e e(iminated. 1A E+portCLoad7 Ta3(e data cou(d 3e e+ported ;Bte94 !aste+portA to a uni+ fi(e4 and updated4 and t2en re(oaded into t2e ta3(e ;Bte94 fast(oad4 &u(ti(oadA. \A C P:O%:#&C$ I= "C:IPT"7 "ome data manipu(ation is ver/ difficu(t and time consuming in s9(. T2ese cou(d 3e rep(aced wit2 c programsCuni+ scripts. "ee t2e <CCEm3edded s9(@ tip.
Conc(u$ion:
Teradata is a "/stem w2ic2 rea((/ can process t2e comp(e+ 9ueries ver/ fast(/. Teradata data3ase is Linear(/ sca(a3(e.Je can e+pand t2e data3ase capacit/ 3/ Bust adding more nodes to t2e e+isting data3ase.If t2e data vo(ume grows we can add more 2ardware and e+pand t2e data3ase capacit/. Teradata 2as a e+tensive para((e( processing capacit/4It can 2and(e mu(tip(e ad2oc re9uests and man/ concurrent users. Teradata data3ase 2as s2ared not2ing arc2itecture. It 2as 2ig2 fau(t to(erance and data protection. #not2er advantage is t2e uniform distri3ution of data t2roug2 t2e $ni9ue primar/ inde+es wit2 out an/ over2ead. T2e 18
performance is Bust ama0ing for 8uge data. Teradata is e+ce((ent to 2and(e 8$%E data.
19