Vous êtes sur la page 1sur 39

H QUN TR C S D LIU

CHNG III
LU TR V CU TRC TP TIN (Storage and File Structure)

MC CH Chng ny trnh by cc vn lin quan n vn lu tr d liu (trn lu tr ngoi, ch yu trn a cng). Vic lu tr d liu phi c t chc sao cho c th ct gi mt lng ln, c th rt ln d liu nhng quan trng hn c l s lu tr phi cho php ly li d liu cn thit mau chng. Cc cu trc tr gip cho truy xut nhanh d liu c trnh by l: ch mc (indice), B+ cy (B+-tree), bm (hashing) ... Cc thit b lu tr (a) c th b hng hc khng lng trc, cc k thut RAID cho ra mt gii php hiu qu cho vn ny. YU CU Hiu r cc c im ca cc thit b lu tr, cch t chc lu tr, truy xut a. Hiu r nguyn l v k thut ca t chc h thng a RAID Hiu r cc k thut t chc cc mu tin trong file Hiu r cc k thut t chc file Hiu v vn dng cc k thut h tr tm li nhanh thng tin: ch mc (c sp, B+-cy, bm)

CHNG III. LU TR V CU TRC TP TIN

trang

34

H QUN TR C S D LIU

KHI QUT V PHNG TIN LU TR VT L


C mt s kiu lu tr d liu trong cc h thng my tnh. Cc phng tin lu tr c phn lp theo tc truy xut, theo gi c v theo tin cy ca phng tin. Cc phng tin hin c l: Cache: l dng lu tr nhanh nht v cng t nht trong cc phng tin lu tr. B nh cache nh; s s dng n c qun tr bi h iu hnh B nh chnh (main memory): Phng tin lu tr dng lu tr d liu sn sng c thc hin. Cc ch th my mc ch chung (general-purpose) hot ng trn b nh chnh. Mc du b nh chnh c th cha nhiu megabytes d liu, n vn l qu nh (v qu t gi) lu tr ton b mt c s d liu. Ni dung trong b nh chnh thng b mt khi mt cp ngun B nh Flash: c bit nh b nh ch c c th lp trnh, c th xo (EEPROM: Electrically Erasable Programmable Read-Only Memory), B nh Flash khc b nh chnh ch d liu cn tn ti trong b nh flash khi mt cp ngun. c d liu t b nh flash mt t hn 100 ns , nhanh nh c d liu t b nh chnh. Tuy nhin, vit d liu vo b nh flash phc tp hn nhiu. D liu c vit (mt ln mt khong 4 n 10 s) nhng khng th vit trc tip. vit b nh c vit, ta phi xo trng ton b b nh sau mi c th vit ln n. Lu tr a t (magnetic-disk): ( y, c hiu l a cng) Phng tin cn bn lu tr d liu trc tuyn, lu di. Thng ton b c s d liu c lu tr trn a t. D liu phi c chuyn t a vo b nh chnh trc khi c truy nhp. Khi d liu trong b nh chnh ny b sa i, n phi c vit ln a. Lu tr a c xem l truy xut trc tip v c th c d liu trn a theo mt th t bt k. Lu tr a vn tn ti khi mt cp ngun. Lu tr a c th b hng hc, tuy khng thng xuyn. Lu tr quang (Optical storage): Dng quen thuc nht ca a quang hc l loi a CD-ROM : Compact-Disk Read-Only Memory. D liu c lu tr trn cc a quang hc c c bi laser. Cc a quang hc CD-ROM ch c th dc. Cc phin bn khc ca chng l loi a quang hc: vit mt ln, c nhiu ln (write-once, read-many: WORM) cho php vit d liu ln a mt ln, khng cho php xo v vit li, v cc a c th vit li (rewritable) v..v Lu tr bng t (tape storage): Lu tr bng t thng dng backup d liu. Bng t r hn a, truy xut d liu chm hn (v phi truy xut tun t). Bng t thng c dung lng rt ln.

Cc phng tin lu tr c th c t chc phn cp theo tc truy xut v gi c. Mc cao nht l nhanh nht nhng cng l t nht, gim dn xung cc mc thp hn. Cc phng tin lu tr nhanh (cache, b nh chnh) c xem nh l lu tr s cp (primary storage), cc thit b lu tr mc thp hn nh a t c xem nh lu tr th cp hay lu tr trc tuyn (on-line storage), cn cc thit b lu tr mc thp nht v gn thp nht nh a quang hc, bng t k c cc a mm c xp vo lu tr tam cp hay lu tr khng trc tuyn (off-line). Bn cnh vn tc v gi c, ta cn phi xt n tnh lu bn ca cc phng tin lu tr.
CHNG III. LU TR V CU TRC TP TIN
trang 35

H QUN TR C S D LIU

C h M i M Fl h M M i di k

O i l di k M i
Phn cp thit b lu tr

A T
C TRNG VT L CA A
Mi tm a c dng hnh trn, hai mt ca n c ph bi vt liu t tnh, thng tin c ghi trn b mt a. a gm nhiu tm a. Ta s s dng thut ng a ch cc a cng. Khi a c s dng, mt ng c a lm quay n mt tc khng i. Mt u c-vit c nh v trn b mt ca tm a. B mt tm a c chia logic thnh cc rnh, mi rnh li c chia thnh cc sector, mt sector l mt n v thng tin nh c th c c, vit ln a. Tu thuc vo kiu a, sector thay i t 32 bytes n 4095 bytes, thng thng l 512 bytes. C t 4 n 32 sectors trn mt rnh, t 20 n 1500 rnh trn mt b mt. Mi b mt ca mt tm a c mt u c vit, n c th chy dc theo bn knh a truy cp n cc rnh khc nhau. Mt a gm nhiu tm a, cc u c-vit ca tt c cc rnh c gn vo mt b c gi l cnh tay a, di chuyn cng nhau. Cc tm a c gn vo mt trc quay. V cc u c-vit trn cc tm a di chuyn cng nhau, nn khi u c-vit trn mt tm a ang rnh th i th cc u c-vit ca cc tm a khc cng rnh th i , do vy cc rnh th i ca tt c cc tm a c gi l tr (cylinder) th i . Mt b iu khin a -- giao din gia h thng my tnh v phn cng hin thi ca a. N chp nhn cc lnh mc cao c v vit mt sector, v khi ng cc hnh ng nh di chuyn cnh tay a n cc rnh ng v c vit d liu. b iu khin a cng tham gia vo checksum mi sector c vit. Checksum c tnh t d liu c vit ln sector. Khi sector c c li, checksum c tnh li t d liu c ly ra v so snh vi checksum lu tr. Nu d liu b sai lc, checksum c tnh s khng khp vi checksum lu tr. Nu li nh vy xy ra, b iu khin s lp li vic c vi ln, nu li vn xy ra, b iu khin s thng bo vic c tht bi. B iu khin a cn c
CHNG III. LU TR V CU TRC TP TIN
trang 36

H QUN TR C S D LIU

chc nng ti nh x cc sector xu: nh x cc sector xu n mt v tr vt l khc. Hnh di by t cc a c ni vi mt h thng my tnh:

System bus

Disk controller Disks

Cc a c ni vi mt h thng my tnh hoc mt b iu khin a qua mt s hp nht tc cao. Hp nht h thng my tnh nh (Small Computer-System Interconnect: SCSI) thng c s dng ni kt cc a vi cc my tnh c nhn v workstation. Mainframe v cc h thng server thng c cc bus nhanh hn v t hn ni vi cc a. Cc u c-vit c gi st vi b mt a nh c th tng dy c (density). a u c nh (Fixed-head) c mt u ring bit cho mi rnh, s sp xp ny cho php my tnh chuyn t rnh ny sang rnh khc mau chng, khng phi di chuyn u c-vit. Tuy nhin, cn mt s rt ln u c-vit, iu ny lm nng gi ca thit b.

O LNG HIU NNG CA A


Cc tiu chun o lng cht lung chnh ca a l dung lng, thi gian truy xut, tc truyn d liu v tin cy. Thi gian truy xut (access time): l khong thi gian t khi yu cu c/vit c pht i n khi bt u truyn d liu. truy xut d liu trn mt sector cho ca mt a, u tin cnh tay a phi di chuyn n rnh ng, sau phi ch sector xut hin di n, thi gian nh v cnh tay c gi l thi gian tm kim (seek time), n t l vi khong cch m cnh tay phi di chuyn, thi gian tm kim nm trong khong 2..30 ms tu thuc vo rnh xa hay gn v tr cnh tay hin ti. Thi gian tm kim trung bnh (average seek time): Thi gian tm kim trung bnh l trung bnh ca thi gian tm kim, c o lung trn mt dy cc yu cu ngu nhin (phn phi u), v bng khong 1/3 thi gian tm kim trong trng hp xu nht. Thi gian tim n lun chuyn (rotational latency time): Thi gian ch sector c truy xut xut hin di u c/vit. Tc quay ca a nm trong khong 60..120 vng quay trn giy, trung bnh cn na vng quay sector cn thit nm di u c/vit. Nh vy, thi gian tim n trung bnh (average latency time) bng na thi gian quay mt vng a.

Thi gian truy xut bng tng ca thi gian tm kim v thi gian tim n v nm trong khong 10..40 ms. Tc truyn d liu: l tc d liu c th c ly ra t a hoc c lu tr vo a. Hin nay tc ny vo khong1..5 Mbps
trang 37

CHNG III. LU TR V CU TRC TP TIN

H QUN TR C S D LIU

Thi gian trung bnh khng s c (mean time to failure): lng thi gian trung bnh h thng chy lin tc khng c bt k s c no. Cc a hin nay c thi gian khng s c trung bnh khong 30000 .. 800000 gi ngha l khong t 3,4 n 91 nm.

TI U HA TRUY XUT KHI A (disk-block)


Yu cu I/O a c sinh ra c bi h thng file ln b qun tr b nh o trong hu ht cc h iu hnh. Mi yu cu xc nh a ch trn a c tham kho, a ch ny dng s khi. Mt khi l mt dy cc sector k nhau trn mt rnh. Kch c khi trong khong 512 bytes n mt vi Kbytes. D liu c truyn gia a v b nh chnh theo n v khi. Mc thp hn ca b qun tr h thng file s chuyn i a ch khi sang s ca tr, ca mt v ca sector mc phn cng. Truy xut d liu trn a chm hn nhiu so vi truy xut d liu trong b nh chnh, do vy cn thit mt chin lc nhm nng cao tc truy xut khi a. Di y ta s tho lun mt vi k thut nhm vo mc ch . Scheduling: Nu mt vi khi ca mt tr cn c truyn t a vo b nh chnh, ta c th tit kim thi gian truy xut bi yu cu cc khi theo th t m n chy qua di u c/vit. Nu cc khi mong mun trn cc tr khc nhau, ta yu cu cc khi theo th t sao cho lm ti thiu s di chuyn cnh tay a. Cc thut ton scheduling cnh tay a (Disk-arm-scheduling) nhm lp th t truy xut cc rnh theo cch lm tng s truy xut c th c x l. Mt thut ton thng dng l thut ton thang my (elevator algorithm): Gi s ban u cnh tay di chuyn t rnh trong nht hng ra pha ngoi a, i vi mi rnh c yu cu truy xut, n dng li, phc v yu cu i vi rnh ny, sau tip tc di chuyn ra pha ngoi n tn khi khng c yu cu no ch cc rnh xa hn pha ngoi. Ti im ny, cnh tay i hng, di chuyn vo pha trong, li dng li trn cc rnh c yu cu, v c nh vy n tn khi khng cn rnh no trong hn c yu cu, ri li i hng .. v .. v .. B iu khin a thng lm nhim v sp xp li cc yu cu c ci tin hiu nng. T chc file: suy gim thi gian truy xut khi, ta c th t chc cc khi trn a theo cch tng ng gn nht vi cch m d liu c truy xut. V d, Nu ta mun mt file c truy xut tun t, khi ta b tr cc khi ca file mt cch tun t trn cc tr k nhau. Tuy nhin vic phn b cc khi lu tr k nhau ny s b ph v trong qu trnh pht trin ca file file khng th c phn b trn cc khi k nhau c na, hin tng ny dc gi l s phn mnh (fragmentation). Nhiu h iu hnh cung cp tin ch gip suy gim s phn mnh ny (Defragmentation) nhm lm tng hiu nng truy xut file. Cc buffers vit khng hay thay i: V ni dung ca b nh chnh b mt khi mt ngun, cc thng tin v c s d liu cp nht phi c ghi ln a nhm phng s c. Hiu nng ca cc ng dng cp nht cng cao ph thuc mnh vo tc vit a. Ta c th s dng b nh truy xut ngu nhin khng hay thay i (nonvolatile RAM) nng tc vit a. Ni dung ca nonvolatile RAM khng b mt khi mt ngun. Mt phng php chung thc hin nonvolatile RAM l s dng RAM pin d phng (battery-back-up RAM). Khi c s d liu yu cu vit mt khi ln a, b iu khin da vit khi ny ln buffer nonvolatile RAM, v thng bo ngay cho h iu hnh l vic vit thnh cng. B iu khin s vit d liu n ch ca n trn a, mi khi a rnh hoc buffer nonvolatile RAM y. Khi h c s d liu yu cu mt vit khi, n ch chu mt khong lng ch i khi buffer nonvolatile RAM y.
trang 38

CHNG III. LU TR V CU TRC TP TIN

H QUN TR C S D LIU

a log (log disk): Mt cch tip cn khc lm suy gim tim nng vit l s dng log-disk: Mt a c tn hin cho vic vit mt log tun t. Tt c cc truy xut n log-disk l tun t, nhm loi b thi gian tm kim, v mt vi khi k c th c vit mt ln, to cho vit vo log-disk nhanh hn vit ngu nhin vi ln. Cng nh trong trng hp s dng nonvolatile RAM, d liu phi c vit vo v tr hin thi ca chng trn a, nhng vic vit ny c th c tin hnh m h c s d liu khng cn thit phi ch n hon tt. Log-disk c th c s dng khi phc d liu. H thng file da trn log l mt phin bn ca cch tip cn log-disk: D liu khng c vit li ln ch gc ca n trn a; thay vo , h thng file lu vt ni cc khi c vit mi y nht trn log-disk, v hon li chng t v tr ny. Log-disk c "c c" li (compacting) theo mt nh k. Cch tip cn ny ci tin hiu nng vit, song sinh ra s phn mnh i vi cc file c cp nht thng xuyn.

RAID
Trong mt h thng c nhiu a, ta c th ci tin tc c vit d liu nu cho chng hot ng song song. Mt khc, h thng nhiu a cn gip tng tin cy lu tr bng cch lu tr d tha thng tin trn cc a khc nhau, nu mt a c s c d liu cng khng b mt. Mt s a dng cc k thut t chc a, c gi l RAID (Redundant Arrays of Inexpensive Disks), c ngh nhm vo vn tng cng hiu nng v tin cy.

CI TIN TIN CY THNG QUA S D THA


Gii php cho vn tin cy l a vo s d tha: lu tr thng tin ph, bnh thng khng cn thit, nhng n c th c s dng ti to thng tin b mt khi gp s c hng hc a, nh vy thi gian trung bnh khng s c tng ln (xt tng th trn h thng a). n gin nht, l lm bn sao cho mi a. K thut ny c gi l mirroring hay shadowing. Mt a logic khi bao gm hai a vt l, v mi vic vit c thc hin trn c hai a. Nu mt a b h, d liu c th c c t a kia. Thi gian trung bnh khng s c ca a mirror ph thuc vo thi gian trung bnh khng s c ca mi a v ph thuc vo thi gian trung bnh c sa cha (mean time to repair): thi gian trung bnh mt a b h c thay th v phc hi d liu trn n.

CI TIN HIU NNG THNG QUA SONG SONG


Vi a mirror, tc c c th tng ln gp i v yu cu c c th c gi n c hai a. Vi nhiu a, ta c th ci tin tc truyn bi phn nh (striping data) d liu qua nhiu a. Dng n gin nht l tch cc bt ca mt byte qua nhiu a, s phn nh ny c gi l s phn nh mc bit (bit-level striping). V d, ta c mt dn 8 a, ta vit bt th i ca mt byte ln a th i . dn 8 a ny c th c x l nh mt a vi cc sector 8 ln ln hn kch c thng thng, quan trng hn l tc d truy xut tng ln tm ln. Trong mt t chc nh vy, mi a tham gia vo mi truy xut (c/vit), nh vy, s cc truy xut c th c x l trong mt giy l tng t nh trn mt a, nhng mi truy xut c th c/vit nhiu d liu hn tm ln. Phn nh mc bit c th c tng qut cho s a l bi hoc c ca 8, V d, ta c mt dn 4 a, ta s phn phi bt th i v bt th 4+i vo a th i. Hn na, s phn nh khng nht thit phi mc bit ca mt byte. V d, trong s phn nh mc khi, cc khi ca mt file c phn nh qua nhiu a, vi n a, khi th i c th c phn phi qua a (i mod n) + 1. Ta cng

CHNG III. LU TR V CU TRC TP TIN

trang

39

H QUN TR C S D LIU

c th phn nh mc byte, sector hoc cc sector ca mt khi. Hai ch song song trong mt h thng a l: 1. Np nhiu truy xut nh cn bng (truy xut trang) sao cho lng d liu c np trong mt n v thi gian ca truy xut nh vy tng ln. 2. Song song ho cc truy xut ln sao cho thi gian tr li cc truy xut ln gim.

CC MC RAID
Mirroring cung cp tin cy cao, nhng t gi. Phn nh cung cp tc truyn d liu cao, nhng khng ci tin c tin cy. Nhiu s cung cp s d tha vi gi thp bng cch phi hp tng ca phn nh vi "parity" bit. Cc s ny c s tho hip gi-hiu nng khc nhau v c phn lp thnh cc mc c gi l cc mc RAID. Mc RAID 0 : Lin quan n cc dn a vi s phn nh mc khi, nhng khng c mt s d tha no. Mc RAID 1 : Lin quan n mirror a Mc RAID 2 : Cng c bit di ci tn m sa li kiu b nh (memory-style error-correcting-code : ECC). H thng b nh thc hin pht hin li bng bit parity. Mi byte trong h thng b nh c th c mt bit parity kt hp vi n. S sa li lu hai hoc nhiu hn cc bit ph, v c th dng li d liu nu mt bit b li. tng ca m sa li c th c s dng trc tip trong dn a thng qua phn nh byte qua cc a. V d, bt u tin ca mi byte c th c lu trn a 1, bit th hai trn a 2, v c nh vy, bit th 8 trn a 8, cc bit sa li c lu trn cc a thm vo. Nu mt trong cc a b h, cc bt cn li ca byte v cc bit sa li kt hp c c t cc a khc c th gip ti to bt b mt trn a h, nh vy ta c th dng li d liu. Vi mt dn 4 a d liu, RAID mc 2 ch cn thm 3 a lu cc bit sa li (cc a thm vo ny c gi l cc a overhead), so snh vi RAID mc 1, cn 4 a overhead. Mc RAID 3 : Cn c gi l t chc parity chen bit (bit-interleaved parity). B iu khin a c th pht hin mt sector c c ng hay sai, nh vy c th s dng ch mt bit parity sa li: Nu mt trong cc sector b h, ta bit chnh xc l sector no, Vi mi bit trong sector ny ta c th hnh dung n l bt 1 hay bit 0 bng cch tnh parity ca cc bit tng ng t cc sector trn cc a khc. Nu parity ca cc bit cn li bng vi parity c lu, bit mt s l 0, ngoi ra bit mt l 1. RAID mc 3 tt nh mc 2 nhng it tn km hn (ch cn mt a overhead). Mc RAID 4 : Cn c gi l t chc parity chen khi (Block-interleaved parity), lu tr cc khi ng nh trong cc a chnh quy, khng phn nh chng qua cc a nhng ly mt khi parity trn mt a ring bit i vi cc khi tng ng t N a khc. Nu mt trong cc a b h, khi parity c th c dng vi cc khi tng ng t cc a khc khi phc khi ca a b h. Mt c khi ch truy xut mt a, cho php cc yu cu khc c x l bi cc a khc. Nh vy, tc truyn d liu i vi mi truy xut chm, nhng nhiu truy xut c c th c x l song song, dn n mt tc I/O tng th cao hn. Tc truyn i v cc c d liu ln (nhiu khi) cao do tt c cc a c th c c song song; cc vit d liu ln (nhiu khi) cng c tc truyn cao v d liu v parity c th c vit song song. Tuy nhin, vit mt khi n phi truy xut a trn khi c lu tr, v a parity (do khi parity cng phi c cp nht). Nh vy, vit mt khi n yu cu 4 truy xut: hai c hai khi c, v hai vit li hai khi.
CHNG III. LU TR V CU TRC TP TIN
trang 40

H QUN TR C S D LIU

Mc RAID 5 : Cn gi l parity phn b chen khi (Block-interleaved Distributed Parity), ci tin ca mc 4 bi phn hoch d liu v parity gia ton b N+1 a, thay v lu tr d liu trn N a v parity trn mt a ring bit nh trong RAID 4. Trong RAID 5, tt c cc a c th tham gia lm tho mn cc yu cu c, nh vy s lm tng tng s yu cu c th c t ra trong mt n v thi gian. i vi mi khi, mt a lu tr parity, cc a khc lu tr d liu. V d, vi mt dn nm a, parity i vi khi th n c lu trn a (n mod 5)+1. Cc khi th n ca 4 a khc lu tr d liu hin hnh ca khi . Mc RAID 6 : Cn c gi l s d tha P+Q (P+Q redundancy scheme), n rt ging RAID 5 nhng lu tr thng tin d tha ph canh chng nhiu a b h. Thay v s dng parity, ngi ta s dng cc m sa li.

CHN MC RAID NG
Nu a b h, Thi gian ti to d liu ca n l ng k v thay i theo mc RAID c dng. S ti to d dng nht i vi mc RAID 1. i vi cc mc khc, ta phi truy xut tt c cc a khc trong dn a ti to d liu trn a b h. Hiu nng ti to ca mt mt h thng RAID c th l mt nhn t quan trng nu vic cung cp d liu lin tc c yu cu (thng xy ra trong cc h CSDL hiu nng cao hoc trao i). Hn na, hiu nng ti to nh hng n thi gian trung bnh khng s c. V RAID mc 2 v 4 c gp li bi RAID mc 3 v 5, Vic la chn mc RAID thu hp li trn cc mc RAID cn li. Mc RAID 0 c dng trong cc ng dng hiu nng cao vic mt d liu khng c g l trm trng c. RAID mc 1 l thng dng cho cc ng dng lu tr cc log-file trong h CSDL. Do mc 1 c overhead cao, mc 3 v 5 thng c a thch hn i vi vic lu tr khi lng d liu ln. S khc nhau gia mc 3 v mc 5 l tc truyn d liu i li vi tc I/O tng th. Mc 3 c a thch hn nu truyn d liu cao c yu cu, mc 5 c a thch hn nu vic c ngu nhin l quan trng. Mc 6, tuy hin nay t c p dng, nhng n c tin cy cao hn mc 5.

M RNG
Cc quan nim ca RAID c khi qut ho cho cc thit b lu tr khc, bao hm cc dn bng, thm ch i vi qung b d liu trn cc h thng khng dy. Khi p dng RAID cho dn bng, cu trc RAID cho kh nng khi phc d liu c khi mt trong cc bng b h hi. Khi p dng i vi qung b d liu, mt khi d liu c phn thnh cc n v nh v c qung b cng vi mt n v parity; nu mt trong cc n v ny khng nhn c, n c th c dng li t cc n v cn li.

LU TR TAM CP (tertiary storage)


A QUANG HC
CR-ROM c u im l c kh nng lu tr ln, d di chuyn (c th a vo v ly ra khi a nh a mm), hn na gi li r. Tuy nhin, so vi a cng, thi gian tm kim ca CD-ROM chm hn nhiu (khong 250ms), tc quay chm hn (khong 400rpm), t dn n tr cao hn; tc truyn d liu cng chm hn (khong 150Kbytes/s). Gn y, mt nh dng mi ca a quang hc - Digital video disk (DVD) - c chun ho, cc a ny c dung lng trong khong 4,7GBytes n 17 GBytes. Cc a WORM, REWRITABLE cng tr thnh ph bin. Cc WORM jukeboxes l cc thit b c th lu tr mt s ln cc a WORM v c th np t ng cc a theo yu cu n mt hoc mt vi WORM.
CHNG III. LU TR V CU TRC TP TIN
trang 41

H QUN TR C S D LIU

BNG T
Bng t c th lu mt lng ln d liu, tuy nhin, chm hn so vi a t v a quang hc. Truy xut bng buc phi l truy xut tun t, nh vy n khng thch hp cho hu ht cc i hi ca lu tr th cp. Bng t c s dng chnh cho vic backup, cho lu tr cc thng tin khng c s dng thng xuyn v nh mt phng tin ngoi vi (off-line medium) truyn thng tin t mt h thng n mt h thng khc. Thi gian nh v on bng lu d liu cn thit c th ko di n hng pht. Jukeboxes bng cha mt lng ln bng, vi mt vi bng v c th lu tr c nhiu TeraBytes (1012 Bytes)

TRUY XUT LU TR
Mt c s d liu c nh x vo mt s cc file khc nhau c duy tr bi h iu hnh nn. Cc file ny lu tr thng trc trn cc a vi backup trn bng. Mi file c phn hoch thnh cc n v lu tr di c nh c gi l khi - n v cho c cp pht lu tr v truyn d liu. Mt khi c th cha mt vi hng mc d liu (data item). Ta gi thit khng mt hng mc d liu no tri ra trn hai khi. Mc tiu ni tri ca h CSDL l ti thiu ho s khi truyn gia a v b nh. Mt cch gim s truy xut a l gi nhiu khi nh c th trong b nh chnh. Mc ch l khi mt khi c truy xut, n nm sn trong b nh chnh v nh vy khng cn mt truy xut a no c. Do khng th lu tt c cc khi trong b nh chnh, ta cn qun tr cp pht khng gian sn c trong b nh chnh lu tr cc khi. B m (Buffer) l mt phn ca b nh chnh sn c lu tr bn sao khi a. Lun c mt bn sao trn a cho mi khi, song cc bn sao trn a ca cc khi l cc phin bn c hn so vi phin bn trong buffer. H thng con m trch cp pht khng gian buffer c gi l b qun tr buffer.

B QUN TR BUFFER
Cc chng trnh trong mt h CSDL a ra cc yu cu cho b qun tr buffer khi chng cn mt khi a. Nu khi ny sn sng trong buffer, a ch khi trong b nh chnh c chuyn cho ngi yu cu. Nu khi cha c trong buffer, b qun tr buffer u tin cp pht khng gian trong buffer cho khi, rt ra mt s khi khc, nu cn thit, ly khng gian cho khi mi. Khi c rt ra ch c vit li trn a khi n c b sa i k t ln c vit ln a gn nht. Sau b qun tr buffer c khi t a vo buffer, v chuyn a ch ca khi trong b nh chnh cho ngi yu cu. B qun tr buffer khng khc g nhiu so vi b qun tr b nh o, mt im khc bit l kch c ca mt CSDL c th rt ln khng cha ton b trong b nh chnh do vy b qun tr buffer phi s dng cc k thut tinh vi hn cc s qun tr b nh o kiu mu. Chin luc thay th. Khi khng c ch trong buffer, mt khi phi c xo khi buffer trc khi mt khi mi c c vo. Thng thng, h iu hnh s dng s LRU (Least Recently Used) vit ln a khi t c dng gn y nht, xo b n khi buffer. Cch tip cn ny c th c ci tin i vi ng dng CSDL. Khi cht (pinned blocks). h CSDL c th khi phc sau s c, cn thit phi hn ch thi gian khi vit li ln a mt khi. Mt khi khng cho php vit li ln a c gi l khi cht. Xut ra bt buc cc khi (Forced output of blocks). C nhng tnh hung trong cn phi vit li mt khi ln a, cho d khng gian buffer m n chim l khng cn n. Vic
CHNG III. LU TR V CU TRC TP TIN
trang 42

H QUN TR C S D LIU

vit ny c gi l s xut ra bt buc ca mt khi. L do ngn gn ca yu cu xut ra bt buc khi l ni dung ca b nh chnh b mt khi c s c, ngc li d liu trn da cn tn ti sau s c.

CC I SCH THAY TH BUFFER (Buffer-Replacement Policies).


Mc ch ca chin lc thay th khi trong buffer l ti thiu ho cc truy xut a. Cc h iu hnh thng s dng chin lc LRU thay th khi. Tuy nhin, mt h CSDL c th d on mu tham kho tng lai. Yu cu ca mt ngi s dng i vi h CSDL bao gm mt s bc. H CSDL c th xc nh trc nhng khi no s l cn thit bng cch xem xt mi mt trong cc bc c yu cu thc hin hot ng c yu cu bi ngi s dng. Nh vy, khc vi h iu hnh, h CSDL c th c thng tin lin quan n tng lai, ch t l tng lai gn. Trong nhiu trng hp, chin lc thay th khi ti u cho h CSDL li l MRU (Most Recently Used): Khi b thay th s l khi mi c dng gn y nht! B qun tr buffer c th s dng thng tin thng k lin quan n xc sut m mt yu cu s tham kho mt quan h ring bit no . T in d liu l mt trong nhng phn c truy xut thng xuyn nht ca CSDL. Nh vy, b qun tr buffer s khng nn xo cc khi t in d liu khi b nh chnh tr phi cc nhn t khc bc ch lm iu . Mt ch mc (Index) i vi mt file c truy xut thng xuyn hn chnh bn thn file, vy th b qun tr buffer cng khng nn xo khi ch mc khi b nh chnh nu c s la chn. Chin luc thay th khi CSDL l tng cn hiu bit v cc hot ng CSDL ang c thc hin. Khng mt chin lc n l no c bit nm bt c ton b cc vin cnh c th. Tuy vy, mt iu ng ngc nhin l phn ln cc h CSDL s dng LRU bt chp cc khuyt im ca chin lc . Chin lc c s dng bi b qun tr buffer thay th khi b nh hng bi cc nhn t khc hn l nhn t thi gian ti khi c tham kho tr li. Nu h thng ang x l cc yu cu ca mt vi ngi s dng cnh tranh, h thng (con) iu khin cnh tranh (concurrency-control subsystem) c th phi lm tr mt s yu cu m bo tnh nht qun ca CSDL. Nu b qun tr buffer c cho cc thng tin t h thng iu khin cnh tranh m n nu r nhng yu cu no ang b lm tr, n c th s dng cc thng tin ny thay i chin lc thay th khi ca n. c bit, cc khi cn thit bi cc yu cu tch cc (active requests) c th c gi li trong buffer, ton b cc bt li dn ln cc khi cn thit bi cc yu cu b lm tr. H thng (con) khi phc (crash-recovery subsystem) p t cc rng buc nghim nht ln vic thay th khi. Nu mt khi b sa i, b qun tr buffer khng c php vit li phin bn mi ca khi trong buffer ln a, v iu ny ph hu phin bn c. Thay vo , b qun tr khi phi tm kim quyn t h thng khi phc trc khi vit khi. H thng khi phc c th i hi mt s khi nht nh khc l xut bt buc (forced output) trc khi cp quyn cho b qun tr buffer xut ra khi c yu cu.

T CHC FILE
Mt file c t chc logic nh mt dy cc mu tin (record). Cc mu tin ny c nh x ln cc khi a. File c cung cp nh mt xy dng c s trong h iu hnh, nh vy ta s gi thit s tn ti ca h thng file nn. Ta cn phi xt nhng phng php biu din cc m hnh d liu logic trong thut ng file.

CHNG III. LU TR V CU TRC TP TIN

trang

43

H QUN TR C S D LIU

Cc khi c kch c c nh c xc nh bi tnh cht vt l ca a v bi h iu hnh, song kch c ca mu tin li thay i. Trong CSDL quan h, cc b ca cc quan h khc nhau ni chung c kch c khc nhau. Mt tip cn nh x mt CSDL n cc file l s dng mt s file, v lu tr cc mu tin thuc ch mt di c nh vo mt file cho no . Mt cch khc l cu trc cc file sao cho ta c th iu tit nhiu di cho cc mu tin. Cc file ca cc mu tin di c nh d dng thc thi hn file ca cc mu tin di thay i.

MU TIN DI C NH (Fixed-Length Records)


Xt mt file cc mu tin account i vi CSDL ngn hng, mi mu tin ca file ny c xc nh nh sau: type depositor = record branch_name: char(20); account_number: char(10); balance:real; end Gi s mi mt k t chim 1 byte v mi s thc chim 8 byte, nh vy mu tin account c di 40 bytes. Mt cch tip cn n gin l s dng 40 byte u tin cho mu tin th nht, 40 byte k tip cho mu tin th hai, ... Cch tip cn n gin ny ny sinh nhng vn sau;

0 1 2 3 4 5 6 7 8

Perryridge Round Hill Mianus Downtown Redwood Perryridge Brighton Downtown Perryridge

A-102 A-305 A-215 A-101 A-222 A-201 A-217 A-110 A-218

400 350 700 500 700 900 750 600 700

0 1 3 4 5 6 7 8

Perryridge Round Hill Downtown Redwood Perryridge Brighton Downtown Perryridge

A-102 A-305 A-101 A-222 A-201 A-217 A-110 A-218

400 350 500 700 900 750 600 700

1. File F cha cc mu tin account

2. File F sau khi xa mu tin 2 v di chuyn cc mu tin sau n


header

0 1 8 3 4

Perryridge Round Hill Perryridge Downtown Redwood

A-102 A-305 A-218 A-101 A-222

400 350 700 500 700

0 1 2 3 4 5 6 7 8

Perryridge Mianus Downtown Perryridge Downtown Perryridge

A-102 A-215 A-101 A-201 A-110 A-218

400 700 500 900 trang 600 700

5 Perryridge A-201 900 CHNG III. LU TR V CU TRC TP TIN 6 Brighton A-217 750 7 Downtown A-110 600 3. File F sau khi xa mu tin 2 v di chuyn mu tin cui vo v ch ca

44

H QUN TR C S D LIU

1. Kh khn khi xo mt mu tin t cu trc ny. Khng gian b chim bi mu tin b xo phi c lp y vi mu tin khc ca file hoc ta phi nh du mu tin b xo. 2. Tr khi kch c khi l bi ca 40, nu khng mt s mu tin s bt cho qua bin khi, c ngha l mt phn mu tin c lu trong mt khi, mt phn khc c lu trong mt khi khc. nh vy i hi phi truy xut hai khi c/vit mt mu tin "bc cu" . Khi mt mu tin b xo, ta c th di chuyn mu tin k sau n vo khng gian b chim mt cch hnh thc bi mu tin b xo, ri mu tin k tip vo khng gian b chim ca mu tin va c di chuyn, c nh vy cho n khi mi mu tin i sau mu tin b xo c dch chuyn hng v u. Cch tip cn ny i hi phi di chuyn mt s ln cc mu tin. Mt cch tip cn khc n gin hn l di chuyn mu tin cui cng vo khng gian b chim bi mu tin b xo. Song cch tip cn ny i hi phi truy xut khi b xung. V hot ng xen xy ra thng xuyn hn hot ng xo, ta c th chp nhn vic "ng" khng gian b chim bi mu tin b xo, v ch mt hot ng xen n sau ti s dng khng gian . Mt du trn mu tin b xo l khng v s gy kh khn cho vic tm kim khng gian "t do" khi xen. Nh vy ta cn a vo cu trc b xung. u ca file, ta cp pht mt s byte nht nh lm header ca file. Header ny s cha ng thng tin v file. Header cha a ch ca mu tin b xo th nht, trong ni dung ca mu tin ny c cha a ch ca mu tin b xo th hai v c nh vy. Nh vy, cc mu tin b xo s to ra mt danh sch lin kt dc gi l danh sch t do (free list). Khi xen mu tin mi, ta s dng con tr u danh sch c cha trong header xc nh danh sch, nu danh sch khng rng ta xen mu tin mi vo vng c tr bi con tr u danh sch nu khng ta xen mu tin mi vo cui file. Xen v xo i vi file mu tin di c nh thc hin n gin v khng gian c gii phng bi mu tin b xo ng bng khng gian cn thit xen mt mu tin. i vi file ca cc mu tin di thay i vn tr nn phc tp hn nhiu.

MU TIN DI THAY I (Variable-Length Records)


Mu tin di thay i trong CSDL do bi: o Vic lu tr nhiu kiu mu tin trong mt file o Kiu mu tin cho php di trng thay i o Kiu mu tin cho php lp li cc trng

CHNG III. LU TR V CU TRC TP TIN

trang

45

H QUN TR C S D LIU

C nhiu k thut thc hin mu tin di thay i. minh ho ta s xt cc biu din khc nhau trn cc mu tin di thay i c nh dng sau: Type account_list = record branch_name: char(20) ; account_info: array[ 1.. ] of record account_number: char(10); balance: real; end; end

Biu din chui byte (Byte-String Representation)


Mt cch n gin thc hin cc mu tin di thay i l gn mt k hiu c bit End-of-record () vo cui mi record. Khi , ta c th lu mi mu tin nh mt chui byte lin tip. Thay v s dng mt k hiu c bit cui ca mi mu tin, mt phin bn ca biu din chui byte lu tr di mu tin bt u ca mi mu tin.
0 1 2 3 4 5 Perryridge Round Hill Mianus Downtown Redwood Brighton A-102 A-301 A-101 A-211 A-300 A-111 400 350 800 500 650 750 A-201 900 A210 700


A-222 A-200 600 1200

A-255
950

Biu din chui byte ca cc mu tin di thay i

Biu din chui byte c cc bt li sau: Kh s dng khng gian b chim hnh thc bi mt mu tin b xo, iu ny dn n mt s ln cc mnh nh ca lu tr a b lng ph. Khng c khng gian cho s pht trin cc mu tin. Nu mt mu tin di thay i di ra, n phi c di chuyn v s di chuyn ny l t gi nu mu tin b cht.

Biu din chui byte khng thng c s dng thc hin mu tin di thay i, song mt dng sa i ca n c gi l cu trc khe-trang (slotted-page structure) thng c dng t chc mu tin trong mt khi n. Trong cu trc slotted-page, c mt header bt u ca mi khi, cha cc thng tin sau: - S cc u vo mu tin (record entries) trong header - im cui khng gian t do (End of Free Space) trong khi - Mt mng cc u vo cha v tr v kch c ca mi mu tin Cc mu tin hin hnh c cp pht k nhau trong khi, bt u t cui khi, Khng gian t do trong khi l mt vng k nhau, nm gia u vo cui cng trong mng header v mu tin u tin. Khi mt mu tin c xen vo, khng gian cp pht cho n cui ca khng gian t do, v u vo tng ng vi n c thm vo header.
Block header Size location #entries Free Space

CHNG III. LU TR V CU TRC TP TIN

trang

46

End of Free Space

H QUN TR C S D LIU

Nu mt mu tin b xo, khng gian b chim bi n c gii phng, u vo ng vi n c t l b xo (kch c ca n c t chng hn l -1). Sau , cc mu tin trong khi trc mu tin b xo c di chuyn sao cho khng gian t do ca khi li l phn nm gia u vo cui cng ca mng header v mu tin u tin. Con tr im cui khng gian t do v cc con tr ng vi mu tin b di chuyn c cp nht. S ln ln hay nh i ca mu tin cng s dng k thut tng t (trong trng hp khi cn khng gian cho s ln ln ca mu tin). Ci gi phi tr cho s di chuyn khng qu cao v cc khi c kch c khng ln ( thng 4Kbytes).

Biu din di c nh
Mt cch khc thc hin mu tin di thay i mt cch hiu qu trong mt h thng file l s dng mt hoc mt vi mu tin di c nh biu din mt mu tin di thay i. Hai k thut thc hin file ca cc mu tin di thay i s dng mu tin di c nh l: 1. Khng gian d tr (reserved space). Gi thit rng cc mu tin c di khng vt qu mt ngng ( di ti a). Ta c th s dng mu tin di c nh (c di ti a), Phn khng gian cha dng n c lp y bi mt k t c bit: null hoc End-of-record. 2. Contr (Pointers). Mu tin di thay i c biu din bi mt danh sch cc mu tin di c nh, c "mc xch" vi nhau bi cc con tr. S bt li ca cu trc con tr l lng ph khng gian trong tt c cc mu tin ngoi tr mu tin u tin trong danh sch (mu tin u tin cn trng branch_name, cc mu tin sau trong danh sch khng cn thit c trng ny!). gii quyt vn ny ngi ta ngh phn cc khi trong file thnh hai loi: Khi neo (Anchor block). cha ch cc mu tin u tin trong danh sch Khi trn (Overflow block). cha cc mu tin cn li ca danh sch Nh vy, tt c cc mu tin trong mt khi c cng di, cho d file c th cha cc mu tin khng cng di.
0 1 2 3 4 5 Perryridge Round Hill Mianus Downtown Redwood Brighton A-102 A-301 A-101 A-211 A-300 A-111 400 350 800 500 650 750 A-201 900 A210 700


A-222 A-200


600 1200

A-255


950

S dng phng php khng gian d tr 0 1 Perryridge A-102 A-201 400 900 700 350 800 500 650

trang 47

2 A-210 CHNG III. LU TR V CU TRC TP TIN 3 Round Hill A-301 4 5 6 Mianus Downtown Redwood A-101 A-211 A-300

H QUN TR C S D LIU

Perryridge Round Hill Mianus Downtown Redwood Brighton

A-102 A-301 A-101 A-211 A-300 A-111


Khi neo

400 350 800 500 650 750 900 700 600 1200 950
Khi trn

A-201 A-210 A-222 A-200 A-255

Cu trc khi neo v khi trn

T CHC CC MU TIN TRONG FILE


Ta xt lm th no biu din cc mu tin trong mt cu trc file. Mt th hin ca mt quan h l mt tp hp cc mu tin. cho mt tp hp cc mu tin, vn t ra l lm th no t chc chng trong mt file. C mt s cch t chc sau: T chc file ng (Heap File Organization). Trong t chc ny, mt mu tin bt k c th c lu tr bt k ni no trong file, c khng gian cho n. Khng c th t no gia cc mu tin. Mt file cho mt quan h. T chc file tun t ( Sequential File Organization). Trong t chc ny, cc mu tin c lu tr th t tun t, da trn gi tr ca kho tm kim ca mi mu tin. T chc file bm (Hashed File Organization). Trong t chc ny, c mt hm bm c tnh ton trn thuc tnh no ca mu tin. Kt qu ca hm bm xc nh mu tin c b tr trong khi no trong file. T chc ny lin h cht ch vi cu trc ch mc.
CHNG III. LU TR V CU TRC TP TIN
trang 48

H QUN TR C S D LIU

T chc file cm (Clustering File Organization). Trong t chc ny, cc mu tin ca mt vi quan h khc nhau c th c lu tr trong cng mt file. Cc mu tin c lin h ca cc quan h khc nhau c lu tr trn cng mt khi sao cho mt hot ng I/O em li cc mu tin c lin h t tt c cc quan h.

T CHC FILE TUN T


T chc file tun t c thit k x l hiu qu cc mu tin trong th t c sp da trn mt kho tm kim (search key) no . cho php tm li nhanh chng cc mu tin theo th t kho tm kim, ta "xch" cc mu tin li bi cc con tr. Con tr trong mi mu tin tr ti mu tin k theo th t kho tm kim. Hn na, ti u ho s khi truy xut trong x l file tun t, ta lu tr vt l cc mu tin theo th t kho tm kim hoc gn vi kho tm kim nh c th. T chc file tun t cho php c cc mu tin theo th t c sp m n c th hu dng cho mc ch trnh by cng nh cho cc thut ton x l vn tin (query-processing algorithms).
Brighton Downtown Downtown Mianus Perryridge Perryridge Perryridge Redwood Round Hill A-217 A-101 A-110 A-215 A-102 A-201 A-218 A-222 A-301 750 500 600 700 400 900 700 850 550

Kh khn gp phi ca t chc ny l vic duy tr th t tun t vt l ca cc mu tin khi xy ra cc hot ng xen, xo, do ci gi phi tr cho vic di chuyn cc mu tin khi xen, xo. Ta c th qun tr vn xo bi dng dy chuyn cc con tr nh trnh by trc y. i vi xen, ta c th p dng cc quy tc sau: 1. nh v mu tin trong file m n i trc mu tin c xen theo th t kho tm kim. 2. Nu c mu tin t do (khng gian ca mu tin b xo) trong cng khi, xen mu tin vo khi ny. Nu khng, xen mu tin mi vo mt khi trn. Trong c hai trng hp, iu chnh cc con tr sao cho n mc xch cc mu tin theo th t ca kho tm kim.
Brighton Downtown Downtown Mianus Perryridge Perryridge Perryridge A-217 A-101 A-110 A-215 A-102 A-201 A-218 750 500 600 700 400 900 700

trang 49

Redwood A-222 850 CHNG III. LU TR V CU TRC TP TIN Round Hill A-301 550
Khi trn

North Town

A_777

1100

H QUN TR C S D LIU

T CHC FILE CM
Nhiu h CSDL quan h, mi quan h c lu tr trong mt file sao cho c th li dng c ton b nhng ci m h thng file ca iu hnh cung cp. Thng thng, cc b ca mt quan h c biu din nh cc mu tin di c nh. Nh vy cc quan h c th nh x vo mt cu trc file. S thc hin n gin ca mt h CSDL quan h rt ph hp vi cc h CSDL c thit k cho cc my tnh c nhn. Trong cc h thng , kch c ca CSDL nh. Hn na, trong mt s my tnh c nhn, ch yu kch c tng th m i tng i vi h CSDL l nh. Mt cu trc file n gin lm suy gim lng m cn thit thc thi h thng. Cch tip cn n gin ny, thc hin CSDL quan h, khng cn ph hp khi kch c ca CSDL tng ln. Ta s thy nhng im li v mt hiu nng t vic gn mt cch thn trng cc mu tin vi cc khi, v t vic t chc k lng chnh bn thn cc khi. Nh vy, c v nh l mt cu trc file phc tp hn li c li hn, ngay c trong trng hp ta gi nguyn chin lc lu tr mi quan h trong mt file ring bit. Tuy nhin, nhiu h CSDL quy m ln khng nh cy trc tip vo h iu hnh nn qun tr file. Thay vo , mt file h iu hnh c cp pht cho h CSDL. Tt c cc quan h c lu tr trong mt file ny, v s qun tr file ny thuc v h CSDL. thy nhng im li ca vic lu tr nhiu quan h trong cng mt file, ta xt vn tin SQL sau: SELECT FROM WHERE account_number, customer_number, customer_treet, customer_city depositor, customer depositor.customer_name = customer.customername;

Cu vn tin ny tnh mt php ni ca cc quan h depositor v customer. Nh vy, i vi mi b ca depositor, h thng phi tm b ca customer c cng gi tr customer_name. Mt cch l tng l vic tm kim cc mu tin ny nh s tr gip ca ch mc. B qua vic tm kim cc mu tin nh th no, ta ch vo vic truyn t a vo b nh. Trong trng hp xu nht, mi mu tin trong mt khi khc nhau, iu ny buc ta phi c mt khi cho mt mu tin c yu cu bi cu vn tin. Ta s trnh by mt cu trc file c thit k thc hin hiu qu cc cu vn tin lin quan n depositor customer. Cc b depositor i vi mi customer_name c lu tr gn b customer c cng customer_name. Cu trc ny trn cc b ca hai quan h vi nhau, nhng cho php x l hiu qu php ni. Khi mt b ca ca quan h customer c c, ton b khi cha b ny c c t a vo trong b nh chnh. Do cc b tng ng ca depositor c lu tr trn a gn b customer, khi cha b customer cha cc b ca quan h depositor cn cho x l cu vn tin. Nu mt customer c nhiu account n ni cc mu tin depositor khng lp y trong mt khi, cc mu tin cn li xut hin trong khi k cn. Cu trc file ny, c gi l gom cm (clustering), cho php ta c nhiu mu tin c yu cu ch s dng mt c khi, nh vy ta c th x l cu vn tin c bit ny hiu qu hn.
customer_name Hays Turner customer_street Main Putnam customer_city Brooklyn Stamford Hays Hays Hays Hays Turner Turner Main A-102 A-220 A-503 Putnam A-305
Cu trc file cm
trang 50

Brooklyn

Quan h CHNG customer TR V CU TRC TP TIN III. LU

Hays Hays Hays

Main A-102 A-220

Brooklyn

Stamford

H QUN TR C S D LIU

Tuy nhin, cu trc gom cm trn li t ra khng c li bng t chc lu mi quan h trong mt file ring, i vi mt s cu vn tin, chng hn: SELECT * FROM customer Vic xc nh khi no th gom cm thng ph thuc vo kiu cu vn tin m ngi thit k CSDL ngh rng n xy ra thng xuyn nht. S dng thn trng gom cm c th ci thin hiu nng ng k trong vic x l cu vn tin.

LU TR T IN D LIU
Mt h CSDL cn thit duy tr d liu v cc quan h, nh s ca cc quan h. Thng tin ny c gi l t in d liu (data dictionary) hay mc lc h thng (system catalog). Trong cc kiu thng tin m h thng phi lu tr l: Cc tn ca cc quan h Cc tn ca cc thuc tnh ca mi quan h Cc min (gi tr) v cc di ca cc thuc tnh Cc tn ca cc View c nh ngha trn CSDL v nh ngha ca cc view ny Cc rng buc ton vn Nhiu h thng cn lu tr cc thng tin lin quan n ngi s dng h thng: Tn ca ngi s dng c php Gii trnh thng tin v ngi s dng Cc d liu thng k v m t v cc quan h c th cng c lu tr: S b trong mi quan h Phng php lu tr c s dng cho mi quan h (cm hay khng) Cc thng tin v mi ch mc trn mi quan h cng cn c lu tr : Tn ca ch mc Tn ca quan h c ch mc Cc thuc tnh trn n ch mc c nh ngha
CHNG III. LU TR V CU TRC TP TIN
trang 51

H QUN TR C S D LIU

Kiu ca ch mc c to Ton b cc thng tin ny trong thc t bao hm mt CSDL nh. Mt s h CSDL s dng nhng cu trc d liu v m mc ch c bit lu tr cc thng tin ny. Ni chung, lu tr d liu v CSDL trong chnh CSDL vn c a chung hn. Bng cch s dng CSDL lu tr d liu h thng, ta n gin ho cu trc tng th ca h thng v cho php s dng y sc mnh ca CSDL trong vic truy xut nhanh n d liu h thng. S chn la chnh xc biu din d liu h thng s dng cc quan h nh th no l do ngi thit k h thng quyt nh. Nh mt v d, ta ngh s biu din sau: System_catalog_schema = (relation_name, number_of_attributes) Attribute_schema = (attribute_name, relation_name, domain_type, position, length) User_schema = (user_name, encrypted_password, group) Index_schema = (index_name, relation_name, index_type, index_attributes) View_schema = (view_name, definition)

CH MC
Ta xt hot ng tm sch trong mt th vin. V d ta mun tm mt cun sch ca mt tc gi no . u tin ta tra trong mc lc tc gi, mt tm th trong mc lc ny s ch cho ta bit c th tm thy cun sch u. Cc th trong mt mc lc c th vin sp xp th t theo vn ch ci , nh vy gip ta c th tm n th cn tm nhanh chng khng cn phi duyt qua tt c cc th. Ch mc ca mt file trong cc cng vic h thng rt ging vi mt mc lc trong mt th vin. Tuy nhin, ch mc c lm nh mc lc c m t nh trn, trong thc t, s qu ln c qun l mt cch hiu qu. Thay vo , ngi ta s dng cc k thut ch mc tinh t hn. C hai kiu ch mc: Ch mc c sp (Ordered indices). c da trn mt th t sp xp theo cc gi tr Ch mc bm (Hash indices). c da trn cc gi tr c phn phi u qua cc bucket. Bucket m mt gi tr c gn vi n c xc nh bi mt hm, c gi l hm bm (hash function)

i vi c hai kiu ny, ta s nu ra mt vi k thut, ng lu l khng k thut no l tt nht. Mi k thut ph hp vi cc ng dng CSDL ring bit. Mi k thut phi c nh gi trn c s ca cc nhn t sau: Kiu truy xut: Cc kiu truy xut c h tr hiu qu. Cc kiu ny bao hm c tm kim mu tin vi mt gi tr thuc tnh c th hoc tm cc mu tin vi gi tr thuc tnh nm trong mt khong xc nh. Thi gian truy xut: Thi gian tm kim mt hng mc d liu hay mt tp cc hng mc. Thi gian xen: Thi gian xen mt hng mc d liu mi. gi tr ny bao hm thi gian tm v tr xen thch hp v thi gian cp nht cu trc ch mc. Thi gian xo: Thi gian xo mt hng mc d liu. gi tr ny bao hm thi gian tm kim hng mc cn xo, thi gian cp nht cu trc ch mc. Tng ph tn khng gian: Khng gian ph b chim bi mt cu trc ch mc.

Mt file thng i km vi mt vi ch mc. Thuc tnh hoc tp hp cc thuc tnh c dng tm kim mu tin trong mt file c gi l kho tm kim. Ch rng nh ngha ny
CHNG III. LU TR V CU TRC TP TIN
trang 52

H QUN TR C S D LIU

khc vi nh ngha kho s cp (primary key), kho d tuyn (candidate key), v siu kho (superkey). Nh vy, nu c mt vi ch mc trn mt file, c mt vi kho tm kim tng ng.

CH MC C SP.
Mt ch mc lu tr cc gi tr kho tm kim trong th t c sp, v kt hp vi mi kho tm kim, cc mu tin cha kho tm kim ny. Cc mu tin trong file c ch mc c th chnh n cng c sp. Mt file c th c mt vi ch mc trn nhng kho tm kim khc nhau. Nu file cha cc mu tin c sp tun t, ch mc trn kho tm kim xc nh th t ny ca file c gi ch mc s cp (primary index). Cc ch mc s cp cng c gi l ch mc cm (clustering index). Kho tm kim ca ch mc s cp thng l kho s cp (kho chnh). Cc ch mc, kho tm kim ca n xc nh mt th t khc vi th t ca file, c gi l cc ch mc th cp (secondary indices) hay cc ch mc khng cm (nonclustering indices).
Brighton Downtown Downtown Mianus Perryridge Perryridge Perryridge Redwood Round Hill A-217 A-101 A-110 A-215 A-102 A-201 A-218 A-222 A-301 750 500 600 700 400 900 700 850 550

Ch mc s cp.

file tun t cc mu tin account

Trong phn ny, ta gi thit rng tt c cc file c sp th t tun t trn mt kho tm kim no . Cc file nh vy, vi mt ch mc s cp trn kho tm kim ny, c gi l file tun t ch mc (index-sequential files). Chng biu din mt trong cc s xa nht c dng trong h CSDL. Chng c thit k cho cc ng dng i hi c x l tun t ton b file ln truy xut ngu nhin n mt mu tin.

Ch mc c v ch mc tha (Dense and Sparse Indices)


Ch mc c
Brighton Mianus Redwood Brighton Downtown Downtown Mianus Perryridge Perryridge Perryridge Redwood Round Hill A-217 A-101 A-110 A-215 A-102 A-201 A-218 A-222 A-301 A-217 A-101 A-110 A-215 A-102 A-201 750 500 600 700 400 900 700 850 550 750 500 600 700 400 900
trang 53

Ch mc tha
Brighton Mianus Perryridge Redwood

Brighton Downtown Mianus Perryridge Perryridge

Downtown CHNG III. LU TR V CU TRC TP TIN Downtown

H QUN TR C S D LIU

C hai loi ch mc c sp: Ch mc c. Mi mu tin ch mc (u vo ch mc/ index entry) xut hin i vi mi gi tr kho tm kim trong file. mu tin ch mc cha gi tr kho tm kim v mt con tr ti mu tin d liu u tin vi gi tr kho tm kim . Ch mc tha. Mt mu tin ch mc c to ra ch vi mt s gi tr. Cng nh vi ch mc c, mi mu tin ch mc cha mt gi tr kho tm kim v mt con tr ti mu tin d liu u tin vi gi tr kho tm kim ny. nh v mt mu tin, ta tm u vo ch mc vi gi tr kho tm kim ln nht trong cc gi tr kho tm kim nh hn hoc bng gi tr kho tm kim ang tm. Ta bt u t mu tin c tr ti bi u vo ch mc, v ln theo cc con tr trong file (d liu) n tn khi tm thy mu tin mong mun.

V d: Gi s ta tm cc kim mu tin i vi chi nhnh Perryridge, s dng ch mc c. u tin, tm Perryridge trong ch mc (tm nh phn!), i theo con tr tng ng n mu tin d liu (vi Branch_name = Perryridge) u tin, x l mu tin ny, sau i theo con tr trong mu tin ny nh v mu tin k trong th t kho tm kim, x l mu tin ny, tip tc nh vy n tn khi t ti mu tin c Branch_name khc vi Perryridge. i vi ch mc tha, u tin tm trong ch mc, u vo c Branch_name ln nht trong cc u vo c Branch_name nh hn hoc bng Perryridge, ta tm c u vo vi Mianus, ln theo con tr tng ng n mu tin d liu, i theo con tr trong mu tin Mianus nh v mu tin k trong th t kho tm kim v c nh vy n tn khi t ti mu tin d liu Perryridge u tin, sau x l bt u t im ny. Ch mc c cho php tm kim mu tin nhanh hn ch mc tha, song ch mc tha li i hi t khng gian hn ch mc c. Hn na, ch mc tha yu cu mt tn ph duy tr nh hn i vi cc hot ng xen, xo. Ngi thit k h thng phi cn nhc s cn i gia thi truy xut v tn ph khng gian. Mt tho hip tt l c mt ch mc tha vi mt u vo ch mc cho mi khi, v nh vy ci gi ni tri trong x l mt yu cu CSDL l thi gian mang mt khi t a vo b nh chnh. Mi khi mt khi c mang vo, thi gian qut ton b khi l khng ng k. S dng ch mc tha, ta tm khi cha mu tin cn tm. Nh vy, tr phi mu tin nm trn khi trn, ta ti thiu ho c truy xut khi, trong khi gi c kch c ca ch mc nh nh c th.

Ch mc nhiu mc
Ch mc c th rt ln, ngay c khi s dng ch mc tha, v khng th cha trong b nh mt ln. Tm kim u vo ch mc i vi cc ch mc nh vy i hi phi c vi khi a. Tm kim nh phn c th c s dng tm mt u vo trn file ch mc, song vn phi truy xut khong logB khi, vi B l s khi a cha ch mc. Nu B ln, thi gian truy xut
CHNG III. LU TR V CU TRC TP TIN
trang 54

H QUN TR C S D LIU

ny l ng k! Hn na nu s dng cc khi trn, tm kim nh phn khng s dng c v nh vy vic tm kim phi lm tun t. N i hi truy xut ln n B khi!! gii quyt vn ny, Ta xem file ch mc nh mt file tun t v xy dng ch mc tha cho n. tm u vo ch mc, ta tm kim nh phn trn ch mc "ngoi" c mu tin c kho tm kim ln nht trong cc mu tin c kho tm kim nh hn hoc bng kho mun tm. Con tr tng ng tr ti khi ca ch mc "trong". Trong khi ny, tm kim mu tin c kho tm kim ln nht trong cc mu tin c kho tm kim nh hn hoc bng kho mun tm, trng con tr ca mu tin ny tr n khi cha mu tin cn tm. V ch mc ngoi nh, c th nm sn trong b nh chnh, nn mt ln tm kim ch cn mt truy xut khi ch mc. Ta c th lp li qu trnh xy dng trn nhiu ln khi cn thit. Ch mc vi khng t hn hai mc c gi l ch mc nhiu mc. Vi ch mc nhiu mc, vic tm kim mu tin i hi truy xut khi t hn ng k so vi tm kim nh phn.

outer index

Index block 0

Index block 1

inner index

Cp nht ch mc
Mi khi xen hoc xo mt mu tin, bt buc phi cp nht cc ch mc km vi file cha mu tin ny. Di y, ta m t cc thut ton cp nht cho cc ch mc mt mc Xo. xo mt mu tin, u tin phi tm mu tin mun xo. Nu mu tin b xo l mu tin u tin trong dy chuyn cc mu tin c xc nh bi con tr ca u vo ch mc trong qu trnh tm kim, c hai trng hp phi xt: nu mu tin b xo l mu tin duy nht trong dy chuyn, ta xo u vo trong ch mc tng ng, nu khng, ta thay th kho tm kim trong u vo ch mc bi kho tm kim ca mu tin k sau mu tin b xo trong dy chuyn, con tr bi a ch mu tin k sau . Trong trng hp khc, vic xo mu tin khng dn n vic iu chnh ch mc. Xen. Trc tin, tm kim da trn kho tm kim ca mu tin c xen. Nu l ch mc c v gi tr kho tm kim khng xut hin trong ch mc, xen gi tr kho ny v con tr ti mu tin vo ch mc. Nu l ch mc tha v lu u vo cho mi khi, khng cn
trang 55

CHNG III. LU TR V CU TRC TP TIN

H QUN TR C S D LIU

thit phi thay i tr phi khi mi c to ra. Trong trng hp , gi tr kho tm kim u tin trong khi mi c xen vo ch mc. Gi thut xen v xo i vi ch mc nhiu mc l mt m rng n gin ca cc gi thut va c m t.

Ch mc th cp.
Ch mc th cp trn mt kho d tuyn ging nh ch mc s cp c ngoi tr cc mu tin c tr n bi cc gi tr lin tip trong ch mc khng c lu tr tun t. Ni chung, ch mc th cp c th c cu trc khc vi ch mc s cp. Nu kho tm kim ca ch mc s cp khng l kho d tuyn, ch mc ch cn tr n mu tin u tin vi mt gi tr kho tm kim ring l (cc mu tin khc cng gi tr kho ny c th tm li c nh qut tun t file). Nu kho tm kim ca mt ch mc th cp khng l kho d tuyn, vic tr ti mu tin u tin vi gi tr kho tm kim ring khng , do cc mu tin trong file khng cn c sp tun t theo kho tm kim ca ch mc th cp, chng c th nm bt k v tr no trong file. Bi vy, ch mc th cp phi cha tt c cc co tr ti mi mu tin. Ta c th s dng mc ph gin tip thc hin ch mc th cp trn cc kho tm kim khng l kho d tuyn. Cc con tr trong ch mc th cp nh vy khng trc tip tr ti mu tin m tr ti mt bucket cha cc con tr ti file.
350 400 500 600 700 750 900
Ch mc th cp trn kho khng l d tuyn

Brighton Downtown Downtown Mianus Perryridge Perryridge Perryridge Redwood Round Hill

A-217 A-101 A-110 A-215 A-102 A-201 A-218 A-222 A-305

750 500 600 700 400 900 700 700 350

Ch mc th cp phi l c, vi mt u vo ch mc cho mi mu tin. Ch mc th cp ci thin hiu nng cc vn tin s dng kho tm kim khng l kha ca ch mc s cp, tuy nhin n li em li mt tn ph sa i CSDL ng k.Vic quyt nh cc ch mc th cp no l cn thit da trn nh gi ca nh thit k CSDL v tn xut vn tin v sa i.

FILE CH MC B+-CY (B+-Tree Index file)


T chc file ch mc tun t c mt nhc im chnh l lm gim hiu nng khi file ln ln. khc phc nhc im i hi phi t chc li file. Cu trc ch mc B+-cy l cu trc c s dng rng ri nht trong cc cu trc m bo c tnh hiu qu ca chng bt chp cc hot ng xen, xo. Ch mc B+-cy l mt dng cy cn bng (mi ng dn t gc n l c cng di). Mi nt khng l l c s con nm trong khong gia m/2 v m, trong m l mt s c nh c gi l bc ca B+-cy. Ta thy rng cu trc B+-cy cng i hi mt tn ph
CHNG III. LU TR V CU TRC TP TIN
trang 56

H QUN TR C S D LIU

hiu nng trn xen v xo cng nh trn khng gian. Tuy nhin, tn ph ny l chp nhn c ngay c i vi cc file c tn sut sa i cao.

Cu trc ca B+-cy
Mt ch mc B+-cy l mt ch mc nhiu mc, nhng c cu trc khc vi file tun t ch mc nhiu mc (multilevel index-sequential). Mt nt tiu biu ca B+-cy cha n n-1 gi tr kho tm kim. K1, K2, ..., Kn-1, v n con tr P1, P2, ..., Pn, cc gi tr kho trong nt c sp th t: i < j Ki < Kj.
P1 K1 P2 K2 . . . Pn-1 Kn-1 Pn

Trc tin, ta xt cu trc ca nt l. i vi i = 1, 2, ..., n-1, con tr Pi tr ti hoc mu tin vi gi tr kho Ki hoc ti mt bucket cc con tr m mi mt trong chng tr ti mt mu tin vi gi tr kho Ki. Cu trc bucket ch c s dng trong cc trng hp: hoc kho tm kim khng l kho s cp hoc file khng c sp theo kho tm kim. Con tr Pn c dng vo mc ch c bit: Pn c dng mc xch cc nt l li theo th t kho tm kim, iu ny cho php x l tun t file hiu qu. By gi ta xem cc gi tr kho tm kim c gn vi mt nt l nh th no. Mi nt nt l c th cha n n-1 gi tr. Khong gi tr m mi nt l cha l khng chng cho. Nh vy, nu Li v Lj l hai nt l vi i < j th mi gi tr kho trong nt Li nh hn mi gi tr kho trong Lj . Nu ch mc B+-cy l c, mi gi tr kho tm kim phi xut hin trong mt nt l no .
perryridg e

Mianus

Redwood

Brighton

downtown

Mianus

perryridg e A-212 A-101 A-110 750 500 600

Redwood

Round Hill

Brighton Downtown Downtown ...

Cc nt khng l l ca mt B+-cy to ra mt ch mc nhiu mc trn cc nt l. Cu trc ca cc nt khng l l tng t nh cu trc nt l ngoi tr tt c cc con tr u tr n cc nt ca cy. Cc nt khng l l c th cha n m con tr v phi cha khng t hn m/2 con tr ngoi tr nt gc. Nt gc c php cha t nht 2 con tr. S con tr trong mt nt c gi l s nan (fanout) ca nt. Con tr Pi ca mt nt khng l l (cha p con tr, 1 < i < p) tr n mt cy con cha cc gi tr kho tm kim nh hn Ki v ln hn hoc bng Ki-1. Con tr P1 tr n cy con cha cc gi tr kho tm kim nh hn K1. Con tr Pp tr ti cy con cha cc kho tm kim ln hn Kp-1.
CHNG III. LU TR V CU TRC TP TIN
trang 57

H QUN TR C S D LIU

Cc vn tin trn B+-cy


Ta xt x l vn tin s dng B+-cy nh th no ? Gi s ta mun tn tt c cc mu tin vi gi tr kho tm kim k. u tin, ta kim tra nt gc, tm gi tr kho tm kim nh nht ln hn k, gi s gi tr kho l Ki. i theo con tr Pi di ti mt nt khc. Nu nt c p con tr v k > Kp-1, i theo con tr Pp. n mt nt ti, lp li qu trnh tm kim gi tr kho tm kim nh nht ln hn k v theo con tr tng ng i ti mt nt khc v tip tc nh vy n khi t ti mt nt l. Con tr tng ng trong nt l hng ta ti mu tin/bucket mong mun. S khi truy xut khng vt qu log K , trong K l s gi tr kho tm kim trong B+-cy,

m / 2

m l bc ca cy.

Cp nht trn B+-cy


Xen. S dng cng k thut nh tm kim, ta tm nt l trong gi tr kho tm kim cn xen s xut hin. Nu kho tm kim xut hin ri trong nt l, xen mu tin vo trong file, thm con tr ti mu tin vo trong bucket tng ng. Nu kho tm kim cha hin din trong nt l, ta xen mu tin vo trong file ri xen gi tr kho tm kim vo trong nt l v tr ng (bo tn tnh th t), to mt bucket mi vi con tr tng ng. Nu nt l khng cn ch cho gi tr kho mi, Mt khi mi c yu cu t h iu hnh, cc gi tr kho trong nt l c tch mt na cho nt mi, gi tr kho mi c xen vo v tr ng ca n vo mt trong hai khi ny. iu ny ko theo vic xen gi tr kho u khi mi v con tr ti khi mi vo nt cha. Vic xen cp gi tr kho v con tr vo nt cha ny li c th dn n vic tch nt ra lm hai. Qu trnh ny c th dn n tn nt gc. Trong trng hp nt gc b tch lm hai, mt nt gc mi c to ra v hai con ca n l hai nt c tch ra t nt gc c, chiu cao cy tng ln mt.
Tm nt l L s cha gi tr V Insert_entry(L, V, P) end procedure Procedure Insert_entry(node L, value V, pointer P) If (L c khng gian cho (V, P) then Xen (V, P) vo L else begin /* tch L */ To nt L' If ( L l nt l) then begin V' l gi tr sao cho m/2 gi tr trong cc gi tr L.K1, L.K2, ..., L.Km-1, V nh hn V' n l ch s nh nht sao cho L.Kn V' Di chuyn L.Pn, L.Kn, ..., L.Pm-1, L.Kn-1 sang L' If (V < V') then xen (V, P) vo trong L else xen (P, V) vo trong L' end else begin V' l gi tr sao cho m/2 gi tr trong cc gi tr L.K1, L.K2, ..., L.Km-1, V ln hn hoc bng V' n l ch s nh nht sao cho L.Kn V' Thm Nil, L.Kn, L.Pn+1, L.Kn+1, ..., L.Pm-1, L.Km-1, L.Pm vo L' Xo L.Kn, L.Pn+1, L.Kn+1, ..., L.Pm-1, L.Km-1, L.Pm khi L

Procedure Insert(value V, pointer P)

CHNG III. LU TR V CU TRC TP TIN

trang

58

H QUN TR C S D LIU
If (V < V') then xen (P, V) vo trong L else xen (P, V) vo trong L' xo (Nil, V') khi L' end If (L khng l nt gc) then Insert_entry(parent(L), V', L') else begin To ra nt mi R vi cc nt con l L v L' vi gi tr duy nht trong n l V' To R l gc ca cy end If (L) l mt nt l then begin t L'.Pm = L.Pm t L.Pm = L' end end end procedure

Xo. S dng k thut tm kim tm mu tin cn xo, xo n khi file, xo gi tr kho tm kim khi nt l trong B+-cy nu khng c bucket kt hp vi gi tr kho tm kim hoc bucket tr nn rng sau khi xo con tr tng ng trong n. Vic xo mt gi tr kho khi mt nt ca B+-cy c th dn n nt l tr nn rng, phi tr li, t nt cha ca n c th c s con nh hn ngng cho php, trong trng hp hoc phi chuyn mt con t nt anh em ca nt cha sang nt cha nu iu c th (nt anh em ca nt cha ny cn s con m/2 sau khi chuyn i mt con). Nu khng, phi gom nt cha ny vi mt nt anh em ca n, iu ny dn ti xo mt nt trong khi cy, ri xo khi nt cha ca n mt hng, ... qu trnh ny c th dn n tn gc. Trong trng hp nt gc ch cn mt con sau xo, cy phi thay nt gc c bi nt con ca n, nt gc c phi tr li cho h thng, chiu cao cy gim i mt.
Tm nt l cha (V, P) delete_entry(L, V, P)

Procedure delete(value V, pointer P)

end procedure Procedure delete_entry(node L, value V, pointer P) xo (V, P) khi L If (L l nt gc and L ch cn li mt con) then Ly con ca L lm nt gc mi ca cy, xo L else If (L c qu t gi tr/ con tr) then begin L' l anh em k tri hoc phi ca L V' l gi tr gia hai con tr L, L' (trong nt parent(L)) If (cc u vo ca L v L' c th lp y trong mt khi) then begin If (L l nt trc ca L') then wsap_variables(L, L') If (L khng l l) then ni V' v tt c con tr, gi tr trong L vi L' else begin ni tt c cc cp (K, P) trong L vi L'; L'.Pp = L.Pp end

CHNG III. LU TR V CU TRC TP TIN

trang

59

H QUN TR C S D LIU
delete_entry(parent(L), V', L); xo nt L end else begin If (L' l nt trc ca L) then begin If (L khng l nt l) then begin p l ch s sao cho L'.Pp l con tr cui trong L' xo (L'.Kp-1, L'.Pp) khi L' xen (L'.Pp, V') nh phn t u tin trong L (right_shift tt c cc phn t ca L) thay th V' trong parent(L) bi L'.Kp-1 end else begin p l ch s sao cho L'.Pp l con tr cui trong L' xo (L'.Pp, L'.Kp) khi L' xen (L'.Pp, L'.Kp) nh phn t u tin trong L (right_shift tt c cc phn t ca L) thay th V' trong parent(L) bi L'.Kp end end < i xng vi trng hp then > end end procedure

T chc file B+-cy


Trong t chc file B+-cy, cc nt l ca cy lu tr cc mu tin, thay cho cc con tr ti file. V mu tin thng ln hn con tr, s ti a cc mu tin c lu tr trong mt khi l t hn s con tr trong mt nt khng l. Cc nt l vn c yu cu c lp y t nht l mt na. Xen v xo trong t chc file B+-cy tng t nh trong ch mc B+-cy. Khi B+-cy c s dng t chc file, vic s dng khng gian l c bit quan trng, v khng gian b chim bi mu tin l ln hn nhiu so vi khng gian b chim bi (kho,con tr). Ta c th ci tin s s sng khng gian trong B+-cy bng cch bao hm nhiu nt anh em hn khi ti phn phi trong khi tch v trn. Khi xen, nu mt nt l y, ta th phn phi li mt s u vo n mt trong cc nt k to khng gian cho u vo mi. Nu vic th ny tht bi, ta mi thc hin tch nt v phn chia cc u vo gia mt trong cc nt k v hai nt nhn c do tch nt. Khi xo, nu nt cha t hn 2m/3 u vo, ta th mn mt u vo t mt trong hai nt anh em k. Nu c hai u c ng 2m/3 mu tin, ta phn phi li cc u vo ca nt cho hai nt anh em k v xo nt th 3. Nu k nt c s dng trong ti phn phi (k-1 nt anh em), mi nt m bo cha t nht (k-1)m/k u vo. Tuy nhin, ci gi phi tr cho cp nht ca cch tip cn ny s cao hn.

FILE CH MC B-CY (B-Tree Index Files)


Ch mc B-cy tng t nh ch mc B+-cy. S khc bit l ch B-cy loi b lu tr d tha cc gi tr kho tm kim. Trong B-cy, cc gi tr kho ch xut hin mt ln. Do cc kho tm kim xut hin trong cc nt khng l l khng xut hin bt k ni no khc na trong B-cy, ta phi thm mt trng con tr cho mi kho tm kim trong cc nt khng l l. Con tr thm vo ny tr ti hoc mu tin trong file hoc bucket tng ng. Mt nt l B-cy tng qut c dng:
CHNG III. LU TR V CU TRC TP TIN
trang 60

H QUN TR C S D LIU P1 K1 P2 K2 ... Pm-1 Km-1 Pm

Mt nt khng l l c dng:
P1 B1 K1 P2 B2 K2 P BmB

Cc con tr Pi l cc con tr cy v c dng nh trong B+-cy. Cc con tr Bi trong cc nt khng l l l cc con tr mu tin hoc con tr bucket. R rng l s gi tr kho trong nt khng l nh hn s gi tr trong nt l. S nt c truy xut trong qu trnh tm kim trong mt B-cy ph thuc ni kho tm kim c nh v. Xo trong mt B-cy phc tp hn trong mt B+-cy. Xo mt u vo xut hin mt nt khng l l ko theo vic tuyn chn mt gi tr thch hp trong cy con ca nt cha u vo b xo. Nu kho Ki b xo, kho nh nht trong cy con c tr bi Pi+1 phi c di chuyn vo v tr ca Ki. Nu nt l cn li qu t u vo, cn thit cc hot ng b xung.
III.9.4 nh ngha ch mc trong SQL Mt ch mc c to ra bi lnh CREATE INDEX vi c php CREATE INDEX < index-name > ON < relation_name > (< attribute-list >) attribute-list l danh sch cc thuc tnh ca quan h c dng lm kho tm kim cho ch mc. Nu mun khai bo l kho tm kim l kho d tuyn, thm vo t kho UNIQUE: CREATE UNIQUE INDEX < index-name > ON < relation_name > (< attribute-list >) attribute-list phi to thnh mt kho d tuyn, nu khng s c mt thng bo li. B i mt ch mc s dng lnh DROP: DROP INDEX < index-name >

BM (HASHING)
BM TNH (Static Hashing)
Bt li ca t chc file tun t l ta phi truy xut mt cu trc ch mc nh v d liu, hoc phi s dng tm kim nh phn, v kt qu l c nhiu hot ng I/O. T chc file da trn k thut bm cho php ta trnh c truy xut mt cu trc ch mc. Bm cung cung cp mt phng php xy dng cc ch mc.

T chc file bm
Trong t chc file bm, ta nhn c a ch ca khi a cha mt mu tin mong mun bi tnh ton mt hm trn gi tr kho tm kim ca mu tin. thut ng bucket c dng ch mt n v lu tr. Mt bucket kiu mu l mt khi a, nhng c th c chn nh hn hoc ln hn mt khi a.

k hiu tp tt c cc gi tr kho tm kim,

Mt hm bm h l mt hm t

K vo

B k hiu tp tt c cc a ch bucket. B : h: K B

Xen mt mu tin vi gi tr kho K vo trong file: ta tnh h(K). Gi tr ca h(K) l a ch ca bucket s cha mu tin. Nu c khng gian trong bucket cho mu tin, mu tin c lu tr trong bucket.
CHNG III. LU TR V CU TRC TP TIN
trang 61

H QUN TR C S D LIU

Tm kim mt mu tin theo gi tr kho K: u tin tnh h(K), ta tm c bucket tng ng. sau d tm trong bucket ny mu tin vi gi tr kho K mong mun. Xo mu tin vi gi tr kho K: tnh h(K), tm trong bucket tng ng mu tin mong mun, xo n khi bucket.

Hm bm
Hm bm xu nht l hm nh x tt c cc gi tr kho vo cng mt bucket. Hm bm l tng l hm phn phi u cc gi tr kho vo cc bucket, nh vy mi bucket cha mt s lng mu tin nh nhau. Ta mun chn mt hm bm tho mn cc tiu chun sau: o Phn phi u: Mi bucket c gn cng mt s gi tr kho tm kim trong tp hp tt c cc gi tr kho c th o Phn phi ngu nhin: Trong trng hp trung bnh, cc bucket c gn mt s lng gi tr kho tm kim gn bng nhau. Cc hm bm phi c thit k thn trng. Mt hm bm xu c th dn n vic tm kim chim mt thi gian t l vi s kho tm kim trong file.

iu khin trn bucket


Khi xen mt mu tin, nu bucket tng ng cn ch, mu tin c xen vo bucket, nu khng s xy ra trn bucket. Trn bucket do cc nguyn do sau: Cc bucket khng . S cc bucket nB phi tho mn nB > nr / fr trong nr l tng s mu tin s lu tr, fr l s mu tin c th lp y trong mt bucket. S lch. Mt vi bucket c gn cho mt s lng mu tin nhiu hn cc bucket khc, nh vy mt bucket c th trn trong khi cc bucket khc vn cn khng gian. Tnh hung ny c gi l s lch bucket. S lch xy ra do hai nguyn nhn: 1. Nhiu mu tin c cng kho tm kim 2. Hm bm c chn phn phi cc gi tr kho khng u Ta qun l trn bucket bng cch dng cc bucket trn. Nu mt mu tin phi c xen vo bucket B nhng bucket B y, khi mt bucket trn s c cp cho B v mu tin c xen vo bucket trn ny. Nu bucket trn cng y mt bucket trn mi li c cp v c nh vy. Tt c cc bucket trn ca mt bucket c mc xch vi nhau thnh mt danh sch lin kt. Vic iu khin trn dng danh sch lin kt nh vy c gi l dy chuyn trn. i vi dy chuyn trn, thut ton tm kim thay i ch t: trc tin ta cng tnh gi tr hm bm trn kho tm kim, ta c bucket B, kim tra cc mu tin, trong bucket B v tt c cc bucket trn tng ng, c gi tr kho khp vi gi tr tm khng. Mt cch iu khin trn bucket khc l: Khi cn xen mt mu tin vo mt bucket nhng n y, thay v cp thm mt bucket trn, ta s dng mt hm bm k trong mt dy cc hm bm c chn tm bucket khc cho mu tin, nu bucket sau cng y, ta li s dng mt hm bm k v c nh vy... Dy cc hm bm thng c s dng l { hi (K) = (hi-1(K) +1) mod nB vi 1 i nB-1 v h0 l hm bm c s }. Dng cu trc bm s dng dy chuyn bucket c gi l bm m. Dng s dng dy cc hm bm c gi l bm ng. Trong cc h CSDL, cu trc bm ng thng c a dng hn.

CHNG III. LU TR V CU TRC TP TIN

trang

62

H QUN TR C S D LIU

Ch mc bm
Mt ch mc bm t chc cc kho tm kim cng con tr kt hp vo mt cu trc file bm nh sau: p dng mt hm bm trn kho tm kim nh danh bucket sau lu gi tr kho v con tr kt hp vo bucket ny (hoc vo cc bucket trn). Ch mc bm thng l ch mc th cp. Hm bm trn s ti khon c tnh theo cng thc: h(Account_number) = (tng cc ch s trong s ti khon) mod 7

BM NG (Dynamic Hashing)
Trong k thut bm tnh (static hashing), tp cc a ch bucket phi l c nh. Cc CSDL pht trin ln ln theo thi gian. Nu ta s dng bm tnh cho CSDL, ta c ba lp la chn: 1. Chn mt hm bm da trn kch c file hin hnh. S la chn ny s dn n s suy gim hiu nng khi CSDL ln ln. 2. Chn mt hm bm da trn kch c file d on trc cho mt thi im no trong tng lai. Mc d s suy gim hiu nng c ci thin, mt lng ng k khng gian c th b lng ph lc khi u. 3. T chc li theo chu k cu trc bm p ng s pht trin kch c file. Mt s t chc li nh vy ko theo vic la chn mt hm bm mi, tnh li hm bm trn mi mu tin trong file v sinh ra cc gn bucket mi. T chc li l mt hot ng tn thi gian. Hn na, n i hi cm truy xut file trong khi ang t chc li file.

bucket 0 A-215 A-305 bucket 1 A-101 A-110 bucket 2 A-217 A-102 bucket 3 A-218 bucket 4 A-203 bucket 5 A-222 CHNG III. LU TR V CU TRC TP TIN bucket 6
trang 63

Brighton Downtown Downtown Mianus Perryridge Perryridge Perryridge Redwood Round Hill

A-217 A-101 A-110 A-215 A-102 A-203 A-218 A-222 A-305

750 500 600 700 400 900 700 850 550

H QUN TR C S D LIU

Ch mc bm trn kho tm kim account-number ca file account K thut bm ng cho php sa i hm bm ph hp vi s tng hoc gim ca CSDL. Mt dng bm ng c gi l bm c th m rng (extendable hashing) c thc hin nh sau: Chn mt hm bm h vi cc tnh cht u, ngu nhin v c min gi tr tng i rng, chng hn, l mt s nguyn b bit (b thng l 32). Khi khi u ta khng s dng ton b b bit gi tr bm. Ti mt thi im, ta ch s dng i bit 0 i b. i bit ny c dng nh mt di (offset) trong mt bng a ch bucket ph. gi tr i tng ln hay gim xung tu theo kch c CSDL. S i xut hin bn trn bng a ch bucket ch ra rng i bit ca gi tr bm h(K) c i hi xc nh bucket ng cho K, s ny s thay i khi kch c file thay i. Mc d i bit dc i hi tm u vo ng trong bng a ch bucket, mt s u vo bng k nhau c th tr n cng mt bucket. Tt c cc nh vy c chung hash prefix chung, nhng chiu di ca prefix ny c th nh hn i. Ta kt hp mt s nguyn ch di ca hash prefix chung ny, ta s k hiu s nguyn kt hp vi bucket j l ij. S cc u vo bng a ch bucket tr n bucket (i ) j l 2 i j .
i1 hash prefix i ..00 ..01 ..10 ..11 . . .
bng a ch bucket bucket 3 bucket 2 bucket 1

i2

i3

Cu trc bm c th m rng tng qut nh v bucket cha gi tr kho tm kim K , ta ly i bit cao u tin ca h(K), tm trong u vo bng tng ng vi chui bit ny v ln theo con tr trong u vo bng ny. xen mt mu tin vi gi tr kho tm kim K, tin hnh th tc dnh v trn, ta c bucket, gi s l bucket j. Nu cn cho cho mu tin, xen mu tin vo trong bucket . Nu khng, ta phi tch bucket ra v phn phi li cc mu tin hin c cng mu tin mi. tch bucket, u tin ta xc nh t gi tr bm c cn tng s bit ln hay khng. Nu i = ij , ch c mt u vo trong bng a ch bucket tr n bucket j. ta cn tng kch c ca bng a ch bucket sao cho ta c th bao hm cc con tr n hai bucket kt qu
trang 64

CHNG III. LU TR V CU TRC TP TIN

H QUN TR C S D LIU

ca vic tch bucket j bng cch xt thm mt bit ca gi tr bm. tng gi tr i ln mt, nh vy kch c ca bng a ch bucket tng ln gp i. Mi mt u vo c thay bi hai u vo, c hai cng cha con tr ca u vo gc. By gi hai u vo trong bng a ch bucket tr ti bucket j. Ta nh v mt bucket mi (bucket z), v t u vo th hai tr ti bucket mi, t ij v iz v i, tip theo mi mt mu tin trong bucket j c bm li, tu thuc vo i bit u tin, s hoc li bucket j hoc c cp pht cho bucket mi c to. Nu i > ij khi nhiu hn mt u vo trong bng a ch bucket tr ti bucket j. nh vy ta c th tch bucket j m khng cn tng kch c bng a ch bucket. Ta cp pht mt bucket mi (bucket z) v t ij v iz n gi tr l kt qu ca vic thm 1 vo gi tr ij gc. K n, ta iu chnh cc u vo trong bng a ch bucket trc y tr ti bucket j. Ta li na u cc u vo, v t tt c cc u vo cn li tr ti bucket mi to (z). Tip theo, mi mu tin trong trong bucket j c bm li v c cp pht cho hoc vo bucket j hoc bucket z.

xo mt mu tin vi gi tr kho K, trc tin ta thc hin th tc nh v, ta tm c bucket tng ng, gi l j, ta xo c kho tm kim trong bucket ln mu tin mu tin trong file. bucket cng b xo, nu n tr nn rng. Ch rng, ti im ny, mt s bucket c th c kt hp li v kch c ca bng a ch bucket s gim i mt na. u im chnh ca bm c th m rng l hiu nng khng b suy gim khi file tng kch c, hn na, tng ph khng gian l ti tiu mc d phi thm vo khng gian cho bng a ch bucket. Mt khuyt im ca bm c th m rng l vic tm kim phi bao hm mt mc gin tip: ta phi truy xut bng a ch bucket trc khi truy xut n bucket. V vy, bm c th m rng l mt k thut rt hp dn.

CHN CH MC HAY BM ?
Ta xt qua cc s : ch mc th t, bm. Ta c th t chc file cc mu tin bi hoc s dng t chc file tun t ch mc, hoc s dng B+-cy, hoc s dng bm ... Mi s c nhng cc u im trong cc tnh hung nht nh. Mt nh thc thi h CSDL c th cung cp nhiu nhiu s , li vic quyt nh s dng s no cho nh thit k CSDL. c mt s la chn khn ngoan, nh thc thi hoc nh thit k CSDL phi xt cc yu t sau:

Ci gi phi tr cho vic t chc li theo nh k ca ch mc hoc bm c chp nhn c hay khng? Tn s tng i ca cc hot ng xen v xo l bao nhiu ? C nn ti u ho thi gian truy xut trung bnh trong khi thi gian truy xut trng hp xu nht tng ln hay khng ?
Cc kiu vn tin m cc ngi s dng thch t ra l g ?

CU TRC LU TR CHO CSDL HNG I TNG

SP XP CC I TNG VO FILE
Phn d liu ca i tng c th c lu tr bi s dng cc cu trc file c m t trc y vi mt s thay i do i tng c kch c khng u, hn na i tng c th rt ln. Ta c th thc thi cc trng tp hp t phn t bng cch s dng danh sch lin kt, cc trng tp hp nhiu phn t bi B-cy hoc bi cc quan h ring bit trong c s d liu. Cc trng tp hp cng c th b loi tr mc lu tr bi chun ho. Cc i tng cc ln kh c
CHNG III. LU TR V CU TRC TP TIN
trang 65

H QUN TR C S D LIU

th phn tch thnh cc thnh phn nh hn c th c lu tr trong mt file ring cho mi i tng.

THC THI NH DANH I TNG


V i tng c nhn bit bi nh danh ca i tng (OID = objject Identifier), Mt h lu tr i tng cn phi c mt c ch tm kim mt i tng c cho bi mt OID. Nu cc OID l logic, c ngha l chng khng xc nh v tr ca di tng, h thng lu tr phi duy tr mt ch mc m n nh x OID ti v tr hin hnh ca i tng. Nu cc OID l vt l, c ngha l chng m ho v tr ca i tng, i tng c th dc tm trc tip. Cc OID in hnh c ba trng sau: 1. Mt volume hoc nh danh file 2. Mt nh danh trang bn trong volume hoc file 3. Mt offset bn trong trang Hn na, OID vt l c th cha mt nh danh duy nht, n l mt s nguyn tch bit OID vi cc nh danh ca cc i tng khc c lu tr cng v tr trc y v b xo hoc di i. nh danh duy nht ny cng c lu vi i tng, cc nh danh trong mt OID v i tng tng ng ph hp. Nu nh danh duy nht trong mt OID vt l khng khp vi vi nh danh duy nht trong i tng m OID ny tr ti, h thng pht hin ra rng con tr l bm v bo mt li. Li con tr nh vy xy ra khi OID vt l tng ng vi i tng c b xo do tai nn. Nu khng gian b chim bi i tng c cp pht li, c th c mt i tng mi vo v tr ny v c th c nh a ch khng ng bi nh danh ca i tng c. Nu khng pht hin c, s dng con tr bm c th gy nn s sai lc ca mt i tng mi c lu cng v tr. nh danh duy nht tr gip pht hin li nh vy. Gi s mt i tng phi di chuyn sang trang mi do s ln ln ca i tng v trang c khng c khng gian ph. Khi OID vt l tr ti trang c by gi khng cn cha i tng. Thay v thay i OID ca i tng (iu ny ko theo s thay i mi i tng tr ti i tng ny) ta a ch forward v tr c. Khi CSDL tm i tng, n tm a ch forward thay cho tm i tng v s dng a ch forward tm i tng.

QUN TR CC CON TR BN (persistent pointers)


Ta thc thi cc con tr bn trong ngn ng lp trnh bn (persistent programming language) bng cch s dng cc OID. Cc con tr bn c th l cc OID vt l hoc logic. S khc nhau quan trng gia con tr bn v con tr trong b nh l kch thc conca con tr. Con tr trong b nh ch cn ln nh a ch ton b b nh o, hin ti kch c con tr trong b nh l 4 byte. Con tr bn nh a ch ton b d liu trong mt CSDL, nn kch c ca n t nht l 8 byte.

Pointer Swizzling
Hnh ng tm mt i tng c cho bi nh danh c gi l dereferencing. cho mt con tr trong b nh, tm i tng n thun l mt s tham kho b nh. cho mt con tr bn, dereferencing mt i tng c mt bc ph: phi tm v tr hin hnh ca i tng trong b nh bi tm con tr bn trong mt bng. Nu i tng cha nm trong b nh, n phi c np t a. Ta c th thc thi bng tm kim ny hon ton hiu qu bi s dng bm, song tm kim vn chm. pointer swizling l mt phng php gim ci gi tm kim cc i tng bn hin din trong b nh. tng l khi mt con tr bn c dereference, i tng c nh v v mang vo trong b (nh nu n cha c ). By gi mt bc ph c thc hin: mt con tr trong b nh ti i tng c lu vo v tr ca con tr bn. Ln k con tr bn tng t c dereference, v tr trong b nh c th c c ra trc tip. Trong trng hp cc i tng bn phi di chuyn ln a ly khng gian cho i tng bn khc, cn mt bc ph m bo i tng vn trong b nh cng phi c thc hin. Khi mt i tng c vit ra. bt k con tr bn no m n cha v b swizzling phi c unswizzling nh vy c chuyn i v biu din bn ca chng. pointer swizzling

CHNG III. LU TR V CU TRC TP TIN

trang

66

H QUN TR C S D LIU
trn poiter dereferenc c m t ny c gi l software swizzling. Quan tr buffer s phc tp hn nu pointer swizzling c s dng.

Hardware swizzling
Vic c hai kiu con tr, con tr bn (persistent pointer) v con tr tm (transient pointer / con tr trong b nh), l iu kh bt li. Ngi lp trnh phi nh kiu con tr v c th phi vit m chng trnh hai ln- mt cho cc con tr bn v mt cho con tr tm. S thun tin hn nu c hai kiu con tr ny cng kiu. Mt cch n gin trn ln hai con r ny l m rng chiu di con tr b nh cho bng kch c con tr bn v s dng mt bit ca phn nh danh phn bit chng. Cch lm ny s lm tng chi ph lu tr i vi cc con tr tm. Ta s m t mt k thut c gi l hardware swizzling n s dng phn cng qun tr b nh gii quyt vn ny. Hardware swizzling c hai im li hn so vi software swizzling: Th nht, n cho php lu tr cc con tr bn trong i tng trong lng khng gian bng vi lng khng gian con tr b nh i hi. Th hai, n chuyn i trong sut gia cc con tr bn v cc con tr tm mt cch thng minh v hiu qu. Phn mm c vit gii quyt cc con tr trong b nh c th gii quyt cc con tr bn m khng cn thay i. hardware swizzling s dng s biu din cc con tr bn c cha trong i tng trn a nh sau: Mt con tr bn c tch ra thnh hai phn, mt l nh danh trang v mt l offset bn trong trang. nh danh trang thng l mt con tr trc tip nh: mi trng c mt bng dch (translation table) cung cp mt nh x t cc nh danh trang ngn n cc nh danh CSDL y . H thng phi tm nh danh trang nh trong mt con tr bn trong bng dch tm nh danh trang y . Bng dch, trong trng hp xu nht, ch ln bng s ti a cc con tr c th c cha trong cc i tng trong mt trang. Vi mt trang kch thc 4096 byte, con tr kch thc 4 byte, s ti a cc con tr l 1024. Trong thc t s ti a nh hn con s ny rt nhiu. nh danh trang nh ch cn s bit nh danh mt dng trong bng, nu s dng ti a l 1024, ch cn 10 bit nh danh trang nh. Bng dch cho php ton b mt con tr bn lp y mt khng gian bng khng gian cho mt con tr trong b nh.

PageID 2395

Off. 255

PageID 4867

Off. 020

PageID 2395

Off. 170

PageID 5001

Off. 255

PageID 4867

Off. 020

PageID 5001

Off. 170

Object 1

Object 2

Object 3

Object 1

Object 2

Object 3

Translation Table PageID 2395 4867 FullPageID 679.34.28000 519.56.84000

Translation Table PageID 5001 4867 FullPageID 679.34.28000 519.56.84000

Hnh 1. nh trang trc khi swizzling

Hnh 2. nh trang sau khi swizzling

Trong hnh 1, trnh by s biu din con tr bn, c ba i tng trong trang, mi mt cha mt con tr bn. Bng dch cho ra nh x gia nh danh trang ngn v nh danh trang CSDL y i vi mi nh danh trang ngn trong cc con tr bn ny. nh danh trang CSDL c trnh by di dng volume.file.offset. Thng tin ph c duy tr vi mi trang sao cho tt cc cc con tr bn trong trang c th tm thy. Thng tin c cp nht khi mt i tng c to ra hay b xo khi trang. Khi mt con tr trong b nh c dereferencing, nu h iu hnh pht hin trang trong khng gian a ch o c tr ti khng c cp pht lu tr hoc c truy xut c bo v, khi mt s vi phm on c c on l xy ra. Nhiu h iu hnh cung cp mt c ch xc nh mt hm se c gi khi vi phm on xy ra, mt c ch cp pht lu tr cho cc trang trong khng gian a ch o, v mt tp cc quyn truy xut trang. u tin, ta xt mt con tr trong b nh tr ti trang v c kh tham chiu, khi lu tr cha c cp pht cho trang ny. Mt vi phm on s xy ra v kt qu l mt li gi hm trn h CSDL. H CSDL du tin xc nh trang CSDL no c cp pht cho trang b nh o v, gi nh danh trang y ca trang CSDL l P, nu khng c trang CSDL cp pht cho v, mt li c thng bo., nu khng, h CSDL cp pht khng gian lu tr cho trang v v np trang CSDL P vo trong v. Pointer swizzling by gi c lm i vi trang P nh sau: H thng tm tt c cc con tr bn c cha trong cc i tng trong trang, bng cch s dng thng tin ph c lu tr trong trang. Ta xt mt con tr nh vy v gi n l (pi, oi), trong pi l nh danh trang ngn v

CHNG III. LU TR V CU TRC TP TIN

trang

67

H QUN TR C S D LIU
oi l offset trong trang. Gi s Pi l nh danh trang y ca pi c tm thy trong bng dch trong trang P. Nu trang Pi cha c mt trang b nh o c cp cho n, mt trang t do trong khng gian a ch o s c cp cho n. Trang Pi s nm v tr a ch o nynu v khi n c mang vo. Ti im ny, trang trong khng gian a ch o khng c bt k mt lu tr no c cp cho n, c trong b nh ln trn a, n tun ch l mt khong a ch d tr cho trang CSDL. By gi gi s trang b nh o c cp pht cho Pi l vi . Ta cp nht con tr (pi, oi) bi thay th pi bi vi , cui cng sau khi swizzling tt c cc con tr bn trong P, s kh tham chiu gy ra vi phm on c cho php tip tc v s tm thy i tng ang c tm kim trong b nh. Trong hnh 2, trnh by trng thi trang trong hnh 1 sau khi trang ny c mang vo trong b nh v cc con tr trong n c swizzling. y ta gi thit trang nh danh trang CSDL ca n l 679.34.28000 c nh x n trang 5001 trong b nh, trong khi trang nh danh ca n l 519.56.84000 c nh x dn trang 4867. Tt c cc con tr trong i tng c cp nht phn nh tng ng mi v by gi c th c dng nh con tr trong b nh. cui ca giai on dch i vi mt trang, cc i tng trong trang tho mn mt tnh cht quan trng: Tt c cc con tr bn c cha trong i tng trong trang c chuyn i thnh cc con tr trong b nh.

CU HI V BI TP CHNG III
III.1 Xt s sp xp cc khi d liu v cc khi parity trn bn a sau:
a 1 B1
B

a 2 B2
B

a 3 B3
B

a 4 B4
B

P1 B8
B

B5
B

B6
B

B7
B

P2 ...

B9
B

B10
B

Trong cc Bi biu din cc khi d liu, cc khi Pi biu din cc khi parity. Khi Pi l khi parity i vi cc khi d liu B4i - 3 , B4i - 2 , B4i - 1 , B4i . Hy nu cc vn gp phi ca cch sp xp ny. III.2 Mt s mt in xy ra trong khi mt khi ang c vit s dn ti kt qu l khi c th ch c vit mt phn. Gi s rng khi c vit mt phn c th pht hin c. Mt vit khi nguyn t l hoc ton b khi c vit hoc khng c g c vit (khng c khi c vit mt phn). Hy ngh nhng s c c cc vit khi nguyn t hiu qu trn cc s RAID: 1. Mc 1 2. Mc 5 (mirroring) (block interleaved, distributed parity)
trang 68

CHNG III. LU TR V CU TRC TP TIN

...

...

...

H QUN TR C S D LIU III.3 Cc h thng RAID tiu biu cho php thay th cc a h khng cn ngng truy xut h thng. Nh vy d liu trong a b h s phi c ti to v vit ln a thay th trong khi h thng vn tip tc hot ng. Vi mc RAID no thi lng giao thoa gia vic ti to v cc truy xut a cn ang chy l t nht ? Gii thch. III.4 Xt vic xo mu tin 5 trong file: 0 1 8 3 4 5 6 7 Perryridge Round Hill Perryridge Downtown Redwood Perryridge Brighton Downtown A-102 A-305 A-218 A-101 A-222 A-201 A-217 A-110 400 350 700 500 700 900 750 600

So snh cc iu hay/d tng i ca cc k thut xo sau: 1. Di chuyn mu tin 6 n khng gian ch chim bi mu tin 5, ri di chuyn mu tin 7 n ch b chim bi mu tin 6. 2. Di chuyn mu tin 7 n ch b chim bi mu tin 5 3. nh du xo mu tin 5.
III.5 V cu trc ca file:
header

0 1 2 3 4 5 6 7 8

Perryridge Mianus Downtown Perryridge Downtown Perryridge

A-102 A-215 A-101 A-201 A-110 A-218

400 700 500 900 600 700

Sau mi bc sau:
1. Insert(Brighton, A-323, 1600) 2. Xo mu tin 2 3. Insert(Brighton, A-636, 2500) CHNG III. LU TR V CU TRC TP TIN
trang 69

H QUN TR C S D LIU III.6 V li cu trc file: 0 1 2 3 4 5 Perryridge Round Hill Mianus Downtown Redwood Brighton A-102 A-301 A-101 A-211 A-300 A-111 400 350 800 500 650 750 A-201 900 A210 700


A-222 A-200 600 1200

A-255
950

Sau mi bc sau: 1. Insert(Mianus, A-101, 2800) 2. Insert(Brighton, A-323, 1600) 3. Delete (Perryridge, A-102, 400) III.7 III.8 iu g s xy ra nu xen mu tin (Perryridge, A-999, 5000) vo file trong III.6. V li cu trc file di y sau mi bc sau: 1. Insert(Mianus, A-101, 2800 2. Insert(Brighton, A-323, 1600) 3. Delete (Perryridge, A-102, 400) 0 1 2 3 4 5 6 7 8 9 10 Round Hill Mianus Downtown Redwood Brighton Perryridge A-102 A-201 A-210 A-301 A-101 A-211 A-300 A-111 A-222 A-200 A-255 ( = con tr nil ) 400 900 700 350 800 500 650 750 600 1200 950

III.9 Nu ln mt v d, trong phng php khng gian d tr biu din cc mu tin di thay i ph hp hn phng php con tr.
III.10 Nu ln mt v d, trong phng php con tr biu din cc mu tin di thay i ph hp hn phng php khng gian d tr. III.11 Nu mt khi tr nn rng sau khi xo. Khi ny c ti s dng vo mc ch g ? CHNG III. LU TR V CU TRC TP TIN
trang 70

H QUN TR C S D LIU III.12 Trong t chc file tun t, ti sao khi trn c s dng thm ch, ti thi im ang xt, ch c mt mu tin trn ? III.13 Lit k cc u im v nhc im ca mi mt trong cc chin lc lu tr CSDL quan h sau: 1. Lu tr mi quan h trong mt file 2. Lu tr nhiu quan h trong mt file III.14 Nu mt v d biu thc i s quan h v mt chin lc x l vn tin trong : 1. MRU ph hp hn LRU 2. LRU ph hp hn MRU III.15 Khi no s dng ch mc c ph hp hn ch mc tha ? Gii thch. III.16 Nu cc im khc nhau gia ch mc s cp v ch mc th cp . III.17 C th c hai ch mc s cp i vi hai kho khc nhau trn cng mt quan h ? Gii thch. III.18 Xy dng mt B+-cy i vi tp cc gi tr kho: (2, 3, 5, 7, 11, 15, 19, 25, 29, 33, 37, 41, 47). Gi thit ban u cy l rng v cc gi tr c xen theo th t tng. Xt trong cc trng hp sau: 1. Mi nt cha ti a 4 con tr 2. Mi nt cha ti a 6 con tr 3. Mi nt cha ti a 8 con tr III.19 i vi mi B+-cy trong bi tp III.18 By t cc bc thc hin trong cc vn tin sau: 1. Tm mu tin vi gi tr kho tm kim 11 2. Tm cc mu tin vi gi tr kho nm trong khong [ 7..19 ] III.20 i vi mi B+-cy trong bi tp III.18. V cy sau mi mt trong dy hot ng sau: 1. Insert 9 2. Insert 11 3. Insert 11 4. Delete 25 5. Delete 19 III.21 Cng cu hi nh trong III.18 nhng i vi B-cy III.22 Nu v gii thch s khc nhau gia bm ng v bm m. Nu cc u, nhc im ca mi k thut ny. III.23 iu g gy ra s trn bucket trong mt t chc file bm ? Lm g gim s trn ny ? III.24 Gi s ta ang s dng bm c th m rng trn mt trn mt file cha cc mu tin vi cc gi tr kho tm kim sau: 2, 3, 5, 7, 11, 17, 19, 23, 37, 31, 35, 41, 49, 55 V cu trc bm c th m rng i vi file ny nu hm bm l h(x) = x mod 8 v mi bucket c th cha nhiu nht c ba mu tin. III.25 V li cu trc bm c th m rng trong bi tp III.24 sau mi bc sau: 1. Xo 11 2. Xo 55 3. Xen 1 CHNG III. LU TR V CU TRC TP TIN
trang 71

H QUN TR C S D LIU 4. Xen 15

CHNG III. LU TR V CU TRC TP TIN

trang

72

Vous aimerez peut-être aussi