Vous êtes sur la page 1sur 9

Crash Recovery

Chapter 18

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 1

Review: The ACID properties

™ A tomicity: All actions in the Xact happen, or none happen.


™ C onsistency: If each Xact is consistent, and the DB starts
consistent, it ends up consistent.
™ I solation: Execution of one Xact is isolated from that of
other Xacts.
™ D urability: If a Xact commits, its effects persist.

™ The Recovery Manager guarantees Atomicity & Durability.

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 2

Motivation
™ Atomicity:
ƒ Transactions may abort (“Rollback”).
™ Durability:
ƒ What if DBMS stops running? (Causes?)

Y Desired Behavior after


system restarts: crash!
T1
– T1, T2 & T3 should be T2
durable. T3
– T4 & T5 should be T4
aborted (effects not seen). T5
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 3
Assumptions

™ Concurrency control is in effect.


ƒ Strict 2PL, in particular.
™ Updates are happening “in place”.
ƒ i.e. data is overwritten on (deleted from) the disk.

™ A simple scheme to guarantee Atomicity &


Durability?

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 4

Handling the Buffer Pool

™ Force every write to disk?


ƒ Poor response time. No Steal Steal
ƒ But provides durability.
Force Trivial
™ Steal buffer-pool frames
from uncommited Xacts?
ƒ If not, poor throughput. Desired
No Force
ƒ If so, how can we ensure
atomicity?

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 5

More on Steal and Force


™ STEAL (why enforcing Atomicity is hard)
ƒ To steal frame F: Current page in F (say P) is
written to disk; some Xact holds lock on P.
• What if the Xact with the lock on P aborts?
• Must remember the old value of P at steal time (to
support UNDOing the write to page P).
™ NO FORCE (why enforcing Durability is hard)
ƒ What if system crashes before a modified page is
written to disk?
ƒ Write as little as possible, in a convenient place, at
commit time,to support REDOing modifications.

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 6


Basic Idea: Logging

™ Record REDO and UNDO information, for


every update, in a log.
ƒ Sequential writes to log (put it on a separate disk).
ƒ Minimal info (diff) written to log, so multiple
updates fit in a single log page.
™ Log: An ordered list of REDO/UNDO actions
ƒ Log record contains:
<XID, pageID, offset, length, old data, new data>
ƒ and additional control info (which we’ll see soon).

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 7

Write-Ahead Logging (WAL)

™ The Write-Ahead Logging Protocol:


c Must force the log record for an update before the
corresponding data page gets to disk.
d Must write all log records for a Xact before commit.
™ #1 guarantees Atomicity.
™ #2 guarantees Durability.

™ Exactly how is logging (and recovery!) done?


ƒ We’ll study the ARIES algorithms.

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 8

WAL & DB RAM


the Log LSNs pageLSNs flushedLSN

™ Each log record has a unique Log Sequence


Number (LSN). Log records
flushed to disk
ƒ LSNs always increasing.
™ Each data page contains a pageLSN.
ƒ The LSN of the most recent log record
for an update to that page.
™ System keeps track of flushedLSN.
pageLSN “Log tail”
ƒ The max LSN flushed so far.
in RAM
™ WAL: Before a page is written,
ƒ pageLSN ≤ flushedLSN
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 9
Log Records
Possible log record types:
LogRecord fields: ™ Update
prevLSN ™ Commit
XID
™ Abort
type
pageID ™ End (signifies end of
update length commit or abort)
records offset ™ Compensation Log
only before-image
Records (CLRs)
after-image
ƒ for UNDO actions

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 10

Other Log-Related State

™ Transaction Table:
ƒ One entry per active Xact.
ƒ Contains XID, status (running/commited/aborted),
and lastLSN.
™ Dirty Page Table:
ƒ One entry per dirty page in buffer pool.
ƒ Contains recLSN -- the LSN of the log record which
first caused the page to be dirty.

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 11

Normal Execution of an Xact

™ Series of reads & writes, followed by commit or


abort.
ƒ We will assume that write is atomic on disk.
• In practice, additional details to deal with non-atomic writes.
™ Strict 2PL.
™ STEAL, NO-FORCE buffer management, with
Write-Ahead Logging.

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 12


Checkpointing
™ Periodically, the DBMS creates a checkpoint, in
order to minimize the time taken to recover in the
event of a system crash. Write to log:
ƒ begin_checkpoint record: Indicates when chkpt began.
ƒ end_checkpoint record: Contains current Xact table and
dirty page table. This is a `fuzzy checkpoint’:
• Other Xacts continue to run; so these tables accurate only as of
the time of the begin_checkpoint record.
• No attempt to force dirty pages to disk; effectiveness of
checkpoint limited by oldest unwritten change to a dirty page.
(So it’s a good idea to periodically flush dirty pages to disk!)
ƒ Store LSN of chkpt record in a safe place (master record).
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 13

The Big Picture:


What’s Stored Where

LOG RAM
DB
LogRecords
prevLSN Xact Table
XID
Data pages lastLSN
type each status
pageID with a
pageLSN Dirty Page Table
length
offset recLSN
before-image master record
after-image flushedLSN

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 14

Simple Transaction Abort

™ For now, consider an explicit abort of a Xact.


ƒ No crash involved.
™ We want to “play back” the log in reverse
order, UNDOing updates.
ƒ Get lastLSN of Xact from Xact table.
ƒ Can follow chain of log records backward via the
prevLSN field.
ƒ Before starting UNDO, write an Abort log record.
• For recovering from crash during UNDO!

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 15


Abort, cont.

™ To perform UNDO, must have a lock on data!


ƒ No problem!
™ Before restoring old value of a page, write a CLR:
ƒ You continue logging while you UNDO!!
ƒ CLR has one extra field: undonextLSN
• Points to the next LSN to undo (i.e. the prevLSN of the record
we’re currently undoing).
ƒ CLRs never Undone (but they might be Redone when
repeating history: guarantees Atomicity!)
™ At end of UNDO, write an “end” log record.
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 16

Transaction Commit

™ Write commit record to log.


™ All log records up to Xact’s lastLSN are
flushed.
ƒ Guarantees that flushedLSN ≥ lastLSN.
ƒ Note that log flushes are sequential, synchronous
writes to disk.
ƒ Many log records per log page.
™ Commit() returns.
™ Write end record to log.
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 17

Crash Recovery: Big Picture


Oldest log
rec. of Xact Y Start from a checkpoint (found
active at crash
via master record).
Smallest Y Three phases. Need to:
recLSN in
dirty page – Figure out which Xacts
table after committed since checkpoint,
Analysis
which failed (Analysis).
– REDO all actions.
Last chkpt X (repeat history)

– UNDO effects of failed Xacts.


CRASH
A R U
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 18
Recovery: The Analysis Phase

™ Reconstruct state at checkpoint.


ƒ via end_checkpoint record.
™ Scan log forward from checkpoint.
ƒ End record: Remove Xact from Xact table.
ƒ Other records: Add Xact to Xact table, set
lastLSN=LSN, change Xact status on commit.
ƒ Update record: If P not in Dirty Page Table,
• Add P to D.P.T., set its recLSN=LSN.

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 19

Recovery: The REDO Phase


™ We repeat History to reconstruct state at crash:
ƒ Reapply all updates (even of aborted Xacts!), redo CLRs.
™ Scan forward from log rec containing smallest
recLSN in D.P.T. For each CLR or update log rec
LSN, REDO the action unless:
ƒ Affected page is not in the Dirty Page Table, or
ƒ Affected page is in D.P.T., but has recLSN > LSN, or
ƒ pageLSN (in DB) ≥ LSN.
™ To REDO an action:
ƒ Reapply logged action.
ƒ Set pageLSN to LSN. No additional logging!
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 20

Recovery: The UNDO Phase


ToUndo={ l | l a lastLSN of a “loser” Xact}
Repeat:
ƒ Choose largest LSN among ToUndo.
ƒ If this LSN is a CLR and undonextLSN==NULL
• Write an End record for this Xact.
ƒ If this LSN is a CLR, and undonextLSN != NULL
• Add undonextLSN to ToUndo
ƒ Else this LSN is an update. Undo the update,
write a CLR, add prevLSN to ToUndo.
Until ToUndo is empty.
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 21
Example of Recovery
LSN LOG

RAM 00 begin_checkpoint
05 end_checkpoint
Xact Table 10 update: T1 writes P5 prevLSNs
lastLSN 20 update T2 writes P3
status
30 T1 abort
Dirty Page Table
recLSN 40 CLR: Undo T1 LSN 10
flushedLSN 45 T1 End
50 update: T3 writes P1
ToUndo 60 update: T2 writes P5
CRASH, RESTART

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 22

Example: Crash During Restart!


LSN LOG
00,05 begin_checkpoint, end_checkpoint
RAM 10 update: T1 writes P5
20 update T2 writes P3
undonextLSN
Xact Table 30 T1 abort
lastLSN
40,45 CLR: Undo T1 LSN 10, T1 End
status
Dirty Page Table 50 update: T3 writes P1
recLSN 60 update: T2 writes P5
flushedLSN CRASH, RESTART
70 CLR: Undo T2 LSN 60
ToUndo 80,85 CLR: Undo T3 LSN 50, T3 end
CRASH, RESTART
90 CLR: Undo T2 LSN 20, T2 end
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 23

Additional Crash Issues


™ What happens if system crashes during
Analysis? During REDO?
™ How do you limit the amount of work in
REDO?
ƒ Flush asynchronously in the background.
ƒ Watch “hot spots”!
™ How do you limit the amount of work in
UNDO?
ƒ Avoid long-running Xacts.

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 24


Summary of Logging/Recovery

™ Recovery Manager guarantees Atomicity &


Durability.
™ Use WAL to allow STEAL/NO-FORCE w/o
sacrificing correctness.
™ LSNs identify log records; linked into
backwards chains per transaction (via
prevLSN).
™ pageLSN allows comparison of data page and
log records.
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 25

Summary, Cont.

™ Checkpointing: A quick way to limit the


amount of log to scan on recovery.
™ Recovery works in 3 phases:
ƒ Analysis: Forward from checkpoint.
ƒ Redo: Forward from oldest recLSN.
ƒ Undo: Backward from end to first LSN of oldest
Xact alive at crash.
™ Upon Undo, write CLRs.
™ Redo “repeats history”: Simplifies the logic!
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 26

Vous aimerez peut-être aussi