Académique Documents
Professionnel Documents
Culture Documents
The kernel
syscall trap/return
fault/return
policy
I/O completions
interrupt/return
timer ticks
DFS
DBuffer dbuf = getBlock(blockID) releaseBlock(dbuf) read(), write() startFetch(), startPush() waitValid(), waitClean()
DBufferCache
DBuffer
ioComplete()
startRequest(dbuf, r/w)
VirtualDisk
Memory Allocation
How should an OS allocate its memory resources among contending demands?
Virtual address spaces: fork, exec, sbrk, page fault.
The kernel controls how many machine memory frames back the pages of each virtual address space.
The kernel can take memory away from a VAS at any time. The kernel always gets control if a VAS (or rather a thread running within a VAS) asks for more. The kernel controls how much machine memory to use as a cache for data blocks whose home is on slow storage. Policy choices: which pages or blocks to keep in memory? And which ones to evict from memory to make room for others?
Memory/storage hierarchy
Terms to know cache index/directory cache line/entry, associativity cache hit/miss, hit ratio spatial locality of reference temporal locality of reference eviction / replacement write-through / writeback dirty/clean
Memory as a cache
data
Processes access external storage objects through file APIs and VM abstraction. The OS kernel manages caching of pages/blocks in main memory.
files and filesystems, databases, other storage objects page/block read/write accesses memory (frames)
Editing Ritchie/Thompson
The system maintains a buffer cache (block cache, file cache) to reduce the number of I/O operations.
Proc
Suppose a process makes a system call to access a single byte of a file. UNIX determines the affected disk block, and finds the block if it is resident in the cache. If it is not resident, UNIX allocates a cache buffer and reads the block into the buffer from the disk. Then, if the op is a write, it replaces the affected byte in the buffer. A buffer with modified data is marked dirty: an entry is made in a list of blocks to be written. The write call may then return. The actual write may not be completed until a later time. If the op is a read, it picks the requested byte out of the buffer and returns it, leaving the block in the cache.
DBufferCache
DBuffer
Device I/O interface Asynchronous I/O to/from buffers block read and write Blocks numbered by blockIDs
Each frame/buffer of memory is described by a meta-object (header). Resident pages or blocks are accessible through through a global hash table. An ordered list of eviction candidates winds through the hash chains.
Some frames/buffers are free (no valid data). These are on a free list.
DBufferCache internals
HASH(blockID) Any given block (blockID) is either resident or not. If resident, then it has exactly one copy (dbuf) in the cache. If it is resident then getBlock finds the dbuf (cache hit). This requires some kind of cache index, e.g., a hash table.
DBufferCache
I/O cache buffers Each is byte[blocksize]
DBuffer
Buffer headers DBuffer dbuf
DBufferCache internals
HASH(blockID) If the requested block is not resident, then getBlock allocates a dbuf for the block and places the correct block contents in its buffer (cache miss). If there are no free dbufs in the cache, then we must evict some other block from the cache and reuse its dbuf.
DBufferCache
I/O cache buffers Each is byte[blocksize]
DBuffer
Buffer headers DBuffer dbuf
List(s) of free buffers (bufs) or eviction candidates. These dbufs might be listed in the cache directory if they contain useful data, or not, if they are truly free.
cache directory
To replace a dbuf Remove from free/eviction list. Remove from cache directory. Change dbuf blockID and status. Enter in directory w/ new blockID. Re-register on eviction list. Beware of concurrent accesses.
DBuffer
A dbuf is valid iff it has the correct copy of the data. A dbuf is dirty iff it is valid and has an update (a write) that has not yet been written to disk. A valid dbuf is clean if it is not dirty. Your DeFiler should return only valid data to a client. That may require you to zero the dbuf or fetch data from the disk. Your DeFiler should ensure that all dirty data is eventually pushed to disk.
DBuffer
ioComplete()
Thread upcalls dbuf ioComplete when I/O operation is done.
startRequest(dbuf, r/w)
VirtualDisk
DFS
A dbuf is pinned if I/O is in progress, i.e., a disk request has started but not yet completed. A dbuf is held if DFS obtained a reference to the dbuf from getBlock but has not yet released the dbuf.
DBufferCache VirtualDisk
DBuffer
ioComplete()
startRequest(dbuf, r/w);
Allocate blocks to files and file metadata. Allocate DFileIDs to files. Track which blockIDs and DFileIDs are free and which are in use.
inode
Maintain a block map inode for each file, as metadata stored on disk.
read(), write() startFetch(), startPush() waitValid(), waitClean()
DBufferCache
DBuffer
A Filesystem On Disk
sector 0
allocation bitmap file 11100010 00101101 10111101 once upo n a time /n in a l directory file wind: 18 0 snow: 62 0 rain: 32 hail: 48 10011010 00110001 00010101
sector 1
Data
A Filesystem On Disk
sector 0
allocation bitmap file 11100010 00101101 10111101 once upo n a time /n in a l directory file wind: 18 0 snow: 62 0 rain: 32 hail: 48 10011010 00110001 00010101
sector 1
Metadata
Managing files
create, destroy, read, write a dfile list dfiles
Each file has a size: it is the first byte offset in the file that has never been written. Never return data past a files size. Fetch blocks for data and metadata (or zero new ones fresh), read and write in place, and push dirty blocks back to the disk.
inode
DBufferCache
DBuffer
block map
Index by logical block number maps to a blockID
blockID access blocks through the block cache with getBlock, startFetch, waitValid, read, releaseBlock.
logical block 1
inode
logical block 2
wind: 18 0 snow: 62
file blocks
00101110 00011001 01000100 and far far away , lived th
inode
X
11100010 00101101 10111101
Your DeFiler volume is small. You can keep the free block/inode maps in memory. You dont need metadata structures on disk for that. But you have to scan the disk to rebuild the in-memory structures on initialization.
X
0 rain: 32 hail: 48
file blocks
and far far away , lived th
inode
DeFiler has no directories. You just need to keep track of which DFileIDs are currently valid, and return a list.
file blocks
and far far away , lived th
inode