Académique Documents
Professionnel Documents
Culture Documents
INTRODUCTION TO VFS
VFS SUPER BLOCK
VFS INODES
FILES & ITS OPERATIONS
DENTRY
REGISTERING & MOUNT
2
Different Operating System have different
file systems.
3
Linux manages to support multiple disk types in the
same way other Unix variants do, through a
concept called the Virtual File system.
4
5
$ cp /floppy/TEST /temp/test
6
VFS uses a common file model which interacts
with different “objects” or data structures for
representing supported file systems.
◦ Inode structure
Stores general information about a specific file or Inode.
7
File structure
o Stores information about the interaction between an open file
and a process
Dentry object
o Stores information about the linking of a directory
entry with the corresponding file.
8
9
INTRODUCTION TO VFS
VFS SUPER BLOCK
VFS INODES
FILES & ITS OPERATIONS
DENTRY
REGISTERING & MOUNT
10
A separate superblock structure is maintained for every
Mounted File System.
11
The VFS keeps a list of the mounted file systems with their
VFS superblocks.
12
13
14
lock_super(struct super_block *sb)
{
If(sb->s_lock)
wait_on_super(sb);
Sb->s_lock=1;
}
15
Each specific file system can define its own
super block operations.
Example: read_inode( ) system call is
17
put_super
Releases the super block object because the corresponding file
system is unmounted.
write_super
Gets called when VFS decides to write the superblock to disc
Obviously, its not needed for the file system marked as READ
ONLY.
remount_fs
Called when file system is to be re-mounted with new options.
Used to change the various mount options without unmounting
the file system.
Example:- changing the read only file system to writeable file
system.
umount_begin
Only NFS provides this option.
Called in early stages of the unmounting process.
Causes any incomplete transaction on the file system to fail
quickly rather than block.
It will not make any file system become unmountable but it
allows any processes using the file system as killable rather
than being in uninterrupted wait.
18
INTRODUCTION TO VFS
VFS SUPER BLOCK
VFS INODES
FILES & ITS OPERATIONS
DENTRY
REGISTERING & MOUNT
19
An inode contains the management information for a
particular file.
20
These inodes can be accessed in two ways.
Through dcache
Each dentry in the dcache refers to an inode, and thereby keeps that
inode in the cache.
21
22
23
24
Each inode object always appears in one of the
following circular doubly linked lists
25
26
27
default_file_ops:
◦ Pointer to default table of file operations for files opened
on this inode.
◦ When a file is opened
• It intializes f_op field in file structure.
• Then open method in file_operations table is called.
• The method may choose to change f_op to different method
table.(ex :-device special file)
create:
◦ Only meaningful on directory inodes.
◦ If successful
Gets a new empty inode from the cache with
get_empty_inode.
Fill in the fields and insert it into the hash table with
insert_inode_hash.
Marks it dirty with mark_inode_dirty.
Instantiate it into the dcache with d_instantiate.
28
lookup:
◦ Only meaningful on directory inodes.
◦ checks if that name exists in the directory
updates the dentry using d_add if it does.
involves finding and loading the inode.
link:
◦ Only meaningful on directory inodes.
◦ makes a hard link
◦ On success, calls d_instantiate to link the inode of the linked file to
the new dentry
unlink:
◦ Only meaningful on directory inodes.
◦ Removes the name from the directory
◦ then d_delete the dentry on success.
symlink:
◦ Only meaningful on directory inodes.
◦ Creates a symbolic link in the given directory with the given name
having the given value.
◦ On success, d_instantiate the new inode into the dentry.
29
mkdir:
◦ Creates a directory with the given parent, name, and mode.
rmdir:
◦ Remove the named directory (if empty)
◦ And d_delete the dentry.
mknod:
◦ Creates a device special file with the given parent, name, mode, and
device number.
◦ Then d_instantiate the new inode into the dentry.
rename:
◦ Renames the object to have the parent and name given by the second
inode and dentry.
◦ All generic checks, including that the new parent isn't a child of the old
name, have already been done.
readlink:
◦ The symbolic link referred to by the dentry is read and the value is
copied into the user buffer (with copy_to_user) with a maximum length
given by the int.
30
INTRODUCTION TO VFS
VFS SUPER BLOCK
VFS INODES
FILES & ITS OPERATIONS
DENTRY
REGISTERING & MOUNT
31
#include<stdio.h>
#include<unistd.h>
#include<sys/types.h>
#include<sys/stat.h>
#include<fcntl.h>
int main()
{
char buffer[512];
int infile = open("hello.txt", O_RDONLY);
read(infile, buffer, 512);
printf("%s\n",buffer);
close(infile);
}
32
What file system does “hello.txt” live on?
33
you make a call to the standard C library function
34
35
Suppose a file system is being mounted
36
user tries to access a file “/usr/bin/nano” on mounted file
system
37
dentry objects are also created
The kernel asks for file object that represents the file /usr/bin/nano!”
from driver
fs driver creates and initializes the VFS file struct and return it to
the kernel
Here all the tasks for which kernel is responsible are done by VFS
layer.
38
39
The VFS file object is an inmemory representation of an
open file
40
Each file object is always included in one of the
following circular doubly linked lists :-
41
Defined in linux/fs.h
42
f_next,f_pprev
link files together into one of a number of lists
There is one list for each active file-system, starting at the s_files pointer
in the super-block
f_dentry
records the dcache entry that points to the inode for this file
f_op
points to a struct containing methods to use on this file
f_count
number of references to this file. One for each different user- process
file descriptor.
43
f_flags
stores the flags for file such as access type (r/w), non-blocking,
appendonly etc.
flags like O_CREAT, O_TRUNC, etc are relevant at the time of
opening, so not stored in f_flags
f_mode
f_mode stores the read and write access as two separate bits
f_pos
records current file position for the next read/write request
44
f_reada, f_remax, f_raend, f_ralen, f_rawin
These five fields are used to keep track of sequential access patterns on
the file and to determine how much read-ahead to do
f_owner
Stores a process id and a signal to send to the process when certain
events happen with the file
f_uid, f_gid
get set to the owner and group of the process which opened the file
45
f_error
used by the NFS client file-system code to return write errors
f_version
used by the underlying fs to help cache state, and check for
the cache being invalid
changes whenever the file has its f_pos value changed
private_data
used by device drivers, and even a few file-systems, to store extra per-
open-file information.
46
Defined in linux/fs.h:
struct file_operations {
47
llseek(file, offset, whence)
implements the lseek system call
called when the VFS needs to move the file position index
updates the f_pos field
48
poll(file, poll_table)
used to implement the select and poll system calls
mmap(file, vma)
implements memory mapping of files into a process address
space
open(inode, file)
opens a file by creating a new file object and linking it to the corresponding
inode object
initialises the “f_op” member with the “default_file_ops” in the inode
structure
49
flush(file)
called when a file descriptor is closed
f_count of the file object is decremented
only fs that currently defines this method is the NFS client
release(inode, file)
called when the last reference to an open file is closed
f_count field of the file object becomes 0
releases the file object
fsync(file, dentry)
implements the fsync system call
50
fasync(file, on)
Enables or disables asynchronous I/O notification by means of
signals
No file-systems currently use this method
check_media_change(dev)
checks whether there has been a change of media since the last
operation on the device file
applicable to block devices that support removable media, such as
CDROM
called in read_super when a file-system is about to be mounted
If it returns true at this point, all buffers associated with the device are
invalidated
revalidate(dev)
called after buffers have been invalidated after a media change, as
reported by check_media_change
51
lock(file, cmd, file_lock)
allows a file service to provide extra handling of POSIX locks
useful particularly for network file-systems
When a process is trying to find what locks are present,
information returned by this method is used.
52
INTRODUCTION TO VFS
VFS SUPER BLOCK
VFS INODES
FILES & ITS OPERATIONS
DENTRY
REGISTERING & MOUNT
53
• A directory / file entry on a file-system is transformed by
the VFS into a dentry object.
For example:
When looking up the /tmp/test pathname,
The kernel creates three dentry objects
one for the / root directory,
second for the tmp entry of the root directory
third for the test entry of the /tmp directory.
• Unused
– The dentry object is not currently used by the kernel.
– The d_count usage counter of the object is null, but
the d_inode field still points to the associated inode.
– The dentry object contains valid information, but its
contents may be discarded if necessary to reclaim
memory.
Contd..
• In use
– The dentry object is currently used by the kernel.
– The d_count usage counter is positive and the
d_inode field points to the associated inode object.
– The dentry object contains valid information and
cannot be discarded.
• Negative
– The inode associated with the dentry no longer
exists, because the corresponding disk inode has
been deleted.
– The d_inode field of the dentry object is set to NULL,
but the object still remains in the dentry cache so
that further lookup operations to the same file
pathname can be quickly resolved.
Since reading a entry from disk and constructing the
corresponding dentry object requires considerable time.
In most cases the same file needs to be repeatedly
accessed.
struct dentry_operations {
int (*d_revalidate)(struct dentry *, int);
int (*d_hash) (struct dentry *, struct qstr *);
int (*d_compare) (struct dentry *, struct qstr *, struct
qstr *);
void (*d_delete)(struct dentry *);
void (*d_release)(struct dentry *);
void (*d_iput)(struct dentry *, struct inode *);
};
About the operations..
• d_revalidate(dentry) :
• This method is called whenever a path lookup uses an entry
in the dcache, in order to see if the entry is still valid.
• Default method does nothing. NFS defined its own.
• d_hash(dentry, hash) :
• called to calculate hash value.
• d_delete(dentry) :
• This is called when the reference count reaches zero, before
the dentry is placed on the dentry_unused list.
Contd..
• d_release(dentry) :
• This is called just before a dentry is finally freed up.
• It can be used to release the d_fsdata if any.
• d_iput(dentry, ino) :
• If defined, this is called instead of iput to release the inode
when the dentry is being discarded.
INTRODUCTION TO VFS
VFS SUPER BLOCK
VFS INODES
FILES & ITS OPERATIONS
DENTRY
REGISTERING & MOUNT
72
REGISTERING AND MOUNTING AT BOOT TIME
boot(){
File_system_setup(){
Register_file_system(&file_systems_type){
Name = Get_file_system(file_system_name);
If(Name is found)
return &file_system_type;
Else
boot_error;
}
}
Mount_root(){
Flip -> F_mode = root_mountflags;
Dummy_inode->i_rdev = ROOT_DEV;
blkdev_open( Flip, Dummy_inode);
For each filesystem_list {
Read_super(); //fail except one and creates an inode object and a dentry object for the root directory
}
Fs_struct->root = Fs_struct->current = dentry_obj;
add_vfsmnt();
}
}
73
Sys_mount()
Input:
1.pathname to device file
2.pathname to mount point
3.file_system_type
4.mount_flags
5.ptr to file system dependent data structure
{
If(don’t have permissions || not in kernel mode) exit;
FS_type = get_fs_type();
If(!FS_type) exit; //reboot & register
If device is on-disk then get its dentry object and check it is valid block device and operational;
Else its not on-disk then get_un-named_dev();
Do_mount() { //reqd parameters
Dir_d = Namei(dirname)
aquire mnt_sem;
If(dir_d is not directory || dir_d is root )
exit;
Sb = Read_super();
Check_remount();
Add_vfsmnt(sb);
D_mount->dir_d = s_root;;
Mt_root->parent = dir_d;
Release mt_sem;
}
}
74
Understanding the UNIX Kernel by Daniel P. Bovet & Marco Cesati
http://www.atnf.csiro.au/people/rgooch/linux/docs/vfs.txt
http://book.opensourceproject.org.cn/kernel/kernelpri/opensource/0131181
637/ch06lev1sec2.html
http://sunsite.nus.sg/LDP/LDP/tlk/node102.html
http://www.linux.it/~rubini/docs/vfs/vfs.html
75