Académique Documents
Professionnel Documents
Culture Documents
Virtual memory
The program thinks it has a large range of contiguous addresses; but in reality the
parts it is currently using are scattered around RAM, and the inactive parts are saved in a
disk file.
Note that "virtual memory" is not just "using disk space to extend physical memory
size". Extending memory is a normal consequence of using virtual memory techniques,
but can be done by other means such as overlays or swapping programs and their data
completely out to disk while they are inactive. The definition of "virtual memory" is
based on tricking programs into thinking they are using large blocks of contiguous
addresses.
• Does not load the pages that are never accessed, so saves the memory for
other programs and increases the degree of multiprogramming.
• Less loading latency at the program startup.
• Less of initial disk overhead because of fewer page reads.
• Does not need extra hardware support than what paging needs, since
protection fault can be used to get page fault.
• Pages will be shared by multiple programs until they are modified by one
of them, so a technique called copy on write will be used to save more
resources.
• Ability to run large programs on the machine, even though it does not
have sufficient memory to run the program. This method is easier for a
programmer than an old manual overlays.
Disadvantages
• Individual programs face extra latency when they access a page for the
first time. So prepaging, a method of remembering which pages a process used
when it last executed and preloading a few of them, is used to improve
performance.
When the page that was selected for replacement and paged out is referenced again
it has to be paged in (read in from disk), and this involves waiting for I/O completion.
This determines the quality of the page replacement algorithm: the less time waiting for
page-ins, the better the algorithm.
When a process incurs a page fault, a local page replacement algorithm selects for
replacement some page that belongs to that same process (or a group of processes sharing
a memory partition). A global replacement algorithm is free to select any page in
memory.
when a page needs to be swapped in, the operating system swaps out the page
whose next use will occur farthest in the future. For example, a page that is not going to
be used for the next 6 seconds will be swapped out over a page that is going to be used
within the next 0.4 seconds.
Analysis of the paging problem has also been done in the field of online
algorithms. Efficiency of randomized online algorithms for the paging problem is
measured using amortized analysis.
At a certain fixed time interval, the clock interrupt triggers and clears the
referenced bit of all the pages, so only pages referenced within the current clock interval
are marked with a referenced bit. When a page needs to be replaced, the operating system
divides the pages into four classes:
Although it does not seem possible for a page to be not referenced yet modified,
this happens when a category 3 page has its referenced bit cleared by the clock interrupt.
The NRU algorithm picks a random page from the lowest category for removal. Note that
this algorithm implies that a referenced page is more important than a modified page.
First-in, first-out
The first-in, first-out (FIFO) page replacement algorithm is a low-overhead
algorithm that requires little bookkeeping on the part of the operating system. The idea is
obvious from the name - the operating system keeps track of all the pages in memory in a
queue, with the most recent arrival at the back, and the earliest arrival in front. When a
page needs to be replaced, the page at the front of the queue (the oldest page) is selected.
While FIFO is cheap and intuitive, it performs poorly in practical application. Thus, it is
rarely used in its unmodified form. This algorithm experiences Belady's anomaly.
The most expensive method is the linked list method, which uses a linked list
containing all the pages in memory. At the back of this list is the least recently used page,
and at the front is the most recently used page. The cost of this implementation lies in the
fact that items in the list will have to be moved about every memory reference, which is a
very time-consuming process.
Another method that requires hardware support is as follows: suppose the hardware
has a 64-bit counter that is incremented at every instruction. Whenever a page is
accessed, it gains a value equal to the counter at the time of page access. Whenever a
page needs to be replaced, the operating system selects the page with the lowest counter
and swaps it out. With present hardware, this is not feasible because the required
hardware counters do not exist.
Because of implementation costs, one may consider algorithms (like those that
follow) that are similar to LRU, but which offer cheaper implementations.
On the other hand, LRU's weakness is that its performance tends to degenerate
under many quite common reference patterns. For example, if there are N pages in the
LRU pool, an application executing a loop over array of N + 1 pages will cause a page
fault on each and every access. As loops over large arrays are common, much effort has
been put into modifying LRU to work better in such situations. Many of the proposed
LRU modifications try to detect looping reference patterns and to switch into suitable
replacement algorithm, like Most Recently Used (MRU).
Variants on LRU
• LRU-K improves greatly on LRU with regard to locality in time. It's also
known as LRU-2, for the case that K=2. LRU-1 (i.e. K=1) is the same as
normal LRU.
• The ARC[3] algorithm extends LRU by maintaining a history of recently
evicted pages and uses this to change preference to recent or frequent access. It
is particularly resistant to sequential scans.
4.4 Thrashing
• If a process does not have “enough” pages, the page-fault rate is very high.
This leads to:
o low CPU utilization.
o operating system thinks that it needs to increase the degree of
multiprogramming.
o another process added to the system.
• Thrashing ==> a process is busy swapping pages in and out to the virtual
exclusion of real work
• Why does paging work?
• Locality model
o Process migrates from one locality to another.
o Localities may overlap.
• Why does thrashing occur?
o sum of size of locality > total memory size
Working-Set Model
Working-set Model
• Equal allocation – e.g., if 100 frames and 5 processes, give each 20 pages.
• Proportional allocation – Allocate according to the size of process.
Priority Allocation
• Program 2
• for (i = 0; i < A.length; i++)
• for (j = 0; j < A.length; j++)
A[i,j] = 0;
I/O Considerations
However, file systems need not make use of a storage device at all. A file system
can be used to organize and represent access to any data, whether it be stored or
dynamically generated (eg, from a network connection).
Whether the file system has an underlying storage device or not, file systems
typically have directories which associate file names with files, usually by connecting the
file name to an index into a file allocation table of some sort, such as the FAT in an MS-
DOS file system, or an inode in a Unix-like file system. Directory structures may be flat,
or allow hierarchies where directories may contain subdirectories. In some file systems,
file names are structured, with special syntax for filename extensions and version
numbers. In others, file names are simple strings, and per-file metadata is stored
elsewhere.
Other bookkeeping information is typically associated with each file within a file
system. The length of the data contained in a file may be stored as the number of blocks
allocated for the file or as an exact byte count. The time that the file was last modified
may be stored as the file's timestamp. Some file systems also store the file creation time,
the time it was last accessed, and the time that the file's meta-data was changed. (Note
that many early PC operating systems did not keep track of file times.) Other information
can include the file's device type (e.g., block, character, socket, subdirectory, etc.), its
owner user-ID and group-ID, and its access permission settings (e.g., whether the file is
read-only, executable, etc.).
The hierarchical file system was an early research interest of Dennis Ritchie of
Unix fame; previous implementations were restricted to only a few levels, notably the
IBM implementations, even of their early databases like IMS. After the success of Unix,
Ritchie extended the file system concept to every object in his later operating system
developments, such as Plan 9 and Inferno.
Traditional file systems offer facilities to create, move and delete both files and
directories. They lack facilities to create additional links to a directory (hard links in
Unix), rename parent links (".." in Unix-like OS), and create bidirectional links to files.
Traditional file systems also offer facilities to truncate, append to, create, move,
delete and in-place modify files. They do not offer facilities to prepend to or truncate
from the beginning of a file, let alone arbitrary insertion into or deletion from a file. The
operations provided are highly asymmetric and lack the generality to be useful in
unexpected contexts. For example, interprocess pipes in Unix have to be implemented
outside of the file system because the pipes concept does not offer truncation from the
beginning of files.
Secure access to basic file system operations can be based on a scheme of access
control lists or capabilities. Research has shown access control lists to be difficult to
secure properly, which is why research operating systems tend to use capabilities.
Commercial file systems still use access control lists. see: secure computing
File system types can be classified into disk file systems, network file systems and
special purpose file systems.
A disk file system is a file system designed for the storage of files on a data storage
device, most commonly a disk drive, which might be directly or indirectly connected to
the computer. Examples of disk file systems include FAT, FAT32, NTFS, HFS and
HFS+, ext2, ext3, ISO 9660, ODS-5, and UDF. Some disk file systems are journaling file
systems or versioning file systems.
A flash file system is a file system designed for storing files on flash memory
devices. These are becoming more prevalent as the number of mobile devices is
increasing, and the capacity of flash memories catches up with hard drives.
While a block device layer can emulate a disk drive so that a disk file system can
be used on a flash device, this is suboptimal for several reasons:
• Erasing blocks: Flash memory blocks have to be explicitly erased before
they can be written to. The time taken to erase blocks can be significant,
thus it is beneficial to erase unused blocks while the device is idle.
• Random access: Disk file systems are optimized to avoid disk seeks
whenever possible, due to the high cost of seeking. Flash memory devices
impose no seek latency.
• Wear levelling: Flash memory devices tend to wear out when a single block
is repeatedly overwritten; flash file systems are designed to spread out
writes evenly.
Log-structured file systems have all the desirable properties for a flash file system.
Such file systems include JFFS2 and YAFFS.
A new concept for file management is the concept of a database-based file system.
Instead of, or in addition to, hierarchical structured management, files are identified by
their characteristics, like type of file, topic, author, or similar metadata. Example: dbfs.
Each disk operation may involve changes to a number of different files and disk
structures. In many cases, these changes are related, meaning that it is important that they
all be executed at the same time. Take for example a bank sending another bank some
money electronically. The bank's computer will "send" the transfer instruction to the
other bank and also update its own records to indicate the transfer has occurred. If for
some reason the computer crashes before it has had a chance to update its own records,
then on reset, there will be no record of the transfer but the bank will be missing some
money.
This type of file system is designed to be fault tolerant, but may incur additional
overhead to do so.
A special purpose file system is basically any file system that is not a disk file
system or network file system. This includes systems where the files are arranged
dynamically by software, intended for such purposes as communication between
computer processes or temporary file space.
Special purpose file systems are most commonly used by file-centric operating
systems such as Unix. Examples include the procfs (/proc) file system used by some Unix
variants, which grants access to information about processes and other operating system
features.
Deep space science exploration craft, like Voyager I & II used digital tape based
special file systems. Most modern space exploration craft like Cassini-Huygens used
Real-time operating system file systems or RTOS influenced file systems. The Mars
Rovers are one such example of an RTOS file system, important in this case because they
are implemented in flash memory.
• URI
• Host
• Port
The following text gives a brief explanation of the different access methods. Please
note that the access methods can be combined. The documentation page of the
"MatchOrder" directive reveals how this can be done.
URI
This is the default setting for the "MatchOrder" directive. When the URI access
method is used, the name of the target siteaccess will be the first parameter that comes
after the "index.php" part of the requested URL. For example, the following URL will
tell eZ publish to use the "admin" siteaccess: http://www.example.com/index.php/admin.
If another siteaccess by the name of "public" exists, then it would be possible to reach it
by pointing the browser to the following address:
http://www.example.com/index.php/public. If the last part of the URL is omitted then the
default siteaccess will be used. The default siteaccess is defined by the "DefaultAccess"
setting within the [SiteSettings] block. The following example shows how to set up
"/settings/override/site.ini.append.php" in order to make eZ publish use the URI access
method and to use a siteaccess called "public" by default:
...
[SiteSettings]
DefaultAccess=public
[SiteAccessSettings]
MatchOrder=uri
...
The URI access method is typically useful for testing / demonstration purposes. In
addition it is quite handy because it doesn't require any configuration of the web server
and the DNS server.
Host
...
[SiteAccessSettings]
MatchOrder=host
HostMatchType=map
HostMatchMapItems[]=www.example.com;public
HostMatchMapItems[]=admin.example.com;admin
...
The example above tells eZ publish to use the "public" siteaccess if the requested
URL starts with "www.example.com". In other words, the configuration files in
"/settings/siteaccess/public" will be used. If the requested URL starts with
"admin.example.com", then the admin siteaccess will be used. The example above
demonstrates only a fragment of the host matching capabilities of eZ publish. Please refer
to the reference documentation for a full explanation of the "HostMatchType" directive.
Port
The port access method makes it possible to map different ports to different
siteaccesses. This access method requires configuration outside eZ publish. The web
server must be configured to listen to the desired ports (by default, a web server typically
listens for requests on port 80, which is the standard port for HTTP traffic). In addition,
the firewall will most likely have to be opened so that the traffic on port 81 actually
reaches the web server. The following example shows how to set up
"/settings/override/site.ini.append.php" in order to make eZ publish use the port access
method. It also shows how to map different ports to different siteaccesses.
...
[SiteAccessSettings]
MatchOrder=port
[PortAccessSettings]
80=public
81=admin
...
The example above tells eZ publish to use the "public" siteaccess if the requested
URL is sent to the web server using port 80. In other words, the configuration files inside
"/settings/siteaccess/public" will be used. If the URL is requested on port 81 (usually by
appending a :81 to the URL, like this: http://www.example.com:81), then the admin
siteaccess will be used.
Single-Level Directory
In a single-level directory system, all the files are placed in one directory. This is
very common on single-user OS's. A single-level directory has significant limitations,
however, when the number of files increases or when there is more than one user. Since
all files are in the same directory, they must have unique names. If there are two users
who call their data file "test", then the unique-name rule is violated. Although file names
are generally selected to reflect the content of the file, they are often quite limited in
length.Even with a single-user, as the number of files increases, it becomes difficult to
remember the names of all the files in order to create only files with unique names.
Two-Level Directory
In the two-level directory system, the system maintains a master block that has
one entry for each user. This master block contains the addresses of the directory of the
users.
There are still problems with two-level directory structure. This structure
effectively isolates one user from another. This is an advantage when the users are
completely independent, but a disadvantage when the users want to cooperate on some
task and access files of other users. Some systems simply do not allow local files to be
accessed by other users.
Tree-Structured Directories
In the tree-structured directory, the directory themselves are files. This leads to
the possibility of having sub-directories that can contain files and sub-subdirectories.An
interesting policy decision in a tree-structured directory structure is how to handle the
deletion of a directory. If a directory is empty, its entry in its containing directory can
simply be deleted. However, suppose the directory to be deleted id not empty, but
contains several files, or possibly sub-directories. Some systems will not delete a
directory unless it is empty.
Thus, to delete a directory, someone must first delete all the files in that directory.
If these are any sub-directories, this procedure must be applied recursively to them, so
that they can be deleted also. This approach may result in a insubstantial amount of work.
An alternative approach is just to assume that, when a request is made to delete a
directory, all of that directory's files and sub-directories are also to be deleted.
Acyclic-Graph Directories
The acyclic directory structure is an extension of the tree-structured directory
structure. In the tree-structured directory, files and directories starting from some fixed
directory are owned by one particular user. In the acyclic structure, this prohibition is
taken out and thus a directory or file under directory can be owned by several users.
Mounting a file system associates it with a directory in the existing file system
tree. Prior to mounting, the files, although present on the disk, are not accessible to users;
once mounted, the file system becomes accessible.
The directory in the existing file system where the file is attached is known as the
mount point or mount directory for the new file system, and the files in the added file
system become part of the existing file system hierarchy.
The mount point should be an empty subdirectory on the existing file system. If
you mount a file system on to a directory that already has files in it, those files will be
hidden and inaccessible until you unmount the file system. If you try to mount the file
system on to a directory whose files are in use, the mount will fail.
Web-based sharing
Webhosting is also used for file sharing, since it makes it possible to exchange
privately. In small communities popular files can be distributed very quickly and
efficiently. Web hosters are independent of each other; therefore contents are not
distributed further. Other terms for this are one-click hosting and web-based sharing.
File sharing has been a feature of mainframe and multi-user computer systems for
many years. With the advent of the Internet, a file transfer system called the File Transfer
Protocol (FTP) has become widely-used. FTP can be used to access (read and possibly
write to) files shared among a particular set of users with a password to gain access to
files shared from an FTP server site. Many FTP sites offer public file sharing or at least
the ability to view or copy files by downloading them, using a public password (which
happens to be "anonymous"). Most Web site developers use FTP to upload new or
revised Web files to a Web server, and indeed the World Wide Web itself can be thought
of as large-scale file sharing in which requested pages or files are constantly being
downloaded or copied down to the Web user.
More usually, however, file sharing implies a system in which users write to as
well as read files or in which users are allotted some amount of space for personal files
on a common server, giving access to other users as they see fit. The latter kind of file
sharing is common in schools and universities. File sharing can be viewed as part of file
systems and their management.
Any multi-user operating system will provide some form of file sharing. Among
the best known network file systems is (not surprisingly) the Network File System (NFS).
Originally developed by Sun Microsystems for its Unix-based systems, it lets you read
and, assuming you have permission, write to sharable files as though they were on your
own personal computer. Files can also be shared in file systems distributed over different
points in a network. File sharing is involved in groupware and a number of other types of
applications.
An operating system's file system structure is its most basic level of organization.
Almost all of the ways an operating system interacts with its users, applications, and
security model are dependent upon the way it stores its files on a storage device. It is
crucial for a variety of reasons that users, as well as programs, be able to refer to a
common guideline to know where to read and write files.
A file system can be seen in terms of two different logical categories of files:
• Shareable vs. unsharable files
• Variable vs. static files
Shareable files are those that can be accessed by various hosts; unsharable files are not
available to any other hosts. Variable files can change at any time without any
intervention; static files, such as read-only documentation and binaries, do not change
without an action from the system administrator or an agent that the system administrator
has placed in motion to accomplish that task.
4.14 File system Implementation
A file is a meaningful set of data to be stored on a non-volatile media: must be
mapped to physical storage – typically a disk.
File control block: all the information about a file accessed on opening a file from the
directory information and stored in the “Open File table” see below.
Directory Structure (Stallings Table 12.2, p. 537, and also p. 481 Bottom):
The directory maps symbolic names to logical or physical disk addresses. When the file
is “opened”, the directory structure is searched for file information and copied to the
“open file table”. The key information is a pointer to the file itself. The open file table
is faster to access than the directory which must be searched. An index into this table to
the directory information of the file which was opened (file control block) is returned to
the user as a file descriptor in UNIX or a handle in some other OS’s. The index is faster
to use than searching the original directory.
Application
Logical file system:
Use directory structure and symbolic names. Output is a logical block address
File organization module:
Translate Logical block address to physical address.
Free space management here
I/O Control
Device driver: data transfer between memory and disk. Input is high-level
commands, and output is low-level hardware commands to device controller.
Device controller
Hardware interface between the device driver (OS) and the device itself.
Typically an adapter card in a bus on the mother board.
Low level logic controlling the device