Vous êtes sur la page 1sur 11

1

This is a C/C++ Language, UNIX-based project to learn about file systems, and their
structure.
Submission
Due Date: Saturday, 5/11/12 (11:59:00 pm ET)*
LATE PROJECTS WILL RESULT IN a 0. NO EXCEPTIONS.
*See extra credit section for more details

This submission should be done through the MyCourses dropbox. Note, this will not
build your files upon submission; make sure it works, if I cannot compile it, you will not
get credit. I will simply extract your project to its own folder and run make. Project
submissions must be archived in .tgz format. If you use 'gmakemake', this can be
done with 'make archive.tgz'.
Please follow the file and program name requirements in this document and include all
the files necessary to compile your program.
Attribution and Documentation
This is an individual effort project. You must do all the work yourself without assistance
from other students.
You may find things online or in books that help you in certain areas; you need to
document your sources. This means you must acknowledge contributions from
these other sources. I expect that you will provide comments in each file containing
information about the program, including all contributing sources.
Style, Documentation and Formatting
All functions, classes, and files must have comments; all parameters must be defined in
the comments section. I am not terribly concerned with the format of your code, though
it should be readable, variable names should make sense, and the code should be
documented. If you still arent sure if your code is properly formatted, ask me. Be sure
to include your name in ALL source files.
Grading
This project is worth 100 points.
It is possible to earn up to 20 points extra credit on this assignment, details are below.
Project Resource Materials
There are several resources on the internet you can use for this project. I suggest using

2
gmakemake to create your makefile. This should be self-explanatory, and will greatly
assist your efforts.
Project Description
Overview
This project will take the shell created in your first project and expand upon it. You will
be adding a file system in which you can create, delete and edit files. Files will be able
to be copied within your file system, and to/from the native file system. File system
management and performance of the system is of utmost importance.
Program I/O
A user starts a shell created by the student for project 1. Command line parameters will
be used to open an existing file system and create a new one. The file system structure
details will be given below, but users should be able to manage files on this system in
much the same way that they manage files on the native file system.
Requirements for the program are:
Users can create a file system by giving a file system name on the command line
when starting the shell
o If the file system does not exist, users will be prompted to create it, and
the user will be placed in the root of that file system
o If the file system already exists, it will be opened for the user, and the
user will be placed in the root of that file system
The maximum size for the file system is 50MB
The minimum size for the file system is 5MB
The minimum size for a cluster is 8KB
The maximum size for a cluster is 16KB
Cluster sizes should be even multiples in KB (8K, 9K, 10K, etc), such that they
are divisible by 128
Users can execute the following commands on the your system, and all functions
should work between the native and user file system, and any mixture of the two
o ls display a list of files in the current directory
o touch create a new 0 byte file
o cp copy a file from <source> to <destination>
o mv move a file from <source> to <destination>
o rm remove a file
o df show the structure of the file system
o cat display the contents of a file
o printFAT prints the FAT in a user readable format
o printDT prints the directory table in a user readable format.
The mount point for your file system is: /<filesys name>
o Ex: if I run os1shell myfs, I should be able to access the file system at
/myfs
o Ex if I run os1shell disk1, I should be able to access the file system at
/disk1

Details
File System
A filesystem in a unix system can be implemented in many different ways. Linux, for
example, has the ability to read and write nearly any file system on the market, and is
capable of reading every commonly used file system. Each of these file system
implementations is usually implemented in the kernel level code (often through a
module), and the file I/O is then dealt with through an API call. In order to increase your
understanding of how file systems are implemented, you will be creating a file system
that is handled entirely within the shell, since you will not have access to implement a
kernel level file system.
There are many types of file systems supported by systems today, though we will be
focusing on a FAT like file system for this project. A FAT file system is composed of a
File Allocation Table. The FAT32 file system is a common system used on many
devices these days, including digital cameras. While the standard FAT16 and FAT32
are much more complicated than what you will need to implement, the basic structure
will be used in this particular project.
In a FAT file system, there is a structure that maps file to the systems hard disk,
indicates when clusters on the hard drive are bad, manages the files copying, deleting,
moving and the access of applications to the system.
The original FAT File system was FAT12, and was used on floppy disks and hard drives
up to 15MB in size. The FAT16 file system allowed for access to 2GB and FAT32 allows
up to 2TB drives. In any of these systems, the FAT table is a fixed size, and the cluster
size determines that maximum size of a disk. For example if we have 2000 records for
files, and each cluster is 1k, we can have a 2MB drive. If each cluster is 8kb, we could
have a 16MB drive.
When creating a file system, there is a compromise between the cluster size and
performance. The large cluster size means than the performance will be faster as fewer
lookups will need to occur (just because FAT16 supported 2GB did not mean that every
drive was 2GB, a 500MB drive would have the number of entries of a 2GB drive with
the same cluster size). A larger cluster size is not always ideal though, as only one file
can occupy a cluster at a time. If you have a 8kb clusters, and are writing 1kb files,
there will be 7kb of space that isnt used. This is a tradeoff that must be made when
selecting cluster sizes, and is something you will need to consider when you create your
file system. If a file is larger than a cluster, which is recorded in the fat table, there is a
link to the next cluster used by the file, which may or not be the next cluster on the
system.
A final point to consider when creating file systems is how to handle directories or
folders. Directories are typically a hierarchical structure of virtual files that house files
and other directories. In the FAT file systems, directories are special entries within the
FAT table that give information about directories, including permissions, access rights,
and last modified and access times. In other words, directories are just special cases of
files, but remember, every file lives in a directory. The FAT file system has changed how
they handle root file systems throughout the years, but it basic concept has remained the

4
same.
For more detailed information about the FAT system, and for ideas about how to
implement your project, the wikipeidia page on FAT file systems is quite comprehensive:
http://en.wikipedia.org/wiki/File_Allocation_Table
The Program
You will be required to implement a file system for the shell you created in Project 1.
Since the file system will reside entirely inside the shell, it will be a simplified version of a
FAT16 system. You will need to support files and directories and management of those
files and directories.
The os1shell create in project 1 should now take a command line parameter, which will
be the name of the file system to open. If the file system does not exist, the user should
be prompted for several parameters before the shell is
opened, in the following manner:
Note: When creating the
Are you sure you want to create a new file system [Y]?
Enter the maximum size for this file system in MB [10]:
Enter the cluster size for this file system in KB [8]:
os1shell -> _
If the user specifies all the parameters for creation a file
system, it will be created on parent file system where it
can be accessed from your shell.

file system, keep in mind


that the entire FAT table
must fit inside a single
cluster. Since we have 1
record in our FAT for
each cluster, and it is an
array of integers, it
should be trivial to find
out if the FAT table will
fit; if it wont, you should
not allow the user to
create the file system.

Once the File System has been created, it should consist of several pieces in order to be
able to access the system.
The first thing you need is a Boot Record. This should reside at address 0 in your file
system, and contain information such as the cluster size, the disc size and the location
of the root directory table entry. Based on this structure, the first cluster will always be
allocated to the structure of the file system.
Type

Name

Description

Unsigned int

Cluster Size

The size of the cluster you specified when creating the


file system in bytes

Unsigned int

Size

The size of the disc in bytes

Unsigned int

Root Directory

Index to the cluster that stores the root directory

Unsigned int

FAT

Index to the cluster containing the FAT Table

The second thing you will need is a directory table. The directory table stores the

5
information about files and directories in your file system. This should be stored in the
first available cluster, and as long as there is free space available in the file system, it
can grow to expand in the same way as a file. You should note that the directory table is
128 bytes large, and that a cluster is also a power of 2. For performance reasons, it
makes sense to always fill a cluster with directory entries and mark them as
available for use, if they arent used. This will make reading and writing of the
directory tables much easier, and is a requirement for the project,
The structure would then consist of multiple entries that look like the following
Type

Name

Length

Description

char

Name

112

Filename
The first byte can have a special value
Value
Description
0x00
This entry is available for use
0xFF
This entry is deleted

Unsigned int

Index

The index of first cluster for this file or the location of a


directory structure

Unsigned int

Size

The size of the file in bytes (0 for directories)

Unsigned int

Type

File attributes
Value
0x0000
0xFFFF

Unsigned int

Creation

Description
File
Directory

The creation date of the file (unix epoch format)

A properly defined directory table will be defined as:


directory_entry *directory_table;

Aside from the directory table, you will need a File Allocation Table, which consists of an
array the index in the array corresponds to the cluster for that data, the 2nd element is
a pointer to the next cluster containing data.
int FileAllocationTable[ClusterCount];
FileAllocationTable[i] is the reference to the next cluster in the series, and i is the index
of cluster on the disk we are looking at
Some common value for FileAllocationTable[i] are:
0x0000 a free cluster
0x0001 0xFFFD the index of the next cluster in the chain
0xFFFE a reserved cluster, not to be used by the file system (DO NOT USE)
0xFFFF The last cluster in the chain
Given the above two items, we can read or write files with the simple code:
currentCluster = directory_table[i].index;
while (FileAllocationTable[currentCluster] != 0xFFFF)
{

6
DataLocation = currentCluster;
currectCluster = FileAllocationTable[currentCluster];
// handle reading and writing here
}

You can keep track of free clusters by setting the FileAllocationTable[currentCluser] =


0x0000; Doing so indicates there is no data and that it can be used for data storage. It
is critical that you keep track of where the directory table and FAT are stored (it should
be stored in your file system) and ensure that you dont overwrite them. As shown here,
the cluster count should be 0 indexed. The location of the boot record should always be
stored in cluster 0.
Both the FAT and the Directory_Table must be read and written to disk as files. They
should occupy clusters in the same manner as regular files, which means your file I/O
routines will have to handle periodic reading and writing of these entries.

BR
FAT1

Root 1

F1 Data
F3 Data

Root 2
F2 Data

F1 Data

F1 Data

F2 Data

BR = Boot Record
Root - The Root File System, this can exist anywhere on the disk
F(x) Data - indicates data for a file on the system
You will be required to print the information in the FAT table, and the Directory table in a
user readable format. It should be clear how the clusters are linked, and what
information is in the directory table. The format of this output is up to you.
Data Types
And now for a note on data types: In this project, we are using unsigned ints for out
integer variables, which should, in theory, allow access to a 4TB file system. Each
integer is a 32 or 64 bit platform is 32 bits. Since there are 8 bits in a byte, an integer
takes 4 bytes.
The hexadecimal number 0xF is the same as 15 in decimal, or 1111 in binary, or 4 bits.
Since a single byte is 8 bits, this yields 1111 1111 in binary, or 255 in decimal, or 0xFF
in hex. It is this reason we can use the 0xFF to check the first byte of the file name they are for all intents and purposes, the exact same thing in memory (A single character
takes up one byte of memory).
By now it should be clear that 0xFFFF is only 16 bits (4x4), and we are working with 32-

7
bit numbers (unsigned ints). The reason for this is two-fold. Primarily, you have a
maximum file size of 50MB, so using 32-bits is overkill. Even 16-bits is overkill, but its a
start. Second reason is that 0xFFFF is a lot more convenient to type and talk about than
0xFFFFFFFF, so for all of our data types, we are effectively only using the short or word
part of the number. As long as you keep your numbers consistent with this writeup, you
should be good.

Commands
When you first start your file system, with the command: os1shell myfs you will have
created a file system call myfs. Your mountpoint for this file system is /myfs. When you
first start your shell, you should be in the file system, so an ls command will give a
directory listing from the file system.
The cd functionality should work, and a valid command would be cd /home/fac/jsb. This
is usually implemented by storing the current path in a local variable and passing that to
functions such as ls, cat, touch and rm when calling execvp, but can also be
implemented by using the chdir() function, but you will need to keep track of the file
system you are in.
If I have executed a cd to change to a path out of my file system, I should be able to
return to the file system by executing cd /foo where foo is the name of the file system
specified on the command line. Commands should then be executed in the local file
system.
Commands such as ls, rm, cat and touch should work in the current path, as well as take
a path as a parameter. For example, if I give the command ls and I am in the /myfs
directory, I should get a directory listing of my file system. If I execute cd /tmp and
perform an ls, I should get a directory listing in tmp. I should also be able to do ls /etc to
get a directory listing of /etc
Copy and move should support absolute and relative paths: The following execution
path is valid
cd /myfs
cp /home/fac/jsb/a.txt a.txt // copy from parent fs to local fs (cur dir is local fs)
cp /home/fac/jsb/b.txt /myfs/b.txt // copy from parent fs to local fs
cp /myfs/a.txt /tmp/a.txt // copy local fs to parent fs
cp /myfs/a.txt /myfs/c.txt // copy a file on the local file system
cp a.txt d.txt // copy a file on the local file system (cur dir is local fs)
cd /tmp
cp /myfs/b.txt t.txt // copy a file on the local file to /tmp (cur dir is /tmp)
Sample Code
This code is given as a sample bootrecord should probably be created as a struct in
order to use names the directory table does not properly handle spanned clusters, and

8
is hard coded for 8k clusters. That being said, this is a good starting place, as it will
allow you to read/write information from the file system.
// the boot record
unsigned int bootrecord[4];
// a directory table entry
typedef struct
{
char name[112];
unsigned int index;
unsigned int size;
unsigned int type;
unsigned int creation;
} directory_entry;
// read the directory table and boot record from 1 8k cluster
fp = fopen("myfs", fp);
fseek(fp, 0, SEEK_SET);
fread(&bootrecord, sizeof(int)*4, 1, fp);
fseek(fp, bootrecord[2]*bootrecord[0], SEEK_SET);
directory_entry *my_dir_table;
my_dir_table = (directory_entry *)malloc(8192);
fread(my_dir_table, 8192, 1, fp);

YOU WILL BE REQUIRED TO BE ABLE TO READ AND WRITE ANOTHER


STUDENTS FILE SYSTEM. FAILURE TO DO SO WILL RESULT IN A LOSS OF
POINTS!
Extra Credit
Most file systems allow users to create directories to house and organize files. In the
above scenario, we only require one directory, and every file gets added to the same
directory table. While this is a solution for this project, it is not an ideal solution for a real
file system. Implementing directories is not terribly difficult and is simply done by
creating a directory table that has a link in it to another directory table (instead of a file).
This new entry will have a link back to the parent directory as well as any files that reside
in the new directory. Moving files between directories means that an entry will need to
be removed from one directory table and added to another. When removing an item
from a directory entry, it is a matter of marking the first byte of the entry to 0x00 to
indicate it is free to use. New entries should check for entries with a 0x00 or 0xFF entry
before creating a new one.
Directories, if implemented, must be done entirely within the confines of the project. No
new tables may be created to store information, and user should be able to move, copy,

9
cat, etc, any file within a directory. rm must also function and provide the ability to
remove a directory, or rmdir must be implemented. If directories are to be implemented,
wildcard support should also be implemented for file management (cp /directory/*
/home/mydir/code/). This will be worth up to 10 points.
The second area to earn extra credit is by turning the project in early. As we will be
approaching the end of the term, and I will need time to grade the projects, anything that
is turned in more than a week early will earn 10 points of extra credit. If you decide that
you want to make a change after the early deadline and submit again, you will forfeit any
extra points you have earned for turning it in early.

FAQ
Q: Holy crap, where do I start?
A: Thats kinda of a broad question, but Ill do my best to answer. First, take your project
1 and rip out the signal handling, and history (if it doesnt work), and then see what you
can do about reading the same file system provided on myCourses. Its not complete,
but you will be able to try basic read operations, which is pretty much the foundation.
Q: Do I need to use read() for this project
A: No, you can use any input method you want. You dont need the 64 character limit
either, but you dont need to remove it either.
Q: Do all external paths start with /home?
A: No, they all start with /, I use /home a lot because its convenient
Q: Do I need to be able to read and write other students file systems?
A: Definitely! Make sure you follow the spec exactly. If you dont, chances of you being
able to read a stock file system are small, and youll lose points. You are welcome and
encouraged to share your file system (the created file system, not your code) with other
students.
Q: Do I need to support any other commands other than the ones listed in the writeup?
A: Yes. Anything that I type that isnt in the list above should be fork()/exec() the same
as project 1.
Q: What should the output of printFAT look like? There is a lot of data to show.
A: Yes, its a lot of data, but its more of a debug tool than anything. I recommend you
have a list of: FAT[0] = 65535 FAT[1] = 0 FAT[2] = 23, etc
Q: What should the output of printDT look like?
A: I recommend you do much the same as printFAT, just dump the data structures
dt[0].name = foo
dt[0].index = 15
dt[0].size = 5
dt[0].type = 0
dt[0].creation = 29389810

10
Q: What should the output of df look like?
A: I recommend you make it look like the linux df:
Filesystem
Myfs

8K-blocks
1280

Used
15

Available
1276

Use%
1%

Mounted on
/myfs

Q: My program is segfaulting a lot, can you help me?


A: Probably, but before I do, a few hints. Segfaults are rarely cause where your
program crashes, but usually in some place where you called malloc on something and
screwed up the size. Double check all your memory allocations to make sure they are
correct.
Q: Pointers are confusing
A: Thats not a question

11

____ : (12 points) Creating a file system that is divided into


clusters, with a size specified on the command line. You are able to
open an existing file system by specifying its name on the command line
____ : (4 points) creation of a 0 byte file using touch
____ : (8 points) copying/moving a file to and from the parent file
system. Copying within the current file system.
____ : (8 points) correcting setting the deleted flag in the FAT table
and not showing those entries, using rm
____ : (12 points) Correctly reading the FAT table and allowing files
to span multiple, non-adjacent clusters
____ : (4 points) df will show the parameters of the current disk,
including the cluster size and file system size
____ : (6 points) printing of the FAT table, and Directory Table in a
user readable format
____ : (4 points) using correct functions to read/write the file on the
disk
____ : (8 points) proper implementing of cat to output a given file
to standard out.
____ : (8 points) overall functionality.
strange conditions and work as expected

Does the application handle

____ : (6 points) Ability to read and write a standard file system


format
____ : (12 points) design: modular architecture, clearly documented
____ : (8 points) style: consistency and coding standards usage
total is 100 points (before extra credit).
____ : (10 extra points) allow users to create, remove, rename and put
files in sub directories.
____ : (10 extra points) submission of project before 05/04/13 at
11:59pm