Vous êtes sur la page 1sur 52

Unix Systems Programming 6TH Semister

Standards and File APIs


Standards
 The ISO (International Standards Organization) defines “standards
are documented agreements containing technical specifications or
other precise criteria to be used consistently as rules, guidelines or
definitions of characteristics to ensure that materials, products,
processes and services are fit for their purpose”.

 Most official computer standards are set by one of the following


organizations:

 ANSI (American National Standards Institute)


 ITU (International Telecommunication Union)
 IEEE (Institute of Electrical and Electronic Engineers)
 ISO (International Standards Organization)
 VESA (Video Electronics Standards
The ANSI C Standard

 This standard was proposed by American ANSI in the year 1989


for C programming Language standard called X3.159-1989 to
standardize the C programming language constructs and libraries.

Major differences between ANSI C and K & R C

 ANSI C supports Function Prototyping


 ANSI C support of the const & volatile data type qualifier
 ANSI C support wide characters and internationalization, Defines
setlocale function
 ANSI C permits function pointers to be used without dereferencing
 ANSI C defines a set of preprocessor symbols
 ANSI C defines a set of standard library functions and associated
headers.

The ANSI / ISO C++ Standard

o The C++ language is one of the OOP languages


o It was developed by Bjarne Stroustrup at At &T Bell Laboratories
o C++ is an extension of C with a major addition of the class
construct features of Simula 67.
o The three most important facilities that C++ adds on to C are
classes, function overloading, & operator overloading.

o In 1989, Bjarne Stroustrup published “The Annotated C++


Reference Manual” , this manual become the base for the draft
ANSI C++ standard

RITM -1- Yogish . H. K.


Unix Systems Programming 6TH Semister
o WG21 committee of the ISO joined the ANSI X3J16 committee to
develop a unify ANSI/ISO C++ standard. A draft version of
ANSI/ISO standard was published in 1994.

Major Differences between ANSI and C++

 Function Declaration or Function Prototype

 Functions that take a variable number of arguments

 Type safe linkage , Linkage Directives

POSIX Standards

 POSIX is a acronym for Portable Operating System Interface


 There are three subgroups in POSIX They are :

POSIX.1 :
 Committee proposes a stdandard for base operating system
APIs
 This stadndard is formally known as the IEEE standard
1003.1-1990.
 This standard specifies the APIs for the file manipulation
and processes(for Process Creation and Control).
POSIX.1b:
 Committee proposes a stdandard for real time operating
system APIs
 This stadndard is formally known as the IEEE standard
1003.4-1993
 This standard specifies the APIs for the interprocess
communication(Semaphores,Message Passing Shared
Memory).
POSIX.1c:
 Committee proposes a stdandard for multithreaded
programming interface
 This standard specifies the APIs for Thread Creation,
Control, and Cleanup, Thread Scheduling ,Thread Synchronization and
for Signal Handling .

 To ensure a user program conforms to the POSIX.1 standard, the user


should define the manifested constant _POSIX_SOURCE at the beginning
of each program(before the inclusion of any header files) as:

#define _ POSIX_SOURCE or

specify the –D_ POSIX_SOURCE option to a C++ compiler during


compilation.

RITM -2- Yogish . H. K.


Unix Systems Programming 6TH Semister

$g++ –D_ POSIX_SOURCE filename.cpp

 In general a user program that must be strictly POSIX.1and POSIX.1b


compliant may be written as follows:

#define _POSIX_SOURCE
#define _POSIX_C_SOURCE 199309L

#include <iostream.h>
#include <unistd.h>

int main( )
{
....
}

POSIX Feature Test Macros

Feature Test Macro Effects if defined on a System


It allow us to start multiple jobs(groups of processes)
from a single terminal and control which jobs can access
_POSIX_JOB_CONTROL
the terminal and which jobs are to run in the background.
Hence It supports BSD version Job Control Feature.
Each process running on the system keeps the saved set-
UID and set-GID, so that it can change effective user ID
_POSIX_SAVED_IDS
and group ID to those values via setuid and setgid APIs
respectively.
If the defined value is -1, users may change ownership of
_POSIX_CHOWN_RESTRICTED files owned by them. Otherwise only users with special
previlege may change ownership of any files on a system.
If the defined value is -1, any long path name passed to
_POSIX_NO_TRUNC an API is silently truncated to NAME_MAX bytes,
otherwise error is generated.
If the defined value is -1, there is no disabling character
_POSIX_VDISABLE for special characters for all terminal device files,
otherwise the value is the disabling character value.

Limits Checking at Compile Time and at Run Time

 The POSIX.1 and POSIX.1b standards specify a number of parameters


that describe capacity limitations of the system.

 Limits are defined in <limits.h>.


 These are prefixed with the name _POSIX _

sysconf, pathcomf and fpathconf

RITM -3- Yogish . H. K.


Unix Systems Programming 6TH Semister

To find out the actual implemented configuration limits

 System wide using sysconf during run time


 On individual objects during run time using, pathconf and fpathconf.
#include <unistd.h>

long sysconf (int parameter);


long fpathconf(int fildes, int flimit_name));
long pathconf(const char *path, int flimit_name);

o For pathconf(), the path argument points to the pathname of a file or


directory.
o For fpathconf (), the fildes argument is an open file descriptor.

The POSIX.1 FIPS Standard

 FIPS stands for Federal Information Processing Standard.


 This standard was developed by National Institute of Standards and
Technology
 The latest version of this standrd, FIPS 151-1, is based on the POSIX.1-
1998 standard. The FIPS standard is a restriction of the POSIX.1-1998
standard, Thus a FIPS 151-1 conforming system is also POSIX.1-1998
conforming , but not vice versa.

 FIPS 151-1 conforming system requires following features to be


implemented in all FIPS conforming systems.

_POSIX_JOB_CONTROL _POSIX_JOB_CONTROL must be defined.


_POSIX_SAVED_IDS _POSIX_SAVED_IDS must be defined.
_POSIX_CHOWN_RESTRICTED must be defined and its
_POSIX_CHOWN_RESTRICTED value is not -1, it means users with special previlege may
change ownership of any files on a system.
If the defined value is -1, any long path name passed to
_POSIX_NO_TRUNC an API is silently truncated to NAME_MAX bytes,
otherwise error is generated.
_POSIX_VDISABLE POSIX_VDISABLE must be defined and its value is not -1.
Must be defined and its value is not -1, Long path name is
_POSIX_NO_TRUNC
not support.
NGROUP_MAX Symbol’s value must be at least 8.
The read and write API should return the number of bytes that have been transferred after
the APIs have been
The group ID of a newly created file must inherit the group ID of its containing directory.

Context Switching

RITM -4- Yogish . H. K.


Unix Systems Programming 6TH Semister
 A user mode is the normal execution context of any user process, and it
allows the process to access its specific data only.

 A kernel mode is the protective execution environment that allows a user


process to access kernels data in a restricted manner.

 When the APIs execution completes, the user process is switched back to
the user mode. This context switching for each API call ensures that
process access kernels data in a controlled manner and minimizes any
chance of a runway user application may damage an entire system. So in
general calling an APIs is more time consuming than calling a user
function due to the context switching. Thus for those time critical
applications, user should call their system APIs only if it is necessary.

An APIs common Characteristics

 Most system calls return a special value to indicate that they have failed.
The special value is typically -1, a null pointer, or a constant such as EOF
that is defined for that purpose.

 To find out what kind of error it was, you need to look at the error code
stored in the variable errno. This variable is declared in the header file
errno.h as shown below.
volatile int errno
o The variable errno contains the system error number.
void perror (const char *message)
o The function perror is declared in stdio.h.

Following table shows Some Error Codes and their meaning:

Errors Meaning
EPERM API was aborted because the calling process does not have the super user
privilege.
EINTR An APIs execution was aborted due to signal interruption.
EIO An Input/Output error occurred in an APIs execution.
ENOEXEC A process could not execute program via one of the Exec API.
EBADF An API was called with an invalid file descriptor.
ECHILD A process does not have any child process which it can wait on.
EAGAIN An API was aborted because some system resource it is requested was
temporarily unavailable. The API should call again later.
ENOMEM An API was aborted because it could not allocate dynamic memory.
EACCESS The process does not have enough privilege to perform the operation.
EFAULT A pointer points to an invalid address.
EPIPE An API attempted to write data to a pipe which has no reader.
ENOENT An invalid file name was specified to an API.

UNIX / POSIX file Types

RITM -5- Yogish . H. K.


Unix Systems Programming 6TH Semister

The different type’s files available in UNIX / POSIX are:

• Regular files Example: All .exe files, C, C++, PDF Document files.
• Directory files Example: Folders in Windows.
• Device files

o Block Device files: A physical device that transmits block of data at a


time.
For example: floppy devices CDROMs, hard disks.

o Character Device files: A physical device that transmits data in a character


based manner.
For example: Line printers, modems etc.

• FIFO files Example: PIPEs.


• Link Files

1. Hard Links

It is a UNIX path or file name, by default files are having only one hard
link

2. Symbolic Links

Symbolic links are called soft links. Soft link are created in the same
manner as hard links, but it requires –s option to the ln command.
Symbolic links are just like shortcuts in windows.
Differences between Hard links and Symbolic Links

Hard Link Soft Links


1. Do not create new inode. 1. Create a new inode.
2.Cannot link directories unless super 2. Can link directories.
user privileges.
3. Cannot link file across file systems. 3. Can link files across file systems.
4. Increase the hard link count. 4. Does not change the hard link count.
5. Always refer to the old file only, 5.Always reference to the latest version
means hard links can be broken by of the files to which they link.
removal of one or more links.

UNIX Kernel supports for file / Kernel Data structure for file manipulation

• If open call succeeds, kernel establish the path between preprocess table to
inode table through file table

The Steps involved in this process are:

RITM -6- Yogish . H. K.


Unix Systems Programming 6TH Semister
Step 1: The kernel will search the process file descriptor table and look for first unused
entry, if an entry is found, that entry will be designated to reference the file.

Step 2:The kernel scan the file table in its kernel space to find an unused entry that can
be assigned to reference the file.

If an unused entry is found, the following events will occur.

The process’s file table entry will be set to point to this file table entry.

o The file table entry will be set to point to the inode table entry where the
inode record of the file is stored.
o The file table entry will contain the current file pointer of the open file.
o The file table entry will contain open mode that specifies that the file is
open for read-only, write-only or read-write etc.
o The reference count in the file table entry is set to 1. The reference count
keeps track of how many file descriptors from any process are referencing
the entry.
o The reference count of the in-memory inode of the file is increased by 1.
This count specifies how many file table entries are pointing to that inode.

If either step1 or step2 fails, the open function will return with a -1 failure status,
no file descriptor table or file table entry will be allocated.

The figure shows a process’s file descriptor table, the kernel file table and the
inode after the process has opened three files: abc for read only, and xyz for read- write
and xyz again for write only.

File Descriptor Table File Table Inode Table

r abc
rc=1r rc=
w 1rc
Process Space
rc=1 =2
w xyz
rc=1
Kernel Space

rc = Reference Count
r = Read only
RITM w = Write only -7- Yogish . H. K.
rw= Read Write Figure: Data Structure of File Manipulation
Unix Systems Programming 6TH Semister

The reference count of an allocated file table entry is usually 1, but a process may

When a process calls the function close to close an opened file, the following
sequence of events will occur.
1) The kernel sets the corresponding file descriptor table entry to be unused.
2) It decrements the reference count in the corresponding file table entry by 1. If the
reference count is still non-zero, go to step 6.
3) The file table entry is marked as unused.
4) The reference count in the corresponding file inode table entry is set decremented
by one. If the count is still non-zero go to step 6.
5) If the hard link count of the inode is not zero, it returns to the caller with a success
status otherwise, it marks the inode table entry as unused and de- allocates all the
physical disk storage of the file.
6) It returns to the caller to the process with 0 (success) statuses.

Regular File APIs

1. Open and creat

#include < sys/types.h>


#include <unistd.h>
#include <fcntl.h>

int open (const char *path, int access_mode);


int open(const char *path, int access_mode, mode_t permission);
int creat(const char *path, mode_t permission);
is equivalent to:
open (const char *path, O_WRONLY | O_CREAT | O_TRUNC,
mode_t
permission);

PARAMETERS: path: [in] the path of the new file to create or open.
permission: [in] the new permission mask of the file. It is
masked by the umask value: mode &
~umask.
access_mode: [in] access_mode are explained below.

DESCRIPTION: open opens a file. The access_mode parameter must be one of the
following:

File Access Modes

O_RDONLY : The file is opened for reading.


O_WRONLY : The file is opened for writing.
O_RDWR : The file is opened for both reading and writing.

RITM -8- Yogish . H. K.


Unix Systems Programming 6TH Semister

The following values may be or'ed together with one of above access_mode flags:

O_APPEND : Appends data to the end of the file.


O_CREAT : Create the file if it does not exist.
O_EXCL : Used with O_CREAT, if the file exists, the call fails. The
test for existence and the creation if the file does not exists.
O_TRUNC :
If the file exits, discards the file contents and sets the file size to
zero.
O_NOCTTY : Species not to use the named terminal device file as the
calling process control terminal.
O_NONBLOCK: Specifies that any subsequent read or write on the file should be
non-blocking.

The permission, argument is required only if the O_CREAT flag is set in the
access_mode argument. It specifies the permissions of owner, group and others.

1. Read and write

read

The read function reads a fixed size block of data from to a file referenced by a
given file descriptor.

PROTOTYPE : #include <sys/types.h>


#include <unistd.h>

ssize_t read (int filedes, void *buffer, size_t size);


ssize_t write (int filedes, void *buffer, size_t size);

PARAMETERS : filedes: [in] the file descriptor of read/write


buffer: [out] the buffer that will contain information read/write
size: [in] the maximal size of buffer.
DESCRIPTION : Reads up to size bytes into buffer from filedes.
The write function writes up to size bytes from buffer to the file
with descriptor filedes. The data in buffer is not necessarily
a character string and a null character is output like any other
character.

RETURN VALUE: On the number of bytes read/write at end of file. On error, -1.

3. close

PROTOTYPE: #include <unistd.h>

int close(int filedes);

PARAMETERS: filedes: [in] the file descriptor to close.

RITM -9- Yogish . H. K.


Unix Systems Programming 6TH Semister
DESCRIPTION: Closesthe file descriptor filedes. If filedes is the last file descriptor
referring to a file, then the resources associated with that
file are deallocated. The locks held on the file by the
current task are released.
RETURN VALUE: On success zero is returned. On error, -1 is returned.

The function close closes the file descriptor filedes. Closing a file
has the following consequences:

• The file descriptor is deallocated.


• De allocates system resources. Example file table entries and
memory buffer allocated to hold read/write file data.
• Any record locks owned by the process on the file are
unlocked.
• When all file descriptors associated with a pipe or FIFO has
been closed, any unread data is discarded.

4. lseek : lseek allows random access of file

PROTOTYPE: #include <sys/types.h>


#include <unistd.h>

off_t lseek(int filedes, off_t offset, int whence);

PARAMETERS: filedes: [in] the file descriptor to manipulate.


offset: [in] the offset modificator.
whence: [in] indicates how to modify the offset.
DESCRIPTION: Changes the read/write file offset of a file descriptor. The offset
parameter is interpreted according to the possible following
values of whence:

SEEK_SET : The new file offset will be offset.


SEEK_CUR : The new file offset will be the current offset
plus offset.
SEEK_END : The new file offset will be the end of the file
plus offset.

RETURN VALUE: On success, the call returns the new file offset where the next read
or write operation will occur. On errror, it returns -1.

5. link : Used for creating alternative file name for existing file

PROTOTYPE: #include <unistd.h>


int link (const char *oldname, const char *newname);

PARAMETERS: oldname: [in] points the file we want to add a link to.
newname: [in] points to the path for the new link.

RITM -10- Yogish . H. K.


Unix Systems Programming 6TH Semister
DESCRIPTION: Creates a new (hard) link to a file. There is no way to distinguish
the links.

RETURN VALUE: On success zero is returned. On error, -1.

6. unlink: This is the API used to delete a file.

PROTOTYPE: #include <unistd.h>


int unlink(const char *path);

PARAMETERS: path: [in] points to the path of the file to unlink.


DESCRIPTION: Deletes a link to a file. If the file is not used and it was the last
link, the file is also deleted.

RETURN VALUE: On success, returns zero. On error, returns -1.

7. fstat, stat and lstat :These are the functions used to retrieve the attributes of a
given file.

PROTOTYPE: #include <sys/stat.h>


#include <unistd.h>

int fstat(int filedes, struct stat *buf);


int stat(char *path, struct stat *buf);
int lstat(char *path, struct stat *buf);
PARAMETERS: filedes: [in] the file descriptor we want to get the information from.
path: [in] the file path we want to get the information from.
buf: [out] points to the buffer that will contain the information.

DESCRIPTION: Those calls return a stat structure in buf with the following
format:
struct stat
{
dev_t st_dev; /* device */
ino_t st_ino; /* inode */
umode_t st_mode; /* access mode */
nlink_t st_nlink; /* number of hard links */
uid_t st_uid; /* uid */
gid_t st_gid; /* gid */
dev_t st_rdev; /* device type */
off_t st_size; /* size (in bytes) */
unsigned long st_blksize; /* block size */
unsigned long st_blocks; /* number of allocated blocks */
time_t st_atime; /* last access time */
time_t st_mtime; /* last modification time */
time_t st_ctime; /* last change time */
};

RITM -11- Yogish . H. K.


Unix Systems Programming 6TH Semister
The first argument of fstat is a file descriptor of an opened file
The First argument of stat is a file pathname
The first argument of lstat is the Symbolic link file name.
RETURN VALUE: On success zero is returned. On error, -1.

8. access : Checks the existance or access permission of user to a named file.

PROTOTYPE : #include <unistd.h>


int access (const char *path, int how)
PARAMETERS : path:[in] The file path we want to check the permission.
how: [in] this argument takes the below mentioned constants.
DESCRIPTION : The access function checks to see whether the file named by path
can be accessed in the way specified by the how argument.
The how argument either can be the bitwise OR of the flags
R_OK, W_OK, X_OK, or the existence test F_OK.

Bit flag Use


F_OK Checks whether a named file exists.
R_OK Checks whether a calling process has read permission.
W_OK Checks whether a calling process has write permission.
X_OK Checks whether a calling process has execute permission.

RETURN VALUE: The return value is 0 if the access is permitted and 1 otherwise. (In
other words, treated as a predicate function, access
returns true if the requested access is denied.)

9. chmod and fchmod : The chmod and fchmod functions change file access
permissions for owner, group and others, as well as
the set-UID, set-GID and sticky flags.

PROTOTYPE : #include <sys/types.h>


#include <unistd.h>
#include <sys/stat.h>

int chmod(const char *path, mode_t permission);


int fchmod(int filedes, mode_t permission);

PARAMETERS : path: [in] points to the path of the file to modify.


filedes: [in] the file descriptor to modify.
permission: [in] the new mode.

DESCRIPTION : chmod changes the mode of the file specified by path to


permission. fchmod changes the mode of the file descriptor
specified by filedes to permission. The possible
values of permission are obtained by or'ing the
following constants:

RETURN VALUE: On success zero is returned. On error, -1.

RITM -12- Yogish . H. K.


Unix Systems Programming 6TH Semister

10. chown and fchown : used to change owner ship of a file(s).

PROTOTYPE :

#include <unistd.h>
#include <sys/types.h>

int chown(const char *path, uid_t owner, gid_t group);


int fchown(int filedes, uid_t owner, gid_t group);
int lchown(const char *path, uid_t owner, gid_t group);

PARAMETERS : path: [in] points to the path of the file to modify.


filedes: [in] the file descriptor to modify.
owner: [in] the new owner. -1 for no change.
group: [in] the new group. -1 for no change.

DESCRIPTION : chown changes the owner and group of file specified by


path to owner and group.
fchown changes the owner and group of file descriptor
filedes to owner and group.
lchown changes the ownership of the symbolic link file.
The superuser may do whatever he wishes with the owner
and group of a file. The owner of a file may only change
the group of the file to any group he belongs to.

RETURN VALUE : On success zero is returned. On error, -1.

11. utime : Used to sets file time and date.

PROTOTYPE : #include <sys/types.h>


#include <unistd.h>
#include <utime.h>

int utime(const char *path, struct utimbuf times);

PARAMETERS : path: [in] points to the path of the file to modify.


times: the new access time and modification time for the file.
DISCRIPTION : utime sets the modification time for the file path. The modification
time is contained in the utimbuf structure *times.
utimbuf Structure used by utime is:

struct utimbuf {
time_t actime; /* access time */
time_t modtime; /* modification time */
}

If *times is NULL, the file's modification time is set to the current time.

RITM -13- Yogish . H. K.


Unix Systems Programming 6TH Semister

RETURN VALUE: On success, returns 0 On error, returns -1.

12. fcntl : This function is used to query or set access control flags and
close_on_exec flag of any file descriptor. It can also be used to
duplicates the file descriptors and used for file or record locking.

PROTOTYPE : #include <fcntl.h>


int fcntl(int filedes, int cmd);
int fcntl(int filedes, int cmd, long arg);

PARAMETERS : filedes: [in] the file descriptor affected by the call.


cmd: [in] the operation to apply on the file descriptor.
arg: [in] an optional argument to the operation.

DESCRIPTION : This call directly applies an operation on a file descriptor. The


possible operations are:

• F_DUPFD : Duplicates filedes the new file descriptor specified by arg. If arg
specifies a file descriptor that was already opened, then the
file descriptor is first closed. It has the same effect has
dup2.
• F_GETFD : Returns the close-on-exec flag of filedes.
• F_SETFD : Sets the close-on-exec flag of filedes.
• F_GETFL : Returns the file descriptor access_mode (as specified by open).
• F_SETFL : Sets the file descriptor access_mode to arg. The only access_mode
that can be modified are O_APPEND and O_NONBLOCK.
• F_GETLK: Determine if the lock described by arg can be set on the file. If so,
the l_type member is set to F_UNLCK. Otherwise, arg is
modified to describe the lock preventing the set operation.
• F_SETLK : Set the lock described by arg on the file or releases an already
existing lock.
• F_SETLKW: Same as F_SETLK but block if the lock can not be set.

When using the F_GETLK, F_SETLK or F_SETLKW commands, the argument is a


pointer to a flock structure.

This structure has the following layout:

struct flock
{
short l_type; /* read, write or unlock */
short l_whence; /* how to interpret l_start */
off_t l_start; /* where to begin the locking area */
off_t l_len; /* the lenght of the area to lock */

RITM -14- Yogish . H. K.


Unix Systems Programming 6TH Semister
pid_t l_pid; /* the pid of the task holding the lock:
returned by
F_GETLK */
};

The l_whence member has the same meaning as for lseek.

l_type can take one of the following values:

F_RDLCK : for a shared read lock on.


F_WRLCK : for an exclusive write lock.
F_UNLCK : to unlock the region.

The system merges adjacent locking regions of the same type and owned by the
same task. When a sub region inside a region is unlocked, the region is split in two parts.
RETURN VALUE : On success, it depends on the cmd parameter:

F_DUPFD : the new file descriptor.


F_GETFD : the value of the close-on-exec flag.
F_GETFL : the value of the file descriptor access_mode.

On error, the call returns -1.

Directory File APIs

13. mkdir : This API is used for creating a new directory.

PROTOTYPE : #include <sys/stat.h>


#include <unistd.h>

int mkdir(const char *path, mode_t permission);

PARAMETERS : path: [in] points to the path of the new directory.


permission: [in] the access bits of the new directory.

DESCRIPTION : Creates a new directory. The uid of the new directory is the
same as the effective uid of the calling task. The gid of the
new directory is the same as its parent directory.

RETURN VALUE: On success zero is returned. On error, -1.

14. opendir, readdir, closedir and rewinddir

PROTOTYPE : #include <sys/types.h>


#include <dirent.h>

DIR * opendir(const char *path);


struct dirent *readdir(DIR *dir_disc);

RITM -15- Yogish . H. K.


Unix Systems Programming 6TH Semister
void rewinddir(DIR *dir_desc);
int closedir(DIR *dir_desc);

PARAMETERS : dir_disc: [in] the file descriptor of a directory.


DESCRIPTION : This call returns in a dirent structure the next entry of a
directory or NULL if the end is reached or an error
occurs. The area where the pointer returned by
readdir points to be static space that is overwritten by
subsequent calls to readdir.
rewinddir resets the file pointer to the beginning of the
directory file referenced by dir_desc. The next call
to the readdir will read the first record from the file.
closedir closes a directory file referenced by the dir_desc.
RETURN VALUE : On success, readdir returns a pointer to the dirent
structure. On error, returns -1.
15. rmdir

PROTOTYPE : int rmdir(const char *path);


PARAMETERS : path: [in] points to path of directory to remove.
DESCRIPTION : Removes a directory. The directory to be removed must be
empty.
RETURN VALUE : On success, returns zero. On error, returns -1.

Device File APIs

16. mknod This is the API is used for creating Device files.

PROTOTYPE :
#include <sys/stat.h>
#include <unistd.h>

int mknod(const char *path, mode_t mode, dev_t dev);

PARAMETERS : path: [in] points to the path of the new file.


mode: [in] specifies the kind of special file to create.
dev: [in] the major and minor numbers of the new device.
DESCRIPTION : Creates a special device file node. Only tasks with
superuser privileges may use this call. The access bits of
the new file are the same as those of the umask of the
current task. If mode does not specify a special device file,
then dev is ignored.
RETURN VALUE : On success zero is returned. On error, -1 is returned and
errno is set to corresponding error code.

For example: Create a block device file called SCS15 with major and minor
numbers are 15 and 3 respectively, and access permissions of read-write-execute for
everyone, the mknod call is:

RITM -16- Yogish . H. K.


Unix Systems Programming 6TH Semister
mknod(“SCS15”,S_IFBLK|S_IRWXU|S_IRWXG|S_IRWXO,(15<<8)| 3 );

17. FIFO: A FIFO ("First In, First Out", pronounced "Fy-Foh") is sometimes known
as a named pipe.

PROTOTYPE :
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>

int mkfifo(const char * path , mode_t permissions);

PARAMETERS: path: path name of a FIFO file to be created.


permissions: similar to permissions specified in the open API and
S_IFIFO flag to indicate that this is a FIFO file.

DESCRIPTION: A FIFO ("First In, First Out", pronounced "Fy-Foh") is sometimes


known as a named pipe. That is, it's like a pipe, except that
it has a name! In this case, the name is that of a file that
multiple processes can open() and read and write to.

RETURN VALUE: returns 0 if it is succeeds or -1if it fails.

Example: Create a FIFO file called FIFO5 with access permissions of read-write-execute
for everyone, the mkfifo call is:

mkfifo(“FIFO5”,S_IFIFO|S_IRWXU|S_IRWXG|S_IRWXO);

18 Symbolic Links : These ar just like shortcuts in windows

PROTOTYPE : #include <sys/types.h>


#include <sys/stat.h>
#include <unistd.h>

int symlink (const char *oldname, const char *newname);


int readlink(const char *sym_link, char *buf, int size);
int lstat (const char *sym_link, struct stat *stav);

PARAMETERS : oldname: [in] the new link.


newname: [in] the file to point to.
sym_link:[in] path name of a symbolic link file name.
buf:is a character buffer that holds the return pathname
referenced by the link.
size: Maximum capacity in bytes of the buf argument.

RITM -17- Yogish . H. K.


Unix Systems Programming 6TH Semister
DESCRIPTION : Creates a symlink form oldname to newname. The
destination does not need to exist at the time of the creation
of the link.
readlink is used to query the path name to which a symbolic link
refers to and it will connect a calling process to the actual
nonlink file.

RETURN VALUE : On success, returns zero. On error, returns -1 and sets


errno to corresponding error code.

Processes
Process

 A process is a program in execution.


 Each has a unique PID.
 A non-negative integer: 0 to PID_MAX
 Created by fork()/vfork() system calls.
 Some special PIDs:
0: scheduler
1: init
2: page daemon

Process Termination

There are five ways for a process to terminate.

Normal Termination

i. return from main


ii. calling exit // which returns to the kernel immediately
iii. calling _exi// performs certain cleanup processing

2. Abnormal Termination

i. Calling abort
ii. Terminated by a signal

Environment List

 It contains an arry of character pointers.


 The adress of the array of pointers is contained in the global variable environ.

RITM -18- Yogish . H. K.


Unix Systems Programming 6TH Semister
char ** environ;
The environment is represented as an array of strings. Each string is of the format
`name=value'.
Environment Access

 The value of an environment variable can be accessed with the getenv


 set using the function setenv, putenv.

char * getenv (const char *name);


int putenv (char *string)
int setenv (const char *name, const char *value, int replace);
int unsetenv (const char *name);

 The getenv function returns a string that is the value of the environment variable
name.

 The putenv function adds or removes definitions from the environment. If the
string is of the form `name=value', the definition is added to the environment.
Otherwise, the string is interpreted as the name of an environment variable, and
any definition for this variable in the environment is removed.

 The setenv function can be used to add a new definition to the environment. The
entry with the name name is replaced by the value `name=value'. If the
environment already contains an entry with key name the replace parameter
controls the action. If replace is zero, nothing happens. Otherwise the old entry is
replaced by the new one.

 The unsetenv function is used to remove an entry completely from the


environment. If the environment contains an entry with the key name this whole
entry is removed. A call to this function is equivalent to a call to putenv when the
value part of the string is empty.

Memory Layout of a C Program

• Text Segment:

o This segment contains the instructions


o This is the shared and read only segment

• Data Segment: Classified into two segments,

 Initialized data segment: it contains the variables with initialized data.

For example: int size=100;

RITM -19- Yogish . H. K.


Unix Systems Programming 6TH Semister
 Un Initialized data segment: also called as “bss (block started by
symbol)” segment. Data initialized by the kernel to arithmetic 0. For
example: int size;

• Stack Segment: It is storage for function arguments, automatic variables and


return address of all functions.

• Heap Segment : Dynamic memory allocation takes place in this section only.

The logical picture of how this segment is arranged is shown below.

Dynamic Memory Allocation

void *malloc (size_t size);


void *calloc (size_t count, size_t size) ;
void *realloc (void *addr, size_t size);
void free (void *addr);

1. malloc : Allocate a block of size bytes ,and returns a pointer to the first
byte. The initial value of the memory is indeterminate.

RITM -20- Yogish . H. K.


Unix Systems Programming 6TH Semister
2. calloc : Allocate a block of count * size bytes using malloc, and set its
contents to zero. Here count is the number of blocks and
size is the size (in bytes)of each block, and a pointer to the
first byte.

3. realloc : This function changes the size of a block of memory tht was
previously allocated by malloc or by calloc to larger or
smaller, possibly by copying it to a new location.

4. Free : Free a block previously allocated by malloc.

alloca function

 It allocates memory from stack region instead of heap region

 Deallocates automatically without using free.

 Only few system support alloca, so it is less portable.

 Prototype:

#include <stdlib.h>

void *alloca( size_t size);

alloca returns the address of a block of size bytes of memory.

getrlimit and setrlimit Functions

 Used to getting and setting limitsof system resources.

#include <sys/resource.h>

int getrlimit (int resource, struct rlimit *rlp);


int setrlimit (int resource, const struct rlimit *rlp);

 struct rlimit

struct rlimit
{
rlim_t rlim_cur; // soft limit : current limit
rlim_t rlim_max ; // hard limit: maximum limit
value for rlim_cur
};

Rules for changing the resource limits:

RITM -21- Yogish . H. K.


Unix Systems Programming 6TH Semister

• A soft limit can be changed by any process to a value less than or equal
to its hard limit.
• Any process can lower its hard limit to a value greater than or equal to its
soft limit value. This lowering of the hard limit is irreversible for normal
users.
• Only a super user process can raise a hard limit value.

setjmp and longjmp Functions

 These are the non-local goto statements.


#include <setjmp.h>

int setjmp(jmp_buf jmpb);


void longjmp(jmp_buf jmpb, int retval);

 setjmp must be called before longjmp.

 setjmp function records a location where the future goto will (via the
longjmp call) will return.

 setjmp returns 0 when it is initially called. If the return is from a call to


longjmp, setjmp returns a non-zero value.

 The longjmp function is called to transfer a program flow to a location


that was stored in the jmp_buf.
The fork()

 Only way to create processes


 Parent/child relationship
 The child is a copy of the parent.
 It inherits the parent's data, heap and stack.
 COW (copy-on-write) in most current implementations.

#include <unistd.h>
#include <sys/types.h>

pid_t fork (void)

 Never know whether the parent or child will start executing first.
 All file descriptors that are open in the parent are duplicated in the child.
 Parent/child also share the same file offset (Files opened after fork() are not
shared).

 Two normal cases for handling the descriptors after a fork():

RITM -22- Yogish . H. K.


Unix Systems Programming 6TH Semister
 Parent waits.
 Parent and child go their own way.
 fork() may fail if it, Exceeds user limit, or Exceeds total system limit.
 Two uses (reasons) for fork():
 Each can execute different sections of the code at the same time.
 One process can execute a different program.
The vfork()
 A BSD variant of fork(), now supported by SVR4.
 Similar to fork(); however, is used to exec a new program only.
 Child running in the parent address space until it calls exec()/exit().
 Not fully copying the address space of the parent into the child.
 vfork() guarantees that the child runs first until it calls exec()/exit( ).
 Deadlock is possible if the child needs information from the parent.
Zombie Process

 A process has been terminated, but is still in the operating systems process table
waiting for its parent process to retrieve its exit status.

 This is created when child terminates before the parent and parent would not able
to fetch terminated status.

 The ps command prints the state of a zombie process as Z.

Various wait() System Calls

 wait() is used to wait for a child to terminate.


 waitpid() is used to wait for a specific child to terminate, plus some options.
 wait3()/wait4() will further collect resource usage information.
 When a process terminates, the following are reported/returned to its parent:
o Exit status.
o Some timing statistics (CPU time consumed).
o Etc.
 Prototype:

#include <sys/wait.h>
#include <sys/types.h>

pid_t wait (int *status-ptr);


pid_t waitpid (pid_t pid, int *status-ptr, int options);
pid_t wait3 (union wait *status-ptr, int options, struct rusage *usage);
pid_t wait4 (pid_t pid, int *status-ptr, int options, struct rusage *usage);

 The resource information includes the amount of use CPU time, the amount of
system CPU time, number of page faults, number of signals received.

Orphaned Process (orphan)

RITM -23- Yogish . H. K.


Unix Systems Programming 6TH Semister

 A process whose parent has exited.


 An orphaned process can never become a zombie process.
 Its slot in the process table is immediately released when an orphan terminates.
 Orphaned processes are inherited by init().
Race Condition
 A race condition occurs when multiple processes are competing for the same
system resource(s).
 The outcome depends on the order in which the processes run.
 Problems due to race conditions are hard to debug.
 Needs to have process synchronization.
 We can avoid race condition and polling by:
1. Signals
2. Various forms of inter process mechanisms
3. TELL_WAIT, TELL_PARENT, TELL_CHILD, WAIT_PARENT
and WAIT_CHILD macros
Scenario of using above said macros is as follows:

#include <unistd.h>
#include <sys/types.h>
int main( )
{
TELL_WAIT(); //set things up for TELL_xxx and WAIT_xxx

if ( (pid = fork()) < 0)


perror("fork error");
if (pid == 0) //child
{ // Child does what ever is necessary
TELL_PARENT(getppid( )); //tell parent we are done
WAIT_PARENT(); // wait for parent
// Child continuous on its way……..
exit(0);
}
// Parent does what ever is necessary
TELL_CHILD(pid); //tell child we are done
WAIT_CHILD( ); // and wait for child
// and the Parent continuous on its way……..
exit(0);
}

The exec() System Call

 Only way to execute processes.


 In the UNIX system, fork() creates processes and exec() executes processes.
These two system calls are very closely related. Without exec(), no process can be
executed. No fork(), no process can be created. They make a good team achieving
most of the UNIX system operations.

RITM -24- Yogish . H. K.


Unix Systems Programming 6TH Semister
 Will replace the calling process with a new program and start execution.
 Brand new text, data, heap and stack segments.
 Inherits most of the process attributes of the calling process, such as :
o PID and PPID.
o The real and effective UID and GID that aren’t SUID or SGID.
o Open files, except those with the close-on-exec flag set, are passed to
the new program.
o The file mode creation mask (umask) is passed to the new program.
o Controlling terminal.
o Current working directory
o Root directory.
o File locks.
o Signal mask.
o Pending signals.
o Resource limits
o CPU times.
 Is a family name for six like functions virtually doing the same thing, only
slightly different in syntax:
 execl(), execv(), execle(), execve(), execlp(), and execvp().
 Only execve() is a system call.
 Meaning of different letters:
 l: needs a list of arguments.
 v: needs an argv[] vector (l and v are mutually exclusive).
 e: needs an envp[] array.
 p: needs the PATH variable to find the executable file.

1. int execl (char *path, char *arg0, ..., NULL);


2. int execle (char *path, char *arg0, ..., NULL, char * envp[]);
3. int execlp (char *path, char *arg0, ...);
4. int execv (char *path, char *argv[]);
5. int execve (char *path, char *argv[], char *envp[]);
6. int execvp (char *path, char *argv[]);

path : File name of the called child process


argN : Argument pointer(s) passed as separate arguments
argv[N] : Argument pointer(s) passed as an array of pointers

RITM -25- Yogish . H. K.


Unix Systems Programming 6TH Semister
env : Array of character pointers.

system()

 int system(const char * cmd)


 system() forks a child process that exec’s /bin/sh, which in turn runs the command
cmd
 As such, it has the following qualities:

 it’s easy and familiar to use


 it’s inefficient
 because it uses system variables and executes from a shell, it can be a
security risk if the command is setuid or setgid
 example: system(“ls –la /usr/bin”);

Process Accounting
 kernel writes an accounting record each time a process terminates
 These accounting records are 32 bytes of binary data
 which contains the name of the command, amount of CPU time used, the
user ID , group user ID, starting time and so on.
 accton command enables and disables process accounting.
 The data required for the accounting record are all kept by the kernel in
the process table and initialized whenever a new process is created (e.g., in
the child after a fork).
 Each accounting record is written when the process terminates. This
means that the order of the records in the accounting file corresponds to
the termination order of the processes.
 The accounting records correspond to processes, not programs. A new
record is initialized by the kernel for the child after a fork, not when a new
program is execed.
Terminal Logins

RITM -26- Yogish . H. K.


Unix Systems Programming 6TH Semister

Network Logins
In this case all the logins come through the kernel's network interface drivers
(e.g., the Ethernet driver), and we do not know ahead of time how many of these will
occur. Instead of having a process waiting for each possible login, we now have to wait
for a network connection request to arrive. If there is a single process that waits for most
network connections, the inetd process, sometimes called the Internet server.

RITM -27- Yogish . H. K.


Unix Systems Programming 6TH Semister

Process Groups

 A process group is a collection of one or more processes


 Each process group has a unique process group ID
 Process group IDs are similar to process ID: They are positive integers
and they can be stored in a pid_t data type.
 The function getpgrp returns the process group ID of the calling process.

#include <sys/types.h>
#include <unistd.h>
pid_t getpgrp(void);

Returns: process group ID of calling process

 Each process group can have a process group leader. The leader is
identified by having its process group ID equal its process ID.

RITM -28- Yogish . H. K.


Unix Systems Programming 6TH Semister
 A process group leader can create a process group, add process to group
and then terminates.

 A process joins an existing process group or creates a new process group


by calling setpgid.
#include <sys/types.h>
#include <unistd.h>

int setpgid (pid_t pid, pid_t pgid);

Returns: 0 if OK, -1 on error

This sets the process group ID to pgid of the process pid.

Sessions
 A session is a collection of one or more process groups
 A process establishes a new session by calling the setsid function.

#include <sys/types.h>
#include <unistd.h>

pid_t setsid(void);

 The setsid function creates a new session

1. The process becomes the session leader of this new session. (A session
leader is the process. That creates a session.) The process is the only
process in this new session.

2. The process becomes the process group leader of a new process group.
The new Process group ID is the process ID of the calling process.

3. This function also makes the calling process have no controlling terminal.

Login shell Proc1 Proc2 Proc3 Proc4


Process Group Process Group Proc5
Process Group

Session
Figure: Arrangement of processes into process group and sessions.

Controlling Terminal

RITM -29- Yogish . H. K.


Unix Systems Programming 6TH Semister
 A session can have a single controlling terminal. The terminal device (in
case of terminal login) or pseudo terminal device (in case of network
login) on which we log in.

 The session leader that establishes the connection to the controlling


terminal is called the controlling process.

 The process groups within a session can be divided into a single


foreground process group and one or more background process groups.

 If a session has a controlling terminal, then it has a single foreground


process group, and all other process groups in the session are background
process groups.

 Whenever we type our terminal's interrupt key (often DELETE or


Control-C) or quit key (often Control-backslash) this causes either the
interrupt signal or the quit signal to be sent to all processes in the
foreground process group.

 If a modem disconnect is detected by the terminal interface, the hang-up


signal is sent to the controlling process (the session leader).

These characteristics are shown in Figure.

Figure: Process groups and sessions showing controlling terminal

 The Controlling terminal is established automatically for us when we log in.

Job Control

RITM -30- Yogish . H. K.


Unix Systems Programming 6TH Semister

 Job control allows us to start multiple jobs (group of processes) from a single terminal
and control which jobs can access the terminal and which jobs are to run in the
background. Job control requires three form of support.

1. A shell that supports job control


2. The terminal driver in the kernel must support job control
3. Support for certain job-control signals must be provided.
A job is just a collection of processes, often a pipeline of processes.

 The terminal driver in the kernel and signal must support job control

o The terminal driver really looks for three special characters, which
generate signals to the foreground process group:

• The interrupt character (typically DELETE or Control-C) generates SIGINT


• The quit character (typically Control-backslash) generates SIGQUIT
• The suspend character (typically Control-Z) generates SIGTSTP

a. Only the foreground job receives terminal input. When background job
to try to read from the terminal, kernel generates: SIGTTIN signal this is send to
the background process. This normally stops the background job.

b. When a background job outputs to the controlling terminal? This is an


option that we can allow or disallow. Normally we use the stty(l) command to
change this option.

i. Pictorially we can represent above said features as follows:

RITM -31- Yogish . H. K.


Unix Systems Programming 6TH Semister

Figure: Summary of job control features with foreground and background jobs and
terminal driver.

 The solid lines through the terminal driver box mean that the terminal I/O and the ter-
minal generated signals are always connected from the foreground process group to
the actual terminal.

 The dashed line corresponding to the SIGTTOU signal means that whether the output
from a process in the background process group appears on the terminal is an option.

Orphaned Process Groups

RITM -32- Yogish . H. K.


Unix Systems Programming 6TH Semister
 The process groups that continue running even after the session leader has
terminated are marked as orphaned process groups.

 For this orphaned process group init becomes a parent process

Signals and Daemon Processes

Signals
o A signal is a software interrupt delivered to a process.
o These are triggered by event

Events Process

1. Process (Synchronization )
Signal Actions
2. Unix Kernel (Divide by 0) to be
3. User (Ctrl-c ) taken

1. Accept the default action


2. Ignore the Signal
3. Invoke user-defined function

Signals are defined as integer flags, and they are defined in the header file
<signal.h>. The table below lists the commonly used in POSIX and in UNIX systems.
Signal Core File
Description generated
at default
SIGABRT Abort process Execution. Can be generated by abort( ) API. Yes
SIGALRM Alarm timer time-outs. Can be generated by alarm() API. No
SIGFPE Illegal arithmetic operation Yes
SIGHUP Controlling terminal hangup. No
SIGILL Execution of an illegal machine instruction. Yes
SIGINT Terminal interrupt signal, generated by Ctrl-c. No
Sure Kill (cannot be caught or ignored), generated by:
SIGKILL Yes
kill -9 PID command.
SIGPIPE Write on a pipe with no one to read it. Yes
Terminal quit signal commonly generated by a control \
SIGQUIT Yes
keys.
Segmentation fault, generated by de-referencing a NULL
SIGSEGV Yes
pointer.
SIGTERM Termination signal, can be generated by : kill pid command Yes
SIGUSR1 User-defined signal 1. No
SIGUSR2 User-defined signal 2. No

RITM -33- Yogish . H. K.


Unix Systems Programming 6TH Semister
Sent to parent process when, Child process terminated or
SIGCHLD No
stopped.
SIGCONT Continue executing, if stopped. No
SIGSTOP Stop executing (cannot be caught or ignored). No
SIGTSTP Terminal stop signal, generated when hits the key Ctrl-z.. No
Stops background process when it attempting read data
SIGTTIN No
from its controlling terminal.
Stops background process when it attempting write data to
SIGTTOU No
its controlling terminal.
SIGBUS Bus error. No
SIGVTALRM Virtual timer expired. No
SIGXCPU CPU time limit exceeded. Nes
SIGXFSZ File size limit exceeded. No

Unix Kernel Supports for Signals

o Process table in the kernel table has a slot contaning array of signal flags
o When a signal is generated kernel will set the corresponding signal flag in
the process table entry.
o Then kernel consults the array entry of the corresponding signal to find out
how the process will react to the pending signal.

o If the array entry for the signal contains a value:

0: The process will accept the default action


1: The process will ignore the signal
Others: It is used as the function pointer for the user defined signal handler
function.

o Pending: When a signal is generated and it is sent to a process, it becomes


pending. Normally it remains pending for just a short period of time.

o delivered signal: If a signal has been reacted or action taken for a sign.

o caught:When signal handler routine is called.

Basic Signal Handling

#include <signal.h>

sighandler_t signal (int signum, sighandler_t action);

The signal function establishes action as the action for the signal signum. The
first argument, signum, identifies the signal whose behavior you want to control, and

RITM -34- Yogish . H. K.


Unix Systems Programming 6TH Semister
should be a signal number. The proper way to specify a signal number is with one of the
symbolic signal names.

The second argument, action, specifies the action to use for the signal signum.

This can be one of the following:

 SIG_DFL : It specifies the default action for the particular signal. The
default actions for various kinds of signals are
stated in Standard Signals.

 SIG_IGN : It specifies that the signal should be ignored.

 handler: Supply the address of a handler function in your program, to


specify running this handler as the way to deliver the signal.

If you set the action for a signal to SIG_IGN, or if you set it to SIG_DFL
and the default action is to ignore that signal, then any pending
signals of that type are discarded (even if they are blocked).
Discarding the pending signals means that they will never be
delivered, not even if you subsequently specify another action
and unblock this kind of signal.

The signal function returns the action that was previously in effect for the specified
signum. You can save this value and restore it later by calling signal again.

Advanced Signal Handling

#include <signal.h>

int sigaction (int signum, const struct sigaction *restrict action,


structsigaction *restrict old-action);
struct sigaction
{
sighandler_t sa_handler ;
sigset_t sa_mask ;
int sa_flags;
};

sighandler_t sa_handler:

This is used in the same way as the action argument to the signal function. The
value can be SIG_DFL, SIG_IGN, or a function pointer.

sigset_t sa_mask:

sa_flags:

RITM -35- Yogish . H. K.


Unix Systems Programming 6TH Semister

This specifies various flags which can affect the behavior of the signal. The
sa_flags member of the sigaction structure is a catch-all for special features.

The value of sa_flags is interpreted as a bit mask. Thus, you should choose the
flags you want to set, OR those flags together, and store the result in the sa_flags
member of your sigaction structure.

Signal Sets

All of the signal blocking functions use a data structure called a signal set to
specify what signals are affected.

You must always initialize the signal set with one of these two functions before
using it in any other way.

int sigemptyset (sigset_t *set);

This function clears all signal flags in the set argument and always returns 0.

int sigfillset (sigset_t *set);

This function sets all the signal in the set argument and the return value is 0.

int sigaddset (sigset_t *set, int signum);

This function adds the signal signum to the signal set set. All sigaddset does is
modify set; it does not block or unblock any signals. The return value is 0 on success and
-1 on failure.

int sigdelset (sigset_t *set, int signum);

This function removes the signal signum from the signal set set. All sigdelset
does is modify set; it does not block or unblock any signals. The return value and error
conditions are the same as for sigaddset.

Finally, there is a function to test what signals are in a signal set:

int sigismember (const sigset_t *set, int signum);

The sigismember function tests whether the signal signum is a member of the
signal set set. It returns 1 if the signal is in the set, 0 if not, and -1 if there is an error.

Process Signal Mask

The collection of signals that are currently blocked is called the signal mask. You
can block or unblock signals with total flexibility by modifying the signal mask.

RITM -36- Yogish . H. K.


Unix Systems Programming 6TH Semister

The prototype for the sigprocmask function is in signal.h.

int sigprocmask (int how, const sigset_t *restrict set, sigset_t *restrict oldset);

The sigprocmask function is used to examine or change the calling process's


signal mask. The how argument determines how the signal mask is changed, and must be
one of the following values:

SIG_BLOCK : Block the signals in set—add them to the existing mask.


SIG_UNBLOCK : Unblock the signals in set—remove them from the existing mask.
SIG_SETMASK : Use set for the mask; ignore the previous value of the mask.

The last argument, oldset, is used to return information about the old process
signal mask.

The sigprocmask function returns 0 if successful, and -1 to indicate an error.

You can't block the SIGKILL and SIGSTOP signals, but if the signal set includes
these, sigprocmask just ignores them instead of returning an error status.

The SIGCHLD signal and waitpid API

When a child process terminates or stops, the kernel will generate a SIGCHLD
signal to its parent process. Depending on how parent sets up the handling of the
SIGCHLD signal, different events may occur.

1. Parent may accept the default action of the SIGCHLD signal:unlike other signals
SIGCHLD signal does not terminate the parent process. It affects only the parent
process if it arrives at the same time the parentprocess is suspended by the waitpid
system call. If that is the case, the parent process will be awakened, The API will
return the child’s exit status and process ID to the parent, and the kernel will clear
up the Process table slot allocated for the child process. Thus with this setup, a
parent process can call the waitpid API repeatedly to wait for each child it
created.

2. Parent ignores the SIGCHLD signal: The SIGCHLD signal will be discarded, and
the parent will not be disturbed, even if it is executing the waitpid system call.
The effect of this setup is that if the parent calls the waitpid system call. The API
will suspend the parent until all its child process have terminated and the kernel
will clear up the Process table slot allocated for the child process and the API will
return -1 to the parent process.

3. Process catches the SIGCHLD signal: the signal handler function will be called in
the parent process whenever a child process terminates. Furtheremore, if the
SIGCHLD signal arrives while the parent process is executing the waitpid system
call, after the signal handler function returns, the waitpid API may be restarted to

RITM -37- Yogish . H. K.


Unix Systems Programming 6TH Semister
collect the child exitstatus and clear its process table slot. And the other hand, API
may be aborted and the child process table slot not freed, depending on the parent
setup of the signal action for the SIGCHLD signal.

The interaction between SIGCHLD and wait API is the same as that
between SIGCHLD and the waitpid API

The sigsetjmp and siglongjmp APIs

The sigsetjmp and siglongjmpAPIs have similar functions as their corresponding


setjmp and longjmp APIs.

These are used to transfer control fron one function to another hence they are
called non-local goto statements. The function prototype of these functions are:

#include <setjmp.h>

int sigsetjmp(sigjmp_buf jmpb, int save_sigmask );


void longjmp(sigjmp_buf jmpb, int retval);

The sigsetjmp behaves similarly to the setjmp APIs, except that it has a second
argument, save_sigmask, which allows a user to specify whether a calling process signal
mask should be saved to the provided env argument.

Similarly the siglongjmp does all the operations as the longjmpAPI, but it also
restores a calling process signal mask if the mask was saved in its env argument. The
retval argument specifies the return value of the corresponding sigsetjmp API when it is
called by siglongjmp. It value should be a non-zero number, if it is zero the siglongjmp
API will reset it to 1.

The siglongjmp API is usually called from user defined signal handling functions.

kill and raise

A process can send itself a signal with the raise function. This function is
declared in signal.h.

#include <signal.h>

int raise (int signum);

The kill function can be used to send a signal to another process.

#include <signal.h>

int kill (pid_t pid, int signum);

RITM -38- Yogish . H. K.


Unix Systems Programming 6TH Semister

The pid specifies the process or process group to receive the signal:

pid > 0 : The process whose identifier is pid.


pid == 0 : All processes in the same process group as the sender.
pid < -1 : The process group whose identifier is −pid.
pid == -1 : If the process is privileged, send the signal to all processes except
for some special system processes. Otherwise, send the
signal to all processes with the same effective user ID.

alarm , pause and sleep function

The pause function suspends program execution until a signal arrives whose
action is either to execute a handler function, or to terminate the process.

#include <unistd.h>

int pause( );

alarm

The alarm API can be called by a process to request the kernel to send the
SIGALRM signal after a certain number of real clock seconds. This is just like setting
alarm clock to remind someone to do smething after a specific period of time.

#include <signal’s>

unsigned int alarm (unsigned int seconds);


sleep
The function sleep suspends a calling process for the specied number of CPU
seconds. The process will be awekened by either the elspase time exceeding the timer
value or sleep can return sooner if a signal arrives.

#include <unistd.h>
unsigned int sleep (unsigned int seconds);

Daemon Process

 Daemons are processes that live for long time


 They do not have a controlling terminal
 They are always run in the background.
 They are often started when the system is bootstrapped and terminate, when the
system is shutdown.

Daemon Characteristics
ps –axj

RITM -39- Yogish . H. K.


Unix Systems Programming 6TH Semister

-a: Option shows the status of all the processes owned by others
-x: Shows process that does not have a controlling terminal
-j: Option displays the job related information:

The output from ps looks like:

PPID PID PGID SID TT TPGID UID COMMAND


0 0 0 0 ? -1 0 Swapper
0 1 0 0 ? -1 0 /sbin/init
0 2 0 0 ? -1 0 pagedaemon
1 105 37 37 ? -1 0 Update
1 108 108 108 ? -1 0 Cron

 Processes 0, 1, and 2 are the process IDs of swapper, init and pagedaemon
respectively and they are existing for the entire lifetime of the system. They have no
Parent process group ID, and no session ID.

 update is a program that flushes the kernel's buffer cache to disk at regular
intervals (usually every 30 seconds). To-do this it just calls the sync function every 30
seconds.

 The cron daemon executes commands at specified dates and times.

 Notice that, all the daemons run with super user privileges (user ID of 1). None of
the daemons has a controlling terminal: The terminal name is set to a question mark.

 The parent of all these daemons is the init process.

Coding Rules

1. The first thing to do is call fork and has the parent exit.
2. Call setsid to create a new session and performs the followings: The process
a. Becomes a session leader of a new session
b. Becomes the process group leader of a new process group
c. Has no controlling terminal.

3. Change the current working directory to the root directory.

4. Set the file mode creation mask to 0.

5. Unneeded file descriptors should be closed.

 The below mentioned Program: is a function that can be called from a


program that wants to initialize itself as a daemon.

RITM -40- Yogish . H. K.


Unix Systems Programming 6TH Semister
# include <sys/types.h>
#include <iostream.h>
#include <unistd.h>
#include <fcntl.h>

int daemon_init(void)
{
pid_t pid;

if ( (pid = fork() < 0))


return(-1);
else
if (pid != 0)
exit(0); /* parent goes bye-bye */
/* child continues */
setsid( ); /* become session leader */
chdir("/"); /* change working directory */
umask(0); /* clear our file mode creation mask */
return(0);
}

Error Logging

 One problem a daemon has is how to handle error messages. It can't just
write to standard error" since it shouldn't have a controlling terminal.

SVR4 Streams log Driver

 SVR4 provides a streams device driver, with an interface for streams


error logging, streams event tracing, and console logging.

 Each log message can be routed to one of three loggers: the error logger,
the trace logger, or the console logger.

There are three ways to generate log messages and three ways to read them.

 Generating log messages:

1. Routines within the kernel can call strlog to generate log messages. This
is normally used by streams modules and streams device drivers for either
error messages or trace messages.

2. User processes (such as a daemon) can putmsg to /dev/log. This message


can be sent to any of the three loggers.

3. A user process (such as a daemon) can write to /dev/conslog. This


message is sent only to the console logger.

RITM -41- Yogish . H. K.


Unix Systems Programming 6TH Semister

 Reading log messages:

1. The normal error logger is strerr. It appends these messages to a file in


the directory /var/adm/strearms. The file's name is error.mm-dd, where
mm is the month and dd is the day of the month. This program, itself a
daemon and it runs in the background, appending the log messages to the
file.

2. The normal trace logger is strace. It can selectively write a specified set
of trace messages to its standard output.

3. The standard console logger is syslogd. This program is a daemon that


reads a configuration file and writes log messages to specified files or the
console device or sends e-mail to certain users.

Client-Server Model

A common use for a daemon process is as a server process. Indeed, in the above
diagram we can call the syslogd process a server that has messages sent to it by user
processes (clients) using a UNIX domain datagram socket.

In general, a server is a process that waits for a client to contact it, requesting
some type of service. In the above, diagram the service being provided by the syslogd
server is the logging of an error message.

Appendix D: Quick Reference to Signals and Daemon Processes

Pipes
 A pipe is a mechanism for interprocess communication
 Data written to the pipe by one process can be read by another process
 The data is handled in a first-in, first-out (FIFO) order.
 The pipe has no name

Limitations of pipe

 They are half-duplex, Hence data flows in one direction


 They can be only between processes that have a common ancestor

Creating a Pipe

 The primitive for creating a pipe is the pipe function.


 This creates both the reading and writing ends of the pipe.

#include <unistd.h>
int pipe (int filedes[2]);

RITM -42- Yogish . H. K.


Unix Systems Programming 6TH Semister
If successful, pipe returns a value of 0. On failure, -1 An easy way to
remember that the input end comes first is that file descriptor 0 is standard input, and file
descriptor 1 is standard output.

If successful, pipe returns a value of 0. On failure, -1 is returned.

The following program creates, writes to, and reads from a pipe.
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <unistd.h>

int main()
{
int pfds[2];
char buf[30];

if (pipe(pfds) == -1) {
perror("pipe");
exit(1);
}

printf("writing to file descriptor #%d\n", pfds[1]);


write(pfds[1], "test", 5);

printf("reading from file descriptor #%d\n", pfds[0]);


read(pfds[0], buf, 5);
printf("read \"%s\"\n", buf);
}

popen and pclose functions (pipe and subprocess)


#include <stdio.h>

FILE * popen (const char *command, const char *mode);


int pclose (FILE *stream);

The popen function is closely related to the system function; it executes the shell
command command as a sub process.

Coprocesses
 A Unix filter is a program that reads from standard input and writes to
standard output. A filters are normally connected linearly in shell
pipelines. A filter becomes a coprocess when the same program genarates
its input and reads its output.
 A coprocess normally runs in the background from a shell and its standard
input and standard output are connected to another program using pipe.
 Example : A process creates two pipes: one to its standard input and one
to its standard output of the coprocess, as shown in figure.

RITM -43- Yogish . H. K.


Unix Systems Programming 6TH Semister

Parent Child(Coprocess)
Pipe1
fdisc[1] stdin

fdisc[2] stdout
Pipe 2

FIFOs

 A FIFO ("First In, First Out", pronounced "Fy-Foh") is known as a named


pipe.

#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>

int mkfifo(const char * path , mode_t permissions);

 path: path name of a FIFO file to be created.


 permissions: similar to permissions specified in the open API and
S_IFIFO flag to indicate that this is a FIFO file.

 returns 0 if it is succeeds or -1if it fails.

Client-Server Communication Using a FIFO

Figure: Client-server communication using FIFOs.

RITM -44- Yogish . H. K.


Unix Systems Programming 6TH Semister
As seen from figure, the server opens its well known FIFO for read only(since it
reads only from it) each time the number of clients goes from 1 to 0 the server will read
an end of file on the FIFO. To prevent the server open its well-known FIFO for read-
write.

POSIX.1b IPC Methods

 IPC methods are defined in POSIX.lb (the standard for a portable real-
time operating system).
 They are messages, shared memory, and semaphores. The System V
messages, shared memory, and semaphores use integer keys as identifiers
(names).

The UNIX System V IPC Methods

 Messages: allow processes on the same machine to exchange formatted


data
 Semaphores: provide a set of system-wide variables that can be modified
and used by processes on the same machine to synchronize their
execution. Semaphores are commonly used with a shared memory to
control the access of data in each shared memory region
 Shared memory: allows multiple processes on the same machine to share a
common region of virtual memory, such that data written to a shared
memory can be directly read and modified by other processes
 Transport Level Interface: allows two processes on different machines to
set up a direct, two-way communication channel. This method uses
STREAMS as the underlying data transport interface

UNIX System V Messages

UNIX Kernel Support for Messages

A message queue table in a kernel address space that keeps track of all message
queues created in a system. Each entry of the message tables stores the following data for
one message queue:

 A name, which is an integer ID key assigned by the process that created


the queue. Other processes may specify this key to "open" the queue and
get descriptor for future access of the queue

RITM -45- Yogish . H. K.


Unix Systems Programming 6TH Semister

 The creator user ID and group ID. A process whose effective user ID
matches a message queue creator user ID may delete the queue and also
change the queue control data
 The assigned owner user ID and group ID. These are normally the same as
those of the creator, but a creator can set these values to reassign the queue
owner and group membership
 Read-write access permission of the queue for owner, group members, or
others. A process that has read permission to the queue may retrieve
messages from the queue, query the assigned user, and group IDs of the
queue. A process that has write permission to a queue may send messages
to the queue
 The time and process ID of the last process that sent a message to the
queue
 The Time and process ID of the last process that read a message from the
queue
 The pointer to a linked list of message records in the queue. Each message
recordstores one message of data and its assigned message type

Figure shows the kernel data structure for messages.

Message record

Message Table

The UNIX APIs for Messages

There are four APIs for message manipulation:

Message APIs File APIs Uses


msgget open Open 'and create if needed, a message queue for access
msgsnd write Send a message to a message queue
msgrcv read Receive a message from a message queue
stat, chmod
msgctl Manipulate the control data of a message queue
chown, unlink

RITM -46- Yogish . H. K.


Unix Systems Programming 6TH Semister
msgget

The function prototype of the msgget API is:

#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/message.h>
int msgget ( key _t key, int flag );
int msgsnd (int msgfd, const void* msgPtr, int len, int flag );
int msgrcv (int msgfd, const void* msgPtr, int len, int mtype, int flag );
int msgctl (int msgfd, int cmd, struct msqid_ds* mbufPtr );

Semaphores

 Semaphores provide a method to synchronize the execution of multiple


processes.
 It is counter used to provide access to a shared data object for multiple
processes.

UNIX Kernel Support for Semaphores

In UNIX System, there is a semaphore table in the kernel, address space that
keeps track of all semaphore sets created in the system.

Each entry of the semaphore table stores the following data for one semaphore
set:
 A name that is an integer ID key assigned by the process, which created
the set. Other processes may specify this key to open the set and get a
descriptor for future access of the set
 The creator user ID and group ID. A process whose effective user ID
matches a semaphore set creator user ID may, delete the set and also
change control data of the set
 The assigned owner user ID and group ID. These are normally the Same
as the creator user and group IDs, but a creator process can set these
values to assign a different owner and group membership for the set
 Read-write access permission of the set for owner, group members, and
others. A process that has read permission to the set may query values of
the semaphores and queries the assigned users and group IDs of the set. A
process that has write permission to a set may change the values of
semaphores
 The number of semaphores in the set
 The time when the last process changed one or more semaphore values
 The time when the last process changed the control data of the set

RITM -47- Yogish . H. K.


Unix Systems Programming 6TH Semister

A pointer to an array of semaphores

Semaphore set
Semaphore Table

Figure : Kernel data structure for semaphore

The UNIX APIs for Semaphores

In addition to the above, the struct sem data type, as defined in the <sys/sem.h>
header, defines the data stored in a semaphore:

Semaphores API Usages


semget Open and create, if needed, a semaphore set
semop Change or query semaphore value
semctl Query or change control data of a semaphore set or delete a set
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/sem.h>

int semget ( key _t key, int num_sem, int flag );


int semop ( int semfd, struct sembuf* opPtr, int len );
int semctl (int semfd, int num, int cmd, union semun arg );

UNIX System V Shared Memory

 Shared memory allows two or more processes to share a given region of


memory. This is the fastest form of IPC because the data does not need to
be copied between the client and server. If the server is placing data into a
shared memory region, the client should not try to access the data until the
server is done. Often semaphores are used to synchronize shared memory
access.

UNIX Kernel Support for Shared Memory

In UNIX System V.3 and V.4, there is a shared memory table in the kernel
address space that keeps track of all shared memory regions created in the system. Each
entry of the table stores the following data for one shared memory region:

RITM -48- Yogish . H. K.


Unix Systems Programming 6TH Semister

 A name that is an integer ID key assigned by the process that created the
shared memory. Other processes may specify this key to "open" the region
and get a descriptor for future attachment to or detachment from the region
 The creator user and group IDs. A process whose effective user ID
matches a shared memory region creator user ID may delete the region
and may change control data of the region
 The assigned owner user and group IDs. These are normally the same as
those of that creator user and group IDs, but a creator process can set these
values to assign different owner and group membership for the region
 Read-write access permission of the region for owner, group members,
and others. A process that has read permission to the region may read data
from it and query the assigned user and group IDs of the region. A process
that has write permission to a region may write data to it
 The size, in number of bytes, of the shared memory region
 The time when the last process attached to the region
 The time when the last process detached from the region
 The time when the last process changed control data of the region

Shared memory region


struct shmid_ids Process Table
Shared memory table

The UNIX APIs for Shared Memory

Shared memory API Uses

shmget Open and create a shared memory


Attach a shared memory to a process virtual address space, so
shmat
that the process can read and/or write data in the shared memory
shmdt Detach a shared memory from the process virtual address space
shmctl Query or change control data of a shared memory or delete the
memory

#include <sys/types.h>

RITM -49- Yogish . H. K.


Unix Systems Programming 6TH Semister
#include <sys/ipc.h>
#include <sys/shm.h>
int shmget (key_t key, int size, int flag );
void * shmat (into shmid, void addr, int flag );
int shmdt ( void* addr );
int shmctl ( int shmid, int cmd, struct shmid_ds* buf );

Stream Pipes

A stream pipe is just a bidirectional (full-duplex) pipe. To obtain bidirectional


data flow between a parent and child, only a single stream pipe is required. Figure below
shows the view of a stream pipe.

s_pipe

This function just calls the standard pipe function, which creates a full-duplex pipe.
I
#include <stdio.h>

int s_pipe(int fd[2]) /* two file descriptors returned in int fd[O] & fd[l] */
{
return ( pipe(fd) );
}
Figure shows pipe looks like under SVR4, Just two stream heads are connected to
each other.

RITM -50- Yogish . H. K.


Unix Systems Programming 6TH Semister
Passing File Descriptors

 The ability to pass an open file descriptor between processes is powerful.


It can lead to different ways of designing client-server applications. It
allows one process (typically a server) to do everything that is required to
open a file (involving details such as translation of a network name to a
network address, dialing a modem, negotiating locks for the file, etc.) and
just pass back to the calling process a descriptor that can be used with all
the I/O functions. All the details involved in opening the file or device are
transparent to the client.
 We must be more specific about what we mean by "passing an open file
descriptor" from one process to another. When we pass an open file
descriptor from one process to another, we want the passing process and
the receiving process to also share the same file table entry. Figure below
shows the desired arrangement.

Process table entry

fd flags ptr
fd 0
…….
fd 1
fd2

File table
File status
Process table entry flagsCurrent file
offsetv-node ptr
V- node table
v-node
fd 0
fd flags ptr Informationi-
fd 1
node
fd2
………… Information
fd3
Current file size

Figure: Passing an open file from the top process to the bottom process

The following three user-defined functions are used to send and receive file
descriptors.

#include <stdio.h>
#include <unistd.h>

RITM -51- Yogish . H. K.


Unix Systems Programming 6TH Semister

int send_fd(int spipefd, int filedes);


int send_err (int spipefd, int status, const char *errmsg);
int recv_fd (int spipefd, ssize_t (*userfunc) (int, const void *, size_t));

o send_fd sends the descriptor filedes across the stream pipe spipefd.
o send_err sends the errmsg across the stream pipe spipefd.
o recv_fd is called by the client to receive a descriptor.

Client-Server Connection Functions

Stream pipes are useful for IPC between related processes, such as a parent and
child.

#include <unistd.h>
#include <sys/stat.h>
#include <sys/types.h>

int serv_listen(const char *name);


int serv _accept (int listenfd, uid _t *uidptr);
int cli_conn(const char *name);

 First a server has to announce its willingness to listen for client


connections on a well-known name (some pathname in the filesystem) by
calling serv _listen.

 Once a server has called serv_listen, it calls serv_accept to wait for a


client connection to arrive.

--- THE END ---

RITM -52- Yogish . H. K.

Vous aimerez peut-être aussi