Vous êtes sur la page 1sur 145

1

UNIT-I

2/18/2013

What Software is Needed?


Operating Systems Application Software Software Development Tools

Web services
Database Servers and RDBMSs

UNIT-I

2/18/2013

What is open-source software (OSS)?


Software comes in the form of compiled code (binaries), and the human-readable

source code from which these binaries are compiled.


Open-source software is software whereby the software is distributed in the form

of binaries as well as source code.


The distributor cannot restrict any party from redistributing the software, nor can

any party be restricted from making modifications or making derivative works based on the source code.

UNIT-I

2/18/2013

Open Source Definition (OSD)


1. Free Redistribution The license shall not restrict any party from selling or giving away the software as a component of an aggregate software distribution containing programs from several different sources. The license shall not require a royalty or other fee for such sale.

UNIT-I

2/18/2013

Open Source Definition (OSD)


2. Source Code The program must include source code, and must allow distribution in source code as well as compiled form. 3. Derived Works The license must allow modifications and derived works, and must allow them to be distributed under the same terms as the license of the original software.

UNIT-I

2/18/2013

Open Source Definition (OSD)


Integrity of The Authors Source Code
Distinguished changes from the base source.

No Discrimination Against Persons or Groups.

Distribution of License
No need for execution of an additional license.

License Must Not Be Specific to a Product


Must not depend on the programs being part of a particular software distribution.

License Must not Restrict Other Software

UNIT-I

2/18/2013

What is open-source software (OSS)? (continued)

Open Source Software (OSS) is an example of a second order Internet effect. The first order was commercialization through buying and selling (e.g., Amazon

and eBay).
The second order is based on collaboration and information sharing (e.g.,

Facebook)
Programmers throughout the world can be engaged in software development.

UNIT-I

2/18/2013

Open Source Vs. Closed Source Software

CSS
Developed by Companies and developers work for economic purposes.

OSS
Developed By Volunteers work for peer recognition. People know that recognition as a good developer have great advantage Decentralized, distributed, multi-site development

Centralized, single site development

Users may suggest requirements but they may or may not be implemented
Release is not too often. There may 2/18/2013only yearly be UNIT-I releases.

User suggests additional features that often get implemented.


Software is released on a daily or weekly basis 9

Open Source Vs. Closed Source Software (Continued)

CSS
Market believes commercial CSS is highly secure because it is developed by a group of professionals confined to one geographical area under a strict time schedule. But quite often this is not the case, hiding information does not make it secure, it only hides its weaknesses Security cannot be enhanced by modifying the source code
2/18/2013 UNIT-I

OSS
OSS is not market driven; it is quality driven. Community reaction to bug reports is much faster compared to CSSD which makes it easier to fix bugs and make the component highly secure

The ability to modify the source code could be a great advantage if you want to deploy a highly secure system 10

Open Source is a certification standard issued by the Open Source Initiative (OSI)
indicates that the source code of a computer program is made available free of charge to the general

OSI

public.

OSI dictates that in order to be considered "OSI Certified" a product must


The author or holder of the license of the source code cannot collect royalties on the distribution

of the program.
The distributed program must make the source code accessible to the user. The author must allow modifications and derivations of the work under the program's original

name.
No person, group or field of endeavor can be denied access to the program

11

UNIT-I

2/18/2013

Example of open source software


Programming Tools
Zope, and PHP, are popular engines behind the "live content" on the World Wide

Web.
Languages:
Perl Python Ruby Tcl/Tk

GNU compilers and tools


GCC,Make,Autoconf,Automake,etc.

12

UNIT-I

2/18/2013

Open source software sites


Free Software Foundation www.fsf.org Open Source Initiative www.opensource.org Freshmeat.net SourceForge.net OSDir.com developer.BerliOS.de Bioinformatics.org see also individual project sites; e.g., www.apache.org; www.cpan.org; c.

13

UNIT-I

2/18/2013

Software Development operating systems http://gcc.gnu.org/

What open-source software is available? (continued)

o GCC - The compiler for C, C++, Fortran, Java, that comes standard with all the major OSS

o JBOSS - A popular open-source implementation of J2EE http://www.jboss.org o Perl - A very popular language widely used in scripts to drive `live content on the World

Wide Web http://www.perl.org


o PHP - A very popular scripting language for interactive web development and applications

http://www.php.net
o Python - A popular object-oriented scripting language for web and desktop development

http://www.python.org

14

UNIT-I

2/18/2013

What open-source software is available?


Multi-user Networked Operating Systems

Linux :The most popular OSS operating system on the planet http://www.linux.org
Internet/intranet Services and Applications

Apache web server - Accounts for over 60% of the web servers on the Internet http://www.apache.org
o BIND name server - The software that provides the DNS (domain name service). Many of

the root name servers as well as the Internet backbone network ISPs use BIND http://www.isc.org/products/BIND/
o Sendmail (Mail Exchange server) - The most widely used email transport software on the

Internet http://www.sendmail.org
UNIT-I

15

2/18/2013

Database Systems

What open-source software is available? (continued)

o MySQL - A very popular open-source RDBMS http://www.mysql.com o PostgreSQL - A popular open-source RDBMS with many advanced features

http://www.postgresql.org
Desktop Applications o OpenOffice.org - An integrated office suite featuring word-processing, spreadsheet, drawing and

presentation software largely compatible with Microsoft Office http://www.openoffice.org


o Ximian Evolution - A GUI desktop application for personal email, calendar and diary having

similar look and feel with Microsoft Outlook http://www.ximian.org


o Mozilla - The open-source evolution of the popular Netscape web browser

http://www.mozilla.org

16

UNIT-I

2/18/2013

Can We Count On OSS?


OSS is developed and/or maintained by volunteer programmers so is a single

party fully accountable for it ?


Yes, For Common open source project we find a non-profit foundations or

normal businesses supporting the software


For example, Apache is supported through the Apache Software Foundation

and Red Hat Linux is supported and maintained by Red Hat Corporation

17

UNIT-I

2/18/2013

Open source companies


IBM uses and develops Apache and Linux; created Secure Mailer and created other software on AlphaWorks

Apple
released core layers of Mac OS X Server as an open source BSD operating system

called Darwin; open sourcing the QuickTime Streaming Server and the OpenPlay network gaming toolkit HP uses and releases products running Linux Sun uses Linux; supports some open source development efforts (Eclipse IDE for Java and the Mozilla web browser)

18

UNIT-I

2/18/2013

Open source companies


Red Hat Software
Linux vendor

ActiveState
develops and sells professional tools for Perl, Python, and Tcl/tk developers.

19

UNIT-I

2/18/2013

Can We Get Support On OSS?


The most frequently cited reasons against using OSS in corporations is the lack of support. In Propriety CSS we can relay on the vendor for support. But, There exists professional companies providing service and support for open-source (e.g.Red

Hat for Linux, Zend for PHP, and recently Sun Microsystems for MySQL)
The Internet is another great source of informal support that is efficient (Newsgroups, FAQs and

HOW-TO documents).

20

UNIT-I

2/18/2013

OPEN SOURCE LICENSES

21

UNIT-I

2/18/2013

TWO ORGANIZATIONS
Free Software Foundation (FSF)
Open Source Initiative (OSI)

22

UNIT-I

2/18/2013

Open source licensing


The licence is what determines whether software is open source

The licence must be approved by the Open Source Initiative- OSI

(www.opensource.org)
The Open Source Initiative approves open source licenses after they have

successfully gone through the approval process and comply with the Open Source
Definition (above).
All approved licences meet their Open Source Definition

(www.opensource.org/docs/definition.php)
Approved licences >50 and include the GPL, LGPL, MPL and BSD.

23

UNIT-I 2/18/2013

Copyright
Software is protected by copyright law By default only the owner of software may copy, adapt or distribute it. The owner of software can agree to let another person copy, adapt or distribute the

code - this agreement is called a licence.

24

UNIT-I 2/18/2013

Free/Open Source Software Licenses


OSI-approved:
-

Academic Free License Apache Software License Apple Public Source License Artistic license BSD license GNU GPL .

GNU LGPL IBM Public License Intel Open Source License MIT license Sun Public License

25

UNIT-I

2/18/2013

Open Source Software licensing and copyright


The two most common types of OSS licensing are: o BSD (Berkley System Distribution)Style: this category of license allows one to take an

open-source software and redistribute it with or without modifications as proprietary software. (e.g. Apache, BIND )
o GNU General Public License(GPL) : It is a license that requires that the product

derived from the original open-source software must also be distributed under the same licensing regime as the original. Thus it cannot be turned into a closed-source product.

26

UNIT-I

2/18/2013

Free/Open Source Software Licenses


If you distribute your software under one of these licenses, you are permitted to say that your software is "OSI Certified Open Source Software." If you see either of an OSI Certified certification mark on a piece of software, the software

is being distributed under a license that conforms to the Open Source Definition.

2/18/2013

UNIT-I

27

BSD license
no restriction on derivative work The only restrictions placed on users of software released under a typical BSD license are

that if they redistribute such software in any form, with or without modification, they
must include in the redistribution
(1) the original copyright notice, (2) a list of two simple restrictions and (3) a disclaimer of liability.

These restrictions can be summarized as


(1) one should not claim that they wrote the software if they did not write it (2) one should not sue the developer if the software does not function as expected or as desired

28

UNIT-I

2/18/2013

Mozilla Public License


Divides software into:

1.Open Source part 2. anything added by user Things added by user can be proprietary, if he does not modify the Open Source part Everything is Open Source if he modifies the Open Source part

29

UNIT-I

2/18/2013

Apache license
BSD + condition: OK to distribute code if under the Apache name, but not for resale

30

UNIT-I

2/18/2013

Taxonomy of Software by FSF


1.

Proprietary: the use, redistribution or modification of the software is prohibited, or requires you to ask for permission, or is restricted so much that you effectively can't do it freely.

2.

Semi-free: not free, but comes with permission for individuals to use, copy, distribute, and modify (including distribution of modified versions) for non-profit purposes. e.g. PGP

3.

Free
A. Copylefted: redistribution cannot add additional restriction B. Non-Copylefted:

31

UNIT-I

2/18/2013

Copyleft as explained by FSF


Copyleft is a general method for making a program free software and requiring all

modified and extended versions of the program to be free software as well.


To copyleft a program, we first state that it is copyrighted; then we add

distribution terms, which are a legal instrument that gives everyone the rights to
use, modify, and redistribute the program's code or any program derived from it but only if the distribution terms are unchanged.
Thus, the code and the freedoms become legally inseparable.

32

UNIT-I

2/18/2013

Free, copyrighted but not copylefted


Non-copy lefted free software comes from the author with permission to

redistribute and modify, and also to add additional restrictions to it.


If a program is free but not copylefted, then some copies or modified versions may

not be free at all.


A software company can compile the program, with or without modifications,

and distribute the executable file as a proprietary software product.


Example: X11 Window System

33

UNIT-I

2/18/2013

BSD vs GPL
The biggest difference between the GPL and BSD licenses is the fact that the

former is a copyleft license and the latter is not.


Copyleft is the application of copyright law to permit the free creation of

derivative works but requiring that such works be redistributable under the same terms (i.e., the same license) as the original work.

34

UNIT-I

2/18/2013

Typical OSS development model


Improvements (as source code) and evaluation results: User as Developer Developer Development Community Trusted Developer

Bug Reports

Trusted Repository Distributor

Stone soup development


OSS users typically use software without paying licensing fees OSS users typically pay for training & support (competed) OSS users are responsible for paying/developing new improvements & any evaluations that they need; often cooperate with others to do so Goal: Active development community (like a consortium)
35 UNIT-I

User

2/18/2013

Advantages of open source software

36

UNIT-I

2/18/2013

1.

The availability of the source code and the right to modify it is very important.

It enables the unlimited tuning and improvement of a software product. It also makes it possible to port the code to new hardware, to adapt it to changing conditions, and to reach a detailed understanding of how the system works.

2.

The right to redistribute modifications and improvements to the code, and to reuse other open source code, permits all the advantages due to the modifiability of the software to be shared by large communities.

3.

No exclusive rights to the software Open source software is open to everyone. Because of this no individual programmer or company can specify the direction that the development should take.

37

UNIT-I

2/18/2013

4.

The right to use the software in any way.


This, combined with redistribution rights, ensures (if the software is useful enough), a large

population of users, which helps in turn to build up a market for support and customization of the software, which can only attract more and more developers to work in the project.
This in turn helps to improve the quality of the product, and to improve its functionality.

5. The biggest advantage of open source for users is that most projects are free to

download and use

38

UNIT-I

2/18/2013

Cons
is that the focus is often on backend processing of information and not on user

interfaces. Microsoft Windows has arguably one of the easiest interfaces with which to work. Often, open source software such as Linux requires the user to have specialized knowledge that cannot be configured with just clicks of a mouse. In addition, open source projects often do not have good documentation to walk the user through the learning and using of the technologies.

39

UNIT-I

2/18/2013

40

UNIT-I

2/18/2013

Types of Operating System


Tasks
Uni tasking,Multi tasking

Users
Single User,Multi User

Processing
Uni processing,Multi processing

Timesharing
is the sharing of a computing resource among many users by means of

multiprogramming and multi-tasking

41

UNIT-I

2/18/2013

History
Multics 1964
Unics 1969 Minix 1990 Linux 1991

42

UNIT-I

2/18/2013

Multics
Multiplexed Information and Computing Service Written in 1964 Mainframe Timesharing OS Last version was shut down on October 30, 2008 Monolithic kernel

Disadvantages -crashes, insecure, error prone, expensive

43

UNIT-I

2/18/2013

Unics
Uniplexed Information and Computing System
Later renamed as UNIX Written in 1969

Ken Thompson, Dennis Ritchie were among the developers


Multi user, Multi tasking and timesharing Monolithic kernel

44

UNIT-I

2/18/2013

Minix
Minimal Unix
Tanenbaum developed this OS Mainly for educational purpose

Unix like OS, implemented with Micro kernel. So the name Minix

45

UNIT-I

2/18/2013

What is Linux?
Developed in 1991 by a University of Finland student Linus Torvalds. Basically a kernel, it was combined with the various software and compilers from GNU

Project to form an OS, called GNU/Linux


Linux is a full-fledged OS available in the form of various Linux Distributions RedHat, Fedora, SuSE, Ubuntu, Debian are examples of Linux distros Linux is supported by big names as IBM, Google, Sun, Novell, Oracle, HP, Dell, and many

more

46

UNIT-I

2/18/2013

Linux
Used in most of the computers, ranging from super computers to embedded

system
Multi user Multi tasking Time sharing Monolithic kernel Stable version of linux kernel 2.6.28, released on 24-Dec-2008

47

UNIT-I

2/18/2013

History of Linux
Inspired by the UNIX OS, the Linux kernel was developed as a clone of UNIX GNU was started in 1984 with a mission to develop a free UNIX-like OS Linux was the best fit as the kernel for the GNU Project Linux kernel was passed onto many interested developers throughout the Internet Linux today is a result of efforts of thousands of individuals, apart from Torvalds

48

UNIT-I

2/18/2013

Free Software Foundation & GNU


Organisation that started developing copylefted programs
GNU Project: Richard Stallman on September 27th 1983. The GNU Project was launched in 1984 to develop a complete Unix-like operating

system which is free software: the GNU system.


GNU's kernel isn't finished, so GNU is used with the kernel Linux. The combination

of GNU and Linux is the GNU/Linux operating system, now used by millions.
www.gnu.org

49

UNIT-I

2/18/2013

Linux on Servers and Supercomputers


Linux is the most used OS on servers 5 out of 10 reliable web hosting companies use Linux Linux is the cornerstone of the LAMP server-software combination (Linux,

Apache, MySQL, Perl/PHP/Python) which has achieved popularity among developers


Out of top 500 supercomputers, Linux is deployed on 426 of them

53

UNIT-I

2/18/2013

Linux on Embedded Systems


16.7% of smartphones worldwide use Linux as OS
Linux poses a major competition to the most popular OS is this segment Symbian Nokia, Openmoko supply Linux on their select smartphones

54

UNIT-I

2/18/2013

Why should you use Linux?


No threat of viruses Linux systems are extremely stable Linux is Free Linux comes with most of the required software pre-installed Linux never gets slow Linux can even run on oldest hardware

55

UNIT-I

2/18/2013

FOSS
Free Open Source Software
Free Means Liberty and not related to Price or cost Open Source code is available and any body can contribute to the

development. Organization independent

58

UNIT-I

2/18/2013

4 Freedoms with FOSS


Freedom to run the software anywhere Freedom to study how the programs work. i.e source code will be accessible Freedom to redistribute copies Freedom to improve the software

If a software has all these 4 freedoms, then it is a FOSS

59

UNIT-I

2/18/2013

Kernel
Core or nucleus of an operating system Interacts with the hardware

First program to get loaded when the system starts and runs till the

session gets terminated


Different from BIOS which is hardware dependent. Kernel is software dependent LINUX: In hard disk, it is represented by the file /vmlinuz.

60

UNIT-I

2/18/2013

Monolithic

Kernel types

All OS related code are stuffed in a single module Available as a single file Advantage : Faster functioning Micro OS components are isolated and run in their own address space Device drivers, programs and system services run outside kernel memory space.Only

a few functions such as process scheduling, and interprocess communication are included into the microkernel
Supports modularity & Lesser in size

61 UNIT-I

2/18/2013

Shell
Program that interacts with kernel
Bridge between kernel and the user Command interpreter

User can type command and the command is conveyed to the kernel and it

will be executed

62

UNIT-I

2/18/2013

Types of Shell
Sh simple shell BASH Bourne Again Shell KSH Korne Shell CSH C Shell SSH Secure Shell To use a particular shell type the shell name at the command prompt. Eg $csh will switch the current shell to c shell To view the current shell that is being used, type echo $SHELL at the command

prompt

63

UNIT-I

2/18/2013

Linux Distributions
Today there are hundreds of different distributions available popular Linux distributions

include SUSE Linux Fedora Linux Red Hat Enterprise Linux Debian Linux ALT Linux TurboLinux Mandrake Linux Lycoris Linux Linspire Gentoo Linux

Ubuntu

UNIT-I

2/18/2013

LINUX DISTRIBUTIONS
Red Hat Linux : One of the original Linux distribution. The commercial, nonfree version is Red Hat Enterprise Linux, which is aimed at big companies

using Linux servers and desktops in a big way.


Free version: Fedora Project. Debian GNU/Linux : A free software distribution. Popular for use on servers. However, Debian is not what many would consider a distribution for beginners, as it's not designed with ease of use in

mind.
SuSE Linux : SuSE was recently purchased by Novell. This distribution is primarily for pay because it contains many commercial programs, although free version that you can download. Mandrake Linux : Mandrake is perhaps strongest on the desktop. Originally based off of Red Hat Linux. Gentoo Linux : Gentoo is a specialty distribution meant for programmers. there's a available stripped-down

Linux OS

66

UNIT-I

2/18/2013

Operating System
User 1

User 2

67

UNIT-I

2/18/2013

OPERATING MODES

68

UNIT-I

2/18/2013

USER MODE
User mode is the normal mode of operating for programs. Web browsers, calculators, etc.

will all be in user mode.


They don't interact directly with the kernel, instead, they just give instructions on what

needs to be done, and the kernel takes care of the rest.


Code running in user mode must delegate to system APIs to access hardware or memory. Due to the protection afforded by this sort of isolation, crashes in user mode are always

recoverable.
Most of the code running on your computer will execute in user mode. When in User Mode, some parts of RAM cant be addressed, some instructions cant

be executed, and I/O ports cant be accessed

69

UNIT-I

2/18/2013

Kernel mode, on the other hand, is where programs communicate directly with the kernel. The kernel-mode programs run in the background, making sure everything runs smoothly

KERNEL MODE

- things like printer drivers, display drivers, drivers that interface with the monitor,
keyboard, mouse, etc.
The executing code has complete and unrestricted access to the underlying hardware. It can execute any CPU instruction and reference any memory address.

Kernel mode is generally reserved for the lowest-level, most trusted functions of the

operating system.
Crashes in kernel mode are catastrophic; they will halt the entire PC.

70

UNIT-I

2/18/2013

KERNEL MODE
A good example of this would be device drivers.
A device driver must tell the kernel exactly how to interact with a piece of

hardware, so it must be run in kernel mode


Because of this close interaction with the kernel, the kernel is also a lot more

vulnerable to programs running in this mode, so it becomes highly crucial that drivers are properly debugged before being released to the public.

71

UNIT-I

2/18/2013

SWITCHING FROM USER MODE TO KERNEL MODE


The only way an user space application can explicitly initiate a switch to kernel mode during

normal operation is by making an system call such as open, read, write etc.
Whenever a user application calls these system call APIs with appropriate parameters, a software

interrupt/exception(SWI) is triggered.
As a result of this SWI, the control of the code execution jumps from the user application to a

predefined location in the Interrupt Vector Table [IVT] provided by the OS.
This IVT contains an address for the SWI exception handler routine, which performs all the

necessary steps required to switch the user application to kernel mode and start executing kernel instructions on behalf of user process.

72

UNIT-I

2/18/2013

Decomposition of Linux System into Major Subsystems


User Applications -- the set of applications in use on a particular Linux system will be different depending on what the computer system is used for Examples ,wordprocessing and a web-browser. O/S Services -- these are services that are typically considered part of the operating system (a windowing system, command shell, etc.); also, the programming interface to the kernel (compiler tool and library) is included in this subsystem. Linux Kernel -- the kernel abstracts and mediates access to the hardware

resources, including the CPU.


Hardware Controllers -- this subsystem is comprised of all the possible physical devices in a Linux installation; for example, the CPU, memory hardware, hard disks, and network hardware

73

UNIT-I

2/18/2013

The fundamental architecture of the GNU/Linux operating system

74

UNIT-I

2/18/2013

User Applications
At the top is the user, or application, space.
This is where the user applications are executed. Below the user space is the kernel space where the Linux

kernel exists.

75

UNIT-I

2/18/2013

GNU C Library (glibc)


provides the system call interface that connects to the kernel provides the mechanism to transition between the user-space application

and the kernel. This is important because the kernel and user application occupy different protected address spaces.
And while each user-space process occupies its own virtual address space,

the kernel occupies a single address space

76

UNIT-I

2/18/2013

Fundamental architecture of the GNU/Linux operating system


The Linux kernel can be further divided into three gross levels. At the top is the system call interface, which implements the basic functions such as read and

write.
Below the system call interface is the kernel code, which can be more accurately defined as the

architecture-independent kernel code.


This code is common to all of the processor architectures supported by Linux. Below this is the architecture-dependent code, which forms what is more commonly called a

BSP (Board Support Package).


This code serves as the processor and platform-specific code for the given architecture.

77

UNIT-I

2/18/2013

File management

Directory Tree
(root)

When you log on the the Linux OS using your username you are automatically located in your home directory.

Most important subdirectories


/bin : Important Linux commands available to the average user. /boot : The files necessary for the system to boot.

Not all Linux distributions use this one. Fedora does.


/dev : All device drivers. Device drivers are the files that your Linux system uses to talk to your

hardware.
/etc : System configuration files. /home : Every user except root gets her own folder in here, named for her login account. So, the

user who logs in with linda has the directory /home/linda, where all of her personal files are kept.
/lib : System libraries. Libraries are just bunches of programming code that the programs on your

system use to get things done.

Most important subdirectories


/mnt

: Mount points. When you temporarily load the contents of a CD-ROM or USB drive, you

typically use a special name under /mnt.


/root : The root user's home directory. /sbin : Essential commands that are only for the system administrator. /tmp : Temporary files and storage space. Don't put anything in here that you want to keep. Most

Linux distributions (including Fedora) are set up to delete any file that's been in this directory

longer than three days.


/usr : Programs and data that can be shared across many systems and don't need to be changed. /var : Data that changes constantly (log files that contain information about what's

happening on your system, data on its way to the printer, and so on).

Home directory
You can see what your home directory is called by entering

pwd (print current working directory)

THE KERNEL
Block diagram of Linux Kernel

System call is the mechanism used by an application program to

Linux Kernel- System Call Interface

request service from the operating system. API is a function definition that specifies how to obtain a given service(ex.calloc,malloc ,free etc.), while System call is an explicit request to the kernel made via a software interrupt Invoking a system call by user mode process

Five main subsystems-Overview


The Process Scheduler (SCHED) is responsible for controlling process access to the CPU.
The scheduler enforces a policy that ensures that processes will have fair access to the CPU,

while ensuring that necessary hardware actions are performed by the kernel on time.
The Memory Manager (MM) permits multiple process to securely share the machine's

main memory system.


In addition, the memory manager supports virtual memory that allows Linux to support

processes that use more memory than is available in the system.


Unused memory is swapped out to persistent storage using the file system then swapped back

in when it is needed.

Five main subsystems


The Virtual File System (VFS) abstracts the details of the variety of hardware devices

by presenting a common file interface to all devices.


In addition, the VFS supports several file system formats that are compatible

with other operating systems.


The Network Interface (NET) provides access to several networking standards and a

variety of network hardware.


The Inter-Process Communication (IPC) subsystem supports several mechanisms

for process-to-process communication on a single Linux system.

Kernel Subsystem Overview

Linux Kernel-Memory Management


Linuxs physical memory-management system deals with allocating

and freeing pages, groups of pages, and small blocks of memory.


It has additional mechanisms for handling virtual memory, memory

mapped into the address space of running processes

Managing Physical Memory


The page allocator allocates and frees all physical pages; it can allocate ranges of

physically-contiguous pages on request.


The allocator uses a buddy-heap algorithm to keep track of available physical pages. Each allocatable memory region is paired with an adjacent partner. Whenever two allocated partner regions are both freed up they are combined to form

a larger region.
If a small memory request cannot be satisfied by allocating an existing small free

region, then a larger free region will be subdivided into two partners to satisfy the request.
Memory allocations in the Linux kernel occur either statically (drivers reserve a

contiguous area of memory during system boot time) or dynamically (via the page allocator).

Virtual Memory
The VM system maintains the address space visible to each process: It creates pages of

virtual memory on demand, and manages the loading of those pages from disk or their swapping back out to disk as required.
The VM manager maintains two separate views of a processs address space: A logical view describing instructions concerning the layout of the address space. The address space consists of a set of non overlapping regions, each representing a

continuous, page-aligned subset of the address space.


A physical view of each address space which is stored in the hardware page tables

for the process.

File System
A file system is the methods and data structures that an operating system uses to

keep track of files on a disk or partition; that is, the way the files are organized on the disk.
A file is an ordered string of bytes Files are organized in directory. File information like size,owner,access permission etc. are stored in a separate data

structure called inode.


Superblock is a data structure containing information about file system

Filesystem
The Virtual Filesystem (also known as Virtual Filesystem Switch or VFS)

is a kernel software layer that handles all system calls related to a standard Unix filesystem.
Its main strength is providing a common interface to several kinds of

filesystems. ex. copy a file from MS-dos filesystem to Linux

Network stack
The network stack, by design, follows a layered architecture modeled after the

protocols themselves. Recall that the Internet Protocol is the core network layer protocol that sits below the transport protocol .
Above TCP is the sockets layer, which is invoked through the SCI. The sockets layer is the standard API to the networking subsystem and provides

a user interface to a variety of networking protocols.


From raw frame access to IP protocol data units and up to TCP and the User

Datagram Protocol (UDP), the sockets layer provides a standardized way to manage connections and move data between endpoints.

While much of Linux is independent of the architecture on which it runs, there are

Architecture-dependent code

elements that must consider the architecture for normal operation and for efficiency.
The ./linux/arch subdirectory defines the architecture-dependent portion of the

kernel source contained in a number of subdirectories that are specific to the architecture .
Each architecture subdirectory contains a number of other subdirectories that

focus on a particular aspect of the kernel, such as boot, kernel, memory management, and others.

TYPES OF PROCESSES & PROCESS MANAGEMENT IN LINUX

Linux Kernel_Process
Process is a program in execution.
Process is represented in OS by Process

Control Block.

Interactive process
Interactive processes are those processes that are invoked by a user and can

interact with the user.


Examples: shells, text editors, GUI applications. Interactive processes can be classified into foreground and background processes. The foreground process is the process that you are currently interacting with,

and is using the terminal as its stdin (standard input) and stdout (standard
output).
A background process is not interacting with the user and can be in one of two

states - paused or running.


There has to be someone connected to the system to start these processes;

they are not started automatically as part of the system functions.

System Process
Daemon (day-mon). Daemon is the term used to refer to process' that are

running on the computer and provide services but do not interact with the

console.
Most server software is implemented as a daemon. Apache, Samba, are all

examples of daemons.
Any process can become a daemon as long as it is run in the background,

and does not interact with the user.


A simple example of this can be achieved using the ls -l command Running in the background by typing ls -l &

Automatic or batch processes


Automatic or batch processes are not connected to a terminal. Rather, these are tasks that can be queued into a spooler area, where

they wait to be executed on a FIFO (first-in, first-out) basis.


Such tasks can be executed using one of two criteria:
At a certain date and time

At times when the total system load is low enough to accept extra jobs:

done using the batch command.


By default, tasks are put in a queue where they wait to be executed until

the system load is lower than 0.8.

Batch processes
In large environments, the system administrator may prefer batch

processing when large amounts of data have to be processed or when tasks demanding a lot of system resources have to be executed on an already loaded system.
Batch processing is also used for optimizing system performance.

104

UNIT-I

2/18/2013

105

UNIT-I

2/18/2013

106

UNIT-I

2/18/2013

107

UNIT-I

2/18/2013

108

UNIT-I

2/18/2013

109

UNIT-I

2/18/2013

110

UNIT-I

2/18/2013

Process State-FLAGS
Each process on the system is in exactly one of five different states. This value

is represented by one of five flags:


TASK_RUNNING: The process is runnable; it is either currently running or

on a runqueue waiting to run


This is the only possible state for a process executing in user-space; it can

also apply to a process in kernel-space that is actively running.


TASK_INTERRUPTIBLE: The process is sleeping (that is, it is blocked),

waiting for some condition to become true or a signal is received.


When the condition becomes true, the kernel sets the process's state to

TASK_RUNNING.
The process can awake and become runnable if it receives a signal.

FLAGS
TASK_UNINTERRUPTIBLE:
This state is identical to TASK_INTERRUPTIBLE except that it does not

wake up and become runnable if it receives a signal.


This is used in situations where the process must wait without

interruption or when the event is expected to occur quite quickly.


Because the task does not respond to signals in this state,

TASK_UNINTERRUPTIBLE is less often used than


TASK_INTERRUPTIBLE6.

TASK_ZOMBIE:

PROCESS STATES-FLAGS

The task has terminated, but its parent has not yet issued a wait() system

call.
The task's task structure must remain in case the parent wants to access it. If the parent calls wait(), the task structure is deallocated.

TASK_STOPPED:
Process execution has stopped; the task is not running nor is it eligible to

run.
This occurs if the task receives the SIGSTOP, SIGTSTP, SIGTTIN, or

SIGTTOU signal or if it receives any signal while it is being debugged.

Zombie process
A zombie process or defunct process is a process that has completed

execution but still has an entry in the process table.


This entry is still needed to allow the parent process to read its child's exit

status.
When a process ends, all of the memory and resources associated with it

are deallocated so they can be used by other processes. However, the process's entry in the process table remains.
The parent can read the child's exit status by executing the wait system

call, whereupon the zombie is removed.

ZOMBIE PROCESS
When a child exits, the parent process will receive a SIGCHLD signal to

indicate that one of its children has finished executing; the parent process will typically call the wait() system call at this point.
That call will provide the parent with the childs exit status, and will

cause the child to be reaped, or removed from the process table.


Its possible that the parent process is intentionally leaving the process in

a zombie state to ensure that future children that it may create will not receive the same pid

Causes of Zombie Processes


When a subprocess exits, its parent is supposed to use the "wait" system

call and collect the process's exit information.


The subprocess exists as a zombie process until this happens, which is

usually immediately.
However, if the parent process isn't programmed properly or has a bug

and never calls "wait," the zombie process remains, eternally waiting for

its information to be collected by its parent.

Process states(USING FLAGS)

Process states in Linux:


Running: Process is either running or ready to run Interruptible: a Blocked state of a process and waiting for an event or

signal from another process


Uninterruptible: a blocked state. Process waits for a hardware condition

and cannot handle any signal


Stopped: Process is stopped or halted and can be restarted by some other

process
Zombie: process terminated, but information is still there in the process

table.

System calls used for process management in linux

System calls used for Process management:


Fork () :- Used to create a new process Exec() :- Execute a new program Wait():- wait until the process finishes execution Exit():- Exit from the process Getpid():- get the unique process id of the process Getppid():- get the parent process unique id

Pid and Parentage


A process ID or pid is a positive integer that uniquely identifies a running

process, and is stored in a variable of type pid_t.


You can get the process pid or parents pid

getpid-

getpid

returns

the

PID

of

the

calling

process.

getppid- getppid returns the PID of the parent of the calling proces main() { pid_t pid, ppid; printf( "My PID is:%d\n\n",(pid = getpid()) ); printf( "Par PID is:%d\n\n",(ppid = getppid()) ); }

2. fork()
#include <sys/types.h>

#include <unistd.h> pid_t fork( void );


Creates a child process by making a copy of the parent process --- an

exact duplicate.
Implicitly specifies code, registers, stack, data, files

Both the child and the parent continue running.

fork() as a diagram
Parent
pid = fork() Child pid == 0 Shared Program Data Copied

Returns a new PID: e.g. pid == 5

Data

Process IDs (pids revisited)


When a fork is executed, The child gets a unique pid. The parent's file descriptor table is copied. (The implication is that

all files open to the parent are also open to the child.)
The return value from fork is -1 if failure (e.g., the process table is full) 0 (in the child) pid of the child (in the parent)

wait
wait is called by a parent process to await termination by one of its

children. This allows a simple form of synchronization between parent


and child. int wait ( int * statloc );
When a process calls wait, if the caller has no children, the wait call

returns immediately with an error code.


if the caller has children, but none has terminated, the caller is blocked

until one does.


if a child process has terminated which has not been waited for (a so-called

zombie process), the child is removed from the process table, and the wait

call returns with the status of the child.


125 UNIT-I 2/18/2013

Wait..
When the call returns, the return value is the pid of the terminated child

& the child's status is stored at statloc.


Two similar system calls, waitid and waitpid, are provided which

provide options to allow waiting on a specific child, and return without blocking.

126

UNIT-I

2/18/2013

wait() Actions
A process that calls wait() can: suspend (block) if all of its children are still running, or

return immediately with the termination status of a child,

or

return immediately with an error if there are no child

processes.

Using the exec Family


The exec functions replace the program running in a process with

another program.
When a program calls an exec function, that process immediately ceases

executing that program and begins executing a new program from the beginning, assuming that the exec call doesnt encounter an error.
Within the exec family, there are functions that vary slightly in their

capabilities and how they are called.


Functions that contain the letter p in their names (execvp and execlp) accept

a program name and search for a program by that name in the current execution path; functions that dont contain the p must be given the full path of the program to be executed.

Exec()
Functions that contain the letter v in their names (execv, execvp, and execve)

accept the argument list for the new program as a NULL-terminated array of

pointers to strings.
Functions that contain the letter l (execl, execlp, and execle) accept the

argument list using the C languages varargs mechanism.


Functions that contain the letter e in their names (execve and execle) accept

an additional argument, an array of environment variables.


The argument should be a NULL-terminated array of pointers to

character strings. Each character string should be of the form


VARIABLE=value.

Simple Execlp Example


include <sys/type.h> #include <stdio.h> #include <unistd.h> int main() { pid_t pid; /* fork a child process */ pid = fork(); if (pid < 0){ printf(Fork Failed); exit(-1); } else if (pid == 0){ /* child process */ execlp(/bin/ls,ls,NULL); } else { /* parent process */ /* parent will wait for child to complete */ wait(NULL); printf(Child Complete); exit(0); } }
#

SCHEDULING IN LINUX

Linux Process Scheduling Policy


A scheduling policy is the set of decisions you make regarding scheduling

priorities, goals, and objectives


A scheduling algorithm is the instructions or code that implements a given

scheduling policy
Linux has several, conflicting objectives Fast process response time Good throughput for background jobs Avoidance of process starvation

Linux Process Scheduling Policy


Linux uses a timesharing technique

We know that this means that each process is assigned a small

quantum or time slice that it is allowed to execute.


Linux schedule process according to a priority ranking, this is a

goodness ranking
Linux uses dynamic priorities, i.e., priorities are adjusted over time to

eliminate starvation
Processes that have not received the CPU for a long time get

their priorities increased, processes that have received the CPU often get their priorities decreased

Linux Process Scheduling Policy


We can classify processes using two schemes
CPU-bound versus I/O-bound I/O-bound programs have the property of performing only a small amount of

computation before performing IO. Such programs typically do not use up their entire CPU quantum.
CPU-bound programs, on the other hand, use their entire quantum without performing

any blocking IO operations.


Consequently, one could make better use of the computer's resources by giving higher

priority to I/O-bound programs and allow them to execute ahead of the CPU-bound
programs.
Interactive versus batch versus real-time These classifications are somewhat independent, e.g., a batch process can be either I/O-

bound or CPU-bound.
Linux recognizes real-time programs and assigns them high priority,

Linux Process Scheduling Policy


Linux uses process preemption, a process is preempted when

Its time quantum has expired


A new process enters TASK_RUNNING state and its priority is greater than the

priority of the currently running process


The preempted process is not suspended, it is still in the ready queue, it simply no longer has

the CPU

Consider a text editor and a compiler


Since the text editor is an interactive program, its dynamic priority is higher than the

compiler
The text editor will be block often since it is waiting for I/O When the I/O interrupt receives a key-press for the editor, the editor is put on the

ready queue and the scheduler is called since the editors priority is higher than the

compiler. The editor gets the input and quickly blocks for more I/O

Linux Process Scheduling Policy


Determining the length of the quantum

Should be neither too long or too short


If too short, the overhead caused by process switching becomes excessively high If too long, processes no longer appear to be executing concurrently
For Linux, long quanta do not necessarily degrade response time for

interactive processes because their dynamic priority remains high, thus they

get the CPU as soon as they need it


The for Linux is the longest possible quantum without affecting responsiveness;

this turns out to be about 20 clock ticks or 210 milliseconds

Linux Process Scheduling Algorithm


The Linux scheduling algorithm is not based on a continuous CPU time axis, instead it

divides the CPU time into epochs


An epoch is a division of time or a period of time In each epoch, every process gets a specified time quantum quantum = maximum CPU time assigned to the process in that epoch duration of

quantum computed when epoch begins


different processes may have different time quantum durations when process forks, remainder of parents quantum is split / shared between

parent and child


Epoch ends when all runnable processes have exhausted their quanta At end of epoch, scheduler algorithm recomputes the time-quantum durations of all

processes; new epoch begin


The a new epoch starts and all process get a new quantum

Linux Process Scheduling Algorithm


When does an epoch end? Important!
An epoch ends when all processes in the ready queue have used their

quantum
This does not include processes that are blocking on some wait queue,

they will still have quantum remaining


The end of an epoch is only concerned with processes on the ready

queue

Linux Process Scheduling Algorithm


Selecting a process to run next. The scheduler considers the priority of each

process. There are two kinds of priorities


Static priorities - these are assigned to real-time processes and range from 1 to

99; they never change


Dynamic priorities - Dynamic priority is calculated from static priority and

average sleep time .


When process wakes up, record how long it was sleeping, up to some maximum

value
When the process is running, decrease that value each timer tick The static priority of real-time process is always higher than the dynamic

priority of conventional processes


Conventional processes will only execute when there are no real-time

processes to execute

Linux Process Scheduling Algorithm


Calculating process quanta for an epoch
Each process is initially assigned a base time quantum, as mentioned

previously it is about 20 clock ticks


If a process uses its entire quantum in the current epoch, then in the next epoch

it will get the base time quantum again


If a process does not use its entire quantum, then the unused quantum carries

over into the next epoch (the unused quantum is not directly used, but a bonus

is calculated)
Why? Process that block often will not use their quantum; this is used to favor

I/O-bound processes because this value is used to calculate priority


When forking a new child process, the parent process remaining quantum

divided in half; half for the parent and half for the child

Linux Process Scheduling Algorithm


Scheduling data in the process descriptor
The process descriptor (task_struct in Linux) holds essentially of

the information for a process, including scheduling information


Recall that Linux keeps a list of all process task_structs and a list of

all ready process task_structs

Linux Process Scheduling Algorithm


Each process descriptor (task_struct) contains the following fields need_resched - this flag is checked every time an interrupt handler

completes to decide if rescheduling is necessary


For real-time processes this can have the value of SCHED_FIFO - first-in, first-out with unlimited time quantum SCHED_RR - round-robin with time quantum, fair CPU usage

For all other processes the value is


SCHED_OTHER

For processes that have yielded the CPU, the value is


SCHED_YIELD

Linux Process Scheduling Algorithm


Process descriptor fields (cont)
rt_priority - the static priority of a real-time process, not used for

other processes
priority - the base time quantum (or base priority) of the process counter - the number of CPU ticks left in its quantum for the current

epoch. This field is updated for every clock tick


The priority and counter fields are used to for timesharing

and dynamic priorities in conventional processes .

Linux Process Scheduling Algorithm


Scheduling actually occurs in schedule() Its objective is to find a process in the ready queue then assign the CPU to it It is invoked in two ways

Direct invocation Lazy invocation

Linux Process Scheduling Algorithm


Direct invocation of schedule() The scheduler is invoked directly when the current process must be blocked right away because the resource it needs is not available
A process must be blocked because a resource is not available, a device

driver can invoke schedule() directly if it will be executing a long iterative task.
The current process is taken off of the ready queue and is placed on

the appropriate wait queue; its state is changed to TASK_INTERRUPTIBLE or TASK_UNINTERRUPTIBLE


When a process is woken up and its priority is higher than that of the

current process

Linux Process Scheduling Algorithm


Lazy invocation of schedule() Scheduler can also be invoked in a lazy way by setting the need_ resched flag field to 1.
Occurs when the current process has used up its quantum A process is added to the ready queue and its priority is higher than the

currently executing process


A process calls sched_yield()
sched_yield() causes the calling thread to relinquish the CPU. The thread is moved

to the end of the queue for its static priority and a new thread gets to run.

Linux Process Scheduling Algorithm


Actions performed by schedule()
First it runs any kernel control paths that have not completed and other

uncompleted house-keeping tasks


Remember, the kernel is not preemptive, so it cannot switch to another

process if a process is already in the kernel or if the kernel is in the middle of doing something else
If the current process is SCHED_RR and has used all of its quantum, then it

is given a new quantum and placed at the end of the ready queue
If the process is not SCHED_RR, then it is removed from the ready queue

Linux Process Scheduling Algorithm


Actions performed by schedule() (cont)
It scans the ready queue for the highest priority process

It calculates the priority using the goodness() function


It may not find any processes that are good when all processes on the

ready queue have used up their quantum (i.e., all have a zero counter

field). In this case it must start a new epoch by assigned a new quantum to
all processes
If a higher priority process was found, then the scheduler performs a

process switch

Linux Process Scheduling Algorithm


How good is a runnable process?
Uses goodness() to determine priority (goodness == -1000) - do not select process (goodness == 0) - process has exhausted quantum (0 < goodness < 1000) - conventional process with quantum (goodness >= 1000) - real-time process

Linux Process Scheduling Algorithm


Linux scheduler issues
Does not scale very well as the number of process grows because it

has to recompute dynamic priorities


Tries to minimize this by computing at end of epoch only

Large numbers of runnable processes can slow response time


Predefined quantum is too long for high system loads I/O-bound process boosting is not optimal Some I/O-bound processes are not interactive (e.g., database search or

network transfer)
Support for real-time processes is weak

Personalities in Linux

Execution Domains_persona System call


The execution domain system allows Linux to provide limited support for

binaries compiled under other UNIX-like operating systems.


Linux is its ability to execute files compiled for other operating systems. Of course,

this is possible only if the files include machine code for the same computer architecture on which the kernel is running.
Two kinds of support are offered for these "foreign" programs: Emulated execution: necessary to execute programs that include system calls that

are not POSIX-compliant


Native execution: valid for programs whose system calls are totally POSIX-

compliant
"Portable Operating System Interface", is a family of standards specified by

the IEEE for maintaining compatibility between operating systems.

Microsoft MS-DOS and Windows programs are emulated: they cannot be natively

Personsa()

executed, because they include APIs that are not recognized by Linux
POSIX-compliant programs compiled on operating systems other than Linux can be

executed without too much trouble, because POSIX operating systems offer similar APIs.
A process specifies its execution domain by setting the personality field of its

descriptor. Each process has an associated personality identifier that can slightly modify the semantics of certain system calls. Used primarily by emulation libraries to request that system calls be compatible with certain specific flavors of OS.
A

process can change its personality by issuing a suitable system call

named personality( )
Programmers are not expected to directly change the personality of their programs;

instead, the personality( ) system call should be issued by the glue code that sets up the execution context of the process

Personalities in LINUX
Personality PER_LINUX PER_BSD PER_SUNOS PER_RISCOS PER_SOLARIS Operating system Standard execution domain

BSD Unix SunOS RISC OS Sun's Solaris

Linux supports different execution domains, or personalities, for each process. Among other things, execution domains tell Linux how to map signal

numbers into signal actions.

Vous aimerez peut-être aussi