Linux Open Source

1
UNIT-I
2/18/2013
What Software is Needed?

Operating Systems Application Software Software Development Tools
Web services
Database Servers and RDBMSs
UNIT-I
2/18/2013
What is open-source software (OSS)?

Software comes in the form of compiled code (binaries), and the human-readable
source code from which these binaries are compiled.

Open-source software is software whereby the software is distributed in the form
of binaries as well as source code.

The distributor cannot restrict any party from redistributing the software, nor can
any party be restricted from making modifications or making derivative works based on the source code.
UNIT-I
2/18/2013
Open Source Definition (OSD)

1. Free Redistribution The license shall not restrict any party from selling or giving away the software as a component of an aggregate software distribution containing programs from several different sources. The license shall not require a royalty or other fee for such sale.
UNIT-I
2/18/2013

2. Source Code The program must include source code, and must allow distribution in source code as well as compiled form. 3. Derived Works The license must allow modifications and derived works, and must allow them to be distributed under the same terms as the license of the original software.
UNIT-I
2/18/2013

Integrity of The Authors Source Code
Distinguished changes from the base source.
No Discrimination Against Persons or Groups.
Distribution of License
No need for execution of an additional license.
License Must Not Be Specific to a Product

Must not depend on the programs being part of a particular software distribution.
License Must not Restrict Other Software
UNIT-I
2/18/2013
What is open-source software (OSS)? (continued)
Open Source Software (OSS) is an example of a second order Internet effect. The first order was commercialization through buying and selling (e.g., Amazon
and eBay).
The second order is based on collaboration and information sharing (e.g.,
Facebook)
Programmers throughout the world can be engaged in software development.
UNIT-I
2/18/2013
Open Source Vs. Closed Source Software
CSS
Developed by Companies and developers work for economic purposes.
OSS
Developed By Volunteers work for peer recognition. People know that recognition as a good developer have great advantage Decentralized, distributed, multi-site development
Centralized, single site development
Users may suggest requirements but they may or may not be implemented
Release is not too often. There may 2/18/2013only yearly be UNIT-I releases.
User suggests additional features that often get implemented.

Software is released on a daily or weekly basis 9
Open Source Vs. Closed Source Software (Continued)
CSS
Market believes commercial CSS is highly secure because it is developed by a group of professionals confined to one geographical area under a strict time schedule. But quite often this is not the case, hiding information does not make it secure, it only hides its weaknesses Security cannot be enhanced by modifying the source code
2/18/2013 UNIT-I
OSS
OSS is not market driven; it is quality driven. Community reaction to bug reports is much faster compared to CSSD which makes it easier to fix bugs and make the component highly secure
The ability to modify the source code could be a great advantage if you want to deploy a highly secure system 10
Open Source is a certification standard issued by the Open Source Initiative (OSI)
indicates that the source code of a computer program is made available free of charge to the general
OSI
public.
OSI dictates that in order to be considered "OSI Certified" a product must

The author or holder of the license of the source code cannot collect royalties on the distribution
of the program.
The distributed program must make the source code accessible to the user. The author must allow modifications and derivations of the work under the program's original
name.
No person, group or field of endeavor can be denied access to the program
11
UNIT-I
2/18/2013
Example of open source software

Programming Tools
Zope, and PHP, are popular engines behind the "live content" on the World Wide
Web.
Languages:
Perl Python Ruby Tcl/Tk
GNU compilers and tools

GCC,Make,Autoconf,Automake,etc.
12
UNIT-I
2/18/2013
Open source software sites

Free Software Foundation www.fsf.org Open Source Initiative www.opensource.org Freshmeat.net SourceForge.net OSDir.com developer.BerliOS.de Bioinformatics.org see also individual project sites; e.g., www.apache.org; www.cpan.org; c.
13
UNIT-I
2/18/2013
Software Development operating systems http://gcc.gnu.org/
What open-source software is available? (continued)
o GCC - The compiler for C, C++, Fortran, Java, that comes standard with all the major OSS
o JBOSS - A popular open-source implementation of J2EE http://www.jboss.org o Perl - A very popular language widely used in scripts to drive `live content on the World
Wide Web http://www.perl.org

o PHP - A very popular scripting language for interactive web development and applications
http://www.php.net
o Python - A popular object-oriented scripting language for web and desktop development
http://www.python.org
14
UNIT-I
2/18/2013
What open-source software is available?

Multi-user Networked Operating Systems
Linux :The most popular OSS operating system on the planet http://www.linux.org
Internet/intranet Services and Applications
Apache web server - Accounts for over 60% of the web servers on the Internet http://www.apache.org
o BIND name server - The software that provides the DNS (domain name service). Many of
the root name servers as well as the Internet backbone network ISPs use BIND http://www.isc.org/products/BIND/
o Sendmail (Mail Exchange server) - The most widely used email transport software on the
Internet http://www.sendmail.org
UNIT-I
15
2/18/2013
Database Systems
What open-source software is available? (continued)
o MySQL - A very popular open-source RDBMS http://www.mysql.com o PostgreSQL - A popular open-source RDBMS with many advanced features
http://www.postgresql.org
Desktop Applications o OpenOffice.org - An integrated office suite featuring word-processing, spreadsheet, drawing and
presentation software largely compatible with Microsoft Office http://www.openoffice.org

o Ximian Evolution - A GUI desktop application for personal email, calendar and diary having
similar look and feel with Microsoft Outlook http://www.ximian.org

o Mozilla - The open-source evolution of the popular Netscape web browser
http://www.mozilla.org
16
UNIT-I
2/18/2013
Can We Count On OSS?

OSS is developed and/or maintained by volunteer programmers so is a single
party fully accountable for it ?

Yes, For Common open source project we find a non-profit foundations or
normal businesses supporting the software

For example, Apache is supported through the Apache Software Foundation
and Red Hat Linux is supported and maintained by Red Hat Corporation
17
UNIT-I
2/18/2013
Open source companies

IBM uses and develops Apache and Linux; created Secure Mailer and created other software on AlphaWorks
Apple
released core layers of Mac OS X Server as an open source BSD operating system
called Darwin; open sourcing the QuickTime Streaming Server and the OpenPlay network gaming toolkit HP uses and releases products running Linux Sun uses Linux; supports some open source development efforts (Eclipse IDE for Java and the Mozilla web browser)
18
UNIT-I
2/18/2013
Open source companies

Red Hat Software
Linux vendor
ActiveState
develops and sells professional tools for Perl, Python, and Tcl/tk developers.
19
UNIT-I
2/18/2013
Can We Get Support On OSS?

The most frequently cited reasons against using OSS in corporations is the lack of support. In Propriety CSS we can relay on the vendor for support. But, There exists professional companies providing service and support for open-source (e.g.Red
Hat for Linux, Zend for PHP, and recently Sun Microsystems for MySQL)
The Internet is another great source of informal support that is efficient (Newsgroups, FAQs and
HOW-TO documents).
20
UNIT-I
2/18/2013
OPEN SOURCE LICENSES
21
UNIT-I
2/18/2013
TWO ORGANIZATIONS
Free Software Foundation (FSF)
Open Source Initiative (OSI)
22
UNIT-I
2/18/2013
Open source licensing

The licence is what determines whether software is open source
The licence must be approved by the Open Source Initiative- OSI
(www.opensource.org)
The Open Source Initiative approves open source licenses after they have
successfully gone through the approval process and comply with the Open Source
Definition (above).
All approved licences meet their Open Source Definition
(www.opensource.org/docs/definition.php)
Approved licences >50 and include the GPL, LGPL, MPL and BSD.
23
UNIT-I 2/18/2013
Copyright
Software is protected by copyright law By default only the owner of software may copy, adapt or distribute it. The owner of software can agree to let another person copy, adapt or distribute the
code - this agreement is called a licence.
24
UNIT-I 2/18/2013
Free/Open Source Software Licenses

OSI-approved:
-
Academic Free License Apache Software License Apple Public Source License Artistic license BSD license GNU GPL .
GNU LGPL IBM Public License Intel Open Source License MIT license Sun Public License
25
UNIT-I
2/18/2013
Open Source Software licensing and copyright

The two most common types of OSS licensing are: o BSD (Berkley System Distribution)Style: this category of license allows one to take an
open-source software and redistribute it with or without modifications as proprietary software. (e.g. Apache, BIND )
o GNU General Public License(GPL) : It is a license that requires that the product
derived from the original open-source software must also be distributed under the same licensing regime as the original. Thus it cannot be turned into a closed-source product.
26
UNIT-I
2/18/2013
Free/Open Source Software Licenses

If you distribute your software under one of these licenses, you are permitted to say that your software is "OSI Certified Open Source Software." If you see either of an OSI Certified certification mark on a piece of software, the software
is being distributed under a license that conforms to the Open Source Definition.
2/18/2013
UNIT-I
27
BSD license
no restriction on derivative work The only restrictions placed on users of software released under a typical BSD license are
that if they redistribute such software in any form, with or without modification, they
must include in the redistribution
(1) the original copyright notice, (2) a list of two simple restrictions and (3) a disclaimer of liability.
These restrictions can be summarized as

(1) one should not claim that they wrote the software if they did not write it (2) one should not sue the developer if the software does not function as expected or as desired
28
UNIT-I
2/18/2013
Mozilla Public License

Divides software into:
1.Open Source part 2. anything added by user Things added by user can be proprietary, if he does not modify the Open Source part Everything is Open Source if he modifies the Open Source part
29
UNIT-I
2/18/2013
Apache license
BSD + condition: OK to distribute code if under the Apache name, but not for resale
30
UNIT-I
2/18/2013
Taxonomy of Software by FSF

1.
Proprietary: the use, redistribution or modification of the software is prohibited, or requires you to ask for permission, or is restricted so much that you effectively can't do it freely.
2.
Semi-free: not free, but comes with permission for individuals to use, copy, distribute, and modify (including distribution of modified versions) for non-profit purposes. e.g. PGP
3.
Free
A. Copylefted: redistribution cannot add additional restriction B. Non-Copylefted:
31
UNIT-I
2/18/2013
Copyleft as explained by FSF

Copyleft is a general method for making a program free software and requiring all
modified and extended versions of the program to be free software as well.

To copyleft a program, we first state that it is copyrighted; then we add
distribution terms, which are a legal instrument that gives everyone the rights to
use, modify, and redistribute the program's code or any program derived from it but only if the distribution terms are unchanged.
Thus, the code and the freedoms become legally inseparable.
32
UNIT-I
2/18/2013
Free, copyrighted but not copylefted

Non-copy lefted free software comes from the author with permission to
redistribute and modify, and also to add additional restrictions to it.

If a program is free but not copylefted, then some copies or modified versions may
not be free at all.

A software company can compile the program, with or without modifications,
and distribute the executable file as a proprietary software product.

Example: X11 Window System
33
UNIT-I
2/18/2013
BSD vs GPL
The biggest difference between the GPL and BSD licenses is the fact that the
former is a copyleft license and the latter is not.

Copyleft is the application of copyright law to permit the free creation of
derivative works but requiring that such works be redistributable under the same terms (i.e., the same license) as the original work.
34
UNIT-I
2/18/2013
Typical OSS development model

Improvements (as source code) and evaluation results: User as Developer Developer Development Community Trusted Developer
Bug Reports
Trusted Repository Distributor
Stone soup development

OSS users typically use software without paying licensing fees OSS users typically pay for training & support (competed) OSS users are responsible for paying/developing new improvements & any evaluations that they need; often cooperate with others to do so Goal: Active development community (like a consortium)
35 UNIT-I
User
2/18/2013
Advantages of open source software
36
UNIT-I
2/18/2013
1.
The availability of the source code and the right to modify it is very important.

It enables the unlimited tuning and improvement of a software product. It also makes it possible to port the code to new hardware, to adapt it to changing conditions, and to reach a detailed understanding of how the system works.
2.
The right to redistribute modifications and improvements to the code, and to reuse other open source code, permits all the advantages due to the modifiability of the software to be shared by large communities.
3.
No exclusive rights to the software Open source software is open to everyone. Because of this no individual programmer or company can specify the direction that the development should take.
37
UNIT-I
2/18/2013
4.
The right to use the software in any way.

This, combined with redistribution rights, ensures (if the software is useful enough), a large
population of users, which helps in turn to build up a market for support and customization of the software, which can only attract more and more developers to work in the project.
This in turn helps to improve the quality of the product, and to improve its functionality.
5. The biggest advantage of open source for users is that most projects are free to
download and use
38
UNIT-I
2/18/2013
Cons
is that the focus is often on backend processing of information and not on user
interfaces. Microsoft Windows has arguably one of the easiest interfaces with which to work. Often, open source software such as Linux requires the user to have specialized knowledge that cannot be configured with just clicks of a mouse. In addition, open source projects often do not have good documentation to walk the user through the learning and using of the technologies.
39
UNIT-I
2/18/2013
40
UNIT-I
2/18/2013
Types of Operating System

Tasks
Uni tasking,Multi tasking
Users
Single User,Multi User
Processing
Uni processing,Multi processing
Timesharing
is the sharing of a computing resource among many users by means of
multiprogramming and multi-tasking
41
UNIT-I
2/18/2013
History
Multics 1964
Unics 1969 Minix 1990 Linux 1991
42
UNIT-I
2/18/2013
Multics
Multiplexed Information and Computing Service Written in 1964 Mainframe Timesharing OS Last version was shut down on October 30, 2008 Monolithic kernel
Disadvantages -crashes, insecure, error prone, expensive
43
UNIT-I
2/18/2013
Unics
Uniplexed Information and Computing System
Later renamed as UNIX Written in 1969
Ken Thompson, Dennis Ritchie were among the developers

Multi user, Multi tasking and timesharing Monolithic kernel
44
UNIT-I
2/18/2013
Minix
Minimal Unix
Tanenbaum developed this OS Mainly for educational purpose
Unix like OS, implemented with Micro kernel. So the name Minix
45
UNIT-I
2/18/2013
What is Linux?
Developed in 1991 by a University of Finland student Linus Torvalds. Basically a kernel, it was combined with the various software and compilers from GNU
Project to form an OS, called GNU/Linux

Linux is a full-fledged OS available in the form of various Linux Distributions RedHat, Fedora, SuSE, Ubuntu, Debian are examples of Linux distros Linux is supported by big names as IBM, Google, Sun, Novell, Oracle, HP, Dell, and many
more
46
UNIT-I
2/18/2013
Linux
Used in most of the computers, ranging from super computers to embedded
system
Multi user Multi tasking Time sharing Monolithic kernel Stable version of linux kernel 2.6.28, released on 24-Dec-2008
47
UNIT-I
2/18/2013
History of Linux
Inspired by the UNIX OS, the Linux kernel was developed as a clone of UNIX GNU was started in 1984 with a mission to develop a free UNIX-like OS Linux was the best fit as the kernel for the GNU Project Linux kernel was passed onto many interested developers throughout the Internet Linux today is a result of efforts of thousands of individuals, apart from Torvalds
48
UNIT-I
2/18/2013
Free Software Foundation & GNU

Organisation that started developing copylefted programs
GNU Project: Richard Stallman on September 27th 1983. The GNU Project was launched in 1984 to develop a complete Unix-like operating
system which is free software: the GNU system.

GNU's kernel isn't finished, so GNU is used with the kernel Linux. The combination
of GNU and Linux is the GNU/Linux operating system, now used by millions.
www.gnu.org
49
UNIT-I
2/18/2013
Linux on Servers and Supercomputers

Linux is the most used OS on servers 5 out of 10 reliable web hosting companies use Linux Linux is the cornerstone of the LAMP server-software combination (Linux,
Apache, MySQL, Perl/PHP/Python) which has achieved popularity among developers

Out of top 500 supercomputers, Linux is deployed on 426 of them
53
UNIT-I
2/18/2013
Linux on Embedded Systems

16.7% of smartphones worldwide use Linux as OS
Linux poses a major competition to the most popular OS is this segment Symbian Nokia, Openmoko supply Linux on their select smartphones
54
UNIT-I
2/18/2013
Why should you use Linux?

No threat of viruses Linux systems are extremely stable Linux is Free Linux comes with most of the required software pre-installed Linux never gets slow Linux can even run on oldest hardware
55
UNIT-I
2/18/2013
FOSS
Free Open Source Software
Free Means Liberty and not related to Price or cost Open Source code is available and any body can contribute to the
development. Organization independent
58
UNIT-I
2/18/2013
4 Freedoms with FOSS

Freedom to run the software anywhere Freedom to study how the programs work. i.e source code will be accessible Freedom to redistribute copies Freedom to improve the software
If a software has all these 4 freedoms, then it is a FOSS
59
UNIT-I
2/18/2013
Kernel
Core or nucleus of an operating system Interacts with the hardware
First program to get loaded when the system starts and runs till the
session gets terminated

Different from BIOS which is hardware dependent. Kernel is software dependent LINUX: In hard disk, it is represented by the file /vmlinuz.
60
UNIT-I
2/18/2013
Monolithic
Kernel types
All OS related code are stuffed in a single module Available as a single file Advantage : Faster functioning Micro OS components are isolated and run in their own address space Device drivers, programs and system services run outside kernel memory space.Only
a few functions such as process scheduling, and interprocess communication are included into the microkernel
Supports modularity & Lesser in size
61 UNIT-I
2/18/2013
Shell
Program that interacts with kernel
Bridge between kernel and the user Command interpreter
User can type command and the command is conveyed to the kernel and it
will be executed
62
UNIT-I
2/18/2013
Types of Shell
Sh simple shell BASH Bourne Again Shell KSH Korne Shell CSH C Shell SSH Secure Shell To use a particular shell type the shell name at the command prompt. Eg $csh will switch the current shell to c shell To view the current shell that is being used, type echo $SHELL at the command
prompt
63
UNIT-I
2/18/2013
Linux Distributions
Today there are hundreds of different distributions available popular Linux distributions
include SUSE Linux Fedora Linux Red Hat Enterprise Linux Debian Linux ALT Linux TurboLinux Mandrake Linux Lycoris Linux Linspire Gentoo Linux
Ubuntu
UNIT-I
2/18/2013
LINUX DISTRIBUTIONS
Red Hat Linux : One of the original Linux distribution. The commercial, nonfree version is Red Hat Enterprise Linux, which is aimed at big companies
using Linux servers and desktops in a big way.

Free version: Fedora Project. Debian GNU/Linux : A free software distribution. Popular for use on servers. However, Debian is not what many would consider a distribution for beginners, as it's not designed with ease of use in
mind.
SuSE Linux : SuSE was recently purchased by Novell. This distribution is primarily for pay because it contains many commercial programs, although free version that you can download. Mandrake Linux : Mandrake is perhaps strongest on the desktop. Originally based off of Red Hat Linux. Gentoo Linux : Gentoo is a specialty distribution meant for programmers. there's a available stripped-down
Linux OS
66
UNIT-I
2/18/2013
Operating System
User 1
User 2
67
UNIT-I
2/18/2013
OPERATING MODES
68
UNIT-I
2/18/2013
USER MODE
User mode is the normal mode of operating for programs. Web browsers, calculators, etc.
will all be in user mode.

They don't interact directly with the kernel, instead, they just give instructions on what
needs to be done, and the kernel takes care of the rest.

Code running in user mode must delegate to system APIs to access hardware or memory. Due to the protection afforded by this sort of isolation, crashes in user mode are always
recoverable.
Most of the code running on your computer will execute in user mode. When in User Mode, some parts of RAM cant be addressed, some instructions cant
be executed, and I/O ports cant be accessed
69
UNIT-I
2/18/2013
Kernel mode, on the other hand, is where programs communicate directly with the kernel. The kernel-mode programs run in the background, making sure everything runs smoothly
KERNEL MODE
- things like printer drivers, display drivers, drivers that interface with the monitor,
keyboard, mouse, etc.
The executing code has complete and unrestricted access to the underlying hardware. It can execute any CPU instruction and reference any memory address.
Kernel mode is generally reserved for the lowest-level, most trusted functions of the
operating system.
Crashes in kernel mode are catastrophic; they will halt the entire PC.
70
UNIT-I
2/18/2013
KERNEL MODE
A good example of this would be device drivers.
A device driver must tell the kernel exactly how to interact with a piece of
hardware, so it must be run in kernel mode

Because of this close interaction with the kernel, the kernel is also a lot more
vulnerable to programs running in this mode, so it becomes highly crucial that drivers are properly debugged before being released to the public.
71
UNIT-I
2/18/2013
SWITCHING FROM USER MODE TO KERNEL MODE

The only way an user space application can explicitly initiate a switch to kernel mode during
normal operation is by making an system call such as open, read, write etc.
Whenever a user application calls these system call APIs with appropriate parameters, a software
interrupt/exception(SWI) is triggered.
As a result of this SWI, the control of the code execution jumps from the user application to a
predefined location in the Interrupt Vector Table [IVT] provided by the OS.
This IVT contains an address for the SWI exception handler routine, which performs all the
necessary steps required to switch the user application to kernel mode and start executing kernel instructions on behalf of user process.
72
UNIT-I
2/18/2013
Decomposition of Linux System into Major Subsystems

User Applications -- the set of applications in use on a particular Linux system will be different depending on what the computer system is used for Examples ,wordprocessing and a web-browser. O/S Services -- these are services that are typically considered part of the operating system (a windowing system, command shell, etc.); also, the programming interface to the kernel (compiler tool and library) is included in this subsystem. Linux Kernel -- the kernel abstracts and mediates access to the hardware
resources, including the CPU.

Hardware Controllers -- this subsystem is comprised of all the possible physical devices in a Linux installation; for example, the CPU, memory hardware, hard disks, and network hardware
73
UNIT-I
2/18/2013
The fundamental architecture of the GNU/Linux operating system
74
UNIT-I
2/18/2013
User Applications
At the top is the user, or application, space.
This is where the user applications are executed. Below the user space is the kernel space where the Linux
kernel exists.
75
UNIT-I
2/18/2013
GNU C Library (glibc)

provides the system call interface that connects to the kernel provides the mechanism to transition between the user-space application
and the kernel. This is important because the kernel and user application occupy different protected address spaces.
And while each user-space process occupies its own virtual address space,
the kernel occupies a single address space
76
UNIT-I
2/18/2013
Fundamental architecture of the GNU/Linux operating system

The Linux kernel can be further divided into three gross levels. At the top is the system call interface, which implements the basic functions such as read and
write.
Below the system call interface is the kernel code, which can be more accurately defined as the
architecture-independent kernel code.

This code is common to all of the processor architectures supported by Linux. Below this is the architecture-dependent code, which forms what is more commonly called a
BSP (Board Support Package).

This code serves as the processor and platform-specific code for the given architecture.
77
UNIT-I
2/18/2013
File management
Directory Tree
(root)
When you log on the the Linux OS using your username you are automatically located in your home directory.
Most important subdirectories

/bin : Important Linux commands available to the average user. /boot : The files necessary for the system to boot.
Not all Linux distributions use this one. Fedora does.

/dev : All device drivers. Device drivers are the files that your Linux system uses to talk to your
hardware.
/etc : System configuration files. /home : Every user except root gets her own folder in here, named for her login account. So, the
user who logs in with linda has the directory /home/linda, where all of her personal files are kept.
/lib : System libraries. Libraries are just bunches of programming code that the programs on your
system use to get things done.
Most important subdirectories

/mnt
: Mount points. When you temporarily load the contents of a CD-ROM or USB drive, you
typically use a special name under /mnt.

/root : The root user's home directory. /sbin : Essential commands that are only for the system administrator. /tmp : Temporary files and storage space. Don't put anything in here that you want to keep. Most
Linux distributions (including Fedora) are set up to delete any file that's been in this directory
longer than three days.

/usr : Programs and data that can be shared across many systems and don't need to be changed. /var : Data that changes constantly (log files that contain information about what's
happening on your system, data on its way to the printer, and so on).
Home directory
You can see what your home directory is called by entering
pwd (print current working directory)
THE KERNEL
Block diagram of Linux Kernel
System call is the mechanism used by an application program to
Linux Kernel- System Call Interface
request service from the operating system. API is a function definition that specifies how to obtain a given service(ex.calloc,malloc ,free etc.), while System call is an explicit request to the kernel made via a software interrupt Invoking a system call by user mode process
Five main subsystems-Overview

The Process Scheduler (SCHED) is responsible for controlling process access to the CPU.
The scheduler enforces a policy that ensures that processes will have fair access to the CPU,
while ensuring that necessary hardware actions are performed by the kernel on time.
The Memory Manager (MM) permits multiple process to securely share the machine's
main memory system.

In addition, the memory manager supports virtual memory that allows Linux to support
processes that use more memory than is available in the system.

Unused memory is swapped out to persistent storage using the file system then swapped back
in when it is needed.
Five main subsystems

The Virtual File System (VFS) abstracts the details of the variety of hardware devices
by presenting a common file interface to all devices.

In addition, the VFS supports several file system formats that are compatible
with other operating systems.

The Network Interface (NET) provides access to several networking standards and a
variety of network hardware.

The Inter-Process Communication (IPC) subsystem supports several mechanisms
for process-to-process communication on a single Linux system.
Kernel Subsystem Overview
Linux Kernel-Memory Management

Linuxs physical memory-management system deals with allocating
and freeing pages, groups of pages, and small blocks of memory.

It has additional mechanisms for handling virtual memory, memory
mapped into the address space of running processes
Managing Physical Memory

The page allocator allocates and frees all physical pages; it can allocate ranges of
physically-contiguous pages on request.

The allocator uses a buddy-heap algorithm to keep track of available physical pages. Each allocatable memory region is paired with an adjacent partner. Whenever two allocated partner regions are both freed up they are combined to form
a larger region.
If a small memory request cannot be satisfied by allocating an existing small free
region, then a larger free region will be subdivided into two partners to satisfy the request.
Memory allocations in the Linux kernel occur either statically (drivers reserve a
contiguous area of memory during system boot time) or dynamically (via the page allocator).
Virtual Memory
The VM system maintains the address space visible to each process: It creates pages of
virtual memory on demand, and manages the loading of those pages from disk or their swapping back out to disk as required.
The VM manager maintains two separate views of a processs address space: A logical view describing instructions concerning the layout of the address space. The address space consists of a set of non overlapping regions, each representing a
continuous, page-aligned subset of the address space.

A physical view of each address space which is stored in the hardware page tables
for the process.
File System
A file system is the methods and data structures that an operating system uses to
keep track of files on a disk or partition; that is, the way the files are organized on the disk.
A file is an ordered string of bytes Files are organized in directory. File information like size,owner,access permission etc. are stored in a separate data
structure called inode.

Superblock is a data structure containing information about file system
Filesystem
The Virtual Filesystem (also known as Virtual Filesystem Switch or VFS)
is a kernel software layer that handles all system calls related to a standard Unix filesystem.
Its main strength is providing a common interface to several kinds of
filesystems. ex. copy a file from MS-dos filesystem to Linux
Network stack
The network stack, by design, follows a layered architecture modeled after the
protocols themselves. Recall that the Internet Protocol is the core network layer protocol that sits below the transport protocol .
Above TCP is the sockets layer, which is invoked through the SCI. The sockets layer is the standard API to the networking subsystem and provides
a user interface to a variety of networking protocols.

From raw frame access to IP protocol data units and up to TCP and the User
Datagram Protocol (UDP), the sockets layer provides a standardized way to manage connections and move data between endpoints.
While much of Linux is independent of the architecture on which it runs, there are
Architecture-dependent code
elements that must consider the architecture for normal operation and for efficiency.
The ./linux/arch subdirectory defines the architecture-dependent portion of the
kernel source contained in a number of subdirectories that are specific to the architecture .
Each architecture subdirectory contains a number of other subdirectories that
focus on a particular aspect of the kernel, such as boot, kernel, memory management, and others.
TYPES OF PROCESSES & PROCESS MANAGEMENT IN LINUX
Linux Kernel_Process
Process is a program in execution.
Process is represented in OS by Process
Control Block.
Interactive process
Interactive processes are those processes that are invoked by a user and can
interact with the user.

Examples: shells, text editors, GUI applications. Interactive processes can be classified into foreground and background processes. The foreground process is the process that you are currently interacting with,
and is using the terminal as its stdin (standard input) and stdout (standard
output).
A background process is not interacting with the user and can be in one of two
states - paused or running.

There has to be someone connected to the system to start these processes;
they are not started automatically as part of the system functions.
System Process
Daemon (day-mon). Daemon is the term used to refer to process' that are
running on the computer and provide services but do not interact with the
console.
Most server software is implemented as a daemon. Apache, Samba, are all
examples of daemons.
Any process can become a daemon as long as it is run in the background,
and does not interact with the user.

A simple example of this can be achieved using the ls -l command Running in the background by typing ls -l &
Automatic or batch processes

Automatic or batch processes are not connected to a terminal. Rather, these are tasks that can be queued into a spooler area, where
they wait to be executed on a FIFO (first-in, first-out) basis.

Such tasks can be executed using one of two criteria:
At a certain date and time
At times when the total system load is low enough to accept extra jobs:
done using the batch command.

By default, tasks are put in a queue where they wait to be executed until
the system load is lower than 0.8.
Batch processes
In large environments, the system administrator may prefer batch
processing when large amounts of data have to be processed or when tasks demanding a lot of system resources have to be executed on an already loaded system.
Batch processing is also used for optimizing system performance.
104
UNIT-I
2/18/2013
105
UNIT-I
2/18/2013
106
UNIT-I
2/18/2013
107
UNIT-I
2/18/2013
108
UNIT-I
2/18/2013
109
UNIT-I
2/18/2013
110
UNIT-I
2/18/2013
Process State-FLAGS
Each process on the system is in exactly one of five different states. This value
is represented by one of five flags:

TASK_RUNNING: The process is runnable; it is either currently running or
on a runqueue waiting to run

This is the only possible state for a process executing in user-space; it can
also apply to a process in kernel-space that is actively running.

TASK_INTERRUPTIBLE: The process is sleeping (that is, it is blocked),
waiting for some condition to become true or a signal is received.

When the condition becomes true, the kernel sets the process's state to
TASK_RUNNING.
The process can awake and become runnable if it receives a signal.
FLAGS
TASK_UNINTERRUPTIBLE:
This state is identical to TASK_INTERRUPTIBLE except that it does not
wake up and become runnable if it receives a signal.

This is used in situations where the process must wait without
interruption or when the event is expected to occur quite quickly.

Because the task does not respond to signals in this state,
TASK_UNINTERRUPTIBLE is less often used than

TASK_INTERRUPTIBLE6.
TASK_ZOMBIE:
PROCESS STATES-FLAGS
The task has terminated, but its parent has not yet issued a wait() system
call.
The task's task structure must remain in case the parent wants to access it. If the parent calls wait(), the task structure is deallocated.
TASK_STOPPED:
Process execution has stopped; the task is not running nor is it eligible to
run.
This occurs if the task receives the SIGSTOP, SIGTSTP, SIGTTIN, or
SIGTTOU signal or if it receives any signal while it is being debugged.
Zombie process
A zombie process or defunct process is a process that has completed
execution but still has an entry in the process table.

This entry is still needed to allow the parent process to read its child's exit
status.
When a process ends, all of the memory and resources associated with it
are deallocated so they can be used by other processes. However, the process's entry in the process table remains.
The parent can read the child's exit status by executing the wait system
call, whereupon the zombie is removed.
ZOMBIE PROCESS
When a child exits, the parent process will receive a SIGCHLD signal to
indicate that one of its children has finished executing; the parent process will typically call the wait() system call at this point.
That call will provide the parent with the childs exit status, and will
cause the child to be reaped, or removed from the process table.

Its possible that the parent process is intentionally leaving the process in
a zombie state to ensure that future children that it may create will not receive the same pid
Causes of Zombie Processes

When a subprocess exits, its parent is supposed to use the "wait" system
call and collect the process's exit information.

The subprocess exists as a zombie process until this happens, which is
usually immediately.
However, if the parent process isn't programmed properly or has a bug
and never calls "wait," the zombie process remains, eternally waiting for
its information to be collected by its parent.
Process states(USING FLAGS)
Process states in Linux:

Running: Process is either running or ready to run Interruptible: a Blocked state of a process and waiting for an event or
signal from another process

Uninterruptible: a blocked state. Process waits for a hardware condition
and cannot handle any signal

Stopped: Process is stopped or halted and can be restarted by some other
process
Zombie: process terminated, but information is still there in the process
table.
System calls used for process management in linux
System calls used for Process management:

Fork () :- Used to create a new process Exec() :- Execute a new program Wait():- wait until the process finishes execution Exit():- Exit from the process Getpid():- get the unique process id of the process Getppid():- get the parent process unique id
Pid and Parentage

A process ID or pid is a positive integer that uniquely identifies a running
process, and is stored in a variable of type pid_t.

You can get the process pid or parents pid
getpid-
getpid
returns
the
PID
of
the
calling
process.
getppid- getppid returns the PID of the parent of the calling proces main() { pid_t pid, ppid; printf( "My PID is:%d\n\n",(pid = getpid()) ); printf( "Par PID is:%d\n\n",(ppid = getppid()) ); }
2. fork()
#include <sys/types.h>
#include <unistd.h> pid_t fork( void );

Creates a child process by making a copy of the parent process --- an
exact duplicate.
Implicitly specifies code, registers, stack, data, files
Both the child and the parent continue running.
fork() as a diagram
Parent
pid = fork() Child pid == 0 Shared Program Data Copied
Returns a new PID: e.g. pid == 5
Data
Process IDs (pids revisited)

When a fork is executed, The child gets a unique pid. The parent's file descriptor table is copied. (The implication is that
all files open to the parent are also open to the child.)
The return value from fork is -1 if failure (e.g., the process table is full) 0 (in the child) pid of the child (in the parent)
wait
wait is called by a parent process to await termination by one of its
children. This allows a simple form of synchronization between parent

and child. int wait ( int * statloc );
When a process calls wait, if the caller has no children, the wait call
returns immediately with an error code.

if the caller has children, but none has terminated, the caller is blocked
until one does.

if a child process has terminated which has not been waited for (a so-called
zombie process), the child is removed from the process table, and the wait
call returns with the status of the child.

125 UNIT-I 2/18/2013
Wait..
When the call returns, the return value is the pid of the terminated child
& the child's status is stored at statloc.

Two similar system calls, waitid and waitpid, are provided which
provide options to allow waiting on a specific child, and return without blocking.
126
UNIT-I
2/18/2013
wait() Actions
A process that calls wait() can: suspend (block) if all of its children are still running, or
return immediately with the termination status of a child,
or
return immediately with an error if there are no child
processes.
Using the exec Family

The exec functions replace the program running in a process with
another program.
When a program calls an exec function, that process immediately ceases
executing that program and begins executing a new program from the beginning, assuming that the exec call doesnt encounter an error.
Within the exec family, there are functions that vary slightly in their
capabilities and how they are called.

Functions that contain the letter p in their names (execvp and execlp) accept
a program name and search for a program by that name in the current execution path; functions that dont contain the p must be given the full path of the program to be executed.
Exec()
Functions that contain the letter v in their names (execv, execvp, and execve)
accept the argument list for the new program as a NULL-terminated array of
pointers to strings.
Functions that contain the letter l (execl, execlp, and execle) accept the
argument list using the C languages varargs mechanism.

Functions that contain the letter e in their names (execve and execle) accept
an additional argument, an array of environment variables.

The argument should be a NULL-terminated array of pointers to
character strings. Each character string should be of the form

VARIABLE=value.
Simple Execlp Example

include <sys/type.h> #include <stdio.h> #include <unistd.h> int main() { pid_t pid; /* fork a child process */ pid = fork(); if (pid < 0){ printf(Fork Failed); exit(-1); } else if (pid == 0){ /* child process */ execlp(/bin/ls,ls,NULL); } else { /* parent process */ /* parent will wait for child to complete */ wait(NULL); printf(Child Complete); exit(0); } }
#
SCHEDULING IN LINUX
Linux Process Scheduling Policy

A scheduling policy is the set of decisions you make regarding scheduling
priorities, goals, and objectives

A scheduling algorithm is the instructions or code that implements a given
scheduling policy
Linux has several, conflicting objectives Fast process response time Good throughput for background jobs Avoidance of process starvation

Linux uses a timesharing technique
We know that this means that each process is assigned a small
quantum or time slice that it is allowed to execute.

Linux schedule process according to a priority ranking, this is a
goodness ranking
Linux uses dynamic priorities, i.e., priorities are adjusted over time to
eliminate starvation
Processes that have not received the CPU for a long time get
their priorities increased, processes that have received the CPU often get their priorities decreased

We can classify processes using two schemes
CPU-bound versus I/O-bound I/O-bound programs have the property of performing only a small amount of
computation before performing IO. Such programs typically do not use up their entire CPU quantum.
CPU-bound programs, on the other hand, use their entire quantum without performing
any blocking IO operations.

Consequently, one could make better use of the computer's resources by giving higher
priority to I/O-bound programs and allow them to execute ahead of the CPU-bound
programs.
Interactive versus batch versus real-time These classifications are somewhat independent, e.g., a batch process can be either I/O-
bound or CPU-bound.
Linux recognizes real-time programs and assigns them high priority,

Linux uses process preemption, a process is preempted when
Its time quantum has expired

A new process enters TASK_RUNNING state and its priority is greater than the
priority of the currently running process

The preempted process is not suspended, it is still in the ready queue, it simply no longer has
the CPU
Consider a text editor and a compiler

Since the text editor is an interactive program, its dynamic priority is higher than the
compiler
The text editor will be block often since it is waiting for I/O When the I/O interrupt receives a key-press for the editor, the editor is put on the
ready queue and the scheduler is called since the editors priority is higher than the
compiler. The editor gets the input and quickly blocks for more I/O

Determining the length of the quantum
Should be neither too long or too short

If too short, the overhead caused by process switching becomes excessively high If too long, processes no longer appear to be executing concurrently
For Linux, long quanta do not necessarily degrade response time for
interactive processes because their dynamic priority remains high, thus they
get the CPU as soon as they need it

The for Linux is the longest possible quantum without affecting responsiveness;
this turns out to be about 20 clock ticks or 210 milliseconds
Linux Process Scheduling Algorithm

The Linux scheduling algorithm is not based on a continuous CPU time axis, instead it
divides the CPU time into epochs

An epoch is a division of time or a period of time In each epoch, every process gets a specified time quantum quantum = maximum CPU time assigned to the process in that epoch duration of
quantum computed when epoch begins

different processes may have different time quantum durations when process forks, remainder of parents quantum is split / shared between
parent and child

Epoch ends when all runnable processes have exhausted their quanta At end of epoch, scheduler algorithm recomputes the time-quantum durations of all
processes; new epoch begin

The a new epoch starts and all process get a new quantum

When does an epoch end? Important!
An epoch ends when all processes in the ready queue have used their
quantum
This does not include processes that are blocking on some wait queue,
they will still have quantum remaining

The end of an epoch is only concerned with processes on the ready
queue

Selecting a process to run next. The scheduler considers the priority of each
process. There are two kinds of priorities

Static priorities - these are assigned to real-time processes and range from 1 to
99; they never change

Dynamic priorities - Dynamic priority is calculated from static priority and
average sleep time .

When process wakes up, record how long it was sleeping, up to some maximum
value
When the process is running, decrease that value each timer tick The static priority of real-time process is always higher than the dynamic
priority of conventional processes

Conventional processes will only execute when there are no real-time
processes to execute

Calculating process quanta for an epoch
Each process is initially assigned a base time quantum, as mentioned
previously it is about 20 clock ticks

If a process uses its entire quantum in the current epoch, then in the next epoch
it will get the base time quantum again

If a process does not use its entire quantum, then the unused quantum carries
over into the next epoch (the unused quantum is not directly used, but a bonus
is calculated)
Why? Process that block often will not use their quantum; this is used to favor
I/O-bound processes because this value is used to calculate priority

When forking a new child process, the parent process remaining quantum
divided in half; half for the parent and half for the child

Scheduling data in the process descriptor
The process descriptor (task_struct in Linux) holds essentially of
the information for a process, including scheduling information

Recall that Linux keeps a list of all process task_structs and a list of
all ready process task_structs

Each process descriptor (task_struct) contains the following fields need_resched - this flag is checked every time an interrupt handler
completes to decide if rescheduling is necessary

For real-time processes this can have the value of SCHED_FIFO - first-in, first-out with unlimited time quantum SCHED_RR - round-robin with time quantum, fair CPU usage
For all other processes the value is

SCHED_OTHER
For processes that have yielded the CPU, the value is

SCHED_YIELD

Process descriptor fields (cont)
rt_priority - the static priority of a real-time process, not used for
other processes
priority - the base time quantum (or base priority) of the process counter - the number of CPU ticks left in its quantum for the current
epoch. This field is updated for every clock tick

The priority and counter fields are used to for timesharing
and dynamic priorities in conventional processes .

Scheduling actually occurs in schedule() Its objective is to find a process in the ready queue then assign the CPU to it It is invoked in two ways

Direct invocation Lazy invocation

Direct invocation of schedule() The scheduler is invoked directly when the current process must be blocked right away because the resource it needs is not available
A process must be blocked because a resource is not available, a device
driver can invoke schedule() directly if it will be executing a long iterative task.
The current process is taken off of the ready queue and is placed on
the appropriate wait queue; its state is changed to TASK_INTERRUPTIBLE or TASK_UNINTERRUPTIBLE

When a process is woken up and its priority is higher than that of the
current process

Lazy invocation of schedule() Scheduler can also be invoked in a lazy way by setting the need_ resched flag field to 1.
Occurs when the current process has used up its quantum A process is added to the ready queue and its priority is higher than the
currently executing process

A process calls sched_yield()
sched_yield() causes the calling thread to relinquish the CPU. The thread is moved
to the end of the queue for its static priority and a new thread gets to run.

Actions performed by schedule()
First it runs any kernel control paths that have not completed and other
uncompleted house-keeping tasks

Remember, the kernel is not preemptive, so it cannot switch to another
process if a process is already in the kernel or if the kernel is in the middle of doing something else
If the current process is SCHED_RR and has used all of its quantum, then it
is given a new quantum and placed at the end of the ready queue
If the process is not SCHED_RR, then it is removed from the ready queue

Actions performed by schedule() (cont)
It scans the ready queue for the highest priority process
It calculates the priority using the goodness() function

It may not find any processes that are good when all processes on the
ready queue have used up their quantum (i.e., all have a zero counter
field). In this case it must start a new epoch by assigned a new quantum to
all processes
If a higher priority process was found, then the scheduler performs a
process switch

How good is a runnable process?
Uses goodness() to determine priority (goodness == -1000) - do not select process (goodness == 0) - process has exhausted quantum (0 < goodness < 1000) - conventional process with quantum (goodness >= 1000) - real-time process

Linux scheduler issues
Does not scale very well as the number of process grows because it
has to recompute dynamic priorities

Tries to minimize this by computing at end of epoch only
Large numbers of runnable processes can slow response time

Predefined quantum is too long for high system loads I/O-bound process boosting is not optimal Some I/O-bound processes are not interactive (e.g., database search or
network transfer)
Support for real-time processes is weak
Personalities in Linux
Execution Domains_persona System call

The execution domain system allows Linux to provide limited support for
binaries compiled under other UNIX-like operating systems.

Linux is its ability to execute files compiled for other operating systems. Of course,
this is possible only if the files include machine code for the same computer architecture on which the kernel is running.
Two kinds of support are offered for these "foreign" programs: Emulated execution: necessary to execute programs that include system calls that
are not POSIX-compliant

Native execution: valid for programs whose system calls are totally POSIX-
compliant
"Portable Operating System Interface", is a family of standards specified by
the IEEE for maintaining compatibility between operating systems.
Microsoft MS-DOS and Windows programs are emulated: they cannot be natively
Personsa()
executed, because they include APIs that are not recognized by Linux
POSIX-compliant programs compiled on operating systems other than Linux can be
executed without too much trouble, because POSIX operating systems offer similar APIs.
A process specifies its execution domain by setting the personality field of its
descriptor. Each process has an associated personality identifier that can slightly modify the semantics of certain system calls. Used primarily by emulation libraries to request that system calls be compatible with certain specific flavors of OS.
A
process can change its personality by issuing a suitable system call
named personality( )
Programmers are not expected to directly change the personality of their programs;
instead, the personality( ) system call should be issued by the glue code that sets up the execution context of the process
Personalities in LINUX
Personality PER_LINUX PER_BSD PER_SUNOS PER_RISCOS PER_SOLARIS Operating system Standard execution domain
BSD Unix SunOS RISC OS Sun's Solaris
Linux supports different execution domains, or personalities, for each process. Among other things, execution domains tell Linux how to map signal
numbers into signal actions.

Linux Open Source

Transféré par

Informations du document

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Linux Open Source

Transféré par

Droits d'auteur :

Formats disponibles

1

What Software is Needed?

What is open-source software (OSS)?

source code from which these binaries are compiled.

of binaries as well as source code.

Open Source Definition (OSD)

Open Source Definition (OSD)

Open Source Definition (OSD)

No Discrimination Against Persons or Groups.

License Must Not Be Specific to a Product

License Must not Restrict Other Software

What is open-source software (OSS)? (continued)

Open Source Vs. Closed Source Software

Centralized, single site development

User suggests additional features that often get implemented.

Open Source Vs. Closed Source Software (Continued)

OSI dictates that in order to be considered "OSI Certified" a product must

Example of open source software

GNU compilers and tools

Open source software sites

Software Development operating systems http://gcc.gnu.org/

What open-source software is available? (continued)

Wide Web http://www.perl.org

What open-source software is available?

What open-source software is available? (continued)

presentation software largely compatible with Microsoft Office http://www.openoffice.org

similar look and feel with Microsoft Outlook http://www.ximian.org

Can We Count On OSS?

party fully accountable for it ?

normal businesses supporting the software

Open source companies

Open source companies

Can We Get Support On OSS?

OPEN SOURCE LICENSES

Open source licensing

The licence must be approved by the Open Source Initiative- OSI

code - this agreement is called a licence.

Free/Open Source Software Licenses

Open Source Software licensing and copyright

Free/Open Source Software Licenses

These restrictions can be summarized as

Mozilla Public License

Taxonomy of Software by FSF

Copyleft as explained by FSF

modified and extended versions of the program to be free software as well.

Free, copyrighted but not copylefted

redistribute and modify, and also to add additional restrictions to it.

not be free at all.

and distribute the executable file as a proprietary software product.

former is a copyleft license and the latter is not.

Typical OSS development model

Trusted Repository Distributor

Stone soup development

Advantages of open source software

The right to use the software in any way.

download and use

Types of Operating System

multiprogramming and multi-tasking

Disadvantages -crashes, insecure, error prone, expensive

Ken Thompson, Dennis Ritchie were among the developers

Project to form an OS, called GNU/Linux

Free Software Foundation & GNU

system which is free software: the GNU system.

Linux on Servers and Supercomputers