Unit IInew

MANAGEMENT INFORMATION SYSTEM UNIT II
Unit I
Definition of Management information system-MIS support for planning organizing and controllingstructure of MIS-Information for decision making
Concept of system-Characteristic of system-System classifications-Categories of information system
strategic information system and competitive advantage
Unit II
Computers and information processing-Classification of computer-Input device-Output device-Storage
device-Batch and online Processing-Hardware, Software Date base Management system
Unit III
System analysis and design-SDLC-Role of system analysis-Functional information system-Personnel,
Production, Material and marketing
Decision support system-Definitions-group decisions support system-Business process out sourcesdefinition and functions
Unit IV
Introduction to Tally - Introduction to software accounting. Computer Application through Accounting
Package Tally - (Latest Version) Preparation of following records on Tally (with inventory) (A)
Creation of company, Group, Ledger Accounts, Feeding of Accounting data Receipts, Payments,
Purchase, Sale, Contra, Journal, Credit Note and Debit Note. (B) Inventory Information Groups, Items
and Valuation. (C) Generation of various Accounting Reports - Creating accounts - Feeding opening
balances - Chart of accounts - Capital Current assets Current liabilities Investments Loans
Miscellaneous Sales Purchase Direct / Indirect income / expenses
Unit V
Purchase and Sales - Purchase / Sales order - Receipt note - Purchase / Sales bills - Debit / Credit note
Journal, Voucher , VAT Bills, service tax , FBT applications,
Reference:
1. Management Information System Sadagopan
2. 2)Management Information System CSV Murthy
3. Tally Financial Accounting Program Current Volume Tally Press
4. Tally for Beginners Tally Press
5. Tally9-Slef Study Training Kit www.swayam-education.com
COMPUTER FUNDAMENTALS
Unit III
Computers and information processing Classification of Computer input device output device
storage device Batch and online processing Hardware, Software Database Management System
WHAT IS A COMPUTER?
The term Computer is derived from the term Compute which
means to reckon or to calculate something. Initially human being used
computers just to do arithmetic calculations. That is why; they have
K.S.Kunkuma Balasubramanian, 9841817207
Director, Arcanum Technologies, Creative Enclave, 148/150, Luz Church Road, (Opp. To Alwarpet Anjaneyar Temple), 600004

named the device which is doing calculations as computer. But now, the computer can do more than just
calculation. It can do logical decisions, data processing etc.
DEFINITION FOR COMPUTER
Computer can be defined as an Electronic device which can do both arithmetic and logical operations
at a faster rate which has memory to store and operates automatically under the control of instructions
stored in the memory
So, a computer is an electronic device which may use either valves or transistors or ICs (integrated
circuits) which can perform arithmetic operations like addition, subtraction, multiplication and division and
can do logical operations like checking whether a condition is true or false at a faster rate. It is having
memory to store and process.
Computer is a device that transforms data. Data can be anything like marks obtained by you in
various subjects. It can also be name, age, sex, weight, height, etc. of all the students in your class
or income, savings, investments, etc., of a country.
Computer i) accept data ii) store data, iii) process data as desired, and iv) retrieve the stored data as
and when required and v) print the result in desired format.
TERMS ASSOCIATED WITH THE SPEED OF COMPUTERS
Milli Second
One thousand instructions/second
Micro Second
Million Instruction/Second
Nano Second
Thousand Million Instructions/Second
Pico Second
Millions Million Instruction/Second
Computers are very fast
The unit of measurement of Computers Speed is MIPS (Million Instructions Per Second)
CHARACTERISTICS OF COMPUTER
Speed
Computer can work very fast. It takes only few seconds for calculations that we take hours to
complete. Suppose if you have to calculate and write a report of average monthly income of say 1500
persons manually, it will take at least one or two weeks if you work hard but for the computer, it will be
less than 5 minutes to calculate and some extra time to print depending on the printer you use. You will be
surprised to know that computer can perform millions (1,000,000) of instructions and even more per
second. Therefore, we determine the speed of computer in terms of microsecond (10-6 part of a second) or
nano-second (10-9 part of a second). From this you can imagine how fast your computer performs work.
Accuracy
The degree of accuracy of computer is very high and every calculation is performed with the same
accuracy. The accuracy level is determined on the basis of design of computer. The errors in computer are
due to human and inaccurate data.
Diligence or consistency
A computer is free from tiredness, lack of concentration, fatigue, etc. It can work for hours without
creating any error. If millions of calculations are to be performed, a computer will perform every
calculation with the same accuracy. Due to this capability it overpowers human being in routine type of
work.
Versatility
It means the capacity to perform completely different type of work. You may use your computer to
prepare payroll slips. Next moment you may use it for inventory management or to prepare electric bills.
Power of Remembering
Computer has the power of storing any amount of information or data. Any information can be

stored and recalled as long as you require it, for any numbers of years. It depends entirely upon you how
much data you want to store in a computer and when to lose or retrieve these data.
Neatness
The execution & the reports generated by the computer will be 100% neat as designed by us
through our program. The neatness will be maintained from the initial to final stage of execution which is
not possible in the case of manual jobs.
Automation
With the help of stored program concept, wherein a instructions will be executed one after the
another automatically by the system. This eliminates unnecessary manual intervention
Storage & retrieval of information
The Computer has an in-built memory where it can store a large amount of data. You can store data
in secondary storage devices such as floppies, hard disks etc., which can be kept outside your computer
and can be carried to other computers. Lateron if you wasnt to retrieve the data from
LIMITATIONS OF THE COMPUTERS
E
E
E
Cannot think by its own

Cannot detect any logical errors (GIGO)
Human beings have the potential to try out various alternatives to solve the unexpected which
computers cannot.
Computers have no Intuition power

BASIC COMPUTER OPERATIONS
A computer performs basically five major operations or functions irrespective of their size and
make. These are
1) it accepts data or instructions by way of input,
2) it stores data,
3) it can process data as required by the user,
4) it gives results in the form of output, and
5) it controls all operations inside a computer.
We discuss below each of these operations.
1. Input
This is the process of entering data and programs in to the computer system. Computer is an electronic
machine like any other machine which takes as inputs raw data and performs some processing giving out
processed data. Therefore, the input unit takes data from user/operator to the computer in an organized
manner for processing.
2. Storage:
The process of saving data and instructions permanently is known as storage. Data has to be fed into
the system before the actual processing starts. It is because the processing speed of Central Processing Unit
(CPU) is so fast that the data has to be provided to CPU with the same speed. Therefore the data is first
stored in the storage unit for faster access and processing. This storage unit or the primary storage of the
computer system is designed to do the above functionality. It provides space for storing data and
instructions.
The storage unit performs the following major functions:
1. All data and instructions are stored here before and after processing.
2. Intermediate results of processing are also stored here.
3. Processing:
The task of performing operations like arithmetic and logical operations is called processing. The
3

Central Processing Unit (CPU) takes data and instructions from the storage unit and makes all sorts of
calculations based on the instructions given and the type of data provided. It is then sent back to the storage
unit.
4. Output:
This is the process of producing results from the data for getting useful information. Similarly the
output produced by the computer after processing must also be kept somewhere inside the computer before
being given to you in human readable form. Again the output is also stored inside the computer for further
processing.
5. Control:
The manner how instructions are executed and the above operations are performed. Controlling of all
operations like input, processing and output are performed by control unit. It takes care of step by step
processing of all operations in side the computer.
FUNCTIONAL UNITS
In order to carry out the operations mentioned in the previous section the computer allocates the
task between its various functional units. The computer system is divided into three separate units for its
operation. They are :
1) Arithmetic Logical Unit,
2] Main Memory Unit,
3) Control Unit,
4] Input Units and
5] Output Units
Arithmetic Logical Unit (ALU)
After you enter data through the input device it is
stored in the primary storage unit. The actual processing
of the data and instruction are performed by Arithmetic
Logical Unit. The major operations performed by the
ALU are addition, subtraction, multiplication, division,
logic and comparison. Data is transferred to ALU from
storage unit when required. After processing the output is
returned back to storage unit for further processing or
getting stored.
Main Memory Unit (MMU)
All instructions and data will be temporarily stored
here for further processing. RAM is called as MMU in a
computer. This is highly volatile and cannot retain anything if power goes off. Using Input Units,
necessary data and instruction will be sent to MMU for temporary storage and then will be sent to
ALU or CU according to the instructions that are given at that time. This is also called as primary
memory.
Control Unit (CU)
The next component of computer is the Control Unit, which acts like the supervisor seeing
that things are done in proper fashion. The control unit determines the sequence in which computer
programs and instructions are executed. Things like processing of programs stored in the main
memory, interpretation of the instructions and issuing of signals for other units of the computer to
execute them. It also acts as a switch board operator when several users access the computer
simultaneously. Thereby it coordinates the activities of computers peripheral equipment as they
perform the input and output. Therefore it is the manager of all operations mentioned in the
4

previous section.
Input Units
These are secondary storage units which supply instructions and data to the CPU whenever
needed. There are various input devices like Card Rader, Magnetic Tape Reader, MICR, Magnetic
Disks, Optical Disks, and Flash Drives (like Pen Drives, MP3 or MP4 players, Wrist Watches or
Pens or Spy camera with Pen Drives, Cell phones with flash drives etc.).
Output Units
These are Secondary storage devices and Printers. Printers are exclusive output device
wherein you can take hard copy of any reports you want to print. Apart from secondary storage
devices, we are having line printers, inkjet printers, Thermal Printers, Laser printers etc.
Central Processing Unit (CPU)
The ALU, MMU and the CU of a computer system are jointly known as the central
processing unit. You may call CPU as the brain of any computer system. It is just like brain that
takes all major decisions, makes all sorts of calculations and directs different parts of the computer
functions by activating and controlling the operations. It can also be called as heart of a computer.
If any one part of the CPU got failed, we cannot use the configuration as computer.
CLASSIFICATION OF COMPUTER
Computers can be classified as under:
E
E
E
Purpose Wise like Special Purpose or General Purpose

Type Wise like Analog, Digital or Hybrid computersand
Size wise like micro Computer, Mini Computer, Mainframe Computer or Super Computer
Purpose wise classification
Computers can be classified into General Purpose and Special Purpose computers.
General Purpose Computers
These are nothing but the computers which we are using for our daily use. Here, different software
of different purposes can be executed one after another like executing Ms-Word, then Ms-Access,
Oracle application, Visual BASIC application, Payroll Application, Inventory Control Application,
Financial Accounting System, DTP applications etc. with the same computer. This is not
specifically allotted for any special or designated purpose.
Special Purpose Computers
These are dedicated for some specialized tasks. Mostly they will be used for the purpose for which
they are dedicated. CT Scanners, Endoscope, equipments (using computers) used for doing Laser
operations computers dedicated for launching for rockets and missiles etc., are some examples.
Type wise classification
Computers can be classified into Analog, Digital or Hybrid Computers.
Analog Computer
Analog Computer works with the qualitative data or physical force. Some physical force is used to
operate analog computer.
For ex., in the case of Thermometer, the physical force is heat of human body. The mercury kept
inside the thermometer will be expanding in proportion to the heat of human body. To have correct
measurement, one must use good quality instrument. Otherwise the result is highly unpredictable.
In the case of spring balance or normal balance, the physical force is gravitation force of the earth.
In the case of Speedo meter, the speed of rotation of wheel is the physical force. In the case of
mechanical watch, the physical force is tension of the spring which rotates the teeth wheel.

The results given by the analog computer is not highly dependable unless until we use good and
quality instruments
The analog computer processes work electronically by analogy. It uses an analog for each variable
and produces analogs as output. It, thus, measures continuously. It does not produce number but
produces its results in the form of graph. It is more efficient in continuous calculations. The
analog computer accepts variable electrical signals (analog values) as inputs, and its output is also
in the form of analog electrical signals
Digital Computer
Digital Computers work with quantitative data. There is some element of counting number of low
or high pluses (Electronic pulses or low voltage or high voltage) emitted by electronic components
which will be represented normally by 0s & 1s and also the output in On-Off signal.
Most of the computers available to-day is digital computers and now we use the term
Computers to refer digital computers only.
Some crude examples for digital computers are digital watches, Digital EB meters, Glucose
meters, digital thermometers etc.,
Hybrid Computers
It is a a combination of both Analog & Digital Computers. A part of the processing is done by
Analog computer and a part by digital computers. A hybrid computer combines the benefits of both
Analog Computers and Digital Computers. It provides greater precision that can be attained with
Analog Computers and the speed and greater control by Digital Computer. It can accept input data
in both analog and digital form. It is used for simulation applications. Now, in most of the big
concerns entire production process will be carried on by hybrid computers wherein human
intervention is very low and only few staffs will be maintained just to observe what is going on.
Only in the case of emergency they will stop the machine and report the matter to the management
and they dont know what is going on in between. Examples are laparoscope, CT scanners,
endoscopes etc.
Size wise classifications
Computers can be classified into micro Computer, Mini Computer, Mainframe Computers and Super
Computers.
Micro Computers
Micro Computers are General purpose computers. It is an outcome of 4th generation with the birth of
Microprocessor. A Microprocessor is called as the miniature of chips. Computers size, cost, weight
etc., has reduced to a greater extend. Microcomputer is at the lowest end of the computer range in terms
of speed and storage capacity. Its CPU is a microprocessor. This is a single user oriented system.
When they were introduced, they were costing around 1.5 lakhs to 5 lakhs. Initially they came with two
8 inch floppy drives and used CP/M operating system which is Character User Interface based
operating system. The first microcomputers were built of 8-bit microprocessor chips. The languages
used at that time were only BASIC, COBOL, Fortran and lower end text editors.
Now, micro computers are so powerful and it can be called as micro computer, PCs, Workstations,
Clients or Nodes.
In Personal Computers (PCs), Pc-Dos or Ms-Dos or OS2 operating system is used. Now, we are using
multi-user operating systems like UNIX and Linux operating system in PCs. With the introduction of
Windows operating system GUI concepts, Multi-tasking, multi-programming concepts were introduced
in PCs. Other most common modern micro computer is Apple computer which is using GUI operating

system OS2.
Mini Computer
When the computer is used for data entry or printing purposes, the CPU will be keeping idle. If that
time is used by another user, there will be effective usage of the CPU. This idea gave birth to multiuser oriented approach. Here, several users can use the idle time of the CPU effectively. When micro
computers were introduced, they were used for data processing. Offline data entry machines were used
for data entry. Off line Data entry machines were costing so high. This gave birth to attachment of
multi terminals and using of CPUs time by several users effectively.
Mini computer is designed to support more than one user at a time. It possessed large storage capacity
and operates at a higher speed than microprocessors. Several dumb terminals around maximum 20
terminals will be attached to a CPU. Once the main terminal is booted other dumb terminals will be
booted and attached to the CPU and can use the system. In this environment each and every user will
be feeling as if he is using an independent system. Here, all requests will be sent in queue to the CPU
and will be executed on fist come first served basis. But the user may not feel all these since the CPU is
very fast
Mainframe Computers
These are general purpose computers. Mainframe was a term originally referring to the cabinet
containing the central processor unit or "main frame" of a room-filling Stone Age batch machine.
After the emergence of smaller "minicomputer" designs in the early 1970s, the traditional big iron
machines were described as "mainframe computers" and eventually just as mainframes. Nowadays, a
Mainframe is a very large and expensive computer capable of supporting hundreds, or even thousands,
of users simultaneously. These types of computers are generally 32-bit microprocessors. They operate
at a very high speed, have very large storage capacity and can handle the work load of many users.
They are generally used in centralized databases. They are also used as controlling nodes in Wide Area
Networks (WAN).
Example of mainframes are DEC, ICL and IBM 3000 series.
Supercomputer:
They are the fastest and most expensive machines. They have high processing speed compared to other
computers. They are using multi micro processors whereas other computer types will have single
processors. This facilitated multiprocessing. One of the ways in which supercomputers are built is by
interconnecting hundreds of microprocessors. Supercomputers are mainly being used for whether
forecasting, biomedical research, remote sensing, aircraft design and other areas of science and
technology.
Examples of supercomputers are CRAY YMP, CRAY2, NEC SX-3, CRAY XMP and PARAM
from India.
ADVANTAGES OF COMPUTERS USING ELECTRONIC DEVICES
Highly reliable and accurate
Due to the usage of electronic circuits in place of mechanical gears and wheels, the problems of
wear and tear, backlash, hysteresis, etc. are totally eliminated. Electronic computers are therefore very
highly reliable and accurate
Very Fast
They are very fast since the computer operates at electronic speed i.e., speed of light where as the
manual mechanical computers are very slow.
Automatic Execution
Automatic operation is carried on due to stored program concept. So, frequent Manual
intervention is very low. Multi-usage of the same program is also possible which are not possible in the
case of manual mechanical computers.
Highly Reliable and do Complex Operations
Mechanical can perform only limited arithmetic operations and we can not fully depend on the
results given by them whereas the results of digital computers are more versatile and can perform logical
operations also. By writing relevant programs we can execute even complex arithmetic operations.
HISTORY OF COMPUTERS
In olden days, human being used fingers in their hands and legs. to count. When they felt the
insufficiency of fingers, they tried to use stones and pebbles and they found it very inconvenient to carry
wherever they go.
During 3000 BC, Chinese & Japanese used ABAUS mechanical calculator which is still widely
used in Asia. Meanwhile in Asia, the Chinese were becoming very involved in commerce with the
Japanese, Indians, and Koreans. Businessmen needed a way to tally accounts and bills. Somehow, out of
this need, the abacus was born. The abacus is the first true precursor to the adding machines and computers
which would follow.

Napier Bones
Then Napiers Bones was developed in the year 1617. John Napier, a Scotsman, invented logarithms
which use lookup tables to find the solution to otherwise tedious and error-prone mathematical
calculations. Logarithm is a technology that allows multiplication to be performed via addition. The magic
ingredient is the logarithm of each operand, which was originally obtained from a printed table. But Napier
also invented an alternative to tables, where the logarithm values were carved on ivory sticks which are
now called Napier's Bones.
The Slide Rule was first built in England in the year 1630. In 1960's, it was used by the NASA
engineers of the Mercury, Gemini, and Apollo programs which landed men on the moon. . The slide rule
works on the basis of Napiers rules for logarithms. It was used until 1970s
Pascal designed the first mechanical calculator (Pascaline) based on gears. It performed addition
and subtraction. In 1642 Blaise Pascal, at age of 19 invented the Pascaline as an aid for his father to collect
tax. Pascal built 50 of this gear-driven one-function calculator (it could only add) but couldn't sell many
because of their exorbitant cost and because they really weren't that accurate (at that time it was not
possible to fabricate gears with the required precision). Up until the present age when car dashboards went
digital, the odometer portion of a car's speedometer used the very same mechanism as the Pascaline to
increment the next wheel after each full revolution of the prior wheel.
Pascal's Pascaline [photo 2002 IEEE]

Just a few years after Pascal, the German Gottfried Wilhelm Leibniz (co-inventor with Newton of
calculus) managed to build a four-function (addition, subtraction, multiplication, and division) calculator
was called him as the stepped reckoner because, instead of gears, it employed fluted drums having ten
flutes arranged around their circumference in a stair-step fashion. Although the stepped reckoner employed
the decimal number system (each drum had 10 flutes), Leibniz was the first to advocate use of the binary
number system which is fundamental to the operation of modern computers. Leibniz is considered one of
the greatest of the philosophers but he died as poor.
Leibniz's Stepped Reckoner (have you ever heard "calculating" referred to as "reckoning"?)
In 1801 the Frenchman Joseph Marie Jacquard invented a power loom that could base its weave
(and hence the design on the fabric) upon a pattern automatically read from punched wooden cards, held
together in a long row by rope. Descendents of these punched cards have been in use ever since
(remember the "hanging chad" from the Florida presidential ballots of the year 2000?).
Jacquard's technology was a real boon to mill owners, but put many loom operators out of work.
Angry mobs smashed Jacquard looms and once attacked Jacquard himself. History is full of examples of
labor unrest following technological innovation yet most studies show that, overall, technology has
actually increased the number of jobs.
In the year 1822, Charles Babbage developed Differential Engine. With this machine, he
compiled statistics of life and saved around two years of processing time but he was unable to develop this
machine successfully.
In the year 1833, he conceived the idea Analytical Engine. According to him this machine will be
using punched cards for storage purpose and working with steam engines. Babbage called the two main
parts of his Analytic Engine the "Store" and the "Mill", as both terms are used in the weaving industry. The
Store was where numbers were held and the Mill was where they were "woven" into new results. In a
modern computer these same parts are called the memory unit and the central processing unit (CPU). He
was the first person who gave the concepts of storage, processing and input and output. Still we are
following the same concept. So he is called as Father of Moedern Computers
The following diagrams will illustrate this:
10
If you compare the above two diagrams, one can understand that still we are following the concept
of Charles babbage and we are unable to deviate from the his idea even after lot of developments in
computer field. That is why he is called as Father of Modern Computers.
Babbage is largely remembered because of the work of Augusta Ada (Countess of Lovelace) who
was the first computer programmer. She was fascinated by Babbage's ideas and through sending and
receiving letters and meetings with Babbage, she learned enough about the design of the Analytic Engine.
While Babbage refused to publish his knowledge for another 30 years, Ada wrote a series of "Notes"
wherein she detailed sequences of instructions she had prepared for the Analytic Engine. The Analytic
Engine remained inbuilt (the British government refused to get involved with this one) but Ada earned her
spot in history as the first computer programmer. Ada invented the subroutine and was the first to
recognize the importance of looping.
A step towards automated computing was the development of punched cards, which were first
successfully used with computers in 1890 by Herman Hollerith and James Powers, who worked for the US.
Census Bureau. They developed devices that could read the information that had been punched into the
cards automatically, without human help. Because of this, reading errors were reduced dramatically, work
flow increased, and, most importantly, stacks of punched cards could be used as easily accessible memory
of almost unlimited size. Furthermore, different problems could be stored on different stacks of cards and
accessed when needed. Herman Hollerith's technique was successful and the 1890 census was completed
in only 3 years at a savings of 5 million dollars. Hollerith was the first American associated with the
history of computers. He was also the first to make a bunch of money at it. His company, the Tabulating
Machine Company, became the Computer Tabulating Recording Company in 1913 after struggling in the
market and merging with another company that produced a similar product. The company hired a
gentleman named Thomas J. Watson in 1918 who was primarily instrumental in turning the company
around. In 1924, the company was renamed International Business machines (IBM) Corporation.
A closer look at the Census Tabulating Machine
11
Aiken thought he could create a modern and functioning model of Babbages Analytical Engine. He
succeeded in securing a grant of 1 million dollars for his proposed Automatic Sequence Calculator; the
Mark I for short from IBM. In 1944, the Mark I was "switched" on. Aiken's colossal machine spanned 51
feet in length and 8 feet in height. 500 meters of wiring were required to connect each component.
GENERATIONS OF COMPUTERS
First Generation - 1946-1958: Vacuum Tubes
First Generation computers used vacuum tubes as electronic components for circuitry and Magnetic
Drum for storage. They were very big in size. They were costlier. Power consumption is heavy. They
produced heavy noise but the speed of processing is very slow.
First Generation Started with the introduction of ENIAC [Electronic Numerical Integrator And
Calculator] in the year 1946 by , John P. Eckert, John W. Mauchly , and their associates at the Moore
school of Electrical Engineering of University of Pennsylvania. Main purpose of developing ENIAC
was to assist the Word War II by preparing Firing Table to decide at what velocity, at which direction,
at which force, at which height if bombs exploded, heavy calamity could be effected to the enemy.
ENIAC Features
1. It was using 18,000 valves 70,000 resistors and 5 million soldered joints.
2. It was weighing around 30 tons and occupied around 300 cu. Ft.
3. It computed at speeds 1,000 times faster than the Mark I was capable of only 2 years earlier
4. It could do nuclear physics calculations (in two hours) which it would have taken 100 engineers
a year to do by hand.
5. The system's program could be changed by rewiring a panel.
6. It consumed about 180,000 watts of electrical power.
7. It had punched card I/O, 1 multiplier, 1 divider/square rooter, and 20 adders using decimal ring
12
counters , which served as adders and also as quick-access (.0002 seconds) read-write register
storage.
8. Used vacuum tubes for circuitry and magnetic drums for memory
9. Very Big, consumed more space.
10. Very expensive to operate & maintain.
11. Used more electricity, generated a lot of heat, which was often the cause of malfunctions.
12. Relied on machine language, the lowest-level programming language understood by computers,
to perform operations, and they could only solve one problem at a time.
13. Input was based on punched cards and paper tape, and output was displayed on printouts.
The First Stored Program Computer was EDSAC [Electronic Discrete Storage Automatic Computer] in
1949. The first American Stored Program Computer was EDVAC (Electronic Discrete Variable
Automatic Computer) by John von Neumann in 1950. Ecker and Mauchly produced UNIVAC I
(UNIVersal Automatic Computer) in 1951. The UNIVAC was the first commercial computer delivered
to a business client, the U.S. Census Bureau in 1951. Then UNIVAC II came into existence.
Characteristics of First Generation computers:
1. Used valves for data processing and storage.
2. They had a memory size of 20 bytes and speed of 5 mbps.
3. They produced heavy noise
4. They consumed enormous power.
5. They generated lot of heat due to the used of more valves.
6. They were very slow and unreliable.
7. They used punched cards for data storage.
8. They used binary language.
Second Generation - 1959-1964: Transistors
Transistors replaced vacuum tubes

The transistor was invented in 1947 but did not see widespread use in computers until the late 50s.
The transistor was far superior to the vacuum tube.
With the usage of transistors, computers became smaller, faster, cheaper, more energy-efficient and more reliable
than their first-generation computers.
Though the transistor still generated a great deal of heat that subjected the computer to damage, it was a vast
improvement over the vacuum tube.
Second-generation computers still relied on punched cards for input and printouts for output.
Second-generation computers moved from cryptic binary machine language to symbolic or assembly languages,
which allowed programmers to specify instructions in words.
High-level programming languages were also being developed at this time, such as early versions of COBOL and
FORTRAN.
These were also the first computers that stored their instructions in their memory, which moved from a magnetic
drum to magnetic core technology.
13
The first computers of this generation were developed for the atomic energy industry.
IBM 1620: Its size was smaller as compared to First Generation computers and mostly used for
scientific purpose. IBM 1401: Its size was small to medium and used for business applications.
CDC 3600: Its size was large and is used for scientific purposes.
Third Generation - 1965-1970: Integrated Circuits
This is very important generation and lot of activities happened in computer field during this
generation. The development of the integrated circuit was the hallmark of the third
generation of computers. ICs are called as miniature of Valves and Transistors i.e., several
valves and transistors functions were put into a small IC. In 1958, Jack Kilby who is an
engineer with Texas Instruments, developed the Integrated Circuit (IC). The
Integrated Circuit combined three electronic components onto a small silicon
disc, which was made from quartz rock. Scientist later managed to fit more
components on a single chip, called semiconductor. As a result of it, more
components were able to squeeze onto the chip and thereby computers became ever smaller. Transistors were
miniaturized and placed on silicon chips which drastically increased the speed and efficiency of computers. Semiconductors
are nothing but ICs which used to conduct the electronic signals partially. Instead of punched cards and printouts, users
interacted with third generation computers through keyboards and monitors and interfaced with an operating system,
which allowed the device to run many different applications at one time with a central program that monitored the memory.
Another third generation computer development included the use of an OS (operating system) that allowed
computers to run multiple programs together with a central program that monitored & coordinated the
memory of the computer.
Computers for the first time became accessible to a mass audience because they were smaller and cheaper than their
predecessors. Mini computers were developed during this period. During this period BASIC [Beginners All purpose Symbolic
Instruction Code] language was developed by Prof. John Kemny and Thomas Kurtz in the year 1964 for the benefits of
beginners and students. Till then, programming was meant for experienced programmers and scientists. This gave birth to
multiprogramming and Timesharing concepts.
Characteristics of Third Generation Computers

Characteristics of Third Generation Computers in comparison with that of previous generation computers
are
1. Third Generation Computers were based on integrated circuit (IC) technology.
2. Third Generation Computers were able to reduce computational time from microseconds to
nanoseconds
3. Third Generation Computers devices consumed less power and generated less heat. In some cases,
air conditioning was still required.
4. The size of Third Generation Computers was smaller as compared to previous computers
5. Since hardware of the Third Generation Computers rarely failed, the maintenance cost for it was
quite low.
6. Extensive use of high-level language became possible in Third Generation Computers.
7. Manual assembling of individual components was not required for Third Generation Computers, so
it reduced the large requirement of labor & cost. However, for the manufacture of IC chips, highly
sophisticated technologies were required
8. Commercial production became easier and cheaper.
Fourth Generation - 1970-Present: Microprocessors
The microprocessor brought the fourth generation of computers. Microprocessors are called as miniature of chips. In
Microprocessor, thousands of integrated circuits were built onto a single silicon chip. What in the first generation filled an entire
room could now fit in the palm of the hand. All modern day computers are Fourth Generation Computers. All of
us are using Fourth Generation Computers for our day-to-day activities. With the improvement in the IC
14

(Integrated Circuit), the size of the computers started to go down. Invention of VLSI (Very Large Scale
Integration) squeezed hundreds of thousands of components onto a single chip, where as the ULSI (Ultralarge Scale Integration) increased that number into the millions by the year 1980. Advancement in
technology makes Fourth generation computers cheaper in price and best in quality than all other
generation of computers.
The Intel 4004 chip, developed in 1971, located all the components of the computer - from the central processing unit
and memory to input/output controls - on a single chip. Whereas the IC used in previous computer generations, the
IC had had to be manufactured to serve a special purpose, now a single microprocessor could be
manufactured & then programmed to meet any number of demands. Soon everyday household items such
as Televisions, Music Systems, Washing Machines, Micro Ovens, and Automobiles incorporated
microprocessors.
In 1981 IBM introduced its first computer for the home user, and in 1984 Apple introduced the
Macintosh. Microprocessors also moved out of the realm of desktop computers and into many areas of life
as more and more everyday products began to use microprocessors.
As these small computers became more powerful, they could be linked together to form networks,
which eventually led to the development of the Internet. Fourth generation computers also saw the
development of GUIs, the mouse and handheld devices.
Integration Types
Small scale integration
Up to 100 devices on a chip
Medium scale integration
100 - 3,000 devices on a chip 100
Large scale integration
3,000 - 100,000 devices on a chip
Very large scale integration
100,000 -100,000,000 devices on a chip
Ultra large scale integration
Over 100,000,000 devices on a chip
Characteristics of Fourth Generation Computers

1.
2.
3.
4.
5.
Fourth generation computers are microprocessor based systems.

Fourth generation computers are very small.
Fourth Generation computers are the cheapest among all other computer generations.
Fourth generation computers are portable and quite reliable.
Fourth generation computers do not require air conditioning since they generate negligible amount
of heat.
6. Minimum maintenance is required for Fourth generation computers since hardware failure is
negligible for them.
7. The production cost of Fourth generation computers is very low
8. GUI and pointing devices enables users to learn to use the computer quickly.
9. Interconnections of computers leads to better communication and resource sharing.
10. Fourth generation computers are very powerful than previous generations and can easily do more
calculation or can run more programs at a time and for more hours.
Fifth Generation - Present and Beyond: Artificial Intelligence
Fifth generation computing devices, based on artificial intelligence, are still in development, though there are some applications,
such as voice recognition, that are being used today. The idea of fifth generation computer was introduced by
Japans Ministry of International Trade and Industry in 1982. The term fifth generation was stretched out
to convey the system as being a leap beyond existing computer machines. But the fifth generation
computer system (FGCS) project of Japan was failed since the Ministry of International Trade and Industry
( MITI ) of Japan stopped funding for it. The use of parallel processing and superconductors is helping to make artificial
intelligence a reality. Quantum computation and molecular and nanotechnology will radically change the face of computers in
15

years to come. The goal of fifth-generation computing is to develop devices that respond to natural language input and are
capable of learning and self-organization
Advances in science behind the creation of fifth generation computer

Many advances in the science of computer-design and technology are coming together to enable the
creation of fifth generation computers. Two such engineering advances are give below
1. parallel processing, which replaces von Neumanns single central processing unit design with a
system harnessing the power of many CPUs to work as one.
2. the technology of superconductors which is another great advantage, allows the flow of the
electricity with very less or even no resistance, greatly improving the information flow speed.
MEMORY OF COMPUTERS
There are two kinds of computer memory: primary and secondary. Primary memory is accessible directly
by the processing unit. RAM is an example of primary memory. As soon as the computer is switched off
the contents of the primary memory is lost. You can store and retrieve data much faster with primary
memory compared to secondary memory. Secondary memory such as floppy disks, magnetic disk, etc., is
located outside the computer. Primary memory is more expensive than secondary memory. Because of this
the size of primary memory is less than that of secondary memory. We will discuss about secondary
memory later on.
Computer memory is used to store two things:
i)
instructions to execute a program and
ii)
data.
When the computer is doing any job, the data that have to be processed are stored in the primary memory.
This data may come from an input device like keyboard or from a secondary storage device like a floppy
disk.
As program or the set of instructions is kept in primary memory, the computer is able to follow instantly
the set of instructions. For example, when you book ticket from railway reservation counter, the computer
has to follow the same steps: take the request, check the availability of seats, calculate fare, wait for money
to be paid, store the reservation and get the ticket printed out. The program containing these steps is kept in
memory of the computer and is followed for each request.
But inside the computer, the steps followed are quite different from what we see on the monitor or screen.
In computers memory both programs and data are stored in the binary form. You have already been
introduced with decimal number system, that is the numbers 1 to 9 and 0. The binary system has only two
values 0 and 1. These are called bits. As human beings we all understand decimal system but the computer
can only understand binary system. It is because a large number of integrated circuits inside the computer
can be considered as switches, which can be made ON, or OFF. If a switch is ON, it is considered 1 and if
it is OFF, it is 0. A number of switches in different states will give you a message like this: 110101....10.
So the computer takes input in the form of 0 and 1 and gives output in the form 0 and 1 only. Is it not
absurd if the computer gives outputs as 0s & 1s only? But you do not have to worry about.
Every number in binary system can be converted to decimal system and vice versa; for example, 1010
meaning decimal 10. Therefore it is the computer that takes information or data in decimal form from you,
convert it in to binary form, process it producing output in binary form and again convert the output to
decimal form.
The primary memory as you know in the computer is in the form of ICs (Integrated Circuits). These
circuits are called Random Access Memory (RAM). Each of RAMs locations stores one byte of
information. (One byte is equal to 8 bits). A bit is an acronym for binary digit, which stands for one binary
piece of information. This can be either 0 or 1. You will know more about RAM later. The Primary or
internal storage section is made up of several small storage locations (ICs) called cells. Each of these cells
16
can store a fixed number of bits called word length.

Each cell has a unique number assigned to it called the address of the cell and it is used to identify the
cells. The address starts at 0 and goes up to (N-1). You should know that the memory is like a large
cabinet containing as many drawers as there are addresses on memory. Each drawer contains a word and
the address is written on outside of the drawer.
UNIT
SYMBOL
POWER
OF 2
0
Byte
Kilobyte (1 Thousand)
KB
Megabyte (1 Million)
MB
Gigabyte (1 Billion)
GB
Terabyte (1 Trillion)
TB
Petabyte (1
quadrillion)
PB
Exabyte (1 quintillion)
EB
Zettabyte (1 sextillion)
ZB
Yottabyte (1 septillion)
YB
10
2
20
2
30
2
40
2
50
2
60
2
70
2
80
2
Number of bytes
1
1,024
1,048,576
1,073,741,824
1,099,511,627,776
1,125,899,906,842,624
1,152,921,504,606,846,976
1,180,591,620,717,411,303,424
1,208,925,819,614,629,174,706,176
1024 YB = 1 (Bronto Byte)

1024 Brontobyte = 1 (Geop Byte)
Geop Byte is The Highest Memory Measurement Unit!!!
Capacity of Primary Memory
You know that each cell of memory contains one character or 1 byte of data. So the capacity is defined in
terms of byte or words. Thus 64 kilobyte (KB) memory is capable of storing 64 X 1024 = 32,768 bytes. (1
kilobyte is 1024 bytes). A memory size ranges from few kilobytes in small systems to several thousand
kilobytes in large mainframe and super computer. In your personal computer you will find memory
capacity in the range of 64 KB, 4 MB, 8 MB and even 16 MB (MB = Million bytes). Now It is in GB.
The following terms related to memory of a computer are discussed below:
Different types of memory
1. Random Access Memory (RAM): The primary storage is referred to as random access memory
(RAM) because it is possible to randomly select and use any location of the memory directly store
and retrieve data. It takes same time to any address of the memory as the first address. It is also
17

called read/write memory. The storage of data and instructions inside the primary storage is
temporary. It disappears from RAM as soon as the power to the computer is switched off. The
memories, which loose their content on failure of power supply, are known as volatile memories
.So now we can say that RAM is volatile memory.
2. Read Only Memory (ROM): There is another memory in computer, which is called Read Only
Memory (ROM). Again it is the ICs inside the PC that form the ROM. The storage of program and
data in the ROM is permanent. The ROM stores some standard processing programs supplied by
the manufacturers to operate the personal computer. The ROM can only be read by the CPU but it
cannot be changed. The basic input/output program is stored in the ROM that examines and
initializes various equipment attached to the PC when the switch is made ON. The memories,
which do not loose their content on failure of power supply, are known as non-volatile memories.
ROM is non-volatile memory.
3. PROM There is another type of primary memory in computer, which is called Programmable Read
Only Memory (PROM). You know that it is not possible to modify or erase programs stored in
ROM, but it is possible for you to store your program in PROM chip. Once the programmes are
written it cannot be changed and remain intact even if power is switched off. Therefore programs or
instructions written in PROM or ROM cannot be erased or changed.
4. EPROM: This stands for Erasable Programmable Read Only Memory, which over come the
problem of PROM & ROM. EPROM chip can be programmed time and again by erasing the
information stored earlier in it. Information stored in EPROM exposing the chip for some time
ultraviolet light and it erases chip is reprogrammed using a special programming facility. When the
EPROM is in use information can only be read.
5. Cache Memory: The speed of CPU is extremely high compared to the access time of main
memory. Therefore the performance of CPU decreases due to the slow speed of main memory. To
decrease the mismatch in operating speed, a small memory chip is attached between CPU and Main
memory whose access time is very close to the processing speed of CPU. It is called CACHE
memory. CACHE memories are accessed much faster than conventional RAM. It is used to store
programs or data currently being executed or temporary data frequently used by the CPU. So each
memory makes main memory to be faster and larger than it really is. It is also very expensive to
have bigger size of cache memory and its size is normally kept small.
6. Registers: The CPU processes data and instructions with high speed, there is also movement of
data between various units of computer. It is necessary to transfer the processed data with high
speed. So the computer uses a number of special memory units called registers. They are not part of
the main memory but they store data or information temporarily and pass it on as directed by the
control unit.
7. Flash Memory: A solid-state, nonvolatile, rewritable memory that functions like a combination of
RAM and hard disk. Flash memory is durable, operates at low voltages, and retains data when
power is off. Flash memory cards are used in digital cameras, cell phones, printers, handheld
computers, pagers, and audio recorders.
8. Virtual Memory :
In the early years, computer memories were small and more expensive. Programmers were using a
total memory size of only 4096 18-bit words for the both user programs and operating system in
PDP-1. So, the programmer had to fit his program in this small memory. Nowadays, computers
have some gigabytes of memory but the modern programs need much more memory. To solve this
problem, operating systems use secondary memories such as disk as main memory.
In the first technique, the programmer divided the program up into a number of pieces called
18
overlays. At the start of the program, first overlay was loaded into memory. When it finished, loads
next overlay. Programmers must manage overlays between memory and disk. He was responsible
to find it from disk and load it to memory. It was difficult for programmers.
In 1961, a group of researchers from Manchester established automatic overlay management
system called virtual memory.
Virtual memory is organized into "pages". A page is the memory unit typically a few Kbytes in
size. It is mostly 4-Kbytes. You can learn page size by typing page size command. When a program
references to an address on a page not present in main memory, a page fault occurs. After a page
fault, the operating system seeks for the corresponding page on the disk and loads it onto main
memory by using a page replacement algorithm such as LRU. We can start a program when none
of the program is in main memory. When the CPU tries to fetch the first instruction of the program,
it gets a page fault, because the memory doesn't contain any piece of the program in the main
memory. This method is called demand paging.
If a process in main memory has low priority or is sleeping, that means it won't run soon. In this
case, the process can be backed up on disk by the operating system. This process is swapped out.
The swap space is using for holding memory data.
Processes use virtual addresses for transparency. They don't know about physical memory. CPU
has a unit called Memory Management Unit which is responsible for operating virtual memory.
When a process makes a reference to a page that isn't in main memory, the MMU generates a page
fault. The kernel catches it and decides whether the reference is valid or not. If invalid, the kernel
sends signal "segmentation violation" to the process. If valid, the kernel retrieves the page process
referenced from the disk.
SECONDARY STORAGE
You are now clear that the operating speed of primary memory or main memory should be as fast as
possible to cope up with the CPU speed. These high-speed storage devices are very expensive and hence
the cost per bit of storage is also very high. Again the storage capacity of the main memory is also very
limited. Often it is necessary to store hundreds of millions of bytes of data for the CPU to process.
Therefore additional memory is required in all the computer systems. This memory is called auxiliary
memory or additional memory or attached memory or secondary memory.
In this type of memory the cost per bit of storage is low. However, the operating speed is slower than that
of the primary storage. Huge volume of data are stored here on permanent basis and transferred to the
primary storage as and when required. Most widely used secondary storage devices are magnetic tapes and
magnetic disk.
1.Magnetic Tape: Magnetic tapes are used for large computers like mainframe computers where large
volume of data is stored for a longer time. In PC also you can use tapes in the form of cassettes. The cost of
storing data in tapes is inexpensive. Tapes consist of magnetic materials that store data permanently. It can
be 12.5 mm to 25 mm wide plastic film-type and 500 meter to 1200 meter long which is coated with
magnetic material. The deck is connected to the central processor and information is fed into or read from
the tape through the processor. It is similar to cassette tape recorder.
Magnetic Tape
19
10.5 inch computer magnetic tape.

Advantages of Magnetic Tape:
Compact: A 10-inch diameter reel of tape is 2400 feet long and is able to hold 800, 1600
or 6250 characters in each inch of its length. The maximum capacity of such tape is 180
million characters. Thus data are stored much more compactly on tape.
Economical: The cost of storing characters is very less as compared to other storage
devices.
Fast: Copying of data is easier and fast.
Long term Storage and Re-usability: Magnetic tapes can be used for long term storage
and a tape can be used repeatedly with out loss of data.
2.Magnetic Disk: You might have seen the gramophone record, which is circular like a disk and coated
with magnetic material. This is a Random Accessing Device. That is, here you can pick out any files or
records at random or your choice immediately and accessing time will be low to access any record or file
you require. These are non-volatile storage device. Magnetic disks used in computer are made on the same
principle of gramophone record player. It rotates with very high speed inside the computer drive. Data is
stored on both the surface of the disk. Magnetic disks are most popular for direct access storage device.
Each disk consists of a number of invisible concentric circles called tracks. Information is recorded on
tracks of a disk surface in the form of tiny magnetic spots. The presence of a magnetic spot represents one
bit and its absence represents zero bit. The information stored in a disk can be read many times without
affecting the stored data. So the reading operation is non-destructive. But if you want to write a new data,
then the existing data is erased from the disk and new data is recorded.
20
Hard Disk
A hard disk drive (often shortened as "hard disk", "hard drive", or "HDD"), is a non-volatile storage
device which stores digitally encoded data on rapidly rotating platters with magnetic surfaces.
Strictly speaking, "drive" refers to a device distinct from its medium, such as a tape drive and its
tape, or a floppy disk drive and its floppy disk. Early HDDs had removable media; however, an
HDD today is typically a sealed unit which is called as Winchester Disk. Here the hard disk will be
21

kept in an airtight box and the user cannot tamper with it normally.
HDDs record data by magnetizing ferromagnetic material directionally, to represent either a 0 or a
1 binary digit. They read the data back by detecting the magnetization of the material. A typical
HDD design consists of a spindle which holds one or more flat circular disks called platters, onto
which the data are recorded. The platters are made from a non-magnetic material, usually
aluminum alloy or glass, and are coated with a thin layer of magnetic material. Older disks used
iron(III) oxide as the magnetic material, but current disks use a cobalt-based alloy.
How files are stored in a disk
data is stored in blocks
blocks a set of sectors
tracks are divided into various sectors
Actual files or information are stored in available sectors
files will have names
files are indefinite in size
files may be updated (in part or whole)
directory entries record file data
file allocation table keeps track of file

pieces and contains File name, size, date
and time of creation, Attributes of a file
and the address of the starting cluster. At
the end of each cluster you will have the
address of next cluster and your file will
be stored as a sequence of links.
If there is any break in the links you will

come across Lost clusters or chains Problem. Actually the data of a file is existing in the
disk but your OS is unable to fetch the lost links because of improper shutdown or closing.
So all details of your files will not be listed.
A Cylinder is nothing but same numbered concentric circle or track of nth the sides. ie. 1st
track of head 0 and 1st track of head 1
Floppy Disk: It is similar to magnetic disk discussed above. The floppies are made up of thin,
flexible polythene film on which high quality magnetic oxide is coated. Since it is flexible in
22

nature, it is called as floppy. It is encased in a square plastic shell to protect it from mishandling,
dust, Moisture. They are 8 inch, 5.25 inch or 3.5 inch in diameter. They come in single or double
density and recorded on one or both surface of the diskette. The capacity of a 5.25-inch floppy is
1.2 mega bytes whereas for 3.5 inch floppy it is 1.44 mega bytes [High Density]. It is cheaper than
any other storage devices and is portable. The floppy is a low cost device particularly suitable for
personal computer system. Now, they have been largely superseded by USB flash drives, External
Hard Drives, CDs, DVDs, and memory cards (such as Secure Digital).
8 Floppy
5 Floppy
3 Floppy
Floppy Disk
3.Optical Disk:
CD ROM and DVD are optic readable media, contrary to hard disks, floppy disks and tapes, which
are magnetic. The optic storage media are read with a very thin and very precisely aimed laser beam. They
supplement the magnetic media.
They have clear advantages in the areas of data density and stability: Data can be packed much
more densely in an optic media than in a magnetic media. And they have much longer life span. It is
presumed that a magnetic media, such as a hard disk or DAT (digital audio tape) can maintain their data
for a maximum of five years. The magnetism simply fades away in time. Conversely, the life span of optic
media is counted in tens of years.
Let us take a closer look at these disks, which are becoming increasingly popular for all types of
information, education and entertainment.
With every new application and software there is greater demand for memory capacity. It is the necessity
to store large volume of data that has led to the development of optical disk storage medium. Optical disks
can be divided into the following categories:
1.
-ROM): The compact disk (CD) was introduced by Philips

and Sony in 1980 to replace LP records. It is a small plastic disk with a reflecting metal coating,
usually aluminum. Myriad's of tiny indentations are burned into this coating. These indentations
contain the music in millions of bits. The CD is organized in tracks. Each track is assigned a number.
The big advantage of the CD is its high-quality music reproduction and total absence of back ground
noise as well as a great dynamic. During operation, the software in the drive can correct errors caused
by such things as finger marks on the disk. All in all, the CD is an excellent music storage media.
23

CD-ROM disks are made of reflective metals. CD-ROM is written during the process of
manufacturing by high power laser beam. Here the storage density is very high, storage cost is very
low and access time is relatively fast. Each disk is approximately 4 1/2 inches in diameter and can
hold over 600 MB of data. As the CD-ROM can be read only we cannot write or make changes into
the data contained in it.
2. Write Once, Read Many (WORM): The inconvenience that we can not write any thing in to a CDROM is avoided in WORM. A WORM allows the user to write data permanently on to the disk. Once
the data is written it can never be erased without physically damaging the disk. Here data can be
recorded from keyboard, video scanner, OCR equipment and other devices. The advantage of WORM
is that it can store vast amount of data amounting to gigabytes (109 bytes). Any document in a WORM
can be accessed very fast, say less than 30 seconds.
3. Erasable Optical Disk: These are optical disks where data can be written, erased and re-written. This
also applies a laser beam to write and re-write the data. These disks may be used as alternatives to
traditional disks. Erasable optical disks are based on a technology known as magnetic optical (MO). To
write a data bit on to the erasable optical disk the MO drive's laser beam heats a tiny, precisely defined
point on the disk's surface and magnetises it.
The Compact Disk
The CD-ROM is designed differently. It has only one track, a spiral winding its way from the
center to the outer edge:
This long spiral track holds up to 650 MB data in about 5.5 billion dots (each is one bit). The
incredibly small dimensions of the bumps make the spiral track on a CD extremely long. If you could lift
the data track off a CD and stretch it out into a straight line, it would be 0.5 microns wide and almost 3.5
miles (5 km) long!
A CD has a long, spiraled data track. If you were to unwind this track, it would extend out 3.5 miles
(5 km).
CD Basics: The Bumps
If you've read How CDs Work, you understand the basic idea of
CD technology. CDs store music and other files in digital form -- that
is, the information on the disc is represented by a series of 1s and 0s
(see How Analog and Digital Recording Works for more information).
In conventional CDs, these 1s and 0s are represented by millions of tiny
bumps and flat areas on the disc's reflective surface. The bumps and
flats are arranged in a continuous track that measures about 0.5 microns
(millionths of a meter) across and 3.5 miles (5 km) long.
To read this information, the CD player passes a laser beam over the
track. When the laser passes over a flat area in the track, the beam is reflected directly to an optical sensor
on the laser assembly. The CD player interprets this as a 1. When the beam passes over a bump (pit), the
light is bounced away from the optical sensor. The CD player recognizes this as a 0.
24
Courtesy : http://computer.howstuffworks.com/cd-burner1.htm
A CD player guides a small laser along the CD's data track.
In conventional CDs, the flat areas, or lands, reflect the light back to the laser assembly; the bumps deflect
the light so it does not bounce back.
What the CD Player Does: Tracking
The hardest part is keeping the laser beam centered on the data track. This centering is the job of
the tracking system. The tracking system, as it plays the CD, has to continually move the laser outward.
As the laser moves outward from the
center of the disc, the bumps move past
the laser faster -- this happens because the
linear, or tangential, speed of the bumps is
equal to the radius times the speed at
which the disc is revolving (rpm).
Therefore, as the laser moves outward, the
spindle motor must slow the speed of the
CD. That way, the bumps travel past the
laser at a constant speed, and the data
comes off the disc at a constant rate.
Data read from CD-ROM
Data are usually read from the CD-ROM at a constant speed. The principle is called CLV (Constant
Linear Velocity). It implies that the data track must pass under the read head at the same rate, whether in
inner or outer parts of the track. This is accomplished by varying the disk rotation speed, based on the read
head's position. The closer to the center of the disk is the faster the rotation speed. In the music CD, data
are read sequentially. Therefore, rotation speed variation is not necessary. The CD-ROM disk on the other
hand has to read in random pattern. The read head must jump frequently to different parts of the disk.
Therefore, it forever has to change rotation speed. You can feel that. It causes pauses in the read function.
That is a disadvantage of the CD-ROM media. Also the faster versions can be rather noisy.
The construction of a CD
The CD itself is made up of one continuous track about 0.5 microns wide and around 5km in
length. This track is a small groove spiralling round and round the CD from the centre to the edge. The
materials used to make a CD are at the top we have the label, then a layer of acrylic, a layer of aluminium
ad finally a thicker layer of plastic to protect the CD.
25

When manufacturing a production CD like what you buy in the shops, A heavy duty stamp is used
with microscopic bumps arranged as a single track of data. This is then stamped on a disc of polycarbonate
plastic. Then the Aluminium coating is applied for its properties as a reflective surface. Acrylic is then
applied for protection, and the label is then placed on. This is obviously a large volume solution and the
technique is no good for home use.
CD-R and CD-E
In 1990, the CD-ROM technique was advanced to include home burning. You could buy your own
burner. That is a drive, which can write on special CD-ROM disks. These have a temperature sensing
layer, which can be changed by writing. You can only write on any given part of these disks once. This
CD-R disk is also called a WORM disk (Write Once Read Many). Once the CD-R is burnt, it can be read in
any CD drive for sound or data.
There is also a type called CD-erasable (CD-E), where you can write multiple times on the same
disk surface. This technique is promising. However, not all CD drives can read these CD's. The latest
drives, which can adjust the laser beam to match the current media, are called multi read. Look for that,
when you buy a new CD-ROM drive.
DVD
The next optic drives we will see in the next few years is the DVD (Digital Video Disks)drive.
They are being developed by several companies (Philips, Sony, and others) and represent a promising
technology. DVD stands for Digital Versatile Disk.
They are thought of as a future all-round disk, which will replace CD-ROM and laser disks. In the
future, DVD might also replace VHS tapes for videos. Certain DVD drives can both read and write the
disks. There are also read only, designed for playing videos.
The DVD is a flat disk, the size of a CD - 4.7 inches diameter and .05 inches thick. Data are stored
in a small indentation in a spiral track, just like in the CD. DVD disks are read by a laser beam of shorter
wave-length than used by the CD ROM drives. This allows for smaller indentations and increased storage
capacity.
The data layer is only half as thick as in the CD-ROM. This opens the possibility to write data in
two layers.The outer gold layer is semi transparent, to allow reading of the underlying silver layer. The
laser beam is set to two different intensities, strongest for reading the underlying silver layer. Here you see
a common type DVD ROM drive:
Other DVD types
We have the following DVD versions:
DVD-ROM is for read-only, like the CD-ROM. This media is usable for distribution of software,
but especially for multimedia products, like movies. The outer layers can hold 4.7 GB, the underlying
3.8 GB. The largest version can hold a total of 17 GB.
DVD-R (recordable) are write once-only like CD-R. This disk can hold 3.9 GB per side .
DVD RAM can be written and read like a hard disk. Capacity is 2.6 GB per side or whatever the
agree on. There are many problems with this format.
4. Flash Drives
26
Use flash memory media

No moving parts so more resistant to shock and vibration, require less power, makes no
sound
Solid-state storage system
Most often found in the form of:
Flash memory cards
USB flash drives
Solid-state drives
Hybrid hard drives
Very small and so are very appropriate for use with digital cameras, digital music players,
handheld PCs, notebook computers, smart phones, MP4 Players etc.
Now, the most common media of storage is Pen Drive using USB port. It replaced Floppy drives.
Now floppies become extinct.
INPUT OUTPUT DEVICES
A computer is only useful when it is able to communicate with the external environment. When
you work with the computer you feed your data and instructions through some devices to the
computer. These devices are called Input devices. Similarly computer after processing, gives
output through other devices called output devices.
For a particular application one form of device is more desirable compared to others. We will
discuss various types of I/O devices that are used for different types of applications. They are
also known as peripheral devices because they surround the CPU and make a communication
between computer and the outer world.
Input Devices
Input devices are necessary to convert our information or data in to a form which can be
understood by the computer. A good input device should provide timely, accurate and useful
data to the main memory of the computer for processing followings are the most useful input
devices.
27

1. Card Reader :
This is a sequential accessing device where in the information are entered in the form
of punching holes on cards
The IBM 80-column punching format, with rectangular holes, eventually won out
over the competing UNIVAC 90-character format, which used 45 columns (2
characters in each) of 12 round holes. Punch cards were widely known as just IBM
cards, even though other companies made cards and equipment to process them. The
rectangular bits of paper punched out are called chad (recently, chads) or chips (in
IBM usage).
IBM punch card format
The IBM card format held 80 columns with 12 punch locations each, representing 80
characters. The top two positions were called zone punches, 12 (top) and 11. These
often encoded plus and minus signs. The remaining ten positions represented (from
top to bottom) the digits 0 through 9.
Originally only numeric information was coded, with 1 or 2 punches per column: digits
(digit [0-9]) and signs (zone [12,11] sometimes over-punching the Least Significant
Digit). Later, codes were introduced for upper-case letters and special characters. A column
with 2 punches (zone [12,11,0] + digit [1-9]) was a letter; 3 punches (zone [12,11,0] + digit
[2-4] + 8) was a special character. The introduction of EBCDIC in 1964 allowed columns
with as many as 6 punches (zones [12,11,0,8,9] + digit [1-7]). For computer applications,
binary formats were sometimes used, where each hole represented a single binary digit (or
"bit").
Information are entered by punching holes on cards using coding for each characters
28
A card punching machine was used to punch information on cards. A Card verifier
should be used to verify the information punched on card. With out verifying the
information punched on a card, we cannot use it in a card reader. A Card sorter is
used to arrange the cards in the desired sorting order. A card Reader will be used to
read information on cards. There will a passage in a card reader for the flow of cards.
On the top that passage, there will be 12 source of light and at the bottom there will
be 12 photo cells. When the card is flowing inside the passage the light will be
following on the respective photocell and that cells where light falling will be
activated and this is considered as 1. If the cells are not activated, It will be
considered as 0. In this way all information will be sent to the computer in binary
format.
Now this cards have become obsolete one because :
1. They are very costlier i.e., a card may cost 50 paise and only 80 characters can
be entered in a card. If you want to enter 80000 characters you have to use
1000 cards. So, the total cost will be Rs.500. A floppy will be costing only Rs.
25 where you can enter 1.2MB information. A 4 MB pen drive will be costing
Rs. 500 only.
2. Cards are not reusable where as the floppies or pen drives are reusable. If you
make any mistake, the card is not reusable. You have to take fresh card and reenter the information on it.
3. The cost of maintaining the cards will also be a prohibitive one. You have to
protect them from dust, moisture and insects.
4. They occupy huge storage space but now we are having Floppy disks, Hard
disks, CDs, DVDs, Pendrives etch which may not occupy a lesser amount of
space.
2. Magnetic Tape discussed earlier
3. Magnetic Disk discussed earlier
Floppy Disk
Hard Disk
4. Optical Disk discussed earlier
29
5. Keyboard: - This is the standard input device attached to all computers. The layout of
keyboard is just like the traditional typewriter of the type QWERTY. It also contains some
extra command keys and function keys. It contains a total of 101 to 104 keys. A typical
keyboard used in a computer is shown in Fig. 2.6. You have to press correct combination of
keys to input data. The computer can recognise the electrical signals corresponding to the
correct key combination and processing is done accordingly.
The Maltron Keyboard
The Maltron keyboard is designed to lessen user fatigue and perhaps carpal tunnel
syndrome. Note the angles of the keys and the many keys that are operated with your
thumbs.
6. Mouse: - Mouse is an input device shown in Fig. 2.7 that is used with your personal
computer. It rolls on a small ball and has two or three buttons on the top. When you roll the
mouse across a flat surface the screen censors the mouse in the direction of mouse
movement. The cursor moves very fast with mouse giving you more freedom to work in
any direction. It is easier and faster to move through a mouse.
7. Scanner: The keyboard can input only text through keys provided in it. If we want to
input a picture the keyboard cannot do that. Scanner is an optical device that can input any
graphical matter and display it back. The common optical scanner devices are Magnetic Ink
Character Recognition (MICR), Optical Mark Reader (OMR) and Optical Character Reader
(OCR).
8. Magnetic Ink Character Recognition (MICR): - This is widely used by banks to process large
volumes of cheques and drafts. Cheques are put inside the MICR. As they enter the reading unit the
cheques pass through the magnetic field which causes the read head to recognise the character of the
cheques.
9. Optical Mark Reader (OMR): This technique is used when students have appeared in
objective type tests and they had to mark their answer by darkening a square or circular
space by pencil. These answer sheets are directly fed to a computer for grading where
OMR is used.
10.Optical Character Recognition (OCR): - This technique unites the direct reading of
any printed character. Suppose you have a set of hand written characters on a piece of
paper. You put it inside the scanner of the computer. This pattern is compared with a site of
30

patterns stored inside the computer. Whichever pattern is matched is called a character read.
Patterns that cannot be identified are rejected. OCRs are expensive though better the MICR.
Output Devices
1. Visual Display Unit: The most popular input/output device is the Visual Display Unit (VDU).
It is also called the monitor. A Keyboard is used to input data and Monitor is used to display the
input data and to receive massages from the computer. A monitor has its own box which is
separated from the main computer system and is connected to the computer by cable. In some
systems it is compact with the system unit. It can be colour or monochrome. It can be a
Monochrome where only black and white colour are available. It can also be CGA ( Colour
Graphics Adapter) where 4 colours are available or can be EGA (Enhanced Graphics Adaptor)
or can be VGA (Video Graphics Array) where more colours are available or Super VGA.
Concepts and terminology : When we talk about screens, there are currently three
different types to choose from:
CRT (Cathode Ray Tube) the common type screens. They are found in different
technologies, suchas Invar and Trinitron.
LCD (Liquid Crystal Display) flat and soft displays. TFT is the most expensive
display of this type.
The TFT screen is also called a "soft" screen, since the images appear softer than
from Cathode Ray Tubes..
Common principles
The principles in these screen types are quite different, but the screen image design rests on
the same concepts:
Pixels. The screen image is made of pixels (tiny dots), which are arranged in
rows across the screen. A screen image consists of between 480,000 and
1,920,000 pixels.
Refresh rate. The screen image is "refreshed" many times per second. Refresh
rates are measured in Hertz (HZ), which means "times per second".
Color depth. Each pixel can display a number of different colors. The number
of colors, which can be displayed, is called color depth. Color depth is
measured in bits.
Video RAM. All video cards have some RAM. How much depends on the
desired color depth. Video cards usually have 1, 2 or 4 MB RAM for normal
usage.
These concepts are central to the understanding of the video system. Since the CRT screens
are still by far the most common, they will form the basis for this review.
31
3. Terminals: It is a very popular interactive input-output unit. It can be divided into two
types: hard copy terminals and soft copy terminals. A hard copy terminal provides a
printout on paper whereas soft copy terminals provide visual copy on monitor. A
terminal when connected to a CPU sends instructions directly to the computer.
Terminals are also classified as dumb terminals or intelligent terminals depending upon
the work situation.
4. Printer: It is an important output device which can be used to get a printed copy of the
processed text or result on paper. There are different types of printers that are designed
for different types of applications. Depending on their speed and approach of printing,
printers are classified as impact and non-impact printers. Impact printers use the
familiar typewriter approach of hammering a typeface against the paper and inked
ribbon. Dot-matrix printers are of this type. Non-impact printers do not hit or impact a
ribbon to print. They use electro-static chemicals and ink-jet technologies. Laser
printers and Ink-jet printers are of this type. This type of printers can produce color
printing and elaborate graphics.
In Dot-matrix printer There will be 5 X 7 as shown below or 7 X 9 matrix of pins will be
there in the printer head. This is shown below.
When any character is to be printed the pins forming that letter will be projecting and
others will not be projected. In the above Picture the pins in black are forming letter X and
others are not projected. If the printer head is hammering on the paper through inked
ribbon, character X will be formed.
This can print at the max. of 300 characters per second. You can print charts, Graphs and
special characters using this printer.\
5. All input devices other than cards are acting as output devices.
Multimedia Technology
Multimedia technology is the computer-based integration of text, sound, still images,
32

animation and digitized motion video.
Merges capabilities of computers with televisions, VCRs, CD players, DVD players, video and
audio recording equipment, music and gaming technologies.
Innovations in Hardware Utilization

Server Farms: massive data centers that contain thousands of networked computer
servers.
Virtualization: using software to create partitions on a single server so that multiple
applications can run on a single server.
Grid computing involves applying the resources of many computers in a network to a
single problem at the same time.
Utility computing (also called subscription computing and on-demand computing) is
when a service provider makes computing resources and infrastructure management
available to a customer as needed for a charge based on specific usage rather than a
flat rate.
Edge Computing: process where parts of Web content and processing are located close
to the user to decrease response time and lower processing costs.
Autonomic Computing: systems that manage themselves without direct human
33
intervention.
Nanotechnology refers to the creation of materials, devices and systems at a scale of 1
to 100 nanometers (billionths of a meter).
SOFTWARE
WHAT IS SOFTWARE?
As you know computer cannot do anything without instructions from the user. In order to do
any specific job you have to give a sequence of instructions to the computer. This set of
instructions is called a computer program. Software refers to the set of computer programs,
procedures that describe the programs, how they are to be used. We can say that it is the
collection of programs, which increase the capabilities of the hardware. Software guides the
computer at every step where to start and stop during a particular job. The process of software
development is called programming.
You should keep in mind that software and hardware are complementary to each other. Both
have to work together to produce meaningful result. Another important point you should know
that producing software is difficult and expensive.
SOFTWARE TYPES
Computer software is normally classified into Three broad categories.
Utility Software
Application Software
System software
In some books it is mentioned as 2 types of Software say Application Software (by combining
Application Software & Utility Software) & System Software.
Utility Software : Utility software I a set of programs to carry out some common tasks or
activities which is not pertaining to any particular organization. This is developed by some
programmers or organization with the intention of having common usage for all who are in
need of that software like Word processors, Ms-Office Package, Foxpro etc.
34
Application Software: Application Software is a set of programs to carry out operations for a
specific application of an organization developed for its own usage by its own programmers or
by outsiders for its own use . For example, payroll is an application software for an
organization to produce pay slips as an output. Other examples are Billing system, Accounting,
Producing statistical report, Analysis of numerous data in research, Weather forecasting, etc. .,
for a particular organisations use only which cannot be used by others or with some
modifications only others can use.
System Software: You know that an instruction is a set of programs that has to be fed to the
computer for operation of computer system as a whole. When you switch on the computer the
programs written in ROM is executed which activates different units of your computer and
makes it ready for you to work on it. This set of program can be called system software.
Therefore system software may be defined as a set of one or more programs designed to
control the operation of computer system.
System software are general programs designed for performing tasks such as controlling all
operations required to move data into and out of the computer. It communicates with printers,
card reader, disk, tapes etc. monitor the use of various hardware like memory, CPU etc. Also
system software are essential for the development of applications software. System Software
allows application packages to be run on the computer with less time and effort. Remember
that it is not possible to run application software without system software.
Development of system software is a complex task and it requires extensive knowledge of
computer technology. Due to its complexity it is not developed in house. Computer
manufactures build and supply this system software with the computer system. DOS, UNIX
and WINDOWS are some of the widely used system software. Out of these UNIX is a multiuser operating system whereas DOS and WINDOWS are PC-based. We will discuss in detail
about DOS and WINDOWS in the next module.
So without system software it is impossible to operate your computer. The following picture is
shown in Fig. 3.1 relation between hardware, software and you as a user of computer system.
Relation between hardware, software.
LANGUAGE
WHAT IS A LANGUAGE?
A language is a system or style of communication between human being. Some of the basic
natural languages that we are familiar with are English, Hindi, Tamil, French etc. These are the
languages used to communicate among various categories of persons.
35

How you will communicate with your computer?
Your computer will not understand any of these natural languages for transfer of data and
instruction. So there are programming languages specially developed so that you could pass
your data and instructions to the computer to do specific job.
You must have heard names like FORTRAN, BASIC, COBOL etc. These are programming
languages. So instructions or programs are written in a particular language based on the type of
job. As an example, for scientific application FORTRAN and C languages are used. On the
other hand COBOL is used for business applications.
Programming Languages
The programming languages can be divided in to 3 languages:
High Level Languages
Low Level Languages &
Machine language.
HIGH LEVEL LANGUAGES
Assembly language and machine level language require deep knowledge of computer hardware
where as in higher language you have to know only the instructions in English words and logic
of the problem irrespective of the type of computer you are using.
Characteristic features of High Level Languages.
They use words and alphabets of any one of the natural language. So, anybody with
little amount of knowledge about computer languages and computers can easily
understand.
They are Machine independent. The programs written for a system can be entered in
any other system and then complied to execute the program.
They are maintaining one to many ratios at the time of converting into machine
language.
They are having their own Input, Output, and File handling commands.
COBOL, BASIC, FORTRAN, Pascal etc. are examples of High Level Languages
They are mostly procedure oriented.
In higher languages a set of instructions, called as Program, using a particular

computer language will be developed. The important thing to note is, the computer
cannot understand even this high level language codes. They got to be converted in to
machine understandable format using any Translating Software like compiler and
interpreter.
Higher level languages are problem-oriented languages because the instructions are
suitable for solving a particular problem. For example COBOL (Common Business
36

Oriented Language) is mostly suitable for business oriented language where there is
very little processing and huge output. There are mathematical oriented languages like
FORTRAN (Formula Translation) and BASIC (Beginners All-purpose Symbolic
Instruction Code) where very large processing is required. Thus a problem oriented
language designed in such a way that its instruction may be written more like the
language of the problem. For example, businessmen use business term and scientists
use scientific terms in their respective languages.
Note:
Languages (3GL). SQL is called as Fourth Generation Language (4GL) and C
is called as Middle Level Language.
Advantages of High Level Languages
Higher level languages have a major advantage over machine and assembly languages that
higher level languages are easy to learn and use. It is because that they are similar to the
languages used by us in our day to day life.
They do not require more hardware knowledge. So, a programmer with little amount of or nil
knowledge of computer can write efficient and effective programs.
LOW LEVEL LANGUAGES
The term low level means closeness to the way in which the machine has been built. Low level
languages are machine oriented and require extensive knowledge of computer hardware and its
configuration.
Characteristic features of Low Level Languages.
They use some Symbolic codes and it very difficult even for an efficient programmer
to write programs.
They are Machine dependent. The programs written for a system can not be entered in
any other system to execute.
They are maintaining one to one ratio at the time of converting into machine language.
They are not having their own Input, Output, and File handling commands. You have
to fetch data directly from registers and memory.
Assembly Language is an example of Low Level Language

Example program in Assembly language
MVI A,58H
IN
O3
ANI 01
37

Advantages:
1. The symbolic programming of Assembly Language is easier to understand and saves a

lot of time and effort of the programmer.
2. It is easier to correct errors and modify program instructions.
3. Assembly Language has the same efficiency of execution as the machine level
language. Because this is one-to-one translator between assembly language program
and its corresponding machine language program.
Disadvantages:
One of the major disadvantages is that assembly language is machine dependent. A
program written for one computer might not run in other computers with different hardware
configuration.
MACHINE LANGUAGE
Machine Language is the only language that is directly understood by the computer. It does not
needs any translator program. We also call it machine code and it is written as strings of 1's
(one) and 0s (zero). When this sequence of codes is fed to the computer, it recognizes the
codes and converts it in to electrical signals needed to run it. For example, a program
instruction may look like this:
1011000111101
It is not an easy language for you to learn because of its difficult to understand. It is efficient
for the computer but very inefficient for programmers. It is considered to the first generation
language. It is also difficult to debug the program written in this language.
Advantage
The only advantage is that program of machine language run very fast because no translation
program is required for the CPU.
Disadvantages
1. It is very difficult to program in machine language. The programmer has to know
details of hardware to write program.
2. The programmer has to remember a lot of codes to write a program which results in
program errors.
3. It is difficult to debug the program.
Source Program & Object Program
38

The programs written by the programmer in higher level language or low level language is
called as source program. After this program is converted to machine languages by the
compiler it is called as object program
it. You need not store compiler or source program in the memory.
Debugging
Locating and correcting the errors in the source code while using compiler of interpreter is called as
debugging. Errors are of 2 types and they are :
Syntax Errors which will be nothing but wrong usage or misspelling of syntax (grammar) of key
words of any computer language. The translating software will display these errors.
Logical error is the error in the logical flow of the instructions. The translating software will treat
them as correct and it is very difficult to detect the these errors. Only by giving test data you can
detect logical errors
TRANSLATING SOFTWARES
Compiler
It is a program translator that converts the instruction of a higher level language or

low level language to machine language.
It is called compiler because it compiles machine language instructions for every
program instructions of higher level language or low level language. Thus compiler is
a program translator that scans the entire program first and then translates it into
machine code.
In the process of conversion, Compiler generates an object file which will be in
machine understandable format
It requires low memory space as it requires only object or executable program to be
loaded in the computer. You do not required compiler and source program to be
sotred in the memory to execute your program.
While, compiling it will give you a list of errors and it is your task to locate the errors
and correct them which is called as debugging. Locating and correcting the errors will
be a major problem for the beginners as it will give so many error messages will be
given for even lesser number of errors. This will confuse and you have to go through
the entire list of errors for debugging.
Higher Level Language --> (Compile) ---> Program --> Machine Language Program
A compiler can translate only those source programs, which have been written, in that language
for which the compiler is meant for. For example FORTRAN compiler will not compile source
code written in COBOL language.
Object program generated by compiler is machine dependent. It means programs compiled for
one type of machine will not run in another type. Therefore every type of machine must have
its personal compiler for a particular language. Machine independence is achieved by using one
39

higher level language in different machines.
Interpreter
An interpreter is another type of program translator used for translating each and
every line of higher level language into machine understandable format.
All languages may not have interpreters.
Each and every line of source program will be transferred into machine
understandable format and execute. Once first line is loaded in the memory,
translated, and executed, second line will be loaded in the memory and so on. Once
one line is executed and another line is loaded in the memory existing line will be
erased from the memory. So, unlike Compilers, the interpreters will be executing
source program in to line by line.
This consumes more memory because to execute your program you must load
interpreter and source program and in the process of translation it will generate object
version.
Execution of the source program will take long time because each end every line is to
be translated into object version. If you are having a chain of execution or looping,
each and every statement must be loaded, translated and executed in the chain. If you
are having more number of cycles f execution, it will take lot of time to execute your
program.
The only benefit here over compiler will be error detection. If there is any error in a
line, that line will be listed on the screen. Unless until you correct all errors in that
line, the control will not go to next line. So, debugging is very easy.
Advantages
E
E
The advantage of interpreter compared to compiler is its fast response to changes or

editing or debugging in source program. It eliminates the need for a separate
compilation after changes to each program.
Interpreters are easy to write and do not require large memory in computer.
Disadvantage
The disadvantage of interpreter is that it is time consuming method because each time a
statement in a program is executed then it is first translated. Thus compiled machine
language program runs much faster than an interpreted program.
COMPUTER NETWORK
40
A computer network is an interconnection of various computer systems located at different

places.
In computer network two or more computers are linked together with a medium and data
communication devices for the purpose of communicating data and sharing resources.
The computer that provides resources to other computers on a network is known as server. In
the network the individual computers, which access shared network resources, are known as
workstations or nodes.
Computer Networks may be classified on the basis of geographical area in two broad
categories.
1. Local Area Network (LAN)
2. Wide Area Network (WAN)
3. Metropolitan Area Network (MAN)
Local Area Network
Networks used to interconnect computers in a single room, rooms within a building or

buildings or simply within a campus are called Local Area Network (LAN).
LAN transmits data with a speed of several megabits per second (106 bits per second).
The transmission medium is normally coaxial cables.
LAN links computers, i.e., software and hardware, in the same area for the purpose of
sharing information.
Usually LAN links computers within a limited geographical area because they must
be connected by a cable, which is quite expensive.
People working in LAN get more capabilities in data processing, work processing and
other information exchange compared to stand-alone computers. Because of this
information exchange most of the business and government organizations are using
LAN.
Main benefit of LAN is, an organization can share all resources available to it to all
computer users.
Major Characteristics of LAN
every computer has the potential to communicate with any other computers of the
network
high degree of interconnection between computers
easy physical connection of computers in a network
inexpensive medium of data transmission
high data transmission rate
41
Advantages
The reliability of network is high because the failure of one computer in the network
does not effect the functioning for other computers.
Addition of new computer to network is easy.
High rate of data transmission is possible.
Peripheral devices like magnetic disk and printer can be shared by other computers.
Disadvantages
If the communication line fails, the entire network system breaks down.
Use of LAN
Followings are the major areas where LAN is normally used
File transfers and Access

Word and text processing
Electronic message handling
Remote database access
Personal computing
Digital voice transmission and storage
Wide Area Network

The term Wide Area Network (WAN) is used to describe a computer network spanning a
regional, national or global area. For example, for a large company the head quarters might be
at Delhi and regional branches at Bombay, Madras, Bangalore and Calcutta. Here regional
centers are connected to head quarters through WAN. The distance between computers
connected to WAN is larger. Therefore the transmission medium used are normally telephone
lines, microwaves and satellite links.
Characteristics of WAN
Followings are the major characteristics of WAN.
1. Communication Facility: For a big company spanning over different parts of the
country the employees can save long distance phone calls and it overcomes the time lag
in overseas communications. Computer conferencing is another use of WAN where
users communicate with each other through their computer system.
2. Remote Data Entry: Remote data entry is possible in WAN. It means sitting at any
location you can enter data, update data and query other information of any computer
attached to the WAN but located in other cities. For example, suppose you are sitting at
Madras and want to see some data of a computer located at Delhi, you can do it through
WAN.
42
3. Centralised Information: In modern computerised environment you will find that big
organisations go for centralised data storage. This means if the organisation is spread
over many cities, they keep their important business data in a single place. As the data
are generated at different sites, WAN permits collection of this data from different sites
and save at a single site.
Examples of WAN
1. Ethernet: Ethernet developed by Xerox Corporation is a famous example of WAN.
This network uses coaxial cables for data transmission. Special integrated circuit chips
called controllers are used to connect quipment to the cable.
2. Aparnet: The Aparnet is another example of WAN. It was developed at Advanced
Research Projects Agency of U. S. Department. This Network connects more than 40
universities and institutions throughout USA and Europe.
Difference between LAN and WAN
LAN is restricted to limited geographical area of few kilometers. But WAN covers
great distance and operate nationwide or even worldwide.
In LAN, the computer terminals and peripheral devices are connected with wires and
coaxial cables. In WAN there is no physical connection. Communication is done
through telephone lines and satellite links.
Cost of data transmission in LAN is less because the transmission medium is owned
by a single organisation. In case of WAN the cost of data transmission is very high
because the transmission medium used are hired, either telephone lines or satellite
links.
The speed of data transmission is much higher in LAN than in WAN. The
transmission speed in LAN varies from 0.1 to 100 megabits per second. In case of
WAN the speed ranges from 1800 to 9600 bits per second (bps).
Few data transmission errors occur in LAN compared to WAN. It is because in LAN
the distance covered is negligible.
NETWORK TOPOLOGY
43
The term topology in the context of communication network refers to the way the computers or
workstations in the network are linked together.
According to the physical arrangements of workstations and nature of work, there are three
major types of network topology. They are star topology, bus topology and ring topology.
4.5.1 Star topology
In star topology a number of workstations (or nodes) are directly linked to a central node (see,
above Fig.). Any communication between stations on a star LAN must pass through the central
node. There is bi-directional communication between various nodes. The central node controls
all the activities of the nodes. The advantages of the star topology are:
It offers flexibility of adding or deleting of workstations from the network.

Breakdown of one station does not affect any other device on the network.
The major disadvantage of star topology is that failure of the central node disables
communication throughout the whole network.
Bus Topology
In bus topology all workstations are connected to a single communication line called bus. In
this type of network topology there is no central node as in star topology. Transmission from
44

any station travels the length of the bus in both directions and can be received by all
workstations. The advantage of the bus topology is that
It is quite easy to set up.

If one station of the topology fails it does not affect the entire system.
The disadvantage of bus topology is that any break in the bus is difficult to identify.
Ring Topology
In ring topology each station is attached nearby stations on a point to point basis so that the
entire system is in the form of a ring. In this topology data is transmitted in one direction only.
Thus the data packets circulate along the ring in either clockwise or anti-clockwise direction.
The advantage of this topology is that any signal transmitted on the network passes through all
the LAN stations. The disadvantage of ring network is that the breakdown of any one station on
the ring can disable the entire system.
Mesh Topology
In mesh topology, nodes are interconnected to one another in a mesh structure. Mesh topology
is very reliable because it can provide alternative routes in forwarding a message from a
sending node to its destination node. But it is more difficult to manage and more expensive.
Mesh topology is supported by the new home monitoring system based on ZigBee technology.
The most recent development in community (municipal) Wi-Fi network, mesh topology is used
to share a broadband Internet connection. The new specification of WiMAX Broadband
Wireless Access (BWA) also supports mesh topology.
45

Tree Topology
Cluster tree topology is a star topology that has branches. A cluster tree network derives the attributes of a
star network.
Figure: Cluster Tree Topology

Two hubs provide connection between two star networks.
Hybrid Topology
This is a combination of different topologies like bus star topology or tree bus topology etc. This is to
extract the benefits of more than one topology.
Various computer network layers

Client server (II tire Architecture)
46
A client-server network has a node (Mostly higher end configuration) that functions as a server which
provides resources (e.g. programs, disk, printers) for other nodes (client computers) and manages clients
access to the network resources. Corporate networks are typically client-server with one or more servers
that store corporate information and employees' computers as clients.
Figure: Client-Server Network

Client computers access programs or information provided by the servers.
Server, Client & Middle server (3 tier Architecture)

Multi tier architecture
A peer-to-peer network does not have a server, each node (computer) in the network can share its own
resources with other nodes and determines other nodes access levels to its own resources. Home networks
are typically peer-to-peer. A peer-to-peer network is also called a workgroup.
47
Figure: Peer-to-Peer Network

Each computer shares its resources with other computers.
PAN (Personal Area Network) is a network that connects computers, peripherals and other devices
within a personal operating space. PAN is typically implemented using wireless (RF) technologies i.e.
ZigBee(low rate), Bluetooth (medium rate) and WiMedia/UWB (high rate). PAN coverage is about 10 to
100 meters.
LAN (Local Area Network) is a network that connects computers, peripherals and other devices within a
building (e.g. office, home) or in a limited area. Typical LAN coverage is about 50 to 300 meters. LAN is
also known as campus network. Most LANs today are implemented using Ethernet. Wireless LAN using
Wi-Fi technology also grows in popularity as an alternative to Ethernet.
48
Figure: LAN. Computers and printers in an office network.

MAN (Metropolitan Area Network) is a city wide network. The coverage limitation is not strict, but real
implementation may have range of up to 50 km in urban, suburban, or rural area. MAN is built to
interconnect LANs, provide access to special contents and high speed Internet access. MAN is
implemented using technologies such as ADSL or VDSL, HFC (CATV), FTTN or FTTH/FTTP, and
WiMAX.
Figure: MAN. A network that covers a city.

WAN (Wide Area Network) is a network that spans larger geographical area. PSTN (fixed telephone
network) and PLMN (mobile cellular telephone network such as GSM, CDMA, and 3G) are examples of
WAN. The largest WAN is the Internet that has worldwide coverage.
49
Figure: WAN. Regional, national or international network

Residential Gateway
Residential gateway is basically a router that is configured to enable the sharing of a single Internet
connection (subscription) by multiple users in a home network. However when you buy a residential
gateway, it most likely incorporates other functions such as hub, switch, wireless access point, or bridge.
Some residential gateways also already include broadband (cable/DSL) modem.
Figure: Residential Gateway

Residential gateway hides a home network from the Internet.
By using a residential gateway to connect your home network to the Internet, you don't need to always turn
on a computer as an ICS host. With a residential gateway, you don't have to manually set an IP address for
each computer in your network because a residential gateway usually has DHCP server. Using DHCP, IP
address for each computer is assigned dynamically by the residential gateway. A residential gateway also
keeps your computers anonymous on the Internet because it translates the IP address of each computer to
an IP address assigned by the ISP. This function is called Network Address Translation (NAT). Besides, a
residential gateway protects your home network from intruders that try to gain access through certain
applications in your computers because it has built-in firewall. Residential gateway is also known as
broadband router or Internet gateway device (IGD).
Gateway
Gateway functions to connect two completely different networks. It performs protocol translation.
Although gateway is considered a Layer 7 device in many publications, it actually works across the seven
layers of the OSI Model. In Internet Telephony, a gateway connects the VoIP network to the PSTN.
50
Figure: Gateway
VoIP/PSTN Gateway performs protocols and signaling translation,
so a VoIP-enabled phone or PC can communicate with a regular phone.
Network Components - Summary
The following table explains network components along with their functions and the corresponding layers
in the OSI Model. Click each component name for a more detailed explanation.
Network
Component
Functions
converts a computer message into electrical/optical signals for transmission
across a network.
Network
Adapter
Figure: Network Adapters
Internal or external network adapter attaches a computer (desktop PC or laptop)
to a LAN.
puts a message (baseband signal) on a carrier for efficient transmission;
takes the baseband signal from the carrier.
Modem
(Modulator
demodulator)
Figure: Modem in Internet access

A modem at a subscriber site communicates with the corresponding modem at
her Operator or ISP.
51

receives signal, amplifies it, then retransmits it.
Repeater
(Regenerator)
Figure: Repeater
A repeater extends the reach of transceivers 1 and 2.
Note: Transceiver is transmitter and receiver.
connects networks with different Layer 2 protocols; divides a network into

several segments to filter traffic.
Bridge
Figure: Bridge
A network bridge enables communication between two computers at different
networks.
connects computers in a network; receives a packet from a sending
computer and transmits it to all other computers.
Hub
Figure: Hub
When A sends to C, the Hub receives signal from A and retransmits it to both B
and C.
Only C then processes the signal.
Switch
connects computers in a network; receives a packet from a sending

computer and transmits it only to its destination.
52
Figure: Switch
When A sends to C, the Switch receives signal from A and only retransmits it to
C.
B doesn't receive the signal
connects computers in a wireless network; connects the wireless network to
wired networks; connects it to the Internet.
Access Point
Figure: Wireless (Wi-Fi) Access Point.

The Access Point creates a wireless network and connects it to the Internet.
forwards a packet to its destination by examining the packet destination
network address.
Router
53
Figure: Router in OSI Model protocol stack

Router is a (OSI's) Layer 3 device.
Packet goes through the protocol stack from its source to its destination.
Brouters
Combine best features of bridges and routers

o Choose best path like routers
o Forward packets based on hardware address like bridges
o Maintain both bridging table of hardware addresses and routing
table of network addresses
Useful in hybrid network with mixture of routable and nonroutable
protocols
May be identified as router with bridging capabilities
connects a home network to the Internet; hides all computers in the home
network from the Internet.
Residential
Gateway
Figure: Residential Gateway

Residential gateway hides a home network from the Internet.
Gateway
connects two totally different networks; translates one signaling/protocol

into another.
54
Figure: Gateway
VoIP/PSTN Gateway performs protocols and signaling translation,
so a VoIP-enabled phone or PC can communicate with a regular phone.
Cable Types (to connect nodes)

A network needs physical medium to connect its nodes together. The physical medium is where the
data actually flows. There are several media types often used in networking. They are described in
the following paragraphs.
Twisted Pair Cable
Figure: Twisted Pair

Twisted pair is two insulated copper wires that are twisted around each other to minimize
interference and noise from other wires. Based on the presence of individual shield and overall
(outer) shield, there are three types of twisted pair, i.e. UTP, STP, and ScTP. Individual shield
encloses a single twisted pair, while outer shield encloses all twisted pairs in a cable. A shield is a
protective sheath that is made from conductive material (metal) and functions to protect the twisted
pair from external interference. An insulator is made from non-conductive material, such as plastic.
Figure: Unshielded Twisted Pair (UTP)

UTP (Unshielded Twisted Pair) is a cable containing several twisted pairs that is only insulated
but not shielded. UTP is the most widely used cable in telephone and computer networks because it
is relatively cheaper than other cables and performs well in normal electrical environment such as
inside an office or a house.
Figure: Shielded Twisted Pair (STP)

STP (Shielded Twisted Pair) is a cable containing several twisted pairs that has individual shields,
an outer shield, and an insulator. STP is more reliable than UTP. However STP is less known
because it is used only in situation where there is complex cabling such as in factory building.
55
Figure: Screened Twisted Pair (ScTP)

ScTP (Screened Twisted Pair) is similar with STP but each twisted pair has no individual shield.
Twisted pair cable is graded based on the number of twists per inch, its cable structure and traffic
carrying capacity into several categories, as follows:
Coaxial Cable
Figure: Coaxial cable (coax) structure

Coaxial cable contains a solid or stranded wire in the core that is insulated with a dielectric layer,
then protected with a solid or braided metallic shield, and covered with an outer insulator.
Electromagnetic wave propagation in a coaxial cable is confined within the space between the core
and the outer conductors. The structure of a coaxial cable makes it less susceptible to interference,
noise, and crosstalk than the twisted pair cable.
Coaxial cable is often classified based on its characteristic impedance. Most coaxial cables have
characteristic impedance of 50 or 75 Ohms. Coaxial cables in the market are usually named with
RG prefix which may stand for Radio Grade. Each RG type is related with certain characteristic
impedance and outer diameter. For example RG-6 which has impedance of 75 Ohms is used for
connecting cable modem or TV to a CATV network. RG-58 (50 Ohms) is used in earlier Ethernet
networks (10Base2). Coaxial cable is terminated with RF (BNC) connectors.
Fiber Optic
Figure: Fiber optic structure

Fiber Optic (optical fiber) is a thin glass or plastic strand in the core which is surrounded by a
cladding and a protective coat and is used to carry information in optical (light) pulses. Because in
fiber optic, information is transmitted in optical pulses instead of electrical signals, fiber optic is not
affected by EMI (electromagnetic interference) and RFI (radio frequency interference). Moreover,
fiber optic has very large bandwidth which is limited only by the equipment that lights the fiber (i.e.
SDH/SONET, ATM, DWDM). But fiber optic is more expensive than twisted pair, coax and radio.
Figure: single-mode fiber (left) and multimode fiber (right)

56

Fiber optic is often classified into single-mode and multimode. In a single-mode fiber, light travels
in one path (mode). In a multimode fiber, light travels in multiple paths (multimode). Single mode
fiber can reach longer distance than multimode fiber, so it is mostly used for MAN or WAN. While
multimode fiber is suitable for implementing high speed LAN.
Because of its reliability and wider bandwidth, fiber optic is often used in backbone networks
where cables run in ducts and in broadband networks that deliver bandwidth intensive applications,
such as HDTV, video streaming, video conferencing and Video on Demand.
Wireless
Radio frequency (RF) refers to frequencies of radio waves. RF is part of electromagnetic spectrum
that ranges from 3 Hz - 300 GHz. Radio wave is radiated by an antenna and produced by
alternating currents fed to the antenna. RF is used in many standard as well as proprietary wireless
communication systems. RF has long been used for radio and TV broadcasting, wireless local loop,
mobile communications, and amateur radio.
Figure: Radio waves radiated by a Base Station's antenna

Microwave is the upper part of RF spectrum, i.e. those frequencies above 1 GHz. Because of the
availability of larger bandwidth in microwave spectrum, microwave is used in many applications
such as wireless PAN (Bluetooth), wireless LAN (Wi-Fi), broadband wireless access or wireless
MAN (WiMAX), wireless WAN (2G/3G cellular networks), satellite communications and radar.
But it became a household name because of its use in microwave oven.
Figure: Microwave is used in satellite communication.

Infrared light is part of electromagnetic spectrum that is shorter than radio waves but longer than
visible light. Its frequency range is between 300 GHz and 400 THz, that corresponds to wavelength
from 1mm to 750 nm. Infrared has long been used in night vision equipment and TV remote
control. Infrared is also one of the physical media in the original wireless LAN standard, that's
IEEE 802.11. Infrared use in communication and networking was defined by the IrDA (Infrared
Data Association). Using IrDA specifications, infrared can be used in a wide range of applications,
e.g. file transfer, synchronization, dial-up networking, and payment. However, IrDA is limited in
57

range (up to about 1 meter). It also requires the communicating devices to be in LOS and within its
30-degree beam-cone.
Figure: TV remote control uses infrared.

Network Protocols
Network protocols a common set of rules
Define how to interpret signals, identify individual computers, initiate and end networked
communication, and manage information exchange across network medium
Include TCP/IP, NetBEUI, IPX/SPX, and NWLink
Network Software
Network software issues requests and responses
Network operating system (NOS) controls which computers and users access network
resources
o Include both client and server components
o Popular NOSs include Windows Server 2003, Windows XP, Windows 2000, Windows
NT, and Novell NetWare
Network applications access the network

o Include e-mail programs, Web browsers, and network-oriented utilities
MultiTasking
Multitasking able to support numerous processes simultaneously
o True multitasking requires as many CPUs as simultaneous processes (multiprocessing)
o Time slicing simulates multitasking
Two types of multitasking
o Preemptive multitasking OS controls what process gets access to CPU and for how
long
o Cooperative multitasking relies on process itself to relinquish control of CPU
INTERNET
The Internet is a network of networks. Millions of computers all over the world are connected
through the Internet.
Computer users on the Internet can contact one another anywhere in the world. If your computer is
connected to the Internet, you can connect to millions of computers. You can gather information
and distribute your data. It is very much similar to the telephone connection where you can talk
58

with any person anywhere in the world.
In Internet a huge resource of information is accessible to people across the world. Information
in every field starting from education, science, health, medicine, history, and geography to
business, news, etc. can be retrieved through Internet. You can also download programs and
software packages from anywhere in the world. Due to the tremendous information resources
the Internet can provide, it is now indispensable to every organisation.
Origin of Internet
In 1969 Department of Defence (DOD) of USA started a network called ARPANET
(Advanced Research Projects Administration Network) with one computer at California and
three at Utah. Later on other universities and R & D institutions were allowed to connect to the
Network. APARNET quickly grew to encompass the entire American
continent and became a huge success. Every university in the country wanted to become a part
of ARPANET. So the network was broken into two smaller parts MILNET for managing
military sites and ARPANET (smaller) for managing non-military sites. Around 1980,
NSFNET (National Science Foundation Network) was created. With the advancement of
modern communication facilities, other computers were also allowed to be linked up with any
computer of NSFNET. By 1990 many computers were looking up to NSFNET giving birth to
Internet.
How Internet functions

Internet is not a governmental organization. The ultimate authority of the Internet is the
Internet Society. This is a voluntary membership organization whose purpose is to promote
global information exchange. Internet has more than one million computers attached to it.
E-mail
E-mail stands for electronic mail. This is one of the most widely used features of Internet.
Mails are regularly used today where with the help of postage stamp we can transfer mails
anywhere in the world. With electronic mail the service is similar. But here data are transmitted
through Internet and therefore within minutes the message reaches the
destination may it be anywhere in the world. Therefore the mailing system is excessively fast
and is being used widely for mail transfer.
Browser
This is software which facilitates us to connect to any web site we want to access. Only
through Brower you can connect to internet. Some of the well known Brosers are Internet
Explorer, Fire fox, Netscape Navigator etc.
World Wide Web
59
Internet and World Wide Web are not same

WWW is newest Internet service in this chapter
Web consists of millions of documents written in Hypertext Markup Language
(HTML)
HTML is used to create static Web Pages
Can browse using links
Primary protocol is Hypertext Transfer Protocol (HTTP)
Front page of Web site is called home page
Use search engine, such as Yahoo! or Google, to find Web sites with specific
information
Most hardware and software vendors have Web sites

Contain product information, updated documentation, new drivers
Web is rich and useful resource

Remote Conferencing
Allows employees to telecommute
MS Messenger, CUSeeMe are common applications

o Video and voice conferencing
o Application sharing
o Whiteboard discussions
o Instant messaging
Does not always provide quality audio and video

o Some applications combine traditional phone conferencing and software for
application sharing and whiteboard discussions
Locating Internet Resources
Internet address lets users navigate Internet
Address usually represented as resource names
Name has corresponding TCP/IP numeric address

Internet Resource Names
Uniform Resource Locator (URL) is address associated with Web-based Internet

resource
o Includes protocol to use to access it
o Protocol is followed by colon, such as HTTP:
o Two forward slashes begin the address
o Domain name identifies the organization and references a server
Domain Name System (DNS)
DNS protocol resolves symbolic names to corresponding IP addresses

o Example: www.microsoft.com references IP address 207.46.250.252
60
Last element of domain name, called top-level domain, categorizes type of

organization
Other domain types may indicate country of origin
Common Domain Types
A
A
A
A
A
A
.com
Commercial organizations or businesses
.edu
Educational institutions
.gov
Government organizations (except military)
.mil
Military organizations
.netNetwork service providers
.org
Other organizations, usually nonprofit
Country-Specific Domains
A
A
A
A
.au
Australia
.fr
France
.uk
United Kingdom
.in
India
For complete, geographically organized list of country top-level domain names, visit
www.norid.no/domenenavnbaser/domreg.html
Making an Internet Connection
A Most users go through Internet Service Provider (ISP) to connect to Internet

A ISPs provide dial-up and dedicated links
o Dial-up lines using modems are most common
o Other relatively inexpensive connections include Integrated Services Digital
Network (ISDN), cable modem, and digital subscriber line (DSL)
o Large companies and government bodies may use higher bandwidth connections
such
as
DS-3
or ATM
Dial-Up Connections
A Dial-up protocols include:

o Point-to-Point Protocol (PPP)
o Serial Line Internet Protocol (SLIP)
o CSLIP, a compressed version of SLIP
A PPP is dial-up protocol of choice for ISPs today because it supports these features:
o Compression
o Error-checking
o Dynamic IP addressing
Digital Connection Types
61

A ISDN is digital line for voice or data with speeds up to 128 Kbps
o Limitations are cost and availability
A Digital technologies offer higher bandwidth at lower costs
o Cable
modems
with
bandwidth
900 Kbps
o DSL with bandwidth of 384 Kbps and higher.
from
150
to
Some useful terms you should know

Multitasking/ multiprogramming: The management of two or more tasks, or programs, running
concurrently on the computer system (one CPU).
Multithreading: A form of multitasking that runs multiple tasks within a single application
simultaneously.
Multiprocessing: simultaneous processing of more than one program by assigning them to
different processors (multiple CPUs).
Virtual Memory: A feature that simulates more main memory than actually exists in the computer
system by extending primary storage into secondary storage.
Graphical user interface (GUI): system software that allows users to have direct control of visible
objects (such as icons) and actions, which replace command syntax.
Character Usesr Interface (CUI): system software that allows users to use enter characters and
command syntax. It is very difficult OS for new entrants to computer field. You must memorise
the OS commands and their syntax and use.
Social interface: A user interface that guides the user through computer applications by using
cartoonlike characters, graphics, animation, and voice commands.
First generation language: The lowest level programming language; composed of binary digits;
only programming language understood by CPU.
Second generation language: more user friendly than first generation language; uses mnemonics
for people to use.
Third generation language: requires the programmer to specify, step-by-step, exactly how the
computer must accomplish a task.
Fourth generation language: allows the user to specify the desired result without having to
specify step-by-step procedures.
Visual programming languages: use a graphical environment with mouse; icons and symbols on
the screen, or pull-down menus to make programming easier.
62

Object- oriented languages. Programming language that encapsulate a small amount of data with
instructions about what to do with data.
Methods. the instructions about what to do with encapsulated data objects.
Object. the combination of a small amount of data with the data.
Encapsulation. the process of creating an object.
Reusability feature. Feature of object-oriented languages that allows classes created for one
purpose to be used in a different object-oriented program if desired.
Hypertext. An approach to data management in which data are stored in a network of nodes
connected by links and are accessed through interactive browsing.
Hyperlinks. The links that connect data nodes in hypertext.
Hypertext document. The combination of nodes, links, and supporting indexes for any particular
topic in hypertext.
Hypertext markup language (HTML). The standard programming language used on the Web to
create and recognize hypertext documents.
Extensible markup language (XML). A programming language designed to improve the
functionality of web documents by providing more flexible and adaptable data identification.
Proprietary application software. Software that addresses a specific or unique business need for a
company ; may be developed in-house or may be commissioned from a software vendor.
Contract software. Specific software programs developed for a particular company by a vendor.
Off-the-shelf application software. Software purchased, leased, or rented from a vendor that
develops programs and sell them to many organizations; can be standard customizable.
Package is a commonly used term for a computer program (or group of programs) that have been
developed by a vendor and is available for purchase in a prepackaged form.
Bandwidth: the range of frequencies available in any communications channel
Narrowband: low-speed transmission speed transmissions up to 64 Kbps
Broadband: high-speed transmission speeds ranging from 256 to several terabits per second.
Integrated Services Digital Network: data transmission technology that allows users to transfer
voice, video, image, and data simultaneously over existing telephone lines.
63

Digital Subscriber Line: a high-speed, digital data transmission technology using existing analog
telephone lines.
Asynchronous Transfer Mode: data transmission technology that uses packet switching and
allows for almost unlimited bandwidth on demand.
Synchronous Optical Network: an interface standard for transporting digital signals over fiber
optic lines that allows users to integrate transmissions from multiple vendors.
T-Carrier System: digital transmission system that defines circuits that operate at different rates,
all of which are multiples of the basic 64 Kbps user to transport a single voice call.
Local area networks: connects two or more devices in a limited geographical region
Wide area network: networks that cover large geographical areas
Value-added network: a type of wide area network that are private, data-only networks managed
by third parties that provide telecommunication and computing services to multiple organizations.
Enterprise network: the entire network of an organization, usually consisting of multiple local
area networks and multiple wide area networks.
Network Protocol: a set of rules and procedures that govern transmission across a network.
Ethernet: a common LAN protocol.
Transmission Control Protocol/Internet Protocol: a file transfer protocol that can send large files
of information across sometimes unreliable network with assurance that the data will arrive
uncorrupted; the protocol of the Internet.
Types of Network Processing
Client/server: links two or more computers in an arrangement in which some machines
(called servers) provide computing services for user computers (called clients).
Peer-to-Peer processing: a type of client/server distributed processing where each
computer acts as both a client and a server.
Intranet: a network designed to serve the internal informational needs of a single
organization.
Extranet: a network that connects parts of the intranets of different organizations and
allows secure communications among business partners over the Internet using virtual
private networks.
64
Darknet: a private network that runs on the Interne but is only open to users who belong to
the network.
Addresses on the Internet
Domain names consist of multiple parts, separated by dots, which are red from right to left.
o Top-level domain
o Name of the organization
o Name of the specific computer
Top-level domain: the rightmost part of an Internet name; common top-level domains are
.com, .edu, .gov
Name of the company: the next section of the Internet name
Name of the specific computer: the next section of the Internet name
Internet Address example
The World Wide Web is a system of universally accepts standards for storing, retrieving, formatting,
and displaying information via a client/server architecture.
n Not the same thing as the Internet
n Home page
n Uniform resource locator
Home page: a text and graphical screen display that usually welcomes the user and explains the
organization that has established the page.
Uniform resource locator: the set of letters that points to the address of a specific resource on the
Web
DATABASE MANAGEMENT SYSTEM

What is DBMS?
A Database Management System (DBMS) is a set of computer programs that
controls the creation, maintenance (Editing, Listing, Printing, Deleting etc.), and
the use of the database of an organization and its end users.
It allows organizations to place control of organization-wide database
development in the hands of database administrators (DBAs) and other
specialists.
DBMSes may use any of a variety of database models, such as
o Sequential file organization model
o Random file organization model
o Index sequential File Organization Model
o Hierarchical file organization model
o the network model or
65
o relational model.
In large systems, a DBMS allows users and other software to store and retrieve
data in a structured way. It helps to specify the logical organization for a
database and access and use the information within a database.
It provides facilities for controlling data access, enforcing data integrity,

managing concurrency controlled, and restoring database.
Before dealing with the DBMS we have to see the traditional fie organisation, then only
we can understand about DBMS and the benefits of DBMS
What is file Handling?

Initially processing job was carried on by entering data and processing without storing
the dta in secondary storage devices. With the development of secondary storage
devices, advances in computer hardware and with the increased use of computers for
commercial usage, called for storge of data for future use. This calls for the usage of
files of various types.
Before dealing with files, let us understand some Base file Terminology
Data Item
Individual elements of data like student number, name, address, father name
etc., separately are called as Data Items or fields. Each Data Item will be
identified with separate name. Each Data Item will have sub data items like
further subdivided into BirthDate, BirthMonth, BirthYear.
Record
A collection of individual fields will be record pertaining to an Item or entry. A
record is trated as single unit.
Record Key
To distinguish one specific record from another, one data item from the record is
selected and kept unique for identification purpose. Some examples are Student
ill be kept
unique to maintain records of permanent and semi-permanent in nature.
66

Entity
An entity is any person, Place, thing, or event of interest to the organization and
about which data are captured, stored, or processed. Patients and tests are
entities of interest in hospitals , while banking entities include customers and
cheques.
File?
File is the container of data. It is a set of related records. Each record in a file is
included because it pertains to the same entity. A file of Cheques can consist of
cheques only and cann not contain invoices or inventory records.
The number of records in the file determines the file size. If each record consists
of 100 characters and if there are 100 records, the file size is approximately 100
X 100 characters.
Master file where fixed and semi-fixed information which will never or
occasionally changing will be stored. Here, the records will be having
Master, Product Master, Employee Master etc.
Transaction File where information of frequently changing in nature will
be maintained like Marks information of each examination, Salary
Information of each month, Product movement information etc.
Table Files contain the reference data used in processing transactions,
Updating master files, or producing output.
Report File are temporary files which contains processed data for taking
printout when printing time is not available and should be sent in queue
to printer.
Work File is a temporary file in a system. It has neither the long-term
character of a master file nor the input nor output character of a
transaction or report file. One common use of a work file is to pass data
created by one program to another.
Program File contains instructions for the processing of data which may
be stored in other files or resident in main memory. The instructions may
be written in a highlanguage, Machine language, or a job control language.
Text File contains alphanumeric and graphic data input using a text
editor, or may be stored in such a way that it can be processed by several
users.
may be backup files to ensure that the duplicate is available
67

or any other files maintained other than specified above.
Different Methods of File Organizations

There are different file organizations and they are also called traditional file
organizations. They are:
Sequential File Organization
Direct or Random File Orgnization
Index Sequential File Organization
Relative File Organization
Hierarchical File organization and
Network File Organization.

Sequential File Organization
Here,
Simplest way of storing an retrieving records in a file.

The records will be entered according to their order of entry.
The records can be accessed one by one i.e., sequentially from the
beginning.
If you want to access 100th record, you have to read record from the
beginning i.e., from fist record to 99th record. Then only, you can access
100th record. If it is 10000th record, you have to access up to 9999th
record to access 10000th record.
Here, accessing time will be more. If you want to access 1st record, it
wont take much time. But to access 1000th record, it will take more time.
So to access higher numbered record it will take more time.
Moreover, if you are in 1000th record and want to access 50th record, you
have to close the file and again open and access from the beginning.
It is not possible to edit and delete any record in the middle. Only
insertion and that to in their order of entry is allowed. To do editing and
deletion, you have to open another file and copy all records from source
file to target file. If there is any modification, you have to bring the record
to the memory and do whatever modification you want to do and copy
that to the target file. If you want to delete, dont copy that record to the
target file.
This type of file is suitable for maintaining transaction data.
Direct or Random File Organization

Here,
The records of Fixed and SemiFixed in nature which is called as master

file will be maintained.
68
Direct Access files are keyed files.
Here, you can retrieve the records sequentially as well as randomly.
Each and every record will be stored with a record number which is called
as record key and you are expected to keep that record key in memory
otherwise you cannot access the record.
This Direct Access does not require the system to start from the beginning
ie., from first record.
Since the record key here is numeric, we can not fetch any record using
primary key or any other keys using alphanumeric coding. For an
example, if you want to know the details of an employee or a product,
know what record number you have given to a record while doing data
processing. Whereas, it is easy to predict a records first two or more keys
and fetch our desired records. This facility is not available in Direct
Access file organization.
This file organizat

But this is not wildly used for commercial applications because some
versions of BASIC and COBOL are having Indexed Sequential File
Organization.
Indexed Sequential File Organization

Here,
Here, the accessing can be both sequential and random using record key.
In this file organization, Index of records includes a record key and
storage address for a record.
Two separate files are maintained. One is indexed file where an index of
records stored in a sequential file will be maintained. By using the index
any record can be fetched easily from the sequential file with lesser
amount of time. Another file is, sequential file which is similar to our
sequential files, where in the records will be stored sequentially.
This organization is mostly used by commercial application systems and

information systems to maintain their master files.
The main benefit of Index sequential file organization over direct file is the
retrieval of a record using a key field. Here, the key field will be primary
key becuae in index sequential files the key field should be a unique one.
69

Drawbacks of Traditional files
be very difficult to Maintain. Even for small or one time usage also you have to
write programs.
Since lot of programs are to be written, it will take longer time to finish of a
software
For any operation, you have to write programs. With out using programs , one
cannot access any information from records.
The files and programs are not independent. We have to define the data structure
within the program itself. So, if there is any change in structure, the program
should be modified. If several programs are using the modified file, then all those
programs are to be modified.. this is a tedious process.
Database Management System
set of programs to access that data. It is software that is helpful in maintaining and
utilizing a database.
A DBMS consists of:
A collection of interrelated and persistent data. This part of DBMS is referred
to as database (DB).
A set of application programs used to access, update, and manage data. This
part constitutes data management system (MS).
A DBMS is general-purpose software i.e., not application specific. The same
DBMS (e.g., Oracle, Sybase, etc.) can be used in railway reservation system,
library management, university, etc.
A DBMS takes care of storing and accessing data, leaving only application
specific tasks to application programs.
DBMS is a complex system that allows a user to do many things to data as shown in
Fig. below. From this figure, it is evident that DBMS allows user to input data, share
the data, edit the data, manipulate the data, and display the data in the database.
Because a DBMS allows more than one user to share the data; the complexity
extends to its design and implementation.
70
Features of DBMS
Database systems are designed to manage large bodies of information.

Management of data involves both defining structures for storage of
information and providing mechanisms for the manipulation of information.

The database system ensures the safety of the information stored, despite
system crashes or attempts at unauthorized access.
If data are to be shared among several users, the system must avoid possible
anomalous results.
Because information is so important in most organizations, computer
scientists have developed a large body of concepts and techniques for
managing data.
Structure of DBMS
An overview of the structure of database management system is shown in Fig.
below. A DBMS is a software package, which translates data from its logical
representation to its physical representation and back. The DBMS uses an
application specific database description to define this translation. The database
description is generated by a database designer from his or her conceptual view
of the database, which is called the Conceptual Schema. The translation from the
conceptual schema to the database description is performed using a data
defi
or textual design interface.
71
Objectives of DBMS
The main objectives of database management system are data availability,
data integrity, data security, and data independence.
Data Availability
Data availability refers to the fact that the data are made available to wide variety
of users in a meaningful format at reasonable cost so that the users can easily
access the data.
Data Integrity
Data integrity refers to the correctness of the data in the database. In other words,
the data available in the database is a reliable data.
Data Security
Data security refers to the fact that only authorized users can access the data.Data
security can be enforced by passwords. If two separate users are accessing a
particular data at the same time, the DBMS must not allow them to make
con icting changes.
Data Independence
DBMS allows the user to store, update, and retrieve data in an efficient manner.
72
DBMS provides an abstract view of how the data is stored in the database.
In order to store the information efficiently, complex data structures are used to
represent the data. The system hides certain details of how the data are stored and
maintained.
Database System Applications
Databases are widely used. Here are some representative applications:
Banking: For customer information, accounts, and loans, and banking transactions.
Airlines: For reservations and schedule information. Airlines were among the first to
use databases in a geographically distributed manner terminals situated around
the world accessed the central database system through phone lines and other data
networks.
Universities: For student information, course registrations, and grades.
Credit card transactions: For purchases on credit cards and generation of monthly
statements.
Telecommunication: For keeping records of calls made, generating monthly bills,
maintaining balances on prepaid calling cards, and storing information about the
communication networks.
Finance: For storing information about holdings, sales, and purchases of financial
instruments such as stocks and bonds.
Sales: For customer, product, and purchase information.
Manufacturing: For management of supply chain and for tracking production of
items.
Human resources: For information about employees, salaries, payroll taxes and
benefits, and for generation of paycheques.
Evolution of Database Management Systems
File-based system was the predecessor to the database management system.

Apollo moon-landing process was started in the year 1960. At that time, there
was no system available to handle and manage large amount of information.

As a result, North American Aviation which is now popularly known as
well International developed software known as Generalized Update Access
Method (GUAM).
In the mid-1960s, IBM joined North American Aviation to develop GUAM into
Information Management System (IMS). IMS was based on Hierarchical data
model.
In the mid-1960s, General Electric released
were based on network data model. Charles Bachmann was mainly
responsible for the development of IDS. The network database was developed
to fulfill the need to represent more complex data relationships than could be
modeled with hierarchical structures.
73
Conference on Data System Languages formed Data Base Task Group (DBTG)
in 1967. DBTG specified three distinct languages for standardization. They are
Data Definition Language (DDL), which would enable Database Administrator
to define the schema, a subschema DDL, which would allow the application
programs to defi
(DML) to manipulate the data.
The network and hierarchical data models developed during that time had the
drawbacks of minimal data independence, minimal theoretical foundation,

and complex data access. To overcome these drawbacks, in 1970, Codd of IBM
published a paper titled A Relational Model of Data for Large Shared Data
387, June
1970.
As an impact of Codds paper, System R project was developed during the late
h Laboratory in California. The project was
developed to prove that relational data model was implementable.
The outcome of System R project was the development of Structured Query

Language (SQL) which is the standard language for relational database
management system.
In 1980s IBM released two commercial relational database management

QL/DS and Oracle Corporation released Oracle.
In 1979, Codd himself attempted to address some of the failings in his original
work with an extended version of the relational model called RM/T in 1979
that represents the
real world more closely have been loosely classified as Semantic Data
Modeling.
In recent years, two approaches to DBMS are more popular, which are
Object-Oriented DBMS (OODBMS) and Object Relational DBMS (OR- DBMS).
The chronological order of the development of DBMS is as follows:
Flat files 1960s1980s

Hierarchical 1970s1990s
Network 1970s1990s
Relational 1980spresent
Object-oriented 1990spresent
Object-relational 1990spresent
Data warehousing 1980spresent
Web-enabled 1990spresent
74

Early 1960s.
Cha
rst general purpose DBMS Integrated Data
Store. It created the basis for the network model which was standardized by
CODASYL (Conference on Data System Language).
Late 1960s.
IBM developed the Information Management System (IMS). IMS used an alternate
1970.
Edgar Codd, from IBM created the Relational Data Model.
In 1981 Codd received the Turing Award for his contributions to database theory.
1976.
Peter Chen presented Entity-Relationship model, which is widely used in database
design.
1980.
SQL eveloped by IBM, became the standard query language for databases. SQL
was standardized by ISO.
1980s and 1990s.
DBMS.
Classification of Database Management System
The database management system can be broadly classified into
(1) Passive Database Management System and
(2) Active Database Management System
Passive Database Management System.
Passive Database Management Systems are program-driven. In passive database
management system the users query the current state of database and retrieve the
information currently available in the database. Traditional DBMS are passive in
the sense that they are explicitly and synchronously invoked by user or application
program initiated operations. Applications send requests for operations to be
performed by the DBMS and wait for the DBMS to confirm and return any possible
answers. The operations can be definitions and updates of the schema, as well as
queries and updates of the data.
Active Database Management System.
Active Database Management Systems are data-driven or event-driven systems. In
75
active database management system, the users specify to the DBMS the
information they need. If the information of interest is currently available, the
DBMS actively monitors the arrival of the desired information and provides it to
the relevant users. The scope of a query in a passive DBMS is limited to the past
and present data, whereas the scope of a query in an active DBMS additionally
includes future data. An active DBMS reverses the control flow between
applications and the DBMS instead of only applications calling the DBMS, the
DBMS may also call applications in an active DBMS. Active databases contain a
set of active rules that consider events that represent database state changes, look
the result of a database predicate or query, and
take an action via a data manipulation program embedded in the system. Alert is
extension architecture at the IBM Almaden Research, for experimentation with
active databases.
File-Based System
Prior to DBMS, file system provided by OS was used to store information. In a filebased system, we have collection of application programs that perform services for
the end users. Each program defines and manages its own data. Consider
University database, the University database contains details about student,
faculty, lists of courses offered, and duration of course, etc. In File-based
processing for each database there is separate application program which is shown
in Fig. below.
One group of users may be interested in knowing the courses offered by the university.
One group of users may be interested in knowing the faculty information. The
information is stored in separate files and separate applications programs are written.
76

Drawbacks of File-Based System
The limitations of file-based approach are duplication of data, data dependence,
incompatible file formats, separation, and isolation of data.
Duplication of Data
Duplication of data means
. This can also be
termed as data redundancy. Data redundancy is a problem in file-based approach due
to the decentralized approach. The main drawbacks of duplication of data are:
Duplication of data leads to wastage of storage space. If the storage space is wasted
it will have a direct impact on cost. The cost will increase.

Duplication of data can lead to loss of data integrity; the data are nolonger
consistent. Assume that the employee detail is stored both in the department and in
the main office. Now the employee changes his contact address. The changed
address is stored in the department alone and not in the main office. If some
important information has to be sent to his contact address from the main office
then that information will be lost. This is due to the lack of decentralized approach.
Data Dependence
Data dependence means the application program depends on the data. If some
modifications have to be made in the data, then the application program has to be
rewritten. If the application program is independent of the storage structure of the data,
then it is termed as data independence. Data independence is generally preferred as it
is more flexible. But in file-based system there is program-data dependence.
Incompatible File Formats
As file-based system lacks program data independence, the structure of the file
depends on the application programming language. For example, the structure of the
file generated by FORTRAN program may be different from the structure of a file
generated by C program. The incompatibility of such files makes them difficult to
process jointly.
Separation and Isolation of Data
In file-based approach, data are isolated in separate fi
fficult to access
data. The application programmer must synchronize the processing of two files to
ensure that the correct data are extracted. This difficulty is more if data has to be
retrieved from more than two files. The draw backs of conventional file-based approach
are summarized later:
77
1. We have to store the information in a secondary memory such as a disk. If the

volume of information is large; it will occupy more memory space.
2. We have to depend on the addressing facilities of the system. If the database is
very large, then it is difficult to address the whole set of records.
3. For each query, for example the address of the student and the list of electives that
the student has chosen, we have to write separate programs.
4. While writing several programs, lot of variables will be declared and it will occupy
some space.
5. It is difficult to ensure the integrity and consistency of the data when more than one
program accesses some file and changes the data.
6. In case of a system crash, it becomes hard to bring back the data to a consistent
state.
7.
files.
8. Data distributed in various files may be in different formats hence it is difficult to
share data among different application (Data Isolation).
DBMS Approach
DBMS is software that provides a set of primitives for defining, accessing, and
manipulating data. In DBMS approach, the same data are being shared by
different application programs; as a result data redundancy is minimized. The
DBMS approach of data access is shown in Fig. below
Advantages of DBMS
There are many advantages of database management system. Some of the advantages
are listed later:
1. Centralized data management.
2. Data Independence.
3. System Integration.
Centralized Data Management
78

In DBMS all files are integrated into one system thus reducing redundancies
and making data management more efficient.
Data Independence
Data independence means that programs are isolated from changes in the
way the data are structured and stored.
In a database system, the database management system provides the

interface between the application programs and the data.
Physical data independence means the applications need not worry about how
should work with a
logical data model and declarative query language. If major changes were to be
made to the data, the application programs may need to be rewritten. When
changes are made to the data representation, the data maintained by the DBMS
is changed but the DBMS continues to provide data to application programs in
the previously used way.
Data independence is the immunity of application programs to changes in

storage structures and access techniques. For example if we add a new
attribute, change index structure then in traditional file processing system,
the applications are affected. But in a DBMS environment these changes are
reflected in the catalog, as a result the applications are not affected. Data
independence can be physical data independence or logical data
independence. Physical data independence is the ability to modify physical
schema without causing the conceptual schema or application programs to be
rewritten. Logical data independence is the ability to modify the conceptual
schema without having to change the external schemas or application
programs.
Data Inconsistency
Data inconsistency means different copies of the same data will have different values.
For example, consider a person working in a branch of an organization. The details of
the person will be stored both in the branch office as well as in the main office. If that
of address has to be
maintained in the main as well as the branch office. For example the change of
address is maintained in the branch office but not in the main office, then the data
about that person is inconsistent.
DBMS is designed to have data consistency. Some of the qualities achieved
in DBMS are:
1. Data redundancy
2. Data independence
Reduced in DBMS.
Activated in DBMS.
79
3. Data inconsistency
Avoided in DBMS.
4.
Achieved in DBMS.
5. Data integrity
Necessary for efficient Transaction.
6. Support for multiple views
Necessary for security reasons.
Data redundancy means duplication of data. Data redundancy will occupy more
space hence it is not desirable.

Data independence means independence between application program and the
data. The advantage is that when the data representation changes, it is not
necessary to change the application program.
Data inconsistency means different copies of the same data will have different
values.
Centralizing the data means data can be easily shared between the users but the
main concern is data security.
The main threat to data integrity comes from several different users attempting to
update the same data at the same time. For example, The number of booking
Support for multiple views means DBMS allows different users to see different
views of the database, according to the perspective each one requires. This
concept is used to enhance the security of the database.
Standard Institute/
Standards Planning
The distinction between the logical and physical representation of data were recognized
framework for database
systems. This framework provided a three-level architecture, three levels of abstraction
at which the database could be viewed.
Need for Abstraction
The main objective of DBMS is to store and retrieve information e ciently; all the users
should be able to access same data. The designers use complex data structure to
represent the data, so that data can be e ciently stored and retrieved, but it is not
necessary for the users to know physical database storage details. The developers hide
the complexity from users through several levels of abstraction.
Data Independence
Data independence means the internal structure of database should be unaffected by
changes to physical aspects of storage. Because of data independence, the Database
administrator can change the database storage structures without affecting the users
view.
The different levels of data abstraction are:
80
1. Physical level or internal level

2. Logical level or conceptual level
vel
Physical Level
It is concerned with the physical storage of the information. It provides the internal
view of the actual physical storage of data. The physical level describes complex lowlevel data structures in detail.
Logical Level
Logical level describes what data are stored in the database and what relationships
exist among those data.
Logical level describes the entire database in terms of a small number of simple
structures. The implementation of simple structure of the logical level may involve
complex physical level structures; the user of the logical level does not need to be
aware of this complexity. Database administrator uses the logical level of abstraction.
View Level
View level is the highest level of abstraction. It is the view that the individual user of
the database has. There can be many view level abstractions of the same data. The
different levels of data abstraction are shown in Fig. below.
Database Instances
Database change over time as information is inserted and deleted. The collection of
information stored in the database at a particular moment is called an instance of the
database.
Database Schema
The overall design of the database is called the database schema. A schema is a
collection of named objects. Schemas provide a logical classification of objects in the
database. A schema can contain tables, views, triggers, functions, packages, and other
objects
81
A schema is also an object in the database. It is explicitly created using the CREATE
urrent user recorded as the schema owner. It can also be
implicitly created when another object is created, provided the user has IMPLICIT
SCHEMA authority.
A Mapping is a translation from one schema to another.
Data Models
Data model is collection of conceptual tools for describing data, relationship between
data, and consistency constraints. Data models help in describing the structure of
data at the logical level. Data model describe the structure of the database. A data
model is the set of conceptual constructs available for defining a schema. The data
model is a language for describing the data and database, it may consist of abstract
concepts, which must be translated by the designer into the constructs of the data
definition interface, or it may consist of constructs, which are directly supported by
82
the data definition interface. The constructs of the data model may be defined at
many levels of abstraction.
Early Data Models

Three historically important data models are the hierarchical, network, and relational
models. These models are relevant for their contributions in establishing the theory of
data modeling and because they were all used as the basis of working and widely used
database systems. Together they are often referred to as the basic data models. The
hierarchical and network models,
organizing the primitive data structures in which the data were stored in the computer
by adding connections or links between the structures. As such they were useful in
resenting the user with a well-defined structure, but they were still highly coupled to
the underlying physical representation of the data. Although they did much to assist in
the efficient access of data, the principle of data independence was poorly supported.
Components and Interfaces of Database Management System

A database management system involves five major components: data, hardware,
software, procedure, and users. These components and the interface between the
components are shown in Fig. below.
Hardware
The hardware can range from a single personal computer, to a single mainframe, to a
network of computers. The particular hardware depends on the requirements of the
83
organization and the DBMS used. Some DBMSs run only on particular operating
systems, while others run on a wide variety of operating systems. A DBMS requires a
minimum amount of main memory and disk space to run, but this minimum
configuration may not necessarily give acceptable performance.
Software
The software includes the DBMS software, application programs together with the
operating systems including the network software if the DBMS is being used over a
network. The application programs are written in third-generation programming
BOL, FORTRAN, Ada, Pascal, etc. or using fourth-generation
language such as SQL, embedded in a third-generation language. The target DBMS
may have its own fourth-generation tools which
through the provision of nonprocedural query languages, report generators, graphics
generators, and application generators. The use of fourth-generation tools can improve
productivity significantly and produce programs that are easier to maintain.
Data
84

A database is a repository for data which, in general, is both integrated and shared.
Integration means that the database may be thought of as a unification of several
otherwise distinct files, with any redundancy among those files partially or wholly
eliminated. The sharing of a database refers to the sharing of data by different users, in
the sense that each of those users may have access to the same piece of data and may
use it for different purposes. Any given user will normally be concerned with only a
subset of the whole database. The main features of the data in the database are listed
later:
1. The data in the database is well organized (structured)
2. The data in the database is related
3. The data are accessible in different orders without great difficulty The data in the
database is persistent, integrated, structured, and shared.
Integrated Data
A data can be considered to be a unification of several distinct data files and
when any redundancy among those files is eliminated, the data are said to be
integrated data.
Shared Data
A database contains data that can be shared by different users for different
application simultaneously. It is important to note that in this way of sharing of
data, the redundancy of data is reduced, since repetitions are avoided, the
possibility of inconsistencies is reduced.
Persistent Data
Persistent data are one, which cannot be removed from the database as a side
effect of some other process. Persistent data have a life span that is not limited to
single execution of the programs that use them.
Procedure
Procedures are the rules that govern the design and the use of database. The procedure
may contain information on how to log on to the DBMS, start and stop the DBMS,
procedure on how to identify the failed component, how to recover the database,
change the structure of the table, and improve the performance.
People Interacting with Database

Here people refers to the people who manages the database, database adminK.S.Kunkuma Balasubramanian, 9841817207
85
istrator, people who design the application program, database designer and the people
who interacts with the database, database users.
-end server in a local or global network, offering
services to clients directly or to Application Servers.
Database Administrator
Database Administrator is a person having central control over data and programs
accessing that data. The database administrator is a manager whose responsibilities
are focused on management of technical aspects of the database system.
The objectives of database administrator are given as follows:
1. To control the database environment
2. To standardize the use of database and associated software
3. To support the development and maintenance of database application
projects
4. To ensure all documentation related to standards and implementation is upto-date
The summarized objectives of database administrator are shown in Fig. below.
86
The control of the database environment should exist from the planning right
through to the maintenance stage. During application development the database
administrator should carry out the tasks that ensure proper control of the
database when an application becomes operational. This includes review of each
design stage to see if it is feasible from the database point of view. The database
administrator should be responsible for developing standards to apply to
development projects. In particular these standards apply to system analysis,
design, and application programming for projects which are going to use the
database. These standards will then be used as a basis for training systems
analysts and programmers to use the database management system efficiently.
Responsibilities of Database Administrator (DBA)
The responsibility of the database administrator is to maintain the integrity,
security, and availability of data. A database must be protected from accidents,
such as input or programming errors, from malicious use of the database and
from hardware or software failures that corrupt data. Protection from accidents
that cause data inaccuracy is a part of maintaining data integrity. Protecting the
database from unauthorized or malicious use is termed as database security. The
responsibilities of the database administrator are summarized as follows:
1. Authorizing access to the database.
2. Coordinating and monitoring its use.
3. Acquiring hardware and software resources as needed.
4. Backup and recovery. DBA has to ensure regular backup of database, incase
of damage, suitable recovery procedure are used to bring the database up
with little downtime as possible.
Database Designer
Database designer can be either logical database designer or physical database
87
designer. Logical database designer is concerned with identifying the data, the
relationships between the data, and the constraints on the data that is to be
stored in the database. The logical database designer must have thorough
understanding of the organizations data and its business rule. The physical
database designer takes the logical data model and decides the way in which it
can be physically implemented. The logical database designer is responsible for
mapping the logical data model into a set of tables and integrity constraints,
selecting speci c storage structure, and designing security measures required on
the data. In a nutshell, the database designer is responsible for:
1. Identifying the data to be stored in the database.
2. Choosing appropriate structure to represent and store the data.
Database Manager
Database manager is a program module which provides the interface between
the low level data stored in the database and the application programs and
queries submitted to the system:
The database manager would translate DML statement into low level le
system commands for storing, retrieving, and updating data in the
database.
Integrity enforcement. Database manager enforces integrity by checking
consistency constraints like the bank balance of customer must be
main
.
Security enforcement. Unauthorized users are prohibited to view the
information stored in the data base.
Backup and recovery. Backup and recovery of database is necessary to
ensure that the database must remain consistent despite the fact of
failures.
Database Users
Database users are the people who need information from the database to
carry out their business responsibility. The database users can be broadly
classified into two categories like application programmers and end users.
Sophisticated End Users

Sophisticated end users interact with the system without writing programs.
They form requests by writing queries in a database query language. These
are submitted to query processor. Analysts who submit queries to explore
data in the database fall in this category.
88

Specialized End Users
Specialized end users write specialized database application that does not t
into data-processing frame work. Application involves knowledge base and
expert system, environment modeling system, etc.
Naive End Users
Naive end user interact with the system by using permanent application
program Example: Query made by the student, namely number of books
borrowed in library database.
System analysts determine the requirements of end user, and develop

specification for canned transaction that meets this requirement.
Canned Transaction
Ready made programs through which naive end users interact with the database
is called canned transaction.
Data Dictionary
A data dictionary, also known as a system catalog, is a centralized store of
information about the database. It contains information about the tables, the fields the
tables contain, data types, primary keys, indexes, the joins which have been
established between those tables, referential integrity, cascades update, cascade delete,
etc. This information stored in the data dictionary is called the Metadata. Thus a data
dictionary can be considered as a file that stores Metadata. Data dictionary is a tool for
recording and processing information about the data that an organization uses. The
data dictionary is a central catalog for Metadata. The data dictionary can be integrated
within the DBMS or separate. Data dictionary may be referenced during system design,
programming, and by actively-executing programs. One of the major functions of a true
data dictionary is to enforce the constraints placed upon the database by the designer,
such as referential integrity and cascade delete.
Metadata
The information (data) about the data in a database is called Metadata. The Metadata
are available for query and manipulation, just as other data in the database.
The functional components of database system structure are:

89
1. Storage manager.
2. Query processor.
Storage Manager
Storage manager is responsible for storing, retrieving, and updating data in the
database. Storage manager components are:
1. Authorization and integrity manager.
2. Transaction manager.
3. File manager.
4. Bu er manager.
Transaction Management
A transaction is a collection of operations that performs a single logical
function in a database application.

Transaction-management component ensures that the database remains in a
consistent state despite system failures and transaction failure.
Concurrency control manager controls the interaction among the concurrent
transactions, to ensure the consistency of the database.
Authorization and Integrity Manager

Checks the integrity constraints and authority of users to access data.
Transaction Manager
It ensures that the database remains in a consistent state despite system failures. The
transaction manager manages the execution of database manipulation requests. The
transaction manager function is to ensure that concurrent access to data does not
result in conflict.
File Manager
File manager manages the allocation of space on disk storage. Files are used to store
collections of similar data. A le management system manages independent les,
helping to enter and retrieve information records. File manager establishes and
maintains the list of structure and indexes defined in the internal schema. The le
manager can:
Create a le
Delete a le
90
Update the record in the le

Retrieve a record from a le
Bu er
The area into which a block from the le is read is termed a bu er. The management of
buffers has the objective of maximizing the performance or the utilization of the
secondary storage systems, while at the same time keeping the demand on CPU
resources tolerably low. The use of two or more buffers for a file allows the transfer of
data to be overlapped with the processing of data.
Buffer Manager
Buffer manager is responsible for fetching data from disk storage into main memory.
Programs call on the buffer manager when they need a block from disk. The requesting
program is given the address of the block in main memory, if it is already present in the
buffer. If the block is not in the buffer, the buffer manager allocates space in the buffer
for the block, replacing some other block, if required, to make space for new block.
Once space is allocated in the buffer, the buffer manager reads in the block from the
disk to the buffer, and passes the address of the block in main memory to the
requester.
Indices
Indices provide fast access to data items that hold particular values. An index is a list
of numerical values which gives the order of the records when they are sorted on a
particular eld or column of the table.
Database Architecture
Database architecture essentially describes the location of all the pieces of information
that make up the database application. The database architecture can be broadly
classi ed into two-, three-, and multitier architecture.
Two-Tier Architecture
The two-tier architecture is a clientserver architecture in which the client
database server processes the SQL statements and sends query results back to
the client. The two-tier architecture is shown in Fig. 1.9. Two-tier client/server
provides a basic separation of tasks. The client, or rst tier, is primarily
responsible for the presentation of data to the user and the server, or second
tier, is primarily responsible for supplying data services to the client.
91
Presentation Services
Presentation services refers to the portion of the application which presents data to
the user. In addition, it also provides for the mechanisms in which the user will interact
with the data. More simply put, presentation logic defines and interacts with the user
interface. The presentation of the data should generally not contain any validation
rules.
Business Services/objects
Business services are a category of application services. Business services
are
derived from the steps necessary to carry out day-today business in an organization.
These rules can be validation rules, used to be sure that the incoming information is of
a valid type and format, or they can be process rules, which ensure that the proper
business process is followed in order to complete an operation.
Application services provide other functions necessary for the application.

Data Services
Data services provide access to data independent of their location. The data can come
from legacy mainframe, SQL RDBMS, or proprietary data access systems. Once again,
the data services provide a standard interface for accessing data.
Advantages of Two-tier Architecture
The two-tier architecture is a good approach for systems with stable
requirements and a moderate number of clients.

The two-tier architecture is the simplest to implement, due to the number of good
commercial development environments.
Drawbacks of Two-tier Architecture
92

Software maintenance can be diffi
presentation, validation, and business logic code.
To make a significant change in the business logic, code must be modified on
many PC clients.
Moreover the performance of two-tier architecture can be poor when a large
number of clients submit requests because the database server may be
overwhelmed with managing messages.
With a large number of simultaneous clients, three-tier architecture may be
necessary.
Three-tier Architecture
A Multitier, often referred to as three-tier or N -tier, architecture provides greater
application scalability, lower maintenance, and increased reuse of components. Threetier architecture offers a technology neutral method of building client/server
applications with vendors who employ standard interfaces which provide services for
each logical tier. The three tier architecture is shown in Fig. below. From this figure, it
is clear that in order to improve the performance a second-tier is included between the
client and the server.
Through standard tiered interfaces, services are made available to the application. A
single application can employ many different services which may reside on dissimilar
platforms or are developed and maintained with different tools. This approach allows a
developer to leverage investments in existing systems while creating new application
93
which can utilize existing resources. Although the three-tier architecture addresses
performance degradations of the two-tier architecture, it does not address division-ofprocessing concerns. The PC clients and the database server still contain the same
division of code although the tasks of the database server are reduced. Multiple-tier
architectures provide more flexibility on division of processing.
Multitier Architecture
A multi-tier, three-tier, or
N -tier implementation employs a three-tier logical
architecture superimposed on a distributed physical model. Application Servers can
access other application servers in order to supply services to the client application as
well as to other Application Servers. The multiple-tier architecture is the most general
clientserver architecture. It can be most difficult to implement because of its
implementation of multiple-tier architecture can provide the most benefits in terms of

scalability, interoperability, and flexibility.
For example, in the diagram shown in Fig. above, the client application looks to
Application Server #1 to supply data from a mainframe-based application. Application
Server #1 has no direct access to the mainframe application, but it does know, through
the development of application services, that Application Server #2 provides a service to
access the data from the mainframe application which satis es the client request.
Application Server #1 then invokes the appropriate service on Application Server #2
and receives the requested data which is then passed on to the client.
Application Servers can take many forms. An Application Server may be anything from
custom application services, Transaction Processing Monitors, Database Middleware,
94
Situations where DBMS is not Necessary
It is also necessary to specify situations where it is not necessary to use a DBMS. If

traditional le processing system is working well, and if it takes more money and time
to design a database, it is better not to go for the DBMS. Moreover if only one person
maintains the data and that person is not skilled in designing a database as well as not
comfortable in using the DBMS then it is not advisable to go for DBMS.
DBMS is undesirable under following situations:
DBMS is undesirable if the application is simple, well-defined, and not expected to

change.
Runtime overheads are not feasible because of real Multiple accesses to data are not required.
Compared with le systems, databases have some disadvantages:

1. High cost of DBMS which includes:
Higher hardware costs
Higher programming costs
High conversion costs
2. Slower processing of some applications
3. Increased vulnerability
4. More di cult recovery
DBMS
Some of the popular DBMS vendors and their corresponding products are given Table
below
DBMS vendors and their products
Vendor
IBM
Microsoft
Open Source
Product
DB2/UDB
DB2/400
Informix Dynamic Server (IDS)
Access
SQL Server
DesktopEdition(MSDE)
MySQL
95

Oracle
Sybase
PostgreSQL
Oracle DBMS
RDB
Adaptive Server Enterprise (ASE)
Adaptive Server Anywhere (ASA)
Watcom
Summary
The main objective of database management system is to store and manipulate the data
in an e cient manner. A database is an organized collection of related data. All the
data will not give useful information. Only processed data gives useful information,
which helps an organization to take important decisions. Before DBMS, computer le
processing systems were used to store, manipulate, and retrieve large les of data.
Computer le processing systems have limitations such as data duplications, limited
data sharing, and no program data independence. In order to overcome these
limitations database approach was developed. The main advantages of DBMS approach
are program-data independence, improved data sharing, and minimal data
redundancy. In this chapter we have seen the evolution of DBMS and broad
introduction to DBMS.
The responsibilities of Database administrator, ANSI/SPARK, two-tier, three-tier
architecture were analyzed in this chapter.
Review Questions
1. What are the drawbacks of le processing system?
The drawbacks of file processing system are:
Duplication of data, which leads to wastage of storage space and data
inconsistency.
Separation and isolation of data, because of which data cannot be used together.
No program data independence.
2. What is meant by Metadata?
Metadata are data about data but not the actual data.
3. Define the term data dictionary?
Data dictionary is a le that contains Metadata.
4. What are the responsibilities of database administrator?
The responsibilities of the database administrator are summarized as follows:
96
Authorizing access to the database.

Coordinating and monitoring its use.
Backup and recovery. DBA has to ensure regular backup of database, incase
of damage, suitable recovery procedure are used to bring the database up with
little downtime as possible.
5. Mention three situations where it is not desirable to use DBMS?
The situations where it is not desirable to use DBMS are:
The database and applications are not expected to change.
Data are not accessed by multiple users.
6. What is meant by data independence?
Data independence renders application programs (e.g., SQL scripts) immune to
changes in the logical and physical organization of data in the system.
Logical organization refers to changes in t
or tuples does not stop queries from working.
umn
Physical organization refers to changes in indices, le organizations, etc.

7. What is meant by Physical and Logical data independence?
In logical data independence, the conceptual schema can be changed without
changing the external schema. In physical data independence, the internal
schema can be changed without changing the conceptual schema.
8. What are some disadvantages of using a DBMS over at le system?
DBMS initially costs more than at le system

DBMS requires skilled sta
9. What are the steps to design a good database?
First find out the requirements of the user

Design a view for each important application
Integrate the views giving the conceptual schema, which is the union of all views
Design external views
Choose physical structures (indexes, etc.)
10. What is Database? Give an example.
97
A Database is a collection of related data. Here, the term data means that
known facts that can be record. Examples of database are library information
system, bus, railway, and airline reservation system, etc.
11. Define DBMS.
DBMS is a collection of programs that enables users to create and maintain a
database.
12. Mention various types of databases?
The diff
Multimedia database
Spatial database (Geographical Information System Database)
Real-time or Active Database
Data Warehouse or On-line Analytical Processing Database
13. Mention the advantages of using DBMS?

The advantages of using DBMS are:
Controlling Redundancy
Enforcing Integrity Constraints so as to maintain the consistency of the database
Providing Backup and recovery facilities
Restricting unauthorized access
Providing multiple user interfaces
Providing persistent storage of program objects and data structures
14. What is Snapshot or Database State?
The data in the database at a particular moment is known as Database State or
Snapshot of the Database.
15. De ne Data Model.
It is a collection of concepts that can be used to describe the structure of a database.
The data model provides necessary means to achieve the abstraction i.e., hiding the
details of data storage.
16. Mention the various categories of Data Model.
The various categories of datamodel are:
High Level or Conceptual Data Model (Example: ER model)
98
Low Level or Physical Data Model

Representational or Implementational Data Model
Relational Data Model
Network and Hierarchal Data Model
Record-based Data Model
Object-based Data Model
Define the concept of database schema. Describe the types of schemas that exist in
Database schema is nothing but description of the database. The types of schemas
that exist in a database complying with three levels of ANSI/SPARC architecture are:
External schema
Conceptual schema
Internal schema
Entity
Introduction
Peter Chen first proposed modeling databases using a graphical technique that
humans can relate to easily. Humans can easily perceive entities and their
characteristics in the real world and represent any relationship with one another. The
objective of modeling graphically is even more profound than simply representing these
entities and relationship. The database designer can use tools to model these entities
and their relationships and then generate database vendor-specific schema
automatically. EntityRelationship (ER) model gives the conceptual model of the world
to be represented in the database. ER Model is based on a perception of a real world
that consists of collection of basic objects called entities and relationships among these
objects. The main motivation for de ning the ER model is to provide a high level model
for conceptual database design, which acts as an intermediate stage prior to mapping
the enterprise being modeled onto a conceptual level. The ER model achieves a high
degree of data independence which means that the database designer do not have to
99
worry about the physical structure of the database. A database schema in ER model
can be pictorially represented by EntityRelationship diagram.
The Building Blocks of an EntityRelationship Diagram

ER diagram is a graphical modeling tool to standardize ER modeling. The modeling can
be carried out with the help of pictorial representation of entities, attributes, and
relationships. The basic building blocks of Entity-Relationship diagram are Entity,
Attribute and Relationship.
Entity
An entity is an object that exists and is distinguishable from other objects. In other
words, the entity can be uniquely identified. The examples of entities are:
A particular person, for exampl

A particular department, for example Electronics and Communication
Engineering Department.
A particular place, for example Coimbatore city can be an entity.
Entity Type
An entity type or entity set is a collection of similar entities. Some examples of entity
types are:
All students in PSG, say STUDENT.

All courses in PSG, say COURSE.
All departments in PSG, say DEPARTMENT. An entity may belong to more than
working in a particular department can
pursue higher education as part-time. Hence the same person is a LECTURER at
one instance and STUDENT at another instance.
Relationship
A relationship is an association of entities where the association includes one entity
from each participating entity type whereas relationship type is a meaningful
association between entity types. The examples of relationship types are:
Teaches is the relationship type between LECTURER and STUDENT.
OMER.
Treatment is the relationship between DOCTOR and PATIENT.
Attributes
100

Attributes are properties of entity types. In other words, entities are described in
a database by a set of attributes.
The following are example of attributes:
Brand, cost, and weight are the attributes of CELLPHONE.
Roll number, name, and grade are the attributes of STUDENT.
Data bus width, address bus width, and clock speed are the attributes of
ER Diagram
The ER diagram is used to represent database schema. In ER diagram:

A rectangle represents an entity set.
An ellipse represents an attribute.
A diamond represents a relationship.
Lines represent linking of attributes to entity sets and of entity sets to
relationship sets.
Example of ER diagram
Let us consider a simple ER diagram as shown in Fig. 2.1.
simple attributes
which are associated with the STUDENT are Roll number and the name. The
attributes associated with the entity CLASS are Subject Name and Hall Number. The
relationship between the two entities STUDENT and CLASS is Attends.
101
Classification of Entity Sets

Entity sets can be broadly classified into:
Strong entity.
Weak entity.
Associative entity.
Strong Entity
Strong e
Example
Consider the example, student takes course. Here student is a strong entity.
102
In this example, course is considered as weak entity because, if there are no students to take a particular
course, then that course cannot be offered. The COURSE entity depends on the STUDENT entity.
Weak Entity
Weak entity is one whose existence depends on other entity. In many cases, weak entity does not
have primary key.
Example
Consider the example, customer borrows loan. Here loan is a weak entity.For every loan, there
should be at least one customer. Here the entity loan depends on the entity customer hence loan is a
weak entity
Weak Entity
Attribute Classification
Attribute is used to describe the properties of the entity. This attribute can be broadly
classified based on value and structure. Based on value the attribute can be classified
into single value, multi-value, derived, and null value attribute. Based on structure, the
attribute can be classified as simple and composite attribute.
Symbols Used in ER Diagram
103
The elements in ER diagram are Entity, Attribute, and Relationship. The different types
of entities like strong, weak, and associative entity, different types of attributes like
multi-valued and derived attributes and identifying relationship and their
corresponding symbols are shown later.
Single Value Attribute

Single value attribute means, there is only one value associated with that attribute.
Example
The examples of single value attribute are age of a person, Roll number of the student,
Registration number of a car, etc.
Representation of Single Value Attribute in ER Diagram
Multi-valued Attribute
In the case of multi-value attribute, more than one value will be associated with that
attribute.
Representation of Multivalued Attribute in ER Diagram
Examples of Multi-valued Attribute
1.
hence skills
associated to an employee are a multi-value attribute.
104
2. Number of chefs in a hotel is an example of multi-value attribute. Moreover, a

hotel will have variety of food items. Hence food items associated with the entity
HOTEL is an example of multi-valued attribute.
Moreover a staff can be an expert in more than one area, hence area of specialization is
considered as multi-valued attribute.
Derived Attribute
The value of the derived attribute can be derived from the values of other related
attributes or entities.
represented by dotted ellipse.
Representation of Derived Attribute in ER Diagram
Example of Derived Attribute
1. Age of a person can be derived from the date of birth of the person. In this
example, age is the derived attribute.
2. Experience of an employee in an organization can be derived from date of joining

of the employee
105

3. CGPA of a student can be derived from GPA (Grade Point Average).
Null Value Attribute

In some cases, a particular entity may not have any applicable value for an attribute.
For such situation, a special value called null value is created.
Null value situations
Not applicable
Example
In application forms, there is one column called phone no. if a person do not
have phone then a null value is entered in that column.
Composite Attribute
Composite attribute is one which can be further subdivided into simple attributes.
Example
Consider the attribute address which can be further subdivided into Street name,
City, and State.
106
As another example of composite attribute consider the degrees earned by a particular

scholar, which can range from undergraduate, postgraduate, doctorate degree, etc.
Hence degree can be considered as composite attribute.
Relationship Degree
Relationship degree refers to the number of associated entities. The relationship
degree can be broadly classified into unary, binary, and ternary
relationship.
Unary Relationship
The unary relationship is otherwise known as recursive relationship. In the
unary relationship the number of associated entity is one. An entity related
to itself is known as recursive relationship.
Captain of
PLAYERS
Roles and Recursive Relation

When an entity sets appear in more than one relationship, it is useful to add labels to
connecting lines. These labels are called as roles.
Example
In this example, Husband and wife are referred as roles.
PERSON
WIFE
107

Married to
Binary Relationship
In a binary relationship, two entities are involved. Consider the example; each staff will
be assigned to a particular department. Here the two entities are STAFF and
Ternary Relationship
In a ternary relationship, three entities are simultaneously involved. Ternary
relationships are required when binary relationships are not sufficient to accurately
describe the semantics of an association among three entities.
Example
Consider the example of employee assigned a project. Here we are considering
ECT, and LOCATION. The relationship is
assigned-to. Many employees will be assigned to one project hence it is an
example of one-to-many relationship.
Quaternary Relationships
Quaternary relationships involve four entities. The example of quaternary relationship
the four entities are
PROFESSOR, SLIDES, COURSE, and STUDENT. The relationships between the entities
are Teaches.
108
Relationship Classification
Relationship is an association among one or more entities. This relationship can be
broadly classified into one-to-one relation, one-to-many relation, many-to-many
relation and recursive relation.
One-to-Many Relationship Type

The relationship that associates one entity to more than one entity is called one-to-to-many relationship is Country having states. For
one country there can be more than one state hence it is an example of one-to-many
relationship. Another example of one-to-many relationship is parentchild relationship.
For one parent there can be more than one child. Hence it is an example of one-tomany relationship.
One-to-One Relationship Type
One-to-one relationship is a special case of one-to-many relationship. True one-to-one
relationship is rare. The relationship between the President and the country is an
example of one-to-one relationship. For a particular country there will be only one
President. In general, a country will not have more than one President hence the
relationship between the country and the President is an example of one-to-one
relationship. Another example of one-to-one relationship is House to Location. A house
is obviously in only one location.
109
Many-to-Many Relationship Type

The relationship between EMPLOYEE entity and PROJECT entity is an example of
many-tomany projects hence
the relationship between employee and project is many-to-many relationship.
Many-to-One Relationship Type
The relationship between EMPLOYEE and DEPARTMENT is an example of many-to-one
relationship. There may be many EMPLOYEES working in one DEPARTMENT. Hence
relationship between EMPLOYEE and DEPARTMENT is many-to-one relationship. The
and shown in Table above.
To implement the database, it is necessary to use the relational model. There is a

simple way of mapping from ER model to the relational model. There is almost one-toone correspondence between ER constructs and the relational ones.
Mapping Algorithm
The mapping algorithm gives the procedure to map ER diagram to tables. The rules in
mapping algorithm are given as:
For each strong entity type say E, create a new table. The columns of the table
are the attribute of the entity type E.
For each weak entity W that is associated with only one 11 identifying owner
relationship, identify the table T of the owner entity type. Include as columns
of T, all the simple attributes and simple components of the composite
attributes of W.
110
For each weak entity W that is associated with a 1N or MN identifying
relationship, or participates in more than one relationship, create a new table

T and include as its columns, all the simple attributes and simple
components of the composite attributes of W. Also form its primary key by
including as a foreign key in R, the primary key of its owner entity.
Reducing ER Diagram to Tables
For each binary 11 relationship type R, identify the tables S and T of the
participating entity types. Choose S, preferably the one with total

participation. Include as foreign key in S, the primary key of T. Include as
columns of S, all the simple attributes and simple components of the
composite attributes of R.
For each binary 1N relationship type R, identify the table S, which is at N
side and T of the participating entities. Include as a foreign key in S, the
primary key of T. Also include as columns of S, all the simple attributes and
simple components of composite attributes of R.
For each M-N relationship type R, create a new table T and include as
columns of T, all the simple attributes and simple components of composite
attributes of R. Include as foreign keys, the primary keys of the participating
entity types. Specify as the primary key of T, the list of foreign keys.
For each multi-valued attribute, create a new table T and include as columns
of T, the simple attribute or simple components of the attribute A. Include as
foreign key, the primary key of the entity or relationship type that has A.
Specify as the primary key of T, the foreign key and the columns
corresponding to A.
Regular Entity
Regular entities are entities that have an independent existence and generally represent
realies are represented by
rectangles with a single line.
Mapping Regular Entities
Each regular entity type in an ER diagram is transformed into a relation. The
name given to the relation is generally the same as the entity type.
Each simple attribute of the entity type becomes an attribute of the relation.
The identifier of the entity type becomes the primary key of the corresponding
relation.
Example 1
111
Mapping regular entity type tennis player
EntityRelationship Model
This diagram is converted into corresponding table as
Here,
Entity name = Name of the relation or table.
In our example, the entity name is PLAYER which is the name of the table
Attributes of ER diagram=Column name of the table.
In our example the Name, Nation, Position, and Number of Grand slams won which
forms the column of the table.
Converting Composite Attribute in an ER Diagram to Tables
When a regular entity type has a composite attribute, only the simple component
attributes of the composite attribute are included in the relation.
Example
In this example the composite attribute is the Customer address, which consists
of Street, City, State, and Zip.
112
When the regular entity type contains a multi-valued attribute, two new relations are
created.
The first relation contains all of the attributes of the entity type except the multi-valued
attribute.
The second relation contains two attributes that form the primary key of the second
relation. The first of these attributes is the primary key from the first relation, which
becomes a foreign key in the second relation. The second is the multi-valued attribute.
Mapping Multivalued Attributes in ER Diagram to Tables
A multivalued attribute is having more than one value. One way to map a
multivalued attribute is to create two tables.
Example
-valued attribute,
since an EMPLOYEE can have more than one skill as fitter, electrician, turner, etc.
113
Weak entity type does not have an independent existence and it exists only through an
identifying relationship with another entity type called the owner. 46 2 Entity
Relationship Model
For each weak entity type, create a new relation and include all of the simple attributes
of the identifying relation as
a foreign key attribute to this new relation. The primary key of the new relation is the
combination of the primary key of the identifying and the partial identifier of the weak
entity type. In this example DEPENDENT is weak entity.
114
Converting Binary Relationship to Table

A relationship which involves two entities can be termed as binary relationship. This
binary relationship can be one-to-one, one-to-many, many-to-one, and many-to-many.
Mapping one-to-Many Relationship
For each 1M relationship, first create a relation for each of the two entity types
participation in the relationship.
Example
One customer can give many orders. Hence the relationship between the two entities
-to-many relationship. In one-to-many relationship,
include the primary key attribute of the entity on the one-side of the relationship as a
foreign key in the relation that is on the many side of the relationship.
115
Here, we have two entities CUSTOMER and ORDER. The relationship between
CUSTOMER and ORDER is one-to-many. For two entities CUSTOMER and ORDER, two
tables namely CUSTOMER and ORDER are created as shown later. The primary key
CUSTOMER ID in the CUSTOMER relation becomes the foreign key in the ORDER
relation.
Binary one-to-one relationship can be viewed as a special case of one-to-many

relationships.
The process of mapping one-totwo relations
are created, one for each of the participating entity types. Second, the primary key of
one of the relations is included as a foreign key in the other relation.
Mapping Associative Entity to Tables
Many-to-many relationship can be modeled as an associative entity in the ER
diagram.
Example 1. (Without Identifier)
Here the associative entity is ORDERLINE, which is without an identifier.
That is the associative entity ORDERLINE is without any key attribute.
116
The first step is to create three relations, one for each of the two participatingentity
types and the third for the associative entity. The relation formed from the associative
entity is associative relation.
Sometimes data models will assign an identifier (surrogate identifier) to the associative
entity type on the ER diagram. There are two reasons to motivate
1. The associative entity type has a natural identifier that is familiar to end user.
2. The default identifier may not uniquely identify instances of the associative entity.
117
a) Shipment-No is a natural identifier to end user.

b) The default identifier consisting of the combination of Customer-ID and Vendor-ID
does not uniquely identify the instances of SHIPMENT.
Converting Unary Relationship to Tables
cases of unary relationship are one-to-many and many-to-many.

One-to-many Unary Relationship
Each employee has exactly one manager. A given employee may manage zero to many
employees. The foreign key in the relation is named Manager-ID. This attribute has the
same domain as the primary key Employee-ID.
118
Converting Ternary Relationship to Tables

A ternary relationship is a relationship among three entity types. The three entities
The PATIENTTREATMENT is an associative entity.
The primary key attributes Patient ID, Physician ID, and Treatment Code become
foreign keys in PATIENT TREATMENT. These attributes are components of the primary
key of PATIENT TREATMENT.
119
Normalisation
What is Normalization?
Normalization is a formal process for determining which fields belong in which
tables in a relational database.
Through normalization a collection of data in a record structure is replaced by
successive record structures that are simpler and more predictable and therefore
more manageable.
Normalisation is carried out for the following reasons.
1. To structure the data so that any pertinent (relevant) relationships

between entities can be represented
2. To permit simple retrieval of data in response to query and report

requests.
3. To simplify the maintenance of the data through updates, insertions,

and deletions.
4. To reduce the need to restructure or reorganize data when new

application requirements arise.
120
A normalized relational database provides several benefits:
Elimination of redundant data storage.

Decompose all data groups into two-dimensional records.
Eliminate any relationshiops in which data elements do not fully
depend on the primary key of the record.
Eliminate any relationsips that contain transitive dependency
Normalization ensures that you get the benefits relational databases offer.
Design Vs Implementation
Designing a database structure and implementing a database structure are different
tasks.
When you design a structure it should be described without reference to the

specific database tool you will use to implement the system, or what concessions
you plan to make for performance reasons. These steps come later.
After youve designed the database structure abstractly, then you implement it in a
particular environment.
Too often people new to database design combine design and implementation in
one step. Implementing a structure without designing it quickly leads to flawed
structures that are difficult and costly to modify.
Design first, implement second, and youll finish faster and cheaper.
Normalized Design: Pros and Cons

Weve implied that there are various advantages to producing a properly normalized
design before you implement your system. Lets look at a detailed list of the pros and
cons:
Pros of Normalizing
More efficient database structure
Cons of Normalizing
You cant start building the
database before you know what the
user needs
More efficient database structure.

Better understanding of your data.
More flexible database structure
Easier
to
maintain
database
structure
121
the road
Validates your common sense and
intuition.
Avoid redundant fields.
Insure that distinct tables exist
when necessary.
Terminology
Primary Key
The primary key is a fundamental concept in relational database design.

Its an easy concept: each record should have something that identifies it
uniquely.
The primary key can be a single field, or a combination of fields.
A tables primary key also serves as the basis of relationships with other tables.
For example, it is typical to relate invoices to a unique customer ID, and
employees to a unique department ID.
A primary key should be unique, mandatory, and permanent.
A classic mistake people make when learning to create relational databases is to
use a volatile field as the primary key. For example, consider this table:
[Companies]
Company Name
Address
? Company Name is an obvious candidate for the primary key. Yet, this is a bad idea,
even if the Company Name is unique.
Not only do you have to change this record, you have to update every single related
record since the key has changed.
Another common mistake is to select a field that is usually unique and unchanging.
Consider this small table:
[People]
Social Security Number
First Name
Last Name
Date of birth
In the United States all workers have a Social Security Number that uniquely identifies
them for tax purposes. Or does it? As it turns out, not everyone has a Social Security
Numbers change, and some people have more
122
than one. This is an appealing but untrustworthy key.
The correct way to build a primary key is with a unique and unchanging value.
Functional Dependency
Closely tied to the notion of a key is a special normalization concept called functional
dependence or functional dependency. The second and third normal forms verify that
your functional dependencies are correct.
So what is a functional dependency?
It describes how one field (or combination (composite) of fields) determines another
field. Consider an example:
[ZIP Codes]
ZIP Code
City
County
State Abbreviation
State Name
ZIP Code is a unique 5the other fields. For each ZIP Code there is a single city, county, and state abbreviation.
These fields are functionally dependent on the ZIP Code field. In other words, they
belong with this key. Look at the last two fields, State Abbreviation and State Name.
State Abbreviation determines State Name, in other words, State Name is functionally
dependent on State Abbreviation. State Abbreviation is acting like a key for the State
Name field. Ah ha! State Abbreviation is a key, so it belongs in another table. As well
see, the third normal form tells us to create a new States table and move State Name
into it.
Rules of Data Normalization

1NF
- Make a separate table for each set of related

attributes, and give each table a primary key.
2NF Eliminate Redundant Data - If an attribute depends on only part of a multivalued key, remove it to a separate table.
3NF Eliminate Columns Not Dependent On Key - If attributes do not contribute to a
description of the key, remove them to a separate table.
BCNF Boyce-Codd Normal Form - If there are non-trivial dependencies between
candidate key attributes, separate them out into distinct tables.
4NF Isolate Independent Multiple Relationships - No table may contain two or more
1:n or n:m relationships that are not directly related.
123
5NF
- There may be practical

constrains on information that justify separating logically related many-to-many
relationships.
ONF Optimal Normal Form - a model limited to only simple (elemental) facts, as
expressed in Object Role Model notation.
DKNF Domain-Key Normal Form - a model free from all modification anomalies.
124

Unit IInew

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Unit IInew

Transféré par

Droits d'auteur :

Formats disponibles

MANAGEMENT INFORMATION SYSTEM UNIT II

MANAGEMENT INFORMATION SYSTEM UNIT II

MANAGEMENT INFORMATION SYSTEM UNIT II

Cannot think by its own

Computers have no Intuition power

MANAGEMENT INFORMATION SYSTEM UNIT II

MANAGEMENT INFORMATION SYSTEM UNIT II

Purpose Wise like Special Purpose or General Purpose

MANAGEMENT INFORMATION SYSTEM UNIT II

MANAGEMENT INFORMATION SYSTEM UNIT II

MANAGEMENT INFORMATION SYSTEM UNIT II

K.S.Kunkuma Balasubramanian, 9841817207

MANAGEMENT INFORMATION SYSTEM UNIT II

Pascal's Pascaline [photo 2002 IEEE]

K.S.Kunkuma Balasubramanian, 9841817207

MANAGEMENT INFORMATION SYSTEM UNIT II

K.S.Kunkuma Balasubramanian, 9841817207

MANAGEMENT INFORMATION SYSTEM UNIT II

K.S.Kunkuma Balasubramanian, 9841817207

MANAGEMENT INFORMATION SYSTEM UNIT II

First Generation - 1946-1958: Vacuum Tubes

MANAGEMENT INFORMATION SYSTEM UNIT II

Transistors replaced vacuum tubes

K.S.Kunkuma Balasubramanian, 9841817207

MANAGEMENT INFORMATION SYSTEM UNIT II

Characteristics of Third Generation Computers

MANAGEMENT INFORMATION SYSTEM UNIT II

Characteristics of Fourth Generation Computers

Fourth generation computers are microprocessor based systems.

K.S.Kunkuma Balasubramanian, 9841817207

MANAGEMENT INFORMATION SYSTEM UNIT II

Advances in science behind the creation of fifth generation computer

MANAGEMENT INFORMATION SYSTEM UNIT II

can store a fixed number of bits called word length.

1024 YB = 1 (Bronto Byte)

MANAGEMENT INFORMATION SYSTEM UNIT II

MANAGEMENT INFORMATION SYSTEM UNIT II

K.S.Kunkuma Balasubramanian, 9841817207

MANAGEMENT INFORMATION SYSTEM UNIT II

10.5 inch computer magnetic tape.

Fast: Copying of data is easier and fast.

K.S.Kunkuma Balasubramanian, 9841817207

MANAGEMENT INFORMATION SYSTEM UNIT II

K.S.Kunkuma Balasubramanian, 9841817207

MANAGEMENT INFORMATION SYSTEM UNIT II

How files are stored in a disk

data is stored in blocks

blocks a set of sectors

tracks are divided into various sectors

Actual files or information are stored in available sectors

files will have names

files are indefinite in size

files may be updated (in part or whole)

directory entries record file data

file allocation table keeps track of file

If there is any break in the links you will

MANAGEMENT INFORMATION SYSTEM UNIT II

-ROM): The compact disk (CD) was introduced by Philips

K.S.Kunkuma Balasubramanian, 9841817207

MANAGEMENT INFORMATION SYSTEM UNIT II

K.S.Kunkuma Balasubramanian, 9841817207

MANAGEMENT INFORMATION SYSTEM UNIT II

K.S.Kunkuma Balasubramanian, 9841817207