Vous êtes sur la page 1sur 15

February 2010

Master of Computer Application (MCA) – Semester 3


MC0073 – System Programming
Assignment Set – 2

1. Explain the following with respect to Loaders:


A) Design of an Absolute Loader
B) A Simple Bootstrap Loader
Ans –

A) Design of an Absolute Loader

Because the example loader does not need to perform such functions
as liking and program relocation, its operation is very simple. All
functions are accomplished in a single pass. The Header record is
checked to verify that the correct program has been presented for
loading. As each Text record is read, the object code it contains moved
to the indicated address in memory. When the End record is
encountered, the loader jumps to the specified address to begin
execution of the loaded program.

Program loaded in memory

Each pair of bytes from the object program record must be packed
together into one byte during loading. Each printed character represents
one byte of the object program record. On the other hand, each printed
character represents one hexadecimal digit in memory (i.e., a halfbyte).
Most machine store object programs in a binary form, with each byte of
object code stored as a single byte in the object program.

Begin
read Header record

verify program name and length

read first Text record

while record type <> ‘E’ do

begin
{if object code is in character form, convert into
internal representation}
move object code to specified location in memory

read next object program record

end
jump to address specified in End record

end

Figure 5.0 Algorithm for an absolute loader

B)A simple bootstrap loader

A computer's central processor can only execute program


code found in Read-Only Memory (ROM) and Random Access Memory
(RAM). Modern operating systems and application program code and
data are stored on nonvolatile data storage devices, such as hard disk
drives, CD, DVD, USB flash drive, and floppy disk. When a computer is
first powered on, it does not have an operating system in ROM or RAM.
The computer must initially execute a small program stored in ROM
along with the bare minimum of data needed to access the nonvolatile
devices from which the operating system programs and data are loaded
into RAM.

The small program that starts this sequence of loading into


RAM, is known as a bootstrap loader, bootstrap or boot loader. This small
boot loader program's only job is to load other data and programs which
are then executed from RAM. Often, multiple-stage boot loaders are
used, during which several programs of increasing complexity
sequentially load one after the other in a process of chain loading.

Early computers (such as the PDP-1 through PDP-8 and early models of
the PDP-11) had a row of toggle switches on the front panel to allow the
operator to manually enter the binary boot instructions into memory
before transferring control to the CPU. The boot loader would then read
in either the second-stage boot loader (called Binary Loader of paper
tape with checksum), or the operating system from an outside storage
medium such as paper tape, punched card, or a disk drive.

2. Write about Deterministic and Non-Deterministic


Finite Automata with suitable numerical examples.
Ans –

Deterministic finite automata

Definition: Deterministic finite automata (DFA)


A deterministic finite automaton (DFA) is a 5-tuple: (S, Σ, T, s, A)

· an alphabet (Σ)

· a set of states (S)

· a transition function (T : S × Σ → S).

· a start state (s S)

· a set of accept states (A S)

The machine starts in the start state and reads in a string of


symbols from its alphabet. It uses the transition function T to determine
the next state using the current state and the symbol just read. If, when
it has finished reading, it is in an accepting state, it is said to accept the
string, otherwise it is said to reject the string. The set of strings it
accepts form a language, which is the language the DFA recognizes.

Non-Deterministic Finite Automaton (N-DFA)

A Non-Deterministic Finite Automaton (NFA) is a 5-tuple: (S, Σ, T, s, A)

· an alphabet (Σ)

· a set of states (S)

· a transition function (T: S × Σ → S).

· a start state (s S)

· a set of accept states (A S)

Where P(S) is the power set of S and ε is the empty string. The
machine starts in the start state and reads in a string of symbols from its
alphabet. It uses the transition relation T to determine the next state(s)
using the current state and the symbol just read or the empty string. If,
when it has finished reading, it is in an accepting state, it is said to
accept the string, otherwise it is said to reject the string. The set of
strings it accepts form a language, which is the language the NFA
recognizes.

A DFA or NFA can easily be converted into a GNFA and then the
GNFA can be easily converted into a regular expression by reducing the
number of states until S = {s, a}.

Deterministic Finite State Machine


The following example explains a deterministic finite state machine
(M) with a binary alphabet, which determines if the input contains an
even number of 0s.

See the following figure

M = (S, Σ, T, s, A)

· Σ = {0, 1}

· S = {S1, S2}

· s = S1

· A = {S1}

The transition function T is visualized by the directed graph shown on the


right, and defined as follows:

o T(S1, 0) = S2

o T(S1, 1) = S1

o T(S2, 0) = S1

o T(S2, 1) = S2

Simply put, the state S1 represents that there has been an even
number of 0s in the input so far, while S2 signifies an odd number. A 1 in
the input does not change the state of the automaton. When the input
ends, the state will show whether the input contained an even number of
0s or not.

There are two main methods for handling where to generate the
outputs for a finite state machine. They are called a Moore Machine and
a Mearly Machine, named after their respective authors.

3. Explain wiht suitable numerical examples the


concepts of Moore Machine and Mealay Machine.
Ans –

Moore Machine and Mearly Machine


A Moore Machine is a type of finite state machine where the
outputs are generate as products of the states. In the below example the
states define what to do; such as apply power to the light globe.

A light system example of a Moore Machine

A Mealy Machine, unlike a Moore Machine is a type of finite state


machine where the outputs are generated as products of the transition
between states. In below example the light is affected by the process of
changing states.

A light system example of a Mealy Machine

Finite state machines is not a new technique, it has been around


for a long time. The concept of decomposition should be familiar to
people with programming or design experience. There are a number of
abstract modeling techniques that may help or spark understanding in
the definition and design of a finite state machine, most come from the
area of design or mathematics.

· State Transition Diagram: also called a bubble diagram, shows the


relationships between states and inputs that cause state transitions.

· State-Action-Decision Diagram: simply a flow diagram with the addition


of bubbles that show waiting for external inputs.

· Statechart Diagrams: a form of UML notation used to show behavior of


an individual object as a number of states, and transitions between those
states.

· Hierarchical Task Analysis (HTA): though it does not look at states, HTA
is a task decomposition technique that looks at the way a task can be
split into subtasks, and the order in which they are performed [4]

4. Write about different Phases of Compilation.


Ans –
Phases of Compiler

A compiler takes as input a source program and produces as


output an equivalent sequence of machine instructions. This process is
so complex that it is not reasonable, either from a logical point of view or
from an implementation point of view, to consider the compilation
process as occurring in one single step. For this reason, it is customary to
partition the compilation process into a series of sub processes called
phases. A phase is a logically cohesive operation that takes as input one
representation of the source program and produces as output another
representation.

The first phase, called the lexical analyzer, or scanner, separates


characters of the source language into groups that logically belong
together; these groups are called tokens. The usual tokens are keywords,
such as DO or IF identifiers, such as X or NUM, operator symbols such as
< = or +, and punctuation symbols such as parentheses or commas. The
output of the lexical analyzer is a stream of tokens, which is passes to
the next phase, the syntax analyzer, or parser. The tokens in this
stream can be represented by codes which we may regard as integers.
Thus, DO might be represented by 1, + by 2, and “identifier” by 3. In the
case of a token like ‘identifier”, a second quantity, telling which of those
identifiers used by the program is represented by this instance of token
“identifier”, is passed along with the integer code for “identifier”.

The syntax analyzer groups tokens together into syntactic


structures. For example, the three tokens representing A + B might be
grouped into a syntactic structure called an expression. Expressions
might further be combined to form statements. Often the syntactic
structure can be regarded as a tree whose leaves are the tokens. The
interior nodes of the tree represent strings of tokens that logically belong
together.

The intermediate code generator uses the structure produced


by the syntax analyzer to create a stream of simple instructions. Many
styles of intermediate code are possible. One common style uses
instructions with one operator and a small number of operands. These
instructions can be viewed as simple macros like the macro ADD2. The
primary difference between intermediate code and assembly code is that
the intermediate code need not specify the registers to be used for each
operation.

Code Optimization is an optional phase designed to improve the


intermediate code so that the ultimate object program runs faster and /
or takes less space. Its output is another intermediate code program that
does the same job as the original, but perhaps in a way that saves time
and / or space.
The final phase, code generation, produces the object code by
deciding on the memory locations for data, selecting code to access each
datum, and selecting the registers in which each computation is to be
done. Designing a code generator that produces truly efficient object
programs is one of the most difficult parts of compiler design, both
practically and theoretically.

The Table-Management, or bookkeeping, portion of the compiler


keeps track of the names used by the program and records essential
information about each, such as its type (integer, real, etc). The data
structure used to record this information is called a Symbol table.

The Error Handler is invoked when a flaw in the source program


is detected. It must warn the programmer by issuing a diagnostic, and
adjust the information being passed from phase to phase so that each
phase can proceed. It is desirable that compilation be completed on
flawed programs, at least through the syntax-analysis phase, so that as
many errors as possible can be detected in one compilation. Both the
table management and error handling routines interact with all phases of
the compiler.

Lexical Analysis

The lexical analyzer is the interface between the source program


and the compiler. The lexical analyzer reads the source program one
character at a time, carving the source program into a sequence of
atomic units called tokens. Each token represents a sequence of
characters that can be treated as a single logical entity. Identifiers,
keywords, constants, operators, and punctuation symbols such as
commas and parentheses are typical tokens. There are two kinds of
token: specific strings such as IF or a semicolon, and classes of strings
such as identifiers, constants, or labels.

Syntax Analysis

The parser has two functions. It checks that the tokens appearing
in its input, which is the output of the lexical analyzer, occur in patterns
that are permitted by the specification for the source language. It also
imposes on the tokens a tree-like structure that is used by the
subsequent phases of the compiler.

The second aspect of syntax analysis is to make explicit the


hierarchical structure of the incoming token stream by identifying which
parts of the token stream should be grouped together.

Intermediate Code Generation

On a logical level the output of the syntax analyzer is some


representation of a parse tree. The intermediate code generation phase
transforms this parse tree into an intermediate language representation
of the source program called Three-Address Code.

Three-Address Code

One popular type of intermediate language is what is called “three-


address code”. A typical three-address code statement is

A: = B op C

Where A, B and C are operands and op is a binary operator.

Code Optimization

Object programs that are frequently executed should be fast and


small. Certain compilers have within them a phase that tries to apply
transformations to the output of the intermediate code generator, in an
attempt to produce an intermediate-language version of the source
program from which a faster or smaller object-language program can
ultimately be produced. This phase is popularity called the optimization
phase.

A good optimizing compiler can improve the target program by


perhaps a factor of two in overall speed, in comparison with a compiler
that generates code carefully but without using specialized techniques
generally referred to as code optimization. There are two types of
optimizations used:

· Local Optimization

· Loop Optimization

Code Generation

The code generation phase converts the intermediate code into a


sequence of machine instructions. A simple-minded code generator
might map the statement A: = B+C into the machine code sequence.

LOAD B

ADD C

STORE A

However, such a straightforward macro like expansion of


intermediate code into machine code usually produces a target program
that contains many redundant loads and stores and that utilizes the
resources of the target machine inefficiently.
To avoid these redundant loads and stores, a code generator might
keep track of the run-time contents of registers. Knowing what quantities
reside in registers, the code generator can generate loads and stores
only when necessary.

Many computers have only a few high speed registers in which


computations can be performed particularly quickly. A good code
generator would therefore attempt to utilize these registers as efficiently
as possible. This aspect of code generation, called register allocation, is
particularly difficult to do optimally.

5. Describe the following with respect to Storage or


Memory Allocations:
A) Static Memory Allocations
B) Stack Based Allocations
C) Dynamic Memory Allocations
Ans –

A) Static Memory Allocation

Static memory allocation refers to the process of


allocating memory at compile-time before the associated
program is executed, unlike dynamic memory allocation or
automatic memory allocation where memory is allocated as
required at run-time.

Static Allocation of Variables

The first type of memory allocation is known as a static memory


allocation, which corresponds to file scope variables and local static
variables. Not all variables are automatically allocated. The following
kinds of variables are statically allocated:

· All global variables, regardless of whether or not they have been


declared as static;

· Local variables explicitly declared to be static.

Statically allocated variables have their storage allocated and


initialized before main starts running and are not deallocated until main
has terminated. Statically allocated local variables are not re-initialized
on every call to the function in which they are declared. A statically
allocated variable thus has the occasionally useful property of
maintaining its value even when none of the functions that access the
variable are active.
The addresses and sizes of these allocations are fixed at the time
of compilation and so they can be placed in a fixed-sized data area which
then corresponds to a section within the final linked executable file. Such
memory allocations are called static because they do not vary in location
or size during the lifetime of the program.

There can be many types of data sections within an executable


file; the three most common are normal data, BSS data and read-only
data. BSS data contains variables and arrays which are to be initialized
to zero at run-time and so is treated as a special case, since the actual
contents of the section need not be stored in the executable file. Read-
only data consists of constant variables and arrays whose contents are
guaranteed not to change when a program is being run. For example, on
a typical SVR4 UNIX system the following variable definitions would
result in them being placed in the following sections.

Int a; // BSS data

Int b=1; // normal data

Const int c=2; // read-only data.

In C the first example would be considered a tentative declaration,


and if there was no subsequent definition of that variable in the current
translation unit then it would become a common variable in the resulting
object file. When the object file gets linked with other object files, any
common variables with the same name become one variable, or take
their definition from a non-tentative definition of that variable. In the
former case, the variable is placed in the BSS section. Note that C++ has
no support for tentative declarations.

As all static memory allocations have sizes and address offsets that
are known at compile-time and are explicitly initialised, there is very
little that can go wrong with them. Data can be read or written past the
end of such variables, but that is a common problem with all memory
allocations and is generally easy to locate in that case. On systems that
separate read-only data from normal data, writing to a read-only variable
can be quickly diagnosed at run-time.

B) Stack memory allocations

The second type of memory allocation is known as a stack memory


allocation, which corresponds to non-static local variables and call-by-
value parameter variables. The sizes of these allocations are fixed at the
time of compilation but their addresses will vary depending on when the
function which defines them is called. Their contents are not immediately
initialised, and must be explicitly initialised by the programmer upon
entry to the function or when they become visible in scope.
Such memory allocations are placed in a system memory area
called the stack, which is allocated per process and generally grows
down in memory. When a function is called, the state of the calling
function must be preserved so that when the called function returns, the
calling function can resume execution. That state is stored on the stack,
including all local variables and parameters. The compiler generates
code to increase the size of the stack upon entry to a function, and
decrease the size of the stack upon exit from a function, as well as
saving and restoring the values of registers.

There are a few common problems using stack memory


allocations, and most generally involve uninitialised variables, which a
good compiler can usually diagnose at compile-time. Some compilers
also have options to initialise all local variables with a bit pattern so that
uninitialised stack variables will cause program faults at run-time. As
with static memory allocations, there can be problems with reading or
writing past the end of stack variables, but as their sizes are fixed these
can usually easily be located.

C) Dynamic memory allocations

The last type of memory allocation is known as a dynamic memory


allocation, which corresponds to memory allocated via malloc() or operator
new[]. The sizes, addresses and contents of such memory vary at run-
time and so can cause a lot of problems when trying to diagnose a fault
in a program. These memory allocations are called dynamic memory
allocations because their location and size can vary throughout the
lifetime of a program.

Such memory allocations are placed in a system memory area


called the heap, which is allocated per process on some systems, but on
others may be allocated directly from the system in scattered blocks.

Unlike memory allocated on the stack, memory allocated on the


heap is not freed when a function or scope is exited and so must be
explicitly freed by the programmer. The pattern of allocations and
deallocations is not guaranteed to be (and is not really expected to be)
linear and so the functions that allocate memory from the heap must be
able to efficiently reuse freed memory and resize existing allocated
memory on request. In some programming languages there is support
for a garbage collector, which attempts to automatically free memory
that has had all references to it removed, but this has traditionally not
been very popular for programming languages such as C and C++, and
has been more widely used in functional languages .

Because dynamic memory allocations are performed at run-time


rather than compile-time, they are outwith the domain of the compiler
and must be implemented in a run-time package, usually as a set of
functions within a linker library. Such a package manages the heap in
such a way as to abstract its underlying structure from the programmer,
providing a common interface to heap management on different
systems. However, this malloc library must decide whether to implement
a fast memory allocator, a space-conserving memory allocator, or a bit of
both. It must also try to keep its own internal tables to a minimum so as
to conserve memory, but this means that it has very little capability to
diagnose errors if any occur.

In some compiler implementations there is a builtin function called


alloca(). This is a dynamic memory allocation function that allocates
memory from the stack rather than the heap, and so the memory is
automatically freed when the function that called it returns. This is a non-
standard feature that is not guaranteed to be present in a compiler, and
indeed may not be possible to implement on some systems. However,
the mpatrol library provides a debugging version of this function (and a
few other related functions) on all systems, so that they make use of the
heap instead of the stack.

6. Describe the following with respect to Software


Tools for Program Development:
A) Compilers B) Editors
C) Debuggers D) Interpreters
Ans –

Compiler

A compiler is a computer program (or set of programs) that


translates text written in a computer language (the source language)
into another computer language (the target language). The original
sequence is usually called the source code and the output called object
code. Commonly the output has a form suitable for processing by other
programs (e.g., a linker), but it may be a human-readable text file.

Compiler Backend

While there are applications where only the compiler frontend is


necessary, such as static language verification tools, a real compiler
hands the intermediate representation generated by the frontend to the
backend, which produces a functional equivalent program in the output
language. This is done in multiple steps:

1. Optimization – the intermediate language representation is


transformed into functionally equivalent but faster (or smaller) forms.

2. Code Generation – the transformed intermediate language is


translated into the output language, usually the native machine
language of the system. This involves resource and storage decisions,
such as deciding which variables to fit into registers and memory and the
selection and scheduling of appropriate machine instructions.

Typical compilers output so called objects, which basically contain


machine code augmented by information about the name and location of
entry points and external calls (to functions not contained in the object).
A set of object files, which need not have all come from a single
compiler, may then be linked together to create the final executable
which can be run directly by a user.Compiler frontend

The compiler frontend consists of multiple phases in itself, each


informed by formal language theory:

1. Scanning – breaking the source code text into small pieces, tokens –
sometimes called ‘terminals’ – each representing a single piece of the
language, for instance a keyword, identifier or symbol names. The token
language is typically a regular language, so a finite state automaton
constructed from a regular expression can be used to recognize it.

2. Parsing – Identifying syntactic structures – so called ‘non-terminals’ –


constructed from one or more tokens and non-terminals, representing
complicated language elements, for instance assignments, conditions
and loops. This is typically done with a parser for a context-free
grammar, often an LL parser or LR parser from a parser generator. (Most
programming languages are only almost context-free, so there’s often
some extra logic hacked in.)

3. Intermediate Language Generation – an equivalent to the original


program is created in a special purpose intermediate language

B) Editors

An Editor is a software tool for editing something, i.e. introducing


changes into some text, graphics or programme.

· HTML editor

· text editor

· source code editor

· graphics editor

· game level editor

· game character editor


· word processor, more complex text-producing tools

C) Debuggers

A Debugger is a computer program that is used to test and debug


other programs. The code to be examined might alternatively be running
on an instruction set simulator (ISS), a technique that allows great power
in its ability to halt when specific conditions are encountered but which
will typically be much slower than executing the code directly on the
appropriate processor.

When the program crashes, the debugger shows the position in the
original code if it is a source-level debugger or symbolic debugger,
commonly seen in integrated development environments. If it is a low-
level debugger or a machine-language debugger it shows the line in
the disassembly. (A "crash" happens when the program cannot continue
because of a programming bug. For example, perhaps the program tried
to use an instruction not available on the current version of the CPU or
attempted access to unavailable or protected memory.)

Typically, debuggers also offer more sophisticated functions such


as running a program step by step (single-stepping), stopping
(breaking) (pausing the program to examine the current state) at some
kind of event by means of breakpoint, and tracking the values of some
variables. Some debuggers have the ability to modify the state of the
program while it is running, rather than merely to observe it.

The importance of a good debugger cannot be overstated. Indeed,


the existence and quality of such a tool for a given language and
platform can often be the deciding factor in its use, even if another
language/platform is better-suited to the task. However, it is also
important to note that software can (and often does) behave differently
running under a debugger than normally, due to the inevitable changes
the presence of a debugger will make to a software program’s internal
timing. As a result, even with a good debugging tool, it is often very
difficult to track down runtime problems in complex multi-threaded or
distributed systems.

List of Debuggers

· CodeView

· DAEDALUS

· DBG – A PHP Debugger and Profiler

· Xdebug – PHP Debugger, [1]

· dbx
· DDD, Data Display Debugger

· Ddbg – Win32 Debugger for the D Programming Language

· DEBUG DOS Command

· Dynamic debugging technique (DDT), and its octal counterpart Octal


Debugging Technique

· Eclipse

· gDEBugger is a commercial OpenGL debugger and OpenGL ES


debugger. Real time GPU debugger and analysis tool provided by
Graphic Remedy. Available for Windows and Linux

· GoBug symbolic debugger for Windows


· GNU Debugger (GDB)
· Insight
· Interactive Disassembler (IDA Pro)
· Java Platform Debugger Architecture
· JSwat, open-source Java debugger
· MacsBug
· OLIVER (CICS interactive test/debug)
· OllyDbg
· IBM Rational Purify
· sdb
· SIMMON (Simulation Monitor)
· SIMON (Batch Interactive test/debug)

D) Interpreter

In computer science, an interpreter is a computer program that


executes, or performs, instructions written in a computer programming
language. Interpretation is one of the two major ways in which a
programming language can be implemented, the other being
compilation. The term interpreter may refer to a program that executes
source code that has already been translated to some intermediate form,
or it may refer to the program that performs both the translation and
execution (e.g., many BASIC implementations).

An interpreter needs to be able to analyze, or parse, instructions


written in the source language. It also needs to represent any program
state, such as variables or data structures, that a program may create. It
needs to be able to move around in the source code when instructed to
do so by control flow constructs such as loops or conditionals. Finally, it
usually needs to interact with an environment, such as by doing
input/output with a terminal or other user interface device.

Vous aimerez peut-être aussi