Vous êtes sur la page 1sur 29

01110111011010001000 MOV A JMP B

What is a programming language?

A programming language is a notational system for

describing computation in a machine-readable and human-readable form.

To build programs, people use languages that are similar

to human language. The results are translated into machine code, which computers understand.

Classification of the languages

Programming languages fall into three broad categories:
Machine languages

Assembly languages
Higher-level languages

In the formal classification the above 3 categories are

divided into as 5 Generations of languages.

Generations of Programming Languages

1st GL: 2nd GL: 3rd GL: 4th GL: 5th GL: machine codes symbolic assemblers (machine-independent) imperative languages (FORTRAN, Pascal, C ...) domain specific application generators AI languages

Each generation is at a higher level of abstraction i.e.: The higher the generation, the lesser the programmer will be aware of the internal workings of the program in the hardware level.

A Brief Chronology
Early 1950s 1957 FORTRAN 1958 ALGOL 1960 LISP, COBOL 1962 APL, SIMULA 1964 BASIC, PL/I 1966 ISWIM 1970 Prolog 1972 C 1975 Pascal, Scheme 1978 CSP 1978 FP 1983 Smalltalk-80, Ada 1984 Standard ML 1986 C++, Eiffel 1988 CLOS, Oberon, Mathematica 1990 Haskell 1990s Perl, Python, Ruby, JavaScript 1995 Java 2000 C# order codes (primitive assemblers) the first high-level programming language the first modern, imperative language Interactive programming; business programming the birth of OOP (SIMULA) first modern functional language (a proposal) logic programming is born the systems programming language two teaching languages Concurrency matures Backus proposal OOP is reinvented FP becomes mainstream (?) OOP is reinvented (again) FP is reinvented Scripting languages become mainstream OOP is reinvented for the internet

What is Machine Code ?

Machine code is the only form of program instructions that

the computer hardware can understand and execute directly. All other forms of computer language must be translated into machine code in order to be executed by the hardware. Machine code consists of many strings of binary digits that are easy for the computer to interpret, but tedious for human beings to interpret. Machine code is different for each type of computer.
A program in machine code for an Intel x86-based PC will not run

on an IBM mainframe computer, and vice versa.

Assembly Language
Assembly language is a symbolic representation of machine

code, which allows programmers to write programs in machine code without having to deal with the long binary strings. For example, the machine code for an instruction that adds two numbers might be 01101110, but in assembly language, this can be represented by the symbol ADD. A simple assembler program translates this symbolic language directly into machine code. Because machine code is specific to each type of computer hardware, assembly languages are also specific to each type of computer. However, all machine languages and assembly languages look very similar, even though they are not interchangeable.

Assembly code

Assembler Object code

High Level Languages

High-level language is a language that is convenient for human

beings to understand. High-level programming languages must be translated into machine code for execution, and this process is called compilation. A program that carries out this translation is a compiler. High-level language may bear no resemblance at all to machine code. The compiler figures out how to generate machine language that does exactly what the high-level-language source program specifies. Languages like C++, Algol, COBOL, etc., are all compiled high-level languages. They usually work more or less the same across all computer types, which makes them much more portable than assembly language.

Higher-Level Languages Third-Generation Languages

Third-generation languages (3GLs) are the first to use true English-

like phrasing, making them easier to use than previous languages.

3GLs are portable, meaning the object code created for one type of

system can be translated for use on a different type of system.

The following languages are 3GLs:


C C++ Java ActiveX

Example :

Higher-Level Languages Fourth-Generation Languages

Fourth-generation languages (4GLs) are even easier to use

than 3GLs.
4GLs may use a text-based environment (like a 3GL) or may

allow the programmer to work in a visual environment, using graphical tools.

The following languages are 4GLs: Visual Basic (VB) VisualAge Authoring environments

Higher-Level Languages Fifth-Generation Languages

Fifth-generation languages (5GLs) are an issue of debate

in the programming community some programmers cannot agree that they even exist.
These high-level languages would use artificial

intelligence to create software, making 5GLs extremely difficult to develop.

Solve problems using constraints rather than algorithms,

used in Artificial Intelligence


Why we need assembly langauge?

Early computer systems were literally programmed by hand. Front panel switches were used to enter instructions and data. These switches represented the address, data and control lines

of the computer system. To enter data into memory, the address switches were toggled to the correct address, the data switches were toggled next, and finally the WRite switch was toggled. This wrote the binary value on the front panel data switches to the address specified. Once all the data and instruction were entered, the run switch was toggled to run the program.

The programmer also needed to know the instruction set

of the processor.
Each instruction needed to be manually converted into bit

patterns by the programmer so the front panel switches could be set correctly.
This led to errors in translation as the programmer could

easily misread 8 as the value B.

It became obvious that such methods were slow and

error prone.

With the advent of better hardware which could address

larger memory, and the increase in memory size (due to better production techniques and lower cost), programs were written to perform some of this manual entry.
Small monitor programs became popular, which allowed

entry of instructions and data via hex keypads or terminals.

Additional devices such as paper tape and punched cards

became popular as storage methods for programs.

Programs were still hand-coded, in that the conversion

from mnemonics to instructions was still performed manually. To increase programmer productivity, the idea of writing a program to interpret another was a major breakthrough. This would be run by the computer, and translate the actual mnemonics into instructions. The benefits of such a program would be
reduced errors
faster translation times changes could be made easier and faster

Assembly language programming is writing machine

instructions in mnemonic form, using an assembler to convert these mnemonics into actual processor instructions and associated data.
So basically assembly language is a translator from

human world to the machine world.

Machine Code



Disadvantages of assembly language programming

the programmer requires knowledge of the processor

architecture and instruction set

many instructions are required to achieve small tasks. source programs tend to be large and difficult to follow programs are machine dependent, requiring complete

rewrites if the hardware is changed

Software development process

The Real World Problem Logical Solution (On paper) Selecting tools (programming language/ method)

Coding Release as final solution Testing /Debugging

The program translation sequence

Developing a software program to accomplish a particular

task : the implementer chooses an appropriate language, develops the algorithm (a sequence of steps, which when

carried out in the order prescribed, achieve the desired result), implements this algorithm in the chosen language (coding), then tests and debugs the final result.

Software execution process

Machine code or the executable code which the machine understands

Assembly language programming

Features provided by an assembler are, allows the programmer to use mnemonics when writing source code programs. variables are represented by symbolic names, not as memory locations symbolic code is easier to read and follow error checking is provided changes can be quickly and easily incorporated with a re-assembly programming aids are included for relocation and expression evaluation In writing assembly language programs for micro-computers, it is

essential that a standardized format be followed. Most manufacturers provide assemblers, which are programs used to generate machine code instructions for the actual processor to execute.

The assembler converts the written assembly

language source program into a format which run on the processor. Each machine code instruction (the binary or hex value) is replaced by a mnemonic. A mnemonic is an abbreviation which represents the actual instruction.
Binary 01001111 00110110 01001101 Hex 4F 36 4D Mnemonic CLRA PSHA TSTA Clears the A accumulator Saves A acc on stack Tests A acc for 0

Mnemonics are used because they are more meaningful than hex or binary values reduce the chances of making an error are easier to remember than bit values Assemblers also accept certain characters as representing

number bases and addressing modes.

$ prefix or h suffix for hexadecimal D for decimal numbers B for binary numbers

O or Q for octal numbers

# for immediate addressing X for indexed addressing

$24 or 24h 24D 67 0101111B 377O 232Q LDAA #$34 LDAA 01,X

Assembly language statements are written one per line. A machine code program thus consists of a sequence of

assembly language statements, where each statement contains a mnemonic. Each line of an assembly language program is split into four fields, as shown below

The label field is optional. A label is an identifier (or text string symbol). Labels are used extensively in programs to reduce reliance upon

programmers remembering where data or code is located. A label can be used to refer to a memory location the value of a piece of data the address of a program, sub-routine, code portion etc. The maximum length of a label differs between assemblers. Some accept up to 32 characters long, others only four characters.
A label, when declared, is suffixed by a colon, and begins with a valid

character (A..Z).

Here, the label START is equal to the address of the instruction LDAA


The label is used in the program as a reference, eg, JMP START

This would result in the processor jumping to the location (address)

associated with the label START, thus executing the instruction LDAA #24H immediately after the JMP instruction. When a label is referenced later on in the program, it is done so without the colon suffix.

An advantage of using labels is that inserting or re-

arranging code statements do not necessitate re-working actual machine instructions.

A simple re-assembly is all that is required. In hand-coding, such changes can take hours to perform.