Levels of Programming Languages

Levels of Programming Languages
There is only one programming language that any computer can actually understand and execute: its own native binary machine code. This is the lowest possible level of language in which it is possible to write a computer program. All other languages are said to be high level or low level according to how closely they can be said to resemble machine code. In this context, a low-level language corresponds closely to machine code, so that a single low-level language instruction translates to a single machinelanguage instruction. A high-level language instruction typically translates into a series of machine-language instructions.
Low-level languages or assembly level languages have the advantage that they can be written to take advantage of any peculiarities in the architecture of the central processing unit (CPU) which is the "brain" of any computer. Thus, a program written in a low-level language can be extremely efficient, making optimum use of both computer memory and processing time. However, to write a low-level program takes a substantial amount of time, as well as a clear understanding of the inner workings of the processor itself. Therefore, lowlevel programming is typically used only for very small programs, or for segments of code that are highly critical and must run as efficiently as possible. High-level languages permit faster development of large programs. The final program as executed by the computer is not as efficient, but the savings in programmer time generally far outweigh the inefficiencies of the finished product. This is because the cost of writing a program is nearly constant for each line of code, regardless of the language. Thus, a high-level language where each line of code translates to 10 machine instructions costs only one tenth as much in program development as a low-level language where each line of code represents only a single machine instruction.
In addition to the distinction between high-level and low-level languages, there is a further distinction between compiler languages and interpreter languages. Let's take a look at the various levels. Absolute Machine Code The very lowest possible level at which you can program a computer is in its own native machine code, consisting of strings of 1's and 0's and stored as binary numbers. The main problems with using machine code directly are that it is very easy to make a mistake, and very hard to find it once you realize the mistake has been made.
Assembly Language Assembly language is nothing more than a symbolic representation of machine code, which also allows symbolic designation of memory locations. Thus, an instruction to add the contents of a memory location to an internal CPU register called the accumulator might be add a number instead of a string of binary digits (bits). No matter how close assembly language is to machine code, the computer still cannot understand it. The assembly-language program must be translated into machine code by a separate program called anassembler. The assembler program recognizes the character strings that make up the symbolic names of the various machine operations, and substitutes the required machine code for each instruction. At the same time, it also calculates the required address in memory for each symbolic name of a memory location, and substitutes those addresses for the names. The final result is a machine-language program that can run on its own at any time; the assembler and the assembly-language program are no longer needed. To help distinguish between the "before" and "after" versions of the program, the original assembly-language program is also known as the source code, while the final machinelanguage program is designated the object code. If an assembly-language program needs to be changed or corrected, it is necessary to make the changes to the source code and then reassemble it to create a new object program.
Compiler Language
Compiler languages are the high-level equivalent of assembly language. Each instruction in the compiler language can correspond to many machine instructions. Once the program has been written, it is translated to the equivalent machine code by a program called a compiler. Once the program has been compiled, the resulting machine code is saved separately, and can be run on its own at any time. As with assembly-language programs, updating or correcting a compiled program requires that the original (source) program be modified appropriately and then recompiled to form a new machinelanguage (object) program. Typically, the compiled machine code is less efficient than the code produced when using assembly language. This means that it runs a bit more slowly and uses a bit more memory than the equivalent assembled program. To offset this drawback, however, we also have the fact that it takes much less time to develop a compiler-language program, so it can be ready to go sooner than the assembly-language program.
Interpreter Language An interpreter language, like a compiler language, is considered to be high level. However, it operates in a totally different manner from a compiler language. Rather, the interpreter program resides in memory, and directly executes the high-level program without preliminary translation to machine code. This use of an interpreter program to directly execute the user's program has both advantages and disadvantages. The primary advantage is that you can run the program to test its operation, make a few changes, and run it again directly. There is no need to recompile because no new machine code is ever produced. This can enormously speed up the development and testing process. On the down side, this arrangement requires that both the interpreter and the user's program reside in memory at the same time. In addition, because the interpreter has to scan the user's program one line at a time and execute internal portions of itself in response, execution of an interpreted program is much slower than for a compiled program.
Assembly language
An assembly language is a low-level programming language for computers, microprocessors, microcontrollers, and other programmable devices. It implements a symbolic representation of the machine codes and other constants needed to program a given CPU architecture. This representation is usually defined by the hardware manufacturer, and is based on mnemonics that symbolize processing steps (instructions), processor registers, memory locations, and other language features. An assembly language is thus specific to a certain physical (or virtual) computer architecture. This is in contrast to most high-level programming languages, which, ideally, are portable.
Why use an assembly language?

A simple assembler converts each assembly language statement into the corresponding machine-language statement, so at first glance seems merely a minor convenience, substituting obscure machine instructions by easily remembered names. However, consider a machine-language program loaded into memory from location 0, which has an instruction which is the machine-code equivalent of jump to instruction 25 or jump forward skipping 5 instructions, followed by other instructions. If written in assembly language, assembled, and loaded it will produce exactly the same code, although it will have convenient mnemonics and use symbolic labels rather than absolute locations, as in jumpto label4, instead of something like hexadecimal number A13528CD, the machine code for the instruction. However, if the program is modified in a way which changes the number of instructions, the destination of the jump may no longer be at location 25 or 5 instructions ahead, meaning the entire machine-language program must be modified every time such a change is made; to correct all jump destinations. If symbols have been used in an assembly language to identify the destination of the jump, the programmer need only work on the changed parts with no regard to anything else; the assembler will assemble the modified program with all jumps and so on adjusted to remain correct. Similarly, if memory availability requires the program to be loaded starting at location 123 instead of 0, the assembler will adjust all references to suit. Assembly language has many more such advantages over machine language.
Assembly language
A program written in assembly language consists of a series of (mnemonic) processor instructions and metastatements (known variously as directives, pseudo-instructions and pseudo-ops), comments and data. Assembly language instructions usually consist of an opcode mnemonic followed by a comma-separated list of data, arguments or parameters.[4]These are translated by anassembler to a stream of executable instructions that can be loaded into memory and executed. Assemblers can also be used to produce blocks of data from formatted and commented source code, to be used by other code. Take, for example, the instruction that tells an x86/IA-32 processor to move an immediate 8-bit value into a register. The binary code for this instruction is 10110 followed by a 3-bit identifier for which register to use. The identifier for the AL register is 000, so the following machine code loads the AL register with the data 01100001.[5]
10110000 01100001 This binary computer code can be made more human-readable by expressing it in hexadecimal as follows
B0 61 Here, B0 means 'Move a copy of the following value into AL', and 61 is a hexadecimal representation of the value 01100001, which is 97 in decimal. Intel assembly language provides the mnemonicMOV (an abbreviation of move) for instructions such as this, so the machine code above can be written as follows in assembly language, complete with an explanatory comment if required, after the semicolon. This is much easier to read and to remember. MOV AL, 61h ; Load AL with 97 decimal (61 hex)
At one time many assembly language mnemonics were three letter abbreviations, such as JMP for jump, INC for increment, etc. Modern processors have a much larger instruction set and many mnemonics are now longer, for example FPATAN for "floating point partial arctangent" and BOUND for "check array index against bounds". The same mnemonic MOV refers to a family of related opcodes to do with loading, copying and moving data, whether these are immediate values, values in registers, or memory locations pointed to by values in registers. The opcode 10110000 (B0) copies an 8-bit value into the AL register, while 10110001 (B1) moves it into CL and 10110010 (B2) does so into DL. Assembly language examples for these follow.[5] MOV AL, 1h MOV CL, 2h MOV DL, 3h ; Load AL with immediate value 1 ; Load CL with immediate value 2 ; Load DL with immediate value 3
The syntax of MOV can also be more complex as the following examples show. [6] ; Move the 4 bytes in memory at the address contained in EBX into EAX MOV EAX, [EBX] MOV [ESI+EAX], CL ; Move the contents of CL into the byte at address ESI+EAX In each case, the MOV mnemonic is translated directly into an opcode in the ranges 88-8E, A0-A3, B0-B8, C6 or C7 by an assembler, and the programmer does not have to know or remember which. [5] Transforming assembly language into machine code is the job of an assembler, and the reverse can at least partially be achieved by a disassembler. Unlike high-level languages, there is usually a one-to-one correspondence between simple assembly statements and machine language instructions. However, in some cases, an assembler may provide pseudoinstructions (essentially macros) which expand into several machine language instructions to provide commonly needed functionality. For example, for a machine that lacks a "branch if greater or equal" instruction, an assembler may provide a pseudoinstruction that expands to the machine's "set if less than" and "branch if zero (on the result of the set instruction)". Most full-featured assemblers also provide a rich macro language (discussed below) which is used by vendors and programmers to generate more complex code and data sequences. Each computer architecture and processor architecture usually has its own machine language. On this level, each instruction is simple enough to be executed using a relatively small number of electronic circuits. Computers differ by the number and type of operations they support. For example, a machine with a 64-bit word length would have different circuitry from a 32-bit machine. They may also have different sizes and numbers of registers, and different representations of data types in storage. While most general-purpose computers are able
to carry out essentially the same functionality, the ways they do so differ; the corresponding assembly languages may reflect these differences. Multiple sets of mnemonics or assembly-language syntax may exist for a single instruction set, typically instantiated in different assembler programs. In these cases, the most popular one is usually that supplied by the manufacturer and used in its documentation.
Typical applications
Hard-coded assembly language is typically used in a system's boot ROM (BIOS on IBMcompatible PC systems). This low-level code is used, among other things, to initialize and test the system hardware prior to booting the OS, and is stored in ROM. Once a certain level of hardware initialization has taken place, execution transfers to other code, typically written in higher level languages; but the code running immediately after power is applied is usually written in assembly language. The same is true of most boot loaders. Many compilers render high-level languages into assembly first before fully compiling, allowing the assembly code to be viewed for debugging and optimization purposes. Relatively low-level languages, such as C, often provide special syntax to embed assembly language directly in the source code. Programs using such facilities, such as the Linux kernel, can then construct abstractions using different assembly language on each hardware platform. The system's portable code can then use these processor-specific components through a uniform interface. Assembly language is also valuable in reverse engineering, since many programs are distributed only in machine code form, and machine code is usually easy to translate into assembly language and carefully examine in this form, but very difficult to translate into a higher-level language. Tools such as the Interactive Disassembler make extensive use of disassembly for such a purpose. One niche that makes use of assembly language is the demoscene. Certain competitions require contestants to restrict their creations to a very small size (e.g. 256B, 1KB, 4KB or 64 KB), and assembly language is the language of choice to achieve this goal.[23] When resources, especially CPU processing-constrained systems, like the earlier Amiga models, and the Commodore 64, are a concern, assembler coding is a must. Optimized assembler code is written "by hand" and instructions are sequenced manually by programmers in an attempt to minimize the number of CPU cycles used. The CPU constraints are so great that every CPU cycle counts. However, using such methods has enabled systems like the Commodore 64 to produce real-time 3D graphics with advanced effects, a feat which might be considered unlikely or even impossible for a system with a 1.02MHz processor.
High-level programming language

A high-level programming language is a programming language with strong abstraction from the details of the computer. In comparison to low-level programming languages, it may use natural language elements, be easier to use, or be from the specification of the program, making the process of developing a program simpler and more understandable with respect to a low-level language. The amount of abstraction provided defines how "high-level" a programming language is.[1]
The first high-level programming language to be designed for a computer was Plankalkl, created by Konrad Zuse. However, it was not implemented in his time, and his original contributions were isolated from other developments.
Features
"High-level language" refers to the higher level of abstraction from machine language. Rather than dealing with registers, memory addresses and call stacks, high-level languages deal with variables, arrays, objects, complex arithmetic or boolean expressions, subroutines and functions, loops, threads, locks, and other abstract computer science concepts, with a focus on usability over optimal program efficiency. Unlike low-level assembly languages, high-level languages have few, if any, language elements that translate directly into a machine's native opcodes. Other features, such as string handling routines, object-oriented language features, and file input/output, may also be present.
Relative meaning
The terms high-level and low-level are inherently relative. Some decades ago, the C language, and similar languages, were most often considered "high-level", as it supported concepts such as expression evaluation, parameterised recursive functions, and data types and structures, while assembly language was considered "low-level". Many programmers today might refer to C as low-level, as it lacks a large runtimesystem (no garbage collection, etc.), basically supports only scalar operations, and provides direct memory addressing. It, therefore, readily blends with assembly language and the machine level of CPUs and microcontrollers. Assembly language may itself be regarded as a higher level (but often still one-to-one if used without macros) representation of machine code, as it supports concepts such as constants and (limited) expressions, sometimes even variables, procedures, and data structures. Machine code, in its turn, is inherently at a slightly higher level than the microcode or micro-operations used internally in many processors. [edit]Execution
models
There are three models of execution for modern high-level languages: Interpreted Interpreted languages are read and then executed directly, with no compilation stage. A program called an interpreter reads each program line following the program flow, converts it to machine code, and executes it; the machine code is then discarded, to be interpreted anew if the line is executed again. Compiled Compiled languages are transformed into an executable form before running. There are two types of compilation: Machine code generation Some compilers compile source code directly into machine code. This is the original mode of compilation, and languages that are directly and completely transformed to machine-native code in this way may be called "truly compiled" languages. Intermediate representations
When a language is compiled to an intermediate representation, that representation can be optimized or saved for later execution without the need to re-read the source file. When the intermediate representation is saved, it is often represented as byte code. The intermediate representation must then be interpreted or further compiled to execute it. Virtual machines that execute byte code directly or transform it further into machine code have blurred the once clear distinction between intermediate representations and truly compiled languages. Translated A language may be translated into a lower-level programming language for which native code compilers are already widely available. The C programming language is a common target for such translators.

Levels of Programming Languages

Transféré par

Informations du document

Description originale:

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Levels of Programming Languages

Transféré par

Droits d'auteur :

Formats disponibles

Levels of Programming Languages

Why use an assembly language?

High-level programming language

Vous aimerez peut-être aussi