Académique Documents
Professionnel Documents
Culture Documents
Stored Program concept Main memory storing programs and data ALU operating on binary data Control unit interpreting instructions from memory and executing Input and output equipment operated by control unit Princeton Institute for Advanced Studies (IAS) Completed 1952
von Neumann
von Neumann
1000 x 40 bit words
Binary number 2 x 20 bit instructions
Moores Law
Increased density of components on chip Gordon Moore - cofounder of Intel Number of transistors on a chip will double every year Since 1970s development has slowed a little Number of transistors doubles every 18 months Cost of a chip has remained almost unchanged Higher packing density means shorter electrical paths, giving higher performance Smaller size gives increased flexibility Reduced power and cooling requirements Fewer interconnections increases reliability
Speeding it up
Pipelining On board cache On board L1 & L2 cache Branch prediction Data flow analysis Speculative execution
Performance Mismatch
Processor speed increased Memory capacity increased Memory speed lags behind processor speed
Solutions
Increase number of bits retrieved at one time
Make DRAM wider rather than deeper
Intel 8086
The 8086 is a 16-bit microprocessor chip designed by Intel and introduced on the market in 1978, which gave rise to the x86 architecture. Intel 8088, released in 1979, was essentially the same chip, but with an external 8-bit data bus (allowing the use of cheaper and fewer supporting logic chips), and is notable as the processor used in the original IBM PC.
Segmentation
Compilers for the 8086 commonly supported two types of pointer, "near" and "far". Near pointers were 16-bit addresses implicitly associated with the program's code or data segment (and so made sense only in programs small enough to fit in one segment). Far pointers were 32-bit segment:offset pairs. C compilers also supported "huge" pointers, which were like far pointers except that pointer arithmetic on a huge pointer treated it as a flat 20-bit pointer, while pointer arithmetic on a far pointer wrapped around within its initial 64-kilobyte segment.
Segmentation
To avoid the need to specify "near" and "far" on every pointer and every function which took or returned a pointer, compilers also supported "memory models" which specified default pointer sizes. The "small", "compact", "medium", and "large" models covered every combination of near and far pointers for code and data. The "tiny" model was like "small" except that code and data shared one segment. The "huge" model was like "large" except that all pointers were huge instead of far by default. Precompiled libraries often came in several versions compiled for different memory models.
Assembly Language
most modern assemblers include a macro facility (described below), and are called macro assemblers. Most assemblers also include macro facilities for performing textual substitution e.g., to generate common short sequences of instructions to run inline, instead of in a subroutine
Assembler
Typically a modern assembler creates object code by translating assembly instruction mnemonics into op codes, and by resolving symbolic names for memory locations and other entities. The use of symbolic references is a key feature of assemblers, saving tedious calculations and manual address updates after program modifications. Most assemblers also include macro facilities for performing textual substitutione.g., to generate common short sequences of instructions to run inline, instead of in a subroutine. Assemblers are generally simpler to write than compilers for high-level languages, and have been available since the 1950s. Modern assemblers, especially for RISC based architectures, such as MIPS, Sun SPARC, HP PA-RISC and x86(-64), optimize instruction scheduling to exploit the CPU pipeline efficiently.
Assembler
There are two types of assemblers based on how many passes through the source are needed to produce the executable program. One-pass assemblers go through the source code once and assumes that all symbols will be defined before any instruction that references them. Two-pass assemblers (and multi-pass assemblers) create a table with all unresolved symbols in the first pass, then use the 2nd pass to resolve these addresses. The advantage in one-pass assemblers is speed - which is not as important as it once was with advances in computer speed and capabilities. The advantage of the two-pass assembler is that symbols can be defined anywhere in the program source.
Assembler
As a result, the program can be defined in a more logical and meaningful way. This makes two-pass assembler programs easier to read and maintain.
Language design
Basic elements Any Assembly language consists of 3 types of instruction statements which are used to define the program operations: opcode mnemonics data sections assembly directives
Opcode mnemonics
Instructions (statements) in assembly language are generally very simple, unlike those in high-level languages. Generally, an opcode is a symbolic name for a single executable machine language instruction, and there is at least one opcode mnemonic defined for each machine language instruction. Each instruction typically consists of an operation or opcode plus zero or more operands. Most instructions refer to a single value, or a pair of values. Operands can be either immediate (typically one byte values, coded in the instruction itself) or the addresses of data located elsewhere in storage. This is determined by the underlying processor architecture: the assembler merely reflects how this
Data sections
There are instructions used to define data elements to hold data and variables. They define what type of data, length and alignment of data. These instructions can also define whether the data is available to outside programs (programs assembled separately) or only to the program in which the data section is defined.
Assembler Errors
The previous paragraphs describe what happens if the .ASM file is correct. By correct, I mean that the file is completely comprehensible to the assembler and can be translated into machine instructions without the assembler getting confused. If the assembler encounters something it doesn't understand when it reads a line from the source code file, we call the misunderstood text an error, and the assembler displays an error message. For example, the following line of assembly language will confuse the assembler and summon an error message: MOV AX,VX
Linking
In traditional assembly language work, what actually happens is that the assembler writes an intermediate object code file with an .OBJ extension to disk. You can't run this .OBJ file, even though it generally contains all the machine instructions that your assembly language source code file specified. The .OBJ file needs to be processed by another translator program, the linker. The linker performs a number of operations on the .OBJ file, most of which would be meaningless to you at this point. The most obvious task the linker does is to weave several .OBJ files into a single .
Linking
Why create multiple .OBJ files when writing a single executable program? One of two major reasons is size. A middling assembly language application might be 50,000 lines long. Cutting that single monolithic .ASM file up into multiple 8,000-line .ASM files would make the individual . ASM files smaller and much easier to understand.
Linking
The other reason is to avoid assembling completed portions of the program every time any part of the program is assembled. One thing you'll be doing is writing assembly language procedures, which are small detours from the main run of steps and tests that can be taken from anywhere within the assembly language program. Once you write and perfect a procedure, you can tuck it away in an .ASM file with other completed procedures, assemble it, and then simply link the resulting . OBJ file into the working .ASM file. The alternative is to waste time by reassembling perfected source code over and over again every time you assemble the main portion of the program.
Linking
This is shown in figure above. In the upper-right corner is a row of .OBJ files. These .OBJ files were assembled earlier from correct .ASM files, yielding binary disk files containing ready-togo machine instructions. When the linker links the .OBJ file produced from your in-progress .ASM file, it adds in the previously assembled .OBJ files, which are called modules. The single .EXE file that the linker writes to disk contains the machine instructions from all of the .OBJ files handed to the linker when then linker is invoked.
Once the in-progress .ASM file is completed and made correct, its .OBJ module can be put up on the rack with the others and added to the next in-progress .ASM source code file. Little by little you construct your application program out of the modules you build one at a time. A very important bonus is that some of the procedures in an .OBJ module may be used in a future assembly language program that hasn't even been begun yet. Creating such libraries of "toolkit" procedures can be an extraordinarily effective way to save time by reusing code over and over again, without even passing it through the assembler again!