Vous êtes sur la page 1sur 8

Advanced Computer Architecture Summer 2006 Design and Development of an Assembler for DSP Processor Purpose

The main objective of this project is to understand assembly language, machine language and the assembly process.

Background
Typically, an assembler is a program that reads an assembly source code (usually .asm) file containing assembly language code and generates an object (.o) file containing associated machine language. There are a number of file formats available in which object files are generated. The most widely used format is the UNIX Executable and Linking Format (ELF). An ELF file consists of sections; the most common are data, bss, and text. There can be one-pass or multiple pass assemblers which basically shows the number of times assembly file is scanned for generating object file. Typically, an assembler makes two passes over the source code. During pass1 the assembler creates a symbol table. Each binding of the table relates a label to an offset within a particular section. During pass2 the assembler uses that symbol table to generate the sections that comprise the object code.

Your Task
Your task in this assignment is to use the C programming language to create an assembler that can process the given set of assembly language instructions for the DSP processor. Your assembler will read a source code file, create an in-memory representation of the assembly language program, performs two passes over it (as described above), and thereby produces an in-memory representation of the program's data, bss, and text sections. To create in-memory representation, what you really do is to read the sequence of characters in source code and convert into a sequence of tokens. Tokens are used to verify that your assembly program is syntactically valid assembly language program. Finally, your assembler generates an object file in ELF format that is machine language representation of your assembly file.

Set of Assembly language Instructions


Please note that all these instructions are taken from the design documents already sent for semester project except MOV and MOVI instructions. MOVI, MOV is typically used to move any immediate value to a register or from register to a register respective. They can easily be encoded as non-AGU instructions using the un-used op-codes, 110 & 111 (refer to table 2 of design document). For MOVI, info field will carry of MSBs of 8-bit number with lower 6-bits in 10:5 bits making a total of 8 bit field for immediate value. For MOV instructions, info field and src1 are unused. Instructions
MOV MOVI ADD_SAT ADD_NSAT MULT_US MULT_SU MULT_SS MULT_UU SHIFT_L SHIFT_L SHIFT_A MAC XOR OR AND NOT LOAD16 LOADI16 LOAD32 LOADI32 STORE16 STOREI16 STORE16 STOREI16 JUMP BRANCH/RETURN LOOP

Function
Move data from reg-reg Move immediate value to a register add with saturation add without saturation unsigned*signed signed*unsigned signed*signed unsigned*unsigned shift right shift left arithmetic shift right multiply and accumulate XOR operation OR operation AND operation NOT operation 16 bit load using address register 16 bit immediate load 32 bit load using address register 32 bit immediate load 16 bit store 16 bit immediate store 32 bit store 32 bit immediate store jump instruction same as JUMP loop instruction

Usage
MOV R1, R2 (R1=R2) MOV R1, 8 (R1=R2) ADD_SAT R0, R1, R2 (R0=R1+R2) ADD_NSAT R0, R1, R2 (R0=R1+R2) MULT_US R0, R2, R3 (R0=R2*R3) MULT_SU R0, R2, R3 (R0=R2*R3) MULT_SS R0, R2, R3 (R0=R2*R3) MULT_SS R0, R2, R3 (R0=R2*R3) SHIFT_L R1, R0, 3 (R1=R0>>3) SHIFT_L R1, R0, -1 (R1=R0<<1) SHIFT_A R2, R4, 3 (R2=R4>>3) MAC R1, R2, R3(R1 = R1 + R2*R3) XOR R1, R2, R3(R1 = R2 xor R3) OR R1, R2, R3(R1 = R2 or R3) AND R1, R2, R3(R1 = R2 and R3) NOT R1, R2 (R1 = !R2) LOAD16 R1, *A1 (R1 [A1]) LOAD16 R1, 32 (R1 [32]) LOAD32 R1, *A1 (R1 [A1]) LOAD32 R1, 50 (R1 [50]) STORE16 R1, *A1 ([A1] R1) STOREI16 R1, *A1 ([A1] R1) STORE32 R1, *A1 ([A1] R1) STOREI32 R1, *A1 ([A1] R1) JUMP A1 (jump to address A1) BRANCH A0/RETURN A0 LOOP A1, 4 (loop four times

starting from A1)

Example Program
A dot product is program that multiplies two arrays of numbers and keeps accumulating the sum of obtained products as shown in the following simple c program. For(i=0; I <10; i++) Sum += a[i] * b[i]; An assembly version of this program will look like the one shown below. XOR R1, R1 MOV R2, -1 loop: LOAD32 R3, *A1 ; A1 contains pointer to array a[] LOAD32 R4, *A2 ; A2 contains pointer to array b[] MULT_UU R5, R3, R4 ADD R1, R1, R5 ; sum of products in R1 ADD R0, R0, R2 LOOP loop, 9 Your assembler will take a code like the one shown above and translates it to an appropriate machine code (refer to machines given in design document). ; init R1 for accumulation

Assembler Directives
Directives are special keywords which are not part of original assembly language and hence are not translated to machine language. Rather assembler uses them in the assembly process to generate object file. Your assembler should support some directives like the ones listed below. Its not compulsory to follow only the given directives. They are just mentioned here to assist you in selecting the type of directives that you may want to have in your assembly code.

EQUATE is used to make programs easier to write. The EQU directive creates absolute symbols and aliases by assigning an expression or value to the declared variable name. Its format is, name: EQU expression Consider the following statement. NUMBER1: EQU 36H

The assembler will replace every occurrence of the label NUMBER1 with the value its been equated to, ie, 36 hexadecimal. The statement LDAA #NUMBER1 will be interpreted by the assembler as LDAA #36H An absolute symbol represents a 16bit value; an alias is a name that represents another symbol. The declared name must be unique, one that has not been previously declared. The redefining of a previous symbol is normally not allowed. NUM1: EQU 20H ... ... NUM1: EQU 30H ; error

ORIGIN this specifies the address to be used for the generation of code. Subsequent instructions and data addresses begin at the new value. Normally, it is used to set the start address of the program, but can also set the location counter to the value specified. ORG 120H LOAD16 #FFH The statement LOAD16 #FFH begins at byte 120h. ORG $ + 2 start: LOAD32 #34H The instruction associated with the label start is declared to start at the address 2bytes beyond the current value of the location counter (specified by $).

.ALIGN directive Syntax: .align [size in bytes] Purpose: The .align directive aligns the section program counter (SPC) on the next boundary, depending on the size in bytes parameter. Using the .align directive has two effects: The assembler aligns the SPC on a byte boundary within the current section. The assembler sets a flag that forces the linker to align the section so that individual alignments remain intact when a section is loaded into memory.

if/elseif/else/endif directives .if well-defined expression .elseif well-defined expression .else .endif Purpose: The .if directive marks the beginning of a conditional block. The well-defined expression is a required parameter. The .elseif directive identifies a block of code to be assembled when the .if expression is false (0) and the .elseif expression is true (nonzero). When the .elseif expression is false, the assembler continues to the next .elseif (if present), .else (if present), or .endif (if no .elseif or .else is present). The .elseif directive is optional in the conditional blocks, and more than one .elseif can be used. If an expression is false and there is no .elseif statement, the assembler continues with the code that follows a .else (if present) or a .endif. The .else directive identifies a block of code that the assembler assembles when the .if expression and all .elseif expressions are false (0). This directive is optional in the conditional block; if an expression is false and there is no .else statement, the assembler continues with the code that follows the .endif. The .endif directive terminates a conditional block. The .elseif and .else directives can be used in the same conditional assembly block and the .elseif directive can be used more than once within a conditional assembly block. Error messages directives .emsg string .mmsg string .wmsg string Purpose: These directives allow you to define your own error and warning messages. The assembler tracks the number of errors and warnings it encounters and prints these numbers on the last line of the listing file. The .emsg directive sends error messages to the standard output device in the same manner as the assembler, incrementing the error count and preventing the assembler from producing an object file. The .mmsg directive sends assembly-time messages to the standard output device in the same manner as the .emsg and .wmsg directives, but it does not set the error or warning counts, and it does not prevent the assembler from producing an object file. The .wmsg directive sends warning messages to the standard output device in the same manner as the .emsg directive, but it increments the warning count rather than the error count, and it does not prevent the assembler from producing an object file. .bss directive .bss symbol, size in bytes [, alignment flag[, bank offset]]

Parameters: symbol: defines a lable that points to the first location reserved by the directive. The symbol should correspond to the name of the variable for which you are reserving space. size in bytes: must be an absolute expression. The assembler allocates size in bytes in the .bss section. There is no default size. Alignment: flag is an optional parameter that ensures that the space allocated to the symbol occurs on the specified boundary. This boundary indicates the size of the slot in bytes and can be set to any power of 2. If the SPC is aligned to the specified boundary, it is not incremented. bank offset: is an optional parameter. It ensures that the space allocated to the symbol occurs on a specific memory bank boundary. The bank offset value measures the number of bytes to offset from the alignment specified before assigning the symbol to that location. Purpose: The .bss directive reserves space for variables in the .bss section. This directive is usually used to allocate variables in RAM.

END and Optional Start Address The END directive specifies the end of the assembly language source listing. It may be followed by an optional entry address. The optional entry address is used by LOADERS to initialize the Program Counter before running the program. If no entry address is specified, execution will start at the first location allocated by the assembler. END

BYTE STORAGE The directive used to allocate and initialize bytes (8bits) of storage is DFB definebyte Its format is, name: DFB initialvalue,,, The name portion is optional. Consider the following examples for CRS8. value1: DFB 16 form: DFB 6*2 text: DFB "Enter your name: " In the first example, the label value1 is assigned a single byte of storage, which is initialized to 16 decimal.The second example allocates a single byte of storage for the label form, and initializes it equal to 12. The last example allocates 17 bytes of storage for the label text. The first byte will be initialized to E, whilst the last byte is initialized to an ASCII space.

WORD STORAGE The directive used to allocate and initialize words (two bytes) of memory storage is, DWM define word, most significant byte first DWL define word, least significant byte first Its format is, name: DWM initialvalue,,, The name portion is optional. DWM 1687H mess: DWM 'ab' The first example allocates one word of storage, having the values 16H followed by 87H. The second example defines mess as a word initialized with the character values a followed by b. The b will be placed in the low-order byte, and the a will be placed in the high order byte. If only one character is specified, the high-order byte will contain 0. Strings when using the DW directive must not contain more than two characters.

Vous aimerez peut-être aussi