If, in the course of reading this document and observing the simulated circuit in action you find yourself confused and unable to grasp it, I recommend putting the Elementary Microprocessor aside for the time being and instead read the document entitled A Very Simple Microprocessor by Etienne Sicard. That document describes a processor that performs the most basic functions possible for a processor to really be called a processor. If you are able to comprehend that document, then the Elementary Microprocessor will serve as the next step up in complexity. If one does not already understand the concepts delineated in A Very Simple Microprocessor, then the EM Circuit will appear hopelessly complex. However, if Mr. Sicards design is also too difficult search the internet for introductory material on Boolean Logic, Combinational Logic, and State Machines in that order. Dont give up, just begin at the beginning.
1) Overview
When you open up the top-level circuit you will find a collection of components consisting of a Clock, a ROM module (containing the program being run), a RAM module (for temporary storage), a terminal window, a probe that shows you the output of the processor in signed decimal form, a keyboard, a circuit for helping the keyboard and the EM communicate, and finally, the Elementary Microprocessor itself. This is the user level view of the processor. From this highest-level point, you can run and interact with programs written for and loaded into the EMs ROM.
[The Elementary Microprocessor user-level view]
In order to use the EM it will be necessary to have a passing familiarity with the operation of Logisim and the behavior of its various components. For instance, the clock is a Logisim supplied component that can be turned on and off, can operate at various speeds, and can be manually caused to proceed one tick at a time (useful for testing and de-bugging). Another example is the keyboard element. In order for it to capture user input you must use the poke tool (looks like a hand with the index finger extended) to select it (sometimes you have to poke it more than once). Because these different elements have their own peculiarities I have separated each one of them from the EM processor itself to the outside user interface. That way the operation of the EM is seen for itself and the Logisim components only serve as inputs or outputs in relationship to the processor.
2) Inside the Elementary Microprocessor
On the left side of the EM circuit you can see the clock, Phase Generator, and Microinstruction Controller. These, together with the Instruction Register (near the bottom middle) are what read the next appropriate instruction in the ROM and control the other parts of the processor by allowing or preventing their access to the Bus at appropriate times.
[Top Level of the Elementary Microprocessor]
You will also see, in the top-middle and center, a series of labeled probes. These are tools of the Logisim simulator, not part of the circuitry. They serve no purpose except to let you see what is on and what is off or what the value of the current data on the bus is. The purpose of the different signals that the probes are attached to will be explained in later sections. The fact that the probes are there will allow you to look inside the processor while it is running and be able to see exactly what is occurring at any point in time.
The Bus is the common line along which the various parts of the processor can communicate with each other. It is vital to understand that the various parts of the processor do not have any method of directing a message to another specific component. It is simply arranged in such a way that no two parts are writing data to the bus at the same time. If two components, say the ALU and the instruction register, both tried to send data to the bus at the same time, they would both essentially be trying to overwrite the other one. The result is a completely unpredictable condition known as bus contention. It is a fatal flaw if it ever occurs in a circuit. A similar problem occurs if a component that is capable of both writing to the bus and reading from the bus attempts to read and write at the same time. It goes into a cycle of attempting to overwrite itself and it is impossible to predict what the end result will be when the clock ticks and the processor moves on to its next action.
The Microinstruction Controller is the device that ensures that only one component is ever speaking at one time but it also controls who is listening at any given time. Any number of components can listen at any time, but it is only useful to have them listen when you want them to have the data currently being placed on the bus. In most cases, you only want one piece of hardware to be listening at any given point, but certain instructions rely on that freedom to have multiple components read from the bus at the same time. If the processor components are an orchestra then the Microinstruction Controller is the conductor. It accomplishes this task by way of turning on and off control signals at the appropriate times. These control signals are carried by lines running to the various components which control when they are listening and when they are speaking to the bus. How does it know when to turn on and off which devices? We can get into that in more detail when we look directly at its physical structure.
On the right side you will find the brains of the processor, the ALU (Arithmetic Logic Unit), Accumulator, and Status Register. The ALU reads the data that is on the Bus, performs the operation needed (Addition, Subtraction, or any of various logical operations), and can put the product of that operation back onto the bus where other components have the opportunity to read it. At the same time the ALU automatically updates the Status Register. This occurs on its own set of lines separate from the bus.
The purpose of the Status Register is to save data about the operation that the ALU performed, not the product of that operation, and to make this data available to the Microinstruction Controller in case it is needed in a later operation. These bits of data are called flags. There are four flags in the EM circuit. They are the Carry Flag C, Overflow Flag O, Negative Flag N, and Zero Flag Z. The Carry Flag indicates that the product has a carry beyond the 16 bits available in the accumulator (indicating that the 17 th bit was lost and the product will not be as expected). The Overflow Flag indicates that there was an overflow. When dealing with signed numbers one of the bits serves to indicate to the programmer that the number is positive or negative, but the CPU doesnt know that so it operates as if the sign bit is just part of the number. This can result in the sign bit being changed when it isnt appropriate. The O flag indicates if the sign bit has been altered in such a way that is mathematically impossible. The Negative Flag turns on if the sign bit indicates the number is negative. The Zero Flag turns on if all bits are zero. These flags dont do anything on their own. They only make data about the last operation available to the Microinstruction Controller. You will see four lines traveling from the Status Register to the Microinstruction Controller; these are the Status Flags.
The Accumulator is the processors short term memory and is the only register that is available to the programmer. All other registers are special purpose temporary storage containers that the processor uses for interacting with components outside of the processor itself. The Accumulator is able to read from the bus, write to the bus, and write directly to the ALU. This is vital as the ALU must perform its operations on two pieces of data but the bus can only hold one piece of data at a time. Therefore one piece of data must be on the bus and one must be in the Accumulator in order for the ALU to have anything to work on. Furthermore, in most cases the product of the ALU gets immediately stored into the Accumulator. This is accomplished by the Microinstruction Controller enabling the ALU to write to the bus while simultaneously instructing the Accumulator to Load data from the bus.
3) The Phase Generator
The Phase Generator is the first component we will look into in-depth. But first, a note on the line that is indicated by the arrow; this is a reset line. In a real system a reset switch would be connected to every component in the processor. Because this is only a simulation, resetting all the components is handled by Logisim itself as a menu item. I included the basic hardware for resetting every component in the EM circuit, but they are not attached to anything. In other words, you can safely ignore them during simulation, but its more accurate to have them there.
Now to the Phase Generator itself. There is one input, the clock. The clock turns on and off at a regular interval. The Phase Generator takes this on/off sequence and breaks it up into a series of four signals. Those four signals are labeled Phase 1 though Phase 4. The clock cycles and Phase 1 turns on. The clock cycles again and Phase 1 turns off, but Phase 2 turns on, etc. This goes on as long as the clock continues turning on and off. Phase 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4 The purpose of this is to interpose time on the processor. Remember that only one component at a time can write to the bus. Those times are Phases 1 through 4.
4) The Instruction Register
The Instruction register accepts data from the program ROM and sends it to its appropriate places. Those two places are the bus and the Microinstruction Controller. In order to understand this, you first need to know how data is arranged in the ROM. Following is an excerpt from the Introduction to the EM.
This is a 16-bit processor, meaning that the bus between components is 16-bits wide and the ALU can compute two different 16-bit pieces of data to produce a 16-bit output. For this reason each RAM address is 16 bits wide. The Program ROM on the other hand, has 32-bit wide address. The Program ROM must have wider addresses in order to accommodate space for the instruction itself and 16-bits for the data that is to be worked upon by the processor. The breakdown is 8 bits for the instruction/operation, 8 bits of free space for possible future changes in instruction set design, and 16 bits for the operand.
Look at the input on the left side of the circuit labeled Data / Instruct In. There are 32 wires. The 16 toward the top are the operand (data which will be operated upon). The 8 line in the middle are the unused free space (I may do something with them in possible future designs). The 8 at the bottom are the operation (instruction). Every address in ROM must be arranged in this fashion or the processer will not know what to do. If the data in ROM is not properly formatted then it is just a matter of garbage in, garbage out.
Lets look at it in proper sequence. Assume that the last instruction executed is done and it is now appropriate for the next instruction in ROM to be loaded and executed. The Microinstruction Controller will turn on its Load Inst (Load Instruction) signal. You will find this at the bottom of the Instruction Register circuit above. There you will also see an input from the clock. Most of the circuits you will see in the EM have a clock input in addition to any other input. This is just to ensure that they are in sync with each other. It would be sloppy and cause errors to have components turning on and off milliseconds out of sync. When Load Inst is activated the Instruction Register reads whatever is in the current address in the Program ROM. The 16 bits representing the data go to the top set of registers and are stored there. The 8 bits representing the instruction go to the bottom set and get stored there. The 8 bits of free space are attached to nothing at all.
If the overall scheme of executing an instruction consists of Fetch, Decode, Execute, and Write-back, then the previous paragraph describes the Fetch step. Notice that there are four steps to execute and instruction and four phases as described in the above section on the Phase Generator. Yes, there is a connection. Each phase is dedicated to one of the above steps. Fetch is always Phase 1, Decode is always Phase 2, Execute is always Phase 3, and Write-Back is always Phase 4.
Great, so now youve moved the instruction and data from the ROM to the processor. Now what? Well, the instruction portion immediately transmits over its own dedicated line to the Microinstruction Controller. The data portion remains in the instruction register until needed in later phases.
5) The Microinstruction Controller
Heres where the magic happens. The Microinstruction Controller accepts two main sets of inputs, the data from the Instruction Register, and the signals from the Phase Generator. It uses those two inputs to determine which components of the processor to activate and deactivate and when to do so (Phases 1 through 4) in order to accomplish the purpose of that instruction. In addition to the two inputs described above, it also takes the data from the Status Register as inputs. These do not determine which instruction runs, but determines the behavior of instructions that need this data. The Status Flags do not affect most instructions. Despite the daunting appearance of the circuit the procedure is fairly simple. One thing to know is that only a few parts of this circuit will ever be in use at any given time.
Along the top of the circuit you will see all of the signal outputs that are controlled by the circuitry on the left side. The left side has a special gate for each instruction and only one instruction (one gate) is ever executing at any given time. The top-left corner has all the inputs that, together, determine which one of the gates will turn on and, in the case of instructions that use Status Flags, what the behavior of the gate will be.
The fist aspect we will look at is the timing. As discussed earlier there are four distinct periods of time that cycle over and over. Different things occur at these different periods of time so as to bring about the end result of the instruction being executed. In Phase 1 (Fetch) only two things ever occur, the current instruction in ROM is loaded to the Bus with the Read ROM signal and the Instruction Register reads that data from the Bus with the Load Inst signal. In Phase 2 (Decode) the signals used depend on the instruction. In all cases the Enable Inst and Prog Count signals occur which cause the Instruction Register to put the data portion of its contents onto the Bus (See Instruction Register circuit to view this input and how it causes this) and the Program Counter to increment by one (thus pointing to the next instruction in the Program ROM. In some instructions, there are other signals used that manipulate the Program Counter in such a way that it does not increment by one, but changes to some other number so as to point do a different instruction in the Program ROM if the flow of the program requires that you execute an instruction other than the next in line (a very common occurrence).
If you look along the left side you will see a whole series of gates that are separate from each other. Each one of those sets of gates corresponds to one instruction and one instruction only. For instance, if the instruction passed from the Program ROM to the Microinstruction Controller via the Instruction Register was NOP (No Operation) then only the set of gates corresponding to NOP will turn on (Thats the topmost set). All others will remain inactive during the course of that instruction. This occurs because the binary representation for NOP is 0000 0000. If you look at the set of gates that correspond to NOP you will see this arrangement:
The leftmost AND gate is has all inputs negated (indicated by the small circles). That means if a 0 comes in, it treats the input like a 1 and vice versa. Its like sticking a NOT gate in front of each input. Therefore, if the incoming instruction is 0000 0000 (NOP) then this AND gate would activate because all the inputs were inverted to 1s.
Heres another example:
Looking at the leftmost AND gate you can see that all but the third line is negated. That means the AND gate would only activate if the following instruction was passed to the Microinstruction Controller: 0000 0100. That happens to be the instruction OUT (send whatever is in the Accumulator to the Output Register). Therefore whenever the instruction in ROM is 0000 0100, the Microinstruction Controller will activate this set of gates.
When one of these gates is activated its output is hardwired to activate certain signals. These are the signals that turn on and off the talking and listening functions of all the other components of the microprocessor. In the example of the NOP instruction no signals are turned on as NOP is No Operation. All the later phases occur, but nothing occurs during the later phases. In the example of OUT, the signal to enable the Accumulator to write its contents to the bus and the signal to the Output Register to load into itself whatever is on the bus are both turned on at the same time during Phase 3 (Execute Phase). This results in the contents of the Accumulator being transferred to the Output Register, thus being available to components outside of the processor.
The final set of inputs to consider is the Status Flags. As mentioned before, they are output by the status register automatically and are made available inside the Microinstruction Controller. There is also an additional flag not previously mentioned that does not come from the Status Register. It is the Input Flag I. This flag comes from the Input Register to indicate that the input device (the simulated keyboard) has available to pass to the processor when the processor is ready. Currently only the Z Flag and I Flag are in use, but the C, O, and N Flags all work correctly so future additions to the EMs instruction set would have these available if needed.
Lets look at a real world example of how a flag would assist the function of a program. A common action of a program is to use a number to represent the size of something else. For instance, your program asks the user for their name and stores it in memory for later use. The user types in Joe. These three characters are going to have to be saved in memory somewhere (lets assume addresses 100, 101, and 102). The problem is the programmer doesnt know how long the users name is so when the program tries to read back the name from memory, where does it stop reading? Address 100? Or address 103? Or address 181? The solution is to establish a number that represents the length of the users name and stores that in memory as well, then refer back to it when reading the name from memory. Lets say the program stores the name_length integer in address 99 and, because the name is Joe the integer has a value of 3 (Your program would have had to work that out when the name was originally input). When reading the name back from memory the program can decrement the number each time an address is read and then stop reading when the number is 0 (remember the Z Flag turns on when the accumulators contents are all zeros). Voila!
6) The Status Register
Lets now dig in and look at exactly how the Status Register determines which flags to turn on and when. As you can see, there are three inputs (not counting the clock and reset) and four outputs. The outputs are the four flags as previously mentioned. The three inputs are all directly from the ALU, not via the bus. They are the MSB (Most Significant Bit, the bit farthest to the left and indicating the highest bit position), the Carry bit (A special bit in the ALU that is not part of the 16 bits it can store, but indicates that the product of the mathematical operation is larger than 16 bits its the 17 th bit), and finally, and input composed of all of the ALUs 16 bits.
Well take up the Zero Flag first because its the easiest. It is on the far right of the circuit and passes all of the ALUs bits through a NOR gate. The result is that the flip-flop only turns on if all bits from the ALU are zero.
The Carry Flag is also pretty easy. It is on the center left position and is turned on if the last operation the ALU did resulted in a product larger than 16 bits. This occurrence results in a bit falling off the end of the ALU. That bit is the input called Carry Bit In on the left side of the circuit. This can happen when adding two positive numbers or subtracting two negative numbers, but cannot happen when adding or subtracting two numbers where one is positive and the other is negative because the product will always be closer to 0 (smaller) than either of the two original numbers. Because the two original numbers cannot possibly be larger than 16 bits the product cannot be larger than 16 bits.
For the two remaining flags, it is necessary to understand a bit about binary number representation in general and signed numbers in particular. It is also necessary to understand a key point on the ignorance of the processor. The processor manipulates binary digits. Those digits can represent anything a person (the programmer) wants them to represent such as a number, a picture, a sound, video content, text, etc. The processor knows none of this because it only manipulates binary digits. It is the responsibility of the programmer to instruct the processor such that the processor manipulates the bits in a way that accomplishes the programmers purpose. When it comes to using binary digits to represent base 10 integers the programmer is faced with a serious problem. How does one represent negative (i.e. signed) numbers? Ill skip all the mathematical technicalities and alternative methods of doing this and just give the basics of how its done implemented by most programmers nowadays.
Signed representation is done by using the MSB (Most Significant Bit) as the sign bit. If the sign bit is 0, the number is positive and if the sign bit is 1, the number is negative. In a 4-bit system the positive numbers count up from 0000 (zero is considered positive because the MSB is a 0) to 0111. At this point you run out of positive numbers because the next number up is 1111 which would be negative. Once you cross this threshold you count down the negative numbers. 1111 is - 1, 1110 is - 2, 1101 is -3, 1100 is -4, 1011 is -5, 1010 is -6, 1001 is -7 and 1000 is -8. Then you run out of space in the negative direction.
You may notice if you run this all the way out in both directions that you can count down to negative 8, but only up to positive 7. Thats because one of the positive numbers is used up representing 0 (0000). You may also notice that if you werent using one of the bits as a sign bit you could count from 0 (0000) up to 15 (1111) as all positive numbers but could not represent any negative numbers.
Now the computer has no idea whatsoever which way you are choosing to represent the numbers. It only processes the binary digits in the fashion that you tell it to and doesnt know if the numbers are positive or negative or not even numbers at all (maybe they represent text or a picture). For this reason, the status register always sets its flags as if you were using signed number representation and its up to the programmer to choose to use them or not as he sees fit. If hes not using signed numbers, he ignores the two Status Flags specifically relating to Signed Number representation. In the processors current incarnation, the status flags for signed representation are available for use, but no instruction actually uses them. See the section on the Microinstruction Controller, only the Z Flag and Input Flag are utilized currently, but all flags are available.
The easy one is the Negative Flag. The N flag turns on if the MSB is 1. Its that simple. The complex one is the Overflow Flag. The O flag turns on if the last operation resulted in a product that has an incorrect sign (whether negative or positive). This can occur if the product is larger than 16 bits in which case the Carry Flag would also turn on. In such a case its the sign bit that falls off the end of the ALU resulting in an unpredictable answer. A specialized case of this is when a product is more negative than possible for the number of bits available. For instance, assume you have a 4-bit system and want to add -4 and -5. This is 1011 + 1010. If you add these together you get the binary number 11101. The MSB is lost off the end and so your product in the ALU is 1101. You end up with the signed representation for -2 but the correct answer is -9. The method the Status Register uses to determine when to turn on the O Flag is by taking the MSB and the Carry Bit and passing them through an XOR gate. When either the MSB or the Carry bit is on but the other isnt then there was an overflow. If both are on, or both are off then there is no overflow. You can do a few tests of this on paper to demonstrate it for yourself.
7) The Accumulator
The Accumulator is the memory of the Elementary Microprocessor. The EM is a type of processor known as an Accumulator Architecture so clearly the Accumulator has a significant impact on the processors design. The basic idea is that the CPU has only one place where it can store a datum for later use. It also means that every operation the ALU performs is going to use the contents of the ALU as one of its data. For instance, to Add 2 and 3 you must load 2 into the Accumulator, and then instruct the processor to add 3 to whatever is in the Accumulator. The processor then immediately stores the product (5, naturally) back into the Accumulator.
This has its advantages in that the hardware design and the instruction set design are much easier (thats why I decided on an Accumulator Architecture for the EM circuit) but it makes writing programs more complex for the programmer because some actions require multiple instructions whereas a single instruction would do in a more modern processor with multiple places to store data.
To get down to the basics, every product computed by the ALU is immediately stored in the Accumulator. The exceptions are the Compare instructions where the purpose of the instruction is only to check out the Status Flags and you arent really interested in the product itself. The programmer can also load a datum into the Accumulator with a variety of Load instructions available in the instruction set that allow him or her to input a value predetermined in the code, or data stored in a memory address. These are done without going through the ALU. Status Flags would not be updated in any of these cases since no mathematical operation occurred, you just moved already existing data from one place to another.
The Accumulator can take data from the bus as an input when told to do so with the Load A signal and can output its contents to the bus when told to do so with the Enable A signal. Those signals come from the Microinstruction Controller when caused to do so by an instruction.
8) The Arithmetic Logic Unit (ALU)
The Arithmetic Logic Unit (ALU) is probably the most complex component of the processor. It takes two 16-bit inputs, one from the bus and one from the Accumulator (as discussed in the Accumulator section of this document). The inputs are found on the top of the circuit. It has one 16-bit output that leads back to the Bus which is found at the bottom middle of the circuit. It also takes six control signals from the Microinstruction Controller to determine which arithmetic or logical operation to perform. Those are on the right side of the circuit. Finally, it sends three outputs to the Status Register which are found on the bottom left.
Despite the apparent complexity of the circuit, the entire structure can be understood by examining the parts. You will notice that the main portion of the ALU is constructed of 16 identical pieces aligned vertically and positioned next to each other. Each one of these corresponds to one bit of output and is known as an ALU Slice. Being a 16-bit machine, the ALU has 16 slices. If the EM were a 32- bit machine then I would simply include 32 ALU Slices. Follow the input lines in and you will see that the rightmost line from each input goes into the rightmost slice and so on. Thus each slice takes two 1-bit inputs and gives a single 1-bit output. Put them all together and you have a 16-bit ALU.
So lets take a closer look at just one ALU Slice to see whats going on inside. You will see two inputs at the top. These are 1 bit each from the two inputs (Accumulator and Bus) being operated on. There is also a special Carry In input that comes from the previous ALU Slice on the line and a Carry Out output that goes to the next ALU Slice on the line. The last Slice on the line (far left) uses its Carry Out output as the Carry signal that goes to the Status Register. Finally, there are several inputs titled CO (short for control) followed by a number 0, 1, or 2. These are control signals from the Microinstruction Controller that tell the ALU which of its functions to output the bus. The ALU is capable of Adding, Subtracting, and applying the following logical operations to two inputs: AND, OR, NOT, XOR. However, NOT only applies to a single input (You take a bit pattern and reverse it. 0010 becomes 1101). In that case the input used is the Accumulator, not the bus.
The ALU Slice is basically an adder with a few other structures placed in parallel to the adder. On the far left are five logic gates that make up the adder. I wont get into the intricacies of how an adder works because it is covered in great detail on many other websites, but it does accept two 1-bit inputs and the carry input and adds them together producing a single 1-bit output and (if applicable) a carry output. The Logical functions are handled by one gate each which can be found to the right of the adder. They are, from right to left, XOR, AND, OR, and NOT.
You will notice that each function the ALU Slice can do produces only a 1-bit output and that those outputs all go into a single circuit before exiting from the ALU Slice. Youll also see that only one 1- bit output leaves. 5 bits enter, 1 bit leaves. This is the ALU Slice Multiplexor. A multiplexor takes multiple inputs and, based on one or more control signals, chooses only 1 of them to output. Heres a diagram of it.
As you can see there are five inputs at the top that correlate to the five functions the ALU can perform (Addition, AND, OR, NOT, and XOR; Subtraction is done by handling the Addition function in a special way which will be covered later). The series of AND gates at the bottom ensure that only one of the several inputs gets output. For example, lets say the ALU is performing is an OR operation. Look again at the ALU Slice circuit and you can see that all of the functions are done all at the same time and each function sends its output to the ALU multiplexor. In order to select only the OR operations data for output requires that the Microinstruction Controller turn on the control signal C1, but not C0, or C2. As long as only C1 is turned on by the Microinstruction Controller, the output of every single ALU Slice will be the output of the OR gate in the ALU slice. All the other ALU Slice functions will be disregarded even though they did perform the operation they are designed to do.
Looking back at the overall ALU circuit lets take up each of the signals coming in from the Microinstruction Controller. There is a signal called Load ALU that allows the ALU to accept inputs (again, one from the bus and one from the Accumulator). The signal Enable ALU signal sends the output to the bus (whereupon the Accumulator can listen for it). The three control signals (CO0, CO1, CO2) control the multiplexors in each of the ALU Slices so as to determine which ALU operation gets output. Finally there is an input called Sub for Subtraction.
The ALU does not have a dedicated subtraction function inside the ALU Slices. One could develop a circuit to handle subtraction of binary numbers but it is extremely complex and gets more and more complex the larger the number of bits involved. If I made a dedicated circuit that just did 16- bit subtraction directly it would be, by far, the most complicated component of the processor. To get around this, a technique was developed that arrives at the correct answer, but with much less circuitry. The way it is done is to 1) reverse every single bit in the number being subtracted, 2) add 1 to that number, and 3) add the two main numbers together. Lo and Behold, you get the same result as if you had simply subtracted one from the other. Im not qualified to explain exactly why this works mathematically, but it is the method used in modern microprocessors and every example problem I have ever worked out on paper has come out with a correct answer.
The way this is implemented in the circuitry is through the use of an XOR gate just above each ALU Slice. The XOR gate has, as its inputs, the Sub signal and the input from the bus (not the Accumulator) that corresponds to that particular ALU slice. Below is a truth table for a two input XOR gate labeled as you would find it in the ALU.
Input Sub Output 0 0 0 1 0 1 0 1 1 1 1 0
As you can see whenever the Sub signal is off (first two examples) then the Input signals passes through unaltered as if there were no gate there at all. But if the Sub signal is on (bottom two examples) then the output is the reverse of the input. Exactly what you want for a subtraction as described above.
The final detail that needs to be accounted for in subtraction is the addition of one to the reversed number. This is done in a very simple fashion. Earlier in this section a description was given of the carry input and carry output of the adders in each of the ALU Slices. Obviously each adder would have their carry out signal pass to the carry in line of the next adder in line and the last adder in line passes its carry out signal out of the ALU entirely and to the Status Register where it is used for the Carry Status Flag. What about the carry in line of the very first adder? That is attached to the Sub signal as well. When the Sub signal is on, then the carry in line of the first adder is turned on and voila, you have added one to the now reversed number. It is now passed on through the ALU slices and out to the bus where the Accumulator can pick it up as the final product of a subtraction operation.
9) The Program Counter
Now that we have covered all of the components of the Elementary Microprocessor that directly relate to the correct execution of a mathematical or logical operation and the logistics of making those components work together, it is necessary to now look at the method the EM uses to interact with the Program ROM. To recap, the Program ROM is a read only device that is not part of the Microprocessor. It is outside of the processor and is a device in Logisims existing component library. I did not design it. However, to the best of my knowledge it does approximate a real ROM in its behavior so the circuitry I designed to communicate with it should be realistic as a solution to the problem of a processor interacting with a ROM. In the EM simulation, the user/programmer (you) loads a program into the ROM and then executes it by resetting the simulation and turning on the clock or by manually ticking the clock one tick at a time.
In very early microprocessors the way a program was executed was to read the first address of a memory device containing a program and execute that instruction, then look at the next address and execute that instruction and so on. Therefore a program ran from beginning to end with no variation of any kind. The Program Counter was the device that kept track of what the next address to read was. It operated by keeping track of the last address read and added one to that number each time an instruction was executed. It started at 0, read the address and then added 1. Then it read address 1 and added 1, then address 2 and added 1, etc. An important development in processor design was the introduction of out-of-order execution. This technique is exactly the same except that certain instructions (referred to as branch or jump instructions) caused the Program Counter to change its number to something other than the last address plus one. These instructions are usually conditional in that they jump if some condition is true or some condition is false or something equals zero, etc. This allowed the programmer to design programs that would execute one set of instructions if a particular condition were extant, but execute something else if the condition was not met.
The EMs Program Counter does in-order execution starting with address 0 in the Program ROM and continuing on until it runs out of ROM to read from or until it hits a Branch instruction that causes it to change its count. To accommodate both conditions it has to have two sets of inputs. One input is the product of the last address plus one (accomplished by an adder that well look at in detail later) and the other input comes directly from the bus (for out of order execution). Below them a multiplexor takes the two 16-bit inputs and outputs one of them to the flip-flops based on a control signal called Enable Branch from the Microinstruction Controller. When the Enable Branch signal is off, the multiplexor passes on the input from the adder. When the Enable Branch signal is on, the multiplexor passes on the input from the bus. The multiplexor circuit follows below.
There are two 16-bit inputs. On the right side the control signal comes in from the Microinstruction Controller. Two outputs proceed from it. One is sent straight to the set of inputs that arrive from the Bus so if the control signal is on the inputs from the Bus are passed on. The other output from the control signal is passed through an inverter and then goes to the set of inputs from the Program Controllers adder. This is the default in-order execution method so if the control signal is off then this set of inputs is passed on.
The Program Counter adder is found on the top right of the Program Counter circuit. All it does is add one to a 16-bit number. This number comes from the last output of the Program counter which you can see as a series of lines passing from the circuits output and coming around back up to the top right. The 16-bit input to the adder is distributed across 16 1-bit adders just like you see in one of the ALU Slices. The addition of 1 is done by using a constant. In any processor there is a pin that is always on and a pin that is always off. That way you can connect to them if you need a hard-wired 0 or hard- wired 1. You can see a series of boxes along the top that have 0s in them except for the far-right box which shows a 1. The boxes are Logisims way of representing a 0 or 1 constant. Because the 0s and 1s in the Program Counter adder has the pattern of 15 0s and a 1, the end result is the addition of 1 to the number that was passed to the adder.
10) The Memory Address Register (MAR)
Another aspect to inspect, now that we have a functional microprocessor, is how it will store data for later use and retrieve it when needed. It only has one user/programmer accessible register so the EM can only store one 16-bit piece of data internally. The answer is a separate memory that is not part of the processor itself. You can see it on the top-level simulation but we are still dealing with the processor itself for now. The processor has a specialized register that is not accessible to the programmer but which is vital to the proper storage of data to and loading of data from the external RAM. This is the Memory Address Register (MAR).
The MAR is exceedingly simple, but performs a vital task. It takes a 16-bit input from the bus and has a 16-bit output which it sends to the simulated RAM module outside of the processor. The MAR reads the data on the bus when the Microinstruction Controller passes a Select Address signal. The output is hooked up to the RAMs address selection input which causes the RAM module to select a particular address (i.e. address 0, or address 1464) which will then be written to or read from. Like with the Program Counter, this circuit interacts with Logisims simulation of a memory device. My study indicates that the simulation is very realistic and thus, the circuitry I developed to interact with it should be realistic as well, but my earlier caveat applies here too. The points where the EM must interact with the world outside of the processor itself are the points where a departure from reality is more likely. Simulations are wonderful teaching mechanisms but they are not the real world.
Why do we need a MAR at all? Why cant the processor send data from the bus to the RAM address selection input directly? The problem is that the bus can only hold one datum at a time and at a time is defined as during one phase imposed by the Phase Generator. Each instruction has four phases, Fetch, Decode, Execute, and Write-back. Lets say the instruction being performed is ADD 110 (Add the contents of memory address 110 to whatever happens to be in the Accumulator currently). In the first phase (Fetch) the instruction is fetched from the Program ROM and temporarily stored in the Instruction Register. In phase 2 (Decode) the instruction portion (ADD) is sent to the Microinstruction Register on its dedicated bus (not the main bus) and the data portion (110) is put onto the bus. At this point the Microinstruction Controller has its instruction and can now start passing appropriate signals as determined by the instruction. The ADD instruction would immediately, during phase 2 pass the Select Address signal causing the MAR to read from the bus. Whats on the bus during phase 2? The RAM address to be read from (110)! Now the MAR has taken over responsibility for remembering the address for the rest of the instruction and making sure that it is still selected in phase 3 and phase 4 while the bus is doing other things such as moving the data that is in address 110 to the ALU and moving the product of the addition that was done over to the Accumulator. If there was no MAR keeping track of the desired RAM address, the RAM address currently being selected would be whatever number happened to be on the bus so in phase 1 the address being selected would be one thing, then in phase 2 it would be another, and in phase 3 yet another, and in phase 4 another still.
11) Input / Output registers
Finally, we can take up the last portions of the processor not yet described, the human I/O portions. These are the Input and Output registers. We will take up the Output Register first.
The Output Register is simpler than it looks. It reads from the bus when the Microinstruction Controller passes the Load Out signal. It has a 16-bit input attached to the bus and two outputs. One is a 7-bit output which passes only the 7 least significant bits. The other is a 16-bit output which passes all of the bits. The reason for this is that the 7-bit output is hooked up to a simulated TTY device. This TTY device accepts ASCII (American Standard Code for Information Interchange) characters and displays them. ASCII uses 7-bit codes to represent the various characters in the English language so if you want to use the TTY as an output device you will only be dealing with 7 bits at a time anyways. The 16-bit output is just hooked up to a Logisim Probe so you can see what it is outputting but it is there for you to play with if you want to make more elaborate I/O developments such as hooking up the Logisims simulated LCD display device. In that case the 7-bit output would be unnecessary and could be disconnected.
The input register is even simpler as a circuit, but because it deals with a human at human speeds some special arrangements have to be made to have the processor wait for the person interacting with it to catch up. There is a 16-bit input driven by the simulated keyboard outside of the processor, a 16-bit output going to the bus and a control signal passed on from the keyboard whenever there is any data in the keyboards buffer waiting to be read. I will be perfectly honest at this point and tell you that I have no idea if the Logisim simulated keyboard is even remotely realistic and my circuitry is necessarily forced into a design that interacts with it. Thus, I have no idea if the input register is remotely realistic, but it does work in the simulation.
You will notice that there are no flip-flops in the circuit so it is not 100% accurate to call it a register though I call it one to maintain consistent terminology. The key to this circuit is the Enable In signal. It is passed from the Microinstruction Controller whenever the IN instruction is executed. This signal causes two things to happen. First it allows the passage of data from the keyboard side of the Input Register to the bus side of the Input Register by activating the controlled buffers in the middle of the circuit. Second, it passes directly out of the processor and to the keyboards send next character line. The keyboard is simulated so as to pass on the next character in its buffer when the send next character line gets turned on. The result is that the keyboard transfers the next character in its buffer to the EMs bus via the Input Register.
12) Conclusion That concludes the description of the Elementary Microprocessors hardware components. I hope it is of use to you whether you are seriously pursuing an education in the field or are merely curious and interested. Feel free to alter, manipulate, tool around with, or utterly destroy the design. Have fun and see what you can do with it! Maybe somebody will get it working with Logisims LCD display, or develop an interrupt driven I/O scheme, or implement multiply and divide, or replace my pathetic assembler with something professional. Do what you will, but please email me with your developments, conclusions, questions, suggestions, etc. at elementarymicroprocessor@gmail.com. I am no longer actively working on the design and am currently pursuing other projects and interests, but I would like nothing better than to know that this project was helpful and interesting to you.