CPU complexity
Why is a CPU so complex? Why are so many transistors needed?
Multiple Cores.
Multiple Roles:
Running application programs, running an operating system, handling external I/O devices, starting or stopping the computer, and managing memory.
Protection And Privilege.
For example, the hardware prevents an application program from directly interacting with I/O devices.
Hardware Priorities.
A CPU uses a priority scheme in which some actions are assigned higher priority than others.
Generality
A CPU is designed to support a wide variety of applications.
Data Size
To speed processing, a CPU is designed to handle large data values.
High Speed
Parallelism is a fundamental technique used to create high-speed hardware.The large amount of parallel hardware needed to make a modern CPU operate at the highest rate also means that the CPU requires many transistors.
Modes of execution
At any given time, the current execution mode determines how the CPU operates.
Backward Compatibility
Backward compatibility allows a vendor to sell a CPU with new features, but also permits customers to use the CPU to run old software.
Changing Modes
Automatic Mode Change.
External hardware can change the mode of a CPU. For example, when an I/O device requests service, the hardware informs the CPU.
Manual Mode Change.
In essence, manual changes occur under control of a running program.
Most often, the program is the operating system, which changes mode before it executes an application. However, some CPUs also provide multiple modes that applications can use, and allow an application to switch among the modes.
Mechanism of Changing Modes
- the CPU includes an instruction to set the current mode.
- the CPU contains a special-purpose mode register to control the mode.
- a mode change can occur as the side effect of another instruction.
In most CPUs, for example, the instruction set includes an instruction that an application uses to make an operating system call.
Multiple Levels of Protection
A CPU that runs applications needs at least two levels of protection: the operating system must run with absolute privilege, but application programs can run with limited privilege.
Microcoded Instructions
First, a hardware architect builds a fast, but small processor known as a microcontroller. Second, to implement the CPU instruction set (called a macro instruction set), the architect writes software for the microcontroller. The software that runs on the microcontroller is known as microcode.
Horizontal Microcode And Parallel Execution
Because horizontal microcode instructions contain separate fields that each control one hardware unit, horizontal microcode makes it easy to specify simultaneous, parallel operation of the hardware units.
Look-Ahead And High Performance Execution
The point is: if the CPU contains enough functional units, an intelligent controller can schedule all four macro instructions to be executed at the same time.
Out-Of-Order Instruction Execution
To achieve highest speed, a modern CPU contains multiple copies of functional units that permit multiple instructions to be executed simultaneously. An intelligent controller uses a scoreboard mechanism to schedule execution in an order that preserves the appearance of sequential processing.
Summary
To reduce the internal complexity, a CPU is often built with two levels of abstraction: a microcontroller is implemented with digital circuits, and a macro instruction set is created by adding microcode.
There are two broad classes of microcode. A microcontroller that uses vertical microcode resembles a conventional RISC processor. Typically, vertical microcode consists of a set of procedures that each correspond to one macro instruction; the CPU runs the appropriate microcode during the fetch-execute cycle. Horizontal microcode, which allows a programmer to schedule functional units to operate on each cycle, consists of instructions in which each bit field corresponds to a functional unit. A third alternative uses Field Programmable Gate Array (FPGA) technology to create the underlying system.
Advanced CPUs extend parallel execution by scheduling a set of instructions across multiple functional units. The CPU uses a scoreboard mechanism to handle cases where the results of one instruction are used by a successive instruction. The idea can be extended to conditional branches by allowing parallel evaluation of each path to proceed, and then, once the condition is known, discarding the values along the path that is not taken.