Pipelining : Architecture, Advantages & Disadvantages

Increasing the speed of execution of the program consequently increases the speed of the processor. There are many ways invented, both hardware implementation and Software architecture, to increase the speed of execution. It was observed that by executing instructions concurrently the time required for execution can be reduced. The concept of Parallelism in programming was proposed. According to this, more than one instruction can be executed per clock cycle. This concept can be practiced by a programmer through various techniques such as Pipelining, Multiple execution units, and multiple cores. Among all these parallelism methods, pipelining is most commonly practiced. So how does an instruction can be executed in the pipelining method? How does it increase the speed of execution?


What is Pipelining?

To grasp the concept of pipelining let us look at the root level of how the program is executed. Instruction is the smallest execution packet of a program. Each instruction contains one or more operations. Simple scalar processors execute one or more instruction per clock cycle, with each instruction containing only one operation. Instructions are executed as a sequence of phases, to produce the expected results. This sequence is given below

Instruction Execution Sequence
Instruction Execution Sequence
  • IF: Fetches the instruction into the instruction register.
  • ID: Instruction Decode, decodes the instruction for the opcode.
  • AG: Address Generator, generates the address.
  • DF: Data Fetch, fetches the operands into the data register.
  • EX: Execution, executes the specified operation.
  • WB: Write back, writes back the result to the register.

Not all instructions require all the above steps but most do. These steps use different hardware functions. In pipelining these different phases are performed concurrently. In pipelining these phases are considered independent between different operations and can be overlapped. Thus, multiple operations can be performed simultaneously with each operation being in its own independent phase.

Instruction Pipelining

Let us look the way instructions are processed in pipelining. This can be easily understood by the diagram below.

Instruction Pipelining
Instruction Pipelining

Assume that the instructions are independent. In simple pipelining processor, at a given time, there is only one operation in each phase. The initial phase is the IF phase. So, at the first clock cycle, one operation is fetched. When the next clock pulse arrives, the first operation goes into the ID phase leaving the IF phase empty. Now, this empty phase is allocated to the next operation. So, during the second clock pulse first operation is in the ID phase and the second operation is in the IF phase.

For the third cycle, the first operation will be in AG phase, the second operation will be in the ID phase and the third operation will be in the IF phase. In this way, instructions are executed concurrently and after six cycles the processor will output a completely executed instruction per clock cycle.

Has this instruction executed sequentially, initially the first instruction has to go through all the phases then the next instruction would be fetched? So, for execution of each instruction, the processor would require six clock cycles. But in a pipelined processor as the execution of instructions takes place concurrently, only the initial instruction requires six cycles and all the remaining instructions are executed as one per each cycle thereby reducing the time of execution and increasing the speed of the processor.

Pipelining Architecture

Parallelism can be achieved with Hardware, Compiler, and software techniques. To exploit the concept of pipelining in computer architecture many processor units are interconnected and are functioned concurrently. In pipelined processor architecture, there are separated processing units provided for integers and floating point instructions. Whereas in sequential architecture, a single functional unit is provided.

Pipelined Processor Unit
Pipelined Processor Unit

In static pipelining, the processor should pass the instruction through all phases of pipeline regardless of the requirement of instruction. In a dynamic pipeline processor, an instruction can bypass the phases depending on its requirement but has to move in sequential order. In a complex dynamic pipeline processor, the instruction can bypass the phases as well as choose the phases out of order.

Pipelining in RISC Processors

The most popular RISC architecture ARM processor follows 3-stage and 5-stage pipelining. In 3-stage pipelining the stages are: Fetch, Decode, and Execute. This pipelining has 3 cycles latency, as an individual instruction takes 3 clock cycles to complete.

ARM 3 stage Pipelining
ARM 3 stage Pipelining

For proper implementation of pipelining Hardware architecture should also be upgraded. The hardware for 3 stage pipelining includes a register bank, ALU, Barrel shifter, Address generator, an incrementer, Instruction decoder, and data registers.

ARM 3 Stage Pipelining Datapath
ARM 3 Stage Pipelining Datapath

In 5 stages pipelining the stages are: Fetch, Decode, Execute, Buffer/data and Write back.

Pipelining Hazards

In a typical computer program besides simple instructions, there are branch instructions, interrupt operations, read and write instructions.  Pipelining is not suitable for all kinds of instructions. When some instructions are executed in pipelining they can stall the pipeline or flush it totally. This type of problems caused during pipelining is called Pipelining Hazards.

In most of the computer programs, the result from one instruction is used as an operand by the other instruction. When such instructions are executed in pipelining, break down occurs as the result of the first instruction is not available when instruction two starts collecting operands. So, instruction two must stall till instruction one is executed and the result is generated. This type of hazard is called Read –after-write pipelining hazard.

Read After Write Pipelining Hazard
Read After Write Pipelining Hazard

Execution of branch instructions also causes a pipelining hazard. Branch instructions while executed in pipelining effects the fetch stages of the next instructions.

Pipelined Branch Behaviour
Pipelined Branch Behaviour

Advantages of Pipelining

  • Instruction throughput increases.
  • Increase in the number of pipeline stages increases the number of instructions executed simultaneously.
  • Faster ALU can be designed when pipelining is used.
  • Pipelined CPU’s works at higher clock frequencies than the RAM.
  • Pipelining increases the overall performance of the CPU.

Disadvantages of Pipelining

  • Designing of the pipelined processor is complex.
  • Instruction latency increases in pipelined processors.
  • The throughput of a pipelined processor is difficult to predict.
  • The longer the pipeline, worse the problem of hazard for branch instructions.

Pipelining benefits all the instructions that follow a similar sequence of steps for execution. Processors that have complex instructions where every instruction behaves differently from the other are hard to pipeline. Processors have reasonable implements with 3 or 5 stages of the pipeline because as the depth of pipeline increases the hazards related to it increases. Name some of the pipelined processors with their pipeline stage?