Out-of-order execution

There's another refinement to the pipeline concept allowing an even higher utilization of a CPU's resources. Namely, as processor manufacturers started to add redundant processing units (Intel P6 already had two integer and two floating-point execution units), it became possible to execute two instructions in parallel.

Up until Pentium Pro (P6), instructions were fed into the pipeline in their order of appearance. But if there's a data dependency between two consecutive instructions, then they can't be processed in parallel, leaving the additional execution unit idle:

a = b + 1;  // 1
c = a + 5; // 2
d = e + 10; // 3
f = d + 15; // 4

The solution to this problem is to take the next independent instruction and execute it before the dependent one. See the next diagram for a visual explanation:

Here, on the left side, we see the traditional execution preserving the instruction order and, on the right, parallel execution with reordering, where instruction 3 will be executed before instruction 2.