Illusion of Order
In Chapter 20, we said that when the pipeline hits a slow instruction, it must wait.
That was true in 1990. It is false today.
Today, the CPU refuses to wait. It looks at your code, sees that Line 2 and Line 3 have nothing to do with Line 1, and executes them Out-of-Order.
The CPU doesn't respect your code order. It respects Dependencies only.
The Big Lie
If the CPU executes instructions opportunistically (driven by data readiness), why doesn't your program break?
Because the CPU keeps a secret diary called the Reorder Buffer (ROB).
It executes logically (running ahead), but it Commits physically in the exact order you wrote.
Order is an illusion preserved for your sanity, not the machine's necessity.
The Second Lie: Registers Are Not Registers
To execute out of order, the CPU must solve a deeper problem. Two instructions may use the same register
name (e.g., RAX), even though they are logically unrelated.
The CPU silent lies again. It gives every value a private physical register, even if they share the same name in your code.
You think there are 16 registers. The CPU uses hundreds. This is called Register Renaming. Without it, the illusion would collapse instantly.
Instructions #2 and #3 (ADD/SUB) are fast. They finish before #1.
But they cannot leave the buffer (Commit) until #1 is done. They wait in line, finished but trapped.
Commit vs Execute
This distinction is vital for two reasons: Sanity and Safety.
- Execute: Do the work. Happens whenever data is ready (Dataflow).
- Commit (Retire): Make the change permanent and visible. Happens only in strict
order.
Crucially, commit order is how the CPU preserves precise exceptions. If instruction #3 crashes, the machine must pretend #4 (which already finished) never happened.
This allows the CPU to speculate. It can guess, execute 50 instructions ahead (in program order), and if the guess was wrong, just throw away the uncommitted results.
Why doesn't the compiler just reorder the code perfectly?
The compiler (Static Scheduling) is smart, but blind. It cannot see Cache Misses.
It might schedule an instruction assuming a variable is in L1 Cache (4 cycles). If that variable is actually in RAM (300 cycles), the static schedule collapses. Only the hardware (OoO) can react to latency dynamically.
Is this why my laptop gets hot?
Yes. Speculation is expensive. The CPU is constantly guessing, executing instructions that might be thrown away. It burns energy to buy time.
Is this dangerous?
Extremely. In 2018, we discovered Spectre and Meltdown. Hackers realized they could trick the CPU into "speculatively" reading secret data, then measure the side effects (cache changes) even after the CPU "threw away" the results.
The Illusion of Order leaked reality.
Out-of-Order execution lets the CPU run ahead.
But branches don't just block execution — they split reality. At every If-statement, the CPU must bet on a future that may not exist.
This is not an optimization. This is survival.