Chapter 12

The Watcher

We have seen that memory is far (Hierarchy) and chunky (Cache Lines). To hide this, the CPU does something disturbing: It stops trusting you.

Modern CPUs have a component called the Hardware Prefetcher. It sits on the memory bus, silently watching every address you request. It is a dumb, specialized machine trained to recognize repetition — not intelligence.

If it sees `Access(100)`, then `Access(101)`, it concludes: "They are walking linearly. They will want 102 next." It fetches 102 before you even ask for it.

The Uneasy Alliance

When you play nice (Arrays, Sequential Access), the Watcher is your best friend. It hides the latency completely.

But if you are unpredictable (Linked Lists, Trees, Hash Maps), the Watcher panics. It guesses wrong. It fetches things you don't need, clogging the bandwidth.

This is why Data Oriented Design works. It isn't just about Cache Lines. It is about hypnotizing the Watcher.

Physics Lens: Latency ↓ (Hidden) | Throughput ↑ (Saturation) | Energy ↑↑ (Speculation) | Waste ↑ (Bad Guesses)
Experiment:
1. Scan Array (Linear): Watch the Blue blocks appear ahead of you. You mostly hit Green (Fast).
2. Random Access: The Watcher cannot help you. You hit Red (Slow) and leave a trail of useless Blue blocks.

The Cost of Prediction

Prediction is not free. It costs energy. And if the prediction is wrong, it costs bandwidth.

A "Graph" data structure is the worst nightmare for a CPU. Every node pointer is a surprise. The prefetcher tries to guess, fails, and eventually gives up. The CPU stalls.

Hypotheses
Is prefetching guaranteed?

No. Prefetchers are heuristic-based and conservative. If you behave unpredictably, they disengage.

Why does a simple loop sometimes outperform a “faster” algorithm?

Because linear access hypnotizes the prefetcher. A theoretically worse algorithm can win if it feeds the hardware predictable patterns.

Does the prefetcher understand data structures?

Absolutely not. It sees addresses, not meaning. A tree traversal looks like chaos.

Can prefetching hurt performance?

Yes. Wrong guesses evict useful data, waste bandwidth, and burn energy. Speculation always has a cost.

What programmers usually get wrong here

They trust Big-O more than hardware behavior. The prefetcher does not care about complexity — it cares about rhythm.

This works — until we need to map the world. Linear memory is fast, but the world is complex. We need to translate our complex ideas into simple addresses.