chood.net/blog/

Sometimes doing work costs less than doing nothing at all

Friday, November 8, 2024

If you’ve ever programmed in an assembly language with multiple word sizes, you have probably noticed that moving data from a smaller register into a larger register typically overwrites the higher bits of the destination register.

For example, the x86-64 instruction mov edx, eax sets the upper 32 bits of rdx to 0, even though the instruction could have been designed so that those bits were preserved. Another move instruction performs two’s-complement sign extension, where the value of the sign bit of the source register is copied to all unmatched bits in the destination register.

When I first learned this, I thought it was strange. Why would you design the CPU to do more work than it actually needs to? If I wanted to zero out a register before a copy, it’s not like there aren’t instructions for that!

As it turns out, modern processors are much more complicated than their instruction sets would lead you to believe. Zeroing out registers reduces the number of data dependencies between the operations the CPU is performing. In fact, in order to perform out-of-order execution efficiently, x86-64 CPUs will reallocate registers defined in the ISA to a much larger set of physical registers in the hardware, which allows for pipelines that can have hundreds of instructions in flight at once.

In real life, we often simplify processes in our day-to-day lives in order to more effectively reason about them from the top down.

When do these mental abstractions lead us to make decisions that we think save time, when we are actually making more work for ourselves?