A prime concern of compiled machine code is to minimize it's filesize. Because the better the program fits in the CPU's instruction cache the less it needs to resort to the slow process of fetching it from RAM.

Redundant instructions, whether arising from other optimization passes or because it reads nicer, are a prime offender. Which the Full Redundancy Elimination optimization pass addresses.

---

With dominators calculators & loop optimizer optionally initialized it does it's processing.
1/

If these arguments are missing, this central function `do_rpo_vn` finds the entry & exit points to analyze. It clears all the `EDGE_DFS_BACK` bitflag on all entry edges.

It depth-first iterates over the control flow graph to set some flags, gather a postorder indices array, & optionally gather cyclic edges (loops). This is seperated into loops for locating the cycles & function/loop headers, shortcircuit tidyups on trivial cases, locating exits, computing postorder, & cleanup.

2/?

If those iteration over the control flow edges yields no loops it'll skip processing PHI instructions. It creates a mapping from the codeblock indices to their index in the postorder array. It updates in-region flags for each basic block & edge. It creates a few more collections for the next loop.

Which iterates in postorder to gather unwind state (C-irrelevant exception handling?) & unset codeblocks' EXECUTABLE flags.

It optionally iterates over cycles to expose them better to later loops.
3/

If it is optimizing loops it performs a main iteration over all codeblocks checking/updating the unwind state, validating EXECUTABLE flags, process the basic block, & their PHI instructions to validate them, & inserts unwinding instructions.

Otherwise it uses a greedy worklist to iterate over the basic blocks in postorder to validate them & determine in which order to process them.

A few final checks/loops outputs debug information to the GCC devs & frees the collections.

4/?

Follow

To actually process a basic block it first checks if there's actually any PHI if the caller told us to process them.

Unless we're handling cycles, it'll consider recursively inserting the loop header into place.

It'll optionally iterate over the PHI instructions to recursively compute a "value number" (hash?) for the expressions they reference.

It updates the edges' EXECUTABLE flags. It iterates over instructions to choose other branches to examine & lookup what can replace this instruction.

`SSA_NAMES`, assignments, conditions, & calls are handled specially. There's plenty of debugging info written for the GCC devs. If it's made any changes there may be per-instruction tidyup for it to perform, possibly defered to a later pass.

Failing to find another instruction to replace it `pass_fre` will add it's parameters to the collections of instructions which may replace later instructions.

6/6 Fin, for now. Next: initial Value Range Propagation

Yes, FRP wasn't very straightforward!

I've just uploaded this thread to: adrian.geek.nz/gnu_docs/c-comp

Why DID I type it as FRP in that last toot?

If all goes well tonight, I'll be discussing Voice2JSON tomorrow morning! Did you hear how much I'm enjoying that tool?

More such writing at: adrian.geek.nz/docs

Sign in to participate in the conversation
FLOSS.social

For people who care about, support, or build Free, Libre, and Open Source Software (FLOSS).