X-Git-Url: https://oss.titaniummirror.com/gitweb?a=blobdiff_plain;f=gcc%2Fdoc%2Ftree-ssa.texi;fp=gcc%2Fdoc%2Ftree-ssa.texi;h=bd0edc442265e23868343b24a4cb65735d63c29d;hb=6fed43773c9b0ce596dca5686f37ac3fc0fa11c0;hp=0000000000000000000000000000000000000000;hpb=27b11d56b743098deb193d510b337ba22dc52e5c;p=msp430-gcc.git diff --git a/gcc/doc/tree-ssa.texi b/gcc/doc/tree-ssa.texi new file mode 100644 index 00000000..bd0edc44 --- /dev/null +++ b/gcc/doc/tree-ssa.texi @@ -0,0 +1,1024 @@ +@c Copyright (c) 2004, 2005, 2007, 2008 Free Software Foundation, Inc. +@c Free Software Foundation, Inc. +@c This is part of the GCC manual. +@c For copying conditions, see the file gcc.texi. + +@c --------------------------------------------------------------------- +@c Tree SSA +@c --------------------------------------------------------------------- + +@node Tree SSA +@chapter Analysis and Optimization of GIMPLE tuples +@cindex Tree SSA +@cindex Optimization infrastructure for GIMPLE + +GCC uses three main intermediate languages to represent the program +during compilation: GENERIC, GIMPLE and RTL@. GENERIC is a +language-independent representation generated by each front end. It +is used to serve as an interface between the parser and optimizer. +GENERIC is a common representation that is able to represent programs +written in all the languages supported by GCC@. + +GIMPLE and RTL are used to optimize the program. GIMPLE is used for +target and language independent optimizations (e.g., inlining, +constant propagation, tail call elimination, redundancy elimination, +etc). Much like GENERIC, GIMPLE is a language independent, tree based +representation. However, it differs from GENERIC in that the GIMPLE +grammar is more restrictive: expressions contain no more than 3 +operands (except function calls), it has no control flow structures +and expressions with side-effects are only allowed on the right hand +side of assignments. See the chapter describing GENERIC and GIMPLE +for more details. + +This chapter describes the data structures and functions used in the +GIMPLE optimizers (also known as ``tree optimizers'' or ``middle +end''). In particular, it focuses on all the macros, data structures, +functions and programming constructs needed to implement optimization +passes for GIMPLE@. + +@menu +* Annotations:: Attributes for variables. +* SSA Operands:: SSA names referenced by GIMPLE statements. +* SSA:: Static Single Assignment representation. +* Alias analysis:: Representing aliased loads and stores. +@end menu + +@node Annotations +@section Annotations +@cindex annotations + +The optimizers need to associate attributes with variables during the +optimization process. For instance, we need to know whether a +variable has aliases. All these attributes are stored in data +structures called annotations which are then linked to the field +@code{ann} in @code{struct tree_common}. + +Presently, we define annotations for variables (@code{var_ann_t}). +Annotations are defined and documented in @file{tree-flow.h}. + + +@node SSA Operands +@section SSA Operands +@cindex operands +@cindex virtual operands +@cindex real operands +@findex update_stmt + +Almost every GIMPLE statement will contain a reference to a variable +or memory location. Since statements come in different shapes and +sizes, their operands are going to be located at various spots inside +the statement's tree. To facilitate access to the statement's +operands, they are organized into lists associated inside each +statement's annotation. Each element in an operand list is a pointer +to a @code{VAR_DECL}, @code{PARM_DECL} or @code{SSA_NAME} tree node. +This provides a very convenient way of examining and replacing +operands. + +Data flow analysis and optimization is done on all tree nodes +representing variables. Any node for which @code{SSA_VAR_P} returns +nonzero is considered when scanning statement operands. However, not +all @code{SSA_VAR_P} variables are processed in the same way. For the +purposes of optimization, we need to distinguish between references to +local scalar variables and references to globals, statics, structures, +arrays, aliased variables, etc. The reason is simple, the compiler +can gather complete data flow information for a local scalar. On the +other hand, a global variable may be modified by a function call, it +may not be possible to keep track of all the elements of an array or +the fields of a structure, etc. + +The operand scanner gathers two kinds of operands: @dfn{real} and +@dfn{virtual}. An operand for which @code{is_gimple_reg} returns true +is considered real, otherwise it is a virtual operand. We also +distinguish between uses and definitions. An operand is used if its +value is loaded by the statement (e.g., the operand at the RHS of an +assignment). If the statement assigns a new value to the operand, the +operand is considered a definition (e.g., the operand at the LHS of +an assignment). + +Virtual and real operands also have very different data flow +properties. Real operands are unambiguous references to the +full object that they represent. For instance, given + +@smallexample +@{ + int a, b; + a = b +@} +@end smallexample + +Since @code{a} and @code{b} are non-aliased locals, the statement +@code{a = b} will have one real definition and one real use because +variable @code{b} is completely modified with the contents of +variable @code{a}. Real definition are also known as @dfn{killing +definitions}. Similarly, the use of @code{a} reads all its bits. + +In contrast, virtual operands are used with variables that can have +a partial or ambiguous reference. This includes structures, arrays, +globals, and aliased variables. In these cases, we have two types of +definitions. For globals, structures, and arrays, we can determine from +a statement whether a variable of these types has a killing definition. +If the variable does, then the statement is marked as having a +@dfn{must definition} of that variable. However, if a statement is only +defining a part of the variable (i.e.@: a field in a structure), or if we +know that a statement might define the variable but we cannot say for sure, +then we mark that statement as having a @dfn{may definition}. For +instance, given + +@smallexample +@{ + int a, b, *p; + + if (@dots{}) + p = &a; + else + p = &b; + *p = 5; + return *p; +@} +@end smallexample + +The assignment @code{*p = 5} may be a definition of @code{a} or +@code{b}. If we cannot determine statically where @code{p} is +pointing to at the time of the store operation, we create virtual +definitions to mark that statement as a potential definition site for +@code{a} and @code{b}. Memory loads are similarly marked with virtual +use operands. Virtual operands are shown in tree dumps right before +the statement that contains them. To request a tree dump with virtual +operands, use the @option{-vops} option to @option{-fdump-tree}: + +@smallexample +@{ + int a, b, *p; + + if (@dots{}) + p = &a; + else + p = &b; + # a = VDEF + # b = VDEF + *p = 5; + + # VUSE + # VUSE + return *p; +@} +@end smallexample + +Notice that @code{VDEF} operands have two copies of the referenced +variable. This indicates that this is not a killing definition of +that variable. In this case we refer to it as a @dfn{may definition} +or @dfn{aliased store}. The presence of the second copy of the +variable in the @code{VDEF} operand will become important when the +function is converted into SSA form. This will be used to link all +the non-killing definitions to prevent optimizations from making +incorrect assumptions about them. + +Operands are updated as soon as the statement is finished via a call +to @code{update_stmt}. If statement elements are changed via +@code{SET_USE} or @code{SET_DEF}, then no further action is required +(i.e., those macros take care of updating the statement). If changes +are made by manipulating the statement's tree directly, then a call +must be made to @code{update_stmt} when complete. Calling one of the +@code{bsi_insert} routines or @code{bsi_replace} performs an implicit +call to @code{update_stmt}. + +@subsection Operand Iterators And Access Routines +@cindex Operand Iterators +@cindex Operand Access Routines + +Operands are collected by @file{tree-ssa-operands.c}. They are stored +inside each statement's annotation and can be accessed through either the +operand iterators or an access routine. + +The following access routines are available for examining operands: + +@enumerate +@item @code{SINGLE_SSA_@{USE,DEF,TREE@}_OPERAND}: These accessors will return +NULL unless there is exactly one operand matching the specified flags. If +there is exactly one operand, the operand is returned as either a @code{tree}, +@code{def_operand_p}, or @code{use_operand_p}. + +@smallexample +tree t = SINGLE_SSA_TREE_OPERAND (stmt, flags); +use_operand_p u = SINGLE_SSA_USE_OPERAND (stmt, SSA_ALL_VIRTUAL_USES); +def_operand_p d = SINGLE_SSA_DEF_OPERAND (stmt, SSA_OP_ALL_DEFS); +@end smallexample + +@item @code{ZERO_SSA_OPERANDS}: This macro returns true if there are no +operands matching the specified flags. + +@smallexample +if (ZERO_SSA_OPERANDS (stmt, SSA_OP_ALL_VIRTUALS)) + return; +@end smallexample + +@item @code{NUM_SSA_OPERANDS}: This macro Returns the number of operands +matching 'flags'. This actually executes a loop to perform the count, so +only use this if it is really needed. + +@smallexample +int count = NUM_SSA_OPERANDS (stmt, flags) +@end smallexample +@end enumerate + + +If you wish to iterate over some or all operands, use the +@code{FOR_EACH_SSA_@{USE,DEF,TREE@}_OPERAND} iterator. For example, to print +all the operands for a statement: + +@smallexample +void +print_ops (tree stmt) +@{ + ssa_op_iter; + tree var; + + FOR_EACH_SSA_TREE_OPERAND (var, stmt, iter, SSA_OP_ALL_OPERANDS) + print_generic_expr (stderr, var, TDF_SLIM); +@} +@end smallexample + + +How to choose the appropriate iterator: + +@enumerate +@item Determine whether you are need to see the operand pointers, or just the +trees, and choose the appropriate macro: + +@smallexample +Need Macro: +---- ------- +use_operand_p FOR_EACH_SSA_USE_OPERAND +def_operand_p FOR_EACH_SSA_DEF_OPERAND +tree FOR_EACH_SSA_TREE_OPERAND +@end smallexample + +@item You need to declare a variable of the type you are interested +in, and an ssa_op_iter structure which serves as the loop controlling +variable. + +@item Determine which operands you wish to use, and specify the flags of +those you are interested in. They are documented in +@file{tree-ssa-operands.h}: + +@smallexample +#define SSA_OP_USE 0x01 /* @r{Real USE operands.} */ +#define SSA_OP_DEF 0x02 /* @r{Real DEF operands.} */ +#define SSA_OP_VUSE 0x04 /* @r{VUSE operands.} */ +#define SSA_OP_VMAYUSE 0x08 /* @r{USE portion of VDEFS.} */ +#define SSA_OP_VDEF 0x10 /* @r{DEF portion of VDEFS.} */ + +/* @r{These are commonly grouped operand flags.} */ +#define SSA_OP_VIRTUAL_USES (SSA_OP_VUSE | SSA_OP_VMAYUSE) +#define SSA_OP_VIRTUAL_DEFS (SSA_OP_VDEF) +#define SSA_OP_ALL_USES (SSA_OP_VIRTUAL_USES | SSA_OP_USE) +#define SSA_OP_ALL_DEFS (SSA_OP_VIRTUAL_DEFS | SSA_OP_DEF) +#define SSA_OP_ALL_OPERANDS (SSA_OP_ALL_USES | SSA_OP_ALL_DEFS) +@end smallexample +@end enumerate + +So if you want to look at the use pointers for all the @code{USE} and +@code{VUSE} operands, you would do something like: + +@smallexample + use_operand_p use_p; + ssa_op_iter iter; + + FOR_EACH_SSA_USE_OPERAND (use_p, stmt, iter, (SSA_OP_USE | SSA_OP_VUSE)) + @{ + process_use_ptr (use_p); + @} +@end smallexample + +The @code{TREE} macro is basically the same as the @code{USE} and +@code{DEF} macros, only with the use or def dereferenced via +@code{USE_FROM_PTR (use_p)} and @code{DEF_FROM_PTR (def_p)}. Since we +aren't using operand pointers, use and defs flags can be mixed. + +@smallexample + tree var; + ssa_op_iter iter; + + FOR_EACH_SSA_TREE_OPERAND (var, stmt, iter, SSA_OP_VUSE) + @{ + print_generic_expr (stderr, var, TDF_SLIM); + @} +@end smallexample + +@code{VDEF}s are broken into two flags, one for the +@code{DEF} portion (@code{SSA_OP_VDEF}) and one for the USE portion +(@code{SSA_OP_VMAYUSE}). If all you want to look at are the +@code{VDEF}s together, there is a fourth iterator macro for this, +which returns both a def_operand_p and a use_operand_p for each +@code{VDEF} in the statement. Note that you don't need any flags for +this one. + +@smallexample + use_operand_p use_p; + def_operand_p def_p; + ssa_op_iter iter; + + FOR_EACH_SSA_MAYDEF_OPERAND (def_p, use_p, stmt, iter) + @{ + my_code; + @} +@end smallexample + +There are many examples in the code as well, as well as the +documentation in @file{tree-ssa-operands.h}. + +There are also a couple of variants on the stmt iterators regarding PHI +nodes. + +@code{FOR_EACH_PHI_ARG} Works exactly like +@code{FOR_EACH_SSA_USE_OPERAND}, except it works over @code{PHI} arguments +instead of statement operands. + +@smallexample +/* Look at every virtual PHI use. */ +FOR_EACH_PHI_ARG (use_p, phi_stmt, iter, SSA_OP_VIRTUAL_USES) +@{ + my_code; +@} + +/* Look at every real PHI use. */ +FOR_EACH_PHI_ARG (use_p, phi_stmt, iter, SSA_OP_USES) + my_code; + +/* Look at every PHI use. */ +FOR_EACH_PHI_ARG (use_p, phi_stmt, iter, SSA_OP_ALL_USES) + my_code; +@end smallexample + +@code{FOR_EACH_PHI_OR_STMT_@{USE,DEF@}} works exactly like +@code{FOR_EACH_SSA_@{USE,DEF@}_OPERAND}, except it will function on +either a statement or a @code{PHI} node. These should be used when it is +appropriate but they are not quite as efficient as the individual +@code{FOR_EACH_PHI} and @code{FOR_EACH_SSA} routines. + +@smallexample +FOR_EACH_PHI_OR_STMT_USE (use_operand_p, stmt, iter, flags) + @{ + my_code; + @} + +FOR_EACH_PHI_OR_STMT_DEF (def_operand_p, phi, iter, flags) + @{ + my_code; + @} +@end smallexample + +@subsection Immediate Uses +@cindex Immediate Uses + +Immediate use information is now always available. Using the immediate use +iterators, you may examine every use of any @code{SSA_NAME}. For instance, +to change each use of @code{ssa_var} to @code{ssa_var2} and call fold_stmt on +each stmt after that is done: + +@smallexample + use_operand_p imm_use_p; + imm_use_iterator iterator; + tree ssa_var, stmt; + + + FOR_EACH_IMM_USE_STMT (stmt, iterator, ssa_var) + @{ + FOR_EACH_IMM_USE_ON_STMT (imm_use_p, iterator) + SET_USE (imm_use_p, ssa_var_2); + fold_stmt (stmt); + @} +@end smallexample + +There are 2 iterators which can be used. @code{FOR_EACH_IMM_USE_FAST} is +used when the immediate uses are not changed, i.e., you are looking at the +uses, but not setting them. + +If they do get changed, then care must be taken that things are not changed +under the iterators, so use the @code{FOR_EACH_IMM_USE_STMT} and +@code{FOR_EACH_IMM_USE_ON_STMT} iterators. They attempt to preserve the +sanity of the use list by moving all the uses for a statement into +a controlled position, and then iterating over those uses. Then the +optimization can manipulate the stmt when all the uses have been +processed. This is a little slower than the FAST version since it adds a +placeholder element and must sort through the list a bit for each statement. +This placeholder element must be also be removed if the loop is +terminated early. The macro @code{BREAK_FROM_IMM_USE_SAFE} is provided +to do this : + +@smallexample + FOR_EACH_IMM_USE_STMT (stmt, iterator, ssa_var) + @{ + if (stmt == last_stmt) + BREAK_FROM_SAFE_IMM_USE (iter); + + FOR_EACH_IMM_USE_ON_STMT (imm_use_p, iterator) + SET_USE (imm_use_p, ssa_var_2); + fold_stmt (stmt); + @} +@end smallexample + +There are checks in @code{verify_ssa} which verify that the immediate use list +is up to date, as well as checking that an optimization didn't break from the +loop without using this macro. It is safe to simply 'break'; from a +@code{FOR_EACH_IMM_USE_FAST} traverse. + +Some useful functions and macros: +@enumerate +@item @code{has_zero_uses (ssa_var)} : Returns true if there are no uses of +@code{ssa_var}. +@item @code{has_single_use (ssa_var)} : Returns true if there is only a +single use of @code{ssa_var}. +@item @code{single_imm_use (ssa_var, use_operand_p *ptr, tree *stmt)} : +Returns true if there is only a single use of @code{ssa_var}, and also returns +the use pointer and statement it occurs in, in the second and third parameters. +@item @code{num_imm_uses (ssa_var)} : Returns the number of immediate uses of +@code{ssa_var}. It is better not to use this if possible since it simply +utilizes a loop to count the uses. +@item @code{PHI_ARG_INDEX_FROM_USE (use_p)} : Given a use within a @code{PHI} +node, return the index number for the use. An assert is triggered if the use +isn't located in a @code{PHI} node. +@item @code{USE_STMT (use_p)} : Return the statement a use occurs in. +@end enumerate + +Note that uses are not put into an immediate use list until their statement is +actually inserted into the instruction stream via a @code{bsi_*} routine. + +It is also still possible to utilize lazy updating of statements, but this +should be used only when absolutely required. Both alias analysis and the +dominator optimizations currently do this. + +When lazy updating is being used, the immediate use information is out of date +and cannot be used reliably. Lazy updating is achieved by simply marking +statements modified via calls to @code{mark_stmt_modified} instead of +@code{update_stmt}. When lazy updating is no longer required, all the +modified statements must have @code{update_stmt} called in order to bring them +up to date. This must be done before the optimization is finished, or +@code{verify_ssa} will trigger an abort. + +This is done with a simple loop over the instruction stream: +@smallexample + block_stmt_iterator bsi; + basic_block bb; + FOR_EACH_BB (bb) + @{ + for (bsi = bsi_start (bb); !bsi_end_p (bsi); bsi_next (&bsi)) + update_stmt_if_modified (bsi_stmt (bsi)); + @} +@end smallexample + +@node SSA +@section Static Single Assignment +@cindex SSA +@cindex static single assignment + +Most of the tree optimizers rely on the data flow information provided +by the Static Single Assignment (SSA) form. We implement the SSA form +as described in @cite{R. Cytron, J. Ferrante, B. Rosen, M. Wegman, and +K. Zadeck. Efficiently Computing Static Single Assignment Form and the +Control Dependence Graph. ACM Transactions on Programming Languages +and Systems, 13(4):451-490, October 1991}. + +The SSA form is based on the premise that program variables are +assigned in exactly one location in the program. Multiple assignments +to the same variable create new versions of that variable. Naturally, +actual programs are seldom in SSA form initially because variables +tend to be assigned multiple times. The compiler modifies the program +representation so that every time a variable is assigned in the code, +a new version of the variable is created. Different versions of the +same variable are distinguished by subscripting the variable name with +its version number. Variables used in the right-hand side of +expressions are renamed so that their version number matches that of +the most recent assignment. + +We represent variable versions using @code{SSA_NAME} nodes. The +renaming process in @file{tree-ssa.c} wraps every real and +virtual operand with an @code{SSA_NAME} node which contains +the version number and the statement that created the +@code{SSA_NAME}. Only definitions and virtual definitions may +create new @code{SSA_NAME} nodes. + +@cindex PHI nodes +Sometimes, flow of control makes it impossible to determine the +most recent version of a variable. In these cases, the compiler +inserts an artificial definition for that variable called +@dfn{PHI function} or @dfn{PHI node}. This new definition merges +all the incoming versions of the variable to create a new name +for it. For instance, + +@smallexample +if (@dots{}) + a_1 = 5; +else if (@dots{}) + a_2 = 2; +else + a_3 = 13; + +# a_4 = PHI +return a_4; +@end smallexample + +Since it is not possible to determine which of the three branches +will be taken at runtime, we don't know which of @code{a_1}, +@code{a_2} or @code{a_3} to use at the return statement. So, the +SSA renamer creates a new version @code{a_4} which is assigned +the result of ``merging'' @code{a_1}, @code{a_2} and @code{a_3}. +Hence, PHI nodes mean ``one of these operands. I don't know +which''. + +The following macros can be used to examine PHI nodes + +@defmac PHI_RESULT (@var{phi}) +Returns the @code{SSA_NAME} created by PHI node @var{phi} (i.e., +@var{phi}'s LHS)@. +@end defmac + +@defmac PHI_NUM_ARGS (@var{phi}) +Returns the number of arguments in @var{phi}. This number is exactly +the number of incoming edges to the basic block holding @var{phi}@. +@end defmac + +@defmac PHI_ARG_ELT (@var{phi}, @var{i}) +Returns a tuple representing the @var{i}th argument of @var{phi}@. +Each element of this tuple contains an @code{SSA_NAME} @var{var} and +the incoming edge through which @var{var} flows. +@end defmac + +@defmac PHI_ARG_EDGE (@var{phi}, @var{i}) +Returns the incoming edge for the @var{i}th argument of @var{phi}. +@end defmac + +@defmac PHI_ARG_DEF (@var{phi}, @var{i}) +Returns the @code{SSA_NAME} for the @var{i}th argument of @var{phi}. +@end defmac + + +@subsection Preserving the SSA form +@findex update_ssa +@cindex preserving SSA form +Some optimization passes make changes to the function that +invalidate the SSA property. This can happen when a pass has +added new symbols or changed the program so that variables that +were previously aliased aren't anymore. Whenever something like this +happens, the affected symbols must be renamed into SSA form again. +Transformations that emit new code or replicate existing statements +will also need to update the SSA form@. + +Since GCC implements two different SSA forms for register and virtual +variables, keeping the SSA form up to date depends on whether you are +updating register or virtual names. In both cases, the general idea +behind incremental SSA updates is similar: when new SSA names are +created, they typically are meant to replace other existing names in +the program@. + +For instance, given the following code: + +@smallexample + 1 L0: + 2 x_1 = PHI (0, x_5) + 3 if (x_1 < 10) + 4 if (x_1 > 7) + 5 y_2 = 0 + 6 else + 7 y_3 = x_1 + x_7 + 8 endif + 9 x_5 = x_1 + 1 + 10 goto L0; + 11 endif +@end smallexample + +Suppose that we insert new names @code{x_10} and @code{x_11} (lines +@code{4} and @code{8})@. + +@smallexample + 1 L0: + 2 x_1 = PHI (0, x_5) + 3 if (x_1 < 10) + 4 x_10 = @dots{} + 5 if (x_1 > 7) + 6 y_2 = 0 + 7 else + 8 x_11 = @dots{} + 9 y_3 = x_1 + x_7 + 10 endif + 11 x_5 = x_1 + 1 + 12 goto L0; + 13 endif +@end smallexample + +We want to replace all the uses of @code{x_1} with the new definitions +of @code{x_10} and @code{x_11}. Note that the only uses that should +be replaced are those at lines @code{5}, @code{9} and @code{11}. +Also, the use of @code{x_7} at line @code{9} should @emph{not} be +replaced (this is why we cannot just mark symbol @code{x} for +renaming)@. + +Additionally, we may need to insert a PHI node at line @code{11} +because that is a merge point for @code{x_10} and @code{x_11}. So the +use of @code{x_1} at line @code{11} will be replaced with the new PHI +node. The insertion of PHI nodes is optional. They are not strictly +necessary to preserve the SSA form, and depending on what the caller +inserted, they may not even be useful for the optimizers@. + +Updating the SSA form is a two step process. First, the pass has to +identify which names need to be updated and/or which symbols need to +be renamed into SSA form for the first time. When new names are +introduced to replace existing names in the program, the mapping +between the old and the new names are registered by calling +@code{register_new_name_mapping} (note that if your pass creates new +code by duplicating basic blocks, the call to @code{tree_duplicate_bb} +will set up the necessary mappings automatically). On the other hand, +if your pass exposes a new symbol that should be put in SSA form for +the first time, the new symbol should be registered with +@code{mark_sym_for_renaming}. + +After the replacement mappings have been registered and new symbols +marked for renaming, a call to @code{update_ssa} makes the registered +changes. This can be done with an explicit call or by creating +@code{TODO} flags in the @code{tree_opt_pass} structure for your pass. +There are several @code{TODO} flags that control the behavior of +@code{update_ssa}: + +@itemize @bullet +@item @code{TODO_update_ssa}. Update the SSA form inserting PHI nodes +for newly exposed symbols and virtual names marked for updating. +When updating real names, only insert PHI nodes for a real name +@code{O_j} in blocks reached by all the new and old definitions for +@code{O_j}. If the iterated dominance frontier for @code{O_j} +is not pruned, we may end up inserting PHI nodes in blocks that +have one or more edges with no incoming definition for +@code{O_j}. This would lead to uninitialized warnings for +@code{O_j}'s symbol@. + +@item @code{TODO_update_ssa_no_phi}. Update the SSA form without +inserting any new PHI nodes at all. This is used by passes that +have either inserted all the PHI nodes themselves or passes that +need only to patch use-def and def-def chains for virtuals +(e.g., DCE)@. + + +@item @code{TODO_update_ssa_full_phi}. Insert PHI nodes everywhere +they are needed. No pruning of the IDF is done. This is used +by passes that need the PHI nodes for @code{O_j} even if it +means that some arguments will come from the default definition +of @code{O_j}'s symbol (e.g., @code{pass_linear_transform})@. + +WARNING: If you need to use this flag, chances are that your +pass may be doing something wrong. Inserting PHI nodes for an +old name where not all edges carry a new replacement may lead to +silent codegen errors or spurious uninitialized warnings@. + +@item @code{TODO_update_ssa_only_virtuals}. Passes that update the +SSA form on their own may want to delegate the updating of +virtual names to the generic updater. Since FUD chains are +easier to maintain, this simplifies the work they need to do. +NOTE: If this flag is used, any OLD->NEW mappings for real names +are explicitly destroyed and only the symbols marked for +renaming are processed@. +@end itemize + +@subsection Preserving the virtual SSA form +@cindex preserving virtual SSA form + +The virtual SSA form is harder to preserve than the non-virtual SSA form +mainly because the set of virtual operands for a statement may change at +what some would consider unexpected times. In general, statement +modifications should be bracketed between calls to +@code{push_stmt_changes} and @code{pop_stmt_changes}. For example, + +@smallexample + munge_stmt (tree stmt) + @{ + push_stmt_changes (&stmt); + @dots{} rewrite STMT @dots{} + pop_stmt_changes (&stmt); + @} +@end smallexample + +The call to @code{push_stmt_changes} saves the current state of the +statement operands and the call to @code{pop_stmt_changes} compares +the saved state with the current one and does the appropriate symbol +marking for the SSA renamer. + +It is possible to modify several statements at a time, provided that +@code{push_stmt_changes} and @code{pop_stmt_changes} are called in +LIFO order, as when processing a stack of statements. + +Additionally, if the pass discovers that it did not need to make +changes to the statement after calling @code{push_stmt_changes}, it +can simply discard the topmost change buffer by calling +@code{discard_stmt_changes}. This will avoid the expensive operand +re-scan operation and the buffer comparison that determines if symbols +need to be marked for renaming. + +@subsection Examining @code{SSA_NAME} nodes +@cindex examining SSA_NAMEs + +The following macros can be used to examine @code{SSA_NAME} nodes + +@defmac SSA_NAME_DEF_STMT (@var{var}) +Returns the statement @var{s} that creates the @code{SSA_NAME} +@var{var}. If @var{s} is an empty statement (i.e., @code{IS_EMPTY_STMT +(@var{s})} returns @code{true}), it means that the first reference to +this variable is a USE or a VUSE@. +@end defmac + +@defmac SSA_NAME_VERSION (@var{var}) +Returns the version number of the @code{SSA_NAME} object @var{var}. +@end defmac + + +@subsection Walking use-def chains + +@deftypefn {Tree SSA function} void walk_use_def_chains (@var{var}, @var{fn}, @var{data}) + +Walks use-def chains starting at the @code{SSA_NAME} node @var{var}. +Calls function @var{fn} at each reaching definition found. Function +@var{FN} takes three arguments: @var{var}, its defining statement +(@var{def_stmt}) and a generic pointer to whatever state information +that @var{fn} may want to maintain (@var{data}). Function @var{fn} is +able to stop the walk by returning @code{true}, otherwise in order to +continue the walk, @var{fn} should return @code{false}. + +Note, that if @var{def_stmt} is a @code{PHI} node, the semantics are +slightly different. For each argument @var{arg} of the PHI node, this +function will: + +@enumerate +@item Walk the use-def chains for @var{arg}. +@item Call @code{FN (@var{arg}, @var{phi}, @var{data})}. +@end enumerate + +Note how the first argument to @var{fn} is no longer the original +variable @var{var}, but the PHI argument currently being examined. +If @var{fn} wants to get at @var{var}, it should call +@code{PHI_RESULT} (@var{phi}). +@end deftypefn + +@subsection Walking the dominator tree + +@deftypefn {Tree SSA function} void walk_dominator_tree (@var{walk_data}, @var{bb}) + +This function walks the dominator tree for the current CFG calling a +set of callback functions defined in @var{struct dom_walk_data} in +@file{domwalk.h}. The call back functions you need to define give you +hooks to execute custom code at various points during traversal: + +@enumerate +@item Once to initialize any local data needed while processing +@var{bb} and its children. This local data is pushed into an +internal stack which is automatically pushed and popped as the +walker traverses the dominator tree. + +@item Once before traversing all the statements in the @var{bb}. + +@item Once for every statement inside @var{bb}. + +@item Once after traversing all the statements and before recursing +into @var{bb}'s dominator children. + +@item It then recurses into all the dominator children of @var{bb}. + +@item After recursing into all the dominator children of @var{bb} it +can, optionally, traverse every statement in @var{bb} again +(i.e., repeating steps 2 and 3). + +@item Once after walking the statements in @var{bb} and @var{bb}'s +dominator children. At this stage, the block local data stack +is popped. +@end enumerate +@end deftypefn + +@node Alias analysis +@section Alias analysis +@cindex alias +@cindex flow-sensitive alias analysis +@cindex flow-insensitive alias analysis + +Alias analysis proceeds in 4 main phases: + +@enumerate +@item Structural alias analysis. + +This phase walks the types for structure variables, and determines which +of the fields can overlap using offset and size of each field. For each +field, a ``subvariable'' called a ``Structure field tag'' (SFT)@ is +created, which represents that field as a separate variable. All +accesses that could possibly overlap with a given field will have +virtual operands for the SFT of that field. + +@smallexample +struct foo +@{ + int a; + int b; +@} +struct foo temp; +int bar (void) +@{ + int tmp1, tmp2, tmp3; + SFT.0_2 = VDEF + temp.a = 5; + SFT.1_4 = VDEF + temp.b = 6; + + VUSE + tmp1_5 = temp.b; + VUSE + tmp2_6 = temp.a; + + tmp3_7 = tmp1_5 + tmp2_6; + return tmp3_7; +@} +@end smallexample + +If you copy the symbol tag for a variable for some reason, you probably +also want to copy the subvariables for that variable. + +@item Points-to and escape analysis. + +This phase walks the use-def chains in the SSA web looking for +three things: + +@itemize @bullet +@item Assignments of the form @code{P_i = &VAR} +@item Assignments of the form P_i = malloc() +@item Pointers and ADDR_EXPR that escape the current function. +@end itemize + +The concept of `escaping' is the same one used in the Java world. +When a pointer or an ADDR_EXPR escapes, it means that it has been +exposed outside of the current function. So, assignment to +global variables, function arguments and returning a pointer are +all escape sites. + +This is where we are currently limited. Since not everything is +renamed into SSA, we lose track of escape properties when a +pointer is stashed inside a field in a structure, for instance. +In those cases, we are assuming that the pointer does escape. + +We use escape analysis to determine whether a variable is +call-clobbered. Simply put, if an ADDR_EXPR escapes, then the +variable is call-clobbered. If a pointer P_i escapes, then all +the variables pointed-to by P_i (and its memory tag) also escape. + +@item Compute flow-sensitive aliases + +We have two classes of memory tags. Memory tags associated with +the pointed-to data type of the pointers in the program. These +tags are called ``symbol memory tag'' (SMT)@. The other class are +those associated with SSA_NAMEs, called ``name memory tag'' (NMT)@. +The basic idea is that when adding operands for an INDIRECT_REF +*P_i, we will first check whether P_i has a name tag, if it does +we use it, because that will have more precise aliasing +information. Otherwise, we use the standard symbol tag. + +In this phase, we go through all the pointers we found in +points-to analysis and create alias sets for the name memory tags +associated with each pointer P_i. If P_i escapes, we mark +call-clobbered the variables it points to and its tag. + + +@item Compute flow-insensitive aliases + +This pass will compare the alias set of every symbol memory tag and +every addressable variable found in the program. Given a symbol +memory tag SMT and an addressable variable V@. If the alias sets +of SMT and V conflict (as computed by may_alias_p), then V is +marked as an alias tag and added to the alias set of SMT@. + +Every language that wishes to perform language-specific alias analysis +should define a function that computes, given a @code{tree} +node, an alias set for the node. Nodes in different alias sets are not +allowed to alias. For an example, see the C front-end function +@code{c_get_alias_set}. +@end enumerate + +For instance, consider the following function: + +@smallexample +foo (int i) +@{ + int *p, *q, a, b; + + if (i > 10) + p = &a; + else + q = &b; + + *p = 3; + *q = 5; + a = b + 2; + return *p; +@} +@end smallexample + +After aliasing analysis has finished, the symbol memory tag for +pointer @code{p} will have two aliases, namely variables @code{a} and +@code{b}. +Every time pointer @code{p} is dereferenced, we want to mark the +operation as a potential reference to @code{a} and @code{b}. + +@smallexample +foo (int i) +@{ + int *p, a, b; + + if (i_2 > 10) + p_4 = &a; + else + p_6 = &b; + # p_1 = PHI ; + + # a_7 = VDEF ; + # b_8 = VDEF ; + *p_1 = 3; + + # a_9 = VDEF + # VUSE + a_9 = b_8 + 2; + + # VUSE ; + # VUSE ; + return *p_1; +@} +@end smallexample + +In certain cases, the list of may aliases for a pointer may grow +too large. This may cause an explosion in the number of virtual +operands inserted in the code. Resulting in increased memory +consumption and compilation time. + +When the number of virtual operands needed to represent aliased +loads and stores grows too large (configurable with @option{--param +max-aliased-vops}), alias sets are grouped to avoid severe +compile-time slow downs and memory consumption. The alias +grouping heuristic proceeds as follows: + +@enumerate +@item Sort the list of pointers in decreasing number of contributed +virtual operands. + +@item Take the first pointer from the list and reverse the role +of the memory tag and its aliases. Usually, whenever an +aliased variable Vi is found to alias with a memory tag +T, we add Vi to the may-aliases set for T@. Meaning that +after alias analysis, we will have: + +@smallexample +may-aliases(T) = @{ V1, V2, V3, @dots{}, Vn @} +@end smallexample + +This means that every statement that references T, will get +@code{n} virtual operands for each of the Vi tags. But, when +alias grouping is enabled, we make T an alias tag and add it +to the alias set of all the Vi variables: + +@smallexample +may-aliases(V1) = @{ T @} +may-aliases(V2) = @{ T @} +@dots{} +may-aliases(Vn) = @{ T @} +@end smallexample + +This has two effects: (a) statements referencing T will only get +a single virtual operand, and, (b) all the variables Vi will now +appear to alias each other. So, we lose alias precision to +improve compile time. But, in theory, a program with such a high +level of aliasing should not be very optimizable in the first +place. + +@item Since variables may be in the alias set of more than one +memory tag, the grouping done in step (2) needs to be extended +to all the memory tags that have a non-empty intersection with +the may-aliases set of tag T@. For instance, if we originally +had these may-aliases sets: + +@smallexample +may-aliases(T) = @{ V1, V2, V3 @} +may-aliases(R) = @{ V2, V4 @} +@end smallexample + +In step (2) we would have reverted the aliases for T as: + +@smallexample +may-aliases(V1) = @{ T @} +may-aliases(V2) = @{ T @} +may-aliases(V3) = @{ T @} +@end smallexample + +But note that now V2 is no longer aliased with R@. We could +add R to may-aliases(V2), but we are in the process of +grouping aliases to reduce virtual operands so what we do is +add V4 to the grouping to obtain: + +@smallexample +may-aliases(V1) = @{ T @} +may-aliases(V2) = @{ T @} +may-aliases(V3) = @{ T @} +may-aliases(V4) = @{ T @} +@end smallexample + +@item If the total number of virtual operands due to aliasing is +still above the threshold set by max-alias-vops, go back to (2). +@end enumerate