CodeGen SSA Design Document

From Open64 Wiki

Jump to: navigation, search

Contents

Objective

The goal of the CGSSA project is to build SSA for CG. CGSSA will provide a better “DU manager” to CG optimizations, ease their implementation, and enable them to work on a global scope.

Background

Currently, the optimizations in CG are performed on extended basic blocks (EBB). Although EBB-level optimizations are more effective than those on the basic-block level, it has been conceived that CG optimizations can be even more effective if they are applied to a larger scope with SSA support.

Overview

1. CFG design

Currently, CG builds its CFG at the beginning of CG, using the data structure, struct BB. It builds the dominators (post-dominators, control dependence...) using separate data structures in a few places. On the other hand, WN_CFG (used by WSSA) is built on top of cfg_base, which is a base class that integrates all CFG related functionalities. Our approach is to build CG_CFG on top of cfg_base. We can have CFG functionalities by reusing a lot of code.

2. SSA design

Mostly, the algorithm will be similar to the existing implementations in WSSA and HSSA. Hence, the CGSSA builder will also go through the same three stages (i.e., building DF, inserting phi and renaming variables). However, some design decisions are made differently to suit the particular needs of CG IR and the optimizations.

  * Side data structures

There is no new OP introduced to represent PHI. Instead, PHIs are maintained in the side data structures connected with the basic blocks. Similarly, MUs and CHIs are maintained in the side data structures associated with OPs. Making SSA a side data structure allows an incremental evolution of CG optimizations by staying transparent to the existing SSA-oblivious ones.

  * No memory SSA

Memory accesses are not in SSA. Instead, the CG optimizations will have to rely on alias analysis when necessary. Although there is no memory access modeled in SSA, CHI and MU are needed in SSA, since they are needed for dedicated register TNs (details are shown in the next section).

  * Allowing overlapping live ranges with control flag

Allowing overlapping live ranges may give optimizations more freedom, while it may not be beneficial in terms of register pressure. There is a control flag in SSA to decide whether overlapping live range is allowed. The CG transformations need to check the flag before performing optimizations.

When overlapping live range is allowed, there is an additional pass before leaving SSA to insert copy instructions and rename the uses for each of such overlaps.

The default value is NOT allowing overlapping live ranges.

  * Modeling the dedicated register TN with control flag

When building SSA, a control flag is passed to indicate whether dedicated register TNs need to be modeled or not. If yes, CHI and MUs are needed to model may-uses and may-defs of dedicated register TN. If not, the version for a dedicated register TN will always return INVALID_VER. This control gives CGSSA the ability to model dedicated registers (with extra overhead) only when the CG optimizations can perform optimizations on dedicated register TNs.

The default value is NOT modelling dedicated register TN.

Detailed Design

1. CG_CFG design

CG_CFG shares the same idea and a lot of code with Whirl CFG.

  • CG_CFG is a derived class from cfg_base class.
  • CG_CFG will provide the CFG functionalities implemented in cfg_base class
    • Control flow information
    • dom/post-dom/dom-frontier/control-dep information

CG_CFG is ccurrently built from original CFG inside CG. There is mapping from the original BBs to the new BB_NODEs and vice verses. The update to the new CG_CFG will also change the original CFG.

2. CGSSA Core Data Structures

The Use-Def/Def-Use/Def-Def information is maintained in the VERSION entries. To represent where a version is defined or used, we introduce reference IDs (REF_ID) and maintain a mapping from REF_ID to a pair of (OP/PHI/CHI/MU, operand index/result index).

A version ID (VER_ID) is associated with each TN (register TN and constant TN). From the version table, a version entry (VERSION) can be accessed through VER_ID. A VERSION can be defined by OP, PHI, or CHI and used in OP, PHI, CHI or MU.

A VERSION includes:

  • VER_ID _ver_num // version id
  • VERSION_KIND _kind // version kind (occ def, phi def, chi def or unkown)
  • TN* _tn // tn associated with def
  • REF_ID _def // UD chain
  • set<REF_ID> _uses // DU chain
  • VER_ID _prev_ver // DD chain

PHI_NODE is derived from SSA_NODE_BASE and has 1 result and n operands (represented by VER_ID).

CHI_NODE is derived from SSA_NODE_BASE and has 1 result and 1 operand (represented by VER_ID). CHI_NODE represents the may-defs of dedicated register TNs that may be defined by the side effects of OPs (e.g., TOP_call). All the non-callee-saved dedicated registers will have a CHI on TOP_call.

MU_NODE is derived from SSA_NODE_BASE and has 1 operand (represented by VER_ID). MU_NODE represents the may-uses of register TNs. There are two cases to use MU_NODE.

  • Conditional def OP will have MU_NODE for its def TN. MU_NODE’s opnd is the previous version (i.e., top of the renaming stack) of its def-TN.
  • OPs whose def-TN is a dedicated register. Because of the alias relation among dedicated register TNs (e.g., AL, AX, EAX, RAX of the x86 architecture), MU_NODE is needed to represent that there is a use of previous version of aliased dedicated registers.

The following is an example.

  GTN1 (ver1) :- mov GTN150 (ver10)
  GTN101 (ver2) :- add GTN 151 (ver11), TN 152 (ver12), MU_NODE(ver1)
  GTN160 (ver15) :- sub GTN1 (ver2), TN 152 (ver12)

GTN101 represents the same hardware register as GTN1 but smaller size. The use of GTN1 is defined in ver2 and ver2 has a MU_NODE of ver1. In the SSA renaming phase, all the aliased dedicated register TNs share the same renaming stack. There are other ways to represent this situation (e.g., PHI). But our approach is a simple while conservative approach, since we are modelling registers as a whole, even if they can be individual addressed and used (for example, AL vs. AH or YMM registers).

CGSSA maintains several maps and tables to store PHI, CHI, MU and reference ID information, including:

  • _BB_PHI_map: hash_map<UINT32, pair<PHI_NODE*, PHI_NODE*>> // phi is with BB
  • OP_CHI_map: hash_map<INTPTR, CHI_NODE*> // chi is with OP
  • _OP_MU_map: hash_map<INTPTR, MU_NODE*> // mu is with OP
  • _ver_map: hash_map<REF_ID, VER_ID> // map ref_id to ver_id
  • _ref_id_map: hash_map<INTPTR, vector<REF_ID>> // map OP to ref_id vector
  • _op_idx_map: hash_map<REF_ID, pair<INTPTR, INT>> // map ref id to a pair
  • _ver_table: vector<VERSION*> // version table

3. CGSSA Builder

The entry function for building CGSSA is CGSSA::Build(). There is also a function CGSSA::leave() to leave CGSSA (if overlapping live range is allowed, copy instruction are inserted and variables are renamed).

Traditional three pass algorithm is used to build CGSSA.

  • Build DF (provided by CG_CFG)
  • PHI Insertion: insert PHI into DF for def TNs and for dedicated register TNs in side_effect_op (e.g., TOP_call)
  • Renaming:
    • generate CHI for side_effect_op
    • generate MU for conditional def OPs and OPs that define dedicated register.
    • constant TNs and the TN whose use is before def are defined in entry CHIs

4. CGSSA Interface for Optimizations

 Interface functions to get a version ID for an OP/PHI/CHI/MU’s operand or result
 * VER_ID Opnd_Ver(OP* op, INT opnd_idx)
 * VER_ID Result_Ver(OP* op, INT res_idx)
 * VER_ID Opnd_Ver(PHI_NODE* phi, INT opnd_idx)
 * VER_ID Result_Ver(PHI_NODE* phi)
 * VER_ID Opnd_Ver(CHI_NODE* chi)
 * VER_ID Result_Ver(CHI_NODE* chi)
 * VER_ID Opnd_Ver(MU_NODE *mu)
 Interfaces are used to get the def for a version.
 A version can be represented by a VER_ID, OP, PHI, CHI or MU’s operands or result.
 * OP* Get_Occ_Def(VER_ID ver_id) ...
 * PHI_NODE* Get_Phi_Def(VER_ID ver_id) ...
 * CHI_NODE* Get_Chi_Def(VER_ID ver_id) ...
 Interfaces are used to get the use count for a version.
 * UINT32 Use_Count(VER_ID ver_id) ...
 Interface to iterate through the uses of a version:
 * VER_USE_iterator version->Use_Begin()
 * VER_USE_iterator version->Use_End()
 Interfaces to get PHI, CHI and MU:
 * PHI_NODE* Phi_Begin(BB_NODE* bb)
 * PHI_NODE* Phi_End()  
 * PHI_NODE* Get_Phi_List()
 * CHI_NODE* Chi_Begin(OP* op)
 * CHI_NODE* Chi_End()
 * CHI_NODE* Get_Chi_List(OP* op)
 * MU_NODE* Mu_Begin(OP* op)
 * MU_NODE* Mu_End()
 * MU_NODE* Get_Mu_List(OP* op)
 Interface to get the reaching definition is provided for the CG optimization to control whether performing an optimization instance that will introduce overlapping live ranges.
 * VER_ID Reaching_Def(OP* op, TN* tn)

5. CGSSA Updater Interfaces

CGSSA_updater provides interfaces to incrementally update CGSSA. These functions are counterpart of CGIR update functions.

  • VER_ID SSA_Gen_Register_TN(ISA_REGISTER_CLASS rclass, INT size)
  • VER_ID SSA_Gen_Typed_Register_TN(TYPE_ID mtype, INT size);
  • VER_ID SSA_Gen_Unique_Literal_TN(INT64 ivalue, INT size);
  • VER_ID SSA_Gen_Symbol_TN(ST *st, INT64 offset, INT32 relocs);
  • void SSA_Set_OP_opnd(OP* op, INT opndnum, VER_ID new_ver_id);
  • void SSA_OP_Change_To_Noop(OP *op);
  • OP* SSA_Mk_OP(TOP opr, ...);
  • void SSA_Replace_OP(OP* old_op, OP* new_op);
  • void Delete_Phi (PHI_NODE* phi, CGSSA_DEL_USES);
  • void Delete_OP (OP* op, CGSSA_DEL_USES);
  • void Delete_Stmt (VER_ID);

There are some limitations with the current implementation of the CGSSA updater; the update that involves CFG change is not fully supported yet. But in CodeGen, there is restriction that CG optimizations should not change CFG.

Uses of CGSSA

In the CGSSA project, we implement a global peephole optimizer phase in CG (currently before the first invocation of EBO) to use the new CGSSA infrastructure. Currently, this global peephole optimizer includes copy propagation, constant folding, and dead code elimination.

Here is the simplified sample code for copy propagation that shows how to use CGSSA.

 VER_ID opnd_ver = Ssa()->Opnd_Ver(op, opndnum);
 OP * def = Ssa()->Get_Occ_Def(opnd_ver);
 if (def != NULL && OP_effectively_copy(def) && ...) {
   VER_ID replace_ver = Ssa()->Opnd_Ver(op, 0);
   if (Propagatable(op, opndnum, replace_ver) {
     Ssa_updater()->SSA_Set_OP_opnd(op, opndnum, replace_ver);
   }
 }

Future work

We will keep on improving the stability and usability of CGSSA as more CG optimizations begin to use it.

Personal tools