Lab 3: SSA and optimizations

    Objective

    The goal of this assignment is to build Static Single Assignment (SSA) form, and perform optimizations on SSA. This assignment has the following components:

    1. Build Static Single Assignment (SSA) form. (25%)
    2. Perform SSA based constant propagation. (30%)
    3. Perform SSA based loop invariant code motion. (30%)
    4. Translate SSA back to non-SSA 3-address code. (15%)
    5. Extra Credits: Perform global common subexpression elimination. (15%)

    Project Description

    You will add this functionality to the optimizer and the translator you have implemented for Lab 1 and 2. Your compiler will call the csc compiler to compile the subset (SC) files. Your new optimizer will be able to take the 3-address code as input, build 3-address SSA code, output 3-address SSA code, perform optimizations on it, translate SSA back to non-SSA 3-address code, generating non-SSA optimized 3-address code.

    Getting Started On Lab3

    1. Download the lab files at http://www.cs.utexas.edu/users/mckinley/380C/labs/cs380c_lab3.tgz
    2. Decompress the file on the cs unix machines using
      	"tar xvfz cs380c_lab3.tgz".
      
      This command will produce the following directory structure
      	cs380c_lab3
      		lab3
      
      There are only two script files and a README file in the lab3 directory. You should reuse all the examples files from lab2.
    3. Implement your own new optimizer in the lab3 directory. As usual, your compiler should accept 3-address code as input from stdin, and write output to stdout. The backend of your compiler needs to generate the following output:
      1. CFG
      2. 3-address code - The backend writes out the optimized or unoptimized 3-address format, as you did previously.
      3. 3-address SSA code - The backend writes out 3-address SSA code.
      4. C - Same Lab 1
      5. Optimization report

      Your compiler will be invoked by the script 'run.sh', and should accept the following command line arguments.
      1. -opt, a comma separated list of transformations. The transformations you will support are:
        • scp (simple constant propagation)
        • cse (common subexpression elimination)
        • licm (loop invariant code motion)
        • ssa (static single assignment)
        For example:
        • -opt=ssa,scp means convert to SSA form and perform constant propagation.
        • -opt=ssa,licm means convert to SSA form and then perform constant propagation.
      2. -backend determines which backend to use: c, cfg, 3addr, rep, and ssa. If ssa is not specified, generate 3-address code without SSA instructions. For example,
        -backend=ssa,3addr means to generate 3-address format code in SSA form.

    Optimizer Implementation Specification

    1. Build SSA form. Implement the SSA algorithm discussed in class. Use the opcode phi for phi functions. Name subscripted variables as variable$subscript. For example, variable i will become i$0, i$1, etc. Take a look at a simple example of a loop, and its representation in 3-address and SSA formats.
      	i = 0;
      	while (i < 10) {
      		i = i + 1;
      	}
      
      3-address format:
      	instr 4: move 0 i#-8
      	instr 5: cmplt i#-8 10
      	instr 6: blbc (5) [10]
      	instr 7: add i#-8 1
      	instr 8: move (7) i#-8
      	instr 9: br [5]
      	instr 10: ret 0
      
      Thw 3-address format with SSA: (Note that SSA does not have offsets. You have to keep track of offsets in your symbol table for memory allocation.)
      	instr 4: move 0 i$0
      	instr 23: phi i$0 i$2
      	instr 24: move (23) i$1
      	instr 5: cmplt i$1 10
      	instr 6: blbc (5) [10]
      	instr 7: add i$1 1
      	instr 8: move (7) i$2
      	instr 9: br [23]
      	instr 10: ret 0
      
      Notice the new instructions instr 23 and 24. Only variable i has a phi instruction. Since virtual registers generated by our frontend csc are all block-local (i.e., written to and read from within the same basic-block), they do not require phi functions. In other words, all of them would be dead. Therefore, do not insert phi functions for block-local virtual variables or other variables with this property. Perform an analysis (using live variables from Lab 2) that detects if the phi function is dead. For this assignment, you do not have to add support to your frontend to read in 3-address code in SSA form.
    2. Perform SSA based Constant Propagation. Perform constant propagation on the CFG in SSA form. You must report the number of constants you propagate. Produce an optimization report in the following format:
       
      	Function: 2
      	Number of constants propagated: 3
      	Function: 18
      	Number of constants propagated: 6
      
    3. Perform SSA based loop invariant code motion. For each statement in the program, if none of the operands points to a phi node or a definition inside a loop, the result of this statement is loop invariant. You should hoist that statement to the loop header. Note that you can and should apply these criteria recursively. Produce an optimization report in the following format:
      	Function: 2
      	Number of statement hoisted: 3
      	Function: 9
      	Number of statement hoisted: 2
      
    4. Translate from SSA to non-SSA 3-address code. After performing the SSA based optimizations, you need to translate from SSA format back to regular 3-address code, and then, if specified, to C code.
    5. Extra Credit: Perform global common subexpression elimination on the CFG in SSA form.

    Turning In Your Assignment

    Your assignment should contain the following.

    1. A single tgz file named lab3-groupX.tgz (X should be replaced by your group number), which, when extracted, creates the lab3 directory.
    2. The lab3 directory can contain sub-directories.
    3. The lab3 directory should contain the following files:
      1. README - Please include your name(s) and student ID(s) here.
      2. compile.sh - a script to compile your source code.
      3. run.sh - a script that runs your compiler. This script should read 3-address code as input from stdin and write output to stdout. The output is specified by the command line arguments described in the previous section.

    Turn in your assignment by emailing the final lab3-groupX.tgz to the instructor.

     
    Thanks & Acknowledgements

    These assignments are derived from Prof. Kathryn McKinley's Advanced Compiler Techniques class.