Lab 2: Dataflow and optimizations
Objective
The goal of this assignment is to develop an optimizer that
performs dataflow analysis and scalar optimizations. This assignment
has the following components:
- Construct the Control Flow Graph (CFG). (5%)
- Perform Strongly Connected Region (SCR) analysis. (5%)
- Perform reaching definitions (20%) and use it to perform
simple constant propagation. (25%)
- Perform live variable analysis (20%) and use it to perform
dead statement elimination. (25%)
- Extra credit: perform constant propagation with the Wegman &
Zadeck algorithm. (+15%)
Project Description
You will add this functionality to the translator you implemented
for the Lab 1. Your compiler will call the csc compiler to compile a C
subset file named as <filename.c>. The optimizer will take
3-address format code as input, perform the analysis and
optimizations on it, and generate optimized 3-address format code as
output. The last step of your optimizer will convert this 3-address
form to C, using your translator from Lab 1.
Getting Started On Lab2
- Download the example C files and the CFG output at http://www.cs.utexas.edu/users/mckinley/380C/labs/cs380c_lab2.tgz
- Decompress the file on the cs unix machines using
"tar xvfz cs380c_lab2.tgz".
This command will produce the following directory structure
cs380c_lab2
examples
lab2
The examples directory contains 10 example c files named as <filename.c>,
and their CFG output named <filename.ta.cfg>, and two
bash scripts (check.sh, check-one-cfg.sh). Many of these are the same as in Lab 1.
You can use the provided files to verify your implementation.
- Implement your own optimizer in the lab2 directory. Your
compiler should accept 3-address code as input from stdin and write
output to stdout. Your compiler backend should generate the following
output:
- CFG
- 3-address code - a backend that prints out your optimized
(or unoptimized) 3-address format
- C - functionality from Lab 1
- An optimization report
Your compiler will be invoked by the script run.sh' and should accept
the following command line arguments.
- -opt, a comma separated list of optimizations: scp (Simple
Constant Propagation), dse (Dead Statement Elimination) and wcp
(Wegman/Zadeck Constant Propagation)
- -backend, specifies the particular code generator: c, cfg,
3addr, and rep (the optimization report)
Here are some usage examples.
- ./run.sh -backend=c # This is lab 1
- ./run.sh -backend=cfg # Part 1 of lab 2
- ./run.sh -opt=scp -backend=3addr # Perform simple constant
propagation and generate output in 3-address format.
- ./run.sh -opt=wcp -backend=c # Perform Wegman constant
propagation and produce C code as output.
- ./run.sh -opt=scp -backend=cfg # Perform simple constant
propagation and write out the cfg (after these optimizations)
- ./run.sh -opt=scp -backend=rep # Perform simple constant
propagation and write out the optimization report (after these
optimizations)
- Verify your implementation.
Modify the scripts in example directory to check your own
implementation
- Turn in your implementation.
(See "Turning In Your Assignment" section for more details.)
Optimizer Implementation Specification
- Construct the Control Flow Graph (CFG).
Before performing iterative data-flow analysis, you will need
to generate the CFG. The nodes of the CFG are the basic blocks. The
basic blocks in turn consist of a sequence of instructions in
3-address format.
Your implementation should print out the lead instruction's number
as the basic block number. It also needs to print out the edges
between the basic blocks.The following is the expected
output for the gcd.c example:
Function: 2
Basic blocks: 2 3 5 14
CFG:
2 -> 3
3 -> 5 14
5 -> 3
14 ->
Function: 18
Basic blocks: 18 30 46
CFG:
18 -> 30
30 -> 46
46 ->
The output should be self-explanatory. The numbers are the instruction
numbers that start the functions and basic blocks (leaders from the
class lecture). The list of basic blocks and CFG successors should be
sorted numerically. No optimization report is required for CFG
generation.
For all programs in the examples directory, the CFGs expressed
in the above format are given along with the source program. Before
turning in your assignment, you should check whether your output
matches these files. We will perform additional tests when grading,
and recommend you use your test files from Lab 1 and others to check
your code.
- Perform SCR analysis on the CFG. Produce an optimization
report in the following format:
Function: 2
Basic blocks: 2 6 11 22
SCR: 2 6 11
- Perform simple constant propagation based on reaching
definitions. You need to perform reaching definitions analysis and run
a simple constant propagation analysis that determines if the
reaching definitions to a variable are all the same constant. The execution time of
the algorithm is not a criterion. Your analysis should converge and
compute the correct results. You must report the
number of constants which have been propagated.
Produce an optimization report in the following format:
Function: 2
Number of constants propagated: 3
Function: 18
Number of constants propagated: 6
- Perform dead statement elimination based on liveness analysis.
First perform liveness analysis and then use the result of the analysis to eliminate
statements that produce a result that is not live. You must report the
number of statements which have been eliminated. The statements in
strongly connected regions (SCR) and those that are not in a SCR
need to be counted separately.
Produce optimization report in the following format:
Function: 2
Number of statements eliminated in SCR: 1
Number of statements eliminated not in SCR: 2
Function: 18
Number of statements eliminated in SCR: 2
Number of statements eliminated not in SCR: 3
- Extra Credit. Perform a more sophisticated constant
propagation algorithm based on your own insights and/or Wegman &
Zadeck, Constant
Propagation with Conditional Branches, ACM Transactions on Programming
Languages and Systems, 13(2):181-210, April 1991. Report number of
constants your algorithm propagates. Produce an optimization report
with the same format as simple constant propagation.
Turning In Your Assignment
Your assignment should contain the following.
- A single tar.gz file named lab2-groupX.tgz
(X should be replaced by your group number), which, when extracted,
creates the lab2 directory.
- The lab2 directory can contain sub-directories.
- The lab2 directory should contain the following files:
- README - Please include your name(s) and student ID(s).
- example-scp.c - a synthetic program that tests constant
propagation.
- example-dse.c - a synthetic program that tests statement
elimination.
- compile.sh - a script to compile your source code.
- run.sh - a script that runs your compiler. This script
should read 3-address code as input from stdin and write the output to
stdout. The output is specified by the command line arguments
described above.
Turn in your assignment by emailing the final lab2-groupX.tgz to the
instructor.