Course Project
Descriptions
(Advanced Compiler Techniques, Fall
2011)
The purpose of the final project is to give you a chance to focus on a specific problem, and to work out the solution to this problem in detail. Your project should give you the opportunity to review the current literature in your chosen area, to implement either the ideas you read about or to implement your own improvements on those ideas, and to evaluate the effectiveness of the ideas. You could develop new analyses, implement an analysis recently proposed by other people, evaluate existing analyses or optimizing techniques, or use program analyses to solve an interesting problem (such as applications in software engineering, testing, debugging, parallelization).
Students will work on a specific project in groups of 2-3 members.
Students who are not involved in compiler-related research could choose to work on the regular programming assignments.
Requirements
We recommend the Soot compiler infrastructure from McGill University.
Soot main page: http://www.sable.mcgill.ca/soot/
Soot download page: http://www.sable.mcgill.ca/soot/soot_download.html
Soot tutorials: http://www.sable.mcgill.ca/soot/tutorial/
But you can use any of the freely available compiler infrastructures: JikesRVM, SUIF, Eclipse, Joeq, Wala, LLVM, gcc, abc, MIT-Flex, etc.
A list of possible projects are listed below. You are encouraged to propose a new project that is not listed here. The only requirement is that the project should have some element of "program'' analysis in it, although the "program'' being analyzed are not restricted to Java or C source code.
Possible areas that might be examined are:
(1) finding more efficient representations of flow facts, (2) different
work-list algorithms specialized to specific kinds of dataflow problems, and
(3) avoiding analysis or reanalysis of parts of the CFG not impacted by this
analysis. Other ideas are also welcome.
For example, the expression:
y = x+"h"+1+"l"+"jennifer"
would get generated as something like the following:
x = new StringBuffer x.init x.append(x) x.append("h") x.append(1) ... y = x.toString()
However, if a method is computing a string via many different concatenations to a variable (as in a pretty printer or code generator), then not all of the string concatenations show up in a single expression - they are spread throughout the method body. Each smaller statement will result in a new StringBuffer and extraneous work.
The purpose of this project is to
automatically detect and optimize String operations within a method body.
The purpose of this project is to analyze the costs involved in the process of going from bytecode to Jimple, and to suggest and implement improvements to this process.
Recently a Soot user reported some tuning that he felt helped speed up the process of producing Jimple, which included:
These and other changes may make a
significant improvement in the time it takes to produce good Jimple code.
Efficient shape analysis on SSA form: In many shape
analyses, each object is modeled by the set of local variables that point to
it. In the worst case, this could be any subset, so the analysis is
exponential in the number of local variables. However, only local variables
that are live are relevant to the analysis, and Static Single Assignment
form has some useful properties regarding live variables. Can such a shape
analysis be implemented more efficiently if the program being analyzed is in
SSA form?
Call graph construction: A call graph is a
pre-requisite for almost every interprocedural analysis for an
object-oriented language such as Java. Many open problems remain (you can
choose one):
(a) Integrated development environments (IDEs) are one context in which call
graphs are useful. However, precise call graph construction algorithms that
construct a call graph from scratch are too slow to be executed after every
change made to the program being edited. Design and implement an incremental
call graph construction algorithm that could execute continuously in an IDE
while the program is being modified.
(b) Most call graph construction algorithms require analysis of the whole program. This requirement is impractical since many programs rely on dynamically loaded libraries that may not be available for analysis. Design call graph construction algorithms for analyzing partial programs, and compare their precision against algorithms that have access to the whole program.
(c) Even when using precise analyses, call graphs for Java
programs constructed by static analysis tend to contain many more methods
than are actually executed when a benchmark is run. This project involves
comparing dynamic (run-time) and static call graphs, finding causes of the
differences, and suggesting improvements to either the test suite (to
increase the dynamic call graph size) or to the static analysis (to reduce
the static call graph size) to reduce the discrepancy.
Comparing Pointer Analysis
Effectiveness: Many approaches are proposed for pointer analysis,
including many alias and points-to analyses. Soot has implemented some
points-to analyses already. The project is asking you to implement a couple
of recently proposed pointer analysis approaches and compare their
performance (efficiency, accuracy, etc.) with the implementation in Soot.
[GRS00] Sanjay Ghemawat, Keith H. Randall,
and Daniel J. Scales. Field analysis: getting useful and low-cost
interprocedural information. PLDI 2000.
Projects in AspectJ: abc is the
AspectBench Compiler for AspectJ, developed by a team from Oxford, McGill
and Aarhus (
http://www.aspectbench.org). The front-end of the compiler is developed
using Polyglot and the back-end uses Soot. If you are interested in
Aspect-Oriented Programming, many existing program analyses could also be
modified to support AspectJ using abc.
Other Projects: Many other analyses/problems can be implemented/solved using Soot, such as