Course Project Descriptions
(Advanced Compiler Techniques, Fall 2011)

Introduction

The purpose of the final project is to give you a chance to focus on a specific problem, and to work out the solution to this problem in detail. Your project should give you the opportunity to review the current literature in your chosen area, to implement either the ideas you read about or to implement your own improvements on those ideas, and to evaluate the effectiveness of the ideas. You could develop new analyses, implement an analysis recently proposed by other people, evaluate existing analyses or optimizing techniques, or use program analyses to solve an interesting problem (such as applications in software engineering, testing, debugging, parallelization).

Students will work on a specific project in groups of 2-3 members.

Students who are not involved in compiler-related research could choose to work on the regular programming assignments.

Requirements

Tools

We recommend the Soot compiler infrastructure from McGill University.

But you can use any of the freely available compiler infrastructures: JikesRVM, SUIF, Eclipse, Joeq, Wala, LLVM, gcc, abc, MIT-Flex, etc.

Projects (This list is obsolete!! For reference only!)

A list of possible projects are listed below. You are encouraged to propose a new project that is not listed here. The only requirement is that the project should have some element of "program'' analysis in it, although the "program'' being analyzed are not restricted to Java or C source code.

  1. Space Optimization: Java Virtual machines are being implemented in small devices with restricted memory and caches. In this context it becomes important to reduce the size of the code (rather than the speed of the code). In this project the purpose will be to reduce the size of the bytecode, reduce the number of local variables, and reduce the maximum stack height, rename strings to reduce constant pool size, remove useless fields and useless methods, reverse inlining, and so on. Static size measurements can be used to evaluate the effectiveness of the optimizations.
     
  2. Comparing intermediate representations: The Soot infrastructure is a research infrastructre from McGill. While it has proven to be a useful research infrastructure utilized by many research groups, there are some deficiencies in the bytecode produced by Soot. This project would evaulate bytecode produced by javac, Soot, and Jikes (jikesrvm.org). What is different about the bytecode? How could the Soot bytecode be improved? Are there optimizations performed by Jikes and not by Soot?
     
  3. Faster Data Flow Analysis for Soot: The design of the Soot data flow analysis framework focuses on flexibility and genericity rather than on speed and optimal memory use. This project would focus on a speed and/or memory optimized data flow analysis framework for Soot, perhaps even at the expense of flexibility.

    Possible areas that might be examined are: (1) finding more efficient representations of flow facts, (2) different work-list algorithms specialized to specific kinds of dataflow problems, and (3) avoiding analysis or reanalysis of parts of the CFG not impacted by this analysis. Other ideas are also welcome.
     

  4. String optimizations on Java bytecode: Modern versions of javac generate efficient calls to StringBuffers for computations involving string concatenation in one expression.

    For example, the expression:

      y = x+"h"+1+"l"+"jennifer"

    would get generated as something like the following:

      x = new StringBuffer
      x.init
      x.append(x)
      x.append("h")
      x.append(1)
      ...
      y = x.toString()

    However, if a method is computing a string via many different concatenations to a variable (as in a pretty printer or code generator), then not all of the string concatenations show up in a single expression - they are spread throughout the method body. Each smaller statement will result in a new StringBuffer and extraneous work.

    The purpose of this project is to automatically detect and optimize String operations within a method body.
     

  5. Less Naive Jimple: Currently the Soot framework first produces very naive Jimple, and then applies a series of cleanup steps to produce tighter Jimple that is suitable for further optimizations. Although this is a very modular approach, it may be be case that it is more expensive to first produce very verbose Jimple than it would be to try and produce slightly better Jimple to begin with.

    The purpose of this project is to analyze the costs involved in the process of going from bytecode to Jimple, and to suggest and implement improvements to this process.

    Recently a Soot user reported some tuning that he felt helped speed up the process of producing Jimple, which included:

    These and other changes may make a significant improvement in the time it takes to produce good Jimple code.
     

  6. Efficient shape analysis on SSA form: In many shape analyses, each object is modeled by the set of local variables that point to it. In the worst case, this could be any subset, so the analysis is exponential in the number of local variables. However, only local variables that are live are relevant to the analysis, and Static Single Assignment form has some useful properties regarding live variables. Can such a shape analysis be implemented more efficiently if the program being analyzed is in SSA form?
     

  7. Call graph construction: A call graph is a pre-requisite for almost every interprocedural analysis for an object-oriented language such as Java. Many open problems remain (you can choose one):

    (a) Integrated development environments (IDEs) are one context in which call graphs are useful. However, precise call graph construction algorithms that construct a call graph from scratch are too slow to be executed after every change made to the program being edited. Design and implement an incremental call graph construction algorithm that could execute continuously in an IDE while the program is being modified.

    (b) Most call graph construction algorithms require analysis of the whole program. This requirement is impractical since many programs rely on dynamically loaded libraries that may not be available for analysis. Design call graph construction algorithms for analyzing partial programs, and compare their precision against algorithms that have access to the whole program.

    (c) Even when using precise analyses, call graphs for Java programs constructed by static analysis tend to contain many more methods than are actually executed when a benchmark is run. This project involves comparing dynamic (run-time) and static call graphs, finding causes of the differences, and suggesting improvements to either the test suite (to increase the dynamic call graph size) or to the static analysis (to reduce the static call graph size) to reduce the discrepancy.
     

  8. Comparing Pointer Analysis Effectiveness: Many approaches are proposed for pointer analysis, including many alias and points-to analyses. Soot has implemented some points-to analyses already. The project is asking you to implement a couple of recently proposed pointer analysis approaches and compare their performance (efficiency, accuracy, etc.) with the implementation in Soot.
     

  9. Refactoring Java in Soot: All methods are not equal, and all parts of a method are not equal, meaning that some portions of a method may have hot parts, when other portions are cold (executed rarely). Since it takes time to compile cold parts, it has been suggested that such methods should be refactored so that only the hot parts remain in the method body and the cold parts are moved to other methods. The purpose of this project would be to use Soot to profile frequencies of important basic blocks. Then, based on the profiling factor out the cold code to new methods.

    Reference: John Whaley. Partial Method Compilation using Dynamic Profiling, OOPSLA 2001.
     
  10. Field Analysis and related optmization: In [GRS00], simple analyses were presented based on object fields. Implementing such an analysis in Soot should be relatively simple. Once the analysis is implemented, bytecode optimizations could be performed.

    [GRS00] Sanjay Ghemawat, Keith H. Randall, and Daniel J. Scales. Field analysis: getting useful and low-cost interprocedural information. PLDI 2000.
     

  11. Projects in AspectJ:  abc is the AspectBench Compiler for AspectJ, developed by a team from Oxford, McGill and Aarhus ( http://www.aspectbench.org). The front-end of the compiler is developed using Polyglot and the back-end uses Soot.  If you are interested in Aspect-Oriented Programming, many existing program analyses could also be modified to support AspectJ using abc.
     

  12. Other Projects: Many other analyses/problems can be implemented/solved using Soot, such as

Past Project List