AST and model transformations for understandable and automatic source code modification

Source code transformations are used everywhere in modern development, from compilation chains that take the source code of a program to transform it in other intermediate representations, to Integrated Development Environment (IDE) that provide automatic source code modification like refactoring tools [1]. This technique is mainly used for refactoring, bug/suggestion fixes, automatic rewriting of deprecated code, but also by dedicated tools to perform optimization, code analysis… Those tools rely mainly on AST rewriting techniques [2] which can be implemented by the developer through rewriting tools or using dedicated framework for the source language. They can work at two levels, either the text level or the AST level [3] and basically define three main kinds of transformations: migration which translates from an high level representation to another, synthesis which refines from an high level representation to a lower-level one and rephrasing which stays on the same high level representation (e.g., refactoring tools) [1].

Problem statement and Ph.D. Proposal

Many researches and tools exist for AST rewriting, proposing languages to create/handle rewriting rules that can be applied to a source code. They are split in two major activities, a unification (matching) activity that identifies all the different portions of the code where the rewriting rule can apply, and the rewriting in itself [1]. Currently, all the proposed solution suffers for many drawbacks, they are often hard to read, hard to write and sometimes impossible to debug [4].

Our insight is to explore the knowledge of the Model Driven Engineering (MDE) community, which has worked extensively on this problematic of artifact transformations with the concept of model transformations and model transformation languages (MTLs) [5]. We argue that MDE efforts about MTLs is a first step towards a better way of expression AST rewriting rules, especially because it implies a constrained environment (the AST of the language). AST rewriting can be seen as model transformations over the metamodel of an AST [6]. Indeed endogeneous transformations are model transformation that are working on the same representation (reparsing), and exogeneous transformations are model transformations that are transforming one representation into another one (synthesis and rephrasing) [7, 6]. MTLs are easier to tame than AST rewriting languages, but their use is not widely spread [8], mainly because of poor tooling support. Moreover, model transformations can sometimes be seen as low level because they force the developer to handle each single artifact of the source or target model by hand. Finally, creating on-the-fly and automatically model transformations is still a hard task [8]. Raising the AST as first class artifact also introduces interesting tooling opportunities as projectional editors [9] or for complex refactorings [10] and are leads that we want to explore as well.

This PhD will focus on exploring model transformations for a new and understandable AST
rewriting language.

 

Connaissances requises

The goals of the PhD are
• Explore how MDE techniques can contribute to AST transformations.
• Provide a new transformation language for AST transformation.
• Write new and innovative tools to support this new language.

Familiarity with ASTs, experience with refactoring, static analysis, language modeling. 

 

 

Programme d'études visé

Doctorat

Domaines de recherche

Technologies de l'information et des communications

Financement

Approx 35 000 $ / year  ( 3 years)

Autres informations

Starting : 2020-09-01 

Partners involve : Professors Vincent Aranega and Stephane Ducasse (RMoD, Inria, Lille, France)