I. Overview of XText A. What is XText? ------------------------------------------ XTEXT OVERVIEW XText is a framework for building DSLs It has built-in ways to do: - lexical analysis - parsing - AST validation - Code generation (or interpretation) Integrated with Xtend, an extension of Java ------------------------------------------ What is an AST? ------------------------------------------ ABSTRACT SYNTAX TREES = OBJECT STRUCTURES input: x := x+1 AST: [AssignS | v: * | aexp: * | label: * ] / | v [AExpression | left: * | op: * | right: * ] [EString | * ] / | | | v v v v [VarRefExpr | vname: ] "+" [NumLitExp | num: * ] [String | 1 | * ] | / v v 1 [CharBuf | 'x' ] ------------------------------------------ B. basic tool usage ------------------------------------------ TYPICAL USAGE OF THE TOOLS 0. Create an XText project choose language name, file suffix, etc. 1. Create Lang.xtext in src folder 2. Generate XText artifacts right click on the .xtext file, select Run As... select Generate XText Artifacts 3. Run as Eclipse application to test right click on the main project (edu.ucf.cs.whilelang.WhileLang) select Run As... select Eclipse Application and/or run JUnit tests right click on the ....tests project select Run As... select JUnit Test 4. Edit to enhance the language 5. Repeat the above... ------------------------------------------ ------------------------------------------ 5 PROJECTS for the language named "WhileLang" edu.ucf.cs.whilelang.WhileLang > src > edu.ucf.cs.whilelang > WhileLang.xtext > ... > edu.ucf.cs.whilelang.generator > WhileLangGenerator.xtend > ... > edu.ucf.cs.whilelang.validation > WhileLangValidator.xtend > ... > ... edu.ucf.cs.whilelang.WhileLang.ide edu.ucf.cs.whilelang.WhileLang.tests > src > edu.ucf.cs.whilelang.tests > ... edu.ucf.cs.whilelang.WhileLang.ui edu.ucf.cs.whilelang.WhileLang.ui.tests ------------------------------------------ What converts the user's input into an AST? C. installation ------------------------------------------ INSTALLATION OF XTEXT TO FOLLOW ALONG 1. Install Java (8 or 9) http://www.oracle.com/technetwork/java/index.html (download Java SE) 2. Install Eclipse http://www.eclipse.org/downloads/packages/ eclipse-ide-java-and-dsl-developers/oxygen2 (use the package for Java and DSL Developers, which includes XText and XTend) or search the Eclipse Marketplace for XText and install that and add org.eclipse.xtext to required bundles 3. Install Eclipse Plug-in Development Environment (PDE) search the Marketplace 4. Clone the WhileLang project from Github git clone https://github.com/leavens/WhileLang.git in Eclipse File > Import ... select Git, clone URI import modules as subprojects ------------------------------------------ II. Abstract Syntax Trees (ASTs) in XText A. Importance of ASTs ------------------------------------------ IMPORTANCE OF ASTS ASTs are: the key ------------------------------------------ B. Overview in XText ------------------------------------------ ABSTRACT SYNTAX TREES IN XTEXT Concrete syntax is described in .xtext file (e.g., WhileLang.xtext) ASTs are described in XText tool generates Java classes for AST nodes (e.g., in edu.ucf.cs.whilelang.whileLang) At runtime, each AST is a Java object. (constructed by the parser) ------------------------------------------ C. WhileLang.xtext File (an example) ------------------------------------------ WhileLang.xtext FILE grammar edu.ucf.cs.whilelang.WhileLang with org.eclipse.xtext.common.Terminals import "http://www.eclipse.org/emf/2002/Ecore" as ecore generate whileLang "http://www.ucf.edu/cs/whilelang/WhileLang" Program: 'proc' name=ID '(' (args=Formals)? ')' 'is' body=Stmt; Formals: names+=ID (',' names+=ID)*; ------------------------------------------ What's the syntax of the parsing rules? What does "abstract" mean? What does a program in the While Language look like? ------------------------------------------ STATEMENT GRAMMAR (in WhileLang.xtext) Stmt returns S: Assignment | Skip | Block | While | If; Block returns CompoundS: '{' stmts+=Stmt (';' stmts+=Stmt)* '}'; Assignment returns AssignS: (v=ID) ':=' (aexp=Expression) | '[' (v=ID) ':=' (aexp=Expression) ']' '^' (label=INT); Skip returns SkipS: 'skip' {SkipS} | '[' 'skip' ']' '^' (label=INT); While returns WhileS: 'while' (bexp=LabeledExp) 'do' (block=Block); If returns IfS: 'if' (bexp=LabeledExp) 'then' (s1=Block) 'else' (s2=Block); LabeledExp returns LabeledExp: (be=Expression) | '[' (be=Expression) ']' '^' (label=INT); ------------------------------------------ What does an Assignment statement look like? Why don't all statements have labels? What is a LabeledExp for? ------------------------------------------ EXPRESSION GRAMMAR in WhileLang.xtext Expression returns Expr: BDisj; BDisj returns Expr: BConj ({BDisj.left=current} op=OR right=BConj)*; BConj returns Expr: BRelExp ({BConj.left=current} op=AND right=BRelExp)*; BRelExp returns Expr: AExpression ({BRelExp.left=current} op=OP_R right=AExpression)?; AExpression returns Expr: Factor ({AExpression.left=current} op=OPPLUS right=Factor)*; Factor returns Expr: Primary ({Factor.left=current} op=OPMUL right=Primary)*; Primary returns Expr: VarRefExpr | NumLitExpr | SignedNum | BoolLitExpr | NotExpr | '(' Expression ')' ; VarRefExpr: vname=ID; SignedNum: (sign=OPPLUS) (nval=Primary); NumLitExpr: num=INT; BoolLitExpr: bval='true' | bval='false'; NotExpr: 'not' (bexp=Primary); ------------------------------------------ What operators bind more tightly than others? Do the operators associate to the left or to the right? What does the rule for Expression do? ------------------------------------------ NON-EXECUTABLE (NOT CALLED) RULE ElementaryBlock: Assignment | Skip | LabeledExp ; generates an interface package edu.ucf.cs.whilelang.whileLang; import org.eclipse.emf.ecore.EObject; public interface ElementaryBlock extends EObject { } which is implemented by AssignS, SkipS, and LabeledExp package edu.ucf.cs.whilelang.whileLang; public interface AssignS extends S, ElementaryBlock { /* ... */ } ------------------------------------------ Is S also an interface? Why is S an interface? 1. picture of runtime AST ------------------------------------------ PARSED AST EXAMPLE Parsing the program proc astEx(x,y) is { if x < 5 then y := 2 else skip } gives the AST: [Formals | names: * ] / \ [Program | body: * | args: * ] [EList | "x", "y" ] / v [IfS | bexp: * | s1: * | s2: *] / | \ v v v [LabeledExp | [AssignS | [SkipS | label: 3] label: 1 | label: 2 | be: * ] v: "y" | | aexp: *-]---> [NumLitExp | num: 2] v [BRelExp | left: * | op: * | right: *] / | | v v v [VarRefExp | "<" [NumLitExp | num: 5] vname: "x" ] ------------------------------------------ 2. lexical syntax ------------------------------------------ TOKENS (LEXICAL SYNTAX) IN XTEXT standard token rules from first line e.g., in WhileLang.xtext grammar edu.ucf.cs.whilelang.WhileLang with org.eclipse.xtext.common.Terminals ------------------------------------------ ------------------------------------------ grammar org.eclipse.xtext.common.Terminals hidden(WS, ML_COMMENT, SL_COMMENT) import "http://www.eclipse.org/emf/2002/Ecore" as ecore terminal ID: '^'?('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'_'|'0'..'9')*; terminal INT returns ecore::EInt: ('0'..'9')+; terminal STRING: '"' ( '\\' . /* 'b'|'t'|'n'|'f'|'r'|'u'|'"'|"'"|'\\' */ | !('\\'|'"') )* '"' | "'" ( '\\' . /* 'b'|'t'|'n'|'f'|'r'|'u'|'"'|"'"|'\\' */ | !('\\'|"'") )* "'" ; terminal ML_COMMENT : '/*' -> '*/'; terminal SL_COMMENT : '//' !('\n'|'\r')* ('\r'? '\n')?; terminal WS : (' '|'\t'|'\r'|'\n')+; terminal ANY_OTHER: .; ------------------------------------------ ------------------------------------------ CUSTOMIZED LEXICAL RULES FOR WHILELANG @Override terminal SL_COMMENT: '%' !('\n'|'\r')* ('\r'? '\n')?; terminal OPPLUS: '+' | '-'; terminal OPMUL: '*' | '/'; terminal OR: 'or'; terminal AND: 'and'; terminal OP_R: '=='| '!=' | '<=' | '>=' | '<' | '>' ; ------------------------------------------ How does one write a negative number down? III. Attributes A. What are attributes? ------------------------------------------ ATTRIBUTES defn: an *attribute* is Two kinds of attributes: Synthesized: (passed up toward the root) AST's value computed from attributes of subtrees Inherited: (passed down) AST's value given to it from its parent in the AST ------------------------------------------ B. Examples of Synthesized Attributes ------------------------------------------ SYNTHESIZED ATTRIBUTE EXAMPLE // file CFG.xtend in edu.ucf.cs.whilelang.utility /** Returns the label of the initial * ElementaryBlock of a statement. */ def dispatch static int init(AssignS s) { s.label } /** Returns the label of the initial * ElementaryBlock of a statement. */ def dispatch static int init(SkipS s) { s.label } /** Returns the label of the initial * ElementaryBlock of a statement. */ def dispatch static int init(IfS s) { s.bexp.label } ------------------------------------------ C. inherited attribute example ------------------------------------------ INHERITED ATTRIBUTES No direct support (that I know of) in XText Can use static fields to hold mappings that are precomputed values of functions Compute them during "validation" in XText ------------------------------------------ ------------------------------------------ CODING INHERITED ATTRIBUTES IN XTEXT 1. Declare static maps somewhere class CFG { // file CFG.xtend /** What is the ElementaryBlock with the given label? */ public static val Map itsBlockMap = new HashMap() /** What is the set of flows within each statement? */ public static val FlowGraph cfgMap = new FlowGraph() /* ... */ } 2. Before other checks, populate the maps (a). Call the label validator first, then the CFG validator package edu.ucf.cs.whilelang.validation /* ... some imports... */ public class WhileLangValidator extends AbstractWhileLangValidator { @Check def checkStaticConstraints(Program p) { new WhileLangLabelsValidator().checkUniqueLabels(p) new WhileLangCFGValidator().constructCFG(p) } } (b). Write the CFG Validator to populate the maps package edu.ucf.cs.whilelang.validation import edu.ucf.cs.whilelang.utility.CFG /* ... */ /** * This class constructs a control flow graph (CFG) for the program. * See Section 2.1 of "Principles of Program Analysis" * by Nielsen, Nielsen, and Hankin (Springer-Verlag, 1999 and 2005). */ class WhileLangCFGValidator extends AbstractWhileLangValidator { /** What is the ElementaryBlock with the given label? */ val static Map itsBlockMap = CFG.itsBlockMap /** What is the set of flows within each statement? */ val static FlowGraph cfgMap = CFG.cfgMap /** Construct the CFG for the given program. * Assumes that the labels have already been added to the program. */ @Check def constructCFG(Program p) { resetMaps() // important because vars are static p.body.constructIBM p.body.constructFlows } /* ... */ } 3. Now the static maps are available to all ------------------------------------------ What happens if we don't reset the maps? D. other tools ------------------------------------------ OTHER TOOLS FOR ATTRIBUTES Tools to work with attributes exist JastAdd is an example tailored to compilers written in Java. These make dealing with inherited attributes easier ------------------------------------------ IV. Control Flow Graphs (2.1) What kind of information is in a CFG? A. use of (Control) flow graphs ------------------------------------------ USE OF (CONTROL) FLOW GRAPHS Mathematically (in text): flows: Stmt -> Powerset(Label x Label) CFGs are used to: summarize control flow in a program, direct information in a dataflow analysis Are an inherited attribute depends on information above an AST node ------------------------------------------ Why is the CFG an inherited attribute? B. overview in XText ------------------------------------------ CFGs IN XTEXT Not a built-in feature of XText's framework Stored as a static field in a map of type FlowGraph Flowgraph API: package edu.ucf.cs.whilelang.utility; import java.util.AbstractMap; /*...*/ /** A Map from statements (S) to sets of pairs of labels * (Set>) that can be used as a flow graph. */ public class FlowGraph extends AbstractMap>> // implements Map>> { /** The representation of this flowgraph. */ private Map>> map = new HashMap>>(); /** Initialize this object to be an empty flowgraph. */ public FlowGraph() { } /** Initialize this object to be a singleton flowgraph. */ public FlowGraph(S s, Set> flws) { map.put(s, flws); } public void clear(); public Set>>> entrySet(); public Set> get(S stmt); public Set> put(S stmt, Set> fls); public Set> putUnion(S stmt, Set> fls); /* ... */ @Override public String toString(); } ------------------------------------------ 1. construction of the cfgMap in WhileLangCFGValidator ------------------------------------------ // file WhileLangCFGValidator.xtend package edu.ucf.cs.whilelang.validation import edu.ucf.cs.whilelang.utility.CFG /*...*/ /** * This class constructs a control flow graph (CFG) for the program. * See Section 2.1 of "Principles of Program Analysis" * by Nielsen, Nielsen, and Hankin (Springer-Verlag, 1999 and 2005). */ class WhileLangCFGValidator extends AbstractWhileLangValidator { /** What is the ElementaryBlock with the given label? */ val Map itsBlockMap = CFG.itsBlockMap /** What is the set of flows within each statement? */ val FlowGraph cfgMap = CFG.cfgMap @Check def constructCFG(Program p) { p.body.constructIBM p.body.constructFlows } /* ... */ def dispatch void constructFlows(AssignS a) { cfgMap.putUnion(a, new SetRepUtility>()) } def dispatch void constructFlows(SkipS s) { cfgMap.putUnion(s, new SetRepUtility>()) } def dispatch void constructFlows(CompoundS c) { // recursively treat each sub-statement for (i : 0..c.stmts.size()-1) { c.stmts.get(i).constructFlows } // the flows of each sub-statement are in the flows of c for (i : 0..c.stmts.size()-1) { cfgMap.putUnion(c, cfgMap.get(c.stmts.get(i))) } // the flows of c also contain a flow from each statement to the next for (i : 0..>( new Pair(j, CFG.init(c.stmts.get(i+1))) )) } } } def dispatch void constructFlows(WhileS c) { // construct the flows of the body c.block.constructFlows cfgMap.putUnion(c, cfgMap.get(c.block)) cfgMap.putUnion(c, new SetRepUtility(new Pair(c.bexp.label, CFG.init(c.block)))) for (j : CFG.finals(c.block)) { cfgMap.putUnion(c, new SetRepUtility>( new Pair(j, c.bexp.label) )) } } def dispatch void constructFlows(IfS ifs) { // construct the flows of the two sub-statements ifs.s1.constructFlows ifs.s2.constructFlows // add in all those flows cfgMap.putUnion(ifs, cfgMap.get(ifs.s1)) cfgMap.putUnion(ifs, cfgMap.get(ifs.s2)) // add flows from the test to the init of each sub-statement cfgMap.putUnion(ifs, new SetRepUtility>( new Pair(ifs.bexp.label, CFG.init(ifs.s1)) )) cfgMap.putUnion(ifs, new SetRepUtility>( new Pair(ifs.bexp.label, CFG.init(ifs.s2)) )) } def resetMaps() { itsBlockMap.clear() cfgMap.clear() } } ------------------------------------------ ------------------------------------------ INIT ATTRIBUTE SYNTHESIZED // in CFG.xtend def dispatch static int init(AssignS s) { s.label } def dispatch static int init(SkipS s) { s.label } def dispatch static int init(IfS s) { s.bexp.label } def dispatch static int init(WhileS s) { s.bexp.label } def dispatch static int init(CompoundS s) { s.stmts.get(0).init } ------------------------------------------ Why do IfS and WhileS use the bexp field? ------------------------------------------ FINALS ATTRIBUTE Result is a set of labels: Set // in CFG.xtend def dispatch static Set finals(AssignS s) { new SetRepUtility(s.label) } def dispatch static Set finals(SkipS s) { new SetRepUtility(s.label) } def dispatch static Set finals(IfS s) { } def dispatch static Set finals(WhileS s) { } def dispatch static Set finals(CompoundS s) { s.stmts.get(s.stmts.size()-1).finals } ------------------------------------------ ------------------------------------------ BLOCKS ATTRIBUTE set of the elementary blocks in a statement // file CFG.xtend def dispatch static Set blocks(AssignS s) { new SetRepUtility(s as ElementaryBlock) } /** Returns the set of elementary blocks in a statement. */ def dispatch static Set blocks(SkipS s) { new SetRepUtility(s as ElementaryBlock) } /** Returns the set of elementary blocks in a statement. */ def dispatch static Set blocks(IfS s) { } /** Returns the set of elementary blocks in a statement. */ def dispatch static Set blocks(WhileS s) { val Set ret = s.block.blocks ret.add(s.bexp as ElementaryBlock) return ret } /** Returns the set of elementary blocks in a statement. */ def dispatch static Set blocks(CompoundS s) { val ret = new SetRepUtility(); for (c : s.stmts) { ret.addAll(c.blocks); } return ret } ------------------------------------------ Why are only assignment, skip, and labeled expressions blocks? ------------------------------------------ LABELS ATTRIBUTE def dispatch static Set labels(AssignS s) { } def dispatch static Set labels(SkipS s) { } def dispatch static Set labels(IfS s) { } /** Returns the set of the labels of all labels in a statement. */ def dispatch static Set labels(WhileS s) { } /** Returns the set of the labels of all elementary blocks in a statement. */ def dispatch static Set labels(CompoundS s) { } ------------------------------------------ V. An Example What do we need to do to add assert statements to the While Language?