Documentation
API
Functions
Tokenizing
Token actions
AST parsing
AST manipulation
AST checking
AST traversal
Big AST operations
Conversion
Printing
Utilities
tokenize
tokens = parser.tokenize( luaString [, pathForErrorMessages="?" ] )
tokens = parser.tokenize( luaString [, keepWhitespaceTokens=false, pathForErrorMessages="?" ] )
Convert a Lua string into an array of tokens. Returns nil and a message on error.
local tokens = parser.tokenize[[
local x = foo()
bar(x, 57)
]]
tokenizeFile
tokens = parser.tokenizeFile( path [, keepWhitespaceTokens=false ] )
Convert the contents of a file into an array of tokens.
Uses io.open()
.
Returns nil and a message on error.
newToken
token = parser.newToken( tokenType, tokenValue )
--
commentToken = parser.newToken( "comment", contents )
identifierToken = parser.newToken( "identifier", name )
keywordToken = parser.newToken( "keyword", name )
numberToken = parser.newToken( "number", number )
punctuationToken = parser.newToken( "punctuation", punctuationString )
stringToken = parser.newToken( "string", stringValue )
whitespaceToken = parser.newToken( "whitespace", contents )
Create a new token.
local tokens = {
parser.newToken("identifier", "x"),
parser.newToken("punctuation", "="),
parser.newToken("number", 58),
parser.newToken("punctuation", "+"),
parser.newToken("identifier", "foo"),
parser.newToken("punctuation", "("),
parser.newToken("punctuation", ")"),
}
print(parser.concatTokens(tokens)) -- x=58+foo()
updateToken
parser.updateToken( token, tokenValue )
--
parser.updateToken( commentToken, contents )
parser.updateToken( identifierToken, name )
parser.updateToken( keywordToken, name )
parser.updateToken( numberToken, number )
parser.updateToken( punctuationToken, punctuationString )
parser.updateToken( stringToken, stringValue )
parser.updateToken( whitespaceToken, contents )
Update the value and representation of an existing token.
cloneToken
tokenClone = parser.cloneToken( token )
Clone an existing token.
concatTokens
luaString = parser.concatTokens( tokens )
Concatenate tokens. Whitespace is added between tokens when necessary.
local tokens = {
parser.newToken("identifier", "print"),
parser.newToken("punctuation", "("),
parser.newToken("number", 9),
parser.newToken("punctuation", "+"),
parser.newToken("number", 2),
parser.newToken("punctuation", ")"),
}
local luaString = parser.concatTokens(tokens)
-- luaString is: print(9+2)
local chunk = loadstring(luaString)
chunk() -- Prints "11"
parse
astNode = parser.parse( tokens )
astNode = parser.parse( luaString [, pathForErrorMessages="?" ] )
Convert tokens or Lua code into an AST representing a block of code. Returns nil and a message on error.
local ast = parser.parse[[
local x = foo()
bar(x, 57)
]]
parseExpression
astNode = parser.parseExpression( tokens )
astNode = parser.parseExpression( luaString [, pathForErrorMessages="?" ] )
Convert tokens or Lua code into an AST representing a value expression. Returns nil and a message on error.
local ast = parser.parseExpression("x + 2 * foo()")
parseFile
astNode = parser.parseFile( path )
Convert a Lua file into an AST.
Uses io.open()
.
Returns nil and a message on error.
newNode
astNode = parser.newNode( nodeType, arguments... )
--
identifier = parser.newNode( "identifier", name [, attributeName="" ] )
vararg = parser.newNode( "vararg" )
literal = parser.newNode( "literal", number|string|boolean|nil )
tableNode = parser.newNode( "table" )
lookup = parser.newNode( "lookup" )
unary = parser.newNode( "unary", unaryOperator )
binary = parser.newNode( "binary", binaryOperator )
call = parser.newNode( "call" )
functionNode = parser.newNode( "function" )
breakNode = parser.newNode( "break" )
returnNode = parser.newNode( "return" )
label = parser.newNode( "label", labelName )
gotoNode = parser.newNode( "goto", labelName )
block = parser.newNode( "block" )
declaration = parser.newNode( "declaration" )
assignment = parser.newNode( "assignment" )
ifNode = parser.newNode( "if" )
whileLoop = parser.newNode( "while" )
repeatLoop = parser.newNode( "repeat" )
forLoop = parser.newNode( "for", forLoopKind )
--
attributeName = "close" | "const" | ""
forLoopKind = "numeric" | "generic"
Create a new AST node.
local call = parser.newNode("call")
call.callee = parser.newNode("identifier", "print")
call.arguments[1] = parser.newNode("literal", 981)
local luaString = parser.toLua(call, true)
-- luaString is: print(981);
newNodeFast
astNode = parser.newNodeFast( nodeType, arguments... )
Same as newNode()
but without any validation.
valueToAst
astNode = parser.valueToAst( value [, sortTableKeys=false ] )
Convert a Lua value (number, string, boolean, nil or table) to an AST.
local t = {name="foo", data={x=true, 87, -3, 794}}
local ast = parser.valueToAst(t)
parser.printTree(ast)
cloneNode
astNode = parser.cloneNode( astNode )
Clone an existing AST node (but not any children).
local block = parser.parse[[
foo(x, "bar")
]]
local call = block.statements[1]
call.arguments[3] = parser.cloneNode(call.callee)
local luaString = parser.toLua(block, true)
-- luaString is: foo(x, "bar", foo);
cloneTree
astNode = parser.cloneTree( astNode )
Clone an existing AST node and its children.
local block = parser.parse[[
foo(x, "bar")
]]
local call = block.statements[1]
call.arguments[3] = parser.cloneTree(call)
local luaString = parser.toLua(block, true)
-- luaString is: foo(x, "bar", foo(x, "bar"));
getChild
childNode = parser.getChild( astNode, fieldName )
childNode = parser.getChild( astNode, fieldName, index ) -- If the node field is an array.
childNode = parser.getChild( astNode, fieldName, index, tableFieldKey ) -- If the node field is a table field array.
tableFieldKey = "key" | "value"
Get a child node. See node fields for field names. The result is the same as doing the following, but with more error checking:
childNode = astNode[fieldName]
childNode = astNode[fieldName][index]
childNode = astNode[fieldName][index][tableFieldKey]
setChild
parser.setChild( astNode, fieldName, childNode )
parser.setChild( astNode, fieldName, index, childNode ) -- If the node field is an array.
parser.setChild( astNode, fieldName, index, tableFieldKey, childNode ) -- If the node field is a table field array.
tableFieldKey = "key" | "value"
Set a child node. See node fields for field names. The result is the same as doing the following, but with more error checking:
astNode[fieldName] = childNode
astNode[fieldName][index] = childNode
astNode[fieldName][index][tableFieldKey] = childNode
addChild
parser.addChild( astNode, fieldName, [ index=atEnd, ] childNode )
parser.addChild( astNode, fieldName, [ index=atEnd, ] keyNode, valueNode ) -- If the node field is a table field array.
Add a child node to an array field. See node fields for field names. The result is the same as doing the following, but with more error checking:
table.insert(astNode[fieldName], index, childNode)
table.insert(astNode[fieldName], index, {key=keyNode, value=valueNode, generatedKey=false})
removeChild
parser.removeChild( astNode, fieldName [, index=last ] )
Remove a child node from an array field. See node fields for field names. The result is the same as doing the following, but with more error checking:
table.remove(astNode[fieldName], index)
isExpression
bool = parser.isExpression( astNode )
Returns true for expression nodes and false for statements. Note that call nodes count as expressions for this function, i.e. return true.
isStatement
bool = parser.isStatement( astNode )
Returns true for statements and false for expression nodes. Note that call nodes count as statements for this function, i.e. return true.
validateTree
isValid, errorMessages = parser.validateTree( astNode )
Check for errors in an AST (e.g. missing condition expressions for if statements).
errorMessages
is a multi-line string if isValid
is false.
local block = parser.parse[[
local x = 5
]]
local declaration = block.statements[1]
declaration.names[1] = nil -- Remove the 'x' identifier.
assert(validateTree(block)) -- Will raise an error.
traverseTree
didStop = parser.traverseTree(
astNode, [ leavesFirst=false, ] callback
[, topNodeParent=nil, topNodeContainer=nil, topNodeKey=nil ]
)
action = callback( astNode, parent, container, key )
action = "stop" | "ignorechildren" | nil -- Returning nil/nothing means continue traversal.
Call a function on all nodes in an AST, going from astNode out to the leaf nodes (or from leaf nodes and inwards if leavesFirst
is set).
container[key]
is the position of the current node in the tree and can be used to replace the node.
local ast = parser.parse[[
local x = foo(y, "bar")
]]
parser.traverseTree(ast, function(node)
if node.type == "identifier" then
print(node.name) -- Prints "x", "foo" and "y"
end
end)
traverseTreeReverse
didStop = parser.traverseTreeReverse(
astNode, [ leavesFirst=false, ] callback
[, topNodeParent=nil, topNodeContainer=nil, topNodeKey=nil ]
)
action = callback( astNode, parent, container, key )
action = "stop" | "ignorechildren" | nil -- Returning nil/nothing means continue traversal.
Call a function on all nodes in reverse order in an AST, going from astNode out to the leaf nodes (or from leaf nodes and inwards if leavesFirst
is set).
container[key]
is the position of the current node in the tree and can be used to replace the node.
local ast = parser.parse[[
local x = foo(y, "bar")
]]
parser.traverseTreeReverse(ast, function(node)
if node.type == "identifier" then
print(node.name) -- Prints "y", "foo" and "x"
end
end)
updateReferences
parser.updateReferences( astNode [, updateTopNodePositionInfo=true ] )
Update references between nodes in the tree.
This function sets .parent
, .container
and .key
for all nodes, .declaration
for identifiers and vararg nodes, and .label
for goto nodes.
If updateTopNodePositionInfo
is false then .parent
, .container
and .key
will remain as-is for astNode
specifically.
-- Find globals.
local ast = parser.parse[[
local x = 2
local function func()
local y = x + bar
end
]]
parser.updateReferences(ast)
parser.traverseTree(ast, function(node)
if node.type == "identifier" and not node.declaration then
print(node.name) -- Prints "bar"
end
end)
simplify
stats = parser.simplify( astNode )
Simplify/fold expressions and statements involving constants (1+2
becomes 3
, false and func()
becomes false
etc.).
See INT_SIZE
for notes.
Returns stats.
local ast = parser.parse[[
local x = 2^8
local s = "[" .. 2^16-1 .. "]"
local n = -(-(-(-6)))
]]
parser.simplify(ast)
local luaString = parser.toLua(ast, true)
-- luaString is:
-- local x = 256;
-- local s = "[65535]";
-- local n = 6;
optimize
stats = parser.optimize( astNode )
Attempt to remove nodes that aren't useful, like unused variables, or variables that are essentially constants.
Calls simplify()
internally.
This function can be quite slow!
Returns stats.
local ast = parser.parse[[
local constant = "Carl"
local unused = func()
if constant then
print("Hello "..constant.."!")
end
]]
parser.optimize(ast)
local luaString = parser.toLua(ast, true)
-- luaString is:
-- func();
-- print("Hello Carl!");
-- • 'constant' always refer to the same string value, thus all instances of
-- 'constant' can be replaced with the string "Carl".
-- • 'unused' is never used, thus its declaration can be removed completely.
-- • The call to the global 'func' is kept because it might have side effects
-- (both the lookup of the global and the call itself).
-- • 'if "Carl" then' always passes, thus the body of the 'if' statement always
-- runs and the condition check is useless.
-- • '"Hello ".."Carl".."!"' can be concatenated into one string.
Note: References may be out-of-date after calling this.
minify
stats = parser.minify( astNode [, optimize=false ] )
Replace local variable names with short names.
This function can be used to obfuscate the code to some extent.
If optimize
is set then optimize()
is also called automatically.
Returns stats.
local ast = parser.parse[[
local function printSum(value1, value2)
print(value1 + value2)
end
printSum(33, 77)
]]
parser.minify(ast)
local luaString = parser.toLua(ast, true)
-- luaString is:
-- local function e(e, t)
-- print(e + t);
-- end
-- e(33, 77);
Note: References may be out-of-date after calling this.
toLua
luaString = parser.toLua( astNode [, prettyOuput=false, nodeCallback ] )
nodeCallback = function( node, outputBuffer )
Convert an AST to Lua, optionally call a function on each node before they are turned into Lua.
Any node in the tree with a .pretty
attribute will override the prettyOuput
flag for itself and its children.
Nodes can also have a .prefix
and/or .suffix
attribute with Lua code to output before/after the node (e.g. declaration.names[1].suffix="--[[foo]]"
).
outputBuffer
is an array of Lua code that has been output so far.
Returns nil and a message on error.
local ast = parser.parse[[
local x = foo()
bar(8.500)
]]
local luaString = parser.toLua(ast)
-- luaString is: local x=foo();bar(8.5);
printTokens
parser.printTokens( tokens )
Print tokens to stdout
.
printNode
parser.printNode( astNode )
Print information about an AST node to stdout
.
printTree
parser.printTree( astNode )
Print the structure of a whole AST to stdout
.
formatMessage
message = parser.formatMessage( [ prefix="Info", ] token, formatString, ... )
message = parser.formatMessage( [ prefix="Info", ] astNode, formatString, ... )
message = parser.formatMessage( [ prefix="Info", ] location, formatString, ... )
Format a message to contain a code preview window with an arrow pointing at the target token, node or location. This is used internally for formatting error messages.
if identifier.name ~= "good" then
print(parser.formatMessage("Error", identifier, "This identifier is not good!"))
print(parser.formatMessage(currentStatement, "Current statement."))
end
findDeclaredNames
identifiers = parser.findDeclaredNames( astNode )
Find all declared names in the tree (i.e. identifiers from AstDeclaration, AstFunction and AstFor nodes).
findGlobalReferences
identifiers = parser.findGlobalReferences( astNode )
Find all identifiers not referring to local variables in the tree.
Note:
updateReferences()
must be called at some point before you call this - otherwise all variables will be seen as globals!
findShadows
shadowSequences = parser.findShadows( astNode )
shadowSequences = { shadowSequence1, ... }
shadowSequence = { shadowingIdentifier, shadowedIdentifier1, ... }
Find local variable shadowing in the tree. Each shadowSequence
is an array of declared identifiers where each identifier shadows the next one.
Notes:
updateReferences()
must be called at some point before you call this - otherwise all variables will be seen as globals!Shadowing of globals cannot be detected by the function as that would require knowledge of all potential globals in your program.
Constants
INT_SIZE
parser.INT_SIZE = integer
How many bits integers have.
In Lua 5.3 and later this is usually 64, and in earlier versions it's 32.
The int size may affect how bitwise operations involving only constants get simplified,
e.g. the expression -1>>1
becomes 2147483647
in Lua 5.2 but 9223372036854775807
in Lua 5.3.
MAX_INT
parser.MAX_INT = integer
The highest representable positive signed integer value, according to INT_SIZE
.
This is the same value as math.maxinteger
in Lua 5.3 and later.
This only affects simplification of some bitwise operations.
MIN_INT
parser.MIN_INT = integer
The highest representable negative signed integer value, according to INT_SIZE
.
This is the same value as math.mininteger
in Lua 5.3 and later.
This only affects simplification of some bitwise operations.
VERSION
parser.VERSION
The parser's version number (e.g. "1.0.2"
).
Settings
printIds
parser.printIds = bool -- Default: false
If AST node IDs should be printed.
printLocations
parser.printLocations = bool -- Default: false
If the file location (filename and line number) should be printed for each token or AST node.
indentation
parser.indentation = string -- Default: 4 spaces
The indentation used when printing ASTs.
constantNameReplacementStringMaxLength
parser.constantNameReplacementStringMaxLength = length -- Default: 200
Normally optimize()
replaces variable names that are effectively constants with their value.
The exception is if the value is a string that's longer than what this setting specifies.
-- Example:
local ast = parser.parse[==[
local short = "a"
local long = "xy"
func(short, long)
]==]
parser.constantNameReplacementStringMaxLength = 1
parser.optimize(ast)
print(parser.toLua(ast)) -- local long="xy";func("a",long);
Tokens
Tokens are represented by tables.
Token fields
-
type
: Token type. -
value
: Token value. All token types have a string value, except"number"
tokens which have a number value. -
representation
: The token's code representation. (Strings have surrounding quotes, comments start with"--"
etc.)
-
sourceString
: The original source string, or""
if there is none. -
sourcePath
: Path to the source file, or"?"
if there is none.
-
lineStart
: Start line number insourceString
, or0
by default. -
lineEnd
: End line number insourceString
, or0
by default. -
positionStart
: Start byte position insourceString
, or0
by default. -
positionEnd
: End byte position insourceString
, or0
by default.
Token types
Type | Possible values |
---|---|
"comment" |
Any string value. |
"identifier" |
Any word that is not a keyword. |
"keyword" |
"and" , "break" , "do" , "else" , "elseif" , "end" , "false" , "for" , "function" , "goto" , "if" , "in" , "local" , "nil" , "not" , "or" , "repeat" , "return" , "then" , "true" , "until" , "while" |
"number" |
Any number. |
"punctuation" |
"+" , "-" , "*" , "/" , "%" , "^" , "#" , "&" , "~" , "|" , "<<" , ">>" , "//" , "==" , "~=" , "<=" , ">=" , "<" , ">" , "=" , "(" , ")" , "{" , "}" , "[" , "]" , "::" , ";" , ":" , "," , "." , ".." , "..." |
"string" |
Any string value. |
AST
AST nodes are represented by tables.
Node types
Expressions
AstIdentifier
: An identifier.AstVararg
: Vararg expression.AstLiteral
: Number, string, boolean or nil literal.AstTable
: Table constructor.AstLookup
: Field lookup on an object.AstUnary
: Unary expression (operation with one operand).AstBinary
: Binary expression (operation with two operands).AstCall
: Function call. (Calls can be both expressions and statements.)AstFunction
: Anonymous function header and body.
Statements
AstBreak
: Loop break statement.AstReturn
: Function/chunk return statement, possibly with values.AstLabel
: Label for goto commands.AstGoto
: A jump to a label.AstBlock
: List of statements. Blocks inside blocks aredo...end
statements.AstDeclaration
: Declaration of one or more local variables, possibly with initial values.AstAssignment
: Assignment of one or more values to one or more variables.AstIf
: If statement with a condition, a body if the condition is true, and possibly another body if the condition is false.AstWhile
: Awhile
loop.AstRepeat
: Arepeat
loop.AstFor
: Afor
loop.AstCall
: Function call. (Calls can be both expressions and statements.)
Node fields
Common fields
All nodes have these fields.
-
id
: Unique ID/serial number for the node. -
type
: The node's type.
-
sourceString
: The original source string, or""
if there is none. -
sourcePath
: Path to the source file, or"?"
if there is none.
-
line
: Line number insourceString
, or0
by default. -
position
: Byte position insourceString
, or0
by default.
-
parent
: Refers to the node's parent in the tree. Updated byupdateReferences()
. -
container
: Refers to the specific table that the node is in, which could be the parent itself or a field in the parent. Updated byupdateReferences()
. -
key
: Refers to the specific field in the container that the node is in (which is either a string or an integer). Updated byupdateReferences()
.
AstIdentifier
-
type
:"identifier"
-
name
: String. The name. -
attribute
:""
,"close"
or"const"
. (Only used in declarations.) -
declaration
:AstIdentifier
(whose parent is anAstDeclaration
,AstFunction
orAstFor
). Updated byupdateReferences()
. This is nil for globals.
AstVararg
-
type
:"vararg"
-
declaration
:AstVararg
(whose parent is anAstFunction
). Updated byupdateReferences()
. This is nil in the main chunk (or in a non-vararg function, which is probably an error). -
adjustToOne
: Boolean. true if parentheses surround the vararg.
AstLiteral
-
type
:"literal"
-
value
: Number, string, boolean or nil.
AstTable
-
type
:"table"
-
fields
: Array of{key=expression, value=expression, generatedKey=bool}
.generatedKey
is true for implicit keys (i.e.{x,y}
) and false for explicit keys (i.e.{a=x,b=y}
). Note that the state ofgeneratedKey
affects the output oftoLua()
!key
may be nil ifgeneratedKey
is true.
AstLookup
-
type
:"lookup"
-
object
: Expression. -
member
: Expression.
AstUnary
-
type
:"unary"
-
operator
:"-"
,"not"
,"#"
or"~"
. -
expression
: Expression.
AstBinary
-
type
:"binary"
-
operator
:"+"
,"-"
,"*"
,"/"
,"//"
,"^"
,"%"
,"&"
,"~"
,"|"
,">>"
,"<<"
,".."
,"<"
,"<="
,">"
,">="
,"=="
,"~="
,"and"
or"or"
. -
left
: Expression. -
right
: Expression.
AstCall
-
type
:"call"
-
callee
: Expression. -
arguments
: Array of expressions. -
method
: Boolean. true if the call is a method call. Method calls must have a callee that is a lookup with a member expression that is a string literal that can pass as an identifier. -
adjustToOne
: Boolean. true if parentheses surround the call.
AstFunction
-
type
:"function"
-
parameters
: Array ofAstIdentifier
and maybe anAstVararg
at the end. -
body
:AstBlock
.
AstBreak
-
type
:"break"
AstReturn
-
type
:"return"
-
values
: Array of expressions.
AstLabel
-
type
:"label"
-
name
: String. The value must be able to pass as an identifier.
AstGoto
-
type
:"goto"
-
name
: String. The value must be able to pass as an identifier. -
label
:AstLabel
. Updated byupdateReferences()
.
AstBlock
-
type
:"block"
-
statements
: Array of statements.
AstDeclaration
-
type
:"declaration"
-
names
: Non-empty array ofAstIdentifier
. -
values
: Array of expressions.
AstAssignment
-
type
:"assignment"
-
targets
: Non-empty mixed array ofAstIdentifier
andAstLookup
. -
values
: Non-empty array of expressions.
AstIf
-
type
:"if"
-
condition
: Expression. -
bodyTrue
:AstBlock
. -
bodyFalse
:AstBlock
or nil.
AstWhile
-
type
:"while"
-
condition
: Expression. -
body
:AstBlock
.
AstRepeat
-
type
:"repeat"
-
body
:AstBlock
. -
condition
: Expression.
AstFor
-
type
:"for"
-
kind
:"numeric"
or"generic"
. -
names
: Non-empty array ofAstIdentifier
."numeric"
loops must have only one name. -
values
: Non-empty array of expressions."numeric"
loops must have either 2 or 3 values. -
body
:AstBlock
.
Other objects
Stats
Some functions return a stats table which contains these fields:
-
nodeReplacements
: Array of locations of nodes that were replaced. -
nodeRemovals
: Array of locations of nodes or tree branches that were removed. -
nodeRemoveCount
: Number. How many nodes were removed, including subnodes ofnodeRemovals
.
-
renameCount
: Number. How many identifiers were renamed. -
generatedNameCount
: Number. How many unique names were generated.
Locations
Locations are tables with these fields:
-
sourceString
: The original source string, or""
if there is none. -
sourcePath
: Path to the source file, or"?"
if there is none.
-
line
: Line number insourceString
, or0
by default. -
position
: Byte position insourceString
, or0
by default.
-
node
: The node the location points to, or nil if there is none. -
replacement
: The node that replacednode
, or nil if there is none. (This is set forstats.nodeReplacements
.)
Notes
Syntactic sugar
Things are parsed into their desugared form.
t.k
-- is parsed as...
t["k"]
f""
f{}
-- is parsed as...
f("")
f({})
local function f() end
-- is parsed as...
local f
f = function() end
function t.k:m() end
-- is parsed as...
t.k.m = function(self) end
if x then
elseif y then
end
-- is parsed as...
if x then
else
if y then
end
end
The reverse is sometimes also true when converting ASTs to Lua - syntactic sugar may be applied in some cases.
Special number notation rules
The expression -n
is parsed as a single number literal if n
is a numeral (i.e. the result is a negative number).
The expression n/0
is parsed as a single number literal if n
is a numeral.
If n
is positive then the result is math.huge
,
if n
is negative then the result is -math.huge
,
or if n
is 0
then the result is NaN.
Page updated: 2022-06-23