« ReFreezed.com
Dumb Lua Parser

Documentation


API

Functions

Tokenizing
Token actions
AST parsing
AST manipulation
AST checking
AST traversal
Big AST operations
Conversion
Printing
Utilities

tokenize

tokens = parser.tokenize( luaString [, pathForErrorMessages="?" ] )
tokens = parser.tokenize( luaString [, keepWhitespaceTokens=false, pathForErrorMessages="?" ] )

Convert a Lua string into an array of tokens. Returns nil and a message on error.

local tokens = parser.tokenize[[
	local x = foo()
	bar(x, 57)
]]
See also: tokenizeFile

tokenizeFile

tokens = parser.tokenizeFile( path [, keepWhitespaceTokens=false ] )

Convert the contents of a file into an array of tokens. Uses io.open(). Returns nil and a message on error.

See also: tokenize

newToken

token = parser.newToken( tokenType, tokenValue )
--
commentToken     = parser.newToken( "comment",     contents )
identifierToken  = parser.newToken( "identifier",  name )
keywordToken     = parser.newToken( "keyword",     name )
numberToken      = parser.newToken( "number",      number )
punctuationToken = parser.newToken( "punctuation", punctuationString )
stringToken      = parser.newToken( "string",      stringValue )
whitespaceToken  = parser.newToken( "whitespace",  contents )

Create a new token.

local tokens = {
	parser.newToken("identifier",  "x"),
	parser.newToken("punctuation", "="),
	parser.newToken("number",      58),
	parser.newToken("punctuation", "+"),
	parser.newToken("identifier",  "foo"),
	parser.newToken("punctuation", "("),
	parser.newToken("punctuation", ")"),
}
print(parser.concatTokens(tokens)) -- x=58+foo()

updateToken

parser.updateToken( token, tokenValue )
--
parser.updateToken( commentToken,     contents )
parser.updateToken( identifierToken,  name )
parser.updateToken( keywordToken,     name )
parser.updateToken( numberToken,      number )
parser.updateToken( punctuationToken, punctuationString )
parser.updateToken( stringToken,      stringValue )
parser.updateToken( whitespaceToken,  contents )

Update the value and representation of an existing token.

cloneToken

tokenClone = parser.cloneToken( token )

Clone an existing token.

concatTokens

luaString = parser.concatTokens( tokens )

Concatenate tokens. Whitespace is added between tokens when necessary.

local tokens = {
	parser.newToken("identifier",  "print"),
	parser.newToken("punctuation", "("),
	parser.newToken("number",      9),
	parser.newToken("punctuation", "+"),
	parser.newToken("number",      2),
	parser.newToken("punctuation", ")"),
}

local luaString = parser.concatTokens(tokens)
-- luaString is: print(9+2)

local chunk = loadstring(luaString)
chunk() -- Prints "11"

parse

astNode = parser.parse( tokens )
astNode = parser.parse( luaString [, pathForErrorMessages="?" ] )

Convert tokens or Lua code into an AST representing a block of code. Returns nil and a message on error.

local ast = parser.parse[[
	local x = foo()
	bar(x, 57)
]]

parseExpression

astNode = parser.parseExpression( tokens )
astNode = parser.parseExpression( luaString [, pathForErrorMessages="?" ] )

Convert tokens or Lua code into an AST representing a value expression. Returns nil and a message on error.

local ast = parser.parseExpression("x + 2 * foo()")
See also: parse, parseFile

parseFile

astNode = parser.parseFile( path )

Convert a Lua file into an AST. Uses io.open(). Returns nil and a message on error.

newNode

astNode = parser.newNode( nodeType, arguments... )
--
identifier   = parser.newNode( "identifier", name [, attributeName="" ] )
vararg       = parser.newNode( "vararg" )
literal      = parser.newNode( "literal", number|string|boolean|nil )
tableNode    = parser.newNode( "table" )
lookup       = parser.newNode( "lookup" )
unary        = parser.newNode( "unary",  unaryOperator  )
binary       = parser.newNode( "binary", binaryOperator )
call         = parser.newNode( "call" )
functionNode = parser.newNode( "function" )
breakNode    = parser.newNode( "break" )
returnNode   = parser.newNode( "return" )
label        = parser.newNode( "label", labelName )
gotoNode     = parser.newNode( "goto",  labelName )
block        = parser.newNode( "block" )
declaration  = parser.newNode( "declaration" )
assignment   = parser.newNode( "assignment" )
ifNode       = parser.newNode( "if" )
whileLoop    = parser.newNode( "while" )
repeatLoop   = parser.newNode( "repeat" )
forLoop      = parser.newNode( "for", forLoopKind )
--
attributeName = "close" | "const" | ""
forLoopKind   = "numeric" | "generic"

Create a new AST node.

local call        = parser.newNode("call")
call.callee       = parser.newNode("identifier", "print")
call.arguments[1] = parser.newNode("literal", 981)
local luaString   = parser.toLua(call, true)
-- luaString is: print(981);
See also: newNodeFast

newNodeFast

astNode = parser.newNodeFast( nodeType, arguments... )

Same as newNode() but without any validation.

valueToAst

astNode = parser.valueToAst( value [, sortTableKeys=false ] )

Convert a Lua value (number, string, boolean, nil or table) to an AST.

local t   = {name="foo", data={x=true, 87, -3, 794}}
local ast = parser.valueToAst(t)
parser.printTree(ast)

cloneNode

astNode = parser.cloneNode( astNode )

Clone an existing AST node (but not any children).

local block = parser.parse[[
	foo(x, "bar")
]]
local call        = block.statements[1]
call.arguments[3] = parser.cloneNode(call.callee)
local luaString   = parser.toLua(block, true)
-- luaString is: foo(x, "bar", foo);
See also: cloneTree

cloneTree

astNode = parser.cloneTree( astNode )

Clone an existing AST node and its children.

local block = parser.parse[[
	foo(x, "bar")
]]
local call        = block.statements[1]
call.arguments[3] = parser.cloneTree(call)
local luaString   = parser.toLua(block, true)
-- luaString is: foo(x, "bar", foo(x, "bar"));
See also: cloneNode

getChild

childNode = parser.getChild( astNode, fieldName )
childNode = parser.getChild( astNode, fieldName, index )                -- If the node field is an array.
childNode = parser.getChild( astNode, fieldName, index, tableFieldKey ) -- If the node field is a table field array.
tableFieldKey = "key" | "value"

Get a child node. See node fields for field names. The result is the same as doing the following, but with more error checking:

childNode = astNode[fieldName]
childNode = astNode[fieldName][index]
childNode = astNode[fieldName][index][tableFieldKey]

setChild

parser.setChild( astNode, fieldName, childNode )
parser.setChild( astNode, fieldName, index, childNode )                -- If the node field is an array.
parser.setChild( astNode, fieldName, index, tableFieldKey, childNode ) -- If the node field is a table field array.
tableFieldKey = "key" | "value"

Set a child node. See node fields for field names. The result is the same as doing the following, but with more error checking:

astNode[fieldName]                       = childNode
astNode[fieldName][index]                = childNode
astNode[fieldName][index][tableFieldKey] = childNode

addChild

parser.addChild( astNode, fieldName, [ index=atEnd, ] childNode )
parser.addChild( astNode, fieldName, [ index=atEnd, ] keyNode, valueNode ) -- If the node field is a table field array.

Add a child node to an array field. See node fields for field names. The result is the same as doing the following, but with more error checking:

table.insert(astNode[fieldName], index, childNode)
table.insert(astNode[fieldName], index, {key=keyNode, value=valueNode, generatedKey=false})

removeChild

parser.removeChild( astNode, fieldName [, index=last ] )

Remove a child node from an array field. See node fields for field names. The result is the same as doing the following, but with more error checking:

table.remove(astNode[fieldName], index)

isExpression

bool = parser.isExpression( astNode )

Returns true for expression nodes and false for statements. Note that call nodes count as expressions for this function, i.e. return true.

See also: isStatement

isStatement

bool = parser.isStatement( astNode )

Returns true for statements and false for expression nodes. Note that call nodes count as statements for this function, i.e. return true.

See also: isExpression

validateTree

isValid, errorMessages = parser.validateTree( astNode )

Check for errors in an AST (e.g. missing condition expressions for if statements). errorMessages is a multi-line string if isValid is false.

local block = parser.parse[[
	local x = 5
]]
local declaration    = block.statements[1]
declaration.names[1] = nil -- Remove the 'x' identifier.
assert(validateTree(block)) -- Will raise an error.

traverseTree

didStop = parser.traverseTree(
	astNode, [ leavesFirst=false, ] callback
	[, topNodeParent=nil, topNodeContainer=nil, topNodeKey=nil ]
)
action = callback( astNode, parent, container, key )
action = "stop" | "ignorechildren" | nil  -- Returning nil/nothing means continue traversal.

Call a function on all nodes in an AST, going from astNode out to the leaf nodes (or from leaf nodes and inwards if leavesFirst is set). container[key] is the position of the current node in the tree and can be used to replace the node.

local ast = parser.parse[[
	local x = foo(y, "bar")
]]
parser.traverseTree(ast, function(node)
	if node.type == "identifier" then
		print(node.name) -- Prints "x", "foo" and "y"
	end
end)

traverseTreeReverse

didStop = parser.traverseTreeReverse(
	astNode, [ leavesFirst=false, ] callback
	[, topNodeParent=nil, topNodeContainer=nil, topNodeKey=nil ]
)
action = callback( astNode, parent, container, key )
action = "stop" | "ignorechildren" | nil  -- Returning nil/nothing means continue traversal.

Call a function on all nodes in reverse order in an AST, going from astNode out to the leaf nodes (or from leaf nodes and inwards if leavesFirst is set). container[key] is the position of the current node in the tree and can be used to replace the node.

local ast = parser.parse[[
	local x = foo(y, "bar")
]]
parser.traverseTreeReverse(ast, function(node)
	if node.type == "identifier" then
		print(node.name) -- Prints "y", "foo" and "x"
	end
end)
See also: traverseTree

updateReferences

parser.updateReferences( astNode [, updateTopNodePositionInfo=true ] )

Update references between nodes in the tree. This function sets .parent, .container and .key for all nodes, .declaration for identifiers and vararg nodes, and .label for goto nodes. If updateTopNodePositionInfo is false then .parent, .container and .key will remain as-is for astNode specifically.

-- Find globals.
local ast = parser.parse[[
	local x = 2

	local function func()
		local y = x + bar
	end
]]

parser.updateReferences(ast)

parser.traverseTree(ast, function(node)
	if node.type == "identifier" and not node.declaration then
		print(node.name) -- Prints "bar"
	end
end)

simplify

stats = parser.simplify( astNode )

Simplify/fold expressions and statements involving constants (1+2 becomes 3, false and func() becomes false etc.). See INT_SIZE for notes. Returns stats.

local ast = parser.parse[[
	local x = 2^8
	local s = "[" .. 2^16-1 .. "]"
	local n = -(-(-(-6)))
]]
parser.simplify(ast)

local luaString = parser.toLua(ast, true)
-- luaString is:
--   local x = 256;
--   local s = "[65535]";
--   local n = 6;

optimize

stats = parser.optimize( astNode )

Attempt to remove nodes that aren't useful, like unused variables, or variables that are essentially constants. Calls simplify() internally. This function can be quite slow! Returns stats.

local ast = parser.parse[[
	local constant = "Carl"
	local unused   = func()

	if constant then
		print("Hello "..constant.."!")
	end
]]
parser.optimize(ast)

local luaString = parser.toLua(ast, true)
-- luaString is:
--   func();
--   print("Hello Carl!");

-- • 'constant' always refer to the same string value, thus all instances of
--   'constant' can be replaced with the string "Carl".
-- • 'unused' is never used, thus its declaration can be removed completely.
-- • The call to the global 'func' is kept because it might have side effects
--   (both the lookup of the global and the call itself).
-- • 'if "Carl" then' always passes, thus the body of the 'if' statement always
--   runs and the condition check is useless.
-- • '"Hello ".."Carl".."!"' can be concatenated into one string.

Note: References may be out-of-date after calling this.

minify

stats = parser.minify( astNode [, optimize=false ] )

Replace local variable names with short names. This function can be used to obfuscate the code to some extent. If optimize is set then optimize() is also called automatically. Returns stats.

local ast = parser.parse[[
	local function printSum(value1, value2)
		print(value1 + value2)
	end
	printSum(33, 77)
]]
parser.minify(ast)

local luaString = parser.toLua(ast, true)
-- luaString is:
--   local function e(e, t)
--     print(e + t);
--   end
--   e(33, 77);

Note: References may be out-of-date after calling this.

toLua

luaString    = parser.toLua( astNode [, prettyOuput=false, nodeCallback ] )
nodeCallback = function( node, outputBuffer )

Convert an AST to Lua, optionally call a function on each node before they are turned into Lua. Any node in the tree with a .pretty attribute will override the prettyOuput flag for itself and its children. Nodes can also have a .prefix and/or .suffix attribute with Lua code to output before/after the node (e.g. declaration.names[1].suffix="--[[foo]]"). outputBuffer is an array of Lua code that has been output so far. Returns nil and a message on error.

local ast = parser.parse[[
	local x = foo()
	bar(8.500)
]]
local luaString = parser.toLua(ast)
-- luaString is: local x=foo();bar(8.5);

printTokens

parser.printTokens( tokens )

Print tokens to stdout.

printNode

parser.printNode( astNode )

Print information about an AST node to stdout.

See also: printTree

printTree

parser.printTree( astNode )

Print the structure of a whole AST to stdout.

See also: printNode

formatMessage

message = parser.formatMessage( [ prefix="Info", ] token,    formatString, ... )
message = parser.formatMessage( [ prefix="Info", ] astNode,  formatString, ... )
message = parser.formatMessage( [ prefix="Info", ] location, formatString, ... )

Format a message to contain a code preview window with an arrow pointing at the target token, node or location. This is used internally for formatting error messages.

if identifier.name ~= "good" then
	print(parser.formatMessage("Error", identifier, "This identifier is not good!"))
	print(parser.formatMessage(currentStatement, "Current statement."))
end

findDeclaredNames

identifiers = parser.findDeclaredNames( astNode )

Find all declared names in the tree (i.e. identifiers from AstDeclaration, AstFunction and AstFor nodes).

findGlobalReferences

identifiers = parser.findGlobalReferences( astNode )

Find all identifiers not referring to local variables in the tree.

Note: updateReferences() must be called at some point before you call this - otherwise all variables will be seen as globals!

findShadows

shadowSequences = parser.findShadows( astNode )
shadowSequences = { shadowSequence1, ... }
shadowSequence  = { shadowingIdentifier, shadowedIdentifier1, ... }

Find local variable shadowing in the tree. Each shadowSequence is an array of declared identifiers where each identifier shadows the next one.

Notes:

updateReferences() must be called at some point before you call this - otherwise all variables will be seen as globals!

Shadowing of globals cannot be detected by the function as that would require knowledge of all potential globals in your program.

Constants

INT_SIZE

parser.INT_SIZE = integer

How many bits integers have. In Lua 5.3 and later this is usually 64, and in earlier versions it's 32. The int size may affect how bitwise operations involving only constants get simplified, e.g. the expression -1>>1 becomes 2147483647 in Lua 5.2 but 9223372036854775807 in Lua 5.3.

MAX_INT

parser.MAX_INT = integer

The highest representable positive signed integer value, according to INT_SIZE. This is the same value as math.maxinteger in Lua 5.3 and later. This only affects simplification of some bitwise operations.

MIN_INT

parser.MIN_INT = integer

The highest representable negative signed integer value, according to INT_SIZE. This is the same value as math.mininteger in Lua 5.3 and later. This only affects simplification of some bitwise operations.

VERSION

parser.VERSION

The parser's version number (e.g. "1.0.2").

Settings

printIds

parser.printIds = bool -- Default: false

If AST node IDs should be printed.

printLocations

parser.printLocations = bool -- Default: false

If the file location (filename and line number) should be printed for each token or AST node.

indentation

parser.indentation = string -- Default: 4 spaces

The indentation used when printing ASTs.

constantNameReplacementStringMaxLength

parser.constantNameReplacementStringMaxLength = length -- Default: 200

Normally optimize() replaces variable names that are effectively constants with their value. The exception is if the value is a string that's longer than what this setting specifies.

-- Example:
local ast = parser.parse[==[
	local short = "a"
	local long  = "xy"
	func(short, long)
]==]
parser.constantNameReplacementStringMaxLength = 1
parser.optimize(ast)
print(parser.toLua(ast)) -- local long="xy";func("a",long);

Tokens

Tokens are represented by tables.

Token fields

  • type: Token type.
  • value: Token value. All token types have a string value, except "number" tokens which have a number value.
  • representation: The token's code representation. (Strings have surrounding quotes, comments start with "--" etc.)
  • sourceString: The original source string, or "" if there is none.
  • sourcePath: Path to the source file, or "?" if there is none.
  • lineStart: Start line number in sourceString, or 0 by default.
  • lineEnd: End line number in sourceString, or 0 by default.
  • positionStart: Start byte position in sourceString, or 0 by default.
  • positionEnd: End byte position in sourceString, or 0 by default.

Token types

Type Possible values
"comment" Any string value.
"identifier" Any word that is not a keyword.
"keyword" "and", "break", "do", "else", "elseif", "end", "false", "for", "function", "goto", "if", "in", "local", "nil", "not", "or", "repeat", "return", "then", "true", "until", "while"
"number" Any number.
"punctuation" "+", "-", "*", "/", "%", "^", "#", "&", "~", "|", "<<", ">>", "//", "==", "~=", "<=", ">=", "<", ">", "=", "(", ")", "{", "}", "[", "]", "::", ";", ":", ",", ".", "..", "..."
"string" Any string value.

AST

AST nodes are represented by tables.

Node types

Expressions
Statements
  • AstBreak: Loop break statement.
  • AstReturn: Function/chunk return statement, possibly with values.
  • AstLabel: Label for goto commands.
  • AstGoto: A jump to a label.
  • AstBlock: List of statements. Blocks inside blocks are do...end statements.
  • AstDeclaration: Declaration of one or more local variables, possibly with initial values.
  • AstAssignment: Assignment of one or more values to one or more variables.
  • AstIf: If statement with a condition, a body if the condition is true, and possibly another body if the condition is false.
  • AstWhile: A while loop.
  • AstRepeat: A repeat loop.
  • AstFor: A for loop.
  • AstCall: Function call. (Calls can be both expressions and statements.)

Node fields

Common fields

All nodes have these fields.

  • id: Unique ID/serial number for the node.
  • type: The node's type.
  • sourceString: The original source string, or "" if there is none.
  • sourcePath: Path to the source file, or "?" if there is none.
  • line: Line number in sourceString, or 0 by default.
  • position: Byte position in sourceString, or 0 by default.
  • parent: Refers to the node's parent in the tree. Updated by updateReferences().
  • container: Refers to the specific table that the node is in, which could be the parent itself or a field in the parent. Updated by updateReferences().
  • key: Refers to the specific field in the container that the node is in (which is either a string or an integer). Updated by updateReferences().

AstIdentifier

AstVararg

  • type: "vararg"
  • declaration: AstVararg (whose parent is an AstFunction). Updated by updateReferences(). This is nil in the main chunk (or in a non-vararg function, which is probably an error).
  • adjustToOne: Boolean. true if parentheses surround the vararg.

AstLiteral

  • type: "literal"
  • value: Number, string, boolean or nil.

AstTable

  • type: "table"
  • fields: Array of {key=expression, value=expression, generatedKey=bool}. generatedKey is true for implicit keys (i.e. {x,y}) and false for explicit keys (i.e. {a=x,b=y}). Note that the state of generatedKey affects the output of toLua()! key may be nil if generatedKey is true.

AstLookup

AstUnary

  • type: "unary"
  • operator: "-", "not", "#" or "~".
  • expression: Expression.

AstBinary

  • type: "binary"
  • operator: "+", "-", "*", "/", "//", "^", "%", "&", "~", "|", ">>", "<<", "..", "<", "<=", ">", ">=", "==", "~=", "and" or "or".
  • left: Expression.
  • right: Expression.

AstCall

  • type: "call"
  • callee: Expression.
  • arguments: Array of expressions.
  • method: Boolean. true if the call is a method call. Method calls must have a callee that is a lookup with a member expression that is a string literal that can pass as an identifier.
  • adjustToOne: Boolean. true if parentheses surround the call.

AstFunction

AstBreak

  • type: "break"

AstReturn

AstLabel

  • type: "label"
  • name: String. The value must be able to pass as an identifier.

AstGoto

AstBlock

AstDeclaration

AstAssignment

AstIf

AstWhile

AstRepeat

AstFor

  • type: "for"
  • kind: "numeric" or "generic".
  • names: Non-empty array of AstIdentifier. "numeric" loops must have only one name.
  • values: Non-empty array of expressions. "numeric" loops must have either 2 or 3 values.
  • body: AstBlock.

Other objects

Stats

Some functions return a stats table which contains these fields:

  • nodeReplacements: Array of locations of nodes that were replaced.
  • nodeRemovals: Array of locations of nodes or tree branches that were removed.
  • nodeRemoveCount: Number. How many nodes were removed, including subnodes of nodeRemovals.
  • renameCount: Number. How many identifiers were renamed.
  • generatedNameCount: Number. How many unique names were generated.

Locations

Locations are tables with these fields:

  • sourceString: The original source string, or "" if there is none.
  • sourcePath: Path to the source file, or "?" if there is none.
  • line: Line number in sourceString, or 0 by default.
  • position: Byte position in sourceString, or 0 by default.
  • node: The node the location points to, or nil if there is none.
  • replacement: The node that replaced node, or nil if there is none. (This is set for stats.nodeReplacements.)

Notes

Syntactic sugar

Things are parsed into their desugared form.

t.k
-- is parsed as...
t["k"]
f""
f{}
-- is parsed as...
f("")
f({})
local function f() end
-- is parsed as...
local f
f = function() end
function t.k:m() end
-- is parsed as...
t.k.m = function(self) end
if x then
elseif y then
end
-- is parsed as...
if x then
else
	if y then
	end
end

The reverse is sometimes also true when converting ASTs to Lua - syntactic sugar may be applied in some cases.

Special number notation rules

The expression -n is parsed as a single number literal if n is a numeral (i.e. the result is a negative number).

The expression n/0 is parsed as a single number literal if n is a numeral. If n is positive then the result is math.huge, if n is negative then the result is -math.huge, or if n is 0 then the result is NaN.

Page updated: 2022-06-23