Typechecking and code generation
TODO: Add notes here TODO: Talk about the queues that exist
Instructions
The process of code generation involves the production (creation) of instructions and consumption of them (consuming them and embedding them in other instructions). There are several types of instructions but the main important base ones are listed below.
The base Instruction
Every type of instruction that is produced during the code generation
phase is a kind-of Instruction
, it is the base class for all
instructions and contains some common methods used by all of them:
setContext(Context)
- Sets the
Context
object that is to be associated with this instruction. - This is normally done as a way to transfer the context from the respective parser-node to the corresponding instruction such that if such context is needed during further code generation (or even emit) it can then be accessed
- Sets the
Context getContext()
- Returns this instruction’s associated context via its
Context
object
- Returns this instruction’s associated context via its
string produceToStrEnclose(string addInfo)
- Returns a string containing the additional info provided through
addInfo
- The format of the returned string will be
[Instruction: <className>: <addInfo>]
where<className>
is the name of the instruction type (kind-of) and<addInfo>
as explained previously
- Returns a string containing the additional info provided through
Value-based instructions (Value
)
TODO: Talk about the Value
instruction base class here
A Value
instruction is a kind-of Instruction
of which represents
code which generates some sort of value, think of literals, arithmetic
operations, pointer dereferences, variable reads and so on. Every such
instruction always has an associated Type
object associated with it in
order to know the intended type of the instruction. Below we show the
API usage of the Value
class:
Type getType()
- Returns the type associated with this instruction
setType(Type)
- Set the type to be associated with this instruction
There are many instructions which sub-type this Value
class, these can
be found in
<TODO: Insert path here and put all Value-based instructions in their own module>
.
Code generation
The method of code generation and type checking starts by being provided
a so-called “action list” which is a linear array of dependency-nodes
(or DNode
s for code’s sake), this list is then iterated through by a
for-loop, and each DNode
is passed to a method called
typeCheckThing(DNode)
:
The handling of every different instruction type and its associated
typechecking requirements are handled in one huge if-statement within
the typeCheckThing(DNode)
method. This method will analyse a given
dependency-node and perform the required typechecking by extracting the
DNode
’s emebedded parser-node, whilst doing so if a type check passes
then code generation takes place by generating the corresponding
instruction and adding this to some position in the code queue
(discussed later).
Code queue
TODO: Add information on this
The code queue is used as a stack and a queue in order to facilitate instruction generation. Certain instructions are produced once off and then added to the back of the queue (“consuming” instructions) whilst other are produced and pushed onto the top of the queue (“producing” instructions) for consumption by other consuming instructions later.
An example of this would be the following T code which uses a binary
operation with two operands (one being a LiteralValue
instruction and
the other being a FuncCall
instruction):
This would result in a situation where we have the following production
Enforcement
Enforcement is the procedure of ensuring that a given Value
-based
instruction, \(instr_{i}\), conforms to the target type or “to-type”,
\(type_{i}\). An optional flag can be passed such that if the
\(typeof(instr_{i}) \neq type_{i}\) that it can then attempt coercion as
to bring it to the equal type.
The method by which this is done is:
We will discuss exact equality and exact equality through coercion in the next two sections.
Type equality
In order to check strict equality the type enforcer will initially check
the following condition. We label the toType
as \(t_{1}\) and the the
type of v2
as \(typeof(v_{2})\) (otherwise referred to as \(t_{2}\)).
The method isSameType(Type t1, Type t2)
provides exact quality
checking between the two given types in the form of \(t_{1} = t_{2}\).
Coercion
In the case of coercion an application of \(coerce()\) is applied to the
incoming instruction, as to produce an instruction \(coerceInstr_{i}\), a
CastedValueInstruction
, which wraps the original instruction inside of
it but allows for a type cast/conversion to the target type, therefore
making the statement, \(type_{i} = typeof(coerce(instr_{i}))\) (which is
the same as \(type_{i} = typeof(coerceInstr_{i})\)), valid.
TODO: Document this now
Coercion has a set of rules (TODO: document them) in terms of what can
be coerced. AT the end of the day if the coercion fails then a
CoercionException
is thrown, if, however it succeeds then a
CastedValueInstruction
will be placed into the memory location (the
variable) pointed to by the ref
parameter of
typeEnforce(..., ..., ref Instruction coercedInstruction, true)
.
Example
Below we have an example of the code which processes variable
declarations with assignments (think of byte i = 2
):
Value assignmentInstr;
if(variablePNode.getAssignment())
{
Instruction poppedInstr = popInstr();
assert(poppedInstr);
// Obtain the value instruction of the variable assignment
// ... along with the assignment's type
assignmentInstr = cast(Value)poppedInstr;
assert(assignmentInstr);
Type assignmentType = assignmentInstr.getInstrType();
/**
* Here we can call the `typeEnforce` with the popped
* `Value` instruction and the type to coerce to
* (our variable's type)
*/
typeEnforce(variableDeclarationType, assignmentInstr, assignmentInstr, true);
assert(isSameType(variableDeclarationType, assignmentInstr.getInstrType())); // Sanity check
}
...
What the above code is doing is:
- Firstly popping off an
Instruction
from the stack-queue and then down-casting it toValue
(forValue
-based instructions would be required as an expression is being assigned) - We then call
typeEnforce()
providing it with: * The variable’s type - thevariableDeclarationType
* The incomingValue
-based instructionassignmentInstr
* The third argument, isref
-based, meaning what we provide it is the variable which will have the result of the enforcement (if coercion is required) placed into * The last argument istrue
, meaning “Please attempt coercion if the types are not exactly equal, please”
The last line containing an assertion:
This is a sanity check, as if the type coercion failed then an exception would be thrown and the assertion would not be reached, however if the types were an exact match or if they were not but could be coerced as such then the two types should match.
Variable referencing counting
Firstly let me make it clear that this has nothing to do with runtime reference counting but rather a simple mechanism used to maintain a count or number of references to variables after their declaration.
Below is a method table of the methods of concern:
Method | Description | Return |
---|---|---|
touch(Variable) |
Increments the count by 1 for the given variable, creates a mapping if one does not yet exist | void |
getUnusedVariables() |
Returns an array of all Variable s which have a reference count above 1 |
Variable[] |
This aids us in implementing a single feature unused variable
detection. It’s rather simple, reference counts are incremented by
using a touch(Variable)
method defined in the TypeChecker
and this
is called whilst doing dependency generation in the dependency
generator.
The first time a variable is encountered, such as even its declaration,
we will then touch(...)
-it. At the end of type checking we then call
the getUnusedVariables()
method which returns a list of the undeclared
variables. These are variables with a reference count higher than 1
.
We then print these out so the user can see which are unused.
Usage
Example usage below shows us touch
-ing a variable when we process them
in expressions such as a VariableExpression
in the dependency module:
...
/* Get the entity as a Variable */
Variable variable = cast(Variable)namedEntity;
/* Variable reference count must increase */
tc.touch(variable);
...
We then, after typechecking, run the following in the type checker
module’s doPostChecks()
method:
/**
* Find the variables which were declared but never used
*/
if(this.config.hasConfig("typecheck:warnUnusedVars") & this.config.getConfig("typecheck:warnUnusedVars").getBoolean())
{
Variable[] unusedVariables = getUnusedVariables();
gprintln("There are "~to!(string)(unusedVariables.length)~" unused variables");
if(unusedVariables.length)
{
foreach(Variable unusedVariable; unusedVariables)
{
// TODO: Get a nicer name, full path-based
gprintln("Variable '"~to!(string)(unusedVariable.getName())~"' is declared but never");
}
}
}