- Attempt to fix the tables rendering
This commit is contained in:
parent
0bca21937b
commit
68d7ce5b8e
|
@ -1,10 +1,11 @@
|
||||||
## Lexical analysis
|
## Lexical analysis
|
||||||
|
|
||||||
Lexical analysis is the process of taking a program as an input string
|
Lexical analysis is the process of taking a program as an input string
|
||||||
$A$ and splitting it into a list of $n$ sub-strings
|
*A* and splitting it into a list of *n* sub-strings
|
||||||
$A_{1},\,A_{2}\ldots A_{n}$ called tokens. The length $n$ of this list
|
*A*<sub>1</sub>, *A*<sub>2</sub>…*A*<sub>*n*</sub> called tokens. The
|
||||||
of dependent on several rules that determine how, when and where new
|
length *n* of this list of dependent on several rules that determine
|
||||||
tokens are built - this set of rules is called a *grammar*.
|
how, when and where new tokens are built - this set of rules is called a
|
||||||
|
*grammar*.
|
||||||
|
|
||||||
### Grammar
|
### Grammar
|
||||||
|
|
||||||
|
@ -27,7 +28,7 @@ definitions:
|
||||||
new Token("int") == new Token("int")
|
new Token("int") == new Token("int")
|
||||||
```
|
```
|
||||||
|
|
||||||
- ...would evaluate to `true`, rather than false by reference
|
- …would evaluate to `true`, rather than false by reference
|
||||||
equality (the default in D)
|
equality (the default in D)
|
||||||
- `Lexer` - The token builder
|
- `Lexer` - The token builder
|
||||||
- `sourceCode`, the whole input program (as a string) to be
|
- `sourceCode`, the whole input program (as a string) to be
|
||||||
|
@ -38,7 +39,7 @@ definitions:
|
||||||
- Contains a list of the currently built tokens, `Token[] tokens`
|
- Contains a list of the currently built tokens, `Token[] tokens`
|
||||||
- Current line and column numbers as `line` and `column`
|
- Current line and column numbers as `line` and `column`
|
||||||
respectively
|
respectively
|
||||||
- A "build up" - this is the token (in string form) currently
|
- A “build up” - this is the token (in string form) currently
|
||||||
being built - `currentToken`
|
being built - `currentToken`
|
||||||
|
|
||||||
### Implementation
|
### Implementation
|
||||||
|
@ -124,13 +125,10 @@ character == ':';
|
||||||
|
|
||||||
!!! error FInish this page
|
!!! error FInish this page
|
||||||
|
|
||||||
•
|
•
|
||||||
`\texttt{;}`{=tex} `\texttt{,}`{=tex} `\texttt{(}`{=tex} `\texttt{)}`{=tex} `\texttt{[}`{=tex} `\texttt{]}`{=tex} `\texttt{+}`{=tex} `\texttt{-}`{=tex} `\texttt{/}`{=tex} `\texttt{\%}`{=tex} `\texttt{*}`{=tex} `\texttt{\&}`{=tex} `\texttt{\{}`{=tex} `\texttt{\}}`{=tex}
|
|
||||||
|
|
||||||
• `\texttt{=}`{=tex} \| (TODO: make it
|
• \| (TODO: make it texttt) \\texttt{^} (TODO: not
|
||||||
texttt) \\texttt{\^} `\texttt{!}`{=tex} `\texttt{\\n}`{=tex}(TODO:
|
appearing) \\texttt{\~}
|
||||||
`\n `{=tex}not
|
|
||||||
appearing) \\texttt{\~} `\texttt{.}`{=tex} `\texttt{\:}`{=tex}
|
|
||||||
|
|
||||||
Whenever this method returns `true` it generally means you should flush
|
Whenever this method returns `true` it generally means you should flush
|
||||||
the current token, start a new token add the offending spliter token and
|
the current token, start a new token add the offending spliter token and
|
||||||
|
|
|
@ -2,7 +2,7 @@
|
||||||
|
|
||||||
Once we have generated a list of tokens (instances of `Token`) from the
|
Once we have generated a list of tokens (instances of `Token`) from the
|
||||||
`Lexer` instance we need to turn these into a structure that represents
|
`Lexer` instance we need to turn these into a structure that represents
|
||||||
our program's source code *but* using in-memory data-structures which we
|
our program’s source code *but* using in-memory data-structures which we
|
||||||
can traverse and process at a later stage.
|
can traverse and process at a later stage.
|
||||||
|
|
||||||
### Overview
|
### Overview
|
||||||
|
@ -12,7 +12,7 @@ sub-structures of a TLang program and returning different data types
|
||||||
generated by these methods. The parser has the ability to move back and
|
generated by these methods. The parser has the ability to move back and
|
||||||
forth between the token stream provided and fetch the current token
|
forth between the token stream provided and fetch the current token
|
||||||
(along with analysing it to return the type of symbol the token
|
(along with analysing it to return the type of symbol the token
|
||||||
represents - known as the `SymbolType` (TODO: Cite the "Symbol types"
|
represents - known as the `SymbolType` (TODO: Cite the “Symbol types”
|
||||||
section).
|
section).
|
||||||
|
|
||||||
For example, the method `parseIf()` is used to parse if statements, it
|
For example, the method `parseIf()` is used to parse if statements, it
|
||||||
|
@ -30,7 +30,7 @@ proper module support
|
||||||
|
|
||||||
### API
|
### API
|
||||||
|
|
||||||
The API exposed by the parser is rather minimal as there isn't much to a
|
The API exposed by the parser is rather minimal as there isn’t much to a
|
||||||
parser than controlling the token stream pointer (the position in the
|
parser than controlling the token stream pointer (the position in the
|
||||||
token stream), fetching the token and acting upon the type or value of
|
token stream), fetching the token and acting upon the type or value of
|
||||||
said token. Therefore we have the methods summarised below:
|
said token. Therefore we have the methods summarised below:
|
||||||
|
@ -132,7 +132,7 @@ within the `parsing/data/check.d` and contains the following methods:
|
||||||
that you provide it a `SymbolType` and it will return the
|
that you provide it a `SymbolType` and it will return the
|
||||||
corresponding string that is of that type.
|
corresponding string that is of that type.
|
||||||
- This will work only for back-mapping a sub-section of tokens as
|
- This will work only for back-mapping a sub-section of tokens as
|
||||||
you won't get anything back if you provide
|
you won’t get anything back if you provide
|
||||||
`SymbolType.IDENT_TYPE` as there are infinite possibiltiies for
|
`SymbolType.IDENT_TYPE` as there are infinite possibiltiies for
|
||||||
that - not a fixed token.
|
that - not a fixed token.
|
||||||
|
|
||||||
|
@ -280,7 +280,7 @@ while (hasTokens())
|
||||||
```
|
```
|
||||||
|
|
||||||
Following this we now have several checks that make use of
|
Following this we now have several checks that make use of
|
||||||
`getSymbolType(Token)` in order to determine what the token's type is
|
`getSymbolType(Token)` in order to determine what the token’s type is
|
||||||
and then in our case if the token is `"if"` then we will make a call to
|
and then in our case if the token is `"if"` then we will make a call to
|
||||||
`parseIf()` and append the returned Statement-sub-type to the body of
|
`parseIf()` and append the returned Statement-sub-type to the body of
|
||||||
statements (`Statement[]`):
|
statements (`Statement[]`):
|
||||||
|
|
|
@ -12,7 +12,7 @@ time but also makes the implementation bloated as all logic must be in
|
||||||
one file to support each stage. There are also other disbenefits:
|
one file to support each stage. There are also other disbenefits:
|
||||||
|
|
||||||
1. Symbol definitions
|
1. Symbol definitions
|
||||||
- Doing a single pass means you haven't stored all symbols in the
|
- Doing a single pass means you haven’t stored all symbols in the
|
||||||
program yet, hence resolution of some will fail unless you do
|
program yet, hence resolution of some will fail unless you do
|
||||||
some sort of over-complicated lookahead to find them - cache
|
some sort of over-complicated lookahead to find them - cache
|
||||||
them - and then retry. In general it makes all sort of
|
them - and then retry. In general it makes all sort of
|
||||||
|
@ -24,28 +24,28 @@ one file to support each stage. There are also other disbenefits:
|
||||||
2. Dependencies
|
2. Dependencies
|
||||||
- Some instructions must be generated which do not have a
|
- Some instructions must be generated which do not have a
|
||||||
syntactical mapping. I.e. the static initialization of a class
|
syntactical mapping. I.e. the static initialization of a class
|
||||||
doesn't have a parser/AST node equivalent. Therefore our
|
doesn’t have a parser/AST node equivalent. Therefore our
|
||||||
multi-stage system of parserot-to-dependency coversions allows
|
multi-stage system of parserot-to-dependency coversions allows
|
||||||
us to convert all AST nodes to dependency nodes and add extra
|
us to convert all AST nodes to dependency nodes and add extra
|
||||||
dependency nodes (such as `ClassStaticInit`) into the dependency
|
dependency nodes (such as `ClassStaticInit`) into the dependency
|
||||||
tree despite them having no AST equivalent.
|
tree despite them having no AST equivalent.
|
||||||
- Splitting this up also let's us more easily, once again, about
|
- Splitting this up also let’s us more easily, once again, about
|
||||||
symbols that are defined but reauire static initializations, and
|
symbols that are defined but reauire static initializations, and
|
||||||
looping structures which must be resolved and can easily be done
|
looping structures which must be resolved and can easily be done
|
||||||
if we know all symbols (we just walk the AST tree)
|
if we know all symbols (we just walk the AST tree)
|
||||||
|
|
||||||
*And the list goes on...*
|
*And the list goes on…*
|
||||||
|
|
||||||
Hopefully now one understands as to why a multi-pass compiler is both
|
Hopefully now one understands as to why a multi-pass compiler is both
|
||||||
easier to write (as the code is more modular) and easier to reason about
|
easier to write (as the code is more modular) and easier to reason about
|
||||||
in terms symbol resolution. It is for this reason that a lot of the code
|
in terms symbol resolution. It is for this reason that a lot of the code
|
||||||
you see in the dependency processor looks like a duplicate of the parser
|
you see in the dependency processor looks like a duplicate of the parser
|
||||||
processor but in reality it's doing something different - it's generated
|
processor but in reality it’s doing something different - it’s generated
|
||||||
the actual executable atoms that must be typechecked and have code
|
the actual executable atoms that must be typechecked and have code
|
||||||
generated for - taking into account looping structures and so forth.
|
generated for - taking into account looping structures and so forth.
|
||||||
|
|
||||||
> The dependency processor adds execution to the AST tree and the
|
> The dependency processor adds execution to the AST tree and the
|
||||||
> ability to reason about visited nodes and "already-initted" structures
|
> ability to reason about visited nodes and “already-initted” structures
|
||||||
|
|
||||||
### What gets accomplished?
|
### What gets accomplished?
|
||||||
|
|
||||||
|
@ -60,19 +60,19 @@ and creation process provides us:
|
||||||
mark them as visited hence a use-before-declare situation is
|
mark them as visited hence a use-before-declare situation is
|
||||||
easy to detect and report to the end-user
|
easy to detect and report to the end-user
|
||||||
2. Tree of execution
|
2. Tree of execution
|
||||||
- When the dependency tree is fully created it can be "linearized"
|
- When the dependency tree is fully created it can be “linearized”
|
||||||
or left-hand leaf visited whereby eahc leaf-left node is
|
or left-hand leaf visited whereby eahc leaf-left node is
|
||||||
appended into an array.
|
appended into an array.
|
||||||
- This array then provides us a list of `DNode`s we walk through
|
- This array then provides us a list of `DNode`s we walk through
|
||||||
in the typechecker and can effectively generate instructions
|
in the typechecker and can effectively generate instructions
|
||||||
from and perform typechecking
|
from and perform typechecking
|
||||||
- It's an easy to walk through "process - typecheck - code gen".
|
- It’s an easy to walk through “process - typecheck - code gen”.
|
||||||
3. Non-AST equivalents
|
3. Non-AST equivalents
|
||||||
- There is no equivalent AST node that represents a "static
|
- There is no equivalent AST node that represents a “static
|
||||||
allocation" - that is something derived from the AST tree,
|
allocation” - that is something derived from the AST tree,
|
||||||
therefore we need a list of **concrete** "instructions" which
|
therefore we need a list of **concrete** “instructions” which
|
||||||
precisely tell the code generator what to do - this is one of
|
precisely tell the code generator what to do - this is one of
|
||||||
those cases where a AST tree wouldn't help us - or we we would
|
those cases where a AST tree wouldn’t help us - or we we would
|
||||||
effectively have to implement this all in the parser which leads
|
effectively have to implement this all in the parser which leads
|
||||||
to overly complex parser.
|
to overly complex parser.
|
||||||
|
|
||||||
|
@ -104,7 +104,7 @@ wraps the following methods and fields within it:
|
||||||
needed, therefore a second visitation state is required. See
|
needed, therefore a second visitation state is required. See
|
||||||
`tree()`.
|
`tree()`.
|
||||||
7. `DNode[] dependencies`
|
7. `DNode[] dependencies`
|
||||||
- The current `DNode`'s array of depenencies which themselves are
|
- The current `DNode`’s array of depenencies which themselves are
|
||||||
`DNode`s
|
`DNode`s
|
||||||
8. `performLinearization()`
|
8. `performLinearization()`
|
||||||
- Performs the linearization process on the dependency tree,
|
- Performs the linearization process on the dependency tree,
|
||||||
|
@ -133,7 +133,7 @@ wraps the following methods and fields within it:
|
||||||
|
|
||||||
The DNodeGenerator is used to generate dependency node objects
|
The DNodeGenerator is used to generate dependency node objects
|
||||||
(`DNode`s) based on the current state of the type checker. It will use
|
(`DNode`s) based on the current state of the type checker. It will use
|
||||||
the type checker's facilities to lookup the `Module` that is contained
|
the type checker’s facilities to lookup the `Module` that is contained
|
||||||
within and use this container-based entity to traverse the entire parse
|
within and use this container-based entity to traverse the entire parse
|
||||||
tree of the container and process each different possible type of
|
tree of the container and process each different possible type of
|
||||||
`Statement` found within, step-by-step generating a dependency node for
|
`Statement` found within, step-by-step generating a dependency node for
|
||||||
|
@ -156,7 +156,7 @@ TODO: Discuss the `DNodeGenerator`
|
||||||
|
|
||||||
### Pooling
|
### Pooling
|
||||||
|
|
||||||
Pooling is the technique of mapping a given parse node, let's say some
|
Pooling is the technique of mapping a given parse node, let’s say some
|
||||||
kind-of `Statement`, to the same `DNode` everytime and if no mapping
|
kind-of `Statement`, to the same `DNode` everytime and if no mapping
|
||||||
exists then creating a `DNode` for the respective parse node once off
|
exists then creating a `DNode` for the respective parse node once off
|
||||||
and then returning that same dependency node on successive requests.
|
and then returning that same dependency node on successive requests.
|
||||||
|
@ -176,7 +176,7 @@ status of said `DNode` during processing.
|
||||||
|
|
||||||
Below we have an example of what this process looks like. In this case
|
Below we have an example of what this process looks like. In this case
|
||||||
we would have done something akin to the following. Our scenario is that
|
we would have done something akin to the following. Our scenario is that
|
||||||
we have some sort of parse node, let's assume it was a `Variable` parse
|
we have some sort of parse node, let’s assume it was a `Variable` parse
|
||||||
node which would represent a variable declaration.
|
node which would represent a variable declaration.
|
||||||
|
|
||||||
![](/projects/tlang/graphs/pandocplot11037938885968638614.svg)
|
![](/projects/tlang/graphs/pandocplot11037938885968638614.svg)
|
||||||
|
@ -190,7 +190,7 @@ it and then confirmed that the `varDNode.entity` is equal to that of the
|
||||||
(`varPNode`) in order to show the returned dependency node will be the
|
(`varPNode`) in order to show the returned dependency node will be the
|
||||||
same as that referenced by `varDNode`.
|
same as that referenced by `varDNode`.
|
||||||
|
|
||||||
``` {.d .numberLines}
|
``` d
|
||||||
Variable varPNode = <... fetch node>;
|
Variable varPNode = <... fetch node>;
|
||||||
|
|
||||||
DNode varDNode = pool(varPNode);
|
DNode varDNode = pool(varPNode);
|
||||||
|
|
|
@ -23,7 +23,7 @@ instructions and contains some common methods used by all of them:
|
||||||
that if such context is needed during further code generation
|
that if such context is needed during further code generation
|
||||||
(or even emit) it can then be accessed
|
(or even emit) it can then be accessed
|
||||||
2. `Context getContext()`
|
2. `Context getContext()`
|
||||||
- Returns this instruction's associated context via its `Context`
|
- Returns this instruction’s associated context via its `Context`
|
||||||
object
|
object
|
||||||
3. `string produceToStrEnclose(string addInfo)`
|
3. `string produceToStrEnclose(string addInfo)`
|
||||||
- Returns a string containing the additional info provided through
|
- Returns a string containing the additional info provided through
|
||||||
|
@ -58,12 +58,12 @@ be found in
|
||||||
### Code generation
|
### Code generation
|
||||||
|
|
||||||
The method of code generation and type checking starts by being provided
|
The method of code generation and type checking starts by being provided
|
||||||
a so-called "action list" which is a linear array of dependency-nodes
|
a so-called “action list” which is a linear array of dependency-nodes
|
||||||
(or `DNode`s for code's sake), this list is then iterated through by a
|
(or `DNode`s for code’s sake), this list is then iterated through by a
|
||||||
for-loop, and each `DNode` is passed to a method called
|
for-loop, and each `DNode` is passed to a method called
|
||||||
`typeCheckThing(DNode)`:
|
`typeCheckThing(DNode)`:
|
||||||
|
|
||||||
``` {.d .numberLines}
|
``` d
|
||||||
foreach(DNode node; actionList)
|
foreach(DNode node; actionList)
|
||||||
{
|
{
|
||||||
/* Type-check/code-gen this node */
|
/* Type-check/code-gen this node */
|
||||||
|
@ -75,7 +75,7 @@ The handling of every different instruction type and its associated
|
||||||
typechecking requirements are handled in one huge if-statement within
|
typechecking requirements are handled in one huge if-statement within
|
||||||
the `typeCheckThing(DNode)` method. This method will analyse a given
|
the `typeCheckThing(DNode)` method. This method will analyse a given
|
||||||
dependency-node and perform the required typechecking by extracting the
|
dependency-node and perform the required typechecking by extracting the
|
||||||
`DNode`'s emebedded parser-node, whilst doing so if a type check passes
|
`DNode`’s emebedded parser-node, whilst doing so if a type check passes
|
||||||
then code generation takes place by generating the corresponding
|
then code generation takes place by generating the corresponding
|
||||||
instruction and adding this to some position in the code queue
|
instruction and adding this to some position in the code queue
|
||||||
(discussed later).
|
(discussed later).
|
||||||
|
@ -86,8 +86,8 @@ TODO: Add information on this
|
||||||
|
|
||||||
The code queue is used as a stack and a queue in order to facilitate
|
The code queue is used as a stack and a queue in order to facilitate
|
||||||
instruction generation. Certain instructions are produced once off and
|
instruction generation. Certain instructions are produced once off and
|
||||||
then added to the back of the queue (*"consuming"* instructions) whilst
|
then added to the back of the queue (*“consuming”* instructions) whilst
|
||||||
other are produced and pushed onto the top of the queue (*"producing"*
|
other are produced and pushed onto the top of the queue (*“producing”*
|
||||||
instructions) for consumption by other consuming instructions later.
|
instructions) for consumption by other consuming instructions later.
|
||||||
|
|
||||||
An example of this would be the following T code which uses a binary
|
An example of this would be the following T code which uses a binary
|
||||||
|
|
|
@ -94,7 +94,7 @@ else if(cast(VariableDeclaration)instruction)
|
||||||
What we have here is some code which will extract the name of the
|
What we have here is some code which will extract the name of the
|
||||||
variable being declared via `varDecInstr.varName` which is then used to
|
variable being declared via `varDecInstr.varName` which is then used to
|
||||||
lookup the parser node of type `Variable`. The `Variable` object
|
lookup the parser node of type `Variable`. The `Variable` object
|
||||||
contains information such as the variable's type and also if a variable
|
contains information such as the variable’s type and also if a variable
|
||||||
assignment is attached to this declaration or not.
|
assignment is attached to this declaration or not.
|
||||||
|
|
||||||
TODO: Insert code regarding assignment checking
|
TODO: Insert code regarding assignment checking
|
||||||
|
@ -129,7 +129,7 @@ usage. In this case we want to translate the symbol of the entity named
|
||||||
`simple_variables_decls_ass`. Therefore we provide both peices of
|
`simple_variables_decls_ass`. Therefore we provide both peices of
|
||||||
information into the function `symbolLookup`:
|
information into the function `symbolLookup`:
|
||||||
|
|
||||||
``` {.d .numberLines}
|
``` d
|
||||||
// The relative container of this variable is the module
|
// The relative container of this variable is the module
|
||||||
Container container = tc.getModule();
|
Container container = tc.getModule();
|
||||||
|
|
||||||
|
|
|
@ -21,9 +21,9 @@ following methods:
|
||||||
1. `this(string sourceCode, File emitOutFile)`
|
1. `this(string sourceCode, File emitOutFile)`
|
||||||
- Constructs a new compiler object with the given source code and
|
- Constructs a new compiler object with the given source code and
|
||||||
the file to write the emitted code out to
|
the file to write the emitted code out to
|
||||||
- An newly initialized `File` struct that doesn't contain a valid
|
- An newly initialized `File` struct that doesn’t contain a valid
|
||||||
file handle can be passed in in the case whereby the emitter
|
file handle can be passed in in the case whereby the emitter
|
||||||
won't be used but an instance of the compiler is required
|
won’t be used but an instance of the compiler is required
|
||||||
2. `doLex()`
|
2. `doLex()`
|
||||||
- Performs the tokenization of the input source code,
|
- Performs the tokenization of the input source code,
|
||||||
`sourceCode`.
|
`sourceCode`.
|
||||||
|
@ -118,7 +118,7 @@ The types that can be stored and their respectives methods are:
|
||||||
Below is an example of the usage of the `ConfigEntry`s in the
|
Below is an example of the usage of the `ConfigEntry`s in the
|
||||||
`CompilerConfiguration` system, here we add a few entries:
|
`CompilerConfiguration` system, here we add a few entries:
|
||||||
|
|
||||||
``` {.d .numberLines}
|
``` d
|
||||||
/* Enable Behaviour-C fixes */
|
/* Enable Behaviour-C fixes */
|
||||||
config.addConfig(ConfigEntry("behavec:preinline_args", true));
|
config.addConfig(ConfigEntry("behavec:preinline_args", true));
|
||||||
|
|
||||||
|
@ -138,7 +138,7 @@ Later on we can retrieve these entries, the below is code from the
|
||||||
`DGen` class which emits the C code), here we check for any object files
|
`DGen` class which emits the C code), here we check for any object files
|
||||||
that should be linked in:
|
that should be linked in:
|
||||||
|
|
||||||
``` {.d .numberLines}
|
``` d
|
||||||
//NOTE: Change to system compiler (maybe, we need to choose a good C compiler)
|
//NOTE: Change to system compiler (maybe, we need to choose a good C compiler)
|
||||||
string[] compileArgs = ["clang", "-o", "tlang.out", file.name()];
|
string[] compileArgs = ["clang", "-o", "tlang.out", file.name()];
|
||||||
|
|
||||||
|
|
|
@ -3,15 +3,15 @@
|
||||||
Despite my eagerness to jump directly into the subject matter at hand I
|
Despite my eagerness to jump directly into the subject matter at hand I
|
||||||
think believe there is something of even greater importance. Despite
|
think believe there is something of even greater importance. Despite
|
||||||
there being a myriad of reasons I embarked upon this project something
|
there being a myriad of reasons I embarked upon this project something
|
||||||
more important than the stock-and-standard "I needed it to solve a
|
more important than the stock-and-standard “I needed it to solve a
|
||||||
problem of mine" reasoning comes to mind. There is indeed a better
|
problem of mine” reasoning comes to mind. There is indeed a better
|
||||||
reason for embarking on something that the mere technical *requirement
|
reason for embarking on something that the mere technical *requirement
|
||||||
thereof* - I did this **because I can**. This sentiment is something
|
thereof* - I did this **because I can**. This sentiment is something
|
||||||
that I really hold dear to my heart despite it being a seemingly obvious
|
that I really hold dear to my heart despite it being a seemingly obvious
|
||||||
one. Of course you can do what you want with your code - it's a free
|
one. Of course you can do what you want with your code - it’s a free
|
||||||
country. One would not be wrong to make such a statement but mention
|
country. One would not be wrong to make such a statement but mention
|
||||||
your ideas online and you get hounded down by others saying "that's
|
your ideas online and you get hounded down by others saying “that’s
|
||||||
dumb, just use X" or "your implementation will be inefficient". These
|
dumb, just use X” or “your implementation will be inefficient”. These
|
||||||
statements are not entirely untrue but they miss the point that this is
|
statements are not entirely untrue but they miss the point that this is
|
||||||
an exercise in scientific thinking and an artistic approach at it in
|
an exercise in scientific thinking and an artistic approach at it in
|
||||||
that as well.
|
that as well.
|
||||||
|
@ -23,5 +23,5 @@ expectations but luckily I do not require the external feedback of the
|
||||||
mass - just some close few friends who can appreciate my work and join
|
mass - just some close few friends who can appreciate my work and join
|
||||||
the endeavor with me.
|
the endeavor with me.
|
||||||
|
|
||||||
*Don't let people stop you, you only have one life - take it by the
|
*Don’t let people stop you, you only have one life - take it by the
|
||||||
horns and fly*
|
horns and fly*
|
||||||
|
|
|
@ -13,7 +13,7 @@ filter.
|
||||||
|
|
||||||
Tristan aims to be able to support all of these but with certain limits,
|
Tristan aims to be able to support all of these but with certain limits,
|
||||||
this is after all mainly an imperative language with those paradigms as
|
this is after all mainly an imperative language with those paradigms as
|
||||||
*"extra features"*. Avoiding feature creep in other systems-levels
|
*“extra features”*. Avoiding feature creep in other systems-levels
|
||||||
languages such as C++ is something I really want to stress about the
|
languages such as C++ is something I really want to stress about the
|
||||||
design of this language, I do not want a big and confusing mess that has
|
design of this language, I do not want a big and confusing mess that has
|
||||||
an extremely steep learning curve and way too many moving parts.
|
an extremely steep learning curve and way too many moving parts.
|
||||||
|
@ -117,7 +117,7 @@ in my viewpoint and hence we support such features as:
|
||||||
- Pointers
|
- Pointers
|
||||||
- The mere *support* of pointers allowing one to take a
|
- The mere *support* of pointers allowing one to take a
|
||||||
memory-level view of objects in memory rather than the normal
|
memory-level view of objects in memory rather than the normal
|
||||||
"safe access" means
|
“safe access” means
|
||||||
- Inline assembly
|
- Inline assembly
|
||||||
- Inserting of arbitrary assembler is allowed, providing the
|
- Inserting of arbitrary assembler is allowed, providing the
|
||||||
programmer with access to systems level registers,
|
programmer with access to systems level registers,
|
||||||
|
@ -125,7 +125,7 @@ in my viewpoint and hence we support such features as:
|
||||||
- Custom byte-packing
|
- Custom byte-packing
|
||||||
- Allowing the user to deviate from the normal struct packing
|
- Allowing the user to deviate from the normal struct packing
|
||||||
structure in favor of a tweaked packing technique
|
structure in favor of a tweaked packing technique
|
||||||
- Custom packing on a system that doesn't agree with the alignment
|
- Custom packing on a system that doesn’t agree with the alignment
|
||||||
of your data **is** allowed but the default is to pack
|
of your data **is** allowed but the default is to pack
|
||||||
accordingly to the respective platform
|
accordingly to the respective platform
|
||||||
|
|
||||||
|
|
|
@ -1,6 +1,6 @@
|
||||||
# Language
|
# Language
|
||||||
|
|
||||||
This page serves as an official manual for both user's of TLang and
|
This page serves as an official manual for both user’s of TLang and
|
||||||
those who want to understand/develop the internals of the compiler and
|
those who want to understand/develop the internals of the compiler and
|
||||||
runtime (the language itself).
|
runtime (the language itself).
|
||||||
|
|
||||||
|
|
|
@ -2,7 +2,7 @@
|
||||||
|
|
||||||
The grammar for the language is still a work in progress and may take
|
The grammar for the language is still a work in progress and may take
|
||||||
some time to become concrete but every now and then I update this page
|
some time to become concrete but every now and then I update this page
|
||||||
to add more to it or fix any incongruencies with the parser's actual
|
to add more to it or fix any incongruencies with the parser’s actual
|
||||||
implementation. The grammar starts from the simplest building blocks and
|
implementation. The grammar starts from the simplest building blocks and
|
||||||
then progresses to the more complex (heavily composed) ones and these
|
then progresses to the more complex (heavily composed) ones and these
|
||||||
are placed into sections whereby they are related.
|
are placed into sections whereby they are related.
|
||||||
|
|
|
@ -12,22 +12,22 @@ attributes:
|
||||||
|
|
||||||
### Integral types
|
### Integral types
|
||||||
|
|
||||||
Type Width Intended interpretation
|
| Type | Width | Intended interpretation |
|
||||||
-------- ------- ---------------------------------
|
|--------|-------|---------------------------------|
|
||||||
byte 8 signed byte (two's complement)
|
| byte | 8 | signed byte (two’s complement) |
|
||||||
ubyte 8 unsigned byte
|
| ubyte | 8 | unsigned byte |
|
||||||
short 16 signed short (two's complement)
|
| short | 16 | signed short (two’s complement) |
|
||||||
ushort 16 unsigned short
|
| ushort | 16 | unsigned short |
|
||||||
int 32 signed int (two's complement)
|
| int | 32 | signed int (two’s complement) |
|
||||||
uint 32 unsigned int
|
| uint | 32 | unsigned int |
|
||||||
long 64 signed long (two's complement)
|
| long | 64 | signed long (two’s complement) |
|
||||||
ulong 64 unsigned long
|
| ulong | 64 | unsigned long |
|
||||||
|
|
||||||
#### Conversion rules
|
#### Conversion rules
|
||||||
|
|
||||||
1. TODO: Sign/zero extension
|
1. TODO: Sign/zero extension
|
||||||
2. Promotion?
|
2. Promotion?
|
||||||
3. Precedence in interpretation when the first two don't apply
|
3. Precedence in interpretation when the first two don’t apply
|
||||||
|
|
||||||
### Decimal
|
### Decimal
|
||||||
|
|
||||||
|
|
|
@ -32,7 +32,7 @@ else if(val == 3)
|
||||||
```
|
```
|
||||||
|
|
||||||
In the case the conditions are not true for any of the `if` or `else if`
|
In the case the conditions are not true for any of the `if` or `else if`
|
||||||
branches then "default" code can be run in the `else` branch as such:
|
branches then “default” code can be run in the `else` branch as such:
|
||||||
|
|
||||||
``` d
|
``` d
|
||||||
int val = 2;
|
int val = 2;
|
||||||
|
|
|
@ -3,7 +3,7 @@
|
||||||
Arrays allow us to have one name refer to multiple instances of the same
|
Arrays allow us to have one name refer to multiple instances of the same
|
||||||
type. Think of an array like having multiple variables of the same type
|
type. Think of an array like having multiple variables of the same type
|
||||||
tightly packed next to one-another but being able to refer to this group
|
tightly packed next to one-another but being able to refer to this group
|
||||||
by a *single name* and *each instance* by a number - an *"offset"* so to
|
by a *single name* and *each instance* by a number - an *“offset”* so to
|
||||||
speak.
|
speak.
|
||||||
|
|
||||||
### Stack arrays
|
### Stack arrays
|
||||||
|
@ -13,7 +13,7 @@ Stack arrays are what we refer to when we allocate an array
|
||||||
stack space of the current stack frame (the space for the current
|
stack space of the current stack frame (the space for the current
|
||||||
function call).
|
function call).
|
||||||
|
|
||||||
``` {.d numberLines="1" hl_lines="5"}
|
``` d
|
||||||
module simple_stack_arrays4;
|
module simple_stack_arrays4;
|
||||||
|
|
||||||
int function()
|
int function()
|
||||||
|
|
|
@ -2,14 +2,14 @@
|
||||||
|
|
||||||
Pointers are just like any other variable one would declare but what is
|
Pointers are just like any other variable one would declare but what is
|
||||||
important is that their values can be used in certain operations. A
|
important is that their values can be used in certain operations. A
|
||||||
pointer's value is an address of another variable and one can use a
|
pointer’s value is an address of another variable and one can use a
|
||||||
pointer to indirectly refer to such a variable and indirectly fetch or
|
pointer to indirectly refer to such a variable and indirectly fetch or
|
||||||
update its value.
|
update its value.
|
||||||
|
|
||||||
A pointer type is written in the form of `<type>*`, for example one may
|
A pointer type is written in the form of `<type>*`, for example one may
|
||||||
write `int*` which is read as "a pointer to an `int`".
|
write `int*` which is read as “a pointer to an `int`”.
|
||||||
|
|
||||||
TODO: All pointers are 64-bit values - the size of addresses on one's
|
TODO: All pointers are 64-bit values - the size of addresses on one’s
|
||||||
system.
|
system.
|
||||||
|
|
||||||
### Pointer syntax
|
### Pointer syntax
|
||||||
|
@ -17,22 +17,16 @@ system.
|
||||||
There are a few operators that can be used on pointers which are shown
|
There are a few operators that can be used on pointers which are shown
|
||||||
below, most specific of which are the `*` and `&` unary operators:
|
below, most specific of which are the `*` and `&` unary operators:
|
||||||
|
|
||||||
---------------------------------------------------------------------------------
|
| Operator | Description | Example |
|
||||||
Operator Description Example
|
|----------|---------------------------------------------------------------|----------------------------|
|
||||||
---------------------- ----------------------------- ----------------------------
|
| `&` | Gets the address of the identifier | `int* myVarPtr = &myVar` |
|
||||||
`&` Gets the address of the `int* myVarPtr = &myVar`
|
| `*` | Gets the value at the address held in the referred identifier | `int myVarVal = *myVarPtr` |
|
||||||
identifier
|
|
||||||
|
|
||||||
`*` Gets the value at the address `int myVarVal = *myVarPtr`
|
|
||||||
held in the referred
|
|
||||||
identifier
|
|
||||||
---------------------------------------------------------------------------------
|
|
||||||
|
|
||||||
Below we will declare a module-level global variable `j` of type `int`
|
Below we will declare a module-level global variable `j` of type `int`
|
||||||
and then use a function to indirectly update its value by the use of a
|
and then use a function to indirectly update its value by the use of a
|
||||||
pointer to this integer - in other words an `int*`:
|
pointer to this integer - in other words an `int*`:
|
||||||
|
|
||||||
``` {.d numberLines="1"}
|
``` d
|
||||||
module simple_pointer;
|
module simple_pointer;
|
||||||
|
|
||||||
int j;
|
int j;
|
||||||
|
@ -58,7 +52,7 @@ named `ptr`. This can hold the address of memory that points to an
|
||||||
function with the argument `&j` which means it is passing a pointer to
|
function with the argument `&j` which means it is passing a pointer to
|
||||||
the `j` variable in.
|
the `j` variable in.
|
||||||
|
|
||||||
``` {.d numberLines="1"}
|
``` d
|
||||||
int function(int* ptr)
|
int function(int* ptr)
|
||||||
{
|
{
|
||||||
*ptr = 2+2;
|
*ptr = 2+2;
|
||||||
|
@ -81,15 +75,15 @@ What `int function(int* ptr)` does is two things:
|
||||||
Some of the existing operators such as those used for arithmetic have
|
Some of the existing operators such as those used for arithmetic have
|
||||||
special usage when used on pointers:
|
special usage when used on pointers:
|
||||||
|
|
||||||
Operator Description Example
|
| Operator | Description | Example |
|
||||||
---------- ------------------------------------------------------------------ ---------
|
|----------|------------------------------------------------------------------|---------|
|
||||||
`+` Allows one to offset the pointer by a `+ offset*sizeof(ptrType)` `ptr+1`
|
| `+` | Allows one to offset the pointer by a `+ offset*sizeof(ptrType)` | `ptr+1` |
|
||||||
`-` Allows one to offset the pointer by a `- offset*sizeof(ptrType)` `ptr-1`
|
| `-` | Allows one to offset the pointer by a `- offset*sizeof(ptrType)` | `ptr-1` |
|
||||||
|
|
||||||
Below we show how one can use pointer arithmetic and the casting of
|
Below we show how one can use pointer arithmetic and the casting of
|
||||||
pointers to work on sub-sections of data referenced to by a pointer:
|
pointers to work on sub-sections of data referenced to by a pointer:
|
||||||
|
|
||||||
``` {.d linenums="1" hl_lines="12-14"}
|
``` d
|
||||||
module simple_pointer_cast_le;
|
module simple_pointer_cast_le;
|
||||||
|
|
||||||
int j;
|
int j;
|
||||||
|
@ -123,7 +117,7 @@ access the 4 byte integer byte-by-byte, on x86 we would be starting with
|
||||||
the least-significant byte. What we have done here is updated said byte
|
the least-significant byte. What we have done here is updated said byte
|
||||||
to the value of `2+2`:
|
to the value of `2+2`:
|
||||||
|
|
||||||
``` {.d linenums="1"}
|
``` d
|
||||||
byte* bytePtr = cast(byte*)ptr;
|
byte* bytePtr = cast(byte*)ptr;
|
||||||
*bytePtr = 2+2;
|
*bytePtr = 2+2;
|
||||||
```
|
```
|
||||||
|
@ -133,7 +127,7 @@ which would increment the address by `1`, resultingly pointing to the
|
||||||
second least significant byte, we then use the dereference operator `*`
|
second least significant byte, we then use the dereference operator `*`
|
||||||
to set this byte to `1`:
|
to set this byte to `1`:
|
||||||
|
|
||||||
``` {.d linenums="1"}
|
``` d
|
||||||
*(bytePtr+1) = 1;
|
*(bytePtr+1) = 1;
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -142,7 +136,7 @@ should (TODO: we can explain the memory here) become the result of
|
||||||
`256+4` (that is `260`). After this we then return that number with two
|
`256+4` (that is `260`). After this we then return that number with two
|
||||||
added to it:
|
added to it:
|
||||||
|
|
||||||
``` {.d linenums="1"}
|
``` d
|
||||||
return (*ptr)+1*2;
|
return (*ptr)+1*2;
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -151,7 +145,7 @@ return (*ptr)+1*2;
|
||||||
One can even mix these if they want, for example we can do the
|
One can even mix these if they want, for example we can do the
|
||||||
following:
|
following:
|
||||||
|
|
||||||
``` {.d numberLines="1"}
|
``` d
|
||||||
module simple_stack_arrays3;
|
module simple_stack_arrays3;
|
||||||
|
|
||||||
void function()
|
void function()
|
||||||
|
|
|
@ -18,7 +18,7 @@ struct <name>
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
Note: Assignments to these variables within the struct's body is not
|
Note: Assignments to these variables within the struct’s body is not
|
||||||
allowed.
|
allowed.
|
||||||
|
|
||||||
#### Example
|
#### Example
|
||||||
|
|
|
@ -1,21 +1,21 @@
|
||||||
## External symbols
|
## External symbols
|
||||||
|
|
||||||
Some times it is required that a symbol be processed at a later stage
|
Some times it is required that a symbol be processed at a later stage
|
||||||
that is not within the T compiler's symbol procvessing stage but rather
|
that is not within the T compiler’s symbol procvessing stage but rather
|
||||||
at the linking stage. This is known as late-binding at link time where
|
at the linking stage. This is known as late-binding at link time where
|
||||||
such symbols are only resolved then which can help one link their T
|
such symbols are only resolved then which can help one link their T
|
||||||
program to some symbol in an ELF file (linked in with extra `gcc`
|
program to some symbol in an ELF file (linked in with extra `gcc`
|
||||||
arguments to `DGen`) or in a C standard library autpmatically included
|
arguments to `DGen`) or in a C standard library autpmatically included
|
||||||
in the DGen's emitted C code.
|
in the DGen’s emitted C code.
|
||||||
|
|
||||||
In order to use such a feature one can make use of the `extern` keyword
|
In order to use such a feature one can make use of the `extern` keyword
|
||||||
which us specify either a function's signature or variable that should
|
which us specify either a function’s signature or variable that should
|
||||||
be resolved during C compilation time **but** such that we can still use
|
be resolved during C compilation time **but** such that we can still use
|
||||||
it in our T program with typechecking and all.
|
it in our T program with typechecking and all.
|
||||||
|
|
||||||
One could take a C program such as the following:
|
One could take a C program such as the following:
|
||||||
|
|
||||||
``` {.c .numberLines}
|
``` c
|
||||||
#include<unistd.h>
|
#include<unistd.h>
|
||||||
|
|
||||||
int ctr = 2;
|
int ctr = 2;
|
||||||
|
@ -29,14 +29,14 @@ unsigned int doWrite(unsigned int fd, unsigned char* buffer, unsigned int count)
|
||||||
and then compile it to an on object file named `file_io.o` with the
|
and then compile it to an on object file named `file_io.o` with the
|
||||||
following command:
|
following command:
|
||||||
|
|
||||||
``` {.bash .numberLines}
|
``` bash
|
||||||
gcc source/tlang/testing/file_io.c -c -o file_io.o
|
gcc source/tlang/testing/file_io.c -c -o file_io.o
|
||||||
```
|
```
|
||||||
|
|
||||||
And then link this with your T program using the command (take note of
|
And then link this with your T program using the command (take note of
|
||||||
the flag `-ll file_io.o` which specifies the object to link in):
|
the flag `-ll file_io.o` which specifies the object to link in):
|
||||||
|
|
||||||
``` {.bash .numberLines}
|
``` bash
|
||||||
./tlang compile source/tlang/testing/simple_extern.t \
|
./tlang compile source/tlang/testing/simple_extern.t \
|
||||||
-sm HASHMAPPER \
|
-sm HASHMAPPER \
|
||||||
-et true \
|
-et true \
|
||||||
|
@ -47,16 +47,16 @@ the flag `-ll file_io.o` which specifies the object to link in):
|
||||||
### External functions
|
### External functions
|
||||||
|
|
||||||
To declare an external function use the `extern efunc ...` clause
|
To declare an external function use the `extern efunc ...` clause
|
||||||
followed by a function's signature. Below we have an example of the
|
followed by a function’s signature. Below we have an example of the
|
||||||
`doWrite` function from our C program (seen earlier) being specified:
|
`doWrite` function from our C program (seen earlier) being specified:
|
||||||
|
|
||||||
``` {.d .numberLines}
|
``` d
|
||||||
extern efunc uint doWrite(uint fd, ubyte* buffer, uint count);
|
extern efunc uint doWrite(uint fd, ubyte* buffer, uint count);
|
||||||
```
|
```
|
||||||
|
|
||||||
We can now go ahead and use this function as a call such as with:
|
We can now go ahead and use this function as a call such as with:
|
||||||
|
|
||||||
``` {.d .numberLines}
|
``` d
|
||||||
extern efunc uint write(uint fd, ubyte* buffer, uint count);
|
extern efunc uint write(uint fd, ubyte* buffer, uint count);
|
||||||
|
|
||||||
void test()
|
void test()
|
||||||
|
@ -75,13 +75,13 @@ followed by the variable declaration (type and name). Below we have an
|
||||||
example of the `ctr` variable from our C program seen earlier being
|
example of the `ctr` variable from our C program seen earlier being
|
||||||
specified:
|
specified:
|
||||||
|
|
||||||
``` {.d .numberLines}
|
``` d
|
||||||
extern evar int ctr;
|
extern evar int ctr;
|
||||||
```
|
```
|
||||||
|
|
||||||
We have the same program as before where we then refer to it with:
|
We have the same program as before where we then refer to it with:
|
||||||
|
|
||||||
``` {.d .numberLines}
|
``` d
|
||||||
...
|
...
|
||||||
extern evar int ctr;
|
extern evar int ctr;
|
||||||
|
|
||||||
|
|
|
@ -11,7 +11,7 @@ Primitive data type are the building blocks of which other more complex types ar
|
||||||
### Integral types
|
### Integral types
|
||||||
|
|
||||||
| Type | Width | Intended interpretation |
|
| Type | Width | Intended interpretation |
|
||||||
|-|-|-|
|
|------|-------|-------------------------|
|
||||||
| byte | 8 | signed byte (two's complement) |
|
| byte | 8 | signed byte (two's complement) |
|
||||||
| ubyte | 8 | unsigned byte |
|
| ubyte | 8 | unsigned byte |
|
||||||
| short | 16| signed short (two's complement) |
|
| short | 16| signed short (two's complement) |
|
||||||
|
|
|
@ -37,7 +37,7 @@ function generateMarkdown()
|
||||||
|
|
||||||
outputFile="docs/$(echo $doc | cut -b 9-)"
|
outputFile="docs/$(echo $doc | cut -b 9-)"
|
||||||
|
|
||||||
pandoc -F pandoc-plot -M plot-configuration=pandoc-plot.conf -f markdown -t markdown "$doc" -o "$outputFile"
|
pandoc -F pandoc-plot -M plot-configuration=pandoc-plot.conf -f markdown -t gfm "$doc" -o "$outputFile"
|
||||||
|
|
||||||
echo "$(cat $outputFile | sed -e s/docs\\//\\/projects\\/tlang\\//)" > "$outputFile"
|
echo "$(cat $outputFile | sed -e s/docs\\//\\/projects\\/tlang\\//)" > "$outputFile"
|
||||||
cat "$outputFile"
|
cat "$outputFile"
|
||||||
|
|
Loading…
Reference in New Issue