Commit Graph

1151 Commits

Author SHA1 Message Date
Tristan B. Velloza Kildaire 8bf076b999 TypeChecker (unittests)
- Fixed up calls to `getBuiltInType(...)` which now require some `Container` to pivot on. We now pass in the `Program`
- Fixed up unittests which used `getModule()`, for such uni-modular tests we just use `program.getModules()[0]` now
2024-02-27 09:19:40 +02:00
Tristan B. Velloza Kildaire b352b7d3f6 MetaProcessor
- When getting the type root everything on the `Program` rather than `tc.getModule()`. We therefore now use the `tc.getProgram()` in its place. This updates the `sizeOf_Literalize(string typeName)` method.

DGen

- `emit()` has been updated to enumerate all `Module[]` of the given `Program` and then calls the respecitve `emit(...)` methods with a (`File`, `Module`) which contains the file which emitting should be written to and also the `Module` being emitted.
- At the end of `emit()` we then try to find the `Module` which contains a function named `main` and then set that as the entrypoint. If such a `main` function cannot be found we then go and do a check whether or not the `dgen:emit_entrypoint_test` configuration option is `true`, if so we then try emit a testing entrypoint and assume that the tst uses only one `Module` (this is something to be cleaned up later; as in all tests should have a `main` method).
- Implemented `findEntrypoint(ref Module mainModule, ref Function mainFunc)` which does as the name implies and finds the containing `Module` which has a `Function` named `"main"` and if so sets both ref arguments and returns `true`, otherwise they are left unset and `false` is returned
- `emitHeaderComment(...)`, `emitStaticAllocations(...)`, `emitFunctionPrototypes(...)`, `emitFunctionDefinitions(...)`, `emitFunctionPrototype(...)`, `emitFunctionDefinition(...)`, `emitCodeQueue(...)`, `emitStdint(...)`, `emitEntrypoint(...)`, `emitTestingEntrypoint(...)` now takes in `File modOut` and `Module mod`
- Note the above have not YET been updated to select correct code queues, static init. queues and `emitEntrypoint(...)` has not yet been implemented
- Updated `emitTestingEntrypoint(...)` to check using the incoming `Module`'s name
- `finalize()` will now compile all sources files, then it will link all generated object files together; optionally cleaning up all but the final genersted excutable at the end

CollidingNameException

- Fixed up name generation (we now anchor on the `Program`)
- Fixed up how we detect if a module's name is in use. We now will check and ensure it doesn't match ANY of the modules contained within the `Program`

DNode

- Implemented `forceName(string)` to set the `name` field forcefully
- The `generate()` method now calls `generalPass(Module, Context)` on each of the `Program`'s `Module`s. We then get an array of `DNode[]` from them. After this we then create a root `DNode` to represent the program and then we make it `needs(DNode)` each `Module` `DNode` created previously.

DNodeGenerator

- When processing `VariableExpression` in `expressionPass(Expression exp, Context context)`, if we get a `Function` being referred to in the expression, then we now set the `Context`'s `Container` still to the `Module` containing the function's definition, _however_ we use a method from the resolver to ensure it is the correct module (because we are now using a multi-module setup).
- We have always set the **root** `Module` when entering `generalPass(Container c, Context context)` and processing a `Function` but, as with the last change mentioned above, we need to find that containing `Module` correctly. Therefore we use a method from the `Resolver` to do that.
2024-02-27 09:01:06 +02:00
Tristan B. Velloza Kildaire 4479df0501 HashMapper
- Added support for `ScopeType.LOCAL` in a similar fashion to how it is done in `LebanonMapper`
2024-02-26 09:41:39 +02:00
Tristan B. Velloza Kildaire 0be3376fc2 LebanonMapper
- Join with periods and let step afterwards do the replacement
2024-02-26 09:39:41 +02:00
Tristan B. Velloza Kildaire b2cf6a53a1 LebanonMapper
- If called with `ScopeType.LOCAL` then the name will be generated in an absolute sense, however the module name will be stripped from the path
- For example `simple_module.a.b` would become `a.b` and then be mapped to `a_b`
2024-02-26 09:36:03 +02:00
Tristan B. Velloza Kildaire 0f51f73e36 LebanonMapper
- Fixed wrong variable name in `map(Entity item, ScopeType type)`

HashMapper

- Implemented `map(Entity item, ScopeType type)` for `ScopeType.GLOBAL`
- `ScopeType.LOCAL` is yet to be implemented
2024-02-26 08:32:39 +02:00
Tristan B. Velloza Kildaire 94dc6f9c71 LebanonMapper
- Now respects the `type` field for `copeType.GLOBAL`
- `ScopeType.LOCAL` is still to be implemented
2024-02-26 08:28:17 +02:00
Tristan B. Velloza Kildaire 4625e0516e LebanonMapper
- Implemented `map(Entity item, ScopeType type)`
2024-02-26 08:25:34 +02:00
Tristan B. Velloza Kildaire ac54002cac Implementations
- Added some stub types `LebanonMapper` and `HashMapper`
2024-02-26 08:21:48 +02:00
Tristan B. Velloza Kildaire 6ce79a8721 SymbolMapperV2
- Added missing import for `Entity` type
2024-02-25 21:54:34 +02:00
Tristan B. Velloza Kildaire 229bbf92b3 tlang.compiler.codegen.mapper.api
- Added new module

SymbolMapperV2

- Added new interface

ScopeType

- Added new enum type
2024-02-25 21:53:16 +02:00
Tristan B. Velloza Kildaire 8ecd48207c DGen
- Added some future code for `emit()` which will enumerate all `Module`(s) available in the current `Program` and then do an emit step per-each of them and have those go to the correct `File`(s)
- Added a comment for `finalize()` of which indicates where we should link all generated object files for all the modules. After this we should then link this against each other and generate an executable.
2024-02-25 20:01:13 +02:00
Tristan B. Velloza Kildaire e72e24f8f8 Makefile
- Added seperate stage to _create_ the `main.o` and then only later link with the intent of making an executable.
- This is BETTER because it means we could treat everything in the same way (nevermind a `main` or library)
2024-02-25 19:24:38 +02:00
Tristan B. Velloza Kildaire 513fb920ce Makefile
- Added `-c` to ONLY compile but do NOT attempt to link and make an executable. If we try this it looks for a `main` symbol to satisfy the `_start` for linux-ld
- Updated the build instructions for being able to statically link against the library
2024-02-25 19:17:15 +02:00
Tristan B. Velloza Kildaire 9c72115dc1 Scratch
- Playing with a way to build library objects without immediately linking
- I am then also seeing how to link it to an application which `_start_` can find its `main` symbol
2024-02-25 19:08:19 +02:00
Tristan B. Velloza Kildaire 86c7e49cc2 Parser
- Fixed missing ending curly brace
2023-12-27 08:31:42 +02:00
Tristan B. Velloza Kildaire c3463cd721 Merge branch 'vardec_varass_dependency' into feature/multi_module 2023-12-27 08:31:16 +02:00
Tristan B. Velloza Kildaire ee537f2b25
Feature: Lexer improvements (#18)
* Created brsnch

* Created brsnch (removed placeholder)

* Lexer

- Added tab handling for the presence such as spaces would be in.
- Added unit tests for the new tab processing
- Resolved issues where whitepsace was allowed before and or after the
  '.' character
- Renamed isSpliter to isSplitter
- Some Code styling

* Check

- Added two new `SymbolType`s for comments
- `SINGLE_LINE_COMMENT` (for `//`) and `MULTI_LINE_COMMENT` (for `/*`)

* Parser

- Added a bogus `parseComment()` which returns nothing, prints out the comment, consumes the `Token` and returns
- `parseStatement()` now supports `parseComment()` whenever a single-line or multi-line comment is detected

* Parser

- Fixed token consumption code in `parseComment()`

* BasicLexer

- Fixed style mishaps

* ArrLexer

- Implemented dummy lexer

* Parser

- Added some comment related functions (for testing)
- Added `pushComment(Token)`, `hasCommentsOnStack() and `getCommentCount()`
- `parseComment()` now pushes the current comment-based `Token` onto the comment-stack
- Added a comment stack

Unit tests

- Added testing for various examples of comment-type `Token`s

* Lexer
- Replaced the characters with Enumerated type
- Working Comment lexing, single and multiline
- Working escape codes for strings
- Working Signage and Size Encoder indicators

- Removed floatMode in favour of float lexing function
- Added doComment for the comment lexing instead of comment mode
- Added doEscapeCode for escape codes in string

Testing
- Added unit tests for comments
- Added unit tests numerical encoders
- Added unit tests numerical encoders

TODO
- ADD unit tests for all valid escape sequences and some invalid

* Lexer
- Removed stringMode in favour of soString

TODO
- Decide on miltiline strings, currently not supported

* Parser

- Test comments which appear at a non-Module but rather statement lavel

* Parser

- Changed to using `BasicLexer` for comment tests now seeing as it is now implemented therein

* Basic

- Added `roll()` and `shourt()` to mark unittests

* Basic

- `shout()` now adds line number to print out

* Lexer rewrite
- flush
- underscores in numbers
- escape codes
- character escapes
- bug fixes

* Basic

- Fixed `shourt(int)`

* Basic

- Remved crashing (on purpose_ unittest

* Resolved bug where isSplitter evaluated to true every time

* Basic

- Removed `goggaWithLineInfo(...)`

* Basic

- Updated `shout()` to remove rolling
- Removed `roll()`
- Added function and module name as well

* Basic

- Documented `shout()`

* Lexer Done and 100% coverage

* LexerSymbols

- Documented
- Formatted

* Lexer (module)

- Added `LS` alias
- Added `isOperator(char c)`, `isSplitter(char c)`, `isNumericalEncoder_Size(char character)`, `isNumericalEncoder_Signage(char character)` and `isValidEscape_String(char character)`

* BasicLexer

- Documented constructor `hasToken()`, `performLex()`, `doIdentOrPath()`, `doChar()`, `doString()`, `doComment()`, `doEscapeCode()`, `doNumber()`, `doEncoder()`, `doFloat()`, `flush()`, `buildAdvance()`, `improvedAdvance()`, `advanceLine()`, `isOperator(char)`, `isSplitter(char)`, `isValidDotPrecede(char character)`, `isNumericalEncoder(char character)`, `isNumericalEncoder_Size(char character)`, `isNumericalEncoder_Signage(char character)` and `isValidEscape_String(char character)`
- Tried reformatting some of `doChar()`, `doString()`, `flush()`, `buildAdvance()`, `improvedAdvance()`, `advanceLine()`, `isOperator(char)`, `isSplitter(char)`

* Basic

- Removed `LS` alias

Lexer

- Made `LS` alias public

* BasicLexer

- Removed methods `isValidEscape_String(char character)`, `isNumericalEncoder_Signage(char character)`, `isNumericalEncoder_Size(char character)`, `isNumericalEncoder(char character)`, `isSplitter(char c)` and ` isOperator(char c)`

Lexer

- Added method `isNumericalEncoder(char character)`

* BasicLexer

- Documented `isValidDotPrecede(char character)`

* Lexer

- Added method `isValidDotPrecede(char character)`

* BasicLexer

- Removed method `isValidDotPrecede(char character)`

* BasicLexer (unittests)

- Documented the unittests
- Fixed formatting

* BasicLexer

- Typo fixes

* BasicLexer (unittests)

- Only compile-in `shourt(...)` when in unittest build mode

* BasicLexer

- Documented `isForward()` and `isBackward()`
- Made `isBackward()` private

---------

Co-authored-by: GMeyer <21568499@sun.ac.za>
Co-authored-by: GMeyer <gustav.meyer1999@gmail.com>
2023-12-27 08:18:17 +02:00
Tristan B. Velloza Kildaire e4d10953b7 Merge branch 'vardec_varass_dependency' into feature/multi_module 2023-12-10 12:12:08 +02:00
Tristan B. Velloza Kildaire 4c3a72b026 Pipelines
- Run `apt update` before doing an `apt install`
2023-12-08 18:27:42 +02:00
Tristan B. Velloza Kildaire b0ac107e2d TypeChecker
- `getType(Container c, string typeString)` always checks for built in types first and then if one
of such is found it then immediately returns, else does a search. We have now updated that inner
call to `getBuiltInType(TypeChecker, string)` to use the new API which is `getBuiltInType(TypeChecker, Container, string)`
- When processing a `ReturnStmt` and we need to check the return type of the `Function` it is contained within
and the type of its returned `Expression`, we need not call `getBuiltInType(TypeChecker, Containert, string)`, let's
just call `getType(Container, string)` as it does that call for us in any case
- With the above I also anchor it to search for the type based on the `funcContainer` `Container` as that is at the same level as the `ReturnStmt` itself
2023-12-06 10:49:32 +02:00
Tristan B. Velloza Kildaire 137ca621d5 Builtins
- Updated the `getBuiltInType(TypeChecker, string)` method to take in a `Container` as the second
argument
2023-12-06 10:48:44 +02:00
Tristan B. Velloza Kildaire aafc4341b7 TypeChecker
- Implemented `getProgram()`
2023-12-05 22:12:00 +02:00
Tristan B. Velloza Kildaire 554d21de1d TypeChecker
- Renamed `getModule()` to `deprecated_getModule()` so that we generate compilation errors now
2023-12-05 22:07:56 +02:00
Tristan B. Velloza Kildaire 37af10aadc Parser
- Updates for `parentToContainer(Container, Statement[])`:
	* We now have a default argument called `allowRecursivePainting` which is set to `true`
by default
	* What this does is handle special cases of `Statement`s which are not placed within
a `Container` and therefore can not be reached in a generic way (i.e. `getStatements()` with `Container`
types)
	* These special cases are:
		1. `Variable`
			* Declarations may have a `VariableAssignment` attached to them which has an `Expression`
		2. `BinaryOperatorExpression`
			* Left and right-hand side operands have `Expression`'s attached to them
		3. `VariableAssignmentStdAlone`
			* The assignment `Expression`
		4. `ReturnStmt`
			* It may have an `Expression` in it
		5. `FunctionCall`
			* We need to process each of its actual arguments, which comes in the form of a `Expression[]`
		6. `DiscardStatement`
			* This is like a `ReturnStmt` but it always has an `Expression`, therefore we must process it
		7. `IfStatement`
			* Contains an array of `Branch` (i.e. a `Branch[]`)
		8. `WhileLoop`
			* Contains a singular `Branch`
		9. `ForLoop`
			* Contains a singular `Branch`
		10. `Branch`
			* Contains a condition in the form of an `Expression` and a set of body statements
			in the form of a `Statement[]` array
	* What we then do for 1-6 (so-called normal cases):
		* These we will recurse upon and parent them to the same `Container` that came in in
		the original call which was what got us to enter into the cases of 1 to 6.
	* What we then do for 7 onwrads (so-called "maintain" cases):
		* Notice that each type mentioned in these are all a kind-of `Container`
		* We therefore call `parentToContainer(Container, Statement)` not with the incoming
		`Container` that came into the call that macthed to one of these cases but
		rather to that of this case's container itself
		* We do this because we want the scoping to be consistent and that is done by keeping
		the ancestry tree as it is expected when multiple calls are done for parenting, for example,
		the body `Statement[]` items to their respective `Branch` and then those `Branch[]` to their
		`IfStatement`.
	* "I said there would be special cases ;)"

Parser (unittests)

- Fixed unittests such that they run by using the `Compiler` object now
2023-12-05 21:55:39 +02:00
Tristan B. Velloza Kildaire b231d0b77e Test cases
- Added `alone.t`
2023-12-05 19:35:54 +02:00
Tristan B. Velloza Kildaire 8edc53b884 TypeChecker
- Calling `getModule()` will now return the first `Module` in the `Program`'s list of modules
- Marked `getModule()` for removal
- The constructor now constructs a `Resolver` instance by also passing in the `TypeChecker` itself
- We have updated `dependencyCheck()` to call `checkDefinitionTypes()` on each `Module` of the current `Program`
- We have updated `dependencyCheck()` to call `checkClassInherit()` on each `Module` of the current `Program`
- When handling a `VariableExpression` we look up the `Variable` it referes to, then we must generate a name for it
rooted by its module, we now do this by anchoring at the `this.program` rather than the `modulle`
(what the deprecated `getModule()` would return), this would use the updated `Resolver` code to generate
the name we expect as with the implementation that came before
- When handling a `FunctionCall` we lookup the `Function` it refers to, then we must ask the resolver
to resolve the `Function` by its name using `resolveBest(Container, string)` and this should be best-effort
starting at the `Module` of which the `FunctionCall` is present in. If it happens to be that the `Module`
of the `Function` referred to is outside the provided starting anchor then it will be handled for us, that
is undert the assumption it is `otherModule.myFunc()` being called and present in `otherModule`, compared to
the first case of having `myFunc` being declared in `myModule` of which so is the `myFunc()` `FunctionCall`
statement
- When generating the name of a variable being assigned to for `StaticVariableDeclaration`, we now
anchor at `this.program` instead of `modulle` (the deprecated value returned by `getModule()`)
- Updated `beginCheck()` to now call the `MetaProcessor` with the instance of `this.program` rather than `modulle` (deprecated - as stated before)
- Updated `beginCheck()` such that it now calls `processPseudoEntities()` on each `Module` that is contained
within the current `Program`
- Updated `beginCheck()` such that it now calls `checkContainerCollision()` on each `Module` that is contained
within the current `Program`
- Updated `checkContainerCollision(Container c)` to do the following:
	* On each `Entity` we loop over we ensure that it never has a name which matches that of any `Module` that
is stored within our `Program` (exclusivity principle of module names)
	* Updated calls to `generateName(Container, Entity)` to use `this.program` as the anchor container (insead
of the deprecated `modulle`)

TypeChecker (unittests)

- Fixed a unittest to not spawn a `TypeChecker` using a `Module` but rather a dummy `Compiler` instance
- Fixed a unittest to not spawn a `TypeChecker` using a `Module` but rather a dummy `Compiler` instance
- Fixed a unittest to not spawn a `TypeChecker` using a `Module` but rather a dummy `Compiler` instance
AND to create a dummy `Program` of which we then add a dummy `Module` to
- Fixed remaining unittests to now use a `Compiler` instance
2023-12-05 19:33:27 +02:00
Tristan B. Velloza Kildaire ec9b721e69 Compiler
- Added a `setProgram(Program program)` method
- For now when we finish the call to `parse()` in the `Parser` in `doParse()` we will also add ourselves
to the list of modules. Simply done because as of right now we only add ourselves if we are visited via an
import which implies that you would need a cyclic import ot be visited - which isn't a constraint we obviously
would like to impose upon code writers using TLang. For now we manually add it at the end.
- At the end of `doParse()` call `debugDumpOrds()` on the `Program`
- Added a `getTypeChecker()` method

Compiler (unittests)

- Prints out more information during the exception catch
2023-12-05 15:46:05 +02:00
Tristan B. Velloza Kildaire 8b93e54dfd Program
- Renamed `debugDump()` to `debugDumpOrds()`
- Added `debugDump()` which dumps `modulesImported`
2023-12-05 14:53:10 +02:00
Tristan B. Velloza Kildaire 5432c36cb8 Resolver
- Don't shadow global `Program` variable
- Updated `resolveBest(Container c, string name)`
2023-12-05 14:46:49 +02:00
Tristan B. Velloza Kildaire 216d6d0d39 Resolver
- Added an assertion that the result from `split(..., ...)` may not be an empty array
- This is an update to the `resolveBest(Container c, string name)` method
2023-12-05 14:43:15 +02:00
Tristan B. Velloza Kildaire ff765e4b60 Resolver
- Added another unittest for my sanity
2023-12-05 14:42:27 +02:00
Tristan B. Velloza Kildaire 4cc974c978 Resolver
- Fixed the path array usage in `resolveBest(Container c, string name)`
2023-12-05 14:41:49 +02:00
Tristan B. Velloza Kildaire 08e7f8d98c Resolver
- Updated `resolveBest(Container c, string name)` to now include a check for a condition
whereby we pass in a `Container` which is not an `Entity` as well (i.e. it has no name).
The only case whereby this is the case (in our current code) is that of a `Program`
which is purposefully **not** an `Entity` (as I wanted it to be nameless) but it _is_
a `Container`. Therefore we don't want to continue to hit the code below which
does a cast to `Entity` in the form of `containerEntity`, as that would not work
(and the associated assertion would fail).
- What I wanted to accomplish with the two checks is to check if we are given a name
which then directly matches that of a `Module` that is _in_ the provided `Program`,
and in such a case return said `Module`
- Else, we have a case whereby we have `moduleName.<iets>` whereby we then want to
also similarly scan the `Program` for all its modules and then match to the `moduleName`,
as shown in the above example, we then look this `Module` up **but** we don't return yet.
No, what we do is now do a search for the incoming `name` but we _anchor it on_ the found
`Module` to then search therein
2023-12-05 14:41:04 +02:00
Tristan B. Velloza Kildaire 1c7cd75cb2 Resolution
- Added some unittests just for testing my sanity when it coms to using `split(string, char)`
2023-12-05 14:35:18 +02:00
Tristan B. Velloza Kildaire 9a5023a8b3 Resolution
- Updated the method `generateName(Container relativeTo, Entity entity)` to have a special case
which is when we pass in a `Container` of which is a `Program`. Because the `Entity` provided
may have been declared in some module at, for example a top-level of the `Module`, we then basically
have no way to know from the top-down where it belongs, _rather_ we must start **at** the `Entity`
itself and then ask the resolver to work its way up to the container of which _is_ a `Module` in
order to find its containing top-level container (i.e. a `Module`). After doing this we then call
the `generateName(Container, Entity)` method with said `Module` and the same `Entity` and return early.
- Some work is being done on `resolveBest(Container c, string name)` but so far no changes have been
necessary (through testing) but many might, so the code added is commented out for now and has no associated
commit message as I am still working on it.
- Added debugging messages to `findContainerOfType(TypeInfo_Class containerType, Statement startingNode)`
2023-12-05 14:29:20 +02:00
Tristan B. Velloza Kildaire 4d894ef6fc Testing
- Fixed the shabang for `extern_test.sh`
2023-12-05 13:01:33 +02:00
Tristan B. Velloza Kildaire b09697f8bf Compiler
- Now pass in the `Compiler` instance to the `TypeChecker` constructor
2023-11-06 10:54:18 +02:00
Tristan B. Velloza Kildaire 6db12bc6b3 Resolver
- Removed now-completed TODOs
- These wre done already for years but they may as well be removed now
2023-11-06 10:52:33 +02:00
Tristan B. Velloza Kildaire 6deb10c910 Resolver
- If there is no dot-path but the name matches one of the `Module`s attached to the `Program`, then return it
- This is an update for `resolveBest(Container c, string name)`
2023-11-06 10:51:42 +02:00
Tristan B. Velloza Kildaire 570c3c34a9 Resolver
- Removed `Module` and replaced it with a `Program`
- Updated constructor to take in a `Program` and a `TypeChecker
- `resolveBest(Container c, string name)` will now loop through every `Module` of the `Program` checking to see if `path[0]` matches the name of any of those modules
2023-11-06 10:48:40 +02:00
Tristan B. Velloza Kildaire b43ad6b540 Program
- Made it a kind-of `Container`
- Made class final
- Added stub interface implementations
- Added `addStatement(Statement statement)` and `addStatements(Statement[] statements)`
- Added `getStatements()`
2023-11-06 10:34:16 +02:00
Tristan B. Velloza Kildaire df0c421d26 TypeChecker (unittests)
- Pass in a `null` `Module` such that the constructor is selected correctly (old constructor)
2023-11-05 21:23:21 +02:00
Tristan B. Velloza Kildaire ea0cca9e6a TypeChecker
- Added a constructor which takes in a `Compiler`
- It also then extracts the `Program` from it
2023-11-05 21:14:55 +02:00
Tristan B. Velloza Kildaire 2d7cfb5083 Parser
- Removed `getWorkingDirectory()`
- Removed unneeded imports
2023-11-03 15:03:23 +02:00
Tristan B. Velloza Kildaire f19e94927f Parser
- No longer pass in a value to `parseImport()`
2023-11-03 15:02:33 +02:00
Tristan B. Velloza Kildaire 6853e7949d Parser
- Added some code to do the collecting of several modules in a single import statement
2023-11-03 14:55:52 +02:00
Tristan B. Velloza Kildaire df3fe6390a Parser
- Removed old code from `parseImport(string)`
2023-11-03 14:52:31 +02:00
Tristan B. Velloza Kildaire c6f047cfe4 Program
- `debugDump()` needs something with a non-private field associated to correctly work it seems
2023-11-03 14:16:21 +02:00
Tristan B. Velloza Kildaire 7639e23842 ModuleManager
- Upgraded to new `niknaks` debug
2023-11-03 14:15:57 +02:00