tlang/small_doc_Resolver.md

908 lines
33 KiB
Markdown
Raw Permalink Normal View History

⚡ Feature: Multi-module support (#31) * Parser - `findModulesInDirectory(string directory)` now replaces all the `/` with `.`'s * ModMan - Added stub `ModuleManager` * ModuleManager - Added `validate(string searchPath)` - Added `validate(string[] searchPaths)` - Added a constructor ModMan (unitests) - Added a positive unittest * ModMan (unittests) - Added negative test case example * ModMan - Moved to new package * ModMan - Added exception class ModuleManager - Throw an exception ( a `ModuleManagerError`), if one of the search paths is invalid during construction * ModMan - Added a package for easy importing * Configuration - Calling `defaultConfig()` will now add an entry of type `ARRAY` with key `modman:path` and the starting value is just an array with a single element being the current working directory * Compiler - On construction of a `Compiler` we now construct a `ModuleManager` with the list of paths provided * ModMan - Cleaned up imports - Added another TODO * Parser - Constructor now takes in an instance of `Program` (but by default it is `null`) * Data - Added a `toString()` for `Program` * Compiler - We now create an instance of `Program` on construction - We pass a `Compiler` to the parser - Added `getProgram()` Parser - Accept an instance of `Compiler` - Print out some debug information about the program in `parseImport(string)` * Compiler - Added `getModMan()` method * Modman (pacjage) - Import `ModuleEntry` ModMan - Added `ModuleEntry` - Added `getModulesInDirectory(string directory, bool recurse = false)` which can resolve all the `ModuleEntry`(s) in a given directory (with optional recursion) - Added `search(string curModDir, string name, ref ModuleEntry found)` for searching for a module in a given directory (and then all search paths) * ModuleManager - If the compile-time flag of `DBG_MODMAN` is passed then only print debug text - Added support for recursing down and fixed how it is done in `getModulesInDirectory(string directory, bool recurse = false)` * Parser - Removed old module searching code - Added a TODO for ACTUAL parsing that is still to be done - Added `ModuleManager`-based code for doing module lookups Test cases - Updated `modules/a.t` * Parser - Store current module globally (not sure if we need to) - WIP: Work on module header name validation - Removed `findModulesFromStartingPath(string startingModulePath)` - Removed `findModulesInDirectory(string directory)` - Removed `slashToDot(string strIn)` ModuleManager - Added `isValidModuleDeclaration(string, string, string)` (still a work-in-progress) - Added `slashToDot(string strIn)` * Test cases - Updated testing files `a.t`, `b.t`, `z.t` and `niks/c.t` * Configuration - `defaultConfig()` now adds NOTHING to the default search path (just for now) * ModuleManager - The constructor now takes in an instance of the `Compiler` - Added new method `entries()` which calls `entries(string[])` with the given search paths from the compiler config entry but also tacks on the current working directory - Added `entriesWithInitial(string initialModulePath)` which, given a path to a module initially, will extract said module's containing directory and tack that onto the search paths and then call `entries(string[])` with that - Implemented `entries(string[])` which currently recurses down the given directories discovering all files that end with `.t` and of which have a module header (if not an error is thrown) and then creates a `ModuleEntry` per each mapping their absolute path to their skimmed-off module name - Added an `expect(SymbolType, Token)` so that we can throw `SyntaxError`'s (with a `null`-parser) - Added `skimModuleDeclaredName(string, ref string)` which skims the module's header for the module's name, setting it in the `ref`-based parameter if parsed correcttly, else returns `false` - (WIP) Made `isValidModuleDeclaration(string, string, string)` return `true` * Compiker - Now pass in the `Compiler` to the `ModuleManager` constructor * Parser - Added testing code to `parseImport(string)` which tests the new module discovery/mapping mechanism of `ModuleManager` - It purposefully crashes at the end to prevent it from running the old code * Commands - Added a TODO regarding allowing multiple modules to be specified on the command-line in the future * Parser - Removed old code which made calls to the old `ModuleManager` methods in `parseImport(string)` * Parser - Removed purposeful crash in `parseImport(string)` * ModuleManager - Removed deprecated methods `search(...)`, `getModulesInDirectory(...)` * Parser - Removed module name checking in `parse(string)` * ModuleManager - Removed method `isValiddModuleDeclaration(string declaredName, string fileName, string curModDir)` * Modman - Documented `ModuleEntry` * ModuleManager - Cleaned up and corrected comments * ModuleManager - Implemented `addSearchPath(string)` * ModuleManager - Implemented `addSearchPaths(string[])` * ModuleManager - Switched to using the `addSearchPaths(string[])` method when taking the search paths from the `modman:path` compiler config entry so as to clean it of any duplicates (if any are provided) - This also increases code re-use * ModuleManager - Calling `addSearchPath(string path)` will now validate the path before adding it * ModuleManager - Added TODOs for any debug prints - Added `searchFrom(string searchQuery, string initialModulePath, ref ModuleEntry foundEntry)` * Parser - Added code to test the module searching method (in `parseImport(string)`) * ModuleManager - Implemented `searchFrom_throwable(string searchQuery, string initialModulePath)` * Parser - Switched to using `searchFrom_throwable(string, string)` when searching for the module being imported * ModuleManager - Added `readModuleData(ModuleEntry ent, ref string source)` * ModuleManager - On error or success close the file if it was opened for `readModuleData(ModuleEntry ent, ref string source)` * MduleManager - Added `readModuleData_throwable(ModuleEntry ent)` method * Parser - Read in the module's source * Containers - Added a `toString()` method for the `Module` container type * Test cases - Made `b.t` import module `ap` (useful for cycle detection testing) * Dub - Added `niknaks` dependency * ModuleManager - We now check for any possible duplicates by checking if the discovered module (or modules in the case of a directory-recurse) are present in the found list. This updates `entries(string[] directories)` ModuleEntry - Two `ModuleEntry`s are said to be equal if their file paths are equal * ModuleEntry - Added explicit constructor - Made file private - Added validity checker in the form of `isValid()` - Added `getPath()` and `getName()` * ModuleManager - Removed old `entries()` method - Switched to using `getName()` in `searchFrom(...)` - Whenever a `ModuleEntry` is made assert that it is valid * ModuleManager - Removed debug output * Compiler - During construction add the containing directory of the module file specified on the command-line to the search paths of the `ModuleManager` - `doParse()` now dumps some information from the `ModuleManager` * Test cases - Updated `niks/c.t` to import module `ap` * Program - Added `isModulePresent(ModuleEntry)` - Added `markAsVisited(ModuleEnty)` - Added `setModule(ModuleEntry, Module)` - Added getMods()` and `getOMods()` (for insertion order for debugging purposes) - Added `debugDump()` * Parser - `parseImport(string)` now searches for the module by the name and with the current module's path - Return immediately if the module we found is already present in the `Program` - If NOT, then mark as visited, open parse it and then add the parsed `Moudle` just imported to the `Program` * Compiler - The current directory should be added to the search path * Compiler - This was WRONG. - DO NOT add the current working directory to the search path * ModuleManager - Added `findAllTFiles(string directory)` - Working on new search methods now * Parser - Added testing code and panic to not progress * ModuleManager - Now implemenetd module-name-to-path mapping with fallback mechanism * ModuleManager - Added `find(string modName)` * Test cases - Fixed import statements such that the testing works * Parser - `parseImport(string)` now uses new `ModuleManager` code * ModuleManager - Removed `entriesWithInitial(string initialModulePath)` - Removed `entries(string[] directories)` - Removed `searchFrom(string searchQuery, string initialModulePath, ref ModuleEntry foundEntry)` * Parser - Switch to new Module manager system (for real this time) * Dub - Upgraded `niknaks` package - Enable `DBG_MODMAN` when compiling ModuleManager - Added some useful debugging prints * Dub - Upgraded `niknaks` to version `0.6.0` * ModuleManager - Upgraded to new `niknaks` debug * Program - `debugDump()` needs something with a non-private field associated to correctly work it seems * Parser - Removed old code from `parseImport(string)` * Parser - Added some code to do the collecting of several modules in a single import statement * Parser - No longer pass in a value to `parseImport()` * Parser - Removed `getWorkingDirectory()` - Removed unneeded imports * TypeChecker - Added a constructor which takes in a `Compiler` - It also then extracts the `Program` from it * TypeChecker (unittests) - Pass in a `null` `Module` such that the constructor is selected correctly (old constructor) * Program - Made it a kind-of `Container` - Made class final - Added stub interface implementations - Added `addStatement(Statement statement)` and `addStatements(Statement[] statements)` - Added `getStatements()` * Resolver - Removed `Module` and replaced it with a `Program` - Updated constructor to take in a `Program` and a `TypeChecker - `resolveBest(Container c, string name)` will now loop through every `Module` of the `Program` checking to see if `path[0]` matches the name of any of those modules * Resolver - If there is no dot-path but the name matches one of the `Module`s attached to the `Program`, then return it - This is an update for `resolveBest(Container c, string name)` * Resolver - Removed now-completed TODOs - These wre done already for years but they may as well be removed now * Compiler - Now pass in the `Compiler` instance to the `TypeChecker` constructor * Testing - Fixed the shabang for `extern_test.sh` * Resolution - Updated the method `generateName(Container relativeTo, Entity entity)` to have a special case which is when we pass in a `Container` of which is a `Program`. Because the `Entity` provided may have been declared in some module at, for example a top-level of the `Module`, we then basically have no way to know from the top-down where it belongs, _rather_ we must start **at** the `Entity` itself and then ask the resolver to work its way up to the container of which _is_ a `Module` in order to find its containing top-level container (i.e. a `Module`). After doing this we then call the `generateName(Container, Entity)` method with said `Module` and the same `Entity` and return early. - Some work is being done on `resolveBest(Container c, string name)` but so far no changes have been necessary (through testing) but many might, so the code added is commented out for now and has no associated commit message as I am still working on it. - Added debugging messages to `findContainerOfType(TypeInfo_Class containerType, Statement startingNode)` * Resolution - Added some unittests just for testing my sanity when it coms to using `split(string, char)` * Resolver - Updated `resolveBest(Container c, string name)` to now include a check for a condition whereby we pass in a `Container` which is not an `Entity` as well (i.e. it has no name). The only case whereby this is the case (in our current code) is that of a `Program` which is purposefully **not** an `Entity` (as I wanted it to be nameless) but it _is_ a `Container`. Therefore we don't want to continue to hit the code below which does a cast to `Entity` in the form of `containerEntity`, as that would not work (and the associated assertion would fail). - What I wanted to accomplish with the two checks is to check if we are given a name which then directly matches that of a `Module` that is _in_ the provided `Program`, and in such a case return said `Module` - Else, we have a case whereby we have `moduleName.<iets>` whereby we then want to also similarly scan the `Program` for all its modules and then match to the `moduleName`, as shown in the above example, we then look this `Module` up **but** we don't return yet. No, what we do is now do a search for the incoming `name` but we _anchor it on_ the found `Module` to then search therein * Resolver - Fixed the path array usage in `resolveBest(Container c, string name)` * Resolver - Added another unittest for my sanity * Resolver - Added an assertion that the result from `split(..., ...)` may not be an empty array - This is an update to the `resolveBest(Container c, string name)` method * Resolver - Don't shadow global `Program` variable - Updated `resolveBest(Container c, string name)` * Program - Renamed `debugDump()` to `debugDumpOrds()` - Added `debugDump()` which dumps `modulesImported` * Compiler - Added a `setProgram(Program program)` method - For now when we finish the call to `parse()` in the `Parser` in `doParse()` we will also add ourselves to the list of modules. Simply done because as of right now we only add ourselves if we are visited via an import which implies that you would need a cyclic import ot be visited - which isn't a constraint we obviously would like to impose upon code writers using TLang. For now we manually add it at the end. - At the end of `doParse()` call `debugDumpOrds()` on the `Program` - Added a `getTypeChecker()` method Compiler (unittests) - Prints out more information during the exception catch * TypeChecker - Calling `getModule()` will now return the first `Module` in the `Program`'s list of modules - Marked `getModule()` for removal - The constructor now constructs a `Resolver` instance by also passing in the `TypeChecker` itself - We have updated `dependencyCheck()` to call `checkDefinitionTypes()` on each `Module` of the current `Program` - We have updated `dependencyCheck()` to call `checkClassInherit()` on each `Module` of the current `Program` - When handling a `VariableExpression` we look up the `Variable` it referes to, then we must generate a name for it rooted by its module, we now do this by anchoring at the `this.program` rather than the `modulle` (what the deprecated `getModule()` would return), this would use the updated `Resolver` code to generate the name we expect as with the implementation that came before - When handling a `FunctionCall` we lookup the `Function` it refers to, then we must ask the resolver to resolve the `Function` by its name using `resolveBest(Container, string)` and this should be best-effort starting at the `Module` of which the `FunctionCall` is present in. If it happens to be that the `Module` of the `Function` referred to is outside the provided starting anchor then it will be handled for us, that is undert the assumption it is `otherModule.myFunc()` being called and present in `otherModule`, compared to the first case of having `myFunc` being declared in `myModule` of which so is the `myFunc()` `FunctionCall` statement - When generating the name of a variable being assigned to for `StaticVariableDeclaration`, we now anchor at `this.program` instead of `modulle` (the deprecated value returned by `getModule()`) - Updated `beginCheck()` to now call the `MetaProcessor` with the instance of `this.program` rather than `modulle` (deprecated - as stated before) - Updated `beginCheck()` such that it now calls `processPseudoEntities()` on each `Module` that is contained within the current `Program` - Updated `beginCheck()` such that it now calls `checkContainerCollision()` on each `Module` that is contained within the current `Program` - Updated `checkContainerCollision(Container c)` to do the following: * On each `Entity` we loop over we ensure that it never has a name which matches that of any `Module` that is stored within our `Program` (exclusivity principle of module names) * Updated calls to `generateName(Container, Entity)` to use `this.program` as the anchor container (insead of the deprecated `modulle`) TypeChecker (unittests) - Fixed a unittest to not spawn a `TypeChecker` using a `Module` but rather a dummy `Compiler` instance - Fixed a unittest to not spawn a `TypeChecker` using a `Module` but rather a dummy `Compiler` instance - Fixed a unittest to not spawn a `TypeChecker` using a `Module` but rather a dummy `Compiler` instance AND to create a dummy `Program` of which we then add a dummy `Module` to - Fixed remaining unittests to now use a `Compiler` instance * Test cases - Added `alone.t` * Parser - Updates for `parentToContainer(Container, Statement[])`: * We now have a default argument called `allowRecursivePainting` which is set to `true` by default * What this does is handle special cases of `Statement`s which are not placed within a `Container` and therefore can not be reached in a generic way (i.e. `getStatements()` with `Container` types) * These special cases are: 1. `Variable` * Declarations may have a `VariableAssignment` attached to them which has an `Expression` 2. `BinaryOperatorExpression` * Left and right-hand side operands have `Expression`'s attached to them 3. `VariableAssignmentStdAlone` * The assignment `Expression` 4. `ReturnStmt` * It may have an `Expression` in it 5. `FunctionCall` * We need to process each of its actual arguments, which comes in the form of a `Expression[]` 6. `DiscardStatement` * This is like a `ReturnStmt` but it always has an `Expression`, therefore we must process it 7. `IfStatement` * Contains an array of `Branch` (i.e. a `Branch[]`) 8. `WhileLoop` * Contains a singular `Branch` 9. `ForLoop` * Contains a singular `Branch` 10. `Branch` * Contains a condition in the form of an `Expression` and a set of body statements in the form of a `Statement[]` array * What we then do for 1-6 (so-called normal cases): * These we will recurse upon and parent them to the same `Container` that came in in the original call which was what got us to enter into the cases of 1 to 6. * What we then do for 7 onwrads (so-called "maintain" cases): * Notice that each type mentioned in these are all a kind-of `Container` * We therefore call `parentToContainer(Container, Statement)` not with the incoming `Container` that came into the call that macthed to one of these cases but rather to that of this case's container itself * We do this because we want the scoping to be consistent and that is done by keeping the ancestry tree as it is expected when multiple calls are done for parenting, for example, the body `Statement[]` items to their respective `Branch` and then those `Branch[]` to their `IfStatement`. * "I said there would be special cases ;)" Parser (unittests) - Fixed unittests such that they run by using the `Compiler` object now * TypeChecker - Renamed `getModule()` to `deprecated_getModule()` so that we generate compilation errors now * TypeChecker - Implemented `getProgram()` * Builtins - Updated the `getBuiltInType(TypeChecker, string)` method to take in a `Container` as the second argument * TypeChecker - `getType(Container c, string typeString)` always checks for built in types first and then if one of such is found it then immediately returns, else does a search. We have now updated that inner call to `getBuiltInType(TypeChecker, string)` to use the new API which is `getBuiltInType(TypeChecker, Container, string)` - When processing a `ReturnStmt` and we need to check the return type of the `Function` it is contained within and the type of its returned `Expression`, we need not call `getBuiltInType(TypeChecker, Containert, string)`, let's just call `getType(Container, string)` as it does that call for us in any case - With the above I also anchor it to search for the type based on the `funcContainer` `Container` as that is at the same level as the `ReturnStmt` itself * Parser - Fixed missing ending curly brace * Scratch - Playing with a way to build library objects without immediately linking - I am then also seeing how to link it to an application which `_start_` can find its `main` symbol * Makefile - Added `-c` to ONLY compile but do NOT attempt to link and make an executable. If we try this it looks for a `main` symbol to satisfy the `_start` for linux-ld - Updated the build instructions for being able to statically link against the library * Makefile - Added seperate stage to _create_ the `main.o` and then only later link with the intent of making an executable. - This is BETTER because it means we could treat everything in the same way (nevermind a `main` or library) * DGen - Added some future code for `emit()` which will enumerate all `Module`(s) available in the current `Program` and then do an emit step per-each of them and have those go to the correct `File`(s) - Added a comment for `finalize()` of which indicates where we should link all generated object files for all the modules. After this we should then link this against each other and generate an executable. * tlang.compiler.codegen.mapper.api - Added new module SymbolMapperV2 - Added new interface ScopeType - Added new enum type * SymbolMapperV2 - Added missing import for `Entity` type * Implementations - Added some stub types `LebanonMapper` and `HashMapper` * LebanonMapper - Implemented `map(Entity item, ScopeType type)` * LebanonMapper - Now respects the `type` field for `copeType.GLOBAL` - `ScopeType.LOCAL` is still to be implemented * LebanonMapper - Fixed wrong variable name in `map(Entity item, ScopeType type)` HashMapper - Implemented `map(Entity item, ScopeType type)` for `ScopeType.GLOBAL` - `ScopeType.LOCAL` is yet to be implemented * LebanonMapper - If called with `ScopeType.LOCAL` then the name will be generated in an absolute sense, however the module name will be stripped from the path - For example `simple_module.a.b` would become `a.b` and then be mapped to `a_b` * LebanonMapper - Join with periods and let step afterwards do the replacement * HashMapper - Added support for `ScopeType.LOCAL` in a similar fashion to how it is done in `LebanonMapper` * MetaProcessor - When getting the type root everything on the `Program` rather than `tc.getModule()`. We therefore now use the `tc.getProgram()` in its place. This updates the `sizeOf_Literalize(string typeName)` method. DGen - `emit()` has been updated to enumerate all `Module[]` of the given `Program` and then calls the respecitve `emit(...)` methods with a (`File`, `Module`) which contains the file which emitting should be written to and also the `Module` being emitted. - At the end of `emit()` we then try to find the `Module` which contains a function named `main` and then set that as the entrypoint. If such a `main` function cannot be found we then go and do a check whether or not the `dgen:emit_entrypoint_test` configuration option is `true`, if so we then try emit a testing entrypoint and assume that the tst uses only one `Module` (this is something to be cleaned up later; as in all tests should have a `main` method). - Implemented `findEntrypoint(ref Module mainModule, ref Function mainFunc)` which does as the name implies and finds the containing `Module` which has a `Function` named `"main"` and if so sets both ref arguments and returns `true`, otherwise they are left unset and `false` is returned - `emitHeaderComment(...)`, `emitStaticAllocations(...)`, `emitFunctionPrototypes(...)`, `emitFunctionDefinitions(...)`, `emitFunctionPrototype(...)`, `emitFunctionDefinition(...)`, `emitCodeQueue(...)`, `emitStdint(...)`, `emitEntrypoint(...)`, `emitTestingEntrypoint(...)` now takes in `File modOut` and `Module mod` - Note the above have not YET been updated to select correct code queues, static init. queues and `emitEntrypoint(...)` has not yet been implemented - Updated `emitTestingEntrypoint(...)` to check using the incoming `Module`'s name - `finalize()` will now compile all sources files, then it will link all generated object files together; optionally cleaning up all but the final genersted excutable at the end CollidingNameException - Fixed up name generation (we now anchor on the `Program`) - Fixed up how we detect if a module's name is in use. We now will check and ensure it doesn't match ANY of the modules contained within the `Program` DNode - Implemented `forceName(string)` to set the `name` field forcefully - The `generate()` method now calls `generalPass(Module, Context)` on each of the `Program`'s `Module`s. We then get an array of `DNode[]` from them. After this we then create a root `DNode` to represent the program and then we make it `needs(DNode)` each `Module` `DNode` created previously. DNodeGenerator - When processing `VariableExpression` in `expressionPass(Expression exp, Context context)`, if we get a `Function` being referred to in the expression, then we now set the `Context`'s `Container` still to the `Module` containing the function's definition, _however_ we use a method from the resolver to ensure it is the correct module (because we are now using a multi-module setup). - We have always set the **root** `Module` when entering `generalPass(Container c, Context context)` and processing a `Function` but, as with the last change mentioned above, we need to find that containing `Module` correctly. Therefore we use a method from the `Resolver` to do that. * TypeChecker (unittests) - Fixed up calls to `getBuiltInType(...)` which now require some `Container` to pivot on. We now pass in the `Program` - Fixed up unittests which used `getModule()`, for such uni-modular tests we just use `program.getModules()[0]` now * FuncDefStore - Use `tc.getProgram()` in place of `tc.getModule()` when doing a name generation via the `Resolver` * DGen - Use `else` rather than hard coding all test cases. This will only be used if no main module is found AND testing is enabled - so it should be a safe bet * DNodeGenerator - Fixed `DNode` construction in `generate()` method PoolManager (unitttests) - Fixed unittests * TypeChecker - Removed `deprecated_getModule()` - Removed old `modulle` field - Cleaned up field definition section - Removed old usages of `modulle` and now rooting at the now-new topmost `Container`; the `Program` * LebanonMapper - Added unittest - Seems like it completely works as expected for both `ScopeType`'s * HashMapper - Added unittests - Looks like it works as expected as well * HashMapper - Actually fixed unittest - Added more debug-time debugging prints - Had to use `dup` (BUG)! * CodeEmitter - Removed already-completed TODO * TypeChecker - Use `Module`, as no cast is needed - Cleaned up * TypeChecker - Cleaned up some more * DNodeGenerator - Cleaned up the constructor - Made `resolver` private * DNodeGenerator - Removed `functionDefinitions` as it is no longer used anymore, we now make use of the `IFuncDefStore` * ProgramDepNode - Added new type * DNode - Implemented `getDepCount()` and `getDeps()` DNodeGenerator - Now uses `ProgramDepNode` * DNodeGenerator - When calling `generate()` set a nice name for the dependency nodes of the `Module`(s) * IFuncDefStore - All methods now require an owner `Module` * FuncDefStore - Now conforms to the new `IFuncDefStore` API * FunctionData - Added `Module`-based ownership model - Implemented methods `setOwner(Module mod)` and `getOwner()` - Implemented method `getName()` DNodeGenerator - When processing a function definition (a `Function`) we now will call `addFunctionDef` with the owner of this `Function`, this is done by making use of the `Module root` which is set in such a case (and others) when entering the `generalPass(Module, Context)` method * TypeChecker - We now get all of the top-level `DNode` (dependency nodes) and then from there onwards we `performLinearization()` -> `getLinearizedNoes()` -> `doTypeCheck(DNode[])`, at each stage collecting the global code queue, the function definitions (each of their code queues) and the static init queues. These are then collected ina `ModuleQueue` for the respective `Module` being processed - Added methods `getModQueueFor(Module owner)`, `scratchToModQueue(Module owner)`, `funcScratchToModQueue(Module owner, FunctionData fd)` and `initsScratchToModQueue(Module owner)` - Updated the following methods to make use of the `ModuleQueue`(s) available by selecting by an owner `Module`: `getGlobalCodeQueue(Module owner)`, `getFunctionBodyCodeQueues(Module owner)` and `getInitQueue(Module owner)` ModuleQueue - Added new type * CodeEmitter - Selecting a queue now requires an owner `Module` and it will also, when the `QueueType` is set to `FUNCTION_DEF_QUEUE`, copy over the correct `Instruction[][string]` and then, furthermore, select the correct `Instruction[]` out of it and set that as the current code queue - Updated `getFunctionDefinitionsCount(Module owner)` and `getFunctionDefinitionNames(Module owner)` to select the correct `Instruction[][string]` when called * DGen - When we find an entrypoint use `a` mode when opening the file for it, `w` mode would wipe the previously written C source code - Also added a note regarding this branch (if-else) above - Updated error message when no entry point (and not a test file) is found - Fixed `emitStaticAllocations(File modOut, Module mod)` to select the static init queue using the provided `Module` - Fixed `emitFunctionPrototypes(File modOut, Module mod)` and `emitFunctionDefinitions(File modOut, Module mod)` to use the inheritted `CodeEmitter` methods correctly by passing in a `Module`, similar changes affecting the `TypeChecker` are now conformed to in s similar manner - Fixed `emitFunctionPrototype(File modOut, Module mod, string functionName)` and `emitFunctionDefinition(File modOut, Module mod, string functionName)` to select the function definition queue of the given `functionName` by the given `Module` - Fixed `emitCodeQueue(File modOut, Module mod)` to select the globals queue by `Module` - Added some comments about future features to `emitEntrypoint(File modOut, Module mod)` * ⚡ Feature: Pluggable predicate-based resolution method (#36) * Dub - Upgraded `niknaks` to version `0.9.7` * Resolver - Documented the class and some new and existing - Implemented a version of `resolveWithin(...)` which now takes in a `Predicate!(Entity)` instead of a name and then uses that as the final matching step when iterating over the given `Container`'s `Statement[]` (the container's immediate body) - Implemented `derive_nameMatch(string name)` which returns a `Predicate!(Entity)` closure which matches based on names - Updated `resolveWithin(Container, string)` to make use of these above two changes - A similar change including the above rework and predicate derivation has been done for `resolveUp(...)`, of which there is now a version available as `resolveUp(Container, Predicate!(Entity))` - Stub version of `resolveBest(Container c, Predicate!(Entity) d)` added but not implemented yet - Added some unsued types and methods, `SearchCtx` and `findFrom!(...)(...)` respectively - Added `derive_functionAccMod(AccessorType)` (this is to be moved into `DGen` as that is where it is needed) * Resolver - Removed `derive_functionAccMod(AccessorType)` DGen - Added `derive_functionAccMod(AccessorType)` * Resolver - Removed the old code that was commented out in `resolveWithin(Container currentContainer, string name)` * Resolver - Cleaned up commented out code for `resolveUp(Container currentContainer, string name) and also documented it * Resolver - Removed unused things * DGen - Initial code for `emitExterns(File, Module)` * ⚡ Feature: Collector-based resolution methods (#37) * Resolver - Added collector-based searching for `resolveWithin(..., ..., ...)` * DGen - Corrected `DGen` call to use new `Resolver` method * Dub - Upgraded `niknaks` to versoin `0.9.8` * Resolver - Fixed a bug in `resolveUp(Container currentContainer, Predicate!(Entity) predicate)` which would cause a segmentation fault when we had climbed to the top of the AST hierachy but we tried to find the `parentOf()` a `Program`. This would fail because only kind-of `Statement`(s) have the `parentOf()` method. We normally do a `cast(Entity)container` to therefore get such access to that method (as all `Entity`(s) are a kind-of `Statement`). However a `Program` is notably NOT an `Entity` and hence this would fail. Ths fix therefore was to add a check that if `entity` was `null` meaning that the `resolveWithin(Container, `Predicate!(Entity))` failed that we must not try to climb further. We do this by checking that with an intermediary else-if branch that, if the within-search failed AND we have a `cast(Program)container` which is non-`null` then we stop the search with zero-results by returning `null` Program - When one calls `setModule(ModuleEntry, Module)` then the incoming `Module` should be parented to the `Program` itself * Test cases - Updated `a.t` to have its own `main` method and try refer to something outside of its own module (something in `b.t`) - Updated `b.t` to have a method named `doThing()` * HashMapper - Updated the mapping technique to prepend the characters `t_` or else the `DGen` gets very angry bevause the symbol names sometimes start with numbers of a sequencce of characters that meanns something else (an illegal symbol name). HashMapper (unittest) - Updated unittest to correspond with the fix) * CodeEmitter - Now makes use of `SymbolMapperV2` for the mapper Compiler - Now uses the new `HashMapper` and `LebanonMapper`. - Along with this it also now uses the `SymbolmapperV2` API * DGen - Now takes in a `SymbolMapperV2` - We now have made use of the new symbol mapping facilities and (for now) we are mapping everything with a `GLOBAL` `ScopeType`. n this case have added comments above each mapping request to reconsider which scope type should be used on a case-by-case basis. - We now first call `emitStdint(..., ...)` prior to `emitExterns(..., ...)` because the latter uses types defined in the former (think of well, the `uint32_t` types for example) - Defined a new type `ModuleExternSet` which holds a `Module` and its respetcive publically accessible `Function`(s) and `Variable`(s) - Implemented `generateExternsForModule(Module mod)` which generates a `ModuleExternSet` for the given `Module` - Updated `emitExterns(File modOut, Module mod)` to now determine all `Module[]` except the incoming `mod` itself and then generate `ModuleExternSet`()s for each. These are then looped through and extern statements are generated for each `ModuleExternSet's `pubFns()` and (soon) we will also add support for the `pubVars()` of each * TypeChecker - When pro0cessing a `DNode` which contains a `Statement` of which is a `FunctionCall` we now will generate the corresponding `FuncCallInstr` using the name that is refered to in the `FunctionCall` (i.e. using `getname()` on the `FunctionCall`) INSTEAD of using the `getName()` method from the `Function` looked up CVIA the `FunctionCall`'s `getName()` method. This ensures that the full path expression is passed into the `FuncCallInstr`. Remember it is looked up later in the emitter but having this name fixed a lto of problems. i.e. having a `b.doThing()` call in module `a` now works. Previously it would look up just `doThng()` as that is all that was all whic was saved within the `FuncCallInstr`. - Rule of thumb when deating any sort of `Instruction` whic refers to some entity using a name, save-what-you-see. Don't eaergly lookup stuff and re-generate names because the name y7ou see is normally syntactically correct from the user. * Resolver - Fixed up documentation - `generateNameBest(Entity)` now has a WAY cleaner implementation that uses the `findContainerOfType(..., ...)` method - Documented `isDescendant(Container, Entity)` and fixed it up for `Program` related things - Updasted documentation and error messages which print when using `resolveBwest(Container, string)` with `Program` as the container and referring to a non-module-only name. Resolver (unittests) - Added one huge unittest to test EVERYTHING * Compiler - No longer store a global `Module` - Removed `getModule()` * Commands - Removed all references to `getModule()` to the `Compiler` object * Parser - When calling `parse(string, bool)`, store a `ModuleEntry` corresponding to the just-created `Module` such that it is pre-visited * Compiler - Call `parse(string, bool)` with a flag to indicate this is an entrypoint call and the `Module` immediately parsed should be stored as visited (as we have access to its name now (after parsing the `module <x>;` header)) and the `Module` object itself (as per the instantiation of it PRIOR to any further `LexerInterface` calls but just late enough after the module header parsing in order to obtain its name * Test cases (multi-module) - Updated `a.t` and `b.t` * test cases - Updated `niks/c.t` * Small documentation - Added description of modules and programs along with example code, directory structure and then also usage * Test cases - made the multi-module test case more complicated to ensure it all works - Also fixed the module name at module `niks/c.t` to be `c` and NOT `niks.c` which is incorrect * Small docs - Updated the docs to reflect current code * DGen - Show elapsed time per compiled unit * DGen - Show only milliseconds * DGen - Calculate and print out total compilation time at the end of compilation * ModuleEntry - Fully documented * ModuleEntry - Typo fix in documentation * Small docs - Working on implementation details now * Small docs - Added `ModuleEntry` documentation - Working on `ModuleManager` docs now * Smaol docs - Updated docs * Smol docs - Typo fixes * Small docs - Updated * ModuleManager - Made many methods `private` which should have been from the start - Optimized out calculations that were being done over and over again when they needed to only be calculated once - Added some missing documentation * ModMan - Cleaned up * ModuleManager - Documented more methods * ModuleManager - Documented last item * ModMan - Removed unused unittest * Small docs - Removed unrelated section * Small docs - Updated * Container - Corrected typo * Smol docs - Working on the resolution docs * Smol docs - Updated docs * Smol docs - Finished section on program * Program - Cleaned up - New method names Parser - Updated to use the new `Program` API Compiler - Cleaned up * Container - Added documentation * Program - made `addModule(Module)` public again * Small docs - Added some more on resolver * Small doc - Added more documentation on the `Resolver` API * Small doc - Added more documentation on the `Resolver` * Small docs - updated * Small docs - TYpo fix * Resolver - Fixed the typos in some documentation - `generateNameBest(Entity entity)` now relies on `generateName(Container, Entity)`'s special handling of `Program`'s in order to achieve its goal. - Cleaned up `generateName(Container, Entity)` - `generateName_Internal(...)` is now `generateName0(...)` * Small docsd - Updated * Small doc - Added more information * Small docs - Added examples * Samll docs - netaened up * Small docs - Fixed run onlines * Smal docs - Updated * Small docs - Updated * Small docs - Finioshed resolver docs * Small docs - Added code excerpt * Small docs - Added code insertion * Parser - Cleaned up imports - Removed unused global `curModule` - Cleaned up in general * Resolution - Neatened up - Removed check for `statement !is null` from `resolveWithin(..., ...)` because that should honestly never be the case. In the case it ever is stuff would still work, but it won't - Formatted code - Added docs * Parser - Refactored importing code into `doImport(string)` * Parsing (unittests) - Added a test which tests out the multi-module support by testing `a.t`/`b.t` and `niks/c.t` in a unittest itself * Pipelines - Added test for `a.t` (multi-module) test * Pipelines - Added test for `a.t` (multi-module) test * Pipelines - FIXED it (removed set e in wrong place) - Added test for `a.t` (multi-module) test * Pipelines - No, actual fix is `-e` as we want to exit on first bad command (non-zero exit code) * Pipelines - Smh, fixed * Pipelines - But don't fail after running it lmao * Pipelines - try this? * Pipelines - test to make sure * Revert "Pipelines" This reverts commit 73efdf9d4c09751015e4efb399ebdaa6c3d8e332. * Parser - Updated `doImport(string)` * Parser - Make `doImport(string)` now use `doImports(string[])` - `parseImport()` now supports multi-line imports * test cases - Updated `a.t` to use a multi-line import * Parser - Added doc * Configuration - Added default config option of `modman:strict_headers` and set it to `false` * ModuleManager - Removed `slashToDot(string strIn)` - Removed `skimModuleDeclaredName(string modulePath, ref string skimmedName)` * Parser (unittests) - Comments unittests disabled for now * Small docs - Added new one to work on soon * Parser - `moduleFilePath` is no longer an optional argument * ModuleManager - Fixed bug in `findAllTFilesShallow(string directory)` whereby we would never check `directory` to be a valid path nor even a path (if valid) to a directory - which is required in order to run `dirEntries(...)` on it * Configuration - Cleaned up the configuration code which initially creates the `ConfigEntry` for `modman:path` * Small docs - Yebop * Compiler - Added documentation to `getModMan()` * Commands - Added `ParseBase` mixin template, this contains the support for `--paths` command-line option - `compileCommand` now mixes in `ParseBase` and `TypeCheckerBase` and initializes the `ParseBaseInit()` as well - `parseCommand` now mixes in `ParseBase` and initializes the `ParseBaseInit()` as well - `typecheckCommand` now mixes in `ParseBase` and `TypeCheckerBase` and initializes the `ParseBaseInit()` as well * ModuleManager - Fixed bug in `validate(string searchPath)` which would not check if a path existed * Parser - Removed now-completed TODO * Parser - Removed unused method `doImport(string moduleName)` * Dgen - Added `generateSignature_Variable(Variable var)` which generates JUST a variable signature (with symbol mapping) - `emitExterns(File modOut, Module mod)` now also generates extern statements for global variables * Test cases - Updated to show use of extern variables * DGen - Removed old commented-out code from `emitExterns(File modOut, Module mod)` * TypeChecker (unittests) - Fixed missing `sourceFile` argument to various `Compiler` constructors - Fixed the module obtaining in one of the tests * DGen - Documented * LebaneseMadpper - Removed old mapper * HashMapper - Removed old mapper * SymbolMapper - Removed old definition * SymbolMappingTechnique - Documented * SymbolMapperV2 - Renamed * SymbolMapper - Moved from `api.d` to `core.d` * SymbolMapper - Documented - Cleaned up * DGen - When generating symbols for `Variable`, if it IS `isExternal()` then do not symbol map, else symbol map * DGen - When generating `extern ...` statements for a given `Module` only tack on the `extern` to the emit if it is NOT `isExternal()` (if it is not a TLang externed variable/function). This is because those would already have them via the call to the signature generation function. This updates the `emitxterns(Module, File)` method. - When generating signatures for `Variable`(s) ensure that ones which have `isExternal()` return `true` have `extern ...` added to the front og the generated signature. This updates the `generateSignature_Variable(Variable var)` method. * DGen - Removed now-completed TODO * Notes - Removed random ass note * Symbols - Documented HashMapper - Removed `.dup` which is no longer needed * HashMapper - Cleaned up * Mappers - Documented
2024-04-08 11:43:22 +01:00
## Resolution
Once the parser has constructed an AST tree for us what we have is then a tree of nodes with nested nodes and nested nodes of nodes, so on... . This is great, but we need to be able to search this tree for certain things we may want to find - perhaps by _some predicate_ such as searching by **name**. All of this is made possible by the _resolver_.
### Containers, programs and modules
Before we examine the resolver's API and how to use it it is worth understanding the main important types that play a big role in the resolution process.
The interfaces of importance:
1. `Container`
* Anything which is a `Container` will have methods allowing one to add `Statement`(s) to its body and retrieve all of said added `Statement`(s)
* It also as of recently implies that methods from the `MStatementSearchable` and `MStatementReplaceable` interfaces are available as well - but those won't be covered here as they are nopt important for the ase functionality
The concrete types of importance:
1. `Program`
* A `Program` holds multiple `Module`(s)
* It is a kind-of `Container` **but NOT** any sort of `Statement` at all
2. `Module`
* It is a kind-of `Container` and an `Entity`
3. `Entity`
* You would have seen this earlier, anything which is an entity has a _name_ associated with it
#### Container API
Let us quickly provide a breakdown of what methods the `Container` interface type requires one to have implemented and readily available for usage on the _implementing type_.
| Method | Return type | Description |
|------------------------------|-------------|----------------------------|
| `addStatement(Statement)` | `void` | Appends the given statement to this container's body |
| `addStatements(Statement[])` | `void` | Appends the list of statements (in order) to this container's body |
| `getStatements()` | `Statement[]` | Returns the body of this container |
> We mentioned that the `Container` interface also implements the `MStatementSearchable` and `MStatementReplaceable` interfaces. Those **are** important but their applicability is not within the resolution process at all, so they are excluded from the above method listing.
#### Program API
The _program_ holds a bunch of _modules_ as its _body statements_ (hence being a `Container` type). A program,, unlike a module, is not an `Entity` - meaning it has no name associated with it **but** it is the root of the AST tree.
| Method | Return type | Description |
|---------------------------------------|-------------|-----------------------|
| `getModules()` | `Module[]` | Returns the list of all modules which make up this program. |
| `setEntryModule(ModuleEntry, Module)` | `void` | Given a module entry this will assign (map) a module to it. Along with doing this the incoming module shall be added to the body of this `Program` and this module will have its parent set to said `Program`. |
| `markEntryAsVisited(ModuleEntry)` | `void` | Marks the given entry as present. This effectively means simply adding the name of the incoming module entry as a key to the internal map but without it mapping to a module in particular. |
| `isEntryPresent(ModuleEntry)` | `bool` | Check if the given module entry is present. This is based on whether a module entry within the internal map is present which has a name equal to the incoming entry. |
Some of the methods above are related to the `Container`-side of the `Program` type. These methods are useful once the `Program` is already fully constructed, i.e. all parsing has been completed.
Some of the _other_ methods relating to the `markEntryAsVisited(ModuleEntry)` and so forth have to do with the mechanism by which the parser adds new modules to the program during parsing and ensures that no cycles are traversed (i.e. when a module is already being visited it should not be visited again).
### The _resolver_
Now that we have a good idea of the types involved we can take a look at the API which the resolver has to offer and how it may be used in order to generate names of _entities_ and perform the resolution of _entities_.
Let's first take a look at the constructor that the `Resolver` has:
```d
this
(
Program program,
TypeChecker typeChecker
)
```
This constructs a new resolver with the given root program and the type checking instance. This implies you must have performed parsing, constructed a `TypeChecker` and **only then** could you instantiate a resolver.
### Name resolution
Now that we know how to construct a resolver, let's see what methods it makes available to every component from the `TypeChecker` (as it is constructed here) and onwards.
The first set of methods relate to the name generation of entities in the AST tree.
| Method | Return type | Description |
|---------------------------|-------------|---------------------------------------|
| `isDescendant(Container, Entity)` | `bool` | Returns `true` entity `e` is `c` or is within (contained under `c`), `false` otherwise |
| `generateName0(Container, Entity)` | `string[]` | Generates the components of the path from a given entity up to (and including) the given container. The latter implies that the given `Container` must also be a kind-of `Entity` such that a name can be generated from it. |
| `generateNameBest(Entity)`| `string` | Generate the absolute full path of the given entity without specifying which anchor point to use. |
| `generateName(Container, Entity)` | `string` | Given an entity and a container this will generate the entity's full path relative to the given container. If the container is a `Program` then the absolute name of the entity is derived. |
#### How `isDescendant(Container, Entity)` works
The first check we do is an obvious one, check if the provided entity is equal to that of the provided container, in that case it is a descendant by the rule.
```d
/**
* If they are the same
*/
if (c == e)
{
return true;
}
```
If this is _not_ the case then we check the ancestral relationship by traversing from the entity upwards.
We start off with this loop variable for our do-while loop:
```d
Entity currentEntity = e;
```
**Steps**:
The process of checking for descendance is now described and the actual implementation will follow.
1. At each iteration we obtain `currentEntity`'s parent by using `parentOf()`, we store this as `parentOfCurrent`
2. _If_ the `parentOfCurrent` is equal to the given container then we exit and return `true`. This is the case whereby the direct parent is found.
3. _If not_, then...
a. Every other case, use current entity's parent as starting point and keep climbing
b. If no match is found in the intermediary we will eventually climb to the `Program` node. Since a `Program` _is_ a `Container` but _is **not**_ an `Entity` it will fail to cast and `currentEntity` will be `null`, hence exiting the loop and returning with `false`.
```d
do
{
gprintln
(
format("c isdecsenat: %s", c)
);
gprintln
(
format("currentEntity: %s", currentEntity)
);
Container parentOfCurrent = currentEntity.parentOf();
gprintln
(
format("currentEntity(parent): %s", parentOfCurrent)
);
// If the parent of the currentEntity
// is what we were searching for, then
// yes, we found it to be a descendant
// of it
if(parentOfCurrent == c)
{
return true;
}
// Every other case, use current entity's parent
// as starting point and keep climbing
//
// This would also be null (and stop the search
// if we reached the end of the tree in a case
// where the given container to anchor by is
// the `Program` BUT was not that of a valid one
// that actually belonged to the same tree as
// the starting node. This becomes `null` because
// remember that a `Program` is not a kind-of `Entity`
currentEntity = cast(Entity)(parentOfCurrent);
}
while (currentEntity);
return false;
```
#### How `generateNameBest(Entity)` works
The definition of this method is suspiciously simple:
```d
public string generateNameBest(Entity entity)
{
assert(entity);
return generateName(this.program, entity);
}
```
So what's going on? Well...
What this will do is call `generateName(Container, Entity)` with the container set to the `Program`, this will therefore cause the intended behavior described above - see the aforementioned method for the reason as to why this works out.
This will climb the AST tree until it finds the containing `Module` of the given entity and then it will generate the name using that as the anchor - hence giving you the absolute path (because remember, a `Program` has no name, next best is the `Module`).
#### How `generateName(Container, Entity)` works
The definition of this method is where the real complexity is housed. This also accounts for how the previous method, `generateNameBest(Entity)`, is implemented.
Firstly we ensure that both arguments are non-`null` with:
```d
assert(relativeTo);
assert(entity);
```
A special case is when the container is a `Program`, in that case the entity's containing `Module` will be found and the name will be generated relative to that. Since `Program`'s have no names, doing such a call gives you the absolute (full path) of the entity within the entire program as the `Module` is the second highest in the AST tree and first `Entity`-typed object, meaning first "thing" with a name.
```d
if(cast(Program)relativeTo)
{
Container potModC = findContainerOfType(Module.classinfo, entity);
assert(potModC); // Should always be true (unless you butchered the AST)
Module potMod = cast(Module)potModC;
assert(potMod); // Should always be true (unless you butchered the AST)
return generateName(potMod, entity);
}
```
Given an entity and a container this will generate the entity's full path relative to the given container. This means calling `generateName0(Container, Entity)` and then joining each path element with a period.
```d
string[] name = generateName0(relativeTo, entity);
string path;
for (ulong i = 0; i < name.length; i++)
{
path ~= name[name.length - 1 - i];
if (i != name.length - 1)
{
path ~= ".";
}
}
return path;
```
Once `path` is calculated we then finally return with it.
#### How `generateName0(Container, Entity)` works
Let's first look at how `generateName0(Container relativeTo, Entity entity)` is implemented. The idea behind this method is to generate an array of strings, i.e. `string[]`, which contains the highest node in the hierachy to the lowest node (then given entity) from left to right respectively.
As mentioned the given container, `relativeTo`, has to be a kind-of `Entity` as well such that a name can be generated for it, hence we ensure that the developer is not misusing it with the first check:
```d
Entity containerEntity = cast(Entity) relativeTo;
assert(containerEntity);
```
**Steps**:
1. The first check we then do is to see whether or not the `relativeTo == entity`
a. _If so_, then we simply return a singular path element of `containerEntity.getName()`
2. The next check is to check whether or not the given entity is a descendant, either directly or indirectly, of the given container
a. _If so_, then we begin generating the elements by swimming up the ancestor tree, stopping once the `relativeTo` is reached
3. The last check, if neither checks $1$ or $2$ were true, is to return `null` (an empty array)
The above steps are shown now below in their code form:
```d
/**
* If the Entity and Container are the same then
* just returns its name
*/
if (relativeTo == entity)
{
return [containerEntity.getName()];
}
/**
* If the Entity is contained within the Container
*/
else if (isDescendant(relativeTo, entity))
{
string[] items;
Entity currentEntity = entity;
do
{
items ~= currentEntity.getName();
/**
* So far all objects we have being used
* of which are kind-of Containers are also
* and ONLY also kind-of Entity's hence the
* cast should never fail.
*
* This method is never called with,
* for example, a `Program` relativeTo.
*/
assert(cast(Entity) currentEntity.parentOf());
currentEntity = cast(Entity)(currentEntity.parentOf());
}
while (currentEntity != relativeTo);
/* Add the relative to container */
items ~= containerEntity.getName();
return items;
}
/**
* If not
*/
else
{
return null;
}
```
### Entity resolution
The second set of methods relate to the resolution facilities made available which allow one to search for entities based on various different sorts of custom _predicates_ and by name.
| Method | Return type | Description |
|---------------------------|-------------|---------------------------------------|
| `resolveWithin(Container, Predicate!(Entity), ref Entity[])` | `void` | Performs a horizontal-level search of the given `Container`, returning a found `Entity` when the predicate supplied returns a positive verdict on said entity then we add an entry to the ref parameter |
| `resolveWithin(Container, Predicate!(Entity))` | `Entity` | Performs a horizontal-level search of the given `Container`, returning a found `Entity` when the predicate supplied returns a positive verdict on said entity then we return it. |
| `resolveUp(Container, Predicate!(Entity))` | `Entity` | Performs a horizontal-based search of the given `Container`, returning the first `Entity` found when a positive verdict is returned from having the provided predicate applied to it. If the verdict is `false` then we do not give up immediately but rather recurse up the parental tree searching the container of the current container and applying the same logic. |
| `resolveBest(Container, string)` | `Entity` | This will do a best effort search starting for an entity with the given name. The search will start from the given container and perform a search within it, in the case no such entity is found there then it will recurse upwards, stopping when you reach the program-level. This also handles special cases such as dotted-paths, it can decode them and follow the trail to the intended entity. In the case that the container given is a `Program` then each name must either be solely a module name or a dotted-path beginning with one. In this mode nothing else is accepted, it effectively an absolute downwards (rather than potentially upwards search). |
| `findContainerOfType(TypeInfo_Class, Statement)` | `Container` | Given a type-of `Container` and a starting `Statement` (AST node) this will swim upwards to try and find the first matching parent of which is of the given type (exactly, not kind-of). |
Only the important methods here will be mentioned. Methods pertaining to certain single-item return and predicate generation will not. For those please go examine the source code; see `resolution.d` for those codes.
#### How resolution _within_ works
The method `resolveWithin(Container, Predicate!(Entity), ref Entity[] collection)` is responsible for providing a facility where by a given predicate can be applied to all entities available at the immediate level of the given container.
With this understanding one can imagine that the implementation if rather simple then:
```d
gprintln
(
format
(
"resolveWithin(cntnr=%s) entered",
currentContainer
)
);
Statement[] statements = currentContainer.getStatements();
gprintln
(
format
(
"resolveWithin(cntnr=%s) container has statements %s",
currentContainer,
statements
)
);
foreach(Statement statement; statements)
{
Entity entity = cast(Entity) statement;
if(entity)
{
if(predicate(entity))
{
collection ~= entity;
}
}
}
```
Simply iterate over all _statements_ present within the container (immediately, not considering nested one) and apply the predicate to each. If a match is found then add it to the `collection`, otherwise continue iterating.
#### How resolving _upwards_ works
The method `resolveUp(Container currentContainer, Predicate!(Entity) predicate)` performs a horizontal-based search of the given `Container`, returning the first `Entity` found when a positive verdict is returned from having the provided predicate applied to it. We can see this below:
```d
/* Try to find the Entity within the current Container */
gprintln
(
format
(
"resolveUp(c=%s, pred=%s)",
currentContainer,
predicate
)
);
Entity entity = resolveWithin(currentContainer, predicate);
gprintln
(
format
(
"resolveUp(c=%s, pred=%s) within-search returned '%s'",
currentContainer,
predicate,
entity
)
);
/* If we found it return it */
if(entity)
{
return entity;
}
```
If the verdict is `false` _and_ the `currentContainer` is a kind-of `Program` then it means that there is no further up we can go and we must return `null`:
```d
else if(cast(Program)currentContainer)
{
gprintln
(
format
(
"resolveUp(cntr=%s, pred=%s) Entity was not found and we cannot crawl any further up as we are at the Program container now",
currentContainer,
predicate
)
);
return null;
}
```
However if the verdict is `false` but the `currentContainer` _is **not**_ a kind-of `Program` then we do not give up immediately but rather recurse up the parental tree searching the container of the current container and applying the same logic.
```d
/**
* We will ONLY ever have a `Container`
* here of which is ALSO an `Entity`.
*/
assert(cast(Entity)currentContainer);
Container possibleParent = (cast(Entity) currentContainer).parentOf();
gprintln
(
format
(
"resolveUp(c=%s, pred=%s) cur container typeid: %s",
currentContainer,
predicate,
currentContainer
)
);
gprintln
(
format
(
"resolveUp(c=%s, pred=%s) possible parent: %s",
currentContainer,
predicate,
possibleParent
)
);
/* Can we go up */
if(possibleParent)
{
return resolveUp(possibleParent, predicate);
}
/* If the current container has no parent container */
else
{
gprintln
(
format
(
"resolveUp(c=%s, pred=%s) Simply not found ",
currentContainer,
predicate
)
);
return null;
}
```
#### How _best-effort_ resolution works
Best effort resolution is now described in this section. The method of concern for this is `resolveBest(Container c, string name)`.
**Steps**:
1. We first obtain the `path` as a `string[]` by splitting the incoming `name` by any periods present (`.`s)
2. _If_ the container `c` is a kind-of `Program` _then_...
a. _If_ the `path` is a single element
i.Search for a module with the name of `path[0]`
b. _If_ the `path` is more than a single element then we take it that `path[0]` is the name of a module, we first search for that
i. _If **not** found_ we return `null`
ii. _If found_ we then call `resolveBest(moduleFound, join(path[1..$], '.')`, so we re-anchor our search based on the module as the container node for the recursive call and the rest of the search path is handed off to the nested call.
3. _If **not**_ and we have a single element in the `path` then we have a few more checks which follow
a. We check if any of the _modules_ within the current _program_ matches the name
b. _If_ no match is found _then_ we try to resolve the `name` (in other words `path[0]`) upwards
4. _If_ the `path` has more than one element
a. _If_ `path[0]` refers to the container entity `c` then...
i. _If_ there is only one element left, namely, `path[1]`, then we return with the result of calling `resolveWithin(c, path[1])`.
ii. _If_ there are more than two elements then what we effectively do is these several steps. First, we check that there is an entity at `path[1]` by resolving it against `c` with `resolveWithin(c, path[1])`; if `null` we then return `null`, else we continue and call the found entity `entityNext`. Then we calculate as such, if the path was `x.y.z` then we make a `newPath` containing `y.z`. We now will resolve the `newPath` (the `y.z`) against `entityNext` (which we cast to a `Container` and ensure it is possible and call it `containerWithin`); this is accomplished with `resolveBest(containerWithin, newPath)`. Thus setting in motion the path walking recursive nature of this part of the algorithm.
b. _If_ `path[0]` does **not** refer to the container entity `c`, then...
i. First we check if the `path[0]` matches the name of any _module_ attached to the _current program_. If a match is found then we return with a call to `resolveBest(curModule, name)` and let it handle that. We do this so that module names are **always treated as absolute** and hence can always be referenced, unlike other containers which can have duplicate names if distanced away by at least one non-name-sharing container.
ii. If a module name match _is **not**_ found then we attempt the following. We try to find an entity named by `path[0]` by resolving upwards, if we _do **not**_ find one, we return `null`, _else_ if we do then: We will use the found entity as a container called `con` and then do a `resolveBest(con, name)` in order to try and find it. This effectively is a step to find the nearest anchoring point (as `c` clearly isn't it) and then start the search from there.
The code for this is shown below. Note that it is quite a hefty piece of code but it does after all entail the above process.
```{.d .numberLines}
gprintln
(
format
(
"resolveBest(cntnr=%s, name=%s) Entered",
c,
name
)
);
string[] path = split(name, '.');
assert(path.length); // We must have _something_ here
// Infact this should probably only be
// ...called relative to a Module, there
// are only some cases where it makes sense
// otherwise
if(cast(Program)c)
{
gprintln
(
format
(
"resolveBest: Container is program (%s)",
c
)
);
Program programC = cast(Program)c;
// If you were asking just for the module
// e.g. `simple_module`
//
// Note that this won't consider doing
// a find of the entity in any other module
// if the path = ['g']. The reason for that is
// because a search rooted at the `Program`
// could find such an entity in ANY of the
// modules if we added such support but that
// would be kind of useless
if(path.length == 1)
{
string moduleRequested = name;
foreach(Module curMod; programC.getModules())
{
gprintln
(
format
(
"resolveBest(moduleHorizontal): %s",
curMod
)
);
if(cmp(moduleRequested, curMod.getName()) == 0)
{
return curMod;
}
}
gprintln
(
"resolveBest(moduleHoritontal) We found nothing and will not go down from Program to any Module[]. You probably did a rooted search on the Program for a bnon-Module entity, didn't ya?",
DebugType.ERROR
);
return null;
}
// If you were asking for some entity
// anchored within a module
// e.g.`simple_module.x`
else
{
// First ensure a valid module name as anchor
string moduleRequested = path[0];
Container anchor;
foreach(Module curMod; programC.getModules())
{
gprintln
(
format
(
"resolveBest(moduleHorizontal): %s",
curMod
)
);
if(cmp(moduleRequested, curMod.getName()) == 0)
{
anchor = curMod;
break;
}
}
// If we found the module
// then do an anchored search
// on the remaining path
if(anchor)
{
string remainingPath = join(path[1..$], ".");
return resolveBest(anchor, remainingPath);
}
// If we could not find the module
else
{
gprintln
(
format
(
"resolveBest(Program root): Could not find module '%s' for ANCHORED access",
moduleRequested
),
DebugType.ERROR
);
return null;
}
}
}
/**
* All objects that implement Container so far
* are also Entities (hence they have a name).
*
* The above is ONLY true except when you
* have a `Program` BUT we handle the case
* whereby `c` is a `Program` above, hence
* meaning that this code is unreachable in
* such a case and therefore safe.
*/
Entity containerEntity = cast(Entity) c;
assert(containerEntity);
gprintln
(
format
(
"resolveBest(cntr=%s,name=%s) path = %s",
c,
name,
path
)
);
/**
* If no dot
*
* Try and find `name` within c
*/
if (path.length == 1)
{
/**
* Check if the name, regardless of container,
* matches any of the roots (modules attached
* to this program)
*/
foreach(Module curModule; this.program.getModules())
{
if(cmp(name, curModule.getName()) == 0)
{
return curModule;
}
}
Entity entityWithin = resolveUp(c, name);
/* If `name` was in container `c` or above it */
if (entityWithin)
{
return entityWithin;
}
/* If `name` was NOT found within container `c` or above it */
else
{
return null;
}
}
else
{
/* If the root is the current container */
if (cmp(path[0], containerEntity.getName()) == 0)
{
/* If only 1 left then just grab it */
if (path.length == 2)
{
Entity entityNext = resolveWithin(c, path[1]);
return entityNext;
}
/* Go deeper */
else
{
string newPath = name[indexOf(name, '.') + 1 .. name.length];
Entity entityNext = resolveWithin(c, path[1]);
/* If null then not found */
if (entityNext)
{
Container containerWithin = cast(Container) entityNext;
if (entityNext)
{
return resolveBest(containerWithin, newPath);
}
else
{
return null;
}
}
else
{
return null;
}
}
}
/* We need to search higher */
else
{
/**
* Check if the name is of one of the modules
* attached to the program
*/
foreach(Module curModule; this.program.getModules())
{
if(cmp(curModule.getName(), path[0]) == 0)
{
gprintln
(
format
(
"About to search for name='%s' in module %s",
name,
curModule
)
);
return resolveBest(curModule, name);
}
}
Entity entityFound = resolveUp(c, path[0]);
if (entityFound)
{
Container con = cast(Container) entityFound;
if (con)
{
gprintln("fooook");
return resolveBest(con, name);
}
else
{
gprintln("also a kill me");
return null;
}
}
else
{
gprintln("killl me");
return null;
}
}
}
```
#### How finding a container of a concrete type works
It is sometimes of use to be able to find a _container_ of a _given type_. This is something the other methods do not really consider, for them the _container anchoring point_ and the _name_ are well known. There are however cases whereby one may one want to find a _container_ of a certain type given a starting _statement_ - this is what this method provides.
Taking a look at the method definition below:
```d
Container findContainerOfType
(
TypeInfo_Class containerType,
Statement startingNode
)
```
**Steps**:
1. _If_ the `startingNode` _is_ `null` then we return with `null`
2. _If_ the `typeid(startingNode)`, that is the actual type of `startingNode`, is equal to that of the `containerType` then we return the `startingNode` casted to a `Container`. This is a match on first-call with no swimming upwards.
3. _Else_ we find the _parent of_ the `startingNode` and recurse to this method using `findContainerOfType(containerType, cast(Container)startingNode.parentOf())`. This is a case of us finding the starting node's parent, and then re-applying the logic, hence swimming up in hopes we find the match somewhere above.
This is a relatively simple algorithm and the implementation is shown below:
```{.d .numberLines}
gprintln
(
format
(
"findContainerOfType(TypeInfo_Class, Statement): StmtStart: %s",
startingNode
)
);
gprintln
(
format
(
"findContainerOfType(TypeInfo_Class, Statement): StmtStart (type): %s",
startingNode.classinfo
)
);
// If the given AST object is null, return null
if(startingNode is null)
{
return null;
}
// If the given AST object's type is of the type given
else if(typeid(startingNode) == containerType)
{
// Sanity check: You should not be calling
// with a TypeInfo_Class referring to a non-`Container`
assert(cast(Container)startingNode);
return cast(Container)startingNode;
}
// If not, swim up to the parent
else
{
gprintln
(
format
(
"parent of %s is %s",
startingNode,
startingNode.parentOf()
)
);
return findContainerOfType(containerType, cast(Statement)startingNode.parentOf());
}
```
### Worked examples
Given a program with a single module `resolution_test_1` as follows:
```d
string sourceCode = `
module resolution_test_1;
int g;
`
```
We then setup such a relationship (for the sake of the test):
```d
File dummyFile;
Compiler compiler = new Compiler(sourceCode, "legitidk.t", dummyFile);
compiler.doLex();
compiler.doParse();
Program program = compiler.getProgram();
// There is only a single module in this program
Module modulle = program.getModules()[0];
/* Module name must be resolution_test_1 */
assert(cmp(modulle.getName(), "resolution_test_1")==0);
TypeChecker tc = new TypeChecker(compiler);
```
We first try and search for an entity named `g` using the program as the anchoring container:
```d
// Now try to find the variable `d` by starting at the program-level
// this SHOULD fail as it should NOT be allowed
Entity var = tc.getResolver().resolveBest(program, "g");
assert(var is null);
```
This would _fail_ because any search anchored at the program-level will only be able to resolve names of the form `<moduleName>.<entity... `, hence the `assert(var is null)`.
After this we then try to find the variable `d` by starting at the module-level:
```d
// Try to find the variable `d` by starting at the module-level
var = tc.getResolver().resolveBest(modulle, "g");
assert(var);
assert(cast(Variable)var); // Ensure it is a variable
```
This passes, compared to the last, because the search is anchored at a non-program container and there is an entity named `"g"` within the module `modulle`.
After this we should be able to do a rooted search for a module, however, at the Program level for a module name:
```d
Entity myModule = tc.getResolver().resolveBest(program, "resolution_test_1");
assert(myModule);
assert(cast(Module)myModule); // Ensure it is a Module
```
This _passes_ because, as stated earlier, only module names and (dotted-paths starting with them) are allowed when using `resolveBest` with a program anchor container.
We then do some tests with descendancy:
```d
// The `g` should be a descendant of the module and the module of the program
assert(tc.getResolver().isDescendant(cast(Container)myModule, var));
assert(tc.getResolver().isDescendant(cast(Container)program, myModule));
```
We can also do a full path resolution including a _dotterd-path_, as we alluded to earlier. In this case we resolve using the program as the anchoring container and request resolution for the name `"resolution_test_1.g"`:
```d
// Lookup `resolution_test_1.g` but anchored from the `Program`
Entity varAgain = tc.getResolver().resolveBest(program, "resolution_test_1.g");
assert(varAgain);
assert(cast(Variable)varAgain); // Ensure it is a Variable
```
---
The last few are just related to doing name generation, similarly though, with differing anchoring points and methods:
```d
// Generate the name from the program as the anchor
string nameFromProgram = tc.getResolver().generateName(program, var);
gprintln(format("nameFromProgram: %s", nameFromProgram));
assert(nameFromProgram == "resolution_test_1.g");
// Generate the name from the module as the anchor (should be same as above)
string nameFromModule = tc.getResolver().generateName(cast(Container)myModule, var);
gprintln(format("nameFromModule: %s", nameFromModule));
assert(nameFromModule == "resolution_test_1.g");
// Generate absolute path of the entity WITHOUT an anchor point
string bestName = tc.getResolver().generateNameBest(var);
gprintln(format("bestName: %s", bestName));
assert(bestName == "resolution_test_1.g");
```