diff options
author | Klaus Aehlig <klaus.aehlig@huawei.com> | 2025-01-23 10:17:08 +0100 |
---|---|---|
committer | Klaus Aehlig <klaus.aehlig@huawei.com> | 2025-01-23 11:07:27 +0100 |
commit | bbfd7d286ee5ed4e3caceb20d9debb0f971adb19 (patch) | |
tree | 252d64357d5a52a4e880e36a622937b8ed0099c7 /doc/concepts/computed-roots.md | |
parent | fc9f622cba6b6671f5b5f0371de9bf31ae75c7d1 (diff) | |
download | justbuild-bbfd7d286ee5ed4e3caceb20d9debb0f971adb19.tar.gz |
Document computed roots as implemented concept
... rather than as future design. While there, also add target-level
caching as a service to the list of documentation pages.
Diffstat (limited to 'doc/concepts/computed-roots.md')
-rw-r--r-- | doc/concepts/computed-roots.md | 193 |
1 files changed, 193 insertions, 0 deletions
diff --git a/doc/concepts/computed-roots.md b/doc/concepts/computed-roots.md new file mode 100644 index 00000000..171c7648 --- /dev/null +++ b/doc/concepts/computed-roots.md @@ -0,0 +1,193 @@ +Computed roots +============== + +Use cases for computed build descriptions +----------------------------------------- + +### Generated target files + +Sometimes projects (or parts thereof that can form a separate logical +repository) have a simple structure. For example, there is a list of +directories and for each one there is a library, named and staged in a +systematic way. Repeating all those systematic target files seems +unnecessary work. Instead, we could store the list of directories to +consider and a small script containing the naming/staging/globbing +logic; this approach would also be more maintainable. A similar approach +could also be attractive for a directory tree with tests where, on top, +all the individual tests should be collected to test suites. + +### Staging according to embedded information + +For importing prebuilt libraries, it is sometimes desirable to stage +them in a way honoring the embedded `soname`. The current approach is to +provide that information out of band in the target file, so that it can +be used during analysis. Still, the information is already present in +the prebuilt binary, causing unnecessary maintenance overhead; instead, +the target file could be a function of that library which can form its +own content-fixed root (e.g., a `git tree` root), so that the computed +value is easily cacheable. + +### Simplified rule definition and alternative syntax + +Rules can share computation through expressions. However, the interface, +deliberately has to be explicit, including the documentation strings +that are used by `just describe`. While this allows easy and efficient +implementation of `just describe`, there is some redundancy involved, as +often fields are only there to be used by a common expression, but this +have to be documented in a redundant way (causing additional maintenance +burden). + +Moreover, using JSON encoding of abstract syntax trees is an +unambiguously readable and easy to automatically process format, but +people argue that it is hard to write by hand. However, it is unlikely +to get agreement on which syntax is best to use. Now, if rule and +expression files can be generated, this argument is not +necessary. Moreover, rules are typically versioned and infrequently +changed, so the step of generating the official syntax from the +convenient one would typically be in cache. + +Root types depending on computation +----------------------------------- + +There are two additional types of roots that are defined through +computation. They allow a clean principle to add the needed (and a +lot more) flexibility for the described use cases, while ensuring that +all computations of roots are properly cacheable at high level. In this +way, we do not compromise efficient builds, as the price of the +additional flexibility; in the typical case, is just a single cache +lookup. Of course, it is up to the user to ensure that this case really +is the typical one, in the same way as it is their responsibility to +describe the targets in a way to have proper incrementality. + +### Root type `"computed"` + +The `just` multi-repository configuration allows a type of root, +called `"computed"`. A `"computed"` root is given by + + - the (global) name of a repository + - the name of a target (in `["module", "target"]` format), and + - a configuration (as JSON object, taken literally). + +It is a requirement that the specified target is an `"export"` target +and the specified repository content-fixed; `"computed"` roots are +considered content-fixed. However, the dependency structure of computed +roots must be cycle free. In other words, there must exist an ordering +of computed roots (the implicit topological order, not a declared one) +such that for each computed root, the referenced repository as well as +all repositories reachable from that one via the `"bindings"` map only +contain computed roots earlier in that order. + +### Root type `"tree structure"` + +In the described use case of generated target files, the tree of +target files only depends on the structure of the workspace root. To +avoid unnecessary actions, an additional root type is defined, +that of a `"tree structure"`. Such a root is given by precisely +one root. It evaluates to that root but with all files replaced +by empty files. Obviously, this computation can be done without +spawning actions and is cachable. + +The serve functionality also allows to answer queries for the +tree structure of a given tree known to serve. + +### Strict evaluation of roots as artifact tree + +The building of required computed roots happens in topological order; +the build of the defining target of a root is, in principle (subject to +a user-defined restriction of parallelism) started as soon as all roots +in the repositories reachable via bindings are available. The root is +then considered the artifact tree of the defining target. + +In particular, the evaluation is strict: all roots of reachable +repositories have to be successfully computed before the evaluation is +started, even if it later turns out that one of these roots is never +accessed in the computation of the defining target. The reason for this +strictness requirement is to ensure that the cache key for target-level +caching can be computed ahead of time (and we expect the entry to be in +target-level cache most of the time anyway). + +### Intensional equality of computed roots + +During a build, each computed root is evaluated only once, even if +required in several places. Two computed roots are considered equal, if +they are defined in the same way, i.e., repository name, target, and +configuration agree. The repository or layer using the computed root is +not part of the root definition. Similarly, two tree-structure roots +are equal if the defining roots are equal. + +### Evaluation through serve endpoint preferred + +When determining the value of a computed root, as for every export +target, the provided serve endpoint (if any) is consulted first. +Only if it is not aware of the root, a local evaluation is carried +out. This strategy is also applied for tree-stucture roots. + +### `just-mr` support for computed roots + +To allow simply setting up a `just` configuration using computed +roots, `just-mr` allows a repository type `"computed"` with the same +parameters as a computed root, as well as a repository type `"tree +structure"` with another root as parameter. These repositories can +be used as roots, like any other `just-mr` repository type. When +generating the `just` multi-repository configuration, the definition +of a `"computed"` repository is just forwarded as computed root. + +### Computed roots and `just serve` + +Due to the presence of `just serve`, roots can be absent. This +affects computed roots in two ways, + - roots, in particular the target roots, of the repository referred + to can be absent, and + - a computed root can be absent itself. +The latter has to be supported, as dependencies that should be +delegated to `just serve` might contain computed roots themselves. +In this case, we consider it acceptable to have one round of talking +back and forth with the serve instance per computed root involved, +however we do not want to fetch the artifacts of those intermediate +roots. After all, whole point of the serve service was to use +dependencies without having them locally. + +#### Sytnax for absent computed roots + +As for other roots, we let the user specify which roots are to be +absent. Tools like `just-import-git` will extend their marking of absent +dependencies (e.g., by the option `--absent` of `just-import-git`) +to computed roots as well. + +In a `just-mr` repository config, `"pragma": {"absent": true}` can +be used for computed roots as well. Also `just-mr` will also honor +the passed absent specification (via `--absent` or implicitly via +the rc file) for computed roots the same way as for other roots. + +In a `just` repository config, computed roots are given by the +tuple `["computed", <repository>, <module>, <target>, <config>]`. +Optionally, an additional entry can be added; that entry has to be +an object. A computed root is absent if that additional argument +is present and contains an entry for the value `"absent"` that +is `true`. E.g., `["computed", "base", "", "", {}]` is a concrete +computed root and `["computed", "base", "", "", {}, {"absent": +true}]` is the same computed root considered absent. + +#### Evaluation of computed roots in connection with absent roots + +If a computed root is absent then, in native mode, regardless of whether +the base repository is absent or not, + - serve will be asked for the result, and + - from the result the tree identifier of the root will be computed + in memory and the root set to that value, as absent. + +If a concrete computed root refers to a base repository with absent +target root, + - the client will ask serve about the flexible variables of the + specified target, and + - with this information will compute locally the cache key and + inspect the local target-level cache. If not there, the root will + be built, installed to a local temporary directory and imported + into the git cas. + +In the remaining case of a concrete computed root with concrete +target root of the referred base repository, the cache key can be +computed locally and a local check for a cache hit can be performed; +in this way, unnecessary IO-operations are avoided. If no cache +hit is found, the target will be built, installed to a temporary +directory and imported into the git cas. |