diff options
Diffstat (limited to 'doc/future-designs')
-rw-r--r-- | doc/future-designs/computed-roots.org | 154 |
1 files changed, 154 insertions, 0 deletions
diff --git a/doc/future-designs/computed-roots.org b/doc/future-designs/computed-roots.org new file mode 100644 index 00000000..d3c355ce --- /dev/null +++ b/doc/future-designs/computed-roots.org @@ -0,0 +1,154 @@ +* Computed roots + +** Status quo + +As of version ~1.0.0~, the ~just~ build tool requires a the repository +configuration, including all roots, to be specified ahead of time. +This has a couple of consequences. + +*** Flexible source views, thanks to staging + +For source files, the flexibility of using them in a layout different +from how they occur in the source tree is gained through staging. +If a different view of sources is needed, instead of a source +target, a defined target can be used that rearranges the sources as +desired. In this way, also programmatic transformations of source +files can be carried out (while the result is still visible at the +original location), as is done, e.g., by the ~["patch", "file"]~ +rule of the ~just~ main repository. + +*** Restricted flexibility in target-definitions via globbing + +When defining targets, the general principle is that the definition +of target and action graph only depends on the description (given by +the target files, the rules and expressions, and the configuration). +There is, however, a single exception to that rule: a target file +may use the ~GLOB~ built-in construct and in this way depend on +the index of the respective source directory. This allows, e.g., +to define a separate action for every source file and, in this +way, get good incrementality and parallelism, while still having +a concise target description. + +*** Modularity in rules through expressions + +Rules might share common tasks. For example, for both ~C~ binaries +and ~C~ libraries, the source files have to be compiled to object +files. To avoid duplication of descriptions, expressions can be +called (also from expressions themselves). + +** Use cases that require more flexibility + +*** Generated target files + +Sometimes projects (or parts thereof that can form a separate +logical repository) have a simple structure. For example, there is +a list of directories and for each one there is a library, named +and staged in a systematic way. Repeating all those systematic +target files seems unnecessary work. Instead, we could store the +list of directories to consider and a small script containing the +naming/staging/globbing logic; this approach would also be more +maintainable. A similar approach could also be attractive for a +directory tree with tests where, on top, all the individual tests +should be collected to test suites. + +*** Staging according to embedded information + +For importing prebuilt libraries, it is sometimes desirable to +stage them in a way honoring the embedded ~soname~. The current +approach is to provide that information out of band in the target +file, so that it can be used during analysis. Still, the information +is already present in the prebuilt binary, causing unnecessary +maintenance overhead; instead, the target file could be a function +of that library which can form its own content-fixed root (e.g., a +~git tree~ root), so that the computed value is easily cachable. + +*** Simplified rule definition and alternative syntax + +Rules can share computation through expressions. However, the +interface, deliberately has to be explicit, including the documentation +strings that are used by ~just describe~. While this allows easy +and efficient implementation of ~just describe~, there is some +redundancy involved, as often fields are only there to be used by +a common expression, but this have to be documented in a redundant +way (causing additional maintenance burden). + +Moreover, using JSON encoding of abstract syntax trees is an +unambiguously readable and easy to automatically process format, +but people argue that it is hard to write by hand. However, it is +unlikely to get agreement on which syntax is best to use. Now, if +rule and expression files could be generated, this argument would +not be necessary. Moreover, rules are typically versioned and +unfrequently changed, so the step of generating the official syntax +from the convenient one would typically be in cache. + +** Proposal: Support computed roots + +We propose computed roots as a clean principle to add the needed (and +a lot more) flexibility for the described use cases, while ensuring +that all computations of roots are properly cachable at high level. +In this way, we do not compromise efficient builds, as the price of +the additional flexibility, in the typical case, is just a single +cache lookup. Of course, it is up to the user to ensure that this +case really is the typical one, in the same way as it is their +responsibility to describe the targets in a way to have proper +incrementality. + +*** New root type ~"computed"~ + +The ~just~ multi-repository configuration will allow a new type +of root (besides ~"file"~ and ~"git tree"~ and variants thereof), +called ~"computed"~. A ~"computed"~ root is given by +- the (global) name of a repository +- the name of a target (in ~["module", "target"]~ format), and +- a configuration (as JSON object, taken literally). +It is a requirement that the specified target is an ~"export"~ +target and the specified repository content-fixed; ~"computed"~ roots +are considered content-fixed. However, the dependency structure of +computed roots must be cycle free. In other words, there must exist +an ordering of computed roots (the implicit topological order, not +a declared one) such that for each computed root, the referenced +repository as well as all repositories reachable from that one +via the ~"bindings"~ map only contain computed roots earlier in +that order. + +*** Strict evaluation of roots as artifact tree + +The building of required computed roots happens in topological order; +the build of the defining target of a root is, in principle (subject +to a user-defined restriction of parallelism) started as soon as all +roots in the repositories reachable via bindings are available. The +root is then considered the artifact tree of the defining target. + +In particular, the evaluation is strict: all roots of reachable +repositories have to be successfully computed before the evaluation +is started, even if it later turns out that one of these roots is +never accessed in the computation of the defining target. The reason +for this strictness requirement is to ensure that the cache key for +target-level caching can be computed ahead of time (and we expect +the entry to be in target-level cache most of the time anyway). + +*** Intensional equality of computed roots + +During a build, each computed root is evaluated only once, even +if required in several places. Two computed roots are considered +equal, if they are defined in the same way, i.e., repository name, +target, and configuration agree. The repository or layer using the +computed root is not part of the root definition. + +*** Computed roots available to the user + +As computed roots are defined by export targets, the respective +artifacts are stored in the local CAS anyway. Additionally, the +tree that forms the root will be added to CAS as well. Moreover, +an option will be added to specify a log file that contains, in +machine-readable way, all the tree identifiers of all computed +roots used in this build, together with their definition. + +*** ~just-mr~ to support computed roots + +To allow simply setting up a ~just~ configuration using computed +roots, ~just-mr~ will allow a repository type ~"computed"~ with the +same parameters as a computed root. These repositories can be used +as roots, like any other ~just-mr~ repository type. When generating +the ~just~ multi-repository configuration, the definition of a +~"computed"~ repository is just forwarded as computed root. |