summaryrefslogtreecommitdiff
path: root/doc/future-designs/computed-roots.md
diff options
context:
space:
mode:
Diffstat (limited to 'doc/future-designs/computed-roots.md')
-rw-r--r--doc/future-designs/computed-roots.md156
1 files changed, 156 insertions, 0 deletions
diff --git a/doc/future-designs/computed-roots.md b/doc/future-designs/computed-roots.md
new file mode 100644
index 00000000..8bbff401
--- /dev/null
+++ b/doc/future-designs/computed-roots.md
@@ -0,0 +1,156 @@
+Computed roots
+==============
+
+Status quo
+----------
+
+As of version `1.0.0`, the `just` build tool requires a the repository
+configuration, including all roots, to be specified ahead of time. This
+has a couple of consequences.
+
+### Flexible source views, thanks to staging
+
+For source files, the flexibility of using them in a layout different
+from how they occur in the source tree is gained through staging. If a
+different view of sources is needed, instead of a source target, a
+defined target can be used that rearranges the sources as desired. In
+this way, also programmatic transformations of source files can be
+carried out (while the result is still visible at the original
+location), as is done, e.g., by the `["patch", "file"]` rule of the
+`just` main repository.
+
+### Restricted flexibility in target-definitions via globbing
+
+When defining targets, the general principle is that the definition of
+target and action graph only depends on the description (given by the
+target files, the rules and expressions, and the configuration). There
+is, however, a single exception to that rule: a target file may use the
+`GLOB` built-in construct and in this way depend on the index of the
+respective source directory. This allows, e.g., to define a separate
+action for every source file and, in this way, get good incrementality
+and parallelism, while still having a concise target description.
+
+### Modularity in rules through expressions
+
+Rules might share common tasks. For example, for both `C` binaries and
+`C` libraries, the source files have to be compiled to object files. To
+avoid duplication of descriptions, expressions can be called (also from
+expressions themselves).
+
+Use cases that require more flexibility
+---------------------------------------
+
+### Generated target files
+
+Sometimes projects (or parts thereof that can form a separate logical
+repository) have a simple structure. For example, there is a list of
+directories and for each one there is a library, named and staged in a
+systematic way. Repeating all those systematic target files seems
+unnecessary work. Instead, we could store the list of directories to
+consider and a small script containing the naming/staging/globbing
+logic; this approach would also be more maintainable. A similar approach
+could also be attractive for a directory tree with tests where, on top,
+all the individual tests should be collected to test suites.
+
+### Staging according to embedded information
+
+For importing prebuilt libraries, it is sometimes desirable to stage
+them in a way honoring the embedded `soname`. The current approach is to
+provide that information out of band in the target file, so that it can
+be used during analysis. Still, the information is already present in
+the prebuilt binary, causing unnecessary maintenance overhead; instead,
+the target file could be a function of that library which can form its
+own content-fixed root (e.g., a `git tree` root), so that the computed
+value is easily cacheable.
+
+### Simplified rule definition and alternative syntax
+
+Rules can share computation through expressions. However, the interface,
+deliberately has to be explicit, including the documentation strings
+that are used by `just describe`. While this allows easy and efficient
+implementation of `just describe`, there is some redundancy involved, as
+often fields are only there to be used by a common expression, but this
+have to be documented in a redundant way (causing additional maintenance
+burden).
+
+Moreover, using JSON encoding of abstract syntax trees is an
+unambiguously readable and easy to automatically process format, but
+people argue that it is hard to write by hand. However, it is unlikely
+to get agreement on which syntax is best to use. Now, if rule and
+expression files could be generated, this argument would not be
+necessary. Moreover, rules are typically versioned and infrequently
+changed, so the step of generating the official syntax from the
+convenient one would typically be in cache.
+
+Proposal: Support computed roots
+--------------------------------
+
+We propose computed roots as a clean principle to add the needed (and a
+lot more) flexibility for the described use cases, while ensuring that
+all computations of roots are properly cacheable at high level. In this
+way, we do not compromise efficient builds, as the price of the
+additional flexibility, in the typical case, is just a single cache
+lookup. Of course, it is up to the user to ensure that this case really
+is the typical one, in the same way as it is their responsibility to
+describe the targets in a way to have proper incrementality.
+
+### New root type `"computed"`
+
+The `just` multi-repository configuration will allow a new type of root
+(besides `"file"` and `"git tree"` and variants thereof), called
+`"computed"`. A `"computed"` root is given by
+
+ - the (global) name of a repository
+ - the name of a target (in `["module", "target"]` format), and
+ - a configuration (as JSON object, taken literally).
+
+It is a requirement that the specified target is an `"export"` target
+and the specified repository content-fixed; `"computed"` roots are
+considered content-fixed. However, the dependency structure of computed
+roots must be cycle free. In other words, there must exist an ordering
+of computed roots (the implicit topological order, not a declared one)
+such that for each computed root, the referenced repository as well as
+all repositories reachable from that one via the `"bindings"` map only
+contain computed roots earlier in that order.
+
+### Strict evaluation of roots as artifact tree
+
+The building of required computed roots happens in topological order;
+the build of the defining target of a root is, in principle (subject to
+a user-defined restriction of parallelism) started as soon as all roots
+in the repositories reachable via bindings are available. The root is
+then considered the artifact tree of the defining target.
+
+In particular, the evaluation is strict: all roots of reachable
+repositories have to be successfully computed before the evaluation is
+started, even if it later turns out that one of these roots is never
+accessed in the computation of the defining target. The reason for this
+strictness requirement is to ensure that the cache key for target-level
+caching can be computed ahead of time (and we expect the entry to be in
+target-level cache most of the time anyway).
+
+### Intensional equality of computed roots
+
+During a build, each computed root is evaluated only once, even if
+required in several places. Two computed roots are considered equal, if
+they are defined in the same way, i.e., repository name, target, and
+configuration agree. The repository or layer using the computed root is
+not part of the root definition.
+
+### Computed roots available to the user
+
+As computed roots are defined by export targets, the respective
+artifacts are stored in the local CAS anyway. Additionally, the tree
+that forms the root will be added to CAS as well. Moreover, an option
+will be added to specify a log file that contains, in machine-readable
+way, all the tree identifiers of all computed roots used in this build,
+together with their definition.
+
+### `just-mr` to support computed roots
+
+To allow simply setting up a `just` configuration using computed roots,
+`just-mr` will allow a repository type `"computed"` with the same
+parameters as a computed root. These repositories can be used as roots,
+like any other `just-mr` repository type. When generating the `just`
+multi-repository configuration, the definition of a `"computed"`
+repository is just forwarded as computed root.