1 files changed, 154 insertions, 0 deletions
diff --git a/doc/future-designs/computed-roots.org b/doc/future-designs/computed-roots.org
new file mode 100644
index 00000000..d3c355ce
--- /dev/null
+++ b/doc/future-designs/computed-roots.org
@@ -0,0 +1,154 @@
+* Computed roots
+
+** Status quo
+
+As of version ~1.0.0~, the ~just~ build tool requires a the repository
+configuration, including all roots, to be specified ahead of time.
+This has a couple of consequences.
+
+*** Flexible source views, thanks to staging
+
+For source files, the flexibility of using them in a layout different
+from how they occur in the source tree is gained through staging.
+If a different view of sources is needed, instead of a source
+target, a defined target can be used that rearranges the sources as
+desired. In this way, also programmatic transformations of source
+files can be carried out (while the result is still visible at the
+original location), as is done, e.g., by the ~["patch", "file"]~
+rule of the ~just~ main repository.
+
+*** Restricted flexibility in target-definitions via globbing
+
+When defining targets, the general principle is that the definition
+of target and action graph only depends on the description (given by
+the target files, the rules and expressions, and the configuration).
+There is, however, a single exception to that rule: a target file
+may use the ~GLOB~ built-in construct and in this way depend on
+the index of the respective source directory. This allows, e.g.,
+to define a separate action for every source file and, in this
+way, get good incrementality and parallelism, while still having
+a concise target description.
+
+*** Modularity in rules through expressions
+
+Rules might share common tasks. For example, for both ~C~ binaries
+and ~C~ libraries, the source files have to be compiled to object
+files. To avoid duplication of descriptions, expressions can be
+called (also from expressions themselves).
+
+** Use cases that require more flexibility
+
+*** Generated target files
+
+Sometimes projects (or parts thereof that can form a separate
+logical repository) have a simple structure. For example, there is
+a list of directories and for each one there is a library, named
+and staged in a systematic way. Repeating all those systematic
+target files seems unnecessary work. Instead, we could store the
+list of directories to consider and a small script containing the
+naming/staging/globbing logic; this approach would also be more
+maintainable. A similar approach could also be attractive for a
+directory tree with tests where, on top, all the individual tests
+should be collected to test suites.
+
+*** Staging according to embedded information
+
+For importing prebuilt libraries, it is sometimes desirable to
+stage them in a way honoring the embedded ~soname~. The current
+approach is to provide that information out of band in the target
+file, so that it can be used during analysis. Still, the information
+is already present in the prebuilt binary, causing unnecessary
+maintenance overhead; instead, the target file could be a function
+of that library which can form its own content-fixed root (e.g., a
+~git tree~ root), so that the computed value is easily cachable.
+
+*** Simplified rule definition and alternative syntax
+
+Rules can share computation through expressions. However, the
+interface, deliberately has to be explicit, including the documentation
+strings that are used by ~just describe~. While this allows easy
+and efficient implementation of ~just describe~, there is some
+redundancy involved, as often fields are only there to be used by
+a common expression, but this have to be documented in a redundant
+way (causing additional maintenance burden).
+
+Moreover, using JSON encoding of abstract syntax trees is an
+unambiguously readable and easy to automatically process format,
+but people argue that it is hard to write by hand. However, it is
+unlikely to get agreement on which syntax is best to use. Now, if
+rule and expression files could be generated, this argument would
+not be necessary. Moreover, rules are typically versioned and
+unfrequently changed, so the step of generating the official syntax
+from the convenient one would typically be in cache.
+
+** Proposal: Support computed roots
+
+We propose computed roots as a clean principle to add the needed (and
+a lot more) flexibility for the described use cases, while ensuring
+that all computations of roots are properly cachable at high level.
+In this way, we do not compromise efficient builds, as the price of
+the additional flexibility, in the typical case, is just a single
+cache lookup. Of course, it is up to the user to ensure that this
+case really is the typical one, in the same way as it is their
+responsibility to describe the targets in a way to have proper
+incrementality.
+
+*** New root type ~"computed"~
+
+The ~just~ multi-repository configuration will allow a new type
+of root (besides ~"file"~ and ~"git tree"~ and variants thereof),
+called ~"computed"~. A ~"computed"~ root is given by
+- the (global) name of a repository
+- the name of a target (in ~["module", "target"]~ format), and
+- a configuration (as JSON object, taken literally).
+It is a requirement that the specified target is an ~"export"~
+target and the specified repository content-fixed; ~"computed"~ roots
+are considered content-fixed. However, the dependency structure of
+computed roots must be cycle free. In other words, there must exist
+an ordering of computed roots (the implicit topological order, not
+a declared one) such that for each computed root, the referenced
+repository as well as all repositories reachable from that one
+via the ~"bindings"~ map only contain computed roots earlier in
+that order.
+
+*** Strict evaluation of roots as artifact tree
+
+The building of required computed roots happens in topological order;
+the build of the defining target of a root is, in principle (subject
+to a user-defined restriction of parallelism) started as soon as all
+roots in the repositories reachable via bindings are available. The
+root is then considered the artifact tree of the defining target.
+
+In particular, the evaluation is strict: all roots of reachable
+repositories have to be successfully computed before the evaluation
+is started, even if it later turns out that one of these roots is
+never accessed in the computation of the defining target. The reason
+for this strictness requirement is to ensure that the cache key for
+target-level caching can be computed ahead of time (and we expect
+the entry to be in target-level cache most of the time anyway).
+
+*** Intensional equality of computed roots
+
+During a build, each computed root is evaluated only once, even
+if required in several places. Two computed roots are considered
+equal, if they are defined in the same way, i.e., repository name,
+target, and configuration agree. The repository or layer using the
+computed root is not part of the root definition.
+
+*** Computed roots available to the user
+
+As computed roots are defined by export targets, the respective
+artifacts are stored in the local CAS anyway. Additionally, the
+tree that forms the root will be added to CAS as well. Moreover,
+an option will be added to specify a log file that contains, in
+machine-readable way, all the tree identifiers of all computed
+roots used in this build, together with their definition.
+
+*** ~just-mr~ to support computed roots
+
+To allow simply setting up a ~just~ configuration using computed
+roots, ~just-mr~ will allow a repository type ~"computed"~ with the
+same parameters as a computed root. These repositories can be used
+as roots, like any other ~just-mr~ repository type. When generating
+the ~just~ multi-repository configuration, the definition of a
+~"computed"~ repository is just forwarded as computed root.