summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorOliver Reiche <oliver.reiche@huawei.com>2023-06-01 13:36:32 +0200
committerOliver Reiche <oliver.reiche@huawei.com>2023-06-12 16:29:05 +0200
commitb66a7359fbbff35af630c88c56598bbc06b393e1 (patch)
treed866802c4b44c13cbd90f9919cc7fc472091be0c
parent144b2c619f28c91663936cd445251ca28af45f88 (diff)
downloadjustbuild-b66a7359fbbff35af630c88c56598bbc06b393e1.tar.gz
doc: Convert orgmode files to markdown
-rw-r--r--README.md36
-rw-r--r--doc/concepts/anonymous-targets.md345
-rw-r--r--doc/concepts/anonymous-targets.org336
-rw-r--r--doc/concepts/built-in-rules.md172
-rw-r--r--doc/concepts/built-in-rules.org167
-rw-r--r--doc/concepts/cache-pragma.md134
-rw-r--r--doc/concepts/cache-pragma.org130
-rw-r--r--doc/concepts/configuration.md115
-rw-r--r--doc/concepts/configuration.org107
-rw-r--r--doc/concepts/doc-strings.md152
-rw-r--r--doc/concepts/doc-strings.org145
-rw-r--r--doc/concepts/expressions.md368
-rw-r--r--doc/concepts/expressions.org344
-rw-r--r--doc/concepts/garbage.md86
-rw-r--r--doc/concepts/garbage.org82
-rw-r--r--doc/concepts/multi-repo.md170
-rw-r--r--doc/concepts/multi-repo.org167
-rw-r--r--doc/concepts/overview.md210
-rw-r--r--doc/concepts/overview.org206
-rw-r--r--doc/concepts/rules.md567
-rw-r--r--doc/concepts/rules.org551
-rw-r--r--doc/concepts/target-cache.md231
-rw-r--r--doc/concepts/target-cache.org219
-rw-r--r--doc/future-designs/computed-roots.md156
-rw-r--r--doc/future-designs/computed-roots.org154
-rw-r--r--doc/future-designs/execution-properties.md125
-rw-r--r--doc/future-designs/execution-properties.org119
-rw-r--r--doc/future-designs/service-target-cache.md236
-rw-r--r--doc/future-designs/service-target-cache.org227
-rw-r--r--doc/future-designs/symlinks.md113
-rw-r--r--doc/future-designs/symlinks.org108
-rw-r--r--doc/specification/remote-protocol.md145
-rw-r--r--doc/specification/remote-protocol.org139
-rw-r--r--doc/tutorial/getting-started.md217
-rw-r--r--doc/tutorial/getting-started.org212
-rw-r--r--doc/tutorial/hello-world.md379
-rw-r--r--doc/tutorial/hello-world.org370
-rw-r--r--doc/tutorial/proto.md (renamed from doc/tutorial/proto.org)245
-rw-r--r--doc/tutorial/rebuild.md (renamed from doc/tutorial/rebuild.org)164
-rw-r--r--doc/tutorial/target-file-glob-tree.md (renamed from doc/tutorial/target-file-glob-tree.org)326
-rw-r--r--doc/tutorial/tests.md (renamed from doc/tutorial/tests.org)270
-rw-r--r--doc/tutorial/third-party-software.md473
-rw-r--r--doc/tutorial/third-party-software.org475
43 files changed, 4916 insertions, 4777 deletions
diff --git a/README.md b/README.md
index 46fc78e4..bd0938c4 100644
--- a/README.md
+++ b/README.md
@@ -15,25 +15,25 @@ taken from user-defined rules described by functional expressions.
[installation guide](INSTALL.md).
* Tutorial
- - [Getting Started](doc/tutorial/getting-started.org)
- - [Hello World](doc/tutorial/hello-world.org)
- - [Third party dependencies](doc/tutorial/third-party-software.org)
- - [Tests](doc/tutorial/tests.org)
- - [Targets versus `FILE`, `GLOB`, and `TREE`](doc/tutorial/target-file-glob-tree.org)
- - [Ensuring reproducibility](doc/tutorial/rebuild.org)
- - [Using protobuf](doc/tutorial/proto.org)
+ - [Getting Started](doc/tutorial/getting-started.md)
+ - [Hello World](doc/tutorial/hello-world.md)
+ - [Third party dependencies](doc/tutorial/third-party-software.md)
+ - [Tests](doc/tutorial/tests.md)
+ - [Targets versus `FILE`, `GLOB`, and `TREE`](doc/tutorial/target-file-glob-tree.md)
+ - [Ensuring reproducibility](doc/tutorial/rebuild.md)
+ - [Using protobuf](doc/tutorial/proto.md)
- [How to create a single-node remote execution service](doc/tutorial/just-execute.org)
## Documentation
-- [Overview](doc/concepts/overview.org)
-- [Build Configurations](doc/concepts/configuration.org)
-- [Multi-Repository Builds](doc/concepts/multi-repo.org)
-- [Expression Language](doc/concepts/expressions.org)
-- [Built-in Rules](doc/concepts/built-in-rules.org)
-- [User-Defined Rules](doc/concepts/rules.org)
-- [Documentation Strings](doc/concepts/doc-strings.org)
-- [Cache Pragma and Testing](doc/concepts/cache-pragma.org)
-- [Anonymous Targets](doc/concepts/anonymous-targets.org)
-- [Target-Level Caching](doc/concepts/target-cache.org)
-- [Garbage Collection](doc/concepts/garbage.org)
+- [Overview](doc/concepts/overview.md)
+- [Build Configurations](doc/concepts/configuration.md)
+- [Multi-Repository Builds](doc/concepts/multi-repo.md)
+- [Expression Language](doc/concepts/expressions.md)
+- [Built-in Rules](doc/concepts/built-in-rules.md)
+- [User-Defined Rules](doc/concepts/rules.md)
+- [Documentation Strings](doc/concepts/doc-strings.md)
+- [Cache Pragma and Testing](doc/concepts/cache-pragma.md)
+- [Anonymous Targets](doc/concepts/anonymous-targets.md)
+- [Target-Level Caching](doc/concepts/target-cache.md)
+- [Garbage Collection](doc/concepts/garbage.md)
diff --git a/doc/concepts/anonymous-targets.md b/doc/concepts/anonymous-targets.md
new file mode 100644
index 00000000..6692d0ae
--- /dev/null
+++ b/doc/concepts/anonymous-targets.md
@@ -0,0 +1,345 @@
+Anonymous targets
+=================
+
+Motivation
+----------
+
+Using [Protocol buffers](https://github.com/protocolbuffers/protobuf)
+allows to specify, in a language-independent way, a wire format for
+structured data. This is done by using description files from which APIs
+for various languages can be generated. As protocol buffers can contain
+other protocol buffers, the description files themselves have a
+dependency structure.
+
+From a software-engineering point of view, the challenge is to ensure
+that the author of the description files does not have to be aware of
+the languages for which APIs will be generated later. In fact, the main
+benefit of the language-independent description is that clients in
+various languages can be implemented using the same wire protocol (and
+thus capable of communicating with the same server).
+
+For a build system that means that we have to expect that language
+bindings at places far away from the protocol definition, and
+potentially several times. Such a duplication can also occur implicitly
+if two buffers, for which language bindings are generated both use a
+common buffer for which bindings are never requested explicitly. Still,
+we want to avoid duplicate work for common parts and we have to avoid
+conflicts with duplicate symbols and staging conflicts for the libraries
+for the common part.
+
+Our approach is that a "proto" target only provides the description
+files together with their dependency structure. From those, a consuming
+target generates "anonymous targets" as additional dependencies; as
+those targets will have an appropriate notion of equality, no duplicate
+work is done and hence, as a side effect, staging or symbol conflicts
+are avoided as well.
+
+Preliminary remark: action identifiers
+--------------------------------------
+
+Actions are defined as Merkle-tree hash of the contents. As all
+components (input tree, list of output strings, command vector,
+environment, and cache pragma) are given by expressions, that can
+quickly be computed. This identifier also defines the notion of equality
+for actions, and hence action artifacts. Recall that equality of
+artifacts is also (implicitly) used in our notion of disjoint map union
+(where the set of keys does not have to be disjoint, as long as the
+values for all duplicate keys are equal).
+
+When constructing the action graph for traversal, we can drop duplicates
+(i.e., actions with the same identifier, and hence the same
+description). For the serialization of the graph as part of the analyse
+command, we can afford the preparatory step to compute a map from action
+id to list of origins.
+
+Equality
+--------
+
+### Notions of equality
+
+In the context of builds, there are different concepts of equality to
+consider. We recall the definitions, as well as their use in our build
+tool.
+
+#### Locational equality ("Defined at the same place")
+
+Names (for targets and rules) are given by repository name, module
+name, and target name (inside the module); additionally, for target
+names, there's a bit specifying that we explicitly refer to a file.
+Names are equal if and only if the respective strings (and the file
+bit) are equal.
+
+For targets, we use locational equality, i.e., we consider targets
+equal precisely if their names are equal; targets defined at
+different places are considered different, even if they're defined
+in the same way. The reason we use notion of equality is that we
+have to refer to targets (and also check if we already have a
+pending task to analyse them) before we have fully explored them
+with all the targets referred to in their definition.
+
+#### Intensional equality ("Defined in the same way")
+
+In our expression language we handle definitions; in particular, we
+treat artifacts by their definition: a particular source file, the
+output of a particular action, etc. Hence we use intensional
+equality in our expression language; two objects are equal precisely
+if they are defined in the same way. This notion of equality is easy
+to determine without the need of reading a source file or running an
+action. We implement quick tests by keeping a Merkle-tree hash of
+all expression values.
+
+#### Extensional equality ("Defining the same object")
+
+For built artifacts, we use extensional equality, i.e., we consider
+two files equal, if they are bit-by-bit identical.
+Implementation-wise, we compare an appropriate cryptographic hash.
+Before running an action, we built its inputs. In particular (as
+inputs are considered extensionally) an action might cause a cache
+hit with an intensionally different one.
+
+#### Observable equality ("The defined objects behave in the same way")
+
+Finally, there is the notion of observable equality, i.e., the
+property that two binaries behaving the same way in all situations.
+As this notion is undecidable, it is never used directly by any
+build tool. However, it is often the motivation for a build in the
+first place: we want a binary that behaves in a particular way.
+
+### Relation between these notions
+
+The notions of equality were introduced in order from most fine grained
+to most coarse. Targets defined at the same place are obviously defined
+in the same way. Intensionally equal artifacts create equal action
+graphs; here we can confidently say "equal" and not only isomorphic:
+due to our preliminary clean up, even the node names are equal. Making
+sure that equal actions produce bit-by-bit equal outputs is the realm of
+[reproducibe builds](https://reproducible-builds.org/). The tool can
+support this by appropriate sandboxing, etc, but the rules still have to
+define actions that don't pick up non-input information like the
+current time, user id, readdir order, etc. Files that are bit-by-bit
+identical will behave in the same way.
+
+### Example
+
+Consider the following target file.
+
+```jsonc
+{ "foo":
+ { "type": "generic"
+ , "outs": ["out.txt"]
+ , "cmds": ["echo Hello World > out.txt"]
+ }
+, "bar":
+ { "type": "generic"
+ , "outs": ["out.txt"]
+ , "cmds": ["echo Hello World > out.txt"]
+ }
+, "baz":
+ { "type": "generic"
+ , "outs": ["out.txt"]
+ , "cmds": ["echo -n Hello > out.txt && echo ' World' >> out.txt"]
+ }
+, "foo upper":
+ { "type": "generic"
+ , "deps": ["foo"]
+ , "outs": ["upper.txt"]
+ , "cmds": ["cat out.txt | tr a-z A-Z > upper.txt"]
+ }
+, "bar upper":
+ { "type": "generic"
+ , "deps": ["bar"]
+ , "outs": ["upper.txt"]
+ , "cmds": ["cat out.txt | tr a-z A-Z > upper.txt"]
+ }
+, "baz upper":
+ { "type": "generic"
+ , "deps": ["baz"]
+ , "outs": ["upper.txt"]
+ , "cmds": ["cat out.txt | tr a-z A-Z > upper.txt"]
+ }
+, "ALL":
+ { "type": "install"
+ , "files":
+ {"foo.txt": "foo upper", "bar.txt": "bar upper", "baz.txt": "baz upper"}
+ }
+}
+```
+
+Assume we build the target `"ALL"`. Then we will analyse 7 targets, all
+the locationally different ones (`"foo"`, `"bar"`, `"baz"`,
+`"foo upper"`, `"bar upper"`, `"baz upper"`). For the targets `"foo"`
+and `"bar"`, we immediately see that the definition is equal; their
+intensional equality also renders `"foo upper"` and `"bar upper"`
+intensionally equal. Our action graph will contain 4 actions: one with
+origins `["foo", "bar"]`, one with origins `["baz"]`, one with origins
+`["foo upper", "bar upper"]`, and one with origins `["baz
+upper"]`. The `"install"` target will, of course, not create any
+actions. Building sequentially (`-J 1`), we will get one cache hit. Even
+though the artifacts of `"foo"` and `"bar"` and of `"baz"` are defined
+differently, they are extensionally equal; both define a file with
+contents `"Hello World\n"`.
+
+Anonymous targets
+-----------------
+
+Besides named targets we also have additional targets (and hence also
+configured targets) that are not associated with a location they are
+defined at. Due to the absence of definition location, their notion of
+equality will take care of the necessary deduplication (implicitly, by
+the way our dependency exploration works). We will call them "anonymous
+targets", even though, technically, they're not fully anonymous as the
+rules that are part of their structure will be given by name, i.e.,
+defining rule location.
+
+### Value type: target graph node
+
+In order to allow targets to adequately describe a dependency structure,
+we have a value type in our expression language, that of a (target)
+graph node. As with all value types, equality is intensional, i.e.,
+nodes defined in the same way are equal even if defined at different
+places. This can be achieved by our usual approach for expressions of
+having cached Merkle-tree hashes and comparing them when an equality
+test is required. This efficient test for equality also allows using
+graph nodes as part of a map key, e.g., for our asynchronous map
+consumers.
+
+As a graph node can only be defined with all data given, the defined
+dependency structure is cycle-free by construction. However, the tree
+unfolding will usually be exponentially larger. For internal handling,
+this is not a problem: our shared-pointer implementation can efficiently
+represent a directed acyclic graph and since we cache hashes in
+expressions, we can compute the overall hash without folding the
+structure to a tree. When presenting nodes to the user, we only show the
+map of identifier to definition, to avoid that exponential unfolding.
+
+We have two kinds of nodes.
+
+#### Value nodes
+
+These represent a target that, in any configuration, returns a fixed
+value. Source files would typically be represented this way. The
+constructor function `"VALUE_NODE"` takes a single argument `"$1"`
+that has to be a result value.
+
+#### Abstract nodes
+
+These represent internal nodes in the dag. Their constructor
+`"ABSTRACT_NODE"` takes the following arguments (all evaluated).
+
+ - `"node_type"`. An arbitrary string, not interpreted in any way,
+ to indicate the role that the node has in the dependency
+ structure. When we create an anonymous target from a node, this
+ will serve as the key into the rule mapping to be applied.
+ - `"string_fields"`. This has to be a map of strings.
+ - `"target_fields"`. These have to be a map of lists of graph
+ nodes.
+
+Moreover, we require that the keys for maps provided as
+`"string_fields"` and `"target_fields"` be disjoint.
+
+### Graph nodes in `export` targets
+
+Graph nodes are completely free of names and hence are eligible for
+exporting. As with other values, in the cache the intensional definition
+of artifacts implicit in them will be replaced by the corresponding,
+extensionally equal, known value.
+
+However, some care has to be taken in the serialisation that is part of
+the caching, as we do not want to unfold the dag to a tree. Therefore,
+we take as JSON serialisation a simple dict with `"type"` set to
+`"NODE"`, and `"value"` set to the Merkle-tree hash. That serialisation
+respects intensional equality. To allow deserialisation, we add an
+additional map to the serialisation from node hash to its definition.
+
+### Dependings on anonymous targets
+
+#### Parts of an anonymous target
+
+An anonymous target is given by a pair of a node and a map mapping
+the abstract node-type specifying strings to rule names. So, in the
+implementation these are just two expression pointers (with their
+defined notion of equality, i.e., equality of the respective
+Merkle-tree hashes). Such a pair of pointers also forms an
+additional variant of a name value, referring to such an anonymous
+target.
+
+It should be noted that such an anonymous target contains all the
+information needed to evaluate it in the same way as a regular
+(named) target defined by a user-defined rule. It is an analysis
+error analysing an anonymous target where there is no entry in the
+rules map for the string given as `"node_type"` for the
+corresponding node.
+
+#### Anonymous targets as additional dependencies
+
+We keep the property that a user can only request named targets. So
+anonymous targets have to be requested by other targets. We also
+keep the property that other targets are only requested at certain
+fixed steps in the evaluation of a target. To still achieve a
+meaningful use of anonymous targets our rule language handles
+anonymous targets in the following way.
+
+##### Rules parameter `"anonymous"`
+
+In the rule definition a parameter `"anonymous"` (with empty map
+as default) is allowed. It is used to define an additional
+dependency on anonymous targets. The value has to be a map with
+keys the additional implicitly defined field names. It is hence
+a requirement that the set of keys be disjoint from all other
+field names (the values of `"config_fields"`, `"string_fields"`,
+and `"target_fields"`, as well as the keys of the `"implict"`
+parameter). Another consequence is that `"config_transitions"`
+map may now also have meaningful entries for the keys of the
+`"anonymous"` map. Each value in the map has to be itself a map,
+with entries `"target"`, `"provider"`, and `"rule_map"`.
+
+For `"target"`, a single string has to be specifed, and the
+value has to be a member of the `"target_fields"` list. For
+provider, a single string has to be specified as well. The idea
+is that the nodes are collected from that provider of the
+targets in the specified target field. For `"rule_map"` a map
+has to be specified from strings to rule names; the latter are
+evaluated in the context of the rule definition.
+
+###### Example
+
+For generating language bindings for protocol buffers, a
+rule might look as follows.
+
+``` jsonc
+{ "cc_proto_bindings":
+ { "target_fields": ["proto_deps"]
+ , "anonymous":
+ { "protos":
+ { "target": "proto_deps"
+ , "provider": "proto"
+ , "rule_map": {"proto_library": "cc_proto_library"}
+ }
+ }
+ , "expression": {...}
+ }
+}
+```
+
+##### Evaluation mechanism
+
+The evaluation of a target defined by a user-defined rule is
+handled as follows. After the target fields are evaluated as
+usual, an additional step is carried out.
+
+For each anonymous-target field, i.e., for each key in the
+`"anonymous"` map, a list of anonymous targets is generated from
+the corresponding value: take all targets from the specified
+`"target"` field in all their specified configuration
+transitions (they have already been evaluated) and take the
+values provided for the specified `"provider"` key (using the
+empty list as default). That value has to be a list of nodes.
+All the node lists obtained that way are concatenated. The
+configuration transition for the respective field name is
+evaluated. Those targets are then evaluated for all the
+transitioned configurations requested.
+
+In the final evaluation of the defining expression, the
+anonymous-target fields are available in the same way as any
+other target field. Also, they contribute to the effective
+configuration in the same way as regular target fields.
diff --git a/doc/concepts/anonymous-targets.org b/doc/concepts/anonymous-targets.org
deleted file mode 100644
index 98d194c7..00000000
--- a/doc/concepts/anonymous-targets.org
+++ /dev/null
@@ -1,336 +0,0 @@
-* Anonymous targets
-** Motivation
-
-Using [[https://github.com/protocolbuffers/protobuf][Protocol
-buffers]] allows to specify, in a language-independent way, a wire
-format for structured data. This is done by using description files
-from which APIs for various languages can be generated. As protocol
-buffers can contain other protocol buffers, the description files
-themselves have a dependency structure.
-
-From a software-engineering point of view, the challenge is to
-ensure that the author of the description files does not have to
-be aware of the languages for which APIs will be generated later.
-In fact, the main benefit of the language-independent description
-is that clients in various languages can be implemented using the
-same wire protocol (and thus capable of communicating with the
-same server).
-
-For a build system that means that we have to expect that language
-bindings at places far away from the protocol definition, and
-potentially several times. Such a duplication can also occur
-implicitly if two buffers, for which language bindings are generated
-both use a common buffer for which bindings are never requested
-explicitly. Still, we want to avoid duplicate work for common parts
-and we have to avoid conflicts with duplicate symbols and staging
-conflicts for the libraries for the common part.
-
-Our approach is that a "proto" target only provides the description
-files together with their dependency structure. From those, a
-consuming target generates "anonymous targets" as additional
-dependencies; as those targets will have an appropriate notion of
-equality, no duplicate work is done and hence, as a side effect,
-staging or symbol conflicts are avoided as well.
-
-** Preliminary remark: action identifiers
-
-Actions are defined as Merkle-tree hash of the contents. As all
-components (input tree, list of output strings, command vector,
-environment, and cache pragma) are given by expressions, that can
-quickly be computed. This identifier also defines the notion of
-equality for actions, and hence action artifacts. Recall that equality
-of artifacts is also (implicitly) used in our notion of disjoint
-map union (where the set of keys does not have to be disjoint, as
-long as the values for all duplicate keys are equal).
-
-When constructing the action graph for traversal, we can drop
-duplicates (i.e., actions with the same identifier, and hence the
-same description). For the serialization of the graph as part of
-the analyse command, we can afford the preparatory step to compute
-a map from action id to list of origins.
-
-** Equality
-
-*** Notions of equality
-
-In the context of builds, there are different concepts of equality
-to consider. We recall the definitions, as well as their use in
-our build tool.
-
-**** Locational equality ("Defined at the same place")
-
-Names (for targets and rules) are given by repository name, module
-name, and target name (inside the module); additionally, for target
-names, there's a bit specifying that we explicitly refer to a file.
-Names are equal if and only if the respective strings (and the file
-bit) are equal.
-
-For targets, we use locational equality, i.e., we consider targets
-equal precisely if their names are equal; targets defined at different
-places are considered different, even if they're defined in the
-same way. The reason we use notion of equality is that we have to
-refer to targets (and also check if we already have a pending task
-to analyse them) before we have fully explored them with all the
-targets referred to in their definition.
-
-**** Intensional equality ("Defined in the same way")
-
-In our expression language we handle definitions; in particular,
-we treat artifacts by their definition: a particular source file,
-the output of a particular action, etc. Hence we use intensional
-equality in our expression language; two objects are equal precisely
-if they are defined in the same way. This notion of equality is easy
-to determine without the need of reading a source file or running
-an action. We implement quick tests by keeping a Merkle-tree hash
-of all expression values.
-
-**** Extensional equality ("Defining the same object")
-
-For built artifacts, we use extensional equality, i.e., we consider
-two files equal, if they are bit-by-bit identical. Implementation-wise,
-we compare an appropriate cryptographic hash. Before running an
-action, we built its inputs. In particular (as inputs are considered
-extensionally) an action might cause a cache hit with an intensionally
-different one.
-
-**** Observable equality ("The defined objects behave in the same way")
-
-Finally, there is the notion of observable equality, i.e., the
-property that two binaries behaving the same way in all situations.
-As this notion is undecidable, it is never used directly by any
-build tool. However, it is often the motivation for a build in the
-first place: we want a binary that behaves in a particular way.
-
-*** Relation between these notions
-
-The notions of equality were introduced in order from most fine grained
-to most coarse. Targets defined at the same place are obviously defined
-in the same way. Intensionally equal artifacts create equal action
-graphs; here we can confidently say "equal" and not only isomorphic:
-due to our preliminary clean up, even the node names are equal.
-Making sure that equal actions produce bit-by-bit equal outputs
-is the realm of [[https://reproducible-builds.org/][reproducibe
-builds]]. The tool can support this by appropriate sandboxing,
-etc, but the rules still have to define actions that don't pick
-up non-input information like the current time, user id, readdir
-order, etc. Files that are bit-by-bit identical will behave in
-the same way.
-
-*** Example
-
-Consider the following target file.
-
-#+BEGIN_SRC
-{ "foo":
- { "type": "generic"
- , "outs": ["out.txt"]
- , "cmds": ["echo Hello World > out.txt"]
- }
-, "bar":
- { "type": "generic"
- , "outs": ["out.txt"]
- , "cmds": ["echo Hello World > out.txt"]
- }
-, "baz":
- { "type": "generic"
- , "outs": ["out.txt"]
- , "cmds": ["echo -n Hello > out.txt && echo ' World' >> out.txt"]
- }
-, "foo upper":
- { "type": "generic"
- , "deps": ["foo"]
- , "outs": ["upper.txt"]
- , "cmds": ["cat out.txt | tr a-z A-Z > upper.txt"]
- }
-, "bar upper":
- { "type": "generic"
- , "deps": ["bar"]
- , "outs": ["upper.txt"]
- , "cmds": ["cat out.txt | tr a-z A-Z > upper.txt"]
- }
-, "baz upper":
- { "type": "generic"
- , "deps": ["baz"]
- , "outs": ["upper.txt"]
- , "cmds": ["cat out.txt | tr a-z A-Z > upper.txt"]
- }
-, "ALL":
- { "type": "install"
- , "files":
- {"foo.txt": "foo upper", "bar.txt": "bar upper", "baz.txt": "baz upper"}
- }
-}
-#+END_SRC
-
-Assume we build the target ~"ALL"~. Then we will analyse 7 targets,
-all the locationally different ones (~"foo"~, ~"bar"~, ~"baz"~,
-~"foo upper"~, ~"bar upper"~, ~"baz upper"~). For the targets ~"foo"~
-and ~"bar"~, we immediately see that the definition is equal; their
-intensional equality also renders ~"foo upper"~ and ~"bar upper"~
-intensionally equal. Our action graph will contain 4 actions: one
-with origins ~["foo", "bar"]~, one with origins ~["baz"]~, one with
-origins ~["foo upper", "bar upper"]~, and one with origins ~["baz
-upper"]~. The ~"install"~ target will, of course, not create any
-actions. Building sequentially (~-J 1~), we will get one cache hit.
-Even though the artifacts of ~"foo"~ and ~"bar"~ and of ~"baz~"
-are defined differently, they are extensionally equal; both define
-a file with contents ~"Hello World\n"~.
-
-** Anonymous targets
-
-Besides named targets we also have additional targets (and hence also
-configured targets) that are not associated with a location they are
-defined at. Due to the absence of definition location, their notion
-of equality will take care of the necessary deduplication (implicitly,
-by the way our dependency exploration works). We will call them
-"anonymous targets", even though, technically, they're not fully
-anonymous as the rules that are part of their structure will be
-given by name, i.e., defining rule location.
-
-*** Value type: target graph node
-
-In order to allow targets to adequately describe a dependency
-structure, we have a value type in our expression language, that
-of a (target) graph node. As with all value types, equality is
-intensional, i.e., nodes defined in the same way are equal even
-if defined at different places. This can be achieved by our usual
-approach for expressions of having cached Merkle-tree hashes and
-comparing them when an equality test is required. This efficient
-test for equality also allows using graph nodes as part of a map
-key, e.g., for our asynchronous map consumers.
-
-As a graph node can only be defined with all data given, the defined
-dependency structure is cycle-free by construction. However, the
-tree unfolding will usually be exponentially larger. For internal
-handling, this is not a problem: our shared-pointer implementation
-can efficiently represent a directed acyclic graph and since we
-cache hashes in expressions, we can compute the overall hash without
-folding the structure to a tree. When presenting nodes to the user,
-we only show the map of identifier to definition, to avoid that
-exponential unfolding.
-
-We have two kinds of nodes.
-
-**** Value nodes
-
-These represent a target that, in any configuration, returns a fixed
-value. Source files would typically be represented this way. The
-constructor function ~"VALUE_NODE"~ takes a single argument ~"$1"~
-that has to be a result value.
-
-**** Abstract nodes
-
-These represent internal nodes in the dag. Their constructor
-~"ABSTRACT_NODE"~ takes the following arguments (all evaluated).
-- ~"node_type"~. An arbitrary string, not interpreted in any way, to
- indicate the role that the node has in the dependency structure.
- When we create an anonymous target from a node, this will serve
- as the key into the rule mapping to be applied.
-- ~"string_fields"~. This has to be a map of strings.
-- ~"target_fields"~. These have to be a map of lists of graph nodes.
-Moreover, we require that the keys for maps provided as ~"string_fields"~
-and ~"target_fields"~ be disjoint.
-
-*** Graph nodes in ~export~ targets
-
-Graph nodes are completely free of names and hence are eligible
-for exporting. As with other values, in the cache the intensional
-definition of artifacts implicit in them will be replaced by the
-corresponding, extensionally equal, known value.
-
-However, some care has to be taken in the serialisation that is
-part of the caching, as we do not want to unfold the dag to
-a tree. Therefore, we take as JSON serialisation a simple dict
-with ~"type"~ set to ~"NODE"~, and ~"value"~ set to the Merkle-tree
-hash. That serialisation respects intensional equality. To allow
-deserialisation, we add an additional map to the serialisation from
-node hash to its definition.
-
-*** Dependings on anonymous targets
-
-**** Parts of an anonymous target
-
-An anonymous target is given by a pair of a node and a map mapping
-the abstract node-type specifying strings to rule names. So, in
-the implementation these are just two expression pointers (with
-their defined notion of equality, i.e., equality of the respective
-Merkle-tree hashes). Such a pair of pointers also forms an additional
-variant of a name value, referring to such an anonymous target.
-
-It should be noted that such an anonymous target contains all the
-information needed to evaluate it in the same way as a regular (named)
-target defined by a user-defined rule. It is an analysis error
-analysing an anonymous target where there is no entry in the rules
-map for the string given as ~"node_type"~ for the corresponding node.
-
-**** Anonymous targets as additional dependencies
-
-We keep the property that a user can only request named targets.
-So anonymous targets have to be requested by other targets. We
-also keep the property that other targets are only requested at
-certain fixed steps in the evaluation of a target. To still achieve
-a meaningful use of anonymous targets our rule language handles
-anonymous targets in the following way.
-
-***** Rules parameter ~"anonymous"~
-
-In the rule definition a parameter ~"anonymous"~ (with empty map as
-default) is allowed. It is used to define an additional dependency on
-anonymous targets. The value has to be a map with keys the additional
-implicitly defined field names. It is hence a requirement that the
-set of keys be disjoint from all other field names (the values of
-~"config_fields"~, ~"string_fields"~, and ~"target_fields"~, as well as
-the keys of the ~"implict"~ parameter). Another consequence is that
-~"config_transitions"~ map may now also have meaningful entries for
-the keys of the ~"anonymous"~ map. Each value in the map has to be
-itself a map, with entries ~"target"~, ~"provider"~, and ~"rule_map"~.
-
-For ~"target"~, a single string has to be specifed, and the value has
-to be a member of the ~"target_fields"~ list. For provider, a single
-string has to be specified as well. The idea is that the nodes are
-collected from that provider of the targets in the specified target
-field. For ~"rule_map"~ a map has to be specified from strings to
-rule names; the latter are evaluated in the context of the rule
-definition.
-
-****** Example
-
-For generating language bindings for protocol buffers, a rule might
-look as follows.
-
-#+BEGIN_SRC
-{ "cc_proto_bindings":
- { "target_fields": ["proto_deps"]
- , "anonymous":
- { "protos":
- { "target": "proto_deps"
- , "provider": "proto"
- , "rule_map": {"proto_library": "cc_proto_library"}
- }
- }
- , "expression": {...}
- }
-}
-#+END_SRC
-
-***** Evaluation mechanism
-
-The evaluation of a target defined by a user-defined rule is handled
-as follows. After the target fields are evaluated as usual, an
-additional step is carried out.
-
-For each anonymous-target field, i.e., for each key in the ~"anonymous"~
-map, a list of anonymous targets is generated from the corresponding
-value: take all targets from the specified ~"target"~ field in all
-their specified configuration transitions (they have already been
-evaluated) and take the values provided for the specified ~"provider"~
-key (using the empty list as default). That value has to be a list
-of nodes. All the node lists obtained that way are concatenated.
-The configuration transition for the respective field name is
-evaluated. Those targets are then evaluated for all the transitioned
-configurations requested.
-
-In the final evaluation of the defining expression, the anonymous-target
-fields are available in the same way as any other target field.
-Also, they contribute to the effective configuration in the same
-way as regular target fields.
diff --git a/doc/concepts/built-in-rules.md b/doc/concepts/built-in-rules.md
new file mode 100644
index 00000000..3672df36
--- /dev/null
+++ b/doc/concepts/built-in-rules.md
@@ -0,0 +1,172 @@
+Built-in rules
+==============
+
+Targets are defined in `TARGETS` files. Each target file is a single
+`JSON` object. If the target name is contained as a key in that object,
+the corresponding value defines the target; otherwise it is implicitly
+considered a source file. The target definition itself is a `JSON`
+object as well. The mandatory key `"type"` specifies the rule defining
+the target; the meaning of the remaining keys depends on the rule
+defining the target.
+
+There are a couple of rules built in, all named by a single string. The
+user can define additional rules (and, in fact, we expect the majority
+of targets to be defined by user-defined rules); referring to them in a
+qualified way (with module) will always refer to those even if new
+built-in rules are added later (as built-in rules will always be only
+named by a single string).
+
+The following rules are built in. Built-in rules can have a special
+syntax.
+
+`"export"`
+----------
+
+The `"export"` rule evaluates a given target in a specified
+configuration. More precisely, the field `"target"` has to name a single
+target (not a list of targets), the field `"flexible_config"` a list of
+strings, treated as variable names, and the field `"fixed_config"` has
+to be a map that is taken unevaluated. It is a requirement that the
+domain of the `"fixed_config"` and the `"flexible_config"` be disjoint.
+The optional fields `"doc"` and `"config_doc"` can be used to describe
+the target and the `"flexible_config"`, respectively.
+
+To evaluate an `"export"` target, first the configuration is restricted
+to the `"flexible_config"` and then the union with the `"fixed_config"`
+is built. The target specified in `"target"` is then evaluated. It is a
+requirement that this target be untainted. The result is the result of
+this evaluation; artifacts, runfiles, and provides map are forwarded
+unchanged.
+
+The main point of the `"export"` rule is, that the relevant part of the
+configuration can be determined without having to analyze the target
+itself. This makes such rules eligible for target-level caching
+(provided the content of the repository as well as all reachable ones
+can be determined cheaply). This eligibility is also the reason why it
+is good practice to only depend on `"export"` targets of other
+repositories.
+
+`"install"`
+-----------
+
+The `"install"` rules allows to stage artifacts (and runfiles) of other
+targets in a different way. More precisely, a new stage (i.e., map of
+artifacts with keys treated as file names) is constructed in the
+following way.
+
+The runfiles from all targets in the `"deps"` field are taken; the
+`"deps"` field is an evaluated field and has to evaluate to a list of
+targets. It is an error, if those runfiles conflict.
+
+The `"files"` argument is a special form. It has to be a map, and the
+keys are taken as paths. The values are evaluated and have to evaluate
+to a single target. That target has to have a single artifact or no
+artifacts and a single run file. In this way, `"files"` defines a stage;
+this stage overlays the runfiles of the `"deps"` and conflicts are
+ignored.
+
+Finally, the `"dirs"` argument has to evaluate to a list of pairs (i.e.,
+lists of length two) with the first argument a target name and the
+second argument a string, taken as directory name. For each entry, both,
+runfiles and artifacts of the specified target are staged to the
+specified directory. It is an error if a conflict with the stage
+constructed so far occurs.
+
+Both, runfiles and artifacts of the `"install"` target are the stage
+just described. An `"install"` target always has an empty provides map.
+Any provided information of the dependencies is discarded.
+
+`"generic"`
+-----------
+
+The `"generic"` rules allows to define artifacts as the output of an
+action. This is mainly useful for ad-hoc constructions; for anything
+occurring more often, a proper user-defined rule is usually the better
+choice.
+
+The `"deps"` argument is evaluated and has to evaluate to a list of
+target names. The runfiles and artifacts of these targets form the
+inputs of the action. Conflicts are not an error and resolved by giving
+precedence to the artifacts over the runfiles; conflicts within
+artifacts or runfiles are resolved in a latest-wins fashion using the
+order of the targets in the evaluated `"deps"` argument.
+
+The fields `"cmds"`, `"out_dirs"`, `"outs"`, and `"env"` are evaluated
+fields where `"cmds"`, `"out_dirs"`, and `"outs"` have to evaluate to a
+list of strings, and `"env"` has to evaluate to a map of strings. During
+their evaluation, the functions `"out_dirs"`, `"outs"` and `"runfiles"`
+can be used to access the logical paths of the directories, artifacts
+and runfiles, respectively, of a target specified in `"deps"`. Here,
+`"env"` specifies the environment in which the action is carried out.
+`"out_dirs"` and `"outs"` define the output directories and files,
+respectively, the action has to produce. Since some artifacts are to be
+produced, at least one of `"out_dirs"` or `"outs"` must be a non-empty
+list of strings. It is an error if one or more paths are present in both
+the `"out_dirs"` and `"outs"`. Finally, the strings in `"cmds"` are
+extended by a newline character and joined, and command of the action is
+interpreting this string by `sh`.
+
+The artifacts of this target are the outputs (as declared by
+`"out_dirs"` and `"outs"`) of this action. Runfiles and provider map are
+empty.
+
+`"file_gen"`
+------------
+
+The `"file_gen"` rule allows to specify a file with a given content. To
+be able to accurately report about file names of artifacts or runfiles
+of other targets, they can be specified in the field `"deps"` which has
+to evaluate to a list of targets. The names of the artifacts and
+runfiles of a target specified in `"deps"` can be accessed through the
+functions `"outs"` and `"runfiles"`, respectively, during the evaluation
+of the arguments `"name"` and `"data"` which have to evaluate to a
+single string.
+
+Artifacts and runfiles of a `"file_gen"` target are a singleton map with
+key the result of evaluating `"name"` and value a (non-executable) file
+with content the result of evaluating `"data"`. The provides map is
+empty.
+
+`"tree"`
+--------
+
+The `"tree"` rule allows to specify a tree out of the artifact stage of
+given targets. More precisely, the deps field `"deps"` has to evaluate
+to a list of targets. For each target, runfiles and artifacts are
+overlayed in an artifacts-win fashion and the union of the resulting
+stages is taken; it is an error if conflicts arise in this way. The
+resulting stage is transformed into a tree. Both, artifacts and runfiles
+of the `"tree"` target are a singleton map with the key the result of
+evaluating `"name"` (which has to evaluate to a single string) and value
+that tree.
+
+`"configure"`
+-------------
+
+The `"configure"` rule allows to configure a target with a given
+configuration. The field `"target"` is evaluated and the result of the
+evaluation must name a single target (not a list). The `"config"` field
+is evaluated and must result in a map, which is used as configuration
+for the given target.
+
+This rule uses the given configuration to overlay the current
+environment for evaluating the given target, and thereby performs a
+configuration transition. It forwards all results
+(artifacts/runfiles/provides map) of the configured target to the upper
+context. The result of a target that uses this rule is the result of the
+target given in the `"target"` field (the configured target).
+
+As a full configuration transition is performed, the same care has to be
+taken when using this rule as when writing a configuration transition in
+a rule. Typically, this rule is used only at a top-level target of a
+project and configures only variables internally to the project. In any
+case, when using non-internal targets as dependencies (i.e., targets
+that a caller of the `"configure"` potentially might use as well), care
+should be taken that those are only used in the initial configuration.
+Such preservation of the configuration is necessary to avoid conflicts,
+if the targets depended upon are visible in the `"configure"` target
+itself, e.g., as link dependency (which almost always happens when
+depending on a library). Even if a non-internal target depended upon is
+not visible in the `"configure"` target itself, requesting it in a
+modified configuration causes additional overhead by increasing the
+target graph and potentially the action graph.
diff --git a/doc/concepts/built-in-rules.org b/doc/concepts/built-in-rules.org
deleted file mode 100644
index 9463b10c..00000000
--- a/doc/concepts/built-in-rules.org
+++ /dev/null
@@ -1,167 +0,0 @@
-* Built-in rules
-
-Targets are defined in ~TARGETS~ files. Each target file is a single
-~JSON~ object. If the target name is contained as a key in that
-object, the corresponding value defines the target; otherwise it is
-implicitly considered a source file. The target definition itself
-is a ~JSON~ object as well. The mandatory key ~"type"~ specifies
-the rule defining the target; the meaning of the remaining keys
-depends on the rule defining the target.
-
-There are a couple of rules built in, all named by a single string.
-The user can define additional rules (and, in fact, we expect the
-majority of targets to be defined by user-defined rules); referring
-to them in a qualified way (with module) will always refer to those
-even if new built-in rules are added later (as built-in rules will
-always be only named by a single string).
-
-The following rules are built in. Built-in rules can have a
-special syntax.
-
-** ~"export"~
-
-The ~"export"~ rule evaluates a given target in a specified
-configuration. More precisely, the field ~"target"~ has to name a single
-target (not a list of targets), the field ~"flexible_config"~ a list
-of strings, treated as variable names, and the field ~"fixed_config"~
-has to be a map that is taken unevaluated. It is a requirement that
-the domain of the ~"fixed_config"~ and the ~"flexible_config"~ be
-disjoint. The optional fields ~"doc"~ and ~"config_doc"~ can be used
-to describe the target and the ~"flexible_config"~, respectively.
-
-To evaluate an ~"export"~ target, first the configuration is
-restricted to the ~"flexible_config"~ and then the union with the
-~"fixed_config"~ is built. The target specified in ~"target"~ is
-then evaluated. It is a requirement that this target be untainted.
-The result is the result of this evaluation; artifacts, runfiles,
-and provides map are forwarded unchanged.
-
-The main point of the ~"export"~ rule is, that the relevant part
-of the configuration can be determined without having to analyze
-the target itself. This makes such rules eligible for target-level
-caching (provided the content of the repository as well as all
-reachable ones can be determined cheaply). This eligibility is also
-the reason why it is good practice to only depend on ~"export"~
-targets of other repositories.
-
-** ~"install"~
-
-The ~"install"~ rules allows to stage artifacts (and runfiles) of
-other targets in a different way. More precisely, a new stage (i.e.,
-map of artifacts with keys treated as file names) is constructed
-in the following way.
-
-The runfiles from all targets in the ~"deps"~ field are taken; the
-~"deps"~ field is an evaluated field and has to evaluate to a list
-of targets. It is an error, if those runfiles conflict.
-
-The ~"files"~ argument is a special form. It has to be a map, and
-the keys are taken as paths. The values are evaluated and have
-to evaluate to a single target. That target has to have a single
-artifact or no artifacts and a single run file. In this way, ~"files"~
-defines a stage; this stage overlays the runfiles of the ~"deps"~
-and conflicts are ignored.
-
-Finally, the ~"dirs"~ argument has to evaluate to a list of
-pairs (i.e., lists of length two) with the first argument a target
-name and the second argument a string, taken as directory name. For
-each entry, both, runfiles and artifacts of the specified target
-are staged to the specified directory. It is an error if a conflict
-with the stage constructed so far occurs.
-
-Both, runfiles and artifacts of the ~"install"~ target are the stage
-just described. An ~"install"~ target always has an empty provides
-map. Any provided information of the dependencies is discarded.
-
-** ~"generic"~
-
-The ~"generic"~ rules allows to define artifacts as the output
-of an action. This is mainly useful for ad-hoc constructions; for
-anything occurring more often, a proper user-defined rule is usually
-the better choice.
-
-The ~"deps"~ argument is evaluated and has to evaluate to a list
-of target names. The runfiles and artifacts of these targets form
-the inputs of the action. Conflicts are not an error and resolved
-by giving precedence to the artifacts over the runfiles; conflicts
-within artifacts or runfiles are resolved in a latest-wins fashion
-using the order of the targets in the evaluated ~"deps"~ argument.
-
-The fields ~"cmds"~, ~"out_dirs"~, ~"outs"~, and ~"env"~ are evaluated
-fields where ~"cmds"~, ~"out_dirs"~, and ~"outs"~ have to evaluate to
-a list of strings, and ~"env"~ has to evaluate to a map of
-strings. During their evaluation, the functions ~"out_dirs"~, ~"outs"~
-and ~"runfiles"~ can be used to access the logical paths of the
-directories, artifacts and runfiles, respectively, of a target
-specified in ~"deps"~. Here, ~"env"~ specifies the environment in
-which the action is carried out. ~"out_dirs"~ and ~"outs"~ define the
-output directories and files, respectively, the action has to
-produce. Since some artifacts are to be produced, at least one of
-~"out_dirs"~ or ~"outs"~ must be a non-empty list of strings. It is an
-error if one or more paths are present in both the ~"out_dirs"~ and
-~"outs"~. Finally, the strings in ~"cmds"~ are extended by a newline
-character and joined, and command of the action is interpreting this
-string by ~sh~.
-
-The artifacts of this target are the outputs (as declared by
-~"out_dirs"~ and ~"outs"~) of this action. Runfiles and provider map
-are empty.
-
-** ~"file_gen"~
-
-The ~"file_gen"~ rule allows to specify a file with a given content.
-To be able to accurately report about file names of artifacts
-or runfiles of other targets, they can be specified in the field
-~"deps"~ which has to evaluate to a list of targets. The names
-of the artifacts and runfiles of a target specified in ~"deps"~
-can be accessed through the functions ~"outs"~ and ~"runfiles"~,
-respectively, during the evaluation of the arguments ~"name"~ and
-~"data"~ which have to evaluate to a single string.
-
-Artifacts and runfiles of a ~"file_gen"~ target are a singleton map
-with key the result of evaluating ~"name"~ and value a (non-executable)
-file with content the result of evaluating ~"data"~. The provides
-map is empty.
-
-** ~"tree"~
-
-The ~"tree"~ rule allows to specify a tree out of the artifact
-stage of given targets. More precisely, the deps field ~"deps"~
-has to evaluate to a list of targets. For each target, runfiles
-and artifacts are overlayed in an artifacts-win fashion and
-the union of the resulting stages is taken; it is an error if conflicts
-arise in this way. The resulting stage is transformed into a tree.
-Both, artifacts and runfiles of the ~"tree"~ target are a singleton map
-with the key the result of evaluating ~"name"~ (which has to evaluate to
-a single string) and value that tree.
-
-
-** ~"configure"~
-
-The ~"configure"~ rule allows to configure a target with a given
-configuration. The field ~"target"~ is evaluated and the result
-of the evaluation must name a single target (not a list). The
-~"config"~ field is evaluated and must result in a map, which is
-used as configuration for the given target.
-
-This rule uses the given configuration to overlay the current environment for
-evaluating the given target, and thereby performs a configuration transition. It
-forwards all results (artifacts/runfiles/provides map) of the configured target
-to the upper context. The result of a target that uses this rule is the result
-of the target given in the ~"target"~ field (the configured target).
-
-As a full configuration transition is performed, the same care has
-to be taken when using this rule as when writing a configuration
-transition in a rule. Typically, this rule is used only at a
-top-level target of a project and configures only variables internally
-to the project. In any case, when using non-internal targets as
-dependencies (i.e., targets that a caller of the ~"configure"~
-potentially might use as well), care should be taken that those
-are only used in the initial configuration. Such preservation of
-the configuration is necessary to avoid conflicts, if the targets
-depended upon are visible in the ~"configure"~ target itself, e.g.,
-as link dependency (which almost always happens when depending on a
-library). Even if a non-internal target depended upon is not visible
-in the ~"configure"~ target itself, requesting it in a modified
-configuration causes additional overhead by increasing the target
-graph and potentially the action graph.
diff --git a/doc/concepts/cache-pragma.md b/doc/concepts/cache-pragma.md
new file mode 100644
index 00000000..858f2b4f
--- /dev/null
+++ b/doc/concepts/cache-pragma.md
@@ -0,0 +1,134 @@
+Action caching pragma
+=====================
+
+Introduction: exit code, build failures, and caching
+----------------------------------------------------
+
+The exit code of a process is used to signal success or failure of that
+process. By convention, 0 indicates success and any other value
+indicates some form of failure.
+
+Our tool expects all build actions to follow this convention. A non-zero
+exit code of a regular build action has two consequences.
+
+ - As the action failed, the whole build is aborted and considered
+ failed.
+ - As such a failed action can never be part of a successful build, it
+ is (effectively) not cached.
+
+This non-caching is achieved by rerequesting an action without cache
+look up, if a failed action from cache is reported.
+
+In particular, for building, we have the property that everything that
+does not lead to aborting the build can (and will) be cached. This
+property is justified as we expect build actions to behave in a
+functional way.
+
+Test and run actions
+--------------------
+
+Tests have a lot of similarity to regular build actions: a process is
+run with given inputs, and the results are processed further (e.g., to
+create reports on test suites). However, they break the above described
+connection between caching and continuation of the build: we expect that
+some tests might be flaky (even though they shouldn't be, of course)
+and hence only want to cache successful tests. Nevertheless, we do want
+to continue testing after the first test failure.
+
+Another breakage of the functionality assumption of actions are "run"
+actions, i.e., local actions that are executed either because of their
+side effect on the host system, or because of their non-deterministic
+results (e.g., monitoring some resource). Those actions should never be
+cached, but if they fail, the build should be aborted.
+
+Tainting
+--------
+
+Targets that, directly or indirectly, depend on non-functional actions
+are not regular targets. They are test targets, run targets, benchmark
+results, etc; in any case, they are tainted in some way. When adding
+high-level caching of targets, we will only support caching for
+untainted targets.
+
+To make everybody aware of their special nature, they are clearly marked
+as such: tainted targets not generated by a tainted rule (e.g., a test
+rule) have to explicitly state their taintedness in their attributes.
+This declaration also gives a natural way to mark targets that are
+technically pure, but still should be used only in test, e.g., a mock
+version of a larger library.
+
+Besides being for tests only, there might be other reasons why a target
+might not be fit for general use, e.g., configuration files with
+accounts for developer access, or files under restrictive licences. To
+avoid having to extend the framework for each new use case, we allow
+arbitrary strings as markers for the kind of taintedness of a target. Of
+course, a target can be tainted in more than one way.
+
+More precisely, rules can have `"tainted"` as an additional property.
+Moreover `"tainted"` is another reserved keyword for target arguments
+(like `"type"` and `"arguments_config"`). In both cases, the value has
+to be a list of strings, and the empty list is assumed, if not
+specified.
+
+A rule is tainted with the set of strings in its `"tainted"` property. A
+target is tainted with the union of the set of strings of its
+`"tainted"` argument and the set of strings its generating rule is
+tainted with.
+
+Every target has to be tainted with (at least) the union of what its
+dependencies are tainted with.
+
+For tainted targets, the `analyse`, `build`, and `install` commands
+report the set of strings the target is tainted with.
+
+### `"may_fail"` and `"no_cache"` properties of `"ACTION"`
+
+The `"ACTION"` function in the defining expression of a rule have two
+additional (besides inputs, etc) parameters `"may_fail"` and
+`"no_cache"`. Those are not evaluated and have to be lists of strings
+(with empty assumed if the respective parameter is not present). Only
+strings the defining rule is tainted with may occur in that list. If the
+list is not empty, the corresponding may-fail or no-cache bit of the
+action is set.
+
+For actions with the `"may_fail"` bit set, the optional parameter
+`"fail_message"` with default value `"action failed"` is evaluated. That
+message will be reported if the action returns a non-zero exit value.
+
+Actions with the no-cache bit set are never cached. If an action with
+the may-fail bit set exits with non-zero exit value, the build is
+continued if the action nevertheless managed to produce all expected
+outputs. We continue to ignore actions with non-zero exit status from
+cache.
+
+### Marking of failed artifacts
+
+To simplify finding failures in accumulated reports, our tool keeps
+track of artifacts generated by failed actions. More precisely,
+artifacts are considered failed if one of the following conditions
+applies.
+
+ - Artifacts generated by failed actions are failed.
+ - Tree artifacts containing a failed artifact are failed.
+ - Artifacts generated by an action taking a failed artifact as input
+ are failed.
+
+The identifiers used for built artifacts (including trees) remain
+unchanged; in particular, they will only describe the contents and not
+if they were obtained in a failed way.
+
+When reporting artifacts, e.g., in the log file, an additional marker is
+added to indicate that the artifact is a failed one. After every `build`
+or `install` command, if the requested artifacts contain failed one, a
+different exit code is returned.
+
+### The `install-cas` subcommand
+
+A typical workflow for testing is to first run the full test suite and
+then only look at the failed tests in more details. As we don't take
+failed actions from cache, installing the output can't be done by
+rerunning the same target as `install` instead of `build`. Instead, the
+output has to be taken from CAS using the identifier shown in the build
+log. To simplify this workflow, there is the `install-cas` subcommand
+that installs a CAS entry, identified by the identifier as shown in the
+log to a given location or (if no location is specified) to `stdout`.
diff --git a/doc/concepts/cache-pragma.org b/doc/concepts/cache-pragma.org
deleted file mode 100644
index 11953702..00000000
--- a/doc/concepts/cache-pragma.org
+++ /dev/null
@@ -1,130 +0,0 @@
-* Action caching pragma
-
-** Introduction: exit code, build failures, and caching
-
-The exit code of a process is used to signal success or failure
-of that process. By convention, 0 indicates success and any other
-value indicates some form of failure.
-
-Our tool expects all build actions to follow this convention. A
-non-zero exit code of a regular build action has two consequences.
-- As the action failed, the whole build is aborted and considered failed.
-- As such a failed action can never be part of a successful build,
- it is (effectively) not cached.
-This non-caching is achieved by rerequesting an action without
-cache look up, if a failed action from cache is reported.
-
-In particular, for building, we have the property that everything
-that does not lead to aborting the build can (and will) be cached.
-This property is justified as we expect build actions to behave in
-a functional way.
-
-** Test and run actions
-
-Tests have a lot of similarity to regular build actions: a process is
-run with given inputs, and the results are processed further (e.g.,
-to create reports on test suites). However, they break the above
-described connection between caching and continuation of the
-build: we expect that some tests might be flaky (even though they
-shouldn't be, of course) and hence only want to cache successful
-tests. Nevertheless, we do want to continue testing after the first
-test failure.
-
-Another breakage of the functionality assumption of actions are
-"run" actions, i.e., local actions that are executed either because
-of their side effect on the host system, or because of their
-non-deterministic results (e.g., monitoring some resource). Those
-actions should never be cached, but if they fail, the build should
-be aborted.
-
-** Tainting
-
-Targets that, directly or indirectly, depend on non-functional
-actions are not regular targets. They are test targets, run targets,
-benchmark results, etc; in any case, they are tainted in some way.
-When adding high-level caching of targets, we will only support
-caching for untainted targets.
-
-To make everybody aware of their special nature, they are clearly
-marked as such: tainted targets not generated by a tainted rule (e.g.,
-a test rule) have to explicitly state their taintedness in their
-attributes. This declaration also gives a natural way to mark targets
-that are technically pure, but still should be used only in test,
-e.g., a mock version of a larger library.
-
-Besides being for tests only, there might be other reasons why a
-target might not be fit for general use, e.g., configuration files
-with accounts for developer access, or files under restrictive
-licences. To avoid having to extend the framework for each new
-use case, we allow arbitrary strings as markers for the kind of
-taintedness of a target. Of course, a target can be tainted in more
-than one way.
-
-More precisely, rules can have ~"tainted"~ as an additional
-property. Moreover ~"tainted"~ is another reserved keyword for
-target arguments (like ~"type"~ and ~"arguments_config"~). In both
-cases, the value has to be a list of strings, and the empty list
-is assumed, if not specified.
-
-A rule is tainted with the set of strings in its ~"tainted"~
-property. A target is tainted with the union of the set of strings
-of its ~"tainted"~ argument and the set of strings its generating
-rule is tainted with.
-
-Every target has to be tainted with (at least) the union of what
-its dependencies are tainted with.
-
-For tainted targets, the ~analyse~, ~build~, and ~install~ commands
-report the set of strings the target is tainted with.
-
-*** ~"may_fail"~ and ~"no_cache"~ properties of ~"ACTION"~
-
-The ~"ACTION"~ function in the defining expression of a rule
-have two additional (besides inputs, etc) parameters ~"may_fail"~
-and ~"no_cache"~. Those are not evaluated and have to be lists
-of strings (with empty assumed if the respective parameter is not
-present). Only strings the defining rule is tainted with may occur
-in that list. If the list is not empty, the corresponding may-fail
-or no-cache bit of the action is set.
-
-For actions with the ~"may_fail"~ bit set, the optional parameter
-~"fail_message"~ with default value ~"action failed"~ is evaluated.
-That message will be reported if the action returns a non-zero
-exit value.
-
-Actions with the no-cache bit set are never cached. If an action
-with the may-fail bit set exits with non-zero exit value, the build
-is continued if the action nevertheless managed to produce all
-expected outputs. We continue to ignore actions with non-zero exit
-status from cache.
-
-*** Marking of failed artifacts
-
-To simplify finding failures in accumulated reports, our tool
-keeps track of artifacts generated by failed actions. More
-precisely, artifacts are considered failed if one of the following
-conditions applies.
-- Artifacts generated by failed actions are failed.
-- Tree artifacts containing a failed artifact are failed.
-- Artifacts generated by an action taking a failed artifact as
- input are failed.
-The identifiers used for built artifacts (including trees) remain
-unchanged; in particular, they will only describe the contents and
-not if they were obtained in a failed way.
-
-When reporting artifacts, e.g., in the log file, an additional marker
-is added to indicate that the artifact is a failed one. After every
-~build~ or ~install~ command, if the requested artifacts contain
-failed one, a different exit code is returned.
-
-*** The ~install-cas~ subcommand
-
-A typical workflow for testing is to first run the full test suite
-and then only look at the failed tests in more details. As we don't
-take failed actions from cache, installing the output can't be
-done by rerunning the same target as ~install~ instead of ~build~.
-Instead, the output has to be taken from CAS using the identifier
-shown in the build log. To simplify this workflow, there is the
-~install-cas~ subcommand that installs a CAS entry, identified by
-the identifier as shown in the log to a given location or (if no
-location is specified) to ~stdout~.
diff --git a/doc/concepts/configuration.md b/doc/concepts/configuration.md
new file mode 100644
index 00000000..743ed41e
--- /dev/null
+++ b/doc/concepts/configuration.md
@@ -0,0 +1,115 @@
+Configuration
+=============
+
+Targets describe abstract concepts like "library". Depending on
+requirements, a library might manifest itself in different ways. For
+example,
+
+ - it can be built for various target architectures,
+ - it can have the requirement to produce position-independent code,
+ - it can be a special build for debugging, profiling, etc.
+
+So, a target (like a library described by header files, source files,
+dependencies, etc) has some additional input. As those inputs are
+typically of a global nature (e.g., a profiling build usually wants all
+involved libraries to be built for profiling), this additional input,
+called "configuration" follows the same approach as the `UNIX`
+environment: it is a global collection of key-value pairs and every
+target picks, what it needs.
+
+Top-level configuration
+-----------------------
+
+The configuration is a `JSON` object. The configuration for the target
+requested can be specified on the command line using the `-c` option;
+its argument is a file name and that file is supposed to contain the
+`JSON` object.
+
+Propagation
+-----------
+
+Rules and target definitions have to declare which parts of the
+configuration they want to have access to. The (essentially) full
+configuration, however, is passed on to the dependencies; in this way, a
+target not using a part of the configuration can still depend on it, if
+one of its dependencies does.
+
+### Rules configuration and configuration transitions
+
+As part of the definition of a rule, it specifies a set `"config_vars"`
+of variables. During the evaluation of the rule, the configuration
+restricted to those variables (variables unset in the original
+configuration are set to `null`) is used as environment.
+
+Additionally, the rule can request that certain targets be evaluated in
+a modified configuration by specifying `"config_transitions"`
+accordingly. Typically, this is done when a tool is required during the
+build; then this tool has to be built for the architecture on which the
+build is carried out and not the target architecture. Those tools often
+are `"implicit"` dependencies, i.e., dependencies that every target
+defined by that rule has, without the need to specify it in the target
+definition.
+
+### Target configuration
+
+Additionally (and independently of the configuration-dependency of the
+rule), the target definition itself can depend on the configuration.
+This can happen, if a debug version of a library has additional
+dependencies (e.g., for structured debug logs).
+
+If such a configuration-dependency is needed, the reserved key word
+`"arguments_config"` is used to specify a set of variables (if unset,
+the empty set is assumed; this should be the usual case). The
+environment in which all arguments of the target definition are
+evaluated is the configuration restricted to those variables (again,
+with values unset in the original configuration set to `null`).
+
+For example, a library where the debug version has an additional
+dependency could look as follows.
+
+``` jsonc
+{ "libfoo":
+ { "type": ["@", "rules", "CC", "library"]
+ , "arguments_config": ["DEBUG"]
+ , "name": ["foo"]
+ , "hdrs": ["foo.hpp"]
+ , "srcs": ["foo.cpp"]
+ , "local defines":
+ { "type": "if"
+ , "cond": {"type": "var", "name": "DEBUG"}
+ , "then": ["DEBUG"]
+ }
+ , "deps":
+ { "type": "++"
+ , "$1":
+ [ ["libbar", "libbaz"]
+ , { "type": "if"
+ , "cond": {"type": "var", "name": "DEBUG"}
+ , "then": ["libdebuglog"]
+ }
+ ]
+ }
+ }
+}
+```
+
+Effective configuration
+-----------------------
+
+A target is influenced by the configuration through
+
+ - the configuration dependency of target definition, as specified in
+ `"arguments_config"`,
+ - the configuration dependency of the underlying rule, as specified in
+ the rule's `"config_vars"` field, and
+ - the configuration dependency of target dependencies, not taking into
+ account values explicitly set by a configuration transition.
+
+Restricting the configuration to this collection of variables yields the
+effective configuration for that target-configuration pair. The
+`--dump-targets` option of the `analyse` subcommand allows to inspect
+the effective configurations of all involved targets. Due to
+configuration transitions, a target can be analyzed in more than one
+configuration, e.g., if a library is used both, for a tool needed during
+the build, as well as for the final binary cross-compiled for a
+different target architecture.
diff --git a/doc/concepts/configuration.org b/doc/concepts/configuration.org
deleted file mode 100644
index 4217d22d..00000000
--- a/doc/concepts/configuration.org
+++ /dev/null
@@ -1,107 +0,0 @@
-* Configuration
-
-Targets describe abstract concepts like "library". Depending on
-requirements, a library might manifest itself in different ways.
-For example,
-- it can be built for various target architectures,
-- it can have the requirement to produce position-independent code,
-- it can be a special build for debugging, profiling, etc.
-
-So, a target (like a library described by header files, source files,
-dependencies, etc) has some additional input. As those inputs are
-typically of a global nature (e.g., a profiling build usually wants
-all involved libraries to be built for profiling), this additional
-input, called "configuration" follows the same approach as the
-~UNIX~ environment: it is a global collection of key-value pairs
-and every target picks, what it needs.
-
-** Top-level configuration
-
-The configuration is a ~JSON~ object. The configuration for the
-target requested can be specified on the command line using the
-~-c~ option; its argument is a file name and that file is supposed
-to contain the ~JSON~ object.
-
-** Propagation
-
-Rules and target definitions have to declare which parts of the
-configuration they want to have access to. The (essentially) full
-configuration, however, is passed on to the dependencies; in this way,
-a target not using a part of the configuration can still depend on
-it, if one of its dependencies does.
-
-*** Rules configuration and configuration transitions
-
-As part of the definition of a rule, it specifies a set ~"config_vars"~
-of variables. During the evaluation of the rule, the configuration
-restricted to those variables (variables unset in the original
-configuration are set to ~null~) is used as environment.
-
-Additionally, the rule can request that certain targets be evaluated
-in a modified configuration by specifying ~"config_transitions"~
-accordingly. Typically, this is done when a tool is required during
-the build; then this tool has to be built for the architecture on
-which the build is carried out and not the target architecture. Those
-tools often are ~"implicit"~ dependencies, i.e., dependencies that
-every target defined by that rule has, without the need to specify
-it in the target definition.
-
-*** Target configuration
-
-Additionally (and independently of the configuration-dependency
-of the rule), the target definition itself can depend on the
-configuration. This can happen, if a debug version of a library
-has additional dependencies (e.g., for structured debug logs).
-
-If such a configuration-dependency is needed, the reserved key
-word ~"arguments_config"~ is used to specify a set of variables (if
-unset, the empty set is assumed; this should be the usual case).
-The environment in which all arguments of the target definition are
-evaluated is the configuration restricted to those variables (again,
-with values unset in the original configuration set to ~null~).
-
-For example, a library where the debug version has an additional
-dependency could look as follows.
-#+BEGIN_SRC
-{ "libfoo":
- { "type": ["@", "rules", "CC", "library"]
- , "arguments_config": ["DEBUG"]
- , "name": ["foo"]
- , "hdrs": ["foo.hpp"]
- , "srcs": ["foo.cpp"]
- , "local defines":
- { "type": "if"
- , "cond": {"type": "var", "name": "DEBUG"}
- , "then": ["DEBUG"]
- }
- , "deps":
- { "type": "++"
- , "$1":
- [ ["libbar", "libbaz"]
- , { "type": "if"
- , "cond": {"type": "var", "name": "DEBUG"}
- , "then": ["libdebuglog"]
- }
- ]
- }
- }
-}
-#+END_SRC
-
-** Effective configuration
-
-A target is influenced by the configuration through
-- the configuration dependency of target definition, as specified
- in ~"arguments_config"~,
-- the configuration dependency of the underlying rule, as specified
- in the rule's ~"config_vars"~ field, and
-- the configuration dependency of target dependencies, not taking
- into account values explicitly set by a configuration transition.
-Restricting the configuration to this collection of variables yields
-the effective configuration for that target-configuration pair.
-The ~--dump-targets~ option of the ~analyse~ subcommand allows to
-inspect the effective configurations of all involved targets. Due to
-configuration transitions, a target can be analyzed in more than one
-configuration, e.g., if a library is used both, for a tool needed
-during the build, as well as for the final binary cross-compiled
-for a different target architecture.
diff --git a/doc/concepts/doc-strings.md b/doc/concepts/doc-strings.md
new file mode 100644
index 00000000..a1a156ac
--- /dev/null
+++ b/doc/concepts/doc-strings.md
@@ -0,0 +1,152 @@
+Documentation of build rules, expressions, etc
+==============================================
+
+Build rules can obtain a non-trivial complexity. This is especially true
+if several rules have to exist for slightly different use cases, or if
+the rule supports many different fields. Therefore, documentation of the
+rules (and also expressions for the benefit of rule authors) is
+desirable.
+
+Experience shows that documentation that is not versioned together with
+the code it refers to quickly gets out of date, or lost. Therefore, we
+add documentation directly into the respective definitions.
+
+Multi-line strings in JSON
+--------------------------
+
+In JSON, the newline character is encoded specially and not taken
+literally; also, there is not implicit joining of string literals. So,
+in order to also have documentation readable in the JSON representation
+itself, instead of single strings, we take arrays of strings, with the
+understanding that they describe the strings obtained by joining the
+entries with newline characters.
+
+Documentation is optional
+-------------------------
+
+While documentation is highly recommended, it still remains optional.
+Therefore, when in the following we state that a key is for a list or a
+map, it is always implied that it may be absent; in this case, the empty
+array or the empty map is taken as default, respectively.
+
+Rules
+-----
+
+Each rule is described as a JSON object with a fixed set of keys. So
+having fixed keys for documentation does not cause conflicts. More
+precisely, the keys `doc`, `field doc`, `config_doc`, `artifacts_doc`,
+`runfiles_doc`, and `provides_doc` are reserved for documentation. Here,
+`doc` has to be a list of strings describing the rule in general.
+`field doc` has to be a map from (some of) the field names to an array
+of strings, containing additional information on that particular field.
+`config_doc` has to be a map from (some of) the config variables to an
+array of strings describing the respective variable. `artifacts_doc` is
+an array of strings describing the artifacts produced by the rule.
+`runfiles_doc` is an array of strings describing the runfiles produced
+by this rule. Finally, `provides_doc` is a map describing (some of) the
+providers by that rule; as opposed to fields or config variables there
+is no authoritative list of providers given elsewhere in the rule, so it
+is up to the rule author to give an accurate documentation on the
+provided data.
+
+### Example
+
+``` jsonc
+{ "library":
+ { "doc":
+ [ "A C library"
+ , ""
+ , "Define a library that can be used to be statically linked to a"
+ , "binary. To do so, the target can simply be specified in the deps"
+ , "field of a binary; it can also be a dependency of another library"
+ , "and the information is then propagated to the corresponding binary."
+ ]
+ , "string_fields": ["name"]
+ , "target_fields": ["srcs", "hdrs", "private-hdrs", "deps"]
+ , "field_doc":
+ { "name":
+ ["The base name of the library (i.e., the name without the leading lib)."]
+ , "srcs": ["The source files (i.e., *.c files) of the library."]
+ , "hdrs":
+ [ "The public header files of this library. Targets depending on"
+ , "this library will have access to those header files"
+ ]
+ , "private-hdrs":
+ [ "Additional internal header files that are used when compiling"
+ , "the source files. Targets depending on this library have no access"
+ , "to those header files."
+ ]
+ , "deps":
+ [ "Any other libraries that this library uses. The dependency is"
+ , "also propagated (via the link-deps provider) to any consumers of"
+ , "this target. So only direct dependencies should be declared."
+ ]
+ }
+ , "config_vars": ["CC"]
+ , "config_doc":
+ { "CC":
+ [ "single string. defaulting to \"cc\", specifying the compiler"
+ , "to be used. The compiler is also used to launch the preprocessor."
+ ]
+ }
+ , "artifacts_doc":
+ ["The actual library (libname.a) staged in the specified directory"]
+ , "runfiles_doc": ["The public headers of this library"]
+ , "provides_doc":
+ { "compile-deps":
+ [ "Map of artifacts specifying any additional files that, besides the runfiles,"
+ , "have to be present in compile actions of targets depending on this library"
+ ]
+ , "link-deps":
+ [ "Map of artifacts specifying any additional files that, besides the artifacts,"
+ , "have to be present in a link actions of targets depending on this library"
+ ]
+ , "link-args":
+ [ "List of strings that have to be added to the command line for linking actions"
+ , "in targets depending on this library"
+ ]
+ }
+ , "expression": { ... }
+ }
+}
+```
+
+Expressions
+-----------
+
+Expressions are also described by a JSON object with a fixed set of
+keys. Here we use the keys `doc` and `vars_doc` for documentation, where
+`doc` is an array of strings describing the expression as a whole and
+`vars_doc` is a map from (some of) the `vars` to an array of strings
+describing this variable.
+
+Export targets
+--------------
+
+As export targets play the role of interfaces between repositories, it
+is important that they be documented as well. Again, export targets are
+described as a JSON object with fixed set of keys amd we use the keys
+`doc` and `config_doc` for documentation. Here `doc` is an array of
+strings describing the targeted in general and `config_doc` is a map
+from (some of) the variables of the `flexible_config` to an array of
+strings describing this parameter.
+
+Presentation of the documentation
+---------------------------------
+
+As all documentation are just values (that need not be evaluated) in
+JSON objects, it is easy to write tool rendering documentation pages for
+rules, etc, and we expect those tools to be written independently.
+Nevertheless, for the benefit of developers using rules from a git-tree
+roots that might not be checked out, there is a subcommand `describe`
+which takes a target specification like the `analyze` command, looks up
+the corresponding rule and describes it fully, i.e., prints in
+human-readable form
+
+ - the documentation for the rule
+ - all the fields available for that rule together with
+ - their type (`string_field`, `target_field`, etc), and
+ - their documentation,
+ - all the configuration variables of the rule with their documentation
+ (if given), and
+ - the documented providers.
diff --git a/doc/concepts/doc-strings.org b/doc/concepts/doc-strings.org
deleted file mode 100644
index d9a94dc5..00000000
--- a/doc/concepts/doc-strings.org
+++ /dev/null
@@ -1,145 +0,0 @@
-* Documentation of build rules, expressions, etc
-
-Build rules can obtain a non-trivial complexity. This is especially
-true if several rules have to exist for slightly different use
-cases, or if the rule supports many different fields. Therefore,
-documentation of the rules (and also expressions for the benefit
-of rule authors) is desirable.
-
-Experience shows that documentation that is not versioned together with
-the code it refers to quickly gets out of date, or lost. Therefore,
-we add documentation directly into the respective definitions.
-
-** Multi-line strings in JSON
-
-In JSON, the newline character is encoded specially and not taken
-literally; also, there is not implicit joining of string literals.
-So, in order to also have documentation readable in the JSON
-representation itself, instead of single strings, we take arrays
-of strings, with the understanding that they describe the strings
-obtained by joining the entries with newline characters.
-
-** Documentation is optional
-
-While documentation is highly recommended, it still remains optional.
-Therefore, when in the following we state that a key is for a list
-or a map, it is always implied that it may be absent; in this case,
-the empty array or the empty map is taken as default, respectively.
-
-** Rules
-
-Each rule is described as a JSON object with a fixed set of keys.
-So having fixed keys for documentation does not cause conflicts.
-More precisely, the keys ~doc~, ~field doc~, ~config_doc~,
-~artifacts_doc~, ~runfiles_doc~, and ~provides_doc~
-are reserved for documentation. Here, ~doc~ has to be a list of
-strings describing the rule in general. ~field doc~ has to be a map
-from (some of) the field names to an array of strings, containing
-additional information on that particular field. ~config_doc~ has
-to be a map from (some of) the config variables to an array of
-strings describing the respective variable. ~artifacts_doc~ is
-an array of strings describing the artifacts produced by the rule.
-~runfiles_doc~ is an array of strings describing the runfiles produced
-by this rule. Finally, ~provides_doc~ is a map describing (some
-of) the providers by that rule; as opposed to fields or config
-variables there is no authoritative list of providers given elsewhere
-in the rule, so it is up to the rule author to give an accurate
-documentation on the provided data.
-
-*** Example
-
-#+BEGIN_SRC
-{ "library":
- { "doc":
- [ "A C library"
- , ""
- , "Define a library that can be used to be statically linked to a"
- , "binary. To do so, the target can simply be specified in the deps"
- , "field of a binary; it can also be a dependency of another library"
- , "and the information is then propagated to the corresponding binary."
- ]
- , "string_fields": ["name"]
- , "target_fields": ["srcs", "hdrs", "private-hdrs", "deps"]
- , "field_doc":
- { "name":
- ["The base name of the library (i.e., the name without the leading lib)."]
- , "srcs": ["The source files (i.e., *.c files) of the library."]
- , "hdrs":
- [ "The public header files of this library. Targets depending on"
- , "this library will have access to those header files"
- ]
- , "private-hdrs":
- [ "Additional internal header files that are used when compiling"
- , "the source files. Targets depending on this library have no access"
- , "to those header files."
- ]
- , "deps":
- [ "Any other libraries that this library uses. The dependency is"
- , "also propagated (via the link-deps provider) to any consumers of"
- , "this target. So only direct dependencies should be declared."
- ]
- }
- , "config_vars": ["CC"]
- , "config_doc":
- { "CC":
- [ "single string. defaulting to \"cc\", specifying the compiler"
- , "to be used. The compiler is also used to launch the preprocessor."
- ]
- }
- , "artifacts_doc":
- ["The actual library (libname.a) staged in the specified directory"]
- , "runfiles_doc": ["The public headers of this library"]
- , "provides_doc":
- { "compile-deps":
- [ "Map of artifacts specifying any additional files that, besides the runfiles,"
- , "have to be present in compile actions of targets depending on this library"
- ]
- , "link-deps":
- [ "Map of artifacts specifying any additional files that, besides the artifacts,"
- , "have to be present in a link actions of targets depending on this library"
- ]
- , "link-args":
- [ "List of strings that have to be added to the command line for linking actions"
- , "in targets depending on this library"
- ]
- }
- , "expression": { ... }
- }
-}
-#+END_SRC
-
-** Expressions
-
-Expressions are also described by a JSON object with a fixed set of
-keys. Here we use the keys ~doc~ and ~vars_doc~ for documentation,
-where ~doc~ is an array of strings describing the expression as a
-whole and ~vars_doc~ is a map from (some of) the ~vars~ to an array
-of strings describing this variable.
-
-** Export targets
-
-As export targets play the role of interfaces between repositories,
-it is important that they be documented as well. Again, export targets
-are described as a JSON object with fixed set of keys amd we use
-the keys ~doc~ and ~config_doc~ for documentation. Here ~doc~ is an
-array of strings describing the targeted in general and ~config_doc~
-is a map from (some of) the variables of the ~flexible_config~ to
-an array of strings describing this parameter.
-
-** Presentation of the documentation
-
-As all documentation are just values (that need not be evaluated)
-in JSON objects, it is easy to write tool rendering documentation
-pages for rules, etc, and we expect those tools to be written
-independently. Nevertheless, for the benefit of developers using
-rules from a git-tree roots that might not be checked out, there is
-a subcommand ~describe~ which takes a target specification like the
-~analyze~ command, looks up the corresponding rule and describes
-it fully, i.e., prints in human-readable form
-- the documentation for the rule
-- all the fields available for that rule together with
- - their type (~string_field~, ~target_field~, etc), and
- - their documentation,
-- all the configuration variables of the rule with their
- documentation (if given), and
-- the documented providers.
diff --git a/doc/concepts/expressions.md b/doc/concepts/expressions.md
new file mode 100644
index 00000000..9e8a8f36
--- /dev/null
+++ b/doc/concepts/expressions.md
@@ -0,0 +1,368 @@
+Expression language
+===================
+
+At various places, in particular in order to define a rule, we need a
+restricted form of functional computation. This is achieved by our
+expression language.
+
+Syntax
+------
+
+All expressions are given by JSON values. One can think of expressions
+as abstract syntax trees serialized to JSON; nevertheless, the precise
+semantics is given by the evaluation mechanism described later.
+
+Semantic Values
+---------------
+
+Expressions evaluate to semantic values. Semantic values are JSON values
+extended by additional atomic values for build-internal values like
+artifacts, names, etc.
+
+### Truth
+
+Every value can be treated as a boolean condition. We follow a
+convention similar to `LISP` considering everything true that is not
+empty. More precisely, the values
+
+ - `null`,
+ - `false`,
+ - `0`,
+ - `""`,
+ - the empty map, and
+ - the empty list
+
+are considered logically false. All other values are logically true.
+
+Evaluation
+----------
+
+The evaluation follows a strict, functional, call-by-value evaluation
+mechanism; the precise evaluation is as follows.
+
+ - Atomic values (`null`, booleans, strings, numbers) evaluate to
+ themselves.
+ - For lists, each entry is evaluated in the order they occur in the
+ list; the result of the evaluation is the list of the results.
+ - For JSON objects (wich can be understood as maps, or dicts), the key
+ `"type"` has to be present and has to be a literal string. That
+ string determines the syntactical construct (sloppily also referred
+ to as "function") the object represents, and the remaining
+ evaluation depends on the syntactical construct. The syntactical
+ construct has to be either one of the built-in ones or a special
+ function available in the given context (e.g., `"ACTION"` within the
+ expression defining a rule).
+
+All evaluation happens in an "environment" which is a map from strings
+to semantic values.
+
+### Built-in syntactical constructs
+
+#### Special forms
+
+##### Variables: `"var"`
+
+There has to be a key `"name"` that (i.e., the expression in the
+object at that key) has to be a literal string, taken as
+variable name. If the variable name is in the domain of the
+environment and the value of the environment at the variable
+name is non-`null`, then the result of the evaluation is the
+value of the variable in the environment.
+
+Otherwise, the key `"default"` is taken (if present, otherwise
+the value `null` is taken as default for `"default"`) and
+evaluated. The value obtained this way is the result of the
+evaluation.
+
+##### Sequential binding: `"let*"`
+
+The key `"bindings"` (default `[]`) has to be (syntactically) a
+list of pairs (i.e., lists of length two) with the first
+component a literal string.
+
+For each pair in `"bindings"` the second component is evaluated,
+in the order the pairs occur. After each evaluation, a new
+environment is taken for the subsequent evaluations; the new
+environment is like the old one but amended at the position
+given by the first component of the pair to now map to the value
+just obtained.
+
+Finally, the `"body"` is evaluated in the final environment
+(after evaluating all binding entries) and the result of
+evaluating the `"body"` is the value for the whole `"let*"`
+expression.
+
+##### Environment Map: `"env"`
+
+Creates a map from selected environment variables.
+
+The key `"vars"` (default `[]`) has to be a list of literal
+strings referring to the variable names that should be included
+in the produced map. This field is not evaluated. This
+expression is only for convenience and does not give new
+expression power. It is equivalent but lot shorter to multiple
+`singleton_map` expressions combined with `map_union`.
+
+##### Conditionals
+
+###### Binary conditional: `"if"`
+
+First the key `"cond"` is evaluated. If it evaluates to a
+value that is logically true, then the key `"then"` is
+evaluated and its value is the result of the evaluation.
+Otherwise, the key `"else"` (if present, otherwise `[]` is
+taken as default) is evaluated and the obtained value is the
+result of the evaluation.
+
+###### Sequential conditional: `"cond"`
+
+The key `"cond"` has to be a list of pairs. In the order of
+the list, the first components of the pairs are evaluated,
+until one evaluates to a value that is logically true. For
+that pair, the second component is evaluated and the result
+of this evaluation is the result of the `"cond"` expression.
+
+If all first components evaluate to a value that is
+logically false, the result of the expression is the result
+of evaluating the key `"default"` (defaulting to `[]`).
+
+###### String case distinction: `"case"`
+
+If the key `"case"` is present, it has to be a map (an
+"object", in JSON's terminology). In this case, the key
+`"expr"` is evaluated; it has to evaluate to a string. If
+the value is a key in the `"case"` map, the expression at
+this key is evaluated and the result of that evaluation is
+the value for the `"case"` expression.
+
+Otherwise (i.e., if `"case"` is absent or `"expr"` evaluates
+to a string that is not a key in `"case"`), the key
+`"default"` (with default `[]`) is evaluated and this gives
+the result of the `"case"` expression.
+
+###### Sequential case distinction on arbitrary values: `"case*"`
+
+If the key `"case"` is present, it has to be a list of
+pairs. In this case, the key `"expr"` is evaluated. It is an
+error if that evaluates to a name-containing value. The
+result of that evaluation is sequentially compared to the
+evaluation of the first components of the `"case"` list
+until an equal value is found. In this case, the evaluation
+of the second component of the pair is the value of the
+`"case*"` expression.
+
+If the `"case"` key is absent, or no equality is found, the
+result of the `"case*"` expression is the result of
+evaluating the `"default"` key (with default `[]`).
+
+##### Conjunction and disjunction: `"and"` and `"or"`
+
+For conjunction, if the key `"$1"` (with default `[]`) is
+syntactically a list, its entries are sequentially evaluated
+until a logically false value is found; in that case, the result
+is `false`, otherwise true. If the key `"$1"` has a different
+shape, it is evaluated and has to evaluate to a list. The result
+is the conjunction of the logical values of the entries. In
+particular, `{"type": "and"}` evaluates to `true`.
+
+For disjunction, the evaluation mechanism is the same, but the
+truth values and connective are taken dually. So, `"and"` and
+`"or"` are logical conjunction and disjunction, respectively,
+using short-cut evaluation if syntactically admissible (i.e., if
+the argument is syntactically a list).
+
+##### Mapping
+
+###### Mapping over lists: `"foreach"`
+
+First the key `"range"` is evaluated and has to evaluate to
+a list. For each entry of this list, the expression `"body"`
+is evaluated in an environment that is obtained from the
+original one by setting the value for the variable specified
+at the key `"var"` (which has to be a literal string,
+default `"_"`) to that value. The result is the list of
+those evaluation results.
+
+###### Mapping over maps: `"foreach_map"`
+
+Here, `"range"` has to evaluate to a map. For each entry (in
+lexicographic order (according to native byte order) by
+keys), the expression `"body"` is evaluated in an
+environment obtained from the original one by setting the
+variables specified at `"var_key"` and `"var_val"` (literal
+strings, default values `"_"` and `"$_"`, respectively). The
+result of the evaluation is the list of those values.
+
+##### Folding: `"foldl"`
+
+The key `"range"` is evaluated and has to evaluate to a list.
+Starting from the result of evaluating `"start"` (default `[]`)
+a new value is obtained for each entry of the range list by
+evaluating `"body"` in an environment obtained from the original
+by binding the variable specified by `"var"` (literal string,
+default `"_"`) to the list entry and the variable specified by
+`"accum_var"` (literal string, default value `"$1"`) to the old
+value. The result is the last value obtained.
+
+#### Regular functions
+
+First `"$1"` is evaluated; for binary functions `"$2"` is evaluated
+next. For functions that accept keyword arguments, those are
+evaluated as well. Finally the function is applied to this (or
+those) argument(s) to obtain the final result.
+
+##### Unary functions
+
+ - `"nub_right"` The argument has to be a list. It is an error
+ if that list contains (directly or indirectly) a name. The
+ result is the input list, except that for all duplicate
+ values, all but the rightmost occurrence is removed.
+
+ - `"basename"` The argument has to be a string. This string is
+ interpreted as a path, and the file name thereof is
+ returned.
+
+ - `"keys"` The argument has to be a map. The result is the
+ list of keys of this map, in lexicographical order
+ (according to native byte order).
+
+ - `"values"` The argument has to be a map. The result are the
+ values of that map, ordered by the corresponding keys
+ (lexicographically according to native byte order).
+
+ - `"range"` The argument is interpreted as a non-negative
+ integer as follows. Non-negative numbers are rounded to the
+ nearest integer; strings have to be the decimal
+ representation of an integer; everything else is considered
+ zero. The result is a list of the given length, consisting
+ of the decimal representations of the first non-negative
+ integers. For example, `{"type": "range",
+ "$1": "3"}` evaluates to `["0", "1", "2"]`.
+
+ - `"enumerate"` The argument has to be a list. The result is a
+ map containing one entry for each element of the list. The
+ key is the decimal representation of the position in the
+ list (starting from `0`), padded with leading zeros to
+ length at least 10. The value is the element. The padding is
+ chosen in such a way that iterating over the resulting map
+ (which happens in lexicographic order of the keys) has the
+ same iteration order as the list for all lists indexable by
+ 32-bit integers.
+
+ - `"++"` The argument has to be a list of lists. The result is
+ the concatenation of those lists.
+
+ - `"map_union"` The argument has to be a list of maps. The
+ result is a map containing as keys the union of the keys of
+ the maps in that list. For each key, the value is the value
+ of that key in the last map in the list that contains that
+ key.
+
+ - `"join_cmd"` The argument has to be a list of strings. A
+ single string is returned that quotes the original vector in
+ a way understandable by a POSIX shell. As the command for an
+ action is directly given by an argument vector, `"join_cmd"`
+ is typically only used for generated scripts.
+
+ - `"json_encode"` The result is a single string that is the
+ canonical JSON encoding of the argument (with minimal white
+ space); all atomic values that are not part of JSON (i.e.,
+ the added atomic values to represent build-internal values)
+ are serialized as `null`.
+
+##### Unary functions with keyword arguments
+
+ - `"change_ending"` The argument has to be a string,
+ interpreted as path. The ending is replaced by the value of
+ the keyword argument `"ending"` (a string, default `""`).
+ For example, `{"type":
+ "change_ending", "$1": "foo/bar.c", "ending": ".o"}`
+ evaluates to `"foo/bar.o"`.
+
+ - `"join"` The argument has to be a list of strings. The
+ return value is the concatenation of those strings,
+ separated by the the specified `"separator"` (strings,
+ default `""`).
+
+ - `"escape_chars"` Prefix every in the argument every
+ character occuring in `"chars"` (a string, default `""`) by
+ `"escape_prefix"` (a strings, default `"\"`).
+
+ - `"to_subdir"` The argument has to be a map (not necessarily
+ of artifacts). The keys as well as the `"subdir"` (string,
+ default `"."`) argument are interpreted as paths and keys
+ are replaced by the path concatenation of those two paths.
+ If the optional argument `"flat"` (default `false`)
+ evaluates to a true value, the keys are instead replaced by
+ the path concatenation of the `"subdir"` argument and the
+ base name of the old key. It is an error if conflicts occur
+ in this way; in case of such a user error, the argument
+ `"msg"` is also evaluated and the result of that evaluation
+ reported in the error message. Note that conflicts can also
+ occur in non-flat staging if two keys are different as
+ strings, but name the same path (like `"foo.txt"` and
+ `"./foo.txt"`), and are assigned different values. It also
+ is an error if the values for keys in conflicting positions
+ are name-containing.
+
+##### Binary functions
+
+ - `"=="` The result is `true` is the arguments are equal,
+ `false` otherwise. It is an error if one of the arguments
+ are name-containing values.
+
+ - `"concat_target_name"` This function is only present to
+ simplify transitions from some other build systems and
+ normally not used outside code generated by transition
+ tools. The second argument has to be a string or a list of
+ strings (in the latter case, it is treated as strings by
+ concatenating the entries). If the first argument is a
+ string, the result is the concatenation of those two
+ strings. If the first argument is a list of strings, the
+ result is that list with the second argument concatenated to
+ the last entry of that list (if any).
+
+##### Other functions
+
+ - `"empty_map"` This function takes no arguments and always
+ returns an empty map.
+
+ - `"singleton_map"` This function takes two keyword arguments,
+ `"key"` and `"value"` and returns a map with one entry,
+ mapping the given key to the given value.
+
+ - `"lookup"` This function takes two keyword arguments,
+ `"key"` and `"map"`. The `"key"` argument has to evaluate to
+ a string and the `"map"` argument has to evaluate to a map.
+ If that map contains the given key and the corresponding
+ value is non-`null`, the value is returned. Otherwise the
+ `"default"` argument (with default `null`) is evaluated and
+ returned.
+
+#### Constructs related to reporting of user errors
+
+Normally, if an error occurs during the evaluation the error is
+reported together with a stack trace. This, however, might not be
+the most informative way to present a problem to the user,
+especially if the underlying problem is a proper user error, e.g.,
+in rule usage (leaving out mandatory arguments, violating semantical
+prerequisites, etc). To allow proper error reporting, the following
+functions are available. All of them have an optional argument
+`"msg"` that is evaluated (only) in case of error and the result of
+that evaluation included in the error message presented to the user.
+
+ - `"fail"` Evaluation of this function unconditionally fails.
+
+ - `"context"` This function is only there to provide additional
+ information in case of error. Otherwise it is the identify
+ function (a unary function, i.e., the result of the evaluation
+ is the result of evaluating the argument `"$1"`).
+
+ - `"assert_non_empty"` Evaluate the argument (given by the
+ parameter `"$1"`). If it evaluates to a non-empty string, map,
+ or list, return the result of the evaluation. Otherwise fail.
+
+ - `"disjoint_map_union"` Like `"map_union"` but it is an error, if
+ two (or more) maps contain the same key, but map it to different
+ values. It is also an error if the argument is a name-containing
+ value.
diff --git a/doc/concepts/expressions.org b/doc/concepts/expressions.org
deleted file mode 100644
index ac66e878..00000000
--- a/doc/concepts/expressions.org
+++ /dev/null
@@ -1,344 +0,0 @@
-* Expression language
-
-At various places, in particular in order to define a rule, we need
-a restricted form of functional computation. This is achieved by
-our expression language.
-
-** Syntax
-
-All expressions are given by JSON values. One can think of expressions
-as abstract syntax trees serialized to JSON; nevertheless, the precise
-semantics is given by the evaluation mechanism described later.
-
-** Semantic Values
-
-Expressions evaluate to semantic values. Semantic values are JSON
-values extended by additional atomic values for build-internal
-values like artifacts, names, etc.
-
-*** Truth
-
-Every value can be treated as a boolean condition. We follow a
-convention similar to ~LISP~ considering everything true that is
-not empty. More precisely, the values
-- ~null~,
-- ~false~,
-- ~0~,
-- ~""~,
-- the empty map, and
-- the empty list
-are considered logically false. All other values are logically true.
-
-** Evaluation
-
-The evaluation follows a strict, functional, call-by-value evaluation
-mechanism; the precise evaluation is as follows.
-
-- Atomic values (~null~, booleans, strings, numbers) evaluate to
- themselves.
-- For lists, each entry is evaluated in the order they occur in the
- list; the result of the evaluation is the list of the results.
-- For JSON objects (wich can be understood as maps, or dicts), the
- key ~"type"~ has to be present and has to be a literal string.
- That string determines the syntactical construct (sloppily also
- referred to as "function") the object represents, and the remaining
- evaluation depends on the syntactical construct. The syntactical
- construct has to be either one of the built-in ones or a special
- function available in the given context (e.g., ~"ACTION"~ within
- the expression defining a rule).
-
-All evaluation happens in an "environment" which is a map from
-strings to semantic values.
-
-*** Built-in syntactical constructs
-
-**** Special forms
-
-***** Variables: ~"var"~
-
-There has to be a key ~"name"~ that (i.e., the expression in the
-object at that key) has to be a literal string, taken as variable
-name. If the variable name is in the domain of the environment and
-the value of the environment at the variable name is non-~null~,
-then the result of the evaluation is the value of the variable in
-the environment.
-
-Otherwise, the key ~"default"~ is taken (if present, otherwise the
-value ~null~ is taken as default for ~"default"~) and evaluated.
-The value obtained this way is the result of the evaluation.
-
-***** Sequential binding: ~"let*"~
-
-The key ~"bindings"~ (default ~[]~) has to be (syntactically) a
-list of pairs (i.e., lists of length two) with the first component
-a literal string.
-
-For each pair in ~"bindings"~ the second component is evaluated, in
-the order the pairs occur. After each evaluation, a new environment
-is taken for the subsequent evaluations; the new environment is
-like the old one but amended at the position given by the first
-component of the pair to now map to the value just obtained.
-
-Finally, the ~"body"~ is evaluated in the final environment (after
-evaluating all binding entries) and the result of evaluating the
-~"body"~ is the value for the whole ~"let*"~ expression.
-
-***** Environment Map: ~"env"~
-
-Creates a map from selected environment variables.
-
-The key ~"vars"~ (default ~[]~) has to be a list of literal strings referring to
-the variable names that should be included in the produced map. This field is
-not evaluated. This expression is only for convenience and does not give new
-expression power. It is equivalent but lot shorter to multiple ~singleton_map~
-expressions combined with ~map_union~.
-
-***** Conditionals
-
-****** Binary conditional: ~"if"~
-
-First the key ~"cond"~ is evaluated. If it evaluates to a value that
-is logically true, then the key ~"then"~ is evaluated and its value
-is the result of the evaluation. Otherwise, the key ~"else"~ (if
-present, otherwise ~[]~ is taken as default) is evaluated and the
-obtained value is the result of the evaluation.
-
-****** Sequential conditional: ~"cond"~
-
-The key ~"cond"~ has to be a list of pairs. In the order of the
-list, the first components of the pairs are evaluated, until one
-evaluates to a value that is logically true. For that pair, the
-second component is evaluated and the result of this evaluation is
-the result of the ~"cond"~ expression.
-
-If all first components evaluate to a value that is logically false,
-the result of the expression is the result of evaluating the key
-~"default"~ (defaulting to ~[]~).
-
-****** String case distinction: ~"case"~
-
-If the key ~"case"~ is present, it has to be a map (an "object", in
-JSON's terminology). In this case, the key ~"expr"~ is evaluated; it
-has to evaluate to a string. If the value is a key in the ~"case"~
-map, the expression at this key is evaluated and the result of that
-evaluation is the value for the ~"case"~ expression.
-
-Otherwise (i.e., if ~"case"~ is absent or ~"expr"~ evaluates to a
-string that is not a key in ~"case"~), the key ~"default"~ (with
-default ~[]~) is evaluated and this gives the result of the ~"case"~
-expression.
-
-****** Sequential case distinction on arbitrary values: ~"case*"~
-
-If the key ~"case"~ is present, it has to be a list of pairs. In this
-case, the key ~"expr"~ is evaluated. It is an error if that evaluates
-to a name-containing value. The result of that evaluation
-is sequentially compared to the evaluation of the first components
-of the ~"case"~ list until an equal value is found. In this case,
-the evaluation of the second component of the pair is the value of
-the ~"case*"~ expression.
-
-If the ~"case"~ key is absent, or no equality is found, the result of
-the ~"case*"~ expression is the result of evaluating the ~"default"~
-key (with default ~[]~).
-
-***** Conjunction and disjunction: ~"and"~ and ~"or"~
-
-For conjunction, if the key ~"$1"~ (with default ~[]~) is syntactically
-a list, its entries are sequentially evaluated until a logically
-false value is found; in that case, the result is ~false~, otherwise
-true. If the key ~"$1"~ has a different shape, it is evaluated and
-has to evaluate to a list. The result is the conjunction of the
-logical values of the entries. In particular, ~{"type": "and"}~
-evaluates to ~true~.
-
-For disjunction, the evaluation mechanism is the same, but the truth
-values and connective are taken dually. So, ~"and"~ and ~"or"~ are
-logical conjunction and disjunction, respectively, using short-cut
-evaluation if syntactically admissible (i.e., if the argument is
-syntactically a list).
-
-***** Mapping
-
-****** Mapping over lists: ~"foreach"~
-
-First the key ~"range"~ is evaluated and has to evaluate to a list.
-For each entry of this list, the expression ~"body"~ is evaluated
-in an environment that is obtained from the original one by setting
-the value for the variable specified at the key ~"var"~ (which has
-to be a literal string, default ~"_"~) to that value. The result
-is the list of those evaluation results.
-
-****** Mapping over maps: ~"foreach_map"~
-
-Here, ~"range"~ has to evaluate to a map. For each entry (in
-lexicographic order (according to native byte order) by keys), the
-expression ~"body"~ is evaluated in an environment obtained from
-the original one by setting the variables specified at ~"var_key"~
-and ~"var_val"~ (literal strings, default values ~"_"~ and
-~"$_"~, respectively). The result of the evaluation is the list of
-those values.
-
-***** Folding: ~"foldl"~
-
-The key ~"range"~ is evaluated and has to evaluate to a list.
-Starting from the result of evaluating ~"start"~ (default ~[]~) a
-new value is obtained for each entry of the range list by evaluating
-~"body"~ in an environment obtained from the original by binding
-the variable specified by ~"var"~ (literal string, default ~"_"~) to
-the list entry and the variable specified by ~"accum_var"~ (literal
-string, default value ~"$1"~) to the old value. The result is the
-last value obtained.
-
-**** Regular functions
-
-First ~"$1"~ is evaluated; for binary functions ~"$2"~ is evaluated
-next. For functions that accept keyword arguments, those are
-evaluated as well. Finally the function is applied to this (or
-those) argument(s) to obtain the final result.
-
-***** Unary functions
-
-- ~"nub_right"~ The argument has to be a list. It is an error if that list
- contains (directly or indirectly) a name. The result is the
- input list, except that for all duplicate values, all but the
- rightmost occurrence is removed.
-
-- ~"basename"~ The argument has to be a string. This string is
- interpreted as a path, and the file name thereof is returned.
-
-- ~"keys"~ The argument has to be a map. The result is the list of
- keys of this map, in lexicographical order (according to native
- byte order).
-
-- ~"values"~ The argument has to be a map. The result are the values
- of that map, ordered by the corresponding keys (lexicographically
- according to native byte order).
-
-- ~"range"~ The argument is interpreted as a non-negative integer as
- follows. Non-negative numbers are rounded to the nearest integer;
- strings have to be the decimal representation of an integer;
- everything else is considered zero. The result is a list of the
- given length, consisting of the decimal representations of the
- first non-negative integers. For example, ~{"type": "range",
- "$1": "3"}~ evaluates to ~["0", "1", "2"]~.
-
-- ~"enumerate"~ The argument has to be a list. The result is a map
- containing one entry for each element of the list. The key is
- the decimal representation of the position in the list (starting
- from ~0~), padded with leading zeros to length at least 10. The
- value is the element. The padding is chosen in such a way that
- iterating over the resulting map (which happens in lexicographic
- order of the keys) has the same iteration order as the list for
- all lists indexable by 32-bit integers.
-
-- ~"++"~ The argument has to be a list of lists. The result is the
- concatenation of those lists.
-
-- ~"map_union"~ The argument has to be a list of maps. The result
- is a map containing as keys the union of the keys of the maps in
- that list. For each key, the value is the value of that key in
- the last map in the list that contains that key.
-
-- ~"join_cmd"~ The argument has to be a list of strings. A single
- string is returned that quotes the original vector in a way
- understandable by a POSIX shell. As the command for an action is
- directly given by an argument vector, ~"join_cmd"~ is typically
- only used for generated scripts.
-
-- ~"json_encode"~ The result is a single string that is the canonical
- JSON encoding of the argument (with minimal white space); all atomic
- values that are not part of JSON (i.e., the added atomic values
- to represent build-internal values) are serialized as ~null~.
-
-***** Unary functions with keyword arguments
-
-- ~"change_ending"~ The argument has to be a string, interpreted as
- path. The ending is replaced by the value of the keyword argument
- ~"ending"~ (a string, default ~""~). For example, ~{"type":
- "change_ending", "$1": "foo/bar.c", "ending": ".o"}~ evaluates
- to ~"foo/bar.o"~.
-
-- ~"join"~ The argument has to be a list of strings. The return
- value is the concatenation of those strings, separated by the
- the specified ~"separator"~ (strings, default ~""~).
-
-- ~"escape_chars"~ Prefix every in the argument every character
- occuring in ~"chars"~ (a string, default ~""~) by ~"escape_prefix"~ (a
- strings, default ~"\\"~).
-
-- ~"to_subdir"~ The argument has to be a map (not necessarily of
- artifacts). The keys as well as the ~"subdir"~ (string, default
- ~"."~) argument are interpreted as paths and keys are replaced
- by the path concatenation of those two paths. If the optional
- argument ~"flat"~ (default ~false~) evaluates to a true value,
- the keys are instead replaced by the path concatenation of the
- ~"subdir"~ argument and the base name of the old key. It is an
- error if conflicts occur in this way; in case of such a user
- error, the argument ~"msg"~ is also evaluated and the result
- of that evaluation reported in the error message. Note that
- conflicts can also occur in non-flat staging if two keys are
- different as strings, but name the same path (like ~"foo.txt"~
- and ~"./foo.txt"~), and are assigned different values.
- It also is an error if the values for keys in conflicting positions
- are name-containing.
-
-***** Binary functions
-
-- ~"=="~ The result is ~true~ is the arguments are equal, ~false~
- otherwise. It is an error if one of the arguments are name-containing
- values.
-
-- ~"concat_target_name"~ This function is only present to simplify
- transitions from some other build systems and normally not used
- outside code generated by transition tools. The second argument
- has to be a string or a list of strings (in the latter case,
- it is treated as strings by concatenating the entries). If the
- first argument is a string, the result is the concatenation of
- those two strings. If the first argument is a list of strings,
- the result is that list with the second argument concatenated to
- the last entry of that list (if any).
-
-***** Other functions
-
-- ~"empty_map"~ This function takes no arguments and always returns
- an empty map.
-
-- ~"singleton_map"~ This function takes two keyword arguments,
- ~"key"~ and ~"value"~ and returns a map with one entry, mapping
- the given key to the given value.
-
-- ~"lookup"~ This function takes two keyword arguments, ~"key"~
- and ~"map"~. The ~"key"~ argument has to evaluate to a string
- and the ~"map"~ argument has to evaluate to a map. If that map
- contains the given key and the corresponding value is non-~null~,
- the value is returned. Otherwise the ~"default"~ argument (with
- default ~null~) is evaluated and returned.
-
-**** Constructs related to reporting of user errors
-
-Normally, if an error occurs during the evaluation the error is
-reported together with a stack trace. This, however, might not
-be the most informative way to present a problem to the user,
-especially if the underlying problem is a proper user error, e.g.,
-in rule usage (leaving out mandatory arguments, violating semantical
-prerequisites, etc). To allow proper error reporting, the following
-functions are available. All of them have an optional argument
-~"msg"~ that is evaluated (only) in case of error and the result of
-that evaluation included in the error message presented to the user.
-
-- ~"fail"~ Evaluation of this function unconditionally fails.
-
-- ~"context"~ This function is only there to provide additional
- information in case of error. Otherwise it is the identify
- function (a unary function, i.e., the result of the evaluation
- is the result of evaluating the argument ~"$1"~).
-
-- ~"assert_non_empty"~ Evaluate the argument (given by the parameter
- ~"$1"~). If it evaluates to a non-empty string, map, or list,
- return the result of the evaluation. Otherwise fail.
-
-- ~"disjoint_map_union"~ Like ~"map_union"~ but it is an error,
- if two (or more) maps contain the same key, but map it to
- different values. It is also an error if the argument is a
- name-containing value.
diff --git a/doc/concepts/garbage.md b/doc/concepts/garbage.md
new file mode 100644
index 00000000..69594b1c
--- /dev/null
+++ b/doc/concepts/garbage.md
@@ -0,0 +1,86 @@
+Garbage Collection
+==================
+
+For every build, for all non-failed actions an entry is created in the
+action cache and the corresponding artifacts are stored in the CAS. So,
+over time, a lot of files accumulate in the local build root. Hence we
+have a way to reclaim disk space while keeping the benefits of having a
+cache. This operation is referred to as garbage collection and usually
+uses the heuristics to keeping what is most recently used. Our approach
+follows this paradigm as well.
+
+Invariants assumed by our build system
+--------------------------------------
+
+Our tool assumes several invariants on the local build root, that we
+need to maintain during garbage collection. Those are the following.
+
+ - If an artifact is referenced in any cache entry (action cache,
+ target-level cache), then the corresponding artifact is in CAS.
+ - If a tree is in CAS, then so are its immediate parts (and hence also
+ all transitive parts).
+
+Generations of cache and CAS
+----------------------------
+
+In order to allow garbage collection while keeping the desired
+invariants, we keep several (currently two) generations of cache and
+CAS. Each generation in itself has to fulfill the invariants. The
+effective cache or CAS is the union of the caches or CASes of all
+generations, respectively. Obviously, then the effective cache and CAS
+fulfill the invariants as well.
+
+The actual `gc` command rotates the generations: the oldest generation
+is be removed and the remaining generations are moved one number up
+(i.e., currently the young generation will simply become the old
+generation), implicitly creating a new, empty, youngest generation. As
+an empty generation fulfills the required invariants, this operation
+preservers the requirement that each generation individually fulfill the
+invariants.
+
+All additions are made to the youngest generation; in order to keep the
+invariant, relevant entries only present in an older generation are also
+added to the youngest generation first. Moreover, whenever an entry is
+referenced in any way (cache hit, request for an entry to be in CAS) and
+is only present in an older generation, it is also added to the younger
+generation, again adding referenced parts first. As a consequence, the
+youngest generation contains everything directly or indirectly
+referenced since the last garbage collection; in particular, everything
+referenced since the last garbage collection will remain in the
+effective cache or CAS upon the next garbage collection.
+
+These generations are stored as separate directories inside the local
+build root. As the local build root is, starting from an empty
+directory, entirely managed by \`just\` and compatible tools,
+generations are on the same file system. Therefore the adding of old
+entries to the youngest generation can be implemented in an efficient
+way by using hard links.
+
+The moving up of generations can happen atomically by renaming the
+respective directory. Also, the oldest generation can be removed
+logically by renaming a directory to a name that is not searched for
+when looking for existing generations. The actual recursive removal from
+the file system can then happen in a separate step without any
+requirements on order.
+
+Parallel operations in the presence of garbage collection
+---------------------------------------------------------
+
+The addition to cache and CAS can continue to happen in parallel; that
+certain values are taken from an older generation instead of freshly
+computed does not make a difference for the youngest generation (which
+is the only generation modified). But build processes assume they don't
+violate the invariant if they first add files to CAS and later a tree or
+cache entry referencing them. This, however, only holds true if no
+generation rotation happens in between. To avoid those kind of races, we
+make processes coordinate over a single lock for each build root.
+
+ - Any build process keeps a shared lock for the entirety of the build.
+ - The garbage collection process takes an exclusive lock for the
+ period it does the directory renames.
+
+We consider it acceptable that, in theory, local build processes could
+starve local garbage collection. Moreover, it should be noted that the
+actual removal of no-longer-needed files from the file system happens
+without any lock being held. Hence the disturbance of builds caused by
+garbage collection is small.
diff --git a/doc/concepts/garbage.org b/doc/concepts/garbage.org
deleted file mode 100644
index 26f6cc51..00000000
--- a/doc/concepts/garbage.org
+++ /dev/null
@@ -1,82 +0,0 @@
-* Garbage Collection
-
-For every build, for all non-failed actions an entry is created in
-the action cache and the corresponding artifacts are stored in the
-CAS. So, over time, a lot of files accumulate in the local build
-root. Hence we have a way to reclaim disk space while keeping the
-benefits of having a cache. This operation is referred to as garbage
-collection and usually uses the heuristics to keeping what is most
-recently used. Our approach follows this paradigm as well.
-
-** Invariants assumed by our build system
-
-Our tool assumes several invariants on the local build root, that we
-need to maintain during garbage collection. Those are the following.
-- If an artifact is referenced in any cache entry (action cache,
- target-level cache), then the corresponding artifact is in CAS.
-- If a tree is in CAS, then so are its immediate parts (and hence
- also all transitive parts).
-
-
-** Generations of cache and CAS
-
-In order to allow garbage collection while keeping the desired
-invariants, we keep several (currently two) generations of cache
-and CAS. Each generation in itself has to fulfill the invariants.
-The effective cache or CAS is the union of the caches or CASes of
-all generations, respectively. Obviously, then the effective cache
-and CAS fulfill the invariants as well.
-
-The actual ~gc~ command rotates the generations: the oldest
-generation is be removed and the remaining generations are moved
-one number up (i.e., currently the young generation will simply
-become the old generation), implicitly creating a new, empty,
-youngest generation. As an empty generation fulfills the required
-invariants, this operation preservers the requirement that each
-generation individually fulfill the invariants.
-
-All additions are made to the youngest generation; in order to keep
-the invariant, relevant entries only present in an older generation
-are also added to the youngest generation first. Moreover, whenever
-an entry is referenced in any way (cache hit, request for an entry
-to be in CAS) and is only present in an older generation, it is
-also added to the younger generation, again adding referenced
-parts first. As a consequence, the youngest generation contains
-everything directly or indirectly referenced since the last garbage
-collection; in particular, everything referenced since the last
-garbage collection will remain in the effective cache or CAS upon
-the next garbage collection.
-
-These generations are stored as separate directories inside the
-local build root. As the local build root is, starting from an
-empty directory, entirely managed by `just` and compatible tools,
-generations are on the same file system. Therefore the adding of
-old entries to the youngest generation can be implemented in an
-efficient way by using hard links.
-
-The moving up of generations can happen atomically by renaming the
-respective directory. Also, the oldest generation can be removed
-logically by renaming a directory to a name that is not searched
-for when looking for existing generations. The actual recursive
-removal from the file system can then happen in a separate step
-without any requirements on order.
-
-** Parallel operations in the presence of garbage collection
-
-The addition to cache and CAS can continue to happen in parallel;
-that certain values are taken from an older generation instead
-of freshly computed does not make a difference for the youngest
-generation (which is the only generation modified). But build
-processes assume they don't violate the invariant if they first
-add files to CAS and later a tree or cache entry referencing them.
-This, however, only holds true if no generation rotation happens in
-between. To avoid those kind of races, we make processes coordinate
-over a single lock for each build root.
-- Any build process keeps a shared lock for the entirety of the build.
-- The garbage collection process takes an exclusive lock for the
- period it does the directory renames.
-We consider it acceptable that, in theory, local build processes
-could starve local garbage collection. Moreover, it should be noted
-that the actual removal of no-longer-needed files from the file
-system happens without any lock being held. Hence the disturbance
-of builds caused by garbage collection is small.
diff --git a/doc/concepts/multi-repo.md b/doc/concepts/multi-repo.md
new file mode 100644
index 00000000..c465360e
--- /dev/null
+++ b/doc/concepts/multi-repo.md
@@ -0,0 +1,170 @@
+Multi-repository build
+======================
+
+Repository configuration
+------------------------
+
+### Open repository names
+
+A repository can have external dependencies. This is realized by having
+unbound ("open") repository names being used as references. The actual
+definition of those external repositories is not part of the repository;
+we think of them as inputs, i.e., we think of this repository as a
+function of the referenced external targets.
+
+### Binding in a separate repository configuration
+
+The actual binding of the free repository names is specified in a
+separate repository-configuration file, which is specified on the
+command line (via the `-C` option); this command-line argument is
+optional and the default is that the repository worked on has no
+external dependencies. Typically (but not necessarily), this
+repository-configuration file is located outside the referenced
+repositories and versioned separately or generated from such a file via
+`bin/just-mr.py`. It serves as meta-data for a group of repositories
+belonging together.
+
+This file contains one JSON object. For the key `"repositories"` the
+value is an object; its keys are the global names of the specified
+repositories. For each repository, there is an object describing it. The
+key `"workspace_root"` describes where to find the repository and should
+be present for all (direct or indirect) external dependencies of the
+repository worked upon. Additional roots file names (for target, rule,
+and expression) can be specified. For keys not given, the same rules for
+default values apply as for the corresponding command-line arguments.
+Additionally, for each repository, the key "bindings" specifies the
+map of the open repository names to the global names that provide these
+dependencies. Repositories may depend on each other (or even
+themselves), but the resulting global target graph has to be cycle free.
+
+Whenever a location has to be specified, the value has to be a list,
+with the first entry being specifying the naming scheme; the semantics
+of the remaining entries depends on the scheme (see "Root Naming
+Schemes" below).
+
+Additionally, the key `"main"` (with default `""`) specifies the main
+repository. The target to be built (as specified on the command line) is
+taken from this repository. Also, the command-line arguments `-w`,
+`--target_root`, etc, apply to this repository. If no option `-w` is
+given and `"workspace_root"` is not specified in the
+repository-configuration file either, the root is determined from the
+working directory as usual.
+
+The value of `main` can be overwritten on the command line (with the
+`--main` option) In this way, a consistent configuration of
+interdependent repositories can be versioned and referred to regardless
+of the repository worked on.
+
+#### Root naming scheme
+
+##### `"file"`
+
+The `"file"` scheme tells that the repository (or respective
+root) can be found in a directory in the local file system; the
+only argument is the absolute path to that directory.
+
+##### `"git tree"`
+
+The `"git tree"` scheme tells that the root is defined to be a
+tree given by a git tree identifier. It takes two arguments
+
+ - the tree identifier, as hex-encoded string, and
+ - the absolute path to some repository containing that tree
+
+#### Example
+
+Consider, for example, the following repository-configuration file.
+In the following, we assume it is located at `/etc/just/repos.json`.
+
+``` jsonc
+{ "main": "env"
+, "repositories":
+ { "foobar":
+ { "workspace_root": ["file", "/opt/foobar/repo"]
+ , "rule_root": ["file", "/etc/just/rules"]
+ , "bindings": {"base": "barimpl"}
+ }
+ , "barimpl":
+ { "workspace_root": ["file", "/opt/barimpl"]
+ , "target_file_name": "TARGETS.bar"
+ }
+ , "env": {"bindings": {"foo": "foobar", "bar": "barimpl"}}
+ }
+}
+```
+
+It specifies 3 repositories, with global names `foobar`, `barimpl`,
+and `env`. Within `foobar`, the repository name `base` refers to
+`barimpl`, the repository that can be found at `/opt/barimpl`.
+
+The repository `env` is the main repository and there is no
+workspace root defined for it, so it only provides bindings for
+external repositories `foo` and `bar`, but the actual repository is
+taken from the working directory (unless `-w` is specified). In this
+way, it provides an environment for developing applications based on
+`foo` and `bar`.
+
+For example, the invocation `just build -C /etc/just/repos.conf
+baz` tells our tool to build the target `baz` from the module the
+working directory is located in. `foo` will refer to the repository
+found at `/opt/foobar/repo` (using rules from `/etc/just/rules`,
+taking `base` refer to the repository at `/opt/barimpl`) and `bar`
+will refer to the repository at `/opts/barimpl`.
+
+Naming of targets
+-----------------
+
+### Reference in target files
+
+In addition to the normal target references (string for a target in the
+name module, module-target pair for a target in same repository,
+`["./", relpath, target]` relative addressing, `["FILE", null,
+name]` explicit file reference in the same module), references of the
+form `["@", repo, module, target]` can be specified, where `repo` is
+string referring to an open name. That open repository name is resolved
+to the global name by the `"bindings"` parameter of the repository the
+target reference is made in. Within the repository the resolved name
+refers to, the target `[module, target]` is taken.
+
+### Expression language: names as abstract values
+
+Targets are a global concept as they distinguish targets from different
+repositories. Their names, however, depend on the repository they occur
+in (as the local names might differ in various repositories). Moreover,
+some targets cannot be named in certain repositories as not every
+repository has a local name in every other repository.
+
+To handle this naming problem, we note the following. During the
+evaluation of a target names occur at two places: as the result of
+evaluating the parameters (for target fields) and in the evaluation of
+the defining expression when requesting properties of a target dependent
+upon (via `DEP_ARTIFACTS` and related functions). In the later case,
+however, the only legitimate way to obtain a target name is by the
+`FIELD` function. To enforce this behavior, and to avoid problems with
+serializing target names, our expression language considers target names
+as opaque values. More precisely,
+
+ - in a target description, the target fields are evaluated and the
+ result of the evaluation is parsed, in the context of the module the
+ `TARGET` file belongs to, as a target name, and
+ - during evaluation of the defining expression of a the target's
+ rule, when accessing `FIELD` the values of target fields will be
+ reported as abstract name values and when querying values of
+ dependencies (via `DEP_ARTIFACTS` etc) the correct abstract target
+ name has to be provided.
+
+While the defining expression has access to target names (via target
+fields), it is not useful to provide them in provided data; a consuming
+data cannot use names unless it has those fields as dependency anyway.
+Our tool will not enforce this policy; however, only targets not having
+names in their provided data are eligible to be used in `export` rules.
+
+File layout in actions
+----------------------
+
+As `just` does full staging for actions, no special considerations are
+needed when combining targets of different repositories. Each target
+brings its staging of artifacts as usual. In particular, no repository
+names (neither local nor global ones) will ever be visible in any
+action. So for the consuming target it makes no difference if its
+dependency comes from the same or a different repository.
diff --git a/doc/concepts/multi-repo.org b/doc/concepts/multi-repo.org
deleted file mode 100644
index f1ad736f..00000000
--- a/doc/concepts/multi-repo.org
+++ /dev/null
@@ -1,167 +0,0 @@
-* Multi-repository build
-
-** Repository configuration
-
-*** Open repository names
-
-A repository can have external dependencies. This is realized by
-having unbound ("open") repository names being used as references.
-The actual definition of those external repositories is not part
-of the repository; we think of them as inputs, i.e., we think of
-this repository as a function of the referenced external targets.
-
-*** Binding in a separate repository configuration
-
-The actual binding of the free repository names is specified in a
-separate repository-configuration file, which is specified on the
-command line (via the ~-C~ option); this command-line argument
-is optional and the default is that the repository worked on has
-no external dependencies. Typically (but not necessarily), this
-repository-configuration file is located outside the referenced
-repositories and versioned separately or generated from such a
-file via ~bin/just-mr.py~. It serves as meta-data for a group of
-repositories belonging together.
-
-This file contains one JSON object. For the key ~"repositories"~ the
-value is an object; its keys are the global names of the specified
-repositories. For each repository, there is an object describing it.
-The key ~"workspace_root"~ describes where to find the repository and
-should be present for all (direct or indirect) external dependencies
-of the repository worked upon. Additional roots file names (for
-target, rule, and expression) can be specified. For keys not given,
-the same rules for default values apply as for the corresponding
-command-line arguments. Additionally, for each repository, the
-key "bindings" specifies the map of the open repository names to
-the global names that provide these dependencies. Repositories may
-depend on each other (or even themselves), but the resulting global
-target graph has to be cycle free.
-
-Whenever a location has to be specified, the value has to be a
-list, with the first entry being specifying the naming scheme; the
-semantics of the remaining entries depends on the scheme (see "Root
-Naming Schemes" below).
-
-Additionally, the key ~"main"~ (with default ~""~) specifies
-the main repository. The target to be built (as specified on the
-command line) is taken from this repository. Also, the command-line
-arguments ~-w~, ~--target_root~, etc, apply to this repository. If
-no option ~-w~ is given and ~"workspace_root"~ is not specified in
-the repository-configuration file either, the root is determined
-from the working directory as usual.
-
-The value of ~main~ can be overwritten on the command line (with
-the ~--main~ option) In this way, a consistent configuration
-of interdependent repositories can be versioned and referred to
-regardless of the repository worked on.
-
-**** Root naming scheme
-
-***** ~"file"~
-
-The ~"file"~ scheme tells that the repository (or respective root)
-can be found in a directory in the local file system; the only
-argument is the absolute path to that directory.
-
-
-***** ~"git tree"~
-
-The ~"git tree"~ scheme tells that the root is defined to be a tree
-given by a git tree identifier. It takes two arguments
-- the tree identifier, as hex-encoded string, and
-- the absolute path to some repository containing that tree
-
-**** Example
-
-Consider, for example, the following repository-configuration file.
-In the following, we assume it is located at ~/etc/just/repos.json~.
-
-#+BEGIN_SRC
-{ "main": "env"
-, "repositories":
- { "foobar":
- { "workspace_root": ["file", "/opt/foobar/repo"]
- , "rule_root": ["file", "/etc/just/rules"]
- , "bindings": {"base": "barimpl"}
- }
- , "barimpl":
- { "workspace_root": ["file", "/opt/barimpl"]
- , "target_file_name": "TARGETS.bar"
- }
- , "env": {"bindings": {"foo": "foobar", "bar": "barimpl"}}
- }
-}
-#+END_SRC
-
-It specifies 3 repositories, with global names ~foobar~, ~barimpl~,
-and ~env~. Within ~foobar~, the repository name ~base~ refers to
-~barimpl~, the repository that can be found at ~/opt/barimpl~.
-
-The repository ~env~ is the main repository and there is no workspace
-root defined for it, so it only provides bindings for external
-repositories ~foo~ and ~bar~, but the actual repository is taken
-from the working directory (unless ~-w~ is specified). In this way,
-it provides an environment for developing applications based on
-~foo~ and ~bar~.
-
-For example, the invocation ~just build -C /etc/just/repos.conf
-baz~ tells our tool to build the target ~baz~ from the module the
-working directory is located in. ~foo~ will refer to the repository
-found at ~/opt/foobar/repo~ (using rules from ~/etc/just/rules~,
-taking ~base~ refer to the repository at ~/opt/barimpl~) and
-~bar~ will refer to the repository at ~/opts/barimpl~.
-
-** Naming of targets
-
-*** Reference in target files
-
-In addition to the normal target references (string for a target in
-the name module, module-target pair for a target in same repository,
-~["./", relpath, target]~ relative addressing, ~["FILE", null,
-name]~ explicit file reference in the same module), references of the
-form ~["@", repo, module, target]~ can be specified, where ~repo~
-is string referring to an open name. That open repository name is
-resolved to the global name by the ~"bindings"~ parameter of the
-repository the target reference is made in. Within the repository
-the resolved name refers to, the target ~[module, target]~ is taken.
-
-*** Expression language: names as abstract values
-
-Targets are a global concept as they distinguish targets from different
-repositories. Their names, however, depend on the repository they
-occur in (as the local names might differ in various repositories).
-Moreover, some targets cannot be named in certain repositories as
-not every repository has a local name in every other repository.
-
-To handle this naming problem, we note the following. During the
-evaluation of a target names occur at two places: as the result of
-evaluating the parameters (for target fields) and in the evaluation
-of the defining expression when requesting properties of a target
-dependent upon (via ~DEP_ARTIFACTS~ and related functions). In the
-later case, however, the only legitimate way to obtain a target
-name is by the ~FIELD~ function. To enforce this behavior, and
-to avoid problems with serializing target names, our expression
-language considers target names as opaque values. More precisely,
-- in a target description, the target fields are evaluated and the
- result of the evaluation is parsed, in the context of the module
- the ~TARGET~ file belongs to, as a target name, and
-- during evaluation of the defining expression of a the target's
- rule, when accessing ~FIELD~ the values of target fields will
- be reported as abstract name values and when querying values of
- dependencies (via ~DEP_ARTIFACTS~ etc) the correct abstract target
- name has to be provided.
-
-While the defining expression has access to target names (via
-target fields), it is not useful to provide them in provided data;
-a consuming data cannot use names unless it has those fields as
-dependency anyway. Our tool will not enforce this policy; however,
-only targets not having names in their provided data are eligible
-to be used in ~export~ rules.
-
-** File layout in actions
-
-As ~just~ does full staging for actions, no special considerations
-are needed when combining targets of different repositories. Each
-target brings its staging of artifacts as usual. In particular, no
-repository names (neither local nor global ones) will ever be visible
-in any action. So for the consuming target it makes no difference
-if its dependency comes from the same or a different repository.
diff --git a/doc/concepts/overview.md b/doc/concepts/overview.md
new file mode 100644
index 00000000..a9bcc847
--- /dev/null
+++ b/doc/concepts/overview.md
@@ -0,0 +1,210 @@
+Tool Overview
+=============
+
+Structuring
+-----------
+
+### Structuring the Build: Targets, Rules, and Actions
+
+The primary units this build system deals with are targets: the user
+requests the system to build (or install) a target, targets depend on
+other targets, etc. Targets typically reflect the units a software
+developer thinks in: libraries, binaries, etc. The definition of a
+target only describes the information directly belonging to the target,
+e.g., its source, private and public header files, and its direct
+dependencies. Any other information needed to build a target (like the
+public header files of an indirect dependency) are inferred by the build
+tool. In this way, the build description can be kept maintainable
+
+A built target consists of files logically belonging together (like the
+actual library file and its public headers) as well as information on
+how to use the target (linking arguments, transitive header files, etc).
+For a consumer of a target, the definition of this collection of files
+as well as the additionally provided information is what defines the
+target as a dependency, respectively of where the target is coming from
+(i.e., targets coinciding here are indistinguishable for other targets).
+
+Of course, to actually build a single target from its dependencies, many
+invocations of the compiler or other tools are necessary (so called
+"actions"); the build tool translates these high level description
+into the individual actions necessary and only re-executes those where
+inputs have changed.
+
+This translation of high-level concepts into individual actions is not
+hard coded into the tool. It is provided by the user as "rules" and
+forms additional input to the build. To avoid duplicate work, rules are
+typically maintained centrally for a project or an organization.
+
+### Structuring the Code: Modules and Repositories
+
+The code base is usually split into many directories, each containing
+source files belonging together. To allow the definition of targets
+where their code is, the targets are structured in a similar way. For
+each directory, there can be a targets files. Directories for which such
+a targets file exists are called "modules". Each file belongs to the
+module that is closest when searching upwards in the directory tree. The
+targets file of a module defines the targets formed from the source
+files belonging to this module.
+
+Larger projects are often split into "repositories". For this build
+tool, a repository is a logical unit. Often those coincide with the
+repositories in the sense of version control. This, however, does not
+have to be the case. Also, from one directory in the file system many
+repositories can be formed that might differ in the rules used, targets
+defined, or binding of their dependencies.
+
+Staging
+-------
+
+A peculiarity of this build system is the complete separation between
+physical and logical paths. Targets have their own view of the world,
+i.e., they can place their artifacts at any logical path they like, and
+this is how they look to other targets. It is up to the consuming
+targets what they do with artifacts of the targets they depend on; in
+particular, they are not obliged to leave them at the logical location
+their dependency put them.
+
+When such a collection of artifacts at logical locations (often referred
+to as the "stage") is realized on the file system (when installing a
+target, or as inputs to actions), the paths are interpreted as paths
+relative to the respective root (installation or action directory).
+
+This separation is what allows flexible combination of targets from
+various sources without leaking repository names or different file
+arrangement if a target is in the "main" repository.
+
+Repository data
+---------------
+
+A repository uses a (logical) directory for several purposes: to obtain
+source files, to read definitions of targets, to read rules, and to read
+expressions that can be used by rules. While all those directories can
+(and often are) be the same, this does not have to be the case. For each
+of those purposes, a different logical directory (also called "root")
+can be used. In this way, one can, e.g., add target definitions to a
+source tree originally written for a different build tool without
+modifying the original source tree.
+
+Those roots are usually defined in a repository configuration. For the
+"main" repository, i.e., the repository from which the target to be
+built is requested, the roots can also be overwritten at the command
+line. Roots can be defined as paths in the file system, but also as
+`git` tree identifiers (together with the location of some repository
+containing that tree). The latter definition is preferable for rules and
+dependencies, as it allows high-level caching of targets. It also
+motivates the need of adding target definitions without changing the
+root itself.
+
+The same flexibility as for the roots is also present for the names of
+the files defining targets, rules, and expressions. While the default
+names `TARGETS`, `RULES`, and `EXPRESSIONS` are often used, other file
+names can be specified for those as well, either in the repository
+configuration or (for the main repository) on the command line.
+
+The final piece of data needed to describe a repository is the binding
+of the open repository names that are used to refer to other
+repositories. More details can be found in the documentation on
+multi-repository builds.
+
+Targets
+-------
+
+### Target naming
+
+In description files, targets, rules, and expressions are referred to by
+name. As the context always fixes if a name for a target, rule, or
+expression is expected, they use the same naming scheme.
+
+ - A single string refers to the target with this name in the same
+ module.
+ - A pair `[module, name]` refers to the target `name` in the module
+ `module` of the same repository. There are no module names with a
+ distinguished meaning. The naming scheme is unambiguous, as all
+ other names given by lists have length at least 3.
+ - A list `["./", relative-module-path, name]` refers to a target with
+ the given name in the module that has the specified path relative to
+ the current module (in the current repository).
+ - A list `["@", repository, module, name]` refers to the target with
+ the specified name in the specified module of the specified
+ repository.
+
+Additionally, there are special targets that can also be referred to in
+target files.
+
+ - An explicit reference of a source-file target in the same module,
+ specified as `["FILE", null, name]`. The explicit `null` at the
+ second position (where normally the module would be) is necessary to
+ ensure the name has length more than 2 to distinguish it from a
+ reference to the module `"FILE"`.
+ - A reference to an collection, given by a shell pattern, of explicit
+ source files in the top-level directory of the same module,
+ specified as `["GLOB", null, pattern]`. The explicit `null` at
+ second position is required for the same reason as in the explicit
+ file reference.
+ - A reference to a tree target in the same module, specified as
+ `["TREE", null, name]`. The explicit `null` at second position is
+ required for the same reason as in the explicit file reference.
+
+### Data of an analyzed target
+
+Analyzing a target results in 3 pieces of data.
+
+ - The "artifacts" are a staged collection of artifacts. Typically,
+ these are what is normally considered the main reason to build a
+ target, e.g., the actual library file in case of a library.
+
+ - The "runfiles" are another staged collection of artifacts.
+ Typically, these are files that directly belong to the target and
+ are somehow needed to use the target. For example, in case of a
+ library that would be the public header files of the library itself.
+
+ - A "provides" map with additional information the target wants to
+ provide to its consumers. The data contained in that map can also
+ contain additional artifacts. Typically, this the remaining
+ information needed to use the target in a build.
+
+ In case of a library, that typically would include any other
+ libraries this library transitively depends upon (a stage), the
+ correct linking order (a list of strings), and the public headers of
+ the transitive dependencies (another stage).
+
+A target is completely determined by these 3 pieces of data. A consumer
+of the target will have no other information available. Hence it is
+crucial, that everything (apart from artifacts and runfiles) needed to
+build against that target is contained in the provides map.
+
+When the installation of a target is requested on the command line,
+artifacts and runfiles are installed; in case of staging conflicts,
+artifacts take precedence.
+
+### Source targets
+
+#### Files
+
+If a target is not found in the targets file, it is implicitly
+treated as a source file. Both, explicit and implicit source files
+look the same. The artifacts stage has a single entry: the path is
+the relative path of the file to the module root and the value the
+file artifact located at the specified location. The runfiles are
+the same as the artifacts and the provides map is empty.
+
+#### Collection of files given by a shell pattern
+
+A collection of files given by a shell pattern has, both as
+artifacts and runfiles, the (necessarily disjoint) union of the
+artifact maps of the (zero or more) source targets that match the
+pattern. Only *files* in the *top-level* directory of the given
+modules are considered for matches. The provides map is empty.
+
+#### Trees
+
+A tree describes a directory. Internally, however, it is a single
+opaque artifact. Consuming targets cannot look into the internal
+structure of that tree. Only when realized in the file system (when
+installation is requested or as part of the input to an action), the
+directory structure is visible again.
+
+An explicit tree target is similar to an explicit file target,
+except that at the specified location there has to be a directory
+rather than a file and the tree artifact corresponding to that
+directory is taken instead of a file artifact.
diff --git a/doc/concepts/overview.org b/doc/concepts/overview.org
deleted file mode 100644
index 5dc7ad20..00000000
--- a/doc/concepts/overview.org
+++ /dev/null
@@ -1,206 +0,0 @@
-* Tool Overview
-
-** Structuring
-
-*** Structuring the Build: Targets, Rules, and Actions
-
-The primary units this build system deals with are targets: the
-user requests the system to build (or install) a target, targets
-depend on other targets, etc. Targets typically reflect the units a
-software developer thinks in: libraries, binaries, etc. The definition
-of a target only describes the information directly belonging to
-the target, e.g., its source, private and public header files, and
-its direct dependencies. Any other information needed to build a
-target (like the public header files of an indirect dependency)
-are inferred by the build tool. In this way, the build description
-can be kept maintainable
-
-A built target consists of files logically belonging together (like
-the actual library file and its public headers) as well as information
-on how to use the target (linking arguments, transitive header files,
-etc). For a consumer of a target, the definition of this collection
-of files as well as the additionally provided information is what
-defines the target as a dependency, respectively of where the target
-is coming from (i.e., targets coinciding here are indistinguishable
-for other targets).
-
-Of course, to actually build a single target from its dependencies,
-many invocations of the compiler or other tools are necessary (so
-called "actions"); the build tool translates these high level
-description into the individual actions necessary and only re-executes
-those where inputs have changed.
-
-This translation of high-level concepts into individual actions
-is not hard coded into the tool. It is provided by the user as
-"rules" and forms additional input to the build. To avoid duplicate
-work, rules are typically maintained centrally for a project or an
-organization.
-
-*** Structuring the Code: Modules and Repositories
-
-The code base is usually split into many directories, each containing
-source files belonging together. To allow the definition of targets
-where their code is, the targets are structured in a similar way.
-For each directory, there can be a targets files. Directories for
-which such a targets file exists are called "modules". Each file
-belongs to the module that is closest when searching upwards in the
-directory tree. The targets file of a module defines the targets
-formed from the source files belonging to this module.
-
-Larger projects are often split into "repositories". For this build
-tool, a repository is a logical unit. Often those coincide with
-the repositories in the sense of version control. This, however,
-does not have to be the case. Also, from one directory in the file
-system many repositories can be formed that might differ in the
-rules used, targets defined, or binding of their dependencies.
-
-** Staging
-
-A peculiarity of this build system is the complete separation
-between physical and logical paths. Targets have their own view of
-the world, i.e., they can place their artifacts at any logical path
-they like, and this is how they look to other targets. It is up to
-the consuming targets what they do with artifacts of the targets
-they depend on; in particular, they are not obliged to leave them
-at the logical location their dependency put them.
-
-When such a collection of artifacts at logical locations (often
-referred to as the "stage") is realized on the file system (when
-installing a target, or as inputs to actions), the paths are
-interpreted as paths relative to the respective root (installation
-or action directory).
-
-This separation is what allows flexible combination of targets from
-various sources without leaking repository names or different file
-arrangement if a target is in the "main" repository.
-
-** Repository data
-
-A repository uses a (logical) directory for several purposes: to
-obtain source files, to read definitions of targets, to read rules,
-and to read expressions that can be used by rules. While all those
-directories can (and often are) be the same, this does not have
-to be the case. For each of those purposes, a different logical
-directory (also called "root") can be used. In this way, one can,
-e.g., add target definitions to a source tree originally written for
-a different build tool without modifying the original source tree.
-
-Those roots are usually defined in a repository configuration. For
-the "main" repository, i.e., the repository from which the target
-to be built is requested, the roots can also be overwritten at the
-command line. Roots can be defined as paths in the file system,
-but also as ~git~ tree identifiers (together with the location
-of some repository containing that tree). The latter definition
-is preferable for rules and dependencies, as it allows high-level
-caching of targets. It also motivates the need of adding target
-definitions without changing the root itself.
-
-The same flexibility as for the roots is also present for the names
-of the files defining targets, rules, and expressions. While the
-default names ~TARGETS~, ~RULES~, and ~EXPRESSIONS~ are often used,
-other file names can be specified for those as well, either in
-the repository configuration or (for the main repository) on the
-command line.
-
-The final piece of data needed to describe a repository is the
-binding of the open repository names that are used to refer to
-other repositories. More details can be found in the documentation
-on multi-repository builds.
-
-** Targets
-
-*** Target naming
-
-In description files, targets, rules, and expressions are referred
-to by name. As the context always fixes if a name for a target,
-rule, or expression is expected, they use the same naming scheme.
-- A single string refers to the target with this name in the
- same module.
-- A pair ~[module, name]~ refers to the target ~name~ in the module
- ~module~ of the same repository. There are no module names with
- a distinguished meaning. The naming scheme is unambiguous, as
- all other names given by lists have length at least 3.
-- A list ~["./", relative-module-path, name]~ refers to a target
- with the given name in the module that has the specified path
- relative to the current module (in the current repository).
-- A list ~["@", repository, module, name]~ refers to the target
- with the specified name in the specified module of the specified
- repository.
-
-Additionally, there are special targets that can also be referred
-to in target files.
-- An explicit reference of a source-file target in the same module,
- specified as ~["FILE", null, name]~. The explicit ~null~ at the
- second position (where normally the module would be) is necessary
- to ensure the name has length more than 2 to distinguish it from
- a reference to the module ~"FILE"~.
-- A reference to an collection, given by a shell pattern, of explicit
- source files in the top-level directory of the same module,
- specified as ~["GLOB", null, pattern]~. The explicit ~null~ at
- second position is required for the same reason as in the explicit
- file reference.
-- A reference to a tree target in the same module, specified as
- ~["TREE", null, name]~. The explicit ~null~ at second position is
- required for the same reason as in the explicit file reference.
-
-*** Data of an analyzed target
-
-Analyzing a target results in 3 pieces of data.
-- The "artifacts" are a staged collection of artifacts. Typically,
- these are what is normally considered the main reason to build
- a target, e.g., the actual library file in case of a library.
-- The "runfiles" are another staged collection of artifacts. Typically,
- these are files that directly belong to the target and are somehow
- needed to use the target. For example, in case of a library that
- would be the public header files of the library itself.
-- A "provides" map with additional information the target wants
- to provide to its consumers. The data contained in that map can
- also contain additional artifacts. Typically, this the remaining
- information needed to use the target in a build.
-
- In case of a library, that typically would include any other
- libraries this library transitively depends upon (a stage),
- the correct linking order (a list of strings), and the public
- headers of the transitive dependencies (another stage).
-
-A target is completely determined by these 3 pieces of data. A
-consumer of the target will have no other information available.
-Hence it is crucial, that everything (apart from artifacts and
-runfiles) needed to build against that target is contained in the
-provides map.
-
-When the installation of a target is requested on the command line,
-artifacts and runfiles are installed; in case of staging conflicts,
-artifacts take precedence.
-
-*** Source targets
-
-**** Files
-
-If a target is not found in the targets file, it is implicitly
-treated as a source file. Both, explicit and implicit source files
-look the same. The artifacts stage has a single entry: the path is
-the relative path of the file to the module root and the value the
-file artifact located at the specified location. The runfiles are
-the same as the artifacts and the provides map is empty.
-
-**** Collection of files given by a shell pattern
-
-A collection of files given by a shell pattern has, both as artifacts
-and runfiles, the (necessarily disjoint) union of the artifact
-maps of the (zero or more) source targets that match the pattern.
-Only /files/ in the /top-level/ directory of the given modules are
-considered for matches. The provides map is empty.
-
-**** Trees
-
-A tree describes a directory. Internally, however, it is a single
-opaque artifact. Consuming targets cannot look into the internal
-structure of that tree. Only when realized in the file system (when
-installation is requested or as part of the input to an action),
-the directory structure is visible again.
-
-An explicit tree target is similar to an explicit file target, except
-that at the specified location there has to be a directory rather
-than a file and the tree artifact corresponding to that directory
-is taken instead of a file artifact.
diff --git a/doc/concepts/rules.md b/doc/concepts/rules.md
new file mode 100644
index 00000000..2ab4c334
--- /dev/null
+++ b/doc/concepts/rules.md
@@ -0,0 +1,567 @@
+User-defined Rules
+==================
+
+Targets are defined in terms of high-level concepts like "libraries",
+"binaries", etc. In order to translate these high-level definitions
+into actionable tasks, the user defines rules, explaining at a single
+point how all targets of a given type are built.
+
+Rules files
+-----------
+
+Rules are defined in rules files (by default named `RULES`). Those
+contain a JSON object mapping rule names to their rule definition. For
+rules, the same naming scheme as for targets applies. However, built-in
+rules (always named by a single string) take precedence in naming; to
+explicitly refer to a rule defined in the current module, the module has
+to be specified, possibly by a relative path, e.g.,
+`["./", ".", "install"]`.
+
+Basic components of a rule
+--------------------------
+
+A rule is defined through a JSON object with various keys. The only
+mandatory key is `"expression"` containing the defining expression of
+the rule.
+
+### `"config_fields"`, `"string_fields"` and `"target_fields"`
+
+These keys specify the fields that a target defined by that rule can
+have. In particular, those have to be disjoint lists of strings.
+
+For `"config_fields"` and `"string_fields"` the respective field has to
+evaluate to a list of strings, whereas `"target_fields"` have to
+evaluate to a list of target references. Those references are evaluated
+immediately, and in the name context of the target they occur in.
+
+The difference between `"config_fields"` and `"string_fields"` is that
+`"config_fields"` are evaluated before the target fields and hence can
+be used by the rule to specify config transitions for the target fields.
+`"string_fields"` on the other hand are evaluated *after*
+the target fields; hence the rule cannot use them to specify a
+configuration transition, however the target definition in those fields
+may use the `"outs"` and `"runfiles"` functions to have access to the
+names of the artifacts or runfiles of a target specified in one of the
+target fields.
+
+### `"implicit"`
+
+This key specifies a map of implicit dependencies. The keys of the map
+are additional target fields, the values are the fixed list of targets
+for those fields. If a short-form name of a target is used (e.g., only a
+string instead of a module-target pair), it is interpreted relative to
+the repository and module the rule is defined in, not the one the rule
+is used in. Other than this, those fields are evaluated the same way as
+target fields settable on invocation of the rule.
+
+### `"config_vars"`
+
+This is a list of strings specifying which parts of the configuration
+the rule uses. The defining expression of the rule is evaluated in an
+environment that is the configuration restricted to those variables; if
+one of those variables is not specified in the configuration the value
+in the restriction is `null`.
+
+### `"config_transitions"`
+
+This key specifies a map of (some of) the target fields (whether
+declared as `"target_fields"` or as `"implicit"`) to a configuration
+expression. Here, a configuration expression is any expression in our
+language. It has access to the `"config_vars"` and the `"config_fields"`
+and has to evaluate to a list of maps. Each map specifies a transition
+to the current configuration by amending it on the domain of that map to
+the given value.
+
+### `"imports"`
+
+This specifies a map of expressions that can later be used by
+`CALL_EXPRESSION`. In this way, duplication of (rule) code can be
+avoided. For each key, we have to have a name of an expression;
+expressions are named following the same naming scheme as targets and
+rules. The names are resolved in the context of the rule. Expressions
+themselves are defined in expression files, the default name being
+`EXPRESSIONS`.
+
+Each expression is a JSON object. The only mandatory key is
+`"expression"` which has to be an expression in our language. It
+optionally can have a key `"vars"` where the value has to be a list of
+strings (and the default is the empty list). Additionally, it can have
+another optional key `"imports"` following the same scheme as the
+`"imports"` key of a rule; in the `"imports"` key of an expression,
+names are resolved in the context of that expression. It is a
+requirement that the `"imports"` graph be cycle free.
+
+### `"expression"`
+
+This specifies the defining expression of the rule. The value has to be
+an expression of our expression language (basically, an abstract syntax
+tree serialized as JSON). It has access to the following extra functions
+and, when evaluated, has to return a result value.
+
+#### `FIELD`
+
+The field function takes one argument, `name` which has to evaluate
+to the name of a field. For string fields, the given list of strings
+is returned; for target fields, the list of abstract names for the
+given target is returned. These abstract names are opaque within the
+rule language (but meaningful when reported in error messages) and
+should only be used to be passed on to other functions that expect
+names as inputs.
+
+#### `DEP_ARTIFACTS` and `DEP_RUNFILES`
+
+These functions give access to the artifacts, or runfiles,
+respectively, of one of the targets depended upon. It takes two
+(evaluated) arguments, the mandatory `"dep"` and the optional
+`"transition"`.
+
+The argument `"dep"` has to evaluate to an abstract name (as can be
+obtained from the `FIELD` function) of some target specified in one
+of the target fields. The `"transition"` argument has to evaluate to
+a configuration transition (i.e., a map) and the empty transition is
+taken as default. It is an error to request a target-transition pair
+for a target that was not requested in the given transition through
+one of the target fields.
+
+#### `DEP_PROVIDES`
+
+This function gives access to a particular entry of the provides map
+of one of the targets depended upon. The arguments `"dep"` and
+`"transition"` are as for `DEP_ARTIFACTS`; additionally, there is
+the mandatory argument `"provider"` which has to evaluate to a
+string. The function returns the value of the provides map of the
+target at the given provider. If the key is not in the provides map
+(or the value at that key is `null`), the optional argument
+`"default"` is evaluated and returned. The default for `"default"`
+is the empty list.
+
+#### `BLOB`
+
+The `BLOB` function takes a single (evaluated) argument `data` which
+is optional and defaults to the empty string. This argument has to
+evaluate to a string. The function returns an artifact that is a
+non-executable file with the given string as content.
+
+#### `TREE`
+
+The `TREE` function takes a single (evaluated) argument `$1` which
+has to be a map of artifacts. The result is a single tree artifact
+formed from the input map. It is an error if the map cannot be
+transformed into a tree (e.g., due to staging conflicts).
+
+#### `ACTION`
+
+Actions are a way to define new artifacts from (zero or more)
+already defined artifacts by running a command, typically a
+compiler, linker, archiver, etc. The action function takes the
+following arguments.
+
+ - `"inputs"` A map of artifacts. These artifacts are present when
+ the command is executed; the keys of the map are the relative
+ path from the working directory of the command. The command must
+ not make any assumption about the location of the working
+ directory in the file system (and instead should refer to files
+ by path relative to the working directory). Moreover, the
+ command must not modify the input files in any way. (In-place
+ operations can be simulated by staging, as is shown in the
+ example later in this document.)
+
+ It is an additional requirement that no conflicts occur when
+ interpreting the keys as paths. For example, `"foo.txt"` and
+ `"./foo.txt"` are different as strings and hence legitimately
+ can be assigned different values in a map. When interpreted as a
+ path, however, they name the same path; so, if the `"inputs"`
+ map contains both those keys, the corresponding values have to
+ be equal.
+
+ - `"cmd"` The command to execute, given as `argv` vector, i.e., a
+ non-empty list of strings. The 0'th element of that list will
+ also be the program to be executed.
+
+ - `"env"` The environment in which the command should be executed,
+ given as a map of strings to strings.
+
+ - `"outs"` and `"out_dirs"` Two list of strings naming the files
+ and directories, respectively, the command is expected to
+ create. It is an error if the command fails to create the
+ promised output files. These two lists have to be disjoint, but
+ an entry of `"outs"` may well name a location inside one of the
+ `"out_dirs"`.
+
+This function returns a map with keys the strings mentioned in
+`"outs"` and `"out_dirs"`. As values this map has artifacts defined
+to be the ones created by running the given command (in the given
+environment with the given inputs).
+
+#### `RESULT`
+
+The `RESULT` function is the only way to obtain a result value. It
+takes three (evaluated) arguments, `"artifacts"`, `"runfiles"`, and
+`"provides"`, all of which are optional and default to the empty
+map. It defines the result of a target that has the given artifacts,
+runfiles, and provided data, respectively. In particular,
+`"artifacts"` and `"runfiles"` have to be maps to artifacts, and
+`"provides"` has to be a map. Moreover, they keys in `"runfiles"`
+and `"artifacts"` are treated as paths; it is an error if this
+interpretation yields to conflicts. The keys in the artifacts or
+runfile maps as seen by other targets are the normalized paths of
+the keys given.
+
+Result values themselves are opaque in our expression language and
+cannot be deconstructed in any way. Their only purpose is to be the
+result of the evaluation of the defining expression of a target.
+
+#### `CALL_EXPRESSION`
+
+This function takes one mandatory argument `"name"` which is
+unevaluated; it has to a be a string literal. The expression
+imported by that name through the imports field is evaluated in the
+current environment restricted to the variables of that expression.
+The result of that evaluation is the result of the `CALL_EXPRESSION`
+statement.
+
+During the evaluation of an expression, rule fields can still be
+accessed through the functions `FIELD`, `DEP_ARTIFACTS`, etc. In
+particular, even an expression with no variables (that, hence, is
+always evaluated in the empty environment) can carry out non-trivial
+computations and be non-constant. The special functions `BLOB`,
+`ACTION`, and `RESULT` are also available. If inside the evaluation
+of an expression the function `CALL_EXPRESSION` is used, the name
+argument refers to the `"imports"` map of that expression. So the
+call graph is deliberately recursion free.
+
+Evaluation of a target
+----------------------
+
+A target defined by a user-defined rule is evaluated in the following
+way.
+
+ - First, the config fields are evaluated.
+
+ - Then, the target-fields are evaluated. This happens for each field
+ as follows.
+
+ - The configuration transition for this field is evaluated and the
+ transitioned configurations determined.
+ - The argument expression for this field is evaluated. The result
+ is interpreted as a list of target names. Each of those targets
+ is analyzed in all the specified configurations.
+
+ - The string fields are evaluated. If the expression for a string
+ field queries a target (via `outs` or `runfiles`), the value for
+ that target is returned in the first configuration. The rational
+ here is that such generator expressions are intended to refer to the
+ corresponding target in its "main" configuration; they are hardly
+ used anyway for fields branching their targets over many
+ configurations.
+
+ - The effective configuration for the target is determined. The target
+ effectively has used of the configuration the variables used by the
+ `arguments_config` in the rule invocation, the `config_vars` the
+ rule specified, and the parts of the configuration used by a target
+ dependent upon. For a target dependent upon, all parts it used of
+ its configuration are relevant expect for those fixed by the
+ configuration transition.
+
+ - The rule expression is evaluated and the result of that evaluation
+ is the result of the rule.
+
+Example of developing a rule
+----------------------------
+
+Let's consider step by step an example of writing a rule. Say we want
+to write a rule that programmatically patches some files.
+
+### Framework: The minimal rule
+
+Every rule has to have a defining expression evaluating to a `RESULT`.
+So the minimally correct rule is the `"null"` rule in the following
+example rule file.
+
+ { "null": {"expression": {"type": "RESULT"}}}
+
+This rule accepts no parameters, and has the empty map as artifacts,
+runfiles, and provided data. So it is not very useful.
+
+### String inputs
+
+Let's allow the target definition to have some fields. The most simple
+fields are `string_fields`; they are given by a list of strings. In the
+defining expression we can access them directly via the `FIELD`
+function. Strings can be used when defining maps, but we can also create
+artifacts from them, using the `BLOB` function. To create a map, we can
+use the `singleton_map` function. We define values step by step, using
+the `let*` construct.
+
+``` jsonc
+{ "script only":
+ { "string_fields": ["script"]
+ , "expression":
+ { "type": "let*"
+ , "bindings":
+ [ [ "script content"
+ , { "type": "join"
+ , "separator": "\n"
+ , "$1":
+ { "type": "++"
+ , "$1":
+ [["H"], {"type": "FIELD", "name": "script"}, ["w", "q", ""]]
+ }
+ }
+ ]
+ , [ "script"
+ , { "type": "singleton_map"
+ , "key": "script.ed"
+ , "value":
+ {"type": "BLOB", "data": {"type": "var", "name": "script content"}}
+ }
+ ]
+ ]
+ , "body":
+ {"type": "RESULT", "artifacts": {"type": "var", "name": "script"}}
+ }
+ }
+}
+```
+
+### Target inputs and derived artifacts
+
+Now it is time to add the input files. Source files are targets like any
+other target (and happen to contain precisely one artifact). So we add a
+target field `"srcs"` for the file to be patched. Here we have to keep
+in mind that, on the one hand, target fields accept a list of targets
+and, on the other hand, the artifacts of a target are a whole map. We
+chose to patch all the artifacts of all given `"srcs"` targets. We can
+iterate over lists with `foreach` and maps with `foreach_map`.
+
+Next, we have to keep in mind that targets may place their artifacts at
+arbitrary logical locations. For us that means that first we have to
+make a decision at which logical locations we want to place the output
+artifacts. As one thinks of patching as an in-place operation, we chose
+to logically place the outputs where the inputs have been. Of course, we
+do not modify the input files in any way; after all, we have to define a
+mathematical function computing the output artifacts, not a collection
+of side effects. With that choice of logical artifact placement, we have
+to decide what to do if two (or more) input targets place their
+artifacts at logically the same location. We could simply take a
+"latest wins" semantics (keep in mind that target fields give a list
+of targets, not a set) as provided by the `map_union` function. We chose
+to consider it a user error if targets with conflicting artifacts are
+specified. This is provided by the `disjoint_map_union` that also allows
+to specify an error message to be provided the user. Here, conflict
+means that values for the same map position are defined in a different
+way.
+
+The actual patching is done by an `ACTION`. We have the script already;
+to make things easy, we stage the input to a fixed place and also expect
+a fixed output location. Then the actual command is a simple shell
+script. The only thing we have to keep in mind is that we want useful
+output precisely if the action fails. Also note that, while we define
+our actions sequentially, they will be executed in parallel, as none of
+them depends on the output of another one of them.
+
+``` jsonc
+{ "ed patch":
+ { "string_fields": ["script"]
+ , "target_fields": ["srcs"]
+ , "expression":
+ { "type": "let*"
+ , "bindings":
+ [ [ "script content"
+ , { "type": "join"
+ , "separator": "\n"
+ , "$1":
+ { "type": "++"
+ , "$1":
+ [["H"], {"type": "FIELD", "name": "script"}, ["w", "q", ""]]
+ }
+ }
+ ]
+ , [ "script"
+ , { "type": "singleton_map"
+ , "key": "script.ed"
+ , "value":
+ {"type": "BLOB", "data": {"type": "var", "name": "script content"}}
+ }
+ ]
+ , [ "patched files per target"
+ , { "type": "foreach"
+ , "var": "src"
+ , "range": {"type": "FIELD", "name": "srcs"}
+ , "body":
+ { "type": "foreach_map"
+ , "var_key": "file_name"
+ , "var_val": "file"
+ , "range":
+ {"type": "DEP_ARTIFACTS", "dep": {"type": "var", "name": "src"}}
+ , "body":
+ { "type": "let*"
+ , "bindings":
+ [ [ "action output"
+ , { "type": "ACTION"
+ , "inputs":
+ { "type": "map_union"
+ , "$1":
+ [ {"type": "var", "name": "script"}
+ , { "type": "singleton_map"
+ , "key": "in"
+ , "value": {"type": "var", "name": "file"}
+ }
+ ]
+ }
+ , "cmd":
+ [ "/bin/sh"
+ , "-c"
+ , "cp in out && chmod 644 out && /bin/ed out < script.ed > log 2>&1 || (cat log && exit 1)"
+ ]
+ , "outs": ["out"]
+ }
+ ]
+ ]
+ , "body":
+ { "type": "singleton_map"
+ , "key": {"type": "var", "name": "file_name"}
+ , "value":
+ { "type": "lookup"
+ , "map": {"type": "var", "name": "action output"}
+ , "key": "out"
+ }
+ }
+ }
+ }
+ }
+ ]
+ , [ "artifacts"
+ , { "type": "disjoint_map_union"
+ , "msg": "srcs artifacts must not overlap"
+ , "$1":
+ { "type": "++"
+ , "$1": {"type": "var", "name": "patched files per target"}
+ }
+ }
+ ]
+ ]
+ , "body":
+ {"type": "RESULT", "artifacts": {"type": "var", "name": "artifacts"}}
+ }
+ }
+}
+```
+
+A typical invocation of that rule would be a target file like the
+following.
+
+``` jsonc
+{ "input.txt":
+ { "type": "ed patch"
+ , "script": ["%g/world/s//user/g", "%g/World/s//USER/g"]
+ , "srcs": [["FILE", null, "input.txt"]]
+ }
+}
+```
+
+As the input file has the same name as a target (in the same module), we
+use the explicit file reference in the specification of the sources.
+
+### Implicit dependencies and config transitions
+
+Say, instead of patching a file, we want to generate source files from
+some high-level description using our actively developed code generator.
+Then we have to do some additional considerations.
+
+ - First of all, every target defined by this rule not only depends on
+ the targets the user specifies. Additionally, our code generator is
+ also an implicit dependency. And as it is under active development,
+ we certainly do not want it to be taken from the ambient build
+ environment (as we did in the previous example with `ed` which,
+ however, is a pretty stable tool). So we use an `implicit` target
+ for this.
+ - Next, we notice that our code generator is used during the build. In
+ particular, we want that tool (written in some compiled language) to
+ be built for the platform we run our actions on, not the target
+ platform we build our final binaries for. Therefore, we have to use
+ a configuration transition.
+ - As our defining expression also needs the configuration transition
+ to access the artifacts of that implicit target, we better define it
+ as a reusable expression. Other rules in our rule collection might
+ also have the same task; so `["transitions", "for host"]` might be a
+ good place to define it. In fact, it can look like the expression
+ with that name in our own code base.
+
+So, the overall organization of our rule might be as follows.
+
+``` jsonc
+{ "generated code":
+ { "target_fields": ["srcs"]
+ , "implicit": {"generator": [["generators", "foogen"]]}
+ , "config_vars": ["HOST_ARCH"]
+ , "imports": {"for host": ["transitions", "for host"]}
+ , "config_transitions":
+ {"generator": [{"type": "CALL_EXPRESSION", "name": "for host"}]}
+ , "expression": ...
+ }
+}
+```
+
+### Providing information to consuming targets
+
+In the simple case of patching, the resulting file is indeed the only
+information the consumer of that target needs; in fact, the main point
+was that the resulting target could be a drop-in replacement of a source
+file. A typical rule, however, defines something like a library and a
+library is much more, than just the actual library file and the public
+headers: a library may depend on other libraries; therefore, in order to
+use it, we need
+
+ - to have the header files of dependencies available that might be
+ included by the public header files of that library,
+ - to have the libraries transitively depended upon available during
+ linking, and
+ - to know the order in which to link the dependencies (as they might
+ have dependencies among each other).
+
+In order to keep a maintainable build description, all this should be
+taken care of by simply depending on that library. We do
+*not* want the consumer of a target having to be aware of
+such transitive dependencies (e.g., when constructing the link command
+line), as it used to be the case in early build tools like `make`.
+
+It is a deliberate design choice that a target is given only by the
+result of its analysis, regardless of where it is coming from.
+Therefore, all this information needs to be part of the result of a
+target. Such kind of information is precisely, what the mentioned
+`"provides"` map is for. As a map, it can contain an arbitrary amount of
+information and the interface function `"DEP_PROVIDES"` is in such a way
+that adding more providers does not affect targets not aware of them
+(there is no function asking for all providers of a target). The keys
+and their meaning have to be agreed upon by a target and its consumers.
+As the latter, however, typically are a target of the same family
+(authored by the same group), this usually is not a problem.
+
+A typical example of computing a provided value is the `"link-args"` in
+the rules used by `just` itself. They are defined by the following
+expression.
+
+``` jsonc
+{ "type": "nub_right"
+, "$1":
+ { "type": "++"
+ , "$1":
+ [ {"type": "keys", "$1": {"type": "var", "name": "lib"}}
+ , {"type": "CALL_EXPRESSION", "name": "link-args-deps"}
+ , {"type": "var", "name": "link external", "default": []}
+ ]
+ }
+}
+```
+
+This expression
+
+ - collects the respective provider of its dependencies,
+ - adds itself in front, and
+ - deduplicates the resulting list, keeping only the right-most
+ occurrence of each entry.
+
+In this way, the invariant is kept, that the `"link-args"` from a
+topological ordering of the dependencies (in the order that a each entry
+is mentioned before its dependencies).
diff --git a/doc/concepts/rules.org b/doc/concepts/rules.org
deleted file mode 100644
index d4c61b5e..00000000
--- a/doc/concepts/rules.org
+++ /dev/null
@@ -1,551 +0,0 @@
-* User-defined Rules
-
-Targets are defined in terms of high-level concepts like "libraries",
-"binaries", etc. In order to translate these high-level definitions
-into actionable tasks, the user defines rules, explaining at a
-single point how all targets of a given type are built.
-
-** Rules files
-
-Rules are defined in rules files (by default named ~RULES~). Those
-contain a JSON object mapping rule names to their rule definition.
-For rules, the same naming scheme as for targets applies. However,
-built-in rules (always named by a single string) take precedence
-in naming; to explicitly refer to a rule defined in the current
-module, the module has to be specified, possibly by a relative
-path, e.g., ~["./", ".", "install"]~.
-
-** Basic components of a rule
-
-A rule is defined through a JSON object with various keys. The only
-mandatory key is ~"expression"~ containing the defining expression
-of the rule.
-
-*** ~"config_fields"~, ~"string_fields"~ and ~"target_fields"~
-
-These keys specify the fields that a target defined by that rule can
-have. In particular, those have to be disjoint lists of strings.
-
-For ~"config_fields"~ and ~"string_fields"~ the respective field
-has to evaluate to a list of strings, whereas ~"target_fields"~
-have to evaluate to a list of target references. Those references
-are evaluated immediately, and in the name context of the target
-they occur in.
-
-The difference between ~"config_fields"~ and ~"string_fields"~ is
-that ~"config_fields"~ are evaluated before the target fields and
-hence can be used by the rule to specify config transitions for the
-target fields. ~"string_fields"~ on the other hand are evaluated
-_after_ the target fields; hence the rule cannot use them to
-specify a configuration transition, however the target definition
-in those fields may use the ~"outs"~ and ~"runfiles"~ functions to
-have access to the names of the artifacts or runfiles of a target
-specified in one of the target fields.
-
-*** ~"implicit"~
-
-This key specifies a map of implicit dependencies. The keys of the
-map are additional target fields, the values are the fixed list
-of targets for those fields. If a short-form name of a target is
-used (e.g., only a string instead of a module-target pair), it is
-interpreted relative to the repository and module the rule is defined
-in, not the one the rule is used in. Other than this, those fields
-are evaluated the same way as target fields settable on invocation
-of the rule.
-
-*** ~"config_vars"~
-
-This is a list of strings specifying which parts of the configuration
-the rule uses. The defining expression of the rule is evaluated in an
-environment that is the configuration restricted to those variables;
-if one of those variables is not specified in the configuration
-the value in the restriction is ~null~.
-
-*** ~"config_transitions"~
-
-This key specifies a map of (some of) the target fields (whether
-declared as ~"target_fields"~ or as ~"implicit"~) to a configuration
-expression. Here, a configuration expression is any expression
-in our language. It has access to the ~"config_vars"~ and the
-~"config_fields"~ and has to evaluate to a list of maps. Each map
-specifies a transition to the current configuration by amending
-it on the domain of that map to the given value.
-
-*** ~"imports"~
-
-This specifies a map of expressions that can later be used by
-~CALL_EXPRESSION~. In this way, duplication of (rule) code can be
-avoided. For each key, we have to have a name of an expression;
-expressions are named following the same naming scheme as targets
-and rules. The names are resolved in the context of the rule.
-Expressions themselves are defined in expression files, the default
-name being ~EXPRESSIONS~.
-
-Each expression is a JSON object. The only mandatory key is
-~"expression"~ which has to be an expression in our language. It
-optionally can have a key ~"vars"~ where the value has to be a list
-of strings (and the default is the empty list). Additionally, it
-can have another optional key ~"imports"~ following the same scheme
-as the ~"imports"~ key of a rule; in the ~"imports"~ key of an
-expression, names are resolved in the context of that expression.
-It is a requirement that the ~"imports"~ graph be cycle free.
-
-*** ~"expression"~
-
-This specifies the defining expression of the rule. The value has to
-be an expression of our expression language (basically, an abstract
-syntax tree serialized as JSON). It has access to the following
-extra functions and, when evaluated, has to return a result value.
-
-**** ~FIELD~
-
-The field function takes one argument, ~name~ which has to evaluate
-to the name of a field. For string fields, the given list of strings
-is returned; for target fields, the list of abstract names for the
-given target is returned. These abstract names are opaque within
-the rule language (but meaningful when reported in error messages)
-and should only be used to be passed on to other functions that
-expect names as inputs.
-
-**** ~DEP_ARTIFACTS~ and ~DEP_RUNFILES~
-
-These functions give access to the artifacts, or runfiles, respectively,
-of one of the targets depended upon. It takes two (evaluated)
-arguments, the mandatory ~"dep"~ and the optional ~"transition"~.
-
-The argument ~"dep"~ has to evaluate to an abstract name (as can be
-obtained from the ~FIELD~ function) of some target specified in one
-of the target fields. The ~"transition"~ argument has to evaluate
-to a configuration transition (i.e., a map) and the empty transition
-is taken as default. It is an error to request a target-transition
-pair for a target that was not requested in the given transition
-through one of the target fields.
-
-**** ~DEP_PROVIDES~
-
-This function gives access to a particular entry of the provides
-map of one of the targets depended upon. The arguments ~"dep"~
-and ~"transition"~ are as for ~DEP_ARTIFACTS~; additionally, there
-is the mandatory argument ~"provider"~ which has to evaluate to a
-string. The function returns the value of the provides map of the
-target at the given provider. If the key is not in the provides
-map (or the value at that key is ~null~), the optional argument
-~"default"~ is evaluated and returned. The default for ~"default"~
-is the empty list.
-
-**** ~BLOB~
-
-The ~BLOB~ function takes a single (evaluated) argument ~data~
-which is optional and defaults to the empty string. This argument
-has to evaluate to a string. The function returns an artifact that
-is a non-executable file with the given string as content.
-
-**** ~TREE~
-
-The ~TREE~ function takes a single (evaluated) argument ~$1~ which
-has to be a map of artifacts. The result is a single tree artifact
-formed from the input map. It is an error if the map cannot be
-transformed into a tree (e.g., due to staging conflicts).
-
-**** ~ACTION~
-
-Actions are a way to define new artifacts from (zero or more) already
-defined artifacts by running a command, typically a compiler, linker,
-archiver, etc. The action function takes the following arguments.
-- ~"inputs"~ A map of artifacts. These artifacts are present when
- the command is executed; the keys of the map are the relative path
- from the working directory of the command. The command must not
- make any assumption about the location of the working directory
- in the file system (and instead should refer to files by path
- relative to the working directory). Moreover, the command must
- not modify the input files in any way. (In-place operations can
- be simulated by staging, as is shown in the example later in
- this document.)
-
- It is an additional requirement that no conflicts occur when
- interpreting the keys as paths. For example, ~"foo.txt"~ and
- ~"./foo.txt"~ are different as strings and hence legitimately
- can be assigned different values in a map. When interpreted as
- a path, however, they name the same path; so, if the ~"inputs"~
- map contains both those keys, the corresponding values have
- to be equal.
-- ~"cmd"~ The command to execute, given as ~argv~ vector, i.e.,
- a non-empty list of strings. The 0'th element of that list will
- also be the program to be executed.
-- ~"env"~ The environment in which the command should be executed,
- given as a map of strings to strings.
-- ~"outs"~ and ~"out_dirs"~ Two list of strings naming the files
- and directories, respectively, the command is expected to create.
- It is an error if the command fails to create the promised output
- files. These two lists have to be disjoint, but an entry of
- ~"outs"~ may well name a location inside one of the ~"out_dirs"~.
-
-This function returns a map with keys the strings mentioned in
-~"outs"~ and ~"out_dirs"~. As values this map has artifacts defined
-to be the ones created by running the given command (in the given
-environment with the given inputs).
-
-**** ~RESULT~
-
-The ~RESULT~ function is the only way to obtain a result value.
-It takes three (evaluated) arguments, ~"artifacts"~, ~"runfiles"~, and
-~"provides"~, all of which are optional and default to the empty map.
-It defines the result of a target that has the given artifacts,
-runfiles, and provided data, respectively. In particular, ~"artifacts"~
-and ~"runfiles"~ have to be maps to artifacts, and ~"provides"~ has
-to be a map. Moreover, they keys in ~"runfiles"~ and ~"artifacts"~
-are treated as paths; it is an error if this interpretation yields
-to conflicts. The keys in the artifacts or runfile maps as seen by
-other targets are the normalized paths of the keys given.
-
-
-Result values themselves are opaque in our expression language
-and cannot be deconstructed in any way. Their only purpose is to
-be the result of the evaluation of the defining expression of a target.
-
-**** ~CALL_EXPRESSION~
-
-This function takes one mandatory argument ~"name"~ which is
-unevaluated; it has to a be a string literal. The expression imported
-by that name through the imports field is evaluated in the current
-environment restricted to the variables of that expression. The result
-of that evaluation is the result of the ~CALL_EXPRESSION~ statement.
-
-During the evaluation of an expression, rule fields can still be
-accessed through the functions ~FIELD~, ~DEP_ARTIFACTS~, etc. In
-particular, even an expression with no variables (that, hence, is
-always evaluated in the empty environment) can carry out non-trivial
-computations and be non-constant. The special functions ~BLOB~,
-~ACTION~, and ~RESULT~ are also available. If inside the evaluation
-of an expression the function ~CALL_EXPRESSION~ is used, the name
-argument refers to the ~"imports"~ map of that expression. So the
-call graph is deliberately recursion free.
-
-** Evaluation of a target
-
-A target defined by a user-defined rule is evaluated in the
-following way.
-
-- First, the config fields are evaluated.
-
-- Then, the target-fields are evaluated. This happens for each
- field as follows.
- - The configuration transition for this field is evaluated and
- the transitioned configurations determined.
- - The argument expression for this field is evaluated. The result
- is interpreted as a list of target names. Each of those targets
- is analyzed in all the specified configurations.
-
-- The string fields are evaluated. If the expression for a string
- field queries a target (via ~outs~ or ~runfiles~), the value for
- that target is returned in the first configuration. The rational
- here is that such generator expressions are intended to refer to
- the corresponding target in its "main" configuration; they are
- hardly used anyway for fields branching their targets over many
- configurations.
-
-- The effective configuration for the target is determined. The target
- effectively has used of the configuration the variables used by
- the ~arguments_config~ in the rule invocation, the ~config_vars~
- the rule specified, and the parts of the configuration used by
- a target dependent upon. For a target dependent upon, all parts
- it used of its configuration are relevant expect for those fixed
- by the configuration transition.
-
-- The rule expression is evaluated and the result of that evaluation
- is the result of the rule.
-
-** Example of developing a rule
-
-Let's consider step by step an example of writing a rule. Say we want
-to write a rule that programmatically patches some files.
-
-*** Framework: The minimal rule
-
-Every rule has to have a defining expression evaluating
-to a ~RESULT~. So the minimally correct rule is the ~"null"~
-rule in the following example rule file.
-
-#+BEGIN_SRC
-{ "null": {"expression": {"type": "RESULT"}}}
-#+END_SRC
-
-This rule accepts no parameters, and has the empty map as artifacts,
-runfiles, and provided data. So it is not very useful.
-
-*** String inputs
-
-Let's allow the target definition to have some fields. The most
-simple fields are ~string_fields~; they are given by a list of
-strings. In the defining expression we can access them directly via
-the ~FIELD~ function. Strings can be used when defining maps, but
-we can also create artifacts from them, using the ~BLOB~ function.
-To create a map, we can use the ~singleton_map~ function. We define
-values step by step, using the ~let*~ construct.
-
-#+BEGIN_SRC
-{ "script only":
- { "string_fields": ["script"]
- , "expression":
- { "type": "let*"
- , "bindings":
- [ [ "script content"
- , { "type": "join"
- , "separator": "\n"
- , "$1":
- { "type": "++"
- , "$1":
- [["H"], {"type": "FIELD", "name": "script"}, ["w", "q", ""]]
- }
- }
- ]
- , [ "script"
- , { "type": "singleton_map"
- , "key": "script.ed"
- , "value":
- {"type": "BLOB", "data": {"type": "var", "name": "script content"}}
- }
- ]
- ]
- , "body":
- {"type": "RESULT", "artifacts": {"type": "var", "name": "script"}}
- }
- }
-}
-#+END_SRC
-
-*** Target inputs and derived artifacts
-
-Now it is time to add the input files. Source files are targets like
-any other target (and happen to contain precisely one artifact). So
-we add a target field ~"srcs"~ for the file to be patched. Here we
-have to keep in mind that, on the one hand, target fields accept a
-list of targets and, on the other hand, the artifacts of a target
-are a whole map. We chose to patch all the artifacts of all given
-~"srcs"~ targets. We can iterate over lists with ~foreach~ and maps
-with ~foreach_map~.
-
-Next, we have to keep in mind that targets may place their artifacts
-at arbitrary logical locations. For us that means that first
-we have to make a decision at which logical locations we want
-to place the output artifacts. As one thinks of patching as an
-in-place operation, we chose to logically place the outputs where
-the inputs have been. Of course, we do not modify the input files
-in any way; after all, we have to define a mathematical function
-computing the output artifacts, not a collection of side effects.
-With that choice of logical artifact placement, we have to decide
-what to do if two (or more) input targets place their artifacts at
-logically the same location. We could simply take a "latest wins"
-semantics (keep in mind that target fields give a list of targets,
-not a set) as provided by the ~map_union~ function. We chose to
-consider it a user error if targets with conflicting artifacts are
-specified. This is provided by the ~disjoint_map_union~ that also
-allows to specify an error message to be provided the user. Here,
-conflict means that values for the same map position are defined
-in a different way.
-
-The actual patching is done by an ~ACTION~. We have the script
-already; to make things easy, we stage the input to a fixed place
-and also expect a fixed output location. Then the actual command
-is a simple shell script. The only thing we have to keep in mind
-is that we want useful output precisely if the action fails. Also
-note that, while we define our actions sequentially, they will
-be executed in parallel, as none of them depends on the output of
-another one of them.
-
-#+BEGIN_SRC
-{ "ed patch":
- { "string_fields": ["script"]
- , "target_fields": ["srcs"]
- , "expression":
- { "type": "let*"
- , "bindings":
- [ [ "script content"
- , { "type": "join"
- , "separator": "\n"
- , "$1":
- { "type": "++"
- , "$1":
- [["H"], {"type": "FIELD", "name": "script"}, ["w", "q", ""]]
- }
- }
- ]
- , [ "script"
- , { "type": "singleton_map"
- , "key": "script.ed"
- , "value":
- {"type": "BLOB", "data": {"type": "var", "name": "script content"}}
- }
- ]
- , [ "patched files per target"
- , { "type": "foreach"
- , "var": "src"
- , "range": {"type": "FIELD", "name": "srcs"}
- , "body":
- { "type": "foreach_map"
- , "var_key": "file_name"
- , "var_val": "file"
- , "range":
- {"type": "DEP_ARTIFACTS", "dep": {"type": "var", "name": "src"}}
- , "body":
- { "type": "let*"
- , "bindings":
- [ [ "action output"
- , { "type": "ACTION"
- , "inputs":
- { "type": "map_union"
- , "$1":
- [ {"type": "var", "name": "script"}
- , { "type": "singleton_map"
- , "key": "in"
- , "value": {"type": "var", "name": "file"}
- }
- ]
- }
- , "cmd":
- [ "/bin/sh"
- , "-c"
- , "cp in out && chmod 644 out && /bin/ed out < script.ed > log 2>&1 || (cat log && exit 1)"
- ]
- , "outs": ["out"]
- }
- ]
- ]
- , "body":
- { "type": "singleton_map"
- , "key": {"type": "var", "name": "file_name"}
- , "value":
- { "type": "lookup"
- , "map": {"type": "var", "name": "action output"}
- , "key": "out"
- }
- }
- }
- }
- }
- ]
- , [ "artifacts"
- , { "type": "disjoint_map_union"
- , "msg": "srcs artifacts must not overlap"
- , "$1":
- { "type": "++"
- , "$1": {"type": "var", "name": "patched files per target"}
- }
- }
- ]
- ]
- , "body":
- {"type": "RESULT", "artifacts": {"type": "var", "name": "artifacts"}}
- }
- }
-}
-#+END_SRC
-
-A typical invocation of that rule would be a target file like the following.
-#+BEGIN_SRC
-{ "input.txt":
- { "type": "ed patch"
- , "script": ["%g/world/s//user/g", "%g/World/s//USER/g"]
- , "srcs": [["FILE", null, "input.txt"]]
- }
-}
-#+END_SRC
-As the input file has the same name as a target (in the same module),
-we use the explicit file reference in the specification of the sources.
-
-*** Implicit dependencies and config transitions
-
-Say, instead of patching a file, we want to generate source files
-from some high-level description using our actively developed code
-generator. Then we have to do some additional considerations.
-- First of all, every target defined by this rule not only depends
- on the targets the user specifies. Additionally, our code
- generator is also an implicit dependency. And as it is under
- active development, we certainly do not want it to be taken from
- the ambient build environment (as we did in the previous example
- with ~ed~ which, however, is a pretty stable tool). So we use an
- ~implicit~ target for this.
-- Next, we notice that our code generator is used during the
- build. In particular, we want that tool (written in some compiled
- language) to be built for the platform we run our actions on, not
- the target platform we build our final binaries for. Therefore,
- we have to use a configuration transition.
-- As our defining expression also needs the configuration transition
- to access the artifacts of that implicit target, we better define
- it as a reusable expression. Other rules in our rule collection
- might also have the same task; so ~["transitions", "for host"]~
- might be a good place to define it. In fact, it can look like
- the expression with that name in our own code base.
-
-So, the overall organization of our rule might be as follows.
-
-#+BEGIN_SRC
-{ "generated code":
- { "target_fields": ["srcs"]
- , "implicit": {"generator": [["generators", "foogen"]]}
- , "config_vars": ["HOST_ARCH"]
- , "imports": {"for host": ["transitions", "for host"]}
- , "config_transitions":
- {"generator": [{"type": "CALL_EXPRESSION", "name": "for host"}]}
- , "expression": ...
- }
-}
-#+END_SRC
-
-*** Providing information to consuming targets
-
-In the simple case of patching, the resulting file is indeed the
-only information the consumer of that target needs; in fact, the main
-point was that the resulting target could be a drop-in replacement
-of a source file. A typical rule, however, defines something like
-a library and a library is much more, than just the actual library
-file and the public headers: a library may depend on other libraries;
-therefore, in order to use it, we need
-- to have the header files of dependencies available that might be
- included by the public header files of that library,
-- to have the libraries transitively depended upon available during
- linking, and
-- to know the order in which to link the dependencies (as they
- might have dependencies among each other).
-In order to keep a maintainable build description, all this should
-be taken care of by simply depending on that library. We do _not_
-want the consumer of a target having to be aware of such transitive
-dependencies (e.g., when constructing the link command line), as
-it used to be the case in early build tools like ~make~.
-
-It is a deliberate design choice that a target is given only by
-the result of its analysis, regardless of where it is coming from.
-Therefore, all this information needs to be part of the result of
-a target. Such kind of information is precisely, what the mentioned
-~"provides"~ map is for. As a map, it can contain an arbitrary
-amount of information and the interface function ~"DEP_PROVIDES"~
-is in such a way that adding more providers does not affect targets
-not aware of them (there is no function asking for all providers
-of a target). The keys and their meaning have to be agreed upon
-by a target and its consumers. As the latter, however, typically
-are a target of the same family (authored by the same group), this
-usually is not a problem.
-
-A typical example of computing a provided value is the ~"link-args"~
-in the rules used by ~just~ itself. They are defined by the following
-expression.
-#+BEGIN_SRC
-{ "type": "nub_right"
-, "$1":
- { "type": "++"
- , "$1":
- [ {"type": "keys", "$1": {"type": "var", "name": "lib"}}
- , {"type": "CALL_EXPRESSION", "name": "link-args-deps"}
- , {"type": "var", "name": "link external", "default": []}
- ]
- }
-}
-#+END_SRC
-This expression
-- collects the respective provider of its dependencies,
-- adds itself in front, and
-- deduplicates the resulting list, keeping only the right-most
- occurrence of each entry.
-In this way, the invariant is kept, that the ~"link-args"~ from a
-topological ordering of the dependencies (in the order that a each
-entry is mentioned before its dependencies).
diff --git a/doc/concepts/target-cache.md b/doc/concepts/target-cache.md
new file mode 100644
index 00000000..0db627e1
--- /dev/null
+++ b/doc/concepts/target-cache.md
@@ -0,0 +1,231 @@
+Target-level caching
+====================
+
+`git` trees as content-fixed roots
+----------------------------------
+
+### The `"git tree"` root scheme
+
+The multi-repository configuration supports a scheme `"git tree"`. This
+scheme is given by two parameters,
+
+ - the id of the tree (as a string with the hex encoding), and
+ - an arbitrary `git` repository containing the specified tree object,
+ as well as all needed tree and blob objects reachable from that
+ tree.
+
+For example, a root could be specified as follows.
+
+``` jsonc
+["git tree", "6a1820e78f61aee6b8f3677f150f4559b6ba77a4", "/usr/local/src/justbuild.git"]
+```
+
+It should be noted that the `git` tree identifier alone already
+specifies the content of the full tree. However, `just` needs access to
+some repository containing the tree in order to know what the tree looks
+like.
+
+Nevertheless, it is an important observation that the tree identifier
+alone already specifies the content of the whole (logical) directory.
+The equality of two such directories can be established by comparing the
+two identifiers *without* the need to read any file from
+disk. Those "fixed-content" descriptions, i.e., descriptions of a
+repository root that already fully determines the content are the key to
+caching whole targets.
+
+### `KNOWN` artifacts
+
+The in-memory representation of known artifacts has an optional
+reference to a repository containing that artifact. Artifacts "known"
+from local repositories might not be known to the CAS used for the
+action execution; this additional reference allows to fill such misses
+in the CAS.
+
+Content-fixed repositories
+--------------------------
+
+### The parts of a content-fixed repository
+
+In order to meaningfully cache a target, we need to be able to
+efficiently compute the cache key. We restrict this to the case where we
+can compute the information about the repository without file-system
+access. This requires that all roots (workspace, target root, etc) be
+content fixed, as well as the bindings of the free repository names (and
+hence also all transitively reachable repositories). The call such
+repositories "content-fixed" repositories.
+
+### Canonical description of a content-fixed repository
+
+The local data of a repository consists of the following.
+
+ - The roots (for workspace, targets, rules, expressions). As the tree
+ identifier already defines the content, we leave out the path to the
+ repository containing the tree.
+ - The names of the targets, rules, and expression files.
+ - The names of the outgoing "bindings".
+
+Additionally, repositories can reach additional repositories via
+bindings. Moreover, this repository-level dependency relation is not
+necessarily cycle free. In particular, we cannot use the tree unfolding
+as canonical representation of that graph up to bisimulation, as we do
+with most other data structures. To still get a canonical
+representation, we factor out the largest bisimulation, i.e., minimize
+the respective automaton (with repositories as states, local data as
+locally observable properties, and the binding relation as edges).
+
+Finally, for each repository individually, the reachable repositories
+are renamed `"0"`, `"1"`, `"2"`, etc, following a depth-first traversal
+starting from the repository in question where outgoing edges are
+traversed in lexicographical order. The entry point is hence
+recognisable as repository `"0"`.
+
+The repository key content-identifier of the canonically formatted
+canonical serialisation of the JSON encoding of the obtain
+multi-repository configuration (with repository-free git-root
+descriptions). The serialisation itself is stored in CAS.
+
+These identifications and replacement of global names does not change
+the semantics, as our name data types are completely opaque to our
+expression language. In the `"json_encode"` expression, they're
+serialized as `null` and string representation is only generated in user
+messages not available to the language itself. Moreover, names cannot be
+compared for equality either, so their only observable properties, i.e.,
+the way `"DEP_ARTIFACTS"`, `"DEP_RUNFILES`, and `"DEP_PROVIDES"` reacts
+to them are invariant under repository bisimulation.
+
+Configuration and the `"export"` rule
+-------------------------------------
+
+Targets not only depend on the content of their repository, but also on
+their configurations. Normally, the effective part of a configuration is
+only determined after analysing the target. However, for caching, we
+need to compute the cache key directly. This property is provided by the
+built-in `"export"` rule; only `"export"` targets residing in
+content-fixed repositories will be cached. This also serves as
+indication, which targets of a repository are intended for consumption
+by other repositories.
+
+An `"export"` rule takes precisely the following arguments.
+
+ - `"target"` specifying a single target, the target to be cached. It
+ must not be tainted.
+ - `"flexible_config"` a list of strings; those specify the variables
+ of the configuration that are considered. All other parts of the
+ configuration are ignored. So the effective configuration for the
+ `"export"` target is the configuration restricted to those variables
+ (filled up with `null` if the variable was not present in the
+ original configuration).
+ - `"fixed_config"` a dict with of arbitrary JSON values (taken
+ unevaluated) with keys disjoint from the `"flexible_config"`.
+
+An `"export"` target is analyzed as follows. The configuration is
+restricted to the variables specified in the `"flexible_config"`; this
+will result in the effective configuration for the exported target. It
+is a requirement that the effective configuration contain only pure JSON
+values. The (necessarily conflict-free) union with the `"fixed_config"`
+is computed and the `"target"` is evaluated in this configuration. The
+result (artifacts, runfiles, provided information) is the result of that
+evaluation. It is a requirement that the provided information does only
+contain pure JSON values and artifacts (including tree artifacts); in
+particular, they may not contain names.
+
+Cache key
+---------
+
+We only consider `"export"` targets in content-fixed repositories for
+caching. An export target is then fully described by
+
+ - the repository key of the repository the export target resides in,
+ - the target name of the export target within that repository,
+ described as module-name pair, and
+ - the effective configuration.
+
+More precisely, the canonical description is the JSON object with those
+values for the keys `"repo_key"`, `"target_name"`, and
+`"effective_config"`, respectively. The repository key is the blob
+identifier of the canonical serialisation (including sorted keys, etc)
+of the just described piece of JSON. To allow debugging and cooperation
+with other tools, whenever a cache key is computed, it is ensured, that
+the serialisation ends up in the applicable CAS.
+
+It should be noted that the cache key can be computed
+*without* analyzing the target referred to. This is
+possible, as the configuration is pruned a priori instead of the usual
+procedure to analyse and afterwards determine the parts of the
+configuration that were relevant.
+
+Cached value
+------------
+
+The value to be cached is the result of evaluating the target, that is,
+its artifacts, runfiles, and provided data. All artifacts inside those
+data structures will be described as known artifacts.
+
+As serialisation, we will essentially use our usual JSON encoding; while
+this can be used as is for artifacts and runfiles where we know that
+they have to be a map from strings to artifacts, additional information
+will be added for the provided data. The provided data can contain
+artifacts, but also legitimately pure JSON values that coincide with our
+JSON encoding of artifacts; the same holds true for nodes and result
+values. Moreover, the tree unfolding implicit in the JSON serialisation
+can be exponentially larger than the value.
+
+Therefore, in our serialisation, we add an entry for every subexpression
+and separately add a list of which subexpressions are artifacts, nodes,
+or results. During deserialisation, we use this subexpression structure
+to deserialize every subexpression only one.
+
+Sharding of target cache
+------------------------
+
+In our target description, the execution environment is not included.
+For local execution, it is implicit anyway. As we also want to cache
+high-level targets when using remote execution, we shard the target
+cache (e.g., by using appropriate subdirectories) by the blob identifier
+of the serialisation of the description of the execution backend. Here,
+`null` stands for local execution, and for remote execution we use an
+object with keys `"remote_execution_address"` and
+`"remote_execution_properties"` filled in the obvious way. As usual, we
+add the serialisation to the CAS.
+
+`"export"` targets, strictness and the extensional projection
+-------------------------------------------------------------
+
+As opposed to the target that is exported, the corresponding export
+target, if part of a content-fixed repository, will be strict: a build
+depending on such a target can only succeed if all artifacts in the
+result of target (regardless whether direct artifacts, runfiles, or as
+part of the provided data) can be built, even if not all (or even none)
+are actually used in the build.
+
+Upon cache hit, the artifacts of an export target are the known
+artifacts corresponding to the artifacts of the exported target. While
+extensionally equal, known artifacts are defined differently, so an
+export target and the exported target are intensionally different (and
+that difference might only be visible on the second build). As
+intensional equality is used when testing for absence of conflicts in
+staging, a target and its exported version almost always conflict and
+hence should not be used together. One way to achieve this is to always
+use the export target for any target that is exported. This fits well
+together with the recommendation of only depending on export targets of
+other repositories.
+
+If a target forwards artifacts of an exported target (indirect header
+files, indirect link dependencies, etc), and is exported again, no
+additional conflicts occur; replacing by the corresponding known
+artifact is a projection: the known artifact corresponding to a known
+artifact is the artifact itself. Moreover, by the strictness property
+described earlier, if an export target has a cache hit, then so have all
+export targets it depends upon. Keep in mind that a repository can only
+be content-fixed if all its dependencies are.
+
+For this strictness-based approach to work, it is, however, a
+requirement that any artifact that is exported (typically indirectly,
+e.g., as part of a common dependency) by several targets is only used
+through the same export target. For a well-structured repository, this
+should not be a natural property anyway.
+
+The forwarding of artifacts are the reason we chose that in the
+non-cached analysis of an export target the artifacts are passed on as
+received and are not wrapped in an "add to cache" action. The latter
+choice would violate that projection property we rely upon.
diff --git a/doc/concepts/target-cache.org b/doc/concepts/target-cache.org
deleted file mode 100644
index 591a66af..00000000
--- a/doc/concepts/target-cache.org
+++ /dev/null
@@ -1,219 +0,0 @@
-* Target-level caching
-
-** ~git~ trees as content-fixed roots
-
-*** The ~"git tree"~ root scheme
-
-The multi-repository configuration supports a scheme ~"git tree"~.
-This scheme is given by two parameters,
-- the id of the tree (as a string with the hex encoding), and
-- an arbitrary ~git~ repository containing the specified tree
- object, as well as all needed tree and blob objects reachable
- from that tree.
-For example, a root could be specified as follows.
-#+BEGIN_SRC
-["git tree", "6a1820e78f61aee6b8f3677f150f4559b6ba77a4", "/usr/local/src/justbuild.git"]
-#+END_SRC
-
-It should be noted that the ~git~ tree identifier alone already
-specifies the content of the full tree. However, ~just~ needs access
-to some repository containing the tree in order to know what the
-tree looks like.
-
-Nevertheless, it is an important observation that the tree identifier
-alone already specifies the content of the whole (logical) directory.
-The equality of two such directories can be established by comparing
-the two identifiers _without_ the need to read any file from
-disk. Those "fixed-content" descriptions, i.e., descriptions of a
-repository root that already fully determines the content are the
-key to caching whole targets.
-
-*** ~KNOWN~ artifacts
-
-The in-memory representation of known artifacts has an optional
-reference to a repository containing that artifact. Artifacts
-"known" from local repositories might not be known to the CAS used
-for the action execution; this additional reference allows to fill
-such misses in the CAS.
-
-** Content-fixed repositories
-
-*** The parts of a content-fixed repository
-
-In order to meaningfully cache a target, we need to be able to
-efficiently compute the cache key. We restrict this to the case where
-we can compute the information about the repository without file-system
-access. This requires that all roots (workspace, target root, etc)
-be content fixed, as well as the bindings of the free repository
-names (and hence also all transitively reachable repositories).
-The call such repositories "content-fixed" repositories.
-
-*** Canonical description of a content-fixed repository
-
-The local data of a repository consists of the following.
-- The roots (for workspace, targets, rules, expressions). As the
- tree identifier already defines the content, we leave out the
- path to the repository containing the tree.
-- The names of the targets, rules, and expression files.
-- The names of the outgoing "bindings".
-
-Additionally, repositories can reach additional repositories via
-bindings. Moreover, this repository-level dependency relation
-is not necessarily cycle free. In particular, we cannot use the
-tree unfolding as canonical representation of that graph up to
-bisimulation, as we do with most other data structures. To still get
-a canonical representation, we factor out the largest bisimulation,
-i.e., minimize the respective automaton (with repositories as
-states, local data as locally observable properties, and the binding
-relation as edges).
-
-Finally, for each repository individually, the reachable repositories
-are renamed ~"0"~, ~"1"~, ~"2"~, etc, following a depth-first
-traversal starting from the repository in question where outgoing
-edges are traversed in lexicographical order. The entry point is
-hence recognisable as repository ~"0"~.
-
-The repository key content-identifier of the canonically formatted
-canonical serialisation of the JSON encoding of the obtain
-multi-repository configuration (with repository-free git-root
-descriptions). The serialisation itself is stored in CAS.
-
-These identifications and replacement of global names does not change
-the semantics, as our name data types are completely opaque to our
-expression language. In the ~"json_encode"~ expression, they're
-serialized as ~null~ and string representation is only generated in
-user messages not available to the language itself. Moreover, names
-cannot be compared for equality either, so their only observable
-properties, i.e., the way ~"DEP_ARTIFACTS"~, ~"DEP_RUNFILES~, and
-~"DEP_PROVIDES"~ reacts to them are invariant under repository
-bisimulation.
-
-** Configuration and the ~"export"~ rule
-
-Targets not only depend on the content of their repository, but also
-on their configurations. Normally,
-the effective part of a configuration is only determined after
-analysing the target. However, for caching, we need to compute
-the cache key directly. This property is provided by the built-in ~"export"~ rule; only ~"export"~ targets
-residing in content-fixed repositories will be cached. This also
-serves as indication, which targets of a repository are intended
-for consumption by other repositories.
-
-An ~"export"~ rule takes precisely the following arguments.
-- ~"target"~ specifying a single target, the target to be cached.
- It must not be tainted.
-- ~"flexible_config"~ a list of strings; those specify the variables
- of the configuration that are considered. All other parts of
- the configuration are ignored. So the effective configuration for
- the ~"export"~ target is the configuration restricted to those
- variables (filled up with ~null~ if the variable was not present
- in the original configuration).
-- ~"fixed_config"~ a dict with of arbitrary JSON values (taken
- unevaluated) with keys disjoint from the ~"flexible_config"~.
-
-An ~"export"~ target is analyzed as follows. The configuration is
-restricted to the variables specified in the ~"flexible_config"~;
-this will result in the effective configuration for the exported
-target. It is a requirement that the effective configuration contain
-only pure JSON values. The (necessarily conflict-free) union with
-the ~"fixed_config"~ is computed and the ~"target"~ is evaluated
-in this configuration. The result (artifacts, runfiles, provided
-information) is the result of that evaluation. It is a requirement
-that the provided information does only contain pure JSON values
-and artifacts (including tree artifacts); in particular, they may
-not contain names.
-
-** Cache key
-
-We only consider ~"export"~ targets in content-fixed repositories
-for caching. An export target is then fully described by
-- the repository key of the repository the export target resides in,
-- the target name of the export target within that repository,
- described as module-name pair, and
-- the effective configuration.
-More precisely, the canonical description is the JSON object with
-those values for the keys ~"repo_key"~, ~"target_name"~, and ~"effective_config"~,
-respectively. The repository key is the blob identifier of the
-canonical serialisation (including sorted keys, etc) of the just
-described piece of JSON. To allow debugging and cooperation with
-other tools, whenever a cache key is computed, it is ensured,
-that the serialisation ends up in the applicable CAS.
-
-It should be noted that the cache key can be computed _without_
-analyzing the target referred to. This is possible, as the
-configuration is pruned a priori instead of the usual procedure
-to analyse and afterwards determine the parts of the configuration
-that were relevant.
-
-** Cached value
-
-The value to be cached is the result of evaluating the target,
-that is, its artifacts, runfiles, and provided data. All artifacts
-inside those data structures will be described as known artifacts.
-
-As serialisation, we will essentially use our usual JSON encoding;
-while this can be used as is for artifacts and runfiles where we
-know that they have to be a map from strings to artifacts, additional
-information will be added for the provided data. The provided data
-can contain artifacts, but also legitimately pure JSON values that
-coincide with our JSON encoding of artifacts; the same holds true
-for nodes and result values. Moreover, the tree unfolding implicit
-in the JSON serialisation can be exponentially larger than the value.
-
-Therefore, in our serialisation, we add an entry for every subexpression
-and separately add a list of which subexpressions are artifacts,
-nodes, or results. During deserialisation, we use this subexpression
-structure to deserialize every subexpression only one.
-
-** Sharding of target cache
-
-In our target description, the execution environment is not included.
-For local execution, it is implicit anyway. As we also want to
-cache high-level targets when using remote execution, we shard the
-target cache (e.g., by using appropriate subdirectories) by the blob
-identifier of the serialisation of the description of the execution
-backend. Here, ~null~ stands for local execution, and for remote
-execution we use an object with keys ~"remote_execution_address"~
-and ~"remote_execution_properties"~ filled in the obvious way. As
-usual, we add the serialisation to the CAS.
-
-** ~"export"~ targets, strictness and the extensional projection
-
-As opposed to the target that is exported, the corresponding export
-target, if part of a content-fixed repository, will be strict: a
-build depending on such a target can only succeed if all artifacts
-in the result of target (regardless whether direct artifacts,
-runfiles, or as part of the provided data) can be built, even if
-not all (or even none) are actually used in the build.
-
-Upon cache hit, the artifacts of an export target are the known
-artifacts corresponding to the artifacts of the exported target.
-While extensionally equal, known artifacts are defined differently,
-so an export target and the exported target are intensionally
-different (and that difference might only be visible on the second
-build). As intensional equality is used when testing for absence
-of conflicts in staging, a target and its exported version almost
-always conflict and hence should not be used together. One way to
-achieve this is to always use the export target for any target that
-is exported. This fits well together with the recommendation of
-only depending on export targets of other repositories.
-
-If a target forwards artifacts of an exported target (indirect header
-files, indirect link dependencies, etc), and is exported again, no
-additional conflicts occur; replacing by the corresponding known
-artifact is a projection: the known artifact corresponding to a
-known artifact is the artifact itself. Moreover, by the strictness
-property described earlier, if an export target has a cache hit,
-then so have all export targets it depends upon. Keep in mind that
-a repository can only be content-fixed if all its dependencies are.
-
-For this strictness-based approach to work, it is, however, a
-requirement that any artifact that is exported (typically indirectly,
-e.g., as part of a common dependency) by several targets is only
-used through the same export target. For a well-structured repository,
-this should not be a natural property anyway.
-
-The forwarding of artifacts are the reason we chose that in the
-non-cached analysis of an export target the artifacts are passed on
-as received and are not wrapped in an "add to cache" action. The
-latter choice would violate that projection property we rely upon.
diff --git a/doc/future-designs/computed-roots.md b/doc/future-designs/computed-roots.md
new file mode 100644
index 00000000..8bbff401
--- /dev/null
+++ b/doc/future-designs/computed-roots.md
@@ -0,0 +1,156 @@
+Computed roots
+==============
+
+Status quo
+----------
+
+As of version `1.0.0`, the `just` build tool requires a the repository
+configuration, including all roots, to be specified ahead of time. This
+has a couple of consequences.
+
+### Flexible source views, thanks to staging
+
+For source files, the flexibility of using them in a layout different
+from how they occur in the source tree is gained through staging. If a
+different view of sources is needed, instead of a source target, a
+defined target can be used that rearranges the sources as desired. In
+this way, also programmatic transformations of source files can be
+carried out (while the result is still visible at the original
+location), as is done, e.g., by the `["patch", "file"]` rule of the
+`just` main repository.
+
+### Restricted flexibility in target-definitions via globbing
+
+When defining targets, the general principle is that the definition of
+target and action graph only depends on the description (given by the
+target files, the rules and expressions, and the configuration). There
+is, however, a single exception to that rule: a target file may use the
+`GLOB` built-in construct and in this way depend on the index of the
+respective source directory. This allows, e.g., to define a separate
+action for every source file and, in this way, get good incrementality
+and parallelism, while still having a concise target description.
+
+### Modularity in rules through expressions
+
+Rules might share common tasks. For example, for both `C` binaries and
+`C` libraries, the source files have to be compiled to object files. To
+avoid duplication of descriptions, expressions can be called (also from
+expressions themselves).
+
+Use cases that require more flexibility
+---------------------------------------
+
+### Generated target files
+
+Sometimes projects (or parts thereof that can form a separate logical
+repository) have a simple structure. For example, there is a list of
+directories and for each one there is a library, named and staged in a
+systematic way. Repeating all those systematic target files seems
+unnecessary work. Instead, we could store the list of directories to
+consider and a small script containing the naming/staging/globbing
+logic; this approach would also be more maintainable. A similar approach
+could also be attractive for a directory tree with tests where, on top,
+all the individual tests should be collected to test suites.
+
+### Staging according to embedded information
+
+For importing prebuilt libraries, it is sometimes desirable to stage
+them in a way honoring the embedded `soname`. The current approach is to
+provide that information out of band in the target file, so that it can
+be used during analysis. Still, the information is already present in
+the prebuilt binary, causing unnecessary maintenance overhead; instead,
+the target file could be a function of that library which can form its
+own content-fixed root (e.g., a `git tree` root), so that the computed
+value is easily cacheable.
+
+### Simplified rule definition and alternative syntax
+
+Rules can share computation through expressions. However, the interface,
+deliberately has to be explicit, including the documentation strings
+that are used by `just describe`. While this allows easy and efficient
+implementation of `just describe`, there is some redundancy involved, as
+often fields are only there to be used by a common expression, but this
+have to be documented in a redundant way (causing additional maintenance
+burden).
+
+Moreover, using JSON encoding of abstract syntax trees is an
+unambiguously readable and easy to automatically process format, but
+people argue that it is hard to write by hand. However, it is unlikely
+to get agreement on which syntax is best to use. Now, if rule and
+expression files could be generated, this argument would not be
+necessary. Moreover, rules are typically versioned and infrequently
+changed, so the step of generating the official syntax from the
+convenient one would typically be in cache.
+
+Proposal: Support computed roots
+--------------------------------
+
+We propose computed roots as a clean principle to add the needed (and a
+lot more) flexibility for the described use cases, while ensuring that
+all computations of roots are properly cacheable at high level. In this
+way, we do not compromise efficient builds, as the price of the
+additional flexibility, in the typical case, is just a single cache
+lookup. Of course, it is up to the user to ensure that this case really
+is the typical one, in the same way as it is their responsibility to
+describe the targets in a way to have proper incrementality.
+
+### New root type `"computed"`
+
+The `just` multi-repository configuration will allow a new type of root
+(besides `"file"` and `"git tree"` and variants thereof), called
+`"computed"`. A `"computed"` root is given by
+
+ - the (global) name of a repository
+ - the name of a target (in `["module", "target"]` format), and
+ - a configuration (as JSON object, taken literally).
+
+It is a requirement that the specified target is an `"export"` target
+and the specified repository content-fixed; `"computed"` roots are
+considered content-fixed. However, the dependency structure of computed
+roots must be cycle free. In other words, there must exist an ordering
+of computed roots (the implicit topological order, not a declared one)
+such that for each computed root, the referenced repository as well as
+all repositories reachable from that one via the `"bindings"` map only
+contain computed roots earlier in that order.
+
+### Strict evaluation of roots as artifact tree
+
+The building of required computed roots happens in topological order;
+the build of the defining target of a root is, in principle (subject to
+a user-defined restriction of parallelism) started as soon as all roots
+in the repositories reachable via bindings are available. The root is
+then considered the artifact tree of the defining target.
+
+In particular, the evaluation is strict: all roots of reachable
+repositories have to be successfully computed before the evaluation is
+started, even if it later turns out that one of these roots is never
+accessed in the computation of the defining target. The reason for this
+strictness requirement is to ensure that the cache key for target-level
+caching can be computed ahead of time (and we expect the entry to be in
+target-level cache most of the time anyway).
+
+### Intensional equality of computed roots
+
+During a build, each computed root is evaluated only once, even if
+required in several places. Two computed roots are considered equal, if
+they are defined in the same way, i.e., repository name, target, and
+configuration agree. The repository or layer using the computed root is
+not part of the root definition.
+
+### Computed roots available to the user
+
+As computed roots are defined by export targets, the respective
+artifacts are stored in the local CAS anyway. Additionally, the tree
+that forms the root will be added to CAS as well. Moreover, an option
+will be added to specify a log file that contains, in machine-readable
+way, all the tree identifiers of all computed roots used in this build,
+together with their definition.
+
+### `just-mr` to support computed roots
+
+To allow simply setting up a `just` configuration using computed roots,
+`just-mr` will allow a repository type `"computed"` with the same
+parameters as a computed root. These repositories can be used as roots,
+like any other `just-mr` repository type. When generating the `just`
+multi-repository configuration, the definition of a `"computed"`
+repository is just forwarded as computed root.
diff --git a/doc/future-designs/computed-roots.org b/doc/future-designs/computed-roots.org
deleted file mode 100644
index a83eee67..00000000
--- a/doc/future-designs/computed-roots.org
+++ /dev/null
@@ -1,154 +0,0 @@
-* Computed roots
-
-** Status quo
-
-As of version ~1.0.0~, the ~just~ build tool requires a the repository
-configuration, including all roots, to be specified ahead of time.
-This has a couple of consequences.
-
-*** Flexible source views, thanks to staging
-
-For source files, the flexibility of using them in a layout different
-from how they occur in the source tree is gained through staging.
-If a different view of sources is needed, instead of a source
-target, a defined target can be used that rearranges the sources as
-desired. In this way, also programmatic transformations of source
-files can be carried out (while the result is still visible at the
-original location), as is done, e.g., by the ~["patch", "file"]~
-rule of the ~just~ main repository.
-
-*** Restricted flexibility in target-definitions via globbing
-
-When defining targets, the general principle is that the definition
-of target and action graph only depends on the description (given by
-the target files, the rules and expressions, and the configuration).
-There is, however, a single exception to that rule: a target file
-may use the ~GLOB~ built-in construct and in this way depend on
-the index of the respective source directory. This allows, e.g.,
-to define a separate action for every source file and, in this
-way, get good incrementality and parallelism, while still having
-a concise target description.
-
-*** Modularity in rules through expressions
-
-Rules might share common tasks. For example, for both ~C~ binaries
-and ~C~ libraries, the source files have to be compiled to object
-files. To avoid duplication of descriptions, expressions can be
-called (also from expressions themselves).
-
-** Use cases that require more flexibility
-
-*** Generated target files
-
-Sometimes projects (or parts thereof that can form a separate
-logical repository) have a simple structure. For example, there is
-a list of directories and for each one there is a library, named
-and staged in a systematic way. Repeating all those systematic
-target files seems unnecessary work. Instead, we could store the
-list of directories to consider and a small script containing the
-naming/staging/globbing logic; this approach would also be more
-maintainable. A similar approach could also be attractive for a
-directory tree with tests where, on top, all the individual tests
-should be collected to test suites.
-
-*** Staging according to embedded information
-
-For importing prebuilt libraries, it is sometimes desirable to
-stage them in a way honoring the embedded ~soname~. The current
-approach is to provide that information out of band in the target
-file, so that it can be used during analysis. Still, the information
-is already present in the prebuilt binary, causing unnecessary
-maintenance overhead; instead, the target file could be a function
-of that library which can form its own content-fixed root (e.g., a
-~git tree~ root), so that the computed value is easily cacheable.
-
-*** Simplified rule definition and alternative syntax
-
-Rules can share computation through expressions. However, the
-interface, deliberately has to be explicit, including the documentation
-strings that are used by ~just describe~. While this allows easy
-and efficient implementation of ~just describe~, there is some
-redundancy involved, as often fields are only there to be used by
-a common expression, but this have to be documented in a redundant
-way (causing additional maintenance burden).
-
-Moreover, using JSON encoding of abstract syntax trees is an
-unambiguously readable and easy to automatically process format,
-but people argue that it is hard to write by hand. However, it is
-unlikely to get agreement on which syntax is best to use. Now, if
-rule and expression files could be generated, this argument would
-not be necessary. Moreover, rules are typically versioned and
-infrequently changed, so the step of generating the official syntax
-from the convenient one would typically be in cache.
-
-** Proposal: Support computed roots
-
-We propose computed roots as a clean principle to add the needed (and
-a lot more) flexibility for the described use cases, while ensuring
-that all computations of roots are properly cacheable at high level.
-In this way, we do not compromise efficient builds, as the price of
-the additional flexibility, in the typical case, is just a single
-cache lookup. Of course, it is up to the user to ensure that this
-case really is the typical one, in the same way as it is their
-responsibility to describe the targets in a way to have proper
-incrementality.
-
-*** New root type ~"computed"~
-
-The ~just~ multi-repository configuration will allow a new type
-of root (besides ~"file"~ and ~"git tree"~ and variants thereof),
-called ~"computed"~. A ~"computed"~ root is given by
-- the (global) name of a repository
-- the name of a target (in ~["module", "target"]~ format), and
-- a configuration (as JSON object, taken literally).
-It is a requirement that the specified target is an ~"export"~
-target and the specified repository content-fixed; ~"computed"~ roots
-are considered content-fixed. However, the dependency structure of
-computed roots must be cycle free. In other words, there must exist
-an ordering of computed roots (the implicit topological order, not
-a declared one) such that for each computed root, the referenced
-repository as well as all repositories reachable from that one
-via the ~"bindings"~ map only contain computed roots earlier in
-that order.
-
-*** Strict evaluation of roots as artifact tree
-
-The building of required computed roots happens in topological order;
-the build of the defining target of a root is, in principle (subject
-to a user-defined restriction of parallelism) started as soon as all
-roots in the repositories reachable via bindings are available. The
-root is then considered the artifact tree of the defining target.
-
-In particular, the evaluation is strict: all roots of reachable
-repositories have to be successfully computed before the evaluation
-is started, even if it later turns out that one of these roots is
-never accessed in the computation of the defining target. The reason
-for this strictness requirement is to ensure that the cache key for
-target-level caching can be computed ahead of time (and we expect
-the entry to be in target-level cache most of the time anyway).
-
-*** Intensional equality of computed roots
-
-During a build, each computed root is evaluated only once, even
-if required in several places. Two computed roots are considered
-equal, if they are defined in the same way, i.e., repository name,
-target, and configuration agree. The repository or layer using the
-computed root is not part of the root definition.
-
-*** Computed roots available to the user
-
-As computed roots are defined by export targets, the respective
-artifacts are stored in the local CAS anyway. Additionally, the
-tree that forms the root will be added to CAS as well. Moreover,
-an option will be added to specify a log file that contains, in
-machine-readable way, all the tree identifiers of all computed
-roots used in this build, together with their definition.
-
-*** ~just-mr~ to support computed roots
-
-To allow simply setting up a ~just~ configuration using computed
-roots, ~just-mr~ will allow a repository type ~"computed"~ with the
-same parameters as a computed root. These repositories can be used
-as roots, like any other ~just-mr~ repository type. When generating
-the ~just~ multi-repository configuration, the definition of a
-~"computed"~ repository is just forwarded as computed root.
diff --git a/doc/future-designs/execution-properties.md b/doc/future-designs/execution-properties.md
new file mode 100644
index 00000000..d6fc53e8
--- /dev/null
+++ b/doc/future-designs/execution-properties.md
@@ -0,0 +1,125 @@
+Action-controlled execution properties
+======================================
+
+Motivation
+----------
+
+### Varying execution platforms
+
+It is a common situation that software is developed for one platform,
+but it is desirable to build on a different one. For example, the other
+platform could be faster (common theme when developing for embedded
+devices), cheaper, or simply available in larger quantities. The
+standard solution for these kind of situations is cross compiling: the
+binary is completely built on one platform, while being intended to run
+on a different one. This can be achieved by constructing the compiler
+invocations accordingly and is already built into our rules (at least
+for `C` and `C++`).
+
+The situation changes, however, once testing (especially end-to-end
+testing) comes into play. Here, we actually have to run the built
+binary---and do so on the target architecture. Nevertheless, we still
+want to offload as much as possible of the work to the other platform
+and perform only the actual test execution on the target platform. This
+requires a single build executing actions on two (or more) platforms.
+
+### Varying execution times
+
+#### Calls to foreign build systems
+
+Often, third-party dependencies that natively build with a different
+build system and don't change to often (yet often enough to not
+have them part of the build image) are simply put in a single
+action, so that they get built only once, and then stay in cache for
+everyone. This is precisely, what our `rules-cc` rules like
+`["CC/foreign/make",
+"library"]` and `["CC/foreign/cmake", "library"]` do.
+
+For those compound actions, we of course expect them to run longer
+than normal actions that only consist of a single compiler or linker
+invocation. Giving an absolute amount of time needed for such an
+action is not reasonable, as that very much depends on the
+underlying hardware. However, it is reasonable to give a number
+"typical" actions this compound action corresponds to.
+
+#### Long-running end-to-end tests
+
+A similar situation where a significantly longer action is needed in
+a build otherwise consisting of short actions are end-to-end tests.
+Test using the final binary might have a complex set up, potentially
+involving several instances running to test communication, and
+require a lengthy sequence of interactions to get into the situation
+that is to be tested, or to verify the absence of degrading of the
+service under high load or extended usage.
+
+Status Quo
+----------
+
+Action can at the moment specify
+
+ - the actual action, i.e., inputs, outputs, and the command vector,
+ - the environment variables,
+ - a property that the action can fail (e.g., for test actions), and
+ - a property that the action is not to be taken from cache (e.g.,
+ testing for flakiness).
+
+No other properties can be set by the action itself. In particular,
+remote-execution properties and timeout are equal for all actions of a
+build.
+
+Proposed changes
+----------------
+
+### Extension of the `"ACTION"` function
+
+We propose to extend the `"ACTION"` function available in the rule
+definition by the following attributes. All of the new attributes are
+optional, and the default is taken to reflect the status quo. Hence, the
+proposed changes are backwards compatible.
+
+#### `"execution properties"`
+
+This value has to evaluate to a map of strings; if not given, the
+empty map is taken as default. This map is taken as a union with any
+remote-execution properties specified at the invocation of the build
+(if keys are defined both, for the entire build and in
+`"execution properties"` of a specific action, the latter takes
+precedence).
+
+Local execution continues to any execution properties specified.
+However, with the auxiliary change to `just` described later, such
+execution properties can also influence a build that is local by
+default.
+
+#### `"timeout scaling"`
+
+If given, the value has to be a number greater or equal than `1.0`,
+with `1.0` taken as default. The action timeout specified for this
+build (the default value, or whatever is specified on the command
+line) is multiplied by the given factor and taken as timeout for
+this action. This applies for both, local and remote builds.
+
+### `just` to support dispatching based on remote-execution properties
+
+In simple setups, like using `just execute`, the remote execution is not
+capable of dispatching to different workers based on remote-execution
+properties. To nevertheless have the benefits of using different
+execution environments, `just` will allow an optional configuration file
+to be passed on the command line via a new option
+`--endpoint-configuration`. This configuration file will contain a list
+of pairs of remote-execution properties and remote-execution endpoints.
+The first matching entry (i.e., the first entry where the
+remote-execution property map coincides with the given map when
+restricted to its domain) determines the remote-execution endpoint to be
+used; if no entry matches, the default remote-execution endpoint is
+used. In any case, the remote-execution properties are forwarded to the
+chosen remote-execution endpoint without modification.
+
+When connecting a non-standard remote-execution endpoint, `just` will
+ensure that the applicable CAS of that endpoint will have all the needed
+artifacts for that action. It will also transfer all result artifacts
+back to the CAS of the default remote-execution endpoint.
+
+`just serve` (once implemented) will also support this new option. As
+with the default execution endpoint, there is the understanding that the
+client uses the same configuration as the `just serve` endpoint.
diff --git a/doc/future-designs/execution-properties.org b/doc/future-designs/execution-properties.org
deleted file mode 100644
index 6e9cf9e3..00000000
--- a/doc/future-designs/execution-properties.org
+++ /dev/null
@@ -1,119 +0,0 @@
-* Action-controlled execution properties
-
-** Motivation
-
-*** Varying execution platforms
-
-It is a common situation that software is developed for one platform,
-but it is desirable to build on a different one. For example,
-the other platform could be faster (common theme when developing
-for embedded devices), cheaper, or simply available in larger
-quantities. The standard solution for these kind of situations is
-cross compiling: the binary is completely built on one platform,
-while being intended to run on a different one. This can be achieved
-by constructing the compiler invocations accordingly and is already
-built into our rules (at least for ~C~ and ~C++~).
-
-The situation changes, however, once testing (especially end-to-end
-testing) comes into play. Here, we actually have to run the built
-binary---and do so on the target architecture. Nevertheless, we
-still want to offload as much as possible of the work to the other
-platform and perform only the actual test execution on the target
-platform. This requires a single build executing actions on two (or
-more) platforms.
-
-*** Varying execution times
-
-**** Calls to foreign build systems
-
-Often, third-party dependencies that natively build with a different
-build system and don't change to often (yet often enough to not have
-them part of the build image) are simply put in a single action, so
-that they get built only once, and then stay in cache for everyone.
-This is precisely, what our ~rules-cc~ rules like ~["CC/foreign/make",
-"library"]~ and ~["CC/foreign/cmake", "library"]~ do.
-
-For those compound actions, we of course expect them to run longer
-than normal actions that only consist of a single compiler or
-linker invocation. Giving an absolute amount of time needed for
-such an action is not reasonable, as that very much depends on the
-underlying hardware. However, it is reasonable to give a number
-"typical" actions this compound action corresponds to.
-
-**** Long-running end-to-end tests
-
-A similar situation where a significantly longer action is needed in
-a build otherwise consisting of short actions are end-to-end tests.
-Test using the final binary might have a complex set up, potentially
-involving several instances running to test communication, and
-require a lengthy sequence of interactions to get into the situation
-that is to be tested, or to verify the absence of degrading of the
-service under high load or extended usage.
-
-** Status Quo
-
-Action can at the moment specify
-- the actual action, i.e., inputs, outputs, and the command vector,
-- the environment variables,
-- a property that the action can fail (e.g., for test actions), and
-- a property that the action is not to be taken from cache (e.g.,
- testing for flakiness).
-No other properties can be set by the action itself. In particular,
-remote-execution properties and timeout are equal for all actions
-of a build.
-
-** Proposed changes
-
-*** Extension of the ~"ACTION"~ function
-
-We propose to extend the ~"ACTION"~ function available in the rule
-definition by the following attributes. All of the new attributes
-are optional, and the default is taken to reflect the status quo.
-Hence, the proposed changes are backwards compatible.
-
-**** ~"execution properties"~
-
-This value has to evaluate to a map of strings; if not given, the
-empty map is taken as default. This map is taken as a union with
-any remote-execution properties specified at the invocation of
-the build (if keys are defined both, for the entire build and in
-~"execution properties"~ of a specific action, the latter takes
-precedence).
-
-Local execution continues to any execution properties specified.
-However, with the auxiliary change to ~just~ described later,
-such execution properties can also influence a build that is local
-by default.
-
-**** ~"timeout scaling"~
-
-If given, the value has to be a number greater or equal than ~1.0~,
-with ~1.0~ taken as default. The action timeout specified for this
-build (the default value, or whatever is specified on the command
-line) is multiplied by the given factor and taken as timeout for
-this action. This applies for both, local and remote builds.
-
-*** ~just~ to support dispatching based on remote-execution properties
-
-In simple setups, like using ~just execute~, the remote execution
-is not capable of dispatching to different workers based on
-remote-execution properties. To nevertheless have the benefits of
-using different execution environments, ~just~ will allow an optional
-configuration file to be passed on the command line via a new option
-~--endpoint-configuration~. This configuration file will contain a
-list of pairs of remote-execution properties and remote-execution
-endpoints. The first matching entry (i.e., the first entry where
-the remote-execution property map coincides with the given map when
-restricted to its domain) determines the remote-execution endpoint to
-be used; if no entry matches, the default remote-execution endpoint
-is used. In any case, the remote-execution properties are forwarded
-to the chosen remote-execution endpoint without modification.
-
-When connecting a non-standard remote-execution endpoint, ~just~ will
-ensure that the applicable CAS of that endpoint will have all the
-needed artifacts for that action. It will also transfer all result
-artifacts back to the CAS of the default remote-execution endpoint.
-
-~just serve~ (once implemented) will also support this new option. As
-with the default execution endpoint, there is the understanding that
-the client uses the same configuration as the ~just serve~ endpoint.
diff --git a/doc/future-designs/service-target-cache.md b/doc/future-designs/service-target-cache.md
new file mode 100644
index 00000000..941115e9
--- /dev/null
+++ b/doc/future-designs/service-target-cache.md
@@ -0,0 +1,236 @@
+Target-level caching as a service
+=================================
+
+Motivation
+----------
+
+Projects can have quite a lot of dependencies that are not part of the
+build environment, but are, instead, built from source, e.g., in order
+to always build against the latest snapshot. The latter is a typical
+workflow in case of first-party dependencies. In the case of
+`justbuild`, those first-party dependencies form a separate logical
+repository that is typically content fixed (e.g., because that
+dependency is versioned in a `git` repository).
+
+Moreover, code is typically first built (and tested) by the owning
+project before being used as a dependency. Therefore, if remote
+execution is used, for a first-party dependency, we expect all actions
+to be in cache. As dependencies are typically updated less often than
+the code being developed is changed, in most builds, the dependencies
+are in target-level cache. In other words, in a remote-execution setup,
+the whole code of dependencies is fetched just to walk through the
+action graph a single time to get the necessary cache hits.
+
+Proposal: target-level caching as a service
+-------------------------------------------
+
+To avoid these unnecessary fetches, we add a new subcommand `just
+serve` that starts a service that provides the dependencies. This
+typically happens by looking up a target-level cache entry. If the
+entry, however, is not in cache, this also includes building the
+respective `export` target using an associated remote-execution end
+point.
+
+### Scope: eligible `export` targets
+
+In order to typically have requests in cache, `just serve` will refuse
+to handle requests that do not refer to `export` targets in
+content-fixed repositories; recall that for a repository to be content
+fixed, so have to be all repositories reachable from there.
+
+### Communication through an associated remote-execution service
+
+Each `just serve` endpoint is always associated with a remote-execution
+endpoint. All artifacts exchanged between client and `just serve`
+endpoint are exchanged via the CAS that is part in the associated
+remote-execution endpoint. This remote-execution endpoint is also used
+if `just serve` has to build targets.
+
+The associated remote-execution endpoint can well be the same process
+simultaneously acting as `just execute`. In fact, this is the default if
+no remote-execution endpoint is specified.
+
+### Protocol
+
+Communication is handled via `grpc` exchanging `proto` buffers
+containing the information described in the rest of this section.
+
+#### Main request and answer format
+
+A request is given by
+
+ - the map of remote-execution properties for the designated
+ remote-execution endpoint; together with the knowledge on the
+ fixed endpoint, the `just serve` instance can compute the
+ target-level cache shard, and
+ - the identifier of the target-level cache key; it is the
+ client's responsibility to ensure that the referred blob (i.e.,
+ the JSON object with appropriate values for the keys
+ `"repo_key"`, `"target_name"`, and `"effective_config"`) as well
+ as the indirectly referred repository description (the JSON
+ object the `"repo_key"` in the cache key refers to) are uploaded
+ to CAS (of the designated remote-execution endpoint) beforehand.
+
+The answer to that request is the identifier of the corresponding
+target-level cache value (in the same format as for local
+target-level caching). The `just serve` instance will ensure that
+the actual value, as well as any directly or indirectly referenced
+artifacts are available in the respective remote-execution CAS.
+Alternatively, the answer can indicate the kind of error (unknown
+root, not an export target, build failure, etc).
+
+#### Auxiliary request: tree of a commit
+
+As for `git` repositories, it is common to specify a commit in order
+to fix a dependency (even though the corresponding tree identifier
+would be enough). Moreover, the standard `git` protocol supports
+asking for the commit of a given remote branch, but additional
+overhead is needed in order to get the tree identifier.
+
+Therefore, in order to support clients (or, more precisely,
+`just-mr` instances setting up the repository description) in
+constructing an appropriate request for `just serve` without
+unnecessary overhead, `just serve` will support a second kind of
+request, where the client request consists of a `git` commit
+identifier and the server answers with the tree identifier for that
+commit if it is aware of that commit, or indicates that it is not
+aware of that commit.
+
+#### Auxiliary request: describe
+
+To support `just describe` also in the cases where code is delegated
+to the `just serve` endpoint, an additional request for the
+`describe` information of a target can be requested; as `just
+serve` only handles `export` targets, this target necessarily has to
+be an export target.
+
+The request is given by the identifier of the target-level cache
+key, again with the promise that the referred blob is available in
+CAS. The answer is the identifier of a blob containing a JSON object
+with the needed information, i.e., those parts of the target
+description that are used by `just describe`. Alternatively, the
+answer may indicate the kind of error (unknown root, not an export
+target, etc).
+
+### Sources: local git repositories and remote trees
+
+A `just serve` instance takes roots from various sources,
+
+ - the `git` repository contained in the local build root,
+ - additional `git` repositories, optionally specified in the
+ invocation, and
+ - as last resort, asking the CAS in the designated remote-execution
+ service for the specified `git` tree.
+
+Allowing a list of repositories to take as sources (rather than a single
+one) increases the effort when having to search for a specified tree (in
+case the requested `export` target is not in cache and an actual
+analysis of the build has to be carried out) or specific commit (in case
+a client asks for the tree of a given commit). However, it allows for
+the natural workflow of keeping separate upstream repositories in
+separate clones (updated in an appropriate way) without artificially
+putting them in a single repository (as orphan branches).
+
+Supporting building against trees from CAS allows more flexibility in
+defining roots that clients do not have to care about. In fact, they can
+be defined in any way, as long as
+
+ - the client is aware of the git tree identifier of the root, and
+ - some entity ensures the needed trees are known to the CAS.
+
+The auxiliary changes to `just-mr` described later in this document
+provide one possible way to handle archives in this way. Moreover, this
+additional flexibility will be necessary if we ever support computed
+roots, i.e., roots that are the output of a `just` build.
+
+### Absent roots in `just` repository specification
+
+In order for `just` to know for which repositories to delegate the build
+to the designated `just serve` endpoint, the repository configuration
+for `just` can mark roots as absent; this is done by only giving the
+type as `"git tree"` (or the corresponding ignore-special variant
+thereof) and the tree identifier in the root specification, but no
+witnessing repository.
+
+Any repository containing an absent root has to be content fixed, but
+not all roots have to be absent (as `just` can always upload those trees
+to CAS). It is an error if, outside the computations delegated to
+`just serve`, a non-export target is requested from a repository
+containing an absent root. Moreover, whenever there is a dependency on a
+repository containing an absent root, a `just
+serve` endpoint has to be specified in the invocation of `just`.
+
+### Auxiliary changes
+
+#### `just-mr` pragma `"absent"`
+
+For `just-mr` to know how to construct the repository description,
+the description used by `just-mr` is extended. More precisely, a new
+key `"absent"` is allowed in the `"pragma"` dictionary of a
+repository description. If the specified value is true, `just-mr`
+will generate an absent root out of this description, using all
+available means to generate that root without ever having to fetch
+the repository locally. In the typical case of a `git` repository,
+the auxiliary `just serve` function to obtain the tree of a commit
+is used. To allow this communication, `just-mr` also accepts the
+arguments describing a `just serve` endpoint and forwards them as
+early arguments to `just`, in the same way as it does with
+`--local-build-root`.
+
+#### `just-mr` to inquire remote execution before fetching
+
+In line with the idea that fetching sources from upstream should
+happen only once and not once per developer, we add remote execution
+as another way of obtaining files to `just-mr`. More precisely,
+`just-mr` will support the options `just` accepts to connect to the
+remote CAS. When given, those will be forwarded to `just` as early
+arguments (so that later `just`-only ones can override them);
+moreover, when a file needed to set up a (present) root is found
+neither in local CAS nor in one of the specified distdirs, `just-mr`
+will first ask the remote CAS for the missing file before trying to
+fetch itself from the specified URL. The rationale for this search
+order is that the designated remote-execution service is typically
+reachable over the network in a more reliable way than external
+resources (while local resources do not require a network at all).
+
+#### `just-mr` to support new repository type `git tree`
+
+A new repository type is added to `just-mr`, called `git tree`. Such
+a repository is given by
+
+ - a `git` tree identifier, and
+ - a command that, when executed in an empty directory (anywhere in
+ the file system) will create in that directory a directory
+ structure containing the specified `git` tree (either top-level
+ or in some subdirectory). Moreover, that command does not modify
+ anything outside the directory it is called in; it is an error
+ if the specified tree is not created in this way.
+
+In this way, content-fixed repositories can be generated in a
+generic way, e.g., using other version-control systems or
+specialized artifact-fetching tools.
+
+Additionally, for archive-like repositories in the `just-mr`
+repository specification (currently `archive` and `zip`), a `git`
+tree identifier can be specified. If the tree is known to `just-mr`,
+or the `"pragma"` `"absent"` is given, it will just use that tree.
+Otherwise, it will fetch as usual, but error out if the obtained
+tree is not the promised one after unpacking and taking the
+specified subdirectory. In this way, also archives can be used as
+absent roots.
+
+#### `just-mr fetch` to support storing in remote-execution CAS
+
+The `fetch` subcommand of `just-mr` will get an additional option to
+support backing up the fetched information not to a local directory,
+but instead to the CAS of the specified remote-execution endpoint.
+This includes
+
+ - all archives fetched, but also
+ - all trees computed in setting up the respective repository
+ description, both, from `git tree` repositories, as well as from
+ archives.
+
+In this way, `just-mr` can be used to fill the CAS from one central
+point with all the information the clients need to treat all
+content-fixed roots as absent.
diff --git a/doc/future-designs/service-target-cache.org b/doc/future-designs/service-target-cache.org
deleted file mode 100644
index 10138db5..00000000
--- a/doc/future-designs/service-target-cache.org
+++ /dev/null
@@ -1,227 +0,0 @@
-* Target-level caching as a service
-
-** Motivation
-
-Projects can have quite a lot of dependencies that are not part of
-the build environment, but are, instead, built from source, e.g.,
-in order to always build against the latest snapshot. The latter
-is a typical workflow in case of first-party dependencies. In the
-case of ~justbuild~, those first-party dependencies form a separate
-logical repository that is typically content fixed (e.g., because
-that dependency is versioned in a ~git~ repository).
-
-Moreover, code is typically first built (and tested) by the owning
-project before being used as a dependency. Therefore, if remote
-execution is used, for a first-party dependency, we expect all
-actions to be in cache. As dependencies are typically updated less
-often than the code being developed is changed, in most builds,
-the dependencies are in target-level cache. In other words, in a
-remote-execution setup, the whole code of dependencies is fetched
-just to walk through the action graph a single time to get the
-necessary cache hits.
-
-** Proposal: target-level caching as a service
-
-To avoid these unnecessary fetches, we add a new subcommand ~just
-serve~ that starts a service that provides the dependencies. This
-typically happens by looking up a target-level cache entry. If the
-entry, however, is not in cache, this also includes building the
-respective ~export~ target using an associated remote-execution
-end point.
-
-*** Scope: eligible ~export~ targets
-
-In order to typically have requests in cache, ~just serve~ will
-refuse to handle requests that do not refer to ~export~ targets
-in content-fixed repositories; recall that for a repository to be
-content fixed, so have to be all repositories reachable from there.
-
-*** Communication through an associated remote-execution service
-
-Each ~just serve~ endpoint is always associated with a remote-execution
-endpoint. All artifacts exchanged between client and ~just serve~
-endpoint are exchanged via the CAS that is part in the associated
-remote-execution endpoint. This remote-execution endpoint is also
-used if ~just serve~ has to build targets.
-
-The associated remote-execution endpoint can well be the same
-process simultaneously acting as ~just execute~. In fact, this is
-the default if no remote-execution endpoint is specified.
-
-*** Protocol
-
-Communication is handled via ~grpc~ exchanging ~proto~ buffers
-containing the information described in the rest of this section.
-
-**** Main request and answer format
-
-A request is given by
-- the map of remote-execution properties for the designated
- remote-execution endpoint; together with the knowledge on the fixed
- endpoint, the ~just serve~ instance can compute the target-level
- cache shard, and
-- the identifier of the target-level cache key; it is the client's
- responsibility to ensure that the referred blob (i.e., the
- JSON object with appropriate values for the keys ~"repo_key"~,
- ~"target_name"~, and ~"effective_config"~) as well as the
- indirectly referred repository description (the JSON object the
- ~"repo_key"~ in the cache key refers to) are uploaded to CAS (of
- the designated remote-execution endpoint) beforehand.
-
-The answer to that request is the identifier of the corresponding
-target-level cache value (in the same format as for local target-level
-caching). The ~just serve~ instance will ensure that the actual
-value, as well as any directly or indirectly referenced artifacts
-are available in the respective remote-execution CAS. Alternatively,
-the answer can indicate the kind of error (unknown root, not an
-export target, build failure, etc).
-
-**** Auxiliary request: tree of a commit
-
-As for ~git~ repositories, it is common to specify a commit in order
-to fix a dependency (even though the corresponding tree identifier
-would be enough). Moreover, the standard ~git~ protocol supports
-asking for the commit of a given remote branch, but additional
-overhead is needed in order to get the tree identifier.
-
-Therefore, in order to support clients (or, more precisely, ~just-mr~
-instances setting up the repository description) in constructing an
-appropriate request for ~just serve~ without unnecessary overhead,
-~just serve~ will support a second kind of request, where the
-client request consists of a ~git~ commit identifier and the server
-answers with the tree identifier for that commit if it is aware of
-that commit, or indicates that it is not aware of that commit.
-
-**** Auxiliary request: describe
-
-To support ~just describe~ also in the cases where code is
-delegated to the ~just serve~ endpoint, an additional request for
-the ~describe~ information of a target can be requested; as ~just
-serve~ only handles ~export~ targets, this target necessarily has
-to be an export target.
-
-The request is given by the identifier of the target-level cache
-key, again with the promise that the referred blob is available
-in CAS. The answer is the identifier of a blob containing a JSON
-object with the needed information, i.e., those parts of the target
-description that are used by ~just describe~. Alternatively, the
-answer may indicate the kind of error (unknown root, not an export
-target, etc).
-
-*** Sources: local git repositories and remote trees
-
-A ~just serve~ instance takes roots from various sources,
-- the ~git~ repository contained in the local build root,
-- additional ~git~ repositories, optionally specified in the
- invocation, and
-- as last resort, asking the CAS in the designated remote-execution
- service for the specified ~git~ tree.
-
-Allowing a list of repositories to take as sources (rather than
-a single one) increases the effort when having to search for a
-specified tree (in case the requested ~export~ target is not in
-cache and an actual analysis of the build has to be carried out)
-or specific commit (in case a client asks for the tree of a given
-commit). However, it allows for the natural workflow of keeping
-separate upstream repositories in separate clones (updated in an
-appropriate way) without artificially putting them in a single
-repository (as orphan branches).
-
-Supporting building against trees from CAS allows more flexibility
-in defining roots that clients do not have to care about. In fact,
-they can be defined in any way, as long as
-- the client is aware of the git tree identifier of the root, and
-- some entity ensures the needed trees are known to the CAS.
-The auxiliary changes to ~just-mr~ described later in this document
-provide one possible way to handle archives in this way. Moreover,
-this additional flexibility will be necessary if we ever support
-computed roots, i.e., roots that are the output of a ~just~ build.
-
-*** Absent roots in ~just~ repository specification
-
-In order for ~just~ to know for which repositories to delegate
-the build to the designated ~just serve~ endpoint, the repository
-configuration for ~just~ can mark roots as absent; this is done
-by only giving the type as ~"git tree"~ (or the corresponding
-ignore-special variant thereof) and the tree identifier in the root
-specification, but no witnessing repository.
-
-Any repository containing an absent root has to be content fixed,
-but not all roots have to be absent (as ~just~ can always upload
-those trees to CAS). It is an error if, outside the computations
-delegated to ~just serve~, a non-export target is requested from a
-repository containing an absent root. Moreover, whenever there is
-a dependency on a repository containing an absent root, a ~just
-serve~ endpoint has to be specified in the invocation of ~just~.
-
-*** Auxiliary changes
-
-**** ~just-mr~ pragma ~"absent"~
-
-For ~just-mr~ to know how to construct the repository description,
-the description used by ~just-mr~ is extended. More precisely, a
-new key ~"absent"~ is allowed in the ~"pragma"~ dictionary of a
-repository description. If the specified value is true, ~just-mr~
-will generate an absent root out of this description, using all
-available means to generate that root without ever having to fetch
-the repository locally. In the typical case of a ~git~ repository,
-the auxiliary ~just serve~ function to obtain the tree of a commit
-is used. To allow this communication, ~just-mr~ also accepts the
-arguments describing a ~just serve~ endpoint and forwards them
-as early arguments to ~just~, in the same way as it does with
-~--local-build-root~.
-
-**** ~just-mr~ to inquire remote execution before fetching
-
-In line with the idea that fetching sources from upstream should
-happen only once and not once per developer, we add remote execution
-as another way of obtaining files to ~just-mr~. More precisely,
-~just-mr~ will support the options ~just~ accepts to connect to
-the remote CAS. When given, those will be forwarded to ~just~
-as early arguments (so that later ~just~-only ones can override
-them); moreover, when a file needed to set up a (present) root is
-found neither in local CAS nor in one of the specified distdirs,
-~just-mr~ will first ask the remote CAS for the missing file before
-trying to fetch itself from the specified URL. The rationale for
-this search order is that the designated remote-execution service
-is typically reachable over the network in a more reliable way than
-external resources (while local resources do not require a network
-at all).
-
-**** ~just-mr~ to support new repository type ~git tree~
-
-A new repository type is added to ~just-mr~, called ~git tree~.
-Such a repository is given by
-- a ~git~ tree identifier, and
-- a command that, when executed in an empty directory (anywhere
- in the file system) will create in that directory a directory
- structure containing the specified ~git~ tree (either top-level
- or in some subdirectory). Moreover, that command does not modify
- anything outside the directory it is called in; it is an error
- if the specified tree is not created in this way.
-In this way, content-fixed repositories can be generated in a
-generic way, e.g., using other version-control systems or specialized
-artifact-fetching tools.
-
-Additionally, for archive-like repositories in the ~just-mr~
-repository specification (currently ~archive~ and ~zip~), a ~git~
-tree identifier can be specified. If the tree is known to ~just-mr~,
-or the ~"pragma"~ ~"absent"~ is given, it will just use that tree.
-Otherwise, it will fetch as usual, but error out if the obtained
-tree is not the promised one after unpacking and taking the specified
-subdirectory. In this way, also archives can be used as absent roots.
-
-**** ~just-mr fetch~ to support storing in remote-execution CAS
-
-The ~fetch~ subcommand of ~just-mr~ will get an additional option to
-support backing up the fetched information not to a local directory,
-but instead to the CAS of the specified remote-execution endpoint.
-This includes
-- all archives fetched, but also
-- all trees computed in setting up the respective repository
- description, both, from ~git tree~ repositories, as well as
- from archives.
-
-In this way, ~just-mr~ can be used to fill the CAS from one central
-point with all the information the clients need to treat all
-content-fixed roots as absent.
diff --git a/doc/future-designs/symlinks.md b/doc/future-designs/symlinks.md
new file mode 100644
index 00000000..05215030
--- /dev/null
+++ b/doc/future-designs/symlinks.md
@@ -0,0 +1,113 @@
+Symbolic links
+==============
+
+Background
+----------
+
+Besides files and directories, symbolic links are also an important
+entity in the file system. Also `git` natively supports symbolic links
+as entries in a tree object. Technically, a symbolic link is a string
+that can be read via `readlink(2)`. However, they can also be followed
+and functions to access a file, like `open(2)` do so by default. When
+following a symbolic link, both, relative and absolute, names can be
+used.
+
+Symbolic links in build systems
+-------------------------------
+
+### Follow and reading both happen
+
+Compilers usually follow symlinks for all inputs. Archivers (like
+`tar(1)` and package-building tools) usually read the link in order to
+package the link itself, rather than the file referred to (if any). As a
+generic build system, it is desirable to not have to make assumptions on
+the intention of the program called (and hence the way it deals with
+symlinks). This, however, has the consequence that only symbolic links
+themselves can properly model symbolic links.
+
+### Self-containedness and location-independence of roots
+
+From a build-system perspective, a root should be self-contained; in
+fact, the target-level caching assumes that the git tree identifier
+entirely describes a `git`-tree root. For this to be true, such a root
+has to be both, self contained and independent of its (assumed) location
+in the file system. In particular, we can neither allow absolute
+symbolic links (as they, depending on the assumed location, might point
+out of the root), nor relative symbolic links that go upwards (via a
+`../` reference) too far.
+
+### Symbolic links in actions
+
+Like for source roots, we understand action directories as self
+contained and independent of their location in the file system.
+Therefore, we have to require the same restrictions there as well, i.e.,
+neither absolute symbolic links nor relative symbolic links going up too
+far.
+
+Allowing all relative symbolic links that don't point outside the
+action directory, however, poses an additional layer of complications in
+the definition of actions: a string might be allowed as symlink in some
+places in the action directory, but not in others; in particular, we
+can't tell only from the information that an artifact is a relative
+symlink whether it can be safely placed at a particular location in an
+action or not. Similarly for trees for which we only know that they
+might contain relative symbolic links.
+
+### Presence of symbolic links in system source trees
+
+It can be desirable to use system libraries or tools as dependencies. A
+typical use case, but not the only one, is packaging a tool for a
+distribution. An obvious approach is to declare a system directory as a
+root of a repository (providing the needed target files in a separate
+root). As it turns out, however, those system directories do contain
+symbolic links, e.g., shared libraries pointing to the specific version
+(like `libfoo.so.3` as a symlink pointing to `libfoo.so.3.1.4`) or
+detours through `/etc/alternatives`.
+
+Implemented stop-gap: "shopping list" for bootstrapping
+---------------------------------------------------------
+
+As a stop-gap measure to support building the tool itself against
+pre-installed dependencies with the respective directories containing
+symbolic links, or tools (like `protoc`) being symbolic links (e.g., to
+the specific version), repositories can specify, in the `"copy"`
+attribute of the `"local_bootstrap"` parameter, a list of files and
+directories to be copied as part of the bootstrapping process to a fresh
+clean directory serving as root; during this copying, symlinks are
+followed.
+
+Proposed treatment of symbolic links
+------------------------------------
+
+### "Ignore-special" roots
+
+To allow working with source trees containing symbolic links, we extend
+the existing roots by "ignore-special" versions thereof. In such a
+root (regardless whether file based, or `git`-tree based), everything
+not a file or a directory will be pretended to be absent. For any
+compile-like tasks, the effect of symlinks can be modeled by appropriate
+staging.
+
+As certain entries have to be ignored, source trees can only be obtained
+by traversing the respective tree; in particular, the `TREE` reference
+is no longer constant time on those roots, even if `git`-tree based.
+Nevertheless, for `git`-tree roots, the effective tree is a function of
+the `git`-tree of the root, so `git`-tree-based ignore-special roots are
+content fixed and hence eligible for target-level caching.
+
+### Accepting non-upwards relative symlinks as first-class objects
+
+Finally, a restricted form of symlinks, more precisely relative
+non-upwards symbolic links, will be added as first-class object. That
+is, a new artifact type (besides blobs and trees) for relative
+non-upwards symbolic links is added. Like any other artifact they can be
+freely placed into the inputs of an action, as well as in artifacts,
+runfiles, or provides map of a target. Artifacts of this new type can be
+defined as
+
+ - source-symlink reference, as well as implicitly as part of a source
+ tree,
+ - as a symlink output of an action, as well as implicitly as part of a
+ tree output of an action, and
+ - explicitly in the rule language from a string through a new
+ `SYMLINK` constructor function.
diff --git a/doc/future-designs/symlinks.org b/doc/future-designs/symlinks.org
deleted file mode 100644
index 47ca5063..00000000
--- a/doc/future-designs/symlinks.org
+++ /dev/null
@@ -1,108 +0,0 @@
-* Symbolic links
-
-** Background
-
-Besides files and directories, symbolic links are also an important
-entity in the file system. Also ~git~ natively supports symbolic
-links as entries in a tree object. Technically, a symbolic link
-is a string that can be read via ~readlink(2)~. However, they can
-also be followed and functions to access a file, like ~open(2)~ do
-so by default. When following a symbolic link, both, relative and
-absolute, names can be used.
-
-** Symbolic links in build systems
-
-*** Follow and reading both happen
-
-Compilers usually follow symlinks for all inputs. Archivers (like
-~tar(1)~ and package-building tools) usually read the link in order
-to package the link itself, rather than the file referred to (if
-any). As a generic build system, it is desirable to not have to make
-assumptions on the intention of the program called (and hence the
-way it deals with symlinks). This, however, has the consequence that
-only symbolic links themselves can properly model symbolic links.
-
-*** Self-containedness and location-independence of roots
-
-From a build-system perspective, a root should be self-contained; in
-fact, the target-level caching assumes that the git tree identifier
-entirely describes a ~git~-tree root. For this to be true, such a
-root has to be both, self contained and independent of its (assumed)
-location in the file system. In particular, we can neither allow
-absolute symbolic links (as they, depending on the assumed location,
-might point out of the root), nor relative symbolic links that go
-upwards (via a ~../~ reference) too far.
-
-*** Symbolic links in actions
-
-Like for source roots, we understand action directories as self
-contained and independent of their location in the file system.
-Therefore, we have to require the same restrictions there as well,
-i.e., neither absolute symbolic links nor relative symbolic links
-going up too far.
-
-Allowing all relative symbolic links that don't point outside the
-action directory, however, poses an additional layer of complications
-in the definition of actions: a string might be allowed as symlink
-in some places in the action directory, but not in others; in
-particular, we can't tell only from the information that an artifact
-is a relative symlink whether it can be safely placed at a particular
-location in an action or not. Similarly for trees for which we only
-know that they might contain relative symbolic links.
-
-*** Presence of symbolic links in system source trees
-
-It can be desirable to use system libraries or tools as dependencies.
-A typical use case, but not the only one, is packaging a tool for a
-distribution. An obvious approach is to declare a system directory
-as a root of a repository (providing the needed target files in a
-separate root). As it turns out, however, those system directories
-do contain symbolic links, e.g., shared libraries pointing to
-the specific version (like ~libfoo.so.3~ as a symlink pointing to
-~libfoo.so.3.1.4~) or detours through ~/etc/alternatives~.
-
-** Implemented stop-gap: "shopping list" for bootstrapping
-
-As a stop-gap measure to support building the tool itself against
-pre-installed dependencies with the respective directories containing
-symbolic links, or tools (like ~protoc~) being symbolic links (e.g.,
-to the specific version), repositories can specify, in the ~"copy"~
-attribute of the ~"local_bootstrap"~ parameter, a list of files
-and directories to be copied as part of the bootstrapping process
-to a fresh clean directory serving as root; during this copying,
-symlinks are followed.
-
-** Proposed treatment of symbolic links
-
-*** "Ignore-special" roots
-
-To allow working with source trees containing symbolic links, we
-extend the existing roots by "ignore-special" versions thereof. In
-such a root (regardless whether file based, or ~git~-tree based),
-everything not a file or a directory will be pretended to be absent.
-For any compile-like tasks, the effect of symlinks can be modeled
-by appropriate staging.
-
-As certain entries have to be ignored, source trees can only be
-obtained by traversing the respective tree; in particular, the
-~TREE~ reference is no longer constant time on those roots, even
-if ~git~-tree based. Nevertheless, for ~git~-tree roots, the
-effective tree is a function of the ~git~-tree of the root, so
-~git~-tree-based ignore-special roots are content fixed and hence
-eligible for target-level caching.
-
-*** Accepting non-upwards relative symlinks as first-class objects
-
-Finally, a restricted form of symlinks, more precisely relative
-non-upwards symbolic links, will be added as first-class object.
-That is, a new artifact type (besides blobs and trees) for relative
-non-upwards symbolic links is added. Like any other artifact they
-can be freely placed into the inputs of an action, as well as in
-artifacts, runfiles, or provides map of a target. Artifacts of this
-new type can be defined as
-- source-symlink reference, as well as implicitly as part of a
- source tree,
-- as a symlink output of an action, as well as implicitly as part
- of a tree output of an action, and
-- explicitly in the rule language from a string through a new
- ~SYMLINK~ constructor function.
diff --git a/doc/specification/remote-protocol.md b/doc/specification/remote-protocol.md
new file mode 100644
index 00000000..1afd7e32
--- /dev/null
+++ b/doc/specification/remote-protocol.md
@@ -0,0 +1,145 @@
+Specification of the just Remote Execution Protocol
+===================================================
+
+Introduction
+------------
+
+just supports remote execution of actions across multiple machines. As
+such, it makes use of a remote execution protocol. The basis of our
+protocol is the open-source gRPC [remote execution
+API](https://github.com/bazelbuild/remote-apis/blob/main/build/bazel/remote/execution/v2/remote_execution.proto).
+We use this protocol in a **compatible** mode, but by default, we use a
+modified version, allowing us to pass git trees and files directly
+without even looking at their content or traversing them. This
+modification makes sense since it is more efficient if sources are
+available in git repositories and much open-source code is hosted in git
+repositories. With this protocol, we take advantage of already hashed
+git content as much as possible by avoiding unnecessary conversion and
+communication overhead.
+
+In the following sections, we explain which modifications we applied to
+the original protocol and which requirements we have to the remote
+execution service to seamlessly work with just.
+
+just Protocol Description
+-------------------------
+
+### git Blob and Tree Hashes
+
+In order to be able work with git hashes, both client side as well as
+server side need to be extended to support the regular git hash
+functions for blobs and trees:
+
+The hash of a blob is computed as
+
+ sha1sum(b"blob <size_of_content>\0<content>")
+
+The hash of a tree is computed as
+
+ sha1sum(b"tree <size_of_entries>\0<entries>")
+
+where `<entries>` is a sequence (without newlines) of `<entry>`, and
+each `<entry>` is
+
+ <mode> <file or dir name>\0<git-hash of the corresponding blob or tree>
+
+`<mode>` is a number defining if the object is a file (`100644`), an
+executable file (`100755`), a tree (`040000`), or a symbolic link
+(`120000`). More information on how git internally stores its objects
+can be found in the official [git
+documentation](https://git-scm.com/book/en/v2/git-Internals-git-Objects).
+
+Since git hashes blob content differently from trees, this type of
+information has to be transmitted in addition to the content and the
+hash. To this aim, just prefixes the git hash values passed over the
+wire with a single-byte marker. Thus allowing the remote side to
+distinguish a blob from a tree without inspecting the (potentially
+large) content. The markers are
+
+ - `0x62` for a git blob (`0x62` corresponds to the character `b`)
+ - `0x74` for a git tree (`0x74` corresponds to the character `t`)
+
+Since hashes are transmitted as hexadecimal string, the resulting length
+of such prefixed git hashes is 42 characters. The server side has to
+accept this hash length as valid hash length to detect our protocol and
+to apply the according git hash functions based on the detected prefix.
+
+### Blob and Tree Availability
+
+Typically, it makes sense for a client to check the availability of a
+blob or a tree at the remote side, before it actually uploads it. Thus,
+the remote side should be able to answer availability requests based on
+our prefixed hash values.
+
+### Blob Upload
+
+A blob is uploaded to the remote side by passing its raw content as well
+as its `Digest` containing the git hash value for a blob prefixed by
+`0x62`. The remote side needs to verify the received content by applying
+the git blob hash function to it, before the blob is stored in the
+content addressable storage (CAS).
+
+If a blob is part of git repository and already known to the remote
+side, we even do not have to calculate the hash value from a possible
+large file, instead we can directly use the hash value calculated by git
+and pass it through.
+
+### Tree Upload
+
+In contrast to regular files, which are uploaded as blobs, the original
+protocol has no notion of directories on the remote side. Thus,
+directories need to be traversed and converted to `Directory` Protobuf
+messages, which are then serialized and uploaded as blobs.
+
+In our modified protocol, we prevent this traversing and conversion
+overhead by directly uploading the git tree objects instead of the
+serialized Protobuf messages if the directory is part of a git
+repository. Consequently, we can also reuse the corresponding git hash
+value for a tree object, which just needs to be prefixed by `74`, when
+uploaded.
+
+The remote side must accepts git tree objects instead `Directory`
+Protobuf messages at any location where `Directory` messages are
+referred (e.g., the root directory of an action). The tree content is
+verified using the git hash function for trees. In addition, it has to
+be modified to parse the git tree object format.
+
+Using this git tree representation makes tree handling much more
+efficient, since the effort of traversing and uploading the content of a
+git tree occurs only once and for each subsequent request, we directly
+pass around the git tree id. We require the invariant that if a tree is
+part of any CAS then all its content is also available in this CAS. To
+adhere to this invariant, the client side has to prove that the content
+of a tree is available in the CAS, before uploading this tree. One way
+to ensure that the tree content is known to the remote side is that it
+is uploaded by the client. The server side has to ensure this invariant
+holds. In particular, if the remote side implements any sort of pruning
+strategy for the CAS, it has to honor this invariant when an element got
+pruned.
+
+Another consequence of this efficient tree handling is that it improves
+**action digest** calculation noticeably, since known git trees referred
+by the root directory do not need to be traversed. This in turn allows
+to faster determine whether an action result is already available in the
+action cache or not.
+
+### Tree Download
+
+Once an action is successfully executed, it might have generated output
+files or output directories in its staging area on the remote side. Each
+output file needs to be uploaded to its CAS with the corresponding git
+blob hash. Each output directory needs to be translated to a git tree
+object and uploaded to the CAS with the corresponding git tree hash.
+Only if the content of a tree is available in the CAS, the server side
+is allowed to return the tree to the client.
+
+In case of a generated output directory, the server only returns the
+corresponding git tree id to the client instead of a flat list of all
+recursively generated output directories as part of a `Tree` Protobuf
+message as it is done in the original protocol. The remote side promises
+that each blob and subtree contained in the root tree is available in
+the remote CAS. Such blobs and trees must be accessible, using the
+streaming interface, without specifying the size (since sizes are not
+stored in a git tree). Due to the Protobuf 3 specification, which is
+used in this remote execution API, not specifying the size means the
+default value 0 is used.
diff --git a/doc/specification/remote-protocol.org b/doc/specification/remote-protocol.org
deleted file mode 100644
index dea7177e..00000000
--- a/doc/specification/remote-protocol.org
+++ /dev/null
@@ -1,139 +0,0 @@
-* Specification of the just Remote Execution Protocol
-
-** Introduction
-
-just supports remote execution of actions across multiple machines. As such, it
-makes use of a remote execution protocol. The basis of our protocol is the
-open-source gRPC
-[[https://github.com/bazelbuild/remote-apis/blob/main/build/bazel/remote/execution/v2/remote_execution.proto][remote
-execution API]]. We use this protocol in a *compatible* mode, but by default, we
-use a modified version, allowing us to pass git trees and files directly without
-even looking at their content or traversing them. This modification makes sense
-since it is more efficient if sources are available in git repositories and much
-open-source code is hosted in git repositories. With this protocol, we take
-advantage of already hashed git content as much as possible by avoiding
-unnecessary conversion and communication overhead.
-
-In the following sections, we explain which modifications we applied to the
-original protocol and which requirements we have to the remote execution service
-to seamlessly work with just.
-
-
-** just Protocol Description
-
-*** git Blob and Tree Hashes
-
-In order to be able work with git hashes, both client side as well as server
-side need to be extended to support the regular git hash functions for blobs and
-trees:
-
-The hash of a blob is computed as
-#+BEGIN_SRC
-sha1sum(b"blob <size_of_content>\0<content>")
-#+END_SRC
-The hash of a tree is computed as
-#+BEGIN_SRC
-sha1sum(b"tree <size_of_entries>\0<entries>")
-#+END_SRC
-where ~<entries>~ is a sequence (without newlines) of ~<entry>~, and each
-~<entry>~ is
-#+BEGIN_SRC
-<mode> <file or dir name>\0<git-hash of the corresponding blob or tree>
-#+END_SRC
-~<mode>~ is a number defining if the object is a file (~100644~), an executable
-file (~100755~), a tree (~040000~), or a symbolic link (~120000~). More
-information on how git internally stores its objects can be found in the
-official [[https://git-scm.com/book/en/v2/git-Internals-git-Objects][git
-documentation]].
-
-Since git hashes blob content differently from trees, this type of information
-has to be transmitted in addition to the content and the hash. To this aim, just
-prefixes the git hash values passed over the wire with a single-byte marker.
-Thus allowing the remote side to distinguish a blob from a tree without
-inspecting the (potentially large) content. The markers are
-
-- ~0x62~ for a git blob (~0x62~ corresponds to the character ~b~)
-- ~0x74~ for a git tree (~0x74~ corresponds to the character ~t~)
-
-Since hashes are transmitted as hexadecimal string, the resulting length of such
-prefixed git hashes is 42 characters. The server side has to accept this hash
-length as valid hash length to detect our protocol and to apply the according
-git hash functions based on the detected prefix.
-
-
-*** Blob and Tree Availability
-
-Typically, it makes sense for a client to check the availability of a blob or a
-tree at the remote side, before it actually uploads it. Thus, the remote side
-should be able to answer availability requests based on our prefixed hash
-values.
-
-
-*** Blob Upload
-
-A blob is uploaded to the remote side by passing its raw content as well as its
-~Digest~ containing the git hash value for a blob prefixed by ~0x62~. The remote
-side needs to verify the received content by applying the git blob hash function
-to it, before the blob is stored in the content addressable storage (CAS).
-
-If a blob is part of git repository and already known to the remote side, we
-even do not have to calculate the hash value from a possible large file, instead
-we can directly use the hash value calculated by git and pass it through.
-
-
-*** Tree Upload
-
-In contrast to regular files, which are uploaded as blobs, the original protocol
-has no notion of directories on the remote side. Thus, directories need to be
-traversed and converted to ~Directory~ Protobuf messages, which are then
-serialized and uploaded as blobs.
-
-In our modified protocol, we prevent this traversing and conversion overhead by
-directly uploading the git tree objects instead of the serialized Protobuf
-messages if the directory is part of a git repository. Consequently, we can also
-reuse the corresponding git hash value for a tree object, which just needs to be
-prefixed by ~74~, when uploaded.
-
-The remote side must accepts git tree objects instead ~Directory~ Protobuf
-messages at any location where ~Directory~ messages are referred (e.g., the root
-directory of an action). The tree content is verified using the git hash
-function for trees. In addition, it has to be modified to parse the git tree
-object format.
-
-Using this git tree representation makes tree handling much more efficient,
-since the effort of traversing and uploading the content of a git tree occurs
-only once and for each subsequent request, we directly pass around the git tree
-id. We require the invariant that if a tree is part of any CAS then all its
-content is also available in this CAS. To adhere to this invariant, the client
-side has to prove that the content of a tree is available in the CAS, before
-uploading this tree. One way to ensure that the tree content is known to the
-remote side is that it is uploaded by the client. The server side has to ensure
-this invariant holds. In particular, if the remote side implements any sort of
-pruning strategy for the CAS, it has to honor this invariant when an element got
-pruned.
-
-Another consequence of this efficient tree handling is that it improves *action
-digest* calculation noticeably, since known git trees referred by the root
-directory do not need to be traversed. This in turn allows to faster determine
-whether an action result is already available in the action cache or not.
-
-
-*** Tree Download
-
-Once an action is successfully executed, it might have generated output files or
-output directories in its staging area on the remote side. Each output file
-needs to be uploaded to its CAS with the corresponding git blob hash. Each
-output directory needs to be translated to a git tree object and uploaded to the
-CAS with the corresponding git tree hash. Only if the content of a tree is
-available in the CAS, the server side is allowed to return the tree to the
-client.
-
-In case of a generated output directory, the server only returns the
-corresponding git tree id to the client instead of a flat list of all
-recursively generated output directories as part of a ~Tree~ Protobuf message as
-it is done in the original protocol. The remote side promises that each blob and
-subtree contained in the root tree is available in the remote CAS. Such blobs
-and trees must be accessible, using the streaming interface, without specifying
-the size (since sizes are not stored in a git tree). Due to the Protobuf 3
-specification, which is used in this remote execution API, not specifying the
-size means the default value 0 is used.
diff --git a/doc/tutorial/getting-started.md b/doc/tutorial/getting-started.md
new file mode 100644
index 00000000..36a57d26
--- /dev/null
+++ b/doc/tutorial/getting-started.md
@@ -0,0 +1,217 @@
+Getting Started
+===============
+
+In order to use *justbuild*, first make sure that `just`, `just-mr`, and
+`just-import-git` are available in your `PATH`.
+
+Creating a new project
+----------------------
+
+*justbuild* needs to know the root of the project worked on. By default,
+it searches upwards from the current directory till it finds a marker.
+Currently, we support three different markers: the files `ROOT` and
+`WORKSPACE` or the directory `.git`. Lets create a new project by
+creating one of those markers:
+
+``` sh
+$ touch ROOT
+```
+
+Creating a generic target
+-------------------------
+
+By default, targets are described in `TARGETS` files. These files
+contain a `JSON` object with the target name as key and the target
+description as value. A target description is an object with at least a
+single mandatory field: `"type"`. This field specifies which rule
+(built-in or user-defined) to apply for this target.
+
+A simple target that only executes commands can be created using the
+built-in `"generic"` rule, which requires at least one command and one
+output file or directory. To create such a target, create the file
+`TARGETS` with the following content:
+
+``` {.jsonc srcname="TARGETS"}
+{ "greeter":
+ { "type": "generic"
+ , "cmds": ["echo -n 'Hello ' > out.txt", "cat name.txt >> out.txt"]
+ , "outs": ["out.txt"]
+ , "deps": ["name.txt"]
+ }
+}
+```
+
+In this example, the `"greeter"` target will run two commands to produce
+the output file `out.txt`. The second command depends on the input file
+`name.txt` that we need to create as well:
+
+``` sh
+$ echo World > name.txt
+```
+
+Building a generic target
+-------------------------
+
+To build a target, we need to run `just` with the subcommand `build`:
+
+``` sh
+$ just build greeter
+INFO: Requested target is [["@","","","greeter"],{}]
+INFO: Analysed target [["@","","","greeter"],{}]
+INFO: Export targets found: 0 cached, 0 uncached, 0 not eligible for caching
+INFO: Discovered 1 actions, 0 trees, 0 blobs
+INFO: Building [["@","","","greeter"],{}].
+INFO: Processed 1 actions, 0 cache hits.
+INFO: Artifacts built, logical paths are:
+ out.txt [557db03de997c86a4a028e1ebd3a1ceb225be238:12:f]
+$
+```
+
+The subcommand `build` just builds the artifact but does not stage it to
+any user-defined location on the file system. Instead it reports a
+description of the artifact consisting of `git` blob identifier, size,
+and type (in this case `f` for non-executable file). To also stage the
+produced artifact to the working directory, use the `install` subcommand
+and specify the output directory:
+
+``` sh
+$ just install greeter -o .
+INFO: Requested target is [["@","","","greeter"],{}]
+INFO: Analysed target [["@","","","greeter"],{}]
+INFO: Export targets found: 0 cached, 0 uncached, 0 not eligible for caching
+INFO: Discovered 1 actions, 0 trees, 0 blobs
+INFO: Building [["@","","","greeter"],{}].
+INFO: Processed 1 actions, 1 cache hits.
+INFO: Artifacts can be found in:
+ /tmp/tutorial/out.txt [557db03de997c86a4a028e1ebd3a1ceb225be238:12:f]
+$ cat out.txt
+Hello World
+$
+```
+
+Note that the `install` subcommand initiates the build a second time,
+without executing any actions as all actions are being served from
+cache. The produced artifact is identical, which is indicated by the
+same hash/size/type.
+
+If one is only interested in a single final artifact, one can also
+request via the `-P` option that this artifact be written to standard
+output after the build. As all messages are reported to standard error,
+this can be used for both, interactively reading a text file, as well as
+for piping the artifact to another program.
+
+``` sh
+$ just build greeter -Pout.txt
+INFO: Requested target is [["@","","","greeter"],{}]
+INFO: Analysed target [["@","","","greeter"],{}]
+INFO: Export targets found: 0 cached, 0 uncached, 0 not eligible for caching
+INFO: Discovered 1 actions, 0 trees, 0 blobs
+INFO: Building [["@","","","greeter"],{}].
+INFO: Processed 1 actions, 1 cache hits.
+INFO: Artifacts built, logical paths are:
+ out.txt [557db03de997c86a4a028e1ebd3a1ceb225be238:12:f]
+Hello World
+$
+```
+
+Alternatively, we could also directly request the artifact `out.txt`
+from *justbuild*'s CAS (content-addressable storage) and print it on
+the command line via:
+
+``` sh
+$ just install-cas [557db03de997c86a4a028e1ebd3a1ceb225be238:12:f]
+Hello World
+$
+```
+
+The canonical way of requesting an object from the CAS is, as just
+shown, to specify the full triple of hash, size, and type, separated by
+colons and enclosed in square brackets. To simplify usage, the brackets
+can be omitted and the size and type fields have the default values `0`
+and `f`, respectively. While the default value for the size is wrong for
+all but one string, the hash still determines the content of the file
+and hence the local CAS is still able to retrieve the file. So the
+typical invocation would simply specify the hash.
+
+``` sh
+$ just install-cas 557db03de997c86a4a028e1ebd3a1ceb225be238
+Hello World
+$
+```
+
+Targets versus Files: The Stage
+-------------------------------
+
+When invoking the `build` command, we had to specify the target
+`greeter`, not the output file `out.txt`. While other build systems
+allow requests specifying an output file, for *justbuild* this would
+conflict with a fundamental design principle: staging; each target has
+its own logical output space, the "stage", where it can put its
+artifacts. We can, without any problem, add a second target also
+generating a file `out.txt`.
+
+``` {.jsonc srcname="TARGETS"}
+...
+, "upper":
+ { "type": "generic"
+ , "cmds": ["cat name.txt | tr a-z A-Z > out.txt"]
+ , "outs": ["out.txt"]
+ , "deps": ["name.txt"]
+ }
+...
+```
+
+As we only request targets, no conflicts arise.
+
+``` sh
+$ just build upper -P out.txt
+INFO: Requested target is [["@","","","upper"],{}]
+INFO: Analysed target [["@","","","upper"],{}]
+INFO: Export targets found: 0 cached, 0 uncached, 0 not eligible for caching
+INFO: Discovered 1 actions, 0 trees, 0 blobs
+INFO: Building [["@","","","upper"],{}].
+INFO: Processed 1 actions, 0 cache hits.
+INFO: Artifacts built, logical paths are:
+ out.txt [83cf24cdfb4891a36bee93421930dd220766299a:6:f]
+WORLD
+$ just build greeter -P out.txt
+INFO: Requested target is [["@","","","greeter"],{}]
+INFO: Analysed target [["@","","","greeter"],{}]
+INFO: Export targets found: 0 cached, 0 uncached, 0 not eligible for caching
+INFO: Discovered 1 actions, 0 trees, 0 blobs
+INFO: Building [["@","","","greeter"],{}].
+INFO: Processed 1 actions, 1 cache hits.
+INFO: Artifacts built, logical paths are:
+ out.txt [557db03de997c86a4a028e1ebd3a1ceb225be238:12:f]
+Hello World
+$
+```
+
+While one normally tries to design targets in such a way that they
+don't have conflicting files if they should be used together, it is up
+to the receiving target to decide what to do with those artifacts. A
+built-in rule allowing to rearrange artifacts is `"install"`; a detailed
+description of this rule can be found in the documentation. In the
+simple case of a target producing precisely one file, the argument
+`"files"` can be used to map that file to a new location.
+
+``` {.jsonc srcname="TARGETS"}
+...
+, "both":
+ {"type": "install", "files": {"hello.txt": "greeter", "upper.txt": "upper"}}
+...
+```
+
+``` sh
+$ just build both
+INFO: Requested target is [["@","","","both"],{}]
+INFO: Analysed target [["@","","","both"],{}]
+INFO: Export targets found: 0 cached, 0 uncached, 0 not eligible for caching
+INFO: Discovered 2 actions, 0 trees, 0 blobs
+INFO: Building [["@","","","both"],{}].
+INFO: Processed 2 actions, 2 cache hits.
+INFO: Artifacts built, logical paths are:
+ hello.txt [557db03de997c86a4a028e1ebd3a1ceb225be238:12:f]
+ upper.txt [83cf24cdfb4891a36bee93421930dd220766299a:6:f]
+$
+```
diff --git a/doc/tutorial/getting-started.org b/doc/tutorial/getting-started.org
deleted file mode 100644
index 5a041397..00000000
--- a/doc/tutorial/getting-started.org
+++ /dev/null
@@ -1,212 +0,0 @@
-* Getting Started
-
-In order to use /justbuild/, first make sure that ~just~, ~just-mr~, and
-~just-import-git~ are available in your ~PATH~.
-
-** Creating a new project
-
-/justbuild/ needs to know the root of the project worked on. By default, it
-searches upwards from the current directory till it finds a marker. Currently,
-we support three different markers: the files ~ROOT~ and ~WORKSPACE~ or the
-directory ~.git~. Lets create a new project by creating one of those markers:
-
-#+BEGIN_SRC sh
-$ touch ROOT
-#+END_SRC
-
-** Creating a generic target
-
-By default, targets are described in ~TARGETS~ files. These files contain a
-~JSON~ object with the target name as key and the target description as value. A
-target description is an object with at least a single mandatory field:
-~"type"~. This field specifies which rule (built-in or user-defined) to apply
-for this target.
-
-A simple target that only executes commands can be created using the built-in
-~"generic"~ rule, which requires at least one command and one output file or
-directory. To create such a target, create the file ~TARGETS~ with the following
-content:
-
-#+SRCNAME: TARGETS
-#+BEGIN_SRC js
-{ "greeter":
- { "type": "generic"
- , "cmds": ["echo -n 'Hello ' > out.txt", "cat name.txt >> out.txt"]
- , "outs": ["out.txt"]
- , "deps": ["name.txt"]
- }
-}
-#+END_SRC
-
-In this example, the ~"greeter"~ target will run two commands to produce the
-output file ~out.txt~. The second command depends on the input file ~name.txt~
-that we need to create as well:
-
-#+BEGIN_SRC sh
-$ echo World > name.txt
-#+END_SRC
-
-** Building a generic target
-
-To build a target, we need to run ~just~ with the subcommand ~build~:
-
-#+BEGIN_SRC sh
-$ just build greeter
-INFO: Requested target is [["@","","","greeter"],{}]
-INFO: Analysed target [["@","","","greeter"],{}]
-INFO: Export targets found: 0 cached, 0 uncached, 0 not eligible for caching
-INFO: Discovered 1 actions, 0 trees, 0 blobs
-INFO: Building [["@","","","greeter"],{}].
-INFO: Processed 1 actions, 0 cache hits.
-INFO: Artifacts built, logical paths are:
- out.txt [557db03de997c86a4a028e1ebd3a1ceb225be238:12:f]
-$
-#+END_SRC
-
-The subcommand ~build~ just builds the artifact but does not stage it to any
-user-defined location on the file system. Instead it reports a description
-of the artifact consisting of ~git~ blob identifier, size, and type (in
-this case ~f~ for non-executable file). To also stage the produced artifact to
-the working directory, use the ~install~ subcommand and specify the output
-directory:
-
-#+BEGIN_SRC sh
-$ just install greeter -o .
-INFO: Requested target is [["@","","","greeter"],{}]
-INFO: Analysed target [["@","","","greeter"],{}]
-INFO: Export targets found: 0 cached, 0 uncached, 0 not eligible for caching
-INFO: Discovered 1 actions, 0 trees, 0 blobs
-INFO: Building [["@","","","greeter"],{}].
-INFO: Processed 1 actions, 1 cache hits.
-INFO: Artifacts can be found in:
- /tmp/tutorial/out.txt [557db03de997c86a4a028e1ebd3a1ceb225be238:12:f]
-$ cat out.txt
-Hello World
-$
-#+END_SRC
-
-Note that the ~install~ subcommand initiates the build a second time, without
-executing any actions as all actions are being served from cache. The produced
-artifact is identical, which is indicated by the same hash/size/type.
-
-If one is only interested in a single final artifact, one can
-also request via the ~-P~ option that this artifact be written to
-standard output after the build. As all messages are reported to
-standard error, this can be used for both, interactively reading a
-text file, as well as for piping the artifact to another program.
-
-#+BEGIN_SRC sh
-$ just build greeter -Pout.txt
-INFO: Requested target is [["@","","","greeter"],{}]
-INFO: Analysed target [["@","","","greeter"],{}]
-INFO: Export targets found: 0 cached, 0 uncached, 0 not eligible for caching
-INFO: Discovered 1 actions, 0 trees, 0 blobs
-INFO: Building [["@","","","greeter"],{}].
-INFO: Processed 1 actions, 1 cache hits.
-INFO: Artifacts built, logical paths are:
- out.txt [557db03de997c86a4a028e1ebd3a1ceb225be238:12:f]
-Hello World
-$
-#+END_SRC
-
-Alternatively, we could also directly request the artifact ~out.txt~ from
-/justbuild/'s CAS (content-addressable storage) and print it on the command line
-via:
-
-#+BEGIN_SRC sh
-$ just install-cas [557db03de997c86a4a028e1ebd3a1ceb225be238:12:f]
-Hello World
-$
-#+END_SRC
-
-The canonical way of requesting an object from the CAS is, as just shown, to
-specify the full triple of hash, size, and type, separated by colons and
-enclosed in square brackets. To simplify usage, the brackets can be omitted
-and the size and type fields have the default values ~0~ and ~f~, respectively.
-While the default value for the size is wrong for all but one string, the hash
-still determines the content of the file and hence the local CAS is still
-able to retrieve the file. So the typical invocation would simply specify the
-hash.
-
-#+BEGIN_SRC sh
-$ just install-cas 557db03de997c86a4a028e1ebd3a1ceb225be238
-Hello World
-$
-#+END_SRC
-
-** Targets versus Files: The Stage
-
-When invoking the ~build~ command, we had to specify the target ~greeter~,
-not the output file ~out.txt~. While other build systems allow requests
-specifying an output file, for /justbuild/ this would conflict with a
-fundamental design principle: staging; each target has its own logical
-output space, the "stage", where it can put its artifacts. We can, without
-any problem, add a second target also generating a file ~out.txt~.
-
-#+SRCNAME: TARGETS
-#+BEGIN_SRC js
-...
-, "upper":
- { "type": "generic"
- , "cmds": ["cat name.txt | tr a-z A-Z > out.txt"]
- , "outs": ["out.txt"]
- , "deps": ["name.txt"]
- }
-...
-#+END_SRC
-
-As we only request targets, no conflicts arise.
-
-#+BEGIN_SRC sh
-$ just build upper -P out.txt
-INFO: Requested target is [["@","","","upper"],{}]
-INFO: Analysed target [["@","","","upper"],{}]
-INFO: Export targets found: 0 cached, 0 uncached, 0 not eligible for caching
-INFO: Discovered 1 actions, 0 trees, 0 blobs
-INFO: Building [["@","","","upper"],{}].
-INFO: Processed 1 actions, 0 cache hits.
-INFO: Artifacts built, logical paths are:
- out.txt [83cf24cdfb4891a36bee93421930dd220766299a:6:f]
-WORLD
-$ just build greeter -P out.txt
-INFO: Requested target is [["@","","","greeter"],{}]
-INFO: Analysed target [["@","","","greeter"],{}]
-INFO: Export targets found: 0 cached, 0 uncached, 0 not eligible for caching
-INFO: Discovered 1 actions, 0 trees, 0 blobs
-INFO: Building [["@","","","greeter"],{}].
-INFO: Processed 1 actions, 1 cache hits.
-INFO: Artifacts built, logical paths are:
- out.txt [557db03de997c86a4a028e1ebd3a1ceb225be238:12:f]
-Hello World
-$
-#+END_SRC
-
-While one normally tries to design targets in such a way that they
-don't have conflicting files if they should be used together, it is
-up to the receiving target to decide what to do with those artifacts.
-A built-in rule allowing to rearrange artifacts is ~"install"~; a
-detailed description of this rule can be found in the documentation.
-In the simple case of a target producing precisely one file, the
-argument ~"files"~ can be used to map that file to a new location.
-
-#+SRCNAME: TARGETS
-#+BEGIN_SRC js
-...
-, "both":
- {"type": "install", "files": {"hello.txt": "greeter", "upper.txt": "upper"}}
-...
-#+END_SRC
-
-#+BEGIN_SRC sh
-$ just build both
-INFO: Requested target is [["@","","","both"],{}]
-INFO: Analysed target [["@","","","both"],{}]
-INFO: Export targets found: 0 cached, 0 uncached, 0 not eligible for caching
-INFO: Discovered 2 actions, 0 trees, 0 blobs
-INFO: Building [["@","","","both"],{}].
-INFO: Processed 2 actions, 2 cache hits.
-INFO: Artifacts built, logical paths are:
- hello.txt [557db03de997c86a4a028e1ebd3a1ceb225be238:12:f]
- upper.txt [83cf24cdfb4891a36bee93421930dd220766299a:6:f]
-$
-#+END_SRC
diff --git a/doc/tutorial/hello-world.md b/doc/tutorial/hello-world.md
new file mode 100644
index 00000000..9af68f07
--- /dev/null
+++ b/doc/tutorial/hello-world.md
@@ -0,0 +1,379 @@
+Building C++ Hello World
+========================
+
+*justbuild* is a true language-agnostic (there are no more-equal
+languages) and multi-repository build system. As a consequence,
+high-level concepts (e.g., C++ binaries, C++ libraries, etc.) are not
+hardcoded built-ins of the tool, but rather provided via a set of rules.
+These rules can be specified as a true dependency to your project like
+any other external repository your project might depend on.
+
+Setting up the Multi-Repository Configuration
+---------------------------------------------
+
+To build a project with multi-repository dependencies, we first need to
+provide a configuration that declares the required repositories. Before
+we begin, we need to declare where the root of our workspace is located
+by creating an empty file `ROOT`:
+
+``` sh
+$ touch ROOT
+```
+
+Second, we also need to create the multi-repository configuration
+`repos.json` in the workspace root:
+
+``` {.jsonc srcname="repos.json"}
+{ "main": "tutorial"
+, "repositories":
+ { "rules-cc":
+ { "repository":
+ { "type": "git"
+ , "branch": "master"
+ , "commit": "123d8b03bf2440052626151c14c54abce2726e6f"
+ , "repository": "https://github.com/just-buildsystem/rules-cc.git"
+ , "subdir": "rules"
+ }
+ }
+ , "tutorial":
+ { "repository": {"type": "file", "path": "."}
+ , "bindings": {"rules": "rules-cc"}
+ }
+ }
+}
+```
+
+In that configuration, two repositories are defined:
+
+1. The `"rules-cc"` repository located in the subdirectory `rules` of
+ [just-buildsystem/rules-cc:123d8b03bf2440052626151c14c54abce2726e6f](https://github.com/just-buildsystem/rules-cc/tree/123d8b03bf2440052626151c14c54abce2726e6f),
+ which contains the high-level concepts for building C/C++ binaries
+ and libraries.
+
+2. The `"tutorial"` repository located at `.`, which contains the
+ targets that we want to build. It has a single dependency, which is
+ the *rules* that are needed to build the target. These rules are
+ bound via the open name `"rules"` to the just created repository
+ `"rules-cc"`. In this way, the entities provided by `"rules-cc"` can
+ be accessed from within the `"tutorial"` repository via the
+ fully-qualified name `["@", "rules", "<module>", "<name>"]`;
+ fully-qualified names (for rules, targets to build (like libraries,
+ binaries), etc) are given by a repository name, a path specifying a
+ directory within that repository (the "module") where the
+ specification file is located, and a symbolic name (i.e., an
+ arbitrary string that is used as key in the specification).
+
+The final repository configuration contains a single `JSON` object with
+the key `"repositories"` referring to an object of repository names as
+keys and repository descriptions as values. For convenience, the main
+repository to pick is set to `"tutorial"`.
+
+Description of the helloworld target
+------------------------------------
+
+For this tutorial, we want to create a target `helloworld` that produces
+a binary from the C++ source `main.cpp`. To define such a target, create
+a `TARGETS` file with the following content:
+
+``` {.jsonc srcname="TARGETS"}
+{ "helloworld":
+ { "type": ["@", "rules", "CC", "binary"]
+ , "name": ["helloworld"]
+ , "srcs": ["main.cpp"]
+ }
+}
+```
+
+The `"type"` field refers to the rule `"binary"` from the module `"CC"`
+of the `"rules"` repository. This rule additionally requires the string
+field `"name"`, which specifies the name of the binary to produce; as
+the generic interface of rules is to have fields either take a list of
+strings or a list of targets, we have to specify the name as a list
+(this rule will simply concatenate all strings given in this field).
+Furthermore, at least one input to the binary is required, which can be
+specified via the target fields `"srcs"` or `"deps"`. In our case, the
+former is used, which contains our single source file (files are
+considered targets).
+
+Now, the last file that is missing is the actual source file `main.cpp`:
+
+``` {.cpp srcname="main.cpp"}
+#include <iostream>
+
+int main() {
+ std::cout << "Hello world!\n";
+ return 0;
+}
+```
+
+Building the helloworld target
+------------------------------
+
+To build the `helloworld` target, we need specify it on the `just-mr`
+command line:
+
+``` sh
+$ just-mr build helloworld
+INFO: Requested target is [["@","tutorial","","helloworld"],{}]
+INFO: Analysed target [["@","tutorial","",helloworld"],{}]
+INFO: Export targets found: 0 cached, 0 uncached, 0 not eligible for caching
+INFO: Discovered 2 actions, 1 trees, 0 blobs
+INFO: Building [["@","helloworld","","helloworld"],{}].
+INFO: Processed 2 actions, 0 cache hits.
+INFO: Artifacts built, logical paths are:
+ helloworld [b5cfca8b810adc4686f5cac00258a137c5d4a3ba:17088:x]
+$
+```
+
+Note that the target is taken from the `tutorial` repository, as it
+specified as the main repository in `repos.json`. If targets from other
+repositories should be build, the repository to use must be specified
+via the `--main` option.
+
+`just-mr` reads the repository configuration, fetches externals (if
+any), generates the actual build configuration, and stores it in its
+cache directory (by default under `$HOME/.cache/just`). Afterwards, the
+generated configuration is used to call the `just` binary, which
+performs the actual build.
+
+Note that these two programs, `just-mr` and `just`, can also be run
+individually. To do so, first run `just-mr` with `setup` and capture the
+path to the generated build configuration from stdout by assigning it to
+a shell variable (e.g., `CONF`). Afterwards, `just` can be called to
+perform the actual build by explicitly specifying the configuration file
+via `-C`:
+
+``` sh
+$ CONF=$(just-mr setup tutorial)
+$ just build -C $CONF helloworld
+```
+
+Note that `just-mr` only needs to be run the very first time and only
+once again whenever the `repos.json` file is modified.
+
+By default, the BSD-default compiler front-ends (which are also defined
+for most Linux distributions) `cc` and `c++` are used for C and C++
+(variables `"CC"` and `"CXX"`). If you want to temporarily use different
+defaults, you can use `-D` to provide a JSON object that sets different
+default variables. For instance, to use Clang as C++ compiler for a
+single build invocation, you can use the following command to provide an
+object that sets `"CXX"` to `"clang++"`:
+
+``` sh
+$ just-mr build helloworld -D'{"CXX":"clang++"}'
+INFO: Requested target is [["@","tutorial","","helloworld"],{"CXX":"clang++"}]
+INFO: Analysed target [["@","tutorial","","helloworld"],{"CXX":"clang++"}]
+INFO: Export targets found: 0 cached, 0 uncached, 0 not eligible for caching
+INFO: Discovered 2 actions, 1 trees, 0 blobs
+INFO: Building [["@","tutorial","","helloworld"],{"CXX":"clang++"}].
+INFO: Processed 2 actions, 0 cache hits.
+INFO: Artifacts built, logical paths are:
+ helloworld [b8cf7b8579d9dc7172b61660139e2c14521cedae:16944:x]
+$
+```
+
+Defining project defaults
+-------------------------
+
+To define a custom set of defaults (toolchain and compile flags) for
+your project, you need to create a separate file root for providing
+required `TARGETS` file, which contains the `"defaults"` target that
+should be used by the rules. This file root is then used as the *target
+root* for the rules, i.e., the search path for `TARGETS` files. In this
+way, the description of the `"defaults"` target is provided in a
+separate file root, to keep the rules repository independent of these
+definitions.
+
+We will call the new file root `tutorial-defaults` and need to create a
+module directory `CC` in it:
+
+``` sh
+$ mkdir -p ./tutorial-defaults/CC
+```
+
+In that module, we need to create the file
+`tutorial-defaults/CC/TARGETS` that contains the target `"defaults"` and
+specifies which toolchain and compile flags to use; it has to specify
+the complete toolchain, but can specify a `"base"` toolchain to inherit
+from. In our case, we don't use any base, but specify all the required
+fields directly.
+
+``` {.jsonc srcname="tutorial-defaults/CC/TARGETS"}
+{ "defaults":
+ { "type": ["CC", "defaults"]
+ , "CC": ["cc"]
+ , "CXX": ["c++"]
+ , "CFLAGS": ["-O2", "-Wall"]
+ , "CXXFLAGS": ["-O2", "-Wall"]
+ , "AR": ["ar"]
+ , "PATH": ["/bin", "/usr/bin"]
+ }
+}
+```
+
+To use the project defaults, modify the existing `repos.json` to reflect
+the following content:
+
+``` {.jsonc srcname="repos.json"}
+{ "main": "tutorial"
+, "repositories":
+ { "rules-cc":
+ { "repository":
+ { "type": "git"
+ , "branch": "master"
+ , "commit": "123d8b03bf2440052626151c14c54abce2726e6f"
+ , "repository": "https://github.com/just-buildsystem/rules-cc.git"
+ , "subdir": "rules"
+ }
+ , "target_root": "tutorial-defaults"
+ , "rule_root": "rules-cc"
+ }
+ , "tutorial":
+ { "repository": {"type": "file", "path": "."}
+ , "bindings": {"rules": "rules-cc"}
+ }
+ , "tutorial-defaults":
+ { "repository": {"type": "file", "path": "./tutorial-defaults"}
+ }
+ }
+}
+```
+
+Note that the `"defaults"` target uses the rule `["CC", "defaults"]`
+without specifying any external repository (e.g.,
+`["@", "rules", ...]`). This is because `"tutorial-defaults"` is not a
+full-fledged repository but merely a file root that is considered local
+to the `"rules-cc"` repository. In fact, the `"rules-cc"` repository
+cannot refer to any external repository as it does not have any defined
+bindings.
+
+To rebuild the project, we need to rerun `just-mr` (note that due to
+configuration changes, rerunning only `just` would not suffice):
+
+``` sh
+$ just-mr build helloworld
+INFO: Requested target is [["@","tutorial","","helloworld"],{}]
+INFO: Analysed target [["@","tutorial","","helloworld"],{}]
+INFO: Export targets found: 0 cached, 0 uncached, 0 not eligible for caching
+INFO: Discovered 2 actions, 1 trees, 0 blobs
+INFO: Building [["@","tutorial","","helloworld"],{}].
+INFO: Processed 2 actions, 0 cache hits.
+INFO: Artifacts built, logical paths are:
+ helloworld [487dc9e47b978877ed2f7d80b3395ce84b23be92:16992:x]
+$
+```
+
+Note that the output binary may have changed due to different defaults.
+
+Modeling target dependencies
+----------------------------
+
+For demonstration purposes, we will separate the print statements into a
+static library `greet`, which will become a dependency to our binary.
+Therefore, we create a new subdirectory `greet` with the files
+`greet/greet.hpp`:
+
+``` {.cpp srcname="greet/greet.hpp"}
+#include <string>
+
+void greet(std::string const& s);
+```
+
+and `greet/greet.cpp`:
+
+``` {.cpp srcname="greet/greet.cpp"}
+#include "greet.hpp"
+#include <iostream>
+
+void greet(std::string const& s) {
+ std::cout << "Hello " << s << "!\n";
+}
+```
+
+These files can now be used to create a static library `libgreet.a`. To
+do so, we need to create the following target description in
+`greet/TARGETS`:
+
+``` {.jsonc srcname="greet/TARGETS"}
+{ "greet":
+ { "type": ["@", "rules", "CC", "library"]
+ , "name": ["greet"]
+ , "hdrs": ["greet.hpp"]
+ , "srcs": ["greet.cpp"]
+ , "stage": ["greet"]
+ }
+}
+```
+
+Similar to `"binary"`, we have to provide a name and source file.
+Additionally, a library has public headers defined via `"hdrs"` and an
+optional staging directory `"stage"` (default value `"."`). The staging
+directory specifies where the consumer of this library can expect to
+find the library's artifacts. Note that this does not need to reflect
+the location on the file system (i.e., a full-qualified path like
+`["com", "example", "utils", "greet"]` could be used to distinguish it
+from greeting libraries of other projects). The staging directory does
+not only affect the main artifact `libgreet.a` but also it's
+*runfiles*, a second set of artifacts, usually those a consumer needs to
+make proper use the actual artifact; in the case of a library, the
+runfiles are its public headers. Hence, the public header will be staged
+to `"greet/greet.hpp"`. With that knowledge, we can now perform the
+necessary modifications to `main.cpp`:
+
+``` {.cpp srcname="main.cpp"}
+#include "greet/greet.hpp"
+
+int main() {
+ greet("Universe");
+ return 0;
+}
+```
+
+The target `"helloworld"` will have a direct dependency to the target
+`"greet"` of the module `"greet"` in the top-level `TARGETS` file:
+
+``` {.jsonc srcname="TARGETS"}
+{ "helloworld":
+ { "type": ["@", "rules", "CC", "binary"]
+ , "name": ["helloworld"]
+ , "srcs": ["main.cpp"]
+ , "private-deps": [["greet", "greet"]]
+ }
+}
+```
+
+Note that there is no need to explicitly specify `"greet"`'s public
+headers here as the appropriate artifacts of dependencies are
+automatically added to the inputs of compile and link actions. The new
+binary can be built with the same command as before (no need to rerun
+`just-mr`):
+
+``` sh
+$ just-mr build helloworld
+INFO: Requested target is [["@","tutorial","","helloworld"],{}]
+INFO: Analysed target [["@","tutorial","","helloworld"],{}]
+INFO: Export targets found: 0 cached, 0 uncached, 0 not eligible for caching
+INFO: Discovered 4 actions, 2 trees, 0 blobs
+INFO: Building [["@","tutorial","","helloworld"],{}].
+INFO: Processed 4 actions, 0 cache hits.
+INFO: Artifacts built, logical paths are:
+ helloworld [2b81e3177afc382452a2df9f294d3df90a9ccaf0:17664:x]
+$
+```
+
+To only build the static library target `"greet"` from module `"greet"`,
+run the following command:
+
+``` sh
+$ just-mr build greet greet
+INFO: Requested target is [["@","tutorial","greet","greet"],{}]
+INFO: Analysed target [["@","tutorial","greet","greet"],{}]
+INFO: Export targets found: 0 cached, 0 uncached, 0 not eligible for caching
+INFO: Discovered 2 actions, 1 trees, 0 blobs
+INFO: Building [["@","tutorial","greet","greet"],{}].
+INFO: Processed 2 actions, 2 cache hits.
+INFO: Artifacts built, logical paths are:
+ greet/libgreet.a [83ed406e21f285337b0c9bd5011f56f656bba683:2992:f]
+ (1 runfiles omitted.)
+$
+```
diff --git a/doc/tutorial/hello-world.org b/doc/tutorial/hello-world.org
deleted file mode 100644
index 342eaf82..00000000
--- a/doc/tutorial/hello-world.org
+++ /dev/null
@@ -1,370 +0,0 @@
-* Building C++ Hello World
-
-/justbuild/ is a true language-agnostic (there are no more-equal languages) and
-multi-repository build system. As a consequence, high-level concepts (e.g., C++
-binaries, C++ libraries, etc.) are not hardcoded built-ins of the tool, but
-rather provided via a set of rules. These rules can be specified as a true
-dependency to your project like any other external repository your project might
-depend on.
-
-** Setting up the Multi-Repository Configuration
-
-To build a project with multi-repository dependencies, we first need to provide
-a configuration that declares the required repositories. Before we begin, we
-need to declare where the root of our workspace is located by creating an empty
-file ~ROOT~:
-
-#+BEGIN_SRC sh
-$ touch ROOT
-#+END_SRC
-
-Second, we also need to create the multi-repository configuration ~repos.json~
-in the workspace root:
-
-#+SRCNAME: repos.json
-#+BEGIN_SRC js
-{ "main": "tutorial"
-, "repositories":
- { "rules-cc":
- { "repository":
- { "type": "git"
- , "branch": "master"
- , "commit": "123d8b03bf2440052626151c14c54abce2726e6f"
- , "repository": "https://github.com/just-buildsystem/rules-cc.git"
- , "subdir": "rules"
- }
- }
- , "tutorial":
- { "repository": {"type": "file", "path": "."}
- , "bindings": {"rules": "rules-cc"}
- }
- }
-}
-#+END_SRC
-
-In that configuration, two repositories are defined:
-
- 1. The ~"rules-cc"~ repository located in the subdirectory ~rules~ of
- [[https://github.com/just-buildsystem/rules-cc/tree/123d8b03bf2440052626151c14c54abce2726e6f][just-buildsystem/rules-cc:123d8b03bf2440052626151c14c54abce2726e6f]],
- which contains the high-level concepts for building C/C++ binaries and
- libraries.
-
- 2. The ~"tutorial"~ repository located at ~.~, which contains the targets that
- we want to build. It has a single dependency, which is the /rules/ that are
- needed to build the target. These rules are bound via the open name
- ~"rules"~ to the just created repository ~"rules-cc"~. In this way, the
- entities provided by ~"rules-cc"~ can be accessed from within the
- ~"tutorial"~ repository via the fully-qualified name
- ~["@", "rules", "<module>", "<name>"]~; fully-qualified
- names (for rules, targets to build (like libraries, binaries),
- etc) are given by a repository name, a path specifying a
- directory within that repository (the "module") where the
- specification file is located, and a symbolic name (i.e., an
- arbitrary string that is used as key in the specification).
-
-The final repository configuration contains a single ~JSON~ object with the key
-~"repositories"~ referring to an object of repository names as keys and
-repository descriptions as values. For convenience, the main repository to pick
-is set to ~"tutorial"~.
-
-** Description of the helloworld target
-
-For this tutorial, we want to create a target ~helloworld~ that produces a
-binary from the C++ source ~main.cpp~. To define such a target, create a
-~TARGETS~ file with the following content:
-
-#+SRCNAME: TARGETS
-#+BEGIN_SRC js
-{ "helloworld":
- { "type": ["@", "rules", "CC", "binary"]
- , "name": ["helloworld"]
- , "srcs": ["main.cpp"]
- }
-}
-#+END_SRC
-
-The ~"type"~ field refers to the rule ~"binary"~ from the module ~"CC"~ of the
-~"rules"~ repository. This rule additionally requires the string field ~"name"~,
-which specifies the name of the binary to produce; as the generic interface of
-rules is to have fields either take a list of strings or a list of targets,
-we have to specify the name as a list (this rule will simply concatenate all
-strings given in this field). Furthermore, at least one
-input to the binary is required, which can be specified via the target fields
-~"srcs"~ or ~"deps"~. In our case, the former is used, which contains our single
-source file (files are considered targets).
-
-Now, the last file that is missing is the actual source file ~main.cpp~:
-
-#+SRCNAME: main.cpp
-#+BEGIN_SRC cpp
-#include <iostream>
-
-int main() {
- std::cout << "Hello world!\n";
- return 0;
-}
-#+END_SRC
-
-** Building the helloworld target
-
-To build the ~helloworld~ target, we need specify it on the ~just-mr~ command
-line:
-
-#+BEGIN_SRC sh
-$ just-mr build helloworld
-INFO: Requested target is [["@","tutorial","","helloworld"],{}]
-INFO: Analysed target [["@","tutorial","",helloworld"],{}]
-INFO: Export targets found: 0 cached, 0 uncached, 0 not eligible for caching
-INFO: Discovered 2 actions, 1 trees, 0 blobs
-INFO: Building [["@","helloworld","","helloworld"],{}].
-INFO: Processed 2 actions, 0 cache hits.
-INFO: Artifacts built, logical paths are:
- helloworld [b5cfca8b810adc4686f5cac00258a137c5d4a3ba:17088:x]
-$
-#+END_SRC
-
-Note that the target is taken from the ~tutorial~ repository, as it specified as
-the main repository in ~repos.json~. If targets from other repositories should
-be build, the repository to use must be specified via the ~--main~ option.
-
-~just-mr~ reads the repository configuration, fetches externals (if any),
-generates the actual build configuration, and stores it in its cache directory
-(by default under ~$HOME/.cache/just~). Afterwards, the generated configuration
-is used to call the ~just~ binary, which performs the actual build.
-
-Note that these two programs, ~just-mr~ and ~just~, can also be run
-individually. To do so, first run ~just-mr~ with ~setup~ and capture the path to
-the generated build configuration from stdout by assigning it to a shell
-variable (e.g., ~CONF~). Afterwards, ~just~ can be called to perform the actual
-build by explicitly specifying the configuration file via ~-C~:
-
-#+BEGIN_SRC sh
-$ CONF=$(just-mr setup tutorial)
-$ just build -C $CONF helloworld
-#+END_SRC
-
-Note that ~just-mr~ only needs to be run the very first time and only once again
-whenever the ~repos.json~ file is modified.
-
-By default, the BSD-default compiler front-ends (which are also defined for most
-Linux distributions) ~cc~ and ~c++~ are used for C and C++ (variables ~"CC"~ and
-~"CXX"~). If you want to temporarily use different defaults, you can use ~-D~ to
-provide a JSON object that sets different default variables. For instance, to
-use Clang as C++ compiler for a single build invocation, you can use the
-following command to provide an object that sets ~"CXX"~ to ~"clang++"~:
-
-#+BEGIN_SRC sh
-$ just-mr build helloworld -D'{"CXX":"clang++"}'
-INFO: Requested target is [["@","tutorial","","helloworld"],{"CXX":"clang++"}]
-INFO: Analysed target [["@","tutorial","","helloworld"],{"CXX":"clang++"}]
-INFO: Export targets found: 0 cached, 0 uncached, 0 not eligible for caching
-INFO: Discovered 2 actions, 1 trees, 0 blobs
-INFO: Building [["@","tutorial","","helloworld"],{"CXX":"clang++"}].
-INFO: Processed 2 actions, 0 cache hits.
-INFO: Artifacts built, logical paths are:
- helloworld [b8cf7b8579d9dc7172b61660139e2c14521cedae:16944:x]
-$
-#+END_SRC
-
-** Defining project defaults
-
-To define a custom set of defaults (toolchain and compile flags) for your
-project, you need to create a separate file root for providing required
-~TARGETS~ file, which contains the ~"defaults"~ target that should be used by
-the rules. This file root is then used as the /target root/ for the rules, i.e.,
-the search path for ~TARGETS~ files. In this way, the description of the
-~"defaults"~ target is provided in a separate file root, to keep the rules
-repository independent of these definitions.
-
-We will call the new file root ~tutorial-defaults~ and need to create a module
-directory ~CC~ in it:
-
-#+BEGIN_SRC sh
-$ mkdir -p ./tutorial-defaults/CC
-#+END_SRC
-
-In that module, we need to create the file ~tutorial-defaults/CC/TARGETS~ that
-contains the target ~"defaults"~ and specifies which toolchain and compile flags
-to use; it has to specify the complete toolchain, but can specify a ~"base"~
-toolchain to inherit from. In our case, we don't use any base, but specify all
-the required fields directly.
-
-#+SRCNAME: tutorial-defaults/CC/TARGETS
-#+BEGIN_SRC js
-{ "defaults":
- { "type": ["CC", "defaults"]
- , "CC": ["cc"]
- , "CXX": ["c++"]
- , "CFLAGS": ["-O2", "-Wall"]
- , "CXXFLAGS": ["-O2", "-Wall"]
- , "AR": ["ar"]
- , "PATH": ["/bin", "/usr/bin"]
- }
-}
-#+END_SRC
-
-To use the project defaults, modify the existing ~repos.json~ to reflect the
-following content:
-
-#+SRCNAME: repos.json
-#+BEGIN_SRC js
-{ "main": "tutorial"
-, "repositories":
- { "rules-cc":
- { "repository":
- { "type": "git"
- , "branch": "master"
- , "commit": "123d8b03bf2440052626151c14c54abce2726e6f"
- , "repository": "https://github.com/just-buildsystem/rules-cc.git"
- , "subdir": "rules"
- }
- , "target_root": "tutorial-defaults"
- , "rule_root": "rules-cc"
- }
- , "tutorial":
- { "repository": {"type": "file", "path": "."}
- , "bindings": {"rules": "rules-cc"}
- }
- , "tutorial-defaults":
- { "repository": {"type": "file", "path": "./tutorial-defaults"}
- }
- }
-}
-#+END_SRC
-
-Note that the ~"defaults"~ target uses the rule ~["CC", "defaults"]~ without
-specifying any external repository (e.g., ~["@", "rules", ...]~). This is
-because ~"tutorial-defaults"~ is not a full-fledged repository but merely a file
-root that is considered local to the ~"rules-cc"~ repository. In fact, the
-~"rules-cc"~ repository cannot refer to any external repository as it does not
-have any defined bindings.
-
-To rebuild the project, we need to rerun ~just-mr~ (note that due to
-configuration changes, rerunning only ~just~ would not suffice):
-
-#+BEGIN_SRC sh
-$ just-mr build helloworld
-INFO: Requested target is [["@","tutorial","","helloworld"],{}]
-INFO: Analysed target [["@","tutorial","","helloworld"],{}]
-INFO: Export targets found: 0 cached, 0 uncached, 0 not eligible for caching
-INFO: Discovered 2 actions, 1 trees, 0 blobs
-INFO: Building [["@","tutorial","","helloworld"],{}].
-INFO: Processed 2 actions, 0 cache hits.
-INFO: Artifacts built, logical paths are:
- helloworld [487dc9e47b978877ed2f7d80b3395ce84b23be92:16992:x]
-$
-#+END_SRC
-
-Note that the output binary may have changed due to different defaults.
-
-** Modeling target dependencies
-
-For demonstration purposes, we will separate the print statements into a static
-library ~greet~, which will become a dependency to our binary. Therefore, we
-create a new subdirectory ~greet~ with the files ~greet/greet.hpp~:
-
-#+SRCNAME: greet/greet.hpp
-#+BEGIN_SRC cpp
-#include <string>
-
-void greet(std::string const& s);
-#+END_SRC
-
-and ~greet/greet.cpp~:
-
-#+SRCNAME: greet/greet.cpp
-#+BEGIN_SRC cpp
-#include "greet.hpp"
-#include <iostream>
-
-void greet(std::string const& s) {
- std::cout << "Hello " << s << "!\n";
-}
-#+END_SRC
-
-These files can now be used to create a static library ~libgreet.a~. To do so,
-we need to create the following target description in ~greet/TARGETS~:
-
-#+SRCNAME: greet/TARGETS
-#+BEGIN_SRC js
-{ "greet":
- { "type": ["@", "rules", "CC", "library"]
- , "name": ["greet"]
- , "hdrs": ["greet.hpp"]
- , "srcs": ["greet.cpp"]
- , "stage": ["greet"]
- }
-}
-#+END_SRC
-
-Similar to ~"binary"~, we have to provide a name and source file. Additionally,
-a library has public headers defined via ~"hdrs"~ and an optional staging
-directory ~"stage"~ (default value ~"."~). The staging directory specifies where
-the consumer of this library can expect to find the library's artifacts. Note
-that this does not need to reflect the location on the file system (i.e., a
-full-qualified path like ~["com", "example", "utils", "greet"]~ could be used to
-distinguish it from greeting libraries of other projects). The staging directory
-does not only affect the main artifact ~libgreet.a~ but also it's /runfiles/,
-a second set of artifacts, usually those a consumer needs to make proper use the
-actual artifact; in the case of a library, the runfiles are its public headers.
-Hence, the public header will be staged to ~"greet/greet.hpp"~. With that
-knowledge, we can now perform the necessary modifications to ~main.cpp~:
-
-#+SRCNAME: main.cpp
-#+BEGIN_SRC cpp
-#include "greet/greet.hpp"
-
-int main() {
- greet("Universe");
- return 0;
-}
-#+END_SRC
-
-The target ~"helloworld"~ will have a direct dependency to the target ~"greet"~
-of the module ~"greet"~ in the top-level ~TARGETS~ file:
-
-#+SRCNAME: TARGETS
-#+BEGIN_SRC js
-{ "helloworld":
- { "type": ["@", "rules", "CC", "binary"]
- , "name": ["helloworld"]
- , "srcs": ["main.cpp"]
- , "private-deps": [["greet", "greet"]]
- }
-}
-#+END_SRC
-
-Note that there is no need to explicitly specify ~"greet"~'s public headers here
-as the appropriate artifacts of dependencies are automatically added to the
-inputs of compile and link actions. The new binary can be built with the same
-command as before (no need to rerun ~just-mr~):
-
-#+BEGIN_SRC sh
-$ just-mr build helloworld
-INFO: Requested target is [["@","tutorial","","helloworld"],{}]
-INFO: Analysed target [["@","tutorial","","helloworld"],{}]
-INFO: Export targets found: 0 cached, 0 uncached, 0 not eligible for caching
-INFO: Discovered 4 actions, 2 trees, 0 blobs
-INFO: Building [["@","tutorial","","helloworld"],{}].
-INFO: Processed 4 actions, 0 cache hits.
-INFO: Artifacts built, logical paths are:
- helloworld [2b81e3177afc382452a2df9f294d3df90a9ccaf0:17664:x]
-$
-#+END_SRC
-
-To only build the static library target ~"greet"~ from module ~"greet"~, run the
-following command:
-
-#+BEGIN_SRC sh
-$ just-mr build greet greet
-INFO: Requested target is [["@","tutorial","greet","greet"],{}]
-INFO: Analysed target [["@","tutorial","greet","greet"],{}]
-INFO: Export targets found: 0 cached, 0 uncached, 0 not eligible for caching
-INFO: Discovered 2 actions, 1 trees, 0 blobs
-INFO: Building [["@","tutorial","greet","greet"],{}].
-INFO: Processed 2 actions, 2 cache hits.
-INFO: Artifacts built, logical paths are:
- greet/libgreet.a [83ed406e21f285337b0c9bd5011f56f656bba683:2992:f]
- (1 runfiles omitted.)
-$
-#+END_SRC
diff --git a/doc/tutorial/proto.org b/doc/tutorial/proto.md
index b4a02d48..8a04e373 100644
--- a/doc/tutorial/proto.org
+++ b/doc/tutorial/proto.md
@@ -1,27 +1,28 @@
-* Using protocol buffers
+Using protocol buffers
+======================
-The rules /justbuild/ uses for itself also support protocol
-buffers. This tutorial shows how to use those rules and the targets
-associated with them. It is not a tutorial on protocol buffers
-itself; rather, it is assumed that the reader has some knowledge on
-[[https://developers.google.com/protocol-buffers/][protocol buffers]].
+The rules *justbuild* uses for itself also support protocol buffers.
+This tutorial shows how to use those rules and the targets associated
+with them. It is not a tutorial on protocol buffers itself; rather, it
+is assumed that the reader has some knowledge on [protocol
+buffers](https://developers.google.com/protocol-buffers/).
-** Setting up the repository configuration
+Setting up the repository configuration
+---------------------------------------
-Before we begin, we first need to declare where the root of our workspace is
-located by creating the empty file ~ROOT~:
+Before we begin, we first need to declare where the root of our
+workspace is located by creating the empty file `ROOT`:
-#+BEGIN_SRC sh
+``` sh
$ touch ROOT
-#+END_SRC
+```
-The ~protobuf~ repository conveniently contains an
-[[https://github.com/protocolbuffers/protobuf/tree/v3.12.4/examples][example]],
-so we can use this and just add our own target files. We create
-file ~repos.template.json~ as follows.
+The `protobuf` repository conveniently contains an
+[example](https://github.com/protocolbuffers/protobuf/tree/v3.12.4/examples),
+so we can use this and just add our own target files. We create file
+`repos.template.json` as follows.
-#+SRCNAME: repos.template.json
-#+BEGIN_SRC js
+``` {.jsonc srcname="repos.template.json"}
{ "repositories":
{ "":
{ "repository":
@@ -36,45 +37,45 @@ file ~repos.template.json~ as follows.
, "tutorial": {"repository": {"type": "file", "path": "."}}
}
}
-#+END_SRC
+```
-The missing entry ~"rules-cc"~ refers to our C/C++ build rules provided
-[[https://github.com/just-buildsystem/rules-cc][online]]. These rules support
-protobuf if the dependency ~"protoc"~ is provided. To import this rule
-repository including the required transitive dependencies for protobuf, the
-~bin/just-import-git~ script with option ~--as rules-cc~ can be used to
-generate the actual ~repos.json~:
+The missing entry `"rules-cc"` refers to our C/C++ build rules provided
+[online](https://github.com/just-buildsystem/rules-cc). These rules
+support protobuf if the dependency `"protoc"` is provided. To import
+this rule repository including the required transitive dependencies for
+protobuf, the `bin/just-import-git` script with option `--as rules-cc`
+can be used to generate the actual `repos.json`:
-#+BEGIN_SRC sh
+``` sh
$ just-import-git -C repos.template.json -b master --as rules-cc https://github.com/just-buildsystem/rules-cc > repos.json
-#+END_SRC
+```
-To build the example with ~just~, the only task is to write targets files. As
-that contains a couple of new concepts, we will do this step by step.
+To build the example with `just`, the only task is to write targets
+files. As that contains a couple of new concepts, we will do this step
+by step.
-** The proto library
+The proto library
+-----------------
First, we have to declare the proto library. In this case, it only
-contains the file ~addressbook.proto~ and has no dependencies. To
-declare the library, create a ~TARGETS~ file with the following
-content:
+contains the file `addressbook.proto` and has no dependencies. To
+declare the library, create a `TARGETS` file with the following content:
-#+SRCNAME: TARGETS
-#+BEGIN_SRC js
+``` {.jsonc srcname="TARGETS"}
{ "address":
{ "type": ["@", "rules", "proto", "library"]
, "name": ["addressbook"]
, "srcs": ["addressbook.proto"]
}
}
-#+END_SRC
+```
In general, proto libraries could also depend on other proto libraries;
-those would be added to the ~"deps"~ field.
+those would be added to the `"deps"` field.
When building the library, there's very little to do.
-#+BEGIN_SRC sh
+``` sh
$ just-mr build address
INFO: Requested target is [["@","","","address"],{}]
INFO: Analysed target [["@","","","address"],{}]
@@ -84,20 +85,21 @@ INFO: Building [["@","","","address"],{}].
INFO: Processed 0 actions, 0 cache hits.
INFO: Artifacts built, logical paths are:
$
-#+END_SRC
+```
On the other hand, what did we expect? A proto library is an abstract
description of a protocol, so, as long as we don't specify for which
language we want to have bindings, there is nothing to generate.
-Nevertheless, a proto library target is not empty. In fact, it can't be empty,
-as other targets can only access the values of a target and have no
-insights into its definitions. We already relied on this design principle
-implicitly, when we exploited target-level caching for our external dependencies
-and did not even construct the dependency graph for that target. A proto
-library simply provides the dependency structure of the ~.proto~ files.
+Nevertheless, a proto library target is not empty. In fact, it can't be
+empty, as other targets can only access the values of a target and have
+no insights into its definitions. We already relied on this design
+principle implicitly, when we exploited target-level caching for our
+external dependencies and did not even construct the dependency graph
+for that target. A proto library simply provides the dependency
+structure of the `.proto` files.
-#+BEGIN_SRC sh
+``` sh
$ just-mr analyse --dump-nodes - address
INFO: Requested target is [["@","","","address"],{}]
INFO: Result of target [["@","","","address"],{}]: {
@@ -146,36 +148,35 @@ INFO: Target nodes of target [["@","","","address"],{}]:
}
}
$
-#+END_SRC
+```
-The target has one provider ~"proto"~, which is a node. Nodes are
-an abstract representation of a target graph. More precisely, there
-are two kind of nodes, and our example contains one of each.
+The target has one provider `"proto"`, which is a node. Nodes are an
+abstract representation of a target graph. More precisely, there are two
+kind of nodes, and our example contains one of each.
-The simple kind of nodes are the value nodes; they represent a
-target that has a fixed value, and hence are given by artifacts,
-runfiles, and provided data. In our case, we have one value node,
-the one for the ~.proto~ file.
+The simple kind of nodes are the value nodes; they represent a target
+that has a fixed value, and hence are given by artifacts, runfiles, and
+provided data. In our case, we have one value node, the one for the
+`.proto` file.
The other kind of nodes are the abstract nodes. They describe the
-arguments for a target, but only have an abstract name (i.e., a
-string) for the rule. Combining such an abstract target with a
-binding for the abstract rule names gives a concrete "anonymous"
-target that, in our case, will generate the library with the bindings
-for the concrete language. In this example, the abstract name is
-~"library"~. The alternative in our proto rules would have been
-~"service library"~, for proto libraries that also contain ~rpc~
-definitions (which is used by [[https://grpc.io/][gRPC]]).
-
-** Using proto libraries
-
-Using proto libraries requires, as discussed, bindings for the
-abstract names. Fortunately, our ~CC~ rules are aware of proto
-libraries, so we can simply use them. Our target file hence
-continues as follows.
-
-#+SRCNAME: TARGETS
-#+BEGIN_SRC js
+arguments for a target, but only have an abstract name (i.e., a string)
+for the rule. Combining such an abstract target with a binding for the
+abstract rule names gives a concrete "anonymous" target that, in our
+case, will generate the library with the bindings for the concrete
+language. In this example, the abstract name is `"library"`. The
+alternative in our proto rules would have been `"service library"`, for
+proto libraries that also contain `rpc` definitions (which is used by
+[gRPC](https://grpc.io/)).
+
+Using proto libraries
+---------------------
+
+Using proto libraries requires, as discussed, bindings for the abstract
+names. Fortunately, our `CC` rules are aware of proto libraries, so we
+can simply use them. Our target file hence continues as follows.
+
+``` {.jsonc srcname="TARGETS"}
...
, "add_person":
{ "type": ["@", "rules", "CC", "binary"]
@@ -190,14 +191,14 @@ continues as follows.
, "private-proto": ["address"]
}
...
-#+END_SRC
+```
-The first time, we build a target that requires the proto compiler
-(in that particular version, built in that particular way), it takes
-a bit of time, as the proto compiler has to be built. But in follow-up
-builds, also in different projects, the target-level cache is filled already.
+The first time, we build a target that requires the proto compiler (in
+that particular version, built in that particular way), it takes a bit
+of time, as the proto compiler has to be built. But in follow-up builds,
+also in different projects, the target-level cache is filled already.
-#+BEGIN_SRC sh
+``` sh
$ just-mr build add_person
...
$ just-mr build add_person
@@ -210,12 +211,12 @@ INFO: Processed 5 actions, 5 cache hits.
INFO: Artifacts built, logical paths are:
add_person [bcbb3deabfe0d77e6d3ea35615336a2f59a1b0aa:2285928:x]
$
-#+END_SRC
+```
If we look at the actions associated with the binary, we find that those
are still the two actions we expect: a compile action and a link action.
-#+BEGIN_SRC sh
+``` sh
$ just-mr analyse add_person --dump-actions -
INFO: Requested target is [["@","","","add_person"],{}]
INFO: Result of target [["@","","","add_person"],{}]: {
@@ -251,17 +252,17 @@ INFO: Actions for target [["@","","","add_person"],{}]:
}
]
$
-#+END_SRC
+```
-As discussed, the ~libaddressbook.a~ that is conveniently available
-during the linking of the binary (as well as the ~addressbook.pb.h~
-available in the ~include~ tree for the compile action) are generated
-by an anonymous target. Using that during the build we already
-filled the target-level cache, we can have a look at all targets
-still analysed. In the one anonymous target, we find again the
-abstract node we discussed earlier.
+As discussed, the `libaddressbook.a` that is conveniently available
+during the linking of the binary (as well as the `addressbook.pb.h`
+available in the `include` tree for the compile action) are generated by
+an anonymous target. Using that during the build we already filled the
+target-level cache, we can have a look at all targets still analysed. In
+the one anonymous target, we find again the abstract node we discussed
+earlier.
-#+BEGIN_SRC sh
+``` sh
$ just-mr analyse add_person --dump-targets -
INFO: Requested target is [["@","","","add_person"],{}]
INFO: Result of target [["@","","","add_person"],{}]: {
@@ -302,25 +303,24 @@ INFO: List of analysed targets:
}
}
$
-#+END_SRC
-
-It should be noted, however, that this tight integration of proto
-into our ~C++~ rules is just convenience of our code base. If we had
-to cooperate with rules not aware of proto, we could have created
-a separate rule delegating the library creation to the anonymous
-target and then simply reflecting the values of that target.
-In fact, we could simply use an empty library with a public ~proto~
-dependency for this purpose.
-
-#+SRCNAME: TARGETS
-#+BEGIN_SRC js
+```
+
+It should be noted, however, that this tight integration of proto into
+our `C++` rules is just convenience of our code base. If we had to
+cooperate with rules not aware of proto, we could have created a
+separate rule delegating the library creation to the anonymous target
+and then simply reflecting the values of that target. In fact, we could
+simply use an empty library with a public `proto` dependency for this
+purpose.
+
+``` {.jsonc srcname="TARGETS"}
...
, "address proto library":
{"type": ["@", "rules", "CC", "library"], "proto": ["address"]}
...
-#+END_SRC
+```
-#+BEGIN_SRC sh
+``` sh
$ just-mr analyse 'address proto library'
...
INFO: Requested target is [["@","","","address proto library"],{}]
@@ -347,18 +347,18 @@ INFO: Result of target [["@","","","address proto library"],{}]: {
}
}
$
-#+END_SRC
+```
-** Adding a test
+Adding a test
+-------------
-Finally, let's add a test. As we use the ~protobuf~ repository as
-workspace root, we add the test script ad hoc into a targets file,
-using the ~"file_gen"~ rule. For debugging a potentially failing
-test, we also keep the intermediate files the test generates.
-Create a top-level ~TARGETS~ file with the following content:
+Finally, let's add a test. As we use the `protobuf` repository as
+workspace root, we add the test script ad hoc into a targets file, using
+the `"file_gen"` rule. For debugging a potentially failing test, we also
+keep the intermediate files the test generates. Create a top-level
+`TARGETS` file with the following content:
-#+SRCNAME: TARGETS
-#+BEGIN_SRC js
+``` {.jsonc srcname="TARGETS"}
...
, "test.sh":
{ "type": "file_gen"
@@ -382,17 +382,16 @@ Create a top-level ~TARGETS~ file with the following content:
, "keep": ["addressbook.data", "out.txt"]
}
...
-#+END_SRC
+```
-That example also shows why it is important that the generation
-of the language bindings is delegated to an anonymous target: we
-want to analyse only once how the ~C++~ bindings are generated.
-Nevertheless, many targets can depend (directly or indirectly) on
-the same proto library. And, indeed, analysing the test, we get
-the expected additional targets and the one anonymous target is
-reused by both binaries.
+That example also shows why it is important that the generation of the
+language bindings is delegated to an anonymous target: we want to
+analyse only once how the `C++` bindings are generated. Nevertheless,
+many targets can depend (directly or indirectly) on the same proto
+library. And, indeed, analysing the test, we get the expected additional
+targets and the one anonymous target is reused by both binaries.
-#+BEGIN_SRC sh
+``` sh
$ just-mr analyse test --dump-targets -
INFO: Requested target is [["@","","","test"],{}]
INFO: Result of target [["@","","","test"],{}]: {
@@ -444,11 +443,11 @@ INFO: List of analysed targets:
}
INFO: Target tainted ["test"].
$
-#+END_SRC
+```
Finally, the test passes and the output is as expected.
-#+BEGIN_SRC sh
+``` sh
$ just-mr build test -Pwork/out.txt
INFO: Requested target is [["@","","","test"],{}]
INFO: Analysed target [["@","","","test"],{}]
@@ -472,4 +471,4 @@ Person ID: 12345
Updated: 2022-12-14T18:08:36Z
INFO: Target tainted ["test"].
$
-#+END_SRC
+```
diff --git a/doc/tutorial/rebuild.org b/doc/tutorial/rebuild.md
index 80aafb6f..3f1ddd88 100644
--- a/doc/tutorial/rebuild.org
+++ b/doc/tutorial/rebuild.md
@@ -1,15 +1,17 @@
-* Ensuring reproducibility of the build
-
-Software builds should be [[https://reproducible-builds.org/][reproducible]].
-The ~just~ tool, supports this goal in local builds by isolating
-individual actions, setting permissions and file time stamps to
-canonical values, etc; most remote execution systems take even further
-measures to ensure the environment always looks the same to every
-action. Nevertheless, it is always possible to break reproducibility
-by bad actions, both coming from rules not carefully written, as
-well as from ad-hoc actions added by the ~generic~ target.
-
-#+BEGIN_SRC js
+Ensuring reproducibility of the build
+=====================================
+
+Software builds should be
+[reproducible](https://reproducible-builds.org/). The `just` tool,
+supports this goal in local builds by isolating individual actions,
+setting permissions and file time stamps to canonical values, etc; most
+remote execution systems take even further measures to ensure the
+environment always looks the same to every action. Nevertheless, it is
+always possible to break reproducibility by bad actions, both coming
+from rules not carefully written, as well as from ad-hoc actions added
+by the `generic` target.
+
+``` jsonc
...
, "version.h":
{ "type": "generic"
@@ -18,29 +20,29 @@ well as from ad-hoc actions added by the ~generic~ target.
, "outs": ["version.h"]
}
...
-#+END_SRC
-
-Besides time stamps there are many other sources of nondeterminism,
-like properties of the build machine (name, number of CPUs available,
-etc), but also subtle ones like ~readdir~ order. Often, those
-non-reproducible parts get buried deeply in a final artifact (like
-the version string embedded in a binary contained in a compressed
-installation archive); and, as long as the non-reproducible action
-stays in cache, it does not even result in bad incrementality.
-Still, others won't be able to reproduce the exact artifact.
-
-There are tools like [[https://diffoscope.org/][diffoscope]] to deeply
+```
+
+Besides time stamps there are many other sources of nondeterminism, like
+properties of the build machine (name, number of CPUs available, etc),
+but also subtle ones like `readdir` order. Often, those non-reproducible
+parts get buried deeply in a final artifact (like the version string
+embedded in a binary contained in a compressed installation archive);
+and, as long as the non-reproducible action stays in cache, it does not
+even result in bad incrementality. Still, others won't be able to
+reproduce the exact artifact.
+
+There are tools like [diffoscope](https://diffoscope.org/) to deeply
compare archives and other container formats. Nevertheless, it is
desirable to find the root causes, i.e., the first (in topological
order) actions that yield a different output.
-** Rebuilding
+Rebuilding
+----------
-For the remainder of this section, we will consider the following example
-project with the C++ source file ~hello.cpp~:
+For the remainder of this section, we will consider the following
+example project with the C++ source file `hello.cpp`:
-#+SRCNAME: hello.cpp
-#+BEGIN_SRC cpp
+``` {.cpp srcname="hello.cpp"}
#include <iostream>
#include "version.h"
@@ -50,12 +52,11 @@ int main(int argc, const char* argv[]) {
}
return 0;
}
-#+END_SRC
+```
-and the following ~TARGETS~ file:
+and the following `TARGETS` file:
-#+SRCNAME: TARGETS
-#+BEGIN_SRC js
+``` {.jsonc srcname="TARGETS"}
{ "":
{ "type": "install"
, "files":
@@ -95,17 +96,17 @@ and the following ~TARGETS~ file:
, "deps": ["out.txt"]
}
}
-#+END_SRC
+```
-To search for the root cause of non-reproducibility, ~just~ has
-a subcommand ~rebuild~. It builds the specified target again, requesting
+To search for the root cause of non-reproducibility, `just` has a
+subcommand `rebuild`. It builds the specified target again, requesting
that every action be executed again (but target-level cache is still
active); then the result of every action is compared to the one in the
action cache, if present with the same inputs. So, you typically would
-first ~build~ and then ~rebuild~. Note that a repeated ~build~ simply
+first `build` and then `rebuild`. Note that a repeated `build` simply
takes the action result from cache.
-#+BEGIN_SRC sh
+``` sh
$ just-mr build
INFO: Requested target is [["@","tutorial","",""],{}]
INFO: Analysed target [["@","tutorial","",""],{}]
@@ -135,30 +136,31 @@ INFO: Export targets found: 0 cached, 0 uncached, 0 not eligible for caching
INFO: Discovered 6 actions, 1 trees, 0 blobs
INFO: Rebuilding [["@","tutorial","",""],{}].
WARN: Found flaky action:
- - id: c854a382ea26628e1a5b8d4af00d6d0cef433436
- - cmd: ["sh","-c","echo '#define VERSION \"0.0.0.'`date +%Y%m%d%H%M%S`'\"' > version.h\n"]
- - output 'version.h' differs:
- - [6aac3477e22cd57e8c98ded78562d3c017e5d611:39:f] (rebuilt)
- - [789a29f39b6aa966f91776bfe092e247614e6acd:39:f] (cached)
+ - id: c854a382ea26628e1a5b8d4af00d6d0cef433436
+ - cmd: ["sh","-c","echo '#define VERSION \"0.0.0.'`date +%Y%m%d%H%M%S`'\"' > version.h\n"]
+ - output 'version.h' differs:
+ - [6aac3477e22cd57e8c98ded78562d3c017e5d611:39:f] (rebuilt)
+ - [789a29f39b6aa966f91776bfe092e247614e6acd:39:f] (cached)
INFO: 2 actions compared with cache, 1 flaky actions found (0 of which tainted), no cache entry found for 4 actions.
INFO: Artifacts built, logical paths are:
bin/hello [73994ff43ec1161aba96708f277e8c88feab0386:16608:x]
share/hello/OUT.txt [428b97b82b6c59cad7488b24e6b618ebbcd819bc:13:f]
share/hello/version.txt [8dd65747395c0feab30891eab9e11d4a9dd0c715:39:f]
$
-#+END_SRC
+```
-In the example, the second action compared to cache is the upper
-casing of the output. Even though the generation of ~out.txt~ depends
-on the non-reproducible ~hello~, the file itself is reproducible.
-Therefore, the follow-up actions are checked as well.
+In the example, the second action compared to cache is the upper casing
+of the output. Even though the generation of `out.txt` depends on the
+non-reproducible `hello`, the file itself is reproducible. Therefore,
+the follow-up actions are checked as well.
-For this simple example, reading the console output is enough to understand
-what's going on. However, checking for reproducibility usually is part
-of a larger, quality-assurance process. To support the automation of such
-processes, the findings can also be reported in machine-readable form.
+For this simple example, reading the console output is enough to
+understand what's going on. However, checking for reproducibility
+usually is part of a larger, quality-assurance process. To support the
+automation of such processes, the findings can also be reported in
+machine-readable form.
-#+BEGIN_SRC sh
+``` sh
$ just-mr rebuild --dump-flaky flakes.json --dump-graph actions.json
[...]
$ cat flakes.json
@@ -186,40 +188,40 @@ $ cat flakes.json
}
}
}$
-#+END_SRC
+```
The file reports the flaky actions together with the non-reproducible
artifacts they generated, reporting both, the cached and the newly
-generated output. The files themselves can be obtained via ~just
-install-cas~ as usual, allowing deeper comparison of the outputs.
-The full definitions of the actions can be found in the action graph,
-in the example dumped as well as ~actions.json~; this definition
-also includes the origins for each action, i.e., the configured
-targets that requested the respective action.
+generated output. The files themselves can be obtained via `just
+install-cas` as usual, allowing deeper comparison of the outputs. The
+full definitions of the actions can be found in the action graph, in the
+example dumped as well as `actions.json`; this definition also includes
+the origins for each action, i.e., the configured targets that requested
+the respective action.
-
-** Comparing build environments
+Comparing build environments
+----------------------------
Simply rebuilding on the same machine is good way to detect embedded
time stamps of sufficiently small granularity; for other sources of
-non-reproducibility, however, more modifications of the environment
-are necessary.
-
-A simple, but effective, way for modifying the build environment
-is the option ~-L~ to set the local launcher, a list of
-strings the argument vector is prefixed with before the action is
-executed. The default ~["env", "--"]~ simply resolves the program
-to be executed in the current value of ~PATH~, but a different
-value for the launcher can obviously be used to set environment
-variables like ~LD_PRELOAD~. Relevant libraries and tools
-include [[https://github.com/wolfcw/libfaketime][libfaketime]],
-[[https://github.com/dtcooper/fakehostname][fakehostname]],
-and [[https://salsa.debian.org/reproducible-builds/disorderfs][disorderfs]].
+non-reproducibility, however, more modifications of the environment are
+necessary.
+
+A simple, but effective, way for modifying the build environment is the
+option `-L` to set the local launcher, a list of strings the argument
+vector is prefixed with before the action is executed. The default
+`["env", "--"]` simply resolves the program to be executed in the
+current value of `PATH`, but a different value for the launcher can
+obviously be used to set environment variables like `LD_PRELOAD`.
+Relevant libraries and tools include
+[libfaketime](https://github.com/wolfcw/libfaketime),
+[fakehostname](https://github.com/dtcooper/fakehostname), and
+[disorderfs](https://salsa.debian.org/reproducible-builds/disorderfs).
More variation can be achieved by comparing remote execution builds,
-either for two different remote-execution end points or comparing
-one remote-execution end point to the local build. The latter is
-also a good way to find out where a build that "works on my machine"
-differs. The endpoint on which the rebuild is executed can be set,
-in the same way as for build with the ~-r~ option; the cache end
-point to compare against can be set via the ~--vs~ option.
+either for two different remote-execution end points or comparing one
+remote-execution end point to the local build. The latter is also a good
+way to find out where a build that "works on my machine" differs. The
+endpoint on which the rebuild is executed can be set, in the same way as
+for build with the `-r` option; the cache end point to compare against
+can be set via the `--vs` option.
diff --git a/doc/tutorial/target-file-glob-tree.org b/doc/tutorial/target-file-glob-tree.md
index 58e9c725..524cf358 100644
--- a/doc/tutorial/target-file-glob-tree.org
+++ b/doc/tutorial/target-file-glob-tree.md
@@ -1,34 +1,35 @@
-* Target versus ~FILE~, ~GLOB~, and ~TREE~
+Target versus `FILE`, `GLOB`, and `TREE`
+========================================
-So far, we referred to defined targets as well as source files
-by their name and it just worked. When considering third-party
-software we already saw the ~TREE~ reference. In this section, we
-will highlight in more detail the ways to refer to sources, as well
-as the difference between defined and source targets. The latter
-is used, e.g., when third-party software has to be patched.
+So far, we referred to defined targets as well as source files by their
+name and it just worked. When considering third-party software we
+already saw the `TREE` reference. In this section, we will highlight in
+more detail the ways to refer to sources, as well as the difference
+between defined and source targets. The latter is used, e.g., when
+third-party software has to be patched.
-As example for this section we use gnu ~units~ where we want to
-patch into the standard units definition add two units of area
-popular in German news.
+As example for this section we use gnu `units` where we want to patch
+into the standard units definition add two units of area popular in
+German news.
-** Repository Config for ~units~ with patches
+Repository Config for `units` with patches
+------------------------------------------
-Before we begin, we first need to declare where the root of our workspace is
-located by creating the empty file ~ROOT~:
+Before we begin, we first need to declare where the root of our
+workspace is located by creating the empty file `ROOT`:
-#+BEGIN_SRC sh
+``` sh
$ touch ROOT
-#+END_SRC
+```
The sources are an archive available on the web. As upstream uses a
-different build system, we have to provide our own build description;
-we take the top-level directory as layer for this. As we also want
-to patch the definition file, we add the subdirectory ~files~ as
-logical repository for the patches. Hence we create a file ~repos.json~
-with the following content.
-
-#+SRCNAME: repos.json
-#+BEGIN_SRC js
+different build system, we have to provide our own build description; we
+take the top-level directory as layer for this. As we also want to patch
+the definition file, we add the subdirectory `files` as logical
+repository for the patches. Hence we create a file `repos.json` with the
+following content.
+
+``` {.jsonc srcname="repos.json"}
{ "main": "units"
, "repositories":
{ "rules-cc":
@@ -55,31 +56,33 @@ with the following content.
}
}
}
-#+END_SRC
+```
-The repository to set up is ~units~ and, as usual, we can use ~just-mr~ to
-fetch the archive and obtain the resulting multi-repository configuration.
+The repository to set up is `units` and, as usual, we can use `just-mr`
+to fetch the archive and obtain the resulting multi-repository
+configuration.
-#+BEGIN_SRC sh
+``` sh
$ just-mr setup units
-#+END_SRC
+```
-** Patching a file: targets versus ~FILE~
+Patching a file: targets versus `FILE`
+--------------------------------------
-Let's start by patching the source file ~definitions.units~. While,
-conceptionally, we want to patch a third-party source file, we do /not/
+Let's start by patching the source file `definitions.units`. While,
+conceptionally, we want to patch a third-party source file, we do *not*
modify the sources. The workspace root is a git tree and stay like this.
-Instead, we remember that we specify /targets/ and the definition of a
+Instead, we remember that we specify *targets* and the definition of a
target is looked up in the targets file; only if not defined there, it
is implicitly considered a source target and taken from the target root.
-So we will define a /target/ named ~definitions.units~ to replace the
+So we will define a *target* named `definitions.units` to replace the
original source file.
-Let's first generate the patch. As we're already referring to source files
-as targets, we have to provide a targets file already; we start with the
-empty object and refine it later.
+Let's first generate the patch. As we're already referring to source
+files as targets, we have to provide a targets file already; we start
+with the empty object and refine it later.
-#+BEGIN_SRC sh
+``` sh
$ echo {} > TARGETS.units
$ just-mr install -o . definitions.units
INFO: Requested target is [["@","units","","definitions.units"],{}]
@@ -100,41 +103,39 @@ $ mkdir files
$ echo {} > files/TARGETS
$ diff -u definitions.units.orig definitions.units > files/definitions.units.diff
$ rm definitions.units*
-#+END_SRC
-
-Our rules conveniently contain a rule ~["patch", "file"]~ to patch
-a single file, and we already created the patch. The only other
-input missing is the source file. So far, we could refer to it as
-~"definitions.units"~ because there was no target of that name, but
-now we're about to define a target with that very name. Fortunately,
-in target files, we can use a special syntax to explicitly refer to
-a source file of the current module, even if there is a target with
-the same name: ~["FILE", null, "definition.units"]~. The syntax
-requires the explicit ~null~ value for the current module, despite
-the fact that explicit file references are only allowed for the
-current module; in this way, the name is a list of length more than
-two and cannot be confused with a top-level module called ~FILE~.
-So we add this target and obtain as ~TARGETS.units~ the following.
-
-#+SRCNAME: TARGETS.units
-#+BEGIN_SRC js
+```
+
+Our rules conveniently contain a rule `["patch", "file"]` to patch a
+single file, and we already created the patch. The only other input
+missing is the source file. So far, we could refer to it as
+`"definitions.units"` because there was no target of that name, but now
+we're about to define a target with that very name. Fortunately, in
+target files, we can use a special syntax to explicitly refer to a
+source file of the current module, even if there is a target with the
+same name: `["FILE", null, "definition.units"]`. The syntax requires the
+explicit `null` value for the current module, despite the fact that
+explicit file references are only allowed for the current module; in
+this way, the name is a list of length more than two and cannot be
+confused with a top-level module called `FILE`. So we add this target
+and obtain as `TARGETS.units` the following.
+
+``` {.jsonc srcname="TARGETS.units"}
{ "definitions.units":
{ "type": ["@", "rules", "patch", "file"]
, "src": [["FILE", ".", "definitions.units"]]
, "patch": [["@", "patches", "", "definitions.units.diff"]]
}
}
-#+END_SRC
+```
-Analysing ~"definitions.units"~ we find our defined target which
-contains an action output. Still, it looks like a patched source
-file; the new artifact is staged to the original location. Staging
-is also used in the action definition, to avoid magic names (like
-file names starting with ~-~), in-place operations (all actions
-must not modify their inputs) and, in fact, have a
-fixed command line.
+Analysing `"definitions.units"` we find our defined target which
+contains an action output. Still, it looks like a patched source file;
+the new artifact is staged to the original location. Staging is also
+used in the action definition, to avoid magic names (like file names
+starting with `-`), in-place operations (all actions must not modify
+their inputs) and, in fact, have a fixed command line.
-#+BEGIN_SRC sh
+``` sh
$ just-mr analyse definitions.units --dump-actions -
INFO: Requested target is [["@","units","","definitions.units"],{}]
INFO: Result of target [["@","units","","definitions.units"],{}]: {
@@ -172,11 +173,11 @@ INFO: Actions for target [["@","units","","definitions.units"],{}]:
}
]
$
-#+END_SRC
+```
-Building ~"definitions.units"~ we find out patch applied correctly.
+Building `"definitions.units"` we find out patch applied correctly.
-#+BEGIN_SRC sh
+``` sh
$ just-mr build definitions.units -P definitions.units | grep -A 5 'German units'
INFO: Requested target is [["@","units","","definitions.units"],{}]
INFO: Analysed target [["@","units","","definitions.units"],{}]
@@ -193,24 +194,24 @@ area_soccerfield 105 m * 68 m
area_saarland 2570 km^2
zentner 50 kg
$
-#+END_SRC
+```
-** Globbing source files: ~"GLOB"~
+Globbing source files: `"GLOB"`
+-------------------------------
-Next, we collect all ~.units~ files. We could simply do this by enumerating
-them in a target.
+Next, we collect all `.units` files. We could simply do this by
+enumerating them in a target.
-#+SRCNAME: TARGETS.units
-#+BEGIN_SRC js
+``` {.jsonc srcname="TARGETS.units"}
...
, "data-draft": { "type": "install", "deps": ["definitions.units", "currency.units"]}
...
-#+END_SRC
+```
-In this way, we get the desired collection of one unmodified source file and
-the output of the patch action.
+In this way, we get the desired collection of one unmodified source file
+and the output of the patch action.
-#+BEGIN_SRC sh
+``` sh
$ just-mr analyse data-draft
INFO: Requested target is [["@","units","","data-draft"],{}]
INFO: Result of target [["@","units","","data-draft"],{}]: {
@@ -226,77 +227,76 @@ INFO: Result of target [["@","units","","data-draft"],{}]: {
}
}
$
-#+END_SRC
-
-The disadvantage, however, that we might miss newly added ~.units~
-files if we update and upstream added new files. So we want all
-source files that have the respective ending. The corresponding
-source reference is ~"GLOB"~. A glob expands to the /collection/
-of all /sources/ that are /files/ in the /top-level/ directory of
-the current module and that match the given pattern. It is important
-to understand this in detail and the rational behind it.
-- First of all, the artifact (and runfiles) map has an entry for
- each file that matches. In particular, targets have the option to
- define individual actions for each file, like ~["CC", "binary"]~
- does for the source files. This is different from ~"TREE"~ where
- the artifact map contains a single artifact that happens to be a
- directory. The tree behaviour is preferable when the internals
- of the directory only matter for the execution of actions and not
- for analysis; then there are less entries to carry around during
- analysis and action-key computation, and the whole directory
- is "reserved" for that tree avoid staging conflicts when latter
- adding entries there.
-- As a source reference, a glob expands to explicit source files;
- targets having the same name as a source file are not taken into
- account. In our example, ~["GLOB", null, "*.units"]~ therefore
- contains the unpatched source file ~definitions.units~. In this
- way, we avoid any surprises in the expansion of a glob when a new
- source file is added with a name equal to an already existing target.
-- Only files are considered for matching the glob. Directories
- are ignored.
-- Matches are only considered at the top-level directory. In this
- way, only one directory has to be read during analysis; allowing
- deeper globs would require traversal of subdirectories requiring
- larger cost. While the explicit ~"TREE"~ reference allows recursive
- traversal, in the typical use case of the respective workspace root
- being a ~git~ root, it is actually cheap; we can look up the
- ~git~ tree identifier without traversing the tree. Such a quick
- look up would not be possible if matches had to be selected.
-
-So, ~["GLOB", null, "*.units"]~ expands to all the relevant source
-files; but we still want to keep the patching. Most rules, like ~"install"~,
-disallow staging conflicts to avoid accidentally ignoring a file due
-to conflicting name. In our case, however, the dropping of the source
-file in favour of the patched one is deliberate. For this, there is
-the rule ~["data", "overlay"]~ taking the union of the artifacts of
+```
+
+The disadvantage, however, that we might miss newly added `.units` files
+if we update and upstream added new files. So we want all source files
+that have the respective ending. The corresponding source reference is
+`"GLOB"`. A glob expands to the *collection* of all *sources* that are
+*files* in the *top-level* directory of the current module and that
+match the given pattern. It is important to understand this in detail
+and the rational behind it.
+
+ - First of all, the artifact (and runfiles) map has an entry for each
+ file that matches. In particular, targets have the option to define
+ individual actions for each file, like `["CC", "binary"]` does for
+ the source files. This is different from `"TREE"` where the artifact
+ map contains a single artifact that happens to be a directory. The
+ tree behaviour is preferable when the internals of the directory
+ only matter for the execution of actions and not for analysis; then
+ there are less entries to carry around during analysis and
+ action-key computation, and the whole directory is "reserved" for
+ that tree avoid staging conflicts when latter adding entries there.
+ - As a source reference, a glob expands to explicit source files;
+ targets having the same name as a source file are not taken into
+ account. In our example, `["GLOB", null, "*.units"]` therefore
+ contains the unpatched source file `definitions.units`. In this way,
+ we avoid any surprises in the expansion of a glob when a new source
+ file is added with a name equal to an already existing target.
+ - Only files are considered for matching the glob. Directories are
+ ignored.
+ - Matches are only considered at the top-level directory. In this way,
+ only one directory has to be read during analysis; allowing deeper
+ globs would require traversal of subdirectories requiring larger
+ cost. While the explicit `"TREE"` reference allows recursive
+ traversal, in the typical use case of the respective workspace root
+ being a `git` root, it is actually cheap; we can look up the `git`
+ tree identifier without traversing the tree. Such a quick look up
+ would not be possible if matches had to be selected.
+
+So, `["GLOB", null, "*.units"]` expands to all the relevant source
+files; but we still want to keep the patching. Most rules, like
+`"install"`, disallow staging conflicts to avoid accidentally ignoring a
+file due to conflicting name. In our case, however, the dropping of the
+source file in favour of the patched one is deliberate. For this, there
+is the rule `["data", "overlay"]` taking the union of the artifacts of
the specified targets, accepting conflicts and resolving them in a
-latest-wins fashion. Keep in mind, that our target fields are list,
-not sets. Looking at the definition of the rule, one finds that
-it is simply a ~"map_union"~. Hence we refine our ~"data"~ target.
+latest-wins fashion. Keep in mind, that our target fields are list, not
+sets. Looking at the definition of the rule, one finds that it is simply
+a `"map_union"`. Hence we refine our `"data"` target.
-#+SRCNAME: TARGETS.units
-#+BEGIN_SRC js
+``` {.jsonc srcname="TARGETS.units"}
...
, "data":
{ "type": ["@", "rules", "data", "overlay"]
, "deps": [["GLOB", null, "*.units"], "definitions.units"]
}
...
-#+END_SRC
+```
The result of the analysis, of course, still is the same.
-** Finishing the example: binaries from globbed sources
+Finishing the example: binaries from globbed sources
+----------------------------------------------------
-The source-code organisation of units is pretty simple. All source
-and header files are in the top-level directory. As the header files
-are not in a directory of their own, we can't use a tree, so we use
-a glob, which is fine for the private headers of a binary. For the
-source files, we have to have them individually anyway. So our first
-attempt of defining the binary is as follows.
+The source-code organisation of units is pretty simple. All source and
+header files are in the top-level directory. As the header files are not
+in a directory of their own, we can't use a tree, so we use a glob,
+which is fine for the private headers of a binary. For the source files,
+we have to have them individually anyway. So our first attempt of
+defining the binary is as follows.
-#+SRCNAME: TARGETS.units
-#+BEGIN_SRC js
+``` {.jsonc srcname="TARGETS.units"}
...
, "units-draft":
{ "type": ["@", "rules", "CC", "binary"]
@@ -307,12 +307,12 @@ attempt of defining the binary is as follows.
, "private-hdrs": [["GLOB", null, "*.h"]]
}
...
-#+END_SRC
+```
-The result basically work and shows that we have 5 source files in total,
-giving 5 compile and one link action.
+The result basically work and shows that we have 5 source files in
+total, giving 5 compile and one link action.
-#+BEGIN_SRC sh
+``` sh
$ just-mr build units-draft
INFO: Requested target is [["@","units","","units-draft"],{}]
INFO: Analysed target [["@","units","","units-draft"],{}]
@@ -328,13 +328,14 @@ INFO: Processed 6 actions, 0 cache hits.
INFO: Artifacts built, logical paths are:
units [718cb1489bd006082f966ea73e3fba3dd072d084:124488:x]
$
-#+END_SRC
+```
-To keep the build clean, we want to get rid of the warning. Of course, we could
-simply set an appropriate compiler flag, but let's do things properly and patch
-away the underlying reason. To do so, we first create a patch.
+To keep the build clean, we want to get rid of the warning. Of course,
+we could simply set an appropriate compiler flag, but let's do things
+properly and patch away the underlying reason. To do so, we first create
+a patch.
-#+BEGIN_SRC sh
+``` sh
$ just-mr install -o . strfunc.c
INFO: Requested target is [["@","units","","strfunc.c"],{}]
INFO: Analysed target [["@","units","","strfunc.c"],{}]
@@ -353,12 +354,11 @@ $ echo -e "109\ns|N|// N\nw\nq" | ed strfunc.c
$ diff strfunc.c.orig strfunc.c > files/strfunc.c.diff
$ rm strfunc.c*
$
-#+END_SRC
+```
-Then we amend our ~"units"~ target.
+Then we amend our `"units"` target.
-#+SRCNAME: TARGETS.units
-#+BEGIN_SRC js
+``` {.jsonc srcname="TARGETS.units"}
...
, "units":
{ "type": ["@", "rules", "CC", "binary"]
@@ -378,14 +378,15 @@ Then we amend our ~"units"~ target.
, "patch": [["@", "patches", "", "strfunc.c.diff"]]
}
...
-#+END_SRC
+```
-Building the new target, 2 actions have to be executed: the patching, and
-the compiling of the patched source file. As the patched file still generates
-the same object file as the unpatched file (after all, we only wanted to get
-rid of a warning), the linking step can be taken from cache.
+Building the new target, 2 actions have to be executed: the patching,
+and the compiling of the patched source file. As the patched file still
+generates the same object file as the unpatched file (after all, we only
+wanted to get rid of a warning), the linking step can be taken from
+cache.
-#+BEGIN_SRC sh
+``` sh
$ just-mr build units
INFO: Requested target is [["@","units","","units"],{}]
INFO: Analysed target [["@","units","","units"],{}]
@@ -396,22 +397,21 @@ INFO: Processed 7 actions, 5 cache hits.
INFO: Artifacts built, logical paths are:
units [718cb1489bd006082f966ea73e3fba3dd072d084:124488:x]
$
-#+END_SRC
+```
-To finish the example, we also add a default target (using that, if no target
-is specified, ~just~ builds the lexicographically first target), staging
-artifacts according to the usual conventions.
+To finish the example, we also add a default target (using that, if no
+target is specified, `just` builds the lexicographically first target),
+staging artifacts according to the usual conventions.
-#+SRCNAME: TARGETS.units
-#+BEGIN_SRC js
+``` {.jsonc srcname="TARGETS.units"}
...
, "": {"type": "install", "dirs": [["units", "bin"], ["data", "share/units"]]}
...
-#+END_SRC
+```
Then things work as expected
-#+BEGIN_SRC sh
+``` sh
$ just-mr install -o /tmp/testinstall
INFO: Requested target is [["@","units","",""],{}]
INFO: Analysed target [["@","units","",""],{}]
@@ -427,4 +427,4 @@ $ /tmp/testinstall/bin/units 'area_saarland' 'area_soccerfield'
* 359943.98
/ 2.7782101e-06
$
-#+END_SRC
+```
diff --git a/doc/tutorial/tests.org b/doc/tutorial/tests.md
index d6842ab2..138769b1 100644
--- a/doc/tutorial/tests.org
+++ b/doc/tutorial/tests.md
@@ -1,38 +1,41 @@
-* Creating Tests
+Creating Tests
+==============
-To run tests with justbuild, we do /not/ have a dedicated ~test~
+To run tests with justbuild, we do *not* have a dedicated `test`
subcommand. Instead, we consider tests being a specific action that
-generates a test report. Consequently, we use the ~build~ subcommand
-to build the test report, and thereby run the test action. Test
-actions, however, are slightly different from normal actions in
-that we don't want the build of the test report to be aborted if
-a test action fails (but still, we want only successfully actions
-taken from cache). Rules defining targets containing such special
-actions have to identify themselves as /tainted/ by specifying
-a string explaining why such special actions are justified; in
-our case, the string is ~"test"~. Besides the implicit marking by
-using a tainted rule, those tainting strings can also be explicitly
-assigned by the user in the definition of a target, e.g., to mark
-test data. Any target has to be tainted with (at least) all the
-strings any of its dependencies is tainted with. In this way, it
-is ensured that no test target will end up in a production build.
-
-For the remainder of this section, we expect to have the project files available
-resulting from successfully completing the tutorial section on /Building C++
-Hello World/. We will demonstrate how to write a test binary for the ~greet~
-library and a shell test for the ~helloworld~ binary.
-
-** Creating a C++ test binary
-
-First, we will create a C++ test binary for testing the correct functionality of
-the ~greet~ library. Therefore, we need to provide a C++ source file that performs
-the actual testing and returns non-~0~ on failure. For simplicity reasons, we do
-not use a testing framework for this tutorial. A simple test that captures
-standard output and verifies it with the expected output should be provided in
-the file ~tests/greet.test.cpp~:
-
-#+SRCNAME: tests/greet.test.cpp
-#+BEGIN_SRC cpp
+generates a test report. Consequently, we use the `build` subcommand to
+build the test report, and thereby run the test action. Test actions,
+however, are slightly different from normal actions in that we don't
+want the build of the test report to be aborted if a test action fails
+(but still, we want only successfully actions taken from cache). Rules
+defining targets containing such special actions have to identify
+themselves as *tainted* by specifying a string explaining why such
+special actions are justified; in our case, the string is `"test"`.
+Besides the implicit marking by using a tainted rule, those tainting
+strings can also be explicitly assigned by the user in the definition of
+a target, e.g., to mark test data. Any target has to be tainted with (at
+least) all the strings any of its dependencies is tainted with. In this
+way, it is ensured that no test target will end up in a production
+build.
+
+For the remainder of this section, we expect to have the project files
+available resulting from successfully completing the tutorial section on
+*Building C++ Hello World*. We will demonstrate how to write a test
+binary for the `greet` library and a shell test for the `helloworld`
+binary.
+
+Creating a C++ test binary
+--------------------------
+
+First, we will create a C++ test binary for testing the correct
+functionality of the `greet` library. Therefore, we need to provide a
+C++ source file that performs the actual testing and returns non-`0` on
+failure. For simplicity reasons, we do not use a testing framework for
+this tutorial. A simple test that captures standard output and verifies
+it with the expected output should be provided in the file
+`tests/greet.test.cpp`:
+
+``` {.cpp srcname="tests/greet.test.cpp"}
#include <functional>
#include <iostream>
#include <string>
@@ -68,15 +71,14 @@ auto test_greet(std::string const& name) -> bool {
int main() {
return test_greet("World") && test_greet("Universe") ? 0 : 1;
}
-#+END_SRC
+```
-Next, a new test target needs to be created in module ~greet~. This target uses
-the rule ~["@", "rules", "CC/test", "test"]~ and needs to depend on the
-~["greet", "greet"]~ target. To create the test target, add the following to
-~tests/TARGETS~:
+Next, a new test target needs to be created in module `greet`. This
+target uses the rule `["@", "rules", "CC/test", "test"]` and needs to
+depend on the `["greet", "greet"]` target. To create the test target,
+add the following to `tests/TARGETS`:
-#+SRCNAME: tests/TARGETS
-#+BEGIN_SRC js
+``` {.jsonc srcname="tests/TARGETS"}
{ "greet":
{ "type": ["@", "rules", "CC/test", "test"]
, "name": ["test_greet"]
@@ -84,35 +86,33 @@ the rule ~["@", "rules", "CC/test", "test"]~ and needs to depend on the
, "private-deps": [["greet", "greet"]]
}
}
-#+END_SRC
+```
-Before we can run the test, a proper default module for ~CC/test~ must be
-provided. By specifying the appropriate target in this module the default test
-runner can be overwritten by a different test runner fom the rule's workspace
-root. Moreover, all test targets share runner infrastructure from ~shell/test~,
-e.g., summarizing multiple runs per test (to detect flakyness) if the configuration
-variable ~RUNS_PER_TEST~ is set.
+Before we can run the test, a proper default module for `CC/test` must
+be provided. By specifying the appropriate target in this module the
+default test runner can be overwritten by a different test runner fom
+the rule's workspace root. Moreover, all test targets share runner
+infrastructure from `shell/test`, e.g., summarizing multiple runs per
+test (to detect flakyness) if the configuration variable `RUNS_PER_TEST`
+is set.
However, in our case, we want to use the default runner and therefore it
is sufficient to create an empty module. To do so, create the file
-~tutorial-defaults/CC/test/TARGETS~ with content
+`tutorial-defaults/CC/test/TARGETS` with content
-#+SRCNAME: tutorial-defaults/CC/test/TARGETS
-#+BEGIN_SRC js
+``` {.jsonc srcname="tutorial-defaults/CC/test/TARGETS"}
{}
-#+END_SRC
+```
-as well as the file ~tutorial-defaults/shell/test/TARGETS~ with content
+as well as the file `tutorial-defaults/shell/test/TARGETS` with content
-#+SRCNAME: tutorial-defaults/shell/test/TARGETS
-#+BEGIN_SRC js
+``` {.jsonc srcname="tutorial-defaults/shell/test/TARGETS"}
{}
-#+END_SRC
-
+```
Now we can run the test (i.e., build the test result):
-#+BEGIN_SRC sh
+``` sh
$ just-mr build tests greet
INFO: Requested target is [["@","tutorial","tests","greet"],{}]
INFO: Analysed target [["@","tutorial","tests","greet"],{}]
@@ -130,41 +130,45 @@ INFO: Artifacts built, logical paths are:
(1 runfiles omitted.)
INFO: Target tainted ["test"].
$
-#+END_SRC
-
-Note that the target is correctly reported as tainted with ~"test"~. It will
-produce 3 additional actions for compiling, linking and running the test binary.
-
-The result of the test target are 5 artifacts: ~result~ (containing ~UNKNOWN~,
-~PASS~, or ~FAIL~), ~stderr~, ~stdout~, ~time-start~, and ~time-stop~, and a
-single runfile (omitted in the output above), which is a tree artifact with the
-name ~test_greet~ that contains all of the above artifacts. The test was run
-successfully as otherwise all reported artifacts would have been reported as
-~FAILED~ in the output, and justbuild would have returned the exit code ~2~.
-
-To immediately print the standard output produced by the test binary on the
-command line, the ~-P~ option can be used. Argument to this option is the name
-of the artifact that should be printed on the command line, in our case
-~stdout~:
-
-#+BEGIN_SRC sh
+```
+
+Note that the target is correctly reported as tainted with `"test"`. It
+will produce 3 additional actions for compiling, linking and running the
+test binary.
+
+The result of the test target are 5 artifacts: `result` (containing
+`UNKNOWN`, `PASS`, or `FAIL`), `stderr`, `stdout`, `time-start`, and
+`time-stop`, and a single runfile (omitted in the output above), which
+is a tree artifact with the name `test_greet` that contains all of the
+above artifacts. The test was run successfully as otherwise all reported
+artifacts would have been reported as `FAILED` in the output, and
+justbuild would have returned the exit code `2`.
+
+To immediately print the standard output produced by the test binary on
+the command line, the `-P` option can be used. Argument to this option
+is the name of the artifact that should be printed on the command line,
+in our case `stdout`:
+
+``` sh
$ just-mr build tests greet --log-limit 1 -P stdout
greet output: Hello World!
greet output: Hello Universe!
$
-#+END_SRC
+```
-Note that ~--log-limit 1~ was just added to omit justbuild's ~INFO:~ prints.
+Note that `--log-limit 1` was just added to omit justbuild's `INFO:`
+prints.
-Our test binary does not have any useful options for directly interacting
-with it. When working with test frameworks, it sometimes can be desirable to
-get hold of the test binary itself for manual interaction. The running of
-the test binary is the last action associated with the test and the test
-binary is, of course, one of its inputs.
+Our test binary does not have any useful options for directly
+interacting with it. When working with test frameworks, it sometimes can
+be desirable to get hold of the test binary itself for manual
+interaction. The running of the test binary is the last action
+associated with the test and the test binary is, of course, one of its
+inputs.
-#+BEGIN_SRC sh
+``` sh
$ just-mr analyse --request-action-input -1 tests greet
INFO: Requested target is [["@","tutorial","tests","greet"],{}]
INFO: Request is input of action #-1
@@ -197,15 +201,15 @@ INFO: Result of input of action #-1 of target [["@","tutorial","tests","greet"],
}
INFO: Target tainted ["test"].
$
-#+END_SRC
+```
The provided data also shows us the precise description of the action
-for which we request the input. This allows us to manually rerun
-the action. Or we can simply interact with the test binary manually
-after installing the inputs to this action. Requesting the inputs
-of an action can also be useful when debugging a build failure.
+for which we request the input. This allows us to manually rerun the
+action. Or we can simply interact with the test binary manually after
+installing the inputs to this action. Requesting the inputs of an action
+can also be useful when debugging a build failure.
-#+BEGIN_SRC sh
+``` sh
$ just-mr install -o work --request-action-input -1 tests greet
INFO: Requested target is [["@","tutorial","tests","greet"],{}]
INFO: Request is input of action #-1
@@ -231,26 +235,25 @@ $ echo $?
0
$ cd ..
$ rm -rf work
-#+END_SRC
+```
-** Creating a shell test
+Creating a shell test
+---------------------
-Similarly, to create a shell test for testing the ~helloworld~ binary, a test
-script ~tests/test_helloworld.sh~ must be provided:
+Similarly, to create a shell test for testing the `helloworld` binary, a
+test script `tests/test_helloworld.sh` must be provided:
-#+SRCNAME: tests/test_helloworld.sh
-#+BEGIN_SRC sh
+``` {.sh srcname="tests/test_helloworld.sh"}
set -e
[ "$(./helloworld)" = "Hello Universe!" ]
-#+END_SRC
+```
The test target for this shell tests uses the rule
-~["@", "rules", "shell/test", "script"]~ and must depend on the ~"helloworld"~
-target. To create the test target, add the following to the ~tests/TARGETS~
-file:
+`["@", "rules", "shell/test", "script"]` and must depend on the
+`"helloworld"` target. To create the test target, add the following to
+the `tests/TARGETS` file:
-#+SRCNAME: tests/TARGETS
-#+BEGIN_SRC js
+``` {.jsonc srcname="tests/TARGETS"}
...
, "helloworld":
{ "type": ["@", "rules", "shell/test", "script"]
@@ -259,11 +262,11 @@ file:
, "deps": [["", "helloworld"]]
}
...
-#+END_SRC
+```
Now we can run the shell test (i.e., build the test result):
-#+BEGIN_SRC sh
+``` sh
$ just-mr build tests helloworld
INFO: Requested target is [["@","tutorial","tests","helloworld"],{}]
INFO: Analysed target [["@","tutorial","tests","helloworld"],{}]
@@ -281,29 +284,28 @@ INFO: Artifacts built, logical paths are:
(1 runfiles omitted.)
INFO: Target tainted ["test"].
$
-#+END_SRC
-
-The result is also similar, containing also the 5 artifacts and a single runfile
-(omitted in the output above), which is a tree artifact with the name
-~test_helloworld~ that contains all of the above artifacts.
-
-** Creating a compound test target
-
-As most people probably do not want to call every test target by hand, it is
-desirable to compound test target that triggers the build of multiple test
-reports. To do so, an ~"install"~ target can be used. The field ~"deps"~ of
-an install target is a list of targets for which the runfiles are collected.
-As for the tests the runfiles happen to be
-tree artifacts named the same way as the test and containing all test results,
-this is precisely what we need.
-Furthermore, as the dependent test targets are tainted by ~"test"~, also the
-compound test target must be tainted by the same string. To create the compound
-test target combining the two tests above (the tests ~"greet"~ and
-~"helloworld"~ from module ~"tests"~), add the following to the ~tests/TARGETS~
-file:
-
-#+SRCNAME: tests/TARGETS
-#+BEGIN_SRC js
+```
+
+The result is also similar, containing also the 5 artifacts and a single
+runfile (omitted in the output above), which is a tree artifact with the
+name `test_helloworld` that contains all of the above artifacts.
+
+Creating a compound test target
+-------------------------------
+
+As most people probably do not want to call every test target by hand,
+it is desirable to compound test target that triggers the build of
+multiple test reports. To do so, an `"install"` target can be used. The
+field `"deps"` of an install target is a list of targets for which the
+runfiles are collected. As for the tests the runfiles happen to be tree
+artifacts named the same way as the test and containing all test
+results, this is precisely what we need. Furthermore, as the dependent
+test targets are tainted by `"test"`, also the compound test target must
+be tainted by the same string. To create the compound test target
+combining the two tests above (the tests `"greet"` and `"helloworld"`
+from module `"tests"`), add the following to the `tests/TARGETS` file:
+
+``` {.jsonc srcname="tests/TARGETS"}
...
, "ALL":
{ "type": "install"
@@ -311,12 +313,12 @@ file:
, "deps": ["greet", "helloworld"]
}
...
-#+END_SRC
+```
-Now we can run all tests at once by just building the compound test target
-~"ALL"~:
+Now we can run all tests at once by just building the compound test
+target `"ALL"`:
-#+BEGIN_SRC sh
+``` sh
$ just-mr build tests ALL
INFO: Requested target is [["@","tutorial","tests","ALL"],{}]
INFO: Analysed target [["@","tutorial","tests","ALL"],{}]
@@ -330,8 +332,8 @@ INFO: Artifacts built, logical paths are:
test_helloworld [63fa5954161b52b275b05c270e1626feaa8e178b:177:t]
INFO: Target tainted ["test"].
$
-#+END_SRC
+```
-As a result it reports the runfiles (result directories) of both tests as
-artifacts. Both tests ran successfully as none of those artifacts in this output
-above are tagged as ~FAILED~.
+As a result it reports the runfiles (result directories) of both tests
+as artifacts. Both tests ran successfully as none of those artifacts in
+this output above are tagged as `FAILED`.
diff --git a/doc/tutorial/third-party-software.md b/doc/tutorial/third-party-software.md
new file mode 100644
index 00000000..daaf5b2d
--- /dev/null
+++ b/doc/tutorial/third-party-software.md
@@ -0,0 +1,473 @@
+Building Third-party Software
+=============================
+
+Third-party projects usually ship with their own build description,
+which often happens to be not compatible with justbuild. Nevertheless,
+it is highly desireable to include external projects via their source
+code base, instead of relying on the integration of out-of-band binary
+distributions. justbuild offers a flexible approach to provide the
+required build description via an overlay layer without the need to
+touch the original code base.
+
+For the remainder of this section, we expect to have the project files
+available resulting from successfully completing the tutorial section on
+*Building C++ Hello World*. We will demonstrate how to use the
+open-source project [fmtlib](https://github.com/fmtlib/fmt) as an
+example for integrating third-party software to a justbuild project.
+
+Creating the target overlay layer for fmtlib
+--------------------------------------------
+
+Before we construct the overlay layer for fmtlib, we need to determine
+its file structure ([tag
+8.1.1](https://github.com/fmtlib/fmt/tree/8.1.1)). The relevant header
+and source files are structured as follows:
+
+ fmt
+ |
+ +--include
+ | +--fmt
+ | +--*.h
+ |
+ +--src
+ +--format.cc
+ +--os.cc
+
+The public headers can be found in `include/fmt`, while the library's
+source files are located in `src`. For the overlay layer, the `TARGETS`
+files should be placed in a tree structure that resembles the original
+code base's structure. It is also good practice to provide a top-level
+`TARGETS` file, leading to the following structure for the overlay:
+
+ fmt-layer
+ |
+ +--TARGETS
+ +--include
+ | +--fmt
+ | +--TARGETS
+ |
+ +--src
+ +--TARGETS
+
+Let's create the overlay structure:
+
+``` sh
+$ mkdir -p fmt-layer/include/fmt
+$ mkdir -p fmt-layer/src
+```
+
+The directory `include/fmt` contains only header files. As we want all
+files in this directory to be included in the `"hdrs"` target, we can
+safely use the explicit `TREE` reference[^1], which collects, in a
+single artifact (describing a directory) *all* directory contents from
+`"."` of the workspace root. Note that the `TARGETS` file is only part
+of the overlay, and therefore will not be part of this tree.
+Furthermore, this tree should be staged to `"fmt"`, so that any consumer
+can include those headers via `<fmt/...>`. The resulting header
+directory target `"hdrs"` in `include/fmt/TARGETS` should be described
+as:
+
+``` {.jsonc srcname="fmt-layer/include/fmt/TARGETS"}
+{ "hdrs":
+ { "type": ["@", "rules", "data", "staged"]
+ , "srcs": [["TREE", null, "."]]
+ , "stage": ["fmt"]
+ }
+}
+```
+
+The actual library target is defined in the directory `src`. For the
+public headers, it refers to the previously created `"hdrs"` target via
+its fully-qualified target name (`["include/fmt", "hdrs"]`). Source
+files are the two local files `format.cc`, and `os.cc`. The final target
+description in `src/TARGETS` will look like this:
+
+``` {.jsonc srcname="fmt-layer/src/TARGETS"}
+{ "fmt":
+ { "type": ["@", "rules", "CC", "library"]
+ , "name": ["fmt"]
+ , "hdrs": [["include/fmt", "hdrs"]]
+ , "srcs": ["format.cc", "os.cc"]
+ }
+}
+```
+
+Finally, the top-level `TARGETS` file can be created. While it is
+technically not strictly required, it is considered good practice to
+*export* every target that may be used by another project. Exported
+targets are subject to high-level target caching, which allows to skip
+the analysis and traversal of entire subgraphs in the action graph.
+Therefore, we create an export target that exports the target
+`["src", "fmt"]`, with only the variables in the field
+`"flexible_config"` being propagated. The top-level `TARGETS` file
+contains the following content:
+
+``` {.jsonc srcname="fmt-layer/TARGETS"}
+{ "fmt":
+ { "type": "export"
+ , "target": ["src", "fmt"]
+ , "flexible_config": ["CXX", "CXXFLAGS", "ADD_CXXFLAGS", "AR", "ENV"]
+ }
+}
+```
+
+After adding the library to the multi-repository configuration (next
+step), the list of configuration variables a target, like `["src",
+"fmt"]`, actually depends on can be obtained using the `--dump-vars`
+option of the `analyse` subcommand. In this way, an informed decision
+can be taken when deciding which variables of the export target to make
+tunable for the consumer.
+
+Adding fmtlib to the Multi-Repository Configuration
+---------------------------------------------------
+
+Based on the *hello world* tutorial, we can extend the existing
+`repos.json` by the layer definition `"fmt-targets-layer"` and the
+repository `"fmtlib"`, which is based on the Git repository with its
+target root being overlayed. Furthermore, we want to use `"fmtlib"` in
+the repository `"tutorial"`, and therefore need to introduce an
+additional binding `"format"` for it:
+
+``` {.jsonc srcname="repos.json"}
+{ "main": "tutorial"
+, "repositories":
+ { "rules-cc":
+ { "repository":
+ { "type": "git"
+ , "branch": "master"
+ , "commit": "123d8b03bf2440052626151c14c54abce2726e6f"
+ , "repository": "https://github.com/just-buildsystem/rules-cc.git"
+ , "subdir": "rules"
+ }
+ , "target_root": "tutorial-defaults"
+ , "rule_root": "rules-cc"
+ }
+ , "tutorial":
+ { "repository": {"type": "file", "path": "."}
+ , "bindings": {"rules": "rules-cc", "format": "fmtlib"}
+ }
+ , "tutorial-defaults":
+ { "repository": {"type": "file", "path": "./tutorial-defaults"}
+ }
+ , "fmt-targets-layer":
+ { "repository": {"type": "file", "path": "./fmt-layer"}
+ }
+ , "fmtlib":
+ { "repository":
+ { "type": "git"
+ , "branch": "8.1.1"
+ , "commit": "b6f4ceaed0a0a24ccf575fab6c56dd50ccf6f1a9"
+ , "repository": "https://github.com/fmtlib/fmt.git"
+ }
+ , "target_root": "fmt-targets-layer"
+ , "bindings": {"rules": "rules-cc"}
+ }
+ }
+}
+```
+
+This `"format"` binding can you be used to add a new private dependency
+in `greet/TARGETS`:
+
+``` {.jsonc srcname="greet/TARGETS"}
+{ "greet":
+ { "type": ["@", "rules", "CC", "library"]
+ , "name": ["greet"]
+ , "hdrs": ["greet.hpp"]
+ , "srcs": ["greet.cpp"]
+ , "stage": ["greet"]
+ , "private-deps": [["@", "format", "", "fmt"]]
+ }
+}
+```
+
+Consequently, the `fmtlib` library can now be used by `greet/greet.cpp`:
+
+``` {.cpp srcname="greet/greet.cpp"}
+#include "greet.hpp"
+#include <fmt/format.h>
+
+void greet(std::string const& s) {
+ fmt::print("Hello {}!\n", s);
+}
+```
+
+Due to changes made to `repos.json`, building this tutorial requires to
+rerun `just-mr`, which will fetch the necessary sources for the external
+repositories:
+
+``` sh
+$ just-mr build helloworld
+INFO: Requested target is [["@","tutorial","","helloworld"],{}]
+INFO: Analysed target [["@","tutorial","","helloworld"],{}]
+INFO: Export targets found: 0 cached, 0 uncached, 1 not eligible for caching
+INFO: Discovered 7 actions, 3 trees, 0 blobs
+INFO: Building [["@","tutorial","","helloworld"],{}].
+INFO: Processed 7 actions, 1 cache hits.
+INFO: Artifacts built, logical paths are:
+ helloworld [0ec4e36cfb5f2c3efa0fff789349a46694a6d303:132736:x]
+$
+```
+
+Note to build the `fmt` target alone, its containing repository `fmtlib`
+must be specified via the `--main` option:
+
+``` sh
+$ just-mr --main fmtlib build fmt
+INFO: Requested target is [["@","fmtlib","","fmt"],{}]
+INFO: Analysed target [["@","fmtlib","","fmt"],{}]
+INFO: Export targets found: 0 cached, 0 uncached, 1 not eligible for caching
+INFO: Discovered 3 actions, 1 trees, 0 blobs
+INFO: Building [["@","fmtlib","","fmt"],{}].
+INFO: Processed 3 actions, 3 cache hits.
+INFO: Artifacts built, logical paths are:
+ libfmt.a [513b2ac17c557675fc841f3ebf279003ff5a73ae:240914:f]
+ (1 runfiles omitted.)
+$
+```
+
+Employing high-level target caching
+-----------------------------------
+
+The make use of high-level target caching for exported targets, we need
+to ensure that all inputs to an export target are transitively
+content-fixed. This is automatically the case for `"type":"git"`
+repositories. However, the `libfmt` repository also depends on
+`"rules-cc"`, `"tutorial-defaults"`, and `"fmt-target-layer"`. As the
+latter two are `"type":"file"` repositories, they must be put under Git
+versioning first:
+
+``` sh
+$ git init .
+$ git add tutorial-defaults fmt-layer
+$ git commit -m"fix compile flags and fmt targets layer"
+```
+
+Note that `rules-cc` already is under Git versioning.
+
+Now, to instruct `just-mr` to use the content-fixed, committed source
+trees of those `"type":"file"` repositories the pragma `"to_git"` must
+be set for them in `repos.json`:
+
+``` {.jsonc srcname="repos.json"}
+{ "main": "tutorial"
+, "repositories":
+ { "rules-cc":
+ { "repository":
+ { "type": "git"
+ , "branch": "master"
+ , "commit": "123d8b03bf2440052626151c14c54abce2726e6f"
+ , "repository": "https://github.com/just-buildsystem/rules-cc.git"
+ , "subdir": "rules"
+ }
+ , "target_root": "tutorial-defaults"
+ , "rule_root": "rules-cc"
+ }
+ , "tutorial":
+ { "repository": {"type": "file", "path": "."}
+ , "bindings": {"rules": "rules-cc", "format": "fmtlib"}
+ }
+ , "tutorial-defaults":
+ { "repository":
+ { "type": "file"
+ , "path": "./tutorial-defaults"
+ , "pragma": {"to_git": true}
+ }
+ }
+ , "fmt-targets-layer":
+ { "repository":
+ { "type": "file"
+ , "path": "./fmt-layer"
+ , "pragma": {"to_git": true}
+ }
+ }
+ , "fmtlib":
+ { "repository":
+ { "type": "git"
+ , "branch": "master"
+ , "commit": "b6f4ceaed0a0a24ccf575fab6c56dd50ccf6f1a9"
+ , "repository": "https://github.com/fmtlib/fmt.git"
+ }
+ , "target_root": "fmt-targets-layer"
+ , "bindings": {"rules": "rules-cc"}
+ }
+ }
+}
+```
+
+Due to changes in the repository configuration, we need to rebuild and
+the benefits of the target cache should be visible on the second build:
+
+``` sh
+$ just-mr build helloworld
+INFO: Requested target is [["@","tutorial","","helloworld"],{}]
+INFO: Analysed target [["@","tutorial","","helloworld"],{}]
+INFO: Export targets found: 0 cached, 1 uncached, 0 not eligible for caching
+INFO: Discovered 7 actions, 3 trees, 0 blobs
+INFO: Building [["@","tutorial","","helloworld"],{}].
+INFO: Processed 7 actions, 7 cache hits.
+INFO: Artifacts built, logical paths are:
+ helloworld [0ec4e36cfb5f2c3efa0fff789349a46694a6d303:132736:x]
+$
+$ just-mr build helloworld
+INFO: Requested target is [["@","tutorial","","helloworld"],{}]
+INFO: Analysed target [["@","tutorial","","helloworld"],{}]
+INFO: Export targets found: 1 cached, 0 uncached, 0 not eligible for caching
+INFO: Discovered 4 actions, 2 trees, 0 blobs
+INFO: Building [["@","tutorial","","helloworld"],{}].
+INFO: Processed 4 actions, 4 cache hits.
+INFO: Artifacts built, logical paths are:
+ helloworld [0ec4e36cfb5f2c3efa0fff789349a46694a6d303:132736:x]
+$
+```
+
+Note that in the second run the export target `"fmt"` was taken from
+cache and its 3 actions were eliminated, as their result has been
+recorded to the high-level target cache during the first run.
+
+Combining overlay layers for multiple projects
+----------------------------------------------
+
+Projects typically depend on multiple external repositories. Creating an
+overlay layer for each external repository might unnecessarily clutter
+up the repository configuration and the file structure of your
+repository. One solution to mitigate this issue is to combine the
+`TARGETS` files of multiple external repositories in a single overlay
+layer. To avoid conflicts, the `TARGETS` files can be assigned different
+file names per repository. As an example, imagine a common overlay layer
+with the files `TARGETS.fmt` and `TARGETS.gsl` for the repositories
+`"fmtlib"` and `"gsl-lite"`, respectively:
+
+ common-layer
+ |
+ +--TARGETS.fmt
+ +--TARGETS.gsl
+ +--include
+ | +--fmt
+ | | +--TARGETS.fmt
+ | +--gsl
+ | +--TARGETS.gsl
+ |
+ +--src
+ +--TARGETS.fmt
+
+Such a common overlay layer can be used as the target root for both
+repositories with only one difference: the `"target_file_name"` field.
+By specifying this field, the dispatch where to find the respective
+target description for each repository is implemented. For the given
+example, the following `repos.json` defines the overlay
+`"common-targets-layer"`, which is used by `"fmtlib"` and `"gsl-lite"`:
+
+``` {.jsonc srcname="repos.json"}
+{ "main": "tutorial"
+, "repositories":
+ { "rules-cc":
+ { "repository":
+ { "type": "git"
+ , "branch": "master"
+ , "commit": "123d8b03bf2440052626151c14c54abce2726e6f"
+ , "repository": "https://github.com/just-buildsystem/rules-cc.git"
+ , "subdir": "rules"
+ }
+ , "target_root": "tutorial-defaults"
+ , "rule_root": "rules-cc"
+ }
+ , "tutorial":
+ { "repository": {"type": "file", "path": "."}
+ , "bindings": {"rules": "rules-cc", "format": "fmtlib"}
+ }
+ , "tutorial-defaults":
+ { "repository":
+ { "type": "file"
+ , "path": "./tutorial-defaults"
+ , "pragma": {"to_git": true}
+ }
+ }
+ , "common-targets-layer":
+ { "repository":
+ { "type": "file"
+ , "path": "./common-layer"
+ , "pragma": {"to_git": true}
+ }
+ }
+ , "fmtlib":
+ { "repository":
+ { "type": "git"
+ , "branch": "8.1.1"
+ , "commit": "b6f4ceaed0a0a24ccf575fab6c56dd50ccf6f1a9"
+ , "repository": "https://github.com/fmtlib/fmt.git"
+ }
+ , "target_root": "common-targets-layer"
+ , "target_file_name": "TARGETS.fmt"
+ , "bindings": {"rules": "rules-cc"}
+ }
+ , "gsl-lite":
+ { "repository":
+ { "type": "git"
+ , "branch": "v0.40.0"
+ , "commit": "d6c8af99a1d95b3db36f26b4f22dc3bad89952de"
+ , "repository": "https://github.com/gsl-lite/gsl-lite.git"
+ }
+ , "target_root": "common-targets-layer"
+ , "target_file_name": "TARGETS.gsl"
+ , "bindings": {"rules": "rules-cc"}
+ }
+ }
+}
+```
+
+Using pre-built dependencies
+----------------------------
+
+While building external dependencies from source brings advantages, most
+prominently the flexibility to quickly and seamlessly switch to a
+different build configuration (production, debug, instrumented for
+performance analysis; cross-compiling for a different target
+architecture), there are also legitimate reasons to use pre-built
+dependencies. The most prominent one is if your project is packaged as
+part of a larger distribution. For that reason, just also has (in
+`etc/import.prebuilt`) target files for all its dependencies assuming
+they are pre-installed. The reason why target files are used at all for
+this situation is twofold.
+
+ - On the one hand, having a target allows the remaining targets to not
+ care about where their dependencies come from, or if it is a build
+ against pre-installed dependencies or not. Also, the top-level
+ binary does not have to know the linking requirements of its
+ transitive dependencies. In other words, information stays where it
+ belongs to and if one target acquires a new dependency, the
+ information is automatically propagated to all targets using it.
+ - Still some information is needed to use a pre-installed library and,
+ as explained, a target describing the pre-installed library is the
+ right place to collect this information.
+ - The public header files of the library. By having this explicit,
+ we do not accumulate directories in the include search path and
+ hence also properly detect include conflicts.
+ - The information on how to link the library itself (i.e.,
+ basically its base name).
+ - Any dependencies on other libraries that the library might have.
+ This information is used to obtain the correct linking order and
+ complete transitive linking arguments while keeping the
+ description maintainable, as each target still only declares its
+ direct dependencies.
+
+The target description for a pre-built version of the format library
+that was used as an example in this section is shown next; with our
+staging mechanism the logical repository it belongs to is rooted in the
+`fmt` subdirectory of the `include` directory of the ambient system.
+
+``` {.jsonc srcname="etc/import.prebuilt/TARGETS.fmt"}
+{ "fmt":
+ { "type": ["@", "rules", "CC", "library"]
+ , "name": ["fmt"]
+ , "stage": ["fmt"]
+ , "hdrs": [["TREE", null, "."]]
+ , "private-ldflags": ["-lfmt"]
+ }
+}
+```
+
+[^1]: Explicit `TREE` references are always a list of length 3, to
+ distinguish them from target references of length 2 (module and
+ target name). Furthermore, the second list element is always `null`
+ as we only want to allow tree references from the current module.
diff --git a/doc/tutorial/third-party-software.org b/doc/tutorial/third-party-software.org
deleted file mode 100644
index d1712cc8..00000000
--- a/doc/tutorial/third-party-software.org
+++ /dev/null
@@ -1,475 +0,0 @@
-* Building Third-party Software
-
-Third-party projects usually ship with their own build description, which often
-happens to be not compatible with justbuild. Nevertheless, it is highly
-desireable to include external projects via their source code base, instead of
-relying on the integration of out-of-band binary distributions. justbuild offers
-a flexible approach to provide the required build description via an overlay
-layer without the need to touch the original code base.
-
-For the remainder of this section, we expect to have the project files available
-resulting from successfully completing the tutorial section on /Building C++
-Hello World/. We will demonstrate how to use the open-source project
-[[https://github.com/fmtlib/fmt][fmtlib]] as an example for integrating
-third-party software to a justbuild project.
-
-** Creating the target overlay layer for fmtlib
-
-Before we construct the overlay layer for fmtlib, we need to determine its file
-structure ([[https://github.com/fmtlib/fmt/tree/8.1.1][tag 8.1.1]]). The
-relevant header and source files are structured as follows:
-
-#+BEGIN_SRC
- fmt
- |
- +--include
- | +--fmt
- | +--*.h
- |
- +--src
- +--format.cc
- +--os.cc
-#+END_SRC
-
-The public headers can be found in ~include/fmt~, while the library's source
-files are located in ~src~. For the overlay layer, the ~TARGETS~ files should be
-placed in a tree structure that resembles the original code base's structure.
-It is also good practice to provide a top-level ~TARGETS~ file, leading to the
-following structure for the overlay:
-
-#+BEGIN_SRC
- fmt-layer
- |
- +--TARGETS
- +--include
- | +--fmt
- | +--TARGETS
- |
- +--src
- +--TARGETS
-#+END_SRC
-
-Let's create the overlay structure:
-
-#+BEGIN_SRC sh
-$ mkdir -p fmt-layer/include/fmt
-$ mkdir -p fmt-layer/src
-#+END_SRC
-
-The directory ~include/fmt~ contains only header files. As we want all files in
-this directory to be included in the ~"hdrs"~ target, we can safely
-use the explicit ~TREE~ reference[fn:1], which collects, in a single
-artifact (describing a directory) /all/ directory contents
-from ~"."~ of the workspace root. Note that the ~TARGETS~ file is only part of
-the overlay, and
-therefore will not be part of this tree. Furthermore, this tree should be staged
-to ~"fmt"~, so that any consumer can include those headers via ~<fmt/...>~. The
-resulting header directory target ~"hdrs"~ in ~include/fmt/TARGETS~ should be
-described as:
-
-[fn:1] Explicit ~TREE~ references are always a list of length 3, to distinguish
-them from target references of length 2 (module and target name). Furthermore,
-the second list element is always ~null~ as we only want to allow tree
-references from the current module.
-
-
-#+SRCNAME: fmt-layer/include/fmt/TARGETS
-#+BEGIN_SRC js
-{ "hdrs":
- { "type": ["@", "rules", "data", "staged"]
- , "srcs": [["TREE", null, "."]]
- , "stage": ["fmt"]
- }
-}
-#+END_SRC
-
-The actual library target is defined in the directory ~src~. For the public
-headers, it refers to the previously created ~"hdrs"~ target via its
-fully-qualified target name (~["include/fmt", "hdrs"]~). Source files are the
-two local files ~format.cc~, and ~os.cc~. The final target description in
-~src/TARGETS~ will look like this:
-
-#+SRCNAME: fmt-layer/src/TARGETS
-#+BEGIN_SRC js
-{ "fmt":
- { "type": ["@", "rules", "CC", "library"]
- , "name": ["fmt"]
- , "hdrs": [["include/fmt", "hdrs"]]
- , "srcs": ["format.cc", "os.cc"]
- }
-}
-#+END_SRC
-
-Finally, the top-level ~TARGETS~ file can be created. While it is technically
-not strictly required, it is considered good practice to /export/ every target
-that may be used by another project. Exported targets are subject to high-level
-target caching, which allows to skip the analysis and traversal of entire
-subgraphs in the action graph. Therefore, we create an export target that
-exports the target ~["src", "fmt"]~, with only the variables in the field
-~"flexible_config"~ being propagated. The top-level ~TARGETS~ file contains the
-following content:
-
-#+SRCNAME: fmt-layer/TARGETS
-#+BEGIN_SRC js
-{ "fmt":
- { "type": "export"
- , "target": ["src", "fmt"]
- , "flexible_config": ["CXX", "CXXFLAGS", "ADD_CXXFLAGS", "AR", "ENV"]
- }
-}
-#+END_SRC
-
-After adding the library to the multi-repository configuration (next
-step), the list of configuration variables a target, like ~["src",
-"fmt"]~, actually depends on can be obtained using the ~--dump-vars~
-option of the ~analyse~ subcommand. In this way, an informed decision
-can be taken when deciding which variables of the export target to
-make tunable for the consumer.
-
-** Adding fmtlib to the Multi-Repository Configuration
-
-Based on the /hello world/ tutorial, we can extend the existing ~repos.json~ by
-the layer definition ~"fmt-targets-layer"~ and the repository ~"fmtlib"~, which
-is based on the Git repository with its target root being overlayed.
-Furthermore, we want to use ~"fmtlib"~ in the repository ~"tutorial"~, and
-therefore need to introduce an additional binding ~"format"~ for it:
-
-#+SRCNAME: repos.json
-#+BEGIN_SRC js
-{ "main": "tutorial"
-, "repositories":
- { "rules-cc":
- { "repository":
- { "type": "git"
- , "branch": "master"
- , "commit": "123d8b03bf2440052626151c14c54abce2726e6f"
- , "repository": "https://github.com/just-buildsystem/rules-cc.git"
- , "subdir": "rules"
- }
- , "target_root": "tutorial-defaults"
- , "rule_root": "rules-cc"
- }
- , "tutorial":
- { "repository": {"type": "file", "path": "."}
- , "bindings": {"rules": "rules-cc", "format": "fmtlib"}
- }
- , "tutorial-defaults":
- { "repository": {"type": "file", "path": "./tutorial-defaults"}
- }
- , "fmt-targets-layer":
- { "repository": {"type": "file", "path": "./fmt-layer"}
- }
- , "fmtlib":
- { "repository":
- { "type": "git"
- , "branch": "8.1.1"
- , "commit": "b6f4ceaed0a0a24ccf575fab6c56dd50ccf6f1a9"
- , "repository": "https://github.com/fmtlib/fmt.git"
- }
- , "target_root": "fmt-targets-layer"
- , "bindings": {"rules": "rules-cc"}
- }
- }
-}
-#+END_SRC
-
-This ~"format"~ binding can you be used to add a new private dependency in
-~greet/TARGETS~:
-
-#+SRCNAME: greet/TARGETS
-#+BEGIN_SRC js
-{ "greet":
- { "type": ["@", "rules", "CC", "library"]
- , "name": ["greet"]
- , "hdrs": ["greet.hpp"]
- , "srcs": ["greet.cpp"]
- , "stage": ["greet"]
- , "private-deps": [["@", "format", "", "fmt"]]
- }
-}
-#+END_SRC
-
-Consequently, the ~fmtlib~ library can now be used by ~greet/greet.cpp~:
-
-#+SRCNAME: greet/greet.cpp
-#+BEGIN_SRC cpp
-#include "greet.hpp"
-#include <fmt/format.h>
-
-void greet(std::string const& s) {
- fmt::print("Hello {}!\n", s);
-}
-#+END_SRC
-
-Due to changes made to ~repos.json~, building this tutorial requires to rerun
-~just-mr~, which will fetch the necessary sources for the external repositories:
-
-#+BEGIN_SRC sh
-$ just-mr build helloworld
-INFO: Requested target is [["@","tutorial","","helloworld"],{}]
-INFO: Analysed target [["@","tutorial","","helloworld"],{}]
-INFO: Export targets found: 0 cached, 0 uncached, 1 not eligible for caching
-INFO: Discovered 7 actions, 3 trees, 0 blobs
-INFO: Building [["@","tutorial","","helloworld"],{}].
-INFO: Processed 7 actions, 1 cache hits.
-INFO: Artifacts built, logical paths are:
- helloworld [0ec4e36cfb5f2c3efa0fff789349a46694a6d303:132736:x]
-$
-#+END_SRC
-
-Note to build the ~fmt~ target alone, its containing repository ~fmtlib~ must be
-specified via the ~--main~ option:
-#+BEGIN_SRC sh
-$ just-mr --main fmtlib build fmt
-INFO: Requested target is [["@","fmtlib","","fmt"],{}]
-INFO: Analysed target [["@","fmtlib","","fmt"],{}]
-INFO: Export targets found: 0 cached, 0 uncached, 1 not eligible for caching
-INFO: Discovered 3 actions, 1 trees, 0 blobs
-INFO: Building [["@","fmtlib","","fmt"],{}].
-INFO: Processed 3 actions, 3 cache hits.
-INFO: Artifacts built, logical paths are:
- libfmt.a [513b2ac17c557675fc841f3ebf279003ff5a73ae:240914:f]
- (1 runfiles omitted.)
-$
-#+END_SRC
-
-** Employing high-level target caching
-
-The make use of high-level target caching for exported targets, we need to
-ensure that all inputs to an export target are transitively content-fixed. This
-is automatically the case for ~"type":"git"~ repositories. However, the ~libfmt~
-repository also depends on ~"rules-cc"~, ~"tutorial-defaults"~, and
-~"fmt-target-layer"~. As the latter two are ~"type":"file"~ repositories, they
-must be put under Git versioning first:
-
-#+BEGIN_SRC sh
-$ git init .
-$ git add tutorial-defaults fmt-layer
-$ git commit -m"fix compile flags and fmt targets layer"
-#+END_SRC
-
-Note that ~rules-cc~ already is under Git versioning.
-
-Now, to instruct ~just-mr~ to use the content-fixed, committed source trees of
-those ~"type":"file"~ repositories the pragma ~"to_git"~ must be set for them in
-~repos.json~:
-
-#+SRCNAME: repos.json
-#+BEGIN_SRC js
-{ "main": "tutorial"
-, "repositories":
- { "rules-cc":
- { "repository":
- { "type": "git"
- , "branch": "master"
- , "commit": "123d8b03bf2440052626151c14c54abce2726e6f"
- , "repository": "https://github.com/just-buildsystem/rules-cc.git"
- , "subdir": "rules"
- }
- , "target_root": "tutorial-defaults"
- , "rule_root": "rules-cc"
- }
- , "tutorial":
- { "repository": {"type": "file", "path": "."}
- , "bindings": {"rules": "rules-cc", "format": "fmtlib"}
- }
- , "tutorial-defaults":
- { "repository":
- { "type": "file"
- , "path": "./tutorial-defaults"
- , "pragma": {"to_git": true}
- }
- }
- , "fmt-targets-layer":
- { "repository":
- { "type": "file"
- , "path": "./fmt-layer"
- , "pragma": {"to_git": true}
- }
- }
- , "fmtlib":
- { "repository":
- { "type": "git"
- , "branch": "master"
- , "commit": "b6f4ceaed0a0a24ccf575fab6c56dd50ccf6f1a9"
- , "repository": "https://github.com/fmtlib/fmt.git"
- }
- , "target_root": "fmt-targets-layer"
- , "bindings": {"rules": "rules-cc"}
- }
- }
-}
-#+END_SRC
-
-Due to changes in the repository configuration, we need to rebuild and the
-benefits of the target cache should be visible on the second build:
-
-#+BEGIN_SRC sh
-$ just-mr build helloworld
-INFO: Requested target is [["@","tutorial","","helloworld"],{}]
-INFO: Analysed target [["@","tutorial","","helloworld"],{}]
-INFO: Export targets found: 0 cached, 1 uncached, 0 not eligible for caching
-INFO: Discovered 7 actions, 3 trees, 0 blobs
-INFO: Building [["@","tutorial","","helloworld"],{}].
-INFO: Processed 7 actions, 7 cache hits.
-INFO: Artifacts built, logical paths are:
- helloworld [0ec4e36cfb5f2c3efa0fff789349a46694a6d303:132736:x]
-$
-$ just-mr build helloworld
-INFO: Requested target is [["@","tutorial","","helloworld"],{}]
-INFO: Analysed target [["@","tutorial","","helloworld"],{}]
-INFO: Export targets found: 1 cached, 0 uncached, 0 not eligible for caching
-INFO: Discovered 4 actions, 2 trees, 0 blobs
-INFO: Building [["@","tutorial","","helloworld"],{}].
-INFO: Processed 4 actions, 4 cache hits.
-INFO: Artifacts built, logical paths are:
- helloworld [0ec4e36cfb5f2c3efa0fff789349a46694a6d303:132736:x]
-$
-#+END_SRC
-
-Note that in the second run the export target ~"fmt"~ was taken from cache and
-its 3 actions were eliminated, as their result has been recorded to the
-high-level target cache during the first run.
-
-** Combining overlay layers for multiple projects
-
-Projects typically depend on multiple external repositories. Creating an overlay
-layer for each external repository might unnecessarily clutter up the repository
-configuration and the file structure of your repository. One solution to
-mitigate this issue is to combine the ~TARGETS~ files of multiple external
-repositories in a single overlay layer. To avoid conflicts, the ~TARGETS~ files
-can be assigned different file names per repository. As an example, imagine a
-common overlay layer with the files ~TARGETS.fmt~ and ~TARGETS.gsl~ for the
-repositories ~"fmtlib"~ and ~"gsl-lite"~, respectively:
-
-#+BEGIN_SRC
- common-layer
- |
- +--TARGETS.fmt
- +--TARGETS.gsl
- +--include
- | +--fmt
- | | +--TARGETS.fmt
- | +--gsl
- | +--TARGETS.gsl
- |
- +--src
- +--TARGETS.fmt
-#+END_SRC
-
-Such a common overlay layer can be used as the target root for both repositories
-with only one difference: the ~"target_file_name"~ field. By specifying this
-field, the dispatch where to find the respective target description for each
-repository is implemented. For the given example, the following ~repos.json~
-defines the overlay ~"common-targets-layer"~, which is used by ~"fmtlib"~ and
-~"gsl-lite"~:
-
-#+SRCNAME: repos.json
-#+BEGIN_SRC js
-{ "main": "tutorial"
-, "repositories":
- { "rules-cc":
- { "repository":
- { "type": "git"
- , "branch": "master"
- , "commit": "123d8b03bf2440052626151c14c54abce2726e6f"
- , "repository": "https://github.com/just-buildsystem/rules-cc.git"
- , "subdir": "rules"
- }
- , "target_root": "tutorial-defaults"
- , "rule_root": "rules-cc"
- }
- , "tutorial":
- { "repository": {"type": "file", "path": "."}
- , "bindings": {"rules": "rules-cc", "format": "fmtlib"}
- }
- , "tutorial-defaults":
- { "repository":
- { "type": "file"
- , "path": "./tutorial-defaults"
- , "pragma": {"to_git": true}
- }
- }
- , "common-targets-layer":
- { "repository":
- { "type": "file"
- , "path": "./common-layer"
- , "pragma": {"to_git": true}
- }
- }
- , "fmtlib":
- { "repository":
- { "type": "git"
- , "branch": "8.1.1"
- , "commit": "b6f4ceaed0a0a24ccf575fab6c56dd50ccf6f1a9"
- , "repository": "https://github.com/fmtlib/fmt.git"
- }
- , "target_root": "common-targets-layer"
- , "target_file_name": "TARGETS.fmt"
- , "bindings": {"rules": "rules-cc"}
- }
- , "gsl-lite":
- { "repository":
- { "type": "git"
- , "branch": "v0.40.0"
- , "commit": "d6c8af99a1d95b3db36f26b4f22dc3bad89952de"
- , "repository": "https://github.com/gsl-lite/gsl-lite.git"
- }
- , "target_root": "common-targets-layer"
- , "target_file_name": "TARGETS.gsl"
- , "bindings": {"rules": "rules-cc"}
- }
- }
-}
-#+END_SRC
-
-** Using pre-built dependencies
-
-While building external dependencies from source brings advantages,
-most prominently the flexibility to quickly and seamlessly switch
-to a different build configuration (production, debug, instrumented
-for performance analysis; cross-compiling for a different target
-architecture), there are also legitimate reasons to use pre-built
-dependencies. The most prominent one is if your project is packaged
-as part of a larger distribution. For that reason, just also has (in
-~etc/import.prebuilt~) target files for all its dependencies assuming
-they are pre-installed. The reason why target files are used at
-all for this situation is twofold.
-- On the one hand, having a target allows the remaining targets
- to not care about where their dependencies come from, or if it
- is a build against pre-installed dependencies or not. Also, the
- top-level binary does not have to know the linking requirements
- of its transitive dependencies. In other words, information stays
- where it belongs to and if one target acquires a new dependency,
- the information is automatically propagated to all targets using it.
-- Still some information is needed to use a pre-installed library
- and, as explained, a target describing the pre-installed library
- is the right place to collect this information.
- - The public header files of the library. By having this explicit,
- we do not accumulate directories in the include search path
- and hence also properly detect include conflicts.
- - The information on how to link the library itself (i.e.,
- basically its base name).
- - Any dependencies on other libraries that the library might have.
- This information is used to obtain the correct linking order
- and complete transitive linking arguments while keeping the
- description maintainable, as each target still only declares
- its direct dependencies.
-
-The target description for a pre-built version of the format
-library that was used as an example in this section is shown next;
-with our staging mechanism the logical repository it belongs to is
-rooted in the ~fmt~ subdirectory of the ~include~ directory of the
-ambient system.
-
-#+SRCNAME: etc/import.prebuilt/TARGETS.fmt
-#+BEGIN_SRC js
-{ "fmt":
- { "type": ["@", "rules", "CC", "library"]
- , "name": ["fmt"]
- , "stage": ["fmt"]
- , "hdrs": [["TREE", null, "."]]
- , "private-ldflags": ["-lfmt"]
- }
-}
-#+END_SRC