summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorKlaus Aehlig <klaus.aehlig@huawei.com>2024-01-17 10:17:25 +0100
committerKlaus Aehlig <klaus.aehlig@huawei.com>2024-01-18 11:57:35 +0100
commit8b4e94a1adf9d87ab9e5756136993f595a38c981 (patch)
treef5598040a6be0e5afd93772ba32d10a501f978fb
parente30fd2df4f4e5dae5cad261f46a3eb189e16b36c (diff)
downloadjustbuild-8b4e94a1adf9d87ab9e5756136993f595a38c981.tar.gz
Document the implementation of tc deps tracking on gc
-rw-r--r--CHANGELOG.md4
-rw-r--r--doc/concepts/target-cache.md72
-rw-r--r--doc/future-designs/tc-gc.md129
3 files changed, 77 insertions, 128 deletions
diff --git a/CHANGELOG.md b/CHANGELOG.md
index 617d1ca6..f5f24cc2 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -45,6 +45,10 @@ A feature release on top of `1.2.0`, backwards compatible.
on upgrading or downgrading. However, old target-level cache
entries will not be used leading potentially to rebuilding of
some targets.
+- Garbage collection now honors the dependencies of target-level
+ caches entries on one another. When upgrading in place, this only
+ applies for target-level cache entries written initially after
+ the upgrade.
- Improved portability and update of the bundled dependencies.
- Various minor improvements and typo fixes in the documentation.
- Fixed a race condition in an internal cache of `just execute`
diff --git a/doc/concepts/target-cache.md b/doc/concepts/target-cache.md
index 24b9fcd4..4ebad03e 100644
--- a/doc/concepts/target-cache.md
+++ b/doc/concepts/target-cache.md
@@ -229,3 +229,75 @@ The forwarding of artifacts are the reason we chose that in the
non-cached analysis of an export target the artifacts are passed on as
received and are not wrapped in an "add to cache" action. The latter
choice would violate that projection property we rely upon.
+
+### Example
+
+Consider the following target file (on a content-fixed root) as
+example.
+
+```
+{ "generated":
+ {"type": "generic", "outs": ["out.txt"], "cmds": ["echo Hello > out.txt"]}
+, "export": {"type": "export", "target": "generated"}
+, "use":
+ {"type": "install", "dirs": [["generated", "."], ["generated", "other-use"]]}
+, "": {"type": "export", "target": "use"}
+}
+```
+
+Upon initial analysis (on an empty local build root) of the default
+target `""`, the output artifact `out.txt` is an action artifact, more
+precisely the same one that is output of the target `"generated"`;
+the target `"export"` also has the same artifact on output. After
+building the default target, a target-cache entry will be written
+for this target, containing the extensional definition of the target,
+so for `out.txt` the known artifact `e965047ad7c57865...` stored; as
+a side effect, also for the target `"export"` a target-cache entry
+will be written, containing, of course, the same known artifact.
+So on subsequent analysis, both `"export"` and `""` will still
+have the same artifact for `out.txt`, but this time a known one.
+This artifact is now different from the artifact of the target
+`"generated"` (which is still an action artifact), but no conflicts
+arise as the usual target discipline requires that any target not
+a (direct or indirect) dependency of `"export"` use the target
+`"generated"` only indirectly by using the target `"export"`.
+
+Also note that further exporting such a target has to effect, as a
+known artifact always evaluates to itself. In that sense, replacing
+by the extensional definition is a projection.
+
+### Interaction with garbage collection
+
+While adding the implied export targets happens automatically due
+to the evaluation mechanism, the dependencies of target-level cache
+entries on one another still have to be persisted to honor them
+during garbage collection. Otherwise it would be possible that an
+implied target gets garbage collected. In fact, that would even be
+likely as typical builds only reference the top-level export targets.
+
+
+#### Analysis to track the export targets depended upon
+
+As we have to persist this dependency, we need to explicitly track
+it. More precisely, the internal data structure of an analyzed
+target is extended by a set of all the export targets eligible
+for caching, represented by the hashes of the `TargetCacheKey`s,
+encountered during the analysis of that target.
+
+### Extension of the value of a target-level cache entry
+
+The cached value for a target-level cache entry is serialized as a
+JSON object, with besides the keys `"artifacts"`, `"runfiles"`, and
+`"provides"` also a key `"implied export targets"` that lists (in
+lexicographic order) the hashes of the cache keys of the export
+targets the analysis of the given export target depends upon; the
+field is only serialized if that list is non empty.
+
+### Additional invariant honored during uplinking
+
+Our cache honors the additional invariant that, whenever a target-level
+cache entry is present, so are the implied target-level cache
+entries. This invariant is honored when adding new target-level
+cache entries by adding them in the correct order, as well as when
+uplinking by uplinking the implied entries first (and there, of
+course, honoring the respective invariants).
diff --git a/doc/future-designs/tc-gc.md b/doc/future-designs/tc-gc.md
index afaf4bd3..8060a546 100644
--- a/doc/future-designs/tc-gc.md
+++ b/doc/future-designs/tc-gc.md
@@ -1,133 +1,6 @@
# Target-level cache dependencies for garbage collection
-## Background
-
-### Target-level caching
-
-In order to keep analysis manageable, `just` cuts out unchanged
-parts of the target graph by means of target-level caching. More
-precisely, `export` targets of transitively content-fixed repositories
-are cached; if such a target is requested a second time, the cached
-value is used without even analysing the target graph defining
-this target.
-
-Implicit in that caching is a projection in analysis of this
-target from its intensional description to its extensional one.
-This change in definition requires the build to be organized in a
-way that the intensional and extensional definition are not used
-together. Typically, this is achieved by ensuring that whenever an
-artifact is used that goes through this `export` target, then so
-do all uses; as the change to the extensional name is a projection,
-the strictness of the evaluation of `export` targets together with
-the fact that (only) on successful build _all_ analysed `export`
-targets are cached ensures the absence of conflicts.
-
-#### Example
-
-Consider the following target file (on a content-fixed root) as
-example.
-
-```
-{ "generated":
- {"type": "generic", "outs": ["out.txt"], "cmds": ["echo Hello > out.txt"]}
-, "export": {"type": "export", "target": "generated"}
-, "use":
- {"type": "install", "dirs": [["generated", "."], ["generated", "other-use"]]}
-, "": {"type": "export", "target": "use"}
-}
-```
-
-Upon initial analysis (on an empty local build root) of the default
-target `""`, the output artifact `out.txt` is an action artifact, more
-precisely the same one that is output of the target `"generated"`;
-the target `"export"` also has the same artifact on output. After
-building the default target, a target-cache entry will be written
-for this target, containing the extensional definition of the target,
-so for `out.txt` the known artifact `e965047ad7c57865...` stored; as
-a side effect, also for the target `"export"` a target-cache entry
-will be written, containing, of course, the same known artifact.
-So on subsequent analysis, both `"export"` and `""` will still
-have the same artifact for `out.txt`, but this time a known one.
-This artifact is now different from the artifact of the target
-`"generated"` (which is still an action artifact), but no conflicts
-arise as the usual target discipline requires that any target not
-a (direct or indirect) dependency of `"export"` use the target
-`"generated"` only indirectly by using the target `"export"`.
-
-Also note that further exporting such a target has to effect, as a
-known artifact always evaluates to itself. In that sense, replacing
-by the extensional definition is a projection.
-
-### Gargabe collection
-
-In order to reclaim disk space used by the cache directory, `just`
-has an option to carry out garbage collection. More precisely, the
-cache is organized in two generations, and a garbage-collection
-step removes the old generation and renames the young generation
-to be the old one. All operations are carried out from the young
-generation; entries found in the old generation are linked to the
-young generation before being used. While doing so, the following
-invariants are kept by uplinking, in the correct order, more entries.
-
- - If an artifact is referenced in any cache entry (action cache,
- target-level cache), then the corresponding artifact is in CAS.
- - If a tree is in CAS, then so are its immediate parts (and hence
- also all transitive parts).
-
-## Current situation and shortcomings
-
-As it is implemented currently, garbage collection does not honor
-the invariant on export targets that if one export target is in
-cache, the ones traversed during the analysis of that particular
-target are in cache as well. In fact, those implied targets tend to
-be garbage collected, as typical builds only reference the top-level
-export targets.
-
-This can lead to a staging conflict, e.g., in the situation where
-two `export` targets that contain artifacts from a common `export`
-target are used together. Now, if that common `export` target
-goes out of cache and for one of the two top-level targets the
-description changes, that target will use, due to the cache loss, the
-intensional definition of the artifact from the common target with
-the other (still cached) target still using the extensional one.
-
-## Proposed solution
-
-We propose to honor the dependency on export targets in garbage
-collection by appropriately uplinking the implied target-level cache
-entries as well. To do so, the dependency of configured `export`
-targets on others will be stored in the corresponding cache value.
-
-### Analysis to track the export targets depended upon
-
-So far, the dependency of export targets on one another was only
-tracked implicitly by the evaluation model of target definitions.
-As we now have to persist this dependency, we need to explicitly
-track it. More precisely, the internal data structure of an analyzed
-target will be extended by a set of all the export targets eligible
-for caching, represented by their `TargetCacheKey`, encountered
-during the analysis of that target.
-
-### Extension of the value of a target-level cache entry
-
-The cached value for a target-level cache entry is serialized as a
-JSON object, with currently the keys `"artifacts"`, `"runfiles"`, and
-`"provides"`. This object will be extended by an additional (optional)
-key `"implied export targets"` that lists (in lexicographic order)
-the hashes of the cache keys of the export targets the analysis of
-the given export target depends upon; the field is only serialized
-if that list is non empty.
-
-### Additional invariant honored during uplinking
-
-Our cache will honor the additional invariant that, whenever a
-target-level cache entry is present, so are the implied target-level
-cache entries. This invariant will be honored when adding new
-target-level cache entries by adding them in the correct order, as
-well as when uplinking by uplinking the implied entries first (and
-there, of course, honoring the respective invariants).
-
-### Interaction with `just serve`
+## Interaction with `just serve`
When building, `just` normally does not create an entry for
target-level cache hit received from `just serve`. However, it