Age | Commit message (Collapse) | Author |
|
...in LocalApi and BazelApi.
|
|
...in LocalApi and BazelApi.
|
|
|
|
|
|
|
|
|
|
|
|
During compactification, invalid entries must be deleted.
|
|
During garbage collection split and remove from the storage every entry that is larger than a threshold.
|
|
During garbage collection remove from the storage every entry that has the large entry.
|
|
|
|
|
|
and trees.
|
|
executable files during splitting.
|
|
As we use chunking also for reducing storage, we have to consider
the overhead of block devices which is in the order of kB per file.
So our target chunk size should be at least 2 orders of magnitude
above this. This suggests to minimally aim for a chunk size of
128kB, a target size that also has the advantage the that maximal
chunk size associated with this size is 1MB which is still well
below the maximal transmission size of grpc allowing us to avoid
the streaming API.
As we're scaling everything up by a factor of 16, we also have
to increase the number of bits in the involved masks by 4. We use
this to also extend the window size by using the 2 most significant
octets. Following the advice of the paper proposing FastCDC to
spread out the ones roughly equally suggests 0x4444 as a suitable
value for the two most significant octets.
We also change the suggested extension of the remote-execution API
accordingly. As the precise parameters for FastCDC when announced
over the remote-execution APIs are still under discussion upstream,
we simplify the name to not mention the target size.
|
|
... as the typical chunk size is mainly determined by the masks
used internally. So, as long as we hard code them, we should be
honest to ourselves and accept that the chunking parameters are
hard-coded as well.
|
|
|
|
For splicing of large objects from external sources additional checks are performed:
* The digest of the spliced result must be equal to the expected digest;
* The parts of a spliced tree must be in the storage.
Tested:
* Regular splicing of large objects;
* If the result is unexpected, splicing fails;
* If some parts of a tree are missing, splicing fails.
|
|
* Uplink parts of the large entry before entry itself;
* Uplink large entries in LargeObjectCAS::GetEntryPath to not split things two times;
* Promote spliced tree during uplinking of a large tree entry to properly promote parts of the tree;
* Uplink large entries in LocalUplink{Blob, Tree} to support proper uplinking in Action Cache and Target Cache;
Tested:
* Uplink large blobs and trees;
* Uplink a large object that depends on other large objects.
|
|
Implicitly reconstruct objects during regular uplinking of Blobs/Trees.
|
|
* Add LargeObjectCAS fields for files and trees to LocalCAS;
* Add logic for splitting objects located in the main storage.
Tested:
Splitting of large, small and empty objects.
|
|
|
|
Every large object is keyed by the hash of the result and contains hashes of the parts from which the result can be reconstructed.
|
|
|
|
Main culprits:
- std::size_t, std::nullptr_t, and NULL require <cstddef>
- std::move and std::forward require <utility>
- unordered maps and sets require respective includes
- std::for_each and std::all_of require <algorithm>
|
|
When running in single node, serve endpoint should not even
consider sharding. Additionally, garbage collection uplinking
should also take the shard into account. For this purpose, a
TargetCache instance now remembers if it was explicitly sharded and
passed that information to the GarbageCollector for uplinking.
|
|
|
|
Some of the more specific issues addressed:
- missing log_level target/include
- header-only libs wrongly marking deps as private
- missing/misplaced gsl includes
|
|
... as the fs_utils have a lot more dependencies making them usable
in less places. Moreover, this function also serves to shape the
layout of the local build root and hence is more appropriately
placed in the config anyway.
|
|
|
|
This is needed in order to pass the correctly instantiated
TargetCache to AnalyseTarget even when bootstrapping 'just'.
|
|
... and make rotation of generations optional, as with the removal
of ephemeral data there is now a useful collection even without
rotating generations.
|
|
... under a common root in the youngest generation. This will allow
a simple way of cleaning them up during garbage collection.
|
|
As the internal distdir data structure now supports the
executable bit, it is also expressive enough to support
foreign-file repositories. Hence we can use this cache,
getting potentially more cache hits.
|
|
We ensure that for each export target to be written to the target
cache all its implied export targets are written to the target
cache first. This ensures that the target cache maintains its
consistency at all times with respect to export target
dependencies.
|
|
... by uplinking them appropriately.
|
|
|
|
|
|
|
|
|
|
TargetCache...
...backed by the same CAS, but the FileStorage uses the given
shard. This is particularly useful for the just-serve server
implementation, since the sharding must be performed according to the
client's request and not following the server configuration.
|
|
This constructor is used by TargetService::ServeTarget
|
|
... which was accidentially a list of (a single) object,
instead of only a single JSON object.
|
|
While we don't want to fail if the 'checkouts' map values are not
strings, we shouldn't just accept non-string values either,
instead we should warn the user and continue without them.
|
|
This is required in order to make them available to 'just serve'
in a minimal just installation.
|
|
The bytestream server implementation (deployed by just execute) now
stores the temporary files under
$local_build_root/protocol-depenedent/generation-0
so that they can be garbage collected if "just exectue" is terminated
before they are cleaned up.
|
|
Extend the configuration data structure by a dispatch list of endpoints
to chose based on the first match of the execution properties.
|
|
|
|
|
|
...as early as possible. This ensures that callers always receive
only the tree entries for the supported object types.
For the symlinks non-upwardness check we pass a lambda capturing
the real backend of the tree entries, such that the symlinks can
be read.
Updates git_tree tests accordingly.
|