Age | Commit message (Collapse) | Author |
|
Due to a random nature of the LargeObjectUtils generator, it may generate 2 identical files in a row. To prevent the test from failing, check that a newly generated file doesn't collide with any already added to the CAS.
|
|
...that ignores compactification.
|
|
|
|
|
|
... and rename appropriately to reflect contents more precisely
than the generic "common". This separation also disentangles
dependencies a bit.
|
|
|
|
|
|
... instead of relying on those dependencies being pulled in
indirectly.
|
|
|
|
|
|
|
|
|
|
|
|
...instead of calling ProtocolTraits::IsCompatible
|
|
...and move it to the common stage.
|
|
|
|
...with ArtifactDigestFactory::HashDataAs
|
|
...with ArtifactDigestFactory::HashFileAs
|
|
...with ArtifactDigest.
|
|
...with ArtifactDigest.
|
|
...with ArtifactDigest.
|
|
...with ArtifactDigest.
|
|
...and move this functionality to bazel_msg_factory_test, where it is actually used.
For local_cas.test the regular hashing is used, since blob_creator is redundant there.
|
|
... while keeping our .clang-format file.
|
|
... so that linting information gets propagated properly.
|
|
|
|
|
|
|
|
|
|
...and create StorageConfig and Storage in place if needed.
|
|
|
|
... instead of static calls to GarbageCollector
|
|
...instead of std::filesystem::path.
StorageConfig is extended to return paths of Storage's parts.
|
|
|
|
...to track changes during refactoring easier.
|
|
|
|
|
|
|
|
|
|
|
|
During compactification, invalid entries must be deleted.
|
|
During garbage collection split and remove from the storage every entry that is larger than a threshold.
|
|
During garbage collection remove from the storage every entry that has the large entry.
|
|
and trees.
|
|
executable files during splitting.
|
|
As we use chunking also for reducing storage, we have to consider
the overhead of block devices which is in the order of kB per file.
So our target chunk size should be at least 2 orders of magnitude
above this. This suggests to minimally aim for a chunk size of
128kB, a target size that also has the advantage the that maximal
chunk size associated with this size is 1MB which is still well
below the maximal transmission size of grpc allowing us to avoid
the streaming API.
As we're scaling everything up by a factor of 16, we also have
to increase the number of bits in the involved masks by 4. We use
this to also extend the window size by using the 2 most significant
octets. Following the advice of the paper proposing FastCDC to
spread out the ones roughly equally suggests 0x4444 as a suitable
value for the two most significant octets.
We also change the suggested extension of the remote-execution API
accordingly. As the precise parameters for FastCDC when announced
over the remote-execution APIs are still under discussion upstream,
we simplify the name to not mention the target size.
|
|
|
|
For splicing of large objects from external sources additional checks are performed:
* The digest of the spliced result must be equal to the expected digest;
* The parts of a spliced tree must be in the storage.
Tested:
* Regular splicing of large objects;
* If the result is unexpected, splicing fails;
* If some parts of a tree are missing, splicing fails.
|
|
* Uplink parts of the large entry before entry itself;
* Uplink large entries in LargeObjectCAS::GetEntryPath to not split things two times;
* Promote spliced tree during uplinking of a large tree entry to properly promote parts of the tree;
* Uplink large entries in LocalUplink{Blob, Tree} to support proper uplinking in Action Cache and Target Cache;
Tested:
* Uplink large blobs and trees;
* Uplink a large object that depends on other large objects.
|
|
Implicitly reconstruct objects during regular uplinking of Blobs/Trees.
|