summaryrefslogtreecommitdiff
path: root/doc/future-designs
diff options
context:
space:
mode:
authorSascha Roloff <sascha.roloff@huawei.com>2023-11-22 18:33:45 +0100
committerSascha Roloff <sascha.roloff@huawei.com>2023-11-23 11:08:46 +0100
commitb7b438d5bba82a391d581537d144c2af6ac233ac (patch)
tree4b8e4919ebb4fad31861d7095c959d8fdca29de2 /doc/future-designs
parent957b010bb06509ec2b7db2c79836791684e8bc04 (diff)
downloadjustbuild-b7b438d5bba82a391d581537d144c2af6ac233ac.tar.gz
Move blob-splitting design document to implemented concepts
Diffstat (limited to 'doc/future-designs')
-rw-r--r--doc/future-designs/blob-splitting.md222
1 files changed, 0 insertions, 222 deletions
diff --git a/doc/future-designs/blob-splitting.md b/doc/future-designs/blob-splitting.md
deleted file mode 100644
index 7bfc53c4..00000000
--- a/doc/future-designs/blob-splitting.md
+++ /dev/null
@@ -1,222 +0,0 @@
-Blob Splitting Protocol
-=======================
-
-Introduction
-------------
-
-Due to the generic nature of our build system, we do not impose any
-restrictions on the type of artifacts that are generated by actions or
-that are inputs to actions. Thus, artifacts might be large executables,
-libraries, or a compiled set of slides in PDF format, or even whole file
-system images. Such artifacts result in large blobs, which need to be
-fetched from a remote CAS if they are to be inspected or used locally.
-Depending on the network connection, this might imply a significant
-waiting time until the complete artifact is downloaded as well as
-results in a lot of network traffic. This situation is especially
-painful in case of short modification-inspection cycles. For each small
-modification of the sources, the complete artifact needs to be
-downloaded even though maybe only a small fraction of the compiled
-binary artifact has been changed.
-
-Thus, we propose a blob splitting strategy as conservative extension to
-the original [remote execution
-API](https://github.com/bazelbuild/remote-apis/blob/main/build/bazel/remote/execution/v2/remote_execution.proto),
-which allows to split the binary data of a blob into chunks and to
-transmit only the modified chunks instead of the whole blob data. The
-goal is to reduce traffic between local host and remote server and to
-avoid unnecessary waiting times for users frequently working on large
-artifacts at the cost of an increased storage consumption.
-
-Remote execution API extension
-------------------------------
-
-We extend the existing remote execution API by introducing a new remote
-procedure call (rpc) to the `ContentAddressableStorage` service to split
-a blob into chunks along with respective request and response messages,
-and by adding a flag to the `ServerCapabilities` message to indicate
-whether blob splitting is supported by the server or not. The following
-code snippet depicts the proposed extensions to the remote execution
-protocol:
-
- service ContentAddressableStorage {
- ...
- rpc SplitBlob(SplitBlobRequest) returns (SplitBlobResponse) {}
- }
-
- message SplitBlobRequest {
- string instance_name = 1;
- Digest blob_digest = 2;
- }
-
- message SplitBlobResponse {
- repeated Digest chunk_digests = 1;
- }
-
- message ServerCapabilities {
- ...
- bool blob_split_support = <free field number>;
- }
-
-### Contract between client and server
-
-The contract between the client and the server for this protocol
-extension is that if a client requests to split a blob at the remote
-execution server, the server answers with a list of chunk digests and
-gives the promise that first, all referenced blobs are available in its
-CAS to be downloaded and second, the concatenation of the returned blobs
-in the given order will result in the original blob the client has
-requested.
-
-The client does not give any promise to the server, it is free to not
-use the blob splitting rpc, but in order to make sense of the protocol
-extension (saving traffic), a meaningful behavior of the client would be
-to request the chunk digests of a blob from the server, to fetch only
-those chunks that are missing in its local CAS, and to store the fetched
-chunks as well as the reconstructed original blob in its local CAS. If
-the client requests to split a blob, but the server does not support
-blob splitting or if an error occured during the request, the client
-falls back to fetch the entire blob.
-
-Blob split procedure
---------------------
-
-### Server side
-
-When the server receives a request to split a blob via the `SplitBlob`
-rpc, it first checks whether the blob given by the `blob_digest` from
-the `SplitBlobRequest` message exists in its CAS. If not, the status
-code `google.rpc.Code.NOT_FOUND` is returned. Otherwise, it loads the
-blob data and splits it into chunks according to the implemented data
-chunking algorithm. As explained later, there are different
-implementation options for this algorithm, but they all must ensure one
-property: the concatenation of the chunks result in the original blob.
-After the blob is splitted, the server computes a digest for each chunk
-according to the configured digest function and puts each chunk that is
-not yet stored in its CAS. If an error occurs during storing the chunks
-in the CAS due to storage shortage, the status code
-`google.rpc.Code.RESOURCE_EXHAUSTED` is returned. Otherwise, the chunk
-digests are collected in the `chunk_digests` field of a
-`SplitBlobResponse` message in the same order as the chunks occur within
-the binary blob data and the message is sent back to the client with the
-status code `google.rpc.Code.OK`.
-
-> Note: Since the same data is basically stored twice for each blob, the
-> storage consumption for the remote server is roughly doubled. Some
-> savings occur when chunks are reused in more than one blob or even
-> several times in the same blob.
-
-### Client side
-
-If a client wants to take advantage from blob splitting, it requests to
-split a blob into chunks at the remote side by calling the `SplitBlob`
-rpc from the `ContentAddressableStorage` service given a
-`SplitBlobRequest` message containing the `blob_digest` of the
-respective blob. If the status code returned by the `SplitBlob` rpc
-contains an error, the split operation failed and the client needs to
-fall back to fetch the entire blob. Otherwise, blob splitting was
-successful and the remote server returns an ordered list of
-`chunk_digests` and guarantees that the chunks are available in its CAS
-and that the concatention of all chunks in the given order will result
-in the requested blob. The client checks which chunks are available
-locally, fetches only the locally missing chunks from the remote side
-via the `BatchReadBlobs` rpc or the `ByteStream` API, and assembles the
-requested blob by concatenating the binary data of its chunks. Finally,
-it stores the chunks as well as the resulting blob in its local CAS.
-
-> Note: Since the same data is basically stored twice for each blob, the
-> local storage consumption is roughly doubled. Some savings occur when
-> chunks are reused in more than one blob or even several times in the
-> same blob.
-
-Compatibility issues
---------------------
-
-We aim to get this protocol extension into the official remote execution
-API mentioned above. Once this is done and field numbers are assigned,
-servers that have upgraded to the new API version inform clients that
-have upgraded to the new API version whether they support blob splitting
-or not by setting the `blob_split_support` field in the
-`ServerCapabilities` message accordingly. A server only provides the
-aforementioned guarantee to clients once it has announced to support
-blob splitting using this field.
-
-An old client can communicate with a server implementing the new API
-without any modification. Nobody is forced to use the blob splitting
-functionality and unknown fields are just ignored at the client side. A
-client implementing the new API can communicate with an old server by
-evaluating the `blob_split_support` field, which will be set to its
-default value `false` at the client side.
-
-Until the protocol extension is officially accepted, the field number
-for the `blob_split_support` field is not known. In this case, early
-implementors can use a _trial request_ to determine whether the remote
-server supports blob splitting or not. This allows to implement a
-prototypical client and server employing blob splitting without
-requiring the protocol extension to be officially accepted. It also
-allows new clients to communicate with old servers, they still behave
-correctly and do not need to be adapted. The client implementation needs
-to be rebuilt once a field number has been assigned.
-
-Data chunking algorithm
------------------------
-
-As stated earlier, a blob split request from a client is answered by the
-server with the promise that the concatenation of the returned chunks
-results in the requested blob. While this property is guaranteed by the
-server, it does not tell anything about the quality of the split result.
-It is desirable to generate chunks that are likely to be known by the
-client, because then less data needs to be transferred.
-
-A trivial implementation of the split operation would be returning a
-_singleton list_ containing the original blob as single chunk. While
-this implementation is correct and fast, it is not very useful, since it
-does not save any traffic.
-
-A better approach is _fixed-block chunking_ where the data is split into
-chunks of fixed and equal size. While this approach typically results in
-more than one chunk and improves the probability to save some traffic,
-it comes with the limitation of the data shifting problem. The insertion
-of a single character at the beginning of a data stream shifts the
-entire data while chunk boundaries are fixed. Thus, completely new
-chunks, unknown to the client are created even though the data patterns
-are mostly similar.
-
-A more intelligent way of splitting blobs would be to look for locations
-in the content of the data as chunk boundaries such that splitting of a
-slightly modified blob results in almost similar chunks insensitive to
-the data shifting problem of fixed-block chunking. Since the resulting
-chunk sizes are typically different, this approach is also called
-_variable-block chunking_ and is explained in the following section.
-
-### Variable-block chunking
-
-In variable-block chunking, the borders of the chunks are content
-defined and shifted with the data pattern. This can be achieved by
-computing hash values of data blocks of a specific size at every byte
-position in the data stream and matching those hash values against a
-predefined pattern. In case of a match, that byte position is declared
-as chunk boundary. This approach avoids the data shifting problem of
-fixed-block chunking and ensures that unmodified data more than a block
-size away from the changes will have the same boundaries. A possible
-pattern to be matched might be all zeroes in the lower `n` bits. This
-would result in chunks with an average size of `2^n` bytes.
-
-### Rolling hashes
-
-A common technique to efficiently compute hash values of consecutive
-bytes at every byte position in the data stream is to use a rolling hash
-function. A rolling hash function allows to cheaply compute the hash
-value of a chunk of bytes of a specific size at byte position `i` from
-the hash value of the data chunk of the same size at byte position
-`i-1`. Different plain rolling hash functions are available for
-implementation such as a [polynomial rolling
-hash](https://ieeexplore.ieee.org/document/5390135), the [Rabin
-fingerprint](http://www.cs.cmu.edu/~15-749/READINGS/optional/rabin1981.pdf),
-or a [cyclic polynomial rolling
-hash](https://dl.acm.org/doi/abs/10.1145/256163.256168) (also called
-Buzhash). However, the [fast rolling Gear hash
-algorithm](https://www.usenix.org/conference/atc16/technical-sessions/presentation/xia)
-(also called FastCDC) has been proven to be very compute efficient and
-faster than the other rolling hash algorithms specifically for
-content-based chunking of a stream of data while achieving similar
-deduplication ratios as the Rabin fingerprint.