summaryrefslogtreecommitdiff
path: root/src/buildtool/execution_api/remote
AgeCommit message (Collapse)Author
2024-11-14execution_api/remote: Implement IWYU suggestionsPaul Cristian Sarbu
2024-10-29Clean up unused dependenciesKlaus Aehlig
2024-10-28Retry Execution on FAILED_PRECONDITIONKlaus Aehlig
The specification for this status code is as follows. One or more errors occurred in setting up the action requested, such as a missing input or command or no worker being available. The client may be able to fix the errors and retry. We routinely ensure all inputs are available to the remote execution before we start an action, so all prerequisites will be there on a compliant server, however might not actually be on a server where the CAS only has eventual consistency or is incorrect (due to old cache entries on CAS purge) in its answer to FindMissingBlobs. While we have no guarantee that a retry will help, we still retry; at least in the case of an unavailable worker or CAS entries not yet available due to eventual consistency, this will help. Also, we log at debug lvel the full response, including the repeated Any message. In this way, we can find out what useful information (if any) is sent by popular remote-execution services and implement more specific mitigations in the future.
2024-10-28BatchUploadBlobs: decrease log level for retried tasksKlaus Aehlig
In BatchUploadBlobs we accept short writes and, in case of no progress, fall back to single blob upload. Therefore, failure to upload blobs is not fatal and therefore should not be reported at error level. Decrease the log level accordingly: a protocol failure to upload is a performance-related event (as the retry needs additional time), catching an internal exception is something that shouldn't really happen, so we warn the user.
2024-10-25Add dependencies explicitly that are included directlyKlaus Aehlig
... instead of relying on those dependencies being pulled in indirectly.
2024-10-25bugfix: bazel_network.cpp: fix handling of ongoing actionsAlberto Sartori
The rpc Execution::Execute returns stream google.longrunning.Operation. When the client reads the stream, the server can report that the operation is still in progress and the client has to wait. Before this patch, we were not checking for this particular condition. As a result, an ongoing action was interpreted as an execution failure.
2024-10-10Remove from OSS intersecting public-private dependenciesMaksim Denisov
2024-10-08Name type template parameters using CamelCase.Maksim Denisov
2024-10-08Name classes, structs and enums using CamelCase.Maksim Denisov
2024-10-07Enable cppcoreguidelines-* checks.Maksim Denisov
2024-10-07Disable misc-no-recursion checkMaksim Denisov
...since we use recursion for trees a lot, but skip this check manually.
2024-10-07Enable readability-redundant-member-init check.Maksim Denisov
2024-10-07Enable bugprone-unhandled-exception-at-new check.Maksim Denisov
2024-10-07Enable bugprone-narrowing-conversions checkMaksim Denisov
2024-09-26Fix enum sizes proposed by clang-tidy.Maksim Denisov
Enable performance-enum-size check.
2024-09-26Fix automatic moves proposed by clang-tidy.Maksim Denisov
Enable performance-no-automatic-move check.
2024-09-26Fix assignments in conditionsMaksim Denisov
...proposed by clang-tidy. Enable bugprone-assignment-in-if-condition check.
2024-09-23Reorder dependencies and remove duplicates in OSSMaksim Denisov
2024-09-23Store HashFunction by const reference.Maksim Denisov
Despite the fact that HashFunction is a small type, it still makes sense to store it by reference to reflect the ownership. StorageConfig becomes the main holder. Reference holders store HashFunction by const ref and aren't allowed to change it. However, they are free to return HashFunction by value since this doesn't benefit readability anyhow.
2024-09-18Avoid additional memory allocations when working with protobuf's types.Maksim Denisov
Although this change doesn't benefit performance anyhow (protobuf's mutable_*() methods allocate memory lazily), it is better to let protobuf do this on its own.
2024-09-18Implement ByteStreamUtils::WriteRequest classMaksim Denisov
...and remove split serialization/deserialization logic.
2024-09-18Implement ByteStreamUtils::ReadRequest classMaksim Denisov
...and remove split serialization/deserialization implementations.
2024-09-18Introduce ByteStreamUtils, wrap kChunkSizeMaksim Denisov
...and use the qualified name ByteStreamUtils::kChunkSize
2024-09-18Remove ByteStreamClient::{Read, Write}ManyMaksim Denisov
...since they were used only in tests.
2024-09-18Remove a redundant string allocation.Maksim Denisov
2024-09-17Small code improvements based on lint warningsPaul Cristian Sarbu
- add more noexcept requirements and enforce existing - fixing inconsistencies related to function arguments - remove redundant static keywords - silencing excessive lint reporting in test cases While there, make more getters const ref.
2024-09-16remote execution: Check validity of symlinksPaul Cristian Sarbu
Invalid entries, currently all upwards symlinks (pending implementation of a better way of handling them), are now identified and handled properly: in compatible mode on the client side, during handling of remote response, and in native mode on the server side during the population of the action result.
2024-09-16bazel_execution_client: Remove duplication of error messagesPaul Cristian Sarbu
...during WithRetry, as we know that the error message of the callable has already been appropriately logged.
2024-09-16bazel_response: Improve error reporting for uploading treesPaul Cristian Sarbu
2024-09-16execution_response: Allow failures to be reported while populatingPaul Cristian Sarbu
As populating the containers from remote response only takes place once, no assumptions should be made that this cannot fail (for example if wrong or invalid entries were produced). Instead, return error messages on failure to callers that can log accordingly.
2024-09-13Pass HashFunction to BazelCasClientMaksim Denisov
...to determine whether splitting-splicing functionality is supported.
2024-09-13Check compatibility in BazelAPI based on the hash typeMaksim Denisov
2024-09-13Check compatibility in readers based on the hash typeMaksim Denisov
2024-09-13Add to ProtocolTraits static functions that provide protocol-specific behaviourMaksim Denisov
2024-09-13Rename Compatibility class to ProtocolTraitsMaksim Denisov
...and move it to the common stage.
2024-09-09Remove NativeSupport classMaksim Denisov
2024-09-09Use ArtifactDigestFactory casts in readersMaksim Denisov
2024-09-09Use ArtifactDigest in BazelApi/BazelResponseMaksim Denisov
2024-09-09Return ArtifactDigest from CreateActionDigestFromCommandLineMaksim Denisov
2024-09-09Replace ArtifactDigest::CreateMaksim Denisov
...with ArtifactDigestFactory::HashDataAs
2024-09-09Remove redundant operator lessMaksim Denisov
...from ObjectInfo and ArtifactDigest
2024-08-30Cast ArtifactDigest to bazel_re::Digest explicitlyMaksim Denisov
...to simplify further refactoring.
2024-08-30Use BazelDigestFactory to create bazel_re::Digest directly if neededMaksim Denisov
...bypassing ArtifactDigest functionality.
2024-08-30Replace bazel_re::Digest in BazelMsgFactory (trees)Maksim Denisov
...with ArtifactDigest.
2024-08-30Move artifact_blob_container to a standalone libraryMaksim Denisov
2024-08-30Replace bazel_re::Digest in GitRepo::SymlinksCheckFunc callbackMaksim Denisov
...with ArtifactDigest.
2024-08-30Retry on exceeding deadline obtaining the status of a running executionKlaus Aehlig
Remote execution of actions is handled via long-running operations. Here we have to be careful with the involved status codes: there is the status code of the operation and the response contains a faild that also happens to be a status code. The protocol states Errors discovered during creation of the `Operation` will be reported as gRPC Status errors, while errors that occurred while running the action will be reported in the `status` field of the `ExecuteResponse` So we have to distinguish between two kinds of DEADLINE_EXCEEDED. - If reported by the rpc, it means, we failed to obtain the status of the ongoing action in a reasonable amount of time; here we can do nothing but retry. - If we obtain an answer and that answer has state DEADLINE_EXCEEDED this means "The execution timed out."; hence we must not retry and report the result properly to the user.
2024-08-27CasClient: Fall back to single blob upload, if batch uploading made no progressKlaus Aehlig
We already accept short writes in batch uploads, but when no progress is made, we cannot simply retry, as this might lead to an infinite loop. Instead, we give up on batching and upload each blob one by one.
2024-08-23CasClient: Retry batch update for missing responseOliver Reiche
The remote execution protocol is a bit unclear about how to deal with blob updates for which we got no response. While some clients consider a blob update failed only if a failed response is received, we are going extra defensive here and also consider missing responses to be a failed blob update. Issue a retry for the missing blobs.
2024-08-22Remote-execution properties: restore the latest-wins semanticsKlaus Aehlig
... that was accidentially replaced by a first-wins semantics in 62d204ff4cc94c12c1635f189255710901682825 which fortunately did not make it to any release.