summaryrefslogtreecommitdiff
path: root/doc/future-designs/service-target-cache.org
diff options
context:
space:
mode:
authorOliver Reiche <oliver.reiche@huawei.com>2023-06-01 13:36:32 +0200
committerOliver Reiche <oliver.reiche@huawei.com>2023-06-12 16:29:05 +0200
commitb66a7359fbbff35af630c88c56598bbc06b393e1 (patch)
treed866802c4b44c13cbd90f9919cc7fc472091be0c /doc/future-designs/service-target-cache.org
parent144b2c619f28c91663936cd445251ca28af45f88 (diff)
downloadjustbuild-b66a7359fbbff35af630c88c56598bbc06b393e1.tar.gz
doc: Convert orgmode files to markdown
Diffstat (limited to 'doc/future-designs/service-target-cache.org')
-rw-r--r--doc/future-designs/service-target-cache.org227
1 files changed, 0 insertions, 227 deletions
diff --git a/doc/future-designs/service-target-cache.org b/doc/future-designs/service-target-cache.org
deleted file mode 100644
index 10138db5..00000000
--- a/doc/future-designs/service-target-cache.org
+++ /dev/null
@@ -1,227 +0,0 @@
-* Target-level caching as a service
-
-** Motivation
-
-Projects can have quite a lot of dependencies that are not part of
-the build environment, but are, instead, built from source, e.g.,
-in order to always build against the latest snapshot. The latter
-is a typical workflow in case of first-party dependencies. In the
-case of ~justbuild~, those first-party dependencies form a separate
-logical repository that is typically content fixed (e.g., because
-that dependency is versioned in a ~git~ repository).
-
-Moreover, code is typically first built (and tested) by the owning
-project before being used as a dependency. Therefore, if remote
-execution is used, for a first-party dependency, we expect all
-actions to be in cache. As dependencies are typically updated less
-often than the code being developed is changed, in most builds,
-the dependencies are in target-level cache. In other words, in a
-remote-execution setup, the whole code of dependencies is fetched
-just to walk through the action graph a single time to get the
-necessary cache hits.
-
-** Proposal: target-level caching as a service
-
-To avoid these unnecessary fetches, we add a new subcommand ~just
-serve~ that starts a service that provides the dependencies. This
-typically happens by looking up a target-level cache entry. If the
-entry, however, is not in cache, this also includes building the
-respective ~export~ target using an associated remote-execution
-end point.
-
-*** Scope: eligible ~export~ targets
-
-In order to typically have requests in cache, ~just serve~ will
-refuse to handle requests that do not refer to ~export~ targets
-in content-fixed repositories; recall that for a repository to be
-content fixed, so have to be all repositories reachable from there.
-
-*** Communication through an associated remote-execution service
-
-Each ~just serve~ endpoint is always associated with a remote-execution
-endpoint. All artifacts exchanged between client and ~just serve~
-endpoint are exchanged via the CAS that is part in the associated
-remote-execution endpoint. This remote-execution endpoint is also
-used if ~just serve~ has to build targets.
-
-The associated remote-execution endpoint can well be the same
-process simultaneously acting as ~just execute~. In fact, this is
-the default if no remote-execution endpoint is specified.
-
-*** Protocol
-
-Communication is handled via ~grpc~ exchanging ~proto~ buffers
-containing the information described in the rest of this section.
-
-**** Main request and answer format
-
-A request is given by
-- the map of remote-execution properties for the designated
- remote-execution endpoint; together with the knowledge on the fixed
- endpoint, the ~just serve~ instance can compute the target-level
- cache shard, and
-- the identifier of the target-level cache key; it is the client's
- responsibility to ensure that the referred blob (i.e., the
- JSON object with appropriate values for the keys ~"repo_key"~,
- ~"target_name"~, and ~"effective_config"~) as well as the
- indirectly referred repository description (the JSON object the
- ~"repo_key"~ in the cache key refers to) are uploaded to CAS (of
- the designated remote-execution endpoint) beforehand.
-
-The answer to that request is the identifier of the corresponding
-target-level cache value (in the same format as for local target-level
-caching). The ~just serve~ instance will ensure that the actual
-value, as well as any directly or indirectly referenced artifacts
-are available in the respective remote-execution CAS. Alternatively,
-the answer can indicate the kind of error (unknown root, not an
-export target, build failure, etc).
-
-**** Auxiliary request: tree of a commit
-
-As for ~git~ repositories, it is common to specify a commit in order
-to fix a dependency (even though the corresponding tree identifier
-would be enough). Moreover, the standard ~git~ protocol supports
-asking for the commit of a given remote branch, but additional
-overhead is needed in order to get the tree identifier.
-
-Therefore, in order to support clients (or, more precisely, ~just-mr~
-instances setting up the repository description) in constructing an
-appropriate request for ~just serve~ without unnecessary overhead,
-~just serve~ will support a second kind of request, where the
-client request consists of a ~git~ commit identifier and the server
-answers with the tree identifier for that commit if it is aware of
-that commit, or indicates that it is not aware of that commit.
-
-**** Auxiliary request: describe
-
-To support ~just describe~ also in the cases where code is
-delegated to the ~just serve~ endpoint, an additional request for
-the ~describe~ information of a target can be requested; as ~just
-serve~ only handles ~export~ targets, this target necessarily has
-to be an export target.
-
-The request is given by the identifier of the target-level cache
-key, again with the promise that the referred blob is available
-in CAS. The answer is the identifier of a blob containing a JSON
-object with the needed information, i.e., those parts of the target
-description that are used by ~just describe~. Alternatively, the
-answer may indicate the kind of error (unknown root, not an export
-target, etc).
-
-*** Sources: local git repositories and remote trees
-
-A ~just serve~ instance takes roots from various sources,
-- the ~git~ repository contained in the local build root,
-- additional ~git~ repositories, optionally specified in the
- invocation, and
-- as last resort, asking the CAS in the designated remote-execution
- service for the specified ~git~ tree.
-
-Allowing a list of repositories to take as sources (rather than
-a single one) increases the effort when having to search for a
-specified tree (in case the requested ~export~ target is not in
-cache and an actual analysis of the build has to be carried out)
-or specific commit (in case a client asks for the tree of a given
-commit). However, it allows for the natural workflow of keeping
-separate upstream repositories in separate clones (updated in an
-appropriate way) without artificially putting them in a single
-repository (as orphan branches).
-
-Supporting building against trees from CAS allows more flexibility
-in defining roots that clients do not have to care about. In fact,
-they can be defined in any way, as long as
-- the client is aware of the git tree identifier of the root, and
-- some entity ensures the needed trees are known to the CAS.
-The auxiliary changes to ~just-mr~ described later in this document
-provide one possible way to handle archives in this way. Moreover,
-this additional flexibility will be necessary if we ever support
-computed roots, i.e., roots that are the output of a ~just~ build.
-
-*** Absent roots in ~just~ repository specification
-
-In order for ~just~ to know for which repositories to delegate
-the build to the designated ~just serve~ endpoint, the repository
-configuration for ~just~ can mark roots as absent; this is done
-by only giving the type as ~"git tree"~ (or the corresponding
-ignore-special variant thereof) and the tree identifier in the root
-specification, but no witnessing repository.
-
-Any repository containing an absent root has to be content fixed,
-but not all roots have to be absent (as ~just~ can always upload
-those trees to CAS). It is an error if, outside the computations
-delegated to ~just serve~, a non-export target is requested from a
-repository containing an absent root. Moreover, whenever there is
-a dependency on a repository containing an absent root, a ~just
-serve~ endpoint has to be specified in the invocation of ~just~.
-
-*** Auxiliary changes
-
-**** ~just-mr~ pragma ~"absent"~
-
-For ~just-mr~ to know how to construct the repository description,
-the description used by ~just-mr~ is extended. More precisely, a
-new key ~"absent"~ is allowed in the ~"pragma"~ dictionary of a
-repository description. If the specified value is true, ~just-mr~
-will generate an absent root out of this description, using all
-available means to generate that root without ever having to fetch
-the repository locally. In the typical case of a ~git~ repository,
-the auxiliary ~just serve~ function to obtain the tree of a commit
-is used. To allow this communication, ~just-mr~ also accepts the
-arguments describing a ~just serve~ endpoint and forwards them
-as early arguments to ~just~, in the same way as it does with
-~--local-build-root~.
-
-**** ~just-mr~ to inquire remote execution before fetching
-
-In line with the idea that fetching sources from upstream should
-happen only once and not once per developer, we add remote execution
-as another way of obtaining files to ~just-mr~. More precisely,
-~just-mr~ will support the options ~just~ accepts to connect to
-the remote CAS. When given, those will be forwarded to ~just~
-as early arguments (so that later ~just~-only ones can override
-them); moreover, when a file needed to set up a (present) root is
-found neither in local CAS nor in one of the specified distdirs,
-~just-mr~ will first ask the remote CAS for the missing file before
-trying to fetch itself from the specified URL. The rationale for
-this search order is that the designated remote-execution service
-is typically reachable over the network in a more reliable way than
-external resources (while local resources do not require a network
-at all).
-
-**** ~just-mr~ to support new repository type ~git tree~
-
-A new repository type is added to ~just-mr~, called ~git tree~.
-Such a repository is given by
-- a ~git~ tree identifier, and
-- a command that, when executed in an empty directory (anywhere
- in the file system) will create in that directory a directory
- structure containing the specified ~git~ tree (either top-level
- or in some subdirectory). Moreover, that command does not modify
- anything outside the directory it is called in; it is an error
- if the specified tree is not created in this way.
-In this way, content-fixed repositories can be generated in a
-generic way, e.g., using other version-control systems or specialized
-artifact-fetching tools.
-
-Additionally, for archive-like repositories in the ~just-mr~
-repository specification (currently ~archive~ and ~zip~), a ~git~
-tree identifier can be specified. If the tree is known to ~just-mr~,
-or the ~"pragma"~ ~"absent"~ is given, it will just use that tree.
-Otherwise, it will fetch as usual, but error out if the obtained
-tree is not the promised one after unpacking and taking the specified
-subdirectory. In this way, also archives can be used as absent roots.
-
-**** ~just-mr fetch~ to support storing in remote-execution CAS
-
-The ~fetch~ subcommand of ~just-mr~ will get an additional option to
-support backing up the fetched information not to a local directory,
-but instead to the CAS of the specified remote-execution endpoint.
-This includes
-- all archives fetched, but also
-- all trees computed in setting up the respective repository
- description, both, from ~git tree~ repositories, as well as
- from archives.
-
-In this way, ~just-mr~ can be used to fill the CAS from one central
-point with all the information the clients need to treat all
-content-fixed roots as absent.