diff options
author | Paul Cristian Sarbu <paul.cristian.sarbu@huawei.com> | 2023-06-27 10:41:11 +0200 |
---|---|---|
committer | Paul Cristian Sarbu <paul.cristian.sarbu@huawei.com> | 2023-06-28 18:28:30 +0200 |
commit | 1733e00fd739c59c1708e36fde6d2d9caf6c7322 (patch) | |
tree | 755d9c75d9de74b1b862d553cf13667c61fe24f8 /doc/concepts | |
parent | 902d1c66ad7961896206fac0ab52625495128820 (diff) | |
download | justbuild-1733e00fd739c59c1708e36fde6d2d9caf6c7322.tar.gz |
doc: Move non-upwards symlinks to concepts
Also link it in the README.
Diffstat (limited to 'doc/concepts')
-rw-r--r-- | doc/concepts/symlinks.md | 119 |
1 files changed, 119 insertions, 0 deletions
diff --git a/doc/concepts/symlinks.md b/doc/concepts/symlinks.md new file mode 100644 index 00000000..34691ef3 --- /dev/null +++ b/doc/concepts/symlinks.md @@ -0,0 +1,119 @@ +Symbolic links +============== + +Background +---------- + +Besides files and directories, symbolic links are also an important +entity in the file system. Also `git` natively supports symbolic links +as entries in a tree object. Technically, a symbolic link is a string +that can be read via `readlink(2)`. However, they can also be followed +and functions to access a file, like `open(2)` do so by default. When +following a symbolic link, both, relative and absolute, names can be +used. + +Symbolic links in build systems +------------------------------- + +### Follow and reading both happen + +Compilers usually follow symlinks for all inputs. Archivers (like +`tar(1)` and package-building tools) usually read the link in order to +package the link itself, rather than the file referred to (if any). As a +generic build system, it is desirable to not have to make assumptions on +the intention of the program called (and hence the way it deals with +symlinks). This, however, has the consequence that only symbolic links +themselves can properly model symbolic links. + +### Self-containedness and location-independence of roots + +From a build-system perspective, a root should be self-contained; in +fact, the target-level caching assumes that the git tree identifier +entirely describes a `git`-tree root. For this to be true, such a root +has to be both, self contained and independent of its (assumed) location +in the file system. In particular, we can neither allow absolute +symbolic links (as they, depending on the assumed location, might point +out of the root), nor relative symbolic links that go upwards (via a +`../` reference) too far. + +### Symbolic links in actions + +Like for source roots, we understand action directories as self +contained and independent of their location in the file system. +Therefore, we have to require the same restrictions there as well, i.e., +neither absolute symbolic links nor relative symbolic links going up too +far. + +Allowing all relative symbolic links that don't point outside the +action directory, however, poses an additional layer of complications in +the definition of actions: a string might be allowed as symlink in some +places in the action directory, but not in others; in particular, we +can't tell only from the information that an artifact is a relative +symlink whether it can be safely placed at a particular location in an +action or not. Similarly for trees for which we only know that they +might contain relative symbolic links. + +### Presence of symbolic links in system source trees + +It can be desirable to use system libraries or tools as dependencies. A +typical use case, but not the only one, is packaging a tool for a +distribution. An obvious approach is to declare a system directory as a +root of a repository (providing the needed target files in a separate +root). As it turns out, however, those system directories do contain +symbolic links, e.g., shared libraries pointing to the specific version +(like `libfoo.so.3` as a symlink pointing to `libfoo.so.3.1.4`) or +detours through `/etc/alternatives`. + +Bootstrapping "shopping list" option +--------------------------------------------------------- + +In order to more easily support building the tool itself against +pre-installed dependencies with the respective directories containing +symbolic links, or tools (like `protoc`) being symbolic links (e.g., to +the specific version), repositories can specify, in the `"copy"` +attribute of the `"local_bootstrap"` parameter, a list of files and +directories to be copied as part of the bootstrapping process to a fresh +clean directory serving as root; during this copying, symlinks are +followed. + +Our treatment of symbolic links +------------------------------------ + +### "Ignore-special" roots + +To allow working with source trees containing symbolic links, all the existing +roots have "ignore-special" versions thereof. In such a root +(regardless whether file based, or `git`-tree based), everything +not a file or a directory is pretended to be absent. For any +compile-like tasks, the effect of symlinks can be modeled by appropriate +staging. + +As certain entries have to be ignored, source trees can only be obtained +by traversing the respective tree; in particular, the `TREE` reference +is no longer constant time on those roots, even if `git`-tree based. +Nevertheless, for `git`-tree roots, the effective tree is a function of +the `git`-tree of the root, so `git`-tree-based ignore-special roots are +content fixed and hence eligible for target-level caching. + +### Non-upwards relative symlinks as first-class objects + +Finally, a restricted form of symlinks, more precisely *relative* +*non-upwards symbolic links*, exist as first-class objects. +That is, a new artifact type (besides blobs and trees) for relative +non-upwards symbolic links has been introduced. Like any other artifact, +they can be freely placed into the inputs of an action, as well as in artifacts, +runfiles, or provides map of a target. Artifacts of this new type can be +defined as: + + - source-symlink reference, as well as implicitly as part of a source + tree, + - as a symlink output of an action, as well as implicitly as part of a + tree output of an action, and + - explicitly in the rule language from a string through a new + `SYMLINK` constructor function. + +While existing as separate artifacts in order to properly stage them, (relative +non-upwards) symbolic links are, in many aspects, simple files with elevated +permissions. As such, they locally use the existing file CAS. Remotely, the +existing execution protocol already allows the handling of symbolic links via +corresponding Protobuf messages, therefore no extensions are needed. |