summaryrefslogtreecommitdiff
path: root/doc/future-designs/symlinks.org
diff options
context:
space:
mode:
Diffstat (limited to 'doc/future-designs/symlinks.org')
-rw-r--r--doc/future-designs/symlinks.org107
1 files changed, 107 insertions, 0 deletions
diff --git a/doc/future-designs/symlinks.org b/doc/future-designs/symlinks.org
new file mode 100644
index 00000000..d7e87c95
--- /dev/null
+++ b/doc/future-designs/symlinks.org
@@ -0,0 +1,107 @@
+* Symbolic links
+
+** Background
+
+Besides files and directories, symbolic links are also an important
+entity in the file system. Also ~git~ natively supports symbolic
+links as entries in a tree object. Technically, a symbolic link
+is a string that can be read via ~readlink(2)~. However, they can
+also be followed and functions to access a file, like ~open(2)~ do
+so by default. When following a symbolic link, both, relative and
+absolute, names can be used.
+
+** Symbolic links in build systems
+
+*** Follow and reading both happen
+
+Compilers usually follow symlinks for all inputs. Archivers (like
+~tar(1)~ and package-building tools) usually read the link in order
+to package the link itself, rather than the file referred to (if
+any). As a generic build system, it is desirable to not have to make
+assumptions on the intention of the program called (and hence the
+way it deals with symlinks). This, however, has the consequence that
+only symbolic links themselves can properly model symbolic links.
+
+*** Self-containedness and location-independence of roots
+
+From a build-system perspective, a root should be self-contained; in
+fact, the target-level caching assumes that the git tree identifier
+entirely describes a ~git~-tree root. For this to be true, such a
+root has to be both, self contained and independent of its (assumed)
+location in the file system. In particular, we can neither allow
+absolute symbolic links (as they, depending on the assumed location,
+might point out of the root), nor relative symbolic links that go
+upwards (via a ~../~ reference) too far.
+
+*** Symbolic links in actions
+
+Like for source roots, we understand action directories as self
+contained and independent of their location in the file system.
+Therefore, we have to require the same restrictions there as well,
+i.e., neither absolute symbolic links nor relative symbolic links
+going up too far.
+
+Allowing all relative symbolic links that don't point outside the
+action directory, however, poses an additional layer of complications
+in the definition of actions: a string might be allowed as symlink
+in some places in the action directory, but not in others; in
+particular, we can't tell only from the information that an artifact
+is a relative symlink whether it can be safely placed at a particular
+location in an action or not. Similarly for trees for which we only
+know that they might contain relative symbolic links.
+
+*** Presence of symbolic links in system source trees
+
+It can be desirable to use system libraries or tools as dependencies.
+A typical use case, but not the only one, is packaging a tool for a
+distribution. An obvious approach is to declare a system directory
+as a root of a repository (providing the needed target files in a
+separate root). As it turns out, however, those system directories
+do contain symbolic links, e.g., shared libraries pointing to
+the specific version (like ~libfoo.so.3~ as a symlink pointing to
+~libfoo.so.3.1.4~) or detours through ~/etc/alternatives~.
+
+** Proposed treatment of symbolic links
+
+*** "Shopping list" for bootstrapping against pre-compiled dependencies
+
+As a stop-gap measure to support building the tool itself against
+pre-installed dependencies with the respective directories containing
+symbolic links, or tools (like ~protoc~) being symbolic links (e.g.,
+to the specific version), repositories can specify a list of files
+and directories to be copied as part of the bootstrapping process
+to a fresh clean directory serving as root; during this copying,
+symlinks are followed.
+
+*** "Ignore-special" roots
+
+To allow working with source trees containing symbolic links, we
+extend the existing roots by "ignore-special" versions thereof. In
+such a root (regardless whether file based, or ~git~-tree based),
+everything not a file or a directory will be pretended to be absent.
+For any compile-like tasks, the effect of symlinks can be modeled
+by appropriate staging.
+
+As certain entries have to be ignored, source trees can only be
+obtained by traversing the respective tree; in particular, the
+~TREE~ reference is no longer constant time on those roots, even
+if ~git~-tree based. Nevertheless, for ~git~-tree roots, the
+effective tree is a function of the ~git~-tree of the root, so
+~git~-tree-based ignore-special roots are content fixed and hence
+eligible for target-level caching.
+
+*** Accepting non-upwards relative symlinks as first-class objects
+
+Finally, a restricted form of symlinks, more precisely relative
+non-upwards symbolic links, will be added as first-class object.
+That is, a new artifact type (besides blobs and trees) for relative
+non-upwards symbolic links is added. Like any other artifact they
+can be freely placed into the inputs of an action, as well as in
+artifacts, runfiles, or provides map of a target. Artifacts of this
+new type can be defined as
+- source-symlink reference, as well as implicitly as part of a
+ source tree,
+- as a symlink output of an action, as well as implicitly as part
+ of a tree output of an action, and
+- explicitly in the rule language from a string through a new
+ ~SYMLINK~ constructor function.