summaryrefslogtreecommitdiff
path: root/doc/future-designs/symlinks.org
blob: 47ca5063cbf7ba0bd7bbcd8833b97841e43e4a98 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
* Symbolic links

** Background

Besides files and directories, symbolic links are also an important
entity in the file system. Also ~git~ natively supports symbolic
links as entries in a tree object. Technically, a symbolic link
is a string that can be read via ~readlink(2)~. However, they can
also be followed and functions to access a file, like ~open(2)~ do
so by default. When following a symbolic link, both, relative and
absolute, names can be used.

** Symbolic links in build systems

*** Follow and reading both happen

Compilers usually follow symlinks for all inputs. Archivers (like
~tar(1)~ and package-building tools) usually read the link in order
to package the link itself, rather than the file referred to (if
any). As a generic build system, it is desirable to not have to make
assumptions on the intention of the program called (and hence the
way it deals with symlinks). This, however, has the consequence that
only symbolic links themselves can properly model symbolic links.

*** Self-containedness and location-independence of roots

From a build-system perspective, a root should be self-contained; in
fact, the target-level caching assumes that the git tree identifier
entirely describes a ~git~-tree root. For this to be true, such a
root has to be both, self contained and independent of its (assumed)
location in the file system. In particular, we can neither allow
absolute symbolic links (as they, depending on the assumed location,
might point out of the root), nor relative symbolic links that go
upwards (via a ~../~ reference) too far.

*** Symbolic links in actions

Like for source roots, we understand action directories as self
contained and independent of their location in the file system.
Therefore, we have to require the same restrictions there as well,
i.e., neither absolute symbolic links nor relative symbolic links
going up too far.

Allowing all relative symbolic links that don't point outside the
action directory, however, poses an additional layer of complications
in the definition of actions: a string might be allowed as symlink
in some places in the action directory, but not in others; in
particular, we can't tell only from the information that an artifact
is a relative symlink whether it can be safely placed at a particular
location in an action or not. Similarly for trees for which we only
know that they might contain relative symbolic links.

*** Presence of symbolic links in system source trees

It can be desirable to use system libraries or tools as dependencies.
A typical use case, but not the only one, is packaging a tool for a
distribution. An obvious approach is to declare a system directory
as a root of a repository (providing the needed target files in a
separate root). As it turns out, however, those system directories
do contain symbolic links, e.g., shared libraries pointing to
the specific version (like ~libfoo.so.3~ as a symlink pointing to
~libfoo.so.3.1.4~) or detours through ~/etc/alternatives~.

** Implemented stop-gap: "shopping list" for bootstrapping

As a stop-gap measure to support building the tool itself against
pre-installed dependencies with the respective directories containing
symbolic links, or tools (like ~protoc~) being symbolic links (e.g.,
to the specific version), repositories can specify, in the ~"copy"~
attribute of the ~"local_bootstrap"~ parameter, a list of files
and directories to be copied as part of the bootstrapping process
to a fresh clean directory serving as root; during this copying,
symlinks are followed.

** Proposed treatment of symbolic links

*** "Ignore-special" roots

To allow working with source trees containing symbolic links, we
extend the existing roots by "ignore-special" versions thereof. In
such a root (regardless whether file based, or ~git~-tree based),
everything not a file or a directory will be pretended to be absent.
For any compile-like tasks, the effect of symlinks can be modeled
by appropriate staging.

As certain entries have to be ignored, source trees can only be
obtained by traversing the respective tree; in particular, the
~TREE~ reference is no longer constant time on those roots, even
if ~git~-tree based. Nevertheless, for ~git~-tree roots, the
effective tree is a function of the ~git~-tree of the root, so
~git~-tree-based ignore-special roots are content fixed and hence
eligible for target-level caching.

*** Accepting non-upwards relative symlinks as first-class objects

Finally, a restricted form of symlinks, more precisely relative
non-upwards symbolic links, will be added as first-class object.
That is, a new artifact type (besides blobs and trees) for relative
non-upwards symbolic links is added. Like any other artifact they
can be freely placed into the inputs of an action, as well as in
artifacts, runfiles, or provides map of a target. Artifacts of this
new type can be defined as
- source-symlink reference, as well as implicitly as part of a
  source tree,
- as a symlink output of an action, as well as implicitly as part
  of a tree output of an action, and
- explicitly in the rule language from a string through a new
  ~SYMLINK~ constructor function.