summaryrefslogtreecommitdiff
path: root/doc/tutorial/rebuild.md
blob: 803260daad1b08410674fded8d07dd550afae8ef (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
Ensuring reproducibility of the build
=====================================

Software builds should be
[reproducible](https://reproducible-builds.org/). *Justbuild*
supports this goal in local builds by isolating individual actions,
setting permissions and file time stamps to canonical values, etc.; most
remote execution systems take even further measures to ensure the
environment always looks the same to every action. Nevertheless, it is
always possible to break reproducibility by bad actions, both coming
from rules not carefully written, as well as from ad-hoc actions added
by the `generic` target, such as

``` jsonc
...
, "version.h":
  { "type": "generic"
  , "cmds":
    ["echo '#define VERSION \"0.0.0.'`date +%Y%m%d%H%M%S`'\"' > version.h"]
  , "outs": ["version.h"]
  }
...
```

Besides time stamps there are many other sources of nondeterminism, like
properties of the build machine (name, number of CPUs available, etc),
but also subtle ones like `readdir` order. Often, those non-reproducible
parts get buried deeply in a final artifact (like the version string
embedded in a binary contained in a compressed installation archive);
and, as long as the non-reproducible action stays in cache, it does not
even result in bad incrementality. Still, others won't be able to
reproduce the exact artifact.

There are tools like [diffoscope](https://diffoscope.org/) to deeply
compare archives and other container formats. Nevertheless, it is
desirable to find the root causes, i.e., the first (in topological
order) actions that yield a different output.

Rebuilding
----------

For the remainder of this section, we will consider the following
example project with the C++ source file `hello.cpp`:

``` {.cpp srcname="hello.cpp"}
#include <iostream>
#include "version.h"

int main(int argc, const char* argv[]) {
    if (argc > 1 && std::string{argv[1]} == "-v") {
        std::cout << VERSION << std::endl;
    }
    std::cout << "Hello world!\n";
    return 0;
}
```

and the following `TARGETS` file:

``` {.jsonc srcname="TARGETS"}
{ "":
  { "type": "install"
  , "files":
    { "bin/hello": "hello"
    , "share/hello/version.txt": "version.txt"
    , "share/hello/OUT.txt": "OUT.txt"
    }
  }
, "hello":
  { "type": ["@", "rules", "CC", "binary"]
  , "name": ["hello"]
  , "srcs": ["hello.cpp"]
  , "private-hdrs": ["version.h"]
  }
, "version.h":
  { "type": "generic"
  , "cmds":
    ["echo '#define VERSION \"0.0.0.'`date +%Y%m%d%H%M%S`'\"' > version.h"]
  , "outs": ["version.h"]
  }
, "version.txt":
  { "type": "generic"
  , "outs": ["version.txt"]
  , "cmds": ["./hello -v > version.txt"]
  , "deps": ["hello"]
  }
, "out.txt":
  { "type": "generic"
  , "outs": ["out.txt"]
  , "cmds": ["./hello > out.txt"]
  , "deps": ["hello"]
  }
, "OUT.txt":
  { "type": "generic"
  , "outs": ["OUT.txt"]
  , "cmds": ["tr a-z A-Z > OUT.txt < out.txt"]
  , "deps": ["out.txt"]
  }
}
```

The `repos.json` only needs the `"rules-cc"` repository and as main repository
the current working directory

``` {.jsonc srcname="repos.json"}
{ "main": ""
, "repositories":
  { "rules-cc":
    { "repository":
      { "type": "git"
      , "branch": "master"
      , "commit": "7a2fb9f639a61cf7b7d7e45c7c4cea845e7528c6"
      , "repository": "https://github.com/just-buildsystem/rules-cc.git"
      , "subdir": "rules"
      }
    }
  , "":
    { "repository": {"type": "file", "path": "."}
    , "bindings": {"rules": "rules-cc"}
    }
  }
}
```

To search for the root cause of non-reproducibility, `just` has a
subcommand `rebuild`. It builds the specified target again, requesting
that every action be executed again (but target-level cache is still
active); then the result of every action is compared to the one in the
action cache, if present with the same inputs. So, you typically would
first `build` and then `rebuild`. Note that a repeated `build` simply
takes the action result from cache.

``` sh
$ touch ROOT
$ just-mr build
INFO: Performing repositories setup
INFO: Found 2 repositories involved
INFO: Setup finished, exec ["just","build","-C","..."]
INFO: Requested target is [["@","","",""],{}]
INFO: Analysed target [["@","","",""],{}]
INFO: Discovered 6 actions, 0 tree overlays, 1 trees, 0 blobs
INFO: Building [["@","","",""],{}].
INFO: Processed 6 actions, 0 cache hits.
INFO: Artifacts built, logical paths are:
        bin/hello [1910a58cdd5c270ca375b3222ec1e602b00dee73:18072:x]
        share/hello/OUT.txt [428b97b82b6c59cad7488b24e6b618ebbcd819bc:13:f]
        share/hello/version.txt [de0d4f12aeb65c9e0a52909a07b0638e16e112fd:34:f]
$ sleep 1
$ just-mr build
INFO: Performing repositories setup
INFO: Found 2 repositories involved
INFO: Setup finished, exec ["just","build","-C","..."]
INFO: Requested target is [["@","","",""],{}]
INFO: Analysed target [["@","","",""],{}]
INFO: Discovered 6 actions, 0 tree overlays, 1 trees, 0 blobs
INFO: Building [["@","","",""],{}].
INFO: Processed 6 actions, 6 cache hits.
INFO: Artifacts built, logical paths are:
        bin/hello [1910a58cdd5c270ca375b3222ec1e602b00dee73:18072:x]
        share/hello/OUT.txt [428b97b82b6c59cad7488b24e6b618ebbcd819bc:13:f]
        share/hello/version.txt [de0d4f12aeb65c9e0a52909a07b0638e16e112fd:34:f]
$ just-mr rebuild
INFO: Performing repositories setup
INFO: Found 2 repositories involved
INFO: Setup finished, exec ["just","rebuild","-C","..."]
INFO: Requested target is [["@","","",""],{}]
INFO: Analysed target [["@","","",""],{}]
INFO: Discovered 6 actions, 0 tree overlays, 1 trees, 0 blobs
INFO: Rebuilding [["@","","",""],{}].
WARN: Found flaky action:
       - id: 50e387d4d4c4dd9d8e6d08e1895c7dc729e5a4f3e7c7ad90cc93e373b5dea947
       - cmd: ["sh","-c","echo '#define VERSION \"0.0.0.'`date +%Y%m%d%H%M%S`'\"' > version.h\n"]
       - output 'version.h' differs:
         - [a3c9ccb6547a898c51c2d46cb651f2df668ef007:39:f] (rebuilt)
         - [d8a442743402f7b589e2c25f7981149eeaa1a8f8:39:f] (cached)
INFO: 2 actions compared with cache, 1 flaky actions found (0 of which tainted), no cache entry found for 4 actions.
INFO: Artifacts built, logical paths are:
        bin/hello [84d0282a5b1a9ab09638d02955ad1e92aa911103:18072:x]
        share/hello/OUT.txt [428b97b82b6c59cad7488b24e6b618ebbcd819bc:13:f]
        share/hello/version.txt [d15119f103c0c1322e759c5e9fe5ef45926036fa:34:f]

$
```

In the example, the second action compared to cache is the upper casing
of the output. Even though the generation of `out.txt` depends on the
non-reproducible `hello`, the file itself is reproducible. Therefore,
the follow-up actions are checked as well.

For this simple example, reading the console output is enough to
understand what's going on. However, checking for reproducibility
usually is part of a larger, quality-assurance process. To support the
automation of such processes, the findings can also be reported in
machine-readable form.

``` sh
$ just-mr rebuild --dump-flaky flakes.json --dump-graph actions.json
[...]
$ cat flakes.json
{
  "cache misses": [
    "059fc6b8047bbaf6353f5813be72e387406dd9a171da1f628b167785ed710f84",
    "d2ae0c3a1b3e588e531ff9624def1dbddff9e61b185888602704854f2ab6338d",
    "1c7636801667a48bbb0fbd5fa5404dbff32d92150a6d6fb54b8d48f9ca648271",
    "8ae961996bd2c4c03afb29549053dc9a9cd8d0cc12a0e58aade87159e133c528"
  ],
  "flaky actions": {
    "50e387d4d4c4dd9d8e6d08e1895c7dc729e5a4f3e7c7ad90cc93e373b5dea947": {
      "version.h": {
        "cached": {
          "file_type": "f",
          "id": "d8a442743402f7b589e2c25f7981149eeaa1a8f8",
          "size": 39
        },
        "rebuilt": {
          "file_type": "f",
          "id": "6fe7020f82b32335ee3478e8f7628e293c995139",
          "size": 39
        }
      }
    }
  }
}$
```

The file reports the flaky actions together with the non-reproducible
artifacts they generated, reporting both, the cached and the newly
generated output. The files themselves can be obtained via `just
install-cas` as usual, allowing deeper comparison of the outputs. The
full definitions of the actions can be found in the action graph, in the
example dumped as well as `actions.json`; this definition also includes
the origins for each action, i.e., the configured targets that requested
the respective action.

Comparing build environments
----------------------------

Simply rebuilding on the same machine is good way to detect embedded
time stamps of sufficiently small granularity; for other sources of
non-reproducibility, however, more modifications of the environment are
necessary.

A simple, but effective, way for modifying the build environment is the
option `-L` to set the local launcher, a list of strings the argument
vector is prefixed with before the action is executed. The default
`["env", "--"]` simply resolves the program to be executed in the
current value of `PATH`, but a different value for the launcher can
obviously be used to set environment variables like `LD_PRELOAD`.
Relevant libraries and tools include
[libfaketime](https://github.com/wolfcw/libfaketime),
[fakehostname](https://github.com/dtcooper/fakehostname), and
[disorderfs](https://salsa.debian.org/reproducible-builds/disorderfs).

More variation can be achieved by comparing remote execution builds,
either for two different remote-execution end points or comparing one
remote-execution end point to the local build. The latter is also a good
way to find out where a build that "works on my machine" differs. The
endpoint on which the rebuild is executed can be set, in the same way as
for build with the `-r` option; the cache end point to compare against
can be set via the `--vs` option.