summaryrefslogtreecommitdiff
path: root/src/buildtool/file_system
AgeCommit message (Collapse)Author
2024-03-08just-mr: Fix shell out executionOliver Reiche
... by avoiding reusing temp dirs for execute. While we are at it, also refactor LocalFetchViaTmpRepo() to create its own empty temp dirs, that cannot be reused by the caller.
2024-02-16async maps: Create utility library to handle cycle detectionPaul Cristian Sarbu
2024-01-31serve source tree: Increase server-side granularity in response statusesPaul Cristian Sarbu
For archives and Git repositories we should ensure that not finding the witnessing entity (archive content blob or Git commit, respectively) results in a distinct status in the response to a request that sets up roots on the serve endpoint. This will allow just-mr to better handle its interaction with the serve endpoint.
2024-01-08GitRepo: Add blob lookup methodPaul Cristian Sarbu
2023-12-12Filesystem: Fix copy overwrite of symlink with fileOliver Reiche
... and improve log messages in case of failure.
2023-12-05FileRoot: Add ignore-special flag getter and remove unneeded inliningPaul Cristian Sarbu
2023-12-05just main: Move root parser into static FileRoot method...Paul Cristian Sarbu
...to make it available also for setting up 'just serve' builds.
2023-11-30Resolve inconsistencies in third-party headers include formatPaul Cristian Sarbu
2023-11-30Add files to CAS without fully keeping them in memoryKlaus Aehlig
2023-11-30ObjectType: Add noexcept specifierPaul Cristian Sarbu
2023-11-30FileRoot: Fix content description for ignore-special rootsPaul Cristian Sarbu
Ignore-special git-tree-based roots are still content defined, so one should use the correct marker in the JSON description of the root that will be stored in the repository description of target cache keys. This commit fixes the issue and improves documentation.
2023-11-20Remove O_SYNC from low-level file-writing flags.Sascha Roloff
The man page for open(2) says the following to the O_SYNC flag: 'O_SYNC provides synchronized I/O file integrity completion, meaning write operations will flush data and all associated metadata to the underlying hardware.' This flag results in a high delay when files are stored in casx, e.g., several seconds for medium-sized files such as 23 MB. Since just does not care about persistency, this strong synchronization mechanism is not required and is deactivated.
2023-11-15FileRoot: Add new absent root underlying type variantPaul Cristian Sarbu
Absent roots are characterised only by a Git tree hash, so a new variant of the underlying stored information was added in the form of a plain string. In order to avoid unwanted implicit conversions when instantiating via literal strings, we force callers of the constructors to explicitly differentiate between plain strings and filesystem paths. Existing tests were updated to reflect this. Co-authored-by: Alberto Sartori <alberto.sartori@huawei.com>
2023-11-15TargetCache: add new member function WithShard(shard) that returns a new ↵Alberto Sartori
TargetCache... ...backed by the same CAS, but the FileStorage uses the given shard. This is particularly useful for the just-serve server implementation, since the sharding must be performed according to the client's request and not following the server configuration.
2023-11-13bugfix: Also unlink symlinks before installingOliver Reiche
Make sure that all CopyFile, WriteFile, and CreateSymlink functions properly unlink the target file (if it exists and overwrite requested) to avoid interferences of the install command. With this change, the clean up step for install-cas and the within GraphTraverser can new be omitted.
2023-11-02Decoupling symlinks map and CAS utilities from just-mrPaul Cristian Sarbu
This is required in order to make them available to 'just serve' in a minimal just installation.
2023-11-02GitRepo: Add method for async fetch from local repositoryPaul Cristian Sarbu
This avoids using the more geenric GitRepoRemote method which has libcurl as a dependency, something that is not needed for this Git operation.
2023-11-02GitRepo: Add blob existence checkerPaul Cristian Sarbu
Also fixes a small typo in tree existence checker log messages.
2023-08-08just execute: Fix uncollected upwards symlinksPaul Cristian Sarbu
Upwards symlinks should still be collected from actions, even if only the non-upwards symlinks are supported artifact types. The client side is thus the one responsible with enforcing the non-upwardness condition.
2023-08-07GitRepo: Add method for reading object from tree by its pathPaul Cristian Sarbu
2023-07-11filesystem: Avoid unwanted indirections...Paul Cristian Sarbu
...that std::filesystem::* calls produce. This is because existence and type checks use almost exclusively std::filesystem::status, which follows symbolic links, when being called with path arguments. Instead, one should instead use these methods with the value returned by a call of std::filesystem::symlink_status. This commit also streamlines the FileSystemManager tests, as well as replace bare calls to std::filesystem with their FileSystemManager counterparts (where suitable).
2023-07-11Git: Fix handling of symlinks in tree artifactsPaul Cristian Sarbu
The introduction of non-upwards symlinks as first-class objects should have updated the handling of known git tree artifacts containing symlinks. In particular, one should consider trees in their entirety when uploading (irrespective of the ignore_special flag), and git trees should only be reported as known only if the ignore_special flag is set to false.
2023-07-10just-mr: Fix handling of .gitignore files in git repositoriesPaul Cristian Sarbu
The command 'git add .' does not include paths found in .gitignore files in the directory tree where the command is issued. This is not the desired behaviour, as we expect for a tree with a given commit id to contain all of the entries, irrespective of their meaning to Git. This commit addresses the issue as described. For the just-mr.py script we modified the staging command to 'git add -f .'. For the compiled just-mr, simply adding the force flag to 'git_index_add_all' did not work as intended for files found in ignored subdirectories. This is a known libgit2 issue which has been fixed in v1.6.3. Until we can upgrade our libgit2 version, a workaround was implemented: we recursively read the directory entries ourselves and add each of them iteratively using 'git_index_add_bypath', making sure to ignore the root '.git' subtree (which cannot be staged). At the moment the handling of Git submodules remains an open issue, as Git does not allow '.git' subtrees to be forcefully added to the index, and thus such directory entries will currently not be considered as part of a git tree. This however is consistent behavior between Git and libgit2.
2023-07-10FileSystemManager: Add recursive directory entries reader...Paul Cristian Sarbu
...allowing the skipping of certain subtrees if needed. This is useful, e.g., in simulating what a 'git add' call would do, which ignores all '.git' subdirectories. Also adds a corresponding test for the new method.
2023-06-26Allow non-upwards symlinks with Git APIPaul Cristian Sarbu
2023-06-26Allow non-upwards symlinks with local APIPaul Cristian Sarbu
2023-06-26bazel_msg_factory: Allow non-upwards symlinks in uploaded treesPaul Cristian Sarbu
2023-06-26FileRoot: Add handling of non-upwards symlink...Paul Cristian Sarbu
...and update tests accordingly.
2023-06-26ReadTree: Add check for non-upwards symlinks...Paul Cristian Sarbu
...as early as possible. This ensures that callers always receive only the tree entries for the supported object types. For the symlinks non-upwardness check we pass a lambda capturing the real backend of the tree entries, such that the symlinks can be read. Updates git_tree tests accordingly.
2023-06-26ObjectType: Add non-upwards symlinks as a known object type...Paul Cristian Sarbu
...but make sure it is still considered a special type. The only non-special entry types remain file, executable, and tree.
2023-06-26filesystem: Add logic for handling (non-upwards) symlinksPaul Cristian Sarbu
2023-06-09Git CAS: report absence of CAS at debug levelKlaus Aehlig
A git CAS ist just a fall back, so it is OK if it is absent (e.g., the specificed directory does not exist). Therefore only log at debug level, not at error level if we cannot open it.
2023-06-07file_system: Drop unused template argumentOliver Reiche
... which became obsolete with the new fdless write/copy implementations.
2023-06-07file_system: Avoid malloc in 'fdless' copy/writeOliver Reiche
... to remove the risk of deadlocks on certain combinations of C++ standard library and libc when performing the copy/write in a child process. For 'fdless' copy/write, a child process is used to prevent the parent from getting polluted with open writable file descriptors (which might get inherited by other children that keep them open and can cause EBUSY errors).
2023-06-06style: Use designated initializersPaul Cristian Sarbu
This feature has been introduced with C++20.
2023-06-05git cas: only compute absolute path if not absolute alreadyKlaus Aehlig
... and in this way, continue to work correctly in the absence of a current working directory.
2023-05-31just: Add handling of ignore-special rootsPaul Cristian Sarbu
2023-05-31FileRoot: Add ignore-special roots logicPaul Cristian Sarbu
2023-05-15memcheck: fix race in libgit2...Paul Cristian Sarbu
...caused by incorrectly setting and resetting the library internal state and the misuse of pthreads in libgit2. Normally, git_libgit2_init and git_libgit2_shutdown should span the life of a worker thread in order to be safely used. However, due to an incorrect implementation of libgit2's threadstate with pthreads, on unix systems there is a race condition. Until the use of pthread_key_t is corrected in libgit2, we need to apply a workaround by always ensuring that the main thread is the first thread reaching the GitContext constructor.
2023-05-05GitTree: Check optional before accessing itOliver Reiche
... and drop unecessary IsTree() check.
2023-04-26imports: Switch to Microsoft GSL implementationOliver Reiche
... with two minor code base changes compared to previous use of gsl-lite: - dag.hpp: ActionNode::Ptr and ArtifactNode::Ptr are not wrapped in gsl::not_null<> anymore, due to lack of support for wrapping std::unique_ptr<>. More specifically, the move constructor is missing, rendering it impossible to use std::vector<>::emplace_back(). - utils/cpp/gsl.hpp: New header file added to implement the macros ExpectsAudit() and EnsureAudit(), asserts running only in debug builds, which were available in gsl-lite but are missing in MS GSL.
2023-03-31git tree: degrade log level for missing entry in a git tree to debugKlaus Aehlig
... as this is only an internal functionality, and the caller will take care of a proper error message if the absence of that entry is not expected.
2023-03-30GitRepo: Guard fake repository odb wrappingPaul Cristian Sarbu
In the current libgit2 implementation, a fake repository wrapped around an existing odb is being registered as owner the same way as a normal repository object. Therefore, one has to guard both the creation and destruction of the fake repository against all other git operations that might access the internal cache during this transfer of ownership.
2023-03-23GitRepo: Make tag creation operation more robustPaul Cristian Sarbu
Use a similar logic as for repository initialisation: first check if tag has not already been created in another process, and only then try creation; make more tries with more wait in between; only retry if failure was due to internal locking.
2023-03-23GitRepo: Add proper error message for keep tag operationPaul Cristian Sarbu
2023-03-23GitRepo: Make repository initialisation more robustPaul Cristian Sarbu
As the initialisation of Git repositories is something that only takes place once, we should check early and cheaply whether the repository is already there before trying to initialize it. If we do need to initilize a repo, we can afford more attempts and longer wait times between tries to initalize if the failure to initialize happens due to the internal Git locking mechanism.
2023-03-23GitRepo: Add proper error message for repository initialisationPaul Cristian Sarbu
2023-03-23GitRepo: Make repository path usage explicitPaul Cristian Sarbu
Opening a repo should not check parent directories, only try to open at given path.
2023-03-23targets: Fix deps structurePaul Cristian Sarbu
2023-03-22just-mr: Shell out to system Git for fetches over SSH...Paul Cristian Sarbu
...due to limited SSH support in libgit2. In order to allow the fetches to still be parallel, we execute: git fetch --no-auto-gc --no-write-fetch-head <repo> [<branch>] This only fetches the packs without updating any refs, at the slight cost of sometimes fetching some redundant information, which for our purposes is practically a non-issue. (If really needed, a 'git gc' call can be done eventually to try to compact the fetched packs, although a save in disk space is not actually guaranteed.)