IRC logs for #buildstream for Sunday, 2022-12-04

*** tristan <tristan!tristan@223.62.169.153> has joined #buildstream05:08
*** ChanServ sets mode: +o tristan05:08
*** tristan <tristan!tristan@223.62.169.153> has quit IRC07:21
dorasjuergbi: was something like hardlink checkout + overlayfs considered as an alternative CAS protection mechanism instead of FUSE?15:08
dorasFor staging, I mean.15:11
juergbidoras: overlayfs was not available to unprivileged users back then, which essentially ruled this out. however, I think it is available since linux 5.11, so may be worth considering as alternative15:12
juergbihardlink staging does add a significant cost in staging time if there are many files while staging with FUSE takes almost no time but adds a performance hit on some operations later15:15
juergbibtw: buildbox-casd already supports hardlink staging but buildbox-casd has to be installed setuid (of a casd user, not root) for that15:16
juergbi(as that uses uid separation for protection instead of overlayfs / bst1 fuse)15:17
dorasjuergbi: doesn't this mean that staged hardlinks are essentially read-only?15:18
juergbidoras: which part/variant are you referring to?15:19
dorasHardlink staging + different uid for CAS (if I understood things correctly).15:20
juergbiyes, indeed, that's the purpose of using a different uid15:20
juergbiah, or are you referring to potential issues due to that?15:21
dorasI'm referring to potential issues, yes15:22
dorasI mean, can bst2 actually make use of it without breaking something use cases that otherwise work with FUSE?15:22
dorass/something/some/15:22
juergbithere may be issues in rare cases but well-written code usually has no issue with that (the staged files can still be replaced, they simply can't be modified in-place)15:22
juergbithat said, we strongly recommend use of buildbox-fuse on Linux. the hardlink support is mainly for non-Linux15:25
juergbiwith regular non-setuid buildbox-casd, it will fall back to full file copies is buildbox-fuse is not available. while the full file copy fallback is obviously slow, it allows files to be modified, matching buildbox-fuse15:26
dorasI guess the recent composefs proposal could be useful to avoid the hardlink staging costs while still allowing the use of overlayfs on top for native read/write file access. Best of both worlds, almost.15:27
juergbicomposefs could indeed be useful but that's not something we can rely on in the foreseeable future15:28
juergbialthough, we could support it as additional option even if we can't rely on it being available, of course15:29
juergbiI can't estimate the chances of it being mainlined15:30
*** tristan94 <tristan94!tristan@78.40.148.178> has quit IRC15:31
dorasjuergbi: supposedly EROFS already provides capabilities similar to composefs.15:33
dorasMinus a shared page cache between images, which is supposedly planned.15:42
dorasI'm not sure if it supports user namespaces though.15:43
juergbidoras: isn't EROFS image-based?15:48
juergbidoras: actually, it could be useful to use overlayfs also with buildbox-fuse. this could eliminate the hit on write performance. and read performance is typically less of an issue with buildbox-fuse15:49
juergbiit might be tricky to use, though, as unprivileged mount is only possible within a corresponding mount namespace. might not be possible with the current protocol between buildbox-run and buildbox-casd. also, bubblewrap doesn't support overlayfs mounts yet (unclear whether that would actually be useful, though, depends on how the new protocol would look like)15:49
dorasjuergbi: I see what you mean. Basically provide write protection through an overlay, and use FUSE only to redirect reads to CAS?16:22
juergbidoras: yes. well, write protection is the wrong term in context of buildbox-fuse but redirect writes to a separate (possibly even tmpfs) overlay and use FUSE only to read unmodified staged files16:24
dorasjuergbi: can't buildbox-fuse be used as-is as the lower directory of an overlay and the upper directory of the overlay would be on a native filesystem? Or this what you meant?16:33
juergbiyes, that's what I meant16:33
dorasjuergbi: I'm not familiar with the protocol between buildbox-casd and buildbox-fuse, but it sounds like in theory the FUSE filesystem itself can remain unchanged as long as it's mounted read-only.16:38
juergbiyes, the question is how to handle overlayfs with regards to buildbox-run and buildbox-casd (and buildstream). buildbox-fuse itself shouldn't be affected16:40
juergbi(except possibly minor changes to allow mounting within an unprivileged user namespace)16:40
dorasOh, I now see I read your message wrong.16:41
dorasjuergbi: back to hardlinks + overlayfs idea for a moment, can't we perform the CAS "checkout" offline and not during staging? i.e., after an artifact is pulled?16:49
dorasThen use an overlay with multiple lower read-only directories, each representing a different artifact?16:50
dorasIt will have some storage overhead for the hardlinks themselves, but it may be worth it.16:53
juergbidoras: persistent artifact checkouts are an issue for LRU cleanup of the CAS. we used to have that. and there may also be issues with hardlink limits per file16:53
juergbioverlayfs with many layers also has a performance penalty for open and readdir. don't know how much but overall I think it's questionable that it would perform better than read-only buildbox-fuse16:54
dorasI see.16:54
juergbiif we can implement overlayfs support (single lower layer), I don't think there will be a significant remaining performance issue with buildbox-fuse16:55
juergbithe reflink optimization for buildbox-fuse that we recently discussed, would probably not be possible with this, though. not a disadvantage compared to building on e.g. ext4 without FUSE but that optimization might be nice for some special cases such as ostree16:59
juergbiavoiding performance regression compared to build on ext4 is probably more important than being faster than ext4 in some special cases17:00
nanonymejuergbi: sorry, where would this overlayfs help? I missed that22:48
nanonymeAlso wait wait wait, do you mean us putting buildbox-casd as setuid another user would help with performance? Did I read that correctly? That can be done if needed.22:51
nanonymeWe have a hack that allows putting setuid for files we create with BuildStream before creating Docker images out of them22:53
nanonymejuergbi: the checkout performance isn't currently our primary problem. Checking out 7GB of data takes only couple of minutes apparently. It's waste of IO but such is life. The bigger problem is sandbox performance22:55

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!