IRC logs for #buildstream for Tuesday, 2021-05-18

juergbinanonyme: no, I'm not aware of any hard limits06:32
juergbithere might be disk space issues with temporary files when dealing with multi-GB blobs06:34
nanonymejuergbi, well, this just reproducibly broke with no failures logged when large files were pushed (I guess multi-GB)06:36
nanonymejuergbi, it's a normal(ish) scenario, some projects have eg debuginfo files where individual files are multiple gigabytes in size06:36
juergbido you happen to know whether it might have run out of disk space (i.e. not triggering expiry early enough)?06:37
juergbicasd reserves 2GB extra headroom by default. if you can reproduce the issue and suspect a disk space issue, you could increase that06:39
juergbiI would still expect an error be reported even if it was a disk space issue, though06:40
juergbinanonyme: is this an issue with casd running on the client side or an issue on the server side when casd is used as remote CAS server?06:42
juergbipurely pushing shouldn't require any extra disk space on the client side06:43
nanonymeWe're using buildbox-casd as remote CAS server here06:43
nanonymeSo latter06:43
*** toscalix has joined #buildstream08:06
*** toscalix has quit IRC08:09
*** toscalix has joined #buildstream08:14
*** sstriker has joined #buildstream08:34
*** santi has joined #buildstream08:35
*** toscalix has quit IRC08:45
*** toscalix has joined #buildstream08:45
*** toscalix has quit IRC09:48
*** toscalix has joined #buildstream10:17
*** toscalix has quit IRC10:31
*** toscalix has joined #buildstream10:35
*** toscalix has quit IRC10:49
*** santi has quit IRC10:54
*** toscalix has joined #buildstream11:02
*** santi has joined #buildstream11:08
*** toscalix has quit IRC12:22
*** abderrahim[m] has quit IRC13:23
*** sstriker has quit IRC16:16
*** santi has quit IRC17:54
nanonymejuergbi, the nasty bit here is as said buildbox-casd doesn't emit errors in this scenario and bst1 vomits something like "Unexpected error in RPC handling" so it's hard to debug18:27
juergbinanonyme: if you can reproduce it, can you monitor disk usage on the server to check whether this might indeed be the issue?18:29
juergbiI assume you can't easily test it with bst2 as client to check whether it reports a clearer error18:29
nanonymeI haven't yet tried to reproduced it locally, I think abderrahim did18:30
nanonymeBuildStream is having serious issues working at all with CAS as we realized (it's missing retries both for pulls and pushes)18:31
nanonymeAs in, master18:31
juergbior alternatively, invoke buildbox-casd on the server with a larger disk headroom, e.g. --reserved=8G (default is 2GB)18:31
juergbiI assume that's for the (small) remote asset server requests as per the open issue18:32
nanonymeI see. These are the current parameters https://gitlab.com/freedesktop-sdk/infrastructure/local-cache/-/blob/master/docker-compose.yml#L1418:32
juergbineed to fix that, probably not too difficult18:32
nanonymeIn other news, we in freedesktop-sdk project currently run buildboxx-casd as our primary remote cache implementation18:33
nanonymebuildbox-casd even18:33
juergbihm, the quota is set to SIZE-100G. does this mean 100G are unused, i.e. there should be plenty of headroom on disk? or is this space used by something else?18:34
nanonymeThis is basically a shared builder, there's at maximum three concurrent containers running bst + separate container running a local CAS instance18:35
juergbibuildbox-casd isn't designed as scalable server but if you only need a single instance, it should be fine - after fixing this issue18:35
nanonymeCurrent caching architecture is each runner has its own remote CAS (buildbox-casd) and then there's one central remote CAS (buildbox-casd)18:36
nanonymeFormer has been hitting these issues with large binaries but it's possible latter would too (bst just fails before trying)18:36
juergbiwe should definitely fix disk space handling for large temporary files18:37
nanonymejuergbi, right. The first choice was bb-remote-asset but it had suspicious object eviction characteristics so we switched to buildbox-casd instead18:38
juergbihowever, can't be sure this is the (only) issue in your case, though18:38
juergbibuildgrid cas server would be an alternative that should be able to scale across multiple machines, if that's something you need18:39
nanonymeI don't think multiple machines is currently the desired thing.18:39
nanonymeThe runner CAS is mostly for data locality18:41
nanonymeWell, also for heterogenic conent caching as runner only has given arch builds18:43
nanonymeOr rather, handling caching of heterogenic content in more efficient way18:43
nanonymejuergbi, I think the 100GB is mostly "leave at minimum 100GB on disk for server to remain functional"18:49
*** toscalix has joined #buildstream19:12
*** toscalix has quit IRC22:26

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!