IRC logs for #buildstream for Friday, 2020-03-27

*** benschubert has joined #buildstream08:10
*** tpollard has joined #buildstream08:46
*** rdale has joined #buildstream09:05
*** tristan has quit IRC09:09
*** santi has joined #buildstream09:40
*** phildawson has joined #buildstream09:50
*** lachlan has joined #buildstream09:53
*** tristan has joined #buildstream10:04
*** lachlan has quit IRC10:27
*** narispo has quit IRC10:33
*** narispo has joined #buildstream10:34
*** lachlan has joined #buildstream10:36
*** lachlan has quit IRC10:56
*** lachlan has joined #buildstream10:57
*** jib has joined #buildstream11:02
*** jib has left #buildstream11:03
*** lachlan has quit IRC11:06
*** lachlan has joined #buildstream11:18
*** lachlan has quit IRC12:05
WSalmonjuergbi, benschubert dose bst start a new cas-d for every involcation of bst? or can it share them12:11
benschubertevery invocation, having it for multiple would be more complex but something we might have to consider12:12
WSalmondo they look for the size at the start and then just add to a counter as they make stuff?12:14
WSalmonif two were making stuff at the same time then they would get out of sink but if you start bst tomorrow then it will be write then12:16
WSalmonbut under the new way it may not12:16
WSalmonthats why buildgrid just has one that says around all the time but thats not really practiacl here as far as i can tell12:17
benschubertbuildgrid is a service, it's used in a widely different way than buildstream is.12:22
juergbiWSalmon: yes, that's why I mentioned on the ML that one issue is missing protection against multiple casd instances12:22
benschubertjuergbi: however, running multiple bst at the same time has always been undefined behavior right? :)12:22
juergbiit would be good to add this before implementing the disk usage file12:22
benschubert(Not saying that's a good thing)12:22
benschubertadd this -> not sure what you mean by "this" ?12:23
juergbiwith regards to expiry, yes12:23
juergbidon't allow multiple casd instances in the same directory12:23
benschubertah right12:23
juergbi(or keep spawned casd running but I'm not really a fan of that)12:23
benschubertAdding a lock file in the directory with a PID inside would do right?12:23
WSalmonbenschubert, 2 bst's is bad but it dosnt mess up your cache going forward generally which was my concern12:25
juergbiPID is not perfect but it might be good enough in practice12:25
benschubertthere is no perfect solution though right? :)12:25
juergbiwondering whether race free uniqueness would be possible in a reasonable way by means of the socket file12:26
benschubertI mean other than telling the user "ah you have a .lock file, is it stray? IF so delete it"12:26
WSalmoni was gona say about maybe just removing the file if it was already there but making sure that only one casd can run aggenst one cache sounds like a good idea12:26
benschubertBut we generate random sockets. Would you want to encode the directory in the socket name?12:27
juergbithe reason we generate random socket files is to allow multiple casd instances12:27
juergbiwe'd drop that, of course12:27
benschubert(We _could_, but then what if I create my socket somewhere else?)12:27
juergbiI wouldn't be worried about non-buildstream casd invocations in the same directory12:28
juergbito clarify, the socket file is already in the cas directory12:28
benschubertfair12:28
benschubertthen yeah using a unique name for the socket would be a solution12:29
juergbiwe also create a directory in /tmp and a symlink to the socket but that's just to workaround the idiotic path length restriction12:29
juergbi(or rather, the missing connectat() syscall)12:29
benschubertjuergbi: oh btw, for userchroot and the newer pytest version, it seems like it's something between pytest and pytest-forked that fails :/12:32
benschubert(running without "-n2" makes the test pass12:32
benschubertI'll probably pin pytest to 4.3 in the meantime12:32
juergbiI guess bisecting pytest or pytest-forked is not quite as straight forward'12:33
benschubertYep :) It's a change in pytest breaking pytest-forked12:37
benschubertI have found nothing obvious12:37
benschubertI'll dig a bit more but will soon give up and pin12:37
*** lachlan has joined #buildstream12:49
*** lachlan has quit IRC12:56
*** CTtpollard has joined #buildstream14:00
*** tpollard has quit IRC14:00
*** lachlan has joined #buildstream14:18
*** lachlan has quit IRC14:28
*** CTtpollard has quit IRC14:36
*** lachlan has joined #buildstream14:44
gitlab-br-botjjardon opened issue #1276 (BuildStream build fails if the CAS is missing blobs) on buildstream https://gitlab.com/BuildStream/buildstream/-/issues/127615:20
*** lachlan has quit IRC15:26
juergbijjardon: at a very quick glance it seems BuildStream doesn't recognize the error as a NOT_FOUND error and we currently only fall back to local build on NOT_FOUND errors15:32
juergbibesides fixing that issue at hand, it might make sense to fall back even for other pull errors. that said, it would still be good for the user to be aware of such unexpected errors15:33
juergbi(could be an issue in BuildStream or BuildBox)15:34
jjardonjuergbi: yup, that is why I mentioned there that maybe the behavious should be configurable15:35
jjardon("always use the cache if present", "fallback even if present", "never use even if present", etc)15:36
*** tpollard has joined #buildstream15:36
juergbifor `bst build` we should always perform builds if it's not cached yet15:36
*** lachlan has joined #buildstream15:36
jjardonyeah, agree15:37
juergbiremote cache is optional, of course. we might still be missing some CLI option to override remote cache in config files, don't remember15:37
juergbiand if you don't want any builds, you should call bst artifact pull instead of bst build15:37
jjardonthat makes sense15:37
juergbiwhat I was referring to is the distinction between 'blob not found' and other gRPC errors15:38
juergbie.g. network or internal server error15:38
cphangjuergbi this came up as with the deployments we've been developing with, there isn't referential integrity between the artifact cache and CAS.15:38
tpollardbst2 build can override remove cache via the cli15:38
jjardonright15:38
tpollards/remove/remote15:38
juergbicphang: correct, buildstream should be able to deal with missing blobs15:38
tpollards/bst2/bst master15:39
juergbibut within certain limits15:39
cphangindeed15:39
juergbii.e., if we call findmissingblobs on the remote CAS server and it's all there (or freshly uploaded), we expect it to stay there for a while15:39
cphangSo in the buildbarn deployments we're working with, we'll be doing some server side development to make sure we can provide those guarantees.15:39
jjardontpollard: seems it's a bit broken: https://gitlab.com/BuildStream/buildstream/-/issues/124015:39
juergbiwe can't recover from blobs going missing in the same session15:39
cphangIndeed15:39
cphangBazel is the same (well with builds without the bytes enabled)15:40
cphangBut Bazel does have the means to fall back and reupload, if that mode is not enabled.15:40
cphangSo I think it's beneificial for buildstream to have that same behaviour, if not *strictly* essential.15:41
tpollardjjardon: there should be test coverage for it on master (the --remote option)15:41
jjardonmmm, so can we say that at the moment buildstream remote cache have the same cache restrictions as bazel with builds without the bytes?15:42
cphangjjardon similar. Then there's the distinction between the action cache that Bazel uses, and the artifact cache that buildstream currently uses, and with the pending move to using the remote-asset api15:43
coldtomtpollard, that means that if i want to avoid pushing, i also lose the ability to pull though?15:43
cphangjuergbi is that an accurate statement?15:43
juergbicphang: BuildStream checks all required blobs for a particular action are on the CAS server (uploads missing ones) before issuing an action15:43
jjardontpollard: nice, since when? seems coldtom  experience the same some months ago15:43
tpollardcoldtom;15:44
juergbicphang: BuildStream does not assume that all blobs are available for artifacts in the artifact cache, afaik15:44
tpollardcoldtom: yep, I would like to see it extended15:44
cphangjuergbi, I believe coldtom found that if you delete the CAS and then in a separate bst session try and pull blobs from the CAS that are referenced in the artifact cache then the build fails. coldtom can you confirm?15:47
*** lachlan has quit IRC15:47
cphangThis is documented at https://gitlab.com/celduin/infrastructure/celduin-infra/-/issues/3715:47
juergbiit's definitely possible but that would be a bug, not by design15:47
juergbias per my initial comment here to jjardon15:48
juergbijjardon, cphang: this might help https://gitlab.com/BuildStream/buildstream/-/commit/b0e84e0cfaffa1fc7a196b991458600a8afd14c016:00
cphangooh16:01
cphangcoldtom ^16:01
cphangtvm juergbi16:04
coldtomta juergbi16:07
*** lachlan has joined #buildstream16:08
jjardonjuergbi: great, thanks!16:09
jjardonvalentind: ^^ let's try again to build when that get merged16:15
*** lachlan has quit IRC16:46
*** lachlan has joined #buildstream16:52
*** lachlan has quit IRC17:13
valentindjjardon, we use the latest tag.17:18
valentindSo it would be nice if there was a snapshot done at some point17:18
valentindI can try to apply the patch however.17:20
valentindjjardon, here: https://gitlab.com/freedesktop-sdk/infrastructure/freedesktop-sdk-docker-images/-/merge_requests/9517:24
*** lachlan has joined #buildstream17:28
*** tpollard has quit IRC17:28
jjardonvalentind: coolio, thanks!17:33
valentindjjardon, Just approve it, and I will merge.17:34
jjardonvalentind: done!17:34
*** lachlan has quit IRC17:45
*** lachlan has joined #buildstream17:56
*** lachlan has quit IRC18:15
*** lachlan has joined #buildstream18:30
*** santi has quit IRC18:42
*** lachlan has quit IRC18:42
*** lachlan has joined #buildstream18:45
*** lachlan has quit IRC19:12
*** toscalix has joined #buildstream19:18
*** toscalix has quit IRC19:23
*** toscalix has joined #buildstream19:23
*** phildawson has quit IRC19:25
*** phildawson has joined #buildstream19:26
*** phildawson has quit IRC19:30
*** rdale has quit IRC20:01
*** mohan43u has quit IRC20:22
*** mohan43u has joined #buildstream20:25
*** benschubert has quit IRC20:31
*** toscalix has quit IRC21:09
*** narispo has quit IRC21:43
*** narispo has joined #buildstream21:43
valentindjjardon, juergbi, same error with the patch: https://gitlab.com/freedesktop-sdk/freedesktop-sdk/-/jobs/48906540622:47
valentindIt could be that in _CASBatchRead.send, missing_blobs is not None, so it never raises BlobNotFound. And instead it raises a generic CASRemoteError.22:59

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!