IRC logs for #buildstream for Friday, 2020-03-27

*** benschubert has joined #buildstream		08:10
*** tpollard has joined #buildstream		08:46
*** rdale has joined #buildstream		09:05
*** tristan has quit IRC		09:09
*** santi has joined #buildstream		09:40
*** phildawson has joined #buildstream		09:50
*** lachlan has joined #buildstream		09:53
*** tristan has joined #buildstream		10:04
*** lachlan has quit IRC		10:27
*** narispo has quit IRC		10:33
*** narispo has joined #buildstream		10:34
*** lachlan has joined #buildstream		10:36
*** lachlan has quit IRC		10:56
*** lachlan has joined #buildstream		10:57
*** jib has joined #buildstream		11:02
*** jib has left #buildstream		11:03
*** lachlan has quit IRC		11:06
*** lachlan has joined #buildstream		11:18
*** lachlan has quit IRC		12:05
WSalmon	juergbi, benschubert dose bst start a new cas-d for every involcation of bst? or can it share them	12:11
benschubert	every invocation, having it for multiple would be more complex but something we might have to consider	12:12
WSalmon	do they look for the size at the start and then just add to a counter as they make stuff?	12:14
WSalmon	if two were making stuff at the same time then they would get out of sink but if you start bst tomorrow then it will be write then	12:16
WSalmon	but under the new way it may not	12:16
WSalmon	thats why buildgrid just has one that says around all the time but thats not really practiacl here as far as i can tell	12:17
benschubert	buildgrid is a service, it's used in a widely different way than buildstream is.	12:22
juergbi	WSalmon: yes, that's why I mentioned on the ML that one issue is missing protection against multiple casd instances	12:22
benschubert	juergbi: however, running multiple bst at the same time has always been undefined behavior right? :)	12:22
juergbi	it would be good to add this before implementing the disk usage file	12:22
benschubert	(Not saying that's a good thing)	12:22
benschubert	add this -> not sure what you mean by "this" ?	12:23
juergbi	with regards to expiry, yes	12:23
juergbi	don't allow multiple casd instances in the same directory	12:23
benschubert	ah right	12:23
juergbi	(or keep spawned casd running but I'm not really a fan of that)	12:23
benschubert	Adding a lock file in the directory with a PID inside would do right?	12:23
WSalmon	benschubert, 2 bst's is bad but it dosnt mess up your cache going forward generally which was my concern	12:25
juergbi	PID is not perfect but it might be good enough in practice	12:25
benschubert	there is no perfect solution though right? :)	12:25
juergbi	wondering whether race free uniqueness would be possible in a reasonable way by means of the socket file	12:26
benschubert	I mean other than telling the user "ah you have a .lock file, is it stray? IF so delete it"	12:26
WSalmon	i was gona say about maybe just removing the file if it was already there but making sure that only one casd can run aggenst one cache sounds like a good idea	12:26
benschubert	But we generate random sockets. Would you want to encode the directory in the socket name?	12:27
juergbi	the reason we generate random socket files is to allow multiple casd instances	12:27
juergbi	we'd drop that, of course	12:27
benschubert	(We _could_, but then what if I create my socket somewhere else?)	12:27
juergbi	I wouldn't be worried about non-buildstream casd invocations in the same directory	12:28
juergbi	to clarify, the socket file is already in the cas directory	12:28
benschubert	fair	12:28
benschubert	then yeah using a unique name for the socket would be a solution	12:29
juergbi	we also create a directory in /tmp and a symlink to the socket but that's just to workaround the idiotic path length restriction	12:29
juergbi	(or rather, the missing connectat() syscall)	12:29
benschubert	juergbi: oh btw, for userchroot and the newer pytest version, it seems like it's something between pytest and pytest-forked that fails :/	12:32
benschubert	(running without "-n2" makes the test pass	12:32
benschubert	I'll probably pin pytest to 4.3 in the meantime	12:32
juergbi	I guess bisecting pytest or pytest-forked is not quite as straight forward'	12:33
benschubert	Yep :) It's a change in pytest breaking pytest-forked	12:37
benschubert	I have found nothing obvious	12:37
benschubert	I'll dig a bit more but will soon give up and pin	12:37
*** lachlan has joined #buildstream		12:49
*** lachlan has quit IRC		12:56
*** CTtpollard has joined #buildstream		14:00
*** tpollard has quit IRC		14:00
*** lachlan has joined #buildstream		14:18
*** lachlan has quit IRC		14:28
*** CTtpollard has quit IRC		14:36
*** lachlan has joined #buildstream		14:44
gitlab-br-bot	jjardon opened issue #1276 (BuildStream build fails if the CAS is missing blobs) on buildstream https://gitlab.com/BuildStream/buildstream/-/issues/1276	15:20
*** lachlan has quit IRC		15:26
juergbi	jjardon: at a very quick glance it seems BuildStream doesn't recognize the error as a NOT_FOUND error and we currently only fall back to local build on NOT_FOUND errors	15:32
juergbi	besides fixing that issue at hand, it might make sense to fall back even for other pull errors. that said, it would still be good for the user to be aware of such unexpected errors	15:33
juergbi	(could be an issue in BuildStream or BuildBox)	15:34
jjardon	juergbi: yup, that is why I mentioned there that maybe the behavious should be configurable	15:35
jjardon	("always use the cache if present", "fallback even if present", "never use even if present", etc)	15:36
*** tpollard has joined #buildstream		15:36
juergbi	for `bst build` we should always perform builds if it's not cached yet	15:36
*** lachlan has joined #buildstream		15:36
jjardon	yeah, agree	15:37
juergbi	remote cache is optional, of course. we might still be missing some CLI option to override remote cache in config files, don't remember	15:37
juergbi	and if you don't want any builds, you should call bst artifact pull instead of bst build	15:37
jjardon	that makes sense	15:37
juergbi	what I was referring to is the distinction between 'blob not found' and other gRPC errors	15:38
juergbi	e.g. network or internal server error	15:38
cphang	juergbi this came up as with the deployments we've been developing with, there isn't referential integrity between the artifact cache and CAS.	15:38
tpollard	bst2 build can override remove cache via the cli	15:38
jjardon	right	15:38
tpollard	s/remove/remote	15:38
juergbi	cphang: correct, buildstream should be able to deal with missing blobs	15:38
tpollard	s/bst2/bst master	15:39
juergbi	but within certain limits	15:39
cphang	indeed	15:39
juergbi	i.e., if we call findmissingblobs on the remote CAS server and it's all there (or freshly uploaded), we expect it to stay there for a while	15:39
cphang	So in the buildbarn deployments we're working with, we'll be doing some server side development to make sure we can provide those guarantees.	15:39
jjardon	tpollard: seems it's a bit broken: https://gitlab.com/BuildStream/buildstream/-/issues/1240	15:39
juergbi	we can't recover from blobs going missing in the same session	15:39
cphang	Indeed	15:39
cphang	Bazel is the same (well with builds without the bytes enabled)	15:40
cphang	But Bazel does have the means to fall back and reupload, if that mode is not enabled.	15:40
cphang	So I think it's beneificial for buildstream to have that same behaviour, if not strictly essential.	15:41
tpollard	jjardon: there should be test coverage for it on master (the --remote option)	15:41
jjardon	mmm, so can we say that at the moment buildstream remote cache have the same cache restrictions as bazel with builds without the bytes?	15:42
cphang	jjardon similar. Then there's the distinction between the action cache that Bazel uses, and the artifact cache that buildstream currently uses, and with the pending move to using the remote-asset api	15:43
coldtom	tpollard, that means that if i want to avoid pushing, i also lose the ability to pull though?	15:43
cphang	juergbi is that an accurate statement?	15:43
juergbi	cphang: BuildStream checks all required blobs for a particular action are on the CAS server (uploads missing ones) before issuing an action	15:43
jjardon	tpollard: nice, since when? seems coldtom experience the same some months ago	15:43
tpollard	coldtom;	15:44
juergbi	cphang: BuildStream does not assume that all blobs are available for artifacts in the artifact cache, afaik	15:44
tpollard	coldtom: yep, I would like to see it extended	15:44
cphang	juergbi, I believe coldtom found that if you delete the CAS and then in a separate bst session try and pull blobs from the CAS that are referenced in the artifact cache then the build fails. coldtom can you confirm?	15:47
*** lachlan has quit IRC		15:47
cphang	This is documented at https://gitlab.com/celduin/infrastructure/celduin-infra/-/issues/37	15:47
juergbi	it's definitely possible but that would be a bug, not by design	15:47
juergbi	as per my initial comment here to jjardon	15:48
juergbi	jjardon, cphang: this might help https://gitlab.com/BuildStream/buildstream/-/commit/b0e84e0cfaffa1fc7a196b991458600a8afd14c0	16:00
cphang	ooh	16:01
cphang	coldtom ^	16:01
cphang	tvm juergbi	16:04
coldtom	ta juergbi	16:07
*** lachlan has joined #buildstream		16:08
jjardon	juergbi: great, thanks!	16:09
jjardon	valentind: ^^ let's try again to build when that get merged	16:15
*** lachlan has quit IRC		16:46
*** lachlan has joined #buildstream		16:52
*** lachlan has quit IRC		17:13
valentind	jjardon, we use the latest tag.	17:18
valentind	So it would be nice if there was a snapshot done at some point	17:18
valentind	I can try to apply the patch however.	17:20
valentind	jjardon, here: https://gitlab.com/freedesktop-sdk/infrastructure/freedesktop-sdk-docker-images/-/merge_requests/95	17:24
*** lachlan has joined #buildstream		17:28
*** tpollard has quit IRC		17:28
jjardon	valentind: coolio, thanks!	17:33
valentind	jjardon, Just approve it, and I will merge.	17:34
jjardon	valentind: done!	17:34
*** lachlan has quit IRC		17:45
*** lachlan has joined #buildstream		17:56
*** lachlan has quit IRC		18:15
*** lachlan has joined #buildstream		18:30
*** santi has quit IRC		18:42
*** lachlan has quit IRC		18:42
*** lachlan has joined #buildstream		18:45
*** lachlan has quit IRC		19:12
*** toscalix has joined #buildstream		19:18
*** toscalix has quit IRC		19:23
*** toscalix has joined #buildstream		19:23
*** phildawson has quit IRC		19:25
*** phildawson has joined #buildstream		19:26
*** phildawson has quit IRC		19:30
*** rdale has quit IRC		20:01
*** mohan43u has quit IRC		20:22
*** mohan43u has joined #buildstream		20:25
*** benschubert has quit IRC		20:31
*** toscalix has quit IRC		21:09
*** narispo has quit IRC		21:43
*** narispo has joined #buildstream		21:43
valentind	jjardon, juergbi, same error with the patch: https://gitlab.com/freedesktop-sdk/freedesktop-sdk/-/jobs/489065406	22:47
valentind	It could be that in _CASBatchRead.send, missing_blobs is not None, so it never raises BlobNotFound. And instead it raises a generic CASRemoteError.	22:59

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!