IRC logs for #buildstream for Friday, 2020-07-03

*** benschubert has quit IRC		00:01
*** phildawson has quit IRC		02:09
*** phildawson has joined #buildstream		02:22
*** tristan has quit IRC		05:18
*** tristan has joined #buildstream		06:13
*** ChanServ sets mode: +o tristan		06:13
*** tristan has quit IRC		06:30
*** tristan has joined #buildstream		06:39
*** ChanServ sets mode: +o tristan		06:39
*** benschubert has joined #buildstream		06:57
tristan	benschubert, Ummm, your 6K element test project.... is it 6K elements wide by chance ?	07:30
tristan	It appears to be impossible to have a project that is > 1K elements tall	07:30
benschubert	tristan: No, it's not a flat project	07:30
benschubert	oh?	07:30
benschubert	It's definitely not 1k elements tall though	07:30
* tristan gets recursion error		07:30
benschubert	it's based on `base-files` in debian (and all its deps)	07:30
* tristan used a simple script to make a chain of elements depend on eachother		07:31
tristan	I see	07:31
tristan	You know, we have an `except RecursionError` in _frontend/app.py which says, "Sorry, you'r SOL"	07:31
benschubert	wait what?	07:31
tristan	in more professional language	07:32
benschubert	ah, I got worried :'D	07:32
tristan	So that you don't get to see any stack trace	07:32
tristan	I guess that's better than a stack trace	07:32
benschubert	unless you --debug :)	07:32
tristan	Really ?	07:33
* tristan uncomments		07:33
benschubert	at least it used to be the case	07:33
benschubert	if it's not the case anymore we need to re-enable that	07:33
* tristan also noticed that it appears that --error-lines is getting ignored		07:33
tristan	I get the stack trace in a BUG message if I uncomment that handling	07:34
tristan	But then, it shows me a hard coded limit of stderr lines	07:34
tristan	or of exception lines	07:34
tristan	Without consulting Context.log_error_lines	07:34
* tristan guesses we really need to have a test for that if we want people to pay attention... sadly :-S		07:35
tristan	maybe it's just with stack traces	07:36
tristan	no no	07:36
tristan	benschubert, --debug doesnt cause the RecursionError stacktrace to be printed either ;-)	07:36
tristan	Aaaaaanyway	07:37
tristan	Polish for another day, and hope it stays polished after that	07:37
benschubert	yeah :/	07:37
* tristan doesn't think we need to cover every little detail with a test case, and would rather expect people to fully understand code they are replacing		07:38
benschubert	oh by the way, I had a wild thought the other day: we are using mypy for the public API, how do you find it? Useful? I was wondering if we should switch from "only the minimum for the public API to pass" to "as you wish, as long as it's well made and not blocking others" ?	07:38
tristan	I would like to get to a place where it is everywhere so long as there is no runtime overhead	07:38
tristan	I don't know what is the official policy but, I guess that it makes sense to mandate it in public API	07:39
tristan	Getting it everywhere would force us to restructure modules to avoid any circular dependencies of modules at all	07:39
benschubert	yeah currently we've said "only public"	07:39
tristan	Would would be the main benefit I would see	07:39
benschubert	not necessarily, but yeah that could help with such refactors at least	07:40
tristan	I mean, right now we think it's okay to call a method on an object which we did not import it's class	07:40
tristan	That kind of stuff would become impossible	07:40
tristan	not sure, but thought it might	07:41
benschubert	still possible with forward declarations and If TYPECHECKING guards	07:41
benschubert	so it's more work but it's possible	07:41
benschubert	however it becomes obvious :)	07:41
tristan	Yeah, that too, we can expose the warts in their horrible glory	07:41
benschubert	so would you be rather in favor if I start a ML thread about allowing mypy anywhere we want?	07:41
tristan	I think I've already pushed code which adds annotations in non-public API	07:42
tristan	nobody caught me doing it ;-)	07:42
benschubert	ok I take that for a yes :D	07:42
tristan	PluginFactory	07:42
tristan	That uses some, not completely but some	07:42
benschubert	second point I wanted to ask: I have found `black` to be very useful for our formatting. One last thing bugs me: our import statements are never consistent. Would you be fine with using `isort` in addition to sort all our includes _in a single_possible_way_ ?	07:43
tristan	Of course, it makes no sense for cdef cython things, but I guess outward facing def/cpdef functions on cython classes can still do it right ?	07:43
tristan	what is isort ?	07:43
benschubert	for cython, we can use .pyi files like I did for yaml.pyx :)	07:44
benschubert	isort is a tool that focus on one thing: sorting import statements for python	07:44
benschubert	it has an autoformat mode and a check mode	07:44
tristan	right, I would not favor alphabetic, but the webpage says something about sections	07:44
tristan	not sure how it would work	07:44
benschubert	you can decide, I want "stdlib - htird party - first party - tests" for example	07:44
tristan	we have a nicely treeish thing	07:44
benschubert	and then inside each section you can say: "normal imports first, then "from ... import", sorted alphabetically	07:45
benschubert	ah you prefer the tree?	07:45
tristan	So I'd like to have [standard libraries] -> [parent package] -> [local package] imports	07:45
tristan	I would not like those to be interspersed	07:46
benschubert	yeah, you can ask for this :)	07:46
tristan	from . import foo goes last	07:46
tristan	preceded by from ..	07:46
benschubert	So if we can get such an order, you'd be in favor of an auto-import formatter?	07:46
tristan	Yep :)	07:46
tristan	That's pretty much my only nit I think	07:47
tristan	Yeah	07:47
benschubert	ok! I'll see what I can do and push it to the list then, thanks	07:47
traveltissues	tristan: is there a planned release of 1.93.4.dev0 to pypi?	07:48
tristan	traveltissues, not particularly	07:49
traveltissues	it would be useful is that were published. would you please consider that	07:49
tristan	traveltissues, pypi for dev snapshots at this point is more of an on demand/depending on whether cs-shadow feels like it kinda thing	07:49
tristan	Pushing out tags is cheap	07:50
benschubert	tristan: I can ping cs-shadow to ask him if he can push it if you need it	07:50
tristan	Going and doing a whole pypi dance is another thing, we probably have another tag out by the time a pypi dance is performed	07:50
tristan	benschubert, Ask traveltissues ... I'm not the one expressing a need :D	07:50
benschubert	oups, I means traveltissues	07:50
benschubert	my autocomplete tricked me :)	07:51
tristan	Ah, 't' !	07:51
tristan	'tr' !	07:51
tristan	traveltissues, we're sorted in the same bucket ;-)	07:51
tristan	bucket buddies	07:51
traveltissues	well i don't strictly need it, it would be nicer if we could add the local dir to the sys.path when installing in dev mode	07:53
benschubert	traveltissues: what do you mean?	07:54
traveltissues	if you install bst-plugins-experimental in dev mode	07:55
traveltissues	>>> print(sys.path)	07:55
traveltissues	['', '/usr/lib/python38.zip', '/usr/lib/python3.8', '/usr/lib/python3.8/lib-dynload', '/root/.local/lib/python3.8/site-packages', '/bst-plugins-experimental/src', '/usr/lib/python3.8/site-packages']	07:55
traveltissues	trying the same thing with bst and copying the source, libs, and binary to another container breaks it	07:55
traveltissues	bash-5.0# bst	07:55
traveltissues	Traceback (most recent call last):	07:55
traveltissues	File "/usr/local/bin/bst", line 5, in <module>	07:55
traveltissues	from buildstream._frontend import cli	07:55
traveltissues	ModuleNotFoundError: No module named 'buildstream'	07:55
benschubert	what do you call dev mode? `pip install -e ` ?	07:55
traveltissues	yes	07:55
benschubert	are you doing this in a venv?	07:56
traveltissues	in docker	07:56
benschubert	so you are `pip install -e buildstream` and it doesn't work?	07:56
traveltissues	no, the installation works, copying this to another container doesn't. that's not the surprising thing, what i'm wondering is why is bst-plugins-experimental still found when it's copied but buildstream is not	07:57
benschubert	you might not be copying everything correctly	07:58
benschubert	pip install -e does quite a few things under the hood	07:58
benschubert	I'm curious why you need to copy stuff like that?	07:58
traveltissues	yes, but i'm trying to understand what the difference is here if they're both installed the same way	07:58
benschubert	bst-plugins-experimental is not a cli	07:58
tristan	I think the next tag will be more interesting to have on pypi for what it's worth	07:59
traveltissues	i'm copying it because the container i have doesn't have the tools to install, it doesn't have pip for example	07:59
tristan	The last tag doesnt include the juncle merge	07:59
tristan	Which is a game changer	07:59
benschubert	traveltissues: ./setup.py develop would work too	07:59
tristan	I'd like to land a fix to Element.search() supporting full paths properly first and get all the full-pathing cases right	08:00
traveltissues	nice	08:00
benschubert	tristan: speaking of junction jungle.... it's not possible to have a project depending on itself right?	08:00
traveltissues	i didn't know that, thanks benschubert	08:00
tristan	benschubert, yes it is	08:00
benschubert	traveltissues: no problem, guess it should solve your problem :)	08:00
benschubert	tristan: oh! perfect	08:00
traveltissues	yes, it did	08:00
* benschubert is going back to his recursive project		08:01
tristan	benschubert, I think it needs to be 'internal' though	08:01
tristan	we don't have a notation to express 'self' in 'duplicates'	08:01
benschubert	tristan: yep my aim is to have it internal anyways	08:02
tristan	But I think it's alright this way, the use cases for self junctioning I can think of should be internal, and we could extend with that if the need arises	08:02
benschubert	yep, my aim is to have a stage3 rebuild itself	08:02
benschubert	so internal is exactly what I need	08:02
tristan	yeah, bootstrapping is the main use case	08:02
tristan	(the known one anyway)	08:03
tristan	So it looks like for me to do meaningful benchmarks, I really need to run the same intense project 1000 times and take an average	08:04
tristan	The results are so erratic	08:04
benschubert	yeah the way I run them is 4 runs and I discard the first on the debian-like	08:05
tristan	only 4 ? maybe with 6K elements you have more stable results	08:05
benschubert	yep	08:06
tristan	With my measly 900K tall project it varies a lot	08:06
tristan	900 tall	08:06
tristan	-K	08:06
benschubert	you've seen the difference in timing in my results :) (the third column)	08:06
tristan	yeah	08:07
tristan	Ok what should I do :-S	08:07
tristan	Hmmm	08:07
tristan	benschubert, mind if I toss you some branches to run benchmarks, you can do this in background without too much distraction ?	08:08
tristan	I'd like to try without the unneeded ref/unref calls, and try a more complete array approach to see if that improves things	08:08
tristan	the first one I can dish out in a minute	08:08
benschubert	Sure, won't be the same machine and will take longer but I can :)	08:09
tristan	alright	08:09
tristan	benschubert, pushed removal of refcounting	08:13
benschubert	on it :)	08:13
tristan	Aha	08:14
tristan	benschubert, wait, can you run it while comparing the second last commit too ?	08:14
tristan	Since it's a different machine, I'd like to compare the cost of that with the last, see if it makes a difference	08:15
benschubert	sure	08:15
traveltissues	i've been out of the loop for a while. what happened with the autobenchmark bot effort?	08:20
tristan	Ok that will be helpful, I think I might have a breakthrough too, I'll let you know	08:20
* tristan has spotted a crucial difference between the old and new _variables code		08:20
benschubert	traveltissues: not sure it is "ready"yet? Haven't heard about it for a bit	08:23
tristan	Last I heard, we had a plan to at least run `bst build` for every commit in the benchmarks repo, without gating it on CI, and automatically updating a webpage with that result	08:29
tristan	Even that never happened	08:30
tristan	I think it just involves using an API key and interacting with gitlab restful API	08:30
tristan	I noticed where my code is more complex.	08:37
tristan	The old code has the _expstr_map, which is a dictionary of "either list or string"	08:37
tristan	If it's a string, then it's resolved, if it's a list, then it needs to be recursively resolved	08:37
tristan	There is a quicker return for resolved values	08:38
benschubert	oh good catch	08:38
tristan	Adding a second dictionary for resolved values gets much similar performance, but it would much increase memory signature	08:39
tristan	Looking at the current code, it would seem that the way things do get resolved, we don't technically need the ValueClasses for already resolved values	08:39
tristan	i.e. you cannot have a circular reference error	08:40
tristan	or an unresolved counterpart, for something that is already resolved	08:40
tristan	So... I could probably change this to do a similar "str-or-Value" dictionary	08:40
tristan	crap, that might not be it :-S	08:44
* tristan was looking at false results		08:44
tristan	I could still be close though	08:44
tristan	Hmmm, ok I need to rethink this	08:45
tristan	too many dictionary lookups while resolving a variable, that is what is changing things	08:45
tristan	Ahhhh, ok so tiny quicker return could be helpful but maybe not	08:46
tristan	However, when I do the reverse loop over value dependencies, instead of indexing variables, the resolved values of previously iterated values could be readily in context	08:47
tristan	this would match the nature of the recursive resolve_variable_and_use_result algo we had before	08:47
tristan	https://gitlab.com/BuildStream/buildstream/-/blob/tristan/partial-variables/src/buildstream/_variables.pyx#L332	08:48
tristan	iter_value.resolve() right there, could be given the previously resolved values, the order of the array should ensure that any value it wants, is the next value in the list	08:49
tristan	so if a value needs 2 variables, it takes the last 2 previously resolved ones	08:49
tristan	(whether they were resolved this round or not)	08:49
WSalmon	juergbi, i reran this job quite a few times, it always crashes while just pulling, https://gitlab.com/celduin/bsps/bst-boardsupport/-/jobs/622631814 but https://gitlab.com/celduin/bsps/bst-boardsupport/-/commit/b3d248c38efb191088258df8d5a2d090de8fc038 this change allows it to get MUCH further, im mostly running 1.93.4 see master of https://gitlab.com/freedesktop-sdk/infrastructure/freedesktop-sdk-docker-images for the exact versions of everything	09:00
WSalmon	i presume https://gitlab.com/freedesktop-sdk/infrastructure/freedesktop-sdk-docker-images/-/blob/master/project.refs#L176 is the version of casd that should not cause excessive memeory usage?	09:01
cphang	WSalmon I thought that the error handling of build box was cleaned up so there are more informative errors now?	09:01
WSalmon	cphang, i think the issue is that the system is OOM so everything crashes a bit badly	09:02
WSalmon	when i changed the other stages to be limited like that i did it after localy running using the amount of ram that those runners have. but i though we had updated things since then. i should have dont that last job too but gitlab-ci variable subsutition only gose 2 variables deep which makes everything trickythat stage got missed..	09:07
WSalmon	i didnt mean to send ^ i was trying to delete it and fat finngered but you get the jist	09:07
juergbi	WSalmon: do you have the kernel log?	09:10
juergbi	to see the casd memory usage at time of OOM kill	09:11
WSalmon	no, all i have is https://gitlab.com/celduin/bsps/bst-boardsupport/-/jobs/622631814/artifacts/browse/artifact/	09:11
juergbi	casd 0.0.9 should indeed already have the memory usage fixes but there might be more issues	09:11
juergbi	WSalmon: how much RAM do these runners have? and are multiple docker containers running on a single runner without RAM restrictions? could it be caused by another container?	09:13
juergbi	WSalmon: running buildbox-casd with `/usr/bin/time -v` prints the peak memory usage, iirc	09:14
juergbi	if you want to look into this issue, describing steps that trigger higher than expected peak memory usage in casd would be very helpful	09:15
juergbi	could be done by isolated pulling	09:15
juergbi	I thought I did this before and after my memory fixes and it was reasonable afterwards	09:16
WSalmon	juergbi, they are amazonec2-instance-type=a1.2xlarge with 16Gb of ram	09:16
juergbi	however, my focus was on handling of large files, maybe there is an additional issue with lots of regular sized files (instead of individual large files)	09:16
WSalmon	they have one ci job per vm	09:16
WSalmon	so each ci job has 16GB to its self minus the OS and docker over head, the vms are spun up by docker machine each morning	09:17
juergbi	ok, so presumably 10 pulls at once trigger this on 16 GB	09:17
WSalmon	yep	09:17
WSalmon	afaict	09:17
juergbi	should definitely not be happening	09:17
juergbi	might be a leak	09:18
juergbi	(the fixed issue was not a leak, simply not-optimal code requiring high peak usage)	09:18
WSalmon	i cant rember if i have tride it with 0.0.9 but i have had it use that much ram locally for that job	09:18
WSalmon	* the x86 verion	09:18
juergbi	the ISA will likely not make a big difference	09:19
juergbi	so it should be possible to reproduce this on x86-64	09:19
WSalmon	thats what i ment by locally	09:19
WSalmon	incedentally you can pull that arm job on to you x86 machine with `rpi4-features]$ bst -o target_arch aarch64 build deploy/image.bst`	09:20
WSalmon	i think you have the keys to that cache but its open for anyone to pull from	09:21
WSalmon	if you clone and run from the sample/rpi4-feature folder	09:22
juergbi	WSalmon: I think I can reproduce it	09:36
WSalmon	:D	09:36
juergbi	doesn't appear to be a memory leak but rather excessive memory usage with large trees	09:41
juergbi	e.g. can be seen by the memory usage of pulling bootstrap/acl-build-deps.bst alone	09:41
juergbi	WSalmon: https://gitlab.com/BuildGrid/buildbox/buildbox-casd/-/issues/76	10:00
tristan	Whoa, ok lists are definitely more expensive than arrays	10:00
WSalmon	thanks juergbi	10:03
juergbi	I'm just realizing, while this will fix 'the large artifact of small files' issue, it's still not idea for large files	10:05
WSalmon	juergbi, they want writing strat to disk?	10:07
WSalmon	straight	10:08
WSalmon	presumably its worth not pulling to fast as well as then you just get clogged up in IO issues?	10:08
WSalmon	or will linux handdle that nicely	10:08
juergbi	we have to write them to disk, that's where our CAS cache is	10:09
juergbi	(disk could be tmpfs but that can require a lot of RAM)	10:09
WSalmon	right, i was just thinking that in some data centers you have huge pull bandwidth but terible disk IO	10:11
WSalmon	but thats probly premiture optomisation, and we just need to get them on to disk and off RAM quicker	10:11
juergbi	yes, there isn't too much that can be done about this	10:12
juergbi	except striving towards reducing unnecessary pulling - or using machines with tons of RAM and tmpfs	10:12
juergbi	*and	10:13
WSalmon	i think bst defauling to unlimited pullers might not be the best bet ether	10:15
juergbi	it defaults to 10 max	10:15
WSalmon	ah	10:18
coldtom	10 pullers is still enough to put a lot of stress on a machine ime	10:18
WSalmon	+1	10:18
WSalmon	Especally for anything based on FD	10:18
benschubert	you can always set the default in your user conf no?	10:20
juergbi	yes, the best default will vary depending on your machine and your connection	10:21
WSalmon	benschubert, yes but it seems silly to have a default that requires you to have a 32Gb machine..? having a function of ram or cpu seems more sane to me, but getting buildbox-casd to write straight to disk seems like the most important	10:21
WSalmon	if you have a huge machine you may well need to bump it up any way	10:22
juergbi	the buildbox-casd issue needs to be fixed, of course	10:22
juergbi	I don't think we should reduce the default in buildstream because of that	10:22
benschubert	I agree with juergbi, it seems to work well with 10 even on my core 2duo 8GB ram machine	10:22
WSalmon	benschubert, you have pulled 10 big chunks of FD from a fast cache like celduins at the same time or were you fetching from sources like github that are quite slow?	10:25
benschubert	WSalmon: I have a local cache at home, so we're talking 1GBps :)	10:26
benschubert	and I'm currently working with clang and friends	10:27
benschubert	my version of bst is not 100% up to date though	10:27
WSalmon	umm, interesting, also fetching and pulling work quite diffrenlty i have though for a while that they should have diffrent limiters, as fetching hits the network via the source tool but the pull is via casd	10:28
juergbi	the issue is with large artifacts	10:29
juergbi	the relevant fdo-sdk artifacts are gigabytes (single artifact)	10:29
juergbi	as they compose build-deps into an artifact of their own	10:29
benschubert	juergbi: but that's a very old bug no? I remember opening it when I had 60GB artifacts	10:29
benschubert	or was it fixed and came back in?	10:30
juergbi	there was a capture issue that was fixed	10:30
juergbi	for pulling I don't think it's a regression	10:30
juergbi	at least on the buildbox side	10:30
juergbi	it's very possible we fixed this in buildstream before we handed this task to buildbox-casd	10:31
juergbi	(or it was correct from the beginning in pre-casd buildstream)	10:31
benschubert	tristan: https://gitlab.com/BuildStream/buildstream/-/compare/master...bschubert%2Fisort that's roughly what isort would look like, there's still 1-2 details I'd need to fix around comments though	10:49
benschubert	juergbi: seems like the only tests now failing are the ones using an artifact_share, do you have any idea where we could have more problems?	10:50
juergbi	benschubert: BaseRemote.init?	10:51
juergbi	can hopefully get rid of that fairly soon with step 2 of Remote Asset support	10:51
benschubert	ah good point let me try	10:52
tristan	benschubert, looking at a modified file (instead of the diff): https://gitlab.com/BuildStream/buildstream/-/blob/2833a6abd562a628f4039e74a1dadf12661d2b6d/src/buildstream/_frontend/app.py ... those comments are somehow preserved from the source code ?	11:04
* tristan can see that private modules come before public ones, simply due to the sort, which seems alright		11:04
benschubert	tristan: yeah the comments are preserved but not moved correctly, I'd need to go through all of them to ensure they are in the correct order / remove the stale ones ;)	11:08
benschubert	juergbi: it works! awesome, thanks a lot	11:12
benschubert	ok now cleanup and make a WIP	11:13
juergbi	:)	11:13
tristan	benschubert, so confusing results so far, any word on that benchmark ?	11:20
benschubert	let me go and check	11:20
tristan	it will be interesting just to know if it makes any difference	11:20
tristan	and if so, how much	11:20
* tristan actually lost performance by removing unnecessary dict lookups		11:21
* tristan palmface		11:21
tristan	doesnt seem to make any sense	11:21
benschubert	tristan: https://gitlab.com/snippets/1992419 should realized I forgot to do one of the checks	11:22
benschubert	which other commit did you want again?	11:22
tristan	oh no :(	11:22
tristan	I wanted the tip and the before tip	11:23
tristan	just to measure the difference with and without refcounting	11:23
benschubert	ok let me restart with the one before	11:24
benschubert	is just the `bst show` enough?	11:24
benschubert	I can speed up benchmarks a lot if so :)	11:24
tristan	yes	11:25
tristan	its enough	11:25
benschubert	tristan: i went for everything: https://gitlab.com/snippets/1992419#note_373289531	12:00
*** tristan has quit IRC		12:02
*** tristan has joined #buildstream		12:20
*** ChanServ sets mode: +o tristan		12:20
benschubert	juergbi: seems like I've got still more problems :'D https://gitlab.com/BuildStream/buildstream/-/jobs/623129875#L860	14:09
benschubert	the diff's https://gitlab.com/BuildStream/buildstream/-/compare/master...bschubert%2Fno-multiprocessing	14:10
benschubert	I'm not 100% sure how that can happen :/	14:10
* douglaswinship is particularly concerned by https://gitlab.com/BuildStream/buildstream/-/jobs/623129875#L6023		14:16
douglaswinship	The error messages seem to be getting quiet insulting!	14:16
juergbi	benschubert: so that's ostree-related, right?	14:16
juergbi	ostree is also known for background thread issues	14:16
juergbi	do you only see this in CI? maybe it depends on the ostree version	14:17
benschubert	juergbi: no sorry, line 860 :)	14:17
benschubert	I've seen the ostree ones and I'll have a look at them	14:17
benschubert	but the one line 860 seems worse currently	14:17
juergbi	ah, hm	14:18
juergbi	benschubert: CASDChannel.get_local_cas() is not thread-safe	14:19
benschubert	oh gosh :'D	14:20
juergbi	it assumes that _local_cas is set if _casd_channel is set	14:20
benschubert	I've got a lock around establish_connection though	14:20
benschubert	and same on the close	14:20
juergbi	yes, but the lock is not used in get_local_cas	14:21
juergbi	changing the conditional in get_local_cas to check for _local_cas probably fixes it	14:21
benschubert	https://gitlab.com/BuildStream/buildstream/-/compare/master...bschubert%2Fno-multiprocessing#6b68d6e75602b4de08330322d00edd6aabf0a718_249_246 isn't that enough?	14:21
benschubert	I mean, the 3 only get set at the same time right?	14:21
juergbi	can't the GIL switch between threads between these assignments?	14:22
benschubert	oh	14:22
benschubert	-_-' right	14:22
benschubert	gah thanks	14:23
benschubert	now ostree...	14:23
benschubert	# Use local-only GIO, don't use the DBus-based GVfs backend.	14:37
benschubert	# GVfs/DBus uses background threads, which can cause problems	14:37
benschubert	# with BuildStream's fork-based process model.	14:37
benschubert	os.environ["GIO_USE_VFS"] = "local"	14:37
benschubert	Seems like we should not have threads? -_-'	14:37
juergbi	benschubert: maybe GIO initialization is triggered at some earlier point	14:43
benschubert	even setting it before the import of ostree doesn't change	14:44
juergbi	try setting that environment variable early on in buildstream startup, if you can reproduce this	14:44
benschubert	yeah I can reproduce easily locally	14:44
juergbi	glib/gio might be triggered by something non-ostree, although I don't know what	14:45
WSalmon	so i can pull down a artifact build on a diffrent arch but when i try to check it out it tries to run the sandbox, the final element is a script element with no run time deps, i presume bst want to run some final intergrations stops, can i turn that off? https://pastebin.com/cFX2ZGyT	14:45
benschubert	juergbi: putting that in my pytest env doesn't change anything either	14:47
abderrahim[m]	WSalmon: --no-integrate may help	14:47
WSalmon	i just noticed that but it seems not	14:47
WSalmon	https://pastebin.com/SUkzCW8E	14:48
juergbi	benschubert: can you do a 'thread apply all bt' in gdb at a point where the extra thread exists?	14:48
juergbi	maybe the backtrace of that thread provides some hints	14:48
WSalmon	juergbi, tristan, benschubert i would have thought the --no-integration would fix it is that a bug? or am i using it wrong / it dose something else?	14:49
juergbi	WSalmon: maybe we instantiate the sandbox when we don't need to	14:49
WSalmon	ok, ill go look at the code	14:50
juergbi	I wouldn't expect artifact checkout --no-integrate to require the sandbox	14:50
juergbi	it should even work without buildbox-run installed	14:50
benschubert	juergbi: I can tell they are not python threads, I'm trying to get more info	14:54
benschubert	do you know by any chance where's the code/doc for all of this?	14:55
benschubert	(I mean ostree python)	14:55
juergbi	benschubert: there is no Python OSTree code per se. pygobject dynamically generated Python bindings for the libostree C library	15:06
benschubert	oh :/ less than ideal	15:07
juergbi	the background thread is very likely to be created by libostree (or a dependency of that), not pygobject	15:07
juergbi	it may also depend on the HTTP backend of ostree. iirc, there is a compile-time option between curl and libsoup	15:08
benschubert	so what would you recommend? In the ostree tests we increment the potential number of threads?	15:08
tristan	I'm not sure I understand the context of why ostree libraries starting threads becomes problematic	15:10
tristan	I thought it was mostly problematic that it gets called to early when using the fork() model	15:10
tristan	Quick recommendation though which would result in (A) Easier install story, (B) Lots of code removed, (C) Addressing any conflicts with threads	15:11
juergbi	without fork it should be much less of an issue	15:11
tristan	Would be: Just stop using python bindings for ostree, and call it normally with Plugin.call()	15:11
juergbi	however, it would still be good to be sure that everything is cleaned up after each test	15:11
juergbi	i.e., the tests are isolated	15:12
benschubert	yeah that's the problem, for testing	15:12
tristan	That would also have the benefit of (D): Not special casing that Python API break anymore	15:12
juergbi	if the command line tool is sufficient for the purpose of the ostree source, that might indeed make sense	15:12
juergbi	originally we depended on libostree as it was also used as our artifact cache	15:13
juergbi	so the CLI might not have be sufficient	15:13
juergbi	but purely for the source, it might be	15:13
benschubert	that's true that this would solve the problem	15:13
benschubert	and we need ostree-cli anyways to run the plugin	15:13
benschubert	so in the end...	15:13
tristan	I think the initial decision to use python bindings was rather arbitrary, but definitely driven by using it as artifact cache	15:14
tristan	iirc there was a deeper API we needed for the ostree pushing thing to work well	15:14
benschubert	So you'd be both fine with ostree-cli? We'd loose the reporting of progress, which I think is an improvement :)	15:14
tristan	Haha, that progress has always been whacky	15:15
tristan	percentage up and down as you discover during pull that your total is changing	15:15
*** hasebastian has joined #buildstream		15:15
benschubert	ok! I'll see what I can do	15:15
juergbi	I'd be fine with that. we don't have this for other sources either	15:15
juergbi	benschubert: that said, fixing the background thread might be as simple as adding a close() call somewhere	15:16
juergbi	with the forked job processes, background threads get eliminated by termination, so we didn't care	15:16
benschubert	juergbi: I've dir() on every ostree-related object we create	15:16
benschubert	none has a `close` method	15:16
benschubert	afaik	15:16
tristan	Myeah, I don't expect state of the art	15:17
tristan	I don't think it's GIOStream based	15:17
juergbi	benschubert: have you also tried setting .repo to None?	15:17
tristan	I was thinking that	15:18
juergbi	essentially we might need the inverse of the ensure() method	15:18
benschubert	I've `del repo`	15:18
tristan	Even if it's GC'ed syncrhonously and "disposed", it would not be uncommon for the thread itself to asynchronously complete (no g_thread_join() in dispose())	15:19
juergbi	could be the case, yes	15:19
juergbi	if ostree cli works, let's go for that	15:20
benschubert	we're waiting up to 5s for all threads to close	15:20
* tristan recalls spending time to make this happen synchronously for EDS EBookClient handles, and noticing that there wasn't much interest in having this work		15:20
* tristan rubs tabasco into his eyeball		15:23
tristan	sad	15:23
benschubert	What part of the code prompted you to do that so I can avoid it?	15:23
tristan	Right before the return from the pizza() function	15:24
tristan	whew ok lets give this another go... I'm gonna throw it ALL away, _variable.py... the Values the ValueClass, the ValueParts, all the nice stuff	15:26
tristan	And see if I can't get a performant loop-instead-of-recurse using the ol' list returned from python regex	15:26
tristan	I'm not particularly hopeful, every way I look at it, avoiding recursion requires allocating data structures	15:27
tristan	So it would logically be slower than pushing a stack frame	15:27
Frazer61	hi, im getting an issue where it says "Did not find 'bzr' in PATH" when testing in the venv, i have tried installing bzr but it doesnt seem to fix it unfortunately	15:29
tristan	That is odd	15:31
tristan	and when you run `tox -e venv /bin/bash` and type `which bzr`, do you have it ?	15:31
Frazer61	its when i try to build cachekey. nothing, when i try to install bzr in the venv https://pastebin.com/hQS9ePxB	15:32
tristan	uhhh	15:33
tristan	is there a reason to install bzr into the venv itself ?	15:34
tristan	maybe bzr is implemented in python (I don't keep track) but it shouldnt matter I think	15:34
Frazer61	https://pastebin.com/y1s4F5S1	15:34
Frazer61	for bst build in cachekey	15:34
tristan	right so when you are not in the venv, you have host bzr installed ?	15:35
tristan	`which bzr` ?	15:35
tristan	and when you are in `tox -e venv /bin/bash`, `which bzr` doesn't find your host bzr ?	15:36
Frazer61	no, because when i try to install it, it gives the error shown in the first pastebin. happens locally and in venv	15:36
tristan	ok but what OS are you running ?	15:37
Frazer61	fedora 32	15:37
tristan	don't you have a reliable way of installing bzr on your host ?	15:37
Frazer61	using pip to try and install it	15:37
tristan	what is the latest rpm -i... `aptitude` ?	15:37
tristan	No thats not fedora	15:37
tristan	I forget	15:37
tristan	Frazer61, try using a reliable method provided by your distro to install bzr, that should work	15:38
Frazer61	when i try to install bzr it tries to install breezy, but i thought it was bazaar. is it fine to continue?	15:39
Frazer61	via dnf	15:39
* tristan doesnt know what it is on fedora :-S		15:40
Frazer61	dnf is equivalent to apt	15:40
tristan	https://fedoraproject.org/wiki/Changes/ReplaceBazaarWithBreezy	15:40
tristan	I mean I don't know what bzr is on fedora :)	15:40
coldtom	breezy is bazaar iirc	15:40
Frazer61	nice bit of confusion :), thanks will give it a go	15:41
tristan	Frazer61, yeah so it looks like it will work, different implementation of the same thing	15:41
Frazer61	y	15:41
tristan	indeed confusing	15:41
Frazer61	sorry ignore that	15:41
*** phildawson has quit IRC		15:41
Frazer61	hopefully should get the first bit of bst source show working	15:41
tristan	Frazer61, awesome, please do remember to summarize and get that formal proposal on the list though	15:42
tristan	I know, hacking is more fun	15:42
Frazer61	will do, just using elements from fetch to get it working and go from there to make a proposal for what we want as a final solution	15:43
tristan	It should pass without issue but not everyone is on IRC all the time, so it's better to have it formally on the list, and we can link to that in a resulting merge request etc for good provenance	15:43
tristan	Yup :)	15:43
benschubert	tristan: it does remove quite a bit of code from ostree to use the cli :D	16:00
benschubert	i should have a prospective PR soon	16:01
WSalmon	https://gitlab.com/BuildStream/buildstream/-/commits/willsalmon/checkout_no_sandbox feels really close to working, do we have a patten for creating a vdir with out a sandbox? i can just make a temp dir context and trigger one with that.. but wonder if we already do it some were i can make sure i am consistent with. juergbi maybe?	16:03
juergbi	WSalmon: could just create a CASBasedDirectory directly	16:06
juergbi	WSalmon: a possible alternative is to create a SandboxDummy	16:07
juergbi	not sure off the top of my head what's the cleanest approach	16:07
WSalmon	thanks juergbi	16:17
benschubert	juergbi: tristan https://gitlab.com/BuildStream/bst-plugins-experimental/-/merge_requests/123/ still two tests failing but should work well :)	16:40
juergbi	nice	16:41
benschubert	not sure what's the deal with the failures :/	16:59
benschubert	thought I had reproduced the API correctly	16:59
tristan	nice	17:01
tristan	what are the failures ? gpg key related ?	17:02
tristan	I think I added a test recently to ensure that keeps working	17:02
tristan	I'm not sure if command line would tolerate changes of gpg key presence well either	17:03
tristan	i.e. it happens in repo initialization, so if you add a gpg key or remove it later, it should probably mean you need a seperate mirror directory for it ?	17:04
tristan	repo url + hash of gpg key ?	17:04
tristan	Either that or, `ostree init` needs to be able to reconfigure the repo at the beginning of every session ?	17:05
benschubert	tristan: one sec, finishing the run, but apparently not, more fetch related afaict?	17:06
tristan	not spotting the problem easily	17:12
* tristan has been looking it over		17:12
tristan	what happens ?	17:12
benschubert	I'm just waiting on a test to finish to share, will be simpler	17:13
benschubert	but basically `remote origin not found` is one	17:13
benschubert	and `${ref}.commit doesn't exist`	17:13
benschubert	(two different errors)	17:14
tristan	might be is_cached giving you trouble somehow ?	17:15
tristan	that doesnt seem to ensure the remote anymore ?	17:16
benschubert	oh, did it use to? shoot	17:16
tristan	was just noticing that in the diff	17:16
tristan	Also if is_cached() is a delegated method, which I think is one of the things replacing get_consistency(), seems unnecessary to call it in fetch()	17:17
tristan	but probably our plugin facing API docs need to be clarified to specify these invariants	17:17
benschubert	tristan: true I can remove that	17:18
tristan	is_cached() should of course be able to return False if the mirror directory doesnt even exist	17:19
tristan	like we shouldnt require spending time initializing an empty repo	17:19
benschubert	yep I've just fixed that	17:19
benschubert	[gw0] [ 66%] FAILED mirror.py::test_mirror_junction_from_includes[ostree]	17:20
benschubert	[gw4] [ 53%] FAILED tests/sources/ostree.py::test_fetch_gpg_verify	17:20
benschubert	Those are the two failures now	17:20
tristan	clue: The reason why we don't just use 'origin' all around is because of mirrors	17:21
tristan	if we have some commands that just rely on default remote, that could cause mirror tests to fail	17:22
benschubert	so I need to set the default remote all the time?	17:22
benschubert	I thought we only set one remote for every one?	17:22
tristan	I don't think so	17:22
tristan	Just guessing every command needs to specify the remote	17:23
tristan	maybe in the old code we were setting the default remote ? not sure	17:23
benschubert	https://gitlab.com/BuildStream/bst-plugins-experimental/-/jobs/623441172#L575 is one of them :/	17:24
benschubert	yeah I'm really not familiar with ostree	17:24
*** hasebastian has quit IRC		17:27
WSalmon	so this works locally with and without --no-intergration https://gitlab.com/BuildStream/buildstream/-/commits/willsalmon/checkout_no_sandbox juergbi the only diffrent in reality as far as i can tell is that self.__configure_sandbox(sandbox) is not called, would that effect what gets checked out for a --no--intergartion checkout?	17:29
WSalmon	juergbi, ^	17:29
WSalmon	as far as i can tell that would only effect the case with intergration	17:29
tristan	benschubert, looks like it should be fairly easy to figure out with the failed test data and the ostree command line locally though	17:30
tristan	You can literally copy paste the commands, run them on the test data and figure out where things went wrong	17:30
tristan	I'm not sure if you want --mirror for instance, you might, I don't know	17:31
benschubert	tristan: right, that seems the right way to go :D	17:33
benschubert	Seems also like my multiprocessing change prevents termination :/	17:36
benschubert	pause works perfectly though	17:36
benschubert	so there is that	17:36
tristan	cool !	17:51
tristan	benschubert, checking `ps axf` to make sure you've put everything to sleep ?	17:52
benschubert	I'll try that	17:52
tristan	TS I think you want... iirc	17:52
benschubert	So I was building LLVM, 12 cores used, LVA 12. ctrl + c and in a second no more core used at all :)	17:53
benschubert	that works	17:53
benschubert	the only thing I'm not 100% sure is around the python code	17:53
benschubert	and contrinue -> restart immediatly	17:53
tristan	some things don't respond well, I think ninja might still suck at job control wrt SIGTSTP	17:54
tristan	not our fault if those process trees don't suspend	17:54
benschubert	yep, we're doing the best we can	17:54
benschubert	but unless we are the kernel, well :D	17:54
tristan	We could get better support from the kernel in the future	17:55
tristan	with containers	17:55
tristan	who knows :)	17:55
benschubert	ah namespaces yeah :)	17:55
* benschubert is loosing his sanity on the last bug of ostree		17:55
tristan	join the club :)	17:55
benschubert	WSalmon: what are the symptoms of your OOM errors again?	17:56
tristan	I've lost it already earlier this week trying to get non-recursive variable resolution to perform decently :)	17:56
tristan	Another thing to consider wrt ostree: compatibility with the older plugin	17:56
benschubert	what kind of compatibility?	17:57
tristan	if for any reason we do things differently (repo config or smth) causing an old cached repo to not work with the new cli based access commands	17:57
tristan	then we'd have to namespace the cache directory	17:57
benschubert	no I don't think that's a problem	17:58
tristan	and trigger an expensive re-download of probably a lot of data (ostree mostly used for big downloads)	17:58
tristan	yeah I doubt it	17:58
WSalmon	i cant find a example but casd just uses all of the memory	18:03
benschubert	ok I'm giving up for today, will come back to it tomorrow :'D	18:03
benschubert	WSalmon: And what exit code?	18:03
WSalmon	then it depends what needs a chunk of memery once its all gona	18:03
benschubert	ok, will need to monitor my build more closely	18:04
benschubert	it's not succeeding and I've got 32Go of Ram :/	18:04
WSalmon	:(	18:05
WSalmon	i should be able to find one but not now	18:05
WSalmon	ill ping you one	18:05
WSalmon	when i find it	18:05
WSalmon	have a good non buildstreaming evening :D	18:06
*** tomaz has joined #buildstream		18:36
tomaz	people, is there any `kind` for installing a ruby-gem?	18:37
*** tomaz has quit IRC		18:51
WSalmon	ohhh i helped some one out with one a while back	19:24
WSalmon	cant rember if it was here or in FD-SDK	19:25
benschubert	Yep I can confirm, with casd's master from two days ago I can't build llvm on a 32GB machine	19:35
WSalmon	toma	20:49
WSalmon	sorry fat figgers	20:49
*** pointswaves has joined #buildstream		21:01

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!