IRC logs for #buildstream for Monday, 2020-07-06

*** tristan has quit IRC		06:27
*** benschubert has joined #buildstream		07:17
*** phildawson has joined #buildstream		08:10
*** santi has joined #buildstream		08:37
*** tristan has joined #buildstream		09:47
*** ChanServ sets mode: +o tristan		09:47
benschubert	tristan: hey, if you have time, I'd appreciate a second review round on https://gitlab.com/BuildStream/buildstream/-/merge_requests/1985 since I've fixed a few bugs and haven't done all the changes you were mentionning	10:00
juergbi	tristan: iirc, we decided about a year ago that we would keep bst 1.x client support (ReferenceStorageService) in bst-artifact-server master in addition to the then newish BuildStream artifact/source protocols	10:16
juergbi	things have changed a bit since then since we're moving to the Remote Asset API and we're planning to completely drop bst-artifact-server if/when a suitable replacement exists	10:17
juergbi	would it thus be ok to drop bst 1.x client support in bst-artifact-server now or what do you think?	10:17
juergbi	see https://gitlab.com/BuildStream/buildstream/-/merge_requests/1978#note_372884164	10:18
juergbi	do we need to discuss this on the list?	10:18
tristan	benschubert, I won't be able to until later tonight :-/	10:23
benschubert	tristan: no worries, it's not that pressing :)	10:23
tristan	benschubert, regarding the stylistic issues, I have no problems with this, e.g. stop/quit etc	10:23
* tristan has an iterator approach to Value.resolve(), and now an approach which encodes all value parts to utf-8 strings, concantenates them and decodes back to `unicode` type		10:25
tristan	which "works" but probably needs more touchups to see if it's a winner	10:25
tristan	meh, currently not great	10:26
* tristan just got the test to work before needing to run out the door		10:26
tristan	if only I could get Py_UNICODE memcpy to do anything meaninful :-S	10:26
* tristan cloned cython and peeked at the insides of that, looks like most of it is mapping to raw C		10:27
tristan	with #defines and such for things sitting in Include/cpython/*	10:27
tristan	probably better guidelines for using raw C API of cPython would be more useful than sugar coated .pxd header files	10:28
benschubert	tristan: btw, ostree now passes all tests: https://gitlab.com/BuildStream/bst-plugins-experimental/-/merge_requests/123 :)	10:31
tristan	yay \o/	10:31
benschubert	juergbi: https://gitlab.com/BuildStream/bst-plugins-experimental/-/merge_requests/124 that can also go now right?	10:34
juergbi	yes	10:35
benschubert	thanks!	10:37
benschubert	I've got a design choice to make on the Messenger with the threading scheduler:	10:52
benschubert	The way it is currently works because we have a child - parent separation (Everything in the master thread/process is what is outside of jobs, everything in a job is run in another thread).	10:52
benschubert	This is good as it is easy to handle with thread-locals, but it means we couldn't increase usage of the ioloop, and would need to have the threads do _all_ the job instead of a smaller part, which might work better with the loop and deliver potentially better performance.	10:52
benschubert	My first idea was to use a Messenger per job, and set it on the `plugin` at the start, which whould have meant transparent usage to the element.	10:53
benschubert	However, this does not work, since messenger can be used in dependencies, for example when staging them. Which means we could need multiple messenger on a specific instance, at the same time.	10:53
benschubert	So I think the shorter I could do is:	10:55
benschubert	- A global messenger, the default if nothing else is set	10:55
benschubert	- An explicit messenger passed around in _buildstream private_ code, when we need ot override the global messenger	10:55
benschubert	- A thread-local messenger, we'd use in the jobs at the exact part where we leave inside a thread (that way, if we only have plugin code running in the thread, we still have transparent usage for BuildStream plugins in the public API)	10:55
benschubert	Does that seem too convoluted to people? tristan, juergbi if you have time? :)	10:55
traveltissues	juergbi: i think that if removing support for something it makes sense to mention it on the list	11:05
juergbi	benschubert: per job messenger makes sense to me as that's also how our log files are structured. and as we execute each job in a thread, using thread-local storage for this seems reasonable, assuming it's not a performance bottleneck	12:11
juergbi	and a global messenger for all non-job threads also makes sense	12:12
juergbi	benschubert: the only issue I can think of is if a job were to start using multiple threads. we probably don't generally want to support explicit thread creation. however, if there is a chance a job is using a thread pool somewhere, thread-local storage would be an issue	12:13
benschubert	juergbi: sorry might have not explained correctly. The thread-local messenger is what we have today, but it brings some problems :)	12:29
benschubert	So I was thinking about the 3-level messenger, with more explicit setup on the internal code paths	12:29
benschubert	it might cause problems with threads spawned from plugin code that's tru	12:30
benschubert	though the only way I can see that working then is always passing the messenger (or other identification) explictely	12:30
benschubert	wich doesn't make it ideal	12:30
benschubert	unless you would have another idea?	12:30
juergbi	benschubert: not really. well, one tweak could be to explicitly set the global messenger as the thread-local messenger of the main thread, instead of using it as fallback	12:34
juergbi	and then bail out if attempting to log with neither thread-local messenger nor explicit messenger available	12:35
juergbi	that might at least allow us to detect this issue	12:35
benschubert	Oh yeah that could work	12:35
juergbi	(if necessary we could provide API to 'copy' the thread-local messenger into a helper thread but let's hope it's not needed in the public API)	12:36
benschubert	yeah or we could have helpers to start threads if it's ever needed	12:36
benschubert	ok! I like this and will start looking into it, thanks!	12:36
juergbi	yes. but the direction we're going I assume it shouldn't	12:36
juergbi	(in the public API at least)	12:37
juergbi	yw	12:37
benschubert	yep	12:37
Frazer61	hi, seem to be getting a fail here locally and on master, is this a known issue? https://gitlab.com/BuildStream/buildstream/-/jobs/625428946	13:16
benschubert	Frazer61: on master too? https://gitlab.com/BuildStream/buildstream/-/jobs/625428946#L7309 is pretty clear on what the error is there at least: you are saying to logging to use '{' (new style formatting), but your format string is using '%', aka old style formattig	13:18
Frazer61	oh, seem to be getting the issue locally on master	13:18
*** hasebastian has joined #buildstream		13:29
Frazer61	sorry, made a mistake, ignore that	13:35
WSalmon	so why did autotools not get moved to bst-plugin-experimental? and how is https://gitlab.com/BuildStream/buildstream/-/blob/master/src/buildstream/plugins/elements/autotools.yaml#L52 supposed to work? i still get MAKEFLAGS: -j8 when it should be j1? https://gitlab.com/celduin/bsps/bst-boardsupport/-/jobs/625698066/artifacts/file/artifact/freedesktop-sdk/components-slang/1ad3d1df-build.3146.log https://gitlab.com/freedesktop-sdk/freedesktop-sdk/-/	16:07
WSalmon	blob/master/elements/components/slang.bst#L9	16:07
WSalmon	juergbi, benschubert tristan ^	16:08
juergbi	WSalmon: move: I think it's not that simple as it's used by the test suite. however, we still want to move it, afaik	16:11
juergbi	can you link to the place where you set notparallel?	16:11
WSalmon	https://gitlab.com/freedesktop-sdk/freedesktop-sdk/-/blob/master/elements/components/slang.bst#L9	16:12
WSalmon	https://gitlab.com/freedesktop-sdk/freedesktop-sdk/-/blob/master/elements/components/slang.bst#L10	16:12
WSalmon	sorry	16:12
juergbi	WSalmon: I'm wondering whether this might be another fallout from the variable resolution change. however, we have a test case for notparallel, which seems to pass. tests/integration/manual.py::test_manual_element_noparallel	16:22
douglaswinship	re the plugins API: both node_subst_vars and node_subst_sequence_vars both say they're depricated. What should be used instead?	16:30
douglaswinship	Does strip_node_info automatically replace variables? Because it doesn't mention that in the docs, so far as I can see.	16:31
benschubert	douglaswinship: all the variables are not automatically expanded yes	16:47
*** hasebastian has quit IRC		16:48
douglaswinship	so if they're not expanded automatically, how should we expand them? I belive the traditional way was with node_subst_vars and node_subst_sequence. But the docs say they're both depricated.	16:48
douglaswinship	What's the new way?	16:48
douglaswinship	benschubert: ^^	16:50
benschubert	*now not not sorry	16:50
benschubert	they are now expanded :)	16:50
douglaswinship	Oh! that's much easier to deal with. Thanks :)	16:51
*** santi has quit IRC		17:25
tristan	juergbi, regarding bst-1 support in artifact servers for bst-2, I think that clearly if we are steering towards not offering an artifact server in bst-2 that doesn't stand anymore	17:43
tristan	juergbi, I also seem to recall that we made this decision initially due to the low cost it incurred at that ime	17:44
tristan	*time	17:44
tristan	benschubert, regarding messenger and things "not working", this is an interesting thing which has regressed in the past and is worth some attention	17:45
douglaswinship	jjardon: WSalmon: I believe I've fixed the issue that was causing that overnight job to fail:	17:45
douglaswinship	https://gitlab.com/BuildStream/bst-plugins-experimental/-/merge_requests/125	17:45
douglaswinship	(Could use some review)	17:45
tristan	benschubert, I don't recall all the details, and was not around for the introduction of this `Messenger` object in the first place, but I do recall that messages being delivered to the frontend have 2 related elements, and both are meaningful	17:46
tristan	benschubert, i.e. the `task` element and the element which issued a message are separate IDs on the Message object, and had reason to be, I believe that in most cases, we do not care about the element ID of which element issued a message within a task, we are mostly concerned (in the main log) with the task element	17:47
tristan	benschubert, on the other hand, when logging a task, we might prefer the element ID, since the logging in that specific log file is already task specific	17:47
douglaswinship	jjardon, WSalmon: I'm still not sure why it worked on the freedesktop runners, and not on the freedesktop-sdk runners.	17:48
tristan	benschubert, what this all amounts to IIRC, is that you get proper grouping and understanding of what is going on for a task in the main log, and more minute detail in a task specific log (it really sucks to see many different element names in say "staging this and that dependency"	17:48
tristan	)	17:48
jjardon	douglaswinship: cool, can you change the CI job to use that to check if it passes?	17:50
douglaswinship	"And the fix can be seen working here: https://gitlab.com/BuildStream/buildstream/-/jobs/625901528"	17:50
tristan	benschubert, I'm not sure what the implications of job-specific messengers are, or why we need them, but we do want to ensure that when plugins issue messages from a task, they inherit the task_id from that task before being sent to frontend message handlers	17:50
douglaswinship	jjardon: It doesn't pass, but it hits a different issue	17:51
tristan	benschubert, maybe your thread-local Messenger mostly exists for this sole purpose	17:51
douglaswinship	jjardon: And we have a fix for that different issue too. It's not in the 'bst2' branch of freedesktop-sdk, but it'll be in the new bst2 branch, once we merge https://gitlab.com/freedesktop-sdk/freedesktop-sdk/-/merge_requests/2853	17:52
jjardon	douglaswinship: nice, so with those 2 the CI passes?	17:53
douglaswinship	jjardon: for that, we need to see the results of https://gitlab.com/BuildStream/buildstream/-/jobs/625935362	17:54
douglaswinship	Which isn't finished yet.	17:54
douglaswinship	But i'm optimistic	17:54
* jjardon cross fingers :)		17:54
douglaswinship	And toes!	17:55
benschubert	tristan: oh right, so if we were to group by task we'd be fine then? We always specify the task right?	18:03
benschubert	Yeah the reason of thread-local etc is for logging to the correct log file etc	18:03
benschubert	tristan: thanks for the review1 I've assigned marge :D	18:07
tristan	Great :)	18:11
tristan	benschubert, yeah the thing is, the frontend IIRC will fallback to logging with the element_id if no task_id is specified in a log message	18:11
tristan	benschubert, so it's easy to mess up in the core, if the task_id goes unstamped, then you get a lot of element specific messages which appear incoherent when reading the master log	18:12
Frazer61	tristan, just wondering if you have had a chance to look at the `bst souce show` proposal, or will that be a job tomorrow?	18:12
benschubert	tristan: and in plugins, do they specify the task_id whenever they log something?	18:12
tristan	benschubert, Basically in the master log you want the main [cache key, task name, task element] stuff to be consistent for a given task (i.e. anything that did a START and will finish with either SUCCESS/FAILED/SKIPPED)	18:13
tristan	benschubert, in the current model the job takes care of stamping the message before sending it to the frontend	18:13
benschubert	tristan: that's the problem, before the job was in a separate process, so it was a global thing	18:14
benschubert	now, we can't really do that anymore if we go for threads	18:14
benschubert	Unless we pass the messenger object/something else ecplicitely around in the private API	18:14
tristan	benschubert, Right so in that context you want something to automate it, a thread-local (per-job) Messenger object to do the stamping and job-specific logfile logging seems to make sense to me for this	18:15
benschubert	which works only if we run the entire job in a single thread, which is not that efficient :/	18:15
tristan	benschubert, Either that or have some thread local data which the Messenger object uses to do things conditionally	18:16
tristan	Eh ?	18:16
tristan	Really ?	18:16
benschubert	well, we could do better, by having the ioloop handle more stuff, and have what's run in thread more minimal	18:16
tristan	You would run multiple threads for the span of a single job ?	18:16
tristan	Frazer61, Sorry :-/	18:17
benschubert	Basically, what I think would be the "most efficient" would be to run most of the things in the ioloop with async mechanism, and launch the plugin's public code inside background threads	18:17
tristan	Frazer61, I got side tracked today and took a long afternoon off, have a friend who is going in for surgery tomorrow and will be in hospital for a week	18:17
tristan	benschubert, That sounds a bit convoluted to me but maybe I'm missing how that all works when written in code :)	18:18
tristan	benschubert, just a first impression	18:18
Frazer61	no worries tristan, hope it goes well. I'll prioritise other BuildStream related issues in the mean time	18:18
benschubert	tristan: ok, I'll try something, and put it up so we can discuss it more easily :)	18:18
tristan	benschubert, I would have imagined that (A) You run a thread and that thread does blocking operations (B) If the blocking operations are IPC calls to CASD or other IO based things, those are all just transparently handled in the thread, automatically yielding the GIL whenever going to sleep and blocking	18:19
tristan	Getting the file descriptors and handling them in the main loop of the main thread seems like quite a chore, and... maybe more optimal ?	18:20
tristan	Does it pay off to have a single poll() in the main thread and not have other threads sleeping in poll() simultaneously ? I don't know	18:21
tristan	Anyway sure, let's discuss more around code later :)	18:22
benschubert	tristan: right, I'm not explaining everything there sorry. Basically, my medium-term plan for this scheduler is that we could start using `uvloop` since we would not need child watchers anymore. That would probably give us a nice boot in perfs. However, for that to be the most efficient, we'd need to have most of the code running in the loop instead of other threads :)	18:23
tristan	benschubert, in any case, at every turn that you launch a thread, there is a Job() that is related to it, so even if you end up launching many threads for a single Job(), you should have one single focus point for launching a Job() specific thread	18:23
benschubert	tristan: right, we could hide everything from user's code like that :)	18:24
tristan	Maybe the thread local data is just a single job reference variable, and the Messenger checks if there is a job in context and does the job specific thing then	18:25
benschubert	yeah, I'll dig into it to see what would look best :) thanks!	18:28
benschubert	Can someone enlighten me: why is https://gitlab.com/BuildStream/buildstream/-/blob/master/src/buildstream/_scheduler/queues/queue.py#L270 needed?	19:32
benschubert	òh wait, is it to keep the last build tree so we can rebuild on top of it next time?	19:43
benschubert	Then shouldn't we only be doing this in the build queue?	19:43
benschubert	juergbi: ^?	19:43
*** toscalix has joined #buildstream		19:47
tristan	benschubert, it's there because it's basically "state which a job can have side effects which need to be replicated in the main process"	21:20
tristan	Regardless of who sets it, or why	21:20
benschubert	tristan: the only thing that changes is workspace data right? And that can only change in some but not all types of jobs no? Shouldn't we be more explicit on when that can change?	21:21
tristan	Not sure if we should	21:22
benschubert	Surely any push operation would not affect that at all, nor would a fetch	21:22
tristan	I think I would be more happy to write jobs which explicitly modify state, and a single point at which all state which "can be modified" is pulled into the main data model in a synchronized way	21:23
tristan	Than to have to pick and choose	21:23
benschubert	I'm not that sure, a lot of the job/queues interactions are quite complex and the intent might be clearer if we were doing things explicitly	21:24
tristan	I think we are already explicit when we modify state	21:24
tristan	And being doubly explicit is only extra special casing	21:25
benschubert	Well, this part of the code is only there to update workspaces, and there is a single part of the code that I can think of that can do that, though this is called by 5 different operations :)	21:26
benschubert	Anyways, I'll leave it as is for now, was just wondering if we could not chop off a good chunk of complexity there	21:27
tristan	I guess what makes workspaces different, is that this state synchronization is not bound to the semantics of the job invocation	21:30
tristan	like the result of an operation being complete	21:30
tristan	it would be nice if it would be	21:30
tristan	but in this case it's in a category of it's own	21:30
tristan	because a job can ask the project for the workspaces data, call an API on it, and expect that to be persistent	21:31
tristan	Originally, it was written to just syncrhonously rewrite the workspaces file from the subprocess	21:31
tristan	with no regard whatsoever for updating it in the main process	21:31
benschubert	But it kind of is no? It's only when we get a new build tree that the state changes, and that's for builds only?	21:31
tristan	which passed a test	21:31
tristan	but was obviously buggy	21:32
tristan	Well no, it's not	21:32
tristan	benschubert, it might be something that happens to be a predictable side effect of a single case	21:32
tristan	but it is not bound to the semantics of launching a job	21:32
tristan	it's not written that way	21:32
benschubert	I know, but should we rewrite it like that?	21:33
tristan	it could be nice it if were written that way, and that a Job did not have access to the global Workspaces object and just call functions on it to set things	21:33
tristan	That could be nice yes :)	21:33
benschubert	I'd actually argue this should be moved to the build queue :)	21:34
tristan	I don't think so by itself no	21:34
benschubert	Why?	21:34
tristan	Because of the way the workspace API is set	21:34
tristan	Something is allowed to call workspace_set_foo()	21:34
tristan	in a global context on the data model	21:34
tristan	Instead, it should be written in such a way that the job explicitly returns data to be applied in the main process	21:35
tristan	And the child job should not be just calling workspace_set_foo() and expecting things to "just sort themselves out"	21:35
tristan	That should be streamlined into the semantics of launching the job and collecting the jobs return values	21:35
tristan	otherwise it's currently an expectation that any job can workspace_set_foo() and things will sort themselves out	21:36
benschubert	Mmh not sure I 100% agree with the how, but it would be definitely better than now. Ok I'll put it on a background thread :)	21:37
tristan	Element is modifying the workspace	21:37
tristan	As a side effect of assemble I think	21:37
tristan	internal Element._assemble() could be extended to return the side effects it wants to affect the workspaces with	21:38
tristan	That could be passed back through the BuildJob	21:38
tristan	Currently it's just "Element sets workspaces at times of it's choosing"	21:38
benschubert	Ah yes! There we agree :)	21:38
tristan	Right, that's what I mean, it's currently this hazy thing where jobs/elements just set stuff	21:39
tristan	that's why its in a category where "things that get set, become synchronized"	21:39
tristan	Which yeah, I agree is sloppy	21:39
tristan	but lets fix it all the way if we do :)	21:40
benschubert	Right, so with a thread-based scheduler, we can return more meaningful data in the return of the _assemble and use it to update the workspace in the build queue (and only that one)	21:41
benschubert	Would that make sense?	21:42
tristan	In a multiprocess model we could too	21:42
tristan	Only in a thread-based model, things become more dangerous	21:42
tristan	I.e. one could have overlooked the relevance of synchronizing state at the end of a job when originally writing up this workspace side effect	21:43
tristan	and it might even work well most of the time	21:43
tristan	I would expect in hindsight, that this would have happened, without much consideration given to "when the right time to set workspace data" is	21:44
tristan	Not exactly sure if it's important for it to be synchronized at the end of a job, just saying; it's dangerous that things could "just work" without forcing that consideration to occur	21:44
*** toscalix has quit IRC		21:46
benschubert	Right, ok I've got enough to think about for now, thanks!	21:46
tristan	I think that as far as workspaces go, it's just important that the side effects of a single workspace modification do not effect adjacent workspace modifications in the same session, which might be happening concurrently	21:46
*** benschubert has quit IRC		23:56

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!