IRC logs for #buildstream for Monday, 2020-07-06

*** tristan has quit IRC06:27
*** benschubert has joined #buildstream07:17
*** phildawson has joined #buildstream08:10
*** santi has joined #buildstream08:37
*** tristan has joined #buildstream09:47
*** ChanServ sets mode: +o tristan09:47
benschuberttristan: hey, if you have time, I'd appreciate a second review round on https://gitlab.com/BuildStream/buildstream/-/merge_requests/1985 since I've fixed a few bugs and haven't done all the changes you were mentionning10:00
juergbitristan: iirc, we decided about a year ago that we would keep bst 1.x client support (ReferenceStorageService) in bst-artifact-server master in addition to the then newish BuildStream artifact/source protocols10:16
juergbithings have changed a bit since then since we're moving to the Remote Asset API and we're planning to completely drop bst-artifact-server if/when a suitable replacement exists10:17
juergbiwould it thus be ok to drop bst 1.x client support in bst-artifact-server now or what do you think?10:17
juergbisee https://gitlab.com/BuildStream/buildstream/-/merge_requests/1978#note_37288416410:18
juergbido we need to discuss this on the list?10:18
tristanbenschubert, I won't be able to until later tonight :-/10:23
benschuberttristan: no worries, it's not that pressing :)10:23
tristanbenschubert, regarding the stylistic issues, I have no problems with this, e.g. stop/quit etc10:23
* tristan has an iterator approach to Value.resolve(), and now an approach which encodes all value parts to utf-8 strings, concantenates them and decodes back to `unicode` type10:25
tristanwhich "works" but probably needs more touchups to see if it's a winner10:25
tristanmeh, currently not great10:26
* tristan just got the test to work before needing to run out the door10:26
tristanif only I could get Py_UNICODE memcpy to do anything meaninful :-S10:26
* tristan cloned cython and peeked at the insides of that, looks like most of it is mapping to raw C10:27
tristanwith #defines and such for things sitting in Include/cpython/*10:27
tristanprobably better guidelines for using raw C API of cPython would be more useful than sugar coated .pxd header files10:28
benschuberttristan: btw, ostree now passes all tests: https://gitlab.com/BuildStream/bst-plugins-experimental/-/merge_requests/123 :)10:31
tristanyay \o/10:31
benschubertjuergbi: https://gitlab.com/BuildStream/bst-plugins-experimental/-/merge_requests/124 that can also go now right?10:34
juergbiyes10:35
benschubertthanks!10:37
benschubertI've got a design choice to make on the Messenger with the threading scheduler:10:52
benschubertThe way it is currently works because we have a child - parent separation (Everything in the master thread/process is what is outside of jobs, everything in a job is run in another thread).10:52
benschubertThis is good as it is easy to handle with thread-locals, but it means we couldn't increase usage of the ioloop, and would need to have the threads do _all_ the job instead of a smaller part, which might work better with the loop and deliver potentially better performance.10:52
benschubertMy first idea was to use a Messenger per job, and set it on the `plugin` at the start, which whould have meant transparent usage to the element.10:53
benschubertHowever, this does not work, since messenger can be used in dependencies, for example when staging them. Which means we could need multiple messenger on a specific instance, at the same time.10:53
benschubertSo I think the shorter I could do is:10:55
benschubert- A global messenger, the default if nothing else is set10:55
benschubert- An explicit messenger passed around in _buildstream private_ code, when we need ot override the global messenger10:55
benschubert- A thread-local messenger, we'd use in the jobs at the exact part where we leave inside a thread (that way, if we only have plugin code running in the thread, we still have transparent usage for BuildStream plugins in the public API)10:55
benschubertDoes that seem too convoluted to people? tristan, juergbi if you have time? :)10:55
traveltissuesjuergbi: i think that if removing support for something it makes sense to mention it on the list11:05
juergbibenschubert: per job messenger makes sense to me as that's also how our log files are structured. and as we execute each job in a thread, using thread-local storage for this seems reasonable, assuming it's not a performance bottleneck12:11
juergbiand a global messenger for all non-job threads also makes sense12:12
juergbibenschubert: the only issue I can think of is if a job were to start using multiple threads. we probably don't generally want to support explicit thread creation. however, if there is a chance a job is using a thread pool somewhere, thread-local storage would be an issue12:13
benschubertjuergbi: sorry might have not explained correctly. The thread-local messenger is what we have today, but it brings some problems :)12:29
benschubertSo I was thinking about the 3-level messenger, with more explicit setup on the internal code paths12:29
benschubertit might cause problems with threads spawned from plugin code that's tru12:30
benschubertthough the only way I can see that working then is always passing the messenger (or other identification) explictely12:30
benschubertwich doesn't make it ideal12:30
benschubertunless you would have another idea?12:30
juergbibenschubert: not really. well, one tweak could be to explicitly set the global messenger as the thread-local messenger of the main thread, instead of using it as fallback12:34
juergbiand then bail out if attempting to log with neither thread-local messenger nor explicit messenger available12:35
juergbithat might at least allow us to detect this issue12:35
benschubertOh yeah that could work12:35
juergbi(if necessary we could provide API to 'copy' the thread-local messenger into a helper thread but let's hope it's not needed in the public API)12:36
benschubertyeah or we could have helpers to start threads if it's ever needed12:36
benschubertok! I like this and will start looking into it, thanks!12:36
juergbiyes. but the direction we're going I assume it shouldn't12:36
juergbi(in the public API at least)12:37
juergbiyw12:37
benschubertyep12:37
Frazer61hi, seem to be getting a fail here locally and on master, is this a known issue? https://gitlab.com/BuildStream/buildstream/-/jobs/62542894613:16
benschubertFrazer61: on master too? https://gitlab.com/BuildStream/buildstream/-/jobs/625428946#L7309 is pretty clear on what the error is there at least: you are saying to logging to use '{' (new style formatting), but your format string is using '%', aka old style formattig13:18
Frazer61oh, seem to be getting the issue locally on master13:18
*** hasebastian has joined #buildstream13:29
Frazer61sorry, made a mistake, ignore that13:35
WSalmonso why did autotools not get moved to bst-plugin-experimental? and how is https://gitlab.com/BuildStream/buildstream/-/blob/master/src/buildstream/plugins/elements/autotools.yaml#L52 supposed to work? i still get     MAKEFLAGS: -j8 when it should be j1?  https://gitlab.com/celduin/bsps/bst-boardsupport/-/jobs/625698066/artifacts/file/artifact/freedesktop-sdk/components-slang/1ad3d1df-build.3146.log https://gitlab.com/freedesktop-sdk/freedesktop-sdk/-/16:07
WSalmonblob/master/elements/components/slang.bst#L916:07
WSalmonjuergbi, benschubert tristan ^16:08
juergbiWSalmon: move: I think it's not that simple as it's used by the test suite. however, we still want to move it, afaik16:11
juergbican you link to the place where you set notparallel?16:11
WSalmon https://gitlab.com/freedesktop-sdk/freedesktop-sdk/-/blob/master/elements/components/slang.bst#L916:12
WSalmon https://gitlab.com/freedesktop-sdk/freedesktop-sdk/-/blob/master/elements/components/slang.bst#L1016:12
WSalmonsorry16:12
juergbiWSalmon: I'm wondering whether this might be another fallout from the variable resolution change. however, we have a test case for notparallel, which seems to pass. tests/integration/manual.py::test_manual_element_noparallel16:22
douglaswinshipre the plugins API: both node_subst_vars and node_subst_sequence_vars both say they're depricated. What should be used instead?16:30
douglaswinshipDoes strip_node_info automatically replace variables? Because it doesn't mention that in the docs, so far as I can see.16:31
benschubertdouglaswinship: all the variables are not automatically expanded yes16:47
*** hasebastian has quit IRC16:48
douglaswinshipso if they're not expanded automatically, how should we expand them? I belive the traditional way was with node_subst_vars and node_subst_sequence. But the docs say they're both depricated.16:48
douglaswinshipWhat's the new way?16:48
douglaswinshipbenschubert: ^^16:50
benschubert*now not not sorry16:50
benschubertthey are now expanded :)16:50
douglaswinshipOh! that's much easier to deal with. Thanks :)16:51
*** santi has quit IRC17:25
tristanjuergbi, regarding bst-1 support in artifact servers for bst-2, I think that clearly if we are steering towards not offering an artifact server in bst-2 that doesn't stand anymore17:43
tristanjuergbi, I also seem to recall that we made this decision initially due to the low cost it incurred at that ime17:44
tristan*time17:44
tristanbenschubert, regarding messenger and things "not working", this is an interesting thing which has regressed in the past and is worth some attention17:45
douglaswinshipjjardon: WSalmon: I believe I've fixed the issue that was causing that overnight job to fail:17:45
douglaswinshiphttps://gitlab.com/BuildStream/bst-plugins-experimental/-/merge_requests/12517:45
douglaswinship(Could use some review)17:45
tristanbenschubert, I don't recall all the details, and was not around for the introduction of this `Messenger` object in the first place, but I do recall that messages being delivered to the frontend have 2 related elements, and both are meaningful17:46
tristanbenschubert, i.e. the `task` element and the element which issued a message are separate IDs on the Message object, and had reason to be, I believe that in most cases, we do *not care* about the element ID of which element issued a message within a task, we are mostly concerned (in the main log) with the *task* element17:47
tristanbenschubert, on the other hand, when logging a task, we might prefer the element ID, since the logging in that specific log file is already task specific17:47
douglaswinshipjjardon, WSalmon: I'm still not sure why it worked on the freedesktop runners, and not on the freedesktop-sdk runners.17:48
tristanbenschubert, what this all amounts to IIRC, is that you get proper grouping and understanding of what is going on *for a task* in the main log, and more minute detail in a task specific log (it really sucks to see many different element names in say "staging this and that dependency"17:48
tristan)17:48
jjardondouglaswinship: cool, can you change the CI job to use that to check if it passes?17:50
douglaswinship"And the fix can be seen working here: https://gitlab.com/BuildStream/buildstream/-/jobs/625901528"17:50
tristanbenschubert, I'm not sure what the implications of job-specific messengers are, or why we need them, but we do want to ensure that when plugins issue messages from a task, they inherit the task_id from that task before being sent to frontend message handlers17:50
douglaswinshipjjardon: It doesn't pass, but it hits a different issue17:51
tristanbenschubert, maybe your thread-local Messenger mostly exists for this sole purpose17:51
douglaswinshipjjardon: And we have a fix for that different issue too. It's not in the 'bst2' branch of freedesktop-sdk, but it'll be in the new bst2 branch, once we merge https://gitlab.com/freedesktop-sdk/freedesktop-sdk/-/merge_requests/285317:52
jjardondouglaswinship: nice, so with those 2 the CI passes?17:53
douglaswinshipjjardon: for that, we need to see the results of https://gitlab.com/BuildStream/buildstream/-/jobs/62593536217:54
douglaswinshipWhich isn't finished yet.17:54
douglaswinshipBut i'm optimistic17:54
* jjardon cross fingers :)17:54
douglaswinshipAnd toes!17:55
benschuberttristan: oh right, so if we were to group by task we'd be fine then? We always specify the task right?18:03
benschubertYeah the reason of thread-local etc is for logging to the correct log file etc18:03
benschuberttristan: thanks for the review1 I've assigned marge :D18:07
tristanGreat :)18:11
tristanbenschubert, yeah the thing is, the frontend IIRC will *fallback* to logging with the element_id if no task_id is specified in a log message18:11
tristanbenschubert, so it's easy to mess up in the core, if the task_id goes unstamped, then you get a lot of element specific messages which appear incoherent when reading the master log18:12
Frazer61tristan, just wondering if you have had a chance to look at the `bst souce show` proposal, or will that be a job tomorrow?18:12
benschuberttristan: and in plugins, do they specify the task_id whenever they log something?18:12
tristanbenschubert, Basically in the master log you want the main [cache key, task name, task element] stuff to be consistent for a given task (i.e. anything that did a START and will finish with either SUCCESS/FAILED/SKIPPED)18:13
tristanbenschubert, in the current model the job takes care of stamping the message before sending it to the frontend18:13
benschuberttristan: that's the problem, before the job was in a separate process, so it was a global thing18:14
benschubertnow, we can't really do that anymore if we go for threads18:14
benschubertUnless we pass the messenger object/something else ecplicitely around in the private API18:14
tristanbenschubert, Right so in that context you want something to automate it, a thread-local (per-job) Messenger object to do the stamping and job-specific logfile logging seems to make sense to me for this18:15
benschubertwhich works only if we run the entire job in a single thread, which is not that efficient :/18:15
tristanbenschubert, Either that or have *some* thread local data which the Messenger object uses to do things conditionally18:16
tristanEh ?18:16
tristanReally ?18:16
benschubertwell, we could do better, by having the ioloop handle more stuff, and have what's run in thread more minimal18:16
tristanYou would run multiple threads for the span of a single job ?18:16
tristanFrazer61, Sorry :-/18:17
benschubertBasically, what I think would be the "most efficient" would be to run most of the things in the ioloop with async mechanism, and launch the plugin's public code inside background threads18:17
tristanFrazer61, I got side tracked today and took a long afternoon off, have a friend who is going in for surgery tomorrow and will be in hospital for a week18:17
tristanbenschubert, That sounds a bit convoluted to me but maybe I'm missing how that all works when written in code :)18:18
tristanbenschubert, just a first impression18:18
Frazer61no worries tristan, hope it goes well. I'll prioritise other BuildStream related issues in the mean time18:18
benschuberttristan: ok, I'll try something, and put it up so we can discuss it more easily :)18:18
tristanbenschubert, I would have imagined that (A) You run a thread and that thread does blocking operations (B) If the blocking operations are IPC calls to CASD or other IO based things, those are all just transparently handled in the thread, automatically yielding the GIL whenever going to sleep and blocking18:19
tristanGetting the file descriptors and handling them in the main loop of the main thread seems like quite a chore, and... *maybe* more optimal ?18:20
tristanDoes it pay off to have a single poll() in the main thread and not have other threads sleeping in poll() simultaneously ? I don't know18:21
tristanAnyway sure, let's discuss more around code later :)18:22
benschuberttristan: right, I'm not explaining everything there sorry. Basically, my medium-term plan for this scheduler is that we could start using `uvloop` since we would not need child watchers anymore. That would probably give us a nice boot in perfs. However, for that to be the most efficient, we'd need to have most of the code running in the loop instead of other threads :)18:23
tristanbenschubert, in any case, at every turn that you launch a thread, there is a Job() that is related to it, so even if you end up launching many threads for a single Job(), you should have one single focus point for launching a Job() specific thread18:23
benschuberttristan: right, we could hide everything from user's code like that :)18:24
tristanMaybe the thread local data is just a single job reference variable, and the Messenger checks if there is a job in context and does the job specific thing then18:25
benschubertyeah, I'll dig into it to see what would look best :) thanks!18:28
benschubertCan someone enlighten me: why is https://gitlab.com/BuildStream/buildstream/-/blob/master/src/buildstream/_scheduler/queues/queue.py#L270 needed?19:32
benschubertòh wait, is it to keep the last build tree so we can rebuild on top of it next time?19:43
benschubertThen shouldn't we only be doing this in the build queue?19:43
benschubertjuergbi: ^?19:43
*** toscalix has joined #buildstream19:47
tristanbenschubert, it's there because it's basically "state which a job can have side effects which need to be replicated in the main process"21:20
tristanRegardless of who sets it, or why21:20
benschuberttristan: the only thing that changes is workspace data right? And that can only change in some but not all types of jobs no? Shouldn't we be more explicit on when that can change?21:21
tristanNot sure if we should21:22
benschubertSurely any push operation would not affect that at all, nor would a fetch21:22
tristanI think I would be more happy to write jobs which explicitly modify state, and a single point at which all state which "can be modified" is pulled into the main data model in a synchronized way21:23
tristanThan to have to pick and choose21:23
benschubertI'm not that sure, a lot of the job/queues interactions are quite complex and the intent might be clearer if we were doing things explicitly21:24
tristanI think we are already explicit when we modify state21:24
tristanAnd being doubly explicit is only extra special casing21:25
benschubertWell, this part of the code is only there to update workspaces, and there is a single part of the code that I can think of that can do that, though this is called by 5 different operations :)21:26
benschubertAnyways, I'll leave it as is for now, was just wondering if we could not chop off a good chunk of complexity there21:27
tristanI guess what makes workspaces different, is that this state synchronization is not bound to the semantics of the job invocation21:30
tristanlike the result of an operation being complete21:30
tristanit would be nice if it would be21:30
tristanbut in this case it's in a category of it's own21:30
tristanbecause a job can ask the project for the workspaces data, call an API on it, and expect that to be persistent21:31
tristanOriginally, it was written to just syncrhonously rewrite the workspaces file from the subprocess21:31
tristanwith no regard whatsoever for updating it in the main process21:31
benschubertBut it kind of is no? It's only when we get a new build tree that the state changes, and that's for builds only?21:31
tristanwhich passed a test21:31
tristanbut was obviously buggy21:32
tristanWell no, it's not21:32
tristanbenschubert, it might be something that happens to be a predictable side effect of a single case21:32
tristanbut it is not bound to the semantics of launching a job21:32
tristanit's not written that way21:32
benschubertI know, but should we rewrite it like that?21:33
tristanit could be nice it if were written that way, and that a Job did not have access to the global Workspaces object and just call functions on it to set things21:33
tristanThat could be nice yes :)21:33
benschubertI'd actually argue this should be moved to the build queue :)21:34
tristanI don't think so by itself no21:34
benschubertWhy?21:34
tristanBecause of the way the workspace API is set21:34
tristanSomething is allowed to call workspace_set_foo()21:34
tristanin a global context on the data model21:34
tristanInstead, it should be written in such a way that the job explicitly returns data to be applied in the main process21:35
tristanAnd the child job should not be just calling workspace_set_foo() and expecting things to "just sort themselves out"21:35
tristanThat should be streamlined into the semantics of launching the job and collecting the jobs return values21:35
tristanotherwise it's currently an expectation that any job can workspace_set_foo() and things will sort themselves out21:36
benschubertMmh not sure I 100% agree with the how, but it would be definitely better than now. Ok I'll put it on a background thread :)21:37
tristanElement is modifying the workspace21:37
tristanAs a side effect of assemble I think21:37
tristaninternal Element._assemble() could be extended to return the side effects it wants to affect the workspaces with21:38
tristanThat could be passed back through the BuildJob21:38
tristanCurrently it's just "Element sets workspaces at times of it's choosing"21:38
benschubertAh yes! There we agree :)21:38
tristanRight, that's what I mean, it's currently this hazy thing where jobs/elements just set stuff21:39
tristanthat's why its in a category where "things that get set, become synchronized"21:39
tristanWhich yeah, I agree is sloppy21:39
tristanbut lets fix it all the way if we do :)21:40
benschubertRight, so with a thread-based scheduler, we can return more meaningful data in the return of the _assemble and use it to update the workspace in the build queue (and only that one)21:41
benschubertWould that make sense?21:42
tristanIn a multiprocess model we could too21:42
tristanOnly in a thread-based model, things become more dangerous21:42
tristanI.e. one could have overlooked the relevance of synchronizing state at the end of a job when originally writing up this workspace side effect21:43
tristanand it might even work well most of the time21:43
tristanI would expect in hindsight, that this would have happened, without much consideration given to "when the right time to set workspace data" is21:44
tristanNot exactly sure if it's important for it to be synchronized at the end of a job, just saying; it's dangerous that things could "just work" without forcing that consideration to occur21:44
*** toscalix has quit IRC21:46
benschubertRight, ok I've got enough to think about for now, thanks!21:46
tristanI think that as far as workspaces go, it's just important that the side effects of a single workspace modification do not effect adjacent workspace modifications in the same session, which might be happening concurrently21:46
*** benschubert has quit IRC23:56

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!