*** tristan has quit IRC | 06:27 | |
*** benschubert has joined #buildstream | 07:17 | |
*** phildawson has joined #buildstream | 08:10 | |
*** santi has joined #buildstream | 08:37 | |
*** tristan has joined #buildstream | 09:47 | |
*** ChanServ sets mode: +o tristan | 09:47 | |
benschubert | tristan: hey, if you have time, I'd appreciate a second review round on https://gitlab.com/BuildStream/buildstream/-/merge_requests/1985 since I've fixed a few bugs and haven't done all the changes you were mentionning | 10:00 |
---|---|---|
juergbi | tristan: iirc, we decided about a year ago that we would keep bst 1.x client support (ReferenceStorageService) in bst-artifact-server master in addition to the then newish BuildStream artifact/source protocols | 10:16 |
juergbi | things have changed a bit since then since we're moving to the Remote Asset API and we're planning to completely drop bst-artifact-server if/when a suitable replacement exists | 10:17 |
juergbi | would it thus be ok to drop bst 1.x client support in bst-artifact-server now or what do you think? | 10:17 |
juergbi | see https://gitlab.com/BuildStream/buildstream/-/merge_requests/1978#note_372884164 | 10:18 |
juergbi | do we need to discuss this on the list? | 10:18 |
tristan | benschubert, I won't be able to until later tonight :-/ | 10:23 |
benschubert | tristan: no worries, it's not that pressing :) | 10:23 |
tristan | benschubert, regarding the stylistic issues, I have no problems with this, e.g. stop/quit etc | 10:23 |
* tristan has an iterator approach to Value.resolve(), and now an approach which encodes all value parts to utf-8 strings, concantenates them and decodes back to `unicode` type | 10:25 | |
tristan | which "works" but probably needs more touchups to see if it's a winner | 10:25 |
tristan | meh, currently not great | 10:26 |
* tristan just got the test to work before needing to run out the door | 10:26 | |
tristan | if only I could get Py_UNICODE memcpy to do anything meaninful :-S | 10:26 |
* tristan cloned cython and peeked at the insides of that, looks like most of it is mapping to raw C | 10:27 | |
tristan | with #defines and such for things sitting in Include/cpython/* | 10:27 |
tristan | probably better guidelines for using raw C API of cPython would be more useful than sugar coated .pxd header files | 10:28 |
benschubert | tristan: btw, ostree now passes all tests: https://gitlab.com/BuildStream/bst-plugins-experimental/-/merge_requests/123 :) | 10:31 |
tristan | yay \o/ | 10:31 |
benschubert | juergbi: https://gitlab.com/BuildStream/bst-plugins-experimental/-/merge_requests/124 that can also go now right? | 10:34 |
juergbi | yes | 10:35 |
benschubert | thanks! | 10:37 |
benschubert | I've got a design choice to make on the Messenger with the threading scheduler: | 10:52 |
benschubert | The way it is currently works because we have a child - parent separation (Everything in the master thread/process is what is outside of jobs, everything in a job is run in another thread). | 10:52 |
benschubert | This is good as it is easy to handle with thread-locals, but it means we couldn't increase usage of the ioloop, and would need to have the threads do _all_ the job instead of a smaller part, which might work better with the loop and deliver potentially better performance. | 10:52 |
benschubert | My first idea was to use a Messenger per job, and set it on the `plugin` at the start, which whould have meant transparent usage to the element. | 10:53 |
benschubert | However, this does not work, since messenger can be used in dependencies, for example when staging them. Which means we could need multiple messenger on a specific instance, at the same time. | 10:53 |
benschubert | So I think the shorter I could do is: | 10:55 |
benschubert | - A global messenger, the default if nothing else is set | 10:55 |
benschubert | - An explicit messenger passed around in _buildstream private_ code, when we need ot override the global messenger | 10:55 |
benschubert | - A thread-local messenger, we'd use in the jobs at the exact part where we leave inside a thread (that way, if we only have plugin code running in the thread, we still have transparent usage for BuildStream plugins in the public API) | 10:55 |
benschubert | Does that seem too convoluted to people? tristan, juergbi if you have time? :) | 10:55 |
traveltissues | juergbi: i think that if removing support for something it makes sense to mention it on the list | 11:05 |
juergbi | benschubert: per job messenger makes sense to me as that's also how our log files are structured. and as we execute each job in a thread, using thread-local storage for this seems reasonable, assuming it's not a performance bottleneck | 12:11 |
juergbi | and a global messenger for all non-job threads also makes sense | 12:12 |
juergbi | benschubert: the only issue I can think of is if a job were to start using multiple threads. we probably don't generally want to support explicit thread creation. however, if there is a chance a job is using a thread pool somewhere, thread-local storage would be an issue | 12:13 |
benschubert | juergbi: sorry might have not explained correctly. The thread-local messenger is what we have today, but it brings some problems :) | 12:29 |
benschubert | So I was thinking about the 3-level messenger, with more explicit setup on the internal code paths | 12:29 |
benschubert | it might cause problems with threads spawned from plugin code that's tru | 12:30 |
benschubert | though the only way I can see that working then is always passing the messenger (or other identification) explictely | 12:30 |
benschubert | wich doesn't make it ideal | 12:30 |
benschubert | unless you would have another idea? | 12:30 |
juergbi | benschubert: not really. well, one tweak could be to explicitly set the global messenger as the thread-local messenger of the main thread, instead of using it as fallback | 12:34 |
juergbi | and then bail out if attempting to log with neither thread-local messenger nor explicit messenger available | 12:35 |
juergbi | that might at least allow us to detect this issue | 12:35 |
benschubert | Oh yeah that could work | 12:35 |
juergbi | (if necessary we could provide API to 'copy' the thread-local messenger into a helper thread but let's hope it's not needed in the public API) | 12:36 |
benschubert | yeah or we could have helpers to start threads if it's ever needed | 12:36 |
benschubert | ok! I like this and will start looking into it, thanks! | 12:36 |
juergbi | yes. but the direction we're going I assume it shouldn't | 12:36 |
juergbi | (in the public API at least) | 12:37 |
juergbi | yw | 12:37 |
benschubert | yep | 12:37 |
Frazer61 | hi, seem to be getting a fail here locally and on master, is this a known issue? https://gitlab.com/BuildStream/buildstream/-/jobs/625428946 | 13:16 |
benschubert | Frazer61: on master too? https://gitlab.com/BuildStream/buildstream/-/jobs/625428946#L7309 is pretty clear on what the error is there at least: you are saying to logging to use '{' (new style formatting), but your format string is using '%', aka old style formattig | 13:18 |
Frazer61 | oh, seem to be getting the issue locally on master | 13:18 |
*** hasebastian has joined #buildstream | 13:29 | |
Frazer61 | sorry, made a mistake, ignore that | 13:35 |
WSalmon | so why did autotools not get moved to bst-plugin-experimental? and how is https://gitlab.com/BuildStream/buildstream/-/blob/master/src/buildstream/plugins/elements/autotools.yaml#L52 supposed to work? i still get MAKEFLAGS: -j8 when it should be j1? https://gitlab.com/celduin/bsps/bst-boardsupport/-/jobs/625698066/artifacts/file/artifact/freedesktop-sdk/components-slang/1ad3d1df-build.3146.log https://gitlab.com/freedesktop-sdk/freedesktop-sdk/-/ | 16:07 |
WSalmon | blob/master/elements/components/slang.bst#L9 | 16:07 |
WSalmon | juergbi, benschubert tristan ^ | 16:08 |
juergbi | WSalmon: move: I think it's not that simple as it's used by the test suite. however, we still want to move it, afaik | 16:11 |
juergbi | can you link to the place where you set notparallel? | 16:11 |
WSalmon | https://gitlab.com/freedesktop-sdk/freedesktop-sdk/-/blob/master/elements/components/slang.bst#L9 | 16:12 |
WSalmon | https://gitlab.com/freedesktop-sdk/freedesktop-sdk/-/blob/master/elements/components/slang.bst#L10 | 16:12 |
WSalmon | sorry | 16:12 |
juergbi | WSalmon: I'm wondering whether this might be another fallout from the variable resolution change. however, we have a test case for notparallel, which seems to pass. tests/integration/manual.py::test_manual_element_noparallel | 16:22 |
douglaswinship | re the plugins API: both node_subst_vars and node_subst_sequence_vars both say they're depricated. What should be used instead? | 16:30 |
douglaswinship | Does strip_node_info automatically replace variables? Because it doesn't mention that in the docs, so far as I can see. | 16:31 |
benschubert | douglaswinship: all the variables are not automatically expanded yes | 16:47 |
*** hasebastian has quit IRC | 16:48 | |
douglaswinship | so if they're not expanded automatically, how should we expand them? I belive the traditional way was with node_subst_vars and node_subst_sequence. But the docs say they're both depricated. | 16:48 |
douglaswinship | What's the new way? | 16:48 |
douglaswinship | benschubert: ^^ | 16:50 |
benschubert | *now not not sorry | 16:50 |
benschubert | they are now expanded :) | 16:50 |
douglaswinship | Oh! that's much easier to deal with. Thanks :) | 16:51 |
*** santi has quit IRC | 17:25 | |
tristan | juergbi, regarding bst-1 support in artifact servers for bst-2, I think that clearly if we are steering towards not offering an artifact server in bst-2 that doesn't stand anymore | 17:43 |
tristan | juergbi, I also seem to recall that we made this decision initially due to the low cost it incurred at that ime | 17:44 |
tristan | *time | 17:44 |
tristan | benschubert, regarding messenger and things "not working", this is an interesting thing which has regressed in the past and is worth some attention | 17:45 |
douglaswinship | jjardon: WSalmon: I believe I've fixed the issue that was causing that overnight job to fail: | 17:45 |
douglaswinship | https://gitlab.com/BuildStream/bst-plugins-experimental/-/merge_requests/125 | 17:45 |
douglaswinship | (Could use some review) | 17:45 |
tristan | benschubert, I don't recall all the details, and was not around for the introduction of this `Messenger` object in the first place, but I do recall that messages being delivered to the frontend have 2 related elements, and both are meaningful | 17:46 |
tristan | benschubert, i.e. the `task` element and the element which issued a message are separate IDs on the Message object, and had reason to be, I believe that in most cases, we do *not care* about the element ID of which element issued a message within a task, we are mostly concerned (in the main log) with the *task* element | 17:47 |
tristan | benschubert, on the other hand, when logging a task, we might prefer the element ID, since the logging in that specific log file is already task specific | 17:47 |
douglaswinship | jjardon, WSalmon: I'm still not sure why it worked on the freedesktop runners, and not on the freedesktop-sdk runners. | 17:48 |
tristan | benschubert, what this all amounts to IIRC, is that you get proper grouping and understanding of what is going on *for a task* in the main log, and more minute detail in a task specific log (it really sucks to see many different element names in say "staging this and that dependency" | 17:48 |
tristan | ) | 17:48 |
jjardon | douglaswinship: cool, can you change the CI job to use that to check if it passes? | 17:50 |
douglaswinship | "And the fix can be seen working here: https://gitlab.com/BuildStream/buildstream/-/jobs/625901528" | 17:50 |
tristan | benschubert, I'm not sure what the implications of job-specific messengers are, or why we need them, but we do want to ensure that when plugins issue messages from a task, they inherit the task_id from that task before being sent to frontend message handlers | 17:50 |
douglaswinship | jjardon: It doesn't pass, but it hits a different issue | 17:51 |
tristan | benschubert, maybe your thread-local Messenger mostly exists for this sole purpose | 17:51 |
douglaswinship | jjardon: And we have a fix for that different issue too. It's not in the 'bst2' branch of freedesktop-sdk, but it'll be in the new bst2 branch, once we merge https://gitlab.com/freedesktop-sdk/freedesktop-sdk/-/merge_requests/2853 | 17:52 |
jjardon | douglaswinship: nice, so with those 2 the CI passes? | 17:53 |
douglaswinship | jjardon: for that, we need to see the results of https://gitlab.com/BuildStream/buildstream/-/jobs/625935362 | 17:54 |
douglaswinship | Which isn't finished yet. | 17:54 |
douglaswinship | But i'm optimistic | 17:54 |
* jjardon cross fingers :) | 17:54 | |
douglaswinship | And toes! | 17:55 |
benschubert | tristan: oh right, so if we were to group by task we'd be fine then? We always specify the task right? | 18:03 |
benschubert | Yeah the reason of thread-local etc is for logging to the correct log file etc | 18:03 |
benschubert | tristan: thanks for the review1 I've assigned marge :D | 18:07 |
tristan | Great :) | 18:11 |
tristan | benschubert, yeah the thing is, the frontend IIRC will *fallback* to logging with the element_id if no task_id is specified in a log message | 18:11 |
tristan | benschubert, so it's easy to mess up in the core, if the task_id goes unstamped, then you get a lot of element specific messages which appear incoherent when reading the master log | 18:12 |
Frazer61 | tristan, just wondering if you have had a chance to look at the `bst souce show` proposal, or will that be a job tomorrow? | 18:12 |
benschubert | tristan: and in plugins, do they specify the task_id whenever they log something? | 18:12 |
tristan | benschubert, Basically in the master log you want the main [cache key, task name, task element] stuff to be consistent for a given task (i.e. anything that did a START and will finish with either SUCCESS/FAILED/SKIPPED) | 18:13 |
tristan | benschubert, in the current model the job takes care of stamping the message before sending it to the frontend | 18:13 |
benschubert | tristan: that's the problem, before the job was in a separate process, so it was a global thing | 18:14 |
benschubert | now, we can't really do that anymore if we go for threads | 18:14 |
benschubert | Unless we pass the messenger object/something else ecplicitely around in the private API | 18:14 |
tristan | benschubert, Right so in that context you want something to automate it, a thread-local (per-job) Messenger object to do the stamping and job-specific logfile logging seems to make sense to me for this | 18:15 |
benschubert | which works only if we run the entire job in a single thread, which is not that efficient :/ | 18:15 |
tristan | benschubert, Either that or have *some* thread local data which the Messenger object uses to do things conditionally | 18:16 |
tristan | Eh ? | 18:16 |
tristan | Really ? | 18:16 |
benschubert | well, we could do better, by having the ioloop handle more stuff, and have what's run in thread more minimal | 18:16 |
tristan | You would run multiple threads for the span of a single job ? | 18:16 |
tristan | Frazer61, Sorry :-/ | 18:17 |
benschubert | Basically, what I think would be the "most efficient" would be to run most of the things in the ioloop with async mechanism, and launch the plugin's public code inside background threads | 18:17 |
tristan | Frazer61, I got side tracked today and took a long afternoon off, have a friend who is going in for surgery tomorrow and will be in hospital for a week | 18:17 |
tristan | benschubert, That sounds a bit convoluted to me but maybe I'm missing how that all works when written in code :) | 18:18 |
tristan | benschubert, just a first impression | 18:18 |
Frazer61 | no worries tristan, hope it goes well. I'll prioritise other BuildStream related issues in the mean time | 18:18 |
benschubert | tristan: ok, I'll try something, and put it up so we can discuss it more easily :) | 18:18 |
tristan | benschubert, I would have imagined that (A) You run a thread and that thread does blocking operations (B) If the blocking operations are IPC calls to CASD or other IO based things, those are all just transparently handled in the thread, automatically yielding the GIL whenever going to sleep and blocking | 18:19 |
tristan | Getting the file descriptors and handling them in the main loop of the main thread seems like quite a chore, and... *maybe* more optimal ? | 18:20 |
tristan | Does it pay off to have a single poll() in the main thread and not have other threads sleeping in poll() simultaneously ? I don't know | 18:21 |
tristan | Anyway sure, let's discuss more around code later :) | 18:22 |
benschubert | tristan: right, I'm not explaining everything there sorry. Basically, my medium-term plan for this scheduler is that we could start using `uvloop` since we would not need child watchers anymore. That would probably give us a nice boot in perfs. However, for that to be the most efficient, we'd need to have most of the code running in the loop instead of other threads :) | 18:23 |
tristan | benschubert, in any case, at every turn that you launch a thread, there is a Job() that is related to it, so even if you end up launching many threads for a single Job(), you should have one single focus point for launching a Job() specific thread | 18:23 |
benschubert | tristan: right, we could hide everything from user's code like that :) | 18:24 |
tristan | Maybe the thread local data is just a single job reference variable, and the Messenger checks if there is a job in context and does the job specific thing then | 18:25 |
benschubert | yeah, I'll dig into it to see what would look best :) thanks! | 18:28 |
benschubert | Can someone enlighten me: why is https://gitlab.com/BuildStream/buildstream/-/blob/master/src/buildstream/_scheduler/queues/queue.py#L270 needed? | 19:32 |
benschubert | òh wait, is it to keep the last build tree so we can rebuild on top of it next time? | 19:43 |
benschubert | Then shouldn't we only be doing this in the build queue? | 19:43 |
benschubert | juergbi: ^? | 19:43 |
*** toscalix has joined #buildstream | 19:47 | |
tristan | benschubert, it's there because it's basically "state which a job can have side effects which need to be replicated in the main process" | 21:20 |
tristan | Regardless of who sets it, or why | 21:20 |
benschubert | tristan: the only thing that changes is workspace data right? And that can only change in some but not all types of jobs no? Shouldn't we be more explicit on when that can change? | 21:21 |
tristan | Not sure if we should | 21:22 |
benschubert | Surely any push operation would not affect that at all, nor would a fetch | 21:22 |
tristan | I think I would be more happy to write jobs which explicitly modify state, and a single point at which all state which "can be modified" is pulled into the main data model in a synchronized way | 21:23 |
tristan | Than to have to pick and choose | 21:23 |
benschubert | I'm not that sure, a lot of the job/queues interactions are quite complex and the intent might be clearer if we were doing things explicitly | 21:24 |
tristan | I think we are already explicit when we modify state | 21:24 |
tristan | And being doubly explicit is only extra special casing | 21:25 |
benschubert | Well, this part of the code is only there to update workspaces, and there is a single part of the code that I can think of that can do that, though this is called by 5 different operations :) | 21:26 |
benschubert | Anyways, I'll leave it as is for now, was just wondering if we could not chop off a good chunk of complexity there | 21:27 |
tristan | I guess what makes workspaces different, is that this state synchronization is not bound to the semantics of the job invocation | 21:30 |
tristan | like the result of an operation being complete | 21:30 |
tristan | it would be nice if it would be | 21:30 |
tristan | but in this case it's in a category of it's own | 21:30 |
tristan | because a job can ask the project for the workspaces data, call an API on it, and expect that to be persistent | 21:31 |
tristan | Originally, it was written to just syncrhonously rewrite the workspaces file from the subprocess | 21:31 |
tristan | with no regard whatsoever for updating it in the main process | 21:31 |
benschubert | But it kind of is no? It's only when we get a new build tree that the state changes, and that's for builds only? | 21:31 |
tristan | which passed a test | 21:31 |
tristan | but was obviously buggy | 21:32 |
tristan | Well no, it's not | 21:32 |
tristan | benschubert, it might be something that happens to be a predictable side effect of a single case | 21:32 |
tristan | but it is not bound to the semantics of launching a job | 21:32 |
tristan | it's not written that way | 21:32 |
benschubert | I know, but should we rewrite it like that? | 21:33 |
tristan | it could be nice it if were written that way, and that a Job did not have access to the global Workspaces object and just call functions on it to set things | 21:33 |
tristan | That could be nice yes :) | 21:33 |
benschubert | I'd actually argue this should be moved to the build queue :) | 21:34 |
tristan | I don't think so by itself no | 21:34 |
benschubert | Why? | 21:34 |
tristan | Because of the way the workspace API is set | 21:34 |
tristan | Something is allowed to call workspace_set_foo() | 21:34 |
tristan | in a global context on the data model | 21:34 |
tristan | Instead, it should be written in such a way that the job explicitly returns data to be applied in the main process | 21:35 |
tristan | And the child job should not be just calling workspace_set_foo() and expecting things to "just sort themselves out" | 21:35 |
tristan | That should be streamlined into the semantics of launching the job and collecting the jobs return values | 21:35 |
tristan | otherwise it's currently an expectation that any job can workspace_set_foo() and things will sort themselves out | 21:36 |
benschubert | Mmh not sure I 100% agree with the how, but it would be definitely better than now. Ok I'll put it on a background thread :) | 21:37 |
tristan | Element is modifying the workspace | 21:37 |
tristan | As a side effect of assemble I think | 21:37 |
tristan | internal Element._assemble() could be extended to return the side effects it wants to affect the workspaces with | 21:38 |
tristan | That could be passed back through the BuildJob | 21:38 |
tristan | Currently it's just "Element sets workspaces at times of it's choosing" | 21:38 |
benschubert | Ah yes! There we agree :) | 21:38 |
tristan | Right, that's what I mean, it's currently this hazy thing where jobs/elements just set stuff | 21:39 |
tristan | that's why its in a category where "things that get set, become synchronized" | 21:39 |
tristan | Which yeah, I agree is sloppy | 21:39 |
tristan | but lets fix it all the way if we do :) | 21:40 |
benschubert | Right, so with a thread-based scheduler, we can return more meaningful data in the return of the _assemble and use it to update the workspace in the build queue (and only that one) | 21:41 |
benschubert | Would that make sense? | 21:42 |
tristan | In a multiprocess model we could too | 21:42 |
tristan | Only in a thread-based model, things become more dangerous | 21:42 |
tristan | I.e. one could have overlooked the relevance of synchronizing state at the end of a job when originally writing up this workspace side effect | 21:43 |
tristan | and it might even work well most of the time | 21:43 |
tristan | I would expect in hindsight, that this would have happened, without much consideration given to "when the right time to set workspace data" is | 21:44 |
tristan | Not exactly sure if it's important for it to be synchronized at the end of a job, just saying; it's dangerous that things could "just work" without forcing that consideration to occur | 21:44 |
*** toscalix has quit IRC | 21:46 | |
benschubert | Right, ok I've got enough to think about for now, thanks! | 21:46 |
tristan | I think that as far as workspaces go, it's just important that the side effects of a single workspace modification do not effect adjacent workspace modifications in the same session, which might be happening concurrently | 21:46 |
*** benschubert has quit IRC | 23:56 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!