IRC logs for #buildstream for Saturday, 2020-07-04

juergbibenschubert: hm, the build is failing with up-to-date casd? i.e., getting OOM killed at capture? that would be a different issue than what WSalmon reported00:12
*** benschubert has quit IRC00:13
*** benschubert has joined #buildstream09:19
benschubertjuergbi: let me update everything and come back to you :)09:19
benschubertI need to see if it's also not simply llvm that's using too much09:19
juergbiright, the llvm build can use a lot of memory if you use high parallelization09:20
benschubertyeah I'll try with less first09:20
benschubertseems like the default of 8 might be too much09:20
juergbibenschubert: did you see an OOM kill, the terminated process in the kernel log might already provide a hint in that case09:21
juergbiif the compiler was killed it is definitely not related to the capture phase09:21
benschubertjuergbi: the kernel logs where only showing messages about fuse cleaning up stuff, I did not have more history than that?09:22
benschubertmore accurately:09:22
benschubert[ 2577.966320] fuse: trying to steal weird page09:22
benschubert[ 2577.966323]   page=000000004335e751 index=47445 flags=80000000000000ad, count=1, mapcount=0, mapping=          (null)09:22
juergbiah, right, that message is quite spammy, hopefully fixed soon in master09:22
juergbiyou don't have journald?09:22
benschubertnope, running in docker09:23
benschuberton a windows machine :)09:23
juergbiah, I see09:23
*** doras has joined #buildstream10:36
dorasI really need a way to force CAS cleanup. It's getting far too large and once I set a quota is just says "Error instantiating artifact cache: Your system does not have enough available space to support the cache quota specified."10:37
doras... instead of try to cleanup right there and then.10:38
benschubertdoras: BuildStream master or 1.4?10:39
doras1.410:39
benschubertwhat's the size of your cache? If it's very big, it might be more worth wiping it completely and rebuild your project, bst 1.4 cache cleanup is notoriously slow10:40
dorasDeleting it means 3 hours of pulling10:42
dorasAnd doing it every few weeks is really getting old.10:43
benschubertdoras: there's a lot of improvements made on the master branch for such things. We can't backport them however10:43
benschubertI haven't touched the 1.4 cache for a very long time. can you du -hs ~/.cache/buildstream/* please? and how much disk space do you have available?10:44
benschubertNote  that if your cache is around 2TB currently, cleaning it will take more than 8hours, so 3h of download might be better in that case10:45
benschuberthttps://gitlab.com/BuildStream/bst-plugins-experimental/-/merge_requests/123 if anyone has time for reviews :)10:48
benschubertjuergbi: can confirm it was not a buildbox problem :)10:59
*** tomaz has joined #buildstream11:04
tomazI'm trying to port a specific application from flatpak-builder to buildsteram, and while trying to run it I'm having this error:12:06
tomazhttps://pastebin.com/yjD9xWNx12:06
tomazaparently it can't access the display :0 - anyone can share a ligth?12:07
juergbibenschubert: good to know12:10
juergbitomaz: you might be missing configuration for bst shell to allow access to display etc.12:11
juergbithis could be useful as starting point: https://gitlab.gnome.org/GNOME/gnome-build-meta/-/blob/master/project.conf#L15712:11
tomazjuergbi: on the pastebin there's the environment variables to access the project. /me looking the project conf12:14
tomazjuergbi: yeah, I'm setting everything that the configuration you send me sets.12:14
tomaz(on the file app.yml)12:15
juergbitomaz: also the host-files? e.g. ${XDG_RUNTIME_DIR}?12:16
juergbithe xauth file might be in there12:16
juergbithis may also depend on your host system / display server12:17
tomazhm, no, not that one, thanks for pointing it out.12:17
tomazthis needs to be in the sdk (in that case, gnome-build-meta) or is should be set per project? (currently I'm trying to build dolphin)12:17
juergbiif that doesn't help you could try disabling xauth access control12:17
juergbiit's per project12:18
tomazcool cool.12:18
juergbi(include files may be helpful to avoid duplication)12:18
tomazand yes, it was there (I just didn't pay attention) - I coppyed the app.yml file from the transmission buildstream repo.12:20
benschuberttristan: are you around and have 5 minutes? :)14:55
tristanbenschubert, Yeah15:14
tristanI will probably run off in a bit, I was curious and giving this a try: https://cython.readthedocs.io/en/latest/src/tutorial/profiling_tutorial.html15:15
tristannot super verbose but a bit helpful15:15
benschuberttristan: ok , then quick I'll be: I think that if we want to make the threading approach really neat, we need to stop having a "Messenger" object on the context that is used everywhere, but rather control the scope of them explicitely.15:19
benschubert1) Would you rather have this in the same MR or should I do this separately15:19
benschubert2) This is only for "jobs" so I could move everything that _might_ be running in jobs to take it more or less explicitely, or just get rid of it in the context entirely and pass it where relevant15:19
tristanInterestingly, it's telling me that the majority of time spent from Variables.check() (which does most resolution), is actually spent in Value.resolve(): https://gitlab.com/BuildStream/buildstream/-/blob/tristan/partial-variables/src/buildstream/_variables.pyx#L52215:19
tristan0.061 in Value.resolve() out of 0.070 time spent in Variables._resolve()15:20
tristanbenschubert, I'm not sure I understand or am on board with that15:20
tristanbenschubert, Maybe we want separate instances of a messenger object ?15:21
benschubertyep that's it15:21
tristanRegarding Messenger and Context separation, it appears to me that it was mostly already done15:21
benschubertand that means that an object might use different instances of it15:21
benschubertwell, everything in the codebase is reaching the messenger through the context15:22
tristanMessenger was created, before that it was methods handled on Context, I think that separation makes sense15:22
benschubertwhich makes separation hard15:22
tristanRight, that could be split up15:22
tristanAt the cost of minor burden15:22
benschubertI could also keep it for the things that do not run in the scheduler and then for the rest pass it more or less explicitely around15:23
benschubertE.g. add a _messenger on Plugin, and a use_messenger(messenger) context manager on them15:23
tristanWell, if we kept Messenger accessible on Context, it would be a good thing to have a single focal point when a thread is created, to doctor the Messenger in the child thread (or replace it with a new one which has access to a Queue or such)15:24
tristanWhat I hope to preserve is that code needs not know where it's running of course15:25
tristanand that we don't have to explicitly pass things around15:25
tristanI would guess that job base class takes care of this15:25
benschuberttristan: the problem comes with the Job that sends messages back to the parent15:25
benschubertwhere we actually need to be able to access both15:25
benschubertso we can't really use a "thread-local" approach for better performance15:25
benschubertwe need a finer grain handling15:26
tristanUmmm15:26
benschubertit's all inter-tangled :/15:26
tristanSo you want to marshal messages in the same memory space without sending them over a queue15:26
benschubertyep15:27
benschubertso we can avoid having to pass everything through the asyncio loop15:27
benschubertwhich is one of the two biggest slowing points by moving to threads15:27
tristanMaybe it's better to do that separately as a performance enhancement, it will involve some explicit locking and such15:27
benschubertnot necessarily15:28
tristanit's just a lot to think about in a single review process, I get where you're going15:28
benschubertYeah, but on the other hand I'd rather not have a half-backed branch go in :)15:28
benschubertsince we're speaking about a x2 slowdown15:28
tristanI do think there is a part of this I don't get though15:28
benschubertotherwsie a feature-branch with multiple MRs going to it, then a massive MR to master15:29
benschubertah?15:29
tristan" but rather control the scope of them explicitely. "15:29
tristanI mean, I can envision a world where messages get added to a queue (not an IPC) for the frontend to process serially, in the same memory space15:29
tristanThat much I grok15:29
tristanWhat I don't grok is why various parts of the code need to have any explicit knowledge of how messages get marshalled, or what extra explicit stuff needs to happen15:30
tristanMaybe get to the merge request, and then we start figuring out how to make this transparent to Message() API clients15:31
benschubertOk, an example: let's take an Element. It runs some stuff in the background threads when running through jobs, and other stuff in the main thread (e.g. is_cached()). In both you need an access to a different Messenger15:31
tristanDo you though ?15:31
benschubertthe background thread will write to a file, have a difference "silence" context and suh15:32
tristanMaybe without process separation, you just have a single messenger and a mutex15:32
tristanAh that15:32
benschubertyeah15:32
tristanThat could just be a pid() check ?15:32
tristanBut still keep the same messenger ?15:32
benschubertbut then that means that you have:15:33
benschubertmessage sent from job -> messenger -> queue -> root messenger15:33
benschubertmmh we could actually use a stack of messenger15:33
tristanYou have "parts of the code base issuing messages", and a focal point (Stream) for an abstract frontend to display/log main messages15:34
tristanAll in one memory space (at least for Stream and below it)15:34
benschubertNope, the messenger is also used for the communication between child and parent jobs15:34
benschubertuh that's actually maybe the pain point I need to solve first15:34
tristanBut with threading, those are the same right ?15:34
benschubertwell I'm trying to merge them but can't right now, since they run in different threads15:35
tristanOhhh is the messenger hijacked and mixed into the parent/child stuff ?15:35
benschubertwhich allowed for an easy messenger separation through locals()15:35
benschubertyeah15:35
benschubertok I guess I need to solve this first actually15:35
tristanThat used to be just a discrete queue between Job parent and child15:35
tristanThat needs to go away first I'd expect, and the rest gets simplified yeah15:35
benschubert        if envelope.message_type is _MessageType.LOG_MESSAGE:15:36
benschubert            # Propagate received messages from children15:36
benschubert            # back through the context.15:36
benschubert            self._scheduler.notify_messenger(envelope.message)15:36
benschubertIs the real pain point15:36
benschubertbecause at that point, you want the main thread's messenger15:36
tristanAhhh right, so Jobs had their discrete Queue, which was also used to marshal messages, along with it's internal IPC15:36
benschubertyeah15:36
tristanThen Messenger object was born at some point, and got mixed up in all of this15:36
benschubertand then you get the notifier15:37
tristanUh oh15:38
benschubertwhich is even worse, it is from the App, but called and calls to the scheduler15:38
tristanThe notifier sounds like the madness which got introduced to the scheduler, instead of passing State objects around everywhere15:38
tristanWe can kill that off with a firing squad15:38
benschuberthttps://gitlab.com/BuildStream/buildstream/-/blob/master/src/buildstream/_stream.py#L1614 is where it gets murky15:41
tristanYeah15:41
tristanI've been planning on nuking that all to hell15:41
benschubertHow would you see that instead?15:41
tristanIt may make sense to do that first and put it at the beginning of your MR, maybe even merge the nuking of that first15:42
benschubertYeah, might make my whole job easier15:42
tristanSo originally, we had Context (no Messenger)15:42
benschubertHow would you see that? Direct 'state' access in the scheduler?15:42
tristanAnd that had a callback which the frontend can use to display messages15:42
tristan*YES*15:42
tristanThat is exactly what state is for15:42
* benschubert is not clear about what Stream is meant to do15:42
tristanIt is a goal to make sure that the frontend only ever talks to Stream15:43
tristanAnd possibly Context and Project and State15:43
tristanStream is the main interface to the core15:43
benschubertwhen we were talking about splitting scheduler and frontend in two processes, Stream would have been the entry point in the child process right?15:44
tristanFrontends can be implemented in various ways, so long as Stream draws a hard line15:44
benschubertThat's how it should be seen15:44
tristanRight15:44
tristanStream would be created in the frontend15:44
tristanAnd Stream would decide on it's own to create a subprocess to run the scheduler in15:45
benschubertThen how do you handle https://gitlab.com/BuildStream/buildstream/-/blob/master/src/buildstream/_stream.py#L1614 ? That seems to be callbacks that should be sent back "up" ?15:45
tristanBut Stream already was the main abstraction point before starting talking about process separation, it would have stayed that way15:45
tristanState was created so that it could be updated within the core and changes received in the frontend15:46
tristanAll passed through State15:46
tristanState state = stream.get_state()15:46
tristanstate.register_callback("foo", self.foo_changed_cb)15:47
tristanin the core... State is passed through constructors15:47
tristanAnd the core says stuff like state.set_foo()15:47
tristanWhere internally state says { if foo != self.foo: self.foo = foo; run_callbacks("foo") }15:48
tristanlike that15:48
tristanbenschubert, if ever process separation happened, State internally would need to grow a Queue, so that all that explicit extra codepaths running through the Scheduler, would have never needed to happen15:48
tristanState would take care of knowing it's in what process and where to deliver the notifications to15:49
tristancore just gets a State object passed around, and frontend just listens to a State object, and if process separation happens, State has to work it out while providing everyone with the same functioning API15:50
benschubertOk, I think I understand the gist, but not sure I'm confident of having clear vision on how to do this15:51
benschubertAnd digging around the messenger it seems I could also do without that change for now, so I'll probably let you get that one15:52
tristanIf you start from a fresh branch off master, and then give the State()/Messenger() instances to Scheduler and Queue constructors and whoever needs it, remove this crazy manual message marshalling function, then there will be gaps to fill15:53
tristanThings which relied on this extra "notification" thing just call directly into the State they were given, or the Messenger they were given15:53
tristanState for state changes and Messenger for events/messages15:53
tristanAlright :)15:54
benschubertOh wait, you mean basically nuking 'MessageType' and calling into messenger / state instead?15:54
tristanMessageType is required15:55
tristanIt's a part of the message15:55
tristanit indicates START/SUCCESS/FAIL/SKIPPED etc, it's the component of a Message() which is the thing that get's pumped through the Messenger() queue15:55
benschubertah we're not talking about the same one then I think15:56
benschuberthttps://gitlab.com/BuildStream/buildstream/-/blob/master/src/buildstream/_scheduler/jobs/job.py#L7415:56
tristanAh yeah, that's different15:56
tristanThat used to just be strings I think15:57
tristanbenschubert, *that* thing only goes away with your branch15:57
benschuberttrue15:57
tristanThe scheduler getting it's dirty hands involved in stage change notifications and message delivery, is what I want to remove15:57
tristanbasically that notification handler15:58
tristanscheduler._notify() goes away15:58
benschuberthttps://gitlab.com/BuildStream/buildstream/-/blob/master/src/buildstream/_scheduler/scheduler.py#L605 so we would remove 'NotificationType' right?15:58
tristanNotification and NotificationType goes away15:58
benschubertok I *think* I got it15:59
benschubertI'll try a MR today and if I didn't get it right, feel free to take over/start anew ?15:59
tristanYeah, all that extra noise getting in between people sending messages and notifying state changes, and the receiving State and Messenger objects15:59
tristanYeah15:59
tristanhttps://gitlab.com/BuildStream/buildstream/-/blob/tristan/partial-variables/src/buildstream/_variables.pyx#L522 <-- bottleneck is here :-S16:00
* tristan goes and get's chicken16:00
benschuberthaha, enjoy!16:00
tristanbenschubert, it looks like creating that list, and calling "".join() is in fact the hottest part of the whole thing16:00
tristan.061 out of 0.070 is spent there16:00
benschuberttristan: I would not have guessed :/16:01
tristanSo all of the recursion stuff is quite performant with the arrays and everything16:01
tristanYeah me neither :-/16:01
tristanIt could be I could run through the arrays, count up the size, use a loop of strncpy() into an allocated buffer, and return a fresh string16:01
tristanand make it faster just there16:01
tristanI don't know16:01
tristanIt could also be the nature of the data from valentind's variable test which causes the hotspot to go there16:02
tristanshould verify first16:02
benschubertyou can try on fsdk too16:02
tristanyeah, what's the branch which works with bst2 ?16:03
* tristan could run show on that to help benchmark16:03
tristanvalentind's test data has a lot of variables with many many components16:03
tristanwhere real life has more variables with lest components16:03
tristanless16:04
benschubertI'm not sure what's the branch to use for that nowadays16:06
benschuberthaven't looked at it for quite a bit16:06
Frazer61Hi, just made the proposal for bst source show. I was wondering though about pylint, i noticed the pylint config, and most likely pylint itself is quite outdated and doesnt support all the new features, or enable a few things that could be beneficial, such as enforcing LF for the end of lines. Is this something you would be interested in me making16:26
Frazer61a patch for?16:26
benschubertFrazer61: sure, that would be beneficial!16:32
benschubertdon't hesitate to ask here if you are unsure whether a new rule should be enabled or not :)16:32
Frazer61ok, do you want me to make a patch to show off, i have done some work locally16:41
benschubertYou can always make a MR it's good for visibility16:42
Frazer61will do :)16:42
tristanFrazer61, we've updated it a few times over time, it of course gets outdated and is work to support new errors, patches pretty much always welcome for that yes :)17:00
tristanEnabling new pylint errors, or errors which are currently disabled, can be more difficult :)17:01
benschubertBut might be worth it if you have the courage (but better to ask first for those :D)17:01
benschuberttristan: https://gitlab.com/BuildStream/buildstream/-/merge_requests/1985 I've started17:01
benschubertcurrently working on the `retry` part17:01
benschuberttristan: does it look like what you were thinking?17:01
tristanFrazer61, I'll probably not have time to reply to that until monday17:02
tristantwo things I can say off hand is that (A) It would be good to follow the same API as `bst show`, i.e. use a `--format` parameter... (B) There is no way to know what a "git" plugin is, i.e. filtering output based on plugin 'kind' strings is not meaningful because a plugin 'kind' can mean something different in every project (even multiple projects connected through junctions)17:04
tristanbenschubert, yeah that looks like it's on the right track17:06
tristanbenschubert, ideally the whole Notification/NotificationType would be removed of course17:06
benschuberttristan: yep, I'm going part by part to make the commits easier to review :)17:06
benschubertFor SUSPEND/QUIT/etc, I'll make methods in State, where you can register a callback, does that sound good?17:07
tristanThat sounds good, I'm not sure what the right API is for that, previously that part is mostly controlled by the frontend anyway17:08
benschubertI think frontend -> stream -> state -> whatever is the logical poitn there17:08
tristanBut if the frontend needs to be told, it could be good to have a kind of execution state17:08
benschubertso the only two real interface between stream and scheduler, are messenger and state17:08
tristanYeah, but these notifications look weird because... the frontend basically tells the stream to suspend17:09
benschubertYeah we have that in Stream for now17:09
tristanand to resume17:09
benschubertwhich is right no?17:09
tristanSo I don't know why the frontend needs to be told17:09
benschubertah yeah absolutely :)17:09
benschubertfor the "scheduling state" (terminated/running/etc), I guess moving that from Stream to state is the better way?17:10
tristanIf we do need that state tracking and notification for some reason, it could be good to have state have an API like State.set_execution_state(ExecutionState.SUSPENDING | ExecutionState.SUSPENDED | ExecutionState.TERMINATING ...)17:10
benschubertyup, will see what I need17:10
tristanand have a single notification "execution-state-changed"17:10
tristanbenschubert, regarding your last, I don't think so17:11
tristanI mean, it depends17:11
benschubertSo keeping that in stream?17:12
tristanI think that the frontend controls everything via Stream17:12
tristanAnd I think the frontend cannot call things on State which "cause things to happen"17:12
benschubertright17:12
tristanFor the frontend, State is only something to add listeners to and update UI17:12
benschubertyeah I don't think I did anything there that changes that view17:12
tristanSo if there is an execution-state-changed signal/notification, that goes on State17:12
tristanBut the frontend should still be calling Stream.suspend() and such17:13
* tristan not sure what the API exactly is named17:13
benschubertyeah I think we're on the same page there17:13
tristanRight, we probably agree, I'm just being pedantic :)17:13
benschubertanyways, with well separated commit, it should be easy to change in the worst case17:13
tristanThis will remove a bunch of unneeded redirection of API calls, will be good :)17:14
tristanHopefully make your refactor easier17:14
tristaninteresting, the string "/usr" is 16 bytes long17:42
* tristan might have a memcpy version that'll work17:42
tristanwith https://github.com/cython/cython/blob/master/Cython/Includes/cpython/unicode.pxd17:42
benschubertnice :)17:45
benschubertI'm almost done too with the notification stuff17:45
benschuberttristan: for 'interrupt', do you prefer:17:48
benschubert- callback set on 'state'17:48
benschubert- callback set on 'stream'17:48
benschubert? I think a callback on 'stream' would be nicer (we could have stream pass the callback to Scheduler.run() ?17:48
benschubertI actually think we should do that for the last 5 types: tick,suspend,unsuspend,quit, terminate, just pass callbacks17:48
benschubertit seems weird passing callbacks through the state for such17:54
tristanI don't know if I follow17:57
tristanUmmm17:57
tristanSo right, the scheduler handles the interrupt and informs the frontend17:57
benschubertwhich would not happen like this if they were separated processes17:57
tristanit is not the frontend who catches ^C and decides to tell the stream to suspend17:57
tristanOr is it ?17:57
benschubertit's the event loop17:58
tristanI guess this is more user friendly to the frontend17:58
tristanI think the frontend *does* handle it in some cases, but only when interacting with the user using Click and the click.Abort exception17:58
benschubertLet me do two more commits I think I have a good solution actually :)17:58
benschubertyeah, frontend handles17:58
benschubertbut loop catches17:58
tristanI think you're on the right track17:59
benschubertso loop needs to bubble it up17:59
tristanhmmm, /me not getting this unicode API18:01
tristanI'm not sure that cython header file really means "UTF-8 encoded string" when it says so18:04
benschubertIf someone has time, I've got a few MRs that could have reviews:18:05
benschuberthttps://gitlab.com/BuildStream/bst-plugins-experimental/-/merge_requests/123 (ostree rewrite)18:05
benschuberthttps://gitlab.com/BuildStream/bst-plugins-experimental/-/merge_requests/124 (quick cleanup on CI)18:05
benschuberthttps://gitlab.com/BuildStream/buildstream/-/merge_requests/1984 (quick cleanup)18:05
tristancrap18:15
* tristan is using PyUnicode_AsUnicode, PyUnicode_GetSize to build the buffer with memcpy, and PyUnicode_FromUnicode() in the end keeps throwing me invalid unicode bytes18:17
tristanValueError: character U+abf7b198 is not in range [U+0000; U+10ffff]18:17
tristanAlso, https://github.com/cython/cython/blob/master/Cython/Includes/cpython/unicode.pxd is not the same as .tox/.package/lib/python3.7/site-packages/Cython/Includes/cpython/unicode.pxd18:18
tristanPyUnicode_GetSize() says "Return the length of the Unicode object.", this does not specify in code points or bytes either18:19
benschubert:/18:20
* tristan needs to find out what those map to and how18:21
tristanmaybe it's not all installed into .tox18:21
benschuberttristan: https://gitlab.com/BuildStream/buildstream/-/merge_requests/1985 ready! if you have time to review, it would be great to get that in18:21
benschubertI'm going to eat, and will adress your comments after if you had time :)18:22
tristanYeah, it's looking great overall :)18:31
tristanWhat a relief :)18:31
tristanI won't be around when you come back18:32
benschuberttristan: haha I'm not away yet, but great thanks :)18:32
tristanat least if I am, I will feel quite embarrassed about it, and be late meeting people tomorrow because of it :)18:32
benschubertI'll add the nit18:32
benschubertdon't overstay :) thanks for the view, I'll address the nits and once I've got an approval I;ll merge18:32
benschubertactually, if you're fine with it and I address your comments I guess I can merge as is?18:33
tristanI think so yes18:33
tristanbenschubert, Just try interacting with it first18:33
tristanYou probably did18:33
tristanbut check the usual suspects, ^C, terminate, `kill -TERM`18:34
tristan^Z18:34
tristanIf you don't see any obvious regressions, let's just land it18:34
benschubertokay! thanks18:36
tristanmaybe instead of constructing a list, I could make an iterator for Value to pass to "".join()18:37
tristanThen we reuse join on python str objects without the overhead of creating a list or appending strings to it18:38
tristanHmm18:38
* tristan will give it a shot tomorrow-ish18:38
tristantomorrow real life kicks in for a day18:38
*** benschubert has quit IRC23:09

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!