IRC logs for #buildstream for Wednesday, 2018-10-03

*** bochecha has quit IRC00:06
*** alatiera_ has quit IRC00:07
*** catonano has quit IRC00:35
*** abderrahim has quit IRC00:40
*** abderrahim has joined #buildstream00:46
*** xjuan has quit IRC00:49
*** abderrahim has quit IRC00:51
*** abderrahim has joined #buildstream00:51
*** abderrahim has quit IRC01:27
*** abderrahim has joined #buildstream01:28
*** abderrahim has quit IRC01:31
*** abderrahim has joined #buildstream01:41
gitlab-br-botbuildstream: merge request (valentindavid/rmtree_oserror->master: Catch correct exception from shutil.rmtree) #849 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/84904:17
*** mohan43u has joined #buildstream04:29
*** alatiera_ has joined #buildstream05:00
gitlab-br-botbuildstream: merge request (juerg/dummy-sandbox->master: _platform/linux.py: Accept all configs for dummy sandbox) #843 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/84305:08
*** abderrahim has quit IRC05:23
*** abderrahim has joined #buildstream05:23
gitlab-br-botbuildstream: merge request (juerg/dummy-sandbox->master: _platform/linux.py: Accept all configs for dummy sandbox) #843 changed state ("merged"): https://gitlab.com/BuildStream/buildstream/merge_requests/84305:36
*** abderrahim has quit IRC05:44
*** abderrahim has joined #buildstream05:44
*** catonano has joined #buildstream05:48
*** abderrahim has quit IRC06:08
*** abderrahim has joined #buildstream06:09
*** abderrahim has quit IRC06:18
*** abderrahim has joined #buildstream06:18
*** Nexus has joined #buildstream06:56
*** catonano has quit IRC07:18
gitlab-br-botbuildstream: merge request (tristan/fix-status-messages->master: fix status messages) #845 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/84507:33
gitlab-br-botbuildstream: merge request (tristan/fix-status-messages-1.2->bst-1.2: fix status messages (1.2)) #846 changed state ("merged"): https://gitlab.com/BuildStream/buildstream/merge_requests/84607:35
gitlab-br-botbuildstream: merge request (juerg/cas-batch-1.2->bst-1.2: CAS: Implement BatchUpdateBlobs support) #844 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/84407:35
*** tristan has joined #buildstream07:45
*** toscalix has joined #buildstream07:47
*** toscalix has quit IRC07:48
*** toscalix has joined #buildstream07:49
gitlab-br-botbuildstream: merge request (juerg/cas-batch-1.2->bst-1.2: CAS: Implement BatchUpdateBlobs support) #844 changed state ("merged"): https://gitlab.com/BuildStream/buildstream/merge_requests/84408:05
gitlab-br-botbuildstream: merge request (tristan/fix-status-messages->master: fix status messages) #845 changed state ("merged"): https://gitlab.com/BuildStream/buildstream/merge_requests/84508:07
*** jennis has joined #buildstream08:18
*** jmac has joined #buildstream08:19
*** paulsherwood has joined #buildstream08:20
*** phildawson has joined #buildstream08:20
adds68juergbi, hey would you have any  idea what the error on issue https://gitlab.com/BuildStream/buildstream/issues/692, could be indicating?08:23
Kinnisonadds68: are you giving it the full chain?08:25
*** tristan has quit IRC08:25
Kinnisonadds68: Your system might lack the intermediate certs for LE08:25
gitlab-br-botbuildstream: merge request (docs_Search_not_working->master: Fixing: Search functionality in the Documentation has stopped working) #848 changed state ("merged"): https://gitlab.com/BuildStream/buildstream/merge_requests/84808:25
adds68Kinnison, yea, i'm passing fullchain.pem08:25
KinnisonOddness08:26
KinnisonDoes the system you're running on have a "normal" set of root certs?08:26
adds68Kinnison, viewing that file also shows it has the intermediate LE cert in there08:26
Kinnisone.g. is /etc/ssl/certs full'o'stuff ?08:26
* adds68 checks again08:26
adds68Kinnison, inside there is only a "ca-bundle" cert08:27
Kinnisonis that sizeable?08:27
KinnisonIf no, you're probably lacking the OS-provided root set08:27
KinnisonSSL is hard :(08:27
adds68Kinnison, du returns 0, yet there is data in the file?08:29
Kinnisonit may be a symlink08:29
Kinnisonwhat OS is this?08:29
gitlab-br-botbuildstream: merge request (jmac/remote_execution_client->master: Remote execution client) #626 changed state ("merged"): https://gitlab.com/BuildStream/buildstream/merge_requests/62608:29
gitlab-br-botbuildstream: issue #454 ("BuildStream client-side implementation of remote execution") changed state ("closed") https://gitlab.com/BuildStream/buildstream/issues/45408:30
adds68Kinnison, ahh, 1 sec, this is fedora08:30
Kinnisonis the ca-certificates RPM installed?08:30
adds68Kinnison, yea they are both symlinks --> /etc/pki/ca-trust/extracted/pem/*08:31
adds68Kinnison, i'll check now08:31
Kinnisonand that's well populated?08:31
*** tristan has joined #buildstream08:32
adds68Kinnison, yea that is well populated08:32
gitlab-br-botbuildstream: issue #476 ("Specify the behaviour when overwriting files, symlinks or directories") changed state ("closed") https://gitlab.com/BuildStream/buildstream/issues/47608:32
adds68Kinnison, seems ca-certificates is also installed08:32
*** ChanServ sets mode: +o tristan08:33
Kinnisonadds68: one grpc related thing I've seen suggests it might be choking if the root cert is duplicated08:34
Kinnisonadds68: try with the plain cert, rather than the fullchain08:34
adds68Kinnison, oo ok, i'll try that now :)08:34
juergbiadds68: depending on where your grpc comes from, it might use its own internal root certificates instead of the system one08:42
juergbihowever, I would expect let's encrypt certificates to work in either case08:42
juergbijmac: I commented on !83708:44
jmacI saw08:47
jmacI'm quite certain 837 does fix the issues described in #684, either that or I don't get them in the first place08:48
jmacBut the points you make are valid08:48
*** raoul has joined #buildstream08:48
juergbijmac: is your TMP on the same filesystem as ~/.cache/buildstream?08:49
jmacYes08:50
juergbiok, that explains08:50
juergbithat's often not the case08:50
jmacI was getting permission errors on centos.bst though, and not ones from the artifact cache. I'll have to re-run it08:51
juergbiyes, there are three potential points for exceptions08:51
juergbithe first one you only get if TMP is a different filesystem08:51
juergbithe second one is from import_files and fixed by the added chmod in your MR08:51
juergbithe third one is CAS08:51
jmacRight08:51
juergbithe first two are regressions from 1.2, the third one is not08:51
jmacMoving the temporary import directory to ~/.cache/buildstream sounds easy08:52
juergbiyes08:52
*** jonathanmaw has joined #buildstream09:00
*** finn_ has joined #buildstream09:04
gitlab-br-botbuildstream: merge request (willsalmon/outOfSourecBuild->master: Out of source builds) #776 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/77609:04
adds68Kinnison, juergbi  sorry i had a meeting09:12
adds68Kinnison, plain cert did not work =/09:12
adds68juergbi, it is installed via pip i think?09:13
juergbiyes, the pip one might indeed not properly use the system root certificates09:13
juergbipip vs distro fight ;)09:13
Kinnison:/09:14
juergbihowever, I would still expect let's encrypt to work unless the pip package is configured for no trusted CAs09:14
juergbiI don't think it's something we can really fix in BuildStream code. however, maybe we could adapt the install instructions to improve this, once we know how...09:15
adds68juergbi, hmm ok, i will try and install from the system, i thought LE was now trusted09:16
*** bochecha has joined #buildstream09:22
* tlater[m] wonders if he should approve CI messages to the buildstream-notification-list09:23
tlater[m]Were those recently blocked for some reason?09:23
tiagogomeserm, do you have to manually approve all of these messages? That won't work well09:24
tlater[m]Agreed, but I'm not quite sure why they're popping up - I'd have figured these would be allowed one way or the other.09:24
tlater[m]But then I'm not quite sure what we want on that list, so...09:25
*** lachlan has joined #buildstream09:27
tiagogomesI activated CI notifications for whom subscribes to those be notified of failures on the overnight scheduled pipelines09:27
tlater[m]Ah, fair enough09:29
tlater[m]I'll have a look and see if I can blanket allow these then09:29
tlater[m]Because it's stupid that they're being caught09:29
* tlater[m] clicked the "train bayesian" button, seems like there's not much else that can be done09:34
tlater[m]I suppose it's to avoid someone abusing the gitlab email to spam us09:34
adds68juergbi, seems dnf can not find grpc/grpcio09:35
adds68juergbi, on the grpc docs it only shows installation via package managers like pip ?09:36
coldtomadds68: it's not in the repos yet, bochecha is currently trying to get it in09:38
gitlab-br-botbuildstream: merge request (tristan/fix-terminated-jobs->master: _scheduler: Fix bookkeeping of terminated jobs) #850 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/85009:38
adds68/o\09:38
bochechaadds68: still in my copr for now09:39
gitlab-br-botbuildstream: merge request (tristan/fix-terminated-jobs-1.2->bst-1.2: _scheduler: Fix bookkeeping of terminated jobs) #851 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/85109:39
bochechaadds68: https://gitlab.com/BuildStream/buildstream/issues/68609:39
tristantlater[m], I've just fixed your issue #479, with !850 & !85109:42
tristantlater[m], Care to give those a lookover ?09:42
tlater[m]Sure!09:42
tristanSeems there are more regressions around termination, though :-(09:42
tristanOne by one09:42
tristanEspecially, I'd like to fix that when you hit ^C a second time, it doesnt pop up and ask you again09:43
tristanthat UX is horrible09:43
tlater[m]Oh, that is actually annoying, never caught onto that.09:43
tristanerr, anyway phrased backwards, it *does* currently pop up and ask you again09:43
tlater[m]Maybe we should also have a message that goes "terminating"09:43
tristanWe do09:43
tristanWe have two actually09:44
tristanFirst, we show a message that we are terminating tasks09:44
tristanThen, if those tasks don't respond to SIGTERM after a timeout, we have warnings that they were killed09:44
tlater[m]Ah, but I meant something a bit more permanent, so that it looks like buildstream is actually busy terminating.09:44
adds68bochecha, ah so this means installing Buildstream i meant grpc, i think there are some crossed wires here09:45
tlater[m]It's easy to miss that message and assume we're just continuing to build09:45
tristantlater[m], Well, they are the last lines which appear, and the status bar doesnt show up at that time09:45
bochechaadds68: I missed the beginning of the conversation and assumed you were talking about bst :P09:45
adds68But yes maybe installing buildstream via the system package manager will solve the issue09:45
tristantlater[m], really, if you don't have that obnoxious double question popping up, it looks very clear what is going on09:45
bochechaadds68: though grpc is still in my copr as well, and the issue I mentioned has details about its inclusion in fedora :)09:45
adds68bochecha, seems currently grpcio is having difficulties with system certificates, which maybe down to the python installation09:46
tristantlater[m], note that my fixes for #479, consequently also fix the end of build summary message to say "WARNING Build terminated", instead of "FAILURE"09:46
tristanThat seems to have just been a side effect of bad bookkeeping of tasks09:46
adds68bochecha, i guess i can just install using your linked rpm in the that bugzilla? :d09:47
bochechaadds68: it's a source rpm in the bugzilla09:48
bochechaadds68: add the copr and install from there?09:48
bochechaadds68: https://docs.buildstream.build/install_linux_distro.html#fedora09:49
adds68bochecha, documentation :O nice!09:50
bochechaadds68: that's been there for a while already :P09:50
tlater[m]tristan: This is a surprisingly simple fix. I guess we just weren't sure whether it was intended to be a failure when this was originally written.09:50
tristantlater[m], yeah it was pretty straight forward09:51
tristantlater[m], Note that the fix for master in queue.py is different than for bst-1.2, because now that we cache failed builds, failed jobs need to be propagated through the queues unconditionally09:51
tristanThat is a whole other can of worms09:52
tristantlater[m], I.e., I wonder how this plays fairly with the frontend popping up and asking whether you want to terminate,quit,continue,debug etc09:52
tlater[m]Ah, right, I actually missed that09:53
tristanI think I recall discussing this with skullman, and the result was that we force the next job to be included in the batch of pending jobs to execute in a `quit` scenario09:53
tlater[m]So the difference is that in master jobs are always added to the done queue?09:53
tristanBut I'm not exactly clear on how the scheduler has grown a feature to understand this09:53
tristantlater[m], Yes, which - I'm not sure how that plays out in reality honestly; we probably need more rigorous testing because, the possible side effects are pretty scary to think of; maybe there is a clause somewhere to ignore failed elements landing on your incoming queue09:55
*** finn_ is now known as finnb09:55
tristantlater[m], Anyway, I think we can put #479 behind us :)09:55
adds68juergbi, bochecha seems using the copr installation is also giving me the same error :(09:55
tristanThe double prompt at terminate time is next on the chopping block09:55
* adds68 isn't sure if it's LE or bst at this point 09:55
tlater[m]Yeah, it at least fixes that :)09:56
bochechaadds68: ok, what's the error?09:56
adds68bochecha, oops sorry, here is the original link, https://gitlab.com/BuildStream/buildstream/issues/69209:56
gitlab-br-botbuildstream: issue #479 ("Incorrect summary when terminating builds") changed state ("closed") https://gitlab.com/BuildStream/buildstream/issues/47909:59
gitlab-br-botbuildstream: merge request (tristan/fix-terminated-jobs->master: _scheduler: Fix bookkeeping of terminated jobs) #850 changed state ("merged"): https://gitlab.com/BuildStream/buildstream/merge_requests/85009:59
gitlab-br-botbuildstream: merge request (tristan/fix-terminated-jobs-1.2->bst-1.2: _scheduler: Fix bookkeeping of terminated jobs) #851 changed state ("merged"): https://gitlab.com/BuildStream/buildstream/merge_requests/85110:00
tristanWould be nice to also fix "Unknown child process pid 19753, will report returncode 255" when terminating10:01
tlater[m]That's a double-kill for a process10:01
tristanI think this might be done with ChildWather.remove_child_watcher()10:01
tristantlater[m], It is not !10:01
tristantlater[m], It seems to be happening every time; or wait... maybe you are right10:02
tlater[m]I've seen that error before when double-killing processes10:02
tlater[m]I thought I'd caught all scenarios in which that can happen, but evidently not10:02
tristantlater[m], Maybe the two are linked ! because of force terminate needing a second kick in the nuts with the obnoxious double-ask to the user !10:02
tristani.e., it's not the case where we timeout and force kill, it's happening in a regular terminate10:03
*** finnb has quit IRC10:03
tlater[m]I've only seen it on force exits, so that's quite possible10:03
*** finn_ has joined #buildstream10:04
tlater[m]Also no integration tests for interactive stuff, so this wouldn't be caught...10:04
tlater[m]These termination issues just show that we really need those.10:04
bochechaadds68: try validating the cert with openssl cmd10:06
gitlab-br-botbuildstream: issue #693 ("User is promted a second time when force terminating with ^C") changed state ("opened") https://gitlab.com/BuildStream/buildstream/issues/69310:09
tristantlater[m], I've just filed it in detail https://gitlab.com/BuildStream/buildstream/issues/69310:10
tristantlater[m], I would argue that they also indicate that developers need to smoke test more with real projects, rather than relying on CI as a safety net10:11
tristantlater[m], But, both are true10:11
*** finn has quit IRC10:31
*** finn_ is now known as finn10:31
*** finn_ has joined #buildstream10:31
finnif anyone uses the bst-artifact-server, I'd be interested to have any feedback on this:10:32
finnhttps://buildgrid.gitlab.io/buildgrid/using_cas_server.html10:32
gitlab-br-botbuildstream: merge request (jmac/vdir_import_test->master: WIP: Import test for virtual directories) #815 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/81510:38
tristanHmmm, any idea what could be re-introducing a SIGINT after terminate ?10:39
tristanAnalyzing this, things seem to be happening as expected: User hit's ^C -> We enter Scheduler.jobs_suspended() context manager (indirectly) -> This disconnects the event loop connections for SIGINT -> We prompt the user and handle click.Abort (which is ^C) -> We tell the Scheduler.terminate_jobs() (indirectly) -> We exit Scheduler.jobs_suspended(), which reconnects to SIGINT -> Then after the jobs get terminated... we get another SIGINT event !10:41
tlater[m]That seems... odd? Maybe we don't eat the original sigint event and it sticks around somehow?10:43
tristanOhhhhhhh10:43
tristanLooky here10:43
tristanIn that scenario10:43
tristanScheduler.terminate_jobs() gets called a bunch of times !10:43
tristanAha10:44
tristantlater[m], So, there is some kind of feedback loop where we receive the SIGTERM signals we send to the child processes in the parent10:45
tristantlater[m], Remember that there is a counter for that ?10:45
tristanAh no, it is with SIGTSTP10:45
Kinnisonisn't that ^Z ?10:45
tristanKinnison, Yes10:46
tristanI was wrong, but still suspect that we are receiving a SIGTERM in the parent process for every SIGTERM we send to the child10:46
tlater[m]Ah, right, that was an issue we had at some point last year10:46
KinnisonThat wouldn't be normal POSIX signal behaviour10:46
tristanKinnison, is it standard for SIGTSTP ?10:47
tristan(^Z)10:47
KinnisonTSTP might be sent through the group10:47
* Kinnison can't recall off-hand10:47
tristanAlthough, literally the first thing we do in a child process is call os.setsid()10:48
tristanAnd we launch the job with blocked signals, letting the child unblock them when ready10:49
* Kinnison recalls something about setpgrp() and setsid() working together10:49
jmacjonathanmaw: Is source mirroring a recent addition to BuildStream, or has it been around forever? I couldn't see an MR adding it.10:57
*** slaf_ has joined #buildstream10:58
*** slaf_ has joined #buildstream10:58
jonathanmawjmac: it's been around for a while10:58
tristanKinnison, That seems to be system daemon related, no permission to do that afaics (os.setpgrp() hits EPERM)10:58
*** slaf_ has joined #buildstream10:58
*** slaf_ has joined #buildstream10:58
jmacOK10:58
tristanWell, I have a fix for the behavior, I just don't like my fix very much10:58
*** slaf_ has joined #buildstream10:59
*** slaf_ has joined #buildstream10:59
*** slaf has quit IRC10:59
*** slaf_ is now known as slaf10:59
jonathanmawjmac: the original MR was https://gitlab.com/BuildStream/buildstream/merge_requests/404 I think, though there have been various changes building on it10:59
toscalixdo we want to have the IRC monthly meeting while most of us are in the same room?11:00
jmacjonathanmaw: That's fine - I was just looking for some documentation on how mirroring should be specified. I've found it in the documentation now.11:00
tristanI can verify now that we are indeed having feedback SIGTERM in the main process11:02
toscalixI would prefer to have the IRC monthly meeting next week11:05
toscalixso we have more time face to face and we can use the meeting to prepare the gathering11:05
tristantoscalix, Is it already scheduled for next week ?11:06
tristanI agree that we don't change it11:06
toscalixno, for week 4211:06
tristanOh you mean you want to bump it up one week11:06
toscalixyes11:06
tristanI am okay with that, in any case I would like to have it separately11:06
tristanFor anyone who doesn't attend Manchester, they still get a meeting where we are not all distracted11:07
toscalixvalentind: laurence Kinnison jmac tlater[m] others?11:08
* tlater[m] thinks this is sensible11:09
jmacIt'll then be during BazelCon which a few of us are attending. I don't mind if you move it, I probably won't attend though.11:10
toscalixah, bazelCon. And the week after is ELCE11:11
toscalixso maybe we simply leave it as it is11:11
juergbiKinnison: added env variable test to buildbox !711:12
gitlab-br-botbuildstream: merge request (jonathan/source-mirror-project-refs->master: tests: Add regression test for mirroring with project.refs) #823 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/82311:12
tlater[m]tristan: I'll be holding a tutorial session during the meetup in two weeks... Trying to come up with a script right now.11:13
tlater[m]I'm not sure what I should focus on though11:13
tlater[m]On the one hand, we have BuildStream's integration features, and on the other its application development features11:13
toscalixtlater[m]: is 101. Assume people do not know anything about buildstream11:14
toscalixso installation is first11:14
mablanchjuergbi, I can't approved BuildBox !7, but looks good to me and tested ok.11:14
tlater[m]Absolutely, I'm just not sure what to show after installation11:14
toscalixtlater[m]: you should also walk them through the docu11:14
juergbimablanch: thanks11:14
toscalixtlater[m]: examples11:14
toscalixthe ones in the documentation11:15
toscalixwe use the training sessions to validate the docu11:15
tlater[m]Heh, fair enough, that should be alright.11:15
toscalixif you need to add info to the examples or new examples... you have there info to include in the docu11:16
tlater[m]I just wonder if it properly shows why you'd want to use buildstream over whatever you're already using buildstream11:16
toscalixtlater[m]: not sure if this helps... there is a 102 session the day after, a more advance one11:20
tlater[m]Ah, who is holding that? Probably a good idea to sync up on what each of us covers11:20
toscalixtpollard: and jennis11:20
toscalixadds68: will support all of you with examples related with freedesktop-sdk11:21
jennisyes tlater[m], let's sync up11:24
jennis102 should probably cover workspaces, build shells, junctions11:25
jennismaybe dive into plugins a bit more11:25
tlater[m]jennis: Ah, that's interesting. I'd have thought of workspaces as one of the basic parts11:25
jennisok then include that in 101 :p11:25
tlater[m]Hell, even build shells - I think the 101 should cover anything you need to develop with buildstream.11:25
jennisLet's get a whiteboard later today?11:26
tlater[m]Yeah, that's probably best. I'll run through the docs in the meantime and get a write up of all features, so we can coordinate.11:26
jennissounds like a plan!11:26
jennisDo we have an option to 'only try and push' to the cache, and not bother looking for an artifact to pull11:30
laurencetoscalix, perhaps we can try the irc meeting, but if all of the attendees are physically present in manchester, then abandon it quickly11:31
laurencesince surely the important points will be covered and captured during the week11:31
jennisCurrently I'm trying to push (previously) unbuilt elements to a remote cache, but it always checks the remote cache first to see if it's there11:32
toscalixlaurence: we can do that if there is not external participants, yes11:32
jennisAnd for my pipeline of 1500 elements, I'm checking the empty remote cache for about 20 mins before I start building and pushing things11:32
tpollardjennis: not as far as I'm aware11:33
jennisI know we have `bst push`, but I assume that needs a local build before hand, so I guess I could build without a specific remote cache, then add the cache and then push --deps all11:34
jennisbut I'm wondering whether this option is something we should consider11:34
jenniswithout a specified remote cache*11:35
juergbijennis: isn't it starting the first build before it finishes pull attempts?11:41
jennisno11:41
juergbihm, maybe we could fix that11:41
juergbimight be due to fetch jobs being the same class as pull jobs11:41
juergbialso, maybe we should introduce a batch API for refs11:42
jennis^^ I wondered whether this was already in motion11:42
tpollardthat would be good11:42
jennis?11:42
juergbinot planned yet. the API/protocol should be simple to add, however, it might not be so easy to actually use it in BuildStream11:43
juergbias these checks are handled element by element11:43
jennisAhh, I thought this batch request could be sent immediately after we determine the pipeline11:45
jennisbecause at that point we have refs and names...11:45
jenniscan't we then retrieve a list from the server about which elements are 'fetchable'?11:47
juergbidue to non-strict build and workspaces, not everything is always available at the start of the session11:47
jennisoh ofc11:47
juergbiI don't think providing full listing makes sense11:47
juergbi1) it doesn't scale if the same CAS is used for tons of projects11:48
jennisbut you wouldn't be fetching a workspace and it's reverse dependencies anyway, you'll have to rebuild these right11:48
juergbi2) this could actually be a security issue (right now, nothing is accessible without digests)11:48
gitlab-br-botbuildstream: merge request (jonathan/source-mirror-project-refs->master: tests: Add regression test for mirroring with project.refs) #823 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/82311:49
jennisright ok, perhaps this would need further discussion then11:49
juergbiwith strict builds, you're correct. but there are non-strict builds11:49
jennisoh yes, good point11:49
juergbianother point is that we actually want to allow pulling elements later during a session even if they were not cached at the beginning11:50
juergbithink of long-running build sessions11:50
jennisif another dev pushes it in the meantime...?11:50
juergbiyes (or more realistically, CI)11:50
jennisOk, perhaps batched isn't the way forward here then11:51
jennisbut halting all builds until we've checked should probably be fixed11:51
juergbiI think the best option is really to ensure that fetch jobs are not starved by pull jobs (if that indeed is the issue)11:51
juergbichecking in the background shouldn't hurt too much11:51
juergbiI'd even say fetch jobs could always be considered higher priority than pull jobs11:52
juergbimaybe we need something more sophisticated, but a simple solution might work quite well11:53
gitlab-br-botbuildstream: merge request (tristan/fix-double-terminate-prompt->master: _scheduler/scheduler.py: Ignore interrupt events while terminating.) #852 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/85211:55
gitlab-br-botbuildstream: merge request (tristan/fix-double-terminate-prompt-1.2->bst-1.2: _scheduler/scheduler.py: Ignore interrupt events while terminating.) #853 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/85311:56
tristantlater[m], !852 and !853 fix the behavior, I'm still quite confused as to why this started happening, though11:57
tlater[m]tristan: I'll have a look.11:57
tristanI added a lot of trace while debugging this, interestingly it really only happens if you hit ^C a second time, if you choose (t)erminate, then it doesnt happen11:57
tlater[m]That does sound like the signal is blocked but re-appears when we unblock signals11:58
tristanAnd this keyboard interrupt *does* happen while in the Scheduler.jobs_suspended() context manager, where we have clearly disconnected the signals from the main loop11:58
tristanBut we don't do that11:58
tristanI.e. we don't block SIGINT, we actually *need* SIGINT for click.Abort to be raised and handled in the App interactive session11:59
tlater[m]Oh, right11:59
tristanWe *do* go ahead and block it for the remainder of the program lifetime, after that point12:00
tristanBecause we don't want ^C interfering with clean shutdown12:00
tristanSo, we (A) disconnect the asyncio mainloop signal handler... (B) Receive SIGINT in the form of click.Abort exception... (C) Call terminate_jobs, which blocks SIGINT forever and schedules termination in the mainloop... (D) Reconnect signals at that point12:02
tristantlater[m], While that patch does fix the really obnoxious double prompt, it however does *not* fix the "Unknown child process pid 32561, will report returncode 255" messages12:02
tristanAnd those happen *before* we redundantly call terminate_jobs() as a result of receiving feedback SIGTERM events in the main process12:03
tristanStill, situation much improved12:03
tristanHmmm, I did consider making this a FIXME/XXX comment12:04
tristanWill do, why not12:05
tlater[m]Cool, I'm happy with it otherwise. It should just be clear that we want to get to the bottom of this eventually12:05
gitlab-br-botbuildstream: merge request (tristan/fix-double-terminate-prompt->master: _scheduler/scheduler.py: Ignore interrupt events while terminating.) #852 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/85212:05
gitlab-br-botbuildstream: merge request (tristan/fix-double-terminate-prompt-1.2->bst-1.2: _scheduler/scheduler.py: Ignore interrupt events while terminating.) #853 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/85312:06
gitlab-br-botbuildstream: merge request (tristan/fix-double-terminate-prompt->master: _scheduler/scheduler.py: Ignore interrupt events while terminating.) #852 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/85212:06
gitlab-br-botbuildstream: merge request (tristan/fix-double-terminate-prompt-1.2->bst-1.2: _scheduler/scheduler.py: Ignore interrupt events while terminating.) #853 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/85312:07
tristanHmm, not sure why we got double notifications for those MR updates12:07
Kinnisonjuergbi: I'll look again in a bit12:16
tlater[m]Do we have a nice way to access logging yet? Or do you still need to wade through `~/.cache/buildstream/logs`?12:33
tristantlater[m], Still no feature :-/12:35
tristanWhich reminds me... We *still* don't print anything useful in the cleanup/cache_size logs12:36
tristanThere is interesting information there, like how much size was detected and how much remains in quota... and which artifacts we deleted in a cleanup job12:37
gitlab-br-botbuildstream: merge request (tristan/fix-double-terminate-prompt-1.2->bst-1.2: _scheduler/scheduler.py: Ignore interrupt events while terminating.) #853 changed state ("merged"): https://gitlab.com/BuildStream/buildstream/merge_requests/85312:37
tlater[m]Yep, this would be useful at least for debugging12:37
tristanEeek, is this already a filed issue: https://gitlab.com/BuildStream/buildstream/-/jobs/104201523 ?12:38
tristanCacheSizeJob unhandled exception: FileNotFoundError: [Errno 2] No such file or directory: '/builds/BuildStream/buildstream/dist/buildstream/tmp/test_push_pull_non_strict0/cache/artifacts/cas/refs/heads/test/target/tmp8_hdcyb3'12:39
tlater[m]tristan: Not as far as I know12:41
tlater[m]Looks like a race condition when walking the cache directory12:42
tristanThis is because of tmp files being created/removed while committing an artifact, and CacheSizeJob not finding a file it thinks should exist12:42
tristantiagogomes, Didn't we fix this by giving CAS a tmp directory outside of it's store ?12:43
tlater[m]Regardless, utils._get_dir_size() should guarantee that there's no race condition between it finding a file and calculating its size12:45
tristanI think we did12:45
tristantlater[m], I was thinking that before, but this might be better12:45
tristanI.e. we should be able to guarantee that files in cas/ only get deleted as a result of cleanup, no ?12:46
Kinnisonjuergbi: LGTM, you have an approve for bb!712:46
tristanAnd the fact that this error is raising, shows us that for some reason CasCache is not using it's tempdir for tempfiles, but creating temp files in "refs/heads/..."12:47
tlater[m]tristan: My only worry with leaving it as-is is that we might use that function elsewhere12:47
tlater[m]And we might not be able to guarantee that for things other than the cache12:47
gitlab-br-botbuildstream: merge request (jonathan/source-mirror-project-refs->master: tests: Add regression test for mirroring with project.refs) #823 changed state ("merged"): https://gitlab.com/BuildStream/buildstream/merge_requests/82312:48
tristantlater[m], I'm not convinced, rather we should document the function to assume that things don't inadvertently disappear from the directory while it is being walked12:48
tlater[m]Yeah, that works too12:48
tlater[m]Anyway we need to figure out why files blip out of existence :)12:49
tristanAha, I see12:50
tristanThe issue is that utils.save_file_atomic() needs to be given a tempdir, or else it will assume that it's okay to put it beside where the file is going to land12:50
* tristan notices that utils.save_file_atomic() doesnt even have proper API docs, while it does even have an example12:51
tristanWhy does save_file_atomic expose so many parameters anyway ?12:54
tristanOk, I won't fix that together12:55
gitlab-br-botbuildstream: issue #694 ("Re-think and document incremental workspaces better") changed state ("opened") https://gitlab.com/BuildStream/buildstream/issues/69412:58
gitlab-br-botbuildstream: merge request (Qinusty/634-workspace-failed-builds->master: Do not save workspace on failed build) #812 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/81212:59
gitlab-br-botbuildstream: merge request (tristan/fix-cache-size-race->master: fix cache size race) #854 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/85413:02
tristanAnd here is the fix for the race condition: https://gitlab.com/BuildStream/buildstream/merge_requests/85413:03
* tristan backports13:03
gitlab-br-botbuildstream: merge request (Qinusty/634-workspace-failed-builds->master: Do not save workspace on failed build) #812 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/81213:03
gitlab-br-botbuildstream: merge request (tristan/fix-cache-size-race-1.2->bst-1.2: Tristan/fix cache size race 1.2) #855 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/85513:05
gitlab-br-botbuildstream: merge request (tristan/fix-double-terminate-prompt->master: _scheduler/scheduler.py: Ignore interrupt events while terminating.) #852 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/85213:05
gitlab-br-botbuildstream: merge request (tristan/fix-cache-size-race->master: fix cache size race) #854 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/85413:06
*** solid_black has quit IRC13:12
*** cs-shadow has joined #buildstream13:19
tlater[m]So python 3.7 is giving me this warning: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working13:26
tlater[m]This happens in a lot of our yaml loading code13:26
tlater[m]As far as I can tell, python<3.7 don't support loading mapping from collections.abc13:27
tlater[m]But python>=3.8 won't support loading it from collections13:27
tlater[m]Any idea how we should migrate?13:27
tlater[m]Maybe better to ask on #python or somesuch13:27
tristanThis comes as a surprise to me, as I recall we *were* considering using collections.abc, but never went ahead and actually did it yet13:27
tlater[m]Ah, nevermind, I misread13:28
tlater[m]That has been possible since 3.313:28
tlater[m]I figure we should just start using the .abc then?13:28
tlater[m]We require 3.5 anyway, so this shouldn't be an issue13:28
tristanI think if we use abc *at all*, we should do it across the codebase, not arbitrarily here and there13:29
tlater[m]tristan: I'd move the entire codebase to start using collections.abc for Mapping so that we don't fall under the bus when 3.8 starts being used13:29
tristanIn fact, I feel the same way about haphazardly deciding to use @property here and there, while the rest of the codebase communicates a clear consistent standard use of explicit accessors name `def get_foo(self):`13:30
tlater[m]Do you mean I should start using abstract base classes for our abstract classes to?13:30
tristantlater[m], it would be good to use ABC for ArtifactCache, Sandbox, Plugin, Element, Source13:31
tristanIt can be done without breaking API13:31
tristanAnd will allow better linting13:31
tlater[m]Hm, to be fair, that doesn't seem too involved13:31
tristanHowever, there are some cases where ABC is too rigid13:31
tristanFor instance, it thinks that an abstract method should not have a default implementation, or that an implementor *must* implement an abstract method13:32
tristanWhich, is just not true13:32
tlater[m]Well, in that case it's not an abstract method13:32
tristanSo, our uses of ABC would be a bit contorted, I think that's why I didnt go forward with it13:32
tristantlater[m], Only according to ABC, not according to basic OOP concepts13:33
tlater[m]I'm pretty sure according to OOP concepts it is not considered an abstract method, since abstract methods are those that have no implementation.13:33
tlater[m]That doesn't keep you from overriding a normal method13:33
tristanWell, not sure what it buys us then to use ABC13:34
tlater[m]tristan: I'm getting a bit annoyed by my linter showing me deprecation warnings, would you accept an MR that only deals with the about-to-be-deprecated python API?13:34
tristanI.e. if we don't describe it as an @abstractmethod, then we don't have any added protection for other... what would you call them, virtual methods ?13:35
tristantlater[m], How about stop using ABC altogether, where are we using it and why ?13:35
tlater[m]tristan: We aren't using ABC - this is python moving one of their modules to a different name, and has nothing to do with ABC itself13:35
tlater[m](Internally perhaps this means that they started using ABC)13:36
tristantlater[m], Right, but the deprecation warning you mentioned above mentions this, so if we are not using it, it is a dependency that is doing that13:36
tlater[m]tristan: The dependency is python13:37
tristanAs such, it's sort of out of our control, but I would presume that dependency will probably work itself out by that time13:37
tristanSigh, so python introduces a deprecation warning, which itself causes to happen in it's own standard library ?13:37
tristanThen, this is a non-issue right ?13:37
tristanPython will sort itself out13:37
tlater[m]tristan: One of the lines showing that warning is this one: https://gitlab.com/BuildStream/buildstream/blob/master/buildstream/_context.py#L2213:38
tlater[m]If we keep this into python3.8 buildstream will fail to load half of our yaml parsing13:38
tristanAhh13:38
tlater[m]What that warning suggests is replacing that line with `from collections.abc import deque, Mapping`13:38
tristantlater[m], That deprecation warning is poorly worded IMP13:38
tristanIMO13:39
tristantlater[m], Yes that sounds entirely sensible13:39
tlater[m]Agreed, I had to dig into docs to understand it ;p13:39
tristanIt seems to suggest that people think about dequeue, Mapping, etc... as "ABCs"13:39
tristanBut anyway, I got the picture now, yes please of course that MR would be correct :)13:40
tlater[m]Cool - I was mostly confused because I thought we'd deprecate using python<3.713:40
*** catonano has joined #buildstream13:40
adds68juergbi, bochecha, Kinnison it seems rebuilding the server fixed this issue, i am wondering if it was due to snakeoil certificates being generated before the valid ones13:41
tristantlater[m], Seems pretty superficial change on Python's part; I don't think it's really that costly for them to keep supporting it until python 413:42
gitlab-br-botbuildstream: issue #693 ("User is promted a second time when force terminating with ^C") changed state ("closed") https://gitlab.com/BuildStream/buildstream/issues/69313:42
gitlab-br-botbuildstream: merge request (tristan/fix-double-terminate-prompt->master: _scheduler/scheduler.py: Ignore interrupt events while terminating.) #852 changed state ("merged"): https://gitlab.com/BuildStream/buildstream/merge_requests/85213:42
tlater[m]I agree, I think it's a bit stupid. But we'd have to lobby python to change that, so...13:42
* tristan dislikes that python thinks it's just a-okay to warn people in one cycle that they will break peoples apps in the next13:42
tristanAn app written against a stable API today, should work in 10 years13:43
gitlab-br-botbuildstream: merge request (tristan/fix-cache-size-race->master: fix cache size race) #854 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/85413:44
gitlab-br-botbuildstream: merge request (tristan/fix-cache-size-race-1.2->bst-1.2: Tristan/fix cache size race 1.2) #855 changed state ("merged"): https://gitlab.com/BuildStream/buildstream/merge_requests/85513:44
*** catonano has quit IRC13:44
*** catonano has joined #buildstream13:45
*** catonano has quit IRC13:46
*** catonano has joined #buildstream13:47
Kinnisonadds68: glad to hear you solved your problem13:54
*** tiagogomes has quit IRC13:57
adds68Kinnison, :D14:00
tlater[m]Heh, I just discovered a pylint annoyance where @abc classes would help14:05
tlater[m]It can't quite figure out that a method raises a NotImplementedError but is intended to return something when implemented14:06
*** tristan has quit IRC14:06
tlater[m]So it flags a bunch of lines as us assigning from a function that doesn't return a value14:06
* tlater[m] should investigate what @abc would add over our current implementation at some point14:07
jmacSimilarly, pylint likes to complain that a class's __init__ didn't call its superclass's __init__, despite the superclass __init__ just being a NotImplementedError14:09
gitlab-br-botbuildstream: merge request (tristan/fix-cache-size-race->master: fix cache size race) #854 changed state ("merged"): https://gitlab.com/BuildStream/buildstream/merge_requests/85414:09
tlater[m]Yeah, these things would be fixed by using python's proper abstract class infrastructure14:10
tlater[m]It would also flag them early for people who don't believe in pylint14:11
*** tristan has joined #buildstream14:14
gitlab-br-botbuildstream: issue #695 ("Allow configuring default location for workspaces") changed state ("opened") https://gitlab.com/BuildStream/buildstream/issues/69514:34
*** tiagogomes has joined #buildstream14:36
gitlab-br-botbuildstream: issue #229 ("Allow configuring default location for workspaces") changed state ("closed") https://gitlab.com/BuildStream/buildstream/issues/22914:38
gitlab-br-botbuildstream: issue #692 ("SSL root certificate error when configuring a cache") changed state ("closed") https://gitlab.com/BuildStream/buildstream/issues/69214:46
gitlab-br-botbuildstream: merge request (tiagogomes/docs-improvements->master: Documentation improvements) #835 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/83515:39
*** catonano has quit IRC16:18
gitlab-br-botbuildstream: merge request (jmac/temporaries-inside-cachedir->master: Move the temporary staging directory to artifactdir) #856 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/85616:20
gitlab-br-botbuildstream: merge request (danielsilverstone-ct/bwrap-check-runtime-only->master: WIP: Make bwrap check runtime only) #847 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/84716:21
*** tiagogomes has quit IRC16:22
*** toscalix has quit IRC16:29
Kinnisonjuergbi: I've updated !847 since your change to the linux platform was merged.  All the current test suite passes so I hope it's enough.  I'm not happy without a specific test for 847's changes, but ultimately the timeline on that is up to you and tristan regarding extra dockers etc.16:29
jmachttps://gitlab.com/BuildStream/buildstream/merge_requests/856 is a (hopefully) 5-minute review job if anyone fancies it16:32
* Kinnison takes a quick look for jmac16:36
gitlab-br-botbuildstream: merge request (jmac/temporaries-inside-cachedir->master: Move the temporary staging directory to artifactdir) #856 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/85616:37
Kinnisonjmac: the reasoning is sound and the change looks good to me16:37
* Kinnison approves16:37
*** tiagogomes has joined #buildstream16:37
*** jonathanmaw has quit IRC16:43
gitlab-br-botbuildstream: merge request (jmac/temporaries-inside-cachedir->master: Move the temporary staging directory to artifactdir) #856 changed state ("merged"): https://gitlab.com/BuildStream/buildstream/merge_requests/85616:44
*** xjuan has joined #buildstream16:49
*** phildawson has quit IRC16:54
*** raoul has quit IRC17:30
*** tristan has quit IRC18:53
*** tristan has joined #buildstream18:56
*** tristan has quit IRC19:02
*** tristan has joined #buildstream19:15
*** lachlan has quit IRC19:19
*** catonano has joined #buildstream19:50
cs-shadowhi! Do we have a known issue with our aarch64 runners? https://gitlab.com/BuildStream/buildstream-docker-images/-/jobs/104044816 failed without ever getting assigned a worker20:26
gitlab-br-botbuildstream: merge request (chandan/bst-docker-import->master: Add contrib script to generate Docker images from bst checkout) #857 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/85721:31
gitlab-br-botbuildstream: merge request (chandan/bst-docker-import->master: Add contrib script to generate Docker images from bst checkout) #857 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/85721:31
gitlab-br-botbuildstream: merge request (chandan/bst-docker-import->master: Add contrib script to generate Docker images from bst checkout) #857 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/85721:43
*** bochecha has quit IRC22:08
*** bochecha has joined #buildstream22:08
*** bochecha has quit IRC22:12
*** bochecha has joined #buildstream22:13
gitlab-br-botbuildstream: merge request (valentindavid/staging_order_and_link_files->master: Make link_files and copy_files and behave independently to staging order) #858 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/85822:34
*** xjuan has quit IRC23:02
*** catonano has quit IRC23:06
*** bochecha has quit IRC23:21

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!