IRC logs for #buildstream for Tuesday, 2017-12-05

*** benbrown has quit IRC00:16
*** nexus has quit IRC00:16
*** paulsherwood has quit IRC00:16
*** nexus has joined #buildstream00:17
*** benbrown has joined #buildstream00:17
*** paulsherwood has joined #buildstream00:18
*** mcatanzaro has quit IRC01:31
*** saxa has quit IRC01:55
*** saxa has joined #buildstream01:58
*** mcatanzaro has joined #buildstream02:01
*** mcatanzaro has quit IRC02:34
*** el has joined #buildstream05:19
elARE YOU MAD THOSE NIGGERS ARE TRYING TO STOP TRUMP??05:19
elEMERGENCY KKK AND NAZI COALITION MEETING IN #/JOIN05:19
elON FREENODE IRC SERVER --IRC.FREENODE.NET--05:19
elFREENODE IS AWARE OF THE GROUP AND SUPPORTIVE BUT PLEASE05:19
elDON'T COMPLAIN ON #FREENODE.05:19
elsaxa paulsherwood benbrown nexus bochecha tiago WSalmon tpollard tristan juergbi ironfoot laurenceurhegyi tlater mattiasb jjardon[m] waltervargas[m] kailueke[m] cgmcintyre[m] pro[m] mrmcq2u[m] inigomartinez hergertme gitlab-br-bot adds68_ csoriano ptomato[m] persia brlogger05:19
*** el has left #buildstream05:19
*** bochecha has quit IRC05:51
paulsherwoodhmmm07:23
persia?07:28
paulsherwoodthe overnight spam07:42
paulsherwoodi'd rather we didn't keep that kind of thing for posterity07:42
paulsherwoodbut once folks start tampering with historical records all bets are off :/07:43
persiaIndeed.  There are no good choices for that sort of thing.  I suppose the part I find strangest is the content of this one: this isn't freenode.07:45
paulsherwoodheh :)07:47
*** jude has joined #buildstream08:22
*** bochecha has joined #buildstream08:57
nexushmm...odd09:15
*** ppp has joined #buildstream09:16
pppARE YOU MAD THOSE NIGGERS ARE TRYING TO STOP TRUMP??09:16
pppEMERGENCY KKK AND NAZI COALITION MEETING IN #/JOIN09:16
pppON FREENODE IRC SERVER --IRC.FREENODE.NET--09:16
pppFREENODE IS AWARE OF THE GROUP AND SUPPORTIVE BUT PLEASE09:16
pppDON'T COMPLAIN ON #FREENODE.09:17
pppbochecha jude saxa paulsherwood benbrown nexus tiago WSalmon tpollard tristan juergbi09:17
*** ppp has left #buildstream09:17
nexushmm09:22
*** bethw has joined #buildstream09:35
*** ppp_ has joined #buildstream09:38
ppp_ARE YOU MAD THOSE NIGGERS ARE TRYING TO STOP TRUMP??09:38
ppp_EMERGENCY KKK AND NAZI COALITION MEETING IN #/JOIN09:38
ppp_ON FREENODE IRC SERVER --IRC.FREENODE.NET--09:38
ppp_FREENODE IS AWARE OF THE GROUP AND SUPPORTIVE BUT PLEASE09:38
ppp_DON'T COMPLAIN ON #FREENODE.09:38
ppp_bethw bochecha jude saxa paulsherwood benbrown nexus tiago WSal09:39
*** ppp_ has left #buildstream09:39
*** bochecha has quit IRC09:58
*** bochecha has joined #buildstream09:58
*** bochecha has quit IRC10:03
*** ssam2 has joined #buildstream10:07
ssam2i am still pretty confused by why my subprocesses are dying10:49
ssam2it seems to be genuinely unrelated to the actual code that runs in the subprocesses10:50
ssam2like, if the first thing I do is raise an exception, sometimes i don't see the exeption (but sometimes I do)10:50
ssam2ah, no it's me being dumb again10:51
ssam2argh, no it still doesn't make sense. it's so hard to debug this stuff10:53
ssam2juergbi, out of interest what version of Python are you using when you see the tests passing ?10:54
ssam2on the sam/multiple-caches branch10:55
tlaterssam2: Not sure how much this helps/if you already have it enabled, but there's an asyncio debug mode: https://docs.python.org/3/library/asyncio-dev.html#asyncio-debug-mode10:58
ssam2this is multiprocessing, not asyncio10:58
tlaterAh, fair enough10:59
ssam2not sure if asyncio would be a better choice in the long term... it'd require us to stop supporting python 3.4 though were we to use it10:59
tlaterI don't think we *can* use it for anything but the schedulers, it seems that only one object can run at any time11:00
ssam2that's my impression, it seems like it only supports single-threaded "async" behaviour11:01
ssam2although i haven't got my head around it at all11:01
tlaterWhen I ran into abruptly ending multiprocessing processes it turned out that python simply stopped before it finished running them - any chance this is happening to you?11:02
ssam2i don't *think* so ...11:03
ssam2no because the main process reports that they all died11:03
*** tristan has quit IRC11:04
juergbissam2: i'm on 3.6.311:06
ssam2bah, same as me11:06
juergbihowever, i've tested only your older branch (with my changes on top), not your latest branch yet11:06
ssam2ah ok11:06
juergbii can test your latest branch, if that helps11:06
ssam2worth a try11:07
gitlab-br-botbuildstream: merge request (sam/multiple-caches->master: WIP: multiple remote cache support) #166 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/16611:07
ssam2i've just pushed my current version (which is hideously messy)11:07
ssam2if you try: `python3 ./setup.py test --addopts '--capture=no -x tests/frontend/pull.py '`11:07
ssam2it should fail in the test_push_pull[user-config] test, and with the following errors from the `bst push` invocation11:08
ssam2Fetching artifact list from /home/shared/src/buildstream/tmp/test_push_pull_user_config_0/artifactshare/repo11:08
ssam2Fetching artifact list from /tmp/share/user11:08
ssam2Error loading pipeline: All processes died!11:08
juergbiit's hanging there right now11:09
juergbithat was without the latest push11:09
ssam2ah right11:09
ssam2with the latest version it wakes up from queue.get() once per second to check if all the subprocesses died while we were waiting11:10
juergbiyes, now they don't hang but fail. let me check the errors.11:12
ssam2ah, i might have spotted the issue11:12
ssam2the exception handler in the subprocess was only handling 2 kinds of exception11:12
ssam2if I change it to catch all, I see:11:12
ssam2Logging exception: g-io-error-quark: Invalid remote name file:///home/shared/src/buildstream/tmp/test_push_pull_user_config_0/artifactshare/repo (0)11:12
ssam2although it still seems to fail to actually return that exception to the main process11:13
juergbiah11:13
tlaterPerhaps it doesn't because it's a glib error? I've seen warnings that they don't extend the default python exception objects.11:14
ssam2it could be triggering some pygobject related issue, indeed11:15
ssam2seems that errors from pygobject can't be pickled11:21
ssam2that's probably why they aren't returned from the subprocesses11:21
ssam2ok, it all starts to make sense11:21
nexusi've added the change to the OSTree README, as discussed yesterday https://github.com/knownexus/ostree11:21
nexuswhat should i say in my PR?11:22
juergbissam2: ah ok, we should probably catch them and convert them to something pickable then?11:24
nexuste change is here btw: https://github.com/knownexus/ostree#building let me know if you have suggestions on ways i should change it11:25
ssam2looks OK, I could suggest changes but i can only speculate about what the actual ostree maintainers will want11:36
ssam2the issue with the current instructions is that they can't actually be copied and pasted as-is, right ?11:37
ssam2saying "building is the same as almost every autotools project" isn't enough for folk who don't know autotools11:37
nexusthat's what was already there11:37
ssam2whereas your change is something that can be done11:37
ssam2exactlyu11:37
ssam2*exactly11:37
ssam2one issue is that you give environment variable settings, but you don't say what to do with them11:38
ssam2so the reader already needs to know about what .profile or .bashrc is11:38
ssam2which again, we shouldn't assume if we want these to be widely usable11:38
nexustrue, but how do i cover every option in that case?11:38
nexuse.g. i use zsh11:38
nexusso the instructions wouldnt work11:39
ssam2i don't know11:39
nexus:/11:39
ssam2perhaps just give an example for bash11:39
tlaterAssume bash, anyone who uses zsh has to make these changes all the time anyway11:39
ssam2if someone chooses to use zsh, we can assume they are fairly advanced11:39
nexusand mention that others might differ?11:39
ssam2yeah11:39
nexuskk11:39
ssam2https://unix.stackexchange.com/questions/11544/what-is-the-difference-between-opt-and-usr-local an "interesting" discussion of /opt vs /usr/local11:44
* tlater wonders where the "interesting" part kicks in; people seem to generally agree for once11:47
nexustlater: isn't that interesting enough? xD11:47
tlaterI suppose, but it isn't as fun to read ;P11:48
* persia isn't sure how such a conversation can be complete without inclusion of NFS and reminding folk that in the old days, it was difficult to fit enough disks into a computer to hold what we might consider a full install today.11:49
*** cs_shadow has joined #buildstream12:02
* ssam2 reports https://bugzilla.gnome.org/show_bug.cgi?id=79126512:53
jjardon[m]Hi, can someone take a look at https://gitlab.com/BuildStream/buildstream/merge_requests/173 , please?13:07
juergbijjardon[m]: commented. following the approach of the meson plugin, i would also go with make -C in the variables instead of inserting 'cd' commands, where possible13:22
ssam2ugh, seems that the click CliRunner merges stdout and stderr together when getting the output of running bst ..13:24
juergbioh, for output analysis that sounds wrong13:25
ssam2yeah13:25
ssam2the Result object just has an 'output' attribute13:26
jjardon[m]juergbi: thanks; seems cmake doesnt officially support what meson does, but happy to change the patch if we are ok with that13:26
juergbijjardon[m]: can you elaborate? would we need to use functionality that is not officially supported?13:26
juergbimake -C should be equivalent to cd followed by make, no matter what kind of makefile cmake generates13:27
juergbidon't know whether there is an equivalent for the cmake command itself, but we could also use cd in the variable there, if necessary13:28
jjardon[m]sure, I meant the cmake part: seems cmake supports "cmake -BBuilddir -Hbuildsrc" but is not documented13:28
juergbihm13:29
ssam2https://github.com/pallets/click/issues/737 -- "Enable CliRunner to echo output to stdout/stderr"13:33
ssam2no movement on this of course as Click is abandoned13:33
ssam2wait, that's not the one I meant13:33
ssam2https://github.com/pallets/click/issues/37113:33
ssam2'Result object should have .stdout and .stderr in addition to .output'13:33
tlaterMy integration commands for an element including a linux element fail because it lacks /bin/sh, but I also include busybox which is supposed to provide that... Running bst checkout --no-integrate on it gives me a directory with a working /bin/sh, as expected. Any ideas?13:34
ssam2note that an error like 'Not found: /bin/sh' can also mean that the ld.so that it requires was not found13:35
tlaterEugh. Well, that will probably be it then.13:36
ssam2try: readelf -a /bin/sh|grep 'interpreter'13:37
ssam2you should see something like: [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]13:37
ssam2if that file doesn't exist in the checkout, then there is your issue13:37
tlaterActually, no, the error is: bwrap: execvp sh: No such file or directory13:37
ssam2might still be ld.so, not sure13:38
tlaterProblem is I can't shell into the sysroot, so I can't check for that...13:38
tlaterHmm13:38
ssam2readelf from your host will work on binaries inside the sysroot13:38
ssam2unless it's a cross build13:38
tlaterAlright, that seems to be the issue, ta ssam2... I'm guessing I'm missing that in some split rule13:41
*** mcatanzaro has joined #buildstream14:01
gitlab-br-botbuildstream: merge request (jjardon/cmake_build->master: buildstream/plugins/elements/cmake.yaml: Always create build folder) #173 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/17314:19
juergbijjardon[m]: thanks. given that the tests passed, i take it mkdir is not necessary with -B?15:36
jjardon[m]juergbi: it seems to be automatically generated, yes (I have tested myself here and everything seems to build fine)15:37
juergbigreat, let's merge this15:37
gitlab-br-botbuildstream: merge request (jjardon/cmake_build->master: buildstream/plugins/elements/cmake.yaml: Always create build folder) #173 changed state ("closed"): https://gitlab.com/BuildStream/buildstream/merge_requests/17315:38
jjardon[m]juergbi: thanks!15:39
*** bochecha has joined #buildstream15:56
*** tristan has joined #buildstream16:09
ssam2i'm /still/ having trouble with this concurrent repo initialization ... now ostree is claiming a remote doesn't exist when it does16:33
ssam2it seems to cache them in memory in a hash table, so perhaps doing that is breaking something16:34
ssam2i'm really strongly tempted to just do everything in the same process here; the code is just too fragile for me to be confident merging it or maintaining it16:34
tlaterWhy should this be done in multiple processes in the first place?16:35
ssam2it was already done that way16:36
ssam2i presume because fetching the list of refs is network-IO bound16:36
tlaterRight, yeah, that makes sense16:36
ssam2actually that doesn't make sense16:36
ssam2because this dates from when there was only one cache, and we block waiting for the result16:36
tlaterWhy not?16:36
tlaterIt's probably still better not to block on network IO16:37
tlaterRegardless of whether that made sense originally16:37
ssam2unless it makes the code impossible to work with ...16:38
ssam2also, i didn't profile this properly yet, but it seems to take us a while to marshall the list of refs between processes16:38
ssam2I think that multiprocessing.Queue pickles everything it transfers, which is not actually very efficient16:38
tlaterIt does, that bit me at some point16:39
ssam2so this optimisation only benefits folk who use multiple large caches across slow networks16:39
juergbissam2: might you actually be running into a ostree-level concurrency issue?16:39
ssam2juergbi, yes I think so16:39
juergbisubprocess is required due to the (possibly) implicit libostree threading as discussed16:39
ssam2as the remote definitely exists on disk, but in-process the OSTree code doesn't find it16:39
juergbihowever, we could do everything in a single subprocess16:39
ssam2surely initializing the remotes in the main process would be fine ?16:40
juergbiif it's a libostree operation, how can you be sure it doesn't use a background thread?16:40
ssam2and seems less dangerous than mutating them concurrently in subprocesseses16:40
ssam2true but this is all done during frontend init16:40
ssam2so the task scheduler isn't running yet16:41
juergbiif the background thread properly terminates afterwards, it would probably not be an issue. but if it might hang around, fork is a bad idea16:41
juergbiwhich might be the case with thread pools16:41
juergbii don't know too much about libostree internals, but they could do arbitrary things there. and might even add such potential issues in future versions16:41
juergbii'd play it safe and serially spawn a separate subprocess for each remote16:42
juergbior rather, multiprocessing process16:42
ssam2hmm, ok16:42
ssam2that seems like we get the worst of both worlds, but I'll see if it fixes this issue16:43
juergbifork is very fast on Linux, so it shouldn't be a huge issue16:43
ssam2yes, but returning a 10MB dict via pickle isn't fast16:44
juergbiyou can also use a single subprocess if that doesn't make the code more complex16:44
*** valentind has joined #buildstream16:44
ssam2the cache can contain a huge number of refs, and we query *all* of them16:44
juergbitrue16:44
juergbii don't really see another safe option, though16:45
ssam2fair enough16:45
juergbispawning ostree binary is safe even from the main process but i doubt this would help here16:45
ssam2might be faster to json encode the dict and return that ... would need some proper profiling to see though16:46
ssam2and profiling infra is still an (important) TODO16:46
juergbii'm wondering whether we could somehow catch libraries that attempt to create a background thread in the main process16:46
juergbiso we could immediately abort16:47
juergbithat way we could also test for such issues, at least for a fixed libostree version16:47
tristanIs there a reason we dont use the cache key for ostree artifact extract directories ?16:48
juergbicould probably be done manually with gdb but not sure whether there is something reasonable we can do from within Python16:48
juergbitristan: multiple cache keys can point to the same artifact16:49
juergbi(especially with non-strict mode)16:49
tristanjuergbi, when there is a weak cache key in use, there is still a strong cache key, though16:49
tristanare we also showing weak cache key in the UI ?16:50
juergbiyes, they could still share the extract dir, though16:50
* tristan wonders if that would make sense or not16:50
juergbii don't think we show that in the UI but not 100% sure16:51
tristanI dont understand, short of a cache key collision, how they can share the same dir; if the strong key is always used :-/16:51
juergbidon't we use the commit ID for the extract dir?16:51
tristanI think that the actual weak key can (and should) remain opaque16:51
tristan(i.e. not in the UI)16:51
tristanwe do something like that yes16:52
tristanwe parse_rev() and use that16:52
tristan_ostree.checksum()16:52
juergbiright16:53
tristanit's just really confusing I guess16:53
tristanmaybe I should add checkout --hardlinks16:53
juergbiif i'm not mistaken, we don't have the strict cache key during a non-strict build before we actually extract the artifact16:53
juergbiso we have to go by ostree rev16:53
juergbias we don't want to create an extract directory with the weak cache key (as that doesn't uniquely point to an artifact)16:54
tristanIf we dont have a strict cache key, they we know we dont yet have a strict cache key16:54
tristanso we simply cannot extract an artifact in that case16:55
juergbiif we're in non-strict mode and the artifact was already built, there is a strict cache key inside that artifact16:55
juergbibut we have to extract it to retrieve it16:55
juergbithe only unique id we can use for this extraction is the ostree rev16:55
juergbiwe could also use a temp directory, of course, but that would result in re-extracting everything every time16:56
juergbissam2: btw: https://gitlab.com/BuildStream/buildstream/commit/21f546fa35eaefef2048918afa83b1f222d6839c16:56
ssam2ah, thanks16:58
ssam2starting subprocesses one after the other doesn't seem to fix this issue of disappearing remotes17:01
ssam2i have somehow introduced yet another hang, too17:02
ssam2or maybe its the same issue and its just random17:03
tristanIf we add something for optimized checkout, which gives you files hardlinked to the local artifact cache; what would we call it ?17:05
ssam2`checkout --hardlinks` ? would be nice if it contained the word 'unsafe' or 'readonly' somewhere, though17:07
tristan`bst checkout --hardlinks ...` ?17:07
tristanexactly what I was thinking17:07
tristanalso, I was thinking that technically, it would be better to try os.rename() on the root directory17:07
tristanand then fall back to the regular hardlink-or-copy codepath17:08
tristan(but still in either case, it has hardlinks to the internal cache)17:08
ssam2`bst checkout --internal-cache-links` ?17:08
ssam2that sounds a bit more scary17:09
juergbithe (inverse) git clone option is called --no-hardlinks. however, there hardlinks are less dangerous17:09
juergbiit's really unfortunate that linux doesn't support immutable files (proper support, not the unusable chattr kind)17:10
tristan--impatient ?17:12
tristan`bst checkout --reckless ...`17:13
juergbi--no-warranty17:13
paulsherwoodbst checkout --before-i-go-to-the-pub17:13
bochechawhy is `OrderedDict()` printed when I `bst build`?17:16
bochechathat seems like a debug output someone forgot to remove before pushing? :P17:17
tristanoh ??17:17
tristanbochecha, I'm not seeing that17:19
tristanany any particular time ? have a log ?17:19
tristanmaybe it's plugin specific ?17:19
bochechatristan: at the very beginning: https://paste.gnome.org/phqqukxf517:19
bochechait's an autotools element17:20
* tlater sees the same thing17:20
bochechaI see it with pip elements as well17:20
tlaterOn the current docker image, at least17:20
bochechaI'm running BuildStream from fc96ff0 (I see there's one newer commit in master)17:21
tristanaha17:21
* tlater will check his commit in a sec17:21
tristantlater, it will be the recent changes on making overlaps pretty17:21
tristanmaybe it goes bonkers when there is no overlaps or something17:22
tlaterOn that note, they *are* pretty - helped me debug my image quite quickly17:22
tlaterIt prints that regardless of how many overlaps there are, btw17:22
tristanAlso I think that should be warning, not info17:22
bochechawhat are overlaps?17:23
* bochecha hasn't looked into BuildStream for some time, coming back now17:23
tristanhaha ok I see it now17:23
tlaterOne of my elements prints: OrderedDict([(usr/bin/awk, [base/base-configure.bst, base/gawk.bst]), (usr/bin/ninja, [base/base-configure.bst, base/ninja.bst])])17:23
tlaterbochecha: Two elements might contain the same files - you could, for example, try to overwrite /usr/bin/ninja on your base system with one you freshly build as part of your pipeline.17:25
* tristan is removing the .info() line17:25
tlaterbuildstream figures out when you do that and tells you so you don't do it on accident17:25
bochechatlater: neat17:26
*** valentind has quit IRC17:29
*** valentind has joined #buildstream17:29
ssam2doing all the repo init in the main process definitely fixes this issue of the list of repos being wrong17:36
ssam2list of remotes, ratherh17:36
ssam2*rather17:36
ssam2of course, maybe it reintroduces hangs somewhere else17:36
ssam2ok, alternatively if I call _ostree.configure_remote() both in the subprocess, and in the main process, then the issue goes away17:57
ssam2that seems like the best option I guess17:57
juergbihm18:00
*** bethw has quit IRC18:33
*** ssam2 has quit IRC18:34
*** jude has quit IRC18:59
gitlab-br-botbuildstream: merge request (sam/multiple-caches->master: WIP: multiple remote cache support) #166 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/16619:04
gitlab-br-botbuildstream: merge request (sam/multiple-caches->master: WIP: multiple remote cache support) #166 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/16619:08
*** tiago has quit IRC19:15
*** xjuan has joined #buildstream19:21
*** bethw has joined #buildstream19:25
gitlab-br-botbuildstream: merge request (checkout-hardlinks->master: Checkout hardlinks) #174 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/17419:46
gitlab-br-botbuildstream: merge request (checkout-hardlinks->master: Checkout hardlinks) #174 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/17419:49
*** bethw has quit IRC19:51
*** tristan has quit IRC19:59
gitlab-br-botbuildstream: issue #162 ("Add --unsafe option to bst checkout") changed state ("closed") https://gitlab.com/BuildStream/buildstream/issues/16221:00
gitlab-br-botbuildstream: merge request (checkout-hardlinks->master: Checkout hardlinks) #174 changed state ("merged"): https://gitlab.com/BuildStream/buildstream/merge_requests/17421:00
*** xjuan has quit IRC21:33
*** valentind has quit IRC22:52
*** bochecha has quit IRC23:05

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!