*** philn has quit IRC | 00:07 | |
*** phildawson has quit IRC | 02:28 | |
*** phildawson has joined #buildstream | 02:28 | |
*** tristan has quit IRC | 05:44 | |
*** tristan_ has joined #buildstream | 05:49 | |
*** ChanServ sets mode: +o tristan_ | 05:49 | |
*** benschubert has joined #buildstream | 07:13 | |
tristan_ | https://bpa.st/KFWQ | 07:47 |
---|---|---|
tristan_ | benschubert, ^^^^ | 07:47 |
tristan_ | And better performance on the way, possibly equal | 07:47 |
benschubert | oh nice! That's a really good output | 07:48 |
tristan_ | So interestingly, constructors in cython all want tuples | 07:59 |
tristan_ | But method calls don't | 07:59 |
tristan_ | I'm thinking, having constructors with no arguments, and a separate init() method, is cheaper than tuplizing your constructor arguments | 08:00 |
benschubert | it's probably worth a try, so I'd bet that if cython does it like that, they have a good reason :) | 08:01 |
*** dftxbs3e has joined #buildstream | 08:07 | |
tristan_ | Well, it will decrease yellowness | 08:08 |
benschubert | as I said, worth a shot! For Cython I realized that benchmarking was actually the only way of knowing if I was improving the code or not :'D | 08:09 |
tristan_ | Yeah, I think that I have one more avenue though | 08:09 |
tristan_ | Remove the lists | 08:09 |
tristan_ | That appears expensive, now that I have ResolutionStep, I should be able to just link the steps and avoid append()/pop() | 08:10 |
benschubert | ok :) | 08:25 |
benschubert | Gah, for the threaded scheduler, I now have all tests passing when running pytest directly, but most fail when running through tox -_-' | 08:26 |
tristan_ | Myeah | 08:31 |
juergbi | benschubert: due to tox running with -n? or independent of -n? | 08:43 |
*** santi has joined #buildstream | 08:49 | |
tristan_ | benschubert, Can you run another benchmark on the latest update ? | 08:58 |
* tristan_ will remove the lists now | 09:00 | |
tristan_ | benschubert, In fact, the last three commits on the tree (at this moment) are interesting to benchmark | 09:22 |
tristan_ | benschubert, The third last has an algorithm change for circular deps which very noticeably improves performance, the last two commits are (A): Remove yellowness, by avoiding tuplizing constructor arguments... (B) Remove lists in Variables._resolve(), by adding ->prev pointers to ResolutionStep, and adding a ValueLink object to use instead of a python list of deps | 09:24 |
tristan_ | list removal being the tip of the branch | 09:24 |
benschubert | tristan_: sure will do, starting now :) | 09:28 |
benschubert | juergbi: independent of -n | 09:28 |
benschubert | pytest -n 10 works perfectly well | 09:28 |
benschubert | I'll start looking in the env variables and see if I can reproduce | 09:29 |
benschubert | but it's annoying :/ | 09:29 |
benschubert | Otherwise I've got profiles now, and it seems like it might be just that we overwhelm the asyncio loop, so some cleanup might help well. Nothing too weird | 09:31 |
benschubert | tristan_: benchmark running. Will let you know when it's done. | 09:34 |
robjh | my workflow; disable push to remote cache, build, fail? fix. success? reenable push, delete local cache and build again. | 09:54 |
robjh | is there an easier way to make it not push failures? | 09:54 |
benschubert | robjh: I don't think so. You also do not need to delete the local cache in case of success, you could "just" rebuild / push directly, which would already be slightly better | 09:59 |
benschubert | --no-push-failure or something might make sense though, not sure we have an issue about that | 09:59 |
juergbi | robjh: can you elaborate on the 'fix' step. are you fixing the .bst file or the plugin? | 09:59 |
robjh | the bst file juergbi | 10:01 |
juergbi | ok, so you should definitely get a new cache key, so no manual deletion of local cache should be necessary | 10:01 |
juergbi | and the only advantage of first disabling push is to save a bit of network bandwidth, correct? | 10:01 |
robjh | if i dont delete the local cache first it doesnt trigger a push to the remote | 10:02 |
juergbi | that seems odd | 10:02 |
juergbi | can you check whether the cache key of the element changes as it should? | 10:02 |
juergbi | if not, this might be a plugin bug | 10:02 |
robjh | no. failures being in the remote cache reduces my ability to debug because it doesnt show the full log of the failure and isnt then able to drop to a shell | 10:03 |
juergbi | hm, failure behavior shouldn't change just because the artifact has been pushed | 10:04 |
juergbi | if you can reproduce this, can you please write down the steps in an issue? | 10:04 |
juergbi | the main issue I'm aware of with cached failures is for transient failures or plugin bugs | 10:05 |
robjh | i can certainly try | 10:05 |
juergbi | e.g. when an element fails because your system ran out of memory | 10:05 |
juergbi | but in that case there is no fix in the .bst file and thus, this is a completely different scenario | 10:06 |
robjh | in this case, im not sure the fix will work and i need more than one shot at seeing the failure happen | 10:07 |
tristan_ | benschubert, thanks | 10:08 |
robjh | also, what on earth does this mean?; | 10:08 |
robjh | Choice: [continue]: s | 10:08 |
robjh | Dropping into an interactive shell in the failed build sandbox | 10:08 |
robjh | Error while attempting to create interactive shell: Buildtree is not cached locally or in available remotes | 10:08 |
juergbi | robjh: I don't see why logs would deteriorate after pushing the failure | 10:08 |
robjh | i just saw it try to build this one. it really should have everything | 10:08 |
juergbi | that's right after the build failure in the same session? | 10:09 |
robjh | yep | 10:09 |
juergbi | don't have an idea off the top of my head | 10:09 |
robjh | on failures, i like to go in with a shell and try it manually. once its failed, and cached, i cant do that | 10:10 |
juergbi | maybe caching the buildtree failed (e.g. out of space) but I'd expect a different error message in that case | 10:10 |
juergbi | shell from a cached failure is supposed to work | 10:10 |
robjh | once its failed, and cached onto a server, its a pain to go and sort that out | 10:10 |
juergbi | both from the interactive build session and with `bst shell` afterwards | 10:10 |
abderrahim[m] | caching the buildtree often fails for me | 10:10 |
abderrahim[m] | and it fails silently | 10:10 |
abderrahim[m] | I have filed this bug, but forgot to update it with my investigation | 10:11 |
juergbi | oh, that's new to me | 10:11 |
abderrahim[m] | I'll try to do it later today | 10:11 |
juergbi | or maybe I forgot | 10:11 |
juergbi | thanks | 10:11 |
abderrahim[m] | like it only caches things correctly for BuildElement, not for all Elements | 10:11 |
juergbi | https://gitlab.com/BuildStream/buildstream/-/issues/1263 | 10:11 |
juergbi | ah, that could be | 10:12 |
juergbi | robjh: what | 10:12 |
juergbi | robjh: what's the element kind in the shell failure? | 10:13 |
robjh | juergbi, this one is a kind: script | 10:13 |
juergbi | ok, thanks, maybe there is an issue in scriptelement | 10:13 |
robjh | ive worked around the issue by putting all the build deps into a compose element, checking it out and chrooting into it | 10:13 |
juergbi | as root cause for all of this | 10:13 |
robjh | juergbi, changing it to a manual and im able to get a shell | 10:15 |
abderrahim[m] | https://gitlab.com/BuildStream/buildstream/-/blob/master/src/buildstream/element.py#L1654 | 10:15 |
robjh | (<juergbi> both from the interactive build session and with `bst shell` afterwards) `bst shell` you say? | 10:17 |
juergbi | yes, bst shell --build can use cached buildtrees | 10:18 |
juergbi | but the behavior will/should be the same as in the interactive build session. so if it already fails there, bst shell will probably fail as well | 10:18 |
robjh | i'll make a note of that | 10:18 |
robjh | Do you want to use the cached buildtree? [y/N]: y | 10:26 |
robjh | WARNING: using a buildtree from a failed build. | 10:26 |
robjh | shocker! | 10:26 |
*** SamThursfield[m] has joined #buildstream | 10:36 | |
SamThursfield[m] | registering your nickname on the Matrix IRC bridge is certainly a faff. I felt like i was trying to get into a berlin nightclub :) | 10:37 |
*** benbrown_ has joined #buildstream | 10:47 | |
*** juergbi` has joined #buildstream | 10:48 | |
*** benbrown has quit IRC | 10:49 | |
*** juergbi has quit IRC | 10:49 | |
*** juergbi` is now known as juergbi | 10:51 | |
abderrahim[m] | Sam Thursfield: :) | 10:51 |
juergbi | Hi SamThursfield[m] o/ | 10:52 |
tristan_ | It's been a while since there's been a 'faff', about as long a time since there was a SamThursfield[m] ! | 10:52 |
tristan_ | coincidence ? hmmmm | 10:53 |
SamThursfield[m] | i've been teaching more people to faff, though | 10:53 |
*** tristan_ has quit IRC | 11:17 | |
robjh | im trying to stage a dependency at a specific location. I have a config: layout: section but buildstream is complaining that layout is an unexpected key. has the way you do this changed since 1.4? | 11:24 |
juergbi | robjh: is this in a script element? I don't think it has changed | 11:25 |
robjh | this is in a kind: compose | 11:26 |
juergbi | it's not supported for standard build elements | 11:26 |
robjh | ahhh | 11:26 |
juergbi | compose doesn't support it either | 11:26 |
juergbi | it's a scriptelement feature | 11:26 |
WSalmon | robjh, only scripts and other plugs that do it them selves can do layout like things | 11:26 |
juergbi | there has been discussion about generalizing this in the yaml format | 11:27 |
WSalmon | you can have a pretty basic script element move it tho, and then you dep on that | 11:27 |
WSalmon | but i think you lose all the filter domains etc | 11:27 |
WSalmon | so its not idea | 11:27 |
WSalmon | ideal | 11:27 |
juergbi | https://gitlab.com/BuildStream/buildstream/-/merge_requests/894 is related but it's unlikely to get merged as is | 11:27 |
robjh | ack, thanks | 11:28 |
douglaswinship | If buildbox-casd dies during a CI run, and causes the job to fail, is that generally a problem with the branch I uploaded, or something going wrong with the runners? I'm pretty sure I've seen it before, and at the time I got the impression it was because the runners or the cache were reset in some way, while the CI job was running. | 11:50 |
abderrahim[m] | IME buildbox only dies when it runs out of disk space | 11:51 |
abderrahim[m] | you can add the buildstream logs to the CI artifacts (possibly only on failure) to investigate | 11:52 |
benschubert | tristan: https://gitlab.com/snippets/1991725 | 11:54 |
*** Trevinho has quit IRC | 11:56 | |
*** Trevinho has joined #buildstream | 11:56 | |
WSalmon | douglaswinship, buildbox-casd is still using a lot of memory, you will note that i limit the amount that it dose to stop it running out of ram in some places | 11:57 |
WSalmon | especally on pull | 11:57 |
juergbi | WSalmon: do you still see this with 0.0.9+? | 11:58 |
WSalmon | yep, i had it today locally | 11:58 |
WSalmon | i think it might be better | 11:58 |
juergbi | hm, I thought I fixed all these issues but maybe I missed something | 11:59 |
WSalmon | i am trying to do too much at once atm so i will double check when i get a change | 11:59 |
WSalmon | and update/create issues | 11:59 |
WSalmon | juergbi, it seems a good bit quicker tho, so thank you for all your efforts | 12:00 |
WSalmon | douglaswinship, quite a few project capture the casd logs tho, wrorth a check | 12:00 |
WSalmon | i think fd do | 12:00 |
douglaswinship | will have a look. Not sure yet what i'd expect to see. | 12:05 |
*** mohan43u has quit IRC | 12:05 | |
*** mohan43u has joined #buildstream | 12:08 | |
*** mohan43u has quit IRC | 12:11 | |
*** mohan43u has joined #buildstream | 12:14 | |
*** Frazer has quit IRC | 12:18 | |
*** Frazer6 has joined #buildstream | 12:18 | |
*** pointswaves has quit IRC | 12:21 | |
*** Frazer61 has joined #buildstream | 12:52 | |
*** Frazer6 has quit IRC | 12:52 | |
*** tristan_ has joined #buildstream | 13:08 | |
*** ChanServ sets mode: +o tristan_ | 13:08 | |
tristan_ | benschubert, any luck ? | 13:08 |
benschubert | tristan_: https://gitlab.com/snippets/1991725 | 13:08 |
tristan_ | still not quite perfect | 13:10 |
tristan_ | how many elements is this btw ? | 13:11 |
tristan_ | Would be nice to know just how severely worse performance is | 13:11 |
tristan_ | right now we're down to have a second slower out of around 6 seconds | 13:12 |
*** dftxbs3e has quit IRC | 13:29 | |
*** dftxbs3e has joined #buildstream | 13:32 | |
benschubert | tristan_: 6k elements | 13:33 |
benschubert | I'd say 15-20% slower is quite a big thing for the `show` :/ | 13:34 |
tristan_ | I'm trying structs and PyMem_Malloc/PyMem_Free | 13:43 |
Frazer61 | hey, not sure if this should be an issue for the mailing list or gitlab issue. but i was wondering if its wanted to add BuildStream to the linguist repository https://github.com/github/linguist ? mainly so BuildStream can be shown up in projects along side the other languages used in the project on github or gitlab down the road and give | 13:45 |
Frazer61 | BuildStream more notice? doesnt seem to be too hard to do https://github.com/github/linguist/blob/master/CONTRIBUTING.md#adding-a-language | 13:45 |
* tristan_ doesnt quite get struct pointer member dereferencing in python | 13:49 | |
tristan_ | Struct *struct = <Struct *>PyMem_Malloc(...) ... struct->member is unliked by cython | 13:50 |
benschubert | tristan_: that indeed seems weird :/ | 13:50 |
tristan_ | struct.member doesnt cause any compile time error, but seems obviously wrong | 13:50 |
*** hasebastian has joined #buildstream | 14:28 | |
tristan_ | I think 15-20% slower is an exaggeration, we're .5 seconds slower for 6k elements to show (where it already took almost 6 seconds) I think that is maximum 10% slower in show (and probably entirely negligible for anything other than show) | 14:52 |
benschubert | tristan_: ah sorry, I was still on the earlier numbers | 14:56 |
benschubert | yeah, 10% is still too much though | 14:57 |
benschubert | even the current 6s is difficult to work with (and that's for only 6k elements) | 14:57 |
tristan_ | *only* | 14:57 |
tristan_ | heh | 14:57 |
tristan_ | Like as if I'll ever see a project in my life with that many elements ;-) | 14:57 |
benschubert | I can tell you I need at least one more order of magnitude :) | 14:57 |
benschubert | So yeah, I care about those .5s :) | 14:59 |
tristan_ | yeah yeah... you'll get em back | 14:59 |
tristan_ | I'll have to ramp up the voltage though | 15:00 |
* tristan_ has been getting segfaults and this time around `gdb src/buildstream/_variables.cpython-37m-x86_64-linux-gnu.so core` is not being useful | 15:00 | |
tristan_ | So I'll start with a little corner below keeping the rest in tact and try out some things more incrementally | 15:01 |
benschubert | Let me know if I can help, instead of only being picky ;) | 15:01 |
tristan_ | nah, there's not much room | 15:02 |
tristan_ | better to concentrate on other 2.0 related items | 15:02 |
tristan_ | I should get there in one more day, maaaaaybe two | 15:03 |
benschubert | ok! | 15:03 |
benschubert | I'll go back hammering my threads then | 15:03 |
WSalmon | juergbi, et al. i have just had https://gitlab.com/celduin/bsps/bst-boardsupport/-/jobs/619700756 fail to push a index to https://push.public.aws.celduin.co.uk or http://cache.pointswaves.co.uk:50053, the second server dose not have any proxy in front of it, its just runnig a resonably recent nightly bst docker image with the artifact server. the second one has the artifact falsely reporting that its already there far less tho, so im thruly | 15:33 |
WSalmon | perplexed, the second is in a scaleways data center while the runner and the frist cache are both in | 15:33 |
WSalmon | aws irland | 15:34 |
juergbi | WSalmon: maybe the remaining issue is related to the push issue I mentioned last week or so | 15:35 |
juergbi | which should be fixed by https://gitlab.com/BuildStream/buildstream/-/merge_requests/1976 | 15:35 |
juergbi | I also have a branch now that uses the Remote Asset API. however, the BuildStream core logic hasn't changed that much, so it might not change anything | 15:37 |
WSalmon | so line 1385 is [--:--:--][c6f8bfed][ push:freedesktop-sdk.bst:bootstrap/diffutils.bst] INFO Remote (http://cache.pointswaves.co.uk:50053) already has artifact c6f8bfed cached but sh-5.0# ls /data/artifacts/refs/freedesktop-sdk/bootstrap-diffutils/ | 15:46 |
WSalmon | 68e95fa6fb1e1751f01ee175f2bad646443b2309bb4be9dd0aef85b20bdb71e0 9a9a994c84e7111f7fa6dbbac4cd755ed2b1799382b5f2624accd4a86a444a0b | 15:46 |
WSalmon | 81a4239a68f96a594bb2c18a6dd9edb74947c77c59d942445fdace52c99f0272 9b5e41aa0123519fc4b32d6ace6f2d783b14bc59f79b24a21412e6b490b5b47b | 15:46 |
WSalmon | sh-5.0# | 15:46 |
WSalmon | there is no artifact for that key, its not that it is not updating it | 15:46 |
WSalmon | juergbi^ | 15:46 |
coldtom | yup that's the behaviour i observed too WSalmon, there's no artifacts in the cache | 15:47 |
WSalmon | and there is 10s of Gb of disk left and plenty of cpu and ram | 15:47 |
coldtom | seems like it happens consistently on the cache key too | 15:47 |
WSalmon | but my server dose it a lot less than yours | 15:48 |
WSalmon | i wondered if its was a payload but the artifacts are pretty small | 15:48 |
WSalmon | sh-5.0# cat /data/artifacts/refs/freedesktop-sdk/bootstrap-coreutils/e27244816ea8bba93588004b40d2d1183ace971a512a5e1b8f67e06494ccec90 | 15:49 |
WSalmon | �succeeded*@e27244816ea8bba93588004b40d2d1183ace971a512a5e1b8f67e06494ccec902@c97c7cbe50806804e80fbe58b6d791e70c6cf163171bd039c7869bbe4b19bf39BD | 15:49 |
WSalmon | @7383888b4c7ce5616a33965cfbfda64a830558ea820a1c11effb994999f200ecNJw | 15:49 |
WSalmon | freedesktop-sdk"bootstrap/coreutils-build-deps.bst�@a8c01fedda1bcff6e47b5ffb5aeee40a096bcdbefc22715816d534908e810f32RE | 15:49 |
WSalmon | @4e73db136d9542d8e28e632f2adb18cc25f94df4cbe6df58d4240e4334c44a17Za | 15:49 |
WSalmon | e2724481-build.7008.logF | 15:49 |
WSalmon | it seems like once the server decides it dosent like that artifact it dose it again, i wonded if there could be some wire thing with files not getting closed but i presume the with block should sort that | 15:51 |
WSalmon | and the atomic write | 15:51 |
juergbi | WSalmon: do you use non-strict mode? | 15:54 |
WSalmon | no | 15:54 |
juergbi | it might still be related to weak cache keys, though | 15:55 |
juergbi | it's possible that the weak cache key already exists on the server and that alone may be sufficient to trigger that message, due to a bug | 15:56 |
juergbi | !1976 should fix this as well | 15:56 |
WSalmon | that would make more sence as the first build when we redeploy the server dosnt have issues | 15:56 |
WSalmon | *as far as i have seen | 15:57 |
juergbi | yes, the loop in _push_artifact_proto is buggy | 15:57 |
WSalmon | will !1976 be enough? | 15:58 |
WSalmon | it takes a while to build the docker image i will use to test this so i can build one for that MR but I can hold back if more is likely to change? | 15:59 |
WSalmon | there are all client side changes right? looking at the MR thats my impression.. | 16:00 |
juergbi | yes, I don't see this issue with the version in !1976 based on reading the code | 16:01 |
juergbi | yes | 16:01 |
WSalmon | thanks juergbi | 16:01 |
WSalmon | i spent a while looking at both the client and server code but not that bit.. | 16:03 |
WSalmon | always the way | 16:03 |
juergbi | abderrahim[m]: I've noticed your source-cache branch. that will likely conflict with my remote-asset branch | 16:45 |
juergbi | I already have something in mind for a related speedup, which is to use casd's StageTree instead of staging previous sources in BuildStream itself. casd will use buildbox-fuse where available, so staging will be instant | 16:46 |
abderrahim[m] | ok | 16:47 |
juergbi | (and when planned capture optimizations are in place, the capture part will also be much faster) | 16:47 |
abderrahim[m] | let's see :) | 16:47 |
juergbi | can hopefully get my changes into an MR soon | 16:47 |
abderrahim[m] | any ETA on the remote-asset | 16:48 |
juergbi | the main part is working now in my branch, all tests are passing | 16:49 |
juergbi | need to fix up my casd branch | 16:49 |
juergbi | and then it might soon be ready for an MR | 16:49 |
juergbi | probably at some point next week | 16:50 |
juergbi | this is just the first step, though, moving over to the new protocol. a few additional things will be changed, especially in the source cache area | 16:51 |
abderrahim[m] | ack | 16:54 |
abderrahim[m] | I'm more interested in the new protocol :) | 16:54 |
juergbi | abderrahim[m]: the changes are pushed in juerg/remote-asset if you're interested | 16:58 |
*** hasebastian has quit IRC | 17:17 | |
*** santi has quit IRC | 17:44 | |
*** tristan_ has quit IRC | 19:31 | |
*** tristan has joined #buildstream | 19:32 | |
*** ChanServ sets mode: +o tristan | 19:33 | |
tristan | benschubert, I hit the mark ! | 19:49 |
tristan | Well, according to the variables test, I'm getting better results | 19:49 |
tristan | And I'm only half way through structifying | 19:50 |
tristan | I'm not sure about dropping 'str' for 'char *' though, if we do, I'd still want to take advantage of 'dict' | 19:50 |
tristan | and intern strings | 19:50 |
* tristan not sure if that's possible or if it's worth bringing in a hash table implementation | 19:51 | |
* tristan pushes tristan/partial-variables for the night (or early morning) | 19:53 | |
tristan | still pretty dirty in there... but speed is getting nicer | 19:53 |
tristan | PyCapsule ! | 20:12 |
*** benschubert has quit IRC | 21:31 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!