IRC logs for #buildstream for Friday, 2020-07-03

*** benschubert has quit IRC00:01
*** phildawson has quit IRC02:09
*** phildawson has joined #buildstream02:22
*** tristan has quit IRC05:18
*** tristan has joined #buildstream06:13
*** ChanServ sets mode: +o tristan06:13
*** tristan has quit IRC06:30
*** tristan has joined #buildstream06:39
*** ChanServ sets mode: +o tristan06:39
*** benschubert has joined #buildstream06:57
tristanbenschubert, Ummm, your 6K element test project.... is it 6K elements *wide* by chance ?07:30
tristanIt appears to be impossible to have a project that is > 1K elements *tall*07:30
benschuberttristan: No, it's not a flat project07:30
benschubertoh?07:30
benschubertIt's definitely not 1k elements tall though07:30
* tristan gets recursion error07:30
benschubertit's based on `base-files` in debian (and all its deps)07:30
* tristan used a simple script to make a chain of elements depend on eachother07:31
tristanI see07:31
tristanYou know, we have an `except RecursionError` in _frontend/app.py which says, "Sorry, you'r SOL"07:31
benschubertwait what?07:31
tristanin more professional language07:32
benschubertah, I got worried :'D07:32
tristanSo that you don't get to see any stack trace07:32
tristanI guess that's better than a stack trace07:32
benschubertunless you --debug :)07:32
tristanReally ?07:33
* tristan uncomments07:33
benschubertat least it used to be the case07:33
benschubertif it's not the case anymore we need to re-enable that07:33
* tristan also noticed that it appears that --error-lines is getting ignored07:33
tristanI get the stack trace in a BUG message if I uncomment that handling07:34
tristanBut then, it shows me a hard coded limit of stderr lines07:34
tristanor of exception lines07:34
tristanWithout consulting Context.log_error_lines07:34
* tristan guesses we really need to have a test for that if we want people to pay attention... sadly :-S07:35
tristanmaybe it's just with stack traces07:36
tristanno no07:36
tristanbenschubert, --debug doesnt cause the RecursionError stacktrace to be printed either ;-)07:36
tristanAaaaaanyway07:37
tristanPolish for another day, and hope it stays polished after that07:37
benschubertyeah :/07:37
* tristan doesn't think we need to cover every little detail with a test case, and would rather expect people to fully understand code they are replacing07:38
benschubertoh by the way, I had a wild thought the other day: we are using mypy for the public API, how do you find it? Useful? I was wondering if we should switch from "only the minimum for the public API to pass" to "as you wish, as long as it's well made and not blocking others" ?07:38
tristanI would like to get to a place where it is *everywhere* so long as there is no runtime overhead07:38
tristanI don't know what is the official policy but, I guess that it makes sense to mandate it in public API07:39
tristanGetting it *everywhere* would force us to restructure modules to avoid any circular dependencies of modules at all07:39
benschubertyeah currently we've said "only public"07:39
tristanWould would be the main benefit I would see07:39
benschubertnot necessarily, but yeah that could help with such refactors at least07:40
tristanI mean, right now we think it's okay to call a method on an object which we did not import it's class07:40
tristanThat kind of stuff would become impossible07:40
tristannot sure, but thought it might07:41
benschubertstill possible with forward declarations and If TYPECHECKING guards07:41
benschubertso it's more work but it's possible07:41
benschuberthowever it becomes obvious :)07:41
tristanYeah, that too, we can expose the warts in their horrible glory07:41
benschubertso would you be rather in favor if I start a ML thread about allowing mypy anywhere we want?07:41
tristanI think I've already pushed code which adds annotations in non-public API07:42
tristannobody caught me doing it ;-)07:42
benschubertok I take that for a yes :D07:42
tristanPluginFactory07:42
tristanThat uses some, not completely but some07:42
benschubertsecond point I wanted to ask: I have found `black` to be very useful for our formatting. One last thing bugs me: our import statements are never consistent. Would you be fine with using `isort` in addition to sort all our includes _in a single_possible_way_ ?07:43
tristanOf course, it makes no sense for cdef cython things, but I guess outward facing def/cpdef functions on cython classes can still do it right ?07:43
tristanwhat is isort ?07:43
benschubertfor cython, we can use .pyi files like I did for yaml.pyx :)07:44
benschubertisort is a tool that focus on one thing: sorting import statements for python07:44
benschubertit has an autoformat mode and a check mode07:44
tristanright, I would not favor alphabetic, but the webpage says something about sections07:44
tristannot sure how it would work07:44
benschubertyou can decide, I want "stdlib - htird party - first party - tests" for example07:44
tristanwe have a nicely treeish thing07:44
benschubertand then inside each section you can say: "normal imports first, then "from ... import", sorted alphabetically07:45
benschubertah you prefer the tree?07:45
tristanSo I'd like to have [standard libraries] -> [parent package] -> [local package] imports07:45
tristanI would not like those to be interspersed07:46
benschubertyeah, you can ask for this :)07:46
tristanfrom . import foo goes last07:46
tristanpreceded by from ..07:46
benschubertSo if we can get such an order, you'd be in favor of an auto-import formatter?07:46
tristanYep :)07:46
tristanThat's pretty much my only nit I think07:47
tristanYeah07:47
benschubertok! I'll see what I can do and push it to the list then, thanks07:47
traveltissuestristan: is there a planned release of 1.93.4.dev0 to pypi?07:48
tristantraveltissues, not particularly07:49
traveltissuesit would be useful is that were published. would you please consider that07:49
tristantraveltissues, pypi for dev snapshots at this point is more of an on demand/depending on whether cs-shadow feels like it kinda thing07:49
tristanPushing out tags is cheap07:50
benschuberttristan: I can ping cs-shadow to ask him if he can push it if you need it07:50
tristanGoing and doing a whole pypi dance is another thing, we probably have another tag out by the time a pypi dance is performed07:50
tristanbenschubert, Ask traveltissues ... I'm not the one expressing a need :D07:50
benschubertoups, I means traveltissues07:50
benschubertmy autocomplete tricked me :)07:51
tristanAh, 't' !07:51
tristan'tr' !07:51
tristantraveltissues, we're sorted in the same bucket ;-)07:51
tristanbucket buddies07:51
traveltissueswell i don't strictly need it, it would be nicer if we could add the local dir to the sys.path when installing in dev mode07:53
benschuberttraveltissues: what do you mean?07:54
traveltissuesif you install bst-plugins-experimental in dev mode07:55
traveltissues>>> print(sys.path)07:55
traveltissues['', '/usr/lib/python38.zip', '/usr/lib/python3.8', '/usr/lib/python3.8/lib-dynload', '/root/.local/lib/python3.8/site-packages', '/bst-plugins-experimental/src', '/usr/lib/python3.8/site-packages']07:55
traveltissuestrying the same thing with bst and copying the source, libs, and binary to another container breaks it07:55
traveltissuesbash-5.0# bst07:55
traveltissuesTraceback (most recent call last):07:55
traveltissues  File "/usr/local/bin/bst", line 5, in <module>07:55
traveltissues    from buildstream._frontend import cli07:55
traveltissuesModuleNotFoundError: No module named 'buildstream'07:55
benschubertwhat do you call dev mode? `pip install -e ` ?07:55
traveltissuesyes07:55
benschubertare you doing this in a venv?07:56
traveltissuesin docker07:56
benschubertso you are `pip install -e buildstream` and it doesn't work?07:56
traveltissuesno, the installation works, copying this to another container doesn't. that's not the surprising thing, what i'm wondering is why is bst-plugins-experimental still found when it's copied but buildstream is not07:57
benschubertyou might not be copying everything correctly07:58
benschubertpip install -e does quite a few things under the hood07:58
benschubertI'm curious why you need to copy stuff like that?07:58
traveltissuesyes, but i'm trying to understand what the difference is here if they're both installed the same way07:58
benschubertbst-plugins-experimental is not a cli07:58
tristanI think the next tag will be more interesting to have on pypi for what it's worth07:59
traveltissuesi'm copying it because the container i have doesn't have the tools to install, it doesn't have pip for example07:59
tristanThe last tag doesnt include the juncle merge07:59
tristanWhich is a game changer07:59
benschuberttraveltissues: ./setup.py develop would work too07:59
tristanI'd like to land a fix to Element.search() supporting full paths properly first and get all the full-pathing cases right08:00
traveltissuesnice08:00
benschuberttristan: speaking of junction jungle.... it's not possible to have a project depending on itself right?08:00
traveltissuesi didn't know that, thanks benschubert08:00
tristanbenschubert, yes it is08:00
benschuberttraveltissues: no problem, guess it should solve your problem :)08:00
benschuberttristan: oh! perfect08:00
traveltissuesyes, it did08:00
* benschubert is going back to his recursive project08:01
tristanbenschubert, I think it needs to be 'internal' though08:01
tristanwe don't have a notation to express 'self' in 'duplicates'08:01
benschuberttristan: yep my aim is to have it internal anyways08:02
tristanBut I think it's alright this way, the use cases for self junctioning I can think of should be internal, and we could extend with that if the need arises08:02
benschubertyep, my aim is to have a stage3 rebuild itself08:02
benschubertso internal is exactly what I need08:02
tristanyeah, bootstrapping is the main use case08:02
tristan(the known one anyway)08:03
tristanSo it looks like for me to do meaningful benchmarks, I really need to run the same intense project 1000 times and take an average08:04
tristanThe results are so erratic08:04
benschubertyeah the way I run them is 4 runs and I discard the first on the debian-like08:05
tristanonly 4 ? maybe with 6K elements you have more stable results08:05
benschubertyep08:06
tristanWith my measly 900K tall project it varies a lot08:06
tristan900 tall08:06
tristan-K08:06
benschubertyou've seen the difference in timing in my results :) (the third column)08:06
tristanyeah08:07
tristanOk what should I do :-S08:07
tristanHmmm08:07
tristanbenschubert, mind if I toss you some branches to run benchmarks, you can do this in background without too much distraction ?08:08
tristanI'd like to try without the unneeded ref/unref calls, and try a more complete array approach to see if that improves things08:08
tristanthe first one I can dish out in a minute08:08
benschubertSure, won't be the same machine and will take longer but I can :)08:09
tristanalright08:09
tristanbenschubert, pushed removal of refcounting08:13
benschuberton it :)08:13
tristanAha08:14
tristanbenschubert, wait, can you run it while comparing the second last commit too ?08:14
tristanSince it's a different machine, I'd like to compare the cost of that with the last, see if it makes a difference08:15
benschubertsure08:15
traveltissuesi've been out of the loop for a while. what happened with the autobenchmark bot effort?08:20
tristanOk that will be helpful, I think I might have a breakthrough too, I'll let you know08:20
* tristan has spotted a crucial difference between the old and new _variables code08:20
benschuberttraveltissues: not sure it is "ready"yet? Haven't heard about it for a bit08:23
tristanLast I heard, we had a plan to at least run `bst build` for every commit in the benchmarks repo, without gating it on CI, and automatically updating a webpage with that result08:29
tristanEven that never happened08:30
tristanI think it just involves using an API key and interacting with gitlab restful API08:30
tristanI noticed where my code is more complex.08:37
tristanThe old code has the _expstr_map, which is a dictionary of "either list or string"08:37
tristanIf it's a string, then it's resolved, if it's a list, then it needs to be recursively resolved08:37
tristanThere is a quicker return for resolved values08:38
benschubertoh good catch08:38
tristanAdding a second dictionary for resolved values gets much similar performance, *but* it would much increase memory signature08:39
tristanLooking at the current code, it would seem that the way things do get resolved, we don't technically need the ValueClasses for already resolved values08:39
tristani.e. you cannot have a circular reference error08:40
tristanor an unresolved counterpart, for something that is already resolved08:40
tristanSo... I could probably change this to do a similar "str-or-Value" dictionary08:40
tristancrap, that might not be it :-S08:44
* tristan was looking at false results08:44
tristanI could still be close though08:44
tristanHmmm, ok I need to rethink this08:45
tristantoo many dictionary lookups while resolving a variable, that is what is changing things08:45
tristanAhhhh, ok so tiny quicker return could be helpful but maybe not08:46
tristanHowever, when I do the reverse loop over value dependencies, instead of indexing variables, the resolved values of previously iterated values could be readily in context08:47
tristanthis would match the nature of the recursive resolve_variable_and_use_result algo we had before08:47
tristanhttps://gitlab.com/BuildStream/buildstream/-/blob/tristan/partial-variables/src/buildstream/_variables.pyx#L33208:48
tristaniter_value.resolve() right there, could be given the previously resolved values, the order of the array should ensure that any value it *wants*, is the next value in the list08:49
tristanso if a value needs 2 variables, it takes the last 2 previously resolved ones08:49
tristan(whether they were resolved this round or not)08:49
WSalmonjuergbi, i reran this job quite a few times, it always crashes while just pulling, https://gitlab.com/celduin/bsps/bst-boardsupport/-/jobs/622631814 but https://gitlab.com/celduin/bsps/bst-boardsupport/-/commit/b3d248c38efb191088258df8d5a2d090de8fc038 this change allows it to get MUCH further, im mostly running 1.93.4 see master of https://gitlab.com/freedesktop-sdk/infrastructure/freedesktop-sdk-docker-images for the exact versions of everything09:00
WSalmoni presume https://gitlab.com/freedesktop-sdk/infrastructure/freedesktop-sdk-docker-images/-/blob/master/project.refs#L176 is the version of casd that should not cause excessive memeory usage?09:01
cphangWSalmon I thought that the error handling of build box was cleaned up so there are more informative errors now?09:01
WSalmoncphang, i think the issue is that the system is OOM so everything crashes a bit badly09:02
WSalmonwhen i changed the other stages to be limited like that i did it after localy running using the amount of ram that those runners have. but i though we had updated things since then. i should have dont that last job too but gitlab-ci variable subsutition only gose 2 variables deep which makes everything trickythat stage got missed..09:07
WSalmoni didnt mean to send ^ i was trying to delete it and fat finngered but you get the jist09:07
juergbiWSalmon: do you have the kernel log?09:10
juergbito see the casd memory usage at time of OOM kill09:11
WSalmonno, all i have is https://gitlab.com/celduin/bsps/bst-boardsupport/-/jobs/622631814/artifacts/browse/artifact/09:11
juergbicasd 0.0.9 should indeed already have the memory usage fixes but there might be more issues09:11
juergbiWSalmon: how much RAM do these runners have? and are multiple docker containers running on a single runner without RAM restrictions? could it be caused by another container?09:13
juergbiWSalmon: running buildbox-casd with `/usr/bin/time -v` prints the peak memory usage, iirc09:14
juergbiif you want to look into this issue, describing steps that trigger higher than expected peak memory usage in casd would be very helpful09:15
juergbicould be done by isolated pulling09:15
juergbiI thought I did this before and after my memory fixes and it was reasonable afterwards09:16
WSalmonjuergbi, they are amazonec2-instance-type=a1.2xlarge with 16Gb of ram09:16
juergbihowever, my focus was on handling of large files, maybe there is an additional issue with lots of regular sized files (instead of individual large files)09:16
WSalmonthey have one ci job per vm09:16
WSalmonso each ci job has 16GB to its self minus the OS and docker over head, the vms are spun up by docker machine each morning09:17
juergbiok, so presumably 10 pulls at once trigger this on 16 GB09:17
WSalmonyep09:17
WSalmonafaict09:17
juergbishould definitely not be happening09:17
juergbimight be a leak09:18
juergbi(the fixed issue was not a leak, simply not-optimal code requiring high peak usage)09:18
WSalmoni cant rember if i have tride it with 0.0.9 but i have had it use that much ram locally for that job09:18
WSalmon* the x86 verion09:18
juergbithe ISA will likely not make a big difference09:19
juergbiso it should be possible to reproduce this on x86-6409:19
WSalmonthats what i ment by locally09:19
WSalmonincedentally you can pull that arm job on to you x86 machine with `rpi4-features]$ bst -o target_arch aarch64 build deploy/image.bst`09:20
WSalmoni think you have the keys to that cache but its open for anyone to pull from09:21
WSalmonif you clone and run from the sample/rpi4-feature folder09:22
juergbiWSalmon: I think I can reproduce it09:36
WSalmon:D09:36
juergbidoesn't appear to be a memory leak but rather excessive memory usage with large trees09:41
juergbie.g. can be seen by the memory usage of pulling bootstrap/acl-build-deps.bst alone09:41
juergbiWSalmon: https://gitlab.com/BuildGrid/buildbox/buildbox-casd/-/issues/7610:00
tristanWhoa, ok lists are definitely more expensive than arrays10:00
WSalmonthanks juergbi10:03
juergbiI'm just realizing, while this will fix 'the large artifact of small files' issue, it's still not idea for large files10:05
WSalmonjuergbi, they want writing strat to disk?10:07
WSalmonstraight10:08
WSalmonpresumably its worth not pulling to fast as well as then you just get clogged up in IO issues?10:08
WSalmonor will linux handdle that nicely10:08
juergbiwe have to write them to disk, that's where our CAS cache is10:09
juergbi(disk could be tmpfs but that can require a lot of RAM)10:09
WSalmonright, i was just thinking that in some  data centers you have huge pull bandwidth but terible disk IO10:11
WSalmonbut thats probly premiture optomisation, and we just need to get them on to disk and off RAM quicker10:11
juergbiyes, there isn't too much that can be done about this10:12
juergbiexcept striving towards reducing unnecessary pulling - or using machines with tons of RAM and tmpfs10:12
juergbi*and10:13
WSalmoni think bst defauling to unlimited pullers might not be the best bet ether10:15
juergbiit defaults to 10 max10:15
WSalmonah10:18
coldtom10 pullers is still enough to put a lot of stress on a machine ime10:18
WSalmon+110:18
WSalmonEspecally for anything based on FD10:18
benschubertyou can always set the default in your user conf no?10:20
juergbiyes, the best default will vary depending on your machine and your connection10:21
WSalmonbenschubert, yes but it seems silly to have a default that requires you to have a 32Gb machine..? having a function of ram or cpu seems more sane to me, but getting buildbox-casd to write straight to disk seems like the most important10:21
WSalmonif you have a huge machine you may well need to bump it up any way10:22
juergbithe buildbox-casd issue needs to be fixed, of course10:22
juergbiI don't think we should reduce the default in buildstream because of that10:22
benschubertI agree with juergbi, it seems to work well with 10 even on my core 2duo 8GB ram machine10:22
WSalmonbenschubert, you have pulled 10 big chunks of FD from a fast cache like celduins at the same time or were you fetching from sources like github that are quite slow?10:25
benschubertWSalmon: I have a local cache at home, so we're talking 1GBps :)10:26
benschubertand I'm currently working with clang and friends10:27
benschubertmy version of bst is not 100% up to date though10:27
WSalmonumm, interesting, also fetching and pulling work quite diffrenlty i have though for a while that they should have diffrent limiters, as fetching hits the network via the source tool but the pull is via casd10:28
juergbithe issue is with large artifacts10:29
juergbithe relevant fdo-sdk artifacts are gigabytes (single artifact)10:29
juergbias they compose build-deps into an artifact of their own10:29
benschubertjuergbi: but that's a very old bug no? I remember opening it when I had 60GB artifacts10:29
benschubertor was it fixed and came back in?10:30
juergbithere was a capture issue that was fixed10:30
juergbifor pulling I don't think it's a regression10:30
juergbiat least on the buildbox side10:30
juergbiit's very possible we fixed this in buildstream before we handed this task to buildbox-casd10:31
juergbi(or it was correct from the beginning in pre-casd buildstream)10:31
benschuberttristan: https://gitlab.com/BuildStream/buildstream/-/compare/master...bschubert%2Fisort that's roughly what isort would look like, there's still 1-2 details I'd need to fix around comments though10:49
benschubertjuergbi: seems like the only tests now failing are the ones using an artifact_share, do you have any idea where we could have more problems?10:50
juergbibenschubert: BaseRemote.init?10:51
juergbican hopefully get rid of that fairly soon with step 2 of Remote Asset support10:51
benschubertah good point let me try10:52
tristanbenschubert, looking at a modified file (instead of the diff): https://gitlab.com/BuildStream/buildstream/-/blob/2833a6abd562a628f4039e74a1dadf12661d2b6d/src/buildstream/_frontend/app.py ... those comments are somehow preserved from the source code ?11:04
* tristan can see that private modules come before public ones, simply due to the sort, which seems alright11:04
benschuberttristan: yeah the comments are preserved but not moved correctly, I'd need to go through all of them to ensure they are in the correct order / remove the stale ones ;)11:08
benschubertjuergbi: it works! awesome, thanks a lot11:12
benschubertok now cleanup and make a WIP11:13
juergbi:)11:13
tristanbenschubert, so confusing results so far, any word on that benchmark ?11:20
benschubertlet me go and check11:20
tristanit will be interesting just to know if it makes any difference11:20
tristanand if so, how much11:20
* tristan actually *lost* performance by removing unnecessary dict lookups11:21
* tristan palmface11:21
tristandoesnt seem to make any sense11:21
benschuberttristan: https://gitlab.com/snippets/1992419 should realized I forgot to do one of the checks11:22
benschubertwhich other commit did you want again?11:22
tristanoh no :(11:22
tristanI wanted the tip and the before tip11:23
tristanjust to measure the difference with and without refcounting11:23
benschubertok let me restart with the one before11:24
benschubertis just the `bst show` enough?11:24
benschubertI can speed up benchmarks a lot if so :)11:24
tristanyes11:25
tristanits enough11:25
benschuberttristan: i went for everything: https://gitlab.com/snippets/1992419#note_37328953112:00
*** tristan has quit IRC12:02
*** tristan has joined #buildstream12:20
*** ChanServ sets mode: +o tristan12:20
benschubertjuergbi: seems like I've got still more problems :'D https://gitlab.com/BuildStream/buildstream/-/jobs/623129875#L86014:09
benschubertthe diff's https://gitlab.com/BuildStream/buildstream/-/compare/master...bschubert%2Fno-multiprocessing14:10
benschubertI'm not 100% sure how that can happen :/14:10
* douglaswinship is particularly concerned by https://gitlab.com/BuildStream/buildstream/-/jobs/623129875#L602314:16
douglaswinshipThe error messages seem to be getting quiet insulting!14:16
juergbibenschubert: so that's ostree-related, right?14:16
juergbiostree is also known for background thread issues14:16
juergbido you only see this in CI? maybe it depends on the ostree version14:17
benschubertjuergbi: no sorry, line 860 :)14:17
benschubertI've seen the ostree ones and I'll have a look at them14:17
benschubertbut the one line 860 seems worse currently14:17
juergbiah, hm14:18
juergbibenschubert: CASDChannel.get_local_cas() is not thread-safe14:19
benschubertoh gosh :'D14:20
juergbiit assumes that _local_cas is set if _casd_channel is set14:20
benschubertI've got a lock around establish_connection though14:20
benschubertand same on the close14:20
juergbiyes, but the lock is not used in get_local_cas14:21
juergbichanging the conditional in get_local_cas to check for _local_cas probably fixes it14:21
benschuberthttps://gitlab.com/BuildStream/buildstream/-/compare/master...bschubert%2Fno-multiprocessing#6b68d6e75602b4de08330322d00edd6aabf0a718_249_246 isn't that enough?14:21
benschubertI mean, the 3 only get set at the same time right?14:21
juergbican't the GIL switch between threads between these assignments?14:22
benschubertoh14:22
benschubert-_-' right14:22
benschubertgah thanks14:23
benschubertnow ostree...14:23
benschubert    # Use local-only GIO, don't use the DBus-based GVfs backend.14:37
benschubert    # GVfs/DBus uses background threads, which can cause problems14:37
benschubert    # with BuildStream's fork-based process model.14:37
benschubert    os.environ["GIO_USE_VFS"] = "local"14:37
benschubertSeems like we should not have threads? -_-'14:37
juergbibenschubert: maybe GIO initialization is triggered at some earlier point14:43
benschuberteven setting it before the import of ostree doesn't change14:44
juergbitry setting that environment variable early on in buildstream startup, if you can reproduce this14:44
benschubertyeah I can reproduce easily locally14:44
juergbiglib/gio might be triggered by something non-ostree, although I don't know what14:45
WSalmonso i can pull down a artifact build on a diffrent arch but when i try to check it out it tries to run the sandbox, the final element is a script element with no run time deps, i presume bst want to run some final intergrations stops, can i turn that off? https://pastebin.com/cFX2ZGyT14:45
benschubertjuergbi: putting that in my pytest env doesn't change anything either14:47
abderrahim[m]WSalmon: --no-integrate may help14:47
WSalmoni just noticed that but it seems not14:47
WSalmonhttps://pastebin.com/SUkzCW8E14:48
juergbibenschubert: can you do a 'thread apply all bt' in gdb at a point where the extra thread exists?14:48
juergbimaybe the backtrace of that thread provides some hints14:48
WSalmonjuergbi, tristan, benschubert  i would have thought the --no-integration would fix it is that a bug? or am i using it wrong / it dose something else?14:49
juergbiWSalmon: maybe we instantiate the sandbox when we don't need to14:49
WSalmonok, ill go look at the code14:50
juergbiI wouldn't expect artifact checkout --no-integrate to require the sandbox14:50
juergbiit should even work without buildbox-run installed14:50
benschubertjuergbi: I can tell they are not python threads, I'm trying to get more info14:54
benschubertdo you know by any chance where's the code/doc for all of this?14:55
benschubert(I mean ostree python)14:55
juergbibenschubert: there is no Python OSTree code per se. pygobject dynamically generated Python bindings for the libostree C library15:06
benschubertoh :/ less than ideal15:07
juergbithe background thread is very likely to be created by libostree (or a dependency of that), not pygobject15:07
juergbiit may also depend on the HTTP backend of ostree. iirc, there is a compile-time option between curl and libsoup15:08
benschubertso what would you recommend? In the ostree tests we increment the potential number of threads?15:08
tristanI'm not sure I understand the context of why ostree libraries starting threads becomes problematic15:10
tristanI thought it was mostly problematic that it gets called to early when using the fork() model15:10
tristanQuick recommendation though which would result in (A) Easier install story, (B) Lots of code removed, (C) Addressing any conflicts with threads15:11
juergbiwithout fork it should be much less of an issue15:11
tristanWould be: Just stop using python bindings for ostree, and call it normally with Plugin.call()15:11
juergbihowever, it would still be good to be sure that everything is cleaned up after each test15:11
juergbii.e., the tests are isolated15:12
benschubertyeah that's the problem, for testing15:12
tristanThat would also have the benefit of (D): Not special casing that Python API break anymore15:12
juergbiif the command line tool is sufficient for the purpose of the ostree source, that might indeed make sense15:12
juergbioriginally we depended on libostree as it was also used as our artifact cache15:13
juergbiso the CLI might not have be sufficient15:13
juergbibut purely for the source, it might be15:13
benschubertthat's true that this would solve the problem15:13
benschubertand we need ostree-cli anyways to run the plugin15:13
benschubertso in the end...15:13
tristanI think the initial decision to use python bindings was rather arbitrary, but definitely driven by using it as artifact cache15:14
tristaniirc there was a deeper API we needed for the ostree pushing thing to work well15:14
benschubertSo you'd be both fine with ostree-cli? We'd loose the reporting of progress, which I think is an improvement :)15:14
tristanHaha, that progress has always been whacky15:15
tristanpercentage up and down as you discover during pull that your total is changing15:15
*** hasebastian has joined #buildstream15:15
benschubertok! I'll see what I can do15:15
juergbiI'd be fine with that. we don't have this for other sources either15:15
juergbibenschubert: that said, fixing the background thread might be as simple as adding a close() call somewhere15:16
juergbiwith the forked job processes, background threads get eliminated by termination, so we didn't care15:16
benschubertjuergbi: I've dir() on every ostree-related object we create15:16
benschubertnone has a `close` method15:16
benschubertafaik15:16
tristanMyeah, I don't expect state of the art15:17
tristanI don't think it's GIOStream based15:17
juergbibenschubert: have you also tried setting .repo to None?15:17
tristanI was thinking that15:18
juergbiessentially we might need the inverse of the ensure() method15:18
benschubertI've `del repo`15:18
tristanEven if it's GC'ed syncrhonously and "disposed", it would not be uncommon for the thread itself to asynchronously complete (no g_thread_join() in dispose())15:19
juergbicould be the case, yes15:19
juergbiif ostree cli works, let's go for that15:20
benschubertwe're waiting up to 5s for all threads to close15:20
* tristan recalls spending time to make this happen synchronously for EDS EBookClient handles, and noticing that there wasn't much interest in having this work15:20
* tristan rubs tabasco into his eyeball15:23
tristansad15:23
benschubertWhat part of the code prompted you to do that so I can avoid it?15:23
tristanRight before the return from the pizza() function15:24
tristanwhew ok lets give this another go... I'm gonna throw it ALL away, _variable.py... the Values the ValueClass, the ValueParts, all the nice stuff15:26
tristanAnd see if I can't get a performant loop-instead-of-recurse using the ol' list returned from python regex15:26
tristanI'm not particularly hopeful, every way I look at it, avoiding recursion requires allocating data structures15:27
tristanSo it would logically be slower than pushing a stack frame15:27
Frazer61hi, im getting an issue where it says "Did not find 'bzr' in PATH" when testing in the venv, i have tried installing bzr but it doesnt seem to fix it unfortunately15:29
tristanThat is odd15:31
tristanand when you run `tox -e venv /bin/bash` and type `which bzr`, do you have it ?15:31
Frazer61its when i try to build cachekey. nothing, when i try to install bzr in the venv https://pastebin.com/hQS9ePxB15:32
tristanuhhh15:33
tristanis there a reason to install bzr into the venv itself ?15:34
tristanmaybe bzr is implemented in python (I don't keep track) but it shouldnt matter I think15:34
Frazer61https://pastebin.com/y1s4F5S115:34
Frazer61for bst build in cachekey15:34
tristanright so when you are not in the venv, you have host bzr installed ?15:35
tristan`which bzr` ?15:35
tristanand when you are in `tox -e venv /bin/bash`, `which bzr` doesn't find your host bzr ?15:36
Frazer61no, because when i try to install it, it gives the error shown in the first pastebin. happens locally and in venv15:36
tristanok but what OS are you running ?15:37
Frazer61fedora 3215:37
tristandon't you have a reliable way of installing bzr on your host ?15:37
Frazer61using pip to try and install it15:37
tristanwhat is the latest rpm -i... `aptitude` ?15:37
tristanNo thats not fedora15:37
tristanI forget15:37
tristanFrazer61, try using a reliable method provided by your distro to install bzr, that should work15:38
Frazer61when i try to install bzr it tries to install breezy, but i thought it was bazaar. is it fine to continue?15:39
Frazer61via dnf15:39
* tristan doesnt know what it is on fedora :-S15:40
Frazer61dnf is equivalent to apt15:40
tristanhttps://fedoraproject.org/wiki/Changes/ReplaceBazaarWithBreezy15:40
tristanI mean I don't know what bzr is on fedora :)15:40
coldtombreezy is bazaar iirc15:40
Frazer61nice bit of confusion :), thanks will give it a go15:41
tristanFrazer61, yeah so it looks like it will work, different implementation of the same thing15:41
Frazer61y15:41
tristanindeed confusing15:41
Frazer61sorry ignore that15:41
*** phildawson has quit IRC15:41
Frazer61hopefully should get the first bit of bst source show working15:41
tristanFrazer61, awesome, please do remember to summarize and get that formal proposal on the list though15:42
tristanI know, hacking is more fun15:42
Frazer61will do, just using elements from fetch to get it working and go from there to make a proposal for what we want as a final solution15:43
tristanIt should pass without issue but not everyone is on IRC all the time, so it's better to have it formally on the list, and we can link to that in a resulting merge request etc for good provenance15:43
tristanYup :)15:43
benschuberttristan: it does remove quite a bit of code from ostree to use the cli :D16:00
benschuberti should have a prospective PR soon16:01
WSalmonhttps://gitlab.com/BuildStream/buildstream/-/commits/willsalmon/checkout_no_sandbox feels really close to working, do we have a patten for creating a vdir with out a sandbox? i can just make a temp dir context and trigger one with that.. but wonder if we already do it some were i can make sure i am consistent with. juergbi maybe?16:03
juergbiWSalmon: could just create a CASBasedDirectory directly16:06
juergbiWSalmon: a possible alternative is to create a SandboxDummy16:07
juergbinot sure off the top of my head what's the cleanest approach16:07
WSalmonthanks juergbi16:17
benschubertjuergbi: tristan https://gitlab.com/BuildStream/bst-plugins-experimental/-/merge_requests/123/ still two tests failing but should work well :)16:40
juergbinice16:41
benschubertnot sure what's the deal with the failures :/16:59
benschubertthought I had reproduced the API correctly16:59
tristannice17:01
tristanwhat are the failures ? gpg key related ?17:02
tristanI think I added a test recently to ensure that keeps working17:02
tristanI'm not sure if command line would tolerate changes of gpg key presence well either17:03
tristani.e. it happens in repo initialization, so if you add a gpg key or remove it later, it should probably mean you need a seperate mirror directory for it ?17:04
tristanrepo url + hash of gpg key ?17:04
tristanEither that or, `ostree init` needs to be able to reconfigure the repo at the beginning of every session ?17:05
benschuberttristan: one sec, finishing the run, but apparently not, more fetch related afaict?17:06
tristannot spotting the problem easily17:12
* tristan has been looking it over17:12
tristanwhat happens ?17:12
benschubertI'm just waiting on a test to finish to share, will be simpler17:13
benschubertbut basically `remote origin not found` is one17:13
benschubertand `${ref}.commit doesn't exist`17:13
benschubert(two different errors)17:14
tristanmight be is_cached giving you trouble somehow ?17:15
tristanthat doesnt seem to ensure the remote anymore ?17:16
benschubertoh, did it use to? shoot17:16
tristanwas just noticing that in the diff17:16
tristanAlso if is_cached() is a delegated method, which I think is one of the things replacing get_consistency(), seems unnecessary to call it in fetch()17:17
tristanbut probably our plugin facing API docs need to be clarified to specify these invariants17:17
benschuberttristan: true I can remove that17:18
tristanis_cached() should of course be able to return False if the mirror directory doesnt even exist17:19
tristanlike we shouldnt require spending time initializing an empty repo17:19
benschubertyep I've just fixed that17:19
benschubert[gw0] [ 66%] FAILED mirror.py::test_mirror_junction_from_includes[ostree]17:20
benschubert[gw4] [ 53%] FAILED tests/sources/ostree.py::test_fetch_gpg_verify17:20
benschubertThose are the two failures now17:20
tristanclue: The reason why we don't just use 'origin' all around is because of mirrors17:21
tristanif we have some commands that just rely on default remote, that could cause mirror tests to fail17:22
benschubertso I need to set the default remote all the time?17:22
benschubertI thought we only set one remote for every one?17:22
tristanI don't think so17:22
tristanJust guessing every command needs to specify the remote17:23
tristanmaybe in the old code we were setting the default remote ? not sure17:23
benschuberthttps://gitlab.com/BuildStream/bst-plugins-experimental/-/jobs/623441172#L575 is one of them :/17:24
benschubertyeah I'm really not familiar with ostree17:24
*** hasebastian has quit IRC17:27
WSalmonso this works locally with and without --no-intergration https://gitlab.com/BuildStream/buildstream/-/commits/willsalmon/checkout_no_sandbox juergbi the only diffrent in reality as far as i can tell is that self.__configure_sandbox(sandbox) is not called, would that effect what gets checked out for a --no--intergartion checkout?17:29
WSalmonjuergbi, ^17:29
WSalmonas far as i can tell that would only effect the case with intergration17:29
tristanbenschubert, looks like it should be fairly easy to figure out with the failed test data and the ostree command line locally though17:30
tristanYou can literally copy paste the commands, run them on the test data and figure out where things went wrong17:30
tristanI'm not sure if you want --mirror for instance, you might, I don't know17:31
benschuberttristan: right, that seems the right way to go :D17:33
benschubertSeems also like my multiprocessing change prevents termination :/17:36
benschubertpause works perfectly though17:36
benschubertso there is that17:36
tristancool !17:51
tristanbenschubert, checking `ps axf` to make sure you've put everything to sleep ?17:52
benschubert I'll try that17:52
tristanTS I think you want... iirc17:52
benschubertSo I was building LLVM, 12 cores used, LVA 12. ctrl + c and in a second no more core used at all :)17:53
benschubertthat works17:53
benschubertthe only thing I'm not 100% sure is around the python code17:53
benschubertand contrinue -> restart immediatly17:53
tristansome things don't respond well, I think ninja might still suck at job control wrt SIGTSTP17:54
tristannot our fault if those process trees don't suspend17:54
benschubertyep, we're doing the best we can17:54
benschubertbut unless we are the kernel, well :D17:54
tristanWe could get better support from the kernel in the future17:55
tristanwith containers17:55
tristanwho knows :)17:55
benschubertah namespaces yeah :)17:55
* benschubert is loosing his sanity on the last bug of ostree17:55
tristanjoin the club :)17:55
benschubertWSalmon: what are the symptoms of your OOM errors again?17:56
tristanI've lost it already earlier this week trying to get non-recursive variable resolution to perform decently :)17:56
tristanAnother thing to consider wrt ostree: compatibility with the older plugin17:56
benschubertwhat kind of compatibility?17:57
tristanif for any reason we do things differently (repo config or smth) causing an old cached repo to not work with the new cli based access commands17:57
tristanthen we'd have to namespace the cache directory17:57
benschubertno I don't think that's a problem17:58
tristanand trigger an expensive re-download of probably a lot of data (ostree mostly used for big downloads)17:58
tristanyeah I doubt it17:58
WSalmoni cant find a example but casd just uses all of the memory18:03
benschubertok I'm giving up for today, will come back to it tomorrow :'D18:03
benschubertWSalmon: And what exit code?18:03
WSalmonthen it depends what needs a chunk of memery once its all gona18:03
benschubertok, will need to monitor my build more closely18:04
benschubertit's not succeeding and I've got 32Go of Ram :/18:04
WSalmon:(18:05
WSalmoni should be able to find one but not now18:05
WSalmonill ping you one18:05
WSalmonwhen i find it18:05
WSalmonhave a good non buildstreaming evening :D18:06
*** tomaz has joined #buildstream18:36
tomazpeople, is there any `kind` for installing a ruby-gem?18:37
*** tomaz has quit IRC18:51
WSalmonohhh i helped some one out with one a while back19:24
WSalmoncant rember if it was here or in FD-SDK19:25
benschubertYep I can confirm, with casd's master from two days ago I can't build llvm on a 32GB machine19:35
WSalmontoma20:49
WSalmonsorry fat figgers20:49
*** pointswaves has joined #buildstream21:01

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!