*** tristan has quit IRC | 04:49 | |
*** hasebastian has joined #buildstream | 04:58 | |
*** hasebastian has quit IRC | 05:00 | |
*** tristan has joined #buildstream | 05:02 | |
*** ChanServ sets mode: +o tristan | 05:02 | |
tristan | I'm confused, why do some objects implement __getstate__/__setstate__, while other objects implement get_state_for_child_job_pickling() ? | 05:16 |
---|---|---|
tristan | I find references to get_state_for_child_job_pickling() in jobpickler.py, but I cannot find where this interface is documented | 05:17 |
juergbi | tristan: a reason is that we also have non-job subprocesses (cache usage monitor) | 05:34 |
juergbi | don't know whether that's the only reason. also not sure why we can't simply use getstate/setstate for everything (maybe some pickling aspects are specific to job subprocesses) | 05:35 |
juergbi | get_state_for_child_job_pickling() appears to be implemented by Loader and Messenger, and used by jobpickler | 05:35 |
tristan | yeah | 05:45 |
tristan | juergbi, I came across https://gitlab.com/BuildStream/buildstream/-/merge_requests/1463#note_192464620 | 05:46 |
tristan | and I'm looking at just nulling out the project Includes() instance and the project Loader instances for all loaded projects directly at the end of Stream._load(), hopefully removing this stuff | 05:47 |
tristan | But basically, this job pickling is killing me at every turn | 05:47 |
tristan | My whole branch has been failing the multiprocess job for an issue in PluginFactory, not because my branch changes anything in that area (it doesnt), but because I've added tests which exercise the core a bit more | 05:49 |
tristan | Now my loader refactor is also failing this pickling | 05:49 |
juergbi | :-/ it would be good if someone picked this up again (assuming there is still interest in native Windows) | 05:50 |
tristan | I think I have a clue as to why it's failing in the plugin factory, it's after the cross-plugin junctions | 05:51 |
tristan | https://gitlab.com/BuildStream/buildstream/-/blob/master/src/buildstream/_scheduler/jobs/jobpickler.py#L151 <-- this bit seems to assume that a plugin is associated to a factory (and not the multiple factories it might have to traverse in order to find the factory which initially produces the plugin) | 05:53 |
tristan | not sure though, that code is very cryptic and undocumented | 05:53 |
tristan | Sweet, now when I run: BST_FORCE_START_METHOD="spawn" tox -e py37 -- tests/format/include.py::test_include_junction_file ... it hangs for hours | 05:56 |
tristan | But if I do a `tox -e venv /bin/bash` and run `BST_FORCE_START_METHOD="spawn" bst -C .tox/py37/tmp/test_include_junction_file0/junction/ show --deps none element.bst` from there (the equivalent of the test case in question); it just works fine | 05:56 |
tristan | I think if we wanted to support this picklejobber, we should have a base BstObject class which just lists the members it wants wiped out at pickle time, and does the assertions that those members actually exists on the class and does the __getstate__/__setstate__ stuff automatically | 05:59 |
tristan | This `assert "foo" in state` is very delicate, and the sprinkling of these methods all over the place don't appear to make much sense upon reading | 06:00 |
*** finnb has quit IRC | 06:03 | |
tristan | juergbi, do you know of any way to get the stderr/stdout of a hanging test run in tox ? | 06:06 |
tristan | Clearly something is wrong in there | 06:06 |
juergbi | tristan: -s doesn't show anything? | 06:07 |
tristan | Testing with pytest directly in a tox venv doesnt cause a problem, and when I add sys.stderr.write("DEBUG STUFF") in _pickle_child_job_data() in the pytest side, it's not even triggering | 06:07 |
tristan | juergbi, that never worked | 06:08 |
tristan | juergbi, -s only prints the output of the test once it's finished | 06:08 |
juergbi | tristan: may need to disable output capturing in our cli class | 06:08 |
juergbi | and then use -s | 06:08 |
tristan | :-S | 06:08 |
juergbi | or kill the process when it's hanging and hope for a backtrace? | 06:09 |
tristan | maybe that | 06:09 |
juergbi | depends on where/why it's hanging, I suppose | 06:09 |
tristan | I've got `python -c from multiprocessing.semaphore_tracker import main;main(17)` and `python -c from multiprocessing.spawn import spawn_main; spawn_main(tracker_fd=18, pipe_handle=20) --multipr.........` | 06:10 |
tristan | hah, no killing both just makes them zombies and it just continues to hang | 06:12 |
tristan | juergbi, Hacking runcli.py indeed is the only way | 06:15 |
tristan | Ok if I could get this error https://gitlab.com/BuildStream/buildstream/-/blob/master/src/buildstream/_stream.py#L1646 to tell me *what* is referring to Stream, then I might get somewhere, at least through this hurdle, land my changes, and then start a larger discussion about what to do with this obnoxious stick jammed into our front wheel | 06:17 |
juergbi | tristan: the backtrace of the error doesn't say? or is there no backtrace? | 06:19 |
tristan | juergbi, https://bpa.st/7TEA | 06:20 |
tristan | not helpful | 06:21 |
tristan | just says "Dont pickle streams", but the stack doesnt have context through referees | 06:21 |
tristan | maybe pickle.Pickler(pickled_data) has some options for this | 06:22 |
tristan | Has anyone ever been able to successfully use pdb with BuildStream, and if so, were they able to do it in a tox test context ? | 06:41 |
* tristan has tried to get that thing to work years ago and gave up with pdb | 06:42 | |
tristan | clue: This stuff happens during subproject fetch time mostly | 07:18 |
tristan | So we cannot realistically break references to Stream while performing a subproject fetch | 07:19 |
tristan | Well, we should be able to cause the subproject fetch callback to _not_ be a Stream method | 07:20 |
* tristan can't help feeling the sting of how much of a waste of time it is to be trying to fix this halfway implemented thing | 07:44 | |
*** santi has joined #buildstream | 08:48 | |
*** tristan has quit IRC | 09:04 | |
traveltissues | tristan, i've used pdb in a very limited way but it involved trawling through a huge stack, not via tox though | 09:23 |
*** tristan_ has joined #buildstream | 10:09 | |
*** ChanServ sets mode: +o tristan_ | 10:09 | |
tristan_ | juergbi, Can I get a review on https://gitlab.com/BuildStream/buildstream/-/merge_requests/1964 ? | 10:24 |
tristan_ | At least for that refactor, I was finally able to tiptoe around the pickle jobber | 10:24 |
juergbi | will take a look | 10:25 |
tristan_ | !1901 still has an issue with the pickle jobber, and I still have to figure out duplicate markers on overridden junctions | 10:25 |
tristan_ | But !1964 paves the way for !1901 well | 10:25 |
*** tristan_ has quit IRC | 11:37 | |
*** tristan_ has joined #buildstream | 12:16 | |
*** ChanServ sets mode: +o tristan_ | 12:16 | |
*** tristan_ is now known as tristan | 12:17 | |
*** santi has quit IRC | 14:08 | |
*** santi has joined #buildstream | 14:10 | |
*** santi has quit IRC | 17:02 | |
*** toscalix has joined #buildstream | 18:29 | |
*** xjuan has joined #buildstream | 20:12 | |
*** toscalix has quit IRC | 21:32 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!