juergbi | nanonyme: I've already rebased and merged jjardon's PR #1759 | 14:33 |
---|---|---|
nanonyme | Oh, ok. | 14:33 |
nanonyme | juergbi: did you notice it crashed on Debian 10? | 14:34 |
nanonyme | Closed | 14:35 |
juergbi | the debian-10 job passed the first run in CI (in the grpcio update PR) and also locally, so I merged it | 14:35 |
juergbi | but yes, I'm seeing now that it sometimes still crashes there | 14:36 |
juergbi | I suspect it's still a coverage-related issue | 14:36 |
juergbi | haven't seen any issues with the fedora 36 and fedora 37 jobs (which still have coverage disabled) | 14:37 |
nanonyme | juergbi: looks like the top ones by @abderrahim:gnome.org are more or less safe to merge | 14:37 |
nanonyme | Haha. doras just noticed that if you run out of disk space during build and build fails because of this, BuildStream caches this as failure and pushes it into CAS which prevents any future retries. | 14:38 |
nanonyme | This caching failures seems like it should be opt-out | 14:38 |
doras | And disabled by default, to be honest | 14:39 |
nanonyme | doras: the point is that everything is reproducible and you can debug failures from CI | 14:39 |
nanonyme | But obviously disk space running out is not something you can control | 14:40 |
nanonyme | And not a deterministic phenomenon | 14:40 |
doras | BuildStream doesn't live in a bubble of its own with infinite resources. A kernel bug, a random process crash or disk storage running out are indeed real reasons for failures and are very much temporary. | 14:40 |
nanonyme | Right | 14:40 |
nanonyme | And when there is central CAS, this can in fact mess up the entire project more or less permanently until CAS is fixed | 14:45 |
nanonyme | So it's an actively dangerous feature | 14:45 |
juergbi | not caching failures at all would likely be very inconvenient because in that case also logs of the failed builds would not be available | 14:48 |
nanonyme | That is, caching failures itself is not dangerous but pushing failures is | 14:48 |
nanonyme | juergbi: who cares? We archive them as artifacts | 14:48 |
nanonyme | It's harmful to have the artifact in CAS | 14:48 |
nanonyme | If you wan tot store the logs, there needs to be some other mechanism that does not prevent rebuilding the element. | 14:49 |
juergbi | a config option to ignore cached failures for bst build would make sense to me | 14:49 |
nanonyme | But this is not really a helpful feature in general. If you hit infra failure, it will not be possible to recover without wiping your CAS. | 14:50 |
juergbi | if build ignores the cached failure, it shouldn't be an issue | 14:51 |
nanonyme | Ahh, right, so you would ignore it for build only but pull it as normal? | 14:52 |
juergbi | yes | 14:52 |
nanonyme | What if build succeeds after that? Will the successful build artifact replace the failed artifact in CAS? | 14:52 |
juergbi | yes | 14:52 |
nanonyme | Ok, good. I was worried BuildStream would skip the push. | 14:52 |
nanonyme | This would be user config thing, probably? | 14:53 |
juergbi | yes, could likely be both user config + CLI option | 14:53 |
juergbi | btw: my test fix for updated grpcio doesn't seem sufficient for bst-1, unfortunately | 14:54 |
juergbi | presumably due to bst-1 still using fork inside bst itself | 14:55 |
nanonyme | Right, and changing that would be way too invasive | 14:56 |
juergbi | CI seems reliable so far with updated coverage: https://github.com/apache/buildstream/pull/1817 | 16:00 |
nanonyme | juergbi: don't we run 3.7 on Debian 10? | 16:09 |
juergbi | yes | 16:11 |
nanonyme | juergbi: it's just weird that you say in PR that it works on 3.8+, that's all. | 16:22 |
juergbi | nanonyme: ah, what I meant was that Coverage 6 also works on Python 3.8+ with Cython while older versions of Coverage broke with Python 3.8+ (with Cython) | 16:23 |
nanonyme | Fair enough | 16:23 |
nanonyme | juergbi: it is btw interesting that Fedora is using grpc 1.48.2 together with BuildStream on Python 3.11 | 17:37 |
nanonyme | I wonder if that combination would work at all even if the bug with regex parser was released | 17:37 |
nanonyme | Erm, bug fix was released even | 17:38 |
nanonyme | juergbi: looks like https://github.com/apache/buildstream/pull/1793 after rebase didn't fail so maybe those were random failures? | 18:37 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!