IRC logs for #buildstream for Tuesday, 2019-10-29

*** slaf has joined #buildstream00:08
*** slaf has joined #buildstream00:08
*** slaf has joined #buildstream00:08
*** slaf has joined #buildstream00:08
*** slaf has joined #buildstream00:09
*** slaf has joined #buildstream00:09
*** traveltissues has joined #buildstream09:19
*** jonathanmaw has joined #buildstream09:26
*** santi has joined #buildstream09:27
*** tiagogomes has joined #buildstream09:40
*** bochecha has joined #buildstream10:00
*** phildawson_ has joined #buildstream10:04
*** lachlan has joined #buildstream10:29
*** lachlan has quit IRC10:58
*** santi has quit IRC11:16
*** santi has joined #buildstream11:24
*** lachlan has joined #buildstream11:24
*** lachlan has quit IRC11:54
*** phildawson_ has quit IRC12:06
*** phildawson_ has joined #buildstream12:07
*** dtf has joined #buildstream12:22
gitlab-br-bottraveltissues opened issue #1183 (automatic cache key updating is not compatible with !1651) on buildstream https://gitlab.com/BuildStream/buildstream/issues/118312:28
gitlab-br-bottraveltissues opened (was WIP) MR !1651 (traveltissues/1161->master: extend source api and remove private use from workspace plugin) on buildstream https://gitlab.com/BuildStream/buildstream/merge_requests/165112:36
gitlab-br-botcs-shadow opened issue #1184 (Remote Execution tests produce too much logs) on buildstream https://gitlab.com/BuildStream/buildstream/issues/118412:42
*** phildawson_ has quit IRC13:03
*** lachlan has joined #buildstream13:14
gitlab-br-bottraveltissues closed issue #1183 (automatic cache key updating is not compatible with !1651) on buildstream https://gitlab.com/BuildStream/buildstream/issues/118313:31
gitlab-br-bottraveltissues opened (was WIP) MR !1651 (traveltissues/1161->master: extend source api and remove private use from workspace plugin) on buildstream https://gitlab.com/BuildStream/buildstream/merge_requests/165113:32
tlater[m]juergbi: Hrm, we have a test that tries to spam the remote server with lots of small files13:41
juergbiyes13:42
tlater[m]It's either hanging or taking more than 20 minutes to process with the buildbox-casd proxying13:42
juergbihm, the proxying is supposed to reduce overhead13:42
juergbicould potentially be a message size issue13:43
juergbi(was the motivation for the test)13:43
tlater[m]juergbi: Specifically, we never seem to make it past FindMissingBlobs13:43
tlater[m]buildbox-casd seems to get a lot of Write requests, that's all I can see13:43
tlater[m]What could happen to the message size?13:44
* tlater[m] just literally forwards the message, and sets the same message size restraints BuildStream normally does13:45
tlater[m]Curiously the casserver thinks FindMissingBlobs() completes, but the client doesn't.13:46
*** phildawson_ has joined #buildstream13:59
*** santi has quit IRC14:06
juergbitlater[m]: is this when proxying FindMissingBlobs() requests that bst-artifact-server receives to casd14:08
juergbior when issuing FindMissingBlobs() as part of the artifact timestamp update code?14:08
tlater[m]juergbi: This is part of proxying them when bst-artifact-server receives them14:09
* tlater[m] isn't sure why he never ran into this before14:09
tlater[m]Hm, worth checking if reverting things causes issues, this may be an issue with master?14:09
tlater[m]buildbox-casd master that is14:10
juergbiI've very recently tested buildbox-casd master against the test suite14:10
tlater[m]Yes, but not proxied buildbox-casd14:11
tlater[m]Yep, looks like it has broken between buildbox-casd versions14:13
* tlater[m] should check the old version, just to be sure14:13
tlater[m]Ah, no, just as I say that the test finishes14:14
gitlab-br-botBenjaminSchubert approved MR !1657 (aevri/enable_spawn_ci_5->master: job pickling: also pickle global state in node.pyx) on buildstream https://gitlab.com/BuildStream/buildstream/merge_requests/165714:14
* tlater[m] wonders if it's because he set LogLevel.TRACE14:21
tlater[m]Maybe that just makes buildbox-casd significantly slower14:21
benschuberttlater[m]: I would be very surprised about that :)14:22
*** santi has joined #buildstream14:22
tlater[m]So would I!14:22
tlater[m]But it's the only notable change I can see14:22
tlater[m]Oh, yes14:22
tlater[m]It is that14:22
tlater[m]Wow14:22
benschubertwut?14:23
tlater[m]My guess is that it just spams so many files that the logging overhead becomes severe14:23
tlater[m]Since every Write request is logged with an individual message14:23
tlater[m]\o/ my tests pass now14:29
tlater[m]So yeah, TRACE log level in buildbox-casd is very slow14:30
gitlab-br-bottraveltissues closed issue #1181 (resolve workspaces stages only once test failure for multiprocessing run) on buildstream https://gitlab.com/BuildStream/buildstream/issues/118115:12
*** santi has quit IRC15:42
gitlab-br-botmarge-bot123 merged MR !1657 (aevri/enable_spawn_ci_5->master: job pickling: also pickle global state in node.pyx) on buildstream https://gitlab.com/BuildStream/buildstream/merge_requests/165715:47
*** lachlan has quit IRC15:56
gitlab-br-botaevri opened (was WIP) MR !1663 (aevri/enable_spawn_ci_6->master: Fix remaining spawn unit test breaks under 'tests/') on buildstream https://gitlab.com/BuildStream/buildstream/merge_requests/166316:03
gitlab-br-bottpollard opened issue #1185 (Build does not exit gracefully on a second CTRL-C) on buildstream https://gitlab.com/BuildStream/buildstream/issues/118516:13
*** phil has joined #buildstream16:16
*** phildawson_ has quit IRC16:17
tlater[m]juergbi: buildbox-casd uses the same old LRU expiry approach, right?16:18
* tlater[m] sees an artifact that is not LRU being expired and is very confused16:18
tlater[m]That is, not LRU on the fs leve16:19
tlater[m]*l16:19
juergbitlater[m]: it uses blob-based LRU expiry16:19
juergbii.e., like bst-artifact-server before the switch to casd, but not like the old bst client artifact-based expiry16:19
tlater[m]juergbi: Does it delete the actual artifact protos?16:20
tlater[m]I think I get that, but I'm trying to figure out why my artifact disappears16:21
tlater[m]Does it, besides removing the least recently used blobs, also delete any artifacts that refer to them?16:21
tlater[m]Or is that done by the ArtifactServicer?16:21
juergbitlater[m]: there is no artifact proto expiry at all yet16:22
juergbiI think there is a bug about it but I don't think this has been implemented yet16:22
tlater[m]Hehe, so why is that proto disappearing :D16:23
* tlater[m] will try to figure that out then16:23
*** lachlan has joined #buildstream16:45
*** lachlan has quit IRC16:50
tlater[m]juergbi: I'm stuck - it looks like FetchTree is simply not updating my mtimes17:04
tlater[m]If/when you have time, is there anything obviously wrong you can spot in https://gitlab.com/BuildStream/buildstream/merge_requests/1645/diffs?commit_id=35fb6969f474c32b69df3270a8aae1131c65830b ?17:04
gitlab-br-botaevri opened (was WIP) MR !1674 (aevri/fuse_mount_private->master: _fuse/mount: make mount() and unmount() private) on buildstream https://gitlab.com/BuildStream/buildstream/merge_requests/167417:06
gitlab-br-botaevri opened (was WIP) MR !1673 (aevri/testutils_artifactshare->master: tests/artifactshare: safer cleanup_on_sigterm use) on buildstream https://gitlab.com/BuildStream/buildstream/merge_requests/167317:07
juergbitlater[m]: I commented on an issue with the old reference service but I assume you're testing with the artifact service17:07
juergbiit's about artifact.files, I presume17:07
juergbiI don't see an issue at a quick glance17:08
tlater[m]I suppose I'll need to start stracing buildbox-casd then17:09
*** santi has joined #buildstream17:09
* tlater[m] doesn't see why it wouldn't update mtimes, but yet some blobs are removed despite their parent artifacts not being marked as outdated17:09
*** santi has quit IRC17:12
juergbitlater[m]: you see this as a failure in one of our expiry tests, or is this manual testing?17:25
tlater[m]juergbi: It's in frontend/push.py::test_recently_pulled_artifact_does_not_expire17:25
tlater[m]I've played a little with the cache quota sizes to make sure it's a proper failure17:26
tlater[m]And I've dug into the buildbox-casd logs, too17:26
juergbiok17:26
juergbiI assume you've checked we're passing the right quota to casd?17:26
tlater[m]yeah, buildbox-casd handily reports the cache quota early on17:27
tlater[m]juergbi: For reference, here are the logs: https://hastebin.com/rodedorepo.sql17:27
tlater[m]What should be happening is that the cache usage drops to ~5M17:28
tlater[m]Err, ~10M that is17:28
tlater[m]For element1 and element3 to fit17:28
tlater[m]But that's probably hard to read from those logs :)17:28
*** santi has joined #buildstream17:33
juergbitlater[m]: shouldn't it use a 22 MB quota, not a 25 MB one for this test?17:38
juergbior rather, instead of the 24 MB one17:38
tlater[m]juergbi: Yes, that's my experimentation to push that higher in case the lower quota trips17:38
tlater[m]I have two logs in there, one with 24M, one with 25M17:39
juergbiah ok17:39
juergbitlater[m]: I assume it deletes element1 and element2 instead of just element2?17:41
tlater[m]Yes17:42
tlater[m]The assertions for element 3 and 2 pass, but it then fails with the final assertion17:42
tlater[m]I've traced all the way into the assertion, and it's a FileNotFoundError in the code that reads `artifact.files`' blobs17:43
tlater[m]So it's pretty clear that the mtime setting isn't working17:43
juergbitlater[m]: ah, you need to set fetch_file_blobs in the FetchTreeRequest17:47
juergbiotherwise only the directories will be covered17:47
tlater[m]Oh!17:48
tlater[m]Hm, I wonder if that instead of means I need to call it twice, actually17:48
tlater[m]But yes, that makes sense17:49
* tlater[m] was wondering if it was an argument he was missing, but dismissed that as unimportant o\17:49
tlater[m]Ta juergbi, really appreciate it.17:49
tlater[m]Ugh, I've been banging my head against this for hours :|17:49
juergbidirectory objects are always covered17:50
juergbifiles are optional17:50
juergbionly one call required17:50
juergbiI also missed it when first looking at your code17:50
tlater[m]The doc should probably be changed then :)17:51
tlater[m]Eh, easier to miss in review than when actually coding it17:51
juergbiyes, I didn't look at the proto at that point17:52
* tlater[m] almost wants to try writing an artifact-server testing binary to really drive the protocol into his brain17:52
*** lachlan has joined #buildstream17:55
*** lachlan has quit IRC17:57
*** traveltissues has quit IRC18:02
*** jonathanmaw has quit IRC18:04
*** santi has quit IRC18:06
*** phil has quit IRC18:18
*** rdale has quit IRC18:20
*** benschubert has quit IRC21:26
*** narispo has quit IRC22:42
*** narispo has joined #buildstream22:43
*** narispo has quit IRC23:12
*** narispo has joined #buildstream23:12

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!