*** tristan has joined #buildstream | 00:05 | |
*** leopi has joined #buildstream | 04:06 | |
*** leopi has quit IRC | 06:53 | |
*** tristan has quit IRC | 07:19 | |
*** tristan has joined #buildstream | 08:01 | |
*** ChanServ sets mode: +o tristan | 08:31 | |
*** alatiera_ has joined #buildstream | 08:32 | |
tristan | tlater[m], I didn't notice your reply yesterday (lost connection temporarily it seems) | 08:32 |
---|---|---|
tristan | tlater[m], later in log I elaborate on *some* of the things I've been seeing: https://irclogs.baserock.org/buildstream/%23buildstream.2018-09-08.log.html#t2018-09-08T10:55:52 | 08:33 |
tristan | tlater[m], I am a bit flabbergasted as to how messy things have gotten, but I'm not on a witch hunt... I am however thinking of writing these up and calling them out on the mailing list | 08:35 |
tristan | the headline is basically that; we need to do better in review; all of us | 08:35 |
tristan | That shouldn't have happened to bochecha, looks like a gitlab bug :-S | 08:40 |
tristan | bst-external has master as a protected branch, and only maintainers allowed to merge; bochecha is developer | 08:41 |
* tristan files https://gitlab.com/gitlab-org/gitlab-ce/issues/51279 | 08:55 | |
*** abderrahim has quit IRC | 09:11 | |
*** abderrahim has joined #buildstream | 09:11 | |
*** alatiera_ has quit IRC | 09:18 | |
*** alatiera_ has joined #buildstream | 09:19 | |
*** alatiera_ has quit IRC | 09:20 | |
*** alatiera_ has joined #buildstream | 09:20 | |
*** alatiera_ has quit IRC | 09:25 | |
*** bochecha has joined #buildstream | 12:06 | |
tristan | tlater[m], I have an awesome branch now to cleanup... only about 80% of the mess, I will share tomorrow - I want to run some tests | 12:42 |
tristan | tests/artifactcache/expiry.py::test_never_delete_dependencies will have to be disabled unless we can get better synchronization, for that test to work realistically; we need more elaborate prevention of builds in the main process, but the test is rather unimportant | 12:43 |
tristan | I.e. once we determine that our estimate of the size has exceeded the quota... we need to run a cache calculation job in advance of the cleanup; that job is not exclusive of builds | 12:44 |
tristan | Or, it could be a "calculate cache size and maybe cleanup job" which is exclusive at that point; that would be straightforward and work | 12:44 |
tlater[m] | Hrm, it's annoying that we'd have to block builds with cache size calculation with the latter. | 12:46 |
*** tristan has quit IRC | 12:50 | |
*** tristan has joined #buildstream | 13:19 | |
bochecha | valentind, hi, I'm looking at https://gitlab.com/BuildStream/buildstream/issues/523 ; I tried hardcoding 'aarch64' in the SandboxConfig key instead of using the result of os.uname, just to test… I still don't hit the cache key :-/ | 13:45 |
*** leopi has joined #buildstream | 14:23 | |
*** leopi has quit IRC | 14:38 | |
*** alatiera_ has joined #buildstream | 14:58 | |
persia | bochecha: Have you tried running BuildStream in an aarch64 chroot on your laptop? My guess is that while the binfmt trick lets you execute aarch64 binaries, something about BuildStream arch autodetection is running a native binary (might be some python internal, for example). | 15:55 |
bochecha | persia, buildstream is running native on my laptop,i.e x86_64 | 15:56 |
bochecha | I haven't tried runningit in an aarch64 chroot, I should probably try that, thanks | 15:56 |
persia | Also note that even if your kernel has the binfmt stuff configured, the interpreter needs to be injected into the chroot to run the foreign arch in the chroot, which I think BuildStream may need help doing. | 15:57 |
bochecha | what do you mean? | 16:05 |
persia | As I understand it, one runs commands under an interpreter, so one can define interpreters with entries in /proc/sys/fs/binfmt_misc | 16:07 |
persia | For example, you might have a file named "qemu-aarch64" in that directory, in which there is a line that says something like "interpreter /usr/bin/qemu-aarch64-static" | 16:08 |
persia | This means that when an aarch64 ELF file is encountered, it will call that interpreter to run the file. If one is operating in a chroot environment, one needs to copy the interpreter from the host to that path in the chroot (or bind-mount or something). | 16:09 |
persia | If the interpreter is not found in the choot, then when the foreign binary is encountered, it will fail in the same way it would for a system not using binfmt to run foreign code. | 16:10 |
persia | (or at least, such is my experience: someone might have done something cool to make it work differently since I last looked) | 16:10 |
gitlab-br-bot | buildstream: merge request (Qinusty/skipped-rework->master: Add SkipError for indicating a skipped activity) #765 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/765 | 16:35 |
*** alatiera_ has quit IRC | 16:47 | |
*** leopi has joined #buildstream | 16:49 | |
tristan | bochecha, I filed https://gitlab.com/gitlab-org/gitlab-ce/issues/51279 btw | 17:01 |
tristan | tlater[m], I'm not sure how annoying it actually is, though; because note that that "maybe_cleanup" job only happens when the estimated artificial artifact cache size exceeds the quota | 17:03 |
tristan | then again, not doing it at all seems not horrible either, considering our logic is already fuzzy | 17:04 |
tlater[m] | It depends on the project, if you have a lot of deduplication there will be lots of cache size recalculation... But maybe it's not that big of an issue in practice. | 17:04 |
tristan | at least, if we reach an out of space disk condition for some reason (calculating the size took a very very long time perhaps ?), then we *still* have to fail gracefully | 17:05 |
tristan | Well, if you have a lot of deduplication, then it will be a long time between "maybe_cleanup" jobs | 17:05 |
tristan | If you do not have a lot of deduplication, then a real cleanup is "coming soon" | 17:06 |
tlater[m] | I feel like this is the sort of thing you'll want to test with lots of different kinds of projects. | 17:07 |
tlater[m] | I suppose that in a long build a few minutes of cleanup probably won't matter much. | 17:09 |
tristan | tlater[m], Does the "current thing" somehow make sense, though ? From what I can tell; the mentioned test case is basically a race condition | 17:09 |
tristan | It says that "If I try to build twice as many artifacts than my quota size, then only the artifacts which fit should get into the cache, and the build should fail" | 17:09 |
tristan | But, we launch a calculate size job once we reach the quota, and then we might launch a cleanup when that calculate size job completes | 17:10 |
tristan | In the mean time, we're dispatching those other two build jobs | 17:10 |
tlater[m] | Yes, I see what you mean. | 17:11 |
tristan | tlater[m], It *could* fail at the time that those build jobs try to commit the artifacts which cause the cache quota to be busted, but that seems rather disingenious | 17:11 |
tristan | we already built them, and barring an actual "disk out of space" condition, it's better to just cache them | 17:11 |
tlater[m] | This is definitely easier to handle if calculation and cleanup are one thing. | 17:13 |
tristan | Considering the current crashing state of things, that is definitely better | 17:16 |
tlater[m] | Yeah, I agree, it's probably better just to keep this logic from blowing up completely. | 17:16 |
tristan | But, we could equally not have that test; Or we could manufacture that test in such a way to have a sleep() to ensure it's not a race in some way | 17:16 |
tristan | My branch btw, does not have a distinction between 'cache_size' or 'estimated_size', there is only '_cache_size', and nobody except artifactcache.py can access it | 17:17 |
tristan | actually all of that related public stuff in artifactcache.py is now safely private | 17:18 |
tristan | and cache_size is never allowed to be set to None, to cause weird intentional side effects that are extremely hard to follow | 17:18 |
tristan | The elementjob.py and queue.py don't communicate the cache_size anymore, either; that is reported back to the main thread only through explicit channels, not like workspaces | 17:20 |
tristan | anyway; it's much simplified; I needed to get things simple just to really be sure of what I was doing at this point | 17:21 |
*** alatiera_ has joined #buildstream | 17:55 | |
*** dtf has quit IRC | 17:59 | |
bochecha | tristan, yeah, I received the email | 18:07 |
bochecha | tristan, I get a 404 though, is it private/hidden? | 18:08 |
*** mohan43u has quit IRC | 18:35 | |
tristan | bochecha, somebody set it to confidential yeah, strange that you cannot see it even you are in the participants | 18:40 |
tristan | anyway; I guess it means they are taking it seriously, it's a kind of security breach | 18:41 |
bochecha | yeah | 18:54 |
*** alatiera_ has quit IRC | 18:56 | |
*** alatiera_ has joined #buildstream | 18:56 | |
*** leopi has quit IRC | 19:00 | |
bochecha | basically, the field is called "Target branch", which means it's the branch you'll commit to | 19:13 |
bochecha | except usually in gitlab, "target branch" is the branch your MR will be merged into | 19:13 |
bochecha | so when I saw "target branch: master", I thought sure,that's where I want the MR to be merged in | 19:13 |
bochecha | turns out, I was supposed to change the value inthat field, and that would have created a new branch, and an MR for it | 19:14 |
bochecha | so it's a bad UX issue | 19:14 |
bochecha | coupled with a permission issue | 19:14 |
*** bochecha_ has joined #buildstream | 19:43 | |
*** bochecha has quit IRC | 19:44 | |
tristan | bochecha_, indeed, either the UX should work as you describe, or it shouldn't even let you start editing the file at all where you don't have permission | 20:44 |
bochecha_ | I've edited files in repos I had 0 permissions | 20:44 |
bochecha_ | and that just made MRs | 20:44 |
*** bochecha_ has quit IRC | 22:43 | |
*** alatiera_ has quit IRC | 23:29 | |
*** tristan has quit IRC | 23:51 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!