*** xjuan has quit IRC | 01:50 | |
*** jonathanmaw has joined #buildstream | 07:51 | |
*** jonathanmaw has quit IRC | 08:03 | |
*** jonathanmaw has joined #buildstream | 08:03 | |
*** tristan has joined #buildstream | 08:04 | |
*** ChanServ sets mode: +o tristan | 08:05 | |
*** tlater has joined #buildstream | 08:14 | |
gitlab-br-bot | push on buildstream@dual-cache-keys (by Jürg Billeter): 20 commits (last: context.py: Add strict_build_plan option) https://gitlab.com/BuildStream/buildstream/commit/7f4e2618fc091b44a43c6c5ac6f8fd54f8518fbc | 08:27 |
---|---|---|
gitlab-br-bot | buildstream: merge request (dual-cache-keys->master: Dual cache keys) #55 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/55 | 08:27 |
juergbi | i had to fight against cyclic python imports, solved it by moving ArtifactError to exceptions.py. i assume that's ok | 08:30 |
tristan | juergbi, I had similar problem, I moved it there and back once | 08:33 |
tristan | juergbi, to be honest I'm hoping to do an "API sweep" before any stable release and remove as much as possible | 08:34 |
tristan | from public surfaces | 08:34 |
tristan | what I found in the ArtifactError case, was that it was importing Element | 08:34 |
tristan | which was unneeded really | 08:35 |
tristan | juergbi, anyway it's not a huge deal for right now | 08:35 |
tristan | maybe all these exceptions should be public eventually anyway; I'm thinking in an eventual future I want to move all of the highlevel ops stuff out of _pipeline.py and put that somewhere else; maybe even in context.py or such | 08:37 |
tristan | The scheduler should be a detail of context.py | 08:38 |
tristan | And the pipeline should really just be the loading of a fully resolved project | 08:38 |
tristan | (this is thinking towards recursive pipelines; where I want to be able to "add a pipeline to a scheduler") | 08:38 |
tristan | and the Pipeline I would like to be public to allow external projects to load and iterate over elements | 08:39 |
*** dabukalam has left #buildstream | 08:39 | |
tristan | but anyway that's just my pipedream of pipelines, for immediate future; just private as much as possible; and location of ArtifactError is not hugely important | 08:40 |
juergbi | yes, it's still private. i was a bit surprised how painful python can be in this regard | 08:50 |
tristan | yeah in 3.5 circular imports are supported | 08:51 |
tristan | which caused me to mess up before and introduce them | 08:51 |
tristan | but, it's all around saner to avoid them | 08:51 |
tristan | circular imports tend to turn code into spaghetti :) | 08:52 |
juergbi | yes, i like clear module structure but with internal functionality split across a couple of modules, you sometimes want python --no-strict ;) | 08:54 |
juergbi | i'm actually on python 3.6, doesn't change anything in this regard | 08:55 |
tristan | oh strange, I have that somehow disabled by default with python 3.5 | 08:55 |
tristan | maybe they became stricter | 08:55 |
tristan | anyway we support 3.4, and if we have fuzz of internal functionality split across modules; maybe we should fix that anyway | 08:56 |
tristan | I wonder if it was a mistake to start letting the artifact cache work out what refs to commit based on the element object in the first place | 08:56 |
tristan | but meh | 08:56 |
juergbi | yes, i was actually thinking about the element/artifactcache boundary as well recently | 08:57 |
tristan | it's not hugely important but I guess if it were a bit more explicit, it *might* (only maybe) make things more straightforward for implementing a second artifact cache | 08:57 |
juergbi | but it's not public api, we can change things later as we see fit | 08:57 |
tristan | indeed | 08:57 |
*** ssam2 has joined #buildstream | 09:11 | |
* tristan has a variant conditional source | 09:16 | |
tristan | I need 0.36 branch of vala for GNOME SDK | 09:17 |
tristan | And running `bst track --variant flatpak core-deps/vala.bst` works like a charm :) | 09:17 |
tristan | jonathanmaw, so... as far as I know SIGTSTP is not handled whilst shelling into a failed build environment, we dont have handlers in place for that | 09:35 |
tristan | jonathanmaw, but one thing important to note, this started happening when we changed the sandbox spawning for interactive mode; to not create a new session for the bwrap process | 09:36 |
tristan | So, if a child process is in the same session, it generally means you will receive the unix signals it receives too | 09:36 |
tristan | However, changing it so that we run the interactive sandbox directly in the same session; caused the terminal to be functional | 09:37 |
jonathanmaw | ah, ok | 09:38 |
tristan | The bug that happens with non-functional terminal, is well; shitty terminal first of all (no tab completion and such), and also ^C causes the shell to abort | 09:38 |
tristan | So, yeah it's possible that a SIGSTOP is being issued from bash itself | 09:38 |
tristan | at exit time ? | 09:38 |
tristan | Weird but possible ? | 09:38 |
tristan | If that's the case, then we may not have a workaround for this, and might be stuck implementing namespaced functional terminal "properly" | 09:39 |
tristan | which is... a hassle | 09:39 |
tristan | it means we need to have another process marshalling data to and from the namespaced child process | 09:39 |
tristan | that's more or less what I gathered from the kernel experts in OFTC irc, but I didnt take the time to understand it in detail | 09:40 |
jonathanmaw | ok. | 09:41 |
jonathanmaw | and iiuc, https://gitlab.com/BuildStream/buildstream/commit/c14424ed1e8c457b4a3a886824c2885d0c82c2a4 is the commit where sandboxing stopped using a new session | 09:42 |
tristan | not really, it's one of them | 09:43 |
tristan | that one does the signal handling also conditional on interactive mode | 09:43 |
tristan | the original one only did new_session | 09:43 |
tristan | jonathanmaw, I feel like maybe this is too involved, it really is hard to solve - and might mean really understanding linux namespaces and how to setup a functional console in a namespaced process with everything unshared | 09:45 |
tristan | maybe I threw you a curve ball | 09:45 |
jonathanmaw | ok, for the moment I'll try and find out where the SIGSTOP is coming from. | 09:46 |
tristan | yeah that might help, I'm not even really sure how to trace that | 09:46 |
tristan | maybe strace -f will help with that, though | 09:46 |
tristan | (does strace record signals though ?) | 09:46 |
jonathanmaw | tristan: I believe so | 09:46 |
tristan | I think I suspected a TSTOP was coming from that process, and that's why I removed the _signals.suspendable() in interactive mode | 09:48 |
tristan | but then I left that out because anyway it makes more sense to not do that signals stuff (which is intended for the child tasks) directly in the parent session | 09:48 |
jonathanmaw | hmm, strace seems to break fuse in some way. I can confirm that strace is capable of picking up SIGSTOP, though. | 09:57 |
jonathanmaw | it says which signal the PID came from, and there's si_code in there as well, but I'm not sure what that part means. | 09:58 |
jonathanmaw | aha, man 2 sigaction seems to be informative | 10:00 |
tlater | I'm having trouble building the gnome-modulesets base system. It looks like it doesn't contain a /usr/bin directory... | 10:06 |
tristan | hmmm | 10:06 |
tristan | that wouldnt make much sense | 10:07 |
tlater | I don't set split-rules anywhere, either | 10:07 |
tristan | split-rules should not effect that | 10:08 |
tristan | tlater, always assume that I need all the specific details | 10:09 |
tristan | log file can help, maybe | 10:09 |
tristan | tlater, but in anycase, once you get passed base-configure.bst you should have /usr/bin | 10:09 |
tristan | it's a debian import | 10:09 |
tristan | maybe there was an error on the latest debian import ? | 10:09 |
tlater | Ah, I don't get past base-configure | 10:10 |
tlater | Because it's lacking ln -s | 10:10 |
tristan | no | 10:10 |
tristan | it's not lacking ln -s | 10:11 |
tristan | tlater, maybe there was an error on the last debian import, or the debian import you are using | 10:11 |
tristan | base-system.bst is the import element | 10:11 |
tristan | that downloads a debian image created by multistrap | 10:11 |
tristan | I dont know what ref you have in there | 10:12 |
tristan | but I suspect it's a bad | 10:12 |
tristan | one | 10:12 |
tlater | `bst track` should solve that, shouldn't it? | 10:12 |
tristan | tlater, `bst track base/base-system.bst` should get you the latest ref | 10:13 |
tristan | whether the latest ref is broken or not, I dont know | 10:13 |
tristan | I dont usually expect them to fail | 10:13 |
tristan | but, the other day there was a disk out of space condition | 10:13 |
tlater | I'll try that again, though it didn't work yesterday. | 10:13 |
tristan | tlater, you can also try 37fea071158573b62f186f03cdbc450ddefcc5c40ed622acb2d4eb8dd2eed482 | 10:13 |
tristan | `bst track base/base-system.bst` didnt work ? | 10:14 |
tlater | It didn't solve the problem, it was functional | 10:14 |
* tristan doesnt understand | 10:15 | |
tlater | I had the same build issue yesterday, and tried `bst track` just in case something was wrong with my ref - it didn't solve the build issue then. | 10:15 |
tlater | Though it did successfully update the ref | 10:17 |
tristan | tlater, try 37fea071158573b62f186f03cdbc450ddefcc5c40ed622acb2d4eb8dd2eed482 | 10:17 |
tlater | I will in a second, waiting for it to download things :) | 10:17 |
tristan | this looks successful https://gnome7.codethink.co.uk/debootstrap-logs/debootstrap-ostree-2017-07-14-103716-amd64.txt :-/ | 10:18 |
tristan | yesterday fail: https://gnome7.codethink.co.uk/debootstrap-logs/debootstrap-ostree-2017-07-13-100406-amd64.txt | 10:19 |
tlater | Hm, perhaps I just happened to get that twice somehow. | 10:19 |
tristan | day before successful https://gnome7.codethink.co.uk/debootstrap-logs/debootstrap-ostree-2017-07-12-100401-amd64.txt | 10:19 |
tristan | 11th successful | 10:20 |
tristan | yeah it could be | 10:20 |
tristan | we're probably ignoring the return value of multistrap in debootstrap-ostree scripts | 10:20 |
tristan | that could use a patch | 10:20 |
tristan | ssam2, maybe artifact push errors should not be fatal https://gitlab.com/baserock/definitions/-/jobs/22481024 ? | 10:23 |
tristan | just a thought | 10:23 |
tlater | Yeah, this seems to have worked. The build is taking significantly longer at any rate. Thanks tristan :) | 10:25 |
ssam2 | tristan, oh I didn't notice that was the issue | 10:25 |
ssam2 | tristan, they should be highlighted better, at least! | 10:25 |
ssam2 | tristan, but in this case I do want it to fail if it can't push. maybe that should be an option | 10:26 |
ssam2 | FWIW I looked at adding a 'preflight' artifact cache check, but it turned out quite difficult | 10:26 |
ssam2 | probably needs a 'ping' option adding to bst-artifact-receive | 10:26 |
ssam2 | will open an issue | 10:27 |
tlater | [c8dd6b30][ push:base/ninja.bst ] WARNING Not pushing tainted artifact. | 10:36 |
tlater | \o/ | 10:36 |
juergbi | tristan: do you agree with my latest comment on the MR regarding push? | 10:40 |
tristan | juergbi, so... | 11:29 |
tristan | juergbi, I was looking at https://github.com/dbnicholson/ostree-push/blob/master/TODO | 11:30 |
tristan | juergbi, from what I understand, there is *some*, I'm not sure how much it covers, we should look deeper | 11:30 |
tristan | juergbi, specifically the third point | 11:31 |
tristan | juergbi, in our case; as I understand it at least; we push two separate refs which have all the same objects | 11:31 |
tristan | the second push should only push the ref | 11:31 |
tristan | and indeed as I understand the code; we push everything at the tip of a commit through the tar streaming thing | 11:31 |
tristan | not a subset of only what the server lacks | 11:32 |
juergbi | yes but as i mentioned in my comment, the bulk of the code already supports pushing multiple refs at once (with identical or differing commits) | 11:32 |
tristan | Ah | 11:32 |
juergbi | i.e., using that to avoid the weak ref penalty should be trivial | 11:32 |
juergbi | the third TODO point would still be nice to have but less crucial/urgent | 11:32 |
tristan | juergbi, Okay yes; I agree | 11:32 |
tristan | the third point would still be a great optimization | 11:32 |
juergbi | yes, definitely | 11:32 |
tristan | ok | 11:33 |
tristan | So... juergbi are we about ready to merge that ? | 11:33 |
tristan | I would like to merge it first because you and tlater have this conflict with artifact data... would be good to have him rebase immediately after merge; and have cache keys recorded in some main artifact.yaml along with some other attributes, like workspaced: | 11:34 |
tristan | And I think tlater is getting also very close to a mergeable branch | 11:34 |
juergbi | essentially yes. let me test this push change and then it's hopefully good to go | 11:35 |
juergbi | ah, actually, i forgot to make the change from keys.yaml to artifact.yaml | 11:35 |
juergbi | probably makes sense to do this now before the merge instead of in tlater's branch | 11:35 |
tristan | tlater, btw if you are going to issue that warning (I think it's a good warning to have too), ensure that Element._push() returns False for that, and then not-pushed artifacts will show up as skipped elements in the queue | 11:35 |
tristan | juergbi, sure, sounds like a simple rename but will be easier for tlater to follow | 11:36 |
juergbi | yes and we don't break artifact metadata that way | 11:36 |
tristan | that too :) | 11:36 |
tristan | juergbi, side note; you may have noticed PushExistsException, which is an addition of my own | 11:39 |
tristan | it happens when you try to push the same artifact twice | 11:40 |
tristan | in our case; it will mean we cannot truly use branches for weak artifacts, they will silently fail as non-fatal pushes | 11:40 |
juergbi | hm, right, this is an issue | 11:41 |
tristan | juergbi, you probably have to make sure that when we push the two refs, the strong key push is not aborted just because a weak key already exists | 11:41 |
tristan | I think though, not using branches is sensible, especially since you already prioritize the ideal key at pull time | 11:41 |
juergbi | we should update the existing weak ref, though | 11:42 |
tristan | because we cant really know what is a sensible branch tip, the idea of age also does not make sense | 11:42 |
tristan | We could | 11:42 |
tristan | and it may make sense most of the time | 11:42 |
tristan | but not truly, i.e. you could have the same weak key on two different branches | 11:42 |
tristan | i.e. two buildstream project branches, could produce same weak key for a given element | 11:43 |
tristan | and neither of them is "better" | 11:43 |
juergbi | yes, there are no guarantees | 11:43 |
tristan | I think it's not worth updating the branch tip | 11:44 |
juergbi | ok, so let's just fix it such that it doesn't abort due to pre-existing weak ref | 11:44 |
tristan | especially that even if the concept of "age" worked "most of the time", we dont know that two separate users had todays master sources of the same buildstream project branch when pushing a weak ref | 11:45 |
tristan | we cannot even deduce a correct age | 11:45 |
juergbi | true but i would say in most typical scenarios, latest push winning is in average the better end result than the first push winning | 11:46 |
tristan | :) | 11:46 |
tristan | yeah | 11:47 |
tristan | juergbi, well take it this way; its a bit complicated to do; because you need to know the parent ref you want to push it as (and you dont) | 11:47 |
tristan | and it's low priority because of all the uncertainties | 11:47 |
tristan | juergbi, if you can easily make it do that, go ahead, if not; put it aside and file a bug | 11:48 |
tristan | good ? | 11:48 |
juergbi | yes, sounds fine to me | 11:48 |
tristan | maybe server side you can force it by getting the latest branch tip | 11:48 |
tristan | but then, concurrency is... a bit of an issue | 11:49 |
tristan | also, server side changes need a bump in the protocol version, and we need to update buildstream installations on cache servers (not really an issue right now, there exist exactly 2 artifact servers) | 11:49 |
juergbi | RuntimeError: Click will abort further execution because Python 3 was configured to use ASCII as encoding for the environment. Consult http://click.pocoo.org/python3/for mitigation steps. | 12:29 |
* juergbi checks python config | 12:29 | |
ssam2 | you need to force LANG=C.UTF-8 and LC_ALL=C.UTF-8 | 12:30 |
juergbi | ssam2: regular xx_XX.UTF-8 locale should work as well, shouldn't it? | 12:31 |
ssam2 | yeah | 12:31 |
juergbi | i get this on the SSH receive side. my locale is only set via login shell, i suppose, but not when directly executing the receive process | 12:31 |
juergbi | not sure where to properly fix that | 12:32 |
ssam2 | in a way it's a bug in the host OS | 12:32 |
ssam2 | e.g. https://github.com/fedora-cloud/docker-brew-fedora/issues/14 | 12:33 |
ssam2 | (comment #3) | 12:33 |
juergbi | ta | 12:34 |
juergbi | ssam2: do you know where LANG is set on host OS without this issue? | 12:37 |
tristan | oh :-/ | 12:37 |
juergbi | non-interactive non-login shell doesn't read any bashrc or profile, does it? | 12:37 |
tristan | I dont think so | 12:37 |
juergbi | i.e., sounds like i have to fix this either in my sshd config or maybe PAM | 12:37 |
tristan | it's similar situation for cron jobs | 12:37 |
ssam2 | is /etc/locale.conf relevant ? | 12:38 |
juergbi | that has the UTF-8 locale | 12:38 |
ssam2 | ah, obviously not then | 12:38 |
tristan | I feel like the right way to avoid this is to use click in a different way | 12:38 |
juergbi | even the sshd process has LANG set. but sshd doesn't pass it along to the new process | 12:38 |
tristan | Like, what do we care about utf8 on the bst-artifact-receive side ? | 12:39 |
* tristan greps click | 12:39 | |
juergbi | while we probably don't need anything beyond ASCII in this instance, i don't see it as an issue in that | 12:39 |
tlater | Hm, is there a simple way to remove a specific artifact from the cache? I'm guessing this is backed by an ostree repo locally as well. | 12:41 |
tlater | Just for testing - not particularly fond of re-building the base system again. | 12:41 |
juergbi | tlater: rm the file in refs/head/... | 12:41 |
tlater | Thanks :) | 12:42 |
juergbi | refs/heads/PROJECT/ELEMENT/key | 12:42 |
tristan | ugh | 12:42 |
tristan | no simple way out of this, this is about python trying to act smart about unicode, rather than whether you actually need to display unicode characters in output | 12:43 |
tristan | tlater, I usually use `ostree refs --list ~/.cache/buildstream/artifacts/ostree` to list the refs | 12:43 |
tristan | sorry + '--repo ~/.cache...' | 12:44 |
tristan | tlater, then I do `ostree refs --repo ~/.cache... --delete <ref>` | 12:44 |
tristan | you can also then prune the repo | 12:44 |
tristan | anyway | 12:44 |
tristan | juergbi, ok so I made a change to use click instead of argparse, and that was mostly because man page generation was breaking | 12:45 |
tristan | but man page generation is half broken anyway | 12:45 |
tristan | juergbi, we can revert that to argparse if it's an issue; it seems the click_man extension just takes whichever setup.py 'entry_point' comes first in the arbitrarily ordered list of entry points | 12:46 |
tristan | so I end up running click_man extension multiple times until it choses to generate pages for `bst` instead of `bst-artifact-receive` | 12:47 |
juergbi | well, the whole thing might be a config issue on my system, so i wouldn't jump to that | 12:47 |
tristan | if it hits bst-artifact-receive, it throws an ugly error if it's argparse | 12:47 |
tristan | yeah, it's working on gnome7 and on the baserock cache :) | 12:47 |
tristan | juergbi, do you want super power on gnome7 ? | 12:48 |
juergbi | i've fixed / worked around it locally now | 12:49 |
juergbi | i could check how sshd is configured on gnome7 but other than that, i don't think i need super powers there | 12:50 |
tristan | alright | 12:50 |
tristan | I can easily add you a user if you need though | 12:50 |
juergbi | ok, thanks | 12:51 |
juergbi | push to local test server is working :) | 12:51 |
tristan | I'm going to disappear shortly... | 12:51 |
gitlab-br-bot | buildstream: issue #45 ("Artifact pull jobs say "SUCCESS" even when no artifact was fetched") changed state ("opened") https://gitlab.com/BuildStream/buildstream/issues/45 | 12:51 |
gitlab-br-bot | push on buildstream@dual-cache-keys (by Jürg Billeter): 20 commits (last: widget.py: Use _get_full_display_key()) https://gitlab.com/BuildStream/buildstream/commit/ebfdb5905659f512ae773b4bf52cff29280fc746 | 12:52 |
gitlab-br-bot | buildstream: merge request (dual-cache-keys->master: Dual cache keys) #55 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/55 | 12:52 |
tristan | damn vte doesnt want to build, with exactly the same runtime, and compile / link line as with flatpak-builder, where it succeeds | 12:52 |
juergbi | how does it fail? | 12:53 |
* tristan will have to dig a bit deeper into flatpak-builder over the weekend | 12:53 | |
juergbi | i've pushed a branch update. shall we merge after integration tests pass? | 12:53 |
tristan | like this: /usr/lib/gcc/x86_64-unknown-linux/6.2.0/../../../../x86_64-unknown-linux/bin/ld: .libs/libvte_2_91_la-vtegtk.o: relocation R_X86_64_PC32 against undefined hidden symbol `_vte_marshal_VOID__STRING_BOXED' can not be used when making a shared object | 12:53 |
juergbi | it could definitely use more people testing but i'm not aware of any pending issues | 12:53 |
* tristan hits the fancy blue button on the web UI | 12:54 | |
jonathanmaw | hrm, strace breaks fuse, by the looks of it. | 12:54 |
juergbi | \o/ | 12:54 |
tristan | and then: /usr/lib/gcc/x86_64-unknown-linux/6.2.0/../../../../x86_64-unknown-linux/bin/ld: final link failed: Bad value | 12:54 |
jonathanmaw | I can't read the full backtrace because the output overwrites it with the progress indicator | 12:55 |
tristan | tlater, so you are going to have to rebase against that, and if you have a revision by the end of the day, I will try to merge it over the weekend | 12:55 |
tristan | jonathanmaw, bst | cat | 12:55 |
jonathanmaw | neat | 12:55 |
tristan | There is a bug, hitting control-C with bst | cat will cause a BrokenPipeError that is not handled :-/ | 12:56 |
tristan | but, doesnt really matter much :) | 12:56 |
tristan | jonathanmaw, and for you... I will take a look also and if there is nothing crazily outstanding, I'll also merge the dpkg plugins | 12:56 |
tlater | Should hopefully manage, my only problem at this point is that artifacts stay tainted after re-building. Which is obviously a stupid oversight, but I keep deleting the wrong artifacts -.- | 12:57 |
tristan | tlater, dont forget to remove the coresponding extract directories | 12:57 |
tristan | tlater, that's what is biting you, sorry I didnt think of that earlier | 12:57 |
tristan | tlater, if the artifact already has an extract directory, I think we dont even bother looking at the ostree and just take that | 12:58 |
tlater | Well, I just accidentally deleted the full ostree repo for this project, twice in a row. That's not the behavior I saw, unfortunately. | 12:58 |
juergbi | extract directory names are based on ostree commit instead of cache key now with my branch | 12:59 |
tristan | tlater, I dont understand what is "stay tainted after rebuilding" | 12:59 |
tristan | tlater, if a given cache key produces a tainted artifact; that cache key should always be a tainted artifact | 12:59 |
tristan | juergbi, ahhh interesting :) | 12:59 |
tlater | But only that one key, atm new keys also become tainted | 13:00 |
tlater | Even without a workspace | 13:00 |
juergbi | ah, that's odd | 13:00 |
tristan | juergbi, not entirely necessary I think, though; its all hardlinks | 13:00 |
tristan | I guess we save a bit of overhead though | 13:00 |
juergbi | tristan: the reason was to avoid creating extract dirs with weak cache names | 13:00 |
tristan | tlater, because they depend on that tainted artifact ? | 13:00 |
juergbi | tristan: we don't always know the strong key from the artifact before we actually extract it | 13:01 |
tristan | juergbi, anyway no problem with that :) | 13:01 |
tlater | tristan: No, because the associated element loads the artifact metadata of the old artifact, and fails to delete it when building a new artifact. Just a bit of debugging... | 13:02 |
tristan | mmmm | 13:02 |
gitlab-br-bot | buildstream: merge request (dual-cache-keys->master: Dual cache keys) #55 changed state ("merged"): https://gitlab.com/BuildStream/buildstream/merge_requests/55 | 13:02 |
gitlab-br-bot | push on buildstream@master (by Tristan Van Berkom): 20 commits (last: widget.py: Use _get_full_display_key()) https://gitlab.com/BuildStream/buildstream/commit/ebfdb5905659f512ae773b4bf52cff29280fc746 | 13:02 |
gitlab-br-bot | buildstream: Jürg Billeter deleted branch dual-cache-keys | 13:02 |
tristan | tlater, take cache key resolution time into consideration, and you might want to rebase before resolving that because it may have changed slightly | 13:03 |
tristan | tlater, or wildly | 13:03 |
tlater | Probably a good call | 13:04 |
tristan | tlater, probably you want to ensure that... we never try to load the 'workspaced' attribute from an artifact we depend on, until the moment we try to store the element in the local cache | 13:04 |
tristan | probably we need some better assertions and policy about when to load artifact metadata across the board | 13:05 |
tristan | we have an assertion for public data | 13:05 |
tlater | Wouldn't that be a bit messy if we needed different kinds of artifact data? | 13:07 |
tlater | ^ Only loading that particular attribute | 13:07 |
tristan | tlater, it's never correct to load artifact metadata before we know the correct cache key for the artifact | 13:07 |
tristan | tlater, and cache keys are resolved late with --track enabled, and also juergbi's branch changes that dramatically | 13:08 |
tlater | I guess I'll have to see juergbi's branch | 13:08 |
tristan | the "cache key for build" is in fact the effective key | 13:08 |
tristan | juergbi, I suddenly think that's a better name :) | 13:08 |
tristan | "effective key" | 13:08 |
juergbi | it's only the effective key if we're not using a previously built artifact | 13:09 |
tristan | really ? | 13:09 |
tristan | how does that change ? | 13:09 |
juergbi | in non-strict mode, the strong key in the previously built artifact may differ from the strong key that would be in an artifact if we were to build it now | 13:10 |
juergbi | as it may have been built with different dependency versions | 13:10 |
tristan | juergbi, I understand 3 keys... weak key (without deps) ... strong key (the idealistic key that we would produce when building in strict mode) ... and the effective key (the artifact we are really going to produce, which only might be the strong key) | 13:10 |
tristan | Is that not correct ? | 13:10 |
juergbi | this is correct but once we build the artifact, that effective key is stored in the artifact | 13:11 |
tristan | juergbi, and... in general... we only ever want to load metadata from the effective key | 13:11 |
juergbi | and when we use that artifact in a later session again, we load it from that artifact. and that key may be different than the other 3 | 13:11 |
juergbi | i.e., if you only look at a single session, you 'only' have 3 keys | 13:11 |
tristan | :-/ | 13:12 |
juergbi | but across sessions in --no-strict mode, 4 keys are relevant | 13:12 |
tristan | juergbi, we are storing the "cache key for build" in the artifact, and caching it as the "cache key for build", correct ? | 13:13 |
tristan | i.e. it's the real cache key of any produced artifact, whether or not it happens to be the ideally strong key | 13:13 |
juergbi | yes, we always store that key in the artifact metadata when we build an artifact | 13:13 |
tristan | And that is based on the effective keys of the dependencies | 13:13 |
tristan | + the weak key | 13:14 |
juergbi | yes | 13:14 |
tristan | So whether or not I built a dependency in this session, my effective key will be the same | 13:14 |
tristan | as long as I'm producing the same thing of course | 13:14 |
juergbi | yes | 13:15 |
tristan | So what is the fourth key ? | 13:15 |
juergbi | when an element is not being built in the current session. when you load it from the cache, you can't calculate the effective key | 13:16 |
juergbi | you have to read it out from the artifact metadata | 13:16 |
tristan | Correct | 13:16 |
juergbi | that read out key is different than all 3 calculated keys mentioned above | 13:16 |
juergbi | or rather, may be | 13:16 |
tristan | juergbi, which means you can never know what your effective key is going to be, until you have all the effective keys of your dependencies | 13:16 |
juergbi | correct | 13:17 |
tristan | But in my mind, that does not mean there is a fourth key | 13:17 |
juergbi | not even that is sufficient | 13:17 |
tristan | rather, the 3rd key cannot be allowed to exist until it's known | 13:17 |
juergbi | with 3rd key you mean _for_build? | 13:18 |
tristan | exactly | 13:18 |
tristan | yes | 13:18 |
juergbi | yes, that function must only be called if we're building that element | 13:19 |
juergbi | in that sense, there aren't 4 keys used at the same time, you're right | 13:19 |
juergbi | but there are 3 functions to calculate a key and the effective key may be different from all of those | 13:19 |
tristan | The source of the effective key changes, but it is the same thing, it is the effective key | 13:19 |
tristan | I dont see it that way | 13:20 |
juergbi | it could be called that, yes | 13:20 |
tristan | that is a confusing way to think about it | 13:20 |
tristan | really | 13:20 |
tristan | I mean: "there are 3 functions to calculate a key and the effective key may be different from all of those" | 13:20 |
tristan | Rather, there are 3 ways to reach an effective key A.) strict mode, know the key right away (unless delayed by --track) | 13:21 |
tristan | B.) You are building everything in --no-strict mode, so you could technically know what they are going to be before hand (but not really necessary to do so, because who needs a third code path, we can just encode effective key into artifact and read it back later on) | 13:22 |
tristan | C.) You are downloading pieces as you go along | 13:22 |
tristan | 3 functions to produce the effective key, sure | 13:22 |
tristan | However... regardless of how it was calculated, if the dependencies were stacked the same way, and the inputs are the same, the effective key is always the same | 13:23 |
tristan | cannot differ | 13:23 |
juergbi | if you're rebuilding the element in question | 13:24 |
tristan | juergbi, how is there 3 functions to calculate a key *not counting* the effective key ? | 13:24 |
juergbi | we're comparing the case of rebuild or no rebuild, though | 13:24 |
tristan | I am suddenly very lost, what does "rebuild" entail ? | 13:25 |
tristan | To me it means, reproducing the same thing on another computer, or "I lost an artifact in my cache" | 13:25 |
juergbi | sorry, should be just build or no build (use cache) | 13:25 |
tristan | Ok | 13:25 |
juergbi | weak key, idealistic strong key, and for_build are the three functions that calculate keys. _from_artifact is the function that loads the effective key | 13:26 |
juergbi | it may be possible to merge the latter two with appropriate checks of the context | 13:26 |
tristan | _from_artifact() and _for_build() are not the same, but they both produce the effective key | 13:26 |
tristan | I see what you mean now | 13:26 |
tristan | It would make sense if we can look at those as the same yes | 13:27 |
juergbi | yes, might clear up some confusion | 13:27 |
tristan | i.e. if we could ensure that every time _for_build() is used, it only ever takes _from_artifact() as input | 13:27 |
tristan | maybe that's a stretch | 13:27 |
tristan | juergbi, sorry I lost my mind for a moment there, I mean... it's a very sensitive/crucial thing | 13:28 |
tristan | needs to be understood well | 13:28 |
juergbi | _from_artifact() can't be used yet when _for_build() is used | 13:28 |
juergbi | as there is no cached artifact at that point | 13:28 |
juergbi | but because of this, we might actually be able to merge them into a get_effective_key() function that always does the right thing | 13:29 |
tristan | juergbi, _from_artifact() (of a dependency) *is* used to produce _for_build() (of the element we're building), isn't it ? | 13:29 |
juergbi | yes, for dependencies | 13:29 |
juergbi | i meant, you can't use it for the element itself | 13:29 |
tristan | that's what I meant by "if we could ensure that every time _for_build() is used, it only ever takes _from_artifact() as input" | 13:30 |
tristan | of course | 13:30 |
juergbi | that's already the case | 13:30 |
tristan | Ok that's a good thing | 13:30 |
tristan | juergbi, because it's not strictly necessary | 13:30 |
tristan | juergbi, when you know in advance that you will be building an element, you can at that time calculate cache keys of reverse dependencies based on it's _for_build() key | 13:31 |
tristan | (cause you know what the _from_artifact() *will be* for that element) | 13:31 |
tristan | But, if it's already the case that it doesnt happen, that is cleaner I think, yes. | 13:31 |
juergbi | you can do it once you're sure you have all the dependencies in the version you will actually use | 13:31 |
juergbi | i.e., we couldn't do it early on | 13:31 |
juergbi | due to pulling | 13:32 |
tristan | we could if you dont have a pull queue | 13:32 |
juergbi | yes | 13:32 |
tristan | or once elements have all passed through the pull queue :) | 13:32 |
tristan | But lets not | 13:32 |
tristan | I like that invariant anyway | 13:32 |
juergbi | agreed, don't add unnecessary code paths | 13:32 |
tristan | yeah, anyway I had the experience already of adding code to ensure exactly that premature cache key calculation wouldnt happen (i.e. track queues in play) | 13:34 |
tristan | So I get paranoid about that, to me it feels like there is always a risk that an element has cached a key value before it's correct | 13:34 |
tristan | anyway, enlightening conversation; I think that wrapping these around an _effective_key() method which has a decent comment; and using the effective key, might help things be more readable | 13:35 |
juergbi | i actually added explicit recalculate calls to the pullqueue to be sure | 13:36 |
tristan | when you end up reading the note around _effective_key(), then you understand the gruesome details surrounding that, but not before hehe | 13:36 |
tristan | yeah | 13:36 |
tristan | I saw that in the branch | 13:36 |
* tristan has to exit now | 13:37 | |
*** tristan has quit IRC | 13:40 | |
tlater | juergbi: What is supposed to happen in `_get_cache_key_from_artifact` if meta['keys'] is not defined? | 13:45 |
juergbi | that would be an incompatible cached artifact | 13:46 |
tlater | Ah, never mind, I had a local artifact cache I forgot to remove | 13:46 |
juergbi | we really need to add artifact/key versions to avoid such issues | 13:46 |
tlater | Probably, yes | 13:46 |
tlater | Fixed my issue :) | 14:19 |
gitlab-br-bot | push on buildstream@workspaces (by Tristan Maat): 20 commits (last: element.py: Use _get_cache_key_for_build() for artifact.yaml) https://gitlab.com/BuildStream/buildstream/commit/f43aa7bbbf8cc02faecbeb7a279d47bf97f6beca | 14:32 |
gitlab-br-bot | buildstream: merge request (workspaces->master: WIP: Workspaces) #50 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/50 | 14:32 |
tlater | Should only build dependencies affect the tainted status of an element? | 14:41 |
ssam2 | hmmmmmmm | 14:44 |
ssam2 | I think 'element' is too generic a concept for us to assert that | 14:44 |
tlater | I suppose, some script elements may rely on their dependencies doing something | 14:45 |
tlater | Then again, that's also a build dependency | 14:45 |
ssam2 | yeah. and if the element is a system image, for example, then it includes elements that might be considered runtime dependencies of something else | 14:45 |
juergbi | build dependencies and their runtime dependencies | 14:45 |
juergbi | but direct runtime dependencies shouldn't matter | 14:45 |
juergbi | i.e., same logic as what's used to calculate the cache key | 14:46 |
juergbi | or in code that would be: element.dependencies(Scope.BUILD) | 14:47 |
tlater | Oh, that includes build dependencies' run dependencies? | 14:48 |
juergbi | yes, they are excluded only if you specify recurse=False | 14:52 |
gitlab-br-bot | push on buildstream@sam/artifactcache-preflight-check (by Sam Thursfield): 2 commits (last: Add initial `bst push` command) https://gitlab.com/BuildStream/buildstream/commit/2594a3109c08013b0c161f524f83ec2942657f41 | 14:52 |
gitlab-br-bot | push on buildstream@workspaces (by Tristan Maat): 1 commit (last: element.py: Make element dependencies affect taint status) https://gitlab.com/BuildStream/buildstream/commit/4f343992725db461d17c43fca198e52f0c0c7fcf | 15:01 |
gitlab-br-bot | buildstream: merge request (workspaces->master: WIP: Workspaces) #50 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/50 | 15:01 |
gitlab-br-bot | push on buildstream@workspaces (by Tristan Maat): 1 commit (last: element.py: Make element dependencies affect taint status) https://gitlab.com/BuildStream/buildstream/commit/7227cd41b8f593357df2c4c10d16c5b48c421de1 | 15:01 |
gitlab-br-bot | buildstream: merge request (workspaces->master: WIP: Workspaces) #50 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/50 | 15:01 |
gitlab-br-bot | buildstream: merge request (workspaces->master: Workspaces) #50 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/50 | 15:26 |
gitlab-br-bot | push on buildstream@sam/bst-push (by Sam Thursfield): 1 commit (last: Add initial `bst push` command) https://gitlab.com/BuildStream/buildstream/commit/b9bfe1c727b2c39bb824e3369ed9edc69b95a118 | 15:56 |
ssam2 | oh dear | 15:57 |
ssam2 | https://gitlab.com/baserock/definitions/-/jobs/22540811 | 15:57 |
ssam2 | this doesn't look good | 15:57 |
ssam2 | i did the same thing locally (built the system that is in the cache) and didn't get any errors | 15:58 |
ssam2 | but, this does look like a bst bug... | 15:58 |
ssam2 | maybe a race? | 15:58 |
ssam2 | seems to be non-fatal, actually... | 15:58 |
tlater | It might be a cache issue, considering the missing files | 15:58 |
*** jonathanmaw has quit IRC | 16:02 | |
*** tlater has quit IRC | 17:07 | |
gitlab-br-bot | buildstream: merge request (sam/bst-push->master: Add initial `bst push` command) #56 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/56 | 17:22 |
gitlab-br-bot | push on buildstream@sam/artifactcache-preflight-check (by Sam Thursfield): 20 commits (last: context.py: Add strict_build_plan option) https://gitlab.com/BuildStream/buildstream/commit/3f065aa70d9d5bb4088ef43b0552e1190764a54e | 17:24 |
gitlab-br-bot | buildstream: merge request (sam/artifactcache-preflight-check->master: Check for write access to remote artifact cache early on in the pipeline) #57 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/57 | 17:29 |
gitlab-br-bot | buildstream: issue #46 ("Strange artifact pull failures on GitLab CI") changed state ("opened") https://gitlab.com/BuildStream/buildstream/issues/46 | 17:32 |
*** ssam2 has quit IRC | 17:46 | |
*** tristan has joined #buildstream | 18:39 | |
*** ChanServ sets mode: +o tristan | 18:43 | |
*** tristan has quit IRC | 22:06 | |
jjardon[m] | mmm, Is this a bug or I have configured sth wrong? https://gitlab.com/baserock/definitions/-/jobs/22586590 | 23:45 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!