| *** benschubert has quit IRC | 00:34 | |
| *** mohan43u has quit IRC | 01:40 | |
| *** mohan43u has joined #buildstream | 01:40 | |
| *** tristan has quit IRC | 06:50 | |
| *** benschubert has joined #buildstream | 07:37 | |
| *** tristan has joined #buildstream | 07:39 | |
| *** ChanServ sets mode: +o tristan | 07:39 | |
| *** samwilson has joined #buildstream | 07:43 | |
| tristan | benschubert, There is still probably a lot which I don't understand about cython... could you explain how it is that PARSE_CACHE here is bound to a given element, and not process global state? https://gitlab.com/BuildStream/buildstream/-/blob/master/src/buildstream/_variables.pyx#L223 | 08:21 |
|---|---|---|
| benschubert | tristan: It is global state | 08:22 |
| tristan | Oh wait | 08:22 |
| tristan | I understand now | 08:23 |
| tristan | It's a translation table of "%{varname}" -> "varname" | 08:23 |
| benschubert | right | 08:23 |
| benschubert | tht's actually a super comment we should add on top | 08:23 |
| benschubert | would help a lot | 08:23 |
| tristan | heh | 08:23 |
| tristan | Yeah the comment is confusing, speaking already of lifecycles of contained strings being tied to elements which use it | 08:24 |
| *** santi has joined #buildstream | 08:24 | |
| tristan | Or wait, its a translation to a list | 08:25 |
| tristan | So, it also indexes things like "%{foo} bar %{baz}" -> ["foo", "baz"] iiuc | 08:25 |
| benschubert | I amsure of the correctness of this part | 08:25 |
| tristan | yes that seems correct, when you try to expand something, you can have this form, so you would need a list of things which need to be resolved in order to resolve the thing | 08:26 |
| benschubert | *I am not sure | 08:26 |
| tristan | No ? | 08:26 |
| benschubert | mmh no actually, should work sorry | 08:26 |
| benschubert | this part of the code always gets me | 08:26 |
| tristan | Well, it appears correct so far, but I think I will need to understand a bit more :) | 08:26 |
| benschubert | don't hesitate to add comments there | 08:27 |
| * tristan will try to leave it in a more understandable state yeah | 08:27 | |
| benschubert | it's definitely a part of the coe I'm not comfortable with (most of the rest of the file is more direct at least) | 08:27 |
| tristan | may rename some variables too | 08:27 |
| tristan | Do you know if we support strings like "There are some { literal %{braces} in } here" ? | 08:29 |
| tristan | I think at first we didn't really support that | 08:29 |
| tristan | And I'm sure we never supported "%{var-%{name}}" dynamic variable name resolution | 08:30 |
| tristan | (nor do I think we need to) | 08:30 |
| tristan | Anyway, that's kind of orthogonal, I'll try to focus | 08:31 |
| tristan | The part of variables which always bites me is the regex, I kind of despise them :) | 08:32 |
| benschubert | > "%{var-%{name}}" definitely not. We should probably efend against it | 08:35 |
| benschubert | > "There are some { literal %{braces} in } here" ? should work correctly, cs-shadow did a fix for that recently | 08:35 |
| *** tomaz has joined #buildstream | 08:41 | |
| *** phildawson has quit IRC | 08:42 | |
| *** phildawson has joined #buildstream | 08:43 | |
| *** tristan has quit IRC | 08:52 | |
| juergbi | benschubert: have you seen https://github.com/cython/cython/issues/3515#issuecomment-646374155 ? adding a dummy inline function to node.pxd might work around the coverage 5 issues, which might allow getting py38 coverage to finally work | 09:02 |
| juergbi | haven't tested it yet | 09:02 |
| benschubert | That would be great news! | 09:09 |
| *** tristan has joined #buildstream | 10:16 | |
| *** ChanServ sets mode: +o tristan | 10:16 | |
| tristan | benschubert, you don't like the try / except KeyErrors ? I think that we save a line of code in loader.py with your suggestion, but in _project.py, it seems cleaner to me to have try/except KeyError no ? | 10:45 |
| tristan | To be honest, I've been hoping to make more explicit `try / except KeyError` statements throughout the codebase, to enforce a more "ask for forgiveness instead of permission" mentality/pattern everywhere | 10:46 |
| tristan | To many code blocks are doing "if (this_thing_exists): do_this_thing()" instead of "try: get_this_thing() / except KeyError: oops_nothing_here()" | 10:47 |
| tristan | resulting in a handful of redundant lookups | 10:47 |
| tristan | I think/hope that try/except coding style helps to change the readers mind about how to go about things | 10:48 |
| benschubert | tristan: I am not against try-except on the contrary! | 10:55 |
| benschubert | Just that when we do `try-except-pass`, I think that `suppress` is more explicit. the `pass` always seems like someone forgot to implement it | 10:56 |
| benschubert | if you find try-except-pass more redaable I'm fine with that too | 10:56 |
| tristan | Hah | 10:56 |
| tristan | Yeah I rather liked suppress in loader.py, in _project.py there is not pass | 10:57 |
| *** abderrahim[m] has joined #buildstream | 10:57 | |
| tristan | benschubert, however, looking again, I think even better is `junction_list = self._junction_duplicates.get(project_name, {})` | 10:58 |
| tristan | rendering the following loop a no-op and doing it in one-line | 10:58 |
| benschubert | definitely | 10:58 |
| tristan | I will also rename it junction_dict, since it is a dict of dicts afterall | 10:58 |
| benschubert | Seems good! | 10:59 |
| tristan | That used to be a list of tuples | 10:59 |
| tristan | But I changed it | 10:59 |
| abderrahim[m] | no gitlab bot :S | 11:15 |
| abderrahim[m] | tristan: I think you would love https://gitlab.com/BuildStream/buildstream/-/merge_requests/1970 | 11:16 |
| abderrahim[m] | please tell me if not :) | 11:16 |
| tristan | abderrahim[m], Well, removing `element._build_log_path = ...` from _artifact.py is already a great course correction ! | 11:22 |
| tristan | I'm unsure exactly everything that's going on in there | 11:22 |
| tristan | just a glance | 11:22 |
| tristan | Logging the build environment in element.assemble() rather than in buildjob also looks good to me | 11:23 |
| tristan | Oh ! that horrible hack from buildjob is going away ! | 11:24 |
| * tristan jumps for joy \o/ | 11:24 | |
| abderrahim[m] | I'd also appreciate a hint on where to put a regression test for https://gitlab.com/BuildStream/buildstream/-/merge_requests/1971 | 11:24 |
| tristan | ohhhh | 11:25 |
| tristan | abderrahim[m], you may want to wait for me to finish !1901, not sure... | 11:25 |
| tristan | abderrahim[m], yeah, I think you should wait for !1901 to land, I'm landing it quite presently | 11:26 |
| tristan | abderrahim[m], the point is that, an element path is not always the same, you can have multiple element paths from the command line which are all valid ways of reaching a given element | 11:27 |
| abderrahim[m] | ok, I'll try to see if !1901 fixes the issue then | 11:27 |
| tristan | So test cases for that should be parameterized to try paths (A) Absolute resolved paths, (B) Paths which involve links, (C) Paths which point to elements via overridden subprojects | 11:28 |
| tristan | benschubert, I just replied regarding the error messages, if you have any thoughts on how to represent these strings (or agree with the approach), let me know: https://gitlab.com/BuildStream/buildstream/-/merge_requests/1901/#note_366414606 | 11:31 |
| tristan | I'll try to bang it out now | 11:31 |
| benschubert | tristan: answered | 11:34 |
| benschubert | I agree with the approach and if you'd rather have it like that I'm also fine with it | 11:34 |
| tristan | And answered your question :) | 11:40 |
| tristan | So the whole error message is about a "project" being defined multiple times, where we display every instance - it is indeed left up to the user to know *why*, that is probably a shortcomming of the message | 11:41 |
| tristan | Maybe a standard paragraph along with the error would be good. | 11:41 |
| benschubert | or a link to the docs? | 11:42 |
| benschubert | it might get out of date but it is a nice way of helping | 11:42 |
| tristan | Hmmm | 11:43 |
| tristan | Referring to urls in error messages | 11:43 |
| benschubert | anyways I'll be back in a moment, I'm off for food | 11:43 |
| tristan | I'm hoping we don't have that pattern too much | 11:43 |
| tristan | Yup | 11:43 |
| benschubert | I don't think we have | 11:43 |
| tristan | benschubert, We have an agreement to do so for one case | 11:43 |
| tristan | And I think it's exceptional | 11:43 |
| benschubert | ok, let's not do that then | 11:44 |
| tristan | I.e. it's regarding parallel installation of bst-1 / bst-2 | 11:44 |
| tristan | Enjoy your dinner :) | 11:44 |
| * tristan thinks he can capture the important bit with one sentence | 11:45 | |
| tristan | "The following instances of project 'foo' were encountered: [ insert list ] .... \n\n Internal projects do not cause any conflicts. Conflicts can also be avoided by marking every instance of the project as a duplicate." | 11:47 |
| tristan | Two short sentences | 11:48 |
| benschubert | tristan: I like that | 12:22 |
| tristan | benschubert, juergbi, I think I've addressed the majority of comments on the !1901 branch now | 12:44 |
| tristan | juergbi, do you have strong feelings about https://gitlab.com/BuildStream/buildstream/-/merge_requests/1901/#note_366357312 ? Would you like additional comments explaining that alternative parents are needed in order to access the entire ancestry ? | 12:45 |
| tristan | As I mentioned in the comments, I would rather have that function stand on it's own, and not speak about internal/duplicates as one of it's use cases | 12:45 |
| tristan | How about, I'll let you guys think on it a bit more, and in the absence of further comments on the MR I will merge tomorrow morning Korea time ? | 12:47 |
| *** tristan has quit IRC | 13:11 | |
| *** phildawson has quit IRC | 13:34 | |
| *** phildawson has joined #buildstream | 13:35 | |
| benschubert | tristan: good for me, I'll have a lsat look tonight | 13:39 |
| *** santi has quit IRC | 13:42 | |
| *** santi has joined #buildstream | 13:45 | |
| WSalmon | can anyone think of a good reason https://gitlab.com/BuildStream/buildstream/-/blob/master/src/buildstream/_cas/casserver.py#L512 would fail? | 15:50 |
| juergbi | WSalmon: how does it fail? | 16:13 |
| juergbi | the serialization or the write? | 16:13 |
| WSalmon | any of it | 16:14 |
| juergbi | out of disk space but rather unlikely | 16:14 |
| juergbi | I wouldn't expect SerializeToString to fail | 16:14 |
| WSalmon | i *think* something todo with pushing is failing and its being reported as a `artifact already there` the server its on is 200M/1G disk wise | 16:15 |
| WSalmon | is that right coldtom ? | 16:15 |
| WSalmon | when coldtom went and looked he couldnt find the artifact even though the error said `already here` | 16:15 |
| WSalmon | <coldtom> [root@infra /]# kubectl exec -ti -n rex statefulset/storage-artifact -- find /artifacts -name f7b9a8c38b4d65f633117ff028ee433c488d6ce4a05643869a49f74b5c99c4f3 | 16:16 |
| WSalmon | <coldtom> [root@infra /]# | 16:16 |
| coldtom | that's correct WSalmon | 16:16 |
| juergbi | what was the exact error when it couldn't find the artifact? | 16:16 |
| juergbi | i.e., could it have been a blob missing? | 16:16 |
| WSalmon | its a index only server | 16:16 |
| juergbi | ok but buildstream always does both | 16:17 |
| coldtom | juergbi, there have been no gRPC NOT_FOUNDs from the CAS | 16:17 |
| coldtom | so i don't think it's missing blobs | 16:17 |
| WSalmon | this is the right up https://gitlab.com/celduin/infrastructure/celduin-infra/-/issues/208 | 16:17 |
| WSalmon | the error message seems to imply that it could push the cas bits | 16:17 |
| WSalmon | but then failed to push to the index | 16:18 |
| coldtom | the behaviour is "not cached" -> build -> push -> "already cached" | 16:18 |
| coldtom | inspecting the artifact store shows that it is not cached | 16:18 |
| juergbi | there is a logic error in the buildstream artifact cache client I've noticed recently, which may be related | 16:18 |
| juergbi | pushing the artifact proto should actually replace the proto on the server, not return that it's already cached | 16:19 |
| juergbi | as the artifact proto contents may be different (pointing to different CAS objects) | 16:19 |
| juergbi | i.e., _push_artifact_proto() first calling get_artifact() seems incorrect to me | 16:20 |
| juergbi | it's not a big issue on combined index/CAS servers as there get_artifact() will return NOT_FOUND if a CAS blob is missing | 16:20 |
| juergbi | but with a standalone index server, the artifact proto might remain forever with dead references to CAS objects | 16:21 |
| juergbi | I can't be sure that artifact protos with dead references are the issue in your tests but it could be | 16:22 |
| WSalmon | that would make sence if the pull reported index but not cas, but the pull fails striaght away | 16:22 |
| coldtom | hmm, we shouldn't have had any blob expiry in the CAS yet | 16:22 |
| coldtom | and in that case, there should be a reference stored on the server, afaict the artifact is never stored | 16:23 |
| juergbi | hm, so get_artifact() is successful despite the artifact proto not existing on the index server? | 16:24 |
| juergbi | that's pretty odd | 16:24 |
| juergbi | coldtom: you mention GetArtifact requests being logged. do you also have logs of the result (grpc status)? | 16:26 |
| juergbi | i.e., is the server incorrectly returning OK or is BuildStream not detecting the error? | 16:27 |
| coldtom | juergbi, unfortunately bst-artifact-server doesn't log the responses, and there's no way to tell which GetArtifact request is the offending one from the nginx logs | 16:29 |
| juergbi | ok. I would expect at least one test in our test suite to exercise GetArtifact for a not existing artifact | 16:30 |
| WSalmon | coldtom, do you record the standard out of the artifact-index server? | 16:35 |
| coldtom | yup, but it only logs when requests come in so doesn't know the response | 16:38 |
| juergbi | WSalmon, coldtom: I think we should head in this direction: https://gitlab.com/BuildStream/buildstream/commits/juerg/push | 16:40 |
| juergbi | (and at a higher level, completely skip artifact push for remotes where we've pulled the artifact in the same session) | 16:41 |
| juergbi | I still can't explain the odd behavior you're seeing but it might still help | 16:42 |
| WSalmon | [--:--:--][f46ae283][ push:freedesktop-sdk.bst:bootstrap/file.bst] INFO Remote (https://push.public.aws.celduin.co.uk) already has artifact f46ae283 cached | 16:45 |
| WSalmon | [--:--:--][f46ae283][ push:freedesktop-sdk.bst:bootstrap/file.bst] STATUS Pushing artifact f46ae283 -> http://localhost:10101 | 16:45 |
| WSalmon | [--:--:--][f46ae283][ push:freedesktop-sdk.bst:bootstrap/file.bst] INFO Pushed artifact f46ae283 -> http://localhost:10101 | 16:45 |
| WSalmon | [00:00:00][f46ae283][ push:freedesktop-sdk.bst:bootstrap/file.bst] SUCCESS freedesktop-sdk/bootstrap-file/f46ae283-push.2372052.log | 16:45 |
| WSalmon | so this is annoying, (https://push.public.aws.celduin.co.uk) already has artifact f46ae283 cached is a lie | 16:45 |
| WSalmon | but my local index only has just excepted the artifact | 16:46 |
| *** tristan has joined #buildstream | 16:46 | |
| *** ChanServ sets mode: +o tristan | 16:46 | |
| * WSalmon looks suspiciously at coldtom's deployment | 16:47 | |
| *** samwilson has quit IRC | 16:52 | |
| WSalmon | [--:--:--][3a490e29][ push:freedesktop-sdk.bst:bootstrap/binutils.bst] STATUS Pushing data from artifact 3a490e29 -> https://push.public.aws.celduin.co.uk | 16:52 |
| WSalmon | [--:--:--][3a490e29][ push:freedesktop-sdk.bst:bootstrap/binutils.bst] INFO Pushed data from artifact 3a490e29 -> https://push.public.aws.celduin.co.uk | 16:52 |
| WSalmon | [--:--:--][3a490e29][ push:freedesktop-sdk.bst:bootstrap/binutils.bst] STATUS Pushing artifact 3a490e29 -> https://push.public.aws.celduin.co.uk | 16:52 |
| WSalmon | [--:--:--][3a490e29][ push:freedesktop-sdk.bst:bootstrap/binutils.bst] INFO Remote (https://push.public.aws.celduin.co.uk) already has artifact 3a490e29 cached | 16:52 |
| WSalmon | [--:--:--][3a490e29][ push:freedesktop-sdk.bst:bootstrap/binutils.bst] STATUS Pushing artifact 3a490e29 -> http://localhost:10101 | 16:52 |
| WSalmon | [--:--:--][3a490e29][ push:freedesktop-sdk.bst:bootstrap/binutils.bst] INFO Pushed artifact 3a490e29 -> http://localhost:10101 | 16:52 |
| WSalmon | [00:00:00][3a490e29][ push:freedesktop-sdk.bst:bootstrap/binutils.bst] SUCCESS freedesktop-sdk/bootstrap-binutils/3a490e29-push.2386028.log | 16:52 |
| WSalmon | same again | 16:52 |
| juergbi | WSalmon: and you're really certain that the artifact doesn't exist on the server? | 16:54 |
| WSalmon | [--:--:--][3a490e29][ pull:freedesktop-sdk.bst:bootstrap/binutils.bst] INFO Remote (https://push.public.aws.celduin.co.uk) does not have artifact 3a490e29 cached | 16:57 |
| WSalmon | [--:--:--][3a490e29][ pull:freedesktop-sdk.bst:bootstrap/binutils.bst] STATUS Pulling artifact 3a490e29 <- http://localhost:10101 | 16:57 |
| WSalmon | [--:--:--][3a490e29][ pull:freedesktop-sdk.bst:bootstrap/binutils.bst] INFO Remote (http://localhost:10101) does not have artifact 3a490e29 cached | 16:57 |
| WSalmon | [00:00:00][3a490e29][ pull:freedesktop-sdk.bst:bootstrap/binutils.bst] SKIPPED Pull | 16:57 |
| WSalmon | one or other is failing and as top showed for that other one i dosent look like its there | 16:58 |
| juergbi | WSalmon: and there is no chance it has been pushed by another buildstream client in the meantime? | 16:59 |
| WSalmon | no | 16:59 |
| WSalmon | we get this a lot | 16:59 |
| WSalmon | its not the odd thing | 17:00 |
| juergbi | ah, wait, that's the localhost:10101 | 17:00 |
| juergbi | i.e., not the same server | 17:00 |
| juergbi | it successfully pushes to localhost:10101 | 17:00 |
| WSalmon | yes | 17:00 |
| WSalmon | thats what i mean | 17:00 |
| WSalmon | * WSalmon looks suspiciously at coldtom's deployment | 17:00 |
| WSalmon | im gona delete the artifact and try again but i need to go sort something els out now | 17:01 |
| juergbi | i.e., verify the deployed bst-artifact-server is actually up-to-date? | 17:01 |
| juergbi | coldtom: is it possible that not both of the two bst-artifact-server instances (public pull-only and push) are launched with --index-only | 17:03 |
| *** persia has joined #buildstream | 17:10 | |
| *** adds68 has joined #buildstream | 17:11 | |
| persia | Hi everyone. Was just chatting with adds68, who pointed out that https://buildstream.build/ was a bit outdated. Who covers https://gitlab.com/BuildStream/website these days? | 17:15 |
| WSalmon | juergbi, i think there the same instance but with diffrent proxy settings but i could be wrong | 17:36 |
| WSalmon | juergbi, so when i deleted bootstrap/file.bst from my local cache it could pull the proto from http://localhost:10101 and the elemnets from the cas of https://push.public.aws.celduin.co.uk even tho `Remote (https://public.aws.celduin.co.uk) does not have artifact f46ae283 cached` | 17:42 |
| juergbi | WSalmon: could the issue be nginx caching grpc requests? | 17:43 |
| juergbi | i.e., the responses are not up-to-date? | 17:43 |
| WSalmon | so i think its coldtom's instance's index server or nginx that is the issue.... | 17:43 |
| WSalmon | juergbi, maybe or the grpc forwarding, but it seems to have excepted some elements and not be excepting others.. | 17:45 |
| WSalmon | i would expect the nginx to ether work or not, but it can be effected by size of artifact but the protos should be pretty small | 17:46 |
| juergbi | if it's a stale cache issue it might depend on timing, i.e., it may seem random | 17:46 |
| WSalmon | should be pretty easy for coldtom to rule anything like that out in the morning | 17:47 |
| WSalmon | thanks juergbi for all you insight | 17:47 |
| WSalmon | juergbi, once we get to the bottom of this i think we need to improve the error reporting in bst tho.. | 17:52 |
| *** santi has quit IRC | 17:58 | |
| *** tomaz has quit IRC | 18:08 | |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!