IRC logs for #buildstream for Friday, 2019-03-08

*** nimish has joined #buildstream01:19
*** toscalix has joined #buildstream01:56
*** nimish has quit IRC03:18
*** tristan has joined #buildstream05:49
*** tristan has quit IRC06:52
*** tristan has joined #buildstream07:02
gitlab-br-bottristanvb opened MR !1212 (tristan/cleanup-workspace-tests->master: tests/frontend/workspace.py: Remove redundant and pointless tests) on buildstream https://gitlab.com/BuildStream/buildstream/merge_requests/121207:49
*** ChanServ sets mode: +o tristan08:22
tristanjuergbi, Can I pick your brain for the infamous !1070 algorithm ?08:22
tristanme gropes around channel for the bot which should be linking !107008:24
juergbitristan: ah, I just commented before seeing this here08:24
juergbidisclaimer, haven't followed it too closely08:25
tristanjuergbi, hold a sec, I can explain in human language08:27
tristanRight... commented back, and seems like a good idea indeed08:29
tristanjuergbi, So here is the thing, Element.__update_state_recursively() gets called everywhere (except _fetch_done() which cannot resolve a key) instead of _update_state()08:29
juergbitristan: why artifact version bump?08:29
juergbiwe wouldn't have to store that recursive key08:30
juergbi(nor want to)08:30
tristanjuergbi, Wouldnt the hash allow you to avoid recursing into dependencies which calculating the key ?08:30
tristanOh, so it's a different idea I didn't really comprehend completely08:30
tristanI think08:30
juergbiyes, but only avoiding repeated calls at runtime08:30
juergbiruntime dep versions can change, we can't store something like this in the artifact08:30
juergbi(which doesn't get rebuilt for runtime-only dep changes)08:31
tristanYeah at runtime, but I thought instead of using a "dependencies" list, we would slowly add data to the "dependencies hash" as we go, contributing to the key as we truck along08:32
juergbiI would just recalculate it whenever something changes, otherwise it gets tricky08:32
juergbibut we don't have to do things recursively all the time08:32
tristanhmmm, just rereading this08:32
tristanI'm not sure it handles the case where indirect reverse dependencies need to have their state updated08:33
juergbiactually, we also have to update reverse runtime dependencies, not just reverse build dependencies08:33
juergbibut the main point is that it would suffice to have a flat list of reverse dependencies08:34
tristanjuergbi, What I discovered, is that if Element.__update_state_recursively() only iterates over direct dependencies (build + runtime), then reverse dependencies of reverse runtime-only dependencies are missed08:34
juergbiyes, that's what the new hash would fix08:34
tristanI think you have a good idea08:35
*** toscalix has joined #buildstream08:35
tristanIt would still have to be recursive when the hash of something changes08:36
juergbithe state update would still be recursive, yes08:37
juergbibut not driven from a single element08:37
juergbii.e., recurse only if the cache key or the new hash of the reverse dependency was updated08:37
juergbito bubble up08:37
tristanAnd keep a list of *all* reverse dependencies, not only direct ones ?08:38
juergbino, just direct ones, but build+runtime08:38
juergbithe recursion should happen on the method level, not by having indirect reverse dependencies stored08:38
juergbii.e., the basic structure of the current branch would still fit08:39
juergbibut use Scope.ALL, recurse=False to construct reverse dependency lists08:40
juergbiand then use that new hash as additional criterion in __update_state_recursively()08:41
tristanOk so I don't really understand what this hash is :-S08:42
tristanIt is a cache key of itself and its direct runtime dependencies08:43
tristanAh yes I see what it is now08:43
juergbiself.recursive_runtime_hash = sha256(self.cache_key + [dep.recursive_runtime_hash for dep in self.dependencies(Scope.RUN, recurse=False)])08:43
juergbior so08:43
juergbiand it's undefined as long as any of the dependencies don't have a recursive_runtime_hash, of course08:44
tristanSo basically it is a cache key which behaves differently, and does not omit the runtime-only gaps08:45
tristanwell, minus build-only dependencies, simply because we don't really need that if we're already observing the cache key changes08:45
juergbicorrect08:46
juergbitheoretically, we'd only have to check runtime-only dependencies, but it's conceptually easier to cover all runtime dependencies08:46
tristanUpdated my original comment08:50
tristanWith a description of your idea in my own words after understanding it :)08:50
tristanjuergbi, Thanks I think that will solve it !08:50
tristanJust I still wonder...08:50
tristanIs it enough ?08:50
tristanThe order is correct, I think it is enough08:51
juergbiI think so, too08:52
tristanjuergbi, Maybe we could have a cache key which consists of every different kind of key (weak, strong, etc), and also considers direct runtime-only dependencies - like a "current state" hash08:52
juergbiand I expect it to be lighter than the current branch but don't know how much memory/CPU is saved08:52
juergbiweak is always a subset of strong08:52
tristanAnd then we only ever have to check that one (i.e. just for coherency)08:52
juergbiso that suggested hash would already cover everything08:53
tristanRight, I mean we would *only* consult the current state hash, instead of consulting cache keys + the new hash08:53
tristanmaking it... maybe conceptually lighter to swallow ?08:53
tristanjust an idea08:54
juergbiagreed08:54
tristanI wonder if I should hack it out, I feel bad for stringing benschubert along for so long now haha08:55
juergbiit doesn't necessarily cover strict cache keys. but they might not be relevant for the condition08:55
tristanbenschubert, When you surface, let me know what you think of this idea and whether you will be able to work on it08:55
juergbithe current branch should be correct as well, right? just maybe doing more than necessary08:56
tristanYes08:56
juergbihowever, if that's still an improvement compared to master, we could consider merging it already08:56
tristanI don't know if the latest state of the branch corresponds to the numbers reported08:56
juergbiright, should definitely be rechecked before08:57
tristanI.e. it could be that the list of reverse dependencies being stored became recursive since the last benchmark08:57
juergbipossible08:58
*** tristan has quit IRC09:02
benschubertI'm here! Bt seem to have missed tristan -_-'09:30
benschubertjuergbi: I'm not sure I understand this idea of hashes09:30
benschuberttrying to catch up on the rest09:31
juergbibenschubert: one way to look at it would be to define a new cache key variant that is like the strong cache key except that it takes all dependencies into account, not just build dependencies09:37
juergbiI called it hash, not cache key, because it wouldn't be used as a key for the cache09:38
benschubertjuergbi: what would be the advantage compared to now?09:38
juergbiit would allow use of recurse=False to construct the reverse dependency list09:39
benschubertand what advantages would that bring?09:39
juergbiwe'd use that new hash as condition in __update_state_recursively()09:39
juergbirecurse=False would avoid having to store indirect reverse dependencies in every element (quadratic memory/CPU)09:39
benschubertjuergbi: on essential-stack.bst from debian stretch, that results in 1Mo of additional peak memory time09:40
benschubert(I haven't published those oups)09:40
juergbias mentioned above, the current branch with Scope.BUILD, recurse=True should also be correct. if total performance is already better than master (and peak memory usage is not too high), my optimization shouldn't be a blocker09:41
benschubertI should probably check for debian-stack.bst though09:41
*** tristan has joined #buildstream09:42
*** jonathanmaw has joined #buildstream09:43
*** ChanServ sets mode: +o tristan09:43
* tristan catches backlog09:43
benschubertjuergbi: just one quick check that I understand everything correclty:09:43
benschubert1) when an element has its weak_cache_key discovered, this allows us to discover the weak cache key of its reverse dependencies09:43
benschubert2) when an element has its strong_cache_key discovered, this allows us to discover the strong cache key of its reverse dependencies09:43
benschubertCorrect?09:43
benschubert(having trouble waking up this morning)09:44
juergbibenschubert: mostly but there are some exceptions09:44
benschubertwould you mind listing them?09:44
benschubertworkspace is one09:44
juergbiif BST_STRICT_REBUILD is set, the weak cache of an element includes the strong cache keys of build dependencies09:45
juergbii.e., in that case 1) is not true, however, that shouldn't break the MR09:46
benschubertyep09:47
*** raoul_ has quit IRC09:49
tristanbenschubert, in short, the general advantage of only storing direct reverse dependencies, are that you avoid recursing into dependencies which you know cannot be resolved yet (because the dependencies in between have not yet been resolved, so it would be impossible) - looking at __update_state_recurse(), it *appears* to be the intention09:50
tristanAlso they are much shorter lists (less memory consumption), and generally overall less iteration09:50
benschubertSo basically, if we didn't have _update_state(), we would have something like:09:50
benschubert1) on weak cache key update -> check updates on weak cache keys of rdeps that only depends on us for (RUNTIME if BST_STRICT_REBUILD else ALL)09:50
benschubert2) on strong cache key update -> check updates on weak cache keys of rdeps that only depends on us for BUILD if BST_STRICT_REBUILD09:50
benschubert3) on strong cache key update -> check updates on strong cache keys of rdeps09:50
benschubertCorrect?09:50
benschuberttristan: if you look at the _update_state_recursive(), it has a break is no key has changes, so it doesn't propagate up :)09:51
tristan:-/09:51
tristanSo we enforce somehow the order to elements in the rdeps list ?09:52
benschuberttristan: also only storing direct BUILD deps is that runtime deps of runtime deps of build deps become way harded to treat09:52
tristanIf that is the intention, it would be worth a nice paragraph of comment around it09:52
benschuberttristan: no, we don't need that. What happens is "did my cache key change? yes => propagate to rdeps, no? => we are done!"09:52
tristanbenschubert, storing only direct build dependencies is what I have been pointing out is incorrect since the beginning, runtime deps also need to be stored09:53
juergbibenschubert: 1-3) yes, this sounds correct. there is also 'strict cache key', though, which may have to be considered09:53
benschubertjuergbi: then keeping a list of rdeps (we might need two, one for RUNTIME only, the other one for the rest), will allow more granular state propagation once _update_state is no more09:54
benschubertjuergbi: correct?09:54
tristanI.e. you need *all* direct reverse deps for it to work, and the additional hash of the "cache key state" which juergbi proposes, fixes the issue of missing out on recursing into reverse runtime deps of deps09:54
benschuberttristan: no, you don't need runtime deps, but runtime deps of BUILD deps, which is why we use scope=BUILD, recurse=TRUE, which gives us exactly that09:54
benschuberttristan: because our runtime dependencies don't affect our cache key09:55
juergbibenschubert: latest question: yes, that works but we might do too much work09:55
benschubertjuergbi: in what cases?09:55
juergbiwe iterate over reverse_dependencies which includes indirect runtime dependencies09:56
juergbiand these indirect dependencies we might visit multiple times09:56
juergbibecause of the recursion of __update_state_recursively() itself09:56
tristanbenschubert, You are still speaking in terms of what dependencies are considered for the cache key of an element, I am speaking in terms of what direct reverse dependencies need to be stored to ensure that the recursive algorithm reaches all the deps which need to be updated09:56
benschubertjuergbi: agreed, we will visit them too many times, that is correct09:57
juergbii.e., as mentioned before, my new hash suggestion is just a possible optimization addressing this09:57
juergbiI don't know how much the impact is in practice09:57
juergbi*big09:57
benschubertjuergbi: I fail to see how it addresses this09:57
juergbithe new hash allows reducing reverse_dependencies to the list of direct dependencies09:58
juergbi(although we now have to include runtime dependencies as well)09:58
benschubertjuergbi: but removes the granularity of updates we could have if update_state was split, plus adds dependencies that are not needed09:58
tristanThat granularity is not worth it09:59
juergbiit's not an alternative to splitting update_state()09:59
benschuberttristan: why?09:59
tristanAnd I dont believe it is true: In order to attempt that granularity, you are adding *all* the elements which might need to have their state recalulated as a result of a state update09:59
tristanI.e. you are making the loop immensely more heavy duty, like deps * deps heavier10:00
tristanAnd you will end up hitting those very few anyway10:00
juergbiyes, the hash approach should scale a lot better10:00
tristanAt least it very much appears like this10:00
benschuberttristan: so if instead of a recursive algorithm I had a list and kept an order, would that be satisfactory?10:01
juergbiI can't estimate the impact but given that we're testing with huge element counts, I'd expect it to be relevant10:01
tristanbenschubert, that would be dangerous, the order of the list would be significant and delicate10:01
tristanrecurse one level at a time is much cleaner10:01
benschuberttristan: we have a very simple order10:01
benschuberttristan: that is the order in which the elements were created in _new_from_meta10:02
benschubertso if we do in the constructor: __node_id = next(counter) as for MetaElement, we are done10:02
*** tpollard has joined #buildstream10:02
tristanbenschubert, Exactly, that is delicate and honestly, I don't for a second trust anyone who jumps into the project today, to know that that order should remain the same10:02
benschuberttristan: as I don't trust anybody jumping in the project today to know how _update_state works10:03
*** phildawson has joined #buildstream10:03
tristan(hence: if that is significant, that line of code is worth a comment that is a paragraph long)10:03
benschubertI have no problem with that10:03
tristanI do10:04
juergbiI'm not sure this is even sufficient10:04
benschubertjuergbi: why?10:04
tristanbenschubert, I have a problem with overworking in the attempt to exclude a hand full of elements while I am rather convinced that direct rdep recursion will anyway work better than this10:05
juergbiI'm not sure whether the order is guaranteed to be topological if we include indirect runtime dependencies10:05
benschubertjuergbi: it is per the sorting we are doing in the loader. If it is not, then we have a bug in that place10:06
tristanStaging order is topological, that is not the same as the order of the list of reverse dependencies10:06
tristanit is only the order in which an element *adds itself* as a reverse dependency to it's deps10:07
benschuberttristan: and since we build the elements in topological orders, the way they add themselve stays a topological order10:07
tristanreasoning out whether that is a correct order is quite an expensive mental process10:07
benschubertIf you iterate over a topological tree, in topological order, and create lists of segments you go through, each segment will be a sub-topological order of its own10:08
tristanSo long as every line of code stays at exactly the same place, and care is taken in adding to the list (append/prepend significant)10:09
benschubertas long as _new_from_meta doesn't change yes10:10
tristanPerhaps, and even then, I wouldnt trust that it is without trying to reason it out; it kind of makes sense that it *should* be10:10
benschubertbyt if new_from_meta changes, then EVERY assumption about ordering become also false10:10
tristanRight, in other words: Very complicated for no clear gain10:10
* Kinnison worries that we have behaviour which is this delicate, and for which we can't be confident the test suite will notify us if we break it10:10
* juergbi agrees10:11
* tristan only worries that we are adding to this delicateness10:12
benschuberttristan: Another approach, that would be sane is to, as we call _set_required, add the current element in the list of every dep, that would assure us that the behavior is sane, but again it's more work than necessary10:12
tristanI very dislike having more than direct rdeps in the list, and also having the updates be recursive; and they have to be recursive anyway10:13
tristan(A) it's an exponentially more complex loop, (B) most importantly, it causes the code to be harder to understand, (C) it's exponentially more expensive in memory consumption10:14
benschubertif we have more than direct rdep, how do you handle runtime deps of runtime deps, without adding this hash that will make splitting _update_state harder, and prevent us from having sane optimizations in the future?10:14
tristanBut (B) is still at the top of my list10:14
tristanbenschubert, you add the hash, that propagation can still be safely split up by domain in the future10:15
benschubertAnd C is not that significant, as I said, a 1MB increase out of 680MB,10:15
tristanat least you start with the sane approach10:15
tristanOk here is the sensible consideration to make...10:17
tristanAll of this added complexity, is in the hope to avoid descending into an element which might not need to have it's state resolved "right now"10:18
tristanIn a reverse dependency10:18
benschuberttristan: the hash, at least as how it is described in https://gitlab.com/BuildStream/buildstream/merge_requests/1070#note_148535495 doesn't fix anything, as we are still missing runtime deps of runtime deps. How would you fix that, without resorting to what I have, or having an update of everything?10:18
tristanHowever, a saner, level by level recursion, storing only direct reverse deps, ensures that we dont get ahead of ourselves and call _update_state() unnescesarily10:19
tristanWhich we go to lengths to try to achieve with this approach of storing everything10:19
tristanAnd in the end, it is a reverse dependency which is in the build plan somewhere, so it's state *will* end up being calculated at some point10:19
juergbibenschubert: the hash covers runtime deps of runtime deps10:20
tristanIn any case it will, maybe not at the same time, but ensuring we don't prematurely do it is the key10:20
tristan(i.e. ensuring we dont call _update_state() until we know it could result in resolving a key)10:20
juergbibenschubert: as the new recursive_runtime_hash would use the recursive_runtime_hash of the dependencies10:20
tristanIn the end, we get the same overall performance without all the headache10:20
juergbiand thus, implicitly covering indirect dependencies10:20
benschubertjuergbi: so10:23
benschubert1) have __reverse_build_deps, containing all direct reverse dependencies that depend on us for either BUILD or BUILD and RUN10:23
benschubert2) have __reverse_runtime_deps, containings all direct reverse dependencies that depend on us for RUN only10:23
benschubert3) whenever we update our key, recurse in all __reverse_build_deps10:23
benschubert4) whenever we update our strong key, recurse in all __reverse_runtime_deps10:23
benschubertIs that what you have in mind?10:23
juergbiI would use a single list with build+runtime direct reverse dependencies10:24
juergbiwhenever the new hash changes, recurse10:24
tristanYeah, when you split up that propagation by domain, it is only the loops which change (different loops resolve different things)10:25
benschubertjuergbi: however, a weak cache key will never update a reverse_runtime_dep, correct?10:25
tristanbut the lists (or list) can mostly stay the same... that means we might check different hashes in the loops to decide whether or not to recurse10:25
benschuberttristan: now, how does this fix calling multiple times the same update?10:26
juergbibenschubert: if the weak cache key changes, either the strong cache key changes as well, or it's None10:26
juergbithe big difference is that the reverse_dependencies list no longer includes indirect dependencies10:27
tristan<benschubert> tristan: now, how does this fix calling multiple times the same update? <--- I don't compute what that means10:27
juergbiso you only have recursion via __update_state_recursively(), and no longer implicit recursion by iterating through a list that already includes the indirect dependencies10:27
benschubertjuergbi: so that makes the recursion explicit, but actually introduces more work as we are not updating also runtime dependencies that might never be needed (if our element is a leaf in the current plan correct?) and does not save any work otherwise since the calls will be done anyways10:29
benschubertor am I missing something?10:30
juergbibenschubert: right now the branch does both10:30
juergbii.e., explicit and implicit recursion10:30
juergbiresulting in potential quadratic scaling10:30
juergbiwhile with the hash we only have explicit recursion, although indeed covering some unneeded reverse dependencies10:30
juergbiwith large/deep element graphs, I'd expect the hash-based approach to win out10:31
tristanThose unneeded reverse dependencies are only in the present loop10:31
tristanThey will probably be needed later anyway10:31
tristanAnd if we want to avoid that completely, we can check if it is "required" in the loop and omit them10:31
tristanin the off chance, but I *think* that by the nature of how we load, it can't happen that we won't want to calculate the keys10:32
tristanI.e. we dont load all the possible reverse dependencies to start with, we only load dependencies which we know we are going to want their keys10:32
benschubertOk another proposal:10:33
benschubertInstead of building this list at that moment, we actually build it in _set_required, that way we have a minimal graph and maximum gains and only direct deps10:33
benschubertand this actually improves even more stuff10:33
benschubertas we are not computing cache keys of things that are not yet required10:33
tristanMaybe, depending on the meaning of required10:33
tristanI.e. even if we are not going to need some elements across a build-only boundary, that doesnt necessarily mean we dont want to include their cache keys in a report, if those cache keys can be known10:34
benschuberttristan: but then those cache keys would have been computed before right?10:35
tristanbenschubert, But yes in general I would be amenable to building the list of direct rdeps lazily at _set_required() time10:35
tristanthat does sound sensible10:35
benschubertor that would mean removing my commit that checks the state at the end10:35
benschubertok10:35
benschubertI'll see what this gives10:35
*** raoul has joined #buildstream10:40
tristanbenschubert, One more point about that: it might not really make a difference whether we load that lazily or not - because we might assume that we are only going to be calling _update_state() on elements that we are processing10:41
tristanI mean, that might be helpful if attempting to do it lazily turns out to be a weird and complex eyesore10:41
benschuberttristan: but that's not the case, correct?10:41
benschubertso we are always updating everything10:42
tristanbenschubert, I think it is the case10:42
benschubertwhereas here, we would have the smallest flatest case10:42
tristanbenschubert, if you are iterating over reverse deps on an element already, you will not reach something that is not "required"10:42
tristanbenschubert, unrequired things are lower in the tree (cut off by build-only dependency boundaries), not higher (closer to the targets)10:43
benschuberttristan: ah true10:43
benschuberttristan: however doing it at this moment has two advantages:10:44
benschubert1) Don't create the lists when we do not need them10:44
benschubert2) We get the flattened list for free10:44
tristanYeah10:44
tristanI'm just pointing it out in case it turns out to be weird code10:44
benschubertright10:45
tristanLike, when you set something required, you can easily do it recursively... _set_required() is already called for each element individually10:45
tristanand without recursion, you dont have the rdeps in context right ?10:45
tristananyway, I'm not sure how that can work10:45
benschuberttristan: I'll need to check10:46
gitlab-br-botmarge-bot123 closed issue #922 (Make it easy to test BuildStream against external plugins) on buildstream https://gitlab.com/BuildStream/buildstream/issues/92210:56
gitlab-br-botmarge-bot123 merged MR !1158 (phil/external-plugin-testing->master: Make it easy to test BuildStream against external plugins) on buildstream https://gitlab.com/BuildStream/buildstream/merge_requests/115810:56
*** cs-shadow has joined #buildstream11:15
jonathanmawcs-shadow: WSalmon says you'd be the person to ask about pypi credentials. I'm looking at https://gitlab.com/BuildStream/bst-external/merge_requests/71 and it'll need a username and password in pypi to be useful11:18
gitlab-br-botBenjaminSchubert approved MR !1211 (jennis/move_node_get_project_path->master: Cleanup: Move _yaml.node_get_project_path() to Project._get_path_from_node()) on buildstream https://gitlab.com/BuildStream/buildstream/merge_requests/121111:18
jonathanmawis there an account buildstream uses for uploading to pypi already?11:18
cs-shadowjonathanmaw: not yet, we've been using our personal accounts so far. Although it definitely seems a worthwhile idea to get a buildstream-bot account11:19
cs-shadowI'd imagine that can be shared by buildstream project and various extrernal plugins repos as well11:19
WSalmoncs-shadow, did we get as far as having a "group" that some individual accounts are linked to?11:24
*** jonathanmaw has quit IRC11:29
cs-shadowWSalmon: I don't think PyPI supports the notion of groups, unlike DockerHub11:30
cs-shadowhaving a buildstream-bot account that has secondary email id of more than person is as close as it is going to get AFAIK11:31
*** toscalix has quit IRC11:35
*** toscalix has joined #buildstream11:37
*** jonathanmaw has joined #buildstream11:38
*** phildawson_ has joined #buildstream11:52
*** phildawson has quit IRC11:52
benschuberttristan: my first test with set_required fails when workspaces are involved. I don't have time to look at this now, I'll postpone it a bit :/12:00
benschubertinteresting. Running bst build on a project has so far printend 89k times "Resource temporary unavailable" because we are spawning too many subprocesses a second12:25
tristanbenschubert, I believe there is a report for that12:32
* tristan recalls filing one a long time ago, it just doesnt happen in practice so I guess we're still not catching the exception12:32
benschubertIt does happen in practice. That was our production system12:33
tristanOhhh maybe with remote execution, or a lot of --builders and a very horizontal graph ?12:33
tristanbenschubert, https://gitlab.com/BuildStream/buildstream/issues/9312:34
tristan93 !12:34
tristanhaha small number12:34
benschubert114'000 times for a 6k pipeline, that seems excessive13:28
*** nimish has joined #buildstream14:12
*** toscalix has quit IRC14:15
*** toscalix has joined #buildstream14:15
*** jonathanmaw has quit IRC14:15
*** slaf has quit IRC14:26
gitlab-br-botjmacarthur approved MR !1192 (juerg/remote-execution-cas->master: Improve remote execution) on buildstream https://gitlab.com/BuildStream/buildstream/merge_requests/119214:42
gitlab-br-botmartinblanchard approved MR !1192 (juerg/remote-execution-cas->master: Improve remote execution) on buildstream https://gitlab.com/BuildStream/buildstream/merge_requests/119214:43
*** nimish has quit IRC14:47
*** nimish has joined #buildstream14:49
gokcennurluI am confused about the `tar` source plugin14:50
jmacWhat's the issue? I had some dealings with that recently14:51
gokcennurluI want to keep the folders intact, but buildstream is flattening the output14:51
jmacThat's odd, it shouldn't14:51
gokcennurluI test my tar with `tar -xvf` and it works, but can't do it with buildstream. I might be missing something, I will dig a bit more14:52
gokcennurlubut14:52
gokcennurluThe base_dir raised my eyebrow, here https://gitlab.com/BuildStream/buildstream/blob/master/buildstream/plugins/sources/tar.py#L7514:52
gokcennurlumy output is also missing some files. I will try inserting some print statements in `_find_base_dir` etc.14:59
jmacYeah, find_base_dir looks a bit suspicious to me15:00
jmacDoes your tarfile extract (in bst) if you specify base-dir: '' ?15:02
*** raoul has quit IRC15:03
gokcennurluno, nothing.15:05
gitlab-br-botphildawson approved MR !1212 (tristan/cleanup-workspace-tests->master: tests/frontend/workspace.py: Remove redundant and pointless tests) on buildstream https://gitlab.com/BuildStream/buildstream/merge_requests/121215:06
gokcennurluver 1.3.0 by the way.15:06
gokcennurlujmac sorry, it fixed it, yes.15:11
gokcennurluThank you!15:14
jmacgokcennurlu: np, I don't think the base-dir option is particularly clear15:16
*** slaf has joined #buildstream15:16
*** slaf has joined #buildstream15:16
*** slaf has joined #buildstream15:17
*** slaf has joined #buildstream15:17
*** slaf has joined #buildstream15:17
*** slaf has joined #buildstream15:17
*** slaf has joined #buildstream15:18
*** raoul has joined #buildstream15:21
*** slaf has quit IRC15:21
*** slaf has joined #buildstream15:21
*** slaf has joined #buildstream15:21
*** slaf has joined #buildstream15:22
*** slaf has joined #buildstream15:22
*** nimish has quit IRC15:29
gitlab-br-botmarge-bot123 closed issue #935 (Remote Execution: build command failures hard to diagnose (always 'Directory not found')) on buildstream https://gitlab.com/BuildStream/buildstream/issues/93515:42
gitlab-br-botmarge-bot123 merged MR !1192 (juerg/remote-execution-cas->master: Improve remote execution) on buildstream https://gitlab.com/BuildStream/buildstream/merge_requests/119215:42
benschuberthow can I ensure I'm not caching build trees?15:45
tpollard--cache-buildtrees never on the cli15:46
tpollardor cache-buildtrees: never in user.conf15:47
benschuberttpollard: I have cache-buildtrees:never in user.conf, but the size of my cache didn't go down (I wiped it entirely and rebuilt my stuff)15:47
tpollardunder the cache: section?15:48
benschuberttpollard: under the cache: secion in my configuration file provided as --config when running buildstream15:48
raouljuergbi fyi, I think the remaining discussion points on !1124 are fixed, though I've left some open as I'm not 100% sure they're resolved15:49
gitlab-br-botMR !1124: Source cache https://gitlab.com/BuildStream/buildstream/merge_requests/112415:49
juergbiok, ta, will take a look asap, but will probably be Monday15:49
raoulcheers, shall put up an MR for the remote source cache soon too as that's gone surprisingly smoothly15:50
juergbigreat15:50
tpollardbenschubert: what happpens if you try open a build shell on one of the elements that would normally have a buildtree, with --use-buildtree?15:51
tpollardThere's integration tests for cache-buildtrees via cli and user conf15:51
tpollardif you've got an empty buildtree cached, it should warn that it's empty when staging it15:52
tpollardthat's should be covered with tests too15:53
benschuberttpollard: I'll have a look15:53
jmactpollard: Does the --config option to bst specify project config or user config?15:54
tpollarduser iirc15:55
jmacThat should be fine then15:56
tpollardthe only other thing I can think is that none of the elements have a build-root, so the buildtree is irrelevant16:00
mablanchjuergbi: [WIP] CI job testing BuildStream remote execution against BuildGrid for your just-merged changes in !1192 (triggered manually): https://gitlab.com/BuildGrid/buildgrid.hub.docker.com/-/jobs/17429835016:00
gitlab-br-botMR !1192: Improve remote execution https://gitlab.com/BuildStream/buildstream/merge_requests/119216:00
benschuberttpollard: not sure I understand16:01
juergbimablanch: great, ta16:02
tpollardThe buildtree will always be empty regardless of config if the element being cached doesn't have a build-root benschubert16:02
benschuberttpollard: ah, so if I didn't have it before, it wouldn't change anything correct?16:03
tpollardbenschubert: afaict yes16:03
tpollardsetting cache-buildtree: none just enforces this behaviour for all elements16:04
*** nimish has joined #buildstream16:04
tpollardthis will be handled more clearly once we move to the proto (which will define if an artifact include a buildtree, and if it was original created with one)16:07
*** nimish has quit IRC16:09
*** nimish has joined #buildstream16:09
gitlab-br-botcs-shadow opened MR !1213 (chandan/container-plugins->master: doc/source/core_plugins.rst: Add link to bst-plugins-container) on buildstream https://gitlab.com/BuildStream/buildstream/merge_requests/121316:13
*** nimish has quit IRC16:29
*** nimish has joined #buildstream16:35
*** nimish has quit IRC16:40
*** nimish has joined #buildstream16:40
*** toscalix has quit IRC16:43
*** nimish has quit IRC16:45
*** nimish has joined #buildstream16:45
gitlab-br-botraoul.hidalgocharman opened MR !1214 (raoul/440-source-cache-remotes->master: Remote source cache) on buildstream https://gitlab.com/BuildStream/buildstream/merge_requests/121416:46
*** nimish has quit IRC17:00
*** nimish has joined #buildstream17:01
*** nimish has quit IRC17:06
*** nimish has joined #buildstream17:21
*** tpollard has quit IRC17:25
*** nimish has quit IRC17:26
*** nimish has joined #buildstream17:27
*** nimish has quit IRC17:32
*** nimish has joined #buildstream17:32
benschuberttristan: I tried with a PQ and the ordering instead of the _set_required part. That is what it would give: https://gitlab.com/BuildStream/buildstream/merge_requests/1070/diffs?commit_id=828510a8bbb6eeffdbfd05afd9998cdf2768377a (I can obviously comment/cleanup the code) I'll also run benchmarks on this to see what we gain17:33
benschuberttristan: let me know if this way would be a big no or not17:34
*** nimish has quit IRC17:37
*** nimish has joined #buildstream17:48
*** raoul has quit IRC17:51
*** nimish has quit IRC17:58
*** nimish has joined #buildstream17:58
tristanbenschubert, it's a bit late here, I don't understand why the aversion to only storing direct reverse dependencies18:03
tristanlooks more complicated than it needs to be18:04
benschuberttristan: ok I'll have a look with having both too18:05
benschubertand benchmark all three solutions on a bigger dataset18:06
tristanHonestly, the whole patch fixes an important bug *and* makes the code more sensible in general, if it doesnt make things slower, that is a good thing overall18:07
tristanAlso, I would prefer looking into splitting up update state details into more granular updated, before trying more complicated things18:08
tristan*into more granular updates18:08
benschuberttristan: because even those updates won't be sufficent to get acceptable performance on medium-large projects18:10
benschubertbut yes, we can go per step18:10
*** nimish has quit IRC18:33
*** nimish has joined #buildstream18:49
*** nimish has quit IRC18:54
*** nimish has joined #buildstream18:55
tristanbenschubert, I think we can eliminate calls to Element.dependencies() in cache key calculation with this reverse deps things as a first step18:59
tristani.e. since we're informing reverse deps in order to trigger changes, we could be passing the cache keys down one level at a time, and cumulatively constructing the dependency portion of the cache keys19:00
tristanthat should be a good win and scale better than recursively calculating them19:01
*** nimish has quit IRC19:05
*** nimish has joined #buildstream19:05
*** nimish has joined #buildstream19:05
*** nimish has quit IRC19:08
*** nimish has joined #buildstream19:18
*** nimish has quit IRC19:23
*** tristan has quit IRC19:43
benschubertTristan : that is true and indeed better19:58
*** slaf has quit IRC20:53
*** slaf has joined #buildstream21:12
*** slaf has joined #buildstream21:12
*** slaf has joined #buildstream21:12
*** slaf has joined #buildstream21:12
*** cs-shadow has quit IRC21:40

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!