IRC logs for #buildstream for Friday, 2019-03-08

*** nimish has joined #buildstream		01:19
*** toscalix has joined #buildstream		01:56
*** nimish has quit IRC		03:18
*** tristan has joined #buildstream		05:49
*** tristan has quit IRC		06:52
*** tristan has joined #buildstream		07:02
gitlab-br-bot	tristanvb opened MR !1212 (tristan/cleanup-workspace-tests->master: tests/frontend/workspace.py: Remove redundant and pointless tests) on buildstream https://gitlab.com/BuildStream/buildstream/merge_requests/1212	07:49
*** ChanServ sets mode: +o tristan		08:22
tristan	juergbi, Can I pick your brain for the infamous !1070 algorithm ?	08:22
tristan	me gropes around channel for the bot which should be linking !1070	08:24
juergbi	tristan: ah, I just commented before seeing this here	08:24
juergbi	disclaimer, haven't followed it too closely	08:25
tristan	juergbi, hold a sec, I can explain in human language	08:27
tristan	Right... commented back, and seems like a good idea indeed	08:29
tristan	juergbi, So here is the thing, Element.__update_state_recursively() gets called everywhere (except _fetch_done() which cannot resolve a key) instead of _update_state()	08:29
juergbi	tristan: why artifact version bump?	08:29
juergbi	we wouldn't have to store that recursive key	08:30
juergbi	(nor want to)	08:30
tristan	juergbi, Wouldnt the hash allow you to avoid recursing into dependencies which calculating the key ?	08:30
tristan	Oh, so it's a different idea I didn't really comprehend completely	08:30
tristan	I think	08:30
juergbi	yes, but only avoiding repeated calls at runtime	08:30
juergbi	runtime dep versions can change, we can't store something like this in the artifact	08:30
juergbi	(which doesn't get rebuilt for runtime-only dep changes)	08:31
tristan	Yeah at runtime, but I thought instead of using a "dependencies" list, we would slowly add data to the "dependencies hash" as we go, contributing to the key as we truck along	08:32
juergbi	I would just recalculate it whenever something changes, otherwise it gets tricky	08:32
juergbi	but we don't have to do things recursively all the time	08:32
tristan	hmmm, just rereading this	08:32
tristan	I'm not sure it handles the case where indirect reverse dependencies need to have their state updated	08:33
juergbi	actually, we also have to update reverse runtime dependencies, not just reverse build dependencies	08:33
juergbi	but the main point is that it would suffice to have a flat list of reverse dependencies	08:34
tristan	juergbi, What I discovered, is that if Element.__update_state_recursively() only iterates over direct dependencies (build + runtime), then reverse dependencies of reverse runtime-only dependencies are missed	08:34
juergbi	yes, that's what the new hash would fix	08:34
tristan	I think you have a good idea	08:35
*** toscalix has joined #buildstream		08:35
tristan	It would still have to be recursive when the hash of something changes	08:36
juergbi	the state update would still be recursive, yes	08:37
juergbi	but not driven from a single element	08:37
juergbi	i.e., recurse only if the cache key or the new hash of the reverse dependency was updated	08:37
juergbi	to bubble up	08:37
tristan	And keep a list of all reverse dependencies, not only direct ones ?	08:38
juergbi	no, just direct ones, but build+runtime	08:38
juergbi	the recursion should happen on the method level, not by having indirect reverse dependencies stored	08:38
juergbi	i.e., the basic structure of the current branch would still fit	08:39
juergbi	but use Scope.ALL, recurse=False to construct reverse dependency lists	08:40
juergbi	and then use that new hash as additional criterion in __update_state_recursively()	08:41
tristan	Ok so I don't really understand what this hash is :-S	08:42
tristan	It is a cache key of itself and its direct runtime dependencies	08:43
tristan	Ah yes I see what it is now	08:43
juergbi	self.recursive_runtime_hash = sha256(self.cache_key + [dep.recursive_runtime_hash for dep in self.dependencies(Scope.RUN, recurse=False)])	08:43
juergbi	or so	08:43
juergbi	and it's undefined as long as any of the dependencies don't have a recursive_runtime_hash, of course	08:44
tristan	So basically it is a cache key which behaves differently, and does not omit the runtime-only gaps	08:45
tristan	well, minus build-only dependencies, simply because we don't really need that if we're already observing the cache key changes	08:45
juergbi	correct	08:46
juergbi	theoretically, we'd only have to check runtime-only dependencies, but it's conceptually easier to cover all runtime dependencies	08:46
tristan	Updated my original comment	08:50
tristan	With a description of your idea in my own words after understanding it :)	08:50
tristan	juergbi, Thanks I think that will solve it !	08:50
tristan	Just I still wonder...	08:50
tristan	Is it enough ?	08:50
tristan	The order is correct, I think it is enough	08:51
juergbi	I think so, too	08:52
tristan	juergbi, Maybe we could have a cache key which consists of every different kind of key (weak, strong, etc), and also considers direct runtime-only dependencies - like a "current state" hash	08:52
juergbi	and I expect it to be lighter than the current branch but don't know how much memory/CPU is saved	08:52
juergbi	weak is always a subset of strong	08:52
tristan	And then we only ever have to check that one (i.e. just for coherency)	08:52
juergbi	so that suggested hash would already cover everything	08:53
tristan	Right, I mean we would only consult the current state hash, instead of consulting cache keys + the new hash	08:53
tristan	making it... maybe conceptually lighter to swallow ?	08:53
tristan	just an idea	08:54
juergbi	agreed	08:54
tristan	I wonder if I should hack it out, I feel bad for stringing benschubert along for so long now haha	08:55
juergbi	it doesn't necessarily cover strict cache keys. but they might not be relevant for the condition	08:55
tristan	benschubert, When you surface, let me know what you think of this idea and whether you will be able to work on it	08:55
juergbi	the current branch should be correct as well, right? just maybe doing more than necessary	08:56
tristan	Yes	08:56
juergbi	however, if that's still an improvement compared to master, we could consider merging it already	08:56
tristan	I don't know if the latest state of the branch corresponds to the numbers reported	08:56
juergbi	right, should definitely be rechecked before	08:57
tristan	I.e. it could be that the list of reverse dependencies being stored became recursive since the last benchmark	08:57
juergbi	possible	08:58
*** tristan has quit IRC		09:02
benschubert	I'm here! Bt seem to have missed tristan -_-'	09:30
benschubert	juergbi: I'm not sure I understand this idea of hashes	09:30
benschubert	trying to catch up on the rest	09:31
juergbi	benschubert: one way to look at it would be to define a new cache key variant that is like the strong cache key except that it takes all dependencies into account, not just build dependencies	09:37
juergbi	I called it hash, not cache key, because it wouldn't be used as a key for the cache	09:38
benschubert	juergbi: what would be the advantage compared to now?	09:38
juergbi	it would allow use of recurse=False to construct the reverse dependency list	09:39
benschubert	and what advantages would that bring?	09:39
juergbi	we'd use that new hash as condition in __update_state_recursively()	09:39
juergbi	recurse=False would avoid having to store indirect reverse dependencies in every element (quadratic memory/CPU)	09:39
benschubert	juergbi: on essential-stack.bst from debian stretch, that results in 1Mo of additional peak memory time	09:40
benschubert	(I haven't published those oups)	09:40
juergbi	as mentioned above, the current branch with Scope.BUILD, recurse=True should also be correct. if total performance is already better than master (and peak memory usage is not too high), my optimization shouldn't be a blocker	09:41
benschubert	I should probably check for debian-stack.bst though	09:41
*** tristan has joined #buildstream		09:42
*** jonathanmaw has joined #buildstream		09:43
*** ChanServ sets mode: +o tristan		09:43
* tristan catches backlog		09:43
benschubert	juergbi: just one quick check that I understand everything correclty:	09:43
benschubert	1) when an element has its weak_cache_key discovered, this allows us to discover the weak cache key of its reverse dependencies	09:43
benschubert	2) when an element has its strong_cache_key discovered, this allows us to discover the strong cache key of its reverse dependencies	09:43
benschubert	Correct?	09:43
benschubert	(having trouble waking up this morning)	09:44
juergbi	benschubert: mostly but there are some exceptions	09:44
benschubert	would you mind listing them?	09:44
benschubert	workspace is one	09:44
juergbi	if BST_STRICT_REBUILD is set, the weak cache of an element includes the strong cache keys of build dependencies	09:45
juergbi	i.e., in that case 1) is not true, however, that shouldn't break the MR	09:46
benschubert	yep	09:47
*** raoul_ has quit IRC		09:49
tristan	benschubert, in short, the general advantage of only storing direct reverse dependencies, are that you avoid recursing into dependencies which you know cannot be resolved yet (because the dependencies in between have not yet been resolved, so it would be impossible) - looking at __update_state_recurse(), it appears to be the intention	09:50
tristan	Also they are much shorter lists (less memory consumption), and generally overall less iteration	09:50
benschubert	So basically, if we didn't have _update_state(), we would have something like:	09:50
benschubert	1) on weak cache key update -> check updates on weak cache keys of rdeps that only depends on us for (RUNTIME if BST_STRICT_REBUILD else ALL)	09:50
benschubert	2) on strong cache key update -> check updates on weak cache keys of rdeps that only depends on us for BUILD if BST_STRICT_REBUILD	09:50
benschubert	3) on strong cache key update -> check updates on strong cache keys of rdeps	09:50
benschubert	Correct?	09:50
benschubert	tristan: if you look at the _update_state_recursive(), it has a break is no key has changes, so it doesn't propagate up :)	09:51
tristan	:-/	09:51
tristan	So we enforce somehow the order to elements in the rdeps list ?	09:52
benschubert	tristan: also only storing direct BUILD deps is that runtime deps of runtime deps of build deps become way harded to treat	09:52
tristan	If that is the intention, it would be worth a nice paragraph of comment around it	09:52
benschubert	tristan: no, we don't need that. What happens is "did my cache key change? yes => propagate to rdeps, no? => we are done!"	09:52
tristan	benschubert, storing only direct build dependencies is what I have been pointing out is incorrect since the beginning, runtime deps also need to be stored	09:53
juergbi	benschubert: 1-3) yes, this sounds correct. there is also 'strict cache key', though, which may have to be considered	09:53
benschubert	juergbi: then keeping a list of rdeps (we might need two, one for RUNTIME only, the other one for the rest), will allow more granular state propagation once _update_state is no more	09:54
benschubert	juergbi: correct?	09:54
tristan	I.e. you need all direct reverse deps for it to work, and the additional hash of the "cache key state" which juergbi proposes, fixes the issue of missing out on recursing into reverse runtime deps of deps	09:54
benschubert	tristan: no, you don't need runtime deps, but runtime deps of BUILD deps, which is why we use scope=BUILD, recurse=TRUE, which gives us exactly that	09:54
benschubert	tristan: because our runtime dependencies don't affect our cache key	09:55
juergbi	benschubert: latest question: yes, that works but we might do too much work	09:55
benschubert	juergbi: in what cases?	09:55
juergbi	we iterate over reverse_dependencies which includes indirect runtime dependencies	09:56
juergbi	and these indirect dependencies we might visit multiple times	09:56
juergbi	because of the recursion of __update_state_recursively() itself	09:56
tristan	benschubert, You are still speaking in terms of what dependencies are considered for the cache key of an element, I am speaking in terms of what direct reverse dependencies need to be stored to ensure that the recursive algorithm reaches all the deps which need to be updated	09:56
benschubert	juergbi: agreed, we will visit them too many times, that is correct	09:57
juergbi	i.e., as mentioned before, my new hash suggestion is just a possible optimization addressing this	09:57
juergbi	I don't know how much the impact is in practice	09:57
juergbi	*big	09:57
benschubert	juergbi: I fail to see how it addresses this	09:57
juergbi	the new hash allows reducing reverse_dependencies to the list of direct dependencies	09:58
juergbi	(although we now have to include runtime dependencies as well)	09:58
benschubert	juergbi: but removes the granularity of updates we could have if update_state was split, plus adds dependencies that are not needed	09:58
tristan	That granularity is not worth it	09:59
juergbi	it's not an alternative to splitting update_state()	09:59
benschubert	tristan: why?	09:59
tristan	And I dont believe it is true: In order to attempt that granularity, you are adding all the elements which might need to have their state recalulated as a result of a state update	09:59
tristan	I.e. you are making the loop immensely more heavy duty, like deps * deps heavier	10:00
tristan	And you will end up hitting those very few anyway	10:00
juergbi	yes, the hash approach should scale a lot better	10:00
tristan	At least it very much appears like this	10:00
benschubert	tristan: so if instead of a recursive algorithm I had a list and kept an order, would that be satisfactory?	10:01
juergbi	I can't estimate the impact but given that we're testing with huge element counts, I'd expect it to be relevant	10:01
tristan	benschubert, that would be dangerous, the order of the list would be significant and delicate	10:01
tristan	recurse one level at a time is much cleaner	10:01
benschubert	tristan: we have a very simple order	10:01
benschubert	tristan: that is the order in which the elements were created in _new_from_meta	10:02
benschubert	so if we do in the constructor: __node_id = next(counter) as for MetaElement, we are done	10:02
*** tpollard has joined #buildstream		10:02
tristan	benschubert, Exactly, that is delicate and honestly, I don't for a second trust anyone who jumps into the project today, to know that that order should remain the same	10:02
benschubert	tristan: as I don't trust anybody jumping in the project today to know how _update_state works	10:03
*** phildawson has joined #buildstream		10:03
tristan	(hence: if that is significant, that line of code is worth a comment that is a paragraph long)	10:03
benschubert	I have no problem with that	10:03
tristan	I do	10:04
juergbi	I'm not sure this is even sufficient	10:04
benschubert	juergbi: why?	10:04
tristan	benschubert, I have a problem with overworking in the attempt to exclude a hand full of elements while I am rather convinced that direct rdep recursion will anyway work better than this	10:05
juergbi	I'm not sure whether the order is guaranteed to be topological if we include indirect runtime dependencies	10:05
benschubert	juergbi: it is per the sorting we are doing in the loader. If it is not, then we have a bug in that place	10:06
tristan	Staging order is topological, that is not the same as the order of the list of reverse dependencies	10:06
tristan	it is only the order in which an element adds itself as a reverse dependency to it's deps	10:07
benschubert	tristan: and since we build the elements in topological orders, the way they add themselve stays a topological order	10:07
tristan	reasoning out whether that is a correct order is quite an expensive mental process	10:07
benschubert	If you iterate over a topological tree, in topological order, and create lists of segments you go through, each segment will be a sub-topological order of its own	10:08
tristan	So long as every line of code stays at exactly the same place, and care is taken in adding to the list (append/prepend significant)	10:09
benschubert	as long as _new_from_meta doesn't change yes	10:10
tristan	Perhaps, and even then, I wouldnt trust that it is without trying to reason it out; it kind of makes sense that it should be	10:10
benschubert	byt if new_from_meta changes, then EVERY assumption about ordering become also false	10:10
tristan	Right, in other words: Very complicated for no clear gain	10:10
* Kinnison worries that we have behaviour which is this delicate, and for which we can't be confident the test suite will notify us if we break it		10:10
* juergbi agrees		10:11
* tristan only worries that we are adding to this delicateness		10:12
benschubert	tristan: Another approach, that would be sane is to, as we call _set_required, add the current element in the list of every dep, that would assure us that the behavior is sane, but again it's more work than necessary	10:12
tristan	I very dislike having more than direct rdeps in the list, and also having the updates be recursive; and they have to be recursive anyway	10:13
tristan	(A) it's an exponentially more complex loop, (B) most importantly, it causes the code to be harder to understand, (C) it's exponentially more expensive in memory consumption	10:14
benschubert	if we have more than direct rdep, how do you handle runtime deps of runtime deps, without adding this hash that will make splitting _update_state harder, and prevent us from having sane optimizations in the future?	10:14
tristan	But (B) is still at the top of my list	10:14
tristan	benschubert, you add the hash, that propagation can still be safely split up by domain in the future	10:15
benschubert	And C is not that significant, as I said, a 1MB increase out of 680MB,	10:15
tristan	at least you start with the sane approach	10:15
tristan	Ok here is the sensible consideration to make...	10:17
tristan	All of this added complexity, is in the hope to avoid descending into an element which might not need to have it's state resolved "right now"	10:18
tristan	In a reverse dependency	10:18
benschubert	tristan: the hash, at least as how it is described in https://gitlab.com/BuildStream/buildstream/merge_requests/1070#note_148535495 doesn't fix anything, as we are still missing runtime deps of runtime deps. How would you fix that, without resorting to what I have, or having an update of everything?	10:18
tristan	However, a saner, level by level recursion, storing only direct reverse deps, ensures that we dont get ahead of ourselves and call _update_state() unnescesarily	10:19
tristan	Which we go to lengths to try to achieve with this approach of storing everything	10:19
tristan	And in the end, it is a reverse dependency which is in the build plan somewhere, so it's state will end up being calculated at some point	10:19
juergbi	benschubert: the hash covers runtime deps of runtime deps	10:20
tristan	In any case it will, maybe not at the same time, but ensuring we don't prematurely do it is the key	10:20
tristan	(i.e. ensuring we dont call _update_state() until we know it could result in resolving a key)	10:20
juergbi	benschubert: as the new recursive_runtime_hash would use the recursive_runtime_hash of the dependencies	10:20
tristan	In the end, we get the same overall performance without all the headache	10:20
juergbi	and thus, implicitly covering indirect dependencies	10:20
benschubert	juergbi: so	10:23
benschubert	1) have __reverse_build_deps, containing all direct reverse dependencies that depend on us for either BUILD or BUILD and RUN	10:23
benschubert	2) have __reverse_runtime_deps, containings all direct reverse dependencies that depend on us for RUN only	10:23
benschubert	3) whenever we update our key, recurse in all __reverse_build_deps	10:23
benschubert	4) whenever we update our strong key, recurse in all __reverse_runtime_deps	10:23
benschubert	Is that what you have in mind?	10:23
juergbi	I would use a single list with build+runtime direct reverse dependencies	10:24
juergbi	whenever the new hash changes, recurse	10:24
tristan	Yeah, when you split up that propagation by domain, it is only the loops which change (different loops resolve different things)	10:25
benschubert	juergbi: however, a weak cache key will never update a reverse_runtime_dep, correct?	10:25
tristan	but the lists (or list) can mostly stay the same... that means we might check different hashes in the loops to decide whether or not to recurse	10:25
benschubert	tristan: now, how does this fix calling multiple times the same update?	10:26
juergbi	benschubert: if the weak cache key changes, either the strong cache key changes as well, or it's None	10:26
juergbi	the big difference is that the reverse_dependencies list no longer includes indirect dependencies	10:27
tristan	<benschubert> tristan: now, how does this fix calling multiple times the same update? <--- I don't compute what that means	10:27
juergbi	so you only have recursion via __update_state_recursively(), and no longer implicit recursion by iterating through a list that already includes the indirect dependencies	10:27
benschubert	juergbi: so that makes the recursion explicit, but actually introduces more work as we are not updating also runtime dependencies that might never be needed (if our element is a leaf in the current plan correct?) and does not save any work otherwise since the calls will be done anyways	10:29
benschubert	or am I missing something?	10:30
juergbi	benschubert: right now the branch does both	10:30
juergbi	i.e., explicit and implicit recursion	10:30
juergbi	resulting in potential quadratic scaling	10:30
juergbi	while with the hash we only have explicit recursion, although indeed covering some unneeded reverse dependencies	10:30
juergbi	with large/deep element graphs, I'd expect the hash-based approach to win out	10:31
tristan	Those unneeded reverse dependencies are only in the present loop	10:31
tristan	They will probably be needed later anyway	10:31
tristan	And if we want to avoid that completely, we can check if it is "required" in the loop and omit them	10:31
tristan	in the off chance, but I think that by the nature of how we load, it can't happen that we won't want to calculate the keys	10:32
tristan	I.e. we dont load all the possible reverse dependencies to start with, we only load dependencies which we know we are going to want their keys	10:32
benschubert	Ok another proposal:	10:33
benschubert	Instead of building this list at that moment, we actually build it in _set_required, that way we have a minimal graph and maximum gains and only direct deps	10:33
benschubert	and this actually improves even more stuff	10:33
benschubert	as we are not computing cache keys of things that are not yet required	10:33
tristan	Maybe, depending on the meaning of required	10:33
tristan	I.e. even if we are not going to need some elements across a build-only boundary, that doesnt necessarily mean we dont want to include their cache keys in a report, if those cache keys can be known	10:34
benschubert	tristan: but then those cache keys would have been computed before right?	10:35
tristan	benschubert, But yes in general I would be amenable to building the list of direct rdeps lazily at _set_required() time	10:35
tristan	that does sound sensible	10:35
benschubert	or that would mean removing my commit that checks the state at the end	10:35
benschubert	ok	10:35
benschubert	I'll see what this gives	10:35
*** raoul has joined #buildstream		10:40
tristan	benschubert, One more point about that: it might not really make a difference whether we load that lazily or not - because we might assume that we are only going to be calling _update_state() on elements that we are processing	10:41
tristan	I mean, that might be helpful if attempting to do it lazily turns out to be a weird and complex eyesore	10:41
benschubert	tristan: but that's not the case, correct?	10:41
benschubert	so we are always updating everything	10:42
tristan	benschubert, I think it is the case	10:42
benschubert	whereas here, we would have the smallest flatest case	10:42
tristan	benschubert, if you are iterating over reverse deps on an element already, you will not reach something that is not "required"	10:42
tristan	benschubert, unrequired things are lower in the tree (cut off by build-only dependency boundaries), not higher (closer to the targets)	10:43
benschubert	tristan: ah true	10:43
benschubert	tristan: however doing it at this moment has two advantages:	10:44
benschubert	1) Don't create the lists when we do not need them	10:44
benschubert	2) We get the flattened list for free	10:44
tristan	Yeah	10:44
tristan	I'm just pointing it out in case it turns out to be weird code	10:44
benschubert	right	10:45
tristan	Like, when you set something required, you can easily do it recursively... _set_required() is already called for each element individually	10:45
tristan	and without recursion, you dont have the rdeps in context right ?	10:45
tristan	anyway, I'm not sure how that can work	10:45
benschubert	tristan: I'll need to check	10:46
gitlab-br-bot	marge-bot123 closed issue #922 (Make it easy to test BuildStream against external plugins) on buildstream https://gitlab.com/BuildStream/buildstream/issues/922	10:56
gitlab-br-bot	marge-bot123 merged MR !1158 (phil/external-plugin-testing->master: Make it easy to test BuildStream against external plugins) on buildstream https://gitlab.com/BuildStream/buildstream/merge_requests/1158	10:56
*** cs-shadow has joined #buildstream		11:15
jonathanmaw	cs-shadow: WSalmon says you'd be the person to ask about pypi credentials. I'm looking at https://gitlab.com/BuildStream/bst-external/merge_requests/71 and it'll need a username and password in pypi to be useful	11:18
gitlab-br-bot	BenjaminSchubert approved MR !1211 (jennis/move_node_get_project_path->master: Cleanup: Move _yaml.node_get_project_path() to Project._get_path_from_node()) on buildstream https://gitlab.com/BuildStream/buildstream/merge_requests/1211	11:18
jonathanmaw	is there an account buildstream uses for uploading to pypi already?	11:18
cs-shadow	jonathanmaw: not yet, we've been using our personal accounts so far. Although it definitely seems a worthwhile idea to get a buildstream-bot account	11:19
cs-shadow	I'd imagine that can be shared by buildstream project and various extrernal plugins repos as well	11:19
WSalmon	cs-shadow, did we get as far as having a "group" that some individual accounts are linked to?	11:24
*** jonathanmaw has quit IRC		11:29
cs-shadow	WSalmon: I don't think PyPI supports the notion of groups, unlike DockerHub	11:30
cs-shadow	having a buildstream-bot account that has secondary email id of more than person is as close as it is going to get AFAIK	11:31
*** toscalix has quit IRC		11:35
*** toscalix has joined #buildstream		11:37
*** jonathanmaw has joined #buildstream		11:38
*** phildawson_ has joined #buildstream		11:52
*** phildawson has quit IRC		11:52
benschubert	tristan: my first test with set_required fails when workspaces are involved. I don't have time to look at this now, I'll postpone it a bit :/	12:00
benschubert	interesting. Running bst build on a project has so far printend 89k times "Resource temporary unavailable" because we are spawning too many subprocesses a second	12:25
tristan	benschubert, I believe there is a report for that	12:32
* tristan recalls filing one a long time ago, it just doesnt happen in practice so I guess we're still not catching the exception		12:32
benschubert	It does happen in practice. That was our production system	12:33
tristan	Ohhh maybe with remote execution, or a lot of --builders and a very horizontal graph ?	12:33
tristan	benschubert, https://gitlab.com/BuildStream/buildstream/issues/93	12:34
tristan	93 !	12:34
tristan	haha small number	12:34
benschubert	114'000 times for a 6k pipeline, that seems excessive	13:28
*** nimish has joined #buildstream		14:12
*** toscalix has quit IRC		14:15
*** toscalix has joined #buildstream		14:15
*** jonathanmaw has quit IRC		14:15
*** slaf has quit IRC		14:26
gitlab-br-bot	jmacarthur approved MR !1192 (juerg/remote-execution-cas->master: Improve remote execution) on buildstream https://gitlab.com/BuildStream/buildstream/merge_requests/1192	14:42
gitlab-br-bot	martinblanchard approved MR !1192 (juerg/remote-execution-cas->master: Improve remote execution) on buildstream https://gitlab.com/BuildStream/buildstream/merge_requests/1192	14:43
*** nimish has quit IRC		14:47
*** nimish has joined #buildstream		14:49
gokcennurlu	I am confused about the `tar` source plugin	14:50
jmac	What's the issue? I had some dealings with that recently	14:51
gokcennurlu	I want to keep the folders intact, but buildstream is flattening the output	14:51
jmac	That's odd, it shouldn't	14:51
gokcennurlu	I test my tar with `tar -xvf` and it works, but can't do it with buildstream. I might be missing something, I will dig a bit more	14:52
gokcennurlu	but	14:52
gokcennurlu	The base_dir raised my eyebrow, here https://gitlab.com/BuildStream/buildstream/blob/master/buildstream/plugins/sources/tar.py#L75	14:52
gokcennurlu	my output is also missing some files. I will try inserting some print statements in `_find_base_dir` etc.	14:59
jmac	Yeah, find_base_dir looks a bit suspicious to me	15:00
jmac	Does your tarfile extract (in bst) if you specify base-dir: '' ?	15:02
*** raoul has quit IRC		15:03
gokcennurlu	no, nothing.	15:05
gitlab-br-bot	phildawson approved MR !1212 (tristan/cleanup-workspace-tests->master: tests/frontend/workspace.py: Remove redundant and pointless tests) on buildstream https://gitlab.com/BuildStream/buildstream/merge_requests/1212	15:06
gokcennurlu	ver 1.3.0 by the way.	15:06
gokcennurlu	jmac sorry, it fixed it, yes.	15:11
gokcennurlu	Thank you!	15:14
jmac	gokcennurlu: np, I don't think the base-dir option is particularly clear	15:16
*** slaf has joined #buildstream		15:16
*** slaf has joined #buildstream		15:16
*** slaf has joined #buildstream		15:17
*** slaf has joined #buildstream		15:17
*** slaf has joined #buildstream		15:17
*** slaf has joined #buildstream		15:17
*** slaf has joined #buildstream		15:18
*** raoul has joined #buildstream		15:21
*** slaf has quit IRC		15:21
*** slaf has joined #buildstream		15:21
*** slaf has joined #buildstream		15:21
*** slaf has joined #buildstream		15:22
*** slaf has joined #buildstream		15:22
*** nimish has quit IRC		15:29
gitlab-br-bot	marge-bot123 closed issue #935 (Remote Execution: build command failures hard to diagnose (always 'Directory not found')) on buildstream https://gitlab.com/BuildStream/buildstream/issues/935	15:42
gitlab-br-bot	marge-bot123 merged MR !1192 (juerg/remote-execution-cas->master: Improve remote execution) on buildstream https://gitlab.com/BuildStream/buildstream/merge_requests/1192	15:42
benschubert	how can I ensure I'm not caching build trees?	15:45
tpollard	--cache-buildtrees never on the cli	15:46
tpollard	or cache-buildtrees: never in user.conf	15:47
benschubert	tpollard: I have cache-buildtrees:never in user.conf, but the size of my cache didn't go down (I wiped it entirely and rebuilt my stuff)	15:47
tpollard	under the cache: section?	15:48
benschubert	tpollard: under the cache: secion in my configuration file provided as --config when running buildstream	15:48
raoul	juergbi fyi, I think the remaining discussion points on !1124 are fixed, though I've left some open as I'm not 100% sure they're resolved	15:49
gitlab-br-bot	MR !1124: Source cache https://gitlab.com/BuildStream/buildstream/merge_requests/1124	15:49
juergbi	ok, ta, will take a look asap, but will probably be Monday	15:49
raoul	cheers, shall put up an MR for the remote source cache soon too as that's gone surprisingly smoothly	15:50
juergbi	great	15:50
tpollard	benschubert: what happpens if you try open a build shell on one of the elements that would normally have a buildtree, with --use-buildtree?	15:51
tpollard	There's integration tests for cache-buildtrees via cli and user conf	15:51
tpollard	if you've got an empty buildtree cached, it should warn that it's empty when staging it	15:52
tpollard	that's should be covered with tests too	15:53
benschubert	tpollard: I'll have a look	15:53
jmac	tpollard: Does the --config option to bst specify project config or user config?	15:54
tpollard	user iirc	15:55
jmac	That should be fine then	15:56
tpollard	the only other thing I can think is that none of the elements have a build-root, so the buildtree is irrelevant	16:00
mablanch	juergbi: [WIP] CI job testing BuildStream remote execution against BuildGrid for your just-merged changes in !1192 (triggered manually): https://gitlab.com/BuildGrid/buildgrid.hub.docker.com/-/jobs/174298350	16:00
gitlab-br-bot	MR !1192: Improve remote execution https://gitlab.com/BuildStream/buildstream/merge_requests/1192	16:00
benschubert	tpollard: not sure I understand	16:01
juergbi	mablanch: great, ta	16:02
tpollard	The buildtree will always be empty regardless of config if the element being cached doesn't have a build-root benschubert	16:02
benschubert	tpollard: ah, so if I didn't have it before, it wouldn't change anything correct?	16:03
tpollard	benschubert: afaict yes	16:03
tpollard	setting cache-buildtree: none just enforces this behaviour for all elements	16:04
*** nimish has joined #buildstream		16:04
tpollard	this will be handled more clearly once we move to the proto (which will define if an artifact include a buildtree, and if it was original created with one)	16:07
*** nimish has quit IRC		16:09
*** nimish has joined #buildstream		16:09
gitlab-br-bot	cs-shadow opened MR !1213 (chandan/container-plugins->master: doc/source/core_plugins.rst: Add link to bst-plugins-container) on buildstream https://gitlab.com/BuildStream/buildstream/merge_requests/1213	16:13
*** nimish has quit IRC		16:29
*** nimish has joined #buildstream		16:35
*** nimish has quit IRC		16:40
*** nimish has joined #buildstream		16:40
*** toscalix has quit IRC		16:43
*** nimish has quit IRC		16:45
*** nimish has joined #buildstream		16:45
gitlab-br-bot	raoul.hidalgocharman opened MR !1214 (raoul/440-source-cache-remotes->master: Remote source cache) on buildstream https://gitlab.com/BuildStream/buildstream/merge_requests/1214	16:46
*** nimish has quit IRC		17:00
*** nimish has joined #buildstream		17:01
*** nimish has quit IRC		17:06
*** nimish has joined #buildstream		17:21
*** tpollard has quit IRC		17:25
*** nimish has quit IRC		17:26
*** nimish has joined #buildstream		17:27
*** nimish has quit IRC		17:32
*** nimish has joined #buildstream		17:32
benschubert	tristan: I tried with a PQ and the ordering instead of the _set_required part. That is what it would give: https://gitlab.com/BuildStream/buildstream/merge_requests/1070/diffs?commit_id=828510a8bbb6eeffdbfd05afd9998cdf2768377a (I can obviously comment/cleanup the code) I'll also run benchmarks on this to see what we gain	17:33
benschubert	tristan: let me know if this way would be a big no or not	17:34
*** nimish has quit IRC		17:37
*** nimish has joined #buildstream		17:48
*** raoul has quit IRC		17:51
*** nimish has quit IRC		17:58
*** nimish has joined #buildstream		17:58
tristan	benschubert, it's a bit late here, I don't understand why the aversion to only storing direct reverse dependencies	18:03
tristan	looks more complicated than it needs to be	18:04
benschubert	tristan: ok I'll have a look with having both too	18:05
benschubert	and benchmark all three solutions on a bigger dataset	18:06
tristan	Honestly, the whole patch fixes an important bug and makes the code more sensible in general, if it doesnt make things slower, that is a good thing overall	18:07
tristan	Also, I would prefer looking into splitting up update state details into more granular updated, before trying more complicated things	18:08
tristan	*into more granular updates	18:08
benschubert	tristan: because even those updates won't be sufficent to get acceptable performance on medium-large projects	18:10
benschubert	but yes, we can go per step	18:10
*** nimish has quit IRC		18:33
*** nimish has joined #buildstream		18:49
*** nimish has quit IRC		18:54
*** nimish has joined #buildstream		18:55
tristan	benschubert, I think we can eliminate calls to Element.dependencies() in cache key calculation with this reverse deps things as a first step	18:59
tristan	i.e. since we're informing reverse deps in order to trigger changes, we could be passing the cache keys down one level at a time, and cumulatively constructing the dependency portion of the cache keys	19:00
tristan	that should be a good win and scale better than recursively calculating them	19:01
*** nimish has quit IRC		19:05
*** nimish has joined #buildstream		19:05
*** nimish has joined #buildstream		19:05
*** nimish has quit IRC		19:08
*** nimish has joined #buildstream		19:18
*** nimish has quit IRC		19:23
*** tristan has quit IRC		19:43
benschubert	Tristan : that is true and indeed better	19:58
*** slaf has quit IRC		20:53
*** slaf has joined #buildstream		21:12
*** slaf has joined #buildstream		21:12
*** slaf has joined #buildstream		21:12
*** slaf has joined #buildstream		21:12
*** cs-shadow has quit IRC		21:40

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!