IRC logs for #buildstream for Thursday, 2019-06-20

*** tristan has quit IRC		05:32
*** tristan has joined #buildstream		05:45
*** persia has quit IRC		05:57
*** persia has joined #buildstream		05:58
*** samkirkham has joined #buildstream		07:29
*** bochecha has joined #buildstream		07:44
*** adds68 has joined #buildstream		08:03
*** toscalix has joined #buildstream		08:11
*** bochecha has quit IRC		08:11
*** bochecha has joined #buildstream		08:11
*** raoul_ has joined #buildstream		08:27
*** phildawson has joined #buildstream		08:29
*** ikerperez has joined #buildstream		08:36
*** benschubert has joined #buildstream		08:40
*** Becky has joined #buildstream		08:43
juergbi	tristan: I've been working on moving subproject fetching to Stream as preparation for casd support	08:50
juergbi	doing that I've noticed that there seems to be an inconsistency in master with regards to the decision whether subprojects should be fetched	08:51
juergbi	Stream enables this for commands such as build, fetch, and a few others, but not for commands such as show, as far as I can tell	08:52
juergbi	however, subproject fetching seems to be always enabled as part of include processing (for include directives that use junctions)	08:52
juergbi	I assume this is not intentional	08:53
juergbi	I'd suggest to move the decision whether to fetch subprojects or not to the frontend	08:53
juergbi	and then apply this also to include directive processing	08:53
juergbi	does this sound reasonable?	08:55
juergbi	this would also allow CLI options to override the default, if we want that	08:55
benschubert	juergbi: would it be possible to use the '--fetchers = 0' to disable fetching? That would seem the most straightforwards to me, and thus, always fetch is fetchers > 0 and we need to fetch?	08:58
juergbi	benschubert: this would mean that by default we would implicitly fetch (subprojects) even for bst show	09:00
juergbi	I think we wanted to avoid that	09:00
*** phildawson_ has joined #buildstream		09:01
juergbi	using --fetchers=0 to disable fetching for other commands might make sense, though	09:01
benschubert	I remember it going forwards and backwards on it, definitely, but I must admit, as a user, I'm always annoyed when bst fails and tells me to fetch first, when it knows exactly what to do to execute the command I asked :)	09:01
*** phildawson has quit IRC		09:01
juergbi	yes, I can understand that. I'm not sure what's best overall. however, I'd like to first focus on fixing the internals to be at least consistent per command	09:03
juergbi	changing defaults would probably require more discussion but could be done later	09:03
*** jonathanmaw has joined #buildstream		09:04
benschubert	juergbi: sure, let's fix things and once it settles, let's review all the defaults before 2.0 :)	09:04
juergbi	+1	09:05
juergbi	could potentially also be something for user config. however, we probably want to avoid adding too many knobs	09:05
*** ChanServ sets mode: +o tristan		09:10
tristan	juergbi, Not intentional... sounds sensible... reading rest of backscroll	09:10
tristan	Ok so... I think that definitely if --fetchers is 0, we should handle that case everywhere gracefully	09:11
tristan	And return with an error message, instead of running an infinite scheduler that doesnt queue any jobs	09:11
tristan	Also agree that the decision of "whether to automatically fetch subprojects" is a different decision	09:12
tristan	However... juergbi... do you remember off hand why we have that as optional at all ?	09:12
tristan	I think that subprojects are source like anything else, why would we want that to be optional ?	09:12
tristan	Also... if it is going to remain optional, I certainly think it should have user config... as a user I don't want to tell bst on the command line every time, if my preference is not the default	09:13
tristan	juergbi, unless I'm missing something though, I really think it aught to not be optional at all	09:14
tristan	benschubert, any thoughts why we would want fetching of subprojects to be optional ?	09:14
benschubert	tristan: I cannot remember either, maybe cs-shadow might remember?	09:15
tristan	If it were not optional, and were logically treated as any other source of any other element (which I think it is), then we can still disable it with `--fetchers 0`	09:15
tristan	I mean it really is a source of a junction element after all	09:15
juergbi	tristan: it's not optional from the user/CLI point of view right now	09:15
tristan	juergbi, Curiously, does your branch use the scheduler to run the fetches ?	09:16
tristan	(/me hopes...)	09:16
juergbi	it's enabled for some commands but not for others	09:16
tristan	juergbi, Yes I understand this part... I am largely in favor of treating junction sources in the same way as every other element's sources from a user perspective	09:16
juergbi	the issue is that subproject fetching may be required as part of project/element loading, so commands that normally don't fetch anything might require fetching (or bailing out)	09:16
tristan	and removing the knob completely	09:16
tristan	Right	09:17
juergbi	I have a WIP branch to use the scheduler for the subproject fetching, yes :)	09:17
tristan	But those commands actually require fetching	09:17
juergbi	(probably not ideal yet, but hopefully going in the right direction)	09:17
tristan	juergbi, I.e. `bst show` does not normally require fetching any more than `bst build` normally requires fetching	09:17
tristan	bst build only requires fetching if the sources are not already present	09:18
tristan	in exactly the same way that bst show only requires fetching if it's sources (the subprojects) are not already present	09:18
tristan	juergbi, YAY \o/ (for using the scheduler)	09:18
juergbi	I think the argument was that bst show is not expected to start a scheduler session / take a long time	09:18
juergbi	I thought you were actually arguing for avoiding implicit fetching for bst show	09:18
juergbi	but I might misremember	09:18
tristan	It may be indeed	09:19
tristan	I could very well have argued that... but this was in a time when junctions were brand new :)	09:19
juergbi	I don't have a strong opinion on this at the moment	09:19
juergbi	as mentioned above, I want to fix the internals first before worrying about changing defaults	09:19
juergbi	as this blocks casd	09:20
tristan	I don't understand how this blocks casd, but I'm thankful that it does !	09:20
tristan	haha	09:20
juergbi	unless we can all agree on a change without further discussion	09:20
juergbi	casd means that we need to use grpc for importing blobs into CAS, that includes source cache	09:20
juergbi	we must avoid using grpc in the main process	09:20
tristan	I feel like, it might be the same cost to you (or even less), if we can right now agree that optionality of fetching of subprojects can rest in peace	09:21
juergbi	so scheduler it is :)	09:21
juergbi	yes, I'd be happy to just always fetch subprojects	09:21
tristan	juergbi, I would suggest this brutish move: Write an email to the list, explaining that junction sources are the same as sources of any other element, and if you really want to avoid fetching them, you can specify `--fetchers 0` anyway, just like any other source	09:22
tristan	juergbi, and say "Unless there are any objections in the next few days... this optionality is disappearing, speak now or forever hold your peace !"	09:22
tristan	haha	09:22
juergbi	hehe	09:22
juergbi	fair enough	09:22
*** benbrown has joined #buildstream		09:23
tristan	juergbi, I think that a side effect of this is that `--fetchers 0` currently doesnt do what we would like it to do... but I think that that is safely orthogonal from your branch	09:23
tristan	It should be easy enough to get a better failure mode for a `--fetchers 0` or `--builders 0` or whichever scenario	09:24
juergbi	yes, probably have to improve the UX around that	09:24
tristan	as a separate patch	09:24
juergbi	that sounds good	09:24
tristan	but anyone can do that	09:24
tristan	juergbi, Your using the scheduler... I expect will "mostly just work", you will probably run into a little trouble in the App side of things, due to it's expectations and notions of "sessions" and such	09:25
tristan	But that is all really easy stuff to shuffle around	09:25
juergbi	yes, I suspect that as well. I actually haven't manually tested it yet, only via pytest, so not sure how bad the UX is right now	09:25
tristan	I think there are some callbacks from Stream about when the scheduler starts and ends which the frontend uses to print summaries	09:26
juergbi	will look into that when the basics are in place	09:26
tristan	and that's probably all that conflicts with running the scheduler multiple times	09:26
*** raoul__ has joined #buildstream		09:38
*** raoul_ has quit IRC		09:39
*** raoul__ has quit IRC		09:44
*** lachlan has joined #buildstream		09:47
*** raoul__ has joined #buildstream		09:48
*** tpollard has quit IRC		09:50
*** tpollard has joined #buildstream		09:50
*** lachlan has quit IRC		09:59
juergbi	jonathanmaw: "This job is stuck because you don't have any active runners online with any of these tags assigned to them: wsl"	10:02
juergbi	known problem?	10:02
jonathanmaw	the machine seems to have locked up	10:05
jonathanmaw	I'll sort it out	10:05
*** lachlan has joined #buildstream		10:06
juergbi	ta	10:21
*** raoul__ has quit IRC		10:26
gitlab-br-bot	BenjaminSchubert opened (was WIP) MR !1401 (bschubert/node-api-noget->bschubert/new-node-api: introduce a Node api, and remove node_get) on buildstream https://gitlab.com/BuildStream/buildstream/merge_requests/1401	10:42
benschubert	tristan, juergbi would you have time to have a look at this part of the rework for the new node API? It gets rid of 'node_get'. Once we have an agreement on this part, I'll attack the rest	10:43
*** CTtpollard has joined #buildstream		10:47
*** tpollard has quit IRC		10:48
tristan	benschubert, No replacement for Plugin.node_get_project_path() ?	10:49
tristan	I'll try to take a more thorough look later... kind of in the middle of testing my initrd	10:51
tristan	benschubert, I'm also curious, do the docstrings work well with sphinx, will we get the new Node APIs nicely formatted in our API docs ?	10:51
tristan	having that linked all together well will be an attractive part of this change	10:52
tristan	Plugin.configure() takes a `Node`, click on `Node` in the docs brings you to the docs in how to get data from it	10:52
tristan	regarding the docstrings comment... I was referring to cython (forgot to mention that)	10:57
benschubert	tristan: I was planning in tackling documentation in a second phase before we merge to master, so that we are sure the API is ready to go	10:57
benschubert	oh, I would need to check this	10:57
tristan	Surely there would be a workaround but would be nice if it "just works" well :)	10:58
benschubert	agreed!	10:58
*** bochecha has quit IRC		10:58
*** bochecha has joined #buildstream		10:59
benschubert	And for node_get_project_path, I went under my radar, I'll have a look at it	10:59
benschubert	tristan: seems like this function actually does a lot of stuff, might be better to refactor later :)	11:00
tristan	Yeah it polices that a path doesnt escape the project directory	11:05
tristan	cause we expect project paths to be revisioned with the project, they cannot point to external things	11:05
*** raoul__ has joined #buildstream		11:05
tristan	So basically, the Node just requires knowledge of the related project and where that project is staged	11:05
benschubert	tristan: since that is quite tied to the project, I think it would be best to keep it there, in order to not move everything to 'node', and end up with a huge API surface there	11:06
tristan	Would it be a huge API surface ?	11:07
*** bochecha has quit IRC		11:07
*** bochecha has joined #buildstream		11:07
tristan	I thought it made good sense that the core hands over a Node to Plugin.configure(), and the plugin uses that API to load things in a safe way	11:08
tristan	There may be other cases too	11:08
tristan	For instance, expanding aliases	11:08
benschubert	I think 'Node's shouldn't know about the project at all, they are, after all, a node in a yaml structure, and doesn't really know if it's a path	11:08
benschubert	So you would rather be leaning into moving more knowledge of variables, aliases, etc into nodes?	11:08
tristan	I'm not convinced, I mean I would lean towards a very clean and luxurious API surface for plugins every time	11:09
tristan	Keep that surface nice, even if it means a bit of contortion behind the scenes	11:10
benschubert	I would be leaning to	11:11
benschubert	1) Have a small but understandable and easy to work with Node api, with Provenance information errors handled for you	11:11
benschubert	2) Have helpers on the Plugin, to do things like getting a proper path from the root of the project or other things that require more knowledge of how BuildStream is architectured	11:11
benschubert	and have all the api in 2 work on Nodes from 1	11:12
tristan	Right, we need to limit the knowledge a plugin needs to know about how BuildStream is architectured	11:12
tristan	If we tell people: This is a Node, and go and fetch your config from it... they are unlikely to notice that there is a completely separate API in another location for policing project paths	11:13
tristan	As a user of the API... I am likely to think: BuildStream gave me this Node in the first place... Why is it asking me to give it to this other API ?	11:14
tristan	Shouldn't I be able to just ask the Node ?	11:14
tristan	Heh	11:14
*** bochecha_ has joined #buildstream		11:15
benschubert	We coudl also tell:	11:15
benschubert	This is the plugins public API, it contains helper to ensure you access your configuration correctly.	11:15
benschubert	If you want to access your configuration directly, you can access it through the Node's API	11:15
tristan	But we sort of already have that with node_get()	11:16
*** bochecha has quit IRC		11:16
*** bochecha_ is now known as bochecha		11:16
benschubert	and explain that Node's API maps to the underlying yaml of their configuration	11:16
benschubert	we already have part of it with node_get/set/del, but those operations become very painful and blur the limit between lists/dicts/etc and Nodes	11:17
benschubert	making it unclear what we are working with	11:17
benschubert	such that some code paths are sanitizing nodes multiple times in a row because we have no idea what is given to us	11:17
tristan	I think the current very painful part is constructing the nodes, as we need to do when dynamically setting public data, and I agree it would be nicer to have all of this on a separate Node interface	11:18
tristan	I don't think that it's nice to split the domain of "how to extract configuration data" into two separate locations, though	11:18
tristan	benschubert, You are basically asking the plugin author to understand something about the core - to understand that different entities have different knowledge	11:19
tristan	I think it's more lightweight if that is assembled neatly in one place	11:19
tristan	Again, this mostly amounts to not coding what makes sense because the implementation and underlying architecture dictates that it makes sense	11:20
tristan	and instead coding what makes sense in terms of convenience for the plugin author, which I think we should favor every time	11:20
benschubert	I would disagree with that, because we end up with many, subtly different, functions on Plugin, some referencing methods in Element, on how to handle the data.	11:21
benschubert	Therefore the current dataflow is almost impossible to grasp correctly. Having too many helper functions around an otherwise simple concept might make things	11:21
benschubert	harder to grasp	11:21
benschubert	I do understand that having everything in a single place would be nice	11:22
benschubert	but I also do think that if users have a clear separation in mind between configuration data and project-aware data, it would make things simpler	11:22
benschubert	I am not talking about making the variables stuff public. That should be entirely hidden to users	11:22
benschubert	but telling them "This is how you access your data" and "these are helpers when you need to access something in a context" would make sense I believe	11:23
benschubert	Also, for node_get_project_path, renamign this to something like 'sanitize_path(Node)' would also make things more understandable	11:24
benschubert	It also seems that 'node_get_project_path' would be the only method that would be contentious here, do you agree or do you see other ones?	11:27
tristan	I don't think 'sanitize' is really the right word here... rather we want to get a path for the purpose of loading data from inside the project	11:27
tristan	and we want to enforce that it's there (we don't transform the path I think... we police it here)	11:28
tristan	benschubert, I didnt read through the patches but... substituting variables is indeed another thing	11:28
tristan	actually	11:29
*** CTtpollard is now known as tpollard		11:29
tristan	benschubert, I've been wondering if we should abolish completely the distinction between getting strings, and getting substituted strings	11:29
tristan	benschubert, right now Sources don't have variable substitution capability, however it would be nice if they did, for one	11:29
tristan	And also... it would be simpler if extracted strings were just always substituted, and the plugin didnt have any choice	11:30
tristan	that is a different change of course	11:30
tristan	benschubert, there is also extraction of URLs	11:30
tristan	benschubert, For instance... I think currently we have a separate API for substituting the URLs	11:31
tristan	which means the core does not have a focal point for raising a warning when a source tries to load an aliased URL but the project did not provide an alias ?	11:31
tristan	Or... that is not exactly true, I think we at least have this focal point	11:31
tristan	but the plugin is forced to extract the string first and then do the aliasing separately	11:32
benschubert	(I'm off to eat, I'll answer after)	11:32
tristan	Me too !	11:32
*** lachlan has quit IRC		11:33
*** tristan has quit IRC		11:37
*** bochecha_ has joined #buildstream		11:39
*** bochecha has quit IRC		11:41
*** bochecha_ is now known as bochecha		11:41
*** tristan has joined #buildstream		11:43
*** lachlan has joined #buildstream		11:51
jennis	tristan, regarding the discussion we had the other day of caching/not caching junctioned elements, are we ok with implementing this with just "local elements only", rather than "local push only" and "local pull only"?	11:54
jennis	But this boolean should be defined within "artifacts:" configuration, not just a project level setting	11:55
jennis	(This is what the latest branch accomplishes)	11:55
*** lachlan has quit IRC		11:57
*** raoul__ has quit IRC		12:23
benschubert	tristan: right, I'm not sure having a single point (the Node) for everything would be best, but we should at least not have more than 2. Same for substitution, that should be done by the plugin initialization, before we give the node to the plugin	12:32
*** raoul__ has joined #buildstream		12:37
benschubert	hey raoul__ I think I adressed all your comments on https://gitlab.com/BuildStream/buildstream/merge_requests/1401, could you confirm? :)	12:37
*** raoul__ is now known as raoul		12:38
raoul	Shall have a look in a bit :)	12:39
*** phildawson_ has quit IRC		13:00
*** lachlan has joined #buildstream		13:04
*** phildawson_ has joined #buildstream		13:28
*** lachlan has quit IRC		13:46
jjardon	Hi, what is the latest bst format version bst master supports?	14:15
jjardon	I think bst should fail if you try to build something = or < that version	14:17
cs-shadow	it's at 24 currently (https://gitlab.com/BuildStream/buildstream/blob/master/src/buildstream/_versions.py#L26)	14:20
cs-shadow	I thought the CLI already bailed out when the required format version was more recent than what the BuildStream package came with. Are you not seeing that behavior	14:21
jjardon	cs-shadow: does that mean current master supports 24 and all the previous versions?	14:21
cs-shadow	that's the idea in general	14:22
cs-shadow	I don't think we've made any backwards-incompatible changes to the format, but I'm not 100% sure	14:23
jjardon	cs-shadow: ok, I will recheck and come back; thanks!	14:24
cs-shadow	cool, np	14:25
*** lachlan has joined #buildstream		14:34
*** lachlan has quit IRC		14:51
*** bochecha has quit IRC		15:06
*** toscalix has quit IRC		15:07
*** tpollard has quit IRC		15:59
*** ChanServ sets mode: +o tristan		16:44
tristan	jjardon, master should probably reset to 0 and not care about format-version until 2.0 honestly, there is no relation between format versions of 1.x and 2.x	16:44
tristan	as cs-shadow says, there is no guarantee of compatibility, once 2.x exists we can identify what incompatibilities there are and create a porting guide	16:45
tristan	jjardon, fwiw, I've not made any effort whatsoever to bump format version for the addition of any YAML features in master since it has been decided that we would break API	16:48
tristan	and I don't think format version bumps have been happening there	16:48
tristan	Also, I should write a patch which removes all of the "Since: 1.2" and "Since: 1.4" notations in all of the plugin facing python APIs	16:49
tristan	since they are no longer relevant, as bst2 is a new and separate API	16:49
tristan	they will come back with "Since: 2.2" notations, in a couple of years or whenever the second stable release of the 2.x version comes along	16:50
benschubert	tristan: concerning our previous discussion, I would propose finishing the yaml rework, once we have an API, look how we can make the url/path handling neater and then land everything in one go, do you have any concern with such a plan?	16:50
tristan	benschubert, I still didnt look deeply at the patch honestly, just got home from dinner in a distant neighborhood (2:00am here)	16:51
persia	I have a concern with any plan that includes "land everything in one go" as having a potential to involve massive amounts of rebasing and rework. If it can be broken into parts (and still meet others' concerns), I think that would be preferable.	16:51
tristan	benschubert, the next thing I was going to look at was what the plugins are doing with the API, i.e. I hope that they are still one liners like they were with Plugin.node_get()	16:52
benschubert	oh sure, well, let me know if you are interested in looking at it and I'll wait before merging it to the other branch :)	16:52
tristan	and we don't have to like... "Get the integer node from the dict node, and get the integer value from the integer node" in separate steps	16:52
benschubert	tristan: sure, then let's get this rework done and then do a second phase to polish out the path/url part of it	16:53
tristan	persia, This is an API change which deeply affects all existing plugins, hence it is desirable to land the API surface change once and reduce iterations and churn in external repos	16:54
benschubert	tristan: the way I was thinking about it, we have to do it, like 'node.get_scalar('key').as_int(), I think it's the most readable and less error prone	16:54
persia	tristan: Fair point :)	16:54
tristan	scalar	16:54
tristan	sounds eccentric and esoteric	16:54
tristan	why not just get_int() ?	16:54
tristan	anyway let me come back to it... I would really like it to be dead simple if possible... I also have an ingrained aversion to tail calling in general, but maybe that's just my personal matter of taste	16:55
benschubert	tristan: the advantage of it is that we could access both the node and/or the value in various part of the API, which, for example, allows us to trivially defer access to provenance information	16:56
tristan	in most cases, I avoid tail calling and write two lines of code, in the knowledge that one day in the future I might need to do something in between those two statements which got crammed into the same line of code	16:57
tristan	and rather not muck up history	16:57
benschubert	I can also add helpers on MappingNode/SequenceNode to get directly the correct value if needed, but I'd rather not have a too wide API surface, as it will not help users	16:57
tristan	yeah, I see that provenance being tied to the value and not the dict is valuable here, so probably the tail calling will become a thing and it makes sense	16:58
benschubert	Anyways, I'll let you review the code when you have time, and will continue the next steps in the meantime :)	16:59
tristan	I'll give it priority first thing in the morning... honestly I don't expect to have other criticisms than the one that I kind of feel like we shouldnt burden the plugin author with knowledge that accessors can come from multiple API surfaces	17:01
tristan	like, Project and Context are not public, we manage all of that internally and give the plugin one surface	17:01
tristan	but anyway, I'm not in a state to consider it deeply right now... the tail calling is probably right, lemme look at it in the morning :)	17:02
benschubert	sure, good night :D	17:03
*** raoul has quit IRC		17:54
*** jonathanmaw has quit IRC		18:11
*** bochecha has joined #buildstream		18:39
*** abderrahim has joined #buildstream		20:35
*** abderrahim has quit IRC		20:52
*** abderrahim has joined #buildstream		20:53
*** bochecha has quit IRC		21:24

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!