IRC logs for #buildstream for Thursday, 2018-11-01

*** ctgriffiths has quit IRC00:49
*** ctgriffiths has joined #buildstream00:50
*** xjuan has quit IRC01:01
*** bochecha has quit IRC01:41
*** kolbe has joined #buildstream04:42
*** Prince781 has quit IRC04:59
*** catonano has joined #buildstream05:00
*** catonano has quit IRC05:30
*** catonano has joined #buildstream05:56
*** obre has joined #buildstream06:21
*** ItzExor has joined #buildstream07:21
*** warweasle has joined #buildstream08:13
*** tristan has joined #buildstream08:13
*** toscalix has joined #buildstream08:43
*** catonano has quit IRC08:47
*** alatiera_ has joined #buildstream09:15
laurencemablanch, I think this ticket can be closed? https://gitlab.com/BuildStream/buildstream/issues/63009:25
mablanchlaurence: Yep, I think so!09:28
laurencemablanch, cool, thanks !09:29
*** tristan has quit IRC09:31
*** raoul has joined #buildstream09:35
toscalixvalentind: and I went over some tickets today. There is plenty of them still on the backlog. We will have another session on Monday as usual.09:39
*** benschubert has joined #buildstream09:39
toscalixlaurence: I just did close it09:41
benschubert Any maintainer with bandwidth for a quick review of https://gitlab.com/BuildStream/buildstream/merge_requests/886? I just want to make sure I'm not breaking any feature I might have overlooked :)09:47
*** jonathanmaw has joined #buildstream09:50
*** tiagogomes has joined #buildstream09:54
juergbibenschubert: it seems fine to me but I'm not a pytest/setuptools expert10:00
juergbibenschubert: maybe explain the reason for the change in the commit message itself. dropping the only setup_requires entry10:01
benschubertok sure I will do that, rebase and then merge! Thanks a lot!10:01
WSalmoncould some one how knows more than me about cli.run from tests have a look at https://gitlab.com/BuildStream/buildstream/issues/736#note_113732200 Many thanks, tlater[m] juergbi tristan skullman jonathanmaw10:09
*** tristan has joined #buildstream10:09
* jonathanmaw has a poke10:09
WSalmonta10:10
jonathanmawWSalmon: I'm fairly sure that nothing in cli.run explicitly interacts with the Platform, and the reason why it's persisting between invocations of buildstream is because we directly execute buildstream's main(), rather than having a separate python context10:17
WSalmonbut isnt the issue that it is not! surely cli.run should be resetting stuff like that?10:17
jonathanmawit sounds like the appropriate solution to me is either finding some way of separating out python contexts (I tried forking once, it went badly), or manually resetting the context between invocations10:18
skullmanresetting the context would be the least awful IMO10:18
jonathanmaws/resetting the context/resetting the platform/10:18
* skullman would prefer an explicit reset of the context to cli.run automatically doing it10:18
WSalmonwhy?10:21
*** bochecha has joined #buildstream10:22
benschubertWhat about a fixture that does only that, and is autoused in all the integration tests?10:23
skullmanWSalmon: don't want to do something abnormal unless it's necessary for that test, and caching is the default behaviour10:24
benschubertSo using an explicit fixture in that specific case? I'm not sure about it, we might have unknown dependencies between tests because of this behavior and I'd rather be safe, at least resetting it between every integration test10:25
skullman¯\_(ツ)_/¯ it's a weak preference to be given less weight than the opinion of whoever's doing the work10:27
juergbiglobal pytest setup function that resets this sounds acceptable to me10:30
juergbithat would simply be one step closer to the ideal world where we have a separate python context for each test10:31
*** lachlan has joined #buildstream10:31
benschubertShould we do it at the global level (also for unit tests) or just for the integration one? I'll submit a PR asap :)10:31
juergbimost of our non-integration tests are much more than unit tests and also use cli.run10:31
juergbiI'd do it for all, don't see why not10:31
benschubertsure!10:32
juergbior do you expect a big performance issue?10:32
juergbimight be worth checking whether a full test run gets significantly slower10:32
benschubertI'll do both!10:33
juergbita10:33
benschubertfirst check with master then the change10:33
*** ChanServ sets mode: +o tristan10:46
tristanGah, gitlab is broken again10:46
* tristan works around10:46
*** jonathanmaw_ has joined #buildstream10:51
*** jonathanmaw has quit IRC10:52
Kinnisonjuergbi: I think your last open discussion point on https://gitlab.com/BuildStream/buildstream/merge_requests/786 has been addressed by tpollard.  If that was the last of the review cycle, it'd be awesome if you'd resolve that and approve so we can get this MR landed :-)11:23
*** ahmed89 has joined #buildstream11:31
Kinnisonskullman: Has anyone offered any reviews of !909 for you yet?11:31
skullmanno, but I've not pushed and am behind in reviewing other stuff for the goodwill to deserve it11:32
* Kinnison will see if he can review at some point today, though I'm not the best person for the job :-)11:32
jmacI'll take a look too11:33
KinnisonDoes anyone here have a good understanding of the use of the ChainMap in _yaml.py ?11:36
tristanKinnison, Yep11:37
tristanProbably could use a rename11:37
Kinnisontristan: aah you're online - awesome11:37
tristanShadowMap ? ShallowMap ?11:37
tristanKinnison, ChainMap, the original python one, basically implements the Mapping type (like a dictionary), with a list of Mappings internally11:38
Kinnisontristan: The reason I'm asking is because profiling `bst build` to the point just before `run()` on the scheduler has resulted in ChainMap.__{contains,getitem}__ showing up as serious hotspots (totalling a significant proportion of pre-scheduler time)11:39
tristanWe use this construct to optimize the load time and avoid creating copies of dictionaries11:39
tristanOur version, in addition to pythons version, not only combines the keys/values of the underlying mappings, but also keeps track of deleted keys11:40
Kinnisontristan: roughly 20% of CPU time seems to be in those methods11:40
tristanKinnison, Sure, but pre-scheduler time is going to blow up a *lot* if we do copies instead :)11:40
Kinnisontristan: I appreciate that, I'm just wondering if there's some way we can improve matters11:41
Kinnisontristan: if this is a dead-end then I need to look at other options11:41
Kinnisonstep 1 is always "Can we improve the obvious stuff" :-)11:41
tristanYeah, I'm not sure11:41
* Kinnison feared that :/11:42
tristanKinnison, Looking at contains/getitem there... it does look like *possibly* we can improve the surrounding calls11:42
tristanFor instance... in the patch which does the YamlCache optimization, the first iterations were doing things like: if "foo" in dictionary: ... foo = dictionary[foo]11:43
Kinnisonnod.11:43
tristanInstead of: try: foo dictionary['foo'] except: ...11:43
tristanRooting out any of that might significantly half the lookups11:43
* Kinnison basically noticed that since the optimisation patches he landed last Friday, around half of pre-scheduler time in build is spent constructing element instances, and a large proportion of that ends up in _yaml.py::node_get() which is dominated by the ChainMap11:44
Kinnison(note, I'm assuming the yaml is already in the cache for these profiles)11:44
tristanYeah, the YamlCache is not the point, it's just a place where I noticed double-lookups occurring in hot codepaths11:45
tristan(which I believe was fixed before merging anyway)11:45
tristanBut, if lookups are dominating load time, there is a good chance that other code segments also have this problem11:46
tristanMaybe even node_get()11:46
KinnisonI'll continue to focus efforts at node_get() for now then11:46
KinnisonThough if you spot any potential gains, I'd be happy to try them out in my profiling rig11:46
* Kinnison is running a 25,000 element profiling project because that's similar in size to a client of mine's projects11:47
tristanlooks like that is using the .get(key, default_return) accessor; which I would presume to be optimized by python11:48
tristanNot sure if that is faster or if try / except is faster, but they should be equal11:48
Kinnison    def get(self, key, default=None):11:50
Kinnison        return self[key] if key in self else default11:50
Kinnisonfrom collections/__init__.py11:50
KinnisonAssuming you're interested in the ChainMap base from python11:50
tristanWell11:50
tristanThat is amusing11:50
tristanKinnison, Try changing _yaml.node_get() to use try / except and see if that halfs the time of contains/getitem11:51
Kinnisonwould overriding def get() in _yaml.py's ChainMap to be:11:52
Kinnison    def get(self, key, default=None):11:52
Kinnison        try:11:52
Kinnison            return self[key]11:52
Kinnison        except KeyError:11:52
Kinnison            return default11:52
Kinnisonbe easier?11:52
juergbiKinnison: !786 I've resolved that discussion. however, tpollard is asking tristan for a review, in which case a re-review from my side should not be necessary (all my original points were addressed but because there were a few iterations I'd have to do a re-review to some extent)11:52
Kinnisonjuergbi: aah righty11:52
tristanKinnison, maybe, I'm not sure that we want to rely on everything always being a ChainMap, though11:53
Kinnisontristan: I'll try the above addition of an overridden get() for now and see if that knocks down how much ChainMap.__{getitem,contains}__ shows up11:53
tristanKinnison, Also, if this turns out to really save time, we have a while surface of .get calls all around the codebase which are worth investigating11:53
* Kinnison nods11:53
tristans/while/whole11:54
Kinnisontristan: since I have your attention, any chance you can look over !786?11:54
Kinnisontristan: given juergbi's comment above11:55
tristanKinnison, Ah I see in collections that is *only* for ChainMap11:55
Kinnisontristan: yeah, I think further .get()s might be cleaner11:56
tristanSure, lets see which one of the dozens of MRs that one is about...11:56
Kinnisontristan: don't always pull buildtrees11:56
tristanAh, the important one11:56
Kinnison:-)11:56
Kinnisonbahaha11:58
Kinnisontristan: excellent call.  That replacement .get() method knocked 50% off node_get's cost11:58
* Kinnison will take a further look and submit a clean MR later today11:58
KinnisonIn the meantime I'll grab some lunch12:00
tristanKinnison, Nice !12:09
ironfoottristan: do you want the bot to answer whenever a mention to a MR / Issue appears in channel?12:15
ironfoot"<tristan> Sure, lets see which one of the dozens of MRs that one is about..."12:15
jmacThat could make standups interesting12:17
ironfoottrue _o_12:17
jmacUnless it just responds to "!666" and not "[[!bsissue 666]]]", I suppose12:17
jmacWait, standups have no meaning here12:17
ironfootin anycase it will only respond to !xxxx and #xxxx12:18
jmacYes, sorry, I'm on the wrong channel - we don't use the bot on our internal channel so it won't matter12:19
tristanironfoot, YES !!!!12:19
raoulAnyone want to review https://gitlab.com/BuildStream/buildstream/merge_requests/900? Just checked it worked with an update to buildgrid12:20
tristanironfoot, I'm so jealous that the bots in #gnome-hackers and #gtk+ do that :)12:20
ironfoottristan: I will make a note, and do it once I get the current situation fixed (bot broken)12:20
tristan... Now if only we could avoid premature creation of merge requests, we could also cut down on the spam ... <whistle to self.../>12:21
ironfoot*if* they are marked as WIP maybe that's something I could fix in the bot side of things12:22
tristanOh yeah, that would be cool too !12:22
ironfootwill keep you updated12:23
benschubertI've made initial tests for resetting the platform between each test. The results are: without change: 559s,  570s, 574s, with changes: 587s,  583s. This hints to a 2-5% performance hit. What are your thoughts? juergbi ?12:27
benschubertMy take is, it would be an acceptable slow down for better isolating tests12:30
tristanjuergbi, Are you confident in the CAS/ArtifactCache level changes in !786 ? E.g. this conversation appears resolved but is unclear to me whether it really is: https://gitlab.com/BuildStream/buildstream/merge_requests/786/diffs#note_10995163512:31
tristanbenschubert, I agree, proper tear down is worth it12:32
Kinnisonjennis: If you have time, !919 might be of interest to you12:46
tristantpollard, Kinnison, juergbi; a ton of comments posted on 786 - including a couple of things which are are not strictly related to the patch but are obviously broken and it makes sense to fix instead of growing on top of the breakage12:59
tristanjuergbi, I did not take the exhaustive time to understand all of the low level CAS related changes and would appreciate that you look at only that, since it is much closer to home for you13:00
tristanTo elaborate on the above:  (A) There are missing API documenting comments in element.py, this patch extends their arguments but doesnt fix them... (B) These same functions are declared with core visibility (single underscore), but are internal private details to Element (should be moved to the internal/private section, and have double underscores)13:01
tristan(C): There is a wild assumption currently living in `Element._stage_sources_at()`, which makes the dangerous assumption that well ... if we have a build tree then let's go ahead and stage it, who cares who is asking !13:02
tristanThis patch builds on that wildly dangerous assumption but should fix it instead.13:02
* tpollard notes that (C) was very recently merged, this MR just mitigates it not being available 13:03
tristanOther comments relate mostly to API interactions of modules13:03
mablanchraoul: I can have look [at !900].13:22
*** ahmed89 has quit IRC13:22
*** tristan has quit IRC13:34
* Kinnison hrfs. !919 says it'll close #466 but I never said that to it. I wonder if it's possible to unmake that relationship. If not, I'll just have to reopen #466 once !919 gets merged13:37
*** Prince781 has joined #buildstream13:38
KinnisonFixed13:43
Kinnisonsigh13:43
*** WSalmon has quit IRC13:43
*** WSalmon has joined #buildstream13:43
benschubertthanks Tristan! I'll make a MR then14:00
*** lachlan has quit IRC14:01
*** tristan has joined #buildstream14:03
*** lachlan has joined #buildstream14:17
*** ChanServ sets mode: +o tristan14:30
tristanjmac, There is a problem with SandboxRemote, it redeclares it's own brand of SandboxError, instead of using the one from _exceptions14:30
tristanjmac, Looking at the first errors it raises, they seem to be intended to be user facing14:31
tristanThe way it is implemented, they will show up as BUG messages14:31
tristanwhich should normally just be triggered by assert statements14:31
jmacOk, thanks for raising an issue about it, I'll take a look before the end of the week14:55
lachlanKinnison: Thanks for the review of !91715:31
*** jonathanmaw_ has quit IRC15:31
*** jonathanmaw has joined #buildstream15:32
lachlanDoes anyone on channel have experience of gitlab ci triggers and have time to review one for benchmarking?15:32
*** brlogger has joined #buildstream16:08
Kinnisonjennis: thanks16:23
Kinnisonskullman: I've skimmed through !909 for you.  Only one thing for me to raise, but there's plenty of code there I'm not in a position to grok right now16:24
tpollardtristan: I've replied to the bulk of your comments, some of them are more problematic than others in the context of the MR16:44
adds68If a project junctions another project, will the junctions data be available via the "public_data" when writing a plugin?16:51
*** xjuan has joined #buildstream16:59
skullmanjuergbi: I've got https://gitlab.com/BuildStream/buildstream/merge_requests/920/diffs?commit_id=d614987bd098cb4d7eb0578b62bb64a3ecb58654#1c09d8ba42b5e54906be5cb0a4042b329f2b8104_855_896 for showing artifact logs. It's using CASCache private methods because the API doesn't have anything public for handling refs, everything takes elements and cache keys.17:05
skullmanI'm guessing I'll need to add or promote existing method to APIs that take those17:06
juergbiskullman: yes. the untangling of ArtifactCache and CASCache should help here. MR for this should follow shortly17:07
skullmancool, looks like I'll be rebasing next week then, I'll make a note to watch out for it17:07
juergbiwith the untangling, the public API of CASCache will always use refs, never elements/cachekeys. and the public API of ArtifactCache will accept elements/cachekeys17:07
*** jonathanmaw_ has joined #buildstream17:12
*** jonathanmaw has quit IRC17:13
*** abderrahim1 has quit IRC17:23
*** abderrahim2 has joined #buildstream17:23
jmacI've added some review comments to !909.17:25
skullmanta, I'll get on them on monday17:27
*** ChanServ sets mode: +o tristan17:31
* tristan winks at juergbi and looks forward to embracing a bright future of untangling ;-)17:31
juergbihehe17:31
tristanThe build trees MR would be easier to review in that world, I think17:31
juergbiprobably a bit, yes. I'm looking forward to also getting rid of extract directories, which would have simplified that MR as well, but that's further into the future...17:32
tristanwell, we can agree that even the patch would have been easier to write, as well :D17:33
*** Prince781 has joined #buildstream17:45
*** jonathanmaw_ is now known as jonathanmaw17:50
jonathanmawIs there a way in buildstream's tests to make tmpdir and datafiles point to different directories?18:02
jonathanmawthe only way I can find is to use tmpdir_factory and do tmpdir_factory.mktemp()18:08
* jonathanmaw will see how well that goes18:08
lachlanI'm finding gitlabs caching approach to be unreliable and would like to canvas opinion on how and where benchmarking data should be stored?18:08
*** raoul has quit IRC18:24
WSalmonHi tristan & tlater[m], could you please add 'check my implementations of your review points' to your respective todo's, many thanks (https://gitlab.com/BuildStream/buildstream/merge_requests/897)18:27
*** lachlan has quit IRC18:30
*** jonathanmaw has quit IRC18:44
*** catonano has joined #buildstream19:09
*** toscalix has quit IRC19:40
*** Prince781 has quit IRC19:51
*** Prince781 has joined #buildstream19:55
*** benschubert has quit IRC20:10
*** Prince781 has quit IRC20:10
*** Prince781 has joined #buildstream20:17
*** toscalix has joined #buildstream20:28
*** toscalix has quit IRC20:30
*** Prince781 has quit IRC20:39
*** Prince781 has joined #buildstream20:43
*** catonano has quit IRC20:54
*** tristan has quit IRC21:26
*** alatiera_ has quit IRC22:32
*** Prince781 has quit IRC22:49
*** Prince781_ has joined #buildstream22:49
*** Prince781_ is now known as Prince78122:50
*** Prince781 has quit IRC23:17
*** Prince781 has joined #buildstream23:19

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!