IRC logs for #baserock for Friday, 2017-02-10

*** gtristan has joined #baserock		07:05
*** fay has joined #baserock		08:14
*** fay is now known as Guest94123		08:15
*** ctbruce has joined #baserock		08:27
*** paulwaters_ has joined #baserock		08:31
*** rdale has joined #baserock		08:56
*** tiagogomes has joined #baserock		08:56
*** noisecell has joined #baserock		09:02
*** toscalix has joined #baserock		09:10
*** jonathanmaw has joined #baserock		09:22
paulsherwood	elsewhere i'm seeing folks fiddling with -j again	09:25
paulsherwood	back in the day, morph had an algorithm to set max-jobs: ncores * 1.5	09:25
paulsherwood	because a couple of experienced engineers had found, in their experience, that it gave fastest results	09:26
paulsherwood	i tested that assumption for ybd, because i was watching lots of morph builds and ybd builds, and total elapsted time was an important metric for me	09:26
gtristan	Sounds perfectly sensible, but continue :)	09:26
paulsherwood	disclaimer: there may be some gaps in my testing method, since i am a dilettante at this... i only cared about optimising wallclock time, for a given kind of workload (ybd ci.morph) on the fastest kit i could find	09:27
paulsherwood	so, i noticed that much time was spent building big things - qtwebkit, gcc for example	09:29
paulsherwood	i assumed that these projects were as parallelisable as possible - their upstreams, just like me, must care to make their builds as fast as poss	09:29
paulsherwood	using ybd i was able to time creeation of qtwebkit and gcc artifacts in isolation, on huge AWS machines	09:30
paulsherwood	i could try max-jobs: 1,2,4,8,10,16,24,32 etc	09:30
gtristan	here we're talking about one build in isolation and configuring -j <jobs>	09:31
paulsherwood	the graphs are https://docs.google.com/spreadsheets/d/149jS3VVJ7jA14dSJybeOr4otlEFrzCVJjbxMSppiMcg/edit#gid=245680236	09:31
* gtristan just noting		09:31
paulsherwood	so basically that says to me, at least, that for these well established jobs, i don't get a huge benefit beyond 10 cores	09:32
gtristan	How many cores did that machine have btw ?	09:32
gtristan	mx4.10xlarge means 10 cores ?	09:32
paulsherwood	https://docs.google.com/spreadsheets/d/149jS3VVJ7jA14dSJybeOr4otlEFrzCVJjbxMSppiMcg/edit#gid=1336804058	09:32
paulsherwood	that's two different kinds of AWS machines	09:33
paulsherwood	mx is 'memory optimised'	09:33
paulsherwood	cx is 'compute optimised'	09:33
paulsherwood	https://aws.amazon.com/ec2/instance-types/	09:33
paulsherwood	so, then i used that info, and tried to optimise for elapsed time on ci.morph as a whole...	09:34
paulsherwood	AWS cx4.8xlarge gives me better price/performance than mx, so i focused on that	09:35
gtristan	I dont get how many cores... from that graph... maximum is 8 cores ?	09:35
gtristan	oh below	09:36
gtristan	does vCPU mean number of cores ?	09:36
paulsherwood	yup i think so	09:37
paulsherwood	and actually my graph calls it cores, but really i mean the number after -j	09:37
gtristan	yes I get that, I'm wondering about the machine constraints :)	09:37
paulsherwood	understood. this can be affected by memory, io etc	09:38
gtristan	Assuming unlimited cores, or cores > max-jobs... we reach the average limit of 'number of object files to generate in a given directory'	09:38
paulsherwood	but anyway... given that broadly i saw much less gain in elapsed time once you get beyond 10 or so cores, i fiddled with instances of ybd vs max-jobs, and compared results	09:39
gtristan	gcc has one specific directory that has (I would think) vastly parallelizable objects	09:39
paulsherwood	that's not what my results show, iiuc	09:39
paulsherwood	it's bound to be an exponential curve at best anyway	09:39
gtristan	Sure, so far they show to me that "there is no noticeable performance gain by reducing the number of max-jobs"	09:40
gtristan	but I'm missing the next part of your experiment :)	09:40
paulsherwood	"there is no noticeable performance gain by INCREASING the number of max-jobs" either. and those extra cores cost money	09:40
gtristan	That.	09:41
* paulsherwood pays for his AWS instances, which helps to focus the mind		09:41
paulsherwood	so i trade off the cost of the machine time, versus the cost of my own time	09:41
gtristan	paulsherwood, that's a very particular case indeed, you are saying that spawning a task will magically cost more money on some fancy service which detects that a process is launched and hands it more resources ?	09:42
paulsherwood	and fastest wallclock time i could get for ci.morph (builds, not git clones) was achieved on cx4.8, 4 instances, 8 cores each	09:42
paulsherwood	no, not quite. i have a lot of builds to do, on a constrained machine, and want to get them done in fastest possible time	09:43
paulsherwood	that's not so uncommon	09:43
gtristan	On my machine (most machines ?), spawning a task which is in a wait state does not cost more energy	09:43
paulsherwood	you are considering your personal machine. i'm considering shared/cloud/ci	09:44
gtristan	I am considering a machine with it's own resource limitations of course	09:45
paulsherwood	if one build has 32 cores on a 32 core machine, even if most of them are in wait state, other builds are affected, afaict	09:45
paulsherwood	ymmv	09:45
gtristan	paulsherwood, right now I suspect you may be onto something, but it's not clear to me what that is exactly	09:47
paulsherwood	heh	09:48
paulsherwood	once you figure it out, i'll agree that's what i meant all along	09:48
paulsherwood	gtristan: my results are empirical. fastest wallclock for the whole load, for the fastest/biggest kit i could afford	09:49
gtristan	Have we proven that spawning processes in wait state in advance of actually processing them, incurs a non negligible overhead ?	09:49
paulsherwood	i didn't dig into that level of detail - these tests were quite time-consuming and expensive as it is :)	09:49
paulsherwood	i did collect the logs, though	09:49
gtristan	the tests should be checking "what does it cost us more to spawn additional tasks in advance"	09:49
paulsherwood	from my pov, the wallclock time was the metric i wanted to optimise, period	09:50
paulsherwood	and as someone paying for both engineer time and computer/cloud time, that's the one that i continue to care about	09:50
gtristan	And you found that "reducing number of spawned jobs does not cause the build to take longer to complete", correct ?	09:50
gtristan	But are worried that for some reason, a build which spawns more tasks in advance may be more costly	09:51
paulsherwood	i wasn't testing that. i was just interested in getting to fastest builds, by varying instances vs max-jobs, after selecting highest perf/price aws	09:51
gtristan	I dont see how that follows, but that's the metric we want	09:52
paulsherwood	iirc wallclock ci.morph for instances: 4, max-jobs: 9 was slower than instances: 4, max-jobs: 8 on a 32 core machine	09:52
paulsherwood	and instances:5 max-jobs anything was slower still	09:53
paulsherwood	(but i'd have to check the logs, and i'm unable to spend more time on this now, sorry)	09:53
* paulsherwood notes that ybd already got to be way faster than morph, by being able to support instances: >1, and the knowledge that max-jobs: cores was not normally slower than max-jobs: cores*1.5		09:55
gtristan	An interesting thing to check would be comparing "max-jobs = cores / instances" on a 32core machine... vs the same configuration on a measly 8core laptop	09:57
gtristan	(or 8thread)	09:57
gtristan	I am certain that compiling webkit with only 4 threads while my other instance is crunching configure scripts, is wasting 3 threads most of the time	09:58
paulsherwood	gtristan: it's relatively easy for you to test this, i think :)	09:59
gtristan	(that is quite frustrating :))	09:59
gtristan	sure, we'll do some benchmarking then, after we have conversions :)	09:59
* paulsherwood did experiment with instances: 2, max-jobs: 8 on 8-core, but can't remember the outcome		09:59
* gtristan likes benchmarks		09:59
paulsherwood	https://github.com/devcurmudgeon/build-logs and http://wiki.baserock.org/build-times/	10:01
*** CTtpollard has quit IRC		10:03
*** CTtpollard has joined #baserock		10:06
*** locallycompact has joined #baserock		10:08
*** ssam2 has joined #baserock		10:11
*** ChanServ sets mode: +v ssam2		10:11
*** jude_ has joined #baserock		10:14
*** ssam2 has quit IRC		10:14
*** ssam2 has joined #baserock		10:16
*** ChanServ sets mode: +v ssam2		10:16
*** tiagogomes has quit IRC		10:23
*** locallycompact has quit IRC		10:30
jonathanmaw	Has something gone wrong with ybd's CI? the cache_keys stage has been failing with an ugly stack trace even after I reverted my newest changes.	10:30
jonathanmaw	https://gitlab.com/jonathanmaw/ybd/builds/10240954 for example	10:30
pedroalvarez	jonathanmaw: yes	10:31
paulsherwood	jonathanmaw: there was a problem yesterday, https://gitlab.com/baserock/ybd/commit/0af8892a267e9d7f4534c25c1d27feab7ddb9563	10:31
jonathanmaw	ah, ok.	10:32
*** cornel has quit IRC		10:46
pedroalvarez	I wish gitlab wasn't stuck in "Checking ability to merge automatically…"	10:57
pedroalvarez	i don't know if i will have to rebase, and re-trigger the CI	10:57
pedroalvarez	or not	10:57
pedroalvarez	lots of results in google regarding this, none of them useful	11:01
*** locallycompact has joined #baserock		11:01
*** tiagogomes has joined #baserock		11:06
*** CTtpollard has quit IRC		11:12
*** CTtpollard has joined #baserock		11:12
jjardon	pedroalvarez: yeah, seems a problem since the last upgrade	12:25
jjardon	Can someone else approve pedro's MR so we can merge it? https://gitlab.com/baserock/definitions/merge_requests/30	12:27
paulsherwood	jjardon: done	13:04
pedroalvarez	thanks	13:05
jjardon	thansks for the updates pedroalvarez !	13:36
*** gtristan has quit IRC		14:32
*** jonathanmaw has quit IRC		16:35
*** toscalix has quit IRC		16:37
*** jonathanmaw has joined #baserock		16:51
*** toscalix has joined #baserock		16:54
*** ctbruce has quit IRC		17:01
*** jonathanmaw has quit IRC		17:06
*** Guest94123 has quit IRC		17:16
*** noisecell has quit IRC		17:18
*** jonathanmaw has joined #baserock		17:18
*** jonathanmaw_ has joined #baserock		17:19
*** toscalix has quit IRC		17:21
*** toscalix has joined #baserock		17:24
*** jonathanmaw_ has quit IRC		17:48
*** jonathanmaw has quit IRC		17:48
*** locallycompact has quit IRC		17:53
*** locallycompact has joined #baserock		18:01
*** CTtpollard has quit IRC		18:04
*** locallycompact has quit IRC		18:24
*** toscalix has quit IRC		18:28
*** toscalix has joined #baserock		18:28
*** toscalix has quit IRC		19:00
*** tiagogomes_ has joined #baserock		19:04
*** tiagogomes has quit IRC		19:04
*** gtristan has joined #baserock		19:09
*** paulwaters_ has joined #baserock		19:19
*** ssam2 has quit IRC		19:24
*** tiagogomes_ has quit IRC		20:33
*** gtristan has quit IRC		21:03
*** toscalix has joined #baserock		21:34
*** toscalix has quit IRC		22:45

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!