IRC logs for #baserock for Wednesday, 2016-04-20

*** astrophys has quit IRC00:14
*** gtristan has quit IRC01:51
*** gtristan has joined #baserock02:19
*** bashrc has joined #baserock08:01
*** edcragg has joined #baserock08:04
*** rdale has joined #baserock08:09
*** bruce_ has joined #baserock08:18
*** anahuelamo has joined #baserock08:19
*** jonathanmaw has joined #baserock08:27
*** edcragg has quit IRC08:30
*** tiagogomes has joined #baserock08:38
*** bashrc has left #baserock08:39
*** edcragg has joined #baserock08:58
*** ssam2 has joined #baserock08:58
*** ChanServ sets mode: +v ssam208:58
*** locallycompact has joined #baserock09:39
*** pedroalvarez has left #baserock09:56
*** pedroalvarez has joined #baserock09:56
*** ChanServ sets mode: +v pedroalvarez09:56
paulsherwoodoverlaps...10:01
paulsherwood/tools/lib/gcc/x86_64-bootstrap-linux-gnu/4.9.2/specs10:01
pedroalvarezyup10:01
pedroalvarezi remember that one10:01
paulsherwoodwhich one should be chosen?10:02
paulsherwoodlinux-api-headers? or what was there before?10:03
rjekWhich chunks are providing the overlapping files?10:03
richard_mawthat overlap is required in the bootstrap10:03
richard_mawotherwise it can't find the tools10:03
paulsherwoodack10:04
paulsherwoodso rdale's proposal may not work for this one10:04
pedroalvarezwe create an specs file in stage2-gcc,  and then in stage2-reset-specs we overwrite it IIRC10:05
ssam2the bootstrap is basically going to be a hack for as long as build-depends are transitive10:05
rdalemaybe that is a result of transitive build dependencies10:05
ssam2the stage2 tools need the stage1 tools, but the final tools don't need the stage2 tools10:05
paulsherwoodack10:05
ssam2the stage2 tools need the stage1 tools, but the final tools don't need the stage1 tools, even10:05
jjardonnice, GCC6 has "native" support for musl: https://gcc.gnu.org/gcc-6/changes.html10:06
paulsherwoodi always thought we should have three strata, not 1, for this10:07
pedroalvarezpatches welcome10:07
paulsherwoodheh :)10:07
paulsherwoodi'd patch, if i understood this well enough10:07
paulsherwoodi'll keep thinking about it, though10:08
pedroalvarezit takes a while to get how all of it works, but it's possible to dig into it10:08
paulsherwoodmaybe gtristan can solve it10:08
paulsherwood:)10:09
pedroalvarezhehe10:10
pedroalvareznot sure, but I think he nuked build-essentials for the aboriginal work10:10
paulsherwoodsounds good to me :-)10:11
ssam2if only aboriginal were less complex ...10:18
paulsherwoodyup... but maybe we can shield most folks from the complexity10:24
rjekDoes anybody know why a trove's lorry controller might be sat consuming 100% of a core?10:28
rjekstrace says it's doing nothing but select(), accept()=5, fcntl(5,...), futex(), futex(), repeat10:29
rjekInterestingly it never seems to close fd 510:29
rjek(Might be in another thread?)10:29
richard_mawrjek: I vaguely recall that happening when the database was full of historical jobs, but I added code to lorry-controller to fix that, don't know if we're running that version of course10:30
rjekAnything I can look for that might be diagnostically useful, richard_maw?10:30
richard_mawcheck the lorry controller config, if it's still got the expiry period set to a year, it'll be that problem10:31
pedroalvarezmaybe easier to check the size of the database first10:33
pedroalvarez/home/lorry/webapp.db i think10:33
rjekWhere does the lorry controller config live?10:34
richard_maw/etc/lorry-controller* IIRC10:35
benbrown_richard_maw, rjek: can't see that set anywhere, and the default is set to 3 days in lorry-controller-remove-old-jobs11:04
rjekNothing in /etc/lorry-controller/ contains any expiry info11:31
rjek(sorry, got dragged into a meeting)11:31
richard_mawrjek: it'll be set to the default then, which assuming we're running a version with the default fixed to be more reasonable, will be 3 days, and there's something else wrong instead11:43
rjekNod11:49
rjekIs there any logging?11:49
rjekOh yes.11:50
rjekAnd there's lots of it11:50
rjekTHe log is full of minions asking for work, once a second or so11:56
ssam2how big is the ~lorry/webapp.db file ?12:03
ssam2I think it's normal for minions to ask for work once a second or so12:03
rjek~160MB12:03
ssam2hmm, that should be ok12:04
ssam2when I've seen super high CPU usage, it's normally been because that file was > 1GB12:04
*** CTtpollard has quit IRC12:16
*** gtristan has quit IRC12:17
*** CTtpollard has joined #baserock12:19
rjekThis is the loop it's sat in: http://www.rjek.com/p/34677567.txt12:24
jjardonpaulsherwood: do you know if https://github.com/devcurmudgeon/ybd/issues/205 is easily reproducible? I have ybd and kbas running and Im not getting any error so far12:30
rjeklorry      353 86.6  0.2 503036 41944 ?        Sl   11:55  35:33 /usr/bin/python /usr/bin/lorry-controller-webapp --config=/etc/lorry-controller/webapp.conf12:37
rjekHmm, so it's the webapp, not the controller itself12:37
richard_mawhmm, don't recall anything the webapp would be doing that could peg the CPU like that12:38
rjek-sh: tcpdump: not found12:38
rjekboo12:38
rjekbottle is threaded12:39
* rjek straces with all the threads12:39
ssam2if it's using 100% CPU, probably whatever the problem is will be inside Python itself12:40
ssam2and may not be calling any syscalls12:40
ssam2the only way I know of to live-debug a Python process is to attach gdb, which is not exactly nice12:41
ssam2although there may be some helper scripts i don't know about12:41
*** anahuelamo has quit IRC12:41
rjekhttp://www.rjek.com/p/79731e3e.txt12:41
rjek661's FD 5 is constantly changing12:42
rjek6 is the db12:42
rjekso it's thrashing the DB for some reason12:42
* rjek wonders if it's burning a hole in his shiny expensive SSDs :(12:43
rjekOh, it's reading. :)12:43
richard_mawso, just the atime metadata updates to worry about then?12:45
ssam2could be an sqlite bug even with the small db. you could move the webapp.db file out of the way, and see if the problem goes away12:45
rjekLooks like something is making a great deal of requests for data from it12:45
rjek(ie, every time I ls /proc/X/fds/, the FD the thread is handling is different12:45
rjekssam2: What are the risks in moving the db out of the way?12:46
richard_mawbreaking it completely if you haven't stopped it all first12:46
rjekie, will everything explode into a pile, or will it recreate the DB using information stored elsewhere and continue?12:46
richard_maweverything will explode in a pile since it will fail to initialise the DB correctly if it goes away at runtime rather than startup12:47
pedroalvarezstop webapp service, remove webapp.db, restart services (but there are a few, maybe better reboot)12:48
rjek-sh: service: not found12:48
rjekBah12:48
pedroalvarezsystemctl!!!12:49
rjekWhat ludicrous system-specific syntax do I need for that?12:49
pedroalvarezsystemctl | grep webapp (to find out the name)12:49
pedroalvarezsystemctl stop lorry-controller-webapp.service (IIRC)12:49
ssam2if it breaks completely you can just restart the webapp process12:50
richard_mawnot entirely12:51
richard_mawif it breaks the wrong way it will have an uninitialised database and won't notice it needs to initialise it12:51
rjekRight, stopping webapp, moving db out of the way, rebooting has resulted in a somewhat quieter system12:52
rjekLet's wait and see if it is a /working/ system12:52
rjekthanks all12:53
paulsherwoodlocallycompact: do you have any more info about https://github.com/devcurmudgeon/ybd/issues/205 ?12:54
paulsherwoodi've not seen it anywhere else12:54
pedroalvarezrjek: I suggest to have a look later today12:57
rjekpedroalvarez: I've kept the db files anyway12:58
*** anahuelamo has joined #baserock13:06
*** CTtpollard has quit IRC14:02
*** gtristan has joined #baserock14:03
*** CTtpollard has joined #baserock14:16
*** astrophys has joined #baserock14:49
*** ssam2 has quit IRC14:53
*** ssam2 has joined #baserock15:07
*** ChanServ sets mode: +v ssam215:07
jjardonpaulsherwood: Hi, what the 3 numbers mean in "[0/20/291] [xorg-lib-libfontenc]" I guess the last one is the total of chunks to buid, but what about the other 2?15:09
richard_mawI think it's meant to be "Current job/number of jobs to complete/total number of components in system"15:28
richard_mawSo the middle number doesn't include jobs that are already done15:28
*** bruce_ has quit IRC15:51
*** anahuelamo has quit IRC15:51
*** anahuelamo has joined #baserock15:52
*** bwh_ has joined #baserock15:54
*** fay_ has quit IRC16:17
*** jonathanmaw has quit IRC16:33
paulsherwoodyup :)16:51
*** ssam2 has quit IRC16:52
*** tiagogomes has quit IRC17:04
*** franred has quit IRC17:13
*** edcragg has quit IRC17:15
*** anahuelamo has quit IRC17:27
*** gtristan has quit IRC17:30
*** locallycompact has quit IRC17:44
*** cosm has quit IRC18:43
*** edcragg has joined #baserock20:19
*** gtristan has joined #baserock20:31
*** edcragg has quit IRC20:51
*** edcragg has joined #baserock21:45
*** edcragg has quit IRC23:11
*** gtristan has quit IRC23:53

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!