paulsherwoodis it true that morph-cache-server is now in morph?10:18
perrylthe separate git repo seems to be marked obsolete on GBO10:18
paulsherwoodthat invalidates most of the email i've been writing, then....10:19
paulsherwoodok... i'll re-phrase :)10:19
perrylAFAICT everything from mcs.git has just been moved to a directory inside morph.git as-is, i could be wrong though10:20
ssam2it would be trivial to move it back out again10:27
ssam2as it is 2 files10:28
paulsherwood15-07-01 15:03:31 [1/273/273] [stage1-binutils] Elapsed time 00:02:3814:06
paulsherwoodon thunderX...14:06
persiaHow many parallel jobs?14:07
rjekAnd how long does it take on other machines?14:08
paulsherwood1 minute on macbook. this is max-jobs=114:09
nowsterpaulsherwood: I take it that's ybd?14:09
persiapaulsherwood: Try max-jobs=9614:09
* rjek finds that Macbooks have surprisingly good storage I/O14:09
rjekHow much RAM does each core have on ThunderX?14:10
rjek(I'm assuming it's all shared, but what is ram_size/cores?)14:10
persia64G/48 cores14:11
persiaAt least for the low-end servers.  Some folk have bigger servers.  I've heard of 256G/96cores, but haven't actually touched such a unit.14:11
rjekNot much RAM per core for doing lots of builds, then14:11
rjekAnd how is RAM arranged?  ie, how many banks and thus how much access must be serialised/14:12
persiaHmm.  Good point.  max-jobs=24 is a better arrangement.14:12
* rjek is thinking that those sort of CPUs might be great for doing processing that fits entirely in cache14:12
* persia doesn't know enough about paulsherwood's specific server.14:12
rjekpersia: Do a build with each of max-jobs=${seq 1 96) and report back with a histogram >:)14:13
persiaThe results of that would be interesting, actually.14:14
persiaI don't think we've played much with Baserock on systems with more than 8 cores, so we haven't really been able to measure the right core/memory ratio for building.14:18
paulsherwoodpersia: no difference with 96 afaict14:21
paulsherwoodnowster: yes14:21
persiaInteresting.  I wonder what is taking the time.14:21
KinnisonI strongly doubt that most workloads will parallelise that greatly14:21
Kinnisonyou'd do better having multiple less parallel workloads14:21
persiaHrm?  Doesn't max-jobs=96 mean that it will kick off as many builds within a strata as it can (up to 96)?14:22
persiaOr, actually, if we're only doing stage1-binutils, that's not something that can parallelise well.14:23
ssam2persia: In Morph, --max-jobs N is used to set MAKEFLAGS to -j N14:28
ssam2that's all14:28
paulsherwoodpersia: it doesn't parallelise itself (yet)14:33
KinnisonThat's what distbuild was for, for morph14:34
paulsherwoodpersia: i need to iron out how to clean up dead tmp directories is all... that proved to be a surprisingly complex topic on here a few days ago14:34
paulsherwoodactually, i could let it go without that, and see 'does not clean up' as a tolerable bug14:35
* persia digs through backscroll to develop context for the tmp directories issue15:14
persiaAh, I was actually part of that discussion.  Ignoring the digression on how to make temporary directories secure, the usual way is either to use lockfiles (as Kinnison suggested) or for the controlling program to track the temporary directories, trap on common signals, and attempt to remove directories before exiting.15:19
paulsherwoodin this case the controlling program doesn't want to remove directories on exiting15:24
paulsherwoodand/or there is no 'controlling program'15:24
paulsherwoodi think my use case is: n instances of ybd run. some fail. the temp directories contain useful debris from the crash15:24
* richard_maw would like a tmpdir-daemon which you request a tempdir from, and if you close the fd the daemon cleans up the tempdir, so it's done automatically when your process exits15:25
paulsherwoodbut eventually this fills up disk, so i need some periodic cleanup to notice that some tmp dirs have been untouched and therefore are probably useless15:25
persiarichard_maw: +115:25
richard_mawpersia: you'd also need it to clean up any tempdirs that are on-disk in its managed area which it doesn't recognise, so it can cleanup on system crash15:26
paulsherwoodrichard_maw: maybe offer that as a systemd function? :)15:26
richard_mawpaulsherwood: I got the inspiration from the systemd-logind inhibitor functionality15:27
persiapaulsherwood: I'd recommend adding a stamp file to each tmpdir when it is created, with the PID of the running integration tool.  When the reaper runs, it can check the creation time of the stamp file, and whether the PID in question is in use, and use that as the basis for a decision.15:27
richard_mawalternatively you can rely on your OS' built-in tempdir management15:27
KinnisonPID reuse is a real race condition people don't take account of :(15:28
richard_mawsystemd will clean up tempfiles older than a specified period15:28
richard_mawbut only if systemd is managing the directory used15:28
persiaKinnison: Good point.  Maybe PID+program start time?15:31
richard_mawthat was what kdbus was doing to mitigate the problem15:32
richard_mawas expected it got a bunch of hate, but is the best approach currently, though now I think people are working on some kind of pidfd15:32
persiaAre you using morph?  If so, did you create a morph.conf?16:46
persiaThe error message for not doing so can be confusing.16:47
edcraggit's not morph, just running some python16:47
persiaThen the answer to the question is entirely dependent on the system.  Are you running a reference system?  If so, which?  How was it deployed?16:47
edcraggjetson, the usual flashing method16:48
edcraggit was the current release around a week or so ago16:48
richard_mawedcragg: when does it claim to not have any space in /tmp, and how are you proving it should have enough space?16:48
edcraggit's also the same if i unmount /tmp and leave it on the rootfs16:50
paulsherwoodwin 5816:51
richard_mawedcragg: it looks like you're running a deploy of some form, you're right that the tempdir isn't too small, but you've got /dev/sda mounted into a tempdir in /tmp16:51
richard_mawso it's your / that is too small16:51
edcraggi'm still confused... / has plenty of space too16:54
edcraggyes, i'm running a deployment extension directly with python16:54
richard_mawhow big is the thing you want to deploy onto it?16:57
edcraggit's around 1.4 GB16:59
paulsherwoodedcragg: interesting - you may be the first person to run directly with python :)17:00
richard_mawpaulsherwood: incorrect, I've done it before.17:00
paulsherwoodah, i should have known! :)17:00
richard_mawedcragg: what does `btrfs filesystem df` say?17:00
* paulsherwood is the first to run ybd on thunderx though :-)17:01
richard_mawhmm, should still be plenty available17:03
richard_mawwait… is this an upgrade or a raw disk deployment?17:04
edcraggraw disk17:04
richard_mawsince if it's the latter, then you may need to change the DISK_SIZE variable17:04
edcraggchange it to what? it's set to 7G17:04
edcraggit has been working for some time, just suddenly stopped this afternoon17:05
richard_mawhmm, it's not the bug I was thinking of then17:06
edcraggit's a mystery to me...17:09
