*** thecorconian [~thecorcon@136.1.1.102] has quit [Remote host closed the connection] | 01:06 | |
*** thecorconian [~thecorcon@136.1.1.102] has joined #baserock | 02:07 | |
*** thecorconian [~thecorcon@136.1.1.102] has quit [] | 02:43 | |
*** ssam2 [~ssam2@82-70-136-246.dsl.in-addr.zen.co.uk] has joined #baserock | 06:54 | |
*** dutch [~william@82-70-136-246.dsl.in-addr.zen.co.uk] has joined #baserock | 07:20 | |
*** dutch [~william@82-70-136-246.dsl.in-addr.zen.co.uk] has quit [Read error: Connection reset by peer] | 07:20 | |
*** ct-dutch [~william@82-70-136-246.dsl.in-addr.zen.co.uk] has joined #baserock | 07:21 | |
*** tiagogomes [~tiagogome@213.15.255.100] has joined #baserock | 07:33 | |
*** franred [~franred@82-70-136-246.dsl.in-addr.zen.co.uk] has joined #baserock | 07:47 | |
*** franred [~franred@82-70-136-246.dsl.in-addr.zen.co.uk] has quit [Remote host closed the connection] | 07:49 | |
* ssam2 is just finishing up talking in "The GENIVI Baselines" webcast: http://automotive.linuxfoundation.org/webinars | 07:58 | |
ssam2 | there's another one at 15:00 UTC if anyone is interested | 07:59 |
---|---|---|
ssam2 | wait, 16:00 UTC | 07:59 |
ssam2 | timezones are hard | 07:59 |
paulsher1ood | how did it go? | 08:00 |
ssam2 | first question: ""Native-compile only (right now)" -- any tentative timeline we can expect a cross compilation?" | 08:01 |
paulsher1ood | who asked that? | 08:01 |
ssam2 | went well though I think, about 15 attendees | 08:01 |
ssam2 | one of the attendees | 08:01 |
paulsher1ood | what did you tell him/her? | 08:01 |
ssam2 | I said we don't have a timeline at this point | 08:01 |
ssam2 | although we're interested in it | 08:01 |
paulsher1ood | 'when hell freezes over, and the moon falls fron the sky' | 08:02 |
ssam2 | you never know :) | 08:02 |
paulsher1ood | i replied to anand. i'm slightly worried that his patch didn't apply cleanly | 08:02 |
ssam2 | Thanks for that | 08:03 |
ssam2 | were you applying on rc3 or rc5 ? | 08:03 |
paulsher1ood | rc5 | 08:03 |
ssam2 | hmm, weird. could be that his patch is against the btrfs-next tree or something | 08:03 |
paulsher1ood | maybe. hopefully he'll notice the difference in my reply | 08:05 |
pedroalvarez | hi all | 08:17 |
pedroalvarez | yesterday I finally found the problem with delta/gusb | 08:17 |
pedroalvarez | (is not lorrying | 08:17 |
pedroalvarez | ) | 08:17 |
Kinnison | oh? | 08:18 |
*** franred [~franred@82-70-136-246.dsl.in-addr.zen.co.uk] has joined #baserock | 08:18 | |
pedroalvarez | Is failing in this assert: http://git.baserock.org/cgi-bin/cgit.cgi/baserock/baserock/lorry.git/tree/lorry#n286 | 08:18 |
pedroalvarez | and I think I know why and how to reproduce it | 08:19 |
Kinnison | Oh, did it splorf during initial clone? | 08:19 |
Kinnison | mariadb seems to not have completed either :-( | 08:20 |
Kinnison | Despite completing its conversion | 08:20 |
pedroalvarez | Kinnison: we were trying to clone it with an ssh url. So the repository was created but empty. | 08:23 |
pedroalvarez | then we fixed it: http://git.baserock.org/cgi-bin/cgit.cgi/baserock/local-config/lorries.git/commit/?id=c437b37edfe39068853e4bb06b4259a06ca542b9 | 08:23 |
Kinnison | pedroalvarez: Hmm | 08:23 |
Kinnison | pedroalvarez: it can be fixed by a trove admin, but bleurgh | 08:24 |
Kinnison | pedroalvarez: Do you want to fix it or shall I? | 08:24 |
pedroalvarez | Hm.. I'm not sure about how to fix it. Delete the repo? | 08:24 |
Kinnison | in the lorry working area, yes | 08:25 |
Kinnison | Also, that assert is probably removable | 08:25 |
pedroalvarez | is what I was going to suggest. | 08:25 |
Kinnison | it seems over-pernickety | 08:25 |
pedroalvarez | If there is not source code, why do you want to do a backup? | 08:26 |
*** jonathanmaw [~jonathanm@82-70-136-246.dsl.in-addr.zen.co.uk] has joined #baserock | 08:35 | |
ssam2 | what is going on with http://85.199.252.95/ ? | 08:40 |
ssam2 | seems to have encountered a reproducible error in definitions ref 0a8fa2da3197a43a220e8d68a3287a66f90b2104 | 08:40 |
ssam2 | but then the build failure has magically fixed itself | 08:40 |
ssam2 | without the definitions ref changing | 08:40 |
* ssam2 very much hopes this is a parallel make issue | 08:42 | |
Kinnison | Quite probably | 08:43 |
Kinnison | those kinds of failures in the past tended to be parallel build issues | 08:43 |
persia | Even so, if the build process isn't behaving deterministically, it bodes ill for reproduciblity | 08:43 |
ssam2 | persia: agreed. | 08:44 |
ssam2 | I'm not sure what we can do, other than either force 'max-jobs: 1' everywhere | 08:44 |
ssam2 | or stress test all of the chunks so that we discover any broken makefiles | 08:44 |
ssam2 | perhaps fixing up these errors as we see them will be enough | 08:44 |
Kinnison | persia: s'a missing build-dep in the makefile | 08:44 |
paulsher1ood | ssam2: well in this case, force max-jobs for the breaking chunk | 08:44 |
paulsher1ood | ? | 08:44 |
ssam2 | or fix the makefile, yeah | 08:45 |
Kinnison | max-jobs 1 will potentially help | 08:45 |
Kinnison | fixing the makefile is the right approach | 08:45 |
ssam2 | but the general problem is 'something that built successfully once may not build successfully again' | 08:45 |
Kinnison | There's nothing we can do for that beyond either forcing non-parallel builds, or fixing the issues as we find them | 08:46 |
Kinnison | It's very sad, but many upstreams simply don't care about making their makefiles robust | 08:46 |
persia | Another alternative is patching make to accept an input file describing the specific acceptable parallelism, but that's likely to be more painful. | 08:47 |
Kinnison | persia: the issue is a missing dependency | 08:48 |
Kinnison | persia: the acceptable parallelism in any missing deps situation is, erm, one at a time | 08:48 |
persia | Well. no. | 08:48 |
*** violeta [~violeta@82-70-136-246.dsl.in-addr.zen.co.uk] has joined #baserock | 08:48 | |
persia | It really depends on how fast things take: if something is slow, and other things happening theoretically later have missing deps, the build fails. | 08:48 |
persia | If it's fast the next time (using ccache, for example), then the build succeeds. | 08:49 |
Kinnison | In this case, I think ccache exacerbates the issue | 08:49 |
persia | Although it can also happen as a result of changes in build hardware (more cores, faster cores, etc.). | 08:51 |
Kinnison | yep | 08:52 |
persia | For the mechanism that populates the artifact cache: is there a facility to stage chunk artifacts in such a way that it is possible to test if they misbuilt, or built significantly differently? | 08:53 |
* persia is pondering building everything a few times, and using consensus to determine which is "correct" | 08:53 | |
Kinnison | Unfortunately unclear | 08:54 |
Kinnison | (how to achieve that) | 08:54 |
persia | Why? | 08:54 |
Kinnison | We don't have any useful bit-for-bit comparison stuff yet, so how would you determine if two chunks are equivalent in any automatic way? | 08:55 |
persia | You just build it several times, and compare the results: the content that matches the plurality of builds is probably correct. | 08:55 |
Kinnison | Hmm | 08:56 |
Kinnison | That'd slow distbuilds down somewhat | 08:56 |
persia | bit-for-bit fails. Capture the list of build artifacts, and some relevant aspects of them (symbols definitions for binaries, text match for text, etc.) | 08:56 |
pedroalvarez | delta/gusb fixed | 08:56 |
Kinnison | pedroalvarez: cool | 08:56 |
pedroalvarez | jjardon_: ^ | 08:56 |
persia | By at least a factor of the number of builds participating in consensus, yes. | 08:56 |
persia | But if a given definitions may build or not build non-deterministically, it's one way to deal with that without actually fixing the problem. | 08:57 |
ssam2 | for e2fsprogs, http://git.baserock.org/cgi-bin/cgit.cgi/delta/e2fsprogs.git/commit?h=5c15bf5f978bae01f1ca3cbe6414ab1d355a6adf seems to fix the issue | 08:57 |
ssam2 | so we just need to upgrade the version in use. I'll do a patch | 08:57 |
Kinnison | ssam2: yep, that should fix it | 08:58 |
Kinnison | ssam2: specifically the inclusion of prof_err.h in the deps | 08:58 |
ssam2 | sadly that commit doesn't seem to be in a stable release tag yet | 08:58 |
ssam2 | no, I'm lying | 08:59 |
ssam2 | (confused by 1.42.12 being sorted before 1.42.9 :) | 08:59 |
Kinnison | heh | 09:00 |
ridgerun1er is now known as ridgerunner | 09:23 | |
*** ridgerunner [~robjones@access.ducie-dc1.codethink.co.uk] has quit [Quit: leaving] | 09:30 | |
ssam2 | http://wiki.baserock.org/genivi/ says that one needs latest Morph to build the x86_64 GENIVI baseline. I think that's no longer true. Anyone know for sure ? | 09:46 |
ssam2 | also, I guess we can replace the deploy instructions on that page which give an explicit cluster with 'morph deploy clusters/release.morph genivi-baseline-system-x86_64-generic' | 09:47 |
pedroalvarez | ssam2: hmm,,, I think you are right about the "latest Morph" | 09:48 |
ssam2 | did 14.29 support chunks in definitions? I forget now ! | 09:48 |
ssam2 | morph NEWS is only up to date for 14.28 anyway | 09:49 |
ssam2 | I guess we should still recommend to use latest morph, then | 09:49 |
franred | ssam2, I think you need the latest morph to build anything from morph in definitions | 09:50 |
ssam2 | right :) | 09:50 |
ssam2 | someone asked the disk space and CPU requirements for building the baseline too, so I'd like to put that on http://wiki.baserock.org/genivi/. Any rough guesses? I'd put 4GB RAM, 20GB disk space and at least 1GHz CPU, as a best-guess | 09:50 |
*** ridgerunner [~robjones@access.ducie-dc1.codethink.co.uk] has joined #baserock | 09:52 | |
richard_maw | you probably won't need a full 4GB, since it's possible to build a 32-bit baseline | 09:56 |
Kinnison | ssam2: Is there a branch somewhere with the fix for the 3.17 btrfs issue which I could use to test on my Jetson? | 10:02 |
ssam2 | Kinnison: I don't have one, paulsherwood may have | 10:07 |
Kinnison | paulsher1ood: ? | 10:08 |
ssam2 | do we have a public artifact cache I can use? I remember there was one, but I forgot the link | 10:28 |
ssam2 | an artifact cache with up to date builds of master, I mean | 10:28 |
pedroalvarez | ssam2: yeah! | 10:29 |
persia | Can that be added to baserock.org DNS? e.g. artifacts.baserock.org? | 10:29 |
persia | And can morph default to that? | 10:29 |
pedroalvarez | http://http://85.199.252.93:8080/ | 10:29 |
rjek | I put an http in your http. | 10:29 |
pedroalvarez | meh | 10:30 |
pedroalvarez | chrome... | 10:30 |
pedroalvarez | to use it, add the follwing in morph.conf | 10:30 |
pedroalvarez | artifact-cache-server = http://85.199.252.93:8080/ | 10:30 |
ssam2 | persia: that might be a good solution | 10:31 |
persia | Saves everyone trying to remember annoying numbers | 10:31 |
pedroalvarez | persia: I had that suggestion in my notes | 10:31 |
ssam2 | I think we've always thought that trove.baserock.org should be doing that job, but the current system at trove.baserock.org doesn't have enough disk space to contain continuous builds of the artifacts | 10:31 |
ssam2 | if we had a separate subdomain for artifacts, we could be a bit more flexible | 10:31 |
persia | I'm not convinced I want the same backing hardware for git repos and an artifact cache. | 10:32 |
pedroalvarez | I agree with persia | 10:32 |
ssam2 | logical | 10:32 |
ssam2 | i've been brain damaged I guess, because 'everything in one box' is convenient for a developer | 10:33 |
persia | Absolutely. Makes code easy. Causes extreme stress to sysadmins. | 10:33 |
ssam2 | the artifact cache isn't working, sadly, and Morph doesn't give me sufficient output to diagnose why | 10:34 |
ssam2 | I could have given it the wrong URL or whatever. I'll see about improving the error logging to find out why. | 10:34 |
persia | What do you mean by "isn't working"? | 10:34 |
ssam2 | ah, perhaps it is working | 10:34 |
pedroalvarez | persia: the thing is that this artifact-cache server may be temporary, and redeployed | 10:35 |
ssam2 | the problem is actually that the CI isn't building the GENIVI systems, so I have to build them locally | 10:35 |
persia | Isn't it behind NAT anyway? Can't the NAT be adjusted? | 10:35 |
persia | ssam2: Yeah, I think we need elastic build workers before we can build all systems defined in definitions. | 10:36 |
jjardon_ is now known as jjardon | 10:36 | |
jjardon | pedroalvarez: thanks! and sorry for sending the incorrect url the first time | 10:38 |
paulsher1ood | Kinnison: http://git.baserock.org/cgi-bin/cgit.cgi/delta/linux.git/log/?h=baserock/ps/btrfs-fix | 10:40 |
tlsa | ssam2: I did leave a mason instance building GENIVI systems, and uploading artifacts to ct-mcr-1 | 10:41 |
tlsa | ssam2: only for x86-64 though | 10:41 |
paulsher1ood | tlsa: that's a private trove | 10:42 |
tlsa | it was tlsa-controller on ct-stack-2 | 10:42 |
tlsa | oh, I see | 10:42 |
tlsa | yes indeed | 10:42 |
ssam2 | I've switched http://http://85.199.252.95/ to building release.morph so we can publicise GENIVI artifacts | 10:42 |
tlsa | cool | 10:43 |
ssam2 | I guess this means feedback might be slower. | 10:43 |
ssam2 | but more valuable, as we'll notice if GENIVI breaks :) | 10:43 |
persia | My worry is that it might miss a commit. | 10:43 |
ssam2 | how does including the GENIVI system make it possible for it to miss a commit ? | 10:43 |
persia | But then again, it's better to miss some commits than all of them (which was the case before that was present). | 10:44 |
tlsa | maybe that mason instance could be redeployed with several distbuild workers | 10:44 |
tlsa | to speed up cycle time | 10:44 |
persia | Unless I missed something in the backscroll I have yet to process, there's a timed poll, so that if two commits to definitions.git happen in sufficiently close succession, one may be missed. | 10:45 |
persia | Slowing down the build time increases the window of time in which this can happen. | 10:45 |
tlsa | persia: that's correct | 10:45 |
persia | tlsa: What happens if the prior build isn't yet complete when the next poll occurs? | 10:45 |
persia | Or does it only poll when it's not building? | 10:46 |
tlsa | it doesn't poll until a cycle is complete | 10:46 |
pedroalvarez | persia: hm... the same think will happens when we add tests for the deployed image in mason | 10:46 |
persia | pedroalvarez: yes. | 10:46 |
pedroalvarez | I wonder if we can force mason to just pull one commit | 10:46 |
persia | The solution to these problems is to do things elastically: deploy separate build and test nodes for each commit. | 10:46 |
persia | Which means decomposing the current solution into a controller and workers (each of which workers manage several build/test workers, etc.). | 10:47 |
paulsher1ood | persia: with docker :) | 10:49 |
* richard_maw grumbles about buzzwords | 10:50 | |
persia | paulsher1ood: Why? I'd use a cloud, personally. | 10:51 |
petefotheringham | I thought our preferred eventaul solution involved workers registering with the controiller and saying 'Gis a job!' | 10:57 |
paulsher1ood | persia: maybe i'm over-optimising :) i was imagining cloud vms with docker containers :) | 10:57 |
persia | Adding more levels of virtualisation is rarely described as "optimisation" | 10:58 |
persia | petefotheringham: That's even better :) | 10:58 |
petefotheringham is now known as petefoth | 10:59 | |
* petefoth wonders where the petefotherignam came from! | 10:59 | |
* paulsher1ood notices an odd message during a `morph deploy` - http://fpaste.org/133843/ | 11:02 | |
richard_maw | paulsher1ood: that's new from Lars' work to reduce the amount of data transferred when uploading images | 11:04 |
paulsher1ood | is it a bug? | 11:05 |
richard_maw | I'll let you know after I've investigated mroe | 11:06 |
richard_maw | s/mroe/more/ | 11:06 |
* paulsher1ood updates morph in the meantime | 11:06 | |
richard_maw | that won't fix it | 11:06 |
richard_maw | in this case, it's because the mktemp on your mac behaves differently to those available on Linux | 11:07 |
paulsher1ood | urgh :/ | 11:07 |
richard_maw | the filename template is optional on Linux, but mandatory on a Mac | 11:08 |
persia | morph runs on a mac now? | 11:09 |
richard_maw | no, the replacement for rsyncing data across runs a shell script on the remote end to convert the input stream into a disk image sparsely | 11:10 |
paulsher1ood | as justin said on the ml, probably we should try to fix rsync? | 11:10 |
persia | Ah, yeah, that needs to be POSIX clean, because we never know what lives on the other end. | 11:10 |
persia | paulsher1ood: How is a userspace tool supposed to know how the OS decided to represent a file? | 11:11 |
richard_maw | I fear the dd command is the show-stopper for Mac support | 11:12 |
persia | Or is the suggestion that rsync grow a flag to cause a written file to be sparse? | 11:12 |
paulsher1ood | http://listmaster.pepperfish.net/pipermail/baserock-dev-baserock.org/2014-September/007906.html | 11:13 |
persia | Oh, heh, yes, that would be a bug :) | 11:13 |
* paulsher1ood is a bug magnet | 11:13 | |
paulsher1ood | so this deploy is just sitting there. i guess i need to rollback morph to before th sparse changes? | 11:15 |
richard_maw | if the dd command on your Mac supports iflag=fullblock, then it's possible to fix the script, which would be better than rolling it back | 11:16 |
pedroalvarez | paulsher1ood: `dd --help` | 11:17 |
paulsher1ood | no help on mac :) it has man, though | 11:17 |
richard_maw | https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man1/dd.1.html | 11:18 |
richard_maw | no mention of iflag | 11:18 |
paulsher1ood | http://fpaste.org/133846/ | 11:19 |
paulsher1ood | richard_maw beat me to it :) | 11:19 |
* pedroalvarez wonders what happens if the host is running windows | 11:20 | |
richard_maw | not a lot | 11:20 |
ssam2 | presumably there's no ssh server running on Windows anyway, unless the user has Cygwin | 11:20 |
persia | richard_maw: Shouldn't `...conv=sparse...` do the right thing? | 11:20 |
pedroalvarez | ssam2: true :) | 11:21 |
richard_maw | paulsher1ood: in which case the simplest thing would be to either roll back, or try `git revert -m1 ef6a4743aaaada781685ed6988917f299dbcfcda` | 11:21 |
persia | ssam2: There are other ssh servers for windows, but most windows doesn't have the POSIX compat stuff loaded, so can be safely ignored. | 11:21 |
richard_maw | persia: coreutils dd doesn't have conv=sparse, and it doesn't help having to send a long sequence of 0s in the input | 11:22 |
persia | richard_maw: my install of coreutils contains a dd.1.gz that claims to have conv=sparse | 11:23 |
persia | It's 8.20 | 11:23 |
richard_maw | ah, I've got 8.20 | 11:24 |
richard_maw | 8.13 I mean | 11:24 |
persia | You're certainly not alone: I suspect that (or older) to be installed for many folk who don't upgrade often. | 11:25 |
richard_maw | I'm on Debian Wheezy | 11:25 |
persia | Which is a sensible place (current stable). Ubuntu 12.04 is also 8.13 | 11:28 |
richard_maw | fixing rsync would be the most useful solution, but I have no idea what would be involved, and I don't have time to look at it | 11:28 |
richard_maw | I just remember being terrified the last time I looked at rsync's codebase | 11:29 |
pedroalvarez | richard_maw: meh, we sent almost the same reply | 11:38 |
pedroalvarez | richard_maw: I like your "*64" suggestion. | 11:39 |
richard_maw | I'm not sure I do, but I can't adequately express why. | 11:39 |
pedroalvarez | richard_maw: what about `uname -m` vs $MORPH_ARCH | 11:41 |
Kinnison | paulsher1ood: thanks for the linux link | 11:43 |
paulsher1ood | Kinnison: yw | 11:44 |
paulsher1ood | does it work? | 11:44 |
Kinnison | I thought you'd tried it | 11:45 |
paulsher1ood | not on jetson | 11:45 |
* Kinnison is still trying to work out if 3.17 will work on the K1 | 11:45 | |
paulsher1ood | it works on x86 | 11:45 |
Kinnison | It seems to have stuff targetted at K1 | 11:45 |
Kinnison | but I'm unsure if enough of nvidia's stuff is in | 11:45 |
Kinnison | paulsher1ood: btw, Codethink is cited on http://en.wikipedia.org/wiki/Tegra#Tegra_K1 | 11:47 |
Kinnison | There's no mention in mach-tegra on master of the GK20A family of SoCs | 11:47 |
Kinnison | so I'm guessing it's not merged up yet anyway | 11:47 |
Kinnison | which makes me super-sad | 11:48 |
franred | pedroalvarez, richard_maw, thanks for the fast review, Im trying the "*64" and "$MORPH_ARCH" version of the case statement (this does not include aarch64 yet, tough) | 11:51 |
Kinnison | *64 is probably not good enough | 11:52 |
Kinnison | given it'll likely be aarch64le if we do it | 11:52 |
persia | Isn't there already ppc64? | 11:53 |
paulsher1ood | Kinnison: check state of play in #tegra? | 11:54 |
Kinnison | persia: Yes, we have ppc64 | 11:55 |
Kinnison | I'm just downloading the current tegra kernel source to see how it manifests in there | 11:56 |
franred | Kinnison, ok, then I will leave it as richard_maw's 1st suggestion x86_64|ppc64) and when aarch64 comes we can modify or I can add *64*) but I think this is very weak regular expression | 11:56 |
Kinnison | It's a shell case pattern, but yes :-) | 11:57 |
Kinnison | radiofree: Okay, I give up, what's your recommendation for Tegra kernels then? | 12:03 |
Kinnison | radiofree: because you have 3.15 and 3.16 branches, but nvidia seem to be shipping 3.10 | 12:12 |
paulsher1ood | Kinnison: we should aim for mainline, with minimum patch set, i believe | 12:13 |
paulsher1ood | radiofree had things working on 3.16 but compilations were unstable iirc. | 12:14 |
* paulsher1ood hopes things have improved in the intervening time | 12:15 | |
Kinnison | paulsher1ood: I am, of course, aiming at mainline, but I'm having difficulty tracking everything, I've found a git tree to poke, but if I get completely lost I'll probably go chat to #tegra | 12:15 |
Kinnison | Specifically I'm just poking through https://git.kernel.org/cgit/linux/kernel/git/tegra/linux.git/ | 12:17 |
Kinnison | Hmm, the tegra 124 family might be the k1 | 12:26 |
Kinnison | I hate marketing terms vs. engineering terms issues | 12:27 |
Kinnison | Okay, I'mma try paulsher1ood's branch, with radiofree's kernel config | 12:30 |
Kinnison | let's see how explodey this is | 12:30 |
*** fay_ [~fay@82-70-136-246.dsl.in-addr.zen.co.uk] has quit [Ping timeout: 258 seconds] | 12:31 | |
*** fay_ [~fay@82-70-136-246.dsl.in-addr.zen.co.uk] has joined #baserock | 12:43 | |
* paulsher1ood crosses his fingers | 12:44 | |
Kinnison | Today I have very little brain | 12:46 |
Kinnison | it's running again with a defconfig and dtb which might actually exist :-) | 12:46 |
*** fay_ [~fay@82-70-136-246.dsl.in-addr.zen.co.uk] has quit [Ping timeout: 250 seconds] | 12:51 | |
*** fay_ [~fay@82-70-136-246.dsl.in-addr.zen.co.uk] has joined #baserock | 12:59 | |
*** thecorconian [~thecorcon@eccvpn1.ford.com] has joined #baserock | 13:00 | |
Kinnison | Does morph deploy not update git repos if it needs to? | 13:14 |
Kinnison | Because I had an odd issue | 13:14 |
paulsher1ood | why would it need to? | 13:14 |
Kinnison | where a ref was on my trove, and distbuild could build with it, but it wasn't in my local git cache | 13:15 |
Kinnison | so when I ran deploy, it failed because the ref was "bad" | 13:15 |
SotK | I'm pretty sure that deploy never updates git repos now | 13:15 |
ssam2 | that's a bug introduced when we made `morph deploy` never update git repos | 13:15 |
Kinnison | Right, well that means I ran face-first into an obvious workflow issue | 13:15 |
paulsher1ood | nor should it, imo. it should only deploy a built system? | 13:15 |
ssam2 | we should revert that change now that Morph doesn't blindly update every ref on each run | 13:15 |
ssam2 | *every repo | 13:15 |
Kinnison | paulsher1ood: You're missing the point, *I* didn't build system system, distbuild did | 13:15 |
Kinnison | and I never had that SHA locally because I never needed it locally | 13:16 |
persia | Why do you need it locally at deploy time? | 13:16 |
Kinnison | persia: because morph needs to calculate the artifact IDs, and to do that it needs tree SHAs etc | 13:16 |
Kinnison | persia: and if we have a repo locally, we use that rather than take network time to talk to the trove | 13:16 |
Kinnison | persia: but we're missing a fallback to say "if the ref isn't in the cache, and I've not tried updating the cache, {update the cache and retry,use the remote git resolver}" | 13:17 |
persia | Why does it need to calculate any artifact IDs other than that of the system being deployed? | 13:17 |
paulsher1ood | i disagree, i think. morph shouldn't need to be calculating all the time. distbuild di the calcualtion already, can't morph deploy benefit from that? | 13:17 |
Kinnison | paulsher1ood: that calculation happened on the distbuild controller | 13:18 |
paulsher1ood | Kinnison: i know | 13:18 |
Kinnison | persia: erm, to calculate the system artifact needs all the strata artifact ids which needs all the chunk ids | 13:18 |
Kinnison | persia: it's a heirarchy | 13:18 |
Kinnison | paulsher1ood: My point is, without examining the graph to know if anything changed, deploy cannot know if it's the same artifact graph as distbuild built anyway | 13:18 |
paulsher1ood | Kinnison: the output of the calculation is auseful articatc - we could just publish/cache it? | 13:19 |
persia | No, Kinnison is right here. | 13:19 |
paulsher1ood | bah. typing | 13:19 |
paulsher1ood | i still disagree. | 13:19 |
persia | The alternative is to never build anything that isn't committed, at which point we can rely on the definitions tree ID to confirm things haven't changd. | 13:19 |
persia | But that's not how morph works. | 13:19 |
ssam2 | we could potentially cache the cache key of each system morph in a given SHA1 of definitions | 13:20 |
Kinnison | persia: *iff* all the refs happen to be SHA1s | 13:20 |
ssam2 | I think it'd be more trouble than it's worth, and calculating the build graph only takes seconds anyway | 13:20 |
Kinnison | which, in a tree where I have run 'morph edit' is current not the case | 13:20 |
persia | Ah, right. | 13:20 |
paulsher1ood | distbuild only works on committed stuffs | 13:20 |
ssam2 | it'd be much faster if we batched requests to the remote git cache, too | 13:20 |
Kinnison | ssam2: I believe we modified distbuild to batch some operations. we could batch more | 13:20 |
Kinnison | specifically the "are these artifacts built?" query is batchable I think | 13:21 |
ssam2 | Morph resolves refs one by one | 13:21 |
Kinnison | aye | 13:21 |
Kinnison | those could usefully be batched | 13:21 |
paulsher1ood | ssam2: yes. but why do we do it every time? | 13:21 |
ssam2 | paulsher1ood: if it took a second, why wouldn't we? | 13:22 |
Kinnison | but it'd require a reasonably large rework of the graph walker iirc | 13:22 |
paulsher1ood | it takes minutes | 13:22 |
* Kinnison would much rather make it fast to calculate, than add another potentially out-of-date cache | 13:22 | |
ssam2 | paulsherwood: Not for me. There may be some other performance issue on your system, or an existing performance problem is greatly exacerbated for you | 13:22 |
ssam2 | so we should fix that, not add extra code to cache stuff | 13:22 |
persia | There are already so many caches that there is a significant performance impact maintaining cache consistency | 13:23 |
paulsher1ood | :) | 13:38 |
Kinnison | paulsher1ood: fyi that kernel and dtb built, I'm just waiting on my superslow laptop drive to deploy it :-) | 13:40 |
* ssam2 worries that the build instructions for the ARM Genivi baseline begin: "wget http://download.baserock.org/baserock/baserock-11-devel-system-armv7-versatile.img.gz" | 13:42 | |
ssam2 | any thoughts on what we should change this to? | 13:43 |
ssam2 | on http://wiki.baserock.org/genivi/ | 13:43 |
ssam2 | maybe '1. Buy a Jetson' ? | 13:43 |
paulsher1ood | Kinnison: i wonder whether the whole process would have been faster using a jetson as devel, and upgrading self | 13:43 |
Kinnison | paulsher1ood: possibly, but then I wouldn't have my comfortable dev environment | 13:44 |
* Kinnison is odd | 13:44 | |
pedroalvarez | "buy a jetson board" | 13:44 |
Kinnison | it'd be faster if my laptop had an SSD :-) | 13:44 |
* pedroalvarez is lagging | 13:44 | |
*** genii [~quassel@ubuntu/member/genii] has joined #baserock | 13:49 | |
straycat | Is there an analogue of ATTACH_DISK for openstack deployments? | 14:06 |
richard_maw | straycat: no | 14:07 |
straycat | Okay | 14:07 |
pedroalvarez | straycat: also, the openstack deployment only creates the image. To attach a volume (disk) is at instantiation time. | 14:08 |
straycat | I was going to ask what might be involved in writing an analogue for ATTACH_DISK, but I guess for now it's a case of running the right commands when you instantiate the thing. | 14:10 |
pedroalvarez | richard_maw: Can I use then "if [ $DISTBUILD_GENERIC = True ]", or would you prefer to stick with "if [ -n $DISTBUILD_GENERIC ]" | 14:10 |
pedroalvarez | ? | 14:10 |
pedroalvarez | hm... Am I the only one who doesn't like "openstack-app" for the name of the stratum that only has gerrit at the moment? | 14:15 |
richard_maw | pedroalvarez: so long as you quote $DISTBUILD_GENERIC in the code, I don't mind whether you compare it to True or or check whether it's set | 14:17 |
* persia thinks gerrit ought be in a "gerrit" stratum, unless there's some strong expectation to deploy systems containing gerrit *and* something else. | 14:18 | |
petefoth | persia: what an *odd* idea :) | 14:20 |
* franred though that we aim to add gerrit + zuul + turbo hipster + storyBoard... but maybe Im wrong | 14:21 | |
franred | petefoth, persia, pedroalvarez ^^ | 14:22 |
persia | franred: Wouldn't one typically deploy those as separate systems, rather than trying to stuff it all in one place? | 14:22 |
paulsher1ood | franred: persia is probably right. he often is :) | 14:23 |
persia | "often" is an operative word here: it's typically a good idea to validate my claims, rather than just believing them as gospel :) | 14:23 |
pedroalvarez | just to validate persia's claims: I raised the point because I had the same opinion | 14:25 |
petefoth | My dfeeling is that these apps should be in sepearate chunks whcih can b combined in differet ways into different systems | 14:25 |
petefoth | So our CI Pipeline might consist of a number of co-operating systems. Others might mak a systme where all of the tools are in the same system | 14:26 |
pedroalvarez | petefoth: did you mean "separate strata"? </google> | 14:26 |
petefoth | pedroalvarez: I almost certainly did :) | 14:26 |
* petefoth is not yet fluent in the Baserock language | 14:26 | |
franred | ok, I will change the name of the stratum from openstack-app to gerrit | 14:27 |
paulsher1ood | can we set a standard for naming of stratum 'foo and its dependencies' - i for one would rather not have it called just 'foo' | 14:29 |
franred | paulsher1ood, this stratum only depends on itself - the system which includes this stratum is called gerrit_x86_64.morph and its cluster openstack-gerrit.morph | 14:31 |
persia | " and its dependencies" is long though, and some strata may sensibly only contain a single chunk. | 14:31 |
paulsher1ood | i wasn't actually suggesting that as a name, just noting the use-case. what about 'gerrit-stuff' as a specific example | 14:32 |
persia | "-stuff" is less bad. | 14:32 |
Kinnison | richard_maw: can we stop morph doing the disk space check on distbuild operations, it's very annoying :-) | 14:33 |
paulsher1ood | gerrit-set ? | 14:33 |
franred | what stuff means in this case? | 14:33 |
persia | "gerrit" is actually bad, because it causes name collision (same name for chunk and stratum). | 14:33 |
paulsher1ood | that's why i raised this | 14:33 |
* paulsher1ood is obvbiously not making much sense today | 14:33 | |
franred | chunck is called gerrit-installation-binaries | 14:33 |
persia | "patch-manager" might also work, where gerrit is one chunk needed, but others may be useful. | 14:33 |
franred | so no collision in this case | 14:34 |
persia | franred: Yes, but the chunk should be called "gerrit" once it is being built from source. | 14:34 |
richard_maw | Kinnison: I guess | 14:34 |
franred | persia, code-review-manager? | 14:35 |
persia | franred: That also works, although less terse. | 14:36 |
* persia fails at grammar | 14:36 | |
persia | s/terse/tersely/ | 14:36 |
franred | persia, ok, patch-manager then | 14:37 |
franred | pedroalvarez, ^^ any thoughs against patch-manager? | 14:38 |
persia | either is fine. It's not like it's something that gets typed often, and it tab-completes now | 14:38 |
pedroalvarez | franred: better :) | 14:39 |
rjek | Minix is now cross-buildable from Linux. Here's a Debian package definition for it: https://build.opensuse.org/package/show/home:beng-nl/Minix3 | 14:39 |
rjek | It could be an interesting thing to build with Baserock instead, given its aim for high-reliability embedded systems. | 14:40 |
rjek | (It also now has a working ARM port) | 14:41 |
richard_maw | rjek: ARM for which hardware? | 14:45 |
rjek | A8 and Beagle* boards atm. | 14:45 |
petefoth | I'm concerned that the name 'patch-manager' is too generic. 'patch-manager-gerrit' makes clear this is a specific patch manager | 14:46 |
ssam2 | paulsherwood, persia: interesting point about stratum names | 14:49 |
ssam2 | petefoth: I think you've just described the systemd unit naming policy, too :) | 14:49 |
petefoth | ssam2: is that a good thing? | 14:50 |
ssam2 | I think it's sensible, yeah | 14:50 |
ssam2 | https://fedoraproject.org/wiki/Packaging:Systemd#Naming | 14:50 |
ssam2 | "Unit files should be named after the software implementation that they support as opposed to the generic type of software. So, a good name would be apache-httpd.service and bad names would be httpd.service or apache.service as there are multiple httpd implementations and multiple projects produced by the apache foundation." | 14:50 |
persia | In classic distros, one only needs a few systems, expecting users to customise to needs. For a Baserock distro, one ends up wanting a system for every intended use, which can lead to some confusion when also wanting somewhat generic strata. | 14:51 |
persia | Especially in the case of application-centric systems: as an example, what might be sensible strata/system names for a nagios controller? | 14:52 |
*** ct-dutch [~william@82-70-136-246.dsl.in-addr.zen.co.uk] has quit [Quit: Quit] | 14:54 | |
*** dutch [~william@82-70-136-246.dsl.in-addr.zen.co.uk] has joined #baserock | 14:54 | |
*** tiagogomes [~tiagogome@213.15.255.100] has quit [Quit: Leaving] | 15:48 | |
*** dutch [~william@82-70-136-246.dsl.in-addr.zen.co.uk] has quit [Quit: Quit] | 16:15 | |
straycat | Good news, you can format ephemeral disks when you create an instance | 16:35 |
straycat | Bad news, you can't specify a volume label | 16:36 |
straycat | Good news, you can attach a volume when you bring up an instance instead | 16:36 |
straycat | Bad news, gbo is 500 again | 16:36 |
straycat | Does anyone have any more of an idea what's up with it, I noticed on the weekend that one of the lighttpd services was using around 5GB of memory. | 16:37 |
straycat | We can just restart the process again, but that doesn't help us much in the long run. | 16:39 |
*** mwilliams_ct [~mikewilli@access.ducie-dc1.codethink.co.uk] has joined #baserock | 16:44 | |
*** tpollard [~tom@82-70-136-246.dsl.in-addr.zen.co.uk] has joined #baserock | 16:48 | |
straycat | For the sake of the docs it's probably simpler not to use labels. | 16:56 |
pedroalvarez | franred: wrt gerrit, my previous +1 is still valid | 16:59 |
persia | Labels are useful in a libvirt environment where one is sharing volumes between VMs, but not so much in other contexts | 16:59 |
richard_maw | is it safe to have multiple VMs use the same volume? | 16:59 |
franred | pedroalvarez, cheers, I will merge into master | 16:59 |
pedroalvarez | richard_maw: I think is not allowed | 17:00 |
persia | richard_maw: Yes, if they aren't running at the same time. | 17:01 |
persia | If you need consistency between two hosts, you need another layer of abstraction. | 17:02 |
persia | Err, two concurrently running hosts, that is. | 17:02 |
richard_maw | probably in the form of a networked filesystem with good locking semantics | 17:02 |
persia | The other common solution is the object store abstraction, without traditional FS semantics | 17:03 |
*** jonathanmaw [~jonathanm@82-70-136-246.dsl.in-addr.zen.co.uk] has quit [Quit: Leaving] | 17:04 | |
pedroalvarez | urgh... I wonder if we broke this with the automake upgrade;; http://pastebin.com/w5z3ApM4 | 17:12 |
persia | Does it build correctly if you reset automake to the old ref? | 17:13 |
radiofree | won't that require a rebuild of most things? | 17:14 |
pedroalvarez | no if i have all the artifacts cached :) | 17:14 |
*** pdar [~patrickda@access.ducie-dc1.codethink.co.uk] has joined #baserock | 17:15 | |
*** ssam2 [~ssam2@82-70-136-246.dsl.in-addr.zen.co.uk] has quit [Remote host closed the connection] | 17:22 | |
*** flatmush [~flatmush@82-70-136-246.dsl.in-addr.zen.co.uk] has quit [Ping timeout: 272 seconds] | 17:29 | |
*** flatmush [~flatmush@82-70-136-246.dsl.in-addr.zen.co.uk] has joined #baserock | 17:32 | |
*** franred [~franred@82-70-136-246.dsl.in-addr.zen.co.uk] has quit [Quit: Leaving] | 17:35 | |
*** violeta [~violeta@82-70-136-246.dsl.in-addr.zen.co.uk] has quit [Ping timeout: 255 seconds] | 17:43 | |
pedroalvarez | reverting automake, libevent built | 19:00 |
pedroalvarez | reverting the automake upgrade | 19:00 |
pedroalvarez | hm.. upstream has fixed the problem but there are not stable releases with that fix on it | 19:09 |
pedroalvarez | this is the fix: http://git.baserock.org/cgi-bin/cgit.cgi/delta/libevent.git/commit/?id=a55514eeed96b9bf9a16fbed1a709dfcce5a6080 | 19:10 |
tlsa | can you cherry pick it? | 19:12 |
pedroalvarez | given that is the makefile of the 'test' folder I was wondering if I can disable them | 19:12 |
tlsa | ah | 19:13 |
pedroalvarez | so I don't have to create a branch for it | 19:13 |
pedroalvarez | tlsa: but yes, that is also a good idea | 19:14 |
pedroalvarez | tlsa: git cherry-pick worked. I'm going to opt for that solution. | 19:23 |
tlsa | cool | 19:23 |
paulsher1ood | gbo is down... | 20:24 |
straycat | again? | 20:31 |
straycat | :/ | 20:31 |
straycat | Looks like the same problem | 20:32 |
*** benbrown_ [~benbrown@access.ducie-dc1.codethink.co.uk] has quit [Ping timeout: 245 seconds] | 20:38 | |
*** benbrown_ [~benbrown@access.ducie-dc1.codethink.co.uk] has joined #baserock | 20:39 | |
*** bjdooks [~ben@trinity.fluff.org] has quit [Ping timeout: 250 seconds] | 22:12 | |
*** thecorconian [~thecorcon@eccvpn1.ford.com] has quit [] | 23:19 | |
*** Kinnison [~dsilvers@gateway/shell/pepperfish/x-pgoajfzccsllvrvs] has quit [Ping timeout: 260 seconds] | 23:19 | |
*** Kinnison [~dsilvers@gateway/shell/pepperfish/x-ytsfmzvonwqarpnd] has joined #baserock | 23:20 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!