*** JPohlmann [~jannis@m65s13.vlinux.de] has joined #baserock | 00:15 | |
*** JPohlmann [~jannis@m65s13.vlinux.de] has quit [Changing host] | 00:15 | |
*** JPohlmann [~jannis@xfce/core-developer/JPohlmann] has joined #baserock | 00:15 | |
*** zoli_ [~zoli_@linaro/zoli] has joined #baserock | 03:55 | |
*** thecorconian [~jte@75-27-44-31.lightspeed.orpkil.sbcglobal.net] has quit [Remote host closed the connection] | 04:32 | |
*** zoli_ [~zoli_@linaro/zoli] has quit [Remote host closed the connection] | 04:42 | |
*** aananth [~caananth@74.112.167.117] has joined #baserock | 07:18 | |
*** zoli_ [~zoli_@linaro/zoli] has joined #baserock | 07:56 | |
*** radiofree_ [radiofree@unaffiliated/radiofree] has quit [Read error: Connection reset by peer] | 07:58 | |
*** fay_ [~fay@82-70-136-246.dsl.in-addr.zen.co.uk] has quit [Remote host closed the connection] | 07:59 | |
*** wdutch [~william@82-70-136-246.dsl.in-addr.zen.co.uk] has joined #baserock | 08:02 | |
*** radiofree [radiofree@unaffiliated/radiofree] has joined #baserock | 08:06 | |
*** fay [~fay@82-70-136-246.dsl.in-addr.zen.co.uk] has joined #baserock | 08:10 | |
fay is now known as Guest12589 | 08:11 | |
*** zoli_ [~zoli_@linaro/zoli] has quit [Remote host closed the connection] | 08:29 | |
*** rdale [~quassel@9.Red-83-45-185.dynamicIP.rima-tde.net] has joined #baserock | 08:43 | |
*** franred [~franred@82-70-136-246.dsl.in-addr.zen.co.uk] has joined #baserock | 08:45 | |
*** franred [~franred@82-70-136-246.dsl.in-addr.zen.co.uk] has quit [Remote host closed the connection] | 08:46 | |
*** zoli_ [~zoli_@linaro/zoli] has joined #baserock | 08:55 | |
*** franred [~franred@82-70-136-246.dsl.in-addr.zen.co.uk] has joined #baserock | 08:58 | |
*** wdutch [~william@82-70-136-246.dsl.in-addr.zen.co.uk] has quit [Quit: Quit] | 09:16 | |
*** mariaderidder [~maria@82-70-136-246.dsl.in-addr.zen.co.uk] has joined #baserock | 09:17 | |
*** wdutch [~william@82-70-136-246.dsl.in-addr.zen.co.uk] has joined #baserock | 09:18 | |
*** tiagogomes [~tiagogome@82-70-136-246.dsl.in-addr.zen.co.uk] has joined #baserock | 09:23 | |
*** jonathanmaw [~jonathanm@82-70-136-246.dsl.in-addr.zen.co.uk] has joined #baserock | 09:26 | |
richard_maw | straycat: If you update your system to include a newer version of yarn, then it will be quicker and require less storage, as it doesn't snapshot directories it doesn't need to | 09:37 |
---|---|---|
Guest12589 is now known as fay_ | 09:37 | |
straycat | richard_maw, okay cool i'll do that | 09:38 |
Mode #baserock +o pedroalvarez by ChanServ | 09:40 | |
* Kinnison congratulates the project -- my baserock-dev folder just hit 10,000 entries :-) | 09:53 | |
*** ssam2 [~ssam2@82-70-136-246.dsl.in-addr.zen.co.uk] has joined #baserock | 09:54 | |
Mode #baserock +v ssam2 by ChanServ | 09:54 | |
pedroalvarez | Kinnison: yay! | 09:54 |
pedroalvarez | paulsher1ood: wrt oFono; I'v just realized that 1.11 >= 1.9 | 09:55 |
pedroalvarez | :) | 09:55 |
Kinnison | heh | 09:55 |
Kinnison | What algorithm were you using for version comparison before?! | 09:55 |
pedroalvarez | not sure, but executed by an human brain | 09:56 |
Kinnison | heh | 09:56 |
* Kinnison strongly recommends using the Debian version number comparison algorithm if you can | 09:56 | |
DavePage | pedroalvarez: Always the weak point :) | 09:59 |
pedroalvarez | Also, I realized on friday that we need `morph edit` for now | 10:06 |
pedroalvarez | we are using it in our licensecheck script | 10:06 |
aananth | pedroalvarez: Good Morning! I just checked http://136.18.233.152/cgi-bin/cgit.cgi/delta/linux.git/ of my trove server, the directory is still empty. The server ran for 2 complete days. :-) | 10:11 |
aananth | Any test to see if things go well? | 10:11 |
pedroalvarez | hmm that's odd | 10:11 |
ssam2 | aananth: I think the problem Aananth has is that trove-setup.service failed | 10:11 |
pedroalvarez | ssam2: but I thought that some of the lorries worked | 10:11 |
ssam2 | ah, ok, hmm | 10:12 |
ssam2 | I remember seeing a log from Aananth where trove-setup.service failed because it tried to read SSH host keys from git.baserock.org and couldn't because it lacked proxy config | 10:12 |
ssam2 | which I was hoping to make a fix for today | 10:12 |
pedroalvarez | ah yeah, you missed all the fun last week :) | 10:12 |
*** Krin [~mikesmith@82-70-136-246.dsl.in-addr.zen.co.uk] has joined #baserock | 10:13 | |
aananth | ssam2: I am sorry, the proxy issue got resolved last week. I failed to add a ',' between username and password. | 10:13 |
ssam2 | so it's a specific lorry (linux.git) that isn't working | 10:13 |
aananth | apart from the above, I had to set up corkscrew, .gitconfig, ssh proxy etc. | 10:14 |
ssam2 | it could be that it's timing out, but really we need to see the logs for that lorry to see what's going on | 10:15 |
aananth | ssam2: yes, today I think all linux*.git repo are empty. Ok I will send the log. Output of journalctl? | 10:15 |
ssam2 | aananth: we should be able to filter the output of journalctl a bit so we only see what's needed | 10:16 |
ssam2 | I'll try to work out the correct commandline | 10:16 |
pedroalvarez | aananth: If you run this: `ssh -L 12765:localhost:12765 root@136.18.233.152` and you keep that ssh connection open, you will be able to open "localhost:12765/1.0/status-html" in your browser | 10:16 |
*** petefoth_ [~petefoth@82-70-136-246.dsl.in-addr.zen.co.uk] has joined #baserock | 10:18 | |
*** petefoth [~petefoth@82-70-136-246.dsl.in-addr.zen.co.uk] has quit [Ping timeout: 245 seconds] | 10:18 | |
petefoth_ is now known as petefoth | 10:18 | |
pedroalvarez | aananth: once you are there we can work out how to get some logs :) | 10:21 |
aananth | ssam2, pedroalvarez: http://fpaste.org/151383/14162197/ | 10:23 |
aananth | pedroalvarez: "localhost" did not resolve, hence i replaced it with my trove-server IP. I got lorry controller status only. | 10:24 |
pedroalvarez | aananth: but you have the button "See separate list of all jobs that have ever been started" | 10:26 |
pedroalvarez | from there you will access to the list of all the lorry jobs (mirroring attempts) of your trove | 10:27 |
aananth | pedroalvarez: its a link? | 10:27 |
pedroalvarez | aananth: it is, can't you open it? | 10:28 |
ssam2 | <http://localhost:12765/1.0/status-html> should work | 10:28 |
ssam2 | if not try <http://127.0.0.1:12765/1.0/status-html> | 10:29 |
aananth | pedroalvarez: Yes, I am opening, I could see a long list, with last column 0, 1, 127. I am pasting it. | 10:29 |
pedroalvarez | great, | 10:29 |
pedroalvarez | the ones that are working have the 0 | 10:29 |
pedroalvarez | you should be worried about these with 1, 127, etc | 10:29 |
pedroalvarez | If you click in the Job ID of a job that failed, you see a log | 10:32 |
pedroalvarez | you will se a log* | 10:33 |
aananth | pedroalvarez, ssam2: Ok, I am trying to paste it, but it is a huge text, hence it is not working. Ok I will look into those non zero items and report those logs. | 10:34 |
pedroalvarez | aananth: delta/linux would be a good start | 10:34 |
aananth | pedroalvarez, ssam2: Thank you, here it is: http://fpaste.org/151389/20801141/ | 10:39 |
*** rdale [~quassel@9.Red-83-45-185.dynamicIP.rima-tde.net] has quit [Ping timeout: 258 seconds] | 10:40 | |
pedroalvarez | everything looks ok in that log, but it stopped before finishing | 10:43 |
radiofree | new systemd patches work a treat | 10:44 |
ssam2 | radiofree: is that a +1 ? | 10:45 |
radiofree | yes i have +1ed in the list as well | 10:45 |
radiofree | erm apart from one thing actually! | 10:45 |
radiofree | let me respond on the list | 10:46 |
ssam2 | ok, sweet | 10:46 |
pedroalvarez | I wonder how long is the default lorry timeout | 10:47 |
aananth | Few other failures: http://fpaste.org/151391/21427141/ . Everytime linux or linux-rt ran/executed, it failed. What is the meaning of return code '-9'? | 10:50 |
pedroalvarez | radiofree: I'm interested in knowing if it improves the network connectivity on them, and if you can unplug the ethernet cable and plug it again and get connection | 10:51 |
pedroalvarez | aananth: I think that -9 means that the process was killed, an probably because it was taking too long | 10:51 |
radiofree | pedroalvarez: yes that was the first thing i tried and yes that works! | 10:52 |
paulsher1ood | pedroalvarez: i can't reproduce it here, because for some reason it's *not public* but in the doc I saw it appears that Bluetooth Handsfree requires 1.14 or greater.... is that a typo, do you think? | 10:52 |
pedroalvarez | aananth: to solve this you can add "lorry-timeout" in your lorry-controller.conf file, like we do in our trove: http://git.baserock.org/cgi-bin/cgit.cgi/baserock/local-config/lorries.git/tree/lorry-controller.conf | 10:52 |
aananth | pedroalvarez: The system is too slow and only 117M free (from free command). Is it worth restarting? | 10:52 |
richard_maw | I'm merging "Use same kernel for jetson genivi and devel systems", just in case anyone else is attempting it and I would tread on toes | 10:53 |
pedroalvarez | paulsher1ood: ah! I thought it was 1.9, I'll look again | 10:53 |
pedroalvarez | radiofree: amazing | 10:53 |
radiofree | it is yes! | 10:54 |
*** petefoth [~petefoth@82-70-136-246.dsl.in-addr.zen.co.uk] has quit [Remote host closed the connection] | 11:01 | |
*** petefoth [~petefoth@82-70-136-246.dsl.in-addr.zen.co.uk] has joined #baserock | 11:02 | |
aananth | pedroalvarez, ssam2: I rebooted after adding timeout on both lorries. But up-on reboot, the system reported disk error and asking me to check (fsck) manually... May be this could be the reason? | 11:12 |
JPohlmann | Hi folks! Did anyone ping me in here during the last few days? (The IRC tab was highlighted) | 11:13 |
ssam2 | aananth: I'd expect to see an error in the log for delta/linux if disk corruption caused the delta/linux lorry to fail | 11:14 |
paulsher1ood | JPohlmann: i think you were mentioned in passing. we have logs, now ... http://testirclogs.baserock.org/ | 11:14 |
ssam2 | aananth: definitely run fsck anyway though! | 11:14 |
ssam2 | jpohlmann: hi! I don't remember needing to summon you for anything :) | 11:14 |
JPohlmann | paulsher1ood: Ok :) | 11:15 |
JPohlmann | Yup, found it. Something about a script I wrote that extracts the Bison version from NEWS. | 11:19 |
pedroalvarez | oh yeah :) | 11:20 |
pedroalvarez | I had to use again that patch | 11:20 |
*** locallycompact [~lc@82-70-136-246.dsl.in-addr.zen.co.uk] has joined #baserock | 11:23 | |
*** rdale [~quassel@9.Red-83-45-185.dynamicIP.rima-tde.net] has joined #baserock | 11:27 | |
aananth | ssam2, pedroalvarez: fscked, added timeout and restarted. Another interesting info: http://fpaste.org/151398/24191141/ clearly say timeout is the reason. | 11:38 |
aananth | Any idea, why the run queue in lorry controller status is still 893? Is that normal? | 11:39 |
pedroalvarez | aananth: yes, that is the normal status. They will be there running every 2 hours (you can configure it) to fetch new changes | 11:40 |
aananth | pedroalvarez: Any reason why the disk corruption happened? Slow CPU / network speed? | 11:40 |
aananth | ... second delta/linux started before the first one got completed? | 11:41 |
pedroalvarez | aananth: no Idea about why the corruption happened | 11:41 |
pedroalvarez | aananth: there is only one delta/linux | 11:42 |
pedroalvarez | aananth: why do you think there are 2? | 11:44 |
pedroalvarez | "Richard Moore pointed out that the URLs aren't terminated with quotes" | 11:49 |
pedroalvarez | was this Richard Maw? | 11:49 |
pedroalvarez | :) | 11:50 |
richard_maw | yes | 11:50 |
rdale | oops sorry | 11:50 |
richard_maw | no worries | 11:50 |
aananth | pedroalvarez: I assume that the delta/linux will be queued every 2 or 6 hrs and I meant the 2nd instance of the queue started before the first job was completed... etc :-) | 11:53 |
pedroalvarez | ah! no, that can't happen. delta/linux will appear in the queue again once it finishes, but not before | 11:54 |
aananth | pedroalvarez: whether such scenario can happen, in case of slow network & cpu? | 11:54 |
aananth | Ok | 11:55 |
pedroalvarez | aananth: btw, I think you don't need to reboot the machine everytime you want to change the configuration | 11:57 |
pedroalvarez | it was needed when trove-setup was failing, but not anymore :) | 11:57 |
aananth | Ok | 11:57 |
aananth | pedroalvarez: So, if I change the timeout value and push the config, will the system will take the new timeout value immediately or after I perform "systemctl start lorry-controller-readconf.service". | 11:59 |
pedroalvarez | you can run that, and you will force the lorry-controller to read the configuration | 12:00 |
pedroalvarez | but that is going to happen anyway | 12:00 |
aananth | Ok, so the lorry check its configs before it starts loading/unloading work! Very intelligent lorry :-) | 12:01 |
pedroalvarez | :D | 12:01 |
jjardon | pedroalvarez: hi! seems /usr/share/zoneinfo is not there anymore, I think its related with the glibc transition | 12:03 |
pedroalvarez | jjardon: I've no idea about what's that and why is that needed | 12:03 |
jjardon | do you remember to touch anything related with the zoneinfo or the tzdata? | 12:03 |
pedroalvarez | jjardon: nope | 12:03 |
jjardon | ok, I will take a look | 12:04 |
pedroalvarez | jjardon: can this be related to "cd o && make localtime=UTC" ? | 12:05 |
pedroalvarez | this is how we build glibc | 12:05 |
jjardon | pedroalvarez: see 6.9.2 in http://www.linuxfromscratch.org/lfs/view/development/chapter06/glibc.html | 12:05 |
jjardon | this used to be there with eglibc, so maybe now its diferently configured, not sure yet | 12:06 |
pedroalvarez | now I'm tempted to install the locales also in glibc | 12:07 |
jjardon | ah, seems the zoninfo is now provider by tzdata | 12:08 |
jjardon | pedroalvarez: yes please! so I can see my name correctly ;) | 12:08 |
paulsher1ood | jjardon: i vaguely remember tripping over this ages ago | 12:09 |
pedroalvarez | you can install them manually in your system | 12:09 |
jjardon | paulsher1ood: the zoneinfo problem? | 12:09 |
paulsher1ood | yup | 12:09 |
* paulsher1ood can't remember why | 12:10 | |
jjardon | pedroalvarez: I think it would be better if baserock supports UTF8 by default ;) | 12:10 |
paulsher1ood | +1 | 12:11 |
pedroalvarez | jjardon: for locales: `mkdir -p /usr/lib/locale/ && localedef -v -c -i en_GB -f UTF-8 en_GB.UTF-8` | 12:11 |
ssam2 | pedroalvarez: how did we fix Aananth's issue with trove-setup.service last week? | 12:14 |
ssam2 | would http://git.baserock.org/cgi-bin/cgit.cgi/baserock/baserock/trove-setup.git/commit/?h=sam/fix-init-behind-proxy&id=bd80ed9a1690accd8d5dcb964ce387a27f6b014b be useful ? | 12:14 |
ssam2 | (i've not tested that patch yet and I won't bother if we don't need it :) | 12:14 |
jjardon | pedroalvarez: is there a place where I can file a bug or not yet? | 12:16 |
pedroalvarez | jjardon: on the mail list :/ | 12:16 |
pedroalvarez | ssam2: nice! | 12:19 |
pedroalvarez | I really like it | 12:22 |
pedroalvarez | jjardon: so is this a bug? | 12:25 |
jjardon | pedroalvarez: Its a regression, yes | 12:25 |
* pedroalvarez awaits the BUG email :) | 12:26 | |
*** dabukalam [~quassel@ec2-54-69-244-150.us-west-2.compute.amazonaws.com] has quit [Quit: dabukalam] | 12:32 | |
*** dabukalam [~quassel@ec2-54-69-244-150.us-west-2.compute.amazonaws.com] has joined #baserock | 12:33 | |
*** petefoth_ [~petefoth@82-70-136-246.dsl.in-addr.zen.co.uk] has joined #baserock | 12:37 | |
*** petefoth [~petefoth@82-70-136-246.dsl.in-addr.zen.co.uk] has quit [Ping timeout: 265 seconds] | 12:37 | |
petefoth_ is now known as petefoth | 12:37 | |
jjardon | pedroalvarez: sure ;) probably a first step is to lorry ftp://ftp.iana.org/tz/tzdata-latest.tar.gz , ok to push this? http://paste.baserock.org/nucaqasewo.sql | 12:39 |
pedroalvarez | jjardon: I'd prefer ftp://ftp.iana.org/tz/releases/tzcode2014j.tar.gz | 12:43 |
paulsher1ood | isn't there a git repo? | 12:43 |
paulsher1ood | https://github.com/tzinfo/tzinfo | 12:44 |
paulsher1ood | https://github.com/tzinfo/tzinfo-data | 12:44 |
paulsher1ood | pedroalvarez, jjardon ^^ | 12:44 |
pedroalvarez | isn't it that for ruby? | 12:45 |
paulsher1ood | yes. maybe i'm completely off-topic here | 12:46 |
pedroalvarez | paulsher1ood: but you raised a good point :) | 12:46 |
jjardon | paulsher1ood: I think that has a copy of tzdata inside | 12:46 |
straycat | good news, looks like upstream setuptools might accept our patch | 12:47 |
pedroalvarez | jjardon: isn't this also tzdata? http://git.baserock.org/cgi-bin/cgit.cgi/delta/glibc.git/tree/timezone | 12:47 |
jjardon | pedroalvarez: why would you dont want to lorry the latest tzdata? does lorry not support changes in the tarball automatically and make a new commit? | 12:48 |
pedroalvarez | straycat: that's good! | 12:48 |
pedroalvarez | jjardon: I really don't know, and also I don't know from where is going to get the version number to create a tag | 12:48 |
jmacs | It would be useful to have a description of what "morph branch" and "morph edit" do on http://wiki.baserock.org/contributing/ | 12:48 |
ssam2 | straycat: cool | 12:49 |
paulsher1ood | jjardon: i'm just generally against tarballs if there's a vcs source | 12:50 |
straycat | I say might because they haven't merged it yet, they seem to want to update some of the docs associated with the change | 12:50 |
jjardon | paulsher1ood: sure | 12:50 |
robtaylor | jjardon: i don't think tarball inport can import a series of tarballs and recreate history | 12:50 |
robtaylor | jjardon: would be nice if it did =) | 12:51 |
jjardon | paulsher1ood: https://people.gnome.org/~walters/Tarballs.jpg ;) | 12:51 |
jjardon | pedroalvarez: mmm, looks like it, yes; but at least fedora and arch depends on tzdata to build glibc | 12:52 |
paulsher1ood | petefoth: not sure where this fits in with the current site-structure http://wiki.baserock.org/traceability-and-reproducibility/ | 12:52 |
straycat | http://sprunge.us/BTFE so i could do with this now | 12:53 |
paulsher1ood | straycat: +1 | 12:54 |
petefoth | paulsher1ood: neither am I yet. Let me have a think. It maybe that it should be linked from a top-level story in StoryBoard or in a hypothetical ‘Roadmap’ document. | 12:55 |
pedroalvarez | I still don't like the -bitbucket suffix | 12:55 |
jjardon | robtaylor: btw, about the status of GNOME: its currectly broken because this tzdata problem: use to build up to gnome-shell but failed to run (after fixing some problem with the glib schemas not compiled and dbus services not installed in the corect location) | 12:55 |
franred | straycat, +1 | 12:56 |
pedroalvarez | jjardon: is this bug also going to be a problem in our devel systems? | 12:56 |
pedroalvarez | or just in your gnome system> | 12:56 |
pedroalvarez | ? | 12:56 |
paulsher1ood | petefoth: -1 for pages which are in your head but don't exist in real life :-) | 12:57 |
petefoth | paulsher1ood: creating it is on my to-do list | 12:57 |
petefoth | I’d like to do it in StoryBoard but…. | 12:57 |
straycat | merged thanks | 12:58 |
*** jonathanmaw [~jonathanm@82-70-136-246.dsl.in-addr.zen.co.uk] has quit [Ping timeout: 240 seconds] | 12:59 | |
pedroalvarez | I still think that there wasn't an strong reason to use the bitbucket suffix, and I hope we don't start doing this for new lorries | 13:01 |
petefoth | paulsher1ood: add a readable version of the Traceability document in the misc-docs directory | 13:03 |
*** jonathanmaw [~jonathanm@82-70-136-246.dsl.in-addr.zen.co.uk] has joined #baserock | 13:20 | |
persia__ is now known as persia | 13:27 | |
*** persia [quassel@2400:8900::f03c:91ff:feae:3452] has quit [Changing host] | 13:27 | |
*** persia [quassel@ubuntu/member/persia] has joined #baserock | 13:27 | |
*** Guest77313 [quassel@2400:8900::f03c:91ff:feae:3452] has quit [Changing host] | 13:27 | |
*** Guest77313 [quassel@ubuntu/member/persia] has joined #baserock | 13:27 | |
Guest77313 is now known as persia_ | 13:27 | |
straycat | not sure what's going on with gbo atm but there's a lot of cvsimport's running that don't look like they're associated with any currently running lorry job | 13:34 |
Kinnison | Possibly killed jobs, cvs might have lost its controlling process group :-( | 13:35 |
paulsher1ood | petefoth: are you saying that page is unreadable? | 14:07 |
petefoth | paulsher1ood: no, but it’s a Google Doc not accessible to non-Codethinker. So it needs to be moved out of Google Docs (or maybe shared publically if you want to do that), and put in a format that users of w.b.o can read | 14:08 |
straycat | radiofree, didn't you say you were going to fix vim? :p | 14:10 |
* richard_maw wants to move all configuration file generation out of the install commands, and into deployment time configuration extensions, so that you can change how future vim instances will be configured without affecting the cache key and requiring a rebuild | 14:12 | |
paulsher1ood | petefoth: http://wiki.baserock.org/traceability-and-reproducibility is not a googledoc | 14:13 |
jjardon | straycat: what problem do you have? | 14:13 |
paulsher1ood | nor was it last time i sent the link :-) | 14:13 |
straycat | jjardon, copy pasting | 14:13 |
straycat | can we move webtools into the devel system? or else move pip further down? | 14:14 |
straycat | jjardon, radiofree mentioned the same problem a few weeks ago and i thought he said he was going to fix it :) | 14:14 |
pedroalvarez | straycat: so you need pip for the import tool | 14:14 |
straycat | yeah | 14:15 |
pedroalvarez | makes sense to me | 14:15 |
pedroalvarez | is there anything huge in the webtools? | 14:15 |
jjardon | straycat: you have to comment the "set mouse=a" line in /etc/vimrc but a patch to change this by default would be indeed helpful | 14:16 |
paulsher1ood | i don't think so, pedroalvarez | 14:16 |
paulsher1ood | but maybe it's time to tidy this anyway | 14:16 |
pedroalvarez | in that case, you can go ahead with thad, but bear in mind that GNU tar doesn't build | 14:17 |
straycat | oh? why not? | 14:17 |
petefoth | paulsher1ood: sorry! Too much fliting between jobs today :( | 14:18 |
pedroalvarez | straycat: because of the glibc change. I sent a patch last week, not sure if it can be accepted | 14:18 |
straycat | jjardon, that's awesome thanks | 14:18 |
pedroalvarez | I'll take a look at the discussion | 14:18 |
*** aananth [~caananth@74.112.167.117] has quit [Quit: Leaving] | 14:23 | |
*** rdale [~quassel@9.Red-83-45-185.dynamicIP.rima-tde.net] has quit [Ping timeout: 264 seconds] | 15:02 | |
*** genii [~quassel@ubuntu/member/genii] has joined #baserock | 15:03 | |
pedroalvarez | franred, ssam2, how are you today? are you still ok of having a quick meeting here to talk about the present of our infra? Can we do it at 16:00 utc? | 15:03 |
pedroalvarez | s/of having/with having/ | 15:04 |
jmacs | Is there anything better for making Baserock videos with than kdenlive? | 15:04 |
pedroalvarez | kdenlive? is that a tool for recording> | 15:05 |
pedroalvarez | ? | 15:05 |
DavePage | kdenlive is a tool for video editing | 15:05 |
DavePage | Actually it's a tool for eating all your RAM and crashing but hey | 15:05 |
jmacs | Yes, it's a video editor that crashes all the time | 15:06 |
jmacs | Hence me asking if there's anything better | 15:06 |
DavePage | jmacs: Not taht I've found. pitivi eats all your RAM and hangs rather than crashing, but that's not a great improvement. | 15:07 |
franred | pedroalvarez, I can do at 16:00 utc | 15:08 |
ssam2 | pedroalvarez: sure | 15:13 |
*** rdale [~quassel@9.Red-83-45-185.dynamicIP.rima-tde.net] has joined #baserock | 15:17 | |
*** rdale [~quassel@9.Red-83-45-185.dynamicIP.rima-tde.net] has quit [Ping timeout: 250 seconds] | 15:36 | |
*** rdale [~quassel@9.Red-83-45-185.dynamicIP.rima-tde.net] has joined #baserock | 15:46 | |
pedroalvarez | Reminder: Baserock Infrastructure meeting in 12 minutes | 15:48 |
*** petefoth [~petefoth@82-70-136-246.dsl.in-addr.zen.co.uk] has quit [Quit: petefoth] | 15:48 | |
paulsher1ood | pedroalvarez: what is that, where? | 15:52 |
pedroalvarez | here | 15:52 |
paulsher1ood | ooh, cool :) | 15:52 |
pedroalvarez | you are welcome to join if you have points to raise | 15:53 |
pedroalvarez | everybody is | 15:53 |
pedroalvarez | but the meeting is to talk about the present, not the future | 15:53 |
paulsher1ood | SotK: did you actually put code or an instance of your stuff somewher? | 15:53 |
paulsher1ood | pedroalvarez: is there an agenda? sorry - i've been in my own little world | 15:54 |
* paulsher1ood guesses this may all be on a trello board somewhere :) | 15:54 | |
pedroalvarez | actually there is not an agenda, and is not in the trello board | 15:55 |
pedroalvarez | and I guess I should do it the next time | 15:55 |
SotK | paulsher1ood: I put the code for my prototype stuff here: http://git.baserock.org/cgi-bin/cgit.cgi/delta/openstack/turbo-hipster.git/log/?h=baserock/adamcoldrick/mason-plugin-prototype | 15:55 |
pedroalvarez | is the first time I want to do a meeting on IRC on an opensource project :) | 15:56 |
SotK | paulsher1ood: there isn't an instance of it running somewhere accessible yet though | 15:56 |
pedroalvarez | Infrastructure meeting in 1 minute, please don't interrupt the meeting if is not related. | 15:59 |
pedroalvarez | all right, Baserock Infrastucture meeting starts | 16:01 |
pedroalvarez | franred, ssam2, are you around? | 16:01 |
ssam2 | hi | 16:01 |
franred | hi | 16:01 |
pedroalvarez | anybody else wants to join? | 16:01 |
paulsher1ood | hi | 16:01 |
straycat | eh? wha? | 16:02 |
pedroalvarez | Ok, I'd like to discuss the following points: | 16:02 |
* straycat hides | 16:02 | |
pedroalvarez | * [1] Migration of our Havana instances to Icehouse. | 16:02 |
pedroalvarez | * [2] baserock-clone and cache.baserock.org. | 16:02 |
pedroalvarez | * [3] git.baserock.org. Datacentred Migration. | 16:02 |
pedroalvarez | * [4] Prioritizing the work and check who has time to start with it. | 16:02 |
pedroalvarez | Does any of you have more points to discuss? | 16:02 |
ssam2 | a plan for setting up future infrastructure | 16:03 |
ssam2 | basically to see if you guys like the approach I've been taking in https://github.com/ssssam/test-baserock-infrastructure and want to adopt it or not | 16:03 |
pedroalvarez | cool, that we the [5] | 16:03 |
pedroalvarez | Migration of our Havana instances to Icehouse. | 16:04 |
ssam2 | is there an easy way to do that? | 16:04 |
pedroalvarez | I've been talking with the datacentred guys and we should start moving our infra to the new tenant that we have | 16:04 |
ssam2 | actually, I know using 'nova' you can download an image | 16:05 |
pedroalvarez | the way to do it I believe is: create a snapshot, download it, and upload it to the other tenant | 16:05 |
ssam2 | right | 16:05 |
ssam2 | one issue is that all the public IPs we've been using need to change | 16:05 |
ssam2 | and we have less floating IPs available in the new tenant | 16:05 |
ssam2 | I looked on Friday at how to set up HAProxy, and I think it'd make sense to use that | 16:06 |
paulsher1ood | can't the download/upload be somewhere at dc, to save time? | 16:06 |
paulsher1ood | +1 for haproxy | 16:06 |
pedroalvarez | paulsher1ood: yes, we should do the migration from dc to dc | 16:06 |
ssam2 | paulsher1ood: good idea, better than downloading them to the office! | 16:06 |
pedroalvarez | ssam2: do you think that 10 Ips is not enough for now? | 16:06 |
ssam2 | pedroalvarez: we're using 12 right now :) | 16:07 |
ssam2 | I think setting up HAProxy will be simple enough, and we can then point all our subdomains at one IP and just update the HAProxy config when we want to add/remove/change pieces of infrastructure | 16:07 |
ssam2 | it can forward requests to the correct instance based on subdomain or even matching bits of the URL | 16:07 |
pedroalvarez | sounds like the right thing to do | 16:08 |
*** zoli_ [~zoli_@linaro/zoli] has quit [Remote host closed the connection] | 16:08 | |
paulsher1ood | haproxy on a baserock system should be easy enough... has anyone done that? | 16:08 |
ssam2 | I've been avoiding using Baserock for the infrastructure so far, because all this infrastructure work is unfamiliar territory for me anyway | 16:08 |
paulsher1ood | nod | 16:08 |
ssam2 | I'd like to get it working first, then start moving it to Baserock | 16:09 |
pedroalvarez | I agree | 16:09 |
ssam2 | I can finish setting up HAproxy on Friday, or maybe earlier | 16:09 |
radiofree | straycat: yes i did, for now set mouse-=a | 16:09 |
franred | can not we do some networking trick in neutron to use NAT? | 16:10 |
pedroalvarez | franred: not sure about that, want to research? | 16:10 |
franred | I can have a look yes | 16:10 |
ssam2 | so the main things to migrate are: | 16:11 |
ssam2 | paste. and testirclogs. | 16:11 |
radiofree | what would be the correct patch for that vim thing? | 16:11 |
ssam2 | and the mason-x86-64 and its trove | 16:11 |
pedroalvarez | radiofree: please, can you wait until the meeting ends? | 16:11 |
pedroalvarez | ssam2: yes | 16:11 |
radiofree | oh sorry i had no idea! | 16:11 |
pedroalvarez | radiofree: np | 16:11 |
pedroalvarez | do we agree that we can start moving these things asap? | 16:11 |
ssam2 | could we just redeploy the Mason and the Trove ? should be easy enough | 16:11 |
ssam2 | and just move the Trove's volume to the new instance? | 16:12 |
pedroalvarez | redeploying them makes sense | 16:12 |
ssam2 | I think we agree that we caen start moving them ASAP, and I can look at doing some of the work on Friday | 16:12 |
pedroalvarez | about the trove, it was my second bullet point | 16:12 |
ssam2 | I'll leave paste. and testirclogs. to you as you deployed them originally | 16:12 |
pedroalvarez | good | 16:12 |
franred | ssam2, we will have to copy the volume so we can not avoid to create an snapshot of it? | 16:12 |
ssam2 | franred: i'm not sure if a snapshot of an instance includes any volumes, to be honest | 16:12 |
ssam2 | i'd have thought that the volume needs to be migrated separately anyway? or am I wrong? | 16:13 |
pedroalvarez | I think you are right | 16:13 |
ssam2 | i'm not sure if we can redeploy the Trove and reuse the volume. I think we can if we have the original cluster morph for the trove kept somewhere | 16:13 |
franred | ssam2, yes, you are right, but could the volumes be copied or we need to create an snapshot of them? | 16:13 |
pedroalvarez | I think we should move to my second point now | 16:14 |
pedroalvarez | [2] baserock-clone and cache.baserock.org. | 16:14 |
pedroalvarez | should it be the same thing? Do we still want baserock-clone? | 16:14 |
ssam2 | baserock-clone is useful I think, but not valuable | 16:14 |
ssam2 | i.e. it doesn't matter if there's downtime | 16:14 |
pedroalvarez | It is good to use the jetsons we have in DC | 16:14 |
ssam2 | oh, that's true | 16:15 |
franred | I use baserock-clone for my testing and I still cloning repos from there | 16:15 |
pedroalvarez | I've been thinking about that, because I want to create a mason instance to test things in armv7lhf, and I found that this mason will be different than the others | 16:16 |
franred | I think we should use as a test baserock-lorry and use it in the new instance too, but not sure if someone is using it at me moment | 16:16 |
pedroalvarez | hm.. I'm still unsure, a trove just to test lorries sounds like too much | 16:16 |
pedroalvarez | but, I think we need it anyway to use the jetsons | 16:17 |
paulsher1ood | i thought we'd end up with lorry in devel so users could test themselves? | 16:17 |
franred | well, problem that we have is, that if we want a clean g.b.o or a g.b.o with mess | 16:17 |
ssam2 | paulsher1ood: i hope that happens, yeah | 16:17 |
paulsher1ood | clean gbo | 16:17 |
pedroalvarez | franred: but not having baserock-clone doesn't mean that we will have a g.b.o with mess | 16:18 |
franred | but we don't have the lorry in devel, so for the moment have a test-trove is the clean solution, I though | 16:18 |
paulsher1ood | pedroalvarez: we should look closer at what SotK is proposing for maason i think | 16:18 |
pedroalvarez | paulsher1ood: true, but I want discuss the present in this meeting, not the future | 16:18 |
pedroalvarez | once we have sorted out our current situation we can start looking forward | 16:19 |
ssam2 | we need baserock-clone anyway for Jetsons, so I think we are decided that we will keep it and migrate it as soon as possible, which may lead to some downtime for it | 16:19 |
paulsher1ood | pedroalvarez: ok | 16:19 |
franred | ssam2, ok | 16:20 |
pedroalvarez | then, I think that cache.baserock.org should be baserock-clone | 16:20 |
paulsher1ood | +1 | 16:20 |
ssam2 | that works for now | 16:20 |
pedroalvarez | cool | 16:20 |
franred | for now, ok | 16:20 |
paulsher1ood | can we rename baserock-clone to something better? | 16:20 |
ssam2 | in fact, that works forever as long as all our masons upload artifacts to baserock-clone | 16:20 |
pedroalvarez | paulsher1ood: yes, any suggestion? | 16:20 |
paulsher1ood | (mirror?) | 16:20 |
ssam2 | point 3 is about migrating git.baserock.org to datacentred, so perhaps in the end we rename it to 'git.baserock.org' :) | 16:21 |
pedroalvarez | mirror as trove-id? | 16:21 |
paulsher1ood | :) | 16:21 |
paulsher1ood | gbo-mirror as trove id perhaps | 16:21 |
paulsher1ood | can be discussed out of meeting | 16:21 |
pedroalvarez | works for me | 16:21 |
pedroalvarez | fair | 16:21 |
pedroalvarez | ssam2: is a possibility | 16:22 |
richard_maw | unless baserock-clone also has copies of the lorry state, it's not valid to just rename it to git.baserock.org | 16:22 |
pedroalvarez | richard_maw: yeah sure, I know that | 16:22 |
ssam2 | richard_maw: that's true. I just mean that eventually we might not need baserock-clone to exist | 16:22 |
ssam2 | I guess baserock-clone has multiple roles and it's hard to pick a name that reflects all of them | 16:23 |
richard_maw | fair enough, I just dipped into the conversation and was concerned that things would just be renamed | 16:23 |
ssam2 | the instance is currently named 'mason-artifact-cache-server-plus-git.baserock.org-mirror' :) | 16:23 |
pedroalvarez | anything else to raise regarding this point? | 16:23 |
ssam2 | how about baserock-2 ? | 16:23 |
ssam2 | i'll stop, we can discuss that outside this meeting | 16:24 |
pedroalvarez | [3] git.baserock.org. Datacentred Migration. | 16:24 |
pedroalvarez | I just wanted to raise this point, we should move it to DC | 16:24 |
ssam2 | this reminds me of a 6th point we should discuss: backups | 16:24 |
pedroalvarez | but I think we should move other things first so we can test it | 16:24 |
pedroalvarez | ssam2: point added | 16:25 |
ssam2 | we need a lot more capacity for volumes before we can migrate g.b.o, too | 16:25 |
ssam2 | it won't fit in 200GB | 16:25 |
paulsher1ood | i can ask | 16:25 |
pedroalvarez | ssam2: it will, but we need also space for cache.b.o | 16:26 |
DavePage | Is that because g.b.o serves many functions? | 16:26 |
DavePage | It might be worth trying to split that out as part of the migration. | 16:26 |
pedroalvarez | yeah, splitting the cache is being part of the migration | 16:26 |
ssam2 | just VCS imports and hosting is still going to take up a lot of space | 16:26 |
ssam2 | it's currently 115GB of artifacts + Gits | 16:26 |
ssam2 | but we will keep adding more stuff, so we'll hit 200GB soon enough | 16:27 |
pedroalvarez | paulsher1ood: I'll appreciate that | 16:27 |
pedroalvarez | ok, so this migration is going to be more complex and it has to wait until we have resources and we have tested DC | 16:27 |
pedroalvarez | and we have a plan for backups | 16:28 |
ssam2 | also, g.b.o is working fine, unlike e.g. Storyboard :) | 16:28 |
pedroalvarez | true, but I know that DavePage is not confident about its security situation | 16:28 |
DavePage | Well, to be specific I'm not confident about the host it's running on either :) | 16:29 |
ssam2 | why? is that not more to do with the lack of a formal security process for the OS it runs, than which VM hosting service it's hosted on? | 16:29 |
DavePage | For starters I could do with rebooting the VM host for a kernel security update. For another thing the host is running kvm/qemu with no security support. | 16:30 |
franred | what is the difference between the host is running now and the one it will be runnig in DC? | 16:30 |
* paulsher1ood wonders about the expected duration of this meeting | 16:30 | |
ssam2 | ok, so it would be good to migrate anyway | 16:30 |
pedroalvarez | paulsher1ood: I expect we can finish it in 10 minutes | 16:30 |
DavePage | franred: One is my problem, the other is not ;) | 16:31 |
pedroalvarez | should we move to [4] Prioritizing the work and check who has time to start with it.? | 16:31 |
ssam2 | yes | 16:31 |
ssam2 | we need a todo list for this, I guess | 16:32 |
ssam2 | should we use the existing Trello for now ? or a wiki page with a list of tasks ? | 16:32 |
paulsher1ood | +1 for wiki :) | 16:32 |
pedroalvarez | I was going to say trello for simplicity, but ok wiki | 16:33 |
pedroalvarez | things to do: | 16:33 |
paulsher1ood | i'll drop my +1 if others prefer | 16:33 |
pedroalvarez | * Migrate irclogs | 16:33 |
pedroalvarez | * Migrate paste.baserock | 16:33 |
pedroalvarez | * Migrate mason and trove | 16:33 |
ssam2 | I guess mason and trove can be done independently, which makes it slightly less daunting | 16:34 |
ssam2 | just need to update mason with the new IP of the trove | 16:34 |
pedroalvarez | true | 16:34 |
pedroalvarez | I guess I'm missing things | 16:35 |
ssam2 | i still plan to set up HAProxy, an OpenID provider, and Storyboard | 16:35 |
ssam2 | in that order | 16:35 |
pedroalvarez | right, I can do the paste.baserock, and the mason migration tomorrow | 16:35 |
pedroalvarez | and irc logs I guess | 16:35 |
pedroalvarez | and with this we can move to [5] plan for setting up future infrastructure | 16:36 |
franred | pedroalvarez, I can give you a hand | 16:36 |
pedroalvarez | franred: thanks | 16:36 |
ssam2 | I can look at the Trove then | 16:36 |
ssam2 | will be on Friday | 16:36 |
pedroalvarez | ssam2: cool | 16:36 |
pedroalvarez | [5] I think we should follow what ssam2 is doing to setup the infra | 16:37 |
pedroalvarez | https://github.com/ssssam/test-baserock-infrastructure | 16:37 |
pedroalvarez | I'll bear this in mind when doing the migration | 16:37 |
ssam2 | my idea is that Packer maps reasonably closely to 'morph deploy', so there should be fairly clear migration paths when moving stuff to Baserock | 16:38 |
pedroalvarez | makes sense | 16:38 |
ssam2 | we should decide where that repo lives permanently, then. On g.b.o makes sense except then everyone will be mirroring it | 16:39 |
ssam2 | but I suppose it's only ever going to be small | 16:39 |
pedroalvarez | I think that for now that location is ok | 16:39 |
ssam2 | ok | 16:39 |
pedroalvarez | should we move then? | 16:39 |
ssam2 | move on to [6]? ok | 16:40 |
pedroalvarez | backups in DC | 16:40 |
pedroalvarez | I don't know anything about the possibilities yet, but yes, we should ask, get information about our current backups plan and decide what are we going to use in DC | 16:40 |
pedroalvarez | i volunteer to do that | 16:40 |
ssam2 | I was thinking for future infrastructure we should set up one database server shared by all the infrastructure | 16:41 |
ssam2 | Storyboard seems to need MySQL so I guess it'll have to be a MySQL server | 16:41 |
ssam2 | then Gerrit, Storyboard and whatever else can use that and we only need to back that up | 16:41 |
ssam2 | everything else can be redeployed | 16:41 |
* pedroalvarez nods | 16:41 | |
ssam2 | we should also find out what DC's backup policy is | 16:41 |
paulsher1ood | i wonder if we've properly established why SB needs mysql. they used to support pg - maybe we could re-animate that | 16:42 |
ssam2 | hopefully they can take care of physical backups, and we just need to worry about the logical (database) backup | 16:42 |
ssam2 | paulsher1ood: I've not investigated why, I'll ask them | 16:42 |
franred | ssam2, sounds like a good plan | 16:42 |
pedroalvarez | ok, anything else? | 16:42 |
pedroalvarez | I think this is more than enought to start :) | 16:43 |
ssam2 | thanks for running the meeting Pedro | 16:43 |
pedroalvarez | I declare this meeting finished | 16:43 |
straycat | irc meetings are cool | 16:43 |
* paulsher1ood notes that pg works on baserock out of the box...mysql is more work | 16:43 | |
franred | thanks Pedro :) | 16:43 |
straycat | radiofree, not sure, can't it be a patch to vim? | 16:44 |
ssam2 | straycat: my poor eyes beg to differ | 16:44 |
straycat | pedroalvarez, so what's the deal with tar? | 16:46 |
pedroalvarez | straycat: I believe it can be merged :) | 16:47 |
straycat | "it" ? | 16:48 |
ssam2 | paulsherwood: sounds non-trivial to use PostgreSQL for Storyboard --all the migrations are MySQL-specific | 16:48 |
ssam2 | there are about 30 migrations and we'd need to fix and maintain them and any future ones | 16:48 |
wdutch | :w | 16:49 |
straycat | No file name | 16:50 |
*** CTtpollard [~tom@82-70-136-246.dsl.in-addr.zen.co.uk] has joined #baserock | 16:50 | |
paulsher1ood | urgh :/ | 16:50 |
ssam2 | all of https://github.com/openstack-infra/storyboard/tree/master/storyboard/db/migration/alembic_migrations/versions | 16:51 |
straycat | f | 16:51 |
straycat | heh >.> | 16:51 |
ssam2 | there doesn't seem to be much MySQL specific stuff in there, actually | 16:51 |
pedroalvarez | straycat: "it" = the patch I sent to fix tar | 16:52 |
rdale | having somethink like MYSQL_ENGINE = 'InnoDB' in a database migration seems a bit broken to me | 16:54 |
paulsher1ood | :-) | 16:57 |
straycat | pedroalvarez, is this with upstream? | 16:59 |
pedroalvarez | straycat: no, I just had to upgrade it to a newer version glibc compatible | 17:00 |
pedroalvarez | I'll merge it soon | 17:00 |
straycat | why can't we merge it now? | 17:01 |
pedroalvarez | We can, I'm just in the middle of something. If you want to merge it, i'll appreciate it :) | 17:01 |
radiofree | straycat: i was thinking it would be easier to just create a /root/.vimrc file with "set mouse-=a" and install that in the chunk | 17:04 |
radiofree | rather than having to modify the vim source code | 17:04 |
radiofree | s/source code/repo | 17:04 |
straycat | pedroalvarez, it has two +1s i'm fine merging it if all we're doing is effictively disabling -werror | 17:04 |
richard_maw | radiofree: I don't think vim as root allows .vimrc | 17:04 |
richard_maw | security reasons | 17:04 |
pedroalvarez | straycat: please :) | 17:05 |
richard_maw | though I may be mixing that up with the `vim: foo` lines | 17:05 |
ssam2 | richard_maw: I use a .vimrc in Baserock all the time, so it must work as root | 17:05 |
radiofree | richard_maw: works here | 17:06 |
paulsher1ood | SotK: is there some reason your patches for mason are authored by 'Mason Test Runner'? | 17:06 |
pedroalvarez | straycat: and also you have my +1 to move webtools to devel, although I prefer if you send a patch to see if anybody disagrees | 17:06 |
straycat | have a system integration thing that modified the /etc/vimrc ? | 17:06 |
straycat | *s | 17:06 |
pedroalvarez | s/move /include in/g | 17:07 |
* richard_maw has come to the conclusion that if we can't do atomic runtime fs updates, then we can't do runtime updates at all, as the real value in package-based distributions is that it encodes the logic to safely remove a bunch of files from the filesystem. delta-based application can result in the filesystem being in states not viable for running applications | 17:07 | |
*** wdutch [~william@82-70-136-246.dsl.in-addr.zen.co.uk] has quit [Quit: Quit] | 17:07 | |
richard_maw | that's the real value of packages | 17:08 |
richard_maw | but if you can do an atomic fs update then you don't need them | 17:08 |
robtaylor | projectatomic | 17:08 |
robtaylor | but I still want to do apps with baserock, but that's something rather different | 17:09 |
ssam2 | robtaylor: does OSTree pivot OS version without rebooting? | 17:09 |
richard_maw | according to the docs I could find it runs `systemctl reboot` | 17:10 |
ssam2 | I had a feeling it requires a reboot to upgrade, but I might be wrong | 17:10 |
robtaylor | ssam2: yep, in project atomic you updates | 17:10 |
robtaylor | umm bad paste | 17:10 |
robtaylor | you systemctl reboot | 17:11 |
ssam2 | right. So it provides a similar thing to what we currently have with Btrfs subvolumes (except with the implementation in userspace instead of in the kernel) | 17:11 |
robtaylor | yep | 17:11 |
* richard_maw needs runtime atomic updates and has been informed that containerising the applications is not an option | 17:12 | |
richard_maw | it's doable with clever use of pivot_root | 17:12 |
robtaylor | richard_maw: is it the applications that need updates or the whole system? | 17:12 |
richard_maw | both | 17:13 |
robtaylor | whole system updates without restart? ouch | 17:13 |
DavePage | kexec? :) | 17:14 |
richard_maw | if init or the kernel changes it's permissible to kexec, but for everything else I need it to stay up | 17:14 |
robtaylor | yep, pivot root is your approach there, and i guess its up to the sysadmin to figure out ehen they need to reboot | 17:14 |
robtaylor | all very old school | 17:15 |
paulsher1ood | really? that sounds hard :-) would super-fast boot not be an option? | 17:15 |
robtaylor | paulsher1ood: that would be the modern container-oriented way, indeed | 17:15 |
richard_maw | paulsher1ood: not possible with the class of hardware involved | 17:15 |
*** franred [~franred@82-70-136-246.dsl.in-addr.zen.co.uk] has quit [Quit: Leaving] | 17:16 | |
richard_maw | kexec is the fastest reboot semantics possible, and I hope the hardware supports kexecing | 17:16 |
robtaylor | mm. sounds big irony ;) | 17:16 |
richard_maw | I couldn't possibly comment. | 17:16 |
robtaylor | richard_maw: hah, just had an evil thought | 17:17 |
richard_maw | robtaylor: do tell >:-D | 17:17 |
robtaylor | richard_maw: you could always boot systems in a pid namespace. Then you can boot up a new system, and use gcroups to close down the old system when its stopped doing things | 17:18 |
robtaylor | i guess you could end up with some weird behaviour for apps that think they're the only people talking to their store | 17:19 |
robtaylor | hell, just always boot in a container =) | 17:19 |
* paulsher1ood likes it | 17:19 | |
richard_maw | robtaylor: I've been informed that containers impose too much overhead | 17:19 |
robtaylor | probably could just add support to systemd to do this | 17:19 |
paulsher1ood | even better :-) | 17:20 |
robtaylor | richard_maw: um, they don't know what they're talking about then | 17:20 |
paulsher1ood | robtaylor: careful... :) | 17:20 |
robtaylor | richard_maw: there's only overhead when you use VETH devices to bridge network namspaces | 17:20 |
robtaylor | hmm, which you would probably have to do in this model | 17:21 |
richard_maw | robtaylor: that was my first thought, but it means that there's more copies of binaries and libraries around, which means there's more pressure on the caches, so things keep dropping out of it all the time | 17:21 |
robtaylor | not if you do things right | 17:21 |
robtaylor | the pressure on the caches will be the same for the pivot_root approach, actually | 17:21 |
robtaylor | you'll have the old sharedlibraries and executables mmaped in when you load the new system, | 17:22 |
robtaylor | and no managed way to get rid of them | 17:22 |
robtaylor | if you containerise, you can add systemd commands to query the state | 17:22 |
robtaylor | hmm | 17:22 |
robtaylor | maybe just use a cgroup | 17:22 |
richard_maw | robtaylor: not exactly. With the container approach you need to keep the outer system's libraries pristine when you update the inner one. But with the pivot root, you re-exec _all_ your binaries in the new version | 17:23 |
richard_maw | so there's no processes left using the old binaries | 17:23 |
robtaylor | that's just a reboot | 17:23 |
richard_maw | if you do it right, it's a reboot with no service interruption, since you can have processes gracefully re-exec and keep the connections open | 17:24 |
richard_maw | `systemctl daemon-reexec` is an example of this | 17:24 |
robtaylor | righty | 17:24 |
robtaylor | so you kinda want to do a full-system daemon-rexec? | 17:24 |
richard_maw | yeah | 17:24 |
richard_maw | after migrating all the processes to a new mount tree | 17:25 |
robtaylor | it would then make a lot of sense to put everything in a cgroup, and start a new gcroup when you reexec | 17:25 |
robtaylor | then you can easily track what hasn't restarted | 17:25 |
robtaylor | (and warn/debug if its haveing problems) | 17:25 |
robtaylor | does that make sense? | 17:26 |
richard_maw | interesting idea, and if it maintains the systemd state you can track it back to the service, and have a nice interface to be able to tell the service to gracefully re-exec before forcing it to | 17:27 |
robtaylor | yep | 17:27 |
*** tiagogomes [~tiagogome@82-70-136-246.dsl.in-addr.zen.co.uk] has quit [Quit: Leaving] | 17:27 | |
robtaylor | (can you tell i've recently spent a lot of time trying to understand containers? ;) ;)) | 17:27 |
richard_maw | first class support for this in systemd would be the best place to put this pivot and gracefully re-exec logic I think | 17:28 |
*** jonathanmaw [~jonathanm@82-70-136-246.dsl.in-addr.zen.co.uk] has quit [Quit: Leaving] | 17:28 | |
robtaylor | yep, i expect it'll be gladly accepted and walters will love you | 17:28 |
richard_maw | currently there's not a race-free way of doing this, unless you can freeze all the processes | 17:28 |
robtaylor | well, you can | 17:28 |
robtaylor | you can freeze a cgroup | 17:28 |
robtaylor | and be told when everything is quiesent | 17:29 |
robtaylor | https://www.kernel.org/doc/Documentation/cgroups/freezer-subsystem.txt | 17:29 |
robtaylor | but if everything providing a service is a well behaved socket-activated unit, you could totally do it race free | 17:30 |
robtaylor | old active connections carry on using the old service, new connects startup and use the new service | 17:31 |
richard_maw | nice | 17:31 |
robtaylor | and you use the freezer subsystem to tell when the old services have finished | 17:31 |
robtaylor | hmm, maybe | 17:33 |
robtaylor | you may need some more than freezer | 17:34 |
richard_maw | so you'd only worry about ensuring systemd is properly migrated, and over time the system eventually migrates itself | 17:34 |
robtaylor | oh | 17:34 |
robtaylor | use notify_on_release rather than freezer | 17:34 |
robtaylor | and maybe have a way to manually freeze the old system if its being bad about restarting | 17:35 |
robtaylor | (or maybe some units need some special handling) | 17:35 |
richard_maw | if I'm understanding your freezing suggestion correctly, then if it behaves that badly, then you're better off killing it and accepting that some connections get dropped | 17:36 |
robtaylor | as long as you can a) tell what is actiually happening easily and b) tell stuff to go away if its being an arse and c) rollback if its not working | 17:36 |
robtaylor | richard_maw: yeah, probably | 17:36 |
robtaylor | richard_maw: i think the freezer may be useful in terms of checkpointing | 17:36 |
richard_maw | ah, so you're suggesting _process_ rollback may be possible? | 17:36 |
robtaylor | with freezer you can rollback a cgroup | 17:36 |
* robtaylor suddenly feels very evil | 17:37 | |
* richard_maw feels like he's a child, it's christmas and he's been given lots of new toys, some of which have warning stickers | 17:37 | |
robtaylor | you probably don't want that though, as you can only get a consistent checkpoint by forcing a cgroup to be queisecnt | 17:38 |
DavePage | "Some of these are toys, some of them are beartraps that will take off your hand. Have fun!" | 17:38 |
robtaylor | richard_maw: you probably want to take a look at CRIU | 17:39 |
robtaylor | DavePage: about right | 17:39 |
robtaylor | here be shiny shiny dragons | 17:39 |
richard_maw | robtaylor: probably, I had assumed it required that checkpointed processes need to be restored in an identical filesystem, but I could be wrong | 17:41 |
robtaylor | richard_maw: well you can do that with btrfs | 17:41 |
robtaylor | richard_maw: you can even send/receive a given state to a new system | 17:42 |
* robtaylor all kinds of evil | 17:42 | |
richard_maw | yes, but I don't want to restore it in an identical filesystem, I want to restore it in a different filesystem because it contains updatees | 17:42 |
robtaylor | oh, yeah, i'm not suggetsing you do that | 17:42 |
robtaylor | that would be bad | 17:42 |
robtaylor | just checkpointing for rollback on failure | 17:42 |
robtaylor | so new system blows up, freeze that, make a chroot with your old snapshot and restart your old cgroup | 17:43 |
robtaylor | and then kill off the borken new cgroup | 17:44 |
richard_maw | that's assuming the services are continuing then re-execing, rather than us freezing the old ones and starting new versions | 17:44 |
robtaylor | (if that makes sense) | 17:44 |
richard_maw | I'm assuming that checkpointing isn't quick, because there's a lot of process data that needs to be serialised | 17:44 |
robtaylor | depends, when you snapshot and send/receive | 17:45 |
robtaylor | but i'm suggesting this sequence -> 1) start up new system in a new cgroup, handnd over as per daemon reexec. 2) wait for all the old system to stop doing stuff 3) snaphot 4) kill it | 17:46 |
robtaylor | if that makes sense | 17:46 |
robtaylor | 2) is the hardest bit. I don't think you can assume the services will exec, you'll just have to montor the cgroup | 17:47 |
robtaylor | s/exec/exit/ | 17:47 |
richard_maw | pivot_root has some limitations a) it works per-namespace, so I'd need to pivot in each namespace, rather than all at once. b) chrooted processes don't get thier root changed c) unless the working directory of the process is /, it also won't be chdir'd. b) and c) can be dealt with by chrooting before pivoting, but without a pivot_root that can take fds, you can't always refer to the mount_points | 17:47 |
richard_maw | s/mount_points/new mount point/ | 17:48 |
richard_maw | plus, openat means old processes can still see the old state until they re-open those files | 17:48 |
richard_maw | but that was from when I was thinking I needed to do some magic without cooperating from systemd | 17:49 |
robtaylor | in this sceme you don't really pivot root, you're really doing soemthing like nspawn --share-system | 17:49 |
robtaylor | hmm, you may also want to worry about sytemd upgrades =) | 17:50 |
richard_maw | not exactly though, as I need systemd to also do the transition to the new system, which may require systemd being backwards compatible with its serialised state when doing a `systemctl daemon-reexec` | 17:50 |
robtaylor | yep | 17:50 |
robtaylor | that sounds about right | 17:50 |
richard_maw | daemon-reexec is only supported for being able to reload the libraries it depends on and re-execing for debugging | 17:51 |
richard_maw | re-execing for debugging is basically just so you can compile a version with print statements in | 17:51 |
robtaylor | interesting http://www.freedesktop.org/wiki/Software/systemd/SystemUpdates/ | 17:51 |
robtaylor | (not what you want but a little informative) | 17:52 |
richard_maw | it appears to miss the point for me, as it's doing offline update when you have packages available, when one of the advantages of packages is that you can do online updates | 17:53 |
robtaylor | hmm, actually does any state really need to be passed? | 17:53 |
robtaylor | between new and old sytemd? | 17:53 |
robtaylor | its just really atomic socket activation handover | 17:54 |
richard_maw | only if we're allowing services to see the new version of the system, which may be allowable, since it's likely to work, as you see the same transition with non-atomic updates | 17:54 |
pedroalvarez | jjardon: I still want to know how critical is the bug you have found with glibc and tzdata.. :P | 17:54 |
richard_maw | as services may be used to being still alive when packages are being updated | 17:55 |
robtaylor | yeah, anything that breaks in this model would have broken before | 17:55 |
robtaylor | indeed, astually the model you're replacing never had atomic handover | 17:55 |
robtaylor | you'd always have downtime unless your service specifiucally had graceful restart | 17:56 |
robtaylor | (e.g. apache2) | 17:56 |
richard_maw | also irssi according to one source :-) | 17:56 |
richard_maw | you run /update or something | 17:56 |
richard_maw | ah /upgrade | 17:57 |
robtaylor | so you just need a way to say how to graceful in the unit, and if that isn't there, you stop and restart | 17:57 |
robtaylor | and of course, that's already there | 17:57 |
robtaylor | ExecReload=/usr/sbin/apache2 -k graceful $APACHE2_OPTS | 17:57 |
richard_maw | I thought reload was just for config | 17:58 |
*** ssam2 [~ssam2@82-70-136-246.dsl.in-addr.zen.co.uk] has quit [Quit: Leaving] | 17:59 | |
robtaylor | mm, yes, you'd probaly need a new unit line | 17:59 |
*** mariaderidder [~maria@82-70-136-246.dsl.in-addr.zen.co.uk] has quit [Quit: Ex-Chat] | 18:00 | |
richard_maw | so, we want graceful restarts for everything, but systemd can make that easier to implement for services by them just having a graceful shutdown and being socket activated | 18:00 |
robtaylor | umm, well i'm saying you proably can't have graceful restarts for everthing | 18:01 |
robtaylor | but you can where services *support* graceful restarts | 18:01 |
robtaylor | and that can be indicated in the unit | 18:01 |
robtaylor | if they don't, you just stop and start them | 18:02 |
robtaylor | and you';ll be at parity with current systems, but a lot more controlled | 18:03 |
*** CTtpollard [~tom@82-70-136-246.dsl.in-addr.zen.co.uk] has quit [Quit: Ex-Chat] | 18:03 | |
richard_maw | if we had a magic pivot_root that rebased all the paths in all the processes (cwd, root and all open fds) we could get away without special logic in systemd, but it sounds like systemd support would be easier to achieve | 18:05 |
robtaylor | i think bad things would probably happen if you did that | 18:06 |
robtaylor | imaging not all your shared library is paged in | 18:07 |
robtaylor | (or your mmaped data set) | 18:07 |
*** Krin [~mikesmith@82-70-136-246.dsl.in-addr.zen.co.uk] has quit [Remote host closed the connection] | 18:07 | |
richard_maw | hm, so not all fds, but fds referring to directories might be ok | 18:08 |
robtaylor | (actually shared library is probably a red herring, as that'll all be in ram anyhow) | 18:08 |
robtaylor | it'd be certainly very tricksy and probably hit a lot of unexercised code paths | 18:09 |
richard_maw | the processes are going to be gracefully restarted soon anyway, so keeping files open wouldn't be too problematic | 18:09 |
jjardon | pedroalvarez: sorry, no time to investigate now, I will take a look when arrive home | 18:09 |
robtaylor | systemd approach would of course be much cooler and give you more kudos ;) | 18:09 |
richard_maw | robtaylor: yeah, and it's unlikely to be accepted as a kernel patch, since it feels like pivot_root is probably already a security vulnerability waiting to happen | 18:10 |
robtaylor | yep | 18:10 |
robtaylor | just realised one issue - actaully nothing will support graceful restarts on this kind of system out of the box | 18:11 |
richard_maw | I'll probably still have to knock up a racy prototype, just to show that it's possible, but the systemd root looks plausible | 18:11 |
robtaylor | if you apachctl --graceful, its expecting to shutdown and do the restart | 18:11 |
richard_maw | robtaylor: oh? | 18:11 |
robtaylor | where as we want to have two halves | 18:11 |
robtaylor | --graceful-shutdown and --graceful-restart | 18:12 |
richard_maw | I suppose ExecStop is what you want for graceful shutdown anyway | 18:12 |
robtaylor | graceful-stop | 18:13 |
robtaylor | yep | 18:13 |
robtaylor | i was just thinking you could basically just get the old sysytemd to command the new systemd | 18:13 |
richard_maw | you need systemd to re-exec itself and be PID1 though | 18:14 |
robtaylor | bindmount the new sytemd's socket from the new ns into the host | 18:14 |
robtaylor | use pid namespace | 18:14 |
robtaylor | (and user namespace) | 18:14 |
richard_maw | can the parent pid and user namespace go away when that happens? | 18:15 |
robtaylor | i'm still interested to hear why someone thinks containers have a perfomance impact and where they think that performance impact is | 18:15 |
robtaylor | richard_maw: i think you'd always keep real uid0 and pid 1 pristine | 18:15 |
robtaylor | hmm | 18:16 |
richard_maw | this would have been worth discussing at the Linux Plumbers conference | 18:17 |
robtaylor | we wouldn't have thought of it then | 18:18 |
robtaylor | i'm sure i could arrange some beers with lennart though | 18:18 |
robtaylor | probably after a poc ;) | 18:18 |
richard_maw | poc? | 18:18 |
robtaylor | proof of concept | 18:18 |
richard_maw | yeah, as I said, I'll need to make one anyway for the deadline of my current chunk of work | 18:19 |
robtaylor | cool | 18:19 |
robtaylor | maybe i can finish my app sandboxing stuff in the same timeline ;) | 18:20 |
richard_maw | it's likely to be racy as hell if I want to be able to change everything of importance | 18:20 |
robtaylor | well, you can probably just make everything actually quiesce | 18:21 |
robtaylor | the tricky bit will be handing over the sockets between the systemds | 18:21 |
richard_maw | but if we can get online atomic updates going, then it could be the nail in the coffin for packages | 18:22 |
robtaylor | yep, this would get widely used, i'm sure | 18:22 |
robtaylor | it solves the real problem you always had with upgrades, old cr*p hanging around and you haveing no realy way to know what was what | 18:22 |
richard_maw | plus you don't need the complication of packages needing to leave the system in a runnable state after every installation or removal, you can get away with just applying a delta | 18:23 |
robtaylor | maybe a first poc would be best done with just working within the same systemd instance | 18:23 |
richard_maw | yeah, pivot_root in all mount namespaces that are just for private mounts, rather than full containers (shared PID namespace probably) | 18:25 |
robtaylor | i'd just use nspawn tbh | 18:26 |
robtaylor | hmm | 18:26 |
* robtaylor is running out of brain. maybe we could pick this up again tomorrow | 18:27 | |
robtaylor | one thng. If the conatiner concern is just cache pressure, you can easily still use containers in this scheme but mitigate that concern | 18:29 |
robtaylor | you have a top level (real) pid 1 thats a systemd. The 'current system' would be a container under that systems, as would be your 'new system' | 18:30 |
richard_maw | probably, I've got a planning meeting to decide who's doing what to get us closer to the proof of concept level for a whole bunch of stuff, but if I immediately start on picking through the atomic online update stuff a face to face chat about this would probably be of immense value | 18:30 |
robtaylor | I could probably do a face to face tomorrow if you'd like | 18:31 |
robtaylor | oh no i can't , i won't be in mcr | 18:31 |
robtaylor | can do a call/online whiteboard if you'd like | 18:31 |
robtaylor | anyhow, lets catch up tomorrow here first ;) | 18:32 |
richard_maw | sure | 18:32 |
* robtaylor drives home | 18:32 | |
* richard_maw would still like a variant of pivot_root that took fds and had flags for future expansion | 18:34 | |
* richard_maw is a little amused that he might be able to call himself a Linux Plumber in the future | 18:37 | |
dabukalam | richard_maw: Is a linux plumber anyone that's committed code to the kernel? | 18:52 |
*** cosm [~Unknown@host-78-150-56-250.as13285.net] has quit [Ping timeout: 265 seconds] | 19:02 | |
*** cosm [~Unknown@host-78-150-56-250.as13285.net] has joined #baserock | 19:02 | |
*** cosm [~Unknown@host-78-150-56-250.as13285.net] has quit [Ping timeout: 264 seconds] | 19:19 | |
*** rdale [~quassel@9.Red-83-45-185.dynamicIP.rima-tde.net] has quit [Ping timeout: 256 seconds] | 19:40 | |
*** cosm [~Unknown@cspc154.cs.man.ac.uk] has joined #baserock | 19:54 | |
*** zoli_ [~zoli_@linaro/zoli] has joined #baserock | 19:56 | |
*** zoli_ [~zoli_@linaro/zoli] has quit [Remote host closed the connection] | 19:58 | |
* jjardon merges the systemd 217 branch \o/ | 21:47 | |
paulsher1ood | w00t! :) | 21:50 |
* paulsher1ood kicks off a build | 21:50 | |
robtaylor | dabukalam: linux 'plumbing' is the lower levels of userspace and the user land interfaces of the kernel | 21:52 |
robtaylor | dabukalam: linux plumbers is a confernce for this http://www.linuxplumbersconf.org/ | 21:53 |
jjardon | paulsher1ood: :) if you get bored: baserock/jjardon/gstreamer14 is available as well ;) | 22:00 |
* paulsher1ood stops his build, merges the above, and restarts :-) | 22:05 | |
*** genii [~quassel@ubuntu/member/genii] has quit [Read error: Connection reset by peer] | 22:15 | |
cosm | @robtaylor do you know what's the plan for the kernel hacking workshop at UoM? | 22:42 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!