IRC logs for #baserock for Monday, 2014-11-17

*** JPohlmann [~jannis@m65s13.vlinux.de] has joined #baserock		00:15
*** JPohlmann [~jannis@m65s13.vlinux.de] has quit [Changing host]		00:15
*** JPohlmann [~jannis@xfce/core-developer/JPohlmann] has joined #baserock		00:15
*** zoli_ [~zoli_@linaro/zoli] has joined #baserock		03:55
*** thecorconian [~jte@75-27-44-31.lightspeed.orpkil.sbcglobal.net] has quit [Remote host closed the connection]		04:32
*** zoli_ [~zoli_@linaro/zoli] has quit [Remote host closed the connection]		04:42
*** aananth [~caananth@74.112.167.117] has joined #baserock		07:18
*** zoli_ [~zoli_@linaro/zoli] has joined #baserock		07:56
*** radiofree_ [radiofree@unaffiliated/radiofree] has quit [Read error: Connection reset by peer]		07:58
*** fay_ [~fay@82-70-136-246.dsl.in-addr.zen.co.uk] has quit [Remote host closed the connection]		07:59
*** wdutch [~william@82-70-136-246.dsl.in-addr.zen.co.uk] has joined #baserock		08:02
*** radiofree [radiofree@unaffiliated/radiofree] has joined #baserock		08:06
*** fay [~fay@82-70-136-246.dsl.in-addr.zen.co.uk] has joined #baserock		08:10
fay is now known as Guest12589		08:11
*** zoli_ [~zoli_@linaro/zoli] has quit [Remote host closed the connection]		08:29
*** rdale [~quassel@9.Red-83-45-185.dynamicIP.rima-tde.net] has joined #baserock		08:43
*** franred [~franred@82-70-136-246.dsl.in-addr.zen.co.uk] has joined #baserock		08:45
*** franred [~franred@82-70-136-246.dsl.in-addr.zen.co.uk] has quit [Remote host closed the connection]		08:46
*** zoli_ [~zoli_@linaro/zoli] has joined #baserock		08:55
*** franred [~franred@82-70-136-246.dsl.in-addr.zen.co.uk] has joined #baserock		08:58
*** wdutch [~william@82-70-136-246.dsl.in-addr.zen.co.uk] has quit [Quit: Quit]		09:16
*** mariaderidder [~maria@82-70-136-246.dsl.in-addr.zen.co.uk] has joined #baserock		09:17
*** wdutch [~william@82-70-136-246.dsl.in-addr.zen.co.uk] has joined #baserock		09:18
*** tiagogomes [~tiagogome@82-70-136-246.dsl.in-addr.zen.co.uk] has joined #baserock		09:23
*** jonathanmaw [~jonathanm@82-70-136-246.dsl.in-addr.zen.co.uk] has joined #baserock		09:26
richard_maw	straycat: If you update your system to include a newer version of yarn, then it will be quicker and require less storage, as it doesn't snapshot directories it doesn't need to	09:37
Guest12589 is now known as fay_		09:37
straycat	richard_maw, okay cool i'll do that	09:38
Mode #baserock +o pedroalvarez by ChanServ		09:40
* Kinnison congratulates the project -- my baserock-dev folder just hit 10,000 entries :-)		09:53
*** ssam2 [~ssam2@82-70-136-246.dsl.in-addr.zen.co.uk] has joined #baserock		09:54
Mode #baserock +v ssam2 by ChanServ		09:54
pedroalvarez	Kinnison: yay!	09:54
pedroalvarez	paulsher1ood: wrt oFono; I'v just realized that 1.11 >= 1.9	09:55
pedroalvarez	:)	09:55
Kinnison	heh	09:55
Kinnison	What algorithm were you using for version comparison before?!	09:55
pedroalvarez	not sure, but executed by an human brain	09:56
Kinnison	heh	09:56
* Kinnison strongly recommends using the Debian version number comparison algorithm if you can		09:56
DavePage	pedroalvarez: Always the weak point :)	09:59
pedroalvarez	Also, I realized on friday that we need `morph edit` for now	10:06
pedroalvarez	we are using it in our licensecheck script	10:06
aananth	pedroalvarez: Good Morning! I just checked http://136.18.233.152/cgi-bin/cgit.cgi/delta/linux.git/ of my trove server, the directory is still empty. The server ran for 2 complete days. :-)	10:11
aananth	Any test to see if things go well?	10:11
pedroalvarez	hmm that's odd	10:11
ssam2	aananth: I think the problem Aananth has is that trove-setup.service failed	10:11
pedroalvarez	ssam2: but I thought that some of the lorries worked	10:11
ssam2	ah, ok, hmm	10:12
ssam2	I remember seeing a log from Aananth where trove-setup.service failed because it tried to read SSH host keys from git.baserock.org and couldn't because it lacked proxy config	10:12
ssam2	which I was hoping to make a fix for today	10:12
pedroalvarez	ah yeah, you missed all the fun last week :)	10:12
*** Krin [~mikesmith@82-70-136-246.dsl.in-addr.zen.co.uk] has joined #baserock		10:13
aananth	ssam2: I am sorry, the proxy issue got resolved last week. I failed to add a ',' between username and password.	10:13
ssam2	so it's a specific lorry (linux.git) that isn't working	10:13
aananth	apart from the above, I had to set up corkscrew, .gitconfig, ssh proxy etc.	10:14
ssam2	it could be that it's timing out, but really we need to see the logs for that lorry to see what's going on	10:15
aananth	ssam2: yes, today I think all linux*.git repo are empty. Ok I will send the log. Output of journalctl?	10:15
ssam2	aananth: we should be able to filter the output of journalctl a bit so we only see what's needed	10:16
ssam2	I'll try to work out the correct commandline	10:16
pedroalvarez	aananth: If you run this: `ssh -L 12765:localhost:12765 root@136.18.233.152` and you keep that ssh connection open, you will be able to open "localhost:12765/1.0/status-html" in your browser	10:16
*** petefoth_ [~petefoth@82-70-136-246.dsl.in-addr.zen.co.uk] has joined #baserock		10:18
*** petefoth [~petefoth@82-70-136-246.dsl.in-addr.zen.co.uk] has quit [Ping timeout: 245 seconds]		10:18
petefoth_ is now known as petefoth		10:18
pedroalvarez	aananth: once you are there we can work out how to get some logs :)	10:21
aananth	ssam2, pedroalvarez: http://fpaste.org/151383/14162197/	10:23
aananth	pedroalvarez: "localhost" did not resolve, hence i replaced it with my trove-server IP. I got lorry controller status only.	10:24
pedroalvarez	aananth: but you have the button "See separate list of all jobs that have ever been started"	10:26
pedroalvarez	from there you will access to the list of all the lorry jobs (mirroring attempts) of your trove	10:27
aananth	pedroalvarez: its a link?	10:27
pedroalvarez	aananth: it is, can't you open it?	10:28
ssam2	<http://localhost:12765/1.0/status-html> should work	10:28
ssam2	if not try <http://127.0.0.1:12765/1.0/status-html>	10:29
aananth	pedroalvarez: Yes, I am opening, I could see a long list, with last column 0, 1, 127. I am pasting it.	10:29
pedroalvarez	great,	10:29
pedroalvarez	the ones that are working have the 0	10:29
pedroalvarez	you should be worried about these with 1, 127, etc	10:29
pedroalvarez	If you click in the Job ID of a job that failed, you see a log	10:32
pedroalvarez	you will se a log*	10:33
aananth	pedroalvarez, ssam2: Ok, I am trying to paste it, but it is a huge text, hence it is not working. Ok I will look into those non zero items and report those logs.	10:34
pedroalvarez	aananth: delta/linux would be a good start	10:34
aananth	pedroalvarez, ssam2: Thank you, here it is: http://fpaste.org/151389/20801141/	10:39
*** rdale [~quassel@9.Red-83-45-185.dynamicIP.rima-tde.net] has quit [Ping timeout: 258 seconds]		10:40
pedroalvarez	everything looks ok in that log, but it stopped before finishing	10:43
radiofree	new systemd patches work a treat	10:44
ssam2	radiofree: is that a +1 ?	10:45
radiofree	yes i have +1ed in the list as well	10:45
radiofree	erm apart from one thing actually!	10:45
radiofree	let me respond on the list	10:46
ssam2	ok, sweet	10:46
pedroalvarez	I wonder how long is the default lorry timeout	10:47
aananth	Few other failures: http://fpaste.org/151391/21427141/ . Everytime linux or linux-rt ran/executed, it failed. What is the meaning of return code '-9'?	10:50
pedroalvarez	radiofree: I'm interested in knowing if it improves the network connectivity on them, and if you can unplug the ethernet cable and plug it again and get connection	10:51
pedroalvarez	aananth: I think that -9 means that the process was killed, an probably because it was taking too long	10:51
radiofree	pedroalvarez: yes that was the first thing i tried and yes that works!	10:52
paulsher1ood	pedroalvarez: i can't reproduce it here, because for some reason it's not public but in the doc I saw it appears that Bluetooth Handsfree requires 1.14 or greater.... is that a typo, do you think?	10:52
pedroalvarez	aananth: to solve this you can add "lorry-timeout" in your lorry-controller.conf file, like we do in our trove: http://git.baserock.org/cgi-bin/cgit.cgi/baserock/local-config/lorries.git/tree/lorry-controller.conf	10:52
aananth	pedroalvarez: The system is too slow and only 117M free (from free command). Is it worth restarting?	10:52
richard_maw	I'm merging "Use same kernel for jetson genivi and devel systems", just in case anyone else is attempting it and I would tread on toes	10:53
pedroalvarez	paulsher1ood: ah! I thought it was 1.9, I'll look again	10:53
pedroalvarez	radiofree: amazing	10:53
radiofree	it is yes!	10:54
*** petefoth [~petefoth@82-70-136-246.dsl.in-addr.zen.co.uk] has quit [Remote host closed the connection]		11:01
*** petefoth [~petefoth@82-70-136-246.dsl.in-addr.zen.co.uk] has joined #baserock		11:02
aananth	pedroalvarez, ssam2: I rebooted after adding timeout on both lorries. But up-on reboot, the system reported disk error and asking me to check (fsck) manually... May be this could be the reason?	11:12
JPohlmann	Hi folks! Did anyone ping me in here during the last few days? (The IRC tab was highlighted)	11:13
ssam2	aananth: I'd expect to see an error in the log for delta/linux if disk corruption caused the delta/linux lorry to fail	11:14
paulsher1ood	JPohlmann: i think you were mentioned in passing. we have logs, now ... http://testirclogs.baserock.org/	11:14
ssam2	aananth: definitely run fsck anyway though!	11:14
ssam2	jpohlmann: hi! I don't remember needing to summon you for anything :)	11:14
JPohlmann	paulsher1ood: Ok :)	11:15
JPohlmann	Yup, found it. Something about a script I wrote that extracts the Bison version from NEWS.	11:19
pedroalvarez	oh yeah :)	11:20
pedroalvarez	I had to use again that patch	11:20
*** locallycompact [~lc@82-70-136-246.dsl.in-addr.zen.co.uk] has joined #baserock		11:23
*** rdale [~quassel@9.Red-83-45-185.dynamicIP.rima-tde.net] has joined #baserock		11:27
aananth	ssam2, pedroalvarez: fscked, added timeout and restarted. Another interesting info: http://fpaste.org/151398/24191141/ clearly say timeout is the reason.	11:38
aananth	Any idea, why the run queue in lorry controller status is still 893? Is that normal?	11:39
pedroalvarez	aananth: yes, that is the normal status. They will be there running every 2 hours (you can configure it) to fetch new changes	11:40
aananth	pedroalvarez: Any reason why the disk corruption happened? Slow CPU / network speed?	11:40
aananth	... second delta/linux started before the first one got completed?	11:41
pedroalvarez	aananth: no Idea about why the corruption happened	11:41
pedroalvarez	aananth: there is only one delta/linux	11:42
pedroalvarez	aananth: why do you think there are 2?	11:44
pedroalvarez	"Richard Moore pointed out that the URLs aren't terminated with quotes"	11:49
pedroalvarez	was this Richard Maw?	11:49
pedroalvarez	:)	11:50
richard_maw	yes	11:50
rdale	oops sorry	11:50
richard_maw	no worries	11:50
aananth	pedroalvarez: I assume that the delta/linux will be queued every 2 or 6 hrs and I meant the 2nd instance of the queue started before the first job was completed... etc :-)	11:53
pedroalvarez	ah! no, that can't happen. delta/linux will appear in the queue again once it finishes, but not before	11:54
aananth	pedroalvarez: whether such scenario can happen, in case of slow network & cpu?	11:54
aananth	Ok	11:55
pedroalvarez	aananth: btw, I think you don't need to reboot the machine everytime you want to change the configuration	11:57
pedroalvarez	it was needed when trove-setup was failing, but not anymore :)	11:57
aananth	Ok	11:57
aananth	pedroalvarez: So, if I change the timeout value and push the config, will the system will take the new timeout value immediately or after I perform "systemctl start lorry-controller-readconf.service".	11:59
pedroalvarez	you can run that, and you will force the lorry-controller to read the configuration	12:00
pedroalvarez	but that is going to happen anyway	12:00
aananth	Ok, so the lorry check its configs before it starts loading/unloading work! Very intelligent lorry :-)	12:01
pedroalvarez	:D	12:01
jjardon	pedroalvarez: hi! seems /usr/share/zoneinfo is not there anymore, I think its related with the glibc transition	12:03
pedroalvarez	jjardon: I've no idea about what's that and why is that needed	12:03
jjardon	do you remember to touch anything related with the zoneinfo or the tzdata?	12:03
pedroalvarez	jjardon: nope	12:03
jjardon	ok, I will take a look	12:04
pedroalvarez	jjardon: can this be related to "cd o && make localtime=UTC" ?	12:05
pedroalvarez	this is how we build glibc	12:05
jjardon	pedroalvarez: see 6.9.2 in http://www.linuxfromscratch.org/lfs/view/development/chapter06/glibc.html	12:05
jjardon	this used to be there with eglibc, so maybe now its diferently configured, not sure yet	12:06
pedroalvarez	now I'm tempted to install the locales also in glibc	12:07
jjardon	ah, seems the zoninfo is now provider by tzdata	12:08
jjardon	pedroalvarez: yes please! so I can see my name correctly ;)	12:08
paulsher1ood	jjardon: i vaguely remember tripping over this ages ago	12:09
pedroalvarez	you can install them manually in your system	12:09
jjardon	paulsher1ood: the zoneinfo problem?	12:09
paulsher1ood	yup	12:09
* paulsher1ood can't remember why		12:10
jjardon	pedroalvarez: I think it would be better if baserock supports UTF8 by default ;)	12:10
paulsher1ood	+1	12:11
pedroalvarez	jjardon: for locales: `mkdir -p /usr/lib/locale/ && localedef -v -c -i en_GB -f UTF-8 en_GB.UTF-8`	12:11
ssam2	pedroalvarez: how did we fix Aananth's issue with trove-setup.service last week?	12:14
ssam2	would http://git.baserock.org/cgi-bin/cgit.cgi/baserock/baserock/trove-setup.git/commit/?h=sam/fix-init-behind-proxy&id=bd80ed9a1690accd8d5dcb964ce387a27f6b014b be useful ?	12:14
ssam2	(i've not tested that patch yet and I won't bother if we don't need it :)	12:14
jjardon	pedroalvarez: is there a place where I can file a bug or not yet?	12:16
pedroalvarez	jjardon: on the mail list :/	12:16
pedroalvarez	ssam2: nice!	12:19
pedroalvarez	I really like it	12:22
pedroalvarez	jjardon: so is this a bug?	12:25
jjardon	pedroalvarez: Its a regression, yes	12:25
* pedroalvarez awaits the BUG email :)		12:26
*** dabukalam [~quassel@ec2-54-69-244-150.us-west-2.compute.amazonaws.com] has quit [Quit: dabukalam]		12:32
*** dabukalam [~quassel@ec2-54-69-244-150.us-west-2.compute.amazonaws.com] has joined #baserock		12:33
*** petefoth_ [~petefoth@82-70-136-246.dsl.in-addr.zen.co.uk] has joined #baserock		12:37
*** petefoth [~petefoth@82-70-136-246.dsl.in-addr.zen.co.uk] has quit [Ping timeout: 265 seconds]		12:37
petefoth_ is now known as petefoth		12:37
jjardon	pedroalvarez: sure ;) probably a first step is to lorry ftp://ftp.iana.org/tz/tzdata-latest.tar.gz , ok to push this? http://paste.baserock.org/nucaqasewo.sql	12:39
pedroalvarez	jjardon: I'd prefer ftp://ftp.iana.org/tz/releases/tzcode2014j.tar.gz	12:43
paulsher1ood	isn't there a git repo?	12:43
paulsher1ood	https://github.com/tzinfo/tzinfo	12:44
paulsher1ood	https://github.com/tzinfo/tzinfo-data	12:44
paulsher1ood	pedroalvarez, jjardon ^^	12:44
pedroalvarez	isn't it that for ruby?	12:45
paulsher1ood	yes. maybe i'm completely off-topic here	12:46
pedroalvarez	paulsher1ood: but you raised a good point :)	12:46
jjardon	paulsher1ood: I think that has a copy of tzdata inside	12:46
straycat	good news, looks like upstream setuptools might accept our patch	12:47
pedroalvarez	jjardon: isn't this also tzdata? http://git.baserock.org/cgi-bin/cgit.cgi/delta/glibc.git/tree/timezone	12:47
jjardon	pedroalvarez: why would you dont want to lorry the latest tzdata? does lorry not support changes in the tarball automatically and make a new commit?	12:48
pedroalvarez	straycat: that's good!	12:48
pedroalvarez	jjardon: I really don't know, and also I don't know from where is going to get the version number to create a tag	12:48
jmacs	It would be useful to have a description of what "morph branch" and "morph edit" do on http://wiki.baserock.org/contributing/	12:48
ssam2	straycat: cool	12:49
paulsher1ood	jjardon: i'm just generally against tarballs if there's a vcs source	12:50
straycat	I say might because they haven't merged it yet, they seem to want to update some of the docs associated with the change	12:50
jjardon	paulsher1ood: sure	12:50
robtaylor	jjardon: i don't think tarball inport can import a series of tarballs and recreate history	12:50
robtaylor	jjardon: would be nice if it did =)	12:51
jjardon	paulsher1ood: https://people.gnome.org/~walters/Tarballs.jpg ;)	12:51
jjardon	pedroalvarez: mmm, looks like it, yes; but at least fedora and arch depends on tzdata to build glibc	12:52
paulsher1ood	petefoth: not sure where this fits in with the current site-structure http://wiki.baserock.org/traceability-and-reproducibility/	12:52
straycat	http://sprunge.us/BTFE so i could do with this now	12:53
paulsher1ood	straycat: +1	12:54
petefoth	paulsher1ood: neither am I yet. Let me have a think. It maybe that it should be linked from a top-level story in StoryBoard or in a hypothetical ‘Roadmap’ document.	12:55
pedroalvarez	I still don't like the -bitbucket suffix	12:55
jjardon	robtaylor: btw, about the status of GNOME: its currectly broken because this tzdata problem: use to build up to gnome-shell but failed to run (after fixing some problem with the glib schemas not compiled and dbus services not installed in the corect location)	12:55
franred	straycat, +1	12:56
pedroalvarez	jjardon: is this bug also going to be a problem in our devel systems?	12:56
pedroalvarez	or just in your gnome system>	12:56
pedroalvarez	?	12:56
paulsher1ood	petefoth: -1 for pages which are in your head but don't exist in real life :-)	12:57
petefoth	paulsher1ood: creating it is on my to-do list	12:57
petefoth	I’d like to do it in StoryBoard but….	12:57
straycat	merged thanks	12:58
*** jonathanmaw [~jonathanm@82-70-136-246.dsl.in-addr.zen.co.uk] has quit [Ping timeout: 240 seconds]		12:59
pedroalvarez	I still think that there wasn't an strong reason to use the bitbucket suffix, and I hope we don't start doing this for new lorries	13:01
petefoth	paulsher1ood: add a readable version of the Traceability document in the misc-docs directory	13:03
*** jonathanmaw [~jonathanm@82-70-136-246.dsl.in-addr.zen.co.uk] has joined #baserock		13:20
persia__ is now known as persia		13:27
*** persia [quassel@2400:8900::f03c:91ff:feae:3452] has quit [Changing host]		13:27
*** persia [quassel@ubuntu/member/persia] has joined #baserock		13:27
*** Guest77313 [quassel@2400:8900::f03c:91ff:feae:3452] has quit [Changing host]		13:27
*** Guest77313 [quassel@ubuntu/member/persia] has joined #baserock		13:27
Guest77313 is now known as persia_		13:27
straycat	not sure what's going on with gbo atm but there's a lot of cvsimport's running that don't look like they're associated with any currently running lorry job	13:34
Kinnison	Possibly killed jobs, cvs might have lost its controlling process group :-(	13:35
paulsher1ood	petefoth: are you saying that page is unreadable?	14:07
petefoth	paulsher1ood: no, but it’s a Google Doc not accessible to non-Codethinker. So it needs to be moved out of Google Docs (or maybe shared publically if you want to do that), and put in a format that users of w.b.o can read	14:08
straycat	radiofree, didn't you say you were going to fix vim? :p	14:10
* richard_maw wants to move all configuration file generation out of the install commands, and into deployment time configuration extensions, so that you can change how future vim instances will be configured without affecting the cache key and requiring a rebuild		14:12
paulsher1ood	petefoth: http://wiki.baserock.org/traceability-and-reproducibility is not a googledoc	14:13
jjardon	straycat: what problem do you have?	14:13
paulsher1ood	nor was it last time i sent the link :-)	14:13
straycat	jjardon, copy pasting	14:13
straycat	can we move webtools into the devel system? or else move pip further down?	14:14
straycat	jjardon, radiofree mentioned the same problem a few weeks ago and i thought he said he was going to fix it :)	14:14
pedroalvarez	straycat: so you need pip for the import tool	14:14
straycat	yeah	14:15
pedroalvarez	makes sense to me	14:15
pedroalvarez	is there anything huge in the webtools?	14:15
jjardon	straycat: you have to comment the "set mouse=a" line in /etc/vimrc but a patch to change this by default would be indeed helpful	14:16
paulsher1ood	i don't think so, pedroalvarez	14:16
paulsher1ood	but maybe it's time to tidy this anyway	14:16
pedroalvarez	in that case, you can go ahead with thad, but bear in mind that GNU tar doesn't build	14:17
straycat	oh? why not?	14:17
petefoth	paulsher1ood: sorry! Too much fliting between jobs today :(	14:18
pedroalvarez	straycat: because of the glibc change. I sent a patch last week, not sure if it can be accepted	14:18
straycat	jjardon, that's awesome thanks	14:18
pedroalvarez	I'll take a look at the discussion	14:18
*** aananth [~caananth@74.112.167.117] has quit [Quit: Leaving]		14:23
*** rdale [~quassel@9.Red-83-45-185.dynamicIP.rima-tde.net] has quit [Ping timeout: 264 seconds]		15:02
*** genii [~quassel@ubuntu/member/genii] has joined #baserock		15:03
pedroalvarez	franred, ssam2, how are you today? are you still ok of having a quick meeting here to talk about the present of our infra? Can we do it at 16:00 utc?	15:03
pedroalvarez	s/of having/with having/	15:04
jmacs	Is there anything better for making Baserock videos with than kdenlive?	15:04
pedroalvarez	kdenlive? is that a tool for recording>	15:05
pedroalvarez	?	15:05
DavePage	kdenlive is a tool for video editing	15:05
DavePage	Actually it's a tool for eating all your RAM and crashing but hey	15:05
jmacs	Yes, it's a video editor that crashes all the time	15:06
jmacs	Hence me asking if there's anything better	15:06
DavePage	jmacs: Not taht I've found. pitivi eats all your RAM and hangs rather than crashing, but that's not a great improvement.	15:07
franred	pedroalvarez, I can do at 16:00 utc	15:08
ssam2	pedroalvarez: sure	15:13
*** rdale [~quassel@9.Red-83-45-185.dynamicIP.rima-tde.net] has joined #baserock		15:17
*** rdale [~quassel@9.Red-83-45-185.dynamicIP.rima-tde.net] has quit [Ping timeout: 250 seconds]		15:36
*** rdale [~quassel@9.Red-83-45-185.dynamicIP.rima-tde.net] has joined #baserock		15:46
pedroalvarez	Reminder: Baserock Infrastructure meeting in 12 minutes	15:48
*** petefoth [~petefoth@82-70-136-246.dsl.in-addr.zen.co.uk] has quit [Quit: petefoth]		15:48
paulsher1ood	pedroalvarez: what is that, where?	15:52
pedroalvarez	here	15:52
paulsher1ood	ooh, cool :)	15:52
pedroalvarez	you are welcome to join if you have points to raise	15:53
pedroalvarez	everybody is	15:53
pedroalvarez	but the meeting is to talk about the present, not the future	15:53
paulsher1ood	SotK: did you actually put code or an instance of your stuff somewher?	15:53
paulsher1ood	pedroalvarez: is there an agenda? sorry - i've been in my own little world	15:54
* paulsher1ood guesses this may all be on a trello board somewhere :)		15:54
pedroalvarez	actually there is not an agenda, and is not in the trello board	15:55
pedroalvarez	and I guess I should do it the next time	15:55
SotK	paulsher1ood: I put the code for my prototype stuff here: http://git.baserock.org/cgi-bin/cgit.cgi/delta/openstack/turbo-hipster.git/log/?h=baserock/adamcoldrick/mason-plugin-prototype	15:55
pedroalvarez	is the first time I want to do a meeting on IRC on an opensource project :)	15:56
SotK	paulsher1ood: there isn't an instance of it running somewhere accessible yet though	15:56
pedroalvarez	Infrastructure meeting in 1 minute, please don't interrupt the meeting if is not related.	15:59
pedroalvarez	all right, Baserock Infrastucture meeting starts	16:01
pedroalvarez	franred, ssam2, are you around?	16:01
ssam2	hi	16:01
franred	hi	16:01
pedroalvarez	anybody else wants to join?	16:01
paulsher1ood	hi	16:01
straycat	eh? wha?	16:02
pedroalvarez	Ok, I'd like to discuss the following points:	16:02
* straycat hides		16:02
pedroalvarez	* [1] Migration of our Havana instances to Icehouse.	16:02
pedroalvarez	* [2] baserock-clone and cache.baserock.org.	16:02
pedroalvarez	* [3] git.baserock.org. Datacentred Migration.	16:02
pedroalvarez	* [4] Prioritizing the work and check who has time to start with it.	16:02
pedroalvarez	Does any of you have more points to discuss?	16:02
ssam2	a plan for setting up future infrastructure	16:03
ssam2	basically to see if you guys like the approach I've been taking in https://github.com/ssssam/test-baserock-infrastructure and want to adopt it or not	16:03
pedroalvarez	cool, that we the [5]	16:03
pedroalvarez	Migration of our Havana instances to Icehouse.	16:04
ssam2	is there an easy way to do that?	16:04
pedroalvarez	I've been talking with the datacentred guys and we should start moving our infra to the new tenant that we have	16:04
ssam2	actually, I know using 'nova' you can download an image	16:05
pedroalvarez	the way to do it I believe is: create a snapshot, download it, and upload it to the other tenant	16:05
ssam2	right	16:05
ssam2	one issue is that all the public IPs we've been using need to change	16:05
ssam2	and we have less floating IPs available in the new tenant	16:05
ssam2	I looked on Friday at how to set up HAProxy, and I think it'd make sense to use that	16:06
paulsher1ood	can't the download/upload be somewhere at dc, to save time?	16:06
paulsher1ood	+1 for haproxy	16:06
pedroalvarez	paulsher1ood: yes, we should do the migration from dc to dc	16:06
ssam2	paulsher1ood: good idea, better than downloading them to the office!	16:06
pedroalvarez	ssam2: do you think that 10 Ips is not enough for now?	16:06
ssam2	pedroalvarez: we're using 12 right now :)	16:07
ssam2	I think setting up HAProxy will be simple enough, and we can then point all our subdomains at one IP and just update the HAProxy config when we want to add/remove/change pieces of infrastructure	16:07
ssam2	it can forward requests to the correct instance based on subdomain or even matching bits of the URL	16:07
pedroalvarez	sounds like the right thing to do	16:08
*** zoli_ [~zoli_@linaro/zoli] has quit [Remote host closed the connection]		16:08
paulsher1ood	haproxy on a baserock system should be easy enough... has anyone done that?	16:08
ssam2	I've been avoiding using Baserock for the infrastructure so far, because all this infrastructure work is unfamiliar territory for me anyway	16:08
paulsher1ood	nod	16:08
ssam2	I'd like to get it working first, then start moving it to Baserock	16:09
pedroalvarez	I agree	16:09
ssam2	I can finish setting up HAproxy on Friday, or maybe earlier	16:09
radiofree	straycat: yes i did, for now set mouse-=a	16:09
franred	can not we do some networking trick in neutron to use NAT?	16:10
pedroalvarez	franred: not sure about that, want to research?	16:10
franred	I can have a look yes	16:10
ssam2	so the main things to migrate are:	16:11
ssam2	paste. and testirclogs.	16:11
radiofree	what would be the correct patch for that vim thing?	16:11
ssam2	and the mason-x86-64 and its trove	16:11
pedroalvarez	radiofree: please, can you wait until the meeting ends?	16:11
pedroalvarez	ssam2: yes	16:11
radiofree	oh sorry i had no idea!	16:11
pedroalvarez	radiofree: np	16:11
pedroalvarez	do we agree that we can start moving these things asap?	16:11
ssam2	could we just redeploy the Mason and the Trove ? should be easy enough	16:11
ssam2	and just move the Trove's volume to the new instance?	16:12
pedroalvarez	redeploying them makes sense	16:12
ssam2	I think we agree that we caen start moving them ASAP, and I can look at doing some of the work on Friday	16:12
pedroalvarez	about the trove, it was my second bullet point	16:12
ssam2	I'll leave paste. and testirclogs. to you as you deployed them originally	16:12
pedroalvarez	good	16:12
franred	ssam2, we will have to copy the volume so we can not avoid to create an snapshot of it?	16:12
ssam2	franred: i'm not sure if a snapshot of an instance includes any volumes, to be honest	16:12
ssam2	i'd have thought that the volume needs to be migrated separately anyway? or am I wrong?	16:13
pedroalvarez	I think you are right	16:13
ssam2	i'm not sure if we can redeploy the Trove and reuse the volume. I think we can if we have the original cluster morph for the trove kept somewhere	16:13
franred	ssam2, yes, you are right, but could the volumes be copied or we need to create an snapshot of them?	16:13
pedroalvarez	I think we should move to my second point now	16:14
pedroalvarez	[2] baserock-clone and cache.baserock.org.	16:14
pedroalvarez	should it be the same thing? Do we still want baserock-clone?	16:14
ssam2	baserock-clone is useful I think, but not valuable	16:14
ssam2	i.e. it doesn't matter if there's downtime	16:14
pedroalvarez	It is good to use the jetsons we have in DC	16:14
ssam2	oh, that's true	16:15
franred	I use baserock-clone for my testing and I still cloning repos from there	16:15
pedroalvarez	I've been thinking about that, because I want to create a mason instance to test things in armv7lhf, and I found that this mason will be different than the others	16:16
franred	I think we should use as a test baserock-lorry and use it in the new instance too, but not sure if someone is using it at me moment	16:16
pedroalvarez	hm.. I'm still unsure, a trove just to test lorries sounds like too much	16:16
pedroalvarez	but, I think we need it anyway to use the jetsons	16:17
paulsher1ood	i thought we'd end up with lorry in devel so users could test themselves?	16:17
franred	well, problem that we have is, that if we want a clean g.b.o or a g.b.o with mess	16:17
ssam2	paulsher1ood: i hope that happens, yeah	16:17
paulsher1ood	clean gbo	16:17
pedroalvarez	franred: but not having baserock-clone doesn't mean that we will have a g.b.o with mess	16:18
franred	but we don't have the lorry in devel, so for the moment have a test-trove is the clean solution, I though	16:18
paulsher1ood	pedroalvarez: we should look closer at what SotK is proposing for maason i think	16:18
pedroalvarez	paulsher1ood: true, but I want discuss the present in this meeting, not the future	16:18
pedroalvarez	once we have sorted out our current situation we can start looking forward	16:19
ssam2	we need baserock-clone anyway for Jetsons, so I think we are decided that we will keep it and migrate it as soon as possible, which may lead to some downtime for it	16:19
paulsher1ood	pedroalvarez: ok	16:19
franred	ssam2, ok	16:20
pedroalvarez	then, I think that cache.baserock.org should be baserock-clone	16:20
paulsher1ood	+1	16:20
ssam2	that works for now	16:20
pedroalvarez	cool	16:20
franred	for now, ok	16:20
paulsher1ood	can we rename baserock-clone to something better?	16:20
ssam2	in fact, that works forever as long as all our masons upload artifacts to baserock-clone	16:20
pedroalvarez	paulsher1ood: yes, any suggestion?	16:20
paulsher1ood	(mirror?)	16:20
ssam2	point 3 is about migrating git.baserock.org to datacentred, so perhaps in the end we rename it to 'git.baserock.org' :)	16:21
pedroalvarez	mirror as trove-id?	16:21
paulsher1ood	:)	16:21
paulsher1ood	gbo-mirror as trove id perhaps	16:21
paulsher1ood	can be discussed out of meeting	16:21
pedroalvarez	works for me	16:21
pedroalvarez	fair	16:21
pedroalvarez	ssam2: is a possibility	16:22
richard_maw	unless baserock-clone also has copies of the lorry state, it's not valid to just rename it to git.baserock.org	16:22
pedroalvarez	richard_maw: yeah sure, I know that	16:22
ssam2	richard_maw: that's true. I just mean that eventually we might not need baserock-clone to exist	16:22
ssam2	I guess baserock-clone has multiple roles and it's hard to pick a name that reflects all of them	16:23
richard_maw	fair enough, I just dipped into the conversation and was concerned that things would just be renamed	16:23
ssam2	the instance is currently named 'mason-artifact-cache-server-plus-git.baserock.org-mirror' :)	16:23
pedroalvarez	anything else to raise regarding this point?	16:23
ssam2	how about baserock-2 ?	16:23
ssam2	i'll stop, we can discuss that outside this meeting	16:24
pedroalvarez	[3] git.baserock.org. Datacentred Migration.	16:24
pedroalvarez	I just wanted to raise this point, we should move it to DC	16:24
ssam2	this reminds me of a 6th point we should discuss: backups	16:24
pedroalvarez	but I think we should move other things first so we can test it	16:24
pedroalvarez	ssam2: point added	16:25
ssam2	we need a lot more capacity for volumes before we can migrate g.b.o, too	16:25
ssam2	it won't fit in 200GB	16:25
paulsher1ood	i can ask	16:25
pedroalvarez	ssam2: it will, but we need also space for cache.b.o	16:26
DavePage	Is that because g.b.o serves many functions?	16:26
DavePage	It might be worth trying to split that out as part of the migration.	16:26
pedroalvarez	yeah, splitting the cache is being part of the migration	16:26
ssam2	just VCS imports and hosting is still going to take up a lot of space	16:26
ssam2	it's currently 115GB of artifacts + Gits	16:26
ssam2	but we will keep adding more stuff, so we'll hit 200GB soon enough	16:27
pedroalvarez	paulsher1ood: I'll appreciate that	16:27
pedroalvarez	ok, so this migration is going to be more complex and it has to wait until we have resources and we have tested DC	16:27
pedroalvarez	and we have a plan for backups	16:28
ssam2	also, g.b.o is working fine, unlike e.g. Storyboard :)	16:28
pedroalvarez	true, but I know that DavePage is not confident about its security situation	16:28
DavePage	Well, to be specific I'm not confident about the host it's running on either :)	16:29
ssam2	why? is that not more to do with the lack of a formal security process for the OS it runs, than which VM hosting service it's hosted on?	16:29
DavePage	For starters I could do with rebooting the VM host for a kernel security update. For another thing the host is running kvm/qemu with no security support.	16:30
franred	what is the difference between the host is running now and the one it will be runnig in DC?	16:30
* paulsher1ood wonders about the expected duration of this meeting		16:30
ssam2	ok, so it would be good to migrate anyway	16:30
pedroalvarez	paulsher1ood: I expect we can finish it in 10 minutes	16:30
DavePage	franred: One is my problem, the other is not ;)	16:31
pedroalvarez	should we move to [4] Prioritizing the work and check who has time to start with it.?	16:31
ssam2	yes	16:31
ssam2	we need a todo list for this, I guess	16:32
ssam2	should we use the existing Trello for now ? or a wiki page with a list of tasks ?	16:32
paulsher1ood	+1 for wiki :)	16:32
pedroalvarez	I was going to say trello for simplicity, but ok wiki	16:33
pedroalvarez	things to do:	16:33
paulsher1ood	i'll drop my +1 if others prefer	16:33
pedroalvarez	* Migrate irclogs	16:33
pedroalvarez	* Migrate paste.baserock	16:33
pedroalvarez	* Migrate mason and trove	16:33
ssam2	I guess mason and trove can be done independently, which makes it slightly less daunting	16:34
ssam2	just need to update mason with the new IP of the trove	16:34
pedroalvarez	true	16:34
pedroalvarez	I guess I'm missing things	16:35
ssam2	i still plan to set up HAProxy, an OpenID provider, and Storyboard	16:35
ssam2	in that order	16:35
pedroalvarez	right, I can do the paste.baserock, and the mason migration tomorrow	16:35
pedroalvarez	and irc logs I guess	16:35
pedroalvarez	and with this we can move to [5] plan for setting up future infrastructure	16:36
franred	pedroalvarez, I can give you a hand	16:36
pedroalvarez	franred: thanks	16:36
ssam2	I can look at the Trove then	16:36
ssam2	will be on Friday	16:36
pedroalvarez	ssam2: cool	16:36
pedroalvarez	[5] I think we should follow what ssam2 is doing to setup the infra	16:37
pedroalvarez	https://github.com/ssssam/test-baserock-infrastructure	16:37
pedroalvarez	I'll bear this in mind when doing the migration	16:37
ssam2	my idea is that Packer maps reasonably closely to 'morph deploy', so there should be fairly clear migration paths when moving stuff to Baserock	16:38
pedroalvarez	makes sense	16:38
ssam2	we should decide where that repo lives permanently, then. On g.b.o makes sense except then everyone will be mirroring it	16:39
ssam2	but I suppose it's only ever going to be small	16:39
pedroalvarez	I think that for now that location is ok	16:39
ssam2	ok	16:39
pedroalvarez	should we move then?	16:39
ssam2	move on to [6]? ok	16:40
pedroalvarez	backups in DC	16:40
pedroalvarez	I don't know anything about the possibilities yet, but yes, we should ask, get information about our current backups plan and decide what are we going to use in DC	16:40
pedroalvarez	i volunteer to do that	16:40
ssam2	I was thinking for future infrastructure we should set up one database server shared by all the infrastructure	16:41
ssam2	Storyboard seems to need MySQL so I guess it'll have to be a MySQL server	16:41
ssam2	then Gerrit, Storyboard and whatever else can use that and we only need to back that up	16:41
ssam2	everything else can be redeployed	16:41
* pedroalvarez nods		16:41
ssam2	we should also find out what DC's backup policy is	16:41
paulsher1ood	i wonder if we've properly established why SB needs mysql. they used to support pg - maybe we could re-animate that	16:42
ssam2	hopefully they can take care of physical backups, and we just need to worry about the logical (database) backup	16:42
ssam2	paulsher1ood: I've not investigated why, I'll ask them	16:42
franred	ssam2, sounds like a good plan	16:42
pedroalvarez	ok, anything else?	16:42
pedroalvarez	I think this is more than enought to start :)	16:43
ssam2	thanks for running the meeting Pedro	16:43
pedroalvarez	I declare this meeting finished	16:43
straycat	irc meetings are cool	16:43
* paulsher1ood notes that pg works on baserock out of the box...mysql is more work		16:43
franred	thanks Pedro :)	16:43
straycat	radiofree, not sure, can't it be a patch to vim?	16:44
ssam2	straycat: my poor eyes beg to differ	16:44
straycat	pedroalvarez, so what's the deal with tar?	16:46
pedroalvarez	straycat: I believe it can be merged :)	16:47
straycat	"it" ?	16:48
ssam2	paulsherwood: sounds non-trivial to use PostgreSQL for Storyboard --all the migrations are MySQL-specific	16:48
ssam2	there are about 30 migrations and we'd need to fix and maintain them and any future ones	16:48
wdutch	:w	16:49
straycat	No file name	16:50
*** CTtpollard [~tom@82-70-136-246.dsl.in-addr.zen.co.uk] has joined #baserock		16:50
paulsher1ood	urgh :/	16:50
ssam2	all of https://github.com/openstack-infra/storyboard/tree/master/storyboard/db/migration/alembic_migrations/versions	16:51
straycat	f	16:51
straycat	heh >.>	16:51
ssam2	there doesn't seem to be much MySQL specific stuff in there, actually	16:51
pedroalvarez	straycat: "it" = the patch I sent to fix tar	16:52
rdale	having somethink like MYSQL_ENGINE = 'InnoDB' in a database migration seems a bit broken to me	16:54
paulsher1ood	:-)	16:57
straycat	pedroalvarez, is this with upstream?	16:59
pedroalvarez	straycat: no, I just had to upgrade it to a newer version glibc compatible	17:00
pedroalvarez	I'll merge it soon	17:00
straycat	why can't we merge it now?	17:01
pedroalvarez	We can, I'm just in the middle of something. If you want to merge it, i'll appreciate it :)	17:01
radiofree	straycat: i was thinking it would be easier to just create a /root/.vimrc file with "set mouse-=a" and install that in the chunk	17:04
radiofree	rather than having to modify the vim source code	17:04
radiofree	s/source code/repo	17:04
straycat	pedroalvarez, it has two +1s i'm fine merging it if all we're doing is effictively disabling -werror	17:04
richard_maw	radiofree: I don't think vim as root allows .vimrc	17:04
richard_maw	security reasons	17:04
pedroalvarez	straycat: please :)	17:05
richard_maw	though I may be mixing that up with the `vim: foo` lines	17:05
ssam2	richard_maw: I use a .vimrc in Baserock all the time, so it must work as root	17:05
radiofree	richard_maw: works here	17:06
paulsher1ood	SotK: is there some reason your patches for mason are authored by 'Mason Test Runner'?	17:06
pedroalvarez	straycat: and also you have my +1 to move webtools to devel, although I prefer if you send a patch to see if anybody disagrees	17:06
straycat	have a system integration thing that modified the /etc/vimrc ?	17:06
straycat	*s	17:06
pedroalvarez	s/move /include in/g	17:07
* richard_maw has come to the conclusion that if we can't do atomic runtime fs updates, then we can't do runtime updates at all, as the real value in package-based distributions is that it encodes the logic to safely remove a bunch of files from the filesystem. delta-based application can result in the filesystem being in states not viable for running applications		17:07
*** wdutch [~william@82-70-136-246.dsl.in-addr.zen.co.uk] has quit [Quit: Quit]		17:07
richard_maw	that's the real value of packages	17:08
richard_maw	but if you can do an atomic fs update then you don't need them	17:08
robtaylor	projectatomic	17:08
robtaylor	but I still want to do apps with baserock, but that's something rather different	17:09
ssam2	robtaylor: does OSTree pivot OS version without rebooting?	17:09
richard_maw	according to the docs I could find it runs `systemctl reboot`	17:10
ssam2	I had a feeling it requires a reboot to upgrade, but I might be wrong	17:10
robtaylor	ssam2: yep, in project atomic you updates	17:10
robtaylor	umm bad paste	17:10
robtaylor	you systemctl reboot	17:11
ssam2	right. So it provides a similar thing to what we currently have with Btrfs subvolumes (except with the implementation in userspace instead of in the kernel)	17:11
robtaylor	yep	17:11
* richard_maw needs runtime atomic updates and has been informed that containerising the applications is not an option		17:12
richard_maw	it's doable with clever use of pivot_root	17:12
robtaylor	richard_maw: is it the applications that need updates or the whole system?	17:12
richard_maw	both	17:13
robtaylor	whole system updates without restart? ouch	17:13
DavePage	kexec? :)	17:14
richard_maw	if init or the kernel changes it's permissible to kexec, but for everything else I need it to stay up	17:14
robtaylor	yep, pivot root is your approach there, and i guess its up to the sysadmin to figure out ehen they need to reboot	17:14
robtaylor	all very old school	17:15
paulsher1ood	really? that sounds hard :-) would super-fast boot not be an option?	17:15
robtaylor	paulsher1ood: that would be the modern container-oriented way, indeed	17:15
richard_maw	paulsher1ood: not possible with the class of hardware involved	17:15
*** franred [~franred@82-70-136-246.dsl.in-addr.zen.co.uk] has quit [Quit: Leaving]		17:16
richard_maw	kexec is the fastest reboot semantics possible, and I hope the hardware supports kexecing	17:16
robtaylor	mm. sounds big irony ;)	17:16
richard_maw	I couldn't possibly comment.	17:16
robtaylor	richard_maw: hah, just had an evil thought	17:17
richard_maw	robtaylor: do tell >:-D	17:17
robtaylor	richard_maw: you could always boot systems in a pid namespace. Then you can boot up a new system, and use gcroups to close down the old system when its stopped doing things	17:18
robtaylor	i guess you could end up with some weird behaviour for apps that think they're the only people talking to their store	17:19
robtaylor	hell, just always boot in a container =)	17:19
* paulsher1ood likes it		17:19
richard_maw	robtaylor: I've been informed that containers impose too much overhead	17:19
robtaylor	probably could just add support to systemd to do this	17:19
paulsher1ood	even better :-)	17:20
robtaylor	richard_maw: um, they don't know what they're talking about then	17:20
paulsher1ood	robtaylor: careful... :)	17:20
robtaylor	richard_maw: there's only overhead when you use VETH devices to bridge network namspaces	17:20
robtaylor	hmm, which you would probably have to do in this model	17:21
richard_maw	robtaylor: that was my first thought, but it means that there's more copies of binaries and libraries around, which means there's more pressure on the caches, so things keep dropping out of it all the time	17:21
robtaylor	not if you do things right	17:21
robtaylor	the pressure on the caches will be the same for the pivot_root approach, actually	17:21
robtaylor	you'll have the old sharedlibraries and executables mmaped in when you load the new system,	17:22
robtaylor	and no managed way to get rid of them	17:22
robtaylor	if you containerise, you can add systemd commands to query the state	17:22
robtaylor	hmm	17:22
robtaylor	maybe just use a cgroup	17:22
richard_maw	robtaylor: not exactly. With the container approach you need to keep the outer system's libraries pristine when you update the inner one. But with the pivot root, you re-exec _all_ your binaries in the new version	17:23
richard_maw	so there's no processes left using the old binaries	17:23
robtaylor	that's just a reboot	17:23
richard_maw	if you do it right, it's a reboot with no service interruption, since you can have processes gracefully re-exec and keep the connections open	17:24
richard_maw	`systemctl daemon-reexec` is an example of this	17:24
robtaylor	righty	17:24
robtaylor	so you kinda want to do a full-system daemon-rexec?	17:24
richard_maw	yeah	17:24
richard_maw	after migrating all the processes to a new mount tree	17:25
robtaylor	it would then make a lot of sense to put everything in a cgroup, and start a new gcroup when you reexec	17:25
robtaylor	then you can easily track what hasn't restarted	17:25
robtaylor	(and warn/debug if its haveing problems)	17:25
robtaylor	does that make sense?	17:26
richard_maw	interesting idea, and if it maintains the systemd state you can track it back to the service, and have a nice interface to be able to tell the service to gracefully re-exec before forcing it to	17:27
robtaylor	yep	17:27
*** tiagogomes [~tiagogome@82-70-136-246.dsl.in-addr.zen.co.uk] has quit [Quit: Leaving]		17:27
robtaylor	(can you tell i've recently spent a lot of time trying to understand containers? ;) ;))	17:27
richard_maw	first class support for this in systemd would be the best place to put this pivot and gracefully re-exec logic I think	17:28
*** jonathanmaw [~jonathanm@82-70-136-246.dsl.in-addr.zen.co.uk] has quit [Quit: Leaving]		17:28
robtaylor	yep, i expect it'll be gladly accepted and walters will love you	17:28
richard_maw	currently there's not a race-free way of doing this, unless you can freeze all the processes	17:28
robtaylor	well, you can	17:28
robtaylor	you can freeze a cgroup	17:28
robtaylor	and be told when everything is quiesent	17:29
robtaylor	https://www.kernel.org/doc/Documentation/cgroups/freezer-subsystem.txt	17:29
robtaylor	but if everything providing a service is a well behaved socket-activated unit, you could totally do it race free	17:30
robtaylor	old active connections carry on using the old service, new connects startup and use the new service	17:31
richard_maw	nice	17:31
robtaylor	and you use the freezer subsystem to tell when the old services have finished	17:31
robtaylor	hmm, maybe	17:33
robtaylor	you may need some more than freezer	17:34
richard_maw	so you'd only worry about ensuring systemd is properly migrated, and over time the system eventually migrates itself	17:34
robtaylor	oh	17:34
robtaylor	use notify_on_release rather than freezer	17:34
robtaylor	and maybe have a way to manually freeze the old system if its being bad about restarting	17:35
robtaylor	(or maybe some units need some special handling)	17:35
richard_maw	if I'm understanding your freezing suggestion correctly, then if it behaves that badly, then you're better off killing it and accepting that some connections get dropped	17:36
robtaylor	as long as you can a) tell what is actiually happening easily and b) tell stuff to go away if its being an arse and c) rollback if its not working	17:36
robtaylor	richard_maw: yeah, probably	17:36
robtaylor	richard_maw: i think the freezer may be useful in terms of checkpointing	17:36
richard_maw	ah, so you're suggesting _process_ rollback may be possible?	17:36
robtaylor	with freezer you can rollback a cgroup	17:36
* robtaylor suddenly feels very evil		17:37
* richard_maw feels like he's a child, it's christmas and he's been given lots of new toys, some of which have warning stickers		17:37
robtaylor	you probably don't want that though, as you can only get a consistent checkpoint by forcing a cgroup to be queisecnt	17:38
DavePage	"Some of these are toys, some of them are beartraps that will take off your hand. Have fun!"	17:38
robtaylor	richard_maw: you probably want to take a look at CRIU	17:39
robtaylor	DavePage: about right	17:39
robtaylor	here be shiny shiny dragons	17:39
richard_maw	robtaylor: probably, I had assumed it required that checkpointed processes need to be restored in an identical filesystem, but I could be wrong	17:41
robtaylor	richard_maw: well you can do that with btrfs	17:41
robtaylor	richard_maw: you can even send/receive a given state to a new system	17:42
* robtaylor all kinds of evil		17:42
richard_maw	yes, but I don't want to restore it in an identical filesystem, I want to restore it in a different filesystem because it contains updatees	17:42
robtaylor	oh, yeah, i'm not suggetsing you do that	17:42
robtaylor	that would be bad	17:42
robtaylor	just checkpointing for rollback on failure	17:42
robtaylor	so new system blows up, freeze that, make a chroot with your old snapshot and restart your old cgroup	17:43
robtaylor	and then kill off the borken new cgroup	17:44
richard_maw	that's assuming the services are continuing then re-execing, rather than us freezing the old ones and starting new versions	17:44
robtaylor	(if that makes sense)	17:44
richard_maw	I'm assuming that checkpointing isn't quick, because there's a lot of process data that needs to be serialised	17:44
robtaylor	depends, when you snapshot and send/receive	17:45
robtaylor	but i'm suggesting this sequence -> 1) start up new system in a new cgroup, handnd over as per daemon reexec. 2) wait for all the old system to stop doing stuff 3) snaphot 4) kill it	17:46
robtaylor	if that makes sense	17:46
robtaylor	2) is the hardest bit. I don't think you can assume the services will exec, you'll just have to montor the cgroup	17:47
robtaylor	s/exec/exit/	17:47
richard_maw	pivot_root has some limitations a) it works per-namespace, so I'd need to pivot in each namespace, rather than all at once. b) chrooted processes don't get thier root changed c) unless the working directory of the process is /, it also won't be chdir'd. b) and c) can be dealt with by chrooting before pivoting, but without a pivot_root that can take fds, you can't always refer to the mount_points	17:47
richard_maw	s/mount_points/new mount point/	17:48
richard_maw	plus, openat means old processes can still see the old state until they re-open those files	17:48
richard_maw	but that was from when I was thinking I needed to do some magic without cooperating from systemd	17:49
robtaylor	in this sceme you don't really pivot root, you're really doing soemthing like nspawn --share-system	17:49
robtaylor	hmm, you may also want to worry about sytemd upgrades =)	17:50
richard_maw	not exactly though, as I need systemd to also do the transition to the new system, which may require systemd being backwards compatible with its serialised state when doing a `systemctl daemon-reexec`	17:50
robtaylor	yep	17:50
robtaylor	that sounds about right	17:50
richard_maw	daemon-reexec is only supported for being able to reload the libraries it depends on and re-execing for debugging	17:51
richard_maw	re-execing for debugging is basically just so you can compile a version with print statements in	17:51
robtaylor	interesting http://www.freedesktop.org/wiki/Software/systemd/SystemUpdates/	17:51
robtaylor	(not what you want but a little informative)	17:52
richard_maw	it appears to miss the point for me, as it's doing offline update when you have packages available, when one of the advantages of packages is that you can do online updates	17:53
robtaylor	hmm, actually does any state really need to be passed?	17:53
robtaylor	between new and old sytemd?	17:53
robtaylor	its just really atomic socket activation handover	17:54
richard_maw	only if we're allowing services to see the new version of the system, which may be allowable, since it's likely to work, as you see the same transition with non-atomic updates	17:54
pedroalvarez	jjardon: I still want to know how critical is the bug you have found with glibc and tzdata.. :P	17:54
richard_maw	as services may be used to being still alive when packages are being updated	17:55
robtaylor	yeah, anything that breaks in this model would have broken before	17:55
robtaylor	indeed, astually the model you're replacing never had atomic handover	17:55
robtaylor	you'd always have downtime unless your service specifiucally had graceful restart	17:56
robtaylor	(e.g. apache2)	17:56
richard_maw	also irssi according to one source :-)	17:56
richard_maw	you run /update or something	17:56
richard_maw	ah /upgrade	17:57
robtaylor	so you just need a way to say how to graceful in the unit, and if that isn't there, you stop and restart	17:57
robtaylor	and of course, that's already there	17:57
robtaylor	ExecReload=/usr/sbin/apache2 -k graceful $APACHE2_OPTS	17:57
richard_maw	I thought reload was just for config	17:58
*** ssam2 [~ssam2@82-70-136-246.dsl.in-addr.zen.co.uk] has quit [Quit: Leaving]		17:59
robtaylor	mm, yes, you'd probaly need a new unit line	17:59
*** mariaderidder [~maria@82-70-136-246.dsl.in-addr.zen.co.uk] has quit [Quit: Ex-Chat]		18:00
richard_maw	so, we want graceful restarts for everything, but systemd can make that easier to implement for services by them just having a graceful shutdown and being socket activated	18:00
robtaylor	umm, well i'm saying you proably can't have graceful restarts for everthing	18:01
robtaylor	but you can where services support graceful restarts	18:01
robtaylor	and that can be indicated in the unit	18:01
robtaylor	if they don't, you just stop and start them	18:02
robtaylor	and you';ll be at parity with current systems, but a lot more controlled	18:03
*** CTtpollard [~tom@82-70-136-246.dsl.in-addr.zen.co.uk] has quit [Quit: Ex-Chat]		18:03
richard_maw	if we had a magic pivot_root that rebased all the paths in all the processes (cwd, root and all open fds) we could get away without special logic in systemd, but it sounds like systemd support would be easier to achieve	18:05
robtaylor	i think bad things would probably happen if you did that	18:06
robtaylor	imaging not all your shared library is paged in	18:07
robtaylor	(or your mmaped data set)	18:07
*** Krin [~mikesmith@82-70-136-246.dsl.in-addr.zen.co.uk] has quit [Remote host closed the connection]		18:07
richard_maw	hm, so not all fds, but fds referring to directories might be ok	18:08
robtaylor	(actually shared library is probably a red herring, as that'll all be in ram anyhow)	18:08
robtaylor	it'd be certainly very tricksy and probably hit a lot of unexercised code paths	18:09
richard_maw	the processes are going to be gracefully restarted soon anyway, so keeping files open wouldn't be too problematic	18:09
jjardon	pedroalvarez: sorry, no time to investigate now, I will take a look when arrive home	18:09
robtaylor	systemd approach would of course be much cooler and give you more kudos ;)	18:09
richard_maw	robtaylor: yeah, and it's unlikely to be accepted as a kernel patch, since it feels like pivot_root is probably already a security vulnerability waiting to happen	18:10
robtaylor	yep	18:10
robtaylor	just realised one issue - actaully nothing will support graceful restarts on this kind of system out of the box	18:11
richard_maw	I'll probably still have to knock up a racy prototype, just to show that it's possible, but the systemd root looks plausible	18:11
robtaylor	if you apachctl --graceful, its expecting to shutdown and do the restart	18:11
richard_maw	robtaylor: oh?	18:11
robtaylor	where as we want to have two halves	18:11
robtaylor	--graceful-shutdown and --graceful-restart	18:12
richard_maw	I suppose ExecStop is what you want for graceful shutdown anyway	18:12
robtaylor	graceful-stop	18:13
robtaylor	yep	18:13
robtaylor	i was just thinking you could basically just get the old sysytemd to command the new systemd	18:13
richard_maw	you need systemd to re-exec itself and be PID1 though	18:14
robtaylor	bindmount the new sytemd's socket from the new ns into the host	18:14
robtaylor	use pid namespace	18:14
robtaylor	(and user namespace)	18:14
richard_maw	can the parent pid and user namespace go away when that happens?	18:15
robtaylor	i'm still interested to hear why someone thinks containers have a perfomance impact and where they think that performance impact is	18:15
robtaylor	richard_maw: i think you'd always keep real uid0 and pid 1 pristine	18:15
robtaylor	hmm	18:16
richard_maw	this would have been worth discussing at the Linux Plumbers conference	18:17
robtaylor	we wouldn't have thought of it then	18:18
robtaylor	i'm sure i could arrange some beers with lennart though	18:18
robtaylor	probably after a poc ;)	18:18
richard_maw	poc?	18:18
robtaylor	proof of concept	18:18
richard_maw	yeah, as I said, I'll need to make one anyway for the deadline of my current chunk of work	18:19
robtaylor	cool	18:19
robtaylor	maybe i can finish my app sandboxing stuff in the same timeline ;)	18:20
richard_maw	it's likely to be racy as hell if I want to be able to change everything of importance	18:20
robtaylor	well, you can probably just make everything actually quiesce	18:21
robtaylor	the tricky bit will be handing over the sockets between the systemds	18:21
richard_maw	but if we can get online atomic updates going, then it could be the nail in the coffin for packages	18:22
robtaylor	yep, this would get widely used, i'm sure	18:22
robtaylor	it solves the real problem you always had with upgrades, old cr*p hanging around and you haveing no realy way to know what was what	18:22
richard_maw	plus you don't need the complication of packages needing to leave the system in a runnable state after every installation or removal, you can get away with just applying a delta	18:23
robtaylor	maybe a first poc would be best done with just working within the same systemd instance	18:23
richard_maw	yeah, pivot_root in all mount namespaces that are just for private mounts, rather than full containers (shared PID namespace probably)	18:25
robtaylor	i'd just use nspawn tbh	18:26
robtaylor	hmm	18:26
* robtaylor is running out of brain. maybe we could pick this up again tomorrow		18:27
robtaylor	one thng. If the conatiner concern is just cache pressure, you can easily still use containers in this scheme but mitigate that concern	18:29
robtaylor	you have a top level (real) pid 1 thats a systemd. The 'current system' would be a container under that systems, as would be your 'new system'	18:30
richard_maw	probably, I've got a planning meeting to decide who's doing what to get us closer to the proof of concept level for a whole bunch of stuff, but if I immediately start on picking through the atomic online update stuff a face to face chat about this would probably be of immense value	18:30
robtaylor	I could probably do a face to face tomorrow if you'd like	18:31
robtaylor	oh no i can't , i won't be in mcr	18:31
robtaylor	can do a call/online whiteboard if you'd like	18:31
robtaylor	anyhow, lets catch up tomorrow here first ;)	18:32
richard_maw	sure	18:32
* robtaylor drives home		18:32
* richard_maw would still like a variant of pivot_root that took fds and had flags for future expansion		18:34
* richard_maw is a little amused that he might be able to call himself a Linux Plumber in the future		18:37
dabukalam	richard_maw: Is a linux plumber anyone that's committed code to the kernel?	18:52
*** cosm [~Unknown@host-78-150-56-250.as13285.net] has quit [Ping timeout: 265 seconds]		19:02
*** cosm [~Unknown@host-78-150-56-250.as13285.net] has joined #baserock		19:02
*** cosm [~Unknown@host-78-150-56-250.as13285.net] has quit [Ping timeout: 264 seconds]		19:19
*** rdale [~quassel@9.Red-83-45-185.dynamicIP.rima-tde.net] has quit [Ping timeout: 256 seconds]		19:40
*** cosm [~Unknown@cspc154.cs.man.ac.uk] has joined #baserock		19:54
*** zoli_ [~zoli_@linaro/zoli] has joined #baserock		19:56
*** zoli_ [~zoli_@linaro/zoli] has quit [Remote host closed the connection]		19:58
* jjardon merges the systemd 217 branch \o/		21:47
paulsher1ood	w00t! :)	21:50
* paulsher1ood kicks off a build		21:50
robtaylor	dabukalam: linux 'plumbing' is the lower levels of userspace and the user land interfaces of the kernel	21:52
robtaylor	dabukalam: linux plumbers is a confernce for this http://www.linuxplumbersconf.org/	21:53
jjardon	paulsher1ood: :) if you get bored: baserock/jjardon/gstreamer14 is available as well ;)	22:00
* paulsher1ood stops his build, merges the above, and restarts :-)		22:05
*** genii [~quassel@ubuntu/member/genii] has quit [Read error: Connection reset by peer]		22:15
cosm	@robtaylor do you know what's the plan for the kernel hacking workshop at UoM?	22:42

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!