IRC logs for #buildstream for Tuesday, 2017-12-19

*** mcatanzaro has joined #buildstream		02:26
*** valentind has joined #buildstream		08:36
gitlab-br-bot	buildstream: merge request (master->master: WIP: Documentation improvements) #183 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/183	09:04
*** tiago has joined #buildstream		09:12
*** noisecell has joined #buildstream		09:29
*** valentind has quit IRC		09:31
*** tpollard has joined #buildstream		10:13
*** ssam2 has joined #buildstream		10:17
gitlab-br-bot	buildstream: issue #173 ("Element state / cache key handling") changed state ("opened") https://gitlab.com/BuildStream/buildstream/issues/173	10:30
tlater	Is there any reason to use the click runner over subrpocess.check?	10:50
* tlater is struggling enough trying to whip this in to place that he's considering just using subprocess...		10:51
ssam2	it avoids a fork() and exec()	10:51
ssam2	i don't know if that matters at all, though	10:51
tlater	It seems to do magic to file descriptors that's incompatible with other things, though...	10:52
ssam2	yes	10:53
ssam2	the docs themselves say this: 'This only works in single-threaded systems without any concurrency as it changes the global interpreter state.'	10:53
ssam2	i don't fully understand what that is implying though	10:53
ssam2	http://click.pocoo.org/5/api/#testing	10:53
ssam2	perhaps it's just referring to the way it modifies sys.stdin/out/err	10:54
tlater	I guess it does catch exceptions neatly	10:55
persia	I think it means that if you try to use that as a thread in a larger application, the global state change for the individual thread will affect the rest of the application (so don't: instead restrict use to single-threaded calls).	11:14
*** jude has joined #buildstream		11:32
tlater	So it looks like passing the sys.* objects created by pytest to subprocess.Popen causes the subprocess to hang	11:40
* tlater figures this is deadlock because the parent process and the child process attempt to write to the same file		11:42
tlater	The only way I can see this working is if the sandbox subprocesses write to a pipe and then write all their stdout/stderr to sys.stdout/sys.stderr after they finish	11:42
ssam2	watching output of subprocesses is always painful in python	11:45
ssam2	do you know it's due to pytest creating the sys.stdin/out/err objects?	11:46
ssam2	it's easy to hang even with the default stdin/stdout/stderr	11:46
ssam2	if you don't read from the pipe in the parent process, the child process blocks as soon as it fills up the buffer	11:46
tlater	I've never seen a hang for this reason outside of these test cases, so I'm fairly sure	11:47
ssam2	"fairly sure" is rarely good enough when dealing with software :=)	11:47
tlater	Fair enough, I don't really see a way to be more certain though	11:47
ssam2	you should be able to run the test as a standalone program easily enough ?	11:48
ssam2	just add a call at the bottom of the script to call the test function	11:48
tlater	Yep, that I have done	11:48
ssam2	ah, then pytest shouldn't be messing with sys.stdin/out anyway I don't think	11:48
tlater	I've also replaced sys.stdout with None in the actual code, which seems to prompt python to create a new stdout for that process	11:48
ssam2	if it does that when you just import the module, it's a bad bad panda	11:48
tlater	Nah, it doesn't, luckily	11:48
tlater	But the reason behind the earlier error is that python creates new std* file descriptors for subprocesses	11:49
tlater	Which means that the original one we are capturing from isn't written to	11:49
tlater	Which in turn means we can't get subprocess output	11:49
ssam2	right	11:49
tlater	Now, trying to swap that out for our current file descriptor causes deadlock	11:50
ssam2	using stdout=subprocess.PIPE is probably the best option then, indeed	11:50
ssam2	i always find myself using that to be honest	11:50
tlater	In that case, we would be changing buildstream functionality though	11:50
ssam2	when faced with problems like this	11:50
ssam2	hm ? i thought this was only the test runner	11:50
tlater	No, this is testing the `bst shell` code	11:50
tlater	And because that launches a subprocess...	11:50
ssam2	right	11:51
ssam2	but why not launch `bst` itself as a subprocess ?	11:51
tlater	I'm still trying to use the click test launch thing	11:51
tlater	Though I'm really considering throwing that out	11:51
ssam2	personally, if you can replace it i would say that's no bad thing	11:51
ssam2	this is another situation where Python's stdlib is pretty lacking	11:52
ssam2	there's no way to start a process and then read from stdout/stderr as it runs	11:52
ssam2	i must have written code to do that a dozen times already, always slightly different	11:52
tlater	Yeah, it would be nice to have	11:52
ssam2	but anyway, the click thing is not that smart, we already hit limits of it (no separate stderr/stdout feeds) and it's clearly unmaintained	11:53
* tlater removes it and the hack tristan installed to work around stderr/stdout splitting		11:53
ssam2	so... all we really gain is this smart way of avoiding a fork() and exec() every time we start an instance of `bst` from the tests	11:53
tlater	It also catches exceptions	11:54
ssam2	ah true	11:54
ssam2	that's pretty useful for testscases indeed	11:54
tlater	I don't think it's possible to do that with a fork() and exec() unfortunately	11:54
ssam2	yeah	11:55
ssam2	still, from the point of view of the integration tests this isn't a regression as we already couldn't do that	11:56
ssam2	if the existing test suite uses click.CliRunner and the integration tests use something else, we're still in a better position than before	11:56
tlater	I'd rather use the same functions for both, tbh, but I guess there's no helping it	11:57
persia	The CliRunner docs fairly strongly assert they are to be used for unit tests, and not other sorts, encouraging the split	11:58
*** jude has quit IRC		12:03
tlater	\o/ Finally have a successful test case :)	12:23
gitlab-br-bot	buildstream: merge request (sam/savefile-utility->master: utils.py: Add SaveFile utility class) #179 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/179	12:24
tristan	tlater, ssam2, we probably want to make public API out of `tests/testing` module	14:50
tristan	and share that between the "integration" tests and other regression tests	14:51
tristan	but not before 1.0	14:51
tristan	otherwise, how will third party plugin repos use the test fixtures; write it all by hand ? meh...	14:52
ssam2	makes sense	15:16
tlater	pytest.parametrize is so handy for integration tests :)	15:41
* tristan not sure what the above thing is really about, we could run the exec but then we'd have to provide the entry point		15:42
tristan	(so that tests are not dependent on install)	15:43
ssam2	oh that's true	15:43
tristan	also, we want to refactor the exceptions into error domains/codes	15:43
tristan	where we have like LoadErrorReason already for load domain	15:43
tlater	With the stderr/stdout redirection we can't get the actual exception object in integration tests	15:44
tristan	ideally we get it all machine readable, we could have a way for the app to communicate it back to the caller, if that makes sense	15:44
tristan	I dont expect to be able to pass an exception object between processes	15:44
tlater	So instead you suggest a neater way of outputting exceptions?	15:45
*** mcatanzaro has joined #buildstream		15:45
tristan	tlater, I think you want to be using the same fixtures as in tests/	15:45
tlater	I am for the most part, but I can't use the click test runner	15:46
tristan	tlater, I would, but not if it's unnecessary.	15:46
tristan	tlater, we dont use the click test runner	15:46
tristan	and why not ?	15:46
tlater	Because `bst shell` runs a subprocess, and I've been unable to capture stdout of that subprocess through the click test runner.	15:47
tristan	(we used to use the click test runner, but now we just invoke the cli() directly with a pytest hack)	15:47
tlater	I'm referring to this: cli.main(args=args or (), prog_name=cli.name, **extra)	15:47
tlater	Supposedly that's the click test runner?	15:47
tlater	Or part of it, anyway	15:48
tristan	not anymore	15:48
tristan	we used to use CliRunner	15:48
tlater	Hm, well, either way, it runs buildstream in the same process, which makes it very difficult to get stdout from its subprocesses	15:48
tristan	tlater, it's actually cli() defined in _frontend/main.py	15:49
tlater	Ah, fair enough	15:49
tristan	so you are basically saying that child processes dont inherit the same stdout/stderr ?	15:50
tlater	Yes, and if you force them to, you end up with deadlock	15:50
tristan	what is "force them" ?	15:51
tlater	Setting stdout of the suprocess.Popen call to sys.stdout	15:51
tlater	Which, by default, is set to "None"	15:51
tlater	And should make them get sys.stdout, but presumably that's a different file descriptor, which means that our wrapping code doesn't capture it.	15:52
tristan	tlater, you might try FDCapture instead of SysCapture in tests/testutils/runcli.py	15:53
tristan	that should capture the stdout/stderr at the os level instead of the python sys level	15:54
tlater	Hm, that's worth a try	15:54
tristan	if it works as advertised, not sure that's what it's for	15:54
* tristan notes that pytest is itself running the same captures; so we are already a capture within a capture		15:56
tristan	stuff we print in runcli.py only gets passed back to pytest	15:56
tlater	tristan: The FDCapture works \o/	16:01
tristan	tlater, nice :D	16:02
tlater	Well, at least with the new tests	16:02
tlater	Here's hoping it doesn't break the already existing tests	16:03
* tristan would not be against changing the whole thing to launch an exec, but that's a bit involved and deserves to be it's own topic		16:03
tlater	Well, it's not too big of an issue anymore now, and it means that we gain some small benefits... All it really costs us is some slightly more complex cli wrapper code.	16:05
* tristan thinks it's nicely isolated and replaceable anyway; we could throw it away and change it at any time without having to rewrite tests		16:06
* tlater in fact did just that for a little while to test his integration tests; it's very easy to overwrite the invoke() method		16:07
tristan	so I guess feel free to `git mv tests/testing buildstream/_testing` for your purposes	16:08
tristan	then we can later polish it up and make it `buildstream/testing` and public	16:08
tristan	oh it's currently testutils	16:08
*** noisecell has quit IRC		16:17
* paulsherwood wonders whether he can yet expect to 'get started' smoothly with buildstream		16:37
tlater	There are a few documentation pages coming up with what you need to 'get started', they should land around 1.0 :)	16:42
*** tristan has quit IRC		16:51
paulsherwood	tlater: pls can it be sooner than that? i have a window to look at this over the xmas break... if it's not ready, i'll miss it	17:09
tlater	Hm, with a bit of luck we'll be pushing them tomorrow	17:10
* paulsherwood crosses fingers		17:10
tlater	Which reminds me, I should probably review this today so that tings go smoothly	17:10
gitlab-br-bot	buildstream: merge request (sam/meson->master: WIP: use Meson instead of setuptools to install and distribute BuildStream) #196 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/196	17:13
ssam2	profiling `bst shell`, it seems we spend loads of time in os.realpath	17:27
ssam2	from _ensure_real_directory	17:27
ssam2	this profiling method doesn't measure the FUSE code though, which is a bit useless	17:31
ssam2	commenting out the realpath() calls in _ensure_real_directory() makes the link_files() call go from 26 seconds to 6 seconds, so we will have to figure something out there	17:32
ssam2	i don't think it can just be removed though	17:33
bochecha	ssam2: 20 seconds to resolve symlinks? o_O	17:33
ssam2	i think the issue is Python implements realpath() itself, in Python	17:34
ssam2	we have 1000s of files, and its just doing loads of slow string processing	17:34
ssam2	if we could use realpath() from glibc it would probably be much faster	17:36
ssam2	anyway, i guess i need to open issues for these things and do more profiling, rather than spewing thoughts into IRC :-)	17:37
bochecha	Python's realpath is surprising :-/	17:37
bochecha	path, ok = _joinrealpath(filename[:0], filename, {})	17:37
ssam2	yeah	17:38
bochecha	I'm pretty sure that `filename[:0]` is just the empty string :)	17:38
ssam2	yeah I puzzled over that one	17:38
ssam2	but I think it's because joinrealpath checks if that is bytes or string	17:38
bochecha	it could do that on filename, though	17:39
bochecha	seems more like _joinrealpath is called recursively, so the first call has one parameter empty	17:40
ssam2	true	17:40
ssam2	ah yeah	17:40
bochecha	but then it does `os.readlink` on each component of the path? that's... it's too late to be reading this kind of code :P	17:41
gitlab-br-bot	buildstream: merge request (sam/meson->master: WIP: use Meson instead of setuptools to install and distribute BuildStream) #196 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/196	18:01
gitlab-br-bot	buildstream: issue #174 ("Python os.realpath() function is too slow") changed state ("opened") https://gitlab.com/BuildStream/buildstream/issues/174	18:10
*** noisecell has joined #buildstream		18:12
*** noisecell has quit IRC		18:13
*** tristan has joined #buildstream		18:19
*** valentind has joined #buildstream		18:38
*** ssam2 has quit IRC		18:52
*** tiago has quit IRC		20:00
*** tristan has quit IRC		21:00
*** mcatanzaro has quit IRC		21:11
*** valentind has quit IRC		22:41
*** semanticdesign_ has quit IRC		23:47

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!