*** jennis has joined #buildstream | 00:14 | |
*** jennis_ has joined #buildstream | 00:14 | |
*** tristan has joined #buildstream | 02:24 | |
tristan | gitlab now has a terms of service ! | 02:31 |
---|---|---|
*** Prince781 has joined #buildstream | 03:19 | |
*** tristan has quit IRC | 03:47 | |
*** jennis_ has quit IRC | 04:36 | |
*** jennis has quit IRC | 04:36 | |
*** Prince781 has quit IRC | 04:44 | |
*** ernestask has joined #buildstream | 06:05 | |
paulsherwood | eek... what does that mean? | 06:07 |
*** tristan has joined #buildstream | 06:14 | |
paulsherwood | tristan: is there something in the terms of service you're concerned about? | 07:24 |
tristan | not really, I just had to click through a "There is now a terms of service" button :) | 07:27 |
tristan | I did read through it horizontally out of curiousity, they are fairly transparent when compared with evil social media and human harvesting corps :) | 07:28 |
*** toscalix has joined #buildstream | 07:32 | |
paulsherwood | :-) | 07:35 |
gitlab-br-bot | buildstream: merge request (juerg/googlecas->master: WIP: Google CAS-based artifact cache) #337 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/337 | 07:52 |
*** jonathanmaw has joined #buildstream | 08:41 | |
*** dominic has joined #buildstream | 09:09 | |
*** noisecell has joined #buildstream | 09:12 | |
gitlab-br-bot | buildstream: merge request (jmac/rename_size_request->master: WIP: Logging widgets: Rename 'size_request' to 'prepare') #382 changed state ("closed"): https://gitlab.com/BuildStream/buildstream/merge_requests/382 | 09:17 |
*** bochecha_ has joined #buildstream | 09:21 | |
finn | I'm sourcing a file for autotools. The file has a directory structure such that the file I actually want to make is inside 'doc/amhello' | 09:26 |
finn | Is it possible to prepend to autogen? | 09:26 |
finn | I've tried this: | 09:27 |
finn | https://pastebin.com/9kwCNTdg | 09:27 |
finn | But I think prepend only works for lists, not dicts | 09:27 |
finn | base-amhello.bst [line 9 column 4]: List cannot overwrite value at: autotools.yaml [line 5 column 11] | 09:28 |
*** aday has joined #buildstream | 09:33 | |
tristan | finn, correct, dicts are entirely unordered things | 10:07 |
finn | I can't think of a nice solution to grab the auto tools hello world example. I've uploaded as far as I got to the examples repo with comments | 10:10 |
tristan | finn, does setting the 'command-subdir' to 'doc/amhello' not work for your purposes ? or you *really* only want to override the autogen part of `configure-commands` ? | 10:10 |
finn | Will try now | 10:10 |
tristan | command-subdir applies to all commands | 10:11 |
tristan | finn, do you have a link to the specific example I can view, also ? | 10:12 |
gitlab-br-bot | buildstream: merge request (juerg/googlecas->master: WIP: Google CAS-based artifact cache) #337 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/337 | 10:13 |
finn | thanks, command-subdir worked :) | 10:19 |
finn | You mentioned that yesterday too but I don't think I'd quite understood | 10:19 |
tlater | Huh, I didn't know about command-subdir | 10:22 |
tlater | That's neat :) | 10:22 |
finn | I'll upload in a mo. The example now builds the official auto make example and installs | 10:24 |
tristan | tlater, it's admittedly tricky to find, because it's implemented (and thus documented) in the shared BuildElement base class | 10:24 |
finn | ^^ I hadn't quite understood that doc yesterday | 10:24 |
tlater | tristan: On the expiry branch btw... | 10:27 |
tlater | The last thing I have to think about is the triple quota | 10:28 |
tristan | tlater, you mean triple threshold | 10:28 |
tlater | Err, yeah | 10:28 |
tlater | Obviously we'd rather have the user set a single threshold and calculate the other two from that | 10:29 |
tlater | The question is - what should the margins be? | 10:29 |
tristan | tlater, As I recall, we had decided that there was only going to be one threshold in your branch, as it intends to land way in advance of CAS | 10:29 |
tlater | Didn't we need it anyway to ensure that we don't spend all our time in cache cleanup? | 10:29 |
tristan | i.e. "for now" you are not allowing cache cleanup to happen concurrently with ongoing builds (and potential cache commits) | 10:29 |
tristan | two thresholds is enough for that | 10:30 |
tlater | We have to wait for all current jobs to finish, because otherwise we may remove in-progress ostree commits | 10:30 |
tristan | you need a third only when you allow a cleanup to happen with ongoing commits at the same time, which we should ideally be doing, but are not going to before CAS | 10:30 |
tlater | Alright, I'll just take that at face value for now... In either case, finding a good margin here is difficult, because it depends on the average artifact size | 10:32 |
tristan | tlater, the lower threshold is the ideal smallest target cache size, the higher threshold is the one where you wait for builds to complete and queue up a cleanup job before triggering more builds | 10:32 |
tristan | yes, that is a separate problem | 10:32 |
tlater | I think the optimal solution is the maximum expected artifact size * the number of possible concurrent build jobs | 10:33 |
tristan | i.e. how we derive these values from user configuration is a separate activity | 10:33 |
tlater | Because that way we'd end up perfectly hitting the spot where we filled up the cache every time | 10:33 |
gitlab-br-bot | buildstream: merge request (issue-21_Caching_build_trees->master: WIP: Issue 21 caching build trees) #372 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/372 | 10:33 |
tristan | tlater, that is exactly wrong | 10:33 |
tlater | The question is just, what is the maximum expected artifact size? | 10:33 |
tristan | tlater, i.e. if you mean that you want *that* distance between the two thresholds, it's untrue | 10:34 |
*** bethw has joined #buildstream | 10:34 | |
tristan | tlater, you ideally want a cache cleanup to trigger at most once per build session | 10:35 |
tristan | i.e. you never *want* it to happen, it's an evil necessity that slows things down, it's an annoyance | 10:35 |
tristan | tlater, I might consider starting with something like 50% for the lower threshold | 10:36 |
tristan | and then that 50% *anyway* has a lower limit, it is limited by not being allowed to delete artifacts which are "pinned" by the current build plan | 10:36 |
tristan | so that 50% might become 75% with a small quota | 10:37 |
*** bochecha_ has quit IRC | 10:37 | |
tlater | So, on this low threshold, what do we do? Do we launch a cleanup job, just to make sure we go down to a reasonable amount at the end of the build? | 10:38 |
tristan | ok, so to think about that... take the user configured value and throw it away | 10:38 |
tristan | now we're thinking of how the machine works | 10:39 |
tristan | you have two values | 10:39 |
tristan | When you reach the upper value, you stop queuing build tasks | 10:39 |
tristan | Then you launch the cleanup task once build tasks are complete | 10:39 |
tristan | or "At the earliest free opportunity" | 10:39 |
tristan | Accounting for the failed builds and such which will cause interactive prompts etc, when the user "continues"... and no builds are running... | 10:40 |
tristan | Then you launch your cleanup task | 10:40 |
tristan | tlater, When the cleanup task runs... it removes artifacts until you reach the lower threshold (or stops earlier if removing more artifacts would cause "pinned" artifacts to be removed) | 10:41 |
tristan | Ideally, when the cleanup task completes, the artifact cache size is <= lower threshold | 10:41 |
tlater | Hm, this would mean that we'd always launch a cleanup job once the lower threshold has been reached | 10:41 |
tristan | No | 10:42 |
tristan | Once the higher threshold is reached, you stop queueing jobs and launch a cleanup | 10:42 |
tristan | That single cleanup job, removes alllllllll the artifacts all the way down to the lower threshold | 10:42 |
tristan | leaving you with tons of space in your quota | 10:42 |
tlater | tristan: I mean that we'd launch a cleanup job once per pipeline once we ever reach the lower threshold | 10:43 |
tristan | so you likely dont have to see the cleanup job more than once in a run | 10:43 |
tlater | Because you'll inevitably run out of elements to build eventually | 10:43 |
tristan | What do you mean ? | 10:43 |
tlater | Say the user continued out of a pipeline with failed build elements, and this launched a cleanup job | 10:44 |
tlater | We are now just under the lower threshold | 10:44 |
tristan | You only ever schedule the cleanup job once you reach the UPPER threshold | 10:44 |
tlater | The next time the user launches a pipeline that creates any artifact, we will almost certainly launch a cleanup job at the end again | 10:44 |
tristan | I'm not following why you think that | 10:44 |
tlater | I thought you'd launch a cleanup task at the lower threshold, but that makes sense, yeah | 10:45 |
tlater | The lower threshold is essentially just the value we want to reduce the cache size to - it doesn't trigger any action from buildstream | 10:45 |
tristan | I think I've been saying the same thing... but capitalization of 'upper' works ? | 10:45 |
tristan | exactly | 10:45 |
tlater | Haha, sorry, I thought you'd said the opposite earlier | 10:45 |
tristan | those are the meanings of the two thresholds | 10:45 |
tlater | Probably misread | 10:45 |
tlater | Yep, that makes sense | 10:46 |
tristan | So I think 50% is a decent start for the lower, "target size" threshold | 10:46 |
tlater | If we add a third threshold, *that* threshold would be when we launch the job | 10:46 |
tlater | And that threshold should then be close to max artifact size * number of jobs | 10:46 |
tlater | I presume? | 10:46 |
tristan | tlater, exactly... and it would reside somewhere around 75% lets say | 10:46 |
tristan | tlater, but it would not stop queueing jobs, jobs only stop getting queued at the upper limit | 10:47 |
tlater | Yep, I understand, ta | 10:47 |
tlater | I think 50% is a pretty good value | 10:47 |
tristan | The way that we arrive at that number is rather orthogonal to the initial implementation, but I think it's a decent number too :) | 10:48 |
tlater | We can always optimize in the future if we get more data... | 10:48 |
tlater | Buildstream telemetrics? | 10:48 |
tlater | ;p | 10:48 |
tristan | Capital S | 10:48 |
tristan | toscalix, has started a trend I'm afraid | 10:48 |
tlater | Ack, apologies, typo | 10:49 |
* tlater will hopefully get to implementing this over the next few days and finally get this MR done | 10:49 | |
tristan | tlater, sec... | 10:50 |
toscalix | Based on my experience with OpenSUSE, openSUSE, openSuSe, openSuse, Opensuse..... I predict capitalization will be the biggest source of headaches for tristan in the next 5 years | 10:52 |
tristan | tlater, https://gitlab.com/BuildStream/buildstream/merge_requests/421/diffs#note_72664689 <-- is this going to be relevant local cache ? | 10:52 |
tristan | toscalix, uh oh... now I *KNOW* who is responsible for having started mis-capitalization in the openSUSE camp | 10:53 |
tristan | toscalix, dont do it ! ;-) | 10:53 |
tlater | tristan: No, we worked quite closely together. That MR actually contains some of the API I added for local expiry. | 10:54 |
tristan | tlater, that comment, you know... calculating the size is a *very intense* operation | 10:54 |
tlater | Oh, it didn't link me to the comment :| | 10:55 |
tlater | Hang | 10:55 |
tristan | tlater, I am presuming we are caching some number with the assistance of ostree here, and that we know how much we've added to the repo when we make a commit | 10:55 |
tlater | tristan: Oh, yeah, I made very sure to be careful about using that function | 10:55 |
tristan | so we can easily keep track of what we're spending without spamming the hardware relentlessly | 10:55 |
tlater | We should use it once at most for the full cache - unfortunately ostree can't report cache sizes |: | 10:56 |
tristan | tlater, once ... "in a session" ? | 10:56 |
tlater | Yes, unless I start writing the session size to a file | 10:56 |
tlater | Even then the accuracy would be abysmal | 10:57 |
tlater | But I make sure a separate thread calculates it, at least | 10:57 |
tristan | So how do we know how much the cache grew after a commit ? | 10:57 |
tlater | We guess based on the artifact's directory size | 10:57 |
tlater | And only calculate when *that* grows too large | 10:57 |
tristan | So we assume that builds are 100% non-reproducible, and that ostree has no deduplication ? | 10:58 |
tlater | No, we assume that that gives us a nice upper limit on the actual size | 10:58 |
tristan | I guess that is kind of okay-ish | 10:58 |
tlater | We then figure out the actual cache size if that upper limit becomes too large | 10:59 |
tristan | So we fake it by calculating the size of directories we add, assuming no deduplication | 10:59 |
tristan | and when we near the quota, we scratch the disks ? | 10:59 |
tlater | Yes, that's what the MR currently does... | 11:00 |
tlater | I don't really have any other way to deal with it though | 11:00 |
tlater | Because assuming no deduplication means that the guess is quite inaccurate | 11:00 |
tristan | tlater, note that thread or no thread... this is still very concerning... try running `du -hs` in a directory that is around 60GB in size full of fairly small files (preferably after a clean boot) | 11:00 |
tristan | it will bog down the machine for sure, I/O wait is going to suffer a lot | 11:01 |
tristan | Maybe we should reverse this completely, however if CAS is designed for this, quotas are a preferable configuration API | 11:04 |
tlater | I dislike this, too... Does CAS support a nicer way of determining repository size? | 11:04 |
tristan | If this is going to be a short lived experience of "OK everybody, lets wait for 20min to determine the size of the cache before the builds continue !", then maybe it's alright | 11:05 |
* tlater doesn't think 20 minutes is alright for small builds, but this should be very rare for those at least | 11:06 | |
jmac | tlater: I don't think it does, unfortunately. | 11:07 |
tlater | Considering that, it probably isn't a bad idea to keep a file containing the cache size between runs. | 11:07 |
tristan | tlater, I am making a bit of a presumption based on juergbi being aware of the designs we've been working on, but we aught make sure that juergbi has considered this in his CAS implementation | 11:07 |
tristan | if we're going to use this approach at all | 11:07 |
tristan | tlater, the opposite approach is to not have a quota, but to instead cleanup artifacts based on remaining space on the partition | 11:08 |
tristan | which is always snappy, but a shitty configuration API | 11:09 |
tlater | We *could* make that the default | 11:09 |
juergbi | The current CAS branch doesn't have any special features for this, so it will work the same way (once purge is implemented) | 11:09 |
tristan | A user wants to say "Use at most 50GB", not "Always leave 50GB space on my disk" | 11:09 |
juergbi | However, it should be easier to implement a repository size cache as we control the whole code | 11:10 |
tlater | But some users say "I don't care" and for those we could leave a few GB on their disk :) | 11:10 |
juergbi | (it's generally not trivial due to parallel operations, though) | 11:10 |
tlater | How do DBs solve this? | 11:11 |
juergbi | non-cache DBs can't implicitly expire stuff | 11:11 |
juergbi | a DB-like approach for handling the parallel size updates would definitely be possible | 11:12 |
tristan | You would likely need an active process for CAS | 11:12 |
tristan | wouldnt be nice and simple like SQLite | 11:12 |
juergbi | no, something SQLite-like (or an actual SQLite DB) should work as well | 11:12 |
tristan | would be full blown complex IPC level stuff | 11:13 |
tristan | oh ? | 11:13 |
tristan | perhaps, I never actually trusted parallel writes and file locking with SQLite in production | 11:13 |
juergbi | SQLite is well-tested, I don't expect any issues if we are ok depending on it | 11:14 |
juergbi | with WAL (write ahead logging) it uses a SHM area for coordination | 11:15 |
tristan | I've done a lot of many-readers/one-writer with SQLite | 11:15 |
juergbi | but doesn't require a daemon | 11:15 |
tristan | With WAL indeed | 11:15 |
tristan | WAL gives you a huge performance boost with many-readers-one-writer, you either read the last commit or the next, but never block | 11:16 |
tristan | whether that means you trust multiple processes to not futz things up, is another story | 11:16 |
tristan | (at write time) | 11:16 |
tristan | but perhaps it has gotten better in recent years | 11:16 |
juergbi | yes, in general you can't use it if you don't trust the other processes | 11:16 |
juergbi | but that shouldn't be an issue at least for the local artifact cache case | 11:16 |
juergbi | and hopefully also not for the server side | 11:17 |
*** aday has quit IRC | 11:18 | |
juergbi | a small SHM area and a cross-process mutex might even be sufficient | 11:18 |
tristan | about 17 seconds to `du -hs` my cache, which is nice and small at 32gigs | 11:18 |
juergbi | there will likely be a huge difference depending on how hot the kernel dentry cache is with regard to the directory tree | 11:19 |
juergbi | twice in a row the second one will likely be very fast | 11:19 |
tristan | yes | 11:19 |
*** aday has joined #buildstream | 11:20 | |
tristan | So, likely this mean a really obnoxious delay at the beginning of `bst build` and `bst pull` | 11:21 |
tristan | when it's the first time in a while | 11:21 |
tristan | and cold cache | 11:21 |
tristan | after that, the later checks wont be as costly | 11:22 |
tlater | That's presumably only on Linux though | 11:22 |
jmac | On the server we'd just record size updates in bst-artifact-receive, no? | 11:22 |
tlater | Other platforms might do worse | 11:22 |
tristan | jmac, on the server side we've been stalling to implement a "stop gap" for too long, and it will be entirely obsoleted by CAS | 11:22 |
tristan | but with `bst-artifact-receive` the problem is certainly not simpler | 11:23 |
tristan | assume you have approximately 8 artifacts being simultaneously uploaded at all times | 11:23 |
juergbi | tristan: not really obsoleted, no. it's just a 'touch' on access and then use the same code | 11:23 |
tristan | juergbi, we're not getting rid of the OSTree remote cache ?? | 11:24 |
tristan | I thought we are going CAS all the way | 11:24 |
tristan | no more ? | 11:24 |
juergbi | yes, we are | 11:24 |
jmac | tristan: It's only an atomic update of one integer | 11:24 |
juergbi | but the basic approach of artifact expiry can be very similar | 11:24 |
tristan | Right | 11:24 |
juergbi | one nice difference being that we can use access time instead of mtime, but the rest stays pretty much the same | 11:25 |
tristan | juergbi, I see, so you dont anticipate that we synchronize cleanups on the server ? | 11:25 |
juergbi | we probably should, yes, and not just on the server, also on the client side | 11:26 |
juergbi | but I see this as orthogonal to OSTree vs. CAS | 11:26 |
tristan | jmac, it is most certainly not as simple as just an atomic update of one integer, right now you have hundreds of hashed objects in an artifact, you have to know which ones already exist in the cache to know how much you grow the cache, and 8 separate uploads are happening at once | 11:28 |
tristan | all of that has to get the right number at the end of the day | 11:28 |
juergbi | it should definitely be easier with all code being under our control | 11:29 |
tristan | So I suppose it could be done while renaming files from the tempdir to the cache, and observing if it did exist, those operations then need to be made atomic too | 11:29 |
tristan | maybe I'm off base but, I'm much less worried about a coordination process crashing, than the side effects of stale locks left behind by independent processes | 11:30 |
Nexus | why does workspace reset contain a lot of cloned code from workspace open instead of just calling workspace open like it used to? | 11:31 |
juergbi | with O_EXCL, lock files can be handled properly | 11:31 |
juergbi | should be overall much easier than a daemon | 11:31 |
tristan | Nexus, loading is an expensive recursive process, you dont want to do it again and again and again | 11:32 |
juergbi | on the server we already have a daemon, so we could implement it like that, however, if we need a daemon-less approach on the client side, it could make sense to do the same on the server | 11:32 |
tlater | Nexus: If you really wanted to you could make that bit of code a separate function so you can call it without loading :) | 11:32 |
jmac | tristan: Each file we upload to the cache either overwrites a file or adds a new file; either way we know how much we've increased storage by. You'd need to stat the existing object, but that's all | 11:33 |
Nexus | tristan: it makes it rather difficult for me to do the cached build tree logic, as i'm basically writing duplicated code | 11:33 |
Nexus | what is currently there only works if you're using sources, not a cached build | 11:33 |
tristan | Nexus, not to mention, you want to reuse Stream._fetch() on the *whole batch* if and when you need it, and if you have a failure, you want it to happen *before* you start modifying workspaces | 11:33 |
tristan | Nexus, in other words, IMO the old code was quite broken in practice while attempting to reuse code | 11:34 |
jmac | If you have two artifacts being uploaded at once which overwrite the same files, then you have a potential ordering problemn | 11:34 |
jmac | I think that problem disappears with the CAS, though | 11:34 |
tristan | jmac, right, we upload to a temp cache so that we dont write partial stuff directly, currently they are a safe atomic os.rename(), but (os.rename() + increment size by os.lstat().st_size) has to become atomic afaics | 11:36 |
tristan | I guess it's not *immensely* complicated | 11:36 |
tristan | but it's just ingrained in me, to avoid flock() at high costs | 11:36 |
jmac | I'm not sure you need file-level locking; I was imagining a basic service which used multiprocessing's locks | 11:38 |
tristan | jmac, right, that's kind of the idea I was spinning above, or rather than using threading locks; once you have a live entity server you can start to serialize things | 11:38 |
tristan | jmac, I *feel* like it's safer, but juergbi prefers locks to a daemon, and I'm honestly not sure I'm prepared to argue either way right now | 11:39 |
tristan | my feelings on it being safer probably come from the days of the linux 2.6 kernel | 11:40 |
tristan | the world has changed, and people are using SQLite with multiple writing processes to the same DB, so I am out of date ;-) | 11:41 |
tristan | tlater, ok well - all of this started with a "thing" which is not yet solved for you | 11:42 |
Nexus | tristan: ok, well i will need some way of storing whether or not a cached build tree was used originally when opening the worksapce | 11:43 |
tristan | tlater, I'll say this then: It seems to cost not so HUGE amount of time, likely ~30 seconds at startup for a 50GB cache, and then later requests the way you have arranged them seem to be low cost on linux | 11:43 |
tristan | tlater, I think that we can run with this for now, but we really may want to seriously reverse this after | 11:44 |
tristan | tlater, it may be worth raising this on the ML for more eyes, too, but I dont want this detail to block landing the cleanup tasks | 11:44 |
tristan | Nexus, How so ? | 11:45 |
tlater | tristan: Alright, I'll also make sure I give the ostree doc another dive when I get the chance to. | 11:45 |
tristan | Nexus, when you reset a workspace, it's quite similar to closing and opening it, what changes here ? | 11:46 |
Nexus | tristan: because from what i can see, it looks like the old workspace is deleted, a new one created, and then opened using sources | 11:47 |
tristan | Nexus, when it comes time to call Element._open_workspace(), you have a clean area to start with | 11:47 |
gitlab-br-bot | buildstream: merge request (valentindavid/359_cross_junction_elements->master: WIP: Allow names for cross junction elements) #454 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/454 | 11:47 |
tristan | Nexus, so I would suppose that Element._open_workspace() can at that time know whether a cached build tree is available as usual | 11:47 |
tristan | Nexus, in other words, I'm not sure how any of this highlevel Stream() stuff makes a difference to the underlying mechanics, even | 11:48 |
Nexus | tristan: because element_open_workspace() was only going to get called if they didn't want to use a cached build | 11:49 |
tristan | Nexus, note that by the time we hit the element, in that loop in Stream.workspace_reset(), it is already determined whether tracking was done (potentially changing the cache key), so when Element._open_workspace() asks for the cache key, it should have the new updated one | 11:49 |
tristan | Nexus, So then your business logic is too high up in the food chain, when we ask an Element to open a workspace, we are not asking the element to specifically just stage the sources, we are just asking it to open a workspace | 11:50 |
tristan | Nexus, if there is a preference coming from on high at the CLI level, we should pass that preference through as an argument | 11:50 |
*** aday has quit IRC | 11:51 | |
Nexus | tristan: if we wait until the element to decide to use a cached build tree, then we're doing a lot of un-needed work beforehand in the stream.workspace_open function | 11:52 |
*** aday has joined #buildstream | 11:52 | |
tristan | Nexus, I dont understand... Why ? What exactly is the preference, and what time have we wasted ? | 11:52 |
Nexus | you'd be wasting a lot of time on finding all of the sources which would then not be used | 11:53 |
tristan | I.e. we're still going re-open the workspace in a reset, whether or not we prefer to use a cached build or not, right ? | 11:53 |
tristan | Ah, because fetch ? | 11:53 |
tristan | Nexus, now you raise an interesting problem :) | 11:54 |
Nexus | :) | 11:54 |
tristan | So, A.) It's important to run self._fetch() on everything before moving on, especially because it might track, and the operations might fail | 11:55 |
*** aday has quit IRC | 11:55 | |
tristan | But B.) It's unneeded to self._fetch() in the case that you want to use cached build | 11:55 |
*** aday_ has joined #buildstream | 11:56 | |
Nexus | which is why i had the "do you want to use the cache" logic so high up | 11:56 |
tristan | I think (A) is important because the operation can become destructive | 11:56 |
*** aday_ is now known as aday | 11:56 | |
tristan | Nexus, so I'm not sure exactly what you had before, but I presume you were basically "falling back" to fetching the sources only if a cached build is unavailable ? | 11:57 |
Nexus | tristan: unavailable or explicitly not wanted | 11:58 |
tristan | Nexus, I think you can still factor that into the current code, of course you cannot do that if you need to `--track` | 11:58 |
tristan | it will just become a bit more wordy | 11:58 |
tristan | Nexus, I would still delegate the real work to Element._open_workspace() though, and ask it specifically for a cached build first but have it either return something meaningful or raise a meaningful error in the case one was unavailable | 11:59 |
Nexus | well i think we need to make the `--track` and `--cached-build` flags mutually exclusive, because you can't have both | 12:00 |
tristan | Then again, the (A) problem comes back | 12:00 |
tristan | which quite sucks | 12:00 |
tristan | Well, you surely could have both | 12:01 |
tristan | Nexus, only after you track do you know the new cache key | 12:01 |
tristan | And then depending on that, you probably try another scheduler run to attempt to pull the artifact, etc | 12:01 |
Nexus | assuming you don't want a previous cache? | 12:01 |
Nexus | tristan: i think all of the tracking/fetching logic needs to be moved into the element function, because having that stuff done in both workspace_open and workspace_reset seem wasteful to me | 12:02 |
*** jennis has joined #buildstream | 12:02 | |
*** jennis_ has joined #buildstream | 12:02 | |
tristan | Nexus, If you run `bst workspace reset --track foo.bst`, then you *certainly* are interested in whatever coresponds to the *new* cache key | 12:02 |
tristan | Nexus, I dont see any room for ambiguity there | 12:02 |
Nexus | ok, in the case of reset then yes | 12:03 |
Nexus | but not in open | 12:03 |
tristan | Nexus, no, you cannot put that stuff into the element, that stuff runs schedulers, which creates cyclic ownership weirdness | 12:03 |
Nexus | :/ | 12:03 |
tristan | When opening a workspace, if you `bst workspace open --track foo.bst`, AGAIN you can only possibly be interested in the *tracked* cache key | 12:04 |
tristan | The problem with grouped workspace commands is basically that you want to succeed or fail as a group | 12:04 |
tristan | Or get damn close to it as possible | 12:05 |
tristan | it's pretty ugly to have it fail but have also modified stuff | 12:05 |
tristan | Nexus, so, first of all it seems to me A.) you wont want to use the shared Stream._fetch() function, or at least you may need to modify it to allow track() without fetch() B.) You probably want to share code with workspace_open() and workspace_reset() C.) workspace_open() is going to be another target of multiple element operations | 12:17 |
tristan | Nexus, the code for this logic seems to involve tries with uncertain success and then fallbacks to other things, in order to safely share this... one option might be to do the work in a temp dir first before committing the results | 12:18 |
tristan | Nexus, i.e. in this light, the worst thing which can happen at "commit" time is that we fail to remove the original directory and replace it with the newly created open workspace | 12:19 |
tristan | This approach would seem to allow for sharing the complex logic of "doing the minimal" for both workspace_open() and workspace_reset(), while minimizing the risk of committing a partially successful result | 12:20 |
Nexus | ok, this is going to need a bit of planning then | 12:21 |
finn | Competition time. | 12:27 |
finn | Can anyone think of a better name for BuildFarm? | 12:28 |
tlater | Dam | 12:28 |
tristan | Hah | 12:28 |
tristan | finn, We need a plumbing analogy :) | 12:29 |
tristan | what is a network of pipelines ? | 12:29 |
finn | MarioWorld | 12:29 |
finn | Arboretum | 12:30 |
skullman | Interchange | 12:31 |
skullman | SewageTreatmentPlant | 12:31 |
jmac | Pipemania | 12:32 |
finn | WaterSupply | 12:32 |
tristan | 'Sewage' on it's own is cute haha | 12:32 |
tlater | Hmm, where else do pipes go if not sewage? | 12:33 |
tlater | Pool? | 12:33 |
tlater | BuildPool? Meh | 12:33 |
finn | Bayou | 12:33 |
skullman | WateringHole | 12:33 |
tlater | River? Many streams? | 12:34 |
tlater | Sea/Ocean? | 12:34 |
tristan | Irrigation ! | 12:34 |
skullman | Hydraulics | 12:34 |
tlater | Heh, irrigation is neat, but sounds too technical | 12:35 |
skullman | Hydroponics | 12:35 |
finn | Although irrigation is used in farming | 12:35 |
skullman | hm, a bit close to NixOS' Hydra. | 12:36 |
skullman | sewage.buildstream.gnome.org would clearly be the artifact server containing the failed builds | 12:37 |
finn | Confluence | 12:37 |
skullman | Effluence | 12:37 |
tristan | Conduit ? | 12:38 |
skullman | ooh, Conduit's a good one | 12:38 |
finn | Conflux | 12:38 |
* tlater likes conduit, too | 12:38 | |
finn | it's a pipe which channels water | 12:40 |
finn | plenty of puns in there | 12:40 |
finn | Conduit then? | 12:40 |
tristan | It's got a ring to it, I wouldnt throw in the towel just yet | 12:41 |
tristan | (might clog up the conduit) | 12:41 |
dominic | ahh, never knew conduit had another meaning | 12:42 |
tlater | dominic: "Another"? | 12:44 |
tristan | I like it though, seems like there are opportunities for naming of accompanying tooling too, like "Plunger" or "Faucet" | 12:45 |
dominic | as in, I only knew of an electrical conduit | 12:45 |
tlater | Oh, right | 12:45 |
dominic | so was confused about the suggestion at first | 12:45 |
tristan | "Just extract your build results with Faucet...", or "Did you try freeing up some resources with Plunger ?" | 12:46 |
* tlater just renamed his cleanupjob to plunger | 12:46 | |
jmac | ISTR one of the common complaints about Baserock was the silly names for components | 12:47 |
paulsherwood | +1 | 12:47 |
paulsherwood | there were others, of course :) | 12:48 |
aiden | a manifold is sort of a network of pipes | 12:49 |
toscalix | valentind: the feature you are working on correspond with this feature request? https://gitlab.com/BuildStream/buildstream/issues/328 | 12:49 |
valentind | No, #330. | 12:50 |
toscalix | thanks | 12:50 |
tristan | jmac, paulsherwood ... that was a good idea for naming, just... the analogies were SO unpopular | 12:50 |
tristan | what is the field even called again ? minerology or smth ? | 12:50 |
tristan | some ology | 12:50 |
paulsherwood | tristan: morphology. and stratum. whoosh... | 12:51 |
toscalix | finn: BuildFarm - BuildGrid | 12:51 |
tristan | yeah, not that ology, morphology is a word in the field of ... somethingology | 12:51 |
finn | BuildGrid is simple | 12:52 |
finn | like it | 12:52 |
jmac | +1 for BuildGrid | 12:52 |
paulsherwood | +1 | 12:52 |
paulsherwood | (assuming it actually gives a sense of what it is) | 12:52 |
tlater | At that point we could just go for BuildFram | 12:52 |
tlater | *Farm | 12:52 |
tristan | I liked Conduit better, but not against BuildGrid | 12:52 |
tlater | I don't think Grid is very obvious | 12:52 |
tristan | I thought the reason to not be BuildFarm is... obviously that is google branding for a similar thing | 12:53 |
tristan | it really makes no sense to be BuildFarm | 12:53 |
finn | no | 12:53 |
finn | I don't want to have to keep saying Bazel / Uber BuildFarm and BuildStream BuildFarm | 12:54 |
tristan | finn, I agree, and think we should have a distinctive name | 12:54 |
tristan | even if it's said to have compatible components which adhere to some BuildFarm standards | 12:54 |
tristan | it's better to distinguish | 12:55 |
tristan | Then again... is there even actually such a component ? | 12:55 |
tristan | From what I understand, we are already recycling the ultraboring name CAS | 12:56 |
tristan | because we were too lazy to come up with a name for it | 12:56 |
tristan | finn, is there even a "thing" to be named ? | 12:56 |
finn | a project repo | 12:56 |
tristan | no executable or service ? | 12:57 |
tristan | or library even ? | 12:57 |
*** jennis_ has quit IRC | 12:58 | |
*** jennis has quit IRC | 12:58 | |
tlater | finn: Won't that project produce a daemon for a server to run? That was my impression of this, at least. | 12:58 |
*** jennis has joined #buildstream | 13:01 | |
*** jennis_ has joined #buildstream | 13:01 | |
tristan | in any case, you can have a grid of pipes, but you dont grow a pipelines on your farm, so I think grid is already better than farm (if there is indeed a thing to name, if there is not a tangible thing to name, that's just boring :)) | 13:01 |
finn | It will be a background service for a server to run, so I think that's yes to your question - though I'm no Computer Scientist | 13:02 |
* tristan goes to harvest a fried chicken at the fried chicken grid | 13:03 | |
skullman | grid works analogously to the power grid, as something you connect to, to get work done | 13:04 |
*** tristan has quit IRC | 13:06 | |
*** jennis_ has quit IRC | 13:08 | |
*** jennis has quit IRC | 13:08 | |
*** jennis has joined #buildstream | 13:10 | |
*** jennis_ has joined #buildstream | 13:10 | |
tlater | Hm, I wonder if the various BuildElement variables are documented somewhere | 13:11 |
tlater | bindir and co | 13:11 |
*** tiagogomes has joined #buildstream | 13:12 | |
*** tiago has quit IRC | 13:13 | |
*** jennis has quit IRC | 13:14 | |
*** jennis_ has quit IRC | 13:14 | |
*** solid_black has joined #buildstream | 13:35 | |
*** solid_black has quit IRC | 13:39 | |
*** tristan has joined #buildstream | 13:40 | |
*** tiagogomes has quit IRC | 13:50 | |
*** bethw has quit IRC | 14:24 | |
*** bethw has joined #buildstream | 14:30 | |
*** tiagogomes has joined #buildstream | 14:48 | |
gitlab-br-bot | buildstream: merge request (jmac/virtual_directories->master: WIP: Abstract directory class and filesystem-backed implementation) #445 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/445 | 15:38 |
gitlab-br-bot | buildstream: merge request (valentindavid/359_cross_junction_elements->master: WIP: Allow names for cross junction elements) #454 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/454 | 16:02 |
*** dominic has quit IRC | 16:04 | |
*** toscalix has quit IRC | 16:08 | |
*** bethw has quit IRC | 16:20 | |
*** jonathanmaw has quit IRC | 17:03 | |
*** bochecha_ has joined #buildstream | 17:19 | |
*** ernestask has quit IRC | 18:05 | |
*** tristan has quit IRC | 19:28 | |
gitlab-br-bot | buildstream: merge request (valentindavid/359_cross_junction_elements->master: WIP: Allow names for cross junction elements) #454 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/454 | 20:00 |
*** aday has quit IRC | 20:51 | |
*** aday has joined #buildstream | 20:52 | |
*** aday has quit IRC | 21:38 | |
*** bochecha_ has quit IRC | 22:39 | |
*** bochecha_ has joined #buildstream | 22:41 | |
*** bochecha_ has quit IRC | 23:11 | |
gitlab-br-bot | buildstream: merge request (chandan/393-fix-workspace-no-reference->master: WIP: element.py: Fix consistency of workspaced elements when ref is missing) #462 changed state ("opened"): https://gitlab.com/BuildStream/buildstream/merge_requests/462 | 23:49 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!