Add ostree checkout overwrite mode "--update" #1408

wmanley · 2018-01-11T10:27:14Z

ostree checkout --update=FROM_COMMIT TO_COMMIT DESTINATION

will assume that DESTINATION already contains a checkout of FROM_COMMIT and will attempt to do the least amount of work to make it contain TO_COMMIT.

The intention is to enhance ostree admin deploy to update previous deployments rather than checking out whole new ones. This should be faster. Currently on our system with slow eMMC disk it takes 1m40s to do a deploy (~150000 files). Ideally if only one or two files have changed it should be nearly instantaneous - suitable for interactive development. This should also reduce eMMC wear. This can happen in future PRs though.

TODO:

Write a test so we're sure it's not touching directories that are unmodified.
Decide how subtree checkouts should work

cgwalters · 2018-01-11T14:48:13Z

I haven't looked at your patches here in detail yet but just a quick coment: it seems pretty related to rpm-ostree livefs work; see also:
#1224

If this is about development though I find ostree admin unlock to be super useful. I could certainly imagine teaching libostree how to do "live deployment updates" on top of the overlayfs.

wmanley · 2018-01-15T16:46:07Z

bash/ostree

    local options_with_args="
        --from-file
        --fsync
+        --replace


This should be --update

wmanley · 2018-01-15T16:46:13Z

bash/ostree

            __ostree_compreply_all_files
            return 0
            ;;
+        --replace)


wmanley · 2018-01-15T16:46:24Z

man/ostree-checkout.xml

+            become the destination checkout target.  If COMMIT destination
+            already exists, command will error unless one of the
+            <option>--union</option>, <option>--union-add</option>,
+            <option>--union-identical</option> or <option>--replace</option>


wmanley · 2018-01-15T16:46:33Z

man/ostree-checkout.xml

            </varlistentry>

+            <varlistentry>
+                <term><option>--replace</option>="COMMIT"</term>


wmanley · 2018-01-15T16:47:14Z

src/ostree/ot-builtin-checkout.c

+      if (opt_update && (opt_union || opt_union_add || opt_union_identical))
+        {
+          g_set_error (error, G_IO_ERROR, G_IO_ERROR_FAILED,
+                       "Cannot combine --replace with --union, "


This should be --update too

cgwalters · 2018-01-18T15:20:33Z

Pretty interesting stuff, thanks a lot for working on this! The code looks sane at a very quick glance, but I'm going to need a little bit of time to think about it. In particular I want to try to figure out if this applies to what we have today in rpm-ostree ex livefs and also my idea from #1224

But it'd be great if we can pin down the higher level use case again - do you see this as just a development thing or do you see it as potentially useful for production hotfixes?

In the rpm-ostree livefs issue one concrete use case I had was being able to hotfix OpenSSH, which is a classic example of a service that is pretty safe to hotfix.

wmanley · 2018-01-23T13:52:05Z

But it'd be great if we can pin down the higher level use case again - do you see this as just a development thing or do you see it as potentially useful for production hotfixes?

This is just intended as supporting work for #1419. The motivation for that is faster deploys with large numbers of files on a slow filesystem. Currently it takes 1m40s to do a deploy and a reboot takes about 10s. This is too slow - we want to be able to deploy a new version of software to our devices fast enough that our customers don't notice any outage. In particular we want to add some more containers which increases the number of files. This will make it even slower.

We can't do a deploy while our customers are using the systems because doing the checkout monopolises the disk. As a side effect it makes development faster too. I'll expand on how our build/deploy system works a little later, when I have time...

Note: I'm not live-updating the current deployment with this command. This isn't a use case I'd considered, but I can see how it could work out well. I'll expand on this on #1419.

I've made this a separate pull-request because I think it's easier to review that way and easier to test in isolation than bundled with #1419.

wmanley · 2018-01-28T19:01:09Z

I definitely prefer your approach of recording the tree SHAs in xattrs as suggested in #1224. It's simpler and less error-prone to passing the xattrs in to the --update.

In #1419 I introduced the concept of the deploy cache which stored old deploys named by the commit id. The commit id would be retrieved from the directory name when doing a new deploy. This would be passed to this --update mode when doing a deploy.

This xattrs approach is better for a few reasons:

The way the csums are stored is not specific to ostree admin deploy, they can also be set when doing a normal checkout.
I don't have to special-case /etc when updating deploys. In WIP: deploy: Attempt accelerated deploy by updating old deploy #1419 I was working around the fact that the csum in of the deploy wouldn't match the real checked-out data because /etc had been merged. Because these trees would be csumed individually that's no longer an issue - we just unset the root csum xattr before doing etc merge.
Updating a live deploy in place becomes possible because we don't need to rename the deployment just because the sha of the contents have changed
Updating a live deploy becomes possible because we can cope with non-atomically updated trees - a directory that temporarily doesn't correspond to a particular SHA doesn't have that SHA set on its xattrs. Theres no risk of a partial update being undetected (assuming the appropriate calls to sync).

I'm not saying that live updates would be a good idea, but they'd certainly be possible under this scheme.

I'll have a think about what it would take to implement that.

cgwalters · 2018-02-06T19:56:51Z

As a general request; can you file an issue first for bigger features so we can sketch out the design/use cases a bit before we do the code? Getting big patches is obviously great, not going to complain 😉!

So...are you thinking it makes sense to do #1224 first? I can probably take a crack at that sometime soon myself.

wmanley · 2018-02-07T14:55:06Z

As a general request; can you file an issue first for bigger features so we can sketch out the design/use cases a bit before we do the code? Getting big patches is obviously great, not going to complain 😉!

Yeah, I agree that this is ideally how it would work. In this case time pressures have meant that we don't always have time for this. I spent probably 2-3 days on this. We're already shipping this and #1419 on our product. It was a prerequisite for other important customer-facing work (https://stb-tester.com/blog/2018/02/06/stb-tester-v28-released)

As such I'm not precious about changing it to get it merged properly so I don't think we've lost much with this approach in this instance. I won't be upset if you ask for changes or flat out say no :).

The other element is that I didn't want to propose it until I knew I could do it. I had a stab at implementing this over a year ago - but failed. Your commit e6f17b9 made it much easier for me to understand the code, because it removed a layer of abstraction. So when I had another go and succeeded I raised this PR. I wouldn't have felt good about coming along and starting a discussion about a new feature that I wanted if I felt like I'd be "demanding" that others implement it for me,

So...are you thinking it makes sense to do #1224 first? I can probably take a crack at that sometime soon myself.

I think the only thing that would change in this PR if #1224 were implemented is where the initial tree sha would come from. So changes here would depend only on a small subset of #1224 - specifically recording the SHAs in the xattrs.

How do you see this working? The two questions in my mind are:

When does the caller make the guarantee that the directories won't be modified? For the xattr to actually match the contents of the directory the contents can't change between when the xattr is set to when it is overwritten. I see a few options:
1. Have an option to the intiial ostree checkout like --record-dirtree-xattrs. This is a promise to ostree that the caller won't modify the checked-out tree without clearing the xattrs. And then when you come to do an ostree checkout --update ostree just trusts that the xattrs are correct.
2. Always record dirtree xattrs when checking out and have an option like --trust-dirtree-xattrs when updating a checkout. This flag would signify that the user hasn't modified the checkout since the initial checkout.
3. Implement both flags
4. Have a single flag that says both that the xattrs are trustworthy and that xattrs should be recorded. This could be implied by ostree checkout --update
5. There's no need for an additional flag - ostree checkout --update implies this mode.
How to handle sync/fsync? When updating you'd have to clear the xattr, then update the directory contents, then set the new xattr. If there's no fsync then a power loss could lead to inconsistency between the xattrs and the contents. At the same time we don't want to add a whole bunch of fsync calls as it will be too slow (1810de2). I would propose:
1. Recurse through removing the xattrs,
2. syncfs
3. Update the contents of the directories
4. syncfs
5. Recurse through adding the new xattrs.
The increased complexity is probably not unmanageable, but isn't ideal either. An alternative is to not worry about it and push the complexity to a higher level. I've already had to handle this case in WIP: deploy: Attempt accelerated deploy by updating old deploy #1419 (See b68ecc2) so maybe it's not that big a deal - but having reliable xattrs would allow updates in place. Maybe these additional guarantees (and the associated complexity) can be pushed into the future.

Thoughts?

wmanley · 2018-02-12T16:35:50Z

I'm going away on holiday so I'm not going to be able to look at this until March. I'd still be interested in your thoughts in the mean time, I'm just letting you know that I won't be doing anything about them for a little while.

cgwalters · 2018-06-16T12:07:04Z

We can't do a deploy while our customers are using the systems because doing the checkout monopolises the disk. As a side effect it makes development faster too.

I get the development case for sure. But I want to understand better how you envision using this for production. I reread through the thread and your patches, but some important details aren't yet clear to me.

Let's say we have two deployments, current and rollback (the ostree default). Normally when we do an upgrade, ostree will end up GCing the rollback but only after an upgrade has been completed - in the bootloader order we always have 2, but on disk we will transiently have 3 checkouts. However, we clearly could, on upgrade, immediately "take ownership" of the rollback. Is something like that what you're thinking? I see the "deploy cache" in later patches.

The downside of this is that the upgrade is less "transactional" in that the rollback will vanish if we're interrupted. Today we don't have ostree rollback but it's trivial to implement (as it is in rpm-ostree); we could add an API for it, and have that API "resurrect" the rollback. I'm not too concerned about this though; in practice I feel like the rpm-ostree deploy <version> approach is nicer (another thing to drive into ostree CLI).

Am I right that your approach is about taking control over the rollback?

papr-bot · 2018-06-16T12:07:06Z

Can one of the admins verify this patch?

cgwalters · 2018-06-16T12:13:59Z

Some other notes I have:

The xattr diffing is an enhancement but I don't see a need to require it; a valid concern you have is eMMC wear, and (AIUI) the extra reads we do as part of a full diff traverse aren't a problem for that

We should be able to rebase and land some of the preparatory patches now.

I don't have to special-case /etc when updating deploys.

This is definitely a concern - how we handle the ostree 3-way merge. Are you using systemd today? We need to think about how this feature intersects with staged deployments as I'd really like to make that the default in the future.

What filesystem are you using? Does it have reflinks? That helps a lot for /etc. Although presumably you can also minimize /etc for your use case.

wmanley · 2018-06-22T11:50:44Z

Let's say we have two deployments, current and rollback (the ostree default). Normally when we do an upgrade, ostree will end up GCing the rollback but only after an upgrade has been completed

Yes, so in #1419 I've modified the GC code to move unused deploys to the cache rather than deleting it:

https://github.com/stb-tester/ostree/blob/09a1e345573caf9c30e70bbf36674f81f88fced6/src/libostree/ostree-sysroot-cleanup.c#L443-L444

This is a lot faster than recursively deleting the deploy as it just involves a rename and we can reuse the deploy later without interfering with rollbacks, etc.

Am I right that your approach is about taking control over the rollback?

No, we take control over the last GCed deploy.

This is in preparation for adding `OSTREE_REPO_CHECKOUT_OVERWRITE_UPDATE` where we will need to iterate through two directories simultaneously.

Out of date since e6f17b9.

Used for updating an already checked out tree to a new version. This is intended to make deploys much faster in the future. In default operation `checkout_tree_at` now is doing an update from an empty tree to the given one. This implements diffs for directories. The next commit will introduce the same for files.

The goal is to do ostree deploys faster by updating already checked out trees. This commit continues the work in the previous commit. `checkout_tree_at` now takes an `orig_dirtree_checksum` telling it to assume that the directory it's checking out into already has this content.

The goal is to do ostree deploys faster by updating already checked out trees. This adds an overwrite mode and a new option to specify the original tree. It will do a lot less work when you're updating a tree with a large number of files but only a few files have changed. So if you have a full distro tree checked out and want to update `/usr/bin/ostree` we no longer have to recurse through every file in `/usr/share/doc` to do the update.

This exposes the OSTREE_REPO_CHECKOUT_OVERWRITE_UPDATE functionality added in the previous commit on the command line.

openshift-ci-robot · 2020-07-09T16:24:42Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: wmanley
To complete the pull request process, please assign cgwalters
You can assign the PR to them by writing /assign @cgwalters in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

wmanley · 2020-07-09T16:26:07Z

Rebased on master

cgwalters · 2020-08-19T14:53:29Z

Can you split off some of the prep patches here to a separate PR?

openshift-ci-robot · 2021-04-12T21:19:13Z

@wmanley: PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci · 2023-06-29T14:45:15Z

@wmanley: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/sanity	`ce75735`	link	true	`/test sanity`
ci/prow/images	`ce75735`	link	true	`/test images`
ci/prow/fcos-e2e	`ce75735`	link	true	`/test fcos-e2e`

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

wmanley force-pushed the checkout-update branch 4 times, most recently from cfbbcb7 to aeb8f71 Compare January 11, 2018 12:30

wmanley force-pushed the checkout-update branch from c382f1f to b6af556 Compare January 11, 2018 19:31

wmanley mentioned this pull request Jan 15, 2018

WIP: deploy: Attempt accelerated deploy by updating old deploy #1419

Open

5 tasks

wmanley commented Jan 15, 2018

View reviewed changes

bash/ostree Outdated

local options_with_args="

--from-file

--fsync

--replace

Copy link

Member Author

wmanley Jan 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be --update

wmanley commented Jan 15, 2018

View reviewed changes

bash/ostree Outdated

__ostree_compreply_all_files

return 0

;;

--replace)

Copy link

Member Author

wmanley Jan 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and this

wmanley commented Jan 15, 2018

View reviewed changes

man/ostree-checkout.xml Outdated

</varlistentry>

<varlistentry>

<term><option>--replace</option>="COMMIT"</term>

Copy link

Member Author

wmanley Jan 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and this

wmanley commented Jan 15, 2018

View reviewed changes

wmanley force-pushed the checkout-update branch from b6af556 to 6f2a0ef Compare January 17, 2018 00:50

wmanley added 7 commits July 8, 2020 16:45

checkout: Refactor adding DirDirectoryIter

1fa445f

This is in preparation for adding `OSTREE_REPO_CHECKOUT_OVERWRITE_UPDATE` where we will need to iterate through two directories simultaneously.

checkout: Refactor adding FileDirectoryIter

3e658cd

This is in preparation for adding `OSTREE_REPO_CHECKOUT_OVERWRITE_UPDATE` where we will need to iterate through two directories simultaneously.

checkout_tree_at: Update documentation

0fa8931

Out of date since e6f17b9.

ostree checkout: Add option --update=COMMIT

ce75735

This exposes the OSTREE_REPO_CHECKOUT_OVERWRITE_UPDATE functionality added in the previous commit on the command line.

wmanley force-pushed the checkout-update branch from 6f2a0ef to ce75735 Compare July 9, 2020 16:24

openshift-ci-robot added the needs-rebase label Apr 12, 2021

Add ostree checkout overwrite mode "--update" #1408

Are you sure you want to change the base?

Add ostree checkout overwrite mode "--update" #1408

Conversation

wmanley commented Jan 11, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cgwalters commented Jan 11, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wmanley Jan 15, 2018

Choose a reason for hiding this comment

Uh oh!

wmanley Jan 15, 2018

Choose a reason for hiding this comment

Uh oh!

wmanley Jan 15, 2018

Choose a reason for hiding this comment

Uh oh!

wmanley Jan 15, 2018

Choose a reason for hiding this comment

Uh oh!

wmanley Jan 15, 2018

Choose a reason for hiding this comment

Uh oh!

cgwalters commented Jan 18, 2018

Uh oh!

wmanley commented Jan 23, 2018

Uh oh!

wmanley commented Jan 28, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cgwalters commented Feb 6, 2018

Uh oh!

wmanley commented Feb 7, 2018

Uh oh!

wmanley commented Feb 12, 2018

Uh oh!

cgwalters commented Jun 16, 2018

Uh oh!

papr-bot commented Jun 16, 2018

Uh oh!

cgwalters commented Jun 16, 2018

Uh oh!

wmanley commented Jun 22, 2018

Uh oh!

openshift-ci-robot commented Jul 9, 2020

Uh oh!

wmanley commented Jul 9, 2020

Uh oh!

cgwalters commented Aug 19, 2020

Uh oh!

openshift-ci-robot commented Apr 12, 2021

Uh oh!

openshift-ci bot commented Jun 29, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

wmanley commented Jan 11, 2018 •

edited

Loading

cgwalters commented Jan 11, 2018 •

edited

Loading

wmanley commented Jan 28, 2018 •

edited

Loading