Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use explicit rsync options to detect unchanged RPMs when uploading to stagingyum #292

Merged
merged 1 commit into from
Nov 2, 2023

Conversation

ehelms
Copy link
Member

@ehelms ehelms commented Nov 1, 2023

I am proposing we use checksum to determine if an RPM should get synced to stagingyum.theforeman.org. In the current state, using the default of mtime and size, we download the packages from Copr and they all end up getting synced to stagingyum (and thus to yum.theforeman.org) whether they changed or not.

@evgeni @ekohl This is a follow on to our bigger original discussion around not overriding RPMs in production that have not changed.

@evgeni
Copy link
Member

evgeni commented Nov 1, 2023

Ooh, I forgot to tell you…

I did some experiments (outside the scripts here), and it seemed to me that (at least on Fedora 38) it doesn't matter which "is file changed" logic you select and the modification time of a file is always kept uptodate when you use --times.

"But I don't use --times!" you'll say. Well, but --archive, which is defined as

-a, --archive               archive mode; equals -rlptgoD (no -H,-A,-X)

And -t is --times.

In my tests, switching --archive to the more arcane, verbose notation and dropping t from it made the files retain their original mtimes.

Yes I wish there would be --archive --no-times, but there isn't.

@ehelms ehelms changed the title Use checksum to decide if a file gets rsync over to stagingyum Use explicit rsync options to detect unchanged RPMs when uploading to stagingyum Nov 1, 2023
@ehelms
Copy link
Member Author

ehelms commented Nov 1, 2023

Excellent. I've updated this to reflect that idea and spell out each option with verbosity. I also found with my local testing the same result.

I had a mis-step in my testing. Unless I am missing with this change what you meant, this does not produce what we want. I am still getting it detecting everything as having changed.

@@ -5,4 +5,4 @@
USER='yumrepostage'
HOST='web01.osuosl.theforeman.org'

rsync --archive --verbose --partial --one-file-system --delete-after "tmp/$PROJECT/$VERSION" "$USER@$HOST:rsync_cache/$PROJECT"
rsync --perms --devices --recursive --links --owner --group --verbose --partial --one-file-system --delete-after "tmp/$PROJECT/$VERSION" "$USER@$HOST:rsync_cache/$PROJECT"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can drop owner and group - the remote rsync isn't running with super powers and would ignore those anyway (as it can't change ownership to anyone but the user it's running as).
We can also drop devices, as I am kinda sure we won't have devices between our RPMs.
Not sure about perms, this has the potential of leaking the rel-eng users umask and making files unreadable to "other", or do ee ensure somewhere that folders are 755 and files 644 mode?

@evgeni
Copy link
Member

evgeni commented Nov 1, 2023

Yeah, you still need --checksum (or --size-only, but I prefer checksum) to avoid "different timestamp, file changed"

@ehelms
Copy link
Member Author

ehelms commented Nov 1, 2023

Were you saying to drop --archive, trim the list of what we use from it but also use --checksum as our change detection?

The --archive option includes --times implicitly that preserves modification
time and results in all RPMs being thought of as changed. Instead we explicitly
define the options that are included in --archive and drop the use of --times.
@ehelms
Copy link
Member Author

ehelms commented Nov 1, 2023

Yeah, you still need --checksum (or --size-only, but I prefer checksum) to avoid "different timestamp, file changed"

I did a few laps around the round-about but I think I finally exited properly.

@ehelms
Copy link
Member Author

ehelms commented Nov 1, 2023

Note that this change will be needed in CI as well, however, I am proposing to re-use this script in CI to avoid the two-places problem -- theforeman/jenkins-jobs#369

@evgeni
Copy link
Member

evgeni commented Nov 1, 2023

Were you saying to drop --archive, trim the list of what we use from it but also use --checksum as our change detection?

Yes.

@ehelms ehelms merged commit e87725e into theforeman:master Nov 2, 2023
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants