Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run OPTIMIZE TABLE when required #101

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

nicois
Copy link
Contributor

@nicois nicois commented Nov 24, 2022

Due to changes in the underlying mysql version, it may be required to run OPTIMIZE TABLE on tables which have been ALTERed, to prevent cuprruption.

xtrabackup detects this situation and reports which tables are impacted, so myhoard simply needs to carry out this operation.

@nicois nicois requested a review from a team as a code owner November 24, 2022 03:16
@codecov-commenter
Copy link

codecov-commenter commented Nov 24, 2022

Codecov Report

Base: 76.59% // Head: 76.81% // Increases project coverage by +0.22% 🎉

Coverage data is based on head (195f83e) compared to base (cbd6e9a).
Patch coverage: 80.76% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #101      +/-   ##
==========================================
+ Coverage   76.59%   76.81%   +0.22%     
==========================================
  Files          16       16              
  Lines        3905     3930      +25     
  Branches      932      940       +8     
==========================================
+ Hits         2991     3019      +28     
+ Misses        702      688      -14     
- Partials      212      223      +11     
Impacted Files Coverage Δ
myhoard/basebackup_operation.py 89.86% <63.63%> (-2.86%) ⬇️
myhoard/util.py 89.44% <93.33%> (+0.13%) ⬆️
myhoard/basebackup_restore_operation.py 80.76% <0.00%> (-3.85%) ⬇️
myhoard/controller.py 79.07% <0.00%> (-0.21%) ⬇️
myhoard/restore_coordinator.py 78.79% <0.00%> (+2.63%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@nicois nicois force-pushed the nicois/INC-348-optimise-tables-on-error branch 3 times, most recently from 195f83e to e4057b6 Compare November 24, 2022 05:24
Due to changes in the underlying mysql version, it may be required to
run OPTIMIZE TABLE on tables which have been ALTERed, to prevent
cuprruption.

xtrabackup detects this situation and reports which tables are impacted,
so myhoard simply needs to carry out this operation.
@nicois nicois force-pushed the nicois/INC-348-optimise-tables-on-error branch from e4057b6 to 78d7b26 Compare November 24, 2022 05:31
@andygrunwald
Copy link

This change originated from INC-348

Copy link
Contributor

@alanfranz alanfranz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Github is reporting conflicts that need to be addressed
  2. Some tests are failing
  3. The implemented behaviour seems to be very xtrabackup-version dependent, but we currently only test with one xtrabackup version, while in prod for Aiven we employ the same version for mysql and xtrabackup.

This change won't appear for mysql < 8.0.30 and xtrabackup 8.0.30 btw; but I think we should pin the xtrabackup version in ci files, otherwise this will break as soon as xtrabackup 8.0.30 is out.

for line in xtrabackup_lines[2:]:
if line.startswith("Please run OPTIMIZE TABLE"):
return
yield line.replace("/", ".")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the output of this function is fed directly to the db via string interpolation. I think we'd need a bit of validation here - hcheck for valid chars only, or perform some escaping so that the wrong data cannot enter the db).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also: I'd return a better object, e.g. a 2-tuple with (db, table), then I'd compose it when creating the query, so that we can validate the names separately and we don't have to deal with string content if we want to perform some other manipulation in the future.

@@ -25,6 +25,29 @@
ERR_TIMEOUT = 2013


def get_tables_to_optimise(error_lines: list[str], *, log: Logger) -> Iterator[str]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd really prefer a non-lazy return type. Why?

I doubt there's a performance issue here and that generating an iterator helps us in that regard, and it's quite a common bug to try accidentally reusing an iterator.

@alanfranz
Copy link
Contributor

Also, I think that #102 should work as well?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants