-
Notifications
You must be signed in to change notification settings - Fork 131
Broken tables caused by non transactional table operations
MySQL 5.6 (and previous versions) does not support transactional data definition at either the SQL interface or the storage engine interface. The lack of transactions data definition can result in inconsistencies in the meta-data that define a table if the MySQL server crashes while data definition operations are running. We show some examples with TokuDB. Similar inconsistencies also occur with InnoDB. The inconsistency problems occur because of the lack of transactional data definition in MySQL.
Create table
executes the following:
Step 1: MySQL creates the table's FRM file.
Step 2: The storage engine table handler create
method is called.
Step 2.1: TokuDB begins a transaction.
Step 2.2: For all indexes (including the status
index), TokuDB transactionally creates a row in its tokudb.directory
that maps the name for the table's index to the file that contains the data for the index. TokuDB then transactionally creates a file for the index.
Step 2.3: TokuDB commits the transaction.
If the transaction commits, then both the fractal tree mappings in the tokudb.directory
and the fractal tree files should exist for the table. Otherwise, no mappings should exist in the tokudb.directory
and no fractal tree files should exist for the table. It is all or nothing.
Failure 1: The FRM file is not included in the transaction since step 1 is not part of the transaction. If mysqld
crashes after step 1 and before step 2.3 commits, there may be an FRM file for the table without any of the underlying storage engine data files. For TokuDB, there should be NO TokuDB files for the table and NO mappings for the table in the tokudb.directory
.
Commands may see Can't find file: 't' (errno: 2)
errors. The solution is to delete the FRM file by executing a drop table
command.
Drop table
executes the following:
Step 1: The storage engine handler delete_table
method is called.
Step 1.1: TokuDB begins a transaction.
Step 1.2: For all indexes (including the status
index), TokuDB transactionally deletes a row from the tokudb.directory
and transactionally unlinks the fractal tree file using the transaction from step 1.1.
Step 1.3: TokuDB commits the transaction.
Step 2: MySQL removes the table's FRM file.
Failure 1: If mysqld
crashes after step 1.3 and before step 2 completes, then the FRM file exists for the table but NO file mappings exist in the tokudb.directory
and NO fractal tree files exist.
Commands may see Can't find file: 't' (errno: 2)
errors. The solution is to delete the FRM file by executing a drop table
command.
Create table
for a partitioned table executes the following:
Step 1: MySQL creates the table's FRM file.
Step 2: For each partition, MySQL calls the storage engine handler create
method.
If there are 10 partitions, there will be 10 calls to the storage engine to create a partition. There is no single transaction that wraps all of these calls. Therefore, failures during step 2 can leave the state of the table in an inconsistent state.
Failure 1: Crash before step 2. The table FRM file exists but there are no tokudb.directory
mappings and no fractal tree files for the table.
Commands may see Can't find file: 't' (errno: 2)
errors. The solution is to delete the FRM file by executing a drop table
command.
Failure 2: Crash during step 2 before the final transaction commit. The table FRM file exists but only the tokudb.directory
mappings and the fractal tree files for the partitions that completed step 2 exist.
Commands may see Can't find file: 't' (errno: 2)
errors. The solution is to restore the table with the missing fractal tree files and then execute a drop table
command.
Drop table
for a partitioned table executes the following:
Step 1: For each partition, the storage engine handler delete_table
method is called. Each of the partitions is transactionally deleted separately.
Step 2: MySQL removes the table's FRM file.
Failure 1: A crash inside of step 1 will result in some of the partition's state being deleted but the table's FRM file still exists.
Commands may see Can't find file: 't' (errno: 2)
errors. The solution is to restore the table with the missing fractal tree files and then execute a drop table
command.
Failure 2: A crash after step 1 and before step 2 completes will result in all of the storage engine state for the table being deleted but the table's FRM file still exists.
Commands may see Can't find file: 't' (errno: 2)
errors. The solution is to delete the FRM file by executing a drop table
command.
Failure 3: A concurrent query to the information_schema.tokudb_fractal_tree_info
may cause tokudb.delete_table
to fail for one of the partitions. drop table
will fail but the FRM still exists and only a subset of the fractal tree files still exist.
Create table as select
executes the following:
Step 1: MySQL creates the table's FRM file.
Step 2: The source tables are locked with TokuDB's external_lock
method which causes a root transaction for the statement to be started.
Step 3. The target table is created with the TokuDB's create
method for the target table. Rows are added to the tokudb.directory
and fractal tree files are transactionally created using the root transaction.
Step 4: start_bulk_insert
may use the TokuDB loader if the table is empty and the tokudb_prelock_empty
variable allows.
Step 5: write_row
will be called for each row being inserted into the target table.
Step 6: end_bulk_insert
may close the TokuDB loader if it is being used by this transaction.
Step 7: commit or rollback the root transaction.
Failure 1: If a crash occurs after step 1 completes, then there is a table with an FRM file but no files in the storage engine. All of the storage engine files and storage engine file maps should be rolled back.
Alter table drop partition
executes the following:
Step 1: MySQL creates and installs the new FRM file for the table
Step 2: MySQL updates ddl_log
.
Step 3: TokuDB deletes the partition table. This transactionally deletes the maps in the tokudb.directory
for the fractal trees and transactionally deletes the fractal tree files.
Step 4. MySQL updates ddl_log
again.
The TokuDB loader uses a merge sort algorithm to bulk load fractal tree files. While the TokuDB loader is in progress, there may be 2 fractal tree files: the original fractal tree file and the new fractal tree file that loader is writing new data into. The TokuDB loader will keep the new fractal tree file and unlink the old file when it successfully completes.
The use of the TokuDB loader is transactional with the following actions when the transaction completes:
Commit action: commit the file mapping to the new fractal tree file, and unlink the old fractal tree file.
Abort action: rollback the file mapping to the old fractal tree file, and unlink the new fractal tree file.
Problem 1: Rows exist in the tokudb.directory
for fractal trees whose underlying data files do NOT exist in the MySQL data directory.
Problem 2: Fractal tree files exist in the MySQL data directory but there are no rows in the tokudb.directory
that map these files.
The fractal tree environment open
method could clean up the tokudb.directory
by deleting all rows that have missing fractal tree files. This would happen after recovery completes and could be disabled by default.
What to do with fractal tree files that have no map in the tokudb.directory
? These files can be identified by computing the set of all fractal tree files in the data directory and subtracting those fractal tree files that are in the tokudb.directory
. These files can simply be removed.