Skip to content
This repository has been archived by the owner on Jun 12, 2020. It is now read-only.

Broken tables caused by non transactional table operations

Rik Prohaska edited this page Apr 6, 2015 · 3 revisions

Introduction

MySQL 5.6 (and previous versions) does not support transactional data definition at either the SQL interface or the storage engine interface. The lack of transactions data definition can result in inconsistencies in the meta-data that define a table if the MySQL server crashes while data definition operations are running. We show some examples with TokuDB. Similar inconsistencies also occur with InnoDB. The inconsistency problems occur because of the lack of transactional data definition in MySQL.

Table creation

Create table executes the following:

Step 1: MySQL creates the table's FRM file.

Step 2: The storage engine table handler create method is called.

Step 2.1: TokuDB begins a transaction.

Step 2.2: For all indexes (including the status index), TokuDB transactionally creates a row in its tokudb.directory that maps the name for the table's index to the file that contains the data for the index. TokuDB then transactionally creates a file for the index.

Step 2.3: TokuDB commits the transaction.

If the transaction commits, then both the fractal tree mappings in the tokudb.directory and the fractal tree files should exist for the table. Otherwise, no mappings should exist in the tokudb.directory and no fractal tree files should exist for the table. It is all or nothing.

Failure 1: The FRM file is not included in the transaction since step 1 is not part of the transaction. If mysqld crashes after step 1 and before step 2.3 commits, there may be an FRM file for the table without any of the underlying storage engine data files. For TokuDB, there should be NO TokuDB files for the table and NO mappings for the table in the tokudb.directory.

Commands may see Can't find file: 't' (errno: 2) errors. The solution is to delete the FRM file by executing a drop table command.

Table deletion

Drop table executes the following:

Step 1: The storage engine handler delete_table method is called.

Step 1.1: TokuDB begins a transaction.

Step 1.2: For all indexes (including the status index), TokuDB transactionally deletes a row from the tokudb.directory and transactionally unlinks the fractal tree file using the transaction from step 1.1.

Step 1.3: TokuDB commits the transaction.

Step 2: MySQL removes the table's FRM file.

Failure 1: If mysqld crashes after step 1.3 and before step 2 completes, then the FRM file exists for the table but NO file mappings exist in the tokudb.directory and NO fractal tree files exist.

Commands may see Can't find file: 't' (errno: 2) errors. The solution is to delete the FRM file by executing a drop table command.

Partitioned table creation

Create table for a partitioned table executes the following:

Step 1: MySQL creates the table's FRM file.

Step 2: For each partition, MySQL calls the storage engine handler create method.

If there are 10 partitions, there will be 10 calls to the storage engine to create a partition. There is no single transaction that wraps all of these calls. Therefore, failures during step 2 can leave the state of the table in an inconsistent state.

Failure 1: Crash before step 2. The table FRM file exists but there are no tokudb.directory mappings and no fractal tree files for the table.

Commands may see Can't find file: 't' (errno: 2) errors. The solution is to delete the FRM file by executing a drop table command.

Failure 2: Crash during step 2 before the final transaction commit. The table FRM file exists but only the tokudb.directory mappings and the fractal tree files for the partitions that completed step 2 exist.

Commands may see Can't find file: 't' (errno: 2) errors. The solution is to restore the table with the missing fractal tree files and then execute a drop table command.

Partitioned table deletion

Drop table for a partitioned table executes the following:

Step 1: For each partition, the storage engine handler delete_table method is called. Each of the partitions is transactionally deleted separately.

Step 2: MySQL removes the table's FRM file.

Failure 1: A crash inside of step 1 will result in some of the partition's state being deleted but the table's FRM file still exists.

Commands may see Can't find file: 't' (errno: 2) errors. The solution is to restore the table with the missing fractal tree files and then execute a drop table command.

Failure 2: A crash after step 1 and before step 2 completes will result in all of the storage engine state for the table being deleted but the table's FRM file still exists.

Commands may see Can't find file: 't' (errno: 2) errors. The solution is to delete the FRM file by executing a drop table command.

Failure 3: A concurrent query to the information_schema.tokudb_fractal_tree_info may cause tokudb.delete_table to fail for one of the partitions. drop table will fail but the FRM still exists and only a subset of the fractal tree files still exist.

Tables created with create select

Create table as select executes the following:

Step 1: MySQL creates the table's FRM file.

Step 2: The source tables are locked with TokuDB's external_lock method which causes a root transaction for the statement to be started.

Step 3. The target table is created with the TokuDB's create method for the target table. Rows are added to the tokudb.directory and fractal tree files are transactionally created using the root transaction.

Step 4: start_bulk_insert may use the TokuDB loader if the table is empty and the tokudb_prelock_empty variable allows.

Step 5: write_row will be called for each row being inserted into the target table.

Step 6: end_bulk_insert may close the TokuDB loader if it is being used by this transaction.

Step 7: commit or rollback the root transaction.

Failure 1: If a crash occurs after step 1 completes, then there is a table with an FRM file but no files in the storage engine. All of the storage engine files and storage engine file maps should be rolled back.

Partition deletion

Alter table drop partition executes the following:

Step 1: MySQL creates and installs the new FRM file for the table

Step 2: MySQL updates ddl_log.

Step 3: TokuDB deletes the partition table. This transactionally deletes the maps in the tokudb.directory for the fractal trees and transactionally deletes the fractal tree files.

Step 4. MySQL updates ddl_log again.

Tables created with the TokuDB loader

The TokuDB loader uses a merge sort algorithm to bulk load fractal tree files. While the TokuDB loader is in progress, there may be 2 fractal tree files: the original fractal tree file and the new fractal tree file that loader is writing new data into. The TokuDB loader will keep the new fractal tree file and unlink the old file when it successfully completes.

The use of the TokuDB loader is transactional with the following actions when the transaction completes:

Commit action: commit the file mapping to the new fractal tree file, and unlink the old fractal tree file.

Abort action: rollback the file mapping to the old fractal tree file, and unlink the new fractal tree file.

Unexplained inconsistency problems with TokuDB

Problem 1: Rows exist in the tokudb.directory for fractal trees whose underlying data files do NOT exist in the MySQL data directory.

Problem 2: Fractal tree files exist in the MySQL data directory but there are no rows in the tokudb.directory that map these files.

Cleanup

The fractal tree environment open method could clean up the tokudb.directory by deleting all rows that have missing fractal tree files. This would happen after recovery completes and could be disabled by default.

What to do with fractal tree files that have no map in the tokudb.directory? These files can be identified by computing the set of all fractal tree files in the data directory and subtracting those fractal tree files that are in the tokudb.directory. These files can simply be removed.

Clone this wiki locally