Skip to content
This repository has been archived by the owner on Jun 12, 2020. It is now read-only.

Interactive Debugging of Transaction Conflicts with TokuDB

RIch Prohaska edited this page Oct 9, 2013 · 16 revisions

Interactive Debugging of Transaction Conflicts with TokuDB

I am developing a concurrent application that uses TokuDB to store its database. Sometimes, one of my SQL statements returns with a 'lock wait timeout exceeded' error. How do I identify the cause of this error? First, I need to understand a little bit about how TokuDB transactions use locks. Then, I need to understand how to use the MySQL information schema to look at the current state of the locks.

Transactions and Locks

TokuDB uses key range locks to implement serializable transactions. These locks are acquired as the transaction progresses. The locks are released when the transaction commits or aborts.

TokuDB stores these locks in a data structure called the lock tree. The lock tree stores the set of range locks granted to each transaction. In addition, the lock tree stores the set of locks that are not granted due to a conflict with locks granted to some other transaction.

TokuDB 7.1 provides access to the set of locks taken and the set of locks that are blocked via tables in the MySQL information schema. In addition, TokuDB stores the last lock conflict information in a session variable that can be retrieved by the application.

Example

Here is an example debug session.

I create a table with a single column that is the primary key.

mysql> show create table t;
Create Table: CREATE TABLE `t` (
  `id` int(11) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=TokuDB DEFAULT CHARSET=latin1

I have 2 MySQL clients with ID's 1 and 2 respectively. MySQL client 1 inserts some values into the table 't'. TokuDB transaction 51 is created for the insert statement. Since autocommit is off, transaction 51 is still live after the insert statement completes, and I can query the 'tokudb_locks' information schema table to see the locks that are held by the transaction.

mysql 1> set autocommit=off;

mysql 1> insert into t values (1),(10),(100);
Query OK, 3 rows affected (0.00 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql 1> select * from information_schema.tokudb_locks;
+--------------+-----------------------+---------------+----------------+-----------------+
| locks_trx_id | locks_mysql_thread_id | locks_dname   | locks_key_left | locks_key_right |
+--------------+-----------------------+---------------+----------------+-----------------+
|           51 |                     1 | ./test/t-main | 0001000000     | 0001000000      |
|           51 |                     1 | ./test/t-main | 000a000000     | 000a000000      |
|           51 |                     1 | ./test/t-main | 0064000000     | 0064000000      |
+--------------+-----------------------+---------------+----------------+-----------------+

mysql 1> select * from information_schema.tokudb_lock_waits;
Empty set (0.00 sec)

The keys are currently hex dumped. Someday we will pretty print the keys.

Now we switch to the other MySQL client with ID 2.

mysql 2> insert into t values (2),(20),(100);
The insert gets blocked since there is a conflict on the primary key with value 100.

The granted TokuDB locks are:

mysql 1> select * from information_schema.tokudb_locks;
+--------------+-----------------------+---------------+----------------+-----------------+
| locks_trx_id | locks_mysql_thread_id | locks_dname   | locks_key_left | locks_key_right |
+--------------+-----------------------+---------------+----------------+-----------------+
|           51 |                     1 | ./test/t-main | 0001000000     | 0001000000      |
|           51 |                     1 | ./test/t-main | 000a000000     | 000a000000      |
|           51 |                     1 | ./test/t-main | 0064000000     | 0064000000      |
|           62 |                     2 | ./test/t-main | 0002000000     | 0002000000      |
|           62 |                     2 | ./test/t-main | 0014000000     | 0014000000      |
+--------------+-----------------------+---------------+----------------+-----------------+

The locks that are pending due to a conflict are:

mysql 1> select * from information_schema.tokudb_lock_waits;
+-------------------+-----------------+------------------+---------------------+----------------------+-----------------------+
| requesting_trx_id | blocking_trx_id | lock_waits_dname | lock_waits_key_left | lock_waits_key_right | lock_waits_start_time |
+-------------------+-----------------+------------------+---------------------+----------------------+-----------------------+
|                62 |              51 | ./test/t-main    | 0064000000          | 0064000000           |         1380656990910 |
+-------------------+-----------------+------------------+---------------------+----------------------+-----------------------+

Eventually, the lock for client 2 times out, and we can retrieve a JSON coded document that describes the conflict.

ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction

mysql 2> select @@tokudb_last_lock_timeout;
+----------------------------------------------------------------------------------------------------------------+
| @@tokudb_last_lock_timeout                                                                                     |
+----------------------------------------------------------------------------------------------------------------+
| {"mysql_thread_id":2, "dbname":"./test/t-main", "requesting_txnid":62, "blocking_txnid":51, "key":"0064000000"} |
+----------------------------------------------------------------------------------------------------------------+

mysql 2> rollback;

The JSON document encodes the database, table and index where the conflict occurred in the 'dname' key. The 'key' encodes the conflicting key. The 'blocking_txnid' encodes the transaction identifier that is blocking my transaction. I can find the MySQL client that created the blocking transaction via joins of information schema tables.

mysql> select * from information_schema.tokudb_trx,information_schema.processlist 
where trx_id = 51 and trx_mysql_thread_id = id;
+--------+---------------------+----+----------+-----------+------+---------+------+-------+------+
| trx_id | trx_mysql_thread_id | ID | USER     | HOST      | DB   | COMMAND | TIME | STATE | INFO |
+--------+---------------------+----+----------+-----------+------+---------+------+-------+------+
|     51 |                   1 |  1 | prohaska | localhost | test | Sleep   |  151 |       | NULL |
+--------+---------------------+----+----------+-----------+------+---------+------+-------+------+

Now I have information about the two client operations that are conflicting and can hopefully change my application to avoid this conflict.

Clone this wiki locally