-
Notifications
You must be signed in to change notification settings - Fork 131
Bulk Fetch
The TokuDB bulk fetch algorithm speeds up fractal tree scans by returning multiple rows per tree search rather than a single row. This amortizes the cost of a fractal tree search over several rows. We have observed a speedup of between 2x and 5x when using bulk fetch compared to not using it. TokuDB 7.1.7 (and prior versions) only use bulk fetch for simple selects
. As a result, only simple selects
run fast. Other operations, such as create select
, run slow. We describe when bulk fetch can be used, how its use is triggered, and why it can be used to speed up operations such as create select
.
MySQL uses handler APIs to fetch one row at a time from TokuDB (and other storage engines). Unfortunately, a fractal tree search is too heavy to use for each handler call to get the next
or previous
row from a fractal tree. TokuDB uses a bulk fetch buffer in the handler that is filled by a single fractal tree search. When MySQL calls the next
or previous
TokuDB handler and the bulk fetch buffer is not empty, then a row is popped from the bulk fetch buffer and returned to MySQL. Otherwise, the bulk fetch buffer is refilled by a fractal tree search that returns multiple rows. Sometimes, the bulk fetch buffer contents must be invalidated when its contents may have become inconsistent with the fractal tree.
The MySQL query executor uses one of several table scanning algorithms to read rows from a table. An index scan is used to retrieve all of the rows from an index. A range scan is used to retrieve all of the rows from an index within a key range. TokuDB can use its bulk fetch algorithm for both of these types of scans. However, TokuDB must be notified by the MySQL query executor that an index or range scan will happen. Sometimes, this does not occur and the scans run slow.
The prepare_index_scan
handler method informs the storage engine that an index scan will happen. Similarly, the prepare_range_scan
handler method informs the storage engine that a range scan will happen. We placed calls to these methods in the appropriate places of the MySQL query executor code.
Suppose that prepare_index_scan
is not called by MySQL because of a missing Tokutek patch to the MySQL source. TokuDB can automatically detect an index scan by monitoring the sequence of index calls. If a call to index_first
is followed by a call to index_next
, then TokuDB can infer that an index scan is being done and setup bulk fetch appropriately. Similarly, if a call to index_last
is followed by a call to index_prev
, then TokuDB can infer that a reverse index scan is begin done.
Bulk fetch should be used for index and range scans on simple tables as well as partitioned tables. We recently found that bulk fetch is NOT being used for index scans on partitioned TokuDB tables. #261 Index scans on partitions do not use bulk fetch because the prepare_index_scan
method is not currently called by the partition storage engine. We fixed this problem by auto detecting index scans.
The bulk fetch buffer is invalidated at
- index_init
- index_first
- index_last
- index_read
- index_end
- in get_next when the buffer has been emptied
- A select handler in the trigger is different than the handlers used to scan the source tables. Therefore, there is no interference on the bulk fetch buffers with select statements.
- Rows are only read from the source tables and never changed.
- Insert, delete, and update triggers can not fire since none of these operations ocurrs with a simple select.
- Rows are only read from the source tables and never changed.
- Since the destination table is new, no triggers exist in the created table.
- MySQL will use a temporary table if the destination table and source tables are the same.
- Insert trigger
- Inserts, deletes, and updates on the source table are NOT allowed in the trigger by MySQL.
- MySQL will use a temporary table if the destination table and source tables are the same.
- The update can only change columns in the target row.
- Insert trigger
- Inserts, deletes, and updates on the source table are NOT allowed in the trigger by MySQL.
- Delete trigger does not apply.
- Update trigger
- Inserts, deletes, and updates on the source table are NOT allowed in the trigger by MySQL.
- MySQL will use a temporary table if the destination table and source tables are the same
- Insert trigger like insert select.
- Inserts, deletes, and updates on the source table are NOT allowed in the trigger by MySQL.
- Delete trigger
- Inserts, deletes, and updates on the source table are NOT allowed in the trigger by MySQL.
- Update trigger does not apply.
- Delete trigger
- Inserts, deletes, and updates on the source table are NOT allowed in the trigger by MySQL.
- Insert and update triggers do not apply.
- TODO