-
Notifications
You must be signed in to change notification settings - Fork 131
Cardinality
Cardinality is an estimate of the rows per key for each index of a table. MySQL accepts cardinality estimates for all key prefixes if the key is composed of multiple columns. The storage engine computes the cardinality estimates and posts them to the table. This makes the cardinality estimates available to the query optimizer.
Analyze table in MySQL can be used by the administrator to update table statistics used by the query optimizer. These table statistics include index cardinality. The analyze table implementation in MySQL has these properties:
Analyze table runs with the MDL_SHARED_READ lock, which allows concurrent reads from the table but prohibits concurrent writes to the table. We can patch the MySQL codebase to downgrade the lock to allow concurrent writes.
Analyze table closes the table when it completes. This causes subsequent SQL commands that reference the table to reopen it.
The InnoDB and table statistics blog describes how InnoDB handles table statistics for a table.
TokuDB computes cardinality estimates from ‘analyze table’.
TokuDB does not compute cardinality automatically.
TokuDB computes cardinality on all parts of multiple column keys.
TokuDB persists the cardinality estimates in the status dictionary.
The TokuDB analyze method computes cardinality estimates and stores them in the status dictionary. We expect the table to be closed when ‘analyze table’ completes, so we do not update the table’s key info structures. When the table is reopened, TokuDB’s ‘info’ method will read the cardinality estimates from the status dictionary and post the values to the table’s key info structures.
TokuDB computes cardinality in a dictionary using a bounded scan of the dictionary starting from the first key. The bound is expressed as a time boundary. A TokuDB session variable can be used to change the default value of the time boundary.
Another approach is to examine several randomly selected row ranges and compute cardinality over those ranges. Hopefully, this sampling is statistically significant. This algorithm will need a fractal tree method to position a cursor at roughly the ith row (or the ith leaf node).
Cardinality is displayed via ‘show indexes’.
Show processlist displays analyze progress.
Add index will transactionally delete the table’s persistent cardinality estimates.
Drop index will transactionally delete the table’s persistent cardinality estimates.
Analyze table allows concurrent queries.
Analyze table blocks concurrent writes.
Primary keys and unique secondary keys can skip cardinality computation since they are unique by definition.
If a key prefix comparison of two adjacent keys says the prefixes are different, then all larger prefixes must also be different. We can use this property to terminate the unique key count algorithm.
Analyze should use a fractal tree cursor that does not pollute the fractal tree cache. This feature does not exist.
Analyze should do statistical sampling of the dictionary’s key space. A fractal tree API that positions the cursor at some fraction of the key space is needed.
The bulk loader could compute cardinality at the end of the load.
Downgrade the MDL of the analyze to allow concurrent writes.
Suppose that we want to position a cursor at roughly the ith row in a dictionary. Here is an cursor positioning algorithm that expresses the ith row as a fraction between 0 and 1 into the key range.
key find(node, f)
child_num = floor(f * node->num_of_children)
if node->height == 0
return first key in child[child_num]
else
new_f = f * node->num_of_children - child_num
return find(node->child[child_num], new_f)
Proof: We pick the child based on f, the fraction of the number of child subtrees. Once we pick a subtree, we need to recompute f for the subtree. We introduce M = number of rows in the the tree. M does not really need to be known in the algorithm; it is merely used in the following proof.
The ith row is f * M.
The number of rows in each subtree is assumed to be rows_per_child = M / num_of_children.
The new f for the child subtree is new_f = (f * M - child_num * rows_per_child) / rows_per_child which simplifies to new_f = f * num_of_children - child_num