Skip to content

DDL Survival

Eugene Kosov edited this page Apr 17, 2018 · 87 revisions

Revision: pre2.2

Introduction

DDL Survival feature allows table history to survive DDL operations like ALTER TABLE, DROP TABLE, TRUNCATE TABLE and RENAME TABLE.

What's new:

2.0

  • per-table VTMD instead of global versioning_table_ddl;
  • frm_image blob is depreciated in favor of archive tables.

2.1

  • Error handling section.

2.2

Configuration

DDL Survival mode is switched by versioning_alter_history global/session variable:

set versioning_alter_history = SURVIVE;

Part I: archive tables for persistent history

Archive tables are created on history modifying DDL. When such operation occurs all history is kept in originally schemed-table and all alive data goes to an altered table. Technically archive table is just renamed original table before DDL operation.

Archive tables are named by this template:

NAME_YYYYMMDD_hhmmss_uuuuuu

where

NAME: original table name (arbitrary length); YYYY: year (4 chars); MM: month (2 chars); DD: day of month (2 chars); hh: hour (2 chars); ss: seconds (2 chars); uuuuuu: microseconds (6 chars).

Thus, the length of mandatory suffix for archive tables is: 8 + 6 + 6 + 3 = 23.

Usually archive tables are hidden from SHOW TABLES command except in versioning_hide = NEVER mode.

Example

create or replace table dept (
  id int primary key auto_increment, 
  name varchar(255),
  employees int
)
with system versioning;

insert into dept (name, employees) values
  ("lime squeezing", 3),
  ("ice cracking", 5),
  ("waiters", 8);

delete from dept where id = 1;
select *,sys_trx_start,sys_trx_end from dept for system_time all;
+----+----------------+-----------+---------------+----------------------+
| id | name           | employees | sys_trx_start | sys_trx_end          |
+----+----------------+-----------+---------------+----------------------+
|  1 | lime squeezing |         3 |         16678 |                16684 |
|  2 | ice cracking   |         5 |         16678 | 18446744073709551615 |
|  3 | waiters        |         8 |         16678 | 18446744073709551615 |
+----+----------------+-----------+---------------+----------------------+

SYSTEM_TIME ALL query returns whole history as well as alive data. In this example record with ID 1 was deleted by transaction 16684. Records 2 and 3 are alive, its sys_trx_end is equal to infinity.

alter table dept drop column employees;

select *,sys_trx_start,sys_trx_end from dept for system_time all;
+----+----------------+-----------+---------------+----------------------+
| id | name           | employees | sys_trx_start | sys_trx_end          |
+----+----------------+-----------+---------------+----------------------+
|  2 | ice cracking   |         5 |         16690 | 18446744073709551615 |
|  3 | waiters        |         8 |         16690 | 18446744073709551615 |
+----+----------------+-----------+---------------+----------------------+

After data-modifying DDL the original dept table retained only alive data.

select *,sys_trx_start,sys_trx_end from dept_20161223_191443 for system_time all;
+----+----------------+-----------+-----------------------------+
| id | name           | employees | sys_trx_start | sys_trx_end |
+----+----------------+-----------+-----------------------------+
|  1 | lime squeezing |         3 |         16678 |       16684 |
|  2 | ice cracking   |         5 |         16678 |       16690 |
|  3 | waiters        |         8 |         16678 |       16690 |
+----+----------------+-----------+-----------------------------+

Part II: Versioned Tracking Metadata (VTMD)

Versioned Tracking Metadata table (VTMD) is special System Versioning table that helps to transparently traverse tables schema changes in historical queries (see part IV for SELECT implementation). Thus it logically connects across time multiple archive tables of one versioned table (for archive table see part I).

There are 3 types of operation that are tracked by VTMD:

  • table rename;
  • column rename and/or reorder (see part III;
  • any data-modifying DDL.

VTMD is personal to each versioned table. Every versioned table may have at most one VTMD table.

VTMD is named after corresponding versioned table by adding _vtmd suffix. VTMD is versioned itself and can be queried with SYSTEM_TIME clause.

VTMD structure

  • start: TRX_ID of table lifetime start;
  • end: TRX_ID of table lifetime end;
  • name: Table name during period [start, end);
  • archive_name: Name of archive table (NULL for end = Inf and last sequential table renames);
  • col_renames: blob with new -> old column name mappings.

VTMD table name tracking

Timepoints are marked by t0, t1, ..., tN. Each tN is in fact TRX_ID that references time value in TRT.BEGIN_TIME.

All operations on VTMD are done when versioning_alter_history = SURVIVE. Otherwise there will be no VTMD changes as well as no DDL survival tracking at all.

Table is created

VTMD is created and new record is added with start marking table creation time, end = Inf, name is set to table name, archive_name is NULL.

Table is altered

  1. Archive table is created.
  2. Last record (where end = Inf) end is set to alter time;
  3. All last records where archive_name = NULL are set archive_name to archive table name.
  4. New record is added with start as alter time, end is Inf, name is table name, archive_name is NULL.

Table is renamed

  1. Last record (where end = Inf) end is set to alter time;
  2. New record is added with start as alter time, end is Inf, name is new table name, archive_name is NULL.

Table is dropped

  1. Table is renamed to archive table.
  2. All last records where archive_name = NULL are set archive_name to archive table name.
  3. Last record (where end = Inf) end is set to drop time.

VTMD recovery

Table is altered and there is no VTMD or VTMD is empty. This may be the situation for example, when VTMD was dropped manually or table was created in non-survival mode.

  1. VTMD is created and new record is added with start set to table creation time (same as INFORMATION_SCHEMA.TABLES.CREATE_TIME), end = current_time(), name is set to table name (or previous name for rename), archive_name is NULL for rename or name of archive table for alter.

  2. Second record is added to VTMD corresponding to alter or rename operation (see above).

This does not apply to RENAME: only ALTER and DROP are subject to VTMD recovery.

Example

t0: Table A is created

A_VTMD is created.

start end name archive_name
t0 Inf A NULL

t1: Table A is renamed to B

A_VTMD is renamed to B_VTMD.

start end name archive_name
t0 t1 A NULL
t1 Inf B NULL

t2: Table B is altered

Archive table B_t2 is created.

start end name archive_name
t0 t1 A B_t2
t1 t2 B B_t2
t2 Inf B NULL

t3: Table B is renamed to C

B_VTMD is renamed to C_VTMD.

start end name archive_name
t0 t1 A B_t2
t1 t2 B B_t2
t2 t3 B NULL
t3 Inf C NULL

t4: Table C is renamed to A

C_VTMD is renamed to A_VTMD.

start end name archive_name
t0 t1 A B_t2
t1 t2 B B_t2
t2 t3 B NULL
t3 t4 C NULL
t4 Inf A NULL

t5: Table A is altered

Archive table A_t5 is created.

start end name archive_name
t0 t1 A B_t2
t1 t2 B B_t2
t2 t3 B A_t5
t3 t4 C A_t5
t4 t5 A A_t5
t5 Inf A NULL

A_VTMD is dropped

Archive tables B_t2 and A_t5 are now orphaned.

t6: Table A is renamed to B

B_VTMD is created, t0 is set to B (or A) creation time.

start end name archive_name
t0 t6 A NULL
t6 Inf B NULL

t7: Table B is dropped

Table B is renamed to B_t7

start end name archive_name
t0 t6 A B_t7
t6 t7 B B_t7

Versioned SHOW CREATE TABLE

Versioned SQL show command accepts SYSTEM_TIME expression to show table state in the past from corresponding VTMD record.

show create table dept for system_time as of transaction 16684;
+-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                                                                                                                                                                                                                                                                                                                                                     |
+-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| dept  | CREATE TABLE `dept` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(255) DEFAULT NULL,
  `employees` int(11) DEFAULT NULL,
  `sys_trx_start` bigint(20) unsigned GENERATED ALWAYS AS ROW START,
  `sys_trx_end` bigint(20) unsigned NOT NULL GENERATED ALWAYS AS ROW END,
  PRIMARY KEY (`id`,`sys_trx_end`),
  PERIOD FOR SYSTEM_TIME (`sys_trx_start`, `sys_trx_end`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 WITH SYSTEM VERSIONING |
+-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

show create table dept;
+-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                                                                                                                                                                                                                                                                                                                                                     |
+-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| dept  | CREATE TABLE `dept` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(255) DEFAULT NULL,
  `sys_trx_start` bigint(20) unsigned GENERATED ALWAYS AS ROW START,
  `sys_trx_end` bigint(20) unsigned NOT NULL GENERATED ALWAYS AS ROW END,
  PRIMARY KEY (`id`,`sys_trx_end`),
  PERIOD FOR SYSTEM_TIME (`sys_trx_start`, `sys_trx_end`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 WITH SYSTEM VERSIONING |
+-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Error handling

In DDL Survival mode in case any VTMD-related errors must lead original state recovery. I.e. the state before ALTER operation.

Part III: Column Mapping Metadata (CMMD)

Column Mapping Metadata (CMMD) is special System Versioning table that helps to track down column order and name changes.

CMMD like VTMD is personal to each versioned table. Every VTMD table may have at most one CMMD table.

CMMD is named after corresponding versioned table by adding _cmmd suffix.

Example

t0: Table A is created

create or replace table A (x int, y int, z int) with system versioning;

Table columns: x, y, z

A_VTMD is created:
start end name archive_name col_map
t0 Inf A NULL false

t1: Column order changed

alter table A change z z int first;

Table columns: z, x, y

A_VTMD is updated:
start end name archive_name col_map
t0 t1 A A_t1 true
t1 Inf A NULL NULL
A_CMMD is created:
  • current is column ID at time t0;
  • later is column ID at time t1.
start current later
t0 0 1
t0 1 2
t0 2 0

t2: Column is renamed

alter table A change x x2 int;

Table columns: z, x2, y

A_VTMD is updated:
start end name archive_name col_map
t0 t1 A A_t1 true
t1 t2 A NULL true
t2 Inf A NULL NULL
A_CMMD is updated:
start current later
t0 0 1
t0 1 2
t0 2 0
t1 1 1

t3: Column order and name changed

alter table A change x2 x3 int after y;

Table columns: z, y, x3

A_VTMD is updated:
start end name archive_name col_map
t0 t1 A A_t1 true
t1 t2 A A_t3 true
t2 t3 A A_t3 true
t3 Inf A NULL NULL
A_CMMD is updated:
start current later
t0 0 1
t0 1 2
t0 2 0
t1 1 1
t3 1 2
t3 2 1

Part IV: multi-schema SELECT

Multi-schema SELECT allows to make single query across multiple history points of schema change.

Name select modes

There are 2 modes of accessing tables and columns by name from historical SELECT:

  1. by present name;
  2. by past name.

Mode 1: SELECT by present names.

Uses latest names for tables and columns, contemporary to the time when SELECT is issued.

Find table schema as of specific time

This is pseudocode close to MySQL SP language with some shortcuts in the sake of brevity.

create procedure DDL_SCHEMA_MODE_1(
    TABLE_NAME varchar(255),
    AS_OF_TIME timestamp(6),
    COL_NAMES array of varchar(255))
DDL_SCHEMA_MODE_1_PROC:
begin
    declare TRX_ID bigint unsigned default MAX_VALUE;
    declare COL_MAP blob; # map of varchar(255) to varchar(255)
    declare FRM blob; # frm binary image
    declare OLD_NAME varchar(255);
    while true do
        # find DDL record where trx_id_end = TRX_ID and name = TABLE_NAME
        select sys_trx_start, col_rename, frm_image, old_name
        from DDL
        where trx_id_end = TRX_ID and name = TABLE_NAME
        into TRX_ID, COL_MAP, FRM, OLD_NAME;
        # if there is no such record
        if found_rows() = 0 then
            # and this is first iteration
            if TRX_ID = MAX_VALUE then
                # then use frm from file
                call USE_FRM_FROM_FILE(TABLE_NAME);
                leave DDL_SCHEMA_MODE_1_PROC;
            else
                # otherwise raise error
                call RAISE_ERROR(DDL_CORRUPTED);
                leave DDL_SCHEMA_MODE_1_PROC;
            end if;
        end if;
        # translate column names via mapping in `col_rename`
        set COL_NAMES= TRANSLATE_COL_NAMES(COL_NAMES, COL_MAP);
        if TRT_COMMIT_TS(trx_id_start) < AS_OF_TIME then
            call USE_FRM(frm);
            leave DDL_SCHEMA_MODE_1_PROC;
        end if;
        if not is_null(OLD_NAME) then
            set TABLE_NAME= OLD_NAME;
        end if;
    end while;
end
  1. define variables: TRX_ID = Inf, input parameters: TABLE_NAME, AS_OF_TIME;
  2. find DDL record where trx_id_end = TRX_ID and name = TABLE_NAME;
  3. if there is no such record, if this is first iteration then use frm from file, otherwise return error;
  4. translate column names via mapping in col_rename;
  5. if TRT_COMMIT_TS(trx_id_start) < AS_OF_TIME then return record;
  6. if old_name is not NULL, set TABLE_NAME = old_name;
  7. set TRX_ID = trx_id_start; goto 1.

Example

select id, alias from dept2 for system_time as of transaction 1503;

Selects record with trx_id_start 1503 from VTMD, frm_B is in effect, but column and table names are from frm_C.

select id, name from dept for system_time as of transaction 1503;

Returns error no such table dept.

Mode 2: SELECT by past names.

Uses names for tables and columns, contemporary to the moment specified in SYSTEM_TIME expression. If time range is used in SYSTEM_TIME expression (FROM..TO, BETWEEN..AND), then the moment is specified by a higher boundary of that range.

create procedure DDL_SCHEMA_MODE_2(
    TABLE_NAME varchar(255),
    AS_OF_TIME timestamp(6))
begin
    declare TRX_ID bigint unsigned default MAX_VALUE;
    declare FRM blob; # frm binary image

    select frm_image
    from DDL
    where
        name = TABLE_NAME and
        TRT_COMMIT_TS(trx_id_start) < AS_OF_TIME and
        AS_OF_TIME <= TRT_COMMIT_TS(trx_id_end);
    into FRM;

    if found_rows() = 0 then
        call USE_FRM_FROM_FILE(TABLE_NAME);
    else
        call USE_FRM(frm);
    end if;
end
  1. input parameters TABLE_NAME, AS_OF_TIME;
  2. find DDL record where name = TABLE_NAME and TRT_COMMIT_TS(trx_id_start) < AS_OF_TIME and AS_OF_TIME <= TRT_COMMIT_TS(trx_id_end);
  3. return record.

No column name translation needed since past table name is used with past column names.

Example

select id, name from dept for system_time as of transaction 1503;

frm_B is in effect from time of 1503 with names for table and columns as of that time.

Historical range SELECT

Historical range SELECT is SYSTEM_TIME query that selects history range across multiple schema changes that are stored in VTMD (see part II). The main difficulty with such queries is to track properly table and column renames as well as any corner cases when column is added, deleted or its type changed. This is part of System Versioning 2.0.

Cursory algorithm of historical range SELECT

  1. for each versioned table split range to array of VTMD ranges;
  2. for each latest range per table expand wildcard columns ('*');
  3. make intersection of these array ranges. If intersection is not full, return error "Inexistent table T at range R";
  4. on intersection edges mark column and table name mappings if there are any at corresponding VTMD edges. If the mapping is not possible, i.e. column was added or deleted, return error "Inexistent column C at range R".
  5. for each portion of intersection create corresponding versioned SELECT (preferably of mode 2.);
  6. get resultsets for each SELECT. Connect resultsets by column mappings at intersection edges into single resultset. Return final resultset.

Connection of inexistent columns and EXISTS() function

In 3. and 4. when there is inexistent column or table for current period of time instead of failing with error the inexistence may be marked in appropriate intersection edge. In 6. when connecting resultsets such inexstence gaps may be filled with NULL. Additionally existence property may be tested by EXISTS(column name or table name) function.

This mode of SELECT over history range should be controlled by vers_fail_on_inexistent system variable.

Example

select * from dept2 for system_time all;
  1. create ranges: (0, 1503), [1503, 3750), [3750, 6656), [6656, Inf);

  2. no need to do intersection since only one table is queried;

  3. name mappings are added: (0, 1503), [1503, 3750(dept2->dept)), [3750, 6656(alias->name)), [6656, Inf);

  4. execute selects:

    select id, alias from dept2 for system_time transaction from 6656 to Inf;
    select id, name from dept2 for system_time transaction from 3750 to 6656;
    select id, name from dept for system_time transaction from 1503 to 3750;
    select id, name from dept for system_time transaction from 0 to 1503;
  5. connect resultsets

Example of range intersection

We have 3 tables with following ranges:

  • (0, 7), [7, 15), [15, 35), [35, Inf);
  • (0, 3), [3, 25), [25, 35), [35, Inf);
  • (0, 85), [85, Inf).

For simplicity let's don't check transactions order. In this case we will have the following intersection:

  • (0, 3), [3, 7), [7, 15), [15, 25), [25, 35), [35, 85), [85, Inf).

For each of these ranges we must perform its own SELECT, thus there will be 7 selects.

Glossary

Data-modifying DDL

DDL operation that leads to different query to achieve same result or when same result is not possible.

Inf

Abbreviation for infinity. Technically a maximum possible value for sys_trx_end.

Past name

Historical name as of some time in past. If name was not changed since that time, it will be equal to present name.

Present name (alive name)

Current table or column name, i.e. as of current time.

sys_trx_end

System Versioning special field. Contains marker of record's lifetime end (TRX_ID or timestamp).

sys_trx_start

System Versioning special field. Contains marker of record's lifetime start (TRX_ID or timestamp).

VTMD

Abbreviation for Versioned Tracking Metadata (see above for description).

VTMD range

Time range of [trx_id_start, trx_id_end) of single VTMD record.

Clone this wiki locally