-
Notifications
You must be signed in to change notification settings - Fork 7
Transactions
Kvdb reuses the ets
backend as a transaction store. Strictly speaking kvdb_trans
is also another kvdb backend.
kvdb_trans:run(DbName, fun(Db) ->
kvdb:put(DbOrName, Obj),
...
end)
The way this works is:
- The
kvdb_trans
module opens a transaction store with thekvdb_ets
backend, then creates a #kvdb_ref{} instance where the database reference consists of the pair: {TransactionStore, OriginalDb}. - This #kvdb_ref{} instance is passed as an argument to the fun(), but also stored in the process dictionary.
- If the calling process calls kvdb operation referencing the database by name, kvdb_trans will detect the ongoing transaction and substitute the #kvdb_ref{} from the dictionary for the name. This substitution is triggered already in kvdb.erl, so doesn't add much overhead. This is very similar to how Mnesia does it, BTW.
- Each operation has a corresponding callback in kvdb_trans.erl, where the appropriate semantics is implemented, pulling metadata and data from the original store as needed, and updating the transaction store.
- A few special functions are added to kvdb_ets, to allow for insertion of "markers", e.g. noting that an object or table has been deleted.
- On commit (the fun returns), kvdb_trans asks the kvdb_ets instance for a #commit{} record, which is obtained by scraping and slightly transforming the data in the transaction store. If logging is enabled, the #commit{} record is logged, and all updates performed on the original backend.
- Any schema-triggered events generated during the transaction are kept in the process dictionary until commit-time, by which time they are fetched and issued in order.
This is mostly speculative at this point, as it hasn't been prototyped yet.
Side-effects should be possible inside transactions, as long as the locking regime doesn't enforce transaction restarts due to deadlock prevention. Of course, there will be no rollback on the side-effects... but a kvdb transaction instance could potentially (although not yet) be passed around between processes and survive for some time. This would fit well with NETCONF-style transactions, which stretch over multiple network operations.
An interesting shorter scenario to consider is the store-and-forward handling of RPCs. The transaction should begin with the validation of the RPC, and ideally carry on until the RPC is either immediately fetched from the queue and dispatched, or likely to remain in the queue. Again ideally, the push/pop sequence would not have to touch the persistent store if it happens within the scope of the transaction.
The kvdb_ets backend has an option for persistency, either by periodically saving to a snapshot, or by enabling update logging. When opened, such an instance will look for the specified snapshot file, load it, then search for logs and replay them, restoring the instance to a current state. A timestamp is stored in the snapshot to allow the instance to figure out which logs are more recent than the snapshot.
The log entries are written at the time just before an update of the database is to take place. The log entries use standardized macros in kvdb.hrl, and a generic function in kvdb_lib is used to replay the logs. Thus, while (at this time) only kvdb_ets actually performs logging, the replay should work for any backend. This should make it possible to use logs as an export/import facility, e.g. for migrating between backends (but this is yet to be implemented).
There are configurable log thresholds, either (currently) on number of writes or number of bytes. When a threshold is reached, an asynchronous signal is sent to the instance-owning process, which spawns a temporary process to open a new log, and modifies its own state to log a pending log switch operation.
All transactions must synchronize with the instance owner before starting. This makes it possible to insert a brief log switch pause, and update the #kvdb_ref{} record with a handle to the new log. A copy of the ets table is also made, and a new snapshot is built (currently synchronously) from this copy.
NOTE If we decide to allow long-running transactions, this synch mechanism becomes problematic. Pausing all transaction starts while waiting for a log switch is only acceptable if transactions are short. It might be better to perform the sync before commits, since transactions don't affect log activity or the original database until that time anyway.