Chapter 22 - Transaction Management
functions that a database management system (DBMS) should provide
- transaction support - concurrency control services - recovery services
left in a wait state indefinitely, unable to acquire any new locks
livelock
The technique of locking a child node and releasing the lock on the parent node if possible
lock-coupling or crabbing
Concurrency control techniques
locking and timestamping *both conservative (pessimistic) approaches (they cause transactions to be delayed in case they conflict with other transactions at some time in the future)
log records (or at least certain parts of them) are written before the corresponding write to the database
write-ahead log protocol
Aspects of transactions
- A transaction should always transform the database from one consistent state to another - "Committed" if successful, "Aborted" otherwise, where database is "rolled back" or "undone" - "Compensating transaction" fixes a mistake
Properties of a transaction
- Atomicity: "all or nothing" - Consistency: one consistent state to another - Isolation: execute independently of one another - Durability: effects permanently recorded
A DBMS should provide the following facilities to assist with recovery:
- a backup mechanism: makes periodic backup copies of the database - logging facilities: keep track of the current state of transactions and database changes - checkpoint facility: enables updates to the database that are in progress to be made permanent - recovery manager: allows the system to restore the database to a consistent state following a failure.
steal policy
- allows the buffer manager to write a buffer to disk before a transaction commits (the buffer is unpinned). In other words, the buffer manager "steals" a page from the transaction. The alternative policy is no-steal - avoids the need for a very large buffer space
Optimistic methods
- based on the premise that conflict is rare - sacrifices locking to gain performance, risking costly rollback - involves read (new value to local variable), validation, write
Deadlock detection is usually handled by
- construction of a wait-for graph (WFG) that shows the transaction dependencies - generated at regular intervals
Recovery techniques using deferred update vs immediate update
- deferred update recovery protocol: updates are not written to the database until after a transaction has reached its commit point - immediate update recovery protocol: updates are applied to the database as they occur without waiting to reach the commit point
potential problems caused by concurrency:
- lost update problem: apparently successful operation overidden by another user - uncommitted dependency problem: one transaction is allowed to see the intermediate results of another transaction before it has committed - inconsistent analysis problem: a transaction reads several values from the database but a second transaction updates some of them during the execution of the first
shadow paging
- maintains two-page tables during the life of a transaction: a current page table and a shadow page table - when the transaction starts, the two-page tables are the same (shadow page table is never changed thereafter) - when the transaction completes, the current page table becomes the shadow page table
There are three general techniques for handling deadlock:
- timeouts (a transaction that requests a lock will wait for only a system-defined period of time. If the lock has not been granted within this period, the lock request times out) - deadlock prevention - deadlock detection and recovery * only way to recover - terminate a process
Transaction records contain:
- transaction identifier - type of log record (transaction start, insert, update, delete, abort, commit) - identifier of data item affected by the database action (insert, delete, and update operations) - before-image of the data item, that is, its value before change (update and delete operations only) - after-image of the data item, that is, its value after change (insert and update operations only) - log management information, such as a pointer to previous and next log records for that transaction (all operations)
Timestamping
-if a transaction attempts to read or write a data item, then the read or write is only allowed to proceed if the last update on that data item was carried out by an older transaction - each data item contains a read_timestamp, giving the timestamp of the last transaction to read the item, and a write_timestamp
View serializability
A schedule is view serializable if it is view equivalent to a serial schedule
2PL
A transaction follows the two-phase locking protocol if all locking operations precede the first unlock operation in the transaction. According to the rules of this protocol, every transaction can be divided into two phases: - growing phase: in which it acquires all the locks needed but cannot release any locks - shrinking phase: in which it releases its locks but cannot acquire any new locks.
Transaction
An action, or series of actions, carried out by a single user or application program, that reads or updates the contents of the database.
Deadlock
An impasse that may result when two (or more) transactions are each waiting for locks to be released that are held by the other.
ACID properties of a transaction
Atomicity, Consistency, Isolation, and Durability
used to improve database recovery - all modified buffer blocks, all log records, and a record identifying all active transactions are written to disk
Checkpoints
A schedule where the operations from a set of concurrent transactions are interleaved
Nonserial schedule
A schedule in which for each pair of transactions Ti and Tj, if Tj reads a data item previously written by Ti, then the commit operation of Ti precedes the commit operation of Tj.
Recoverable schedule
A sequence of the operations by a set of concurrent transactions that preserves the order of the operations in each of the individual transactions.
Schedule
A schedule where the operations of each transaction are executed consecutively without any interleaved operations from other transactions
Serial schedule
Locking
Shared lock:If a transaction has a shared lock on a data item, it can read the item but not update it. Exclusive lock:If a transaction has an exclusive lock on a data item, it can both read and update the item.
concurrency control
The process of managing simultaneous operations on the database without having them interfere with one another.
UNDO/REDO in context of a crash
UNDO uncommitted transactions REDO commited
Situation in which a single transaction leads to a series of rollbacks
cascading rollback *cascading rollbacks are undesirable, because they potentially lead to the undoing of a significant amount of work
A schedule that orders any conflicting operations in the same way as some serial execution
conflict serializability
a transaction obtains all its locks when it begins, or it waits until all the locks are available
conservative 2PL
a transaction updates a data item based on its old value, which is first read by the transaction
constrained write rule * tested with a precedence (or serialization) graph
force policy
ensures that all pages updated by a transaction are immediately written to disk when the transaction commits. The alternative policy is no-force
Thomas's write rule
ignore obsolete write rule
leave the release of all locks until the end of the transaction
rigorous 2PL
holds only exclusive locks until the end of the transaction
strict 2PL