Databases: Ch. 5 Transaction Management
List 3 causes of a rollback and examples for each
1) OS detects device failure (e.g. magnetic disk fails during execution of transaction, thus results cannot be written) 2) Database detects conflict between concurrent transactions (e.g. two airline customers attempt to reserve same seat on flight) 3) App logic detects unsuccessful database operation and conditionally runs code to reverse it via database
2 components of the definition of a transaction
1) Sequence of database operations 2) Must be either completed or rejected as a whole
Snapshot Isolation
A concurrency technique creating a private snapshot of data accessed by a transaction, effective for infrequent conflicts
Exclusive Lock
A lock that allows a transaction to read and write data
Shared Lock
A lock that allows a transaction to read but not write data
Two-Phase Locking
A locking technique ensuring serializable transactions with expand and contract phases
Nonrecoverable Schedule
A schedule in which one or more transactions cannot be rolled back
Cascading Schedule
A schedule where the rollback of one transaction forces the rollback of other transactions
Strict Schedule
A schedule where the rollback of one transaction never forces the rollback of other transactions
Deadlock
A state where a group of transactions are frozen, unable to proceed due to circular dependencies on locked data
All transactions must be:
ACID - Atomic - Consistent - Isolated - Durable
Issue with running transactions currently
Accessing the same data might conflict - subresults of one transaction might be used as operands for another, thus may provide faulty results for one or both transactions
Atomic transaction
All or none principle: all operations are executed and applied, and partial/incomplete results are rolled back End goal is full commit or full rollback - new state is fully committed or returned to prior state
Serializable Snapshot Isolation
An extension of snapshot isolation ensuring serializable schedules when the isolation level is set to SERIALIZABLE, supported in Postgres
Read Uncommitted
An isolation level allowing reads from uncommitted data, providing efficient processing but allowing various isolation violations
Read Committed
An isolation level allowing reads only from committed data and allowing nonrepeatable and phantom reads
Repeatable Read
An isolation level allowing reads only from committed data, but permitting phantom reads
Serializable
An isolation level where transactions run in a serializable schedule, ensuring full isolation from concurrent transactions
ACID (abbreviation)
Atomic, Consistent, Isolated, Durable
Fuzzy Checkpoint
Checkpoint that resumes processing while saving dirty blocks, improving database availability
Storage Media Failure
Corrupted database or lost connection; recovery involves copying the latest backup and executing system failure recovery process
Isolation Levels
Different levels like SERIALIZABLE, READ UNCOMMITTED, etc., set using SET TRANSACTION; balance between isolation and performance
List the 3 types of conflicts for a concurrent transaction
Dirty read, Nonrepeatable read, phantom read
Consistent transaction
Follow the rules of the database: all rules governing data are valid when transaction is committed - Committed transactions that violate the rules are rolled back - Applies to both universal rules (i.e. all relational data) and business rules (i.e. specific database/app)
Recovery Log Records
Four types - update, compensation, transaction, and checkpoint records; provide a complete history for recovery
How multi-core CPUs process multiple transactions
In parallel - e.g. a single processor can wait for one transaction to R/W, while it starts another transaction
Consequence of partial execution of transaction
Inconsistent or incorrect data
External resoruces
Laptops, email systems, files not managed by database
System Failure
Loss of main memory; recovery involves redo and undo phases; checkpoint records indicate saved data in main memory
RELEASE
Optional keyword with COMMIT; ends the current session and disconnects from the server
AND CHAIN
Optional keyword with COMMIT; starts a new transaction, overriding the autocommit setting
Redo Phase
Part of system failure recovery; restores transactions committed or rolled back since the last checkpoint by reading the recovery log forward
Undo Phase
Part of system failure recovery; rolls back transactions neither committed nor rolled back by reading the recovery log backward
Availability
Percentage of time a system is working; high availability is crucial for databases, e.g., stock exchange trades
Durable transaction
Permanently saves committed data in database, regardless of system failures
Lock
Permission for one transaction to read or write data, preventing concurrent transactions from accessing the same data
Savepoints
Points within a transaction where partial results are saved temporarily; managed with SAVEPOINT, RELEASE SAVEPOINT, and ROLLBACK TO statements
Isolated transaction
Processed without interference from other transactions as if each transaction were executed one at a time or serially, even if transactions are processed concurrently
Which two subsystems ACID properties are supported in
Recovery system and concurrency system
Rollback, the result of one, and when it's not possible
Rejecting an incomplete transaction - Reverses the transaction and resets data to initial values - Not possible after a commit
Transaction Failure
Results in rollback; caused by logical errors, deadlocks, or insufficient disk space; recovery restores changed data to original values
CHECKPOINT
SQL statement (supported by some databases) to force a checkpoint manually. Syntax varies among databases
BEGIN
SQL statement identical to START TRANSACTION; starts a new transaction
SET flush_time
SQL statement to change the default interval for fuzzy checkpoints (MySQL example)
COMMIT
SQL statement to commit the current transaction; one of the transaction boundary statements
SAVEPOINT
SQL statement to create a savepoint within a transaction; helps in restarting the transaction from a specific point in case of errors
RELEASE SAVEPOINT
SQL statement to discard a savepoint and its associated saved data within a transaction
SET flush
SQL statement to execute a fuzzy checkpoint after each SQL statement (MySQL example)
FLUSH
SQL statement to explicitly flush log records and dirty blocks. Syntax varies among databases
ROLLBACK TO
SQL statement to reset transaction data to the savepoint values, restart processing at the savepoint, and release subsequent savepoints
ROLLBACK
SQL statement to roll back the current transaction; one of the transaction boundary statements
SET TRANSACTION
SQL statement to set the isolation level for subsequent transactions. Syntax includes keywords like ISOLATION LEVEL, GLOBAL, and SESSION
START TRANSACTION
SQL statement to start a new transaction; one of the transaction boundary statements
Checkpoints
Save dirty blocks and log records; improve recovery time from system failure; syntax and procedures vary across databases
Commit
Saving complete transaction results in the database
"Action" - in context of the database, and how they can potentially violate ACID
Sequence of software operations that affect database OR external resources Actions may violate ACID properties since app logic has limited control over external resources
Recovery Log
Sequential record of all database operations stored separately from the database; used for restoring the database after a failure
Discuss concurrency system pros and cons of preventing conflicts, and ideal state of the system
Strict prevention of all conflicts = increased transaction duration and resource utilization - Ideally, database is configured for relaxed enforcement, producing greater efficiency with occasional violations of isolation
Recovery System
Supports atomic transactions by preventing partial transaction results from being saved and durable transactions by ensuring committed transactions are not lost due to failures
Recovery subsystem
System of database that enforces A, C and D transactions (A + D, C for both)
Concurrency subsystem
System of database that enforces I and C transactions (I for concurrency subsystem, C for both)
Hot Backup
Technique for storage media failure recovery; maintains a nearly synchronized secondary database; provides quick failover in seconds
Cold Backup
Technique for storage media failure recovery; periodic checkpoints and database copy to backup media; used when low availability is acceptable
Database action on a temporary vs. persistent database failure
Temporary: database attempts to restart the transaction - e.g. intermittent network problems Persistent: database 'kills' the transaction permanently - e.g. deleted back account
Lock Scope
The collection of data reserved by a lock, such as a row, block, table, or index
Concurrency - Locking
The management of concurrent transactions using locks to ensure data consistency and isolation levels
Failure Scenarios
Three failure scenarios managed by the recovery system: transaction failure, system failure, and storage media failure
Phantom read and example
Transaction T1 inserts or deletes a table row that transaction T2 is reading 1) T1 begins reading table rows 2) T2 inserts new row into table 3) T1 continues reading table rows - Depends on when T2 writes the new row, thus T1 result is unpredictable
Dirty read and example
Transaction T1 reads data that was updated in a second and uncommitted transaction T2, thus potentially creating invalid results 1) T2 updates data X 2) T1 reads X before T2 commits 3) T2 fails and is rolled back
Nonrepeatable read and example
Transaction T1 repeatedly reads changing data, with T1 assumption that data was stable and non-changing 1) T1 reads data X 2) T2 updates X 3) T1 rereads X