================ Transactions ================ How do transactions & recovery interact with real-world actions? Examples: Print message on printer Dispense $20 bill from ATM machine Move nuclear reactor rods down 2 cm Real-world operations cannot be undone Must delay real-world operations until after a transaction commits But what if you crash after committing? Should you re-do the operation? Can never know for sure--can't atomically write disk and send message Possible solutions: Keep replay cache at device E.g., ATM remembers transaction ID Design devices such that operations are idempodent E.g., move reactor rods to absolute position 20cm, not relative Or design devices such that they are "testable" E.g., ask device it's position before issuing relative command maybe ask ATM machine how much money it has dispensed Physical vs. Logical logging Physical logging = value logging just says what data you wrote here Idempotent -- can write data multiple times with same effect But could get very large E.g., insert that requires block allocation and B-tree split Logical logging = operatin logging logs operation E.g., Insert record in table With B-tree split would require many physical log records Often logical undo operations are already operations on an object (Undo of table insert is delete operation, already implemented) Seems like a preferable solution--smaller logs, less code to implement... Does system R work with logical logging? Recall recovery/rollback assumed system was in an ACTION-CONSISTENT state What if a table insert operation failed, for one of these reasons: Logical failure - key already exists in table, xaction started aborting Limit failure - an index file runs out of space, xaction aborted Contention failure - deadlock causes xaction abort Media failure - disk crash System failure - system crashes, need to recover Need to make sure that UNDO can undo partial actions during abort What about during recovery? This is where shadow paging is really useful Always checkpoint at action-consistent states But System R folks regretted implementing shadow paging. Drawbacks? System must periodically stop initiating actions--not desirable Long actions must be designed to "come up for air" periodically What if someone waiting for lock inside action? must abort Requires locking at page granularity Very bad for tails of logs and interior B-tree nodes Would logical logging work with write-ahead logging in IMS (sec. 3.8)? No, because it requires undo/redo to be restartable Compromise: "Psysiological" logging a.k.a. physical-to-a-page logical-within-a-page logging Allow two kinds of log messages: page actions & message actions Message actions record I/O [was missing from system R] Page actions record modifications to a particular page Can be viewed as atomic/consistent "mini-transactions" to page After restart, system is page-action and message-action consistent From there can transform it into transaction-consistent state What about transaction in distributed system? Two-phase commit 1a. Coordinator sends VOTE-REQUEST 1b. Participant votes with VOTE-COMMIT or VOTE-ABORT 2a. Coordinator broadcasts GLOBAL-COMMIT or GLOBAL-ABORT If everyone voted VOTE-COMMIT coordinator durably records transaction coordinator broadcasts GLOBAL-COMMIT If anyone voted VOTE-ABORT coordinator broadcasts GLOBAL-ABORT 2b. Participants that voted VOTE-COMMIT wait for decision by coordinator Problem: nodes crash, coordinators crash Partial solution: Nodes can communicate with each other - If anyone didn't hear VOTE-REQUEST, all can abort - If anyone heard GLOBAL-COMMIT, all can commit - If no one in above to cases, must wait for coordinator to recover