Transaction & Concurrency Control (TCC)

Ladies and Gentlemen, good evening, thank you for your attention and the opportunity given to us. Allow our group to deliver material related to distributed systems with the title "Transaction & Concurrency Control". Here are some things that we have just been able to understand.

Mengenal Transaction & Concurrency Control

Subject:

Abstract
Introduction
Transactions
Nested transactions
Locks
Optimistic concurrency control
Timestamp ordering
Comparison of methods for concurrency control.

1. Abstract

This chapter will discuss the application of transactions and concurrency control to shared objects managed by servers.

A transaction defines a series of server operations that the server guarantees are atomic in the face of multiple clients and server crashes.

Nested transactions (Nested transactions / transactions within transactions) are composed of sets of other transactions. They are very useful in distributed systems because they allow additional concurrency.

All concurrency control protocols are based on the serial equivalence criterion and derive from predefined rules for dealing with conflicts between operations. They are described in the following 3 methods:

Locks are used to order transactions that are accessing the same object according to the order in which their operations arrive. (we understand it as ordering queue numbers at the same counter(object)).

Talking about serial equivalence, according to the online KBBI; Equivalent means having the same value (size, meaning, or effect); worth; comparable; equivalent. This is indicated by the methods/functions used by each transaction. You can see it in Fig.1, where transaction T and transaction U both use the getBalance(), setBalance, and withdraw() methods/functions, which means that both transactions will have the same effect on the resource. Serial means consecutive; sequential; continuous, this is indicated by the pattern of the transaction process between the two, where both take turns and are sequential.

Gb.1 - a serially equivalent interleaving of T and U

Optimistic concurrency control allows transactions to continue until they are ready to commit, where a check is made to see if the operations they perform are in conflict with objects. We understand this from working with github.

Fig. 2 - Transaction Before Commit On Github Desktop

Timestamp ordering uses timestamps to order transactions that are accessing the same object, according to their start time.

Timestamps are the current time recorded by a computer through a mechanism such as the Network Time Protocol (NTP), so that the computer can maintain accurate current time. This accuracy allows computers and online applications to communicate effectively.

The goal (target achievement) of transactions is to ensure that all objects handled by the server remain in a consistent state when these objects are accessed by several types of transactions and also when the server crashes.

Recoverable objects are objects that can be recovered after a crash.

2. Introduction

A transaction performed by a client is a series of operations on an object (accessing an object) treated as a separate unit from the server that manages the object.

The server must ensure that when all transactions have been committed, the results are saved to permanent storage or if there is a crash of one or more objects, its effects must be completely removed.

A client's transactions can also be considered isolated from other clients, in the sense that the operations of one transaction cannot be used to observe the partial effects of the operations of another transaction.

To illustrate some of the points in this chapter, we will use a banking example, shown in Figure 3, where each account is represented by a remote object whose interface, account, provides operations for making deposits, withdrawals, reading balances, and managing balances. Each branch of the bank is represented by a remote object whose interface, branch, provides operations for creating new accounts, searching for accounts by name, and reading the total funds in the branch.

Fig. 3 - Banking Methods

2.1. Simple synchronization (without transactions)

One of the issues often encountered related to transaction problems is the interference of information from one client to another. This can cause errors in the value of the object.

Atomic operations on the server, it is known that the use of multiple threads is very beneficial for the server, in addition the use of threads allows operations from multiple clients to run simultaneously and access the same object. Therefore, object -- object methods must be designed for use in a multi-thread context.

Atomic Operations are the smallest fraction of operations that are free from interference from other operations in other threads.

For example, a method is created in a class that guarantees access to unique objects (limited to only one).

Public synchronized void deposit(int amount) throws RemoteExceptions{
//adds amount to the balance of the amount
}

If one thread calls a synchronized method on an object, the object is locked, so other threads calling the object are blocked until the lock on the object is released. This synchronization requires separate threads to execute in a unit of time and ensures that variables in the object are accessed consistently.

The use of synchronization methods in Java is to achieve these atomic operations.

However, in other programming languages for multi-threaded server environments, operations on objects still require atomic operations to keep the objects consistent.

Synchronization (Synchronization)

It is a process that controls the running of several processes at the same time. It aims to avoid data inconsistencies due to access by several different processes (mutual exclusion), and to regulate the order in which these processes run, so that they can run smoothly and avoid deadlock or starvation.

Improved client cooperation with server operation synchronization, clients can use the server to share their resources. To update an object on the server, it can be done with an operation, while other clients use other operations to access the object.

For example, there is a situation where one client cannot complete the operation it is performing until the operation performed by another client is complete. This is for example a situation where one client is a producer and the other client is a consumer. The client that needs the resource (consumer), must wait until the client that is the producer completes its operation.

The Java wait() & notify() Methods

In Java programming, the 2 methods above are used to solve communication problems between threads.

For example, thread A calls the wait() method on an object, then other threads can use other methods on the object. The notify() method is called to alert other threads accessing the object, informing them that there has been a data change on the object (if there has been a change). Both methods will affect the lock status on the object.

2.2. Failure model for transactions

Lampson [1981a] created a model to describe distributed transactions that experience disk, server and communication failures. This model claims that the algorithm he uses is successful in detecting errors, but cannot predict the behavior of objects when an error occurs. His model can be explained as follows:

Failure to write to storage,
Server crash suddenly,
Message delay (can be caused by duplicate, corrupt).

Stable system design is based on error/failure models that occur in permanent storage, processors and communications.

In practice, stable storage provides atomic write operations to overcome failures, whether it is a write operation error or a process failure. Stable processors use stable storage to provide a recovery process in the event of a crash.

3. Transactions

In various situations, the client requires a sequence of separate requests to the server in order to:

Clients are free from interference from other client operations running simultaneously.
Client operations can be completed completely or they do not cause additional effects when the server crashes.

Transactions are related to database management systems. Transactions are the execution of programs that access a database. Transactions were introduced to distributed systems in the form of file server transactions such as XDFS [Mitchell and Dion 1982] where in the context, transactions are defined as the execution of a sequence of client requests for file operations.

Transactions on distributed objects are provided by several systems, such as Argus [Liskov 1988] and Arjuna [Shrivastava et al. 1991]. From the client's perspective, a transaction can be defined as a sequence of operations in the form of a single step, which transforms data from the server from one form to another.

Atomic transactions

Includes; All or nothing: the transaction can be completely successful or not at all. There are two aspects to this:

Failure automaticity: the effect is small even if the server crashes.
Durability: after a successful transaction, the data will be stored in storage. Data stored in storage will remain even if the server crashes.
Isolation: each transaction must be carried out without any interference from other transactions.

Failure automaticity and durability support requirements

The object must be repairable,
Any changes to an object must be saved to storage.

Servers supporting transaction processing must synchronize their operations to ensure that all requirements regarding isolation aspects are met.

The way to do this is by running transactions serially.
Unfortunately, this method is generally rejected by servers whose resources are shared between multiple users.

The goal of a transaction-enabled server is to maximize concurrent use. Therefore, transactions are allowed to execute concurrently if the effect is similar to serial execution.

3.1 Concurrency control

Includes;

Lost update problem, is a problem that arises due to ignoring information when there is an update from another operation that occurs almost at the same time.
Inconsistent retrievals, is a problem regarding updated information about an operation that has not been obtained by other operations while the operation is running.
Serial equivalence is a criterion for correcting concurrent execution which aims to avoid lost update problems and inconsistent retrievals.
Conflicting operations, is the existence of a conflict between two or more operations whose effects are contradictory to each other.

Serial equivalence

Often used as a concurrency control protocol criterion.

Basically, concurrency control can be achieved by forcing a client to wait for another client to complete its operation or by restarting a client's transaction once a conflict is detected, or a combination of both.

3.2 Recoverability from aborts (Recovery Ability from Abortion)

The server must record every effect of a transaction that is executed and must not record the effect of a transaction cancellation. However, the server must prevent a transaction cancellation from disrupting other transactions that are in progress.

Problem -- problem related to transaction cancellation;

"dirty read", is a problem caused by the interaction between a read operation and a write operation that was previously in progress, but aborted on the same object.
"premature writes", is a problem that arises due to the cancellation of a write operation on an object which affects write operations on other transactions in that object.

Both can occur during transaction execution.

"dirty read"

Things to note regarding the occurrence of dirty reads. For example:

Recoverability of transactions,

Problems caused by the dirty read process cannot be cured directly.
It needs to be detected early and prevented from happening.
Prevention can be done by adding delay time to the operation after cancellation.

Cascading aborts,

Problems that arise in transactions after cancellation.
Other transactions that received the cancellation effect must also be cancelled.
To avoid cascading aborts, a transaction is only allowed to read objects that have been successfully written. To ensure this, any read operation must be delayed until the write operation on the object in question is complete or canceled.

"premature writes"

Other things to note regarding recoverability issues:

Strict executions of transactions, in general, transactions need to delay read and write operations to prevent dirty reads and premature writes. Transaction execution is said to be 'strict' if there is a delay in read and write operations on an object until all transactions that previously wrote operations to the object have completed or been canceled.

Tentative versions, the server used to repair objects must be designed so that the server can delete any updates to objects whose transactions are aborted. To do this, all updates during the transaction process must be stored in volatile memory on the experimental version. Each transaction must be provided with its own set of experiments on its objects.

The experimental version is then transferred to the object when the transaction occurs, which also writes it to storage. When the transaction is rolled back, the experimental version is also deleted.

4. Nested Transactions

Nested transactions are transaction mechanisms formed by other transactions. The outermost transaction in a set of nested transactions is called the top-level transaction. Other than the top-level transaction, it will be called a Sub transaction.

Gb. 4 - Nested Transactions

Sub transactions at the same level can run concurrently but their access to common objects uses a serialized mechanism. As in the locking scheme. When we want to distinguish the original form of a transaction from the nested one we can use the flat transaction scheme.

Advantages of nested transactions:

Sub transactions at the same level can work concurrently. When sub transactions are executed on different servers they can work in parallel.
Sub transactions can perform operations or cancel operations independently.

Nested transaction commitment rules:

A transaction can be executed or aborted only after its child transaction completes.
When a sub/child transaction completes, it will make an independent decision, whether to execute provisionally or abort.
When the top level transaction is cancelled, the child transactions will be automatically cancelled.
When the child transaction is cancelled, the top-level transaction will decide whether to cancel it or not.

Example of nested transactions:

Transfer $100 from B to A
a.deposit(100)
b.withdraw(100)

From the example above, it can be seen that the top-level Transfer transaction can be divided into 2 sub-transactions, namely deposit and withdraw. When the 2 sub-transactions are committed / running, the top-level transaction will almost certainly run. But if the top-level transaction is canceled, in this case the transfer transaction, then the sub-transactions (deposit and withdraw) will also be canceled.

5. Locks

Fig. 5 - Example of lock operation

In the example above, it can be seen that there are 2 transactions T and U, while transaction T involves objects A and B and transaction U involves objects B and C. From the description of the example above, it can be seen that transaction T works first, so it will lock object B and object A. When this lock occurs, transaction U will wait until transaction T completely frees object B. Only then can transaction U use object B to run its transaction.

"strict two-phases locking"

A transaction can abort, so strict execution is needed to prevent "dirty reads" and "Premature Writes". At this time a transaction that will read or write an object must be postponed until other transactions that are active with the object complete / cancel their activities. This situation is called "strict two-phases locking".

So it can be concluded that the rules of the operational conflict are:

If transaction T is performing a read activity on an object, transaction U must not perform a write activity on that object.
If transaction T is performing a write activity, transaction U is not allowed to perform any read or write activities.

Use of strict two-phase locking mechanism

1. When an operation accesses an object for a transaction:

If the object is not locked, the transaction will lock the object and the transaction is executed.
If an object is currently being used/locked by another transaction then other operations of a transaction that are related to that object must wait until the object is released.
If an object has a "non-conflicting lock" property, the lock is shared and the operation is executed.
If an object has been locked in the same transaction. The lock will be promoted and the operation executed. (promotion in this case is the same as the context of rule b).

2. When a transaction is cancelled the server will free all objects in use.

5.1. Deadlock

The risk of using locks is deadlock. This can happen when the state of waiting for the release of the lock occurs in Fig. 6.

Gb. 6 - Deadlock

So it can be concluded that deadlock is a condition where each member of a transaction group waits for other members to release the lock as in the situation above which is often called a wait-for graph condition.

Prevention

A preventive solution that can be used is to lock all objects used when a transaction starts running. So that the transaction will never go to a deadlock state with other transactions. But there are still obstacles, namely predicting which objects are likely to be used in a transaction. By implementing this preventive method, it can reduce the potential for deadlock but has the side effect of premature locking and decreased concurrency.

Deadlock detection

Deadlock can be detected by finding/identifying cycles in the waitfor graph. After that, an abort transaction is selected to break the cycle.

In addition to the above methods, timeouts are also used which provide a time limit for each transaction. So if a deadlock occurs, these timeouts will count down and if the time runs out, the transactions involved in the deadlock will be canceled.

5.2. Concurrency Improvements in Locking Schemes

Two-version locking
Hierarchic locks
Hierarchical locking

1. Two-version locking

Basically this method emphasizes on read operations that must wait if another transaction is working with the same object. This scheme allows more concurrency than read-write lock. A transaction cannot execute its write operation directly if another transaction that has not finished is still performing a read operation on the same object.

2. Hierarchic locks

In hierarchical locking at each level setting a parent lock has the same effect as setting all comparable child locks. In this scheme any node in the hierarchy can be locked and give the owner of the lock explicit access to the node and implicit access to its children. For example, if a read/write operation occurs on a branch, then implicit read/write locks occur on each account.

3. Hierarchical locking

It has the advantage of being able to reduce the number of locks when mixed granularity locking is needed. It also allows each transaction to lock a portion of the size chosen depending on its needs.

6. Optimistic Concurrency Control

Kung and Robinson have identified a number of locking drawbacks of inheritance and proposed an optimistic approach to transaction serialization that can prevent these drawbacks. These locking drawbacks include:

Lock maintenance represents an invisible overhead in systems that do not support concurrent access to shared data. However, locking is still needed as a precaution.
The use of keys can end in a dead end.
To prevent cancellation, the lock cannot be removed until the end of the transaction.

The alternative approach proposed by Kung and Robinson is called 'optimistic' because it is based on the observation that the probability of two clients accessing the same data is small. When a conflict occurs, some transactions will generally be aborted and must be repeated by the client.

Transaction Phase

Working phase: in this phase, each transaction has a tentative version of each updated object. Its use is to allow transactions to stop during the working phase or if they fail during validation, without affecting the object. The read operation will be performed immediately. This operation will access the tentative version of the transaction if it already exists. The write operation will record the latest value of the existing object as a tentative value. If there are multiple transactions at the same time, several different tentative values for the same object can occur.
Validation phase: when a closeTransaction request is received, the transaction is validated to see if the operation on the object conflicts with the operation on the same transaction. If the validation is proven successful, the transaction can proceed. However, if the validation fails, then there must be a conflict justification.
Update phase: once the transaction has been executed and proven valid, all changes to the tentative version will be permanent.

Transaction validation

Validation uses read-write conflict rules to ensure that the scheduling of a particular transaction is serially equivalent. To aid the validation process, each transaction is assigned a transaction number when it enters the validation phase. This number becomes permanent once the transaction has been validated. The number is an integer arranged in ascending order, so that a transaction always completes its working phase after all transactions with smaller numbers.

Fig. 7 - Transaction validation

For example, if a transaction is given the number Ti, it always precedes the transaction with the number Tj if i < j.

Since all read operations that occur in the overlapping transaction are performed before validation of Tv is performed, the transaction cannot be interrupted by write operations in the same transaction. Validation of transaction Tv checks whether the read set in the transaction overlaps with the write set in the previous overlapping transaction Ti. If there is an overlap, validation fails.

In backward validation, the read set of a transaction is validated by comparing it to the write set of a previously executed transaction. So the only way to resolve the conflict is to stop the transaction being validated.

Concurrent optimistic control using backward validation requires a write set on previous versions of objects corresponding to the current transaction until there are no invalid overlapping transactions.

Forward validation

In forward validation on transaction Tv, the write set of Tv is compared with the read set of all active overlapping transactions. Forward validation allows for the fact that the read set of an active transaction can change during validation. When a transaction is compared with an active valid transaction, we have the option to either abort the transaction validation or take an alternative way to resolve the conflict. Härder provides several alternative solutions:

Defer validation until the problematic transaction time has been completed.
Cancels all problematic active transactions and declares them validated.
Cancel the validation process of the transaction.

Backward and forward comparison

It can be seen that forward validation allows for flexibility in handling conflicts, while backward validation has only one solution, aborting the validation process. However, backward validation has the advantage of compiling old write sets until they are no longer needed, while forward validation must allow new transactions to start while the validation process is taking place.

Starvation

Aborted transactions will be restarted by the client program, but they still have the potential to fail during validation. This can happen if the transaction conflicts with other transactions due to the use of objects each time the transaction is repeated.

Preventing transactions from being executed is called starvation, which is a rare occurrence.

7. Timestamp Ordering

In concurrent control based on timestamp ordering, each operation in all transactions is validated. If an operation cannot be validated, the transaction is immediately aborted and then repeated by the client program. Each transaction is marked with a unique timestamp value at the time it is started. The timestamp defines its position in the transaction time sequence. Requests for a transaction can be ordered based on its timestamp.

Rules on timestamps are created to ensure that each transaction accesses a consistent value of an object. The rules must also ensure that the tentative versions of each object must be completed in the order specified by the timestamps of the transactions that created them.

As usual, write operations are recorded in a tentative version of the object and are not visible until the transaction calls closeTransaction and the transaction completes. The object's write timestamp is earlier than its tentative version, and read timestamps can be represented by the maximum number of members. In timestamp ordering, every request by a transaction to read or write an object is checked for conflicts in the operations. When a coordinator receives a request for a transaction, it can always perform it because every operation of a transaction is checked for consistency with the previous transaction.

A timestamp method prevents deadlocks, but is prone to replay. A modification known as the "ignore obsolete write" rule has been implemented.

Multiversion timestamp ordering

In multiversion timestamp ordering, a list of objects and their tentative values is stored in each object. This list represents the history of the object's values. The advantage of using multiple versions is that late read operations do not have to be rejected.

8. Comparison of Methods for Concurrency Control

We have described three separate methods for controlling concurrent access to shared data; strict two-phase locking, optimistic methods, and timestamp ordering. All methods have some advantages and where they are needed, and all three have certain limitations for concurrent operations. Some distributed systems research has explored the utility of semantic locks, timestamp ordering, and new approaches to remote transactions.

Work on two application areas has shown that concurrent control mechanisms are not always adequate. One area of concern is multiuser applications where each user expects to have a public view of objects being updated by other users. Such applications require their data to be atomic when faced with concurrent updates and server collisions, and transaction techniques appear to provide an approach to their design.

Application Requirements Related to Concurrency Control

Users need immediate notification of changes made by other users.
Users need to be able to access objects before other users complete their transactions, which is the rationale for developing a new type of lock that triggers an action when an object is accessed.

Some examples of applications that use Concurrency Control

Dropbox
Google apps
Wikipedia.

1. Dropbox

It is a cloud service that provides file backup and allows users to share files and folders, and access them from anywhere.

Dropbox uses an optimistic form of concurrency control to record file consistency and prevent collisions between users (when performing updates).

So if two users make concurrent updates to the same file, the one that wrote earlier will be accepted and the rest will be rejected. However, Dropbox also provides a history version, allowing users to manually merge their updates or restore them to their original state.

2. Google apps

It is a could service that provides web-based applications such as word processors, spreadsheets and presentations, which allow users to collaborate. Users can edit the same document at the same time. If two users access the same cell at the same time, the one who makes the last update is the winner.

3. Wikipedia

Allows concurrent editors to access the web page where content is being written, the first editor will be accepted while the other will be shown an 'edit conflict' page and asked to resolve the conflict.

Reference

Coulouris George, Dollimore Jean, and Kindberg Tim, "Distributed Systems Concepts and Design," Fifth Edition, Fi Pearson Education Ltd, 2001. (main material for correlation)
http://whatis.techtarget.com/definition/timestamp (definition of timestamp)
http://te.ugm.ac.id/~risanuri/distributed/ringk/bab12.pdf (Indonesian translation)
http://kbbi.web.id/ekuivalen (definition of equivalent)
http://kbbi.web.id/serial (definition of serial)

TOOL

Google Translate (as a tool for approximation)

APPENDIX

The main goal in developing a database is to allow multiple users to access data simultaneously. This data access is not a problem if all users only read the data and they do not interfere with each other. But when two or more users access the same database simultaneously and one of them makes changes to the data, this can lead to inconsistent data (inconsistency data).

To overcome the possibility of data inconsistency, a mechanism is needed to regulate the running of transactions accessing the same data. This mechanism is known as concurrency control. Concurrency control is the process of regulating operations in many transactions running simultaneously on a database without disrupting operations in other transactions so that they can produce consistent data (Connolly, 2005, p577). Three examples of important problems related to concurrency are the Lost-Update problem, the Uncommitted Dependency problem, and the Inconsistent Analysis problem.

Lost-Update Problem

Explanation: Transactions T1 and T2 start at approximately the same time, and both read the balance of $100. T2 increases balx by $100 to $200 and stores the change in the database. On the other hand, transaction T1 decreases its copy of balx by $10 to $90 and stores this value in the database, overwriting the previous update and ultimately losing the $100 that was previously added to the balance. Losing the update of transaction T2 can be avoided by preventing T1 from reading the value of balx until T2's update has completed.

Uncommited Dependency Problem (dirty read)

Explanation: Transaction T4 changes balx to $200, but T4 rolls back the transaction, so balx must be rolled back to its original value of $100. However, by that time, transaction T3 has read the new value of balx ($200) and uses this as the basis for a $10 deduction, giving an incorrect balance of $190, instead of $90. The value of balx that T3 reads is called dirty data, which is derived from its alternative name, the dirty read problem. The reason for the rollback is not important.

The problem is that the transaction fails (errors), possibly reducing the wrong account. The effect is that T3 assumes that T4's update has been successfully executed, even though the changes were subsequently rolled back. This problem is avoided by preventing T3 from reading balx until a decision has been made, namely to commit or undo T4's effects. The above two problems concentrate on transactions that modify the database and their intervention can corrupt the database. However, transactions that only read the database can also produce inaccurate results if they are allowed to read the results of parts of unfinished transactions that are simultaneously reading the database. An example is described in the inconsistent analysis problem.

Inconsistent Analysis Problem

Explanation: Inconsistent analysis problems arise when one transaction reads some values from the database but a second transaction changes some of them during the execution of the first transaction. For example, a transaction that summarizes data in a database (say, total balances) will produce inaccurate results if, while it is running, another transaction is modifying the database. In the example above, transaction T6 is running the summary at the same time as transaction T5. Transaction T6 is summarizing the balances of account x ($100), account y ($50), and account z ($25). However, along the way, transaction T5 has transferred $10 from balx to balz, so T6 now has an incorrect result (more than $10).

Serializability dan Recoverability

The goal of concurrency control protocols is to schedule transactions in a way that avoids interference, and also prevents the types of problems described in the previous section. One obvious solution is to allow only one transaction to run at a time. One transaction is committed before the next transaction is allowed to start. However, the goal of a multiuser DBMS is also to maximize the degree of concurrency or parallelism in a system, so that transactions that can run without interfering with each other can run in parallel. For example, transactions that access different parts of the database can be scheduled together without interference. In this section, we examine serializability as a way to help identify those transaction executions that are guaranteed to ensure consistency. First, let's give some definitions.

Schedule

A schedule is a sequence of operations by a set of concurrent transactions that preserves the order of operations in each individual transaction ( Connolly, 2005, p580 ). A transaction consists of a sequence of operations consisting of a read and/or write to the database, followed by a commit or abort action. A schedule S consists of a sequence of operations of a set of n transactions T1, T2, … Tn, subject to constraints protected by the order of operations for each transaction in the schedule. Thus, for each transaction Ti in schedule S, the order of operations in Ti must be the same as in schedule S. A serial schedule is a schedule in which the operations of each transaction are executed sequentially without any transaction interfering with any other transaction ( Connolly, 2005, p580 ).

NonSerial Schedule is a schedule in which the operations of a set of concurrent transactions are interleaved (Connolly, 2005, p580). In a serial schedule, transactions are executed in serial order. For example, if we have two transactions T1 and T2, the serial order would be T1 followed by T2, or T2 followed by T1. Then, in serial execution there is no interference between transactions, because only one transaction runs at a time.

The goal of serializability is to find a nonserial schedule that allows transactions to run concurrently without interfering with each other, and then produces a database state that can be reproduced by a serial execution. If a set of transactions run concurrently, we say that the (nonserial) schedule is correct if it produces the same result as some other serial execution. Such a schedule is called serializable. To prevent inconsistencies from transactions interfering with each other, it is important to guarantee the serializability of concurrent transactions.

In serializability, the order of read and write operations is important. Here are some things to consider:

If two transactions read only one of the same data items, they do not conflict and order is unimportant.
If two transactions perform read or write operations on different data items, the two transactions do not conflict and the order is unimportant.
If one transaction writes a data item and another transaction either reads or writes to the same data item, then the order of execution becomes important.

Suppose schedule S1 shown in figure (a) below contains operations from two concurrent transactions, T7 and T8. Since the write operation on balx in T8 does not conflict with the subsequent read operation on baly in T7, we can rearrange the operations to produce an equivalent schedule (S2) shown in figure (b).

If we now also change the order of the following non-conflicting operations, we produce an equivalent serial schedule (S3) shown in figure (c) below.

Change the order of write(balx) in T8 with write(baly) in T7
Change the sequence read(balx) in T8 to read(baly) in T7
Change the order of read(balx) in T8 with write(baly) in T7

a. nonserial schedule S1
b. nonserial schedule S2
c. serial schedule S3, equivalent to S1 and S2.

Schedule S3 is a serial schedule and since S1 and S2 are equivalent to S3, then S1 and S2 are serializable schedules.

A recoverable schedule is a schedule in which, for any transactions Ti and Tj, if Tj reads a data item previously written by Ti, then the commit operation of Ti precedes the commit operation of Tj.

Teknik Concurrency Control

There are two main concurrency control techniques that allow transactions to proceed safely in parallel subject to certain constraints: locking and certain timestamp methods. Locking and timestamping are conservative approaches because they cause transactions to be suspended in case they conflict with another transaction at some time in the future. Optimistic methods, however, are based on the premise that conflicts are rare, so they allow transactions to proceed asynchronously and only check for conflicts at the end, when the transaction performs a commit operation.

Metode Locking

Locking is a procedure used to control concurrent access to data. When a transaction is accessing a database, a lock may deny access to other transactions to prevent incorrect results ( Connolly, 2005, p587 ). There are two types of locks, namely shared locks and exclusive locks that must be used before reading or writing access to the database. The use of this lock is to maintain data consistency in the database. If a transaction has a shared lock on a data item, the transaction can read the item but cannot change the data ( Connolly, 2005, p588 ). If a transaction has an exclusive lock on a data item, the transaction can read and change the data item ( Connolly, 2005, p588 ).

Lock is used in the following way:

Any transaction that requires access to a data item must lock it, requesting a shared lock for read-only access or an exclusive lock for read and write access.
If the item is not already locked by another transaction, the lock will be granted.
If an item is locked, the DBMS determines whether the request is compatible with the current lock. If a shared lock is requested on an item that already has a shared lock attached to it, the request is granted. Otherwise, the transaction must wait until the existing lock is released.
A transaction continues to hold the lock until it releases it either at execution time or at the time the transaction terminates (abort or commit). The effects of a write operation will be visible to other transactions only when the exclusive lock has been released.

Two Phase Locking is a transaction that follows the two-phase locking protocol if all locking operations precede the first unlock operation in the transaction ( Connolly, 2005, p589 ).
The rules are as follows:

A transaction must obtain a lock on an item before operating on it. The lock can be either read or write, depending on the type of access required.
Until a transaction releases a lock, it will never acquire another new lock.

Deadlock

Deadlock is a deadlock that can occur when two or more transactions are each waiting for a lock held by another transaction to be released. There is only one way to break a deadlock, which is to abort one or more transactions. There are three ways to handle deadlocks, namely timeout, deadlock prevention and deadlock detection and recovery.

Timeout

A simple approach to deadlock prevention is based on lock timeout. With this approach, a transaction requesting a lock will wait only for a certain period of time defined by the system.

Deadlock Prevention

Another approach to prevent deadlocks is to order transactions using transaction timestamps. Two algorithms have been devised by Rosenkrantz. The first algorithm, Wait-Die, allows only older transactions to wait for younger ones, otherwise the transactions are aborted (die) and restarted with the same timestamp, so that over time they become the oldest active transactions and will not die. The second algorithm, Wound-Wait, uses a symmetric approach. Only younger transactions can wait for older ones. If an older transaction requests a lock held by a younger transaction, the younger transaction is aborted.

Deadlock Detection

Deadlock detection is usually handled by constructing a wait-for graph (WFG) that shows transaction dependencies, i.e., transaction Ti depends on Tj if transaction Tj holds a lock on a data item that is waited for by Ti.

WFG is a directed graph G = (N, E ) consisting of a set of nodes N and a set of directed edges E, which is constructed as follows.

Create a node for each transaction.
Create a directed edge Ti → Tj , if transaction Ti is waiting to lock an item that is currently locked by Tj.

Deadlock occurs if and only if the WFG contains a cycle. The figure above shows a WFG that exhibits a deadlock between two transactions.