Introduction to Database Concurrency Control
This article presents a practical overview of concurrency control (for theoretical discussions, click here). Concurrency control deals with the issues involved with allowing multiple people simultaneous access to shared entities, be they objects, data records, or some other representation.
Assume that you and I both read the same row from the Customer table, we both change the data, and then we both try to write our new versions back to the database. Whose changes should be saved? Yours? Mine? Neither? A combination? Similarly, if we both work with the same Customer object stored in a shared object cache and try to make changes to it, what should happen?
To understand how to implement concurrency control within your system you must start by understanding the basics of collisions – you can either avoid them or detect and then resolve them. The next step is to understand transactions, which are collections of actions that potentially modify two or more entities.
An important message of this article is that on modern software development initiatives that concurrency control and transactions are not simply the domain of databases, instead they are issues that are potentially pertinent to all of your architectural tiers. In this article, I discuss:
- Introduction to collisions
- Locking strategies
- Collision resolution strategies
- Choosing a locking strategy
Implementing Referential Integrity and Shared Business Logic discusses the referential integrity challenges that result from there being an object schema that is mapped to a data schema, something called cross-schema referential integrity problems. With respect to collisions things are a little simpler, we only need to worry about the issues with ensuring the consistency of entities within the system of record. The system of record is the location where the official version of an entity is located.
A collision occurs when two activities, which may or may not be full-fledged transactions, attempt to change entities within a system of record. There are three fundamental ways (Celko 1999) that two activities can interfere with one another:
- Dirty read. Activity 1 (A1) reads an entity from the system of record and then updates the system of record but does not commit the change (for example, the change hasn’t been finalized). Activity 2 (A2) reads the entity, unknowingly making a copy of the uncommitted version. A1 rolls back (aborts) the changes, restoring the entity to the original state that A1 found it in. A2 now has a version of the entity that was never committed and therefore is not considered to have actually existed.
- Non-repeatable read. A1 reads an entity from the system of record, making a copy of it. A2 deletes the entity from the system of record. A1 now has a copy of an entity that does not officially exist.
- Phantom read. A1 retrieves a collection of entities from the system of record, making copies of them, based on some sort of search criteria such as “all customers with first name Bill.”A2 then creates new entities, which would have met the search criteria (for example, inserts “Bill Klassen” into the database), saving them to the system of record. If A1 reapplies the search criteria it gets a different result set.
Collisions occur more often when:
- Data is allowed to go stale in a cache
- The number of concurrent users/threads grows.
So what can you do? First, you can take a pessimistic locking approach that avoids collisions but reduces system performance. Second, you can use an optimistic locking strategy that enables you to detect collisions so you can resolve them. Third, you can take an overly optimistic locking strategy that ignores the issue completely.
Pessimistic locking is an approach where an entity is locked in the database for the entire time that it is in application memory (often in the form of an object). With pessimistic locking:
- Locks either limit or prevent other users from working with the entity in the database.
- A write lock indicates that the holder of the lock intends to update the entity and disallows anyone from reading, updating, or deleting the entity.
- A read lock indicates that the holder of the lock does not want the entity to change while the hold the lock, allowing others to read the entity but not update or delete it.
- The scope of a lock might be the entire database, a table, a collection of rows, or a single row. These types of locks are called database locks, table locks, page locks, and row locks respectively.
Pessimistic locking makes it easier to implement and guarantees that database changes tare made consistently and safely. The primary disadvantage is that this approach isn’t scalable. When a system has many users, or when the transactions involve a greater number of entities, or when transactions are long lived, then the chance of having to wait for a lock to be released increases. Therefore this limits the practical number of simultaneous users that your system can support.
With multi-user systems it is quite common to be in a situation where collisions are infrequent. Although the two of us are working with Customer objects, you’re working with the Wayne Miller object while I work with the John Berg object and therefore we won’t collide. When this is the case optimistic locking becomes a viable concurrency control strategy. The idea is that you accept the fact that collisions occur infrequently, and instead of trying to prevent them you simply choose to detect them and then resolve the collision when it does occur.
Figure 1 depicts the logic for updating an object when optimistic locking is used:
- The application reads the object into memory. To do this a read lock is obtained on the data, the data is read into memory, and the lock is released. At this point in time the row(s) may be marked to facilitate detection of a collision (more on this later).
- The application then manipulates the object until the point that it needs to be updated.
- Next step is to obtain a write lock on the data and reads the original source back so as to determine if there’s been a collision. The application then determines that there has not been a collision so it updates the data and unlocks it.
- Had a collision been detected, e.g. the data had been updated by another process after it had originally been read into memory, then the collision would need to be resolved.
There are two basic strategies for determining if a collision has occurred:
- Mark the source with a unique identifier. The source data row is marked with a unique value each time it is updated. At the point of update, the mark is checked, and if there is a different value than what you originally read in, then you know that there has been an update to the source. There are different types of concurrency marks:
- Datetime stamps (the database server should assign this value because you can’t count on the time clocks of all machines to be in sync).
- Incremental counters.
- User IDs (this only works if everyone has a unique ID and you’re logged into only one machine and the applications ensure that only one copy of an object exists in memory).
- Values generated by a globally unique surrogate key generator.
- Retain a copy of the original. The source data is retrieved at the point of updating and compared with the values that were originally retrieved. If the values have changed, then a collision has occurred. This strategy may be your only option if you are unable to add sufficient columns to your database schema to maintain the concurrency marks.
Figure 1 depicts a native approach, and in fact there are ways to reduce the number of database interactions. The first three requests to the database – the initial lock, marking (if appropriate) the source data, and unlocking – can be performed as a single transaction. The next two interactions, to lock and obtain a copy of the source data, can easily be combined as a single trip to the database. Furthermore the update and unlock can similarly be combined. Another way to improve this is to combine the last four interactions into a single transaction and simply perform collision detection on the database server instead of the application server.
With the strategy you neither try to avoid nor detect collisions, assuming that they will never occur. This strategy is appropriate for single user systems, systems where the system of record is guaranteed to be accessed by only one user or system process at a time, or read-only tables. These situations do occur. It is important to recognize that this strategy is completely inappropriate for multi-user systems.
Collision resolution is critical to effective concurrency control. You have five basic strategies that you can apply to resolve collisions:
- Give up.
- Display the problem and let the user decide.
- Merge the changes.
- Log the problem so someone can decide later.
- Ignore the collision and overwrite.
It is important to recognize that the granularity of a collision counts. Assume that both of us are working with a copy of the same Customer entity. If you update a customer’s name and I update their shopping preferences, then we can still recover from this collision. In effect the collision occurred at the entity level, we updated the same customer, but not at the attribute level. It is very common to detect potential collisions at the entity level then get smart about resolving them at the attribute level.
For simplicity’s sake, many teams will choose a single locking strategy and apply it for all tables. This works well when all, or at least most, tables in your application have the same access characteristics. However, for more complex applications you will likely need to implement several locking strategies based on the access characteristics of individual tables. One approach, suggested by Willem Bogaerts, is to categorize each table by type to provide guidance as to a locking strategy for it. Strategies for doing so are described in Table 1.
|Suggested Locking Strategy
|Log (typically append only)
|Lookup/Reference (typically read only)
I’d like to thank Willem Bogaerts and Rich Stone for their feedback regarding this article.