If you missed the first 2 tips, go back and read 5 Tips to Avoid Deadlocks in Amazon RDS (Part 1), and then come back for the last 3 tips on deadlock avoidance.
Once again, I want to re-emphasize that RDS is not actually capable of creating deadlocks — it merely reports them from the underlying database engine. While all of my 5 tips are applicable to all six supported database engines in RDS (Aurora, MySQL, MariaDB, PostgreSQL, Oracle, and Microsoft SQL Server), your specific database may provide additional recommendations on deadlock avoidance.
Tip #3: Always Acquire Locks in the Same Order
It’s not easy to do, but if you can guarantee multiple threads always lock resources in the same order, you can also guarantee that you have no deadlocks. Let’s consider a simple case, and then reverse the order to show my point. I will have two threads, A and B, executing multiple statements. First, the case that results in deadlock:
A => BEGIN
B => BEGIN
A => SELECT * FROM users WHERE id = 1 FOR UPDATE
B => SELECT * FROM users WHERE id = 2 FOR UPDATE
A => SELECT * FROM users WHERE id = 2 FOR UPDATE
B => SELECT * FROM users WHERE id = 1 FOR UPDATE
If you execute these statements slow enough, you’ll see that thread A blocks on line 5, and that thread B causes deadlock on line 6. This will happen every single time.
Deadlock is unavoidable when running these types of queries manually, though it will take a lot more serendipity to break in production. The timing matters a lot. That’s why deadlocks are so sporadic and not constant — the timing.
The guaranteed deadlock from above can be completely avoided if the order of queries is done right. Consider the following correct example:
A => BEGIN
B => BEGIN
A => SELECT * FROM users WHERE id = 1 FOR UPDATE
B => SELECT * FROM users WHERE id = 1 FOR UPDATE
A => SELECT * FROM users WHERE id = 2 FOR UPDATE
B => SELECT * FROM users WHERE id = 2 FOR UPDATE
Notice the difference in IDs. It doesn’t really matter which ID goes first and which ID goes second, so long as the order between the threads is consistent.
If you ran these same queries manually, just like above, a different pattern would emerge. thread B would block on line 4 until you finished all your queries with thread A. Then, without deadlocks, you could finish all the queries on thread B.
Pragmatically, it’s much harder to achieve this in code. Sometimes, an update to users has to be done before an update to accounts, and sometimes vice versa. The way around this would be to get exclusive locks before starting any of the updates, deletes, or inserts.
Tip #4: Keep All Transaction Logic Inside the Model
Developers want flexibility and code reusability. Both desires lead to this type of pseudo-code:
def updateAuth():
txn = model.begin()
user = model.find(userId)
model.verifyRequirements(user.account)
response = googleAPI.updateAuth(user)
verifyResponse(json.decode(response))
model.updateAuth(user)
model.updatePasswordHistory(user)
txn.commit()
While both flexibility and code reusability are a good thing to have, they can come at the expense of lock contention, race conditions, and deadlocks.
When database connections, transactions, and query order leave the model, it becomes much more difficult to ensure the correct ordering of queries and the shortening of transaction times. This may seem like a continuation of tips #2 and #3, and it partially is. However, this is also a recommendation for a particular type of code structure.
You should strongly avoid ever returning connections, queries, result sets, or any other database resource from any model (in a model-view-controller paradigm). Also, any functions that take as arguments any connection, query, result set, or any other database resource should be made private, so as to completely isolate all database logic in a single file.
In fact, if you can get your model code to be this simple, more power to you, though developers will likely want to abstract a little to avoid rewriting the same queries over and over again.
def setPassword():
conn = this.connect()
conn.query("BEGIN")
conn.query("UPDATE users SET ....")
conn.query("INSERT INTO password_history ...")
conn.query("INSERT INTO google_sync_requests ...")
conn.query("COMMIT")
This type of function is easy to debug, easy to verify ordering, and guaranteed to avoid calculation or external web calls in the middle of execution.
Tip #5: Fix Early, Fix Often
The cost of fixing bugs and defects increases exponentially over the SDLC lifecycle — you’ve probably already heard this. Deadlocks is no different, having to untangle the web of queries, pull out which threads were executing which query, and then making an appropriate code change so deep in the stack without making any worse bugs, that’s a chore.
The problem with deadlocks is that there is another degree of complexity. Deadlocks are a byproduct of lock contention and query order, and lock contention increases exponentially with a linear increase in concurrent transactions.
A single deadlock today could be repeated many times over in the matter of a few weeks. As time continues, more types of deadlocks will be introduced, which leaves a tangled mess of work for developers and a heap of frustrated users.
Fixing a single deadlock is less stressful; the trick is setting up the notifications for that first deadlock. Surprisingly, very few people set up notifications for database issues. RDS reports a Deadlocks metric which you can create an alert for via Cloudwatch. I recommend setting the threshold at 1, and monitoring it like a hawk. The downside is that it will tend to wake you up at night.
Blue Matador can automatically monitor RDS deadlocks and hundreds of other metrics, too. No config, easy setup, and a free 14-day trial. If you enjoyed my tips on deadlock avoidance, consider trying out my software and see how it can help you.