Thursday, August 9, 2007

Threading and Locks

This week, I've been bitten by a dead-lock bug in my website's code twice already, and I've decided to fix it, and to write about it, as it seems like a common problem when starting with threading. To use a thread lock in Python, the threading module provides a Lock class that has two methods: acquire() and release(). Usually, one would think that thread-unsafe code should be "protected" by the lock like this ("lo" is a Lock variable):
def do_something():
lo.acquire()
do_threadunsafe_operation()
lo.release()

The problem is, that if do_threadunsafe_operation() raises an exception, the lock will not be released, and therefore, all subsequent calls to do_something() will lock forever. You have to put the code in a try..finally block to make sure the lock gets released when the function ends:
def do_something():
lo.acquire()
try:
do_threadunsafe_operation()
finally:
lo.release()

In the Python Cookbook, there is also a decorator version of this, so when using Python 2.4 and above, and you like using decorators, try the sychronized decorator and "decorate" your function:
@synchronized(lo)
def do_threadunsafe_operation():
pass

Threads in Python are cool, and if you're careful using them, you can make your application run faster without affecting the stability or flow of your application. A lesson I learned last weekend when a bug with locking resulting in my WSGI app dead-locking and the website unreacheable.

4 comments:

Chris said...

In Python2.5 and above you can also use the with statement to safely do locking:

with lo:
do_threadunsafe_operation()

René Dudfield said...

Processes are much better for webserving compared to threads... imho.

You'll get a lot less bugs, and your code will more easily be able to run across many machines.

It's like running code on win95, where one one application can crash the other applications, and the OS easily.

But, of course sometimes threads are nice :) So there.

thp said...

The problems with multi-process vs. multi-threaded I had was that when using SQLAlchemy and mod_python, one Apache2 process would update/change an object in the database and the other process would still have a cached version of that object and work with the old state, resulting in many problems.

With multi-threading, I put a lock around the SQLAlchemy access code and so far it seems to work well (the object cache is only available once per process, so no caching issues).

Mike Lowe said...

There are explicit instructions for using the 'with' statement, they even mention a example using locks.