Locking Buzhug
I have recently decided to work with Buzhug on a project. As far as I can tell, it has proven efficient, fast, easy to use and to maintain. However, I ran into a few gotchas.
Simple solutions are often the best
I came to use Buzhug for the following requirements:
- I needed a single table
- I did not want to add additional dependencies to the project
- The size of the table will average 5K entries (without having more than 10k entries in peaks)
And an additional (personal) one:
- I did not want to bother with SQL. Really not. no way!
That left me one option: pure-python embedded database.
After having considered a few libraries, I have been seduced by the way Buzhug interface is close to manipulating python objects. And the benchmarks seemed to show that it is performant enough for this project.
After a quick prototyping (1 day), the choice was done.
Then came a few weeks of development and the first stress tests…
And the real world came back fast
A few times a day, the application backed by this database is intensely used:
- It can be run up to 50 times simultaneously in separate python process
- Each run makes a read and a write/delete operation
This causes a race condition on the files used to store data, and concurent writes corrupts database.
Using buzhug.TS_Base
instead of buzhug.Base
did not solve anything,
as the problem is not thread, but processes. What I need is a system-wide
cross-process lock.
Here is the answer
First step was to find how to implement a cross-process, system-wide lock. As it only has to work on Linux, the Lock class given by Chris from Vmfarms fits perfectly. Here is a version slightly modified to make it a context manager :
import fcntl
class PsLock:
"""
Taken from:
http://blog.vmfarms.com/2011/03/cross-process-locking-and.html
"""
def __init__(self, filename):
self.filename = filename
self.handle = open(filename, 'w')
# Bitwise OR fcntl.LOCK_NB if you need a non-blocking lock
def acquire(self):
fcntl.flock(self.handle, fcntl.LOCK_EX)
def release(self):
fcntl.flock(self.handle, fcntl.LOCK_UN)
def __del__(self):
self.handle.close()
def __exit__(self, exc_type, exc_val, exc_tb):
if exc_type is None:
pass
self.release()
def __enter__(self):
self.acquire()
The second step is to define a new class that inheritates from buzhug.Base
that uses PsLock
(inspired by TS_Base
):
import buzhug
_lock = PsLock("/tmp/buzhug.lck")
class PS_Base(buzhug.Base):
def create(self,*args,**kw):
with _lock:
res = buzhug.Base.create(self,*args,**kw)
return res
def open(self,*args,**kw):
with _lock:
res = buzhug.Base.open(self,*args,**kw)
return res
def close(self,*args,**kw):
with _lock:
res = buzhug.Base.close(self,*args,**kw)
return res
def destroy(self,*args,**kw):
with _lock:
res = buzhug.Base.destroy(self,*args,**kw)
return res
def set_default(self,*args,**kw):
with _lock:
res = buzhug.Base.set_default(self,*args,**kw)
return res
def insert(self,*args,**kw):
with _lock:
res = buzhug.Base.insert(self,*args,**kw)
return res
def update(self,*args,**kw):
with _lock:
res = buzhug.Base.update(self,*args,**kw)
return res
def delete(self,*args,**kw):
with _lock:
res = buzhug.Base.delete(self,*args,**kw)
return res
def cleanup(self,*args,**kw):
with _lock:
res = buzhug.Base.cleanup(self,*args,**kw)
return res
def commit(self,*args,**kw):
with _lock:
res = buzhug.Base.commit(self,*args,**kw)
return res
def add_field(self,*args,**kw):
with _lock:
res = buzhug.Base.add_field(self,*args,**kw)
return res
def drop_field(self,*args,**kw):
with _lock:
res = buzhug.Base.drop_field(self,*args,**kw)
return res
Now I just use
database = PS_Base( ... )
And all the errors have vanished.