Bruno Carlin

Locking Buzhug

I have recently decided to work with Buzhug on a project. As far as I can tell, it has proven efficient, fast, easy to use and to maintain. However, I ran into a few gotchas.

Simple solutions are often the best

I came to use Buzhug for the following requirements:

And an additional (personal) one:

That left me one option: pure-python embedded database.

After having considered a few libraries, I have been seduced by the way Buzhug interface is close to manipulating python objects. And the benchmarks seemed to show that it is performant enough for this project.

After a quick prototyping (1 day), the choice was done.

Then came a few weeks of development and the first stress tests…

And the real world came back fast

A few times a day, the application backed by this database is intensely used:

This causes a race condition on the files used to store data, and concurent writes corrupts database.

Using buzhug.TS_Base instead of buzhug.Base did not solve anything, as the problem is not thread, but processes. What I need is a system-wide cross-process lock.

Here is the answer

First step was to find how to implement a cross-process, system-wide lock. As it only has to work on Linux, the Lock class given by Chris from Vmfarms fits perfectly. Here is a version slightly modified to make it a context manager :

import fcntl

class PsLock:
    """
    Taken from:
    http://blog.vmfarms.com/2011/03/cross-process-locking-and.html
    """
    def __init__(self, filename):
        self.filename = filename
        self.handle = open(filename, 'w')

    # Bitwise OR fcntl.LOCK_NB if you need a non-blocking lock
    def acquire(self):
        fcntl.flock(self.handle, fcntl.LOCK_EX)

    def release(self):
        fcntl.flock(self.handle, fcntl.LOCK_UN)

    def __del__(self):
        self.handle.close()

    def __exit__(self, exc_type, exc_val, exc_tb):
        if exc_type is None:
            pass
        self.release()

    def __enter__(self):
        self.acquire()

The second step is to define a new class that inheritates from buzhug.Base that uses PsLock (inspired by TS_Base):

import buzhug

_lock = PsLock("/tmp/buzhug.lck")

class PS_Base(buzhug.Base):

    def create(self,*args,**kw):
        with _lock:
            res = buzhug.Base.create(self,*args,**kw)
        return res

    def open(self,*args,**kw):
        with _lock:
            res = buzhug.Base.open(self,*args,**kw)
        return res

    def close(self,*args,**kw):
        with _lock:
            res = buzhug.Base.close(self,*args,**kw)
        return res

    def destroy(self,*args,**kw):
        with _lock:
            res = buzhug.Base.destroy(self,*args,**kw)
        return res

    def set_default(self,*args,**kw):
        with _lock:
            res = buzhug.Base.set_default(self,*args,**kw)
        return res

    def insert(self,*args,**kw):
        with _lock:
            res = buzhug.Base.insert(self,*args,**kw)
        return res

    def update(self,*args,**kw):
        with _lock:
            res = buzhug.Base.update(self,*args,**kw)
        return res

    def delete(self,*args,**kw):
        with _lock:
            res = buzhug.Base.delete(self,*args,**kw)
        return res

    def cleanup(self,*args,**kw):
        with _lock:
            res = buzhug.Base.cleanup(self,*args,**kw)
        return res

    def commit(self,*args,**kw):
        with _lock:
            res = buzhug.Base.commit(self,*args,**kw)
        return res

    def add_field(self,*args,**kw):
        with _lock:
            res = buzhug.Base.add_field(self,*args,**kw)
        return res

    def drop_field(self,*args,**kw):
        with _lock:
            res = buzhug.Base.drop_field(self,*args,**kw)
        return res

Now I just use

    database = PS_Base( ... )

And all the errors have vanished.