Notes on repoze.bfg (now Pyramid) and ZODB

Published on 17 November 2010, updated on 13 April 2011, Comments

Introduction
Code reloading
ZODB
Traversal
Templating
Beware of Middleware
Conclusion

Introduction

I’ve been working on a couple of projects based on ZODB and repoze.bfg (which is now becoming Pyramid). I ended up learning a few useful things that I couldn’t find in the official documentation. I think the doc assumes some familiarity with the Zope ecosystem, which I didn’t have when I started. That assumption made my learning curve a bit steeper than it could have been. Things I felt missing from the doc are not part of the framework itself, but are provided by satellite packages, either part of Zope or of Repoze. It’s not the role of the official manual to document these tools, even though you will probably need them for any real world development.

repoze.bfg deserves more introductory material, especially targeted at people who haven’t used Zope before. In this article, I’m going to provide various information that I wish I had when I started. repoze.bfg plays well with both ZODB and SQLAlchemy but here I’m talking only about working with ZODB.

Code reloading

First of all, while developing, you’ll want to have the code reloading itself automatically when you make changes:

paster serve etc/myapp.ini --reload

ZODB

Persistent classes

In ZODB, changes made on regular Python objects are not automatically persisted. This confused me a few times. For example, if you have a list attribute on one of your objects:

>>> root['foo'].bar 
[1, 2, 3]

You might think (as I did) that this would work:

>>> root['foo'].bar.append(4)
>>> transaction.commit() # Commit is needed in the shell

Well as you might have guessed by now, the change won’t get saved. One way to tell ZODB to persist the change is to set the _p_changed attribute after you’ve updated a collection:

>>> root['foo'].bar.append(4)
>>> root['foo'].bar._p_changed = True
>>> transaction.commit()

It’s ugly and error-prone, but there is a better way: use one of the persistent classes that ship with ZODB:

>>> from persistent.mapping import PersistentList
>>> root['foo'].my_list = PersistentList()

That way any change to your list will be persisted and you don’t have to remember to set the _p_changed attribute. There are also PersistentMapping for dictionary-like objects and the generic Persistent class which you can subclass to make your own persistent objects:

>>> from persistent import Persistent
>>> from persistent.mapping import PersistentMapping

Folders

While persistent classes are good for storing objects in ZODB, if we intend to use these objects with Traversal, they need to have a __name__ and a __parent__ attribute. While the tutorial explains well how we can manage these attributes ourselves, there is actually an easier way: repoze.folder provides a Folder class which subclasses PersistentMapping and manages __name__ and __parent__ attributes automatically. To get this benefit, you just need to define your models by subclassing Folder.

Database queries

Coming from an SQL background, I spent a comical amount of time searching the web for information about how to query data from ZODB. Actually, you just don’t do that. ZODB is a storage mechanism, it doesn’t provide any facility to query the data. Instead you use a third-party indexing package: repoze.catalog. There is even a tutorial on how to integrate it with repoze.bfg which I ignored until I realized what it was for.

Conflicts

ZODB transactions use optimistic concurrency control and therefore, now and again, a transaction can fail. This will give you an error such as:

ConflictError: database conflict error (oid 0x114f, class
myapp.models.MyModel, serial this txn started with 0x038d931ff0f3c944
2011-04-11 18:07:56.473193, serial currently committed 0x038d932e737a9077

If your code is committing transactions explicitly using transaction.commit(), then you can catch this exception and try again. However your app may be using the transaction WSGI middleware repoze.tm2 (or maybe the older repoze.tm) so that you don’t have to explicitly call transaction.commit() in your code. In this case you won’t be able to catch the error because the commit is done by the middleware, after your own code has been executed. The solution is to use another middleware: repoze.retry. It will retry the WSGI request in case of ConflictError.

Database Maintenance

It’s important to be aware of the fact that ZODB records all changes. Even if you save the same value multiple times, it will record copies of the value and your database file will grow and eventually fill up your disk space much faster than you’d think.

You can be careful not to write to the database when it’s not really needed, but of course your app very likely needs to modify existing data. In order to prevent the database file to grow more than necessary, you have to regularly “pack” the database. ZODB comes with a command line tool called zeopack which does this. For example, if you use buildout, the command line might look like:

  $ ./bin/zeopack localhost:9000

You will probably want to call it periodically using cron. This article has more info about this issue.

Traversal

When I first read the description of Traversal, it seemed very mysterious and clever. All these talks about graph, context finding, view lookup, etc. got me quite confused at first. The reason why that description is abstract is because the mechanism itself is abstract and should in theory be usable in different contexts, but in reality it’s mainly used for one thing: mapping URLs to objects stored in ZODB. Be aware that I am intentionally simplifying here, Traversal is more than that but if you’re a beginner my explanation should help you get started.

Matching model classes

The idea of Traversal is excellent. URLs in web applications often correspond to objects in a database. As web backend developers, we’ve probably all been doing something like:

# Pseudo code example
def show_book():
    book_id = request.params.get('book_id')
    book = Book.get(book_id)
    if book:
        # do something with book
        # ...
        return render_template("book.html", {'book': book})
    else:
        return 404

The object id could also be part of the URL path (/books/42), but the principle is the same: we get an id, we try to get an object for that id and we do some work on that object or return an error if the object wasn’t found. That’s exactly what Traversal does, without you having to write a single line of code. Of course, Traversal is no magic and your database has to follow a certain structure for the mechanism to work. It has to follow the structure of the URLs you want to map. ZODB databases are structured as nested dictionary-like objects (often Folder objects, as we mentioned before).

Now let’s say your ZODB database has the following structure:

database_root = {
    'books': {
        '42': <Book Object>
    }
}

Using repoze.bfg’s Traversal mechanism, here is how the equivalent of the previous example would look like:

# Pseudo bfg-style code example
@bfg_view(for_=Book)
def show_book(context, request):
    # Do stuff here if needed...
    return render_template("book.html", {'book': context})

If you visit the URL /books/42, Traversal will automatically map it to your Book object located at root['books']['42'] in ZODB and pass it as the context argument of your function. This is already quite useful but it gets even more powerful when you combine this with interfaces.

Matching interfaces

Interfaces come from Zope and provide a way to declare expectations about how a class should behave (if you happen know Java, you’re already familiar with the principle). For example, let’s say your app allows users to store books and photos and you want to allow visitors to leave comments on both books and photos. You can define an interface such as:

from zope.interface import Interface

class ICommentable(Interface):
    pass

Then you mark your models as implementing this interface:

class Book(Folder):
    implements(ICommentable)
    # ... rest of the class definition

class Photo(Folder):
    implements(ICommentable)
    # ... rest of the class definition

Now you can write a request handler that will work for any object implementing the interface:

@bfg_view(for_=ICommentable, name="comment")
def create_comment(context, request):
    comment = request.params.get('comment_content')
    # context is either a Book or a Photo
    context.comments.append(comment)
    # ... then send a response...

If you visit /books/42/comment or /photos/23/comment, your create_comment function will be called. In the case of books, Traversal knows it should use create_comment instead of show_book that we defined in the previous section because of the name argument we passed to the view definition. Names take precedence. Using interfaces with Traversal allows you to write generic request handlers easily.

Why don’t we specify any attribute in the interface definition?

As far as Traversal is concerned we don’t need to. Python uses duck-typing: if our objects behave like commentable objects, they are commentable. Here the interface is merely a marker. However you can and maybe should specify attributes in your interface definition to make sure your generic code receives what it expects. I left it out for simplicity but zope.interface has all you need for that.

Templating

Chameleon is a popular choice among repoze.bfg users, probably because it reminds them of Zope templates.

In Chameleon templates, the equivalent of if statements is tal:condition. If you’re like me, you might find yourself looking for the equivalent of an else clause. Well, it just doesn’t exist. If you think about it, Chameleon is based on XML tags. Any XML tag needs a matching closing tag, so how would and else tag look like?

Instead you just write another condition with the opposite value:

<p tal:condition="foo">
   foo is true
</p>
<p tal:condition="not foo">
   foo is false
</p>

Beware of Middleware

Be careful when using middleware. If you don’t configure your application properly it could break your scripts and the bfg shell. You might end up seeing an error such as:

    $ paster --plugin=repoze.bfg bfgshell etc/paste.ini zodb
Traceback (most recent call last):
 [...]
  File "[...]/site-packages/repoze.bfg-1.1-py2.5.egg/repoze/bfg/scripting.py", line 14, in get_root
    registry = app.registry
AttributeError: MyApp instance has no attribute 'registry'

This message might be a bit confusing but actually the key of the problem is provided by bfgshell help:

$ paster --plugin=repoze.bfg help bfgshell
    [...]
    Example::

        $ paster bfgshell myapp.ini main

    .. note:: You should use a ``section_name`` that refers to the
              actual ``app`` section in the config file that points at
              your BFG app without any middleware wrapping, or this
              command will almost certainly fail.
    [...]

There we are. The error message above is caused by doing precisely what we shouldn’t do: call a section name that refers to an app wrapped in WSGI middleware. So inspect your INI config file and check if the section you’re calling (in our case a section called zodb) makes use of any middleware. If it’s a pipleline or filter-app section, it does use middleware. If it’s just an app section, look for a filter-with entry in that section. If you can’t find anything suspicious in your INI file, the middleware might be called programmatically within your app’s initialization code (grepping for “middleware” is probably the quickest way to find out where).

Now that you identified the cause of the problem, you will need to reorganize your config file so that it provides two different application sections:

an app section that refers to your bare BFG app, which will be used by bfgshell and by scripts,
a pipeline or filter-app section that wraps your bare BFG app with the WSGI middleware you need and that you’ll call with Paster, mod_wsgi or whatever you happen to use to serve your app.

There is more than one way to do this, so please refer to Paste Deploy reference documentation to make informed decisions about how to restructure your configuration.

Conclusion

repoze.bfg is a very robust framework. While working with it, I didn’t hit a single bug, which is quite rare with web frameworks. Using Traversal and ZODB is an interesting and refreshing approach to building web applications. I hope these notes can make it a little easier for beginners to get started.

Alex Marandon

Web - Data - Ops