Living on the Edge

Django-SQLAlchemy

Posted on March 21, 2008

One of the things that has kept me very busy over the past several months is something called Django-SQLAlchemy. I started toying with the idea back in the fall and it just kind of rumbled around in my head for a while as I thought through different approaches to integrate SQLAlchemy into my Django projects. Yes, I’m aware you can just import sqlalchemy and you’re done. But no one is every really talking about that, are they? What people want is the ability to keep everything the same but still have the power and flexibility of SQLAlchemy available to them. That’s what this project aims to do.

Finally, some time in or around December 2007, Brian Rosner and I began discussing the project. I was glad to hear that he had been thinking about similar things, and we decided that we would put forth some effort on the project. Brian and I have been working for the past several months trying to do a proof of concept and develop a roadmap for the project. A lot of hackish code was developed, and we frequently went down one road only to discover there was a much easier way. In the end we figured out what we think is a plausible plan for implementing SQLAlchemy into Django.

At PyCon 2008 in Chicago Brian and I decided to set up a Birds of a Feather session in one of the Open Spaces to discuss django-sqlalchemy. Much to our surprise we had a great turnout. First of all Mike Bayer, creator of SQLAlchemy, and Jason Kirtland, core contributor to SQLAlchemy, both showed up and were extremely helpful. They indicated that they are really interested in seeing the project become a success, and they would help in any way possible. During the meeting they even discussed modifying some functionality within SQLAlchemy in order to make our job a bit easier. Adam Gomaa also showed up and expressed an interest in helping on the project. We added Adam as a contributor immediately and we all began hacking away.

Over the next several days, Adam, Brian, and I made really good progress. We solidified some of the hackish filter translation code, we created a test runner, We made a lot of progress on mapping the related fields, and most importantly we completely replace the declarative layer (previously Elixir based), with code based on the declarative layer that Mike created in SQLAlchemy. We also bugged Mike and Jason continuously, and every time they provided us with valuable guidance that saved us hours of grief. We can not thank them enough for their help.

Development Roadmap

The fundamental approach to the django-sqlalchemy project is to implement SQLAlchemy as a custom database backend. Actual SQLAlchemy backends are made available through custom project settings. By doing so we can “hijack” all calls to the QuerySet with our own custom QuerySet that interprets these calls and passes them on to SQLAlchemy. We also need to be able to provide SQLAlchemy hooks into our Django models and this is handled by the declarative layer of the project. Finally, things like management commands that have affect on the database backend need to be overridden to do the right thing in SQLAlchemy terms.

1. Management Commands – Management commands that affect the database backend in any way need to be overridden to pass on calls to the SQLAlchemy layer. This affects at least the following: syncdb, sql*, reset, dbshell, createcachetable, dbshell, dumpdata, flush, inspectdb, loaddata.

2. Declarative Mapper – mapping the Django model structure into SQLAlchemy Table and Mapper information so that access to SQLAlchemy is available while still providing access to all of Django’s expectations for functionality of a model. This includes mapping all of the fields, related fields, polymorphic relations, etc. Plus once we get it functionally compliant, there will be a lot of hooks for extensions that will provide greater access to SQLAlchemy power.

3. QuerySet Mapping – Map the backend django-sqlalchemy database so that we can override Django’s QuerySet into a SqlAlchemyQuerySet that interprets and passes on all backend query functionality to SQLAlchemy. This includes the handing of sessions, metadata, and transactions.

Benefits

So why are we doing all of this? Well mostly we are scratching our own itch. We also sense that there is a lot of interest in the community for this type of thing. Additionally we realize that there are a lot of benefits provided by a django-sqlalchemy integration, that will help make Django even more powerful:

  • Addition database support: MS-SQL, Firebird, MaxDB, MS Access, Sybase and Informix. IBM has also released a DB2 driver
  • Connection Pooling
  • Multi-Database Support, including things like sharding
  • Extremely powerful query language

Initially all of these things may not be available as we first aim to achieve functional equivalence to Django’s ORM, but over time we will provide hooks and ways to tie into the additional functionality provided by SQLAlchemy.

Interest

So if you’re still reading along at this point, you probably have some interest in the project. We welcome you to get involved in any way that works best for you. We have set up the following resources for the project:

Google Code for Ticket Tracking and Wiki

Gitorious for Source Control

Google Group for Discussion

IRC Channel #django-sqlalchemy

I will also try to be good about posting progress updates here.

Comments
  1. NickMarch 21, 2008 @ 01:30 PM

    This is a great new development, and should help to make Django an even more attractive option for web developers than it is right now.

    I hope you guys find enough time to work on this in addition to your excellent podcasting activities…

  2. Tony BeckerMarch 21, 2008 @ 03:54 PM

    I missed a lot of the fun stuff at PyCon because I was sick, but something like this is exactly what I thought would be useful after going to the “State of SQLAlchemy” talk. By implementing the Django ORM on top of SQLAlchemy, you get support for a lot more databases, plus a lot more functions.

    Another idea I had was hacking the test runner so that you could provide it with multiple database backends – say for now, Postgres/MySQL/Sqlite3 – and running all of the fixtures and tests against each database. You wouldn’t want to run them every time, probably, but it would be nice to run them before releasing a module that people might use against any of the supported database engines.

  3. Robert Lofthouse (siudesign)March 21, 2008 @ 06:53 PM

    Seems like an awesome project. I shall take a dig around your source and see where you’re at.

  4. MickMarch 22, 2008 @ 03:34 AM

    Very cool to see someone work on this. Good luck!

  5. Vernon WalkerMarch 22, 2008 @ 03:55 PM

    This is great news, thanks Michael for this and all the other great work you’ve been doing with Django lately (the weekly updates etc.). I’ve been a big believer in integrating SQLAlchemy with Django, there’s a lot of extra functionality and some of the hard problems in object-relational mapping seem to have already been solved in SQLAlchemy + Elixir (model inheritance, composite primary keys). I had a question (if it makes sense): your approach is mapping the Django model definitions and queries to SQLAlchemy, but given how closely the latest Elixir syntax matches Django, was some consideration given to just replacing Django models and dropping in SQLAlchemy + Elixir, the hard work making it function with generic views and the admin, etc.? Just curious whether that approach would have been viable.

  6. EmptyMarch 22, 2008 @ 06:00 PM

    Tony: Sounds like a good idea and something we’ve talked about. Feel free to clone the mainline and work on it.

    Vernon: I agree, but the problem with that approach is that you break every existing app out there. If you’re going to do that you might as well just fork the Django project. We are trying to preserve what’s out there.

  7. TrondMarch 22, 2008 @ 06:11 PM

    Great boost to both django and sqlalchemy if could be done in a sane way.

    I really need multi database support in django…

    Keep up the great work :-)

  8. Antti KaiholaMarch 22, 2008 @ 06:22 PM

    Great news for those of us who constantly run into the limits of Django’s ORM.

    I didn’t see any mention about compatibility with new stuff coming from the queryset-refactor branch. Are you eventually going to cover that as well?

  9. Koen BokMarch 22, 2008 @ 08:36 PM

    Great news!

  10. Az NachMarch 22, 2008 @ 11:18 PM

    Bravo!

    Despite numerous acts in defense of Django’s ORM attempting to veil its shortcomings, this initiative may prevent Django’s ORM to be a show-stopper in the long run in terms of database connectivity.

    I sincerely hope that your combined effort bears fruit and does not strand like previous endeavours that set out to achieve similar goals.

    In my opinion, the current ORM is Django’s Achilles’ heel, a “fatal weakness in spite of overall strength”.

    Having the apparent advantages of progressively decoupled newforms-admin functionality on top of a SA-injected database layer would leverage larger-scale Django adoption, even or especially at the enterprise level. It seems to me that your initiative can be considered a logical extension of Malcolm’s great strides towards refactoring Django’s Queryset.

    Good luck, I’m sure you can work things out together!

    Az

  11. EmptyMarch 22, 2008 @ 11:37 PM

    Az: We will try not to get sucked into the Abyss that Jacob talks about. Thanks for the encouraging words.

    Antti: Our code base is based on Queryset Refactor branch, so yes.

  12. AdamMarch 23, 2008 @ 05:40 PM

    Sounds fantastic. Have you talked to any of the django developers about this being part of the django core.