Clone My Fields, Please

The Introduction

I recently started using the SearchManager from the Mercury Tide white paper on using MySQL full-text search with Django. It's been helpful, but I ran into a bug recently while trying to add a default filter to a SearchManager subclass.

The Boring Context

Rather than deleting objects from the database, my application sets a boolean flag to indicate that the content is not longer relevant. I wanted my manager to apply a filter to every query set to include only items that are not disabled. Here's what the manager class looks like:

class SearchableItemManager(SearchMangager):
    def __init__(self):
        zuper = super(SearchableItemManager, self)
        zuper.__init__(('name','description',))

    def get_query_set(self):
        query = super(SearchableItemManager, self).get_query_set()
        return query.filter(is_enabled=True)

The Ugly Crash

When I made the change, I found that calling the search() method raised a TypeError: "'NoneType' object is not iterable." The error occurred when the SearchQuerySet tried to construct the SQL for the MATCH…AGAINST clause. Somehow, the _search_fields tuple on the SearchQuerySet was None.

The Mystery Solved

This had me baffled until I had a look at the _QuerySet code in Django. It seems obvious now, but adding an additional filter to a query set returns a clone of the original with the new filter added. The _QuerySet object contains a _clone method that copies a hard-coded list of fields from the old QS to the new one. Naturally, that hard-coded list doesn't know anything about my _search_fields, so the property has no value on the clone.

The Fix

Now, depending on how much of a zealot you are about modifying “private” functions, there are two ways to fix this. The easiest method is to simply override the _clone method and add the _search_fields tuple to the clone. The alternative is to override every method that depends on the _clone method, and copy over the _search_fields tuple for each one. I think that would be stupid, and will speak of it no further. Here's the code I added to generate happiness:

class SearchQuerySet(models.query.QuerySet):
    # ... code from the original Mercury Tide class
    def _clone(self, klass=None, **kwargs):
        zuper = super(SearchQuerySet, self)
        clone = zuper._clone(klass, **kwargs)
        clone._search_fields = self._search_fields
        return clone

Hmm.. tried to adapt that white-paper.

I keep getting the following error upon MyModel.objects.search(”x”):

to give me “module has no attribute quote_name”

Any clue?

Posted by Yeago at 04:34:10 PM on 1 April 2008

That sounds familiar, but I don’t remember the cause. This is probably a stupid question, but you are using MySQL, right?

Posted by Jason Wadsworth at 04:46:39 PM on 1 April 2008

Yes, mysql.

I’m using Django SVN and I’ve verified that quote_name exists in that class.

Posted by Yeago at 09:49:46 AM on 3 April 2008

Actually, I’m going to ignore the stuff at mercurytide as this has since been built-in to Django.

Also, for those non-mysqlers:

http://www.davidcramer.net/code/79/in-depth-django-sphinx-tutorial.html

Posted by Yeago at 01:25:16 PM on 3 April 2008

Mmm…. ok, I retract the above remark about it being built in. Apparently Django doesn’t provide for full-text across columns.

Posted by Yeago at 01:59:10 PM on 3 April 2008

Maybe the whole comment was dumb. Sphinx is for MySql+Django only.

Posted by Yeago at 02:04:55 PM on 3 April 2008

Last I checked, the built-in Django support only supported single columns, and only in boolean mode. That’s definitely useful, but it’s not appropriate for every situation.

I looked at my code and remembered the fix for the quote_name problem. The organization of the backends in Django changed in the trunk so that quote_name was accessed through a DatabaseOperations interface.

Replace:
backend.quote_name(...)

With:
ops = backend.DatabaseOperations()
ops.quote_name(...)
Posted by Jason Wadsworth at 10:04:04 AM on 5 April 2008

Don’t suppose the SearchManager is choking upon the last svn update?

django/db/models/query.py

line c = klass(model=self.model, query=self.query.clone())

“__init__() got an unexpected keyword argument ‘query’”

Digging around. Letcha know if I find something out.

Posted by Yeago at 01:28:22 PM on 28 April 2008

I haven’t checked it to any real extent, but it looks like just bad subclassing form. Adding *args and **kwargs in the usual way should make the problem disappear:

def __init__(self, index_column, *args, **kwargs):
super(SearchManager, self).__init__(*args, **kwargs)

Similarly for SearchQuerySet.

Posted by Craig Ogg at 07:39:19 PM on 28 April 2008

I think that fixed one issue, Craig.

Now I’m attempting to track down an odd bug whereby attempting to filter() a result-set results in [], regardless of match.

Posted by Yeago at 11:54:09 AM on 14 May 2008

This article and the accompanying comments were most helpful in converting MercuryTide’s search to Django 1.0+ compatibility, thank you!

I ended up using MercuryTide’s search as a starting point, but wrote the actual search aspect myself, because InnoDB doesn’t support MySQL’s fulltext search.

Posted by JR at 07:14:08 PM on 13 November 2008

Leave a Reply