Wednesday, February 16, 2011

Lucene search fails against decimal numbers in dotCMS

Update: 17/02/2011 11:30:17 AM, Christopher F. Falzone shows how to filter out extraneous results.

When running a Lucene Query, searching/filtering by any number with decimals doesn't work against a field with float data type. For example, if a float field has a value of "3.0" or "3.24", searching for "3.0" or "3.24" respectively will not work. Any search term that has a decimal point will fail against a float field.

This affects both the Admin UI when searching for or filtering content, and in a front end page when running a macro like #pullContent() for example.

This bug affects dotCMS prior to version 1.9.2. The bug report for it is filed under Can't Filter Content by Field Stored as Decimal. I stumbled across this bug in my own project work against dotCMS 1.7 and reported in the Yahoo forum post Searching for decimal via Lucene?

There is no fix apart from either upgrading (ugh) or porting across the fixed Java code (if you can find it and if there are no other dependencies). Instead, I used the following work-around in my code: rip off the decimal portion of the number and tack on an asterisk to do a wild card search using only the integral portion of the number. For example, see below.

#set($indexVal = $floatSearchTerm.indexOf("."))
#if($indexVal > 0)
   #set($floatSearchTerm = $floatSearchTerm.substring(0, $indexVal))
#end
#pullContent("+text2:$floatSearchTerm*" "0" "text1")

This means that instead of searching for "3.0" or "3.24", I will be searching for "3*" in both cases. Sure, you might be getting more results, but at least those results will include the one you need. One sticking point is if or how do you tell the user that you have changed their search term? Perhaps you might re-display their search term as "3" instead of "3.0" or "3.24". In my case, I am not doing anything and hoping that the users will be satisfied just by getting relevant results.

Note that in terms of filtering content in the Admin UI, just leave off the decimal portion and the Admin UI will add the wild card to the end of your search term by default.


Update: 17/02/2011 11:30:17 AM. Christopher F. Falzone posted in the dotCMS Yahoo forum an update noting that it is not too hard to cycle through your results and remove those results that don't exactly match the original search term. Here is the code he adjusted to do that.

#set($origFloatTerm = $floatSearchTerm)
#set($indexVal = $floatSearchTerm.indexOf("."))
#if($indexVal > 0)
   #set($floatSearchTerm = $floatSearchTerm.substring(0, $indexVal))
#end
#pullContent("+text2:$floatSearchTerm*" "0" "text1")

#foreach($content in $list)
   #if($content.floatField.equals($origFloatTerm)
   ...
   #end
#end

Of course, this won't work if you are relying on #pageContent(), but I will address that with a later post.