Local Parameters

DSE Version: 6.0



Earlier in the course, we talked about how we can add instructions to our overall search request by using JSON notation.  Now, let's look at how we can additional instructions to individual search expressions.

Local parameters are a way of explicitly setting or over-riding parameters for query execution on a clause-by-clause basis.  We can give instructions to the query parser, and we can even select which query parser is going to be used to act on our clause.

The basic syntax for specifying local parameters is to use a curly brace plus an exclamation point -- followed by a closing curly brace.  Within those braces, we can specify various instructions as key-value pairs. This structure is inserted at the beginning of the search expression.

Some of the local parameters supported in open source Apache Solr don't make sense to use in DSE Search -- for instance, the FL -- or field list parameter.  Our CQL SELECT clause takes care of this, so there's no reason to pass it as a local parameter.

But, several of the parameters are useful -- such as df (or default field) and Q DOT OP which lets us specify the boolean logic with which multiple clauses are combined.

Here you can see an example using the DF parameter to specify that we want to run this search against the title TT field.

There are several query parsers which can be used to parse our search expressions.  The default is to use the Standard Query Parser which is referred to as LUCENE. If we want to explicitly specify a parser, we can do so using the type parameter.

As mentioned previously, the default query parser is the Standard Query Parser.  It is Lucene's query parser with a few modifications. It's a very strict parser in terms of how it deals with improper syntax.

An alternative to the standard parser is the DisMax parser.  DisMax stands for Maximum Disjunction. If you want to know what that means, I invite you to research it on your own; but all you really need to remember is that the dismax parser that allows for some advanced features like score boosting; and it is much less strict about dealing with search expression syntax.  We can invoke it by specifying type=dismax as a local parameter.

They say that the third time is a charm, and that turns out to be the case in the world of query parsers, too.

The eDisMax or extended DisMax parser is a big step forward from the dismax parser.  It supports the full lucene query parser syntax and adds some additional features.

One thing to pay attention to is that the edismax parser requires you to specify a default field using the DF parameter -- OR -- multiple fields using the QF parameter.  This is true even if your search expression explicitly qualifies each of its terms with a field.

Here's an example using QF.  Our search term is Godfather with no hint as to what field to use.  But, our QF clause specifies both the title TT field and the description TT field.  When DSE Search processes this expression, it will apply the search term "Godfather" against both of these fields.

You can see in the results that we get movies with Godfather in the title; but we also get some results that DON'T have Godfather in the title, but they have it in their description.

This is a much cleaner syntax than specifying something like: title contains godfather OR description contains godfather OR fields X Y and Z alson contain godfather.  Especially if you're searching against a lot of fields.

When we get into the realm of searching against multiple fields, we might want to assign more relevancy to documents that show up in our results because they matched against one field versus another field. 

This is where field boosting comes into play.

Inside of the QF parameters, we can use the caret character to supply a boost factor to certain fields.

In our previous Godfather example, we got matches based on title and description, but you may remember that the Godfather trilogy movies did NOT show up as the first -- and therefore most relevant -- results.

By applying a larger boost to the title field than we apply to the description field, we can affect the scoring and make sure that the Godfather trilogy shows up first in our results.

Boosting certain results can also be accomplished by using the boost parameter and supplying a numeric field whose values will be applied as a boost factor.

Additionally, the boost query parameter can be used to apply boosts to results that match the supplied search expressions.

Finally, the boost functions parameter can be used to supply a function that will be used to apply the boost factor.

Between these various parameters, we have lots of interesting ways to apply boosting.  Let's look at some examples.

Here, we are specifying the average rating field as our boost field.  The effect of doing this means that our results will favor highly rated movies over movies with lower ratings.

What if we want certain specific values to really impact a search result's relevancy?  We can specify those terms in the boost query parameter.

Maybe we're Robert Deniro and Andy Garcia fans, so we want to see the two Godfather sequels get higher rankings.  In this example, we boost results where the title contains TWO or THREE, and we can see that in the corresponding result sets, parts two and three end up at the top of our lists.

Sometimes, not even the boost query gives us enough flexibility in expressing how we want to boost things.

In this example, we want to boost the relevancy of newer movies over older movies.

We can specify a function that converts our date values into numeric values -- suitable for use as a boost factor. Here we're using MS -- which returns the milliseconds between two date values -- and RECIP -- which yields the mathematical reciprocal.

The effect in this case is that now Godfather Part THREE is ranked higher than Godfather Part TWO -- since it was released afterwards.

You can find the list of available functions in the online Solr Reference guide.

No write up.
No Exercises.
No FAQs.
No resources.
Comments are closed.