Filter Query

DSE Version: 6.0



Another powerful feature that we can leverage in DSE Search is the filter query.

The filter query parameter filters search results, just like the Q parameter does, but it also keeps its results in a cache which can be re-used by other searches that have the same filter query criteria.

By applying these cached filters, searches that use the filter query parameters can be significantly faster than if you placed those same predicates in the Q parameter.

Let's take a look at the mechanics of how a filter query is applied.

For each FQ parameter specified in the search, that predicate is executed and its results are stored in the filter cache.  If that particular predicate has been executed before, this step can be skipped, and the existing results will be used.

Before the Q parameter is executed, the intersection of the various filter query results are taken and used as the starting point for our next search.  In this case, documents TWO, NINE, and TWENTY TWO are possible candidates.

The predicate from the Q clause can now be executed against our pre-filtered set, and in this case, our only matching record is Document NINE.

One thing to keep in mind is that predicates specified as part of the filter query are not used in calculating the relevancy of the results; that relevancy score is based solely on the predicates from the Q parameter.

If your Q parameter is set to a wild card, then all of your results will have a constant -- and therefore meaningless -- relevancy score.

As an aside, if you specify non-key fields in the CQL WHERE clause -- which is a new feature in DSE SIX,  those fields are evaluated in DSE Search using the filter query and therefore won't affect relevancy scores.

What are some use cases where filter queries can be useful?

Fields that are commonly searched against and that have low cardinality values are great candidates to be used as filter queries.

In this example, we're searching for stock trades for a given stock symbol and a given user.  Stock symbol has relatively low cardinality values whereas user probably has high cardinality values; in other words, there might be thousands of distinct stock symbol values, but there are potentially millions of users in our data.  So, it's much more likely that a given stock symbol would be searched against multiple times rather than a given user being searched against multiple times.

Here's a case where we are searching for products -- a common thing you might do on an eCommerce website.

It is likely that we only want to show products that are in stock, so we can add a filter query on the in_stock field.  The in_stock field is a boolean field which is about as low cardinality as you can get - and it certainly qualifies as a commonly searched field in our fictitious example.

After applying that filter query, we get a more relevant set of results, and our IN stock products are now cached and available to be used by the next product search.

Filter query and faceting are highly complementary features.  The same fields that me might normally want to facet on are the same types of fields that we'd want to use in a filter query.

We can put them together by running a search, getting its facets, and then letting our users further refine their searches using facet values as the filter queryHere's an example.

Imagine we search for movies by description and retrieve a list of results.  We also get facets back based on the films' ratings.

Given this additional context, our user realizes that they want to restrict their results to movies with a G rating.

We can then run a second search using the G ratings as our filter query, retrieving those results and their corresponding facets.We can continue to apply this technique until the user finds the results that they were seeking.  This is a really powerful combination of tools.

No write up.
No Exercises.
No FAQs.
No resources.
Comments are closed.