Joins

DSE Version: 6.0

Video

Transcript: 

This is Joe Chu again, and in this video we will be talking about joins.

As you probably know, the idea of a join is something that is expressly forbidden in the land of Cassandra. This is intentional by design, and has good justification as far as query performance is concerned. However in DSE, the addition of search indexes with Cassandra means that certain joins are now feasible.

There are limitations to what can be done though, as performance is still a priority. However, I'm happy to tell you that it is possible to join multiple search indexes together for searching. What this means is that you can search for certain terms across multiple search indexes, and then retrieve results from the base table.

Sounds a bit strange, doesn't it? Well I said joining search indexes, not joining tables, so you do have the ability to search, but unfortunately you are not able to return data from multiple tables.

Furthermore, the search indexes that are joining must share the same keyspace and partition key, again for performance reasons. This automatically sets up the join condition, and prevents a search index on one node from having to join with a search index on another node, which would be inefficient. You may think that having to have the same partition key is a bit restrictive as far as what you can do, but if you are following best practices when it comes Cassandra data modeling, you might find that you have quite a few tables with the same partition key that you can use for search index joins.

Well if all of this still appeals to you, let's see how it works.  Basically we have a special parser called join that will allow you to specify a search index to execute the search on. Data will only be retrieved from the table mentioned in the original search query.

In the example here we are looking for the actor name Leonardo DiCaprio in the search index killrvideo.actors_by_video, with title and release year data being returned from the videos table. The table definitions that we see at the top show that the actors_by_video table and the videos table have the same partition key, which is the column video_id.

It is also possible to do a recursive join, where you have separate search conditions for each search index that you are joining. This is done through a special field called underscore query underscore, which is used in a search query to use the results from additional search queries, kind of like an SQL SELECT IN query.

As you can see the syntax uses the AND operator and the magic field underscore query underscore so that only the results from the embedded query are used to find matches for the search query.

That embedded query should be another search query that uses the join parser to search in a separate search index. You can see then it is also possible to chain these searches again using AND and the the underscore query underscore field.​​​​​​​

If we expand this we have another search query.​​​​​​​

Note that here the last close parens and double quote is for our first recursive search query here​​​​​​​

The example here shows a recursive join query with two search queries searching for different actors in the actors_by_video search index, and another search query trying to match movies with a specific genre in the videos search index. The data returned in the search results would be from the videos table.

The final technique that we'll be showing here is a join that uses filter queries. This is similar to the recursive join that we've just seen, but is much cleaner as far as trying to read what the search query is doing.

Here we are doing the same recursive join that we've seen previously, but using filter queries to do the recursion. Both the query q and the filter query fq parameters can use the join parser, and you can see that we can as many filter queries as we want to set up the recursive join.

No write up.
No Exercises.
No FAQs.
No resources.
Comments are closed.