Branching Traversals

DSE Version: 6.0

Video

Unlike simple traversals, where there is a sequence of steps, branching traversals have nested traversals and are more like trees with branches. In this unit, you will be learning about branching traversals.

Transcript: 

Hi, I am Artem Chebotko and this is Branching Traversals.

Unlike simple traversals, where there is a sequence of steps, branching traversals have nested traversals and are more like trees with branches.

Different traversers may follow different branches of that tree and therefore paths that they travel could be drastically different.

Just like in this illustration where gremlins guided by a branching traversal went onto different branches.

Here are some steps that result in branching.

First, we have coalesce that evaluates the provided nested traversals in order and returns the first traversal with a non-empty result.

And then there are two versions of the choose step. The first one resembles a switch statement with options instead of cases. The second one is more like an if-then-else statement found in many programming languages.

“union” will evaluate all of the traversals we give it and merge their results and pass that on to the next step in a traversal.

“local” evaluates its internal traversal in the context of each individual object in a stream rather than in the context of the whole stream. It might be better to wait for an example to understand this step better.

“repeat” will repeatedly evaluate the internal traversal a certain number of times or until a condition is satisfied. We will not see an example of “repeat” in this presentation because it has its own show called Recursive Traversals. You can say that “repeat” is better known for repetition than for branching.

Here is our KillrVideo graph schema.

We have four types of vertices ... movies, users, genres and people. They can be related to each other by different types of relationships.

Users may know each other and can rate movies. Movies belong to genres. Movies can have actors, directors, composers, screenwriters, and cinematographers.

And of course we have various vertex properties and one edge property in this schema. User-defined vertex IDs are underlined. We have no meta-properties but we do have one multi-property … that is the production property.

Please take a moment to study this schema as it will help you to follow our traversal examples.

Here is an example when coalesce is used.

Remember, coalesce is going to give us the first traversal that return a non-empty result.

If we have a tak of finding documentary movies with Johnny Depp and if none exists, we can show his 2010 movies … that is where coalesce will help.

Looking at this example, we start with the Johnny Depp vertex and follow any incoming edges to get to his movies. At this point, we use coalesce with two traversals … one will check that a movie belongs to a genre with name Documentary and the other will check that the movie release year is 2010.

Based on the output of Alice in Wonderland 2010, we can conclude that there are no any documentary movies with Johnny Depp in our graph.

In this example, we are going to use the choose step to display information about a particular movie using a format that is specific to the type of information we may find.

We start with a movie whose ID is m267 and move to all adjacent vertices using step “both”. Then we use “choose” to make a decision about what kind of a computation to do depending in a vertex label.

If the label is “user”, we will return a value map, which is a map with all of the properties and their values.

If the label is “person”, we will return the name.

If the label is “genre”, we will return the name again.

Otherwise, if the label is none of the above, we will simple return the label itself.

We also use the sample step to limit our output to 3 sample elements from the stream.

We can see in the results, we get Johnny Depp - that is the actor associated with this movie …

We get Adventure that is the genre associated with this movie …

And finally we get a value map for one of the user who rated the film.

Let’s look at another choose example.

We want to find all users in our graph who are able to watch a PG-13 movie.

This choose step is using the if-then-else semantics.

The condition is whether an age of a user vertex is greater or equal to 13.

If true, we return a value map with all information about a user.

If false, we inject a constant literal that states “no children under 13” into the output stream.

The three sample results show what output you can expect from this traversal. We happened to stumble upon a user who is 44 and two users who are under 13.

Here is an example that uses “union”. Again, the “union” step will evaluate all of its internal traversals and merge their results.

This time we want to find names of all crew members and production companies for the Alice in Wonderland movie m267.

So we select that movie in the first line using the has step and then we get the union of two traversal results. The first one traverses outgoing edges with labels “director”, “composer”, “screenwriter”, and “cinematographer” and gets people names. The other one simply gets all production companies that are associated with the movie vertex via the production multi-property.

As a result, we get crew member names and production company names in the same output stream.

Note that when we union results of multiple traversals, those results do not have to be of similar types. It is ok to mix integers, dates, string literals, and even graph elements like vertices and edges. In other words, if you heard about the notion of union-compatibility in relational databases, we can say that all traversals are union-compatible in Gremlin.

This example should help us understand what the “local” step is about.

We want to compute an average rating for every movie in our graph.

We begin by selecting all movie vertices in the first line and then use “local” with a nested traversal that goes to incoming rated-edges, gets rating values and calculates their mean or average.

If we have 100 movies in our graph, we will get 100 numbers in the output, which is what we expect anyway.

So what is the purpose of “local” here? If we do not use “local” and simply do the inE().values().mean outside of “local”, we will always get a single number which will represent an average of all ratings across all movies. That number is usually not very useful.

“local” ensures that its internal traversal is executed independently for each movie-vertex in the input stream and therefore we get a separate average number of each movie.

We just used this simple aggregation step called “mean” to calculate rating averages. It may be a good time to take a quick look at other simple aggregation steps, such as count, sum, min and max.

What these steps do is usually universally understood, however they do have an optional parameter “scope” that may be less intuitive.

The scope can be global or local with global being the default. When an aggregation step is evaluation with the global scope, it computes an aggregate based on all values in the input traversal stream. If the scope is local, an aggregate is computed separately for each object in the input traversal stream. Of course, it only makes sense to use the local scope when objects in the stream are some kind of collections like lists or sets.

You can see in the example below, how a scope can affect traversal results for the two Alice in Wonderland movies we have in the graph.

Also, notice that the local step is of course related to the local scope. The step is much more expressive as it allows executing arbitrarily complex traversals as if they had a local scope.

We will look at these and other aggregation steps again we learn about Statistical Traversals. But for now, you know how to do simple aggregation.

How about we use more than one branching step in the same traversal?

Yes! That is what you will do in this exercise!

No write up.
No Exercises.
No FAQs.
No resources.
Comments are closed.