Graph Traversal Steps

DSE Version: 6.0

Video

In this unit, you will learn about graph traveral steps. Gremlin traversal steps and step modulators are defined by interface GraphTraversal. There are around 100 methods with distinct names in this interface. We will be showing you a few things to get you started.

Transcript: 

Hi, I am Artem Chebotko and this is Graph Traversal Steps.

Gremlin traversal steps and step modulators are defined by interface GraphTraversal. There are around 100 methods with distinct names in this interface. The majority of them are graph traversal steps (~80). So this is a rich interface with a lot of functionality. We are just going to be scratching the surface to get you started.

Lambda steps are the foundational constructs of the Gremlin language. They are the most general steps that can be used to implement most other steps. However they are not the easiest traversal steps to use and usually the least efficient because they are hard to optimize. As a result, lambda steps are disabled by default in DataStax Enterprise Graph. We will still take a closer look at these steps mostly for educational purposes.

Derived steps, on the other hand, are useful in practice. Most traversals will typically rely on around 10 most commonly used steps, so learning all of them is not necessary to start using Gremlin.

There are also a few steps that do not fall under the lambda/derived classification.

There are step modulators that are used to manipulate or pass additional parameters to a traversal step.

Finally, there are predicates that are used to relate values and objects.

It is unlikely that you will need to use lambda steps for any of your real-life traversals, but we will still look at them for educational purposes and see why derived steps are so much better.

The first lambda step we look at is “map”. It maps the traverser from the current object of type S to some object of type E according to some function passed as a parameter. In other words, “map” takes one object from a stream and turns it into some other object of the same or different type.

Let’s look at an example and we will see it is not really a hard thing to understand. We are going to use “map” to find a title of a movie. We begin by getting to a vertex that has movieId m267. Then we use the map step and pass a lambda (or anonymous function) to it as a parameter. To understand this Groovy syntax, it refers to the traverser and get() returns the current object of type S (vertex in our case). The lambda returns the title, which is our object of type E.

Basically, we did not do anything complex here. We maped a movie-vertex to the value of its title-property. The result is “Alice in Wonderland”.

While map is a one-to-one mapping, flatMap is a one-to-many mapping.

In this example, we want to retrieve actors associated with a particular movie.

As before, we begin by getting to a vertex that has movieId m267.

We then use flatMap to hop from the movie vertex to all adjacent vertices reachable via outgoing "actor" edges.

We further map vertices to values of their name-properties because we want to see actor names in the output rather than vertices themselves. And we get Johnny Depp, Anne Hathaway, and some others.

Lambda step filter can be used to decide whether to pass a traverser onto the next step in a traversal or not based on some predicate.

In this example, we are still looking for names of actors associated with a particular movie.

But we are using filter to eliminate all vertices except one with label “movie” and movieId

m267, which is specified in somewhat complex logical expression for the filter step.

We still get the same result as in the previous traversal with actors being Johnny Depp, Anne Hathaway, and a few others.

Now, this sideEffect step does not affect a traverser itself. Instead it does some external thing; it creates a side effect like maybe printing a line of text to the screen or sending a message to an external system.

So in this example, we do some printing of text “The actors of Alice in Wonderland” as a side effect. That literal does not actually appear in the output stream of the traversal where you find actor names.

An example of more useful side effects would be traversing a graph and extracting its subgraph as a separate data structure. We will see such an example when we study subgraph traversals.

The last lambda step is branch.

It splits and sends the traverser onto multiple branches to perform alternative traversals.

This simple example demonstrates how to use branch to return Johnny Depp’s id, while returning names of all the other actors.

As before, we start by getting one specific movie vertex and traversing outgoing actor-edges to get to person-vertices.

And then we do something different. Rather than simply getting names of the actors, we go on a branch. The branch step defines that for those vertices which have property called “name” whose value is equal to Johnny Depp, map them onto their personIds. Otherwise, map them to their names.

So if you visit the Johnny Depp actor vertex, you will output ID. If you visit any other vertex, you will output name.

And you can observe that result in the sample output.

We do have a separate presentation about branching traversals where we will explore many usefu l and  practical examples.

Ok, I get it. Lambda steps are pretty hard. The good news is you will never need to use them in practice; they are more of a theoretical foundation for Gremlin.

And even if you feel comfortable using lambda steps, please don’t! The optimizer in DSE Graph will not be able to analyse and interpret the code inside of lambdas which will limit possible execution plan optimizations and will result in poor performance.

What you will use are the derived steps.

Take a look at how our traversal with lambda steps is rewritten using derived steps. It is so much simpler to write and understand, and more efficient, too.

filter was replaced with has. flatMap was replaced with out. And branch was replaced with choose. The result is the same however.

For more information about individual graph traversal steps, you can always refer to the TinkerPop documentation. But this is not always the best way to learn. The approach that we take is studying different types of traversals and their respective, commonly used steps. In this training, we are going to learn by example!

No write up.
No Exercises.
No FAQs.
No resources.
Comments are closed.