Domain-Specific Traversals

DSE Version: 6.0


It is a really good idea to customize the language you use to talk about graph traversals to the specific domain. In this unit, you will learn about domain-specific traversals.



Hey everyone, I am Denise Gosnell and this is domain specific traversals.

Now, as you can see by looking at the gremlin language, it is a power and expressive and general purpose language for defining graph traversals. You can do a lot with it. It is technically a turing complete language.

Usually, the graphs we deal with are going to inhabit a particular domain. All of our examples deal with film. All of the graphs in your domain will be located within the business that you serve.

It will be a really good idea to customize the language you use to talk about graph traversals to that domain.

This is a topic that shows up in many places other than just graphs. Domain specific languages are often a good idea. Gremlin is extensible to allow us to build our own DSLs on top of it to describe traversals in a much more compact and expressive way.

Now, I am not going to show you have to implement a gremlin DSL. that is a slightly bigger topic. But here, we are going to introduce the idea and show you the kinds of things you will do with domain specific traversals. And, we hope that you see the value in DSLs and see that they are a useful tool to implement and worth pursing.

Here is the hypothetical DSL we are going to define. This maps on to the kinds of traversals that we have done in our other examples looking at the other parts of the gremlin query language. We might define a step that will look for all of the vertices called “movie”, label them “m”, and then return their values at the end. In the middle, there will be other kinds of steps that we can put in to do other things. Those can be steps like “with actor” where we will look for the movies with an actor with a partcular name. Or “with director” or “in category” where we might limit the movie vertices to only those that appeared in a particular genre. Or, we could constrain with a higher rating, as seen with the WithHigherRating step.

Now, for each one of those as you can see here in the slide, those translate to a certain amount of native gremlin code. We would rather not look at that native gremlin code if we do not have to though. We would rather speak in terms of the domain. That ends up creating queries that are so much easier to read and so much more expressive.

Now this isn’t a full example of how to implement a DSL, but instead shows us what a DSL would allow us to do.

Let’s look at an example.

So, at the begining, we say that we want movies actor tom hanks. That reads almost like english.

Much more than the equivalent gremlin, like you see in the next step. You look for vertices with the movie label, and label them as m. Then, you back and select M and get the title values.  

Much easier to say and read movies with actor Tom Hanks.

Lastly, you can see here that you can say something like: “movies with actor tom hanks in the genre cateory of drama”.

The first version of this query is pretty readable and communicates the traversal very well.

The subsequent gremlin shows how to write the full query. And, if you have studied the basics of gremlin, you might not find the second example to be too bad. But, if you are not a practiced gremlin developer, you would probably rather write the top line instead of the second example.

We will keep adding to this query to keep showing you how great the DSL can be.

The first query looks for movies with tom hanks in the drame genre, we say that we only want to have those with a higher rating of 7.5.

Compare the use of the DSL in the first example to the full query below.

In the second query, we label all of the movie vertices as m. And in the second line we walk out the actor edges to those with the name tom hanks. Then, we select m to get back to all movies vertices. Next, we walk out the belongs edges to the drama genre. Again, we select all of the movie vertices so that we can only keep the movies in the traversal stream if their average rating is greater than 7.5 Finally we select m, and look at the movies by title which satisfy all of these constraints, and we see forrest gump and the green mile.

Finally, we will put together the other steps we laid out in our hypothetical DSL. we will look for movies with actor tom hanks and the actor gary sinise. Then, we also want to look for those movies with with director robert zemeckis in the drama category, with a rating higher than 7.5  And this ends up being forrest gump. You can see the equivalent traversal on the bottom part of the code. And again, it isn’t using any sophisticated constructus, but it does end up being a lot of code. We would rather look at the simpler thing.

And that is really what DSLs are all about. They don’t make the impossible possible, they are just done in a lower level language that is more flexible and more powerful. But, for the common operations, they become more usable and more expressive.

Now, domain experts who do not know gremlin will be able to read a well crafted DSL. And, if you have designed the DSL well, everyone who has to build code with the DSL will be a little bit more productive.

It is also easier to validate code written in the DSL. Since the language is domain specific, it will be easier to see that it is expressing the thing that we intend for it to express.

So, as your traversals get more complex and you begin to see the patterns emerge in your domain, this is a topic worth exploring further.

No write up.
No Exercises.
No FAQs.
No resources.
Comments are closed.