Introducing DataStax Enterprise Graph

DSE Version: 6.0


In this unit, you will be introduced to DataStax Enterprise Graph. DataStax Enterprise Graph is a real-time, secure, highly available, efficient, scalable and analyics-ready graph system. You'll learn more about it in this course.


Hi. My name is Jonathan Lacefield and I’m member of DataStax’s Product Management team.

Today we’re going to review DSE Graph.

DataStax Enterprise Graph is a real-time, secure, highly available, efficient, scalable, and analytics-ready graph database system.

It is built using open-source technologies, such as Apache TinkerPop™, Apache Cassandra™, Apache Solr™, and Apache Spark™, that are fully integrated into a seamless graph data management system.

Here you can see a simplified, high-level design of DSE Graph.

The frond end is what an application or user of DSE Graph interacts with. It includes Apache TinkerPop™'s data model (property graph) and traversal language (Gremlin), as well as Spark SQL, an extension of Spark’s DataFrame AOU  and a proprietary DSE Graph API.

The back end is comprised of Apache Cassandra™, Apache Solr™ and Apache Spark™.  We essentially project a graph model onto DataStax’s core data engine.

The middle layer is the translation/integration layer to map schema, data, and query between front and back ends.

DSE Graph is a Apache TinkerPop™-enabled database. Apache TinkerPop™ is a graph computing framework supported by DSE Graph and many other graph database systems. Apache TinkerPop™ sets standards for structure (aka property graph) and process (aka Gremlin traversal).

A Property graph is made up of 3 types of elements.

Vertices, which represent domain entities, that is “things”

Edges that represent relationships between entities

And properties that represent attributes of entities and relationships.

This slide illustrates a sample property graph. Here you can see a person vertex with properties personId and name and an edge named actor between the person vertex and movie vertex.

Gremlin is a very powerful graph traversal language that can be used to express both OLTP and OLAP processes.

This means that you can write Gremlin that touches a specific vertex and it’s direct neighbors in a graph and expect millisecond level responsiveness like other OLTP databases.

And, you can also write a Gremlin traversal that traverses an entire graph or subgraph with OLAP processing.

Gremlin is very flexible.

Gremlin can be very straightforward to use.

This example finds titles and years of movies released in 2010 or later with Johnny Depp as an actor. This is a very efficient, real-time (OLTP) traversal around a specific vertex representing Johnny Depp.

In addition to Gremlin, DSE Graph users also ahve the ability to use Spark SQL and other standard Spark APIs through DSE GraphFrames, DataStax’s proprietary implementation of the Spark DataFrame object.

DSE Graph projects a property graph data model based on something called Index Free Adjacency Lists onto DataStax’s version of Apache Apache Cassandra™. Apache Cassandra is the foundation of the entire DataStax Enterprise product and provides DSE Graph with the always on, responsive, and distributed capabilities that are familiar to DSE users.

DSE Graph leverages DSE Search, based on Apache Solr, as an integrated search and indexing engine. Most Graph problems have a heavy search need because users need to find the spot in a graph to being traversing. DSE Graph uses DSE Search for this purpose.

Not to be left out, DSE Analytics provides DSE Graph with a great analytics processing engine. Graph data can be aggregated and analyzed through Gremlin or one of the Spark native APIs, DSE Analytics also provides mechanisms for users to ping graphs into memory for analytical processing purposes.

As mentioned above, DSE Graph projects a property graph model onto the integrated DSE engine. This magic happens in what we call the middle layer of DSE Graph. This is the proprietary glue layer of DSE Graph where we optimize the path for reads and writes in DSE for graph workloads.

When we work with DSE Graph users, we typically are working with one of these listed use cases. Some of the design patterns represented in these use cases can be applied to other domains, like how a customer 360 system’s design pattern can be applied to a product 360 or even GDPR system.

There are a lot of natural graph problems in the world. We’ve learned over the years that graph problems require more than just a graph database. We work with users on a daily basis who are moving off of native graph database systems and onyo DSE Graph because our users have realized that forcing a graph database onto a graph problem isn’t always a best approach. Most graph problems need more than just a graph database. They need search and analytics capabilities, they need a system which is resilient and able to process 100s of thousand or more writes per second. They need a data management system that provides the right tool for the job. This is DSE Graph. A graph database that provides users with the flexibility to tackle large graph problems in real time.

No write up.
No Exercises.
No FAQs.
No resources.
Comments are closed.