## Property Graph Data Model

DSE Version: 6.0

Video

In this unit, you are going to learn about managing property graphs in DataStax Enterprise Graph. We will be covering this in 2 sections of content, the first explains property graph data models and the second explains operations on the graph property data model. This unit takes care of the data model aspect.

Transcript:

Hi. Welcome back.

As a reminder, my name is Jonathan Lacefield and I’m member of DataStax’s Product Management team.

Now that you've had an overview of DSE Graph, we're going to take a look at Managing Property Graphs.

We'll do this using 2 main sections of content, the first explains Property Graph Data Models and the second explains operations on the Graph property Data Model.

Mathematically, we can define a property graph as a directed, binary, attributed multi-graph. It consists of a set of vertices, a multi-set of directed, binary edges, and key-value pairs or properties associated with vertices and edges.

For those that don't come from a mathematical background, Property Graph Data Models represent graphs using 3 elements.

Vertices (things), Edges (relationships), and Properties (attributes on either vertices or edges).

Each vertex in a property graph has a unique id that is specified by a graph creator. The name of the id type for DSE Graph is a User Defined Vertex ID. Each vertex also has a label that generally denotes a type of an entity that a vertex represents.

In this illustration, we have 7 vertices with labels movie, person, user, and genre. All of the vertices have unique ids, such as m267 or p4361. It is convenient for us to think of ids as simple strings in this example but, in reality, vertex ids in DSE Graph are composed of multiple components used not only for uniqueness but also for graph partitioning. We will revisit ids again when we talk about custom ids.

We'll talk more in a subsequent section of this course about how to use ids to achieve data locality in DSE Graph.

Each (binary) edge connects exactly two vertices. Edges are always directed, have automatically assigned unique identifiers and have labels. Since a property graph is a multi-graph, it is possible to have multiple edges connecting the same two vertices; such parallel edges may have the same direction and labels but will have different ids.

In this illustration, we connected our vertices with labeled edges. Edges and their labels usually denote relationships between entities represented by vertices. It is getting more interesting! We can now read that some user u185 rated movie m267 that has some actor p4361.

Though edges are created with direction, DSE Graph automatically stores both directions, in and out, of an edge to ensure users have the flexibility of traversing an edge in any direction.

Properties are key/value pairs that can be associated with any vertex or edge in a property graph. They describe individual vertices (entities) and edges (relashionships).

Our sample graph now contains a lot of useful information. Lets take a look. We can see properties on both the vertices and edges.

A vertex property that can have multiple values is called a multi-property.

In this example, we have production and budget as multi-properties and show their values in square brackets. Another way to think about them is having multiple key-value pairs with the same key (e.g., budget: \$150Mand budget: \$200M). Note that multi-properties are only applicable to vertices and not to edges.

A property that can be assigned to a vertex property is called meta-property. Meta-properties are usually used for describing access control, provenance and audit information for individual vertex properties.

In this example, we have meta-properties describing provenance of two movie budget estimates, as well as meta-properties describing access control permissions for the person properties.

Note that meta-properties are only applicable to vertex properties and not to edge properties. A meta-property cannot be a multi-property or have meta-properties assigned to it.

Here we see all of these concepts combined to create a robust data model that represents a real world graph.

Graph databases are based on graph mathematics.  Though not required, it is helpful to know the meaning of a few graph theory terms.

Incident edge: An edge is called incident to a vertex when this vertex is one of the edge endpoints.

Incoming and outgoing edges: An edge directed in or out of a vertex.

Vertex degree, in-degree, and out-degree: a number of incident, incident incoming, and incident outgoing edges; degree = in-degree + out-degree.

Adjacent vertex: A vertex a is called adjacent to a vertex v when there exists an edge between the two vertices.

Path: A sequence of alternating vertices and edges, beginning and ending with vertices, where each edge’s endpoints are the preceding and following vertices in the sequence.

Simple, cyclic, and shortest paths: simple - a path that cannot repeat an edge/vertex; cyclic - a path with repetition and therefore cycles are ok; shortest length path - a path with the least number of edges.

Subgraph: A subgraph S of a graph G is a graph whose vertices are a subset of the vertex set of G, and whose edges are a subset of the edge set of G.

The illustration shows a subgraph of the original graph we saw previously with 5 vertices and 4 edges. Vertex m267 has 4 incident edges (1 incoming and 3 outgoing) and 4 adjacent vertices. Only simple paths exist in this subgraph.

This is only to get us started! There will be a plenty of new notions introduced gradually.

No write up.
No Exercises.
No FAQs.
No resources.