Skip to content

Instantly share code, notes, and snippets.

@jeremysears
Last active May 10, 2022 21:53
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save jeremysears/24b622c36333728038bd1c9ffa6fb12e to your computer and use it in GitHub Desktop.
Save jeremysears/24b622c36333728038bd1c9ffa6fb12e to your computer and use it in GitHub Desktop.
DSE Graph Schema Management Cheat Sheet in Gremlin Groovy

DSE Graph Schema Management Cheat Sheet in Gremlin Groovy

DSE Graph schema management examples taken from the excellent DS330: DataStax Enterprise Graph course.

Graph Definition

List all graph names:

system.graphs();  // => KillrVideo

Describe all graphs:

system.describe(); // => system.graph("KillrVideo").create()

Describe a specific graph:

system.graph("KillrVideo").describe();  // => system.graph("KillrVideo").create()

Create a graph:

system.graph("KillrVideo").create()
system.graph("KillrVideo").ifNotExists().create()

Create a graph with replication:

system
  .graph("KillrVideo")
  .option("graph.replication_config")
  .set("{'class' : 'NetworkTopologyStrategy','DC-East' : 3,'DC-West' : 5}")
  .option("graph.system_replication_config")
  .set("{'class' : 'NetworkTopologyStrategy','DC-East' : 3,'DC-West' : 3}")
  .ifNotExists()
  .create()

Create a graph in development mode:

system.graph("KillrVideo").create()
schema.config().option("graph.schema_mode").set("Development")

Check if a graph exists:

system.graph("KillrVideo").exists() // => true

Drop a specific graph:

system.graph("SomeGraph").drop();

Graph Schema Definition

Define a property key for a scalar integer:

// Property key with a single integer value
schema.propertyKey("year").Int().single().create();

// single() is assumed by default and can be omitted
schema.propertyKey("year").Int().create();

Defina a multi-property (note that multi-properties can only be associated with vertices):

// Property key that allows many text values for film production companies
schema.propertyKey("production").Text().multiple().create();

Define a meta-property (note that meta-properties can only be associated with vertex properties):

// Property key definitions
schema.propertyKey("source").Text().create();
schema.propertyKey("date").Timestamp().create();

// Multi-property key with two meta-properties
schema.propertyKey("budget").Text().multiple().properties("source","date").create();

Creating vertex labels:

// Vertex label with six associated property keys
schema
  .vertexLabel("movie")
  .properties("movieId","title","year", "duration","country","production")
  .create();
  
// Different vertex labels cna use the same property key

// Property key "name" definition
schema.propertyKey("name").Text().create();

// Property key "name" is associated with both vertex labels
schema.vertexLabel("genre").properties("genreId","name").create();
schema.vertexLabel("person").properties("personId","name").create();  

Creating Edge Labels

// A single cardinality edge label defintion
// A movie can be rated by a user at most once
schema
  .edgeLabel("rated")
  .single()
  .properties("rating")
  .connection("user","movie")
  .create();
  
// A multi-cardinality edge label definition
// Acting in a movie in multiple roles is possible
schema
  .edgeLabel("actor")
  .multiple()
  .connection("movie","person")
  .create();

// multiple() is assumed by default and can be omitted
schema
  .edgeLabel("actor")
  .connection("movie","person")
  .create();
  
// A multi-connection edge label definition
// Edge label with different domains and ranges
schema
  .edgeLabel("knows")
  .single()
  .connection("user","user")
  .connection("user","person")
  .connection("person","user")
  .create();

Dropping graph schemas:

// Dropping graph schema will also result in loosing all graph data!
schema.clear()

Retrieve vertext with a Default ID:

g.V().hasId("{~label=movie, member_id=0, community_id=63341568}")
// Sample output:
// v[{~label=movie, member_id=0, community_id=63341568}]

g.V("{~label=movie, member_id=0, community_id=63341568}")
// Sample output:
// v[{~label=movie, member_id=0, community_id=63341568}]

Defining Custom Vertex IDs

// Property keys
schema.propertyKey("username").Text().create();
schema.propertyKey("age").Int().create();
schema.propertyKey("gender").Text().create();

// Vertex label with a custom ID
schema
  .vertexLabel("user")
  .partitionKey("username")
  .properties("age","gender")
  .create();
  
// Property keys
schema.propertyKey("movieId").Text().create();
schema.propertyKey("title").Text().create();
schema.propertyKey("year").Int().create();
schema.propertyKey("duration").Int().create();
schema.propertyKey("country").Text().create();

// Vertex label with a custom ID
schema.vertexLabel("movie").
       partitionKey("year","country").
       clusteringKey("movieId").
       properties("title","duration").create();  

Retrieving a Vertex with a Custom ID

g.V("{~label=user, username=agent007}")   // or
g.V().hasId("{~label=user, username=agent007}")
// Sample output:
// v[{~label=user, username=agent007}]

g.V("{country=United States, movieId=m267, ~label=movie, year=2010}")   
// or
g.V().hasId("{country=United States, movieId=m267, ~label=movie, year=2010}")
// Sample output:
// v[{country=United States, movieId=m267, ~label=movie, year=2010}]

Graph Index Definition

Vertex Indexes

Create a materialized view index on a high cardinality property

// Indexing movies by movieId
schema.vertexLabel("movie").index("moviesById").materialized().by("movieId").add()

// Find a movie with a given movieId
// Both vertex label and property key-value must be 
// specified for a traversal to use an index.
g.V().hasLabel("movie").has("movieId","m267")
// or
g.V().has("movie","movieId","m267")

Create a secondary index on a low cardinality property

// Indexing movies by year
schema.vertexLabel("movie").index("moviesByYear").secondary().by("year").add()

// Find movies from a given year
g.V().hasLabel("movie").has("year",2010)
// or
g.V().has("movie","year",2010)
// Note: A traversal with no explicitly specified vertex label, 
// e.g., g.V().has("year",2010), cannot take advantage of a vertex index.

Create a full text search index on a text property

// Indexing movies by title
schema.vertexLabel("movie").index("search").search().by("title").asText().add()

// Find movies with words that start with "Wonder" in their titles
g.V().has("movie","title",Search.tokenRegex("Wonder.*"))
// Indexed properties can be queried using token(), tokenPrefix(), and tokenRegex().

Create a string search index on a text property

//Indexing movies by country
schema.vertexLabel("movie").index("search").search().by("country").asString().add()

//Find movies from countries that start with letter "U"
g.V().has("movie","country",Search.prefix("U"))
//Indexed properties can be queried using prefix(), regex(), eq() and neq().

Create additional specialized search index capabilities

// Spacial indexes: asCartesian() and asGeo()
// Other non-text indexes: no special index type
// One search index, many properties
// Indexing users by name, location, and age
schema
  .vertexLabel("user")
  .index("search")
  search()
  by("name").asText()
  by("location").asCartesian(0.0,0.0,100.0,100.0)
  by("age").add()

List vertex indexes

// Listing all graph schema information
schema.describe()

// Listing schema information for a particular vertex label
schema.vertexLabel("movie").describe()
// Sample output:
// schema.vertexLabel("movie").properties("movieId", "title", "year", "duration",
//                                                       "country", "production").create()
// schema.vertexLabel("movie").index("moviesById").materialized().by("movieId").add()
// schema.vertexLabel("movie").index("moviesByYear").secondary().by("year").add()
// schema.vertexLabel("movie").index("search").search().by("title").asText(),
//                                                     .by("country").asString().add()
// ...

Drop a vertex index

// Dropping a materialized view index
schema.vertexLabel("movie").index("moviesById").remove()

// Dropping a secondary index
schema.vertexLabel("movie").index("moviesByYear").remove()

// Dropping a specific search index property
schema.vertexLabel("user").index("search").
                           search().properties("location").remove()

// Dropping a search index
schema.vertexLabel("user").index("search").remove()

Property Indexes

Efficiently retrieve properties of a known vertex that have associated meta-properties whose values are known or fall into a known range using materialized views in Cassandra.

Create and use a Property Index

// Indexing movie budget estimates by source
schema.vertexLabel("movie").index("movieBudgetBySource")
  .property("budget").by("source").add()

// Querying movie budget estimates based on source
g.V().has("movie","movieId","m267").properties("budget")
  .has("source","Los Angeles Times").value()
      
// Indexing movie budget estimates by date
schema.vertexLabel("movie").index("movieBudgetByDate")
  .property("budget").by("date").add()

// Querying movie budget estimates based on date
g.V().has("movie","movieId","m267").properties("budget")
  .has("date", gt(Instant.now().minusSeconds(86400 * 365)))
  .value()

Edge Indexes

Efficiently traverse edges that are incident to a known vertex, have a known label, and have properties whose values are known or fall into a known range using a materialized view in Cassandra.

Create and use an Edge Index

// Find how many users rated a particular movie with an 8-star rating
schema.vertexLabel("movie")
  .index("toUsersByRating")
  .inE("rated").by("rating")
  .add();

g.V().has("movie","movieId","m267")
  .inE("rated").has("rating",8).count()
  
// Find movies rated with a greater-than-7 rating by a particular user
schema.vertexLabel("user")
  .index("toMoviesByRating")
  .outE("rated").by("rating")
  .add();

g.V().has("user","userId","u1")
  .outE("rated").has("rating",gt(7)).inV()
// Both incoming and outgoing edges of a vertex can be indexed by 
// specifying bothE() when creating an edge index.  

Production/Development Schema Modes

Change the schema mode for a graph:

schema.config().option("graph.schema_mode").get() // Production

schema.config().option("graph.schema_mode").set("Development")
schema.config().option("graph.schema_mode").get() // Development

Enabling graph scans in production mode

// This is OK for scanning small portions of data, but caution is advised
schema.config().option("graph.schema_mode").get() // Production
g.V().hasLabel("genre").values("name")
// Could not find a suitable index ... and graph scans are disabled

schema.config().option("graph.allow_scan").set(true)
g.V().hasLabel("genre").values("name")
// Action Adventure Animation Comedy ... 18 genres in total

Profiling Traversals

Executa a query and track profiling data

OLTP traversal example

g.V().has("person","name","Johnny Depp").in("actor").values("title").profile()

Traversal Metrics
Step                                              Count  Traversers       Time (ms)    % Dur
============================================================================================
DsegGraphStep([~label.eq(person), name.eq(Johnn...    1           1           1.103    16.25
  query-optimizer                                                             0.117
  query-setup                                                                 0.004
  index-query                                                                 0.330
DsegVertexStep(IN,[actor],vertex)                    14          14           1.312    19.33
  query-optimizer                                                             0.068
  query-setup                                                                 0.001
  vertex-query                                                                0.519
DsegPropertiesStep([title],value)                    14          14           4.373    64.43
  query-optimizer                                                             0.154
  query-setup                                                                 0.000
  vertex-query                                                                0.161
  query-setup                                                                 0.000
  vertex-query                                                                0.150
  query-setup                                                                 0.000
  vertex-query                                                                0.179
  query-setup                                                                 0.000
  vertex-query                                                                0.209
  query-setup                                                                 0.000
                           >TOTAL                     -           -           6.789        -

OLAP traversal example

g.E().groupCount().by(label).profile()

Traversal Metrics
Step                               Count  Traversers       Time (ms)    % Dur
=============================================================================
GraphStep(edge,[])                 69054       69054        4161.214    98.50
GroupCountStep(label)                  1           1          63.277     1.50
                        >TOTAL         -           -        4224.492        -

Switching to the OLTP traversal engine

:remote config alias g KillrVideo.g
g.V().has("person","name","Johnny Depp").in("actor").values("title")

Switching to the OLAP traversal engine

:remote config alias g KillrVideo.a
g.E().groupCount().by(label)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment