Learn to query graph databases with Gremlin.

We originally commissioned Josh Perryman, of Expero, to teach this TinkerPop/Gremlin workshop for recent Graph Day conference held in Austin, January 2017. To our knowledge, there is no one -- more than Josh -- who has a wikipedia-like knowledge of every commercial graph database. The workshop sold out and received rave reviews. When Josh offered it again at Graph Day in San Francisco, it sold out as well - once again, to rave reviews. We have asked Josh to come to Seattle and offer the course yet again in conjunction with Graph Day at Data Day Seattle. As far as we know, this is currently the only TinkerPop / Gremlin training workshop in the world.

There will only be one section of this class, and enrollment is limited to 30. Don't miss this opportunity!

What is Gremlin?

Gremlin, part of the Apache TinkerPop framework, is an incredibly rich and powerful query language for property graphs. Its functional roots and novel execution model can make it a little difficult to get started with; many instincts from set-based query languages like SQL don't translate directly. In this four hour workshop, led by Gremlin expert Josh Perryman, you will work through a series of increasingly complex exercises.

What is Tinkerpop?

Apache TinkerPop the an open source, vendor-agnostic, graph computing framework distributed under the commercial friendly Apache2 license. When a data system is TinkerPop-enabled, its users are able to model their domain as a graph and analyze that graph using the Gremlin query language. All TinkerPop-enabled systems integrate with one another allowing them to easily expand their offerings as well as allowing users to choose the appropriate graph technology for their application.

Why should you learn Gremlin?

Pretty much every graph database can be queried with Gremlin -- and Microsoft just announced Gremlin/TinkerPop support for its own CosmosDB. Gremlin is well on it's way to becoming the SQL of graph query languages. Here are some of the tools/databases which are compatible with Gremlin and TinkerPop:

  • Blazegraph - RDF graph database with OLTP support.
  • ChronoGraph - A versioned graph database.
  • DSEGraph - DataStax graph database with OLTP and OLAP support.
  • GRAKN.AI - Distributed OLTP/OLAP knowledge graph system.
  • Hadoop (Giraph) - OLAP graph processor using Giraph.
  • Hadoop (Spark) - OLAP graph processor using Spark.
  • HGraphDB - OLTP graph database running on Apache HBase.
  • IBM Graph - OLTP graph database as a service.
  • JanusGraph - Distributed OLTP and OLAP graph database with BerkeleyDB, Cassandra and HBase support.
  • Neo4j - OLTP graph database (embedded and high availability).
  • neo4j-gremlin-bolt - OLTP graph database (using Bolt Protocol).
  • OrientDB - OLTP graph database
  • Sqlg - RDBMS OLTP implementation with HSQLDB and Postresql support.
  • Stardog - RDF graph database with OLTP and OLAP support.
  • TinkerGraph - In-memory OLTP and OLAP reference implementation.
  • Titan - Distributed OLTP and OLAP graph database with BerkeleyDB, Cassandra and HBase support.
  • Titan (Amazon) - The Amazon DynamoDB storage backend for Titan.
  • Titan (Tupl) - The Tupl storage backend for Titan.
  • Unipop - OLTP Elasticsearch and JDBC backed graph.

Course Description

The “Introduction to Gremlin” half-day course takes students from little or negligible knowledge of property graphs and the Gremlin traversal language, to a basic ability to navigate and make changes to TinkerPop-enabled data. The course uses the reference implementation of TinkerPop Gremlin, and a handful of sample data sets, to teach the Gremlin traversal language through hands-on examples.

For most students, their hands will never leave the keyboard as they follow the instructor and the examples in the student handout. At the end of the course, students will:
• understand the difference between the graph data and the traversal process
• know the elements of a TinkerPop property graph
• be able to write basic traversals through the graph
• be able to perform common mutations of the graph (insert, change and remove data)
• be exposed to simple data transformations such as grouping, ordering, and aggregations.

Course Summary

Session 1: Introduction to Property Graphs
Apache TinkerPop, Gremlin Console, traversals vs. graphs, elements of a property graph: vertex, edge, property. Iterating results.
Section 2: Basic Traversals - Finding, Filtering & Projecting
Finding vertices, finding edges, returning property values. Traversing the graph. Filtering using predicates and the where(), is() and has() steps.
Section 3: Mutating the Graph - Adding, Changing & Deleting Data
Graph API vs. the Traversal API. Adding vertices, edges, properties. Editing properties. Removing properties, edges, vertices. Dropping a graph.
Section 4: Common Transformations - Grouping, Ordering & Aggregations
Review projections. Simple grouping examples. Ordering results by a property value. Aggregations.

Course Requirements

Attendees must have:
• A laptop with wifi capabilities
• git or unzip software (The GitHub repo can be cloned with git or downloaded as zip file).
• Docker
• ~300 MB free disk space for the Docker image
• ~100 MB available RAM for the Docker container

Attendees should clone or download the public repo: https://github.com/experoinc/gremlin-lang-intro and then follow the instructions in the README.md file.

About the instructor

Josh Perryman is a Managing Consultant / Data Junkie / Technology Lead at Expero, Inc. His deep familiarity with a multitude of graph platforms and tools makes him a highly sought after speaker, trainer, and consultant in the graph space.

Apache TinkerPop, TinkerPop, Apache, Apache feather logo, and Apache TinkerPop project logo are either registered trademarks or trademarks of Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by Global Data Geeks.