Over the past few months, I've explored many of the different options for modeling and persisting data using graph databases. Ultimately my goal was to decide what my technology stack would look like if I were to start a new project that required a backend service exposing a GraphQL API.
Next, I explored JanusGraph. JanusGraph is an powerful graph database implementation that supports multiple storage backends, including Cassandra, Apache HBase, Google BigTable, etc. It exposes data through the Gremlin traversal language, the de-facto query language for graph databases. I used an object-graph-mapping library called Ferma to map my domain objects to/from the graph, which seemed to be the only Gremlin compatible OGM available. Unfortuantely, I found Ferma to have a few fatal shortcomings:
- Its runtime code generation was incompatible with graphql-tools.
- Enums were not supported.
- Domain objects must be abstract.
- Domain objects must be mutable.
Then, I set up a backend app using Neo4j. Neo4j is a widely-used graph database with mature documentation and tooling. Again, I was disappointed with the object-graph-mapping library offered by Neo4j for similar reasons, such as model objects are required to be mutable. Also, adding support for clusters in Neo4j requires paying for their enterprise edition.
Ultimately, I decided the most powerful backend infrastructure, would be built with JanusGraph + a type-safe OGM to wrap JanusGraph's Gremlin interface. This type-safe OGM still needed to be created, and so, Kremlin was born.
Kotlin + Gremlin = Kremlin
I chose to implement this OGM with Kotlin, due to its optional type. When traversing a relationship in a graph, from say, a child to its mother, it would be nice to enforce that the result of the traversal will never be null. You can do this with Kremlin by specifying the relationship:
val children = Relationship.asymmetricSingleToMany<Human, Human>(name = "children") val mother = children.inverse
Then when traversing:
val lilyPotter = g.traverse(harryPotter out mother)
lilyPotter will be a non-nullable
Human. However if we defined the relationship:
val significantOther = Relationship.symmetricOptionalToOptional<Human>(name = "significantOther")
And the traversed:
val harrysGirlfriend = g.traverse(harryPotter out significantOther)
harrysGirlfriend would be of type
Given the relationships
significantOther we can easily traverse from a mother to their childern's significant others:
val childrensSignificantOthers = g.traverse(lilyPotter out children.to(significantOthers))
The result would be a list of type
[Human] because the chained traversal
children.to(significantOthers) has a cardinality of to-many. When two relationships are linked together and traversed, the OGM is smart enough to not map intermediate vertices to their domain object form, for maximum efficency.
In additions to linking relationships to other relationships, it's possible to modulate a traversal in any way that Gremlin supports. There are a few built in "Steps" including
dedup and others
The current version of Kremlin is 0.9.5 and is considered in beta. The library is already used in production in a few of my own side projects, and a 1.0 release will come once this library has been battle under a variety of environments. Please feel free to contribute on github.