Sammy Kaye Introduces
Query Languages
Graph DBs don't usually use SQL. The most common query languages are:
What are the key concepts of graph DBs and how do they differ from relational DBs?
- Graph DBs are flexible
- Graph DBs use 'nodes' and 'edges'. These can be thought of as 'entities'/'nouns' and 'relationships'/'verbs'
- Graph DBs map more closely to how we might think or draw a diagram
- It is easy to get started, but the complexity quickly ramps up
- Graph DBs are new - there's less information out there. Relational DBs are well studied and well documented
- Graph DB schema can be evolved much more easily than relational DBs
When should I use graph DBs - what are the use cases?
- Highly connected data: If your data model uses a lot of joins, graph DBs may be a good approach
- If your data model uses a lot of one-to-one relationships, or 'get item by id' type queries, graph DBs are not a good fit
- If you don't yet know what sort of queries you want to answer, graph DBs can offer flexibility later
- If your queries are based more around relationships or verbs than objects or nouns, graph DBs may be a good approach
- 'sparse' schemas lend themselves to graph DBs
- There is some debate from the panel around whether graph DBs should work alongside relational DBs or instead of relational DBs
How should we interact with graph databases?
- Adapt existing ORMs to work with graph DBs like NeoEloquent
- Throw out the ORM model and build specialized OGMs like Neo4jrb
- Write query languages like Cypher/Gremlin/etc directly
- Specialized tools like recommendation engines e.g: neo4j-reco
Other topics
- Security - do we need to worry about 'cypher injection'? Most libraries have named parameters 'out of the box'
- Graphical tools e.g: neo4j browser
- Graph DBs don't give nodes auto increment IDs - how do we identify nodes?
- We can use uuids
- With graph DBs, we are less likely to need IDs
- Some graph DB engines discourage the use of IDs
- Use 'natural' IDs where possible
- Graph DBs can benefit from indexing and unique constraints in much the same way as relational DBs
- Neo4j doesn't have triggers. OrientDB does
- Neo4j is more of a 'pure' graph database. OrientDB combines NoSQL and graph approaches
A Concrete example
Sammy Kaye talks about a problem he is facing with his DancerDeck project, discussed in episodes 21 and 39:
I want to be able to allow a user to subscribe (edge) to an event (node), but manage what sort of notifications are associated with that subscription. Should I use multiple edges between the user and node (an edge for each notification), or one edge (subscribe) with multiple properties (notifications) associated with that edge?
The panel offer their opinions on both solutions and also propose another solution: represent each notification as a node (rather than en edge or an edge property) and create edges between the user and the notification nodes.
Ultimately, it depends on how you want to query it.
Sammy Kaye wraps up with