The PHP podcast where everyone chimes in.

Originally aired on

February 24th, 2016

040: Graph Databases

Traditional relational databases like MySQL or Postgres are really good at providing many solutions to the problem of persisting state. But these types of database are really horrible at querying highly connected models in an efficient way.

Graph databases like Neo4j and OrientDB excel at highly connected data. In fact, graph technologies are the backbone of social networks like Facebook and Twitter. We discuss how to think about our data using the graph model and what tools we can use in our PHP projects to interface with them. We also discuss the considerations we'll need to make when deciding whether or not to use a graph database in our next project.

with

Chris White

Using graph databases in PHP Show Summary


Sammy Kaye Introduces

Query Languages

Graph DBs don't usually use SQL. The most common query languages are:

What are the key concepts of graph DBs and how do they differ from relational DBs?

  • Graph DBs are flexible
  • Graph DBs use 'nodes' and 'edges'. These can be thought of as 'entities'/'nouns' and 'relationships'/'verbs'
  • Graph DBs map more closely to how we might think or draw a diagram
  • It is easy to get started, but the complexity quickly ramps up
  • Graph DBs are new - there's less information out there. Relational DBs are well studied and well documented
  • Graph DB schema can be evolved much more easily than relational DBs

When should I use graph DBs - what are the use cases?

  • Highly connected data: If your data model uses a lot of joins, graph DBs may be a good approach
  • If your data model uses a lot of one-to-one relationships, or 'get item by id' type queries, graph DBs are not a good fit
  • If you don't yet know what sort of queries you want to answer, graph DBs can offer flexibility later
  • If your queries are based more around relationships or verbs than objects or nouns, graph DBs may be a good approach
  • 'sparse' schemas lend themselves to graph DBs
  • There is some debate from the panel around whether graph DBs should work alongside relational DBs or instead of relational DBs

How should we interact with graph databases?

  • Adapt existing ORMs to work with graph DBs like NeoEloquent
  • Throw out the ORM model and build specialized OGMs like Neo4jrb
  • Write query languages like Cypher/Gremlin/etc directly
  • Specialized tools like recommendation engines e.g: neo4j-reco

Other topics

  • Security - do we need to worry about 'cypher injection'? Most libraries have named parameters 'out of the box'
  • Graphical tools e.g: neo4j browser
  • Graph DBs don't give nodes auto increment IDs - how do we identify nodes?
    • We can use uuids
    • With graph DBs, we are less likely to need IDs
    • Some graph DB engines discourage the use of IDs
    • Use 'natural' IDs where possible
  • Graph DBs can benefit from indexing and unique constraints in much the same way as relational DBs
  • Neo4j doesn't have triggers. OrientDB does
  • Neo4j is more of a 'pure' graph database. OrientDB combines NoSQL and graph approaches

A Concrete example

Sammy Kaye talks about a problem he is facing with his DancerDeck project, discussed in episodes 21 and 39:

I want to be able to allow a user to subscribe (edge) to an event (node), but manage what sort of notifications are associated with that subscription. Should I use multiple edges between the user and node (an edge for each notification), or one edge (subscribe) with multiple properties (notifications) associated with that edge?

The panel offer their opinions on both solutions and also propose another solution: represent each notification as a node (rather than en edge or an edge property) and create edges between the user and the notification nodes.

Ultimately, it depends on how you want to query it.

Sammy Kaye wraps up with

Michelle Sanver

  • Proud Liiper <3
  • Co-President of PHPWomen
  • Loves colours and nature

Ed Finkler



Chris White

Chris White


Developer Shout-Out

The Developer Shout-Out recognizes developers in the community for their contributions.

For this episode the panel guests, Michelle, Ed, Jeremy, and Chris nominated Adam Engebretson for the Developer Shout-Out segment.

Thank you, Adam Engebretson for your contributions to the Laravel community. A $50 Amazon gift card is on its way to you.

$50 Amazon gift card sponsored by Laracasts

Laracasts

It's like Netflix for developers.

Show Notes Credit

Chris Shaw

Thank you Chris Shaw for authoring the show notes for this episode!

If you'd like to contribute show notes and totally get credit for it, check out the show-notes repo!