Graph. Databases. Are. Awesome.
At least in certain use cases.
Who knows whom, and how? Graph data.
By way of a quick intro, a graph database is a type of datastore called NoSQL or “Not Only SQL”. Since the beginning of the interwebs, the dominant form of databases were SQL, with MySQL taking the lead. And MySQL is great. However, when your data is highly related in complex ways, SQL starts to crack. Tack onto that the enormous amounts of data manipulated by today’s applications, and SQL really starts to slow things down both in the actual read/write operations and in the development of code because of ridiculously complicated JOIN statements.
This is where graph databases shine because they treat the relationships between data as first class citizens, not an after thought. Therefore, you data is actually saved as you would expect relational data to be saved: as a property graph.
Now, its not only easier to visualize, but much, much faster to traverse. Graph databases shine in asking questions about your data like:
How may mutual friends does Bob have?
What other movies would Bob likely enjoy?
What is the shortest path between Bobs house and Jane’s house?
These kinds of questions can be translated into relatively simple queries and executed in milliseconds. Any way you slice it, that is much faster than traditional SQL databases or other NoSQL options. For complex relational data.
This is the best video explaining graph databases I’ve found.
PHP and Graph Databases
PHP is like BASIC for the web — its easy to learn and easy to get going. There are probably millions of web hosts that offer cheap or free PHP hosting with a MySQL database making it the choice for those on a budget, those just getting into web development, or those who want a project fast that can be installed virtually anywhere.
That doesn’t mean, however, that PHP is somehow a child’s language. Since PHP 5.4 (and even more in 5.5 and 5.6), PHP is a very powerful language that offers most of the object-oriented, down and dirty options that other languages offer. Add to that frameworks like Laravel and Yii, components from The League and Symphony, standards from FIG, Composer support, and an excellent community. PHP has grown into a vibrant language that is still easy for beginners, but powerful enough for complexity (Facebook and WordPress both use it).
However, the use of PHP and Graph Databases has been slow to adoption for two reasons: 1.) the lack of cheap and easy GraphDB hosting, and 2.) the lack of a solid set of PHP tools to work with Graph Databases. The first problem is remedying itself with so many cloud hosting services like Amazon AWS, rackspace, and fortrabbit. It’s not to the MySQL place yet. But, anyone building a relatively complex application would need some form of (and least virtual) dedicated hosting. If you want a blog, use wordpress.
Pheonix Labs has decided to tackle the second problem. In preliminary research for a new community-driven, creative collaboration platform, we settled on a graph database very quickly (OrientDB to be exact). We also decided to use PHP and Laravel (another post will discuss that). We couldn’t find the tools we wanted to work with graph data, though. Graph utilities for PHP seemed unfinished and unmaintained. We we would have to start from the ground up.
Spider
What the PHP universe doesn’t lack are excellent ORMs for SQL databases. An ORM (simply put) is a tool that sits between the PHP script and the database. The ORM usually handles some security, maps the data to objects that are easier to use, and provides a consistent API to work with data (through models). It’s pretty standard practice (and a time saver).
Now, there is (in development) an OGM for php and several graph database backends. Check out Spider.
Some other great links
http://programming.oreilly.com/2013/07/why-choose-a-graph-database.html
http://en.wikipedia.org/wiki/Graph_database
http://www.orientechnologies.com/orientdb/
Filed under: Blog, Michael's Attic Tagged: graph databases, php
