Using Robo to automatically trigger unit tests while developing

Robo is an excellent php task runner, similar to NodeJS Gulp. I was looking for a pure php alternative to Gulp for my php projects and I simply prefer to have a tool chain written in the same language if possible. Also I found the code to be a lot better manageable than using Gulp.

So here is one of my main use cases I have for it. I wanted to run the unit tests as soon as I modified a test case. The following code is for a CakePHP3 project.

You can adept the above script for other projects as well pretty easy. Also that should be required is to make sure that you change line 13 so that it fits the path and convention of your project.

Friends of friends – performant social network implementations

I did some research on this because I was curious how Facebook handles their huge amount of data and search it in a quick way because I’ve seen people complaining about custom made social network scripts becoming slow when their user base grows. After I did some benchmarking myself with just 10k users and 2.5 millionen friend connections – not even trying to bother about group permissions and likes and wall posts – it quickly turned out that this approach is flawed. So I’ve spent some time searching the web on how to do it better and came across this official Facebook article:

I really recommend you to watch the presentation of the first link above before continue reading. It’s probably the best explanation of how FB works behind the scenes you can find.

The video and article tells you a few things:

  • They’re using MySQL at the very bottom of their stack
  • Above the SQL DB there is the TAO layer which contains at least two levels of caching and is using graphs to describe the connections.
  • I could not find anything on what software / DB they actually use for their cached graphs, I  think I’ve seen somewhere that they’ve used memcache but don’t know they’re still using it. I doubt it.

Let’s take a look at this, friend connections are top left:

enter image description here

This is a graph. It doesn’t tell you how to build it in SQL, there are several ways to do it but this site has a good amount of different approaches.

Also consider that you have to do more complex queries than just friends of friends, for example when you want to filter all locations around a given coordinate that you and your friends of friends like. A graph is the perfect solution here.

I can’t tell you how to build it so that it will perform well but it clearly requires some trial and error and benchmarking.

Here is my disappointing test for just findings friends of friends:

DB schema:

Friends of friends query:

I really recommend you to create you some sample data with at least 10k user records and each of them having at least 250 friend connections and then run this query. On my machine (i7 4770k, SSD, 16gb RAM) the result was ~0.18 seconds for that query. Maybe it can be optimized, I’m not a DB genius (suggestions are welcome). However, if this scales linear you’re already at 1.8 seconds for just 100k users, 18 seconds for 1 million users.

This might still sound OKish for ~100k users but consider that you just fetched friends of friends and didn’t do any more complex query like “display me only posts from friends of friends + do the permission check if I’m allowed or NOT allowed to see some of them + do a sub query to check if I liked any of them“. You want to let the DB do the check on if you liked a post already or not or you’ll have to do in code. Also consider that this is not the only query you run and that your have more than active user at the same time on a more or less popular site.

I’ve started experimenting with OrientDB to do the graph-queries and mapping my edges to the underlying SQL DB. If I ever get it done I’ll write an article about it.

Conclusion: Implementing a social network is easy but making sure it performs well is clearly not – IMHO.

Loading static JSON data from the rendered page into AngularJS

In our scenario we didn’t wanted to build a whole single page application but instead needed to deal with data after the post was send to the server but didn’t save for some reason, so the page is rendered again and the data shown again as well. This caused the Angular controller to lose it’s set data.

I was looking for a way to prevent this and injecting any kind of data into my Angular app. I finally came up with a small directive that will read the json data, decode it and set it to a given scope variable.

This is the small directive that will load your data in your controllers scope:

In your applications code, in this case php, you can now do this:

If any one knows a better way to deal with this scenario I’m open for any suggestions and criticism!