File Conveyor

17 February, 2010

In this final article in my bachelor thesis series, I explain how I proved that the work I had done for my bachelor thesis (which includes the Episodes module, the Episodes Server module, the CDN integration module and File Conveyor) actually had a positive impact on page loading performance. For that, I converted a fairly high-traffic web site to Drupal, installed File Conveyor to optimize & sync files to both a static file server and an FTP Push CDN, used the CDN integration module to serve files from either the static file server or the FTP Push CDN (the decision to pick either of those two is based on the visitor’s location, i.e. the IP address), measure the results using Episodes and prove the positive impact using Episodes Server’s charts.

Previously in this series:

16 February, 2010

In this article, I explain the rationale behind the CDN integration module for Drupal 6, which was written as part of my bachelor thesis. It supports integration with both Origin Pull CDNs (out-of-the-box) and Push CDNs (by using File Conveyor).
Note that development of version 2 of this module has already begun! Version two will also be ported to Drupal 7.

Previously in this series:

15 February, 2010

In this extensive article, I explain the architecture of the “File Conveyor” daemon that I wrote to detect files immediately (through the file system event monitors on each OS, i.e. inotify on Linux), process them (e.g. recompress images, compress CSS/JS files, transcode videos …) and finally, sync them (FTP, Amazon S3, Amazon CloudFront and Rackspace CloudFiles are supported).

Previously in this series:


So now that we have the tools to accurately (or at least representatively) measure the effects of using a CDN, we still have to start using a CDN. Next, we will examine how a web site can take advantage of a CDN.

3 February, 2010

This weekend on Sunday, February 7, we’ll have a full day of Drupal talks at the 10th edition of FOSDEM, Europe’s biggest, free-est and open-est software conference.

FOSDEM, is a free and non-commercial event organized by the community, for the community. Its goal is to provide Free and Open Source developers a place to meet. The Drupal project was granted a developer room at FOSDEM to do exactly that: to share knowledge about Drupal.

The presentations schedule for the Drupal devroom features interesting speakers such as Robert Douglass, Károly Négyesi, Roel de Meester and Kristof van Tomme and even more interesting subjects as mobile device design, AHAH, eID and Views 3. Everyone is invited to attend the presentations.

29 August, 2009

I will be presenting together with Konstantin Käfer on Front End Performance. To be more exact, he will be talking about Front End Performance in general, and I will be talking about a subdomain of that: CDN integration.
Our sessions were merged because they overlapped to some extent — so now there’s just one supercharged session instead! It’s scheduled for Thursday (3 September), at 9 AM, in the La Reserre (translated: coal-shed) room.

In specific, I will be talking about the work I’ve been doing as part of my bachelor thesis. Integrating Drupal with a CDN was quite painful previously, but by using the CDN integration module, you can choose for either:

26 August, 2009

In this very brief article, I highlight the key properties of CDNs: what differentiates them and which technical implications you should keep in mind.


A content delivery network (CDN) is a collection of web servers distributed across multiple locations to deliver content more efficiently to users. The server selected for delivering content to a specific user is typically based on a measure of network proximity.

It is extremely hard to decide which CDN to use. In fact, by just looking at a CDN’s performance, it is close to impossible (see “Content Owners Struggling To Compare One CDN To Another” and “How Is CDNs Network Performance For Streaming Measured?”)!

24 August, 2009

I’ve been so caught up in work and reducing the amount of work (by lowering the number of projects I’m involved in), that I had not yet posted my results.

I finished my bachelor degree on July 7, 2009, with honors! (It’s actually honors over the entire course of the bachelor degree: it is calculated over all three years.) Most importantly though, I received an extremely high score for my bachelor thesis: 19/20! It’s the highest score possible (a perfect score of 20/20 is never given) and was the highest of my year. My bachelor thesis was considered of the level of a master thesis! (And for a master thesis, you get twice as much time to write it.)

22 May, 2009

Finally, my bachelor thesis has come to an end! I now have a very strong feeling of relief (because I managed to finish it in time!) and accomplishment (because it wasn’t always trivial to see the ligt at the end…). Now I can start studying for my upcoming exams, of which there are fortunately only two!

For those who don’t know yet, there are basically three big components:

  1. Drupal Episodes module
  2. the daemon, which performs the discovery, processing and syncing of files (it still doesn’t have a proper name — your suggestions are welcome!)
  3. Drupal CDN integration module

For more information, I’d like to refer you to the bachelor thesis text draft that I’ve attached to this blog post and possibly even to the blog post in which I announced what my bachelor thesis would be about.

18 October, 2008

I’ve alluded to it before, but now it’s also been officially approved: I’ll be doing my bachelor thesis on Drupal! I will focus on integrating Drupal with CDNs. Yay! :)

Don’t know what a CDN is? It’s short for Content Delivery Network; a network of (static file or streaming media) servers that are located around the globe. These servers all mirror each others’ files. When a user requests a certain file from the CDN, the server that is the closest to the user will serve the file.
By using a CDN to serve the static components on your web site (CSS, JS, images, fonts), your web site will load much faster: the latency will be lower and the throughput will be greater.