Comparing MongoDB write-performance on CentOS, FreeBSD and Ubuntu

February 13, 2012 12:20 pm    Posted by Viktor Petersson    Comments (7)

Recently I wrote a post titled ‘Notes on MongoDB, GridFS, sharding and deploying in the cloud.’ I talked about various aspects of running MongoDB and how to scale it. One thing we really didn’t take into consideration was if MongoDB performed differently on different operating systems. I naively assumed that it would perform relatively similar. That was a very incorrect assumption. Here are my findings when I tested the write-performance.

As it turns out, MongoDB performs very differently on CentOS 6.2, FreeBSD 9.0 and Ubuntu 10.04. This is at least true virtualized. I tried to set up the nodes as similar as possible — they all had 2GHz CPU, 2GB RAM and used VirtIO both for disk and network. All nodes also ran MongoDB 2.0.2.

To test the performance, I set up a FreeBSD 9.0 machine (with the same specifications). I then created a 5GB file with ‘dd’ and copied it into MongoDB on the various nodes using ‘mongofiles.’ I also made sure to wipe MongoDB’s database before I started to ensure similar conditions.

For FreeBSD, I installed MongoDB from ports, and for CentOS and Ubuntu I used 10gen’s MongoDB binaries. The data was copied over a private network interface. I copied the data five times to each server (“mongofiles -hMyHost -l file put fileN”) and recorded the time for each run using the ‘time’-command. The data below is simply (5120MB)/(average of real time in seconds).

The result was surprising to say the least.

[easychart type="vertbar" height="400" title="MongoDB write-speed in MB/sec" groupnames="FreeBSD 9.0, Ubuntu 10.04, CentOS 6.2" valuenames="" group1values="14.36" group2values="27.25" group3values ="21.19"]

Ubuntu is almost 2x as fast as FreeBSD! I didn’t see that one coming.

I’m not really sure why the difference is this great. I also think the fact that CentOS and Ubuntu used 10gen’s own build, and it might have some optimizations. Another thing that surprises me was the different between Ubuntu and CentOS, as they’re both using 10gen’s own binaries.

It should also be mentioned that this test was performed in a public cloud, and the load from other systems could have impacted the performance. Yet, I ran a few tests a few days earlier, and came up with similar numbers.

Update: I’ve been in touch with Ben Becker over at 10gen and he said that they’re not doing anything special with their binaries. They’re build in a VM with code fetched from the Github-repository without any changes. Hence, this difference is most likely explained by a less mature/optimized Virtio-driver in FreeBSD.

7 Responses to “Comparing MongoDB write-performance on CentOS, FreeBSD and Ubuntu”

  1. Great advice Viktor, we’ve luckily and successfully run MongoDb on Ubuntu since start, but those numbers seem like a bit of a surprise to me. Thanks for the insight! Keep it up!

    • Sreenath V says:

      We are evaluating MongoDB for using it like a persistent cache server as well as DB for some content.

      However, Ubuntu Server 11.x was ~1.5 times faster than CentOS 6.2 for both insert and update. Read was pretty much same on both Ubuntu and CentIOS. Both hardware are identical.

      Both OS has ext4, 8 Gb Ram, 4 CPU 2.9gz

      Still we are trying to find out the data block size and other parameters to see if there is any┬ámissing’s.

  2. Mitch Pirtle says:

    Also keep in mind these numbers will diverge even more based on actual hardware. Friend of mine in NJ learned that the hard way with RAID accelerator cards and varying degrees of driver support. In the end his RAID card performed best with CentOS, and was totally unusable on my beloved Debian…

    Also note the different memory management facilities in the *BSD family – to which I’m not sure mongod is really taking full advantage of yet.

    • vpetersson says:

      Mitch,

      RAID drivers isn’t really an issue in this case, since this is a virtualized environment. That would only be an issue for a bare-metal system.

      As far as memory management goes — that might very well be a problem. Ben Becker over at 10gen pointed out that there are many optimizations in the code that only applies to Linux and OS X. Yet, I don’t know what these optimizations are.

  3. tom_m says:

    Is there any difference between gridfs and normal documents? This is just a single 5GB file into gridfs…But what about something more natural? Like a bunch of inserts? Even a batchInsert. I’m thinking that you’d never reach the opportunity to hit this threshold unless you are using the mogofiles command with a large file. That doesn’t sound like a typical MongoDB use case. Do all commands like insert, batchInsert, mongoimport, mongofiles, etc. work exactly the same?

    More specifically…If there is a difference it would be good to know because I’m curious if MongoDB would be able to keep up with the Twitter streaming API as demonstrated by Eliot at MongoSV. I imagine it was the, free, 1% stream so it wasn’t an issue in the demo…But reading this article here makes me wonder about the feasibility of capturing data as fast as Twitter can send it out (at 100% firehose).

    I have an application that imports a bunch of data. It’s too much to hold in memory, but I’m not using mongofiles. I use mongoimport. Speed is less of a concern for me because I wrote the data to a JSON file first, but this is good stuff to know! Thanks for the testing.

    • vpetersson says:

      Tom,

      For sure. This doesn’t perhaps reflect the typical usage of MongoDB, but it’s very close to the way we use it for YippieMove. We use GridFS with large files, and it’s pretty write-heavy.

      You’re right about that it would be interesting to do benchmarks on smaller inserts too, but I did the benchmarks for our own scenario and decided to publish the result. You’re also right that it’s too much data to hold in memory.


Leave a Reply