MongoDB Performance Analysis: Bare Metal v. Virtual

December 20, 2012

Developers can be cynical. When "the next great thing in technology" is announced, I usually wait to see how it performs before I get too excited about it ... Show me how that "next great thing" compares apples-to-apples with the competition, and you'll get my attention. With the launch of MongoDB at SoftLayer, I'd guess a lot of developers outside of SoftLayer and 10gen have the same "wait and see" attitude about the new platform, so I put our new MongoDB engineered servers to the test.

When I shared MongoDB architectural best practices, I referenced a few of the significant optimizations our team worked with 10gen to incorporate into our engineered servers (cheat sheet). To illustrate the impact of these changes in MongoDB performance, we ran 10gen's recommended benchmarking harness (freely available for download and testing of your own environment) on our three tiers of engineered servers alongside equivalent shared virtual environments commonly deployed by the MongoDB community. We've made a pretty big deal about the performance impact of running MongoDB on optimized bare metal infrastructure, so it's time to put our money where our mouth is.

The Testing Environment

For each of the available SoftLayer MongoDB engineered servers, data sets of 512kb documents were preloaded onto single MongoDB instances. The data sets were created with varying size compared to available memory to allow for data sets that were both larger (2X) and smaller than available memory. Each test also ensured that the data set was altered during the test run frequently enough to prevent the queries from caching all of the data into memory.

Once the data sets were created, JMeter server instances with 4 cores and 16GB of RAM were used to drive 'benchrun' from the 10gen benchmarking harness. This diagram illustrates how we set up the testing environment (click for a better look):

MongoDB Performance Analysis Setup

These Jmeter servers function as the clients generating traffic on the MongoDB instances. Each client generated random query and update requests with a ratio of six queries per update (The update requests in the test were to ensure that data was not allowed to fully cache into memory and never exercise reads from disk). These tests were designed to create an extreme load on the servers from an exponentially increasing number of clients until the system resources became saturated, and we recorded the resulting performance of the MongoDB application.

At the Medium (MD) and Large (LG) engineered server tiers, performance metrics were run separately for servers using 15K SAS hard drive data mounts and servers using SSD hard drive data mounts. If you missed the post comparing the IOPS statistics between different engineered server hard drive configurations, be sure to check it out. For a better view of the results in a given graph, click the image included in the results below to see a larger version.

Test Case 1: Small MongoDB Engineered Servers vs Shared Virtual Instance

Servers

Small (SM) MongoDB Engineered Server
Single 4-core Intel 1270 CPU
64-bit CentOS
8GB RAM
2 x 500GB SATAII - RAID1
1Gb Network
Virtual Provider Instance
4 Virtual Compute Units
64-bit CentOS
7.5GB RAM
2 x 500GB Network Storage - RAID1
1Gb Network
 

Tests Performed

Small Data Set (8GB of .5mb documents)
200 iterations of 6:1 query-to-update operations
Concurrent client connections exponentially increased from 1 to 32
Test duration spanned 48 hours
Average Read Operations per Second
by Concurrent Client
MongoDB Performance Analysis
Peak Read Operations per Second
by Concurrent ClientMongoDB Performance Analysis
Average Write Operations per Second
by Concurrent Client
MongoDB Performance Analysis
Peak Write Operations per Second
by Concurrent ClientMongoDB Performance Analysis

Test Case 2: Medium MongoDB Engineered Servers vs Shared Virtual Instance

Servers (15K SAS Data Mount Comparison)

Medium (MD) MongoDB Engineered Server
Dual 6-core Intel 5670 CPUs
64-bit CentOS
36GB RAM
2 x 64GB SSD - RAID1 (Journal Mount)
4 x 300GB 15K SAS - RAID10 (Data Mount)
1Gb Network - Bonded
Virtual Provider Instance
26 Virtual Compute Units
64-bit CentOS
30GB RAM
2 x 64GB Network Storage - RAID1 (Journal Mount)
4 x 300GB Network Storage - RAID10 (Data Mount)
1Gb Network
 

Tests Performed

Small Data Set (32GB of .5mb documents)
200 iterations of 6:1 query-to-update operations
Concurrent client connections exponentially increased from 1 to 128
Test duration spanned 48 hours
Average Read Operations per Second
by Concurrent Client
MongoDB Performance Analysis
Peak Read Operations per Second
by Concurrent ClientMongoDB Performance Analysis
Average Write Operations per Second
by Concurrent Client
MongoDB Performance Analysis
Peak Write Operations per Second
by Concurrent ClientMongoDB Performance Analysis

Servers (SSD Data Mount Comparison)

Medium (MD) MongoDB Engineered Server
Dual 6-core Intel 5670 CPUs
64-bit CentOS
36GB RAM
2 x 64GB SSD - RAID1 (Journal Mount)
4 x 400GB SSD - RAID10 (Data Mount)
1Gb Network - Bonded
Virtual Provider Instance
26 Virtual Compute Units
64-bit CentOS
30GB RAM
2 x 64GB Network Storage - RAID1 (Journal Mount)
4 x 300GB Network Storage - RAID10 (Data Mount)
1Gb Network
 

Tests Performed

Small Data Set (32GB of .5mb documents)
200 iterations of 6:1 query-to-update operations
Concurrent client connections exponentially increased from 1 to 128
Test duration spanned 48 hours
Average Read Operations per Second
by Concurrent Client
MongoDB Performance Analysis
Peak Read Operations per Second
by Concurrent ClientMongoDB Performance Analysis
Average Write Operations per Second
by Concurrent Client
MongoDB Performance Analysis
Peak Write Operations per Second
by Concurrent ClientMongoDB Performance Analysis

Test Case 3: Large MongoDB Engineered Servers vs Shared Virtual Instance

Servers (15K SAS Data Mount Comparison)

Large (LG) MongoDB Engineered Server
Dual 8-core Intel E5-2620 CPUs
64-bit CentOS
128GB RAM
2 x 64GB SSD - RAID1 (Journal Mount)
6 x 600GB 15K SAS - RAID10 (Data Mount)
1Gb Network - Bonded
Virtual Provider Instance
26 Virtual Compute Units
64-bit CentOS
64GB RAM (Maximum available on this provider)
2 x 64GB Network Storage - RAID1 (Journal Mount)
6 x 600GB Network Storage - RAID10 (Data Mount)
1Gb Network
 

Tests Performed

Small Data Set (64GB of .5mb documents)
200 iterations of 6:1 query-to-update operations
Concurrent client connections exponentially increased from 1 to 128
Test duration spanned 48 hours
Average Read Operations per Second
by Concurrent Client
MongoDB Performance Analysis
Peak Read Operations per Second
by Concurrent ClientMongoDB Performance Analysis
Average Write Operations per Second
by Concurrent Client
MongoDB Performance Analysis
Peak Write Operations per Second
by Concurrent ClientMongoDB Performance Analysis

Servers (SSD Data Mount Comparison)

Large (LG) MongoDB Engineered Server
Dual 8-core Intel E5-2620 CPUs
64-bit CentOS
128GB RAM
2 x 64GB SSD - RAID1 (Journal Mount)
6 x 400GB SSD - RAID10 (Data Mount)
1Gb Network - Bonded
Virtual Provider Instance
26 Virtual Compute Units
64-bit CentOS
64GB RAM (Maximum available on this provider)
2 x 64GB Network Storage - RAID1 (Journal Mount)
6 x 600GB Network Storage - RAID10 (Data Mount)
1Gb Network
 

Tests Performed

Small Data Set (64GB of .5mb documents)
200 iterations of 6:1 query-to-update operations
Concurrent client connections exponentially increased from 1 to 128
Test duration spanned over 48 hours
Average Read Operations per Second
by Concurrent Client
MongoDB Performance Analysis
Peak Read Operations per Second
by Concurrent ClientMongoDB Performance Analysis
Average Write Operations per Second
by Concurrent Client
MongoDB Performance Analysis
Peak Write Operations per Second
by Concurrent ClientMongoDB Performance Analysis

Impressions from Performance Testing

The results speak for themselves. Running a Mongo DB big data solution on a shared virtual environment has significant drawbacks when compared to running MongoDB on a single-tenant bare metal offering. Disk I/O is by far the most limiting resource for MongoDB, and relying on shared network-attached storage (with much lower disk I/O) makes this limitation very apparent. Beyond the average and peak statistics above, performance varied much more significantly in the virtual instance environment, so it's not as consistent and predictable as a bare metal.

Highlights:

  • When a working data set is smaller than available memory, query performance increases.
  • The number of clients performing queries has an impact on query performance because more data is being actively cached at a rapid rate.
  • The addition of a separate Journal Mount volume significantly improves performance. Because the Small (SM) engineered server does not include a secondary mount for Journals, whenever MongoDB began to journal, the disk I/O associated with journalling was disruptive to the query and update operations performed on the Data Mount.
  • The best deployments in terms of operations per second, stability and control were the configurations with a RAID10 SSD Data Mount and a RAID1 SSD Journal Mount. These configurations are available in both our Medium and Large offerings, and I'd highly recommend them.

-Harold

Comments

December 20th, 2012 at 9:47am

Thanks for the tests.

My first question would be: what about non-virtual with shared network-attached storage ?

How does that perform ? To compare what hit you take by using the virtualisation and if the performance variance comes from the virtualisation or the shared network-attached storage.

What if you add a SSD as cache at the host that does the virtualisation (think flashcache or bcache) ? Does that solve a large part of the problem ?

December 20th, 2012 at 10:33am

Good questions, Lennie!

The largest part of the performance hit comes from having a network hop to the network attached storage. I have done some personal testing on our CCI's with local storage and tests where the data was completely cached in memory on CCI's with remote storage. In both cases MongoDB performed well. Now to be fair, I didn't do a local storage test on the other vendor (it wasn't available), but given our own internal testing the data indicates that the majority of the performance hit is around the I/O to the storage. In addition to being a network hop, most network attached storage solutions don't have the ability to offer QOS per mount. This makes a shared environment with network storage not only less performant in most cases, it also makes the deviation in performance unpredictable.

The compute side of the equation really comes into play when dealing with data sets smaller than available memory where MongoDB can cache the entire working set. When this happens, the bus speed/clock speed of the compute resources becomes the limiting factor and is no longer the storage I/O. In these cases obviously if your provider can clearly tell you what kind of CPU architecture you are running on, you can make a sound decision on if the platform will meet your needs. Raw compute power only matters (we are talking cores and architecture now) when you are doing map-reduce aggregation operations. These seem to be the only thing thing that begins to utilize CPU over the other resources.

On the flash cache, I would love to find a provider that would allow such an option. You could always "roll your own" and just have a local mounted SSD available. We haven't tested any flash cache setups ourselves yet, but in theory that would help alleviate some of the I/O issues if your working set was smaller than the cache. I think if I had that option though, I would just mount the local SSD as the data drive itself and be done with it. In certain cases though where your total data set might be large but your working data set is smaller than the cache SSD drive it might be advisable to try out a flash cache setup and see if it helped. It would be fun to test it regardless!

December 21st, 2012 at 5:39am

In every case the data set is too large to fit into memory on the virtual provider instance but there is more than enough RAM on the Softlayer server. Wouldn't this skew the results by causing page faults on the virtualised instance where there would be none on the Softlayer instance?

December 21st, 2012 at 7:57am

I work at a provider, we are thinking of deploying Ceph and maybe with a local cache on the virtualisation hosts.

You can obviously also create seperate groups in Ceph of different types of devices, like SSD- and HDD-backed group.

Ceph basically has a similair setup as Lustre (possibly Ceph has a superior design) and Lustre is in use for a number of top500 systems, so at least in theory it should scale pretty much as far as the budget can take you. As Ceph is fairly new, I'm sure their are still some limits of course.

December 21st, 2012 at 10:09am

@David, The virtual instance provider did not offer instances that were an exact match to the offerings we have, so we did our best to find a match with the competitor's offering (which in some cases they just didn't have enough ram). In reality since there are blended update, insert, and queries occurring in the test, there is very little opportunity for MongoDB to cache the records for query since the data set is changing rapidly. This was done to help prevent MongoDB from running exclusively in memory.

By far when running 'top', iostat, and mongostat during these tests you could observe that the limiting factor on both platforms ended up being storage I/O. If the tests performed were query-only centric, then you are 100% correct that these tests would have been heavily skewed and the data fairly worthless. .5gb and 2gb differences in memory size given that overall data set size and the use of updating records while querying did not have a significant impact on these tests. Even given the larger difference in the final offering (once again the virtual provider just didn't have an offering that matched) doesn't indicate that the RAM had as much of an impact. If it had, it would have shown as a large skewing to the query results only which doesn't appear to be the case in the numbers. Once again, this is because care was taken to avoid allowing MongoDB to cache too much data into memory.

I hope that helps, it was a little challenging to line up the two platforms.

August 13th, 2013 at 8:19am

Hello,

the fact that the Network storage is not specified in term of technology used makes this report a little poor.

It's really different if the cabin uses ISCSI, Fiber as conectivity and if the drives are SATA or SAS, (considering that common SATA are also 7200RPM and SAS 10k-15k RPM).
The cabin cache and the old it is also makes a difference in terms of performance.

The real fact I can really take profit from your report is the high importance of not spending a lot of money in drives if I have arrived to the controller limit.

Thanks,
Ruben

August 13th, 2013 at 9:36am

The technology behind "Network Storage" is not specified because the virtual provider that provided that network storage doesn't provide any detail about the architecture behind it. That's part of the problem. Public cloud providers like this virtual provider don't have to explain how the network storage is attached or the hardware that supports it, so they don't. Beyond the raw output performance metrics, the consistency of the results in a public cloud environment varies significantly as well (which is clear in our Riak Performance Analysis).

Leave a Reply

Filtered HTML

  • Web page addresses and e-mail addresses turn into links automatically.
  • You can enable syntax highlighting of source code with the following tags: <pre>, <blockcode>, <bash>, <c>, <cpp>, <drupal5>, <drupal6>, <java>, <javascript>, <php>, <python>, <ruby>. The supported tag styles are: <foo>, [foo].
  • Allowed HTML tags: <a> <em> <strong> <cite> <blockquote> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.

Comments

December 20th, 2012 at 9:47am

Thanks for the tests.

My first question would be: what about non-virtual with shared network-attached storage ?

How does that perform ? To compare what hit you take by using the virtualisation and if the performance variance comes from the virtualisation or the shared network-attached storage.

What if you add a SSD as cache at the host that does the virtualisation (think flashcache or bcache) ? Does that solve a large part of the problem ?

December 20th, 2012 at 10:33am

Good questions, Lennie!

The largest part of the performance hit comes from having a network hop to the network attached storage. I have done some personal testing on our CCI's with local storage and tests where the data was completely cached in memory on CCI's with remote storage. In both cases MongoDB performed well. Now to be fair, I didn't do a local storage test on the other vendor (it wasn't available), but given our own internal testing the data indicates that the majority of the performance hit is around the I/O to the storage. In addition to being a network hop, most network attached storage solutions don't have the ability to offer QOS per mount. This makes a shared environment with network storage not only less performant in most cases, it also makes the deviation in performance unpredictable.

The compute side of the equation really comes into play when dealing with data sets smaller than available memory where MongoDB can cache the entire working set. When this happens, the bus speed/clock speed of the compute resources becomes the limiting factor and is no longer the storage I/O. In these cases obviously if your provider can clearly tell you what kind of CPU architecture you are running on, you can make a sound decision on if the platform will meet your needs. Raw compute power only matters (we are talking cores and architecture now) when you are doing map-reduce aggregation operations. These seem to be the only thing thing that begins to utilize CPU over the other resources.

On the flash cache, I would love to find a provider that would allow such an option. You could always "roll your own" and just have a local mounted SSD available. We haven't tested any flash cache setups ourselves yet, but in theory that would help alleviate some of the I/O issues if your working set was smaller than the cache. I think if I had that option though, I would just mount the local SSD as the data drive itself and be done with it. In certain cases though where your total data set might be large but your working data set is smaller than the cache SSD drive it might be advisable to try out a flash cache setup and see if it helped. It would be fun to test it regardless!

December 21st, 2012 at 5:39am

In every case the data set is too large to fit into memory on the virtual provider instance but there is more than enough RAM on the Softlayer server. Wouldn't this skew the results by causing page faults on the virtualised instance where there would be none on the Softlayer instance?

December 21st, 2012 at 7:57am

I work at a provider, we are thinking of deploying Ceph and maybe with a local cache on the virtualisation hosts.

You can obviously also create seperate groups in Ceph of different types of devices, like SSD- and HDD-backed group.

Ceph basically has a similair setup as Lustre (possibly Ceph has a superior design) and Lustre is in use for a number of top500 systems, so at least in theory it should scale pretty much as far as the budget can take you. As Ceph is fairly new, I'm sure their are still some limits of course.

December 21st, 2012 at 10:09am

@David, The virtual instance provider did not offer instances that were an exact match to the offerings we have, so we did our best to find a match with the competitor's offering (which in some cases they just didn't have enough ram). In reality since there are blended update, insert, and queries occurring in the test, there is very little opportunity for MongoDB to cache the records for query since the data set is changing rapidly. This was done to help prevent MongoDB from running exclusively in memory.

By far when running 'top', iostat, and mongostat during these tests you could observe that the limiting factor on both platforms ended up being storage I/O. If the tests performed were query-only centric, then you are 100% correct that these tests would have been heavily skewed and the data fairly worthless. .5gb and 2gb differences in memory size given that overall data set size and the use of updating records while querying did not have a significant impact on these tests. Even given the larger difference in the final offering (once again the virtual provider just didn't have an offering that matched) doesn't indicate that the RAM had as much of an impact. If it had, it would have shown as a large skewing to the query results only which doesn't appear to be the case in the numbers. Once again, this is because care was taken to avoid allowing MongoDB to cache too much data into memory.

I hope that helps, it was a little challenging to line up the two platforms.

August 13th, 2013 at 8:19am

Hello,

the fact that the Network storage is not specified in term of technology used makes this report a little poor.

It's really different if the cabin uses ISCSI, Fiber as conectivity and if the drives are SATA or SAS, (considering that common SATA are also 7200RPM and SAS 10k-15k RPM).
The cabin cache and the old it is also makes a difference in terms of performance.

The real fact I can really take profit from your report is the high importance of not spending a lot of money in drives if I have arrived to the controller limit.

Thanks,
Ruben

August 13th, 2013 at 9:36am

The technology behind "Network Storage" is not specified because the virtual provider that provided that network storage doesn't provide any detail about the architecture behind it. That's part of the problem. Public cloud providers like this virtual provider don't have to explain how the network storage is attached or the hardware that supports it, so they don't. Beyond the raw output performance metrics, the consistency of the results in a public cloud environment varies significantly as well (which is clear in our Riak Performance Analysis).

Leave a Reply

Filtered HTML

  • Web page addresses and e-mail addresses turn into links automatically.
  • You can enable syntax highlighting of source code with the following tags: <pre>, <blockcode>, <bash>, <c>, <cpp>, <drupal5>, <drupal6>, <java>, <javascript>, <php>, <python>, <ruby>. The supported tag styles are: <foo>, [foo].
  • Allowed HTML tags: <a> <em> <strong> <cite> <blockquote> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.