<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>SoftLayer Blog &#187; big data</title>
	<atom:link href="http://blog.softlayer.com/tag/big-data/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.softlayer.com</link>
	<description>A Behind the Scenes Look at the Best Hosting Provider in the World</description>
	<lastBuildDate>Fri, 24 May 2013 18:19:59 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.4.1</generator>
		<item>
		<title>Big Data at SoftLayer: Riak</title>
		<link>http://blog.softlayer.com/2013/big-data-at-softlayer-riak/</link>
		<comments>http://blog.softlayer.com/2013/big-data-at-softlayer-riak/#comments</comments>
		<pubDate>Tue, 30 Apr 2013 15:15:35 +0000</pubDate>
		<dc:creator>Duke Skarda</dc:creator>
				<category><![CDATA[Cloud]]></category>
		<category><![CDATA[Executive Blog]]></category>
		<category><![CDATA[SoftLayer]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[announcement]]></category>
		<category><![CDATA[architecture]]></category>
		<category><![CDATA[availability]]></category>
		<category><![CDATA[Basho]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[cluster]]></category>
		<category><![CDATA[configure]]></category>
		<category><![CDATA[nodes]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[release]]></category>
		<category><![CDATA[Riak]]></category>
		<category><![CDATA[scale]]></category>

		<guid isPermaLink="false">http://blog.softlayer.com/?p=11226</guid>
		<description><![CDATA[Big data is only getting bigger. Late last year, SoftLayer teamed up with 10Gen to launch a high-performance MongoDB solution, and since then, many of our customers have been clamoring for us to support other big data platforms in the same way. By automating the provisioning process of a complex big data environment on bare [...]]]></description>
			<content:encoded><![CDATA[<p>Big data is only getting bigger. Late last year, SoftLayer teamed up with 10Gen to launch a <a href="http://www.softlayer.com/solutions/big-data/mongodb#utm_source=blog&#038;utm_medium=social&#038;utm_content=info&#038;utm_campaign=outreach">high-performance MongoDB solution</a>, and since then, many of our customers have been clamoring for us to support other big data platforms in the same way. By automating the provisioning process of a complex big data environment on bare metal infrastructure, we made life a lot easier for developers who demanded <a href="http://blog.softlayer.com/2012/mongodb-performance-analysis-bare-metal-v-virtual/">performance</a> and on-demand scalability for their big data applications, and it&#8217;s clear that our simple formula produced amazing results. As Marc mentioned when he started <a href="http://blog.softlayer.com/2012/breaking-down-big-data-database-models/">breaking down big data database models</a>, document-oriented databases like MongoDB are phenomenal for certain use-cases, and in other situations, a key-value store might be a better fit. With that in mind, we called up our friends at <a href="http://basho.com/">Basho</a> and started building a high-performance architecture specifically for <a href="http://basho.com/riak/">Riak</a> &#8230; And I&#8217;m excited to announce that we&#8217;re launching it today!</p>
<p>Riak is an open source, distributed database platform based on the principles enumerated in the DynamoDB paper. It uses a simple key/value model for object storage, and it was architected for high availability, fault tolerance, operational simplicity and scalability. A Riak cluster is composed of multiple nodes that are all connected, all communicating and sharing data automatically. If one node were to fail, the other nodes would automatically share the data that the failed node was storing and processing until the node is back up and running or a new node is added. See the diagram below for a simple illustration of how adding a node to a cluster works within Riak.</p>
<p><a href="http://cdn.softlayer.com/innerlayer/riak_nodes.png" target="_blank"><img class="centered" src="http://cdn.softlayer.com/innerlayer/riak_nodes_s.png" alt="Riak Nodes"/></a></p>
<p>We will support both the open source and the Enterprise versions of Riak. The open source version is a great place to start. It has all of the database functionality of Riak Enterprise, but it is limited to a single cluster. The Enterprise version supports replication between clusters across data centers, giving you lots of architectural options. You can use replication to build highly available, live-live failover applications. You can also use it to distribute your application&#8217;s data across regions, giving you a global platform that you can update anywhere in the world and know that those modifications will be available anywhere else. Riak Enterprise customers also receive 24×7 coverage, both from SoftLayer and Basho. This includes SoftLayer&#8217;s one-hour guaranteed response for Severity 1 hardware issues and unlimited support available via our secure web portal, email and phone.</p>
<p>The business use-case for this flexibility is that if you need to scale up or down, nodes can be easily added or taken down as your requirements change. You can opt for a single-data center environment with a few nodes or you can broaden your architecture to a multi-data center deployment with a 40-node cluster. While these capabilities are inherent in Riak, they can be complicated to build and configure, so we spent countless hours working with Basho to streamline Riak deployment on the SoftLayer platform. The fruit of that labor can be found in our <a href="https://www.softlayer.com/Sales/orderRiakClusters">Riak Solution Designer</a>:</p>
<p><a href="https://www.softlayer.com/Sales/orderRiakClusters" target="_blank"><img class="centered" src="http://cdn.softlayer.com/innerlayer/riak_solution_s.png" alt="Riak Solution Designer"/></a></p>
<p>The server configurations and packages in the Riak Solution Designer have been selected to deliver the performance, availability and stability that our customers expect from their bare metal and virtual cloud infrastructure at SoftLayer. With a few quick clicks, you can order a fully configured Riak environment, and it&#8217;ll be provisioned and online for you in two to four hours. And everything you order is on a month-to-month contract.</p>
<p>Thanks to the hard work done by the SoftLayer development group and Basho&#8217;s team, we&#8217;re proud to be the first in the marketplace to offer a turn-key Riak solution on bare metal infrastructure. You don&#8217;t need to sacrifice performance and agility for simplicity. </p>
<p>For more information, visit <a href="http://www.softlayer.com/riak">SoftLayer.com/Riak</a> or contact our sales team.</p>
<p>-Duke</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.softlayer.com/2013/big-data-at-softlayer-riak/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>MongoDB Performance Analysis: Bare Metal v. Virtual</title>
		<link>http://blog.softlayer.com/2012/mongodb-performance-analysis-bare-metal-v-virtual/</link>
		<comments>http://blog.softlayer.com/2012/mongodb-performance-analysis-bare-metal-v-virtual/#comments</comments>
		<pubDate>Thu, 20 Dec 2012 15:30:32 +0000</pubDate>
		<dc:creator>Harold Hannon</dc:creator>
				<category><![CDATA[Development]]></category>
		<category><![CDATA[Infrastructure]]></category>
		<category><![CDATA[SoftLayer]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[10gen]]></category>
		<category><![CDATA[bare metal]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[configuration]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[deployment]]></category>
		<category><![CDATA[engineered servers]]></category>
		<category><![CDATA[IOPS]]></category>
		<category><![CDATA[MongoDB]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[platform]]></category>
		<category><![CDATA[testing]]></category>

		<guid isPermaLink="false">http://blog.softlayer.com/?p=10070</guid>
		<description><![CDATA[Developers can be cynical. When &#8220;the next great thing in technology&#8221; is announced, I usually wait to see how it performs before I get too excited about it &#8230; Show me how that &#8220;next great thing&#8221; compares apples-to-apples with the competition, and you&#8217;ll get my attention. With the launch of MongoDB at SoftLayer, I&#8217;d guess [...]]]></description>
			<content:encoded><![CDATA[<p>Developers can be cynical. When &#8220;the next great thing in technology&#8221; is announced, I usually wait to see how it performs before I get too excited about it &#8230; Show me how that &#8220;next great thing&#8221; compares apples-to-apples with the competition, and you&#8217;ll get my attention. With the launch of <a href="https://www.softlayer.com/solutions/big-data/mongodb">MongoDB</a> at SoftLayer, I&#8217;d guess a lot of developers outside of SoftLayer and 10gen have the same &#8220;wait and see&#8221; attitude about the new platform, so I put our new <a href="https://www.softlayer.com/solutions/big-data/mongodb/pricing">MongoDB engineered servers</a> to the test.</p>
<p>When I shared <a href="http://blog.softlayer.com/2012/mongodb-architectural-best-practices/">MongoDB architectural best practices</a>, I referenced a few of the significant optimizations our team worked with 10gen to incorporate into our engineered servers (<a href="http://knowledgelayer.softlayer.com/questions/585/Engineered+MongoDB+Installations">cheat sheet</a>). To illustrate the impact of these changes in MongoDB performance, we ran 10gen&#8217;s recommended <a href="http://www.mongodb.org/display/DOCS/JS+Benchmarking+Harness">benchmarking harness</a> (freely available for download and testing of your own environment) on our three tiers of engineered servers alongside equivalent shared virtual environments commonly deployed by the MongoDB community. We&#8217;ve made a pretty big deal about the performance impact of running MongoDB on optimized bare metal infrastructure, so it&#8217;s time to put our money where our mouth is.</p>
<h3>The Testing Environment</h3>
<p style="margin-top:5px; padding-top:0;">For each of the available SoftLayer MongoDB engineered servers, data sets of 512kb documents were preloaded onto single MongoDB instances. The data sets were created with varying size compared to available memory to allow for data sets that were both larger (2X) and smaller than available memory. Each test also ensured that the data set was altered during the test run frequently enough to prevent the queries from caching all of the data into memory.</p>
<p>Once the data sets were created, <a href="http://jmeter.apache.org/">JMeter</a> server instances with 4 cores and 16GB of RAM were used to drive &#8216;benchrun&#8217; from the 10gen benchmarking harness. This diagram illustrates how we set up the testing environment (click for a better look):</p>
<p><a href="http://cdn.softlayer.com/innerlayer/mongodbperformanceanalysis/mongodbtestsetup.png"><img class="centered" src="http://cdn.softlayer.com/innerlayer/mongodbperformanceanalysis/mongodbtestsetup_s.png" alt="MongoDB Performance Analysis Setup"/></a></p>
<p>These Jmeter servers function as the clients generating traffic on the MongoDB instances. Each client generated random query and update requests with a ratio of six queries per update (The update requests in the test were to ensure that data was not allowed to fully cache into memory and never exercise reads from disk). These tests were designed to create an extreme load on the servers from an exponentially increasing number of clients until the system resources became saturated, and we recorded the resulting performance of the MongoDB application.</p>
<p>At the Medium (MD) and Large (LG) engineered server tiers, performance metrics were run separately for servers using 15K SAS hard drive data mounts and servers using SSD hard drive data mounts. If you missed the post <a href="http://blog.softlayer.com/2012/big-data-at-softlayer-the-importance-of-iops/">comparing the IOPS statistics between different engineered server hard drive configurations</a>, be sure to check it out. For a better view of the results in a given graph, click the image included in the results below to see a larger version.</p>
<style type="text/css">
  .perfan {width:305px; float:left; padding:5px; margin:5px; size:6px; font-weight:bold; text-align:center; color:#972f2c;}
  .perfan img {margin-left:auto; margin-right:auto;}
  .comparison {float:left; margin-left:10px; margin-right:10px;}
  h3{font-size:16px;color:#972F2c;}
 </style>
<h3>Test Case 1: Small MongoDB Engineered Servers vs Shared Virtual Instance</h3>
<p style="margin-top:5px; padding-top:0; margin-bottom:0;padding-bottom:0;"><strong>Servers</strong></p>
<div class="comparison">Small (SM) MongoDB Engineered Server<br />
Single 4-core Intel 1270 CPU<br />
64-bit CentOS<br />
8GB RAM<br />
2 x 500GB SATAII &#8211; RAID1<br />
1Gb Network</div>
<div class="comparison">Virtual Provider Instance<br />
4 Virtual Compute Units<br />
64-bit CentOS<br />
7.5GB RAM<br />
2 x 500GB Network Storage &#8211; RAID1<br />
1Gb Network</div>
<div style="clear:both; height:1px; margin:0; padding:0;">&nbsp;</div>
<p style="margin-bottom:0;padding-bottom:0;"><strong>Tests Performed</strong></p>
<div style="margin-left:10px;">Small Data Set (8GB of .5mb documents)<br />
200 iterations of 6:1 query-to-update operations<br />
Concurrent client connections exponentially increased from 1 to 32<br />
Test duration spanned 48 hours</div>
<div class="perfan">Average Read Operations per Second<br />
by Concurrent Client<br />
<a href="http://cdn.softlayer.com/innerlayer/mongodbperformanceanalysis/perfan1.png"><img src="http://cdn.softlayer.com/innerlayer/mongodbperformanceanalysis/perfan1.jpg" alt="MongoDB Performance Analysis"/></a></div>
<div class="perfan">Peak Read Operations per Second<br />
by Concurrent Client<a href="http://cdn.softlayer.com/innerlayer/mongodbperformanceanalysis/perfan2.png"><img src="http://cdn.softlayer.com/innerlayer/mongodbperformanceanalysis/perfan2.jpg" alt="MongoDB Performance Analysis"/></a></div>
<div class="perfan">Average Write Operations per Second<br />
by Concurrent Client<br />
<a href="http://cdn.softlayer.com/innerlayer/mongodbperformanceanalysis/perfan3.png"><img src="http://cdn.softlayer.com/innerlayer/mongodbperformanceanalysis/perfan3.jpg" alt="MongoDB Performance Analysis"/></a></div>
<div class="perfan">Peak Write Operations per Second<br />
by Concurrent Client<a href="http://cdn.softlayer.com/innerlayer/mongodbperformanceanalysis/perfan4.png"><img src="http://cdn.softlayer.com/innerlayer/mongodbperformanceanalysis/perfan4.jpg" alt="MongoDB Performance Analysis"/></a></div>
<p><span id="more-10070"></span></p>
<h3>Test Case 2: Medium MongoDB Engineered Servers vs Shared Virtual Instance</h3>
<p style="margin-top:5px; padding-top:0; margin-bottom:0;padding-bottom:0;"><strong>Servers (15K SAS Data Mount Comparison)</strong></p>
<div class="comparison">Medium (MD) MongoDB Engineered Server<br />
Dual 6-core Intel 5670 CPUs<br />
64-bit CentOS<br />
36GB RAM<br />
2 x 64GB SSD &#8211; RAID1 (Journal Mount)<br />
4 x <strong>300GB 15K SAS</strong> &#8211; RAID10 (Data Mount)<br />
1Gb Network &#8211; Bonded</div>
<div class="comparison">Virtual Provider Instance<br />
26 Virtual Compute Units<br />
64-bit CentOS<br />
30GB RAM<br />
2 x 64GB Network Storage &#8211; RAID1 (Journal Mount)<br />
4 x 300GB Network Storage &#8211; RAID10 (Data Mount)<br />
1Gb Network</div>
<div style="clear:both; height:1px; margin:0; padding:0;">&nbsp;</div>
<p style="margin-bottom:0;padding-bottom:0;"><strong>Tests Performed</strong></p>
<div style="margin-left:10px;">Small Data Set (32GB of .5mb documents)<br />
200 iterations of 6:1 query-to-update operations<br />
Concurrent client connections exponentially increased from 1 to 128<br />
Test duration spanned 48 hours</div>
<div class="perfan">Average Read Operations per Second<br />
by Concurrent Client<br />
<a href="http://cdn.softlayer.com/innerlayer/mongodbperformanceanalysis/perfan5.png"><img src="http://cdn.softlayer.com/innerlayer/mongodbperformanceanalysis/perfan5.jpg" alt="MongoDB Performance Analysis"/></a></div>
<div class="perfan">Peak Read Operations per Second<br />
by Concurrent Client<a href="http://cdn.softlayer.com/innerlayer/mongodbperformanceanalysis/perfan6.png"><img src="http://cdn.softlayer.com/innerlayer/mongodbperformanceanalysis/perfan6.jpg" alt="MongoDB Performance Analysis"/></a></div>
<div class="perfan">Average Write Operations per Second<br />
by Concurrent Client<br />
<a href="http://cdn.softlayer.com/innerlayer/mongodbperformanceanalysis/perfan7.png"><img src="http://cdn.softlayer.com/innerlayer/mongodbperformanceanalysis/perfan7.jpg" alt="MongoDB Performance Analysis"/></a></div>
<div class="perfan">Peak Write Operations per Second<br />
by Concurrent Client<a href="http://cdn.softlayer.com/innerlayer/mongodbperformanceanalysis/perfan8.png"><img src="http://cdn.softlayer.com/innerlayer/mongodbperformanceanalysis/perfan8.jpg" alt="MongoDB Performance Analysis"/></a></div>
<p style="margin-top:5px; padding-top:0; margin-bottom:0;padding-bottom:0;"><strong>Servers (SSD Data Mount Comparison)</strong></p>
<div class="comparison">Medium (MD) MongoDB Engineered Server<br />
Dual 6-core Intel 5670 CPUs<br />
64-bit CentOS<br />
36GB RAM<br />
2 x 64GB SSD &#8211; RAID1 (Journal Mount)<br />
4 x <strong>400GB SSD</strong> &#8211; RAID10 (Data Mount)<br />
1Gb Network &#8211; Bonded</div>
<div class="comparison">Virtual Provider Instance<br />
26 Virtual Compute Units<br />
64-bit CentOS<br />
30GB RAM<br />
2 x 64GB Network Storage &#8211; RAID1 (Journal Mount)<br />
4 x 300GB Network Storage &#8211; RAID10 (Data Mount)<br />
1Gb Network</div>
<div style="clear:both; height:1px; margin:0; padding:0;">&nbsp;</div>
<p style="margin-bottom:0;padding-bottom:0;"><strong>Tests Performed</strong></p>
<div style="margin-left:10px;">Small Data Set (32GB of .5mb documents)<br />
200 iterations of 6:1 query-to-update operations<br />
Concurrent client connections exponentially increased from 1 to 128<br />
Test duration spanned 48 hours</div>
<div class="perfan">Average Read Operations per Second<br />
by Concurrent Client<br />
<a href="http://cdn.softlayer.com/innerlayer/mongodbperformanceanalysis/perfan9.png"><img src="http://cdn.softlayer.com/innerlayer/mongodbperformanceanalysis/perfan9.jpg" alt="MongoDB Performance Analysis"/></a></div>
<div class="perfan">Peak Read Operations per Second<br />
by Concurrent Client<a href="http://cdn.softlayer.com/innerlayer/mongodbperformanceanalysis/perfan10.png"><img src="http://cdn.softlayer.com/innerlayer/mongodbperformanceanalysis/perfan10.jpg" alt="MongoDB Performance Analysis"/></a></div>
<div class="perfan">Average Write Operations per Second<br />
by Concurrent Client<br />
<a href="http://cdn.softlayer.com/innerlayer/mongodbperformanceanalysis/perfan11.png"><img src="http://cdn.softlayer.com/innerlayer/mongodbperformanceanalysis/perfan11.jpg" alt="MongoDB Performance Analysis"/></a></div>
<div class="perfan">Peak Write Operations per Second<br />
by Concurrent Client<a href="http://cdn.softlayer.com/innerlayer/mongodbperformanceanalysis/perfan12.png"><img src="http://cdn.softlayer.com/innerlayer/mongodbperformanceanalysis/perfan12.jpg" alt="MongoDB Performance Analysis"/></a></div>
<h3>Test Case 3: Large MongoDB Engineered Servers vs Shared Virtual Instance</h3>
<p style="margin-top:5px; padding-top:0; margin-bottom:0;padding-bottom:0;"><strong>Servers (15K SAS Data Mount Comparison)</strong></p>
<div class="comparison">Large (LG) MongoDB Engineered Server<br />
Dual 8-core Intel E5-2620 CPUs<br />
64-bit CentOS<br />
128GB RAM<br />
2 x 64GB SSD &#8211; RAID1 (Journal Mount)<br />
6 x <strong>600GB 15K SAS</strong> &#8211; RAID10 (Data Mount)<br />
1Gb Network &#8211; Bonded</div>
<div class="comparison">Virtual Provider Instance<br />
26 Virtual Compute Units<br />
64-bit CentOS<br />
64GB RAM (Maximum available on this provider)<br />
2 x 64GB Network Storage &#8211; RAID1 (Journal Mount)<br />
6 x 600GB Network Storage &#8211; RAID10 (Data Mount)<br />
1Gb Network</div>
<div style="clear:both; height:1px; margin:0; padding:0;">&nbsp;</div>
<p style="margin-bottom:0;padding-bottom:0;"><strong>Tests Performed</strong></p>
<div style="margin-left:10px;">Small Data Set (64GB of .5mb documents)<br />
200 iterations of 6:1 query-to-update operations<br />
Concurrent client connections exponentially increased from 1 to 128<br />
Test duration spanned 48 hours</div>
<div class="perfan">Average Read Operations per Second<br />
by Concurrent Client<br />
<a href="http://cdn.softlayer.com/innerlayer/mongodbperformanceanalysis/perfan13.png"><img src="http://cdn.softlayer.com/innerlayer/mongodbperformanceanalysis/perfan13.jpg" alt="MongoDB Performance Analysis"/></a></div>
<div class="perfan">Peak Read Operations per Second<br />
by Concurrent Client<a href="http://cdn.softlayer.com/innerlayer/mongodbperformanceanalysis/perfan14.png"><img src="http://cdn.softlayer.com/innerlayer/mongodbperformanceanalysis/perfan14.jpg" alt="MongoDB Performance Analysis"/></a></div>
<div class="perfan">Average Write Operations per Second<br />
by Concurrent Client<br />
<a href="http://cdn.softlayer.com/innerlayer/mongodbperformanceanalysis/perfan15.png"><img src="http://cdn.softlayer.com/innerlayer/mongodbperformanceanalysis/perfan15.jpg" alt="MongoDB Performance Analysis"/></a></div>
<div class="perfan">Peak Write Operations per Second<br />
by Concurrent Client<a href="http://cdn.softlayer.com/innerlayer/mongodbperformanceanalysis/perfan16.png"><img src="http://cdn.softlayer.com/innerlayer/mongodbperformanceanalysis/perfan16.jpg" alt="MongoDB Performance Analysis"/></a></div>
<p style="margin-top:5px; padding-top:0; margin-bottom:0;padding-bottom:0;"><strong>Servers (SSD Data Mount Comparison)</strong></p>
<div class="comparison">Large (LG) MongoDB Engineered Server<br />
Dual 8-core Intel E5-2620 CPUs<br />
64-bit CentOS<br />
128GB RAM<br />
2 x 64GB SSD &#8211; RAID1 (Journal Mount)<br />
6 x <strong>400GB SSD</strong> &#8211; RAID10 (Data Mount)<br />
1Gb Network &#8211; Bonded</div>
<div class="comparison">Virtual Provider Instance<br />
26 Virtual Compute Units<br />
64-bit CentOS<br />
64GB RAM (Maximum available on this provider)<br />
2 x 64GB Network Storage &#8211; RAID1 (Journal Mount)<br />
6 x 600GB Network Storage &#8211; RAID10 (Data Mount)<br />
1Gb Network</div>
<div style="clear:both; height:1px; margin:0; padding:0;">&nbsp;</div>
<p style="margin-bottom:0;padding-bottom:0;"><strong>Tests Performed</strong></p>
<div style="margin-left:10px;">Small Data Set (64GB of .5mb documents)<br />
200 iterations of 6:1 query-to-update operations<br />
Concurrent client connections exponentially increased from 1 to 128<br />
Test duration spanned over 48 hours</div>
<div class="perfan">Average Read Operations per Second<br />
by Concurrent Client<br />
<a href="http://cdn.softlayer.com/innerlayer/mongodbperformanceanalysis/perfan17.png"><img src="http://cdn.softlayer.com/innerlayer/mongodbperformanceanalysis/perfan17.jpg" alt="MongoDB Performance Analysis"/></a></div>
<div class="perfan">Peak Read Operations per Second<br />
by Concurrent Client<a href="http://cdn.softlayer.com/innerlayer/mongodbperformanceanalysis/perfan18.png"><img src="http://cdn.softlayer.com/innerlayer/mongodbperformanceanalysis/perfan18.jpg" alt="MongoDB Performance Analysis"/></a></div>
<div class="perfan">Average Write Operations per Second<br />
by Concurrent Client<br />
<a href="http://cdn.softlayer.com/innerlayer/mongodbperformanceanalysis/perfan19.png"><img src="http://cdn.softlayer.com/innerlayer/mongodbperformanceanalysis/perfan19.jpg" alt="MongoDB Performance Analysis"/></a></div>
<div class="perfan">Peak Write Operations per Second<br />
by Concurrent Client<a href="http://cdn.softlayer.com/innerlayer/mongodbperformanceanalysis/perfan20.png"><img src="http://cdn.softlayer.com/innerlayer/mongodbperformanceanalysis/perfan20.jpg" alt="MongoDB Performance Analysis"/></a></div>
<h3>Impressions from Performance Testing</h3>
<p style="margin-top:5px; padding-top:0;">The results speak for themselves. Running a Mongo DB <a href="https://www.softlayer.com/bigdata">big data</a> solution on a shared virtual environment has significant drawbacks when compared to running MongoDB on a single-tenant bare metal offering.  Disk I/O is by far the most limiting resource for MongoDB, and relying on shared network-attached storage (with much lower disk I/O) makes this limitation very apparent. Beyond the average and peak statistics above, performance varied much more significantly in the virtual instance environment, so it&#8217;s not as consistent and predictable as a bare metal.</p>
<p style="margin-bottom:0; padding-bottom:0;"><strong>Highlights:</strong></p>
<ul style="margin-top:5px; padding-top:0;">
<li>When a working data set is smaller than available memory, query performance increases.</li>
<li>The number of clients performing queries has an impact on query performance because more data is being actively cached at a rapid rate.
<li>The addition of a separate Journal Mount volume significantly improves performance. Because the Small (SM) engineered server does not include a secondary mount for Journals, whenever MongoDB began to journal, the disk I/O associated with journalling was disruptive to the query and update operations performed on the Data Mount.</li>
<li>The best deployments in terms of operations per second, stability and control were the configurations with a RAID10 SSD Data Mount and a RAID1 SSD Journal Mount. These configurations are available in both our Medium and Large offerings, and I&#8217;d highly recommend them.</li>
</ul>
<p>-Harold</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.softlayer.com/2012/mongodb-performance-analysis-bare-metal-v-virtual/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Big Data at SoftLayer: The Importance of IOPS</title>
		<link>http://blog.softlayer.com/2012/big-data-at-softlayer-the-importance-of-iops/</link>
		<comments>http://blog.softlayer.com/2012/big-data-at-softlayer-the-importance-of-iops/#comments</comments>
		<pubDate>Mon, 17 Dec 2012 20:00:31 +0000</pubDate>
		<dc:creator>Kelly Hurst</dc:creator>
				<category><![CDATA[Development]]></category>
		<category><![CDATA[Infrastructure]]></category>
		<category><![CDATA[SoftLayer]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[comparison]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[data set]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[disk I/O]]></category>
		<category><![CDATA[engineered servers]]></category>
		<category><![CDATA[hard drives]]></category>
		<category><![CDATA[input]]></category>
		<category><![CDATA[IOPS]]></category>
		<category><![CDATA[journal]]></category>
		<category><![CDATA[MongoDB]]></category>
		<category><![CDATA[operations]]></category>
		<category><![CDATA[output]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[RAID]]></category>
		<category><![CDATA[rate]]></category>
		<category><![CDATA[SAS]]></category>
		<category><![CDATA[speed]]></category>
		<category><![CDATA[SSD]]></category>

		<guid isPermaLink="false">http://blog.softlayer.com/?p=10220</guid>
		<description><![CDATA[The jet flow gates in the Hoover Dam can release up to 73,000 cubic feet &#8212; the equivalent of 546,040 gallons &#8212; of water per second at 120 miles per hour. Imagine replacing those jet flow gates with a single garden hose that pushes 25 gallons per minute (or 0.42 gallons per second). Things would [...]]]></description>
			<content:encoded><![CDATA[<p>The jet flow gates in the Hoover Dam can release up to 73,000 cubic feet &mdash; the equivalent of 546,040 gallons &mdash; of water per second at 120 miles per hour. Imagine replacing those jet flow gates with a single garden hose that pushes 25 gallons per minute (or 0.42 gallons per second). Things would get ugly pretty quickly. In the same way, a massive &#8220;big data&#8221; infrastructure can be crippled by insufficient IOPS.</p>
<p><a href="http://en.wikipedia.org/wiki/IOPS">IOPS</a> &mdash; Input/Output Operations Per Second &mdash; measure computer storage in terms of the number of read and write operations it can perform in a second. IOPS are a primary concern for database environments where content is being written and queried constantly, and when we take those database environments to the extreme (big data), the importance of IOPS can&#8217;t be overstated: If you aren&#8217;t able perform database reads and writes quickly in a big data environment, it doesn&#8217;t matter how many gigabytes, terabytes or petabytes you have in your database &#8230; You won&#8217;t be able to efficiently access, add to or modify your data set.</p>
<p>As we worked with <a href="http://www.10gen.com/">10gen</a> to create, test and tweak SoftLayer&#8217;s <a href="http://www.softlayer.com/solutions/big-data/mongodb/pricing">MongoDB engineered servers</a>, our primary focus centered on performance. Since the performance of massively scalable databases is dictated by the read and write operations to that database&#8217;s data set, we invested significant resources into maximizing the IOPS for each engineered server &#8230; And that involved a lot more than just swapping hard drives out of servers until we found a configuration that worked best. Yes, &#8220;Disk I/O&#8221; &mdash; the amount of input/output operations a given disk can perform &mdash; plays a significant role in big data IOPS, but many other factors limit big data performance. How is performance impacted by network-attached storage? At what point will a given CPU become a bottleneck? How much RAM should included in a base configuration to accommodate the load we expect our users to put on each tier of server? Are there operating system changes that can optimize the performance of a platform like MongoDB?</p>
<p>The resulting engineered servers are a testament to the blood, sweat and tears that were shed in the name of creating a reliable, high-performance big data environment. And I can prove it.</p>
<p>Most shared virtual instances &mdash; the scalable infrastructure many users employ for big data &mdash; use network-attached storage for their platform&#8217;s storage. When data has to be queried over a network connection (rather than from a local disk), you introduce latency and more &#8220;moving parts&#8221; that have to work together. Disk I/O might be amazing on the enterprise SAN where your data lives, but because that data is not stored on-server with your processor or memory resources, performance can sporadically go from &#8220;Amazing&#8221; to &#8220;I Hate My Life&#8221; depending on network traffic. When I&#8217;ve tested the IOPS for network-attached storage from a large competitor&#8217;s virtual instances, I saw an average of around 400 IOPS per mount. It&#8217;s difficult to say whether that&#8217;s &#8220;not good enough&#8221; because every application will have different needs in terms of concurrent reads and writes, but it certainly could be better. We performed some internal testing of the IOPS for the hard drive configurations in our Medium and Large MongoDB engineered servers to give you an apples-to-apples comparison.</p>
<p>Before we get into the tests, here are the specs for the servers we&#8217;re using:</p>
<style type="text/css">
  .comparison {float:left; margin-left:10px; margin-right:10px;}
  table th{background-color: #972f2c; color:#fff; padding:2px;}
 </style>
<div class="comparison"><strong>Medium (MD) MongoDB Engineered Server</strong><br />
Dual 6-core Intel 5670 CPUs<br />
CentOS 6 64-bit<br />
36GB RAM<br />
1Gb Network &#8211; Bonded</div>
<div class="comparison"><strong>Large (LG) MongoDB Engineered Server</strong><br />
Dual 8-core Intel E5-2620 CPUs<br />
CentOS 6 64-bit<br />
128GB RAM<br />
1Gb Network &#8211; Bonded</div>
<div style="clear:both; height:1px; margin:0; padding:0;">&nbsp;</div>
<p>The numbers shown in the table below reflect the average number of IOPS we recorded with a 100% random read/write workload on each of these engineered servers. To measure these IOPS, we used a tool called <a href="http://freecode.com/projects/fio">fio</a> with an 8k block size and iodepth at 128. Remembering that the virtual instance using network-attached storage was able to get 400 IOPS per mount, let&#8217;s look at how our &#8220;base&#8221; configurations perform:</p>
<table style="margin:0 auto; border:0;">
<tr>
<th colspan="2">Medium &#8211; 2 x 64GB SSD RAID1 (Journal) &#8211; 4 x 300GB 15k SAS RAID10 (Data)</th>
</tr>
<tr>
<td>Random Read IOPS &#8211; /var/lib/mongo/logs</td>
<td>2937</td>
</tr>
<tr>
<td>Random Write IOPS &#8211; /var/lib/mongo/logs</td>
<td>1306</td>
</tr>
<tr>
<td>Random Read IOPS &#8211; /var/lib/mongo/data</td>
<td>1720</td>
</tr>
<tr>
<td>Random Write IOPS &#8211; /var/lib/mongo/data</td>
<td>772</td>
</tr>
<tr>
<td>Random Read IOPS &#8211; /var/lib/mongo/data/journal</td>
<td>19659</td>
</tr>
<tr>
<td>Random Write IOPS &#8211; /var/lib/mongo/data/journal</td>
<td>8869</td>
</tr>
<tr>
<td>&nbsp;</td>
<td>&nbsp;</td>
</tr>
<tr>
<th colspan="2">Medium &#8211; 2 x 64GB SSD RAID1 (Journal) &#8211; 4 x 400GB SSD RAID10 (Data)</th>
</tr>
<tr>
<td>Random Read IOPS &#8211; /var/lib/mongo/logs</td>
<td>30269</td>
</tr>
<tr>
<td>Random Write IOPS &#8211; /var/lib/mongo/logs</td>
<td>13124</td>
</tr>
<tr>
<td>Random Read IOPS &#8211; /var/lib/mongo/data</td>
<td>33757</td>
</tr>
<tr>
<td>Random Write IOPS &#8211; /var/lib/mongo/data</td>
<td>14168</td>
</tr>
<tr>
<td>Random Read IOPS &#8211; /var/lib/mongo/data/journal</td>
<td>19644</td>
</tr>
<tr>
<td>Random Write IOPS &#8211; /var/lib/mongo/data/journal</td>
<td>8882</td>
</tr>
<tr>
<td>&nbsp;</td>
<td>&nbsp;</td>
</tr>
<tr>
<th colspan="2">Large &#8211; 2 x 64GB SSD RAID1 (Journal) &#8211; 6 x 600GB 15k SAS RAID10 (Data)</th>
</tr>
<tr>
<td>Random Read IOPS &#8211; /var/lib/mongo/logs</td>
<td>4820</td>
</tr>
<tr>
<td>Random Write IOPS &#8211; /var/lib/mongo/logs</td>
<td>2080</td>
</tr>
<tr>
<td>Random Read IOPS &#8211; /var/lib/mongo/data</td>
<td>2461</td>
</tr>
<tr>
<td>Random Write IOPS &#8211; /var/lib/mongo/data</td>
<td>1099</td>
</tr>
<tr>
<td>Random Read IOPS &#8211; /var/lib/mongo/data/journal</td>
<td>19639</td>
</tr>
<tr>
<td>Random Write IOPS &#8211; /var/lib/mongo/data/journal</td>
<td>8772</td>
</tr>
<tr>
<td>&nbsp;</td>
</tr>
<tr>
<th colspan="2">Large &#8211; 2 x 64GB SSD RAID1 (Journal) &#8211; 6 x 400GB SSD RAID10 (Data)</th>
</tr>
<tr>
<td>Random Read IOPS &#8211; /var/lib/mongo/logs</td>
<td>32403</td>
</tr>
<tr>
<td>Random Write IOPS &#8211; /var/lib/mongo/logs</td>
<td>13928</td>
</tr>
<tr>
<td>Random Read IOPS &#8211; /var/lib/mongo/data</td>
<td>34536</td>
</tr>
<tr>
<td>Random Write IOPS &#8211; /var/lib/mongo/data</td>
<td>15412</td>
</tr>
<tr>
<td>Random Read IOPS &#8211; /var/lib/mongo/data/journal</td>
<td>19578</td>
</tr>
<tr>
<td>Random Write IOPS &#8211; /var/lib/mongo/data/journal</td>
<td>8835</td>
</tr>
</table>
<p>Clearly, the 400 IOPS per mount results you&#8217;d see in SAN-based storage can&#8217;t hold a candle to the performance of a physical disk, regardless of whether it&#8217;s SAS or SSD. As you&#8217;d expect, the &#8220;Journal&#8221; reads and writes have roughly the same IOPS between all of the configurations because all four configurations use 2 x 64GB SSD drives in RAID1. In both configurations, SSD drives provide better Data mount read/write performance than the 15K SAS drives, and the results suggest that having more physical drives in a Data mount will provide higher average IOPS. To put that observation to the test, I maxed out the number of hard drives in both configurations (10 in the 2U MD server and 34 in the 4U LG server) and recorded the results:</p>
<table style="margin:0 auto;">
<tr>
<th colspan="2">Medium &#8211; 2 x 64GB SSD RAID1 (Journal) &#8211; 10 x 300GB 15k SAS RAID10 (Data)</th>
</tr>
<tr>
<td>Random Read IOPS &#8211; /var/lib/mongo/logs</td>
<td>7175</td>
</tr>
<tr>
<td>Random Write IOPS &#8211; /var/lib/mongo/logs</td>
<td>3481</td>
</tr>
<tr>
<td>Random Read IOPS &#8211; /var/lib/mongo/data</td>
<td>6468</td>
</tr>
<tr>
<td>Random Write IOPS &#8211; /var/lib/mongo/data</td>
<td>1763</td>
</tr>
<tr>
<td>Random Read IOPS &#8211; /var/lib/mongo/data/journal</td>
<td>18383</td>
</tr>
<tr>
<td>Random Write IOPS &#8211; /var/lib/mongo/data/journal</td>
<td>8765</td>
</tr>
<tr>
<td>&nbsp;</td>
<td>&nbsp;</td>
</tr>
<tr>
<th colspan="2">Medium &#8211; 2 x 64GB SSD RAID1 (Journal) &#8211; 10 x 400GB SSD RAID10 (Data)</th>
</tr>
<tr>
<td>Random Read IOPS &#8211; /var/lib/mongo/logs</td>
<td>32160</td>
</tr>
<tr>
<td>Random Write IOPS &#8211; /var/lib/mongo/logs</td>
<td>12181</td>
</tr>
<tr>
<td>Random Read IOPS &#8211; /var/lib/mongo/data</td>
<td>34642</td>
</tr>
<tr>
<td>Random Write IOPS &#8211; /var/lib/mongo/data</td>
<td>14545</td>
</tr>
<tr>
<td>Random Read IOPS &#8211; /var/lib/mongo/data/journal</td>
<td>19699</td>
</tr>
<tr>
<td>Random Write IOPS &#8211; /var/lib/mongo/data/journal</td>
<td>8764</td>
</tr>
<tr>
<td>&nbsp;</td>
<td>&nbsp;</td>
</tr>
<tr>
<th colspan="2">Large &#8211; 2 x 64GB SSD RAID1 (Journal) &#8211; 34 x 600GB 15k SAS RAID10 (Data)</th>
</tr>
<tr>
<td>Random Read IOPS &#8211; /var/lib/mongo/logs</td>
<td>17566</td>
</tr>
<tr>
<td>Random Write IOPS &#8211; /var/lib/mongo/logs</td>
<td>11918</td>
</tr>
<tr>
<td>Random Read IOPS &#8211; /var/lib/mongo/data</td>
<td>9978</td>
</tr>
<tr>
<td>Random Write IOPS &#8211; /var/lib/mongo/data</td>
<td>6526</td>
</tr>
<tr>
<td>Random Read IOPS &#8211; /var/lib/mongo/data/journal</td>
<td>18522</td>
</tr>
<tr>
<td>Random Write IOPS &#8211; /var/lib/mongo/data/journal</td>
<td>8722</td>
</tr>
<tr>
<td>&nbsp;</td>
</tr>
<tr>
<th colspan="2">Large &#8211; 2 x 64GB SSD RAID1 (Journal) &#8211; 34 x 400GB SSD RAID10 (Data)</th>
</tr>
<tr>
<td>Random Read IOPS &#8211; /var/lib/mongo/logs</td>
<td>34220</td>
</tr>
<tr>
<td>Random Write IOPS &#8211; /var/lib/mongo/logs</td>
<td>15388</td>
</tr>
<tr>
<td>Random Read IOPS &#8211; /var/lib/mongo/data</td>
<td>35998</td>
</tr>
<tr>
<td>Random Write IOPS &#8211; /var/lib/mongo/data</td>
<td>17120</td>
</tr>
<tr>
<td>Random Read IOPS &#8211; /var/lib/mongo/data/journal</td>
<td>17998</td>
</tr>
<tr>
<td>Random Write IOPS &#8211; /var/lib/mongo/data/journal</td>
<td>8822</td>
</tr>
</table>
<p>It should come as no surprise that by adding more drives into the configuration, we get better IOPS, but you might be wondering why the results aren&#8217;t &#8220;betterer&#8221; when it comes to the IOPS in the SSD drive configurations. While the IOPS numbers improve going from four to ten drives in the medium engineered server and six to thirty-four drives in the large engineered server, they don&#8217;t increase as significantly as the IOPS differences in the SAS drives. This is what I meant when I explained that several factors contribute to and potentially limit IOPS performance. In this case, the limiting factor throttling the (ridiculously high) IOPS is the RAID card we are using in the servers. We&#8217;ve been working with our RAID card vendor to test a new card that will open a little more headroom for SSD IOPS, but that replacement card doesn&#8217;t provide the consistency and reliability we need for these servers (which is just as important as speed).</p>
<p>There are probably a dozen other observations I could point out about how each result compares with the others (and why), but I&#8217;ll stop here and open the floor for you. Do you notice anything interesting in the results? Does anything surprise you? What kind of IOPS performance have you seen from your server/cloud instance when running a tool like fio?</p>
<p>-Kelly</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.softlayer.com/2012/big-data-at-softlayer-the-importance-of-iops/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>MongoDB: Architectural Best Practices</title>
		<link>http://blog.softlayer.com/2012/mongodb-architectural-best-practices/</link>
		<comments>http://blog.softlayer.com/2012/mongodb-architectural-best-practices/#comments</comments>
		<pubDate>Thu, 06 Dec 2012 21:45:35 +0000</pubDate>
		<dc:creator>Harold Hannon</dc:creator>
				<category><![CDATA[Development]]></category>
		<category><![CDATA[Infrastructure]]></category>
		<category><![CDATA[SoftLayer]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[Tips and Tricks]]></category>
		<category><![CDATA[10gen]]></category>
		<category><![CDATA[architecture]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[clusters]]></category>
		<category><![CDATA[configuration]]></category>
		<category><![CDATA[deployment]]></category>
		<category><![CDATA[engineered servers]]></category>
		<category><![CDATA[engineering]]></category>
		<category><![CDATA[IOPS]]></category>
		<category><![CDATA[MongoDB]]></category>
		<category><![CDATA[NoSQL]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[platform]]></category>
		<category><![CDATA[strategy]]></category>
		<category><![CDATA[tweaks]]></category>

		<guid isPermaLink="false">http://blog.softlayer.com/?p=10036</guid>
		<description><![CDATA[With the launch of our MongoDB solutions, developers can provision powerful, optimized, horizontally scaling NoSQL database clusters in real-time on bare metal infrastructure in SoftLayer data centers around the world. We worked tirelessly with our friends at 10gen &#8212; the creators of MongoDB &#8212; to build and tweak hardware and software configurations that enable peak [...]]]></description>
			<content:encoded><![CDATA[<p>With the launch of our <a href="https://www.softlayer.com/solutions/big-data/mongodb">MongoDB solutions</a>, developers can provision powerful, optimized, horizontally scaling NoSQL database clusters in real-time on bare metal infrastructure in SoftLayer data centers around the world. We worked tirelessly with our friends at 10gen &mdash; the creators of <a href="http://www.mongodb.org/">MongoDB</a> &mdash; to build and tweak hardware and software configurations that enable peak MongoDB performance, and the resulting platform is pretty amazing. As Duke mentioned in his <a href="http://blog.softlayer.com/2012/big-data-at-softlayer-mongodb/">blog post</a>, those efforts followed 10Gen&#8217;s MongoDB best practices, but what he didn&#8217;t mention was that we created some architectural best practices of our own for MongoDB in deployments on our platform.</p>
<p>The <a href="http://www.softlayer.com/solutions/big-data/mongodb/pricing">MongoDB engineered servers</a> that you order from SoftLayer already implement several of the recommendations you&#8217;ll see below, and I&#8217;ll note which have been incorporated as we go through them. Given the scope of the topic, it&#8217;s probably easiest to break down this guide into a few sections to make it a little more digestible. Let&#8217;s take a look at the architectural best practices of running MongoDB through the phases of the roll-out process: Selecting a deployment strategy to prepare for your MongoDB installation, the installation itself, and the operational considerations of running it in production.</p>
<h3>Deployment Strategy</h3>
<p style="padding-top:0; margin-top:5px;">When planning your MongoDB deployment, you should follow Sun Tzu&#8217;s (modified) advice: &#8220;If you know the [friend] and know yourself, you need not fear the result of a hundred battles.&#8221; &#8220;Friend&#8221; was substituted for the &#8220;enemy&#8221; in this advice because the other party is MongoDB. If you aren&#8217;t familiar with MongoDB, the top of your to-do list should be to read <a href="http://www.mongodb.org/display/DOCS/Home">MongoDB&#8217;s official documentation</a>. That information will give you the background you&#8217;ll need as you build and use your database. When you feel comfortable with what MongoDB is all about, it&#8217;s time to &#8220;know yourself.&#8221;</p>
<p>Your most important consideration will be the current and anticipated sizes of your data set. Understanding the volume of data you&#8217;ll need to accommodate will be the primary driver for your choice of individual physical nodes as well as your sharding plans. Once you&#8217;ve established an expected size of your data set, you need to consider the importance of your data and how tolerant you are of the possibility of lost or lagging data (especially in replicated scenarios). With this information in hand, you can plan and start testing your deployment strategy. </p>
<p>It sounds a little strange to hear that you should test a deployment strategy, but when it comes to big data, you want to make sure your databases start with a strong foundation. You should perform load testing scenarios on a potential deployment strategy to confirm that a given architecture will meet your needs, and there are a few specific areas that you should consider:</p>
<div style="margin-left:10px;"><strong>Memory Sizing</strong><br />
MongoDB (like many data-oriented applications) works best when the data set can reside in memory. Nothing performs better than a MongoDB instance that does not require disk I/O. Whenever possible, select a platform that has more available RAM than your working data set size. If your data set exceeds the available RAM for a single node, then consider using <a href="http://docs.mongodb.org/manual/core/sharding/">sharding</a> to increase the amount of available RAM in a cluster to accommodate the larger data set. This will maximize the overall performance of your deployment. If you notice page faults when you put your database under production load, they may indicate that you are exceeding the available RAM in your deployment.</p>
<p><strong>Disk Type</strong><br />
If speed is not your primary concern or if you have a data set that is far larger than any available in memory strategy can support, selecting the proper disk type for your deployment is important. <a href="http://en.wikipedia.org/wiki/IOPS">IOPS</a> will be key in selecting your disk type and obviously the higher the IOPS the better the performance of MongoDB. Local disks should be used whenever possible (as network storage can cause high latency and poor performance for your deployment). It&#8217;s also advised that you use RAID 10 when creating disk arrays. </p>
<p>To give you an idea of what kind of IOPS to expect from a given type of drive, these are the approximate ranges of IOPS per drive in SoftLayer MongoDB engineered servers:</p>
<div style="margin-left:10px;">SATA II – 100-200 IOPS<br />
15K SAS – 300-400 IOPS<br />
SSD – 7,000-8,000 IOPS (read) 19,000-20,000 IOPS (write)</div>
<p><strong>CPU</strong><br />
Clock speed and the amount of available processors becomes a consideration if you anticipate using <a href="http://en.wikipedia.org/wiki/MapReduce">MapReduce</a>. It has also been noted that when running a MongoDB instance with the majority of the data in memory, clock speed can have a major impact on overall performance. If you are planning to use MapReduce or you&#8217;re able to operate with a majority of your data in memory, consider a deployment strategy that includes a CPU with a high clock/bus speed to maximize your operations per second.</p>
<p><strong>Replication</strong><br />
Replication provides high availability of your data if a node fails in your cluster. It should be standard to replicate with at least three nodes in any MongoDB deployment. The most common configuration for replication with three nodes is a 2&#215;1 deployment &mdash; having two primary nodes in a single data center with a backup server in a secondary data center:</p>
<p><img class="centered" src="http://cdn.softlayer.com/innerlayer/mongodbreplication.png" alt="MongoDB Replication"/> </p>
<p><strong>Sharding</strong><br />
If you anticipate a large, active data set, you should deploy a sharded MongoDB deployment. Sharding allows you to partition a single data set across multiple nodes. You can allow MongoDB to automatically distribute the data across nodes in the cluster or you may elect to define a shard key and create range-based sharding for that key. </p>
<p>Sharding may also help write performance, so you can also elect to shard even if your data set is small but requires a high amount of updates or inserts. It&#8217;s important to note that when you deploy a sharded set, MongoDB will require three (and only three) config server instances which are specialized Mongo runtimes to track the current shard configuration. Loss of one of these nodes will cause the cluster to go into a read-only mode (for the configuration only) and will require that all nodes be brought back online before any configuration changes can be made.</p>
<p><strong>Write Safety Mode</strong><br />
There are several write safety modes that govern how MongoDB will handle the persistence of the data to disk. It is important to consider which mode best fits your needs for both data integrity and performance. The following write safety modes are available:</p>
<div style="margin-left:10px;">None – This mode provides a deferred writing strategy that is non-blocking. This will allow for high performance, however there is a small opportunity in the case of a node failing that data can be lost. There is also the possibility that data written to one node in a cluster will not be immediately available on all nodes in that cluster for read consistency. The &#8216;None&#8217; strategy will also not provide any sort of protection in the case of network failures. That lack of protection makes this mode highly unreliable and should only be used when performance is a priority and data integrity is not a concern.</p>
<p>Normal – This is the default for MongoDB if you do not select any other mode. It provides a deferred writing strategy that is non-blocking. This will allow for high performance, however there is a small opportunity in the case of a node failing that data can be lost. There is also the possibility that data written to one node in a cluster will not be immediately available on all nodes in that cluster for read consistency.</p>
<p>Safe – This mode will block until MongoDB has acknowledged that it has received the write request but will not block until the write is actually performed. This provides a better level of data integrity and will ensure that read consistency is achieved within a cluster.</p>
<p>Journal Safe – Journals provide a recovery option for MongoDB. Using this mode will ensure that the data has been acknowledged and a Journal update has been performed before returning.</p>
<p>Fsync &#8211; This mode provides the highest level of data integrity and blocks until a physical write of the data has occurred. This comes with a degradation in performance and should be used only if data integrity is the primary concern for your application.</p></div>
<p><strong>Testing the Deployment</strong><br />
Once you&#8217;ve determined your deployment strategy, test it with a data set similar to your production data. 10gen has several tools to help you with load testing your deployment, and the console has a tool named &#8216;benchrun&#8217; which can execute operations from within a <a href="http://www.mongodb.org/display/DOCS/JS+Benchmarking+Harness">JavaScript test harness</a>. These tools will return operation information as well as latency numbers for each of those operations. If you require more detailed information about the MongoDB instance, consider using the <a href="http://docs.mongodb.org/manual/reference/mongostat/">mongostat</a> command or MongoDB Monitoring Service (MMS) to monitor your deployment during the testing.</div>
<h3>Installation</h3>
<p style="padding-top:0; margin-top:5px;">When performing the installation of MongoDB, a few considerations can help create both a stable and performance-oriented solution. 10gen recommends the use CentOS (64-bit) as the base operating system if at all possible. If you try installing MongoDB on a 32-bit operating system, you might run into file size limits that cause issues, and if you feel the urge to install it on Windows, you&#8217;ll see performance issues if virtual memory begins to be utilized by the OS to make up for a lack of RAM in your deployment. As a result, 32-bit operating systems and Windows operating systems should be avoided on MongoDB servers. SoftLayer provisions CentOS 6.X 64-bit operating systems by default on all of our MongoDB engineered server deployments.</p>
<p>When you&#8217;ve got CentOS 64-bit installed, you should also make the following changes to maximize your performance (all of which are included <a href="http://knowledgelayer.softlayer.com/articles/engineered-mongodb-installations">by default</a> on all SoftLayer engineered servers):</p>
<div style="margin-left:10px;">Set SSD Read Ahead Defaults to 16 Blocks &#8211; SSD drives have excellent seek times allowing for shrinking the Read Ahead to 16 blocks. Spinning disks might require slight buffering so these have been set to 32 blocks.</p>
<p>noatime &#8211; Adding the noatime option eliminates the need for the system to make writes to the file system for files which are simply being read &mdash; or in other words: Faster file access and less disk wear.</p>
<p>Turn NUMA Off in BIOS &#8211; Linux, NUMA and MongoDB tend not to work well together. If you are running MongoDB on NUMA hardware, we recommend turning it off (running with an interleave memory policy). If you don&#8217;t, problems will manifest in strange ways like massive slow downs for periods of time or high system CPU time.</p>
<p>Set ulimit &#8211; We have set the ulimit to 64000 for open files and 32000 for user processes to prevent failures due to a loss of available file handles or user processes.</p>
<p>Use ext4 &#8211; We have selected ext4 over ext3.  We found ext3 to be very slow in allocating files (or removing them). Additionally, access within large files is poor with ext3.</p></div>
<p>One last tip on installation: Make the Journal and Data volumes be distinct physical volumes. If the Journal and Data directories reside on a single physical volume, flushes to the Journal will interrupt the access of data and provide spikes of high latency within your MongoDB deployment.</p>
<h3>Operations</h3>
<p style="padding-top:0; margin-top:5px;">Once a MongoDB deployment has been promoted to production, there are a few recommendations for monitoring and optimizing performance. You should always have the MMS agent running on all MongoDB instances to help monitor the health and performance of your deployment. Additionally, this tool is also very useful if you have 10gen MongoDB Cloud Subscriptions because it provides useful debugging data for the 10gen team during support interactions. In addition to MMS, you can use the mongostat command (mentioned in the deployment section) to see runtime information about the performance of a MongoDB node. If either of these tools flags performance issues, sharding or indexing are first-line options to resolve them:</p>
<div style="margin-left:10px;">Indexes &#8211; Indexes should be created for a MongoDB deployment if monitoring tools indicate that field based queries are performing poorly. Always use indexes when you are querying data based on distinct fields to help boost performance.</p>
<p>Sharding &#8211; Sharding can be leveraged when the overall performance of the node is suffering because of a large operating data set. Be sure to shard before you get in the red; the system only splits chunks for sharding on insert or update so if you wait too long to shard you may have some uneven distribution for a period of time or forever depending on your data set and sharding key strategy.</p></div>
<p>I know it seems like we&#8217;ve covered a lot over the course of this blog post, but this list of best practices is far from exhaustive. If you want to learn more, the <a href="https://groups.google.com/forum/?fromgroups#!forum/mongodb-user">MongoDB forums</a> are a great resource to connect with the rest of the MongoDB community and learn from their experiences, and the documentation on MongoDB&#8217;s site is another phenomenal resource. The best people to talk to when it comes to questions about MongoDB are the folks at 10gen, so I also highly recommend taking advantage of MongoDB Cloud Subscriptions to get their direct support for your one-off questions and issues.</p>
<p>-Harold</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.softlayer.com/2012/mongodb-architectural-best-practices/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Breaking Down &#8216;Big Data&#8217; &#8211; Database Models</title>
		<link>http://blog.softlayer.com/2012/breaking-down-big-data-database-models/</link>
		<comments>http://blog.softlayer.com/2012/breaking-down-big-data-database-models/#comments</comments>
		<pubDate>Wed, 05 Dec 2012 16:15:55 +0000</pubDate>
		<dc:creator>Marc Jones</dc:creator>
				<category><![CDATA[Development]]></category>
		<category><![CDATA[Executive Blog]]></category>
		<category><![CDATA[Infrastructure]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[architecture]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[BigCouch]]></category>
		<category><![CDATA[Cassandra]]></category>
		<category><![CDATA[CouchBase]]></category>
		<category><![CDATA[CouchDB]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[developers]]></category>
		<category><![CDATA[hbase]]></category>
		<category><![CDATA[hstore]]></category>
		<category><![CDATA[information]]></category>
		<category><![CDATA[LevelDB]]></category>
		<category><![CDATA[MapReduce]]></category>
		<category><![CDATA[MongoDB]]></category>
		<category><![CDATA[postgreSQL]]></category>
		<category><![CDATA[Redis]]></category>
		<category><![CDATA[Riak]]></category>
		<category><![CDATA[scalability]]></category>
		<category><![CDATA[scaling]]></category>
		<category><![CDATA[store]]></category>

		<guid isPermaLink="false">http://blog.softlayer.com/?p=9806</guid>
		<description><![CDATA[Forester defines big data as &#8220;techniques and technologies that make capturing value from data at an extreme scale economical.&#8221; Gartner says, &#8220;Big data is the term adopted by the market to describe extreme information management and processing issues which exceed the capability of traditional information technology along one or multiple dimensions to support the use [...]]]></description>
			<content:encoded><![CDATA[<p>Forester defines big data as &#8220;techniques and technologies that make capturing value from data at an extreme scale economical.&#8221; Gartner says, &#8220;Big data is the term adopted by the market to describe extreme information management and processing issues which exceed the capability of traditional information technology along one or multiple dimensions to support the use of the information assets.&#8221; Big data demands extreme horizontal scale that traditional IT management can&#8217;t handle, and it&#8217;s not a challenge exclusive to the Facebooks, Twitters and Tumblrs of the world &#8230; Just look at the Google search volume for &#8220;big data&#8221; over the past eight years:</p>
<p><img class="centered" src="http://cdn.softlayer.com/innerlayer/bigdatagoogle.jpg" alt="Big Data Search Interest"/></p>
<p>Developers are collectively facing information overload. As storage has become more and more affordable, it&#8217;s easier to justify collecting and saving more data. Users are more comfortable with creating and sharing content, and we&#8217;re able to track, log and index metrics and activity that previously would have been deleted in consideration of space restraints or cost. As the information age progresses, we are collecting more and more data at an ever-accelerating pace, and we&#8217;re sharing that data at an incredible rate.</p>
<p>To understand the different facets of this increased usage and demand, Gartner came up with the three V&#8217;s of big data that vary significantly from traditional data requirements: Volume, Velocity and Variety. Larger, more abundant pieces of data (&#8220;Volume&#8221;) are coming at a much faster speed (&#8220;Velocity&#8221;) in formats like media and walls of text that don&#8217;t easily fit into a column-and-row database structure (&#8220;Variety&#8221;). Given those equally important factors, many of the biggest players in the IT world have been hard at work to create solutions that provide the scale and speed developers need when they build social, analytics, gaming, financial or medical apps with large data sets.</p>
<p>When we talk about scaling databases here, we&#8217;re talking about scaling horizontally across multiple servers rather than scaling vertically by upgrading a single server &mdash; adding more RAM, increasing HDD capacity, etc. It&#8217;s important to make that distinction because it leads to a unique challenge shared by all distributed computer systems: <a href="http://en.wikipedia.org/wiki/CAP_theorem">The CAP Theorem</a>. According to the CAP theorem, a distributed storage system must choose to sacrifice either <strong>consistency</strong> (that everyone sees the same data) or <strong>availability</strong> (that you can always read/write) <em>while</em> having <strong>partition tolerance</strong> (where the system continues to operate despite arbitrary message loss or failure of part of the system occurs).</p>
<p>Let&#8217;s take a look at a few of the most common database models, what their strengths are, and how they handle the CAP theorem compromise of consistency v. availability:</p>
<h3>Relational Databases</h3>
<p style="margin-top:5px; padding-top:0; margin-left:10px;"><strong>What They Do:</strong> Stores data in rows/columns. Parent-child records can be joined remotely on the server. Provides speed over scale. Some capacity for vertical scaling, poor capacity for horizontal scaling. This type of database is where most people start.<br />
<strong>Horizontal Scaling:</strong> In a relational database system, horizontal scaling is possible via replication &mdash; dharing data between redundant nodes to ensure consistency &mdash; and some people have success sharding &mdash; horizontal partitioning of data &mdash; but those techniques add a lot of complexity.<br />
<strong>CAP Balance:</strong> Prefer consistency over availability.<br />
<strong>When to use:</strong> When you have highly structured data, and you know what you&#8217;ll be storing. Great when production queries will be predictable.<br />
<strong>Example Products:</strong> <a href="http://www.oracle.com/us/products/database/overview/index.html">Oracle</a>, <a href="http://www.sqlite.org/">SQLite</a>, <a href="http://www.postgresql.org">PostgreSQL</a>, <a href="http://www.mysql.com/">MySQL</a></p>
<h3>Document-Oriented Databases</h3>
<p style="margin-top:5px; padding-top:0; margin-left:10px;"><strong>What They Do:</strong> Stores data in documents. Parent-child records can be stored in the same document and returned in a single fetch operation with no join. The server is aware of the fields stored within a document, can query on them, and return their properties selectively.<br />
<strong>Horizontal Scaling:</strong> Horizontal scaling is provided via replication, or replication + sharding. Document-oriented databases also usually support relatively low-performance <a href="http://en.wikipedia.org/wiki/MapReduce">MapReduce</a> for ad-hoc querying.<br />
<strong>CAP Balance:</strong> Generally prefer consistency over availability<br />
<strong>When to Use:</strong> When your concept of a &#8220;record&#8221; has relatively bounded growth, and can store all of its related properties in a single doc.<br />
<strong>Example Products:</strong> <a href="http://www.mongodb.org/">MongoDB</a>, <a href="http://couchdb.apache.org/">CouchDB</a>, <a href="http://bigcouch.cloudant.com/">BigCouch</a>, <a href="https://cloudant.com/">Cloudant</a></p>
<h3>Key-Value Stores</h3>
<p style="margin-top:5px; padding-top:0; margin-left:10px;"><strong>What They Do:</strong> Stores an arbitrary value at a key. Most can perform simple operations on a single value. Typically, each property of a record must be fetched in multiple trips, with Redis being an exception. Very simple, and very fast.<br />
<strong>Horizontal Scaling:</strong> Horizontal scale is provided via sharding.<br />
<strong>CAP Balance:</strong> Generally prefer consistency over availability.<br />
<strong>When to Use:</strong> Very simple schemas, caching of upstream query results, or extreme speed scenarios (like real-time counters)<br />
<strong>Example Products:</strong> <a href="http://www.couchbase.com/">CouchBase</a>, <a href="http://redis.io/">Redis</a>, <a href="http://www.postgresql.org/docs/9.0/static/hstore.html">PostgreSQL HStore</a>, <a href="http://code.google.com/p/leveldb/">LevelDB</a></p>
<h3>BigTable-Inspired Databases</h3>
<p style="margin-top:5px; padding-top:0; margin-left:10px;"><strong>What They Do:</strong> Data put into column-oriented stores inspired by Google&#8217;s <a href="http://research.google.com/archive/bigtable.html">BigTable</a> paper. It has tunable CAP parameters, and can be adjusted to prefer either consistency or availability. Both are sort of operationally intensive.<br />
<strong>Horizontal Scaling:</strong> Good speed and very wide horizontal scale capabilities.<br />
<strong>CAP Balance:</strong> Prefer consistency over availability<br />
<strong>When to Use:</strong> When you need consistency and write performance that scales past the capabilities of a single machine. Hbase in particular has been used with around 1,000 nodes in production.<br />
<strong>Example Products:</strong> <a href="http://hbase.apache.org/">Hbase</a>, <a href="http://cassandra.apache.org/">Cassandra</a> (inspired by both BigTable and Dynamo)</p>
<h3>Dynamo-Inspired Databases</h3>
<p style="margin-top:5px; padding-top:0; margin-left:10px;"><strong>What They Do:</strong> Distributed key/value stores inspired by Amazon&#8217;s <a href="http://www.read.seas.harvard.edu/~kohler/class/cs239-w08/decandia07dynamo.pdf">Dynamo</a> paper. A key written to a dynamo ring is persisted in several nodes at once before a successful write is reported. Riak also provides a native MapReduce implementation.<br />
<strong>Horizontal Scaling:</strong>  Dynamo-inspired databases usually provide for the best scale and extremely strong data durability.<br />
<strong>CAP Balance:</strong> Prefer availability over consistency,<br />
<strong>When to Use:</strong> When the system must always be available for writes and effectively cannot lose data.<br />
<strong>Example Products:</strong> <a href="http://cassandra.apache.org/">Cassandra</a>, <a href="http://wiki.basho.com/">Riak</a>, <a href="http://bigcouch.cloudant.com/">BigCouch</a></p>
<p>Each of the database models has strengths and weaknesses, and there are huge communities that support each of the open source examples I gave in each model. If your database is a bottleneck or you&#8217;re not getting the flexibility and scalability you need to handle your application&#8217;s volume, velocity and variety of data, start looking at some of these &#8220;big data&#8221; solutions.</p>
<p>Tried any of the above models and have feedback that differs from ours? Leave a comment below and tell us about it!</p>
<p>-<a href="http://twitter.com/marcalanjones">@marcalanjones</a></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.softlayer.com/2012/breaking-down-big-data-database-models/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Big Data at SoftLayer: MongoDB</title>
		<link>http://blog.softlayer.com/2012/big-data-at-softlayer-mongodb/</link>
		<comments>http://blog.softlayer.com/2012/big-data-at-softlayer-mongodb/#comments</comments>
		<pubDate>Tue, 04 Dec 2012 15:25:56 +0000</pubDate>
		<dc:creator>Duke Skarda</dc:creator>
				<category><![CDATA[Development]]></category>
		<category><![CDATA[Executive Blog]]></category>
		<category><![CDATA[Infrastructure]]></category>
		<category><![CDATA[SoftLayer]]></category>
		<category><![CDATA[10gen]]></category>
		<category><![CDATA[announcement]]></category>
		<category><![CDATA[bare metal]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[Cloud]]></category>
		<category><![CDATA[custom]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[global]]></category>
		<category><![CDATA[horizontal]]></category>
		<category><![CDATA[launch]]></category>
		<category><![CDATA[MongoDB]]></category>
		<category><![CDATA[on-demand]]></category>
		<category><![CDATA[product]]></category>
		<category><![CDATA[scaling]]></category>
		<category><![CDATA[specialized]]></category>
		<category><![CDATA[volume]]></category>

		<guid isPermaLink="false">http://blog.softlayer.com/?p=9955</guid>
		<description><![CDATA[In one day, Facebook&#8217;s databases ingest more than 500 terabytes of data, Twitter processes 500 million Tweets and Tumblr users publish more than 75 million posts. With such an unprecedented volume of information, developers face significant challenges when it comes to building an application&#8217;s architecture and choosing its infrastructure. As a result, demand has exploded [...]]]></description>
			<content:encoded><![CDATA[<p>In one day, Facebook&#8217;s databases ingest more than <a href="http://gigaom.com/data/facebook-is-collecting-your-data-500-terabytes-a-day/">500 terabytes of data</a>, Twitter processes <a href="http://news.cnet.com/8301-1023_3-57541566-93/report-twitter-hits-half-a-billion-tweets-a-day/">500 million Tweets</a> and Tumblr users publish more than <a href="http://www.tumblr.com/about">75 million posts</a>. With such an unprecedented volume of information, developers face significant challenges when it comes to building an application&#8217;s architecture and choosing its infrastructure. As a result, demand has exploded for &#8220;big data&#8221; solutions &mdash; resources that make it possible to process, store, analyze, search and deliver data from large, complex data sets. In light of that demand, SoftLayer has been working in strategic partnership with <a href="http://www.10gen.com/">10gen</a> &mdash; the creators of <a href="http://www.mongodb.org/">MongoDB</a> &mdash; to develop a high-performance, on-demand, big data solution. Today, we&#8217;re excited to announce the launch of specialized <a href="https://www.softlayer.com/solutions/big-data/mongodb">MongoDB servers</a> at SoftLayer.</p>
<p>If you&#8217;ve configured an infrastructure to accommodate big data, you know how much of a pain it can be: You choose your hardware, you configure it to run NoSQL, you install an open source NoSQL project that you think will meet your needs, and you keep tweaking your environment to optimize its performance. Assuming you have the resources (and patience) to get everything running efficiently, you&#8217;ll wind up with the horizontally scalable database infrastructure you need to handle the volume of content you and your users create and consume. SoftLayer and 10gen are making that process a whole lot easier.</p>
<p>Our new MongoDB solutions take the time and guesswork out of configuring a big data environment. We give you an easy-to-use system for designing and ordering everything you need. You can start with a single server or roll out multiple servers in a single replica set across multiple data centers, and in under two hours, an <em>optimized</em> MongoDB environment is provisioned and ready to be used. I stress that it&#8217;s an &#8220;optimized&#8221; environment because that&#8217;s been our key focus. We collaborated with 10gen engineers on hardware and software configurations that provide the most robust performance for MongoDB, and we incorporated many of their MongoDB best practices. The resulting &#8220;engineered servers&#8221; are <a href="https://www.softlayer.com/solutions/big-data/">big data</a> powerhouses:</p>
<p><a href="http://www.softlayer.com/solutions/big-data/mongodb/pricing"><img class="centered" src="http://cdn.softlayer.com/innerlayer/mongodbfullconfig.jpg" alt="MongoDB Configs"/></a></p>
<p>From each engineered server base configuration, you can customize your MongoDB server to meet your application&#8217;s needs, and as you choose your upgrades from the base configuration, you&#8217;ll see the thresholds at which you should consider upgrading other components. As your data set&#8217;s size and the number of indexes in your database increase, you&#8217;ll need additional RAM, CPU, and storage resources, but you won&#8217;t need them in the same proportions &mdash; certain components become bottlenecks before others. Sure, you could upgrade all of the components in a given database server at the same rate, but if, say, you update everything when you only <em>need</em> to upgrade RAM, you&#8217;d be adding (and paying for) unnecessary CPU and storage capacity.</p>
<p>Using our new <a href="http://www.softlayer.com/Sales/orderMongoDbReplicaSet">Solution Designer</a>, it&#8217;s very easy to graphically design a complex multi-site replica set. Once you finalize your locations and server configurations, you&#8217;ll click &#8220;Order,&#8221; and our automated provisioning system will kick into high gear. It deploys your server hardware, installs CentOS (with OS optimizations to provide MongoDB performance enhancements), installs MongoDB, installs MMS (MongoDB Monitoring Service) and configures the network connection on each server to cluster it with the other servers in your environment. A process that may have taken days of work and months of tweaking is completed in less than four hours. And because everything is standardized and automated, you run much less risk of human error.</p>
<p><a href="http://www.softlayer.com/Sales/orderMongoDbReplicaSet"><img class="centered" src="http://cdn.softlayer.com/innerlayer/solutiondesigner.jpg" alt="MongoDB Configs"/></a></p>
<p>One of the other massive benefits of working so closely with 10gen is that we&#8217;ve been able to integrate 10gen&#8217;s MongoDB Cloud Subscriptions into our offering. Customers who opt for a MongoDB Cloud Subscription get additional MongoDB features (like SSL and SNMP support) and support direct from <em>the</em> MongoDB authority. As an added bonus, since the 10gen team has an intimate understanding of the SoftLayer environment, they&#8217;ll be able to provide even better support to SoftLayer customers!</p>
<p>You shouldn&#8217;t have to sacrifice agility for performance, and you shouldn&#8217;t have to sacrifice performance for agility. Most of the &#8220;big data&#8221; offerings in the market today are built on virtual servers that can be provisioned quickly but offer meager performance levels relative to running the same database on bare metal infrastructure. To get the performance benefits of dedicated hardware, many users have chosen to build, roll out and tweak their own configurations. With our MongoDB offering, you get the on-demand availability and flexibility of a cloud infrastructure with the raw power and full control of dedicated hardware. </p>
<p>If you&#8217;ve been toying with the idea of rolling out your own big data infrastructure, life just got a lot better for you.</p>
<p>-Duke</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.softlayer.com/2012/big-data-at-softlayer-mongodb/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>3 Bars &#124; 3 Questions: Big Data and Search</title>
		<link>http://blog.softlayer.com/2011/3-bars-3-questions-big-data-and-search/</link>
		<comments>http://blog.softlayer.com/2011/3-bars-3-questions-big-data-and-search/#comments</comments>
		<pubDate>Wed, 16 Feb 2011 23:39:25 +0000</pubDate>
		<dc:creator>Marc Jones</dc:creator>
				<category><![CDATA[3 Bars 3 Questions]]></category>
		<category><![CDATA[Culture]]></category>
		<category><![CDATA[Development]]></category>
		<category><![CDATA[SoftLayer]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[3 Bars]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[IaaS]]></category>
		<category><![CDATA[Infrastructure]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[video]]></category>

		<guid isPermaLink="false">http://blog.softlayer.com/2011/</guid>
		<description><![CDATA[Last week, Duke chose me as this week&#8217;s &#8220;3 Bars &#124; 3 Questions&#8221; participant, so my desk chair became the hot seat this afternoon. The topic of discussion: &#8220;Big Data and Search.&#8221; Have you started working with big data? What&#8217;s the best method you&#8217;ve found to keep it organized and accessible? How do you scale [...]]]></description>
			<content:encoded><![CDATA[<p>Last week, <a href="http://blog.softlayer.com/2011/3-bars-3-questions-hybrid-hosting-video-interview/">Duke chose me</a> as this week&#8217;s &#8220;3 Bars | 3 Questions&#8221; participant, so my desk chair became the hot seat this afternoon. The topic of discussion: &#8220;Big Data and Search.&#8221;</p>
<div class="yt560"><iframe width="560" height="349" src="http://www.youtube.com/embed/6CwZ2c7MbX0" frameborder="0" allowfullscreen></iframe></div>
<p>Have you started working with big data? What&#8217;s the best method you&#8217;ve found to keep it organized and accessible? How do you scale your infrastructure to maintain performance?</p>
<p>-Marc</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.softlayer.com/2011/3-bars-3-questions-big-data-and-search/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
