Posts Tagged 'Platform'

August 1, 2013

The "Unified Field Theory" of Storage

This guest blog was contributed by William Rocca of OS NEXUS. OS NEXUS makes the Quantastor Software Defined Storage platform designed to tackle the storage challenges facing cloud computing, Big Data and high performance applications.

Over the last decade, the creation and popularization of SAN/NAS systems simplified the management of storage into a single appliance so businesses could efficiently share, secure and manage data centrally. Fast forward about 10 years in storage innovation, and we're now rapidly changing from a world of proprietary hardware sold by big-iron vendors to open-source, scale-out storage technologies from software-only vendors that make use of commodity off-the-shelf hardware. Some of the new technologies are derivatives of traditional SAN/NAS with better scalability while others are completely new. Object storage technologies such as OpenStack SWIFT have created a foundation for whole new types of applications, and big data technologies like MongoDB, Riak and Hadoop go even further to blur the lines between storage and compute. These innovations provide a means for developing next-generation applications that can collect and analyze mountains of data. This is the exciting frontier of open storage today.

This frontier looks a lot like the "Wild West." With ad-hoc solutions that have great utility but are complex to setup and maintain, many users are effectively solving one-off problems, but these solutions are often narrowly defined and specifically designed for a particular application. The question everyone starts asking is, "Can't we just evolve to having one protocol ... one technology that unites them all?"

If each of these data storing technologies have unique advantages for specific use cases or applications, the answer isn't to eliminate protocols. To borrow a well-known concept from Physics, the solution lies in a "Unified Field Theory of Storage" — weaving them together into a cohesive software platform that makes them simple to deploy, maintain and operate.

When you look at the latest generation of storage technologies, you'll notice a common thread: They're all highly-available, scale-out, open-source and serve as a platform for next-generation applications. While SAN/NAS storage is still the bread-and-butter enterprise storage platform today (and will be for some time to come) these older protocols often don't measure up to the needs of applications being developed today. They run into problems storing, processing and gleaning value out of the mountains of data we're all producing.

Thinking about these challenges, how do we make these next-generation open storage technologies easy to manage and turn-key to deploy? What kind of platform could bring them all together? In short, "What does the 'Unified Field Theory of Storage' look like?"

These are the questions we've been trying to answer for the last few years at OS NEXUS, and the result of our efforts is the QuantaStor Software Defined Storage platform. In its first versions, we focused on building a flexible foundation supporting the traditional SAN/NAS protocols but with the launch of QuantaStor v3 this year, we introduced the first scale-out version of QuantaStor and integrated the first next-gen open storage technology, Gluster, into the platform. In June, we launched support of ZFS on Linux (ZoL), and enhanced the platform with a number of advanced enterprise features, such as snapshots, compression, deduplication and end-to-end checksums.

This is just the start, though. In our quest to solve the "Unified Field Theory of Storage," we're turning our eyes to integrating platforms like OpenStack SWIFT and Hadoop in QuantaStor v4 later this year, and as these high-power technologies are streamlined under a single platform, end users will have the ability to select the type(s) of storage that best fit a given application without having to learn (or unlearn) specific technologies.

The "Unified Field Theory of Storage" is emerging, and we hope to make it downloadable. Visit OSNEXUS.com to keep an eye on our progress. If you want to incorporate QuantaStor into your environment, check out SoftLayer's preconfigured QuantaStor Mass Storage Server solution.

-William Rocca, OS NEXUS

June 4, 2013

IBM to Acquire SoftLayer

As most have seen by now, this morning we announced IBM's intent to acquire SoftLayer. It's not just big news, it's great news for SoftLayer and our customers. I'd like to take a moment and share a little background on the deal and pass along a few resources to answer questions you may have.

We founded SoftLayer in 2005 with the vision of becoming the de facto platform for the Internet. We committed ourselves to automation and innovation. We could have taken shortcuts to make a quick buck by creating manual processes or providing one-off services, but we invested in processes that would enable us to build the strongest, most scalable, most controllable foundation on which customers can build whatever they want. We created a network-within-a-network topology of three physical networks to every SoftLayer server, and all of our services live within a unified API. "Can it be automated?" was not the easiest question to ask, but it's the question that enabled us to grow at Internet scale.

As part of the newly created IBM Cloud Services division, customers and clients from both companies will benefit from a higher level of choice and a higher level of service from a single partner. More important, the real significance will come as we merge technology that we developed within the SoftLayer platform with the power and vision that drives SmartCloud and pioneer next-generation cloud services. It might seem like everyone is "in the cloud" now, but the reality is that we're still in the early days in this technology revolution. What the cloud looks like and what businesses are doing with it will change even more in the next two years than it has in the last five.

You might have questions in the midst of the buzz around this acquisition, and I want you to get answers. A great place to learn more about the deal is the SoftLayer page on IBM.com. From there, you can access a FAQ with more information, and you'll also learn more about the IBM SmartCloud portfolio that SoftLayer will compliment.

A few questions that may be top of mind for the customers reading this blog:

How does this affect my SoftLayer services?
Between now and when the deal closes (expected in the third quarter of this year), SoftLayer will continue to operate as an independent company with no changes to SoftLayer services or delivery. Nothing will change for you in the foreseeable future.

Your SoftLayer account relationships and support infrastructure will remain unchanged, and your existing sales and technical representatives will continue to provide the support you need. At any time, please don't hesitate to reach out to your SoftLayer team members.

Over time as any changes occur, information will be communicated to customers and partners with ample time to allow for planning and a smooth transition. Our customers will benefit from the combined technologies and skills of both companies, including increased investment, global reach, industry expertise and support available from IBM, along with IBM and SoftLayer's joint commitment to innovation.

Once the acquisition has been completed, we will be able to provide more details.

What does it mean for me?
We entered this agreement because it will enable us to continue doing what we've done since 2005, but on an even bigger scale and with greater opportunities. We believe in its success and the opportunity it brings customers.

It's going to be a smooth integration. The executive leadership of both IBM and SoftLayer are committed to the long-term success of this acquisition. The SoftLayer management team will remain part of the integrated leadership team to drive the broader IBM SmartCloud strategy into the marketplace. And IBM is best-in-class at integration and has a significant track record of 26 successful acquisitions over the past three years.

IBM will continue to support and enhance SoftLayer's technologies while enabling clients to take advantage of the broader IBM portfolio, including SmartCloud Foundation, SmartCloud Services and SmartCloud Solutions.

-@lavosby

UPDATE: On July 8, 2013, IBM completed its acquisition of SoftLayer: http://sftlyr.com/30z

February 27, 2013

The Three Most Common Hosting-Related Phobias

As a member of the illustrious the SoftLayer sales (SLales) team, I have the daily pleasure of talking with any number of potential, prospective, new and current customers, and in many of those conversations, I've picked up on a fairly common theme: FEAR. Now we're not talking about lachanophobia (fear of vegetables) or nomophobia (fear of losing cell phone contact) here ... We're talking about fear that paralyzes users and holds them captive — effectively preventing their growth and limiting their business's potential. Fear is a disease.

I've created my own little naming convention for the top three most common phobias I hear from users as they consider making changes to their hosting environments:

1. Pessimisobia
This phobia is best summarized by the saying, "Better the devil you know than the devil you don't." Users with this phobia could suffer from frequent downtime, a lack of responsive support and long term commitment contracts, but their service is a known quantity. What if a different provider is even worse? If you don't suffer from pessimisobia, this phobia probably seems silly, but it's very evident in many of the conversations I have.

2. Whizkiditus
This affliction is particularly prevalent in established companies. Symptoms of this phobia include recurring discomfort associated with the thought of learning a new management system or deviating from a platform where users have become experts. There's an efficiency to being comfortable with how a particular platform works, but the ceiling to that efficiency is the platform itself. Users with whizkiditus might not admit it, but the biggest reason they shy away from change is that they are afraid of losing the familiarity they've built with their old systems over the years ... even if that means staying on a platform that prohibits scale and growth.

3. Everythingluenza
In order to illustrate this phobia of compartmentalizing projects to phase in changes, let's look at a little scenario:

I host all of my applications at Company 1. I want to move Application A to the more-qualified Company 2, but if I do that, I'll have to move Applications B through Z to Company 2 also. All of that work would be too time-consuming and cumbersome, so I won't change anything.

It's easy to get overwhelmed when considering a change of cloud hosting for any piece of your business, and it's even more intimidating when you feel like it has to be an "all or nothing" decision.

Unless you are afflicted with euphobia (the fear of hearing good news), you'll be happy to hear that these common fears, once properly diagnosed, are quickly and easily curable on the SoftLayer platform. There are no known side effects from treatment, and patients experience immediate symptom relief with a full recovery in between 1-3 months.

This might be a lighthearted look at some quirky fears, but I don't want to downplay how significant these phobias are to the developers and entrepreneurs that suffer from them. If any of these fears strike a chord with you, reach out to the SLales team (by phone, chat or email), and we'll help you create a treatment plan. Once you address and conquer these fears, you can devote all of your energy back to getting over your selenophobia (fear of the moon).

-Arielle

Categories: 
December 20, 2012

MongoDB Performance Analysis: Bare Metal v. Virtual

Developers can be cynical. When "the next great thing in technology" is announced, I usually wait to see how it performs before I get too excited about it ... Show me how that "next great thing" compares apples-to-apples with the competition, and you'll get my attention. With the launch of MongoDB at SoftLayer, I'd guess a lot of developers outside of SoftLayer and 10gen have the same "wait and see" attitude about the new platform, so I put our new MongoDB engineered servers to the test.

When I shared MongoDB architectural best practices, I referenced a few of the significant optimizations our team worked with 10gen to incorporate into our engineered servers (cheat sheet). To illustrate the impact of these changes in MongoDB performance, we ran 10gen's recommended benchmarking harness (freely available for download and testing of your own environment) on our three tiers of engineered servers alongside equivalent shared virtual environments commonly deployed by the MongoDB community. We've made a pretty big deal about the performance impact of running MongoDB on optimized bare metal infrastructure, so it's time to put our money where our mouth is.

The Testing Environment

For each of the available SoftLayer MongoDB engineered servers, data sets of 512kb documents were preloaded onto single MongoDB instances. The data sets were created with varying size compared to available memory to allow for data sets that were both larger (2X) and smaller than available memory. Each test also ensured that the data set was altered during the test run frequently enough to prevent the queries from caching all of the data into memory.

Once the data sets were created, JMeter server instances with 4 cores and 16GB of RAM were used to drive 'benchrun' from the 10gen benchmarking harness. This diagram illustrates how we set up the testing environment (click for a better look):

MongoDB Performance Analysis Setup

These Jmeter servers function as the clients generating traffic on the MongoDB instances. Each client generated random query and update requests with a ratio of six queries per update (The update requests in the test were to ensure that data was not allowed to fully cache into memory and never exercise reads from disk). These tests were designed to create an extreme load on the servers from an exponentially increasing number of clients until the system resources became saturated, and we recorded the resulting performance of the MongoDB application.

At the Medium (MD) and Large (LG) engineered server tiers, performance metrics were run separately for servers using 15K SAS hard drive data mounts and servers using SSD hard drive data mounts. If you missed the post comparing the IOPS statistics between different engineered server hard drive configurations, be sure to check it out. For a better view of the results in a given graph, click the image included in the results below to see a larger version.

Test Case 1: Small MongoDB Engineered Servers vs Shared Virtual Instance

Servers

Small (SM) MongoDB Engineered Server
Single 4-core Intel 1270 CPU
64-bit CentOS
8GB RAM
2 x 500GB SATAII - RAID1
1Gb Network
Virtual Provider Instance
4 Virtual Compute Units
64-bit CentOS
7.5GB RAM
2 x 500GB Network Storage - RAID1
1Gb Network
 

Tests Performed

Small Data Set (8GB of .5mb documents)
200 iterations of 6:1 query-to-update operations
Concurrent client connections exponentially increased from 1 to 32
Test duration spanned 48 hours
Average Read Operations per Second
by Concurrent Client
MongoDB Performance Analysis
Peak Read Operations per Second
by Concurrent ClientMongoDB Performance Analysis
Average Write Operations per Second
by Concurrent Client
MongoDB Performance Analysis
Peak Write Operations per Second
by Concurrent ClientMongoDB Performance Analysis

Test Case 2: Medium MongoDB Engineered Servers vs Shared Virtual Instance

Servers (15K SAS Data Mount Comparison)

Medium (MD) MongoDB Engineered Server
Dual 6-core Intel 5670 CPUs
64-bit CentOS
36GB RAM
2 x 64GB SSD - RAID1 (Journal Mount)
4 x 300GB 15K SAS - RAID10 (Data Mount)
1Gb Network - Bonded
Virtual Provider Instance
26 Virtual Compute Units
64-bit CentOS
30GB RAM
2 x 64GB Network Storage - RAID1 (Journal Mount)
4 x 300GB Network Storage - RAID10 (Data Mount)
1Gb Network
 

Tests Performed

Small Data Set (32GB of .5mb documents)
200 iterations of 6:1 query-to-update operations
Concurrent client connections exponentially increased from 1 to 128
Test duration spanned 48 hours
Average Read Operations per Second
by Concurrent Client
MongoDB Performance Analysis
Peak Read Operations per Second
by Concurrent ClientMongoDB Performance Analysis
Average Write Operations per Second
by Concurrent Client
MongoDB Performance Analysis
Peak Write Operations per Second
by Concurrent ClientMongoDB Performance Analysis

Servers (SSD Data Mount Comparison)

Medium (MD) MongoDB Engineered Server
Dual 6-core Intel 5670 CPUs
64-bit CentOS
36GB RAM
2 x 64GB SSD - RAID1 (Journal Mount)
4 x 400GB SSD - RAID10 (Data Mount)
1Gb Network - Bonded
Virtual Provider Instance
26 Virtual Compute Units
64-bit CentOS
30GB RAM
2 x 64GB Network Storage - RAID1 (Journal Mount)
4 x 300GB Network Storage - RAID10 (Data Mount)
1Gb Network
 

Tests Performed

Small Data Set (32GB of .5mb documents)
200 iterations of 6:1 query-to-update operations
Concurrent client connections exponentially increased from 1 to 128
Test duration spanned 48 hours
Average Read Operations per Second
by Concurrent Client
MongoDB Performance Analysis
Peak Read Operations per Second
by Concurrent ClientMongoDB Performance Analysis
Average Write Operations per Second
by Concurrent Client
MongoDB Performance Analysis
Peak Write Operations per Second
by Concurrent ClientMongoDB Performance Analysis

Test Case 3: Large MongoDB Engineered Servers vs Shared Virtual Instance

Servers (15K SAS Data Mount Comparison)

Large (LG) MongoDB Engineered Server
Dual 8-core Intel E5-2620 CPUs
64-bit CentOS
128GB RAM
2 x 64GB SSD - RAID1 (Journal Mount)
6 x 600GB 15K SAS - RAID10 (Data Mount)
1Gb Network - Bonded
Virtual Provider Instance
26 Virtual Compute Units
64-bit CentOS
64GB RAM (Maximum available on this provider)
2 x 64GB Network Storage - RAID1 (Journal Mount)
6 x 600GB Network Storage - RAID10 (Data Mount)
1Gb Network
 

Tests Performed

Small Data Set (64GB of .5mb documents)
200 iterations of 6:1 query-to-update operations
Concurrent client connections exponentially increased from 1 to 128
Test duration spanned 48 hours
Average Read Operations per Second
by Concurrent Client
MongoDB Performance Analysis
Peak Read Operations per Second
by Concurrent ClientMongoDB Performance Analysis
Average Write Operations per Second
by Concurrent Client
MongoDB Performance Analysis
Peak Write Operations per Second
by Concurrent ClientMongoDB Performance Analysis

Servers (SSD Data Mount Comparison)

Large (LG) MongoDB Engineered Server
Dual 8-core Intel E5-2620 CPUs
64-bit CentOS
128GB RAM
2 x 64GB SSD - RAID1 (Journal Mount)
6 x 400GB SSD - RAID10 (Data Mount)
1Gb Network - Bonded
Virtual Provider Instance
26 Virtual Compute Units
64-bit CentOS
64GB RAM (Maximum available on this provider)
2 x 64GB Network Storage - RAID1 (Journal Mount)
6 x 600GB Network Storage - RAID10 (Data Mount)
1Gb Network
 

Tests Performed

Small Data Set (64GB of .5mb documents)
200 iterations of 6:1 query-to-update operations
Concurrent client connections exponentially increased from 1 to 128
Test duration spanned over 48 hours
Average Read Operations per Second
by Concurrent Client
MongoDB Performance Analysis
Peak Read Operations per Second
by Concurrent ClientMongoDB Performance Analysis
Average Write Operations per Second
by Concurrent Client
MongoDB Performance Analysis
Peak Write Operations per Second
by Concurrent ClientMongoDB Performance Analysis

Impressions from Performance Testing

The results speak for themselves. Running a Mongo DB big data solution on a shared virtual environment has significant drawbacks when compared to running MongoDB on a single-tenant bare metal offering. Disk I/O is by far the most limiting resource for MongoDB, and relying on shared network-attached storage (with much lower disk I/O) makes this limitation very apparent. Beyond the average and peak statistics above, performance varied much more significantly in the virtual instance environment, so it's not as consistent and predictable as a bare metal.

Highlights:

  • When a working data set is smaller than available memory, query performance increases.
  • The number of clients performing queries has an impact on query performance because more data is being actively cached at a rapid rate.
  • The addition of a separate Journal Mount volume significantly improves performance. Because the Small (SM) engineered server does not include a secondary mount for Journals, whenever MongoDB began to journal, the disk I/O associated with journalling was disruptive to the query and update operations performed on the Data Mount.
  • The best deployments in terms of operations per second, stability and control were the configurations with a RAID10 SSD Data Mount and a RAID1 SSD Journal Mount. These configurations are available in both our Medium and Large offerings, and I'd highly recommend them.

-Harold

December 6, 2012

MongoDB: Architectural Best Practices

With the launch of our MongoDB solutions, developers can provision powerful, optimized, horizontally scaling NoSQL database clusters in real-time on bare metal infrastructure in SoftLayer data centers around the world. We worked tirelessly with our friends at 10gen — the creators of MongoDB — to build and tweak hardware and software configurations that enable peak MongoDB performance, and the resulting platform is pretty amazing. As Duke mentioned in his blog post, those efforts followed 10Gen's MongoDB best practices, but what he didn't mention was that we created some architectural best practices of our own for MongoDB in deployments on our platform.

The MongoDB engineered servers that you order from SoftLayer already implement several of the recommendations you'll see below, and I'll note which have been incorporated as we go through them. Given the scope of the topic, it's probably easiest to break down this guide into a few sections to make it a little more digestible. Let's take a look at the architectural best practices of running MongoDB through the phases of the roll-out process: Selecting a deployment strategy to prepare for your MongoDB installation, the installation itself, and the operational considerations of running it in production.

Deployment Strategy

When planning your MongoDB deployment, you should follow Sun Tzu's (modified) advice: "If you know the [friend] and know yourself, you need not fear the result of a hundred battles." "Friend" was substituted for the "enemy" in this advice because the other party is MongoDB. If you aren't familiar with MongoDB, the top of your to-do list should be to read MongoDB's official documentation. That information will give you the background you'll need as you build and use your database. When you feel comfortable with what MongoDB is all about, it's time to "know yourself."

Your most important consideration will be the current and anticipated sizes of your data set. Understanding the volume of data you'll need to accommodate will be the primary driver for your choice of individual physical nodes as well as your sharding plans. Once you've established an expected size of your data set, you need to consider the importance of your data and how tolerant you are of the possibility of lost or lagging data (especially in replicated scenarios). With this information in hand, you can plan and start testing your deployment strategy.

It sounds a little strange to hear that you should test a deployment strategy, but when it comes to big data, you want to make sure your databases start with a strong foundation. You should perform load testing scenarios on a potential deployment strategy to confirm that a given architecture will meet your needs, and there are a few specific areas that you should consider:

Memory Sizing
MongoDB (like many data-oriented applications) works best when the data set can reside in memory. Nothing performs better than a MongoDB instance that does not require disk I/O. Whenever possible, select a platform that has more available RAM than your working data set size. If your data set exceeds the available RAM for a single node, then consider using sharding to increase the amount of available RAM in a cluster to accommodate the larger data set. This will maximize the overall performance of your deployment. If you notice page faults when you put your database under production load, they may indicate that you are exceeding the available RAM in your deployment.

Disk Type
If speed is not your primary concern or if you have a data set that is far larger than any available in memory strategy can support, selecting the proper disk type for your deployment is important. IOPS will be key in selecting your disk type and obviously the higher the IOPS the better the performance of MongoDB. Local disks should be used whenever possible (as network storage can cause high latency and poor performance for your deployment). It's also advised that you use RAID 10 when creating disk arrays.

To give you an idea of what kind of IOPS to expect from a given type of drive, these are the approximate ranges of IOPS per drive in SoftLayer MongoDB engineered servers:

SATA II – 100-200 IOPS
15K SAS – 300-400 IOPS
SSD – 7,000-8,000 IOPS (read) 19,000-20,000 IOPS (write)

CPU
Clock speed and the amount of available processors becomes a consideration if you anticipate using MapReduce. It has also been noted that when running a MongoDB instance with the majority of the data in memory, clock speed can have a major impact on overall performance. If you are planning to use MapReduce or you're able to operate with a majority of your data in memory, consider a deployment strategy that includes a CPU with a high clock/bus speed to maximize your operations per second.

Replication
Replication provides high availability of your data if a node fails in your cluster. It should be standard to replicate with at least three nodes in any MongoDB deployment. The most common configuration for replication with three nodes is a 2x1 deployment — having two primary nodes in a single data center with a backup server in a secondary data center:

MongoDB Replication

Sharding
If you anticipate a large, active data set, you should deploy a sharded MongoDB deployment. Sharding allows you to partition a single data set across multiple nodes. You can allow MongoDB to automatically distribute the data across nodes in the cluster or you may elect to define a shard key and create range-based sharding for that key.

Sharding may also help write performance, so you can also elect to shard even if your data set is small but requires a high amount of updates or inserts. It's important to note that when you deploy a sharded set, MongoDB will require three (and only three) config server instances which are specialized Mongo runtimes to track the current shard configuration. Loss of one of these nodes will cause the cluster to go into a read-only mode (for the configuration only) and will require that all nodes be brought back online before any configuration changes can be made.

Write Safety Mode
There are several write safety modes that govern how MongoDB will handle the persistence of the data to disk. It is important to consider which mode best fits your needs for both data integrity and performance. The following write safety modes are available:

None – This mode provides a deferred writing strategy that is non-blocking. This will allow for high performance, however there is a small opportunity in the case of a node failing that data can be lost. There is also the possibility that data written to one node in a cluster will not be immediately available on all nodes in that cluster for read consistency. The 'None' strategy will also not provide any sort of protection in the case of network failures. That lack of protection makes this mode highly unreliable and should only be used when performance is a priority and data integrity is not a concern.

Normal – This is the default for MongoDB if you do not select any other mode. It provides a deferred writing strategy that is non-blocking. This will allow for high performance, however there is a small opportunity in the case of a node failing that data can be lost. There is also the possibility that data written to one node in a cluster will not be immediately available on all nodes in that cluster for read consistency.

Safe – This mode will block until MongoDB has acknowledged that it has received the write request but will not block until the write is actually performed. This provides a better level of data integrity and will ensure that read consistency is achieved within a cluster.

Journal Safe – Journals provide a recovery option for MongoDB. Using this mode will ensure that the data has been acknowledged and a Journal update has been performed before returning.

Fsync - This mode provides the highest level of data integrity and blocks until a physical write of the data has occurred. This comes with a degradation in performance and should be used only if data integrity is the primary concern for your application.

Testing the Deployment
Once you've determined your deployment strategy, test it with a data set similar to your production data. 10gen has several tools to help you with load testing your deployment, and the console has a tool named 'benchrun' which can execute operations from within a JavaScript test harness. These tools will return operation information as well as latency numbers for each of those operations. If you require more detailed information about the MongoDB instance, consider using the mongostat command or MongoDB Monitoring Service (MMS) to monitor your deployment during the testing.

Installation

When performing the installation of MongoDB, a few considerations can help create both a stable and performance-oriented solution. 10gen recommends the use CentOS (64-bit) as the base operating system if at all possible. If you try installing MongoDB on a 32-bit operating system, you might run into file size limits that cause issues, and if you feel the urge to install it on Windows, you'll see performance issues if virtual memory begins to be utilized by the OS to make up for a lack of RAM in your deployment. As a result, 32-bit operating systems and Windows operating systems should be avoided on MongoDB servers. SoftLayer provisions CentOS 6.X 64-bit operating systems by default on all of our MongoDB engineered server deployments.

When you've got CentOS 64-bit installed, you should also make the following changes to maximize your performance (all of which are included by default on all SoftLayer engineered servers):

Set SSD Read Ahead Defaults to 16 Blocks - SSD drives have excellent seek times allowing for shrinking the Read Ahead to 16 blocks. Spinning disks might require slight buffering so these have been set to 32 blocks.

noatime - Adding the noatime option eliminates the need for the system to make writes to the file system for files which are simply being read — or in other words: Faster file access and less disk wear.

Turn NUMA Off in BIOS - Linux, NUMA and MongoDB tend not to work well together. If you are running MongoDB on NUMA hardware, we recommend turning it off (running with an interleave memory policy). If you don't, problems will manifest in strange ways like massive slow downs for periods of time or high system CPU time.

Set ulimit - We have set the ulimit to 64000 for open files and 32000 for user processes to prevent failures due to a loss of available file handles or user processes.

Use ext4 - We have selected ext4 over ext3. We found ext3 to be very slow in allocating files (or removing them). Additionally, access within large files is poor with ext3.

One last tip on installation: Make the Journal and Data volumes be distinct physical volumes. If the Journal and Data directories reside on a single physical volume, flushes to the Journal will interrupt the access of data and provide spikes of high latency within your MongoDB deployment.

Operations

Once a MongoDB deployment has been promoted to production, there are a few recommendations for monitoring and optimizing performance. You should always have the MMS agent running on all MongoDB instances to help monitor the health and performance of your deployment. Additionally, this tool is also very useful if you have 10gen MongoDB Cloud Subscriptions because it provides useful debugging data for the 10gen team during support interactions. In addition to MMS, you can use the mongostat command (mentioned in the deployment section) to see runtime information about the performance of a MongoDB node. If either of these tools flags performance issues, sharding or indexing are first-line options to resolve them:

Indexes - Indexes should be created for a MongoDB deployment if monitoring tools indicate that field based queries are performing poorly. Always use indexes when you are querying data based on distinct fields to help boost performance.

Sharding - Sharding can be leveraged when the overall performance of the node is suffering because of a large operating data set. Be sure to shard before you get in the red; the system only splits chunks for sharding on insert or update so if you wait too long to shard you may have some uneven distribution for a period of time or forever depending on your data set and sharding key strategy.

I know it seems like we've covered a lot over the course of this blog post, but this list of best practices is far from exhaustive. If you want to learn more, the MongoDB forums are a great resource to connect with the rest of the MongoDB community and learn from their experiences, and the documentation on MongoDB's site is another phenomenal resource. The best people to talk to when it comes to questions about MongoDB are the folks at 10gen, so I also highly recommend taking advantage of MongoDB Cloud Subscriptions to get their direct support for your one-off questions and issues.

-Harold

September 17, 2012

Joining the Internet Infrastructure Coalition

In January, we posted a series of blogs about legislation in the U.S. House of Representatives and Senate that would have had a serious impact on the hosting industry. We talked about SOPA and PIPA, and how those proposed laws would "break the Internet" as we know it. The hosting industry rallied together to oppose the passage of those bills, and in doing so, we proved to be a powerful collective force.

In the months that followed the shelving of SOPA and PIPA, many of the hosting companies that were active in the fight were invited to join a new coalition that would focus on proposed legislation that affects Internet infrastructure providers ... The Internet Infrastructure Coalition (or "i2Coalition") was born. i2Coalition co-founder and Board Chair Christian Dawson explains the basics:

SoftLayer is proud to be a Charter Member of i2Coalition, and we're excited to see how many vendors, partners, peers and competitors have joined us. Scrolling the ranks of founding members is a veritable "Who's who?" of the companies that make up the "nuts and bolts" of the Internet.

The goal of i2Coalition is to facilitate public policy education and advocacy, develop market-driven standards formed by consensus and give the industry a unified voice. On the i2Coalition's Public Policy page, that larger goal is broken down into focused priorities, with the first being

"In all public policy initiatives of the i2Coalition will be to encourage the growth and development of the Internet infrastructure industry and to protect the interests of members of the Coalition consistent with this development."

Another huge priority worth noting is the focus on enabling and promoting the free exercise of human rights — including freedom of speech, freedom of assembly and the protection of personal privacy. Those rights are essential to fostering effective Internet advancement and to maintain a free and open Internet, and SoftLayer is a strong supporter of that platform.

If you operate in the hosting or Internet infrastructure space and you want to be part of the i2Coalition, we encourage you to become a member and join the conversation. When policymakers are talking about getting "an Internet" from their staff members, we know that there are plenty of opportunities to educate and provide context on the technical requirements and challenges that would result from proposed legislation, and the Internet Infrastructure Coalition is well equipped to capitalize on those opportunities.

-@toddmitchell

June 20, 2012

How Do You Build a Private Cloud?

If you read Nathan's "A Cloud to Call Your Own" blog, and you wanted to learn a little more about private clouds in general or SoftLayer Private Clouds specifically, this post is for you. We're going take a little time to dive deeper into the technology behind SoftLayer Private Clouds, and in the process, I'll talk a little about why particular platforms/hardware/configurations were chosen.

The Platform: Citrix CloudPlatform

There are several cloud infrastructure frameworks to choose from these days. We have surveyed a number of them and actively work with several of them. We are active members of the happenings around OpenStack and we have working implementations of vSphere, Nimula, Eucalyptus and other stacks in our data centers. So why CloudPlatform by Citrix?

First off, it's one of the most mature of these options. It's been around for several years and now has the substantial backing of Citrix. That backing includes investment, support organizations and the multitude of other products managed by Citrix. There are also some futuristic ideas we have regarding how to leverage products like CloudBridge and Netscaler with Private Clouds. Second, CloudPlatform operates in accordance with how we believe a private cloud should work: It's simple, it doesn't have a huge management infrastructure and we can charge for it by the CPU per month, just like all of our other products. Finally, CloudPlatform has made good inroads with enterprise customers. We love the idea that an enterprise ops team could leverage CloudPlatform as the management platform for both their on-premise and their off-premise private cloud.

So, we selected CloudPlatform for a multitude of reasons; not just one.

Another huge key was our ability to integrate CloudPlatform into the SoftLayer portals/mobile apps/API. Because many SoftLayer customers manage their environments exclusively through the SoftLayer API, we knew that a seamless integration there was an absolute necessity. With the help of the SoftLayer dev team and the CloudStack folks, we've been able to automate private clouds the same way we did for public cloud instances and dedicated servers.

The Hardware

When it came to choosing what hardware the private clouds would use, the decision was pretty simple. Given our need for automation, SoftLayer Private Clouds would need to be indistinguishable from a standard dedicated server or CloudLayer environment. We use the latest and greatest server hardware available on the market, and every month, you can see thousands of new SuperMicro boxes being delivered to our data centers around the world. Because we know we have a reliable, powerful and consistent hardware foundation on which we can build the private clouds product, it makes the integration of the system even easier.

When it comes to the specs of the hardware provided for a private cloud environment, we provide as much transparency and flexibility as we can for a customer to build exactly what he or she needs. Let's look into what that means...

The Hardware Configurations

A CloudPlatform environment can be broken down into these components:

  • A single management server (that can manage multiple zones across layer 2 networks)
  • One or more zones
  • One or more clusters in a zone
  • One or more hosts in a cluster
  • Storage shared by a cluster (which can be a single server)

A simple diagram of a two-zone private cloud might look like this:

SoftLayer Private Clouds

We've set a standard "management server" configuration that we know will be able to accommodate all of your needs when it comes to running CloudPlatform, and how you build and configure the rest of your private cloud infrastructure is up to you. Whether you want simple dual proc, quad core Nehalem box with a lot of local disk space for a dev cloud or an environment made up of quad proc 10-core Westmeres with SSDs, you have the freedom to choose exactly what you want.

Oh, and everything can be online in two to four hours, and it's offered on a month-to-month contract.

The Network Configuration

When it comes to where the hardware is provisioned, you have the ability to deploy zones in multiple geographies and manage them all through a single CloudPlatform management node. Given the way the SoftLayer three-tier network is built, the management node and host nodes do not even need to be accessible by our public network. You can choose to make accessible only the IPs used by the VMs you create. If your initial private cloud infrastructure is in Dallas and you want a node online in Singapore, you can just click a few buttons, and the new node will be provisioned and configured securely by CloudPlatform in a couple of hours.

Imagine how long it would have taken you to build this kind of infrastructure in the past:

SoftLayer Private Clouds

It doesn't take days or weeks now. It takes hours.

As you can see, when we approached the challenge of bringing private clouds to the SoftLayer platform, we had to innovate. In Texas, that would be roughly translated as "Go big or go home." Given the response we've seen from customers and partners since the announcement of SoftLayer Private Clouds, we know the industry has taken notice.

Will all of our customers need their own private cloud infrastructure? Probably not. But will the customers who've been looking for this kind of functionality be ecstatic with the CloudPlatform environment on SoftLayer's network? Absolutely.

-Duke

June 13, 2012

SoftLayer Private Clouds - A Cloud to Call Your Own

Those of us who've been in this industry for years have seen computing evolve pretty significantly, especially recently. We started with dedicated servers running a single operating system, and we were floored by innovations that allowed dedicated servers to run a hypervisor with many operating systems. The next big leap brought virtual machine "cloud" instances into the spotlight ... And the resulting marketing shenanigans have been a blessing and a curse. On the positive side, the approachable "cloud" term is a lot easier to talk about with a nontechnical audience, but on the negative side, we see uninformative TV commercials that leverage cloud as a marketing term, and we see products that further obfuscate what cloud technology actually means:

Cloud Phone?

To make sure we're all on the same page, as we continue to talk about "cloud," our definition is pretty straightforward:

  • It's an operations model.
  • It provides capacity on demand.
  • It offers consumption-based pricing.
  • It features self-service provisioning.
  • It can be accessed and managed via an API.

Understanding those characteristics, when you hear about cloud in the hosting industry, you're usually hearing about cloud computing instances in a public cloud environment. An instance in a public cloud is one of many instances operating on a shared cloud infrastructure alongside other similar instances that aren't managed by you. Your data is still secure, and you can still get good performance in a public cloud environment, but you're not managing the cloud infrastructure on which your instance resides ... You're using a piece of a cloud.

What we announced at Cloud Expo East is the next step in the evolution of technology in our industry ... We're providing a turnkey, on-demand way for our customers to provision their own Private Clouds with Citrix CloudPlatform, powered by Apache CloudStack.

You don't get a piece of the cloud. You have your own cloud, provisioned in a matter of hours on a month-to-month contract.

For those who have looked into building a private cloud for their business in the past, it's probably worth reiterating: With SoftLayer and CloudStack, you can have a geographically distributed, secure, private cloud environment provisioned in a matter of hours (not months). Given the complexity of a private cloud environment — involving a management server, private cloud zones, host servers and object storage — this is no small feat.

SoftLayer Private Clouds

Those unbelievable provisioning times are only part of the story ... When that cloud infrastructure is deployed quickly, it's fully integrated into the SoftLayer platform, so it leverages our global private network alongside your existing bare metal, dedicated and virtual servers. Want to add public cloud instances to your private cloud as web heads? You'll log into one portal or use a singular API to have that done in an instant.

Your own cloud infrastructure, fully integrated into SoftLayer's global infrastructure. If you're chomping at the bit to try it out for yourself, email us at privateclouds@softlayer.com, and we'll get you on the "early access" list.

Before I sign off, I want to be sure to thank everyone at SoftLayer and Citrix who worked so hard to make SoftLayer Private Clouds such an amazing new addition to our platform.

-@nday91

June 7, 2012

Meet Catalyst, SoftLayer's Startup Incubator Program

catalyst [kat-l-ist] noun - A person or thing that precipitates an event or change. also SoftLayer's killer startup incubator program.

It's official, Catalyst has launched on the SoftLayer website:

Catalyst Startup Program

You've heard us talk about SoftLayer's ongoing involvement with entrepreneurs, incubators, accelerators and startup events, but for the most part, we've been flying "under the radar" without an official presence on SoftLayer's website. The Catalyst team has been busy building relationships with more than 50 of the world's best startup-focused organizations, and we've been working directly with hundreds of startups and entrepreneurs to provide some pretty unique resources:

$1,000/month Hosting Credit

SoftLayer is the world's most advanced cloud, dedicated and hybrid hosting company. We integrate best-in-class technology and connectivity into the industry's only fully-automated platform, empowering startups with complete access, control, security and scalability. Startups in Catalyst get a $1000/month credit for hosting for one full year. That includes dedicated servers, cloud servers or a hybrid compute environment.

Mentorship from SoftLayer Innovation Team

You'll get connected with SoftLayer's award-winning Innovation Team. These are the über smart guys who created the SoftLayer Automated Platform. They're our most senior technology team, and they're experts at things like massively scalable software and hardware architectures, cloud, globally distributed computing, security, "Big Data" databases and all the other crazy new "best and next" practices in modern and forward-looking compute.

Increased Market Visibility

Catalyst startups receive marketing opportunities with SoftLayer like guest blog posts on the InnerLayer, video interviews, white papers and use cases to help you tell the world about the cool stuff you're doing. When you're out of Beta, ask about our Technology Partners Marketplace, which exposes you to thousands of our customers.

Empowering entrepreneurs and startups is a core principle for SoftLayer, and we're doing everything we can to provide the platform for the next Facebook, Twitter or Tumblr. The Catalyst page on our website might be brand new, but the startup companies in Catalyst today are already taking advantage of $100,000+ of free hosting ... every month. How is that possible? We've got friends in the right places:

Catalyst Startup Program

Cultivating a pipeline of amazing startup companies has been easy, thanks to organizations like the TechStars Global Accelerator Network and the other featured partners we're recognizing this month. Without any official "public" presence, we've become a go-to resource in the startup community ... Having a Catalyst site to call "home" is just icing on the cake. If you have a few minutes and you want to learn more about whether SoftLayer may be able to help you build your idea or fuel your startup, head over to the Catalyst startup incubator page and submit a quick application.

Join Catalyst » See Change.

-@PaulFord

February 21, 2012

Startup Series: Distil

As you may have read in one of my previous posts, SoftLayer partners with various startup accelerator programs around the world. This gives us the incredible opportunity to get up close and personal with some of the brightest entrepreneurs in the tech industry. Because SoftLayer grew out of a classic startup environment, we have a passion for helping new companies achieve their goals. From C-level execs all the way down the chain, we're committed to finding the best innovators out there and mentoring them on their way to success.

We're planning a pretty big public debut for the SoftLayer startup program in the coming months, but we want to start introducing you to some of the killer startup companies we already are working with. Today's incredible business: Distil.

Distil

Distil is currently enrolled in the TechStars Cloud Accelerator program, where SoftLayer CSO George Karidis, CTO Duke Skarda, and I serve as mentors. After meeting the guys at Distil, I couldn't wait to get them set up with us as well.

Here's a quick insight into the company from a quick Q&A with the brains of the operation, Rami Essaid, Founder and CEO of Distil:

Q: Tell me a little bit about Distil and how you got started.

A: Distil is the first content protection network that helps companies identify and block malicious bots from harvesting and stealing their data. We started after talking to online publishers about their security needs, and we quickly realized that digital publishers had no control over their content once they put it on the web. We started working to create the first platform aimed to help them protect and control their information.

Q: When was the moment you first recognized you had a big idea?

A: It happened after presenting our proof of concept to a couple digital publishers, the enthusiastic feedback we received made us instantly realize that this was it.

Q: How did you build your company?

A: The company started as an after-work hobby. As the platform picked up momentum, we slowly started leaving our jobs to devote all of our time to Distil. We quickly raised seed capital to help fuel our growth.

Q: What are the keys to your Distil's success?

A: The team I have at Distil is absolutely the reason for our success. Each person's hard work, energy, and dedication allow us to accomplish twice as much in half the time. This group of guys is the most intelligent and keen I have ever had the pleasure of working with.

Q: How would you describe the market for your product?

A: Distil is a technology solution to a problem that traditionally only relied on laws and litigation. Copyright infringement has been an issue on the web since the World Wide Web was started, but up until now most companies treated the data theft reactively. We are disrupting that way of thinking and creating a new market, protecting data and content proactively before it is ever stolen.

Q: How did you arrive at SoftLayer and how have we helped?

A: We were connected to SoftLayer through the TechStars Cloud Accelerator program. We were introduced to SoftLayer's leadership team, and they worked with us to improve our platform performance and tweak our designs to utilize both dedicated and cloud servers. By using this hybrid solution, we've been able to gain the power and speed of dedicated servers while still having the flexibility to burst and scale on demand.

Q: What advice would you give to other startups?

A: The best advice I can give to any startup is to make sure they're passionate about what they're doing. Startup life is not easy. You work 16-20 hours a day, seven days a week, have very little money, and are always worried someone else will beat you to the prize. Passion is the only reason you get up in the morning.

Learn more about Distil at distil.it.

In my short conversation with Rami, I could hear his passion. That's exactly what we're looking for in companies who join the SoftLayer startup program. We can't wait to see what the future holds for Distil.

If you enjoy reading about cool new startups, bookmark the Startups page here on the SoftLayer Blog or subscribe to the "Startups" RSS feed to meet some of the most badass startups in the world.

Calling All Startups!

Companies in our program receive mentoring, best practices advice, industry insight, and tangible resources including:

  • A $1,000 per month credit for dedicated hosting, cloud hosting or any kind of hybrid hosting setup
  • Advanced infrastructure help and advice
  • A dedicated Senior Account Representative
  • Marketing support

If you're interested in joining our program and getting the help you deserve, shoot me an email, and we'll help you start the application process.

-@PaulFord

Subscribe to platform