Posts Tagged 'Comparison'

December 17, 2012

Big Data at SoftLayer: The Importance of IOPS

The jet flow gates in the Hoover Dam can release up to 73,000 cubic feet — the equivalent of 546,040 gallons — of water per second at 120 miles per hour. Imagine replacing those jet flow gates with a single garden hose that pushes 25 gallons per minute (or 0.42 gallons per second). Things would get ugly pretty quickly. In the same way, a massive "big data" infrastructure can be crippled by insufficient IOPS.

IOPS — Input/Output Operations Per Second — measure computer storage in terms of the number of read and write operations it can perform in a second. IOPS are a primary concern for database environments where content is being written and queried constantly, and when we take those database environments to the extreme (big data), the importance of IOPS can't be overstated: If you aren't able perform database reads and writes quickly in a big data environment, it doesn't matter how many gigabytes, terabytes or petabytes you have in your database ... You won't be able to efficiently access, add to or modify your data set.

As we worked with 10gen to create, test and tweak SoftLayer's MongoDB engineered servers, our primary focus centered on performance. Since the performance of massively scalable databases is dictated by the read and write operations to that database's data set, we invested significant resources into maximizing the IOPS for each engineered server ... And that involved a lot more than just swapping hard drives out of servers until we found a configuration that worked best. Yes, "Disk I/O" — the amount of input/output operations a given disk can perform — plays a significant role in big data IOPS, but many other factors limit big data performance. How is performance impacted by network-attached storage? At what point will a given CPU become a bottleneck? How much RAM should included in a base configuration to accommodate the load we expect our users to put on each tier of server? Are there operating system changes that can optimize the performance of a platform like MongoDB?

The resulting engineered servers are a testament to the blood, sweat and tears that were shed in the name of creating a reliable, high-performance big data environment. And I can prove it.

Most shared virtual instances — the scalable infrastructure many users employ for big data — use network-attached storage for their platform's storage. When data has to be queried over a network connection (rather than from a local disk), you introduce latency and more "moving parts" that have to work together. Disk I/O might be amazing on the enterprise SAN where your data lives, but because that data is not stored on-server with your processor or memory resources, performance can sporadically go from "Amazing" to "I Hate My Life" depending on network traffic. When I've tested the IOPS for network-attached storage from a large competitor's virtual instances, I saw an average of around 400 IOPS per mount. It's difficult to say whether that's "not good enough" because every application will have different needs in terms of concurrent reads and writes, but it certainly could be better. We performed some internal testing of the IOPS for the hard drive configurations in our Medium and Large MongoDB engineered servers to give you an apples-to-apples comparison.

Before we get into the tests, here are the specs for the servers we're using:

Medium (MD) MongoDB Engineered Server
Dual 6-core Intel 5670 CPUs
CentOS 6 64-bit
36GB RAM
1Gb Network - Bonded
Large (LG) MongoDB Engineered Server
Dual 8-core Intel E5-2620 CPUs
CentOS 6 64-bit
128GB RAM
1Gb Network - Bonded
 

The numbers shown in the table below reflect the average number of IOPS we recorded with a 100% random read/write workload on each of these engineered servers. To measure these IOPS, we used a tool called fio with an 8k block size and iodepth at 128. Remembering that the virtual instance using network-attached storage was able to get 400 IOPS per mount, let's look at how our "base" configurations perform:

Medium - 2 x 64GB SSD RAID1 (Journal) - 4 x 300GB 15k SAS RAID10 (Data)
Random Read IOPS - /var/lib/mongo/logs 2937
Random Write IOPS - /var/lib/mongo/logs 1306
Random Read IOPS - /var/lib/mongo/data 1720
Random Write IOPS - /var/lib/mongo/data 772
Random Read IOPS - /var/lib/mongo/data/journal 19659
Random Write IOPS - /var/lib/mongo/data/journal 8869
   
Medium - 2 x 64GB SSD RAID1 (Journal) - 4 x 400GB SSD RAID10 (Data)
Random Read IOPS - /var/lib/mongo/logs 30269
Random Write IOPS - /var/lib/mongo/logs 13124
Random Read IOPS - /var/lib/mongo/data 33757
Random Write IOPS - /var/lib/mongo/data 14168
Random Read IOPS - /var/lib/mongo/data/journal 19644
Random Write IOPS - /var/lib/mongo/data/journal 8882
   
Large - 2 x 64GB SSD RAID1 (Journal) - 6 x 600GB 15k SAS RAID10 (Data)
Random Read IOPS - /var/lib/mongo/logs 4820
Random Write IOPS - /var/lib/mongo/logs 2080
Random Read IOPS - /var/lib/mongo/data 2461
Random Write IOPS - /var/lib/mongo/data 1099
Random Read IOPS - /var/lib/mongo/data/journal 19639
Random Write IOPS - /var/lib/mongo/data/journal 8772
 
Large - 2 x 64GB SSD RAID1 (Journal) - 6 x 400GB SSD RAID10 (Data)
Random Read IOPS - /var/lib/mongo/logs 32403
Random Write IOPS - /var/lib/mongo/logs 13928
Random Read IOPS - /var/lib/mongo/data 34536
Random Write IOPS - /var/lib/mongo/data 15412
Random Read IOPS - /var/lib/mongo/data/journal 19578
Random Write IOPS - /var/lib/mongo/data/journal 8835

Clearly, the 400 IOPS per mount results you'd see in SAN-based storage can't hold a candle to the performance of a physical disk, regardless of whether it's SAS or SSD. As you'd expect, the "Journal" reads and writes have roughly the same IOPS between all of the configurations because all four configurations use 2 x 64GB SSD drives in RAID1. In both configurations, SSD drives provide better Data mount read/write performance than the 15K SAS drives, and the results suggest that having more physical drives in a Data mount will provide higher average IOPS. To put that observation to the test, I maxed out the number of hard drives in both configurations (10 in the 2U MD server and 34 in the 4U LG server) and recorded the results:

Medium - 2 x 64GB SSD RAID1 (Journal) - 10 x 300GB 15k SAS RAID10 (Data)
Random Read IOPS - /var/lib/mongo/logs 7175
Random Write IOPS - /var/lib/mongo/logs 3481
Random Read IOPS - /var/lib/mongo/data 6468
Random Write IOPS - /var/lib/mongo/data 1763
Random Read IOPS - /var/lib/mongo/data/journal 18383
Random Write IOPS - /var/lib/mongo/data/journal 8765
   
Medium - 2 x 64GB SSD RAID1 (Journal) - 10 x 400GB SSD RAID10 (Data)
Random Read IOPS - /var/lib/mongo/logs 32160
Random Write IOPS - /var/lib/mongo/logs 12181
Random Read IOPS - /var/lib/mongo/data 34642
Random Write IOPS - /var/lib/mongo/data 14545
Random Read IOPS - /var/lib/mongo/data/journal 19699
Random Write IOPS - /var/lib/mongo/data/journal 8764
   
Large - 2 x 64GB SSD RAID1 (Journal) - 34 x 600GB 15k SAS RAID10 (Data)
Random Read IOPS - /var/lib/mongo/logs 17566
Random Write IOPS - /var/lib/mongo/logs 11918
Random Read IOPS - /var/lib/mongo/data 9978
Random Write IOPS - /var/lib/mongo/data 6526
Random Read IOPS - /var/lib/mongo/data/journal 18522
Random Write IOPS - /var/lib/mongo/data/journal 8722
 
Large - 2 x 64GB SSD RAID1 (Journal) - 34 x 400GB SSD RAID10 (Data)
Random Read IOPS - /var/lib/mongo/logs 34220
Random Write IOPS - /var/lib/mongo/logs 15388
Random Read IOPS - /var/lib/mongo/data 35998
Random Write IOPS - /var/lib/mongo/data 17120
Random Read IOPS - /var/lib/mongo/data/journal 17998
Random Write IOPS - /var/lib/mongo/data/journal 8822

It should come as no surprise that by adding more drives into the configuration, we get better IOPS, but you might be wondering why the results aren't "betterer" when it comes to the IOPS in the SSD drive configurations. While the IOPS numbers improve going from four to ten drives in the medium engineered server and six to thirty-four drives in the large engineered server, they don't increase as significantly as the IOPS differences in the SAS drives. This is what I meant when I explained that several factors contribute to and potentially limit IOPS performance. In this case, the limiting factor throttling the (ridiculously high) IOPS is the RAID card we are using in the servers. We've been working with our RAID card vendor to test a new card that will open a little more headroom for SSD IOPS, but that replacement card doesn't provide the consistency and reliability we need for these servers (which is just as important as speed).

There are probably a dozen other observations I could point out about how each result compares with the others (and why), but I'll stop here and open the floor for you. Do you notice anything interesting in the results? Does anything surprise you? What kind of IOPS performance have you seen from your server/cloud instance when running a tool like fio?

-Kelly

May 10, 2012

The SoftLayer API and its 'Star Wars' Sibling

When I present about the SoftLayer API at conferences and meetups, I often use an image that shows how many of the different services in the API are interrelated and connected. As I started building the visual piece of my presentation, I noticed a curious "coincidence" about the layout of the visualization:

SoftLayer API Visualization

What does that look like to you?

You might need to squint your eyes and tilt your head or "look beyond the image" like it's one of those "Magic Eye" pictures, but if you're a geek like me, you can't help but notice a striking resemblance to one of the most iconic images from Star Wars:

SoftLayer API == Death Star?

The SoftLayer API looks like the Death Star.

The similarity is undeniable ... The question is whether that resemblance is coincidental or whether it tells us we can extrapolate some kind of fuller meaning as in light of the visible similarities. I can hear KHazzy now ... "Phil, While that's worth a chuckle and all, there is no way you can actually draw a relevant parallel between the SoftLayer API and The Death Star." While Alderaan may be far too remote for an effective demonstration, this task is no match for the power of the Phil-side.

Challenge Accepted.

The Death Star: A large space station constructed by the Galactic Empire equipped with a super-laser capable of destroying an entire planet.

The SoftLayer API: A robust set of services and methods which provide programmatic access to all portions of the SoftLayer Platform capable of automating any task: administrative, configuration or otherwise.

Each is the incredible result of innovation and design. The construction of the Death Star and creation of the SoftLayer API took years of hard work and a significant investment. Both are massive in scale, and they're both effective and ruthless when completing their objectives.

The most important distinction: The Death Star was made to destroy while the SoftLayer API was made to create ... The Death Star was designed to subjugate a resistance force and destroy anything in the empire's way. The SoftLayer API was designed to help customers create a unified, automated way of managing infrastructure; though in the process, admittedly that "creation" often involves subjugating redundant, compulsory tasks.

The Death Star and the SoftLayer API can both seem pretty daunting. It can be hard to find exactly what you need to solve all of your problems ... Whether that be an exhaust port or your first API call. Fear not, for I will be with you during your journey, and unlike Obi-Wan Kenobi, I'm not your only hope. There is no need for rebel spies to acquire the schematics for the API ... We publish them openly at sldn.softlayer.com, and we encourage our customers to break the API down into the pieces of functionality they need.

-Phil (@SoftLayerDevs)

April 23, 2012

Choosing a Cloud: Which Cloud Chooses You?

It's not easy to choose a cloud hosting provider.

In the first post of this series, we talked about the three key deciding factors every cloud customer has to consider, and we set up a Venn diagram to distinguish the surprisingly broad range of unique priorities customers can have:

Cloud Customer Zones

Because every customer will prioritize a cloud's cost, technology and hosting provider a little differently (for completely valid reasons), we mapped out seven distinct "zones" to differentiate some of the basic market segments, or "personas," of cloud hosting buyers. That post was intended to set the stage for a larger discussion on how customers choose their cloud providers and how cloud providers choose their customers, and we're just scratching the surface. We're tackling a pretty big topic here, so as Bill Cosby famously says, "I told you that story to tell you this one."

As a hosting provider, SoftLayer can't expect to be all things for all people. It's impossible to offer a quad-core hex-proc dedicated server for a price that will appeal to a customer in the market for a $49/mo dedicated server.

To better illustrate SoftLayer's vision in the cloud market, we need to take that generic cost v. technology v. hosting provider diagram and give it the "Three Bars" treatment:

SoftLayer Venn Diagram

We're much more interested in living and breathing the Zone 5 "Technology" space rather than the traditional Zone 2 "Hosting Provider" space. That's why in the past two months, you've seen announcements about our launch of the latest Intel Processors, HPC computing with NVidia GPUs, searchable OpenStack Object Storage, and an innovative "Flex Image" approach to bluring the lines between physical and virtual servers. We choose to pursue the cloud customers who make their buying decisions in Zone 3.

That's a challenging pursuit ... It's expensive to push the envelope in technology, customers primarily interested in technology/performance have demanding needs and expectations, and it's easier to make mistakes when you're breaking new ground. The majority of the hosting industry seems to have an eye on the buyer in Zone 1 because they believe the average hosting customer is only interested in the bottom line ... That hosting is more or less a commodity, so the focus should be on some unverifiable qualitative measure of support or the next big special that'll bring in new orders.

As you may have seen recently, GigaOm posted a lovely article that references several high-profile companies in our 25,000+ customer family. We like to say that SoftLayer builds the platform on which our customers build the future, and that short post speaks volumes about the validity of that statement. Our goal is to provide the most powerful, scalable and seamlessly integrated IT infrastructure for the most innovative companies in the world. Innovate or Die isn't just our company motto ... It's our hope for our customers, as well.

We might miss out on your business if you want a $49/mo dedicated server, but if you're looking to change the world, we've got you covered. :-)

-@khazard

April 20, 2012

Choosing a Cloud: Cost v. Technology v. Hosting Provider

If you had to order a new cloud server right now, how would choose it?

I've worked in the hosting industry for the better part of a decade, and I can safely say that I've either observed or been a part of the buying decision for a few thousand hosting customers — from small business owners getting a website online for the first time to established platforms that are now getting tens of millions of visits every day. While each of those purchasers had different requirements and priorities, I've noticed a few key deciding factors that are consistent in a all of those decisions:

The Hosting Decision

How much will the dedicated server or cloud computing instance cost? What configuration/technology do I need (or want)? Which hosting provider should I trust with my business?

Every website administrator of every site on the Internet has had to answer those three questions, and while they seem pretty straightforward, they end up overlapping, and the buying decision starts to get a little more complicated:

The Hosting Decision

The natural assumption is that everyone will choose a dedicated server or cloud computing instance that falls in the "sweet spot" where the three circles overlap, right? While that makes sense on paper, hosting decisions are not made in a vacuum, so you'll actually see completely valid hosting decisions targeting every spot on that graph.

Why would anyone choose an option that wouldn't fit in the sweet spot?

That's a great question, and it's a tough one to answer in broad strokes. Let's break the chart down into a few distinct zones to look at why a user would choose a server in each area:

The Hosting Decision

Zone 1

Buyers choosing a server in Zone 1 are easiest to understand: Their budget takes priority over everything else. They might want to host with a specific provider or have a certain kind of hardware, but their budget doesn't allow for either. Maybe they don't need their site to use the latest and greatest hardware or have it hosted anywhere in particular. Either way, they choose a cloud solely based on whether it fits their budget. After the initial buying decision, if another server needs to be ordered, they might become a Zone 4 buyer.

Zone 2

Just like Zone 1 buyers, Zone 2 buyers are a pretty simple bunch as well. If you're an IT administrator at a huge enterprise that does all of your hosting in-house, your buying decision is more or less made for you. It doesn't matter how much the solution costs, you have to choose an option in your data center, and while you might like a certain technology, you're going to get what's available. Enterprise users aren't the only people deciding to order a server in Zone 2, though ... It's where you see a lot of loyal customers who have the ability to move to another provider but prefer not to — whether it's because they want their next server to be in the same place as their current servers, they value the capabilities of a specific hosting provider (or they just like the witty, interesting blogs that hosting provider writes).

Zone 3

As with Zone 1 and Zone 2, when a zone doesn't have any overlapping areas, the explanation is pretty easy. In Zone 3, the buying decision is being made with a priority on technology. Buyers in this area don't care what it costs or where it's hosted ... They need the fastest, most powerful, most scalable infrastructure on the market. Similar to Zone 1 buyers, once Zone 3 buyers make their initial buying decision, they might shift to Zone 5 for their next server or cloud instance, but we'll get to that in a minute.

Zone 4

Now we're starting to overlap. In Zone 4, a customer will be loyal to a hosting provider as long as that loyalty doesn't take them out of their budget. This is a relatively common customer ... They'll try to compare options apples-to-apples, and they'll make their decision based on which hosting provider they like/trust most. As we mentioned above, if a Zone 1 buyer is adding another server to their initial server order, they'll likely look to add to their environment in one place to make it easier to manage and to get the best performance between the two servers.

Zone 5

Just like the transitional Zone 1 buyers, when Zone 3 buyers look to build on their environment, they'll probably become Zone 5 buyers. When your initial buying decision is based entirely on technology, it's unusual to reinvent the wheel when it comes to your next buying decision. While there are customers that will reevaluate their environment and choose a Zone 3 option irrespective of where their current infrastructure is hosted, it's less common. Zone 5 users love having he latest and greatest technology, and they value being able to manage it through one provider.

Zone 6

A Zone 6 buyer is usually a Zone 1 buyer that has specific technology needs. With all the options on the table, a Zone 6 buyer will choose the cloud environment that provides the latest technology or best performance for their budget, regardless of the hosting provider. As with Zone 1 and Zone 3 buyers, a Zone 6 buyer will probably become a Zone 7 buyer if they need to order another server.

Zone 7

Zone 7 buyers are in the sweet spot. They know the technology they want, they know the price they want to pay, and they know the host they want to use. They're able to value all three of their priorities equally, and they can choose an environment that meets all of their needs. After Zone 6 buyers order their first server(s), they're going to probably become Zone 7 buyers when it comes time for them to place their next order.

As you probably noticed, a lot of transitioning happens between an initial buying decision and a follow-up buying decision, so let's look at that quickly:

The Hosting Decision

Regardless of how you make your initial buying decision, when it's time for your next server or cloud computing instance, you have a new factor to take into account: You already have a cloud infrastructure at a hosting provider, so when it comes time to grow, you'll probably want to grow in the same place. Why? Moving between providers can be a pain, managing environments between several providers is more difficult, and if your servers have to work together, they're generally doing so across the public Internet, so you're not getting the best performance.

Where does SoftLayer fit in all of this? Well beyond being a hosting provider that buyers are choosing, we have to understand buyers are making their buying decisions, and we have to position our business to appeal to the right people with the right priorities. It's impossible to be all things for all people, so we have to choose where to invest our attention ... I'll leave that post for another day, though.

If you had to choose a zone that best describes how you made (or are currently making) your buying decision, which one would it be?

-@khazard

November 27, 2011

Change is Good

We are closing down 2011 and beginning to prepare for a new year that is bound to be full of exciting changes and growth for our company, and in the midst of the calendar change, I'm reminded that my two-year anniversary of becoming a SLayer will be here soon too. Has time flown?! So many things have changed in the past two years, so I thought it would be fun to think about some things that have changed since my first day on the job.

To give you an idea of how things have changed in our office alone:

  • Our last office had two kitchens and two microwaves. At our Alpha headquarters, we have six kitchens with twelve microwaves. It's so nice that I don't have to wait in line to heat up my lunches anymore.
  • In the Alpha office's main kitchen, we have a Sonic ice machine ... if you aren't from the southern part of the US, you might not know why this is so cool, but if you've had a Cherry Limeade delivered to your car, you'll know exactly what I'm talking about.
  • Previously, we had to share a bathroom with a few other companies. Now we're the only company in our building, and there are three sets bathrooms just for us.
  • When I started we had four conference rooms. Now we have sixteen ... Not even counting the conference rooms in our other locations!

Speaking of "other locations," it'd probably be worthwhile to talk about about a few of bigger changes that happened outside of the walls of the Dallas office.

  • When I started, SoftLayer was run by around 160 SLayers. Now we're over 650!
  • In January 2010, we were on one continent. Now we've added Asia and Europe presences to our foundation in North America.
  • Those international presences have helped us expand our data center footprint. We had three data centers (Dallas, Seattle and Washington, D.C.) when I started. Now we have thirteen data centers around the world, and in addition to those three markets, we now have SLayers in Houston, San Jose, Singapore and Amsterdam!
  • On my first day, our marketing team consisted of three people. Now we have more than fifteen people ... and looking to hire more.
  • Two years ago, we had around 6,000 customers. Today we have more than 25,000 customers located in over 110 countries!

I've been through a headquarter move, a merger, a huge network expansion and multiple product additions, but one thing that remains the same is our dedication to providing our customers with the best on-demand hosting solution in the world... and of course having fun while we are at it!

-Summer

Categories: 
October 21, 2011

SoftLayer, The Texas Rangers & The World Series

At the beginning of the baseball season, we gave away tickets for a lucky customer to see a Texas Rangers game, and as a result of that generosity, the Rangers thought it fitting to make it to the World Series. Well ... our little giveaway may not have had anything to do with their success, but we like to think our support helped a little.

Understanding that we have customers and employees who are die-hard St. Louis Cardinals fans, I don't want to turn anyone off with this blog post, but with all of the buzz in the air about the World Series coming back to Arlington this year, I started thinking about the Top 10 Ways SoftLayer is Like the Texas Rangers:

  1. Secret handshakes / fist bumps.
  2. Have a no "I" in "Team" mentality ... In fact, there are no I's in "Texas Rangers" or "SoftLayer."
  3. Teams' leaders (i.e. coach and the CEO) are ... um ... charismatic (to say the least).
  4. Come ready to play on any day that ends in "y."
  5. Strong lineups all the way through.
  6. Texas is home, but both teams do amazing jobs "on the road."
  7. Both have Michaels who like pink.*
  8. Both have Louisville Slugger bats ... The Rangers' bats do great things, while SoftLayer's bats are given to recognize employees that have done great things.
  9. Support is awesome from the customers (fans) to the back office to the team on the field making plays.
  10. Champions of the World, baby!!**

* Apologies to Michael Young, as this statement may not be true as applied to him. Each of my blogs to date has a veiled (or obvious) reference to our CFO, and it was very difficult to think of how to incorporate this reference in a blog dealing with the Texas Rangers, so I may have taken undue liberties for which I apologize.

** The aspirations associated with that last comparison may have strayed me from an unbiased comparison. :-)

-@badvizsla

Subscribe to comparison