Posts Tagged 'Cloud Computing'

October 28, 2014

SoftLayer and AWS: What's the Difference?

People often compare SoftLayer with Amazon Web Services (AWS).

It’s easy to understand why. We’ve both built scalable infrastructure platforms to provide cloud resources to the same broad range of customers—from individual entrepreneurs to the world’s largest enterprises.

But while the desire to compare is understandable, the comparison itself isn’t quite apt. The SoftLayer platform is fundamentally different from AWS.

In fact, AWS could be run on SoftLayer. SoftLayer couldn’t be run on AWS.

AWS provisions in the public cloud.

When AWS started letting customers have virtual machines deployed on the infrastructure that AWS had built for their e-commerce business, AWS accelerated the adoption of virtual server hosting within the existing world of Web hosting.

In an AWS cloud environment, customers order the computing and storage resources they need, and AWS deploys those resources on demand. The mechanics of that deployment are important to note, though.

AWS has data centers full of physical servers that are integrated with each other in a massive public cloud environment. These servers are managed and maintained by AWS, and they collectively make up the available cloud infrastructure in the facility.

AWS installs a virtualization layer (also known as hypervisor) on these physical servers to tie the individual nodes into the environment’s total capacity. When a customer orders a cloud server from AWS, this virtualization layer finds a node with the requested resources available and provisions a server image with the customer’s desired operating system, applications, etc. The entire process is quick and automated, and each customer has complete control over the resources he or she ordered.

That virtualization layer is serving a purpose, and it may seem insignificant, but it highlights a critical difference in their platform and ours:

AWS automates and provisions at the hypervisor level, while SoftLayer automates and provisions at the data center level.

SoftLayer provisions down to bare metal resources.

While many have their sights on beating AWS at its own game, SoftLayer plays a different game.

SoftLayer platform is designed to give customers complete access and control over the actual infrastructure that they need to build a solution in the cloud. Automated and remote ordering, deployment, and management of the very server, storage, and security hardware resources themselves, are hosted in our data centers so that customers don’t have to build their own facilities or purchase their own hardware to get the reliable, high performance computing they need.

Everything in SoftLayer data centers is transparent, automated, integrated, and built on an open API that customers can access directly. Every server is connected to three distinct physical networks so that public, private, and management network traffic are segmented. And our expert technical support is available for all customers, 24x7.

Notice that the automation and integration of our platform happens at the data center level. We don’t need a virtualization layer to deploy our cloud resources. As a result, we can deploy bare metal servers in the same way AWS deploys public cloud servers (though, admittedly, bare metal servers take more time to deploy than virtual servers in the public cloud). By provisioning down to a lower level in the infrastructure stack, we’re able to offer customers more choice and control in their cloud environments:

In addition to the control customers have over infrastructure resources, with our unique network architecture, their servers aren’t isolated inside the four walls of a single data center. Customers can order one server in Dallas and another in Hong Kong, and those two servers can communicate with each other directly and freely across our private network without interfering with customers’ public network traffic. So with every new data center we build, we geographically expand a unified cloud footprint. No regions. No software-defined virtual networks. No isolation.

SoftLayer vs. AWS

Parts of our cloud business certainly compete with AWS. When users compare virtual servers between us, they encounter a number of similarities. But this post isn’t about comparing and contrasting offerings in the areas in which we’re similar … it’s about explaining how we’re different:
  • SoftLayer is able to provision bare metal resources to customers. This allows customers free reign over the raw compute power of a specific server configuration. This saves the customer from the 2–3 percent performance hit from the hypervisor, and it prevents “noisy neighbors” from being provisioned alongside a customer’s virtual server. AWS does not provision bare metal resources.

  • AWS differentiates “availability zones” and “regions” for customers who want to expand their cloud infrastructure into multiple locations. SoftLayer has data centers interconnected on a global private network. Customers can select the specific SoftLayer data center location they want so they can provision servers in the exact location they desire.

  • When AWS customers move data between their AWS servers, they see “Inter-Region Data Transfer Out” and “Intra-Region Data Transfer” on their bills. If you’re moving data from one SoftLayer facility to another SoftLayer facility (anywhere in the world), that transfer is free and unmetered. And it doesn’t fight your public traffic for bandwidth.

  • SoftLayer bare metal servers ordered with monthly billing include 20TB/mo of public outbound bandwidth, and virtual servers ordered with monthly billing include 5TB/mo of public outbound bandwidth. With AWS, customers pay a per-GB charge for bandwidth on every bill.

  • SoftLayer offers a broad range of management, monitoring, and support options to customers at no additional cost. AWS charges for monitoring based on metrics, frequency, and number of alarms per resource. And having access to support requires an additional monthly cost.

Do SoftLayer and AWS both offer Infrastructure as a Service? Yes.

Does that make SoftLayer and AWS the same? No.

-@khazard

September 30, 2014

SELLING SOFTLAYER (in Amsterdam)

Selling SoftLayer services to Internet-centric companies—hosting resellers, Software-as-a-Service (SaaS) and Platform-as-a-Service (PaaS) providers, big data and e-commerce companies—are no-brainers. These companies clearly see the advantages that come with having their servers (the backbone of their business) hosted by a specialist. They switch their capital expenses into variable costs that can be spread over time.

On the flip side are companies in non-Internet-centric industries—banking, health care, oil & gas, and aerospace. How do these companies find value in the IaaS offered by SoftLayer? The IT infrastructure (servers to be precise) accounts for less than 5 percent of their capital expenditure (CAPEX) as opposed to almost 95 percent for Internet-centric companies.

Will the same value proposition work for both Internet-centric and non-Internet-centric companies?

With Internet-centric companies (where servers constitute up to 95 percent of CAPEX), the majority of the workforce is server-savvy. This means there is a very high chance any contact we have with these companies will be with a server-savvy fellow. Selling SoftLayer will then be a question of how SoftLayer’s USPs differentiate from the competition.

The current industry trend is driving a faulty message: The cloud is a commodity.

The truth is: Unlike basic commodities (electricity, gas, or cable), where there is little or no differentiation between what the end user gets irrespective of the provider, cloud and hosting in general are different. This faulty commodity-based assumption drives the price wars in cloud computing.

Comparing apples and oranges cumulus and stratus.

To test and disprove this theory, I brought a customer’s systems engineer (a server expert) into a sales discussion with the CTO.

I requested to put the price negotiations on hold for about 4 hours, and evaluate the services first. To do this, I asked for the exact configuration that the customer had hosted with a competitor. I ordered the exact configuration on the SoftLayer platform and within 2 hours the servers were ready. When the customer’s system engineer tested the performance of the SoftLayer server and compared it to what they had from a competitor, the price comparison was thrown out the window for good.

There are many different facets wherein SoftLayer outperforms the competition but unfortunately, most prospective customers only see price.

For the non-Internet-centric companies, to reach the price discussion is a milestone in itself. Pricing negotiations only begin when the need and suitability (originality) have been established.

The IBM and SoftLayer effect.

As a salesperson, I subscribe to the SCOTSMAN Sales Qualification Matrix (Solution, Competition, Originality, Timescales, Size, Money, Authority, and Need). Most companies in this group need solutions. IaaS is just part of that solution. This is where IBM (Big Blue) comes into the picture. As a service giant in the IT Sector, IBM can and will build on SoftLayer’s IaaS prowess to conquer this landscape. The synergies that are coming from this acquisition will send shockwaves across the industry.

Question is: Will the stakeholders maximize this potential to the fullest?

- Valentine Che, Global Sales, AMS01

September 18, 2014

The Cloud Doesn't Bite, Part III

Why it's OK to be a server-hugger—a cloud server hugger.

(This is the final post in a three-part series. Read the first and second posts here.)

By now, you probably understand the cloud enough to know what it is and does. Maybe it's something you've even considered for your own business. But you're still not sold. You still have nagging concerns. You still have questions that you wish you could ask, but you're pretty sure no cloud company would dignify those questions with an honest, legitimate response.

Well we’re a cloud company, and we’ll answer those questions.

Inspired by a highly illuminating (!) thread on Slashdot about the video embedded below, we've noticed that some of you aren't ready to get your head caught up in the cloud just yet. And that's cool. But let's see if maybe we can put a few of those fears to rest right now.

“[The] reason that companies are hesitant to commit all of their IT to the cloud [relates to] keeping control. It's not about jobs, it's about being sure that critical services are available when you need them. Whenever you see ‘in the CLOUD!’, mentally replace it with ‘using someone else's server’—all of a sudden it looks a whole lot less appealing. Yes, you gain some flexibility, but you lose a LOT of control. I like my data to not be in the hands of someone else. If I don't control the actual machine that has my data on it, then I don't control the data.”

You guys are control FREAKS! And rightfully so. But some of us actually don't take that away from you. Believe it or not, we make it easier for you.

In fact, sometimes you even get to manage your own infrastructure—and that means you can do anything an employee can do. You'll probably even get so good at it that you'll wonder why we don't pay you.

But it doesn't stop at mere management. Oh, no, no, no, friends. You can even take it one further and build, manage, and have total control over your very own private cloud of virtual servers. Yes, yours, and yours only. Now announcing you, the shot caller.

The point is, you don't lose control over your data in the cloud. None. 'Cause cloud companies don't play like that.

“The first rule of computer security is physical access, which is impossible with cloud services, which means they are inherently insecure.”

Curious. So since you can't physically touch your money in your bank account, does that mean it's a free-for-all on your savings? Let us know; we'll bring buckets.

“These cloud guys always forget to mention one glaring problem with their model— they're not adding any new software to the picture.”

Ready for us to blow your minds? We're actually adding software all the time; you just don't see it—but you do feel it.

Your friendly Infrastructure as a Service (IaaS) providers out there are doing a lot of development behind the scenes. An internal software update might let us deploy servers 10 minutes faster, for example. You won't see that, but that doesn't mean it's not happening. If you're happy with your servers, then rest assured you're seeing some sweet software in action. Some cloud companies aren't exclusively focused on software (think Salesforce), but that doesn't mean the software is dial-up grade.

“I personally don't trust the cloud. Think about it for a moment. You are putting your data on a server, and you have no clue as to where it is. You have no clue about who else is able to see that data, and you have no clue about who is watching as you access your data and probably no clue if that server is up to date on security patches.”

Just ask. Simply ask all these questions, and you'd have all these answers. Not to be cheeky, but all of this is information you can and do have a right to know before you commit to anything. We're not sure what makes you think you don't, but you do. Your own due diligence on behalf of your data makes that a necessity, not a luxury.

“As long as I'm accountable, I want the hardware and software under my control. That way when something goes wrong and my boss calls and asks 'WTF?', I can give him something more than ’Well I called Amazon and left a message with our account representative.’"

We can't speak for Amazon, but cloud companies often offer multiple ways you can get a hold of a real, live person because we get that you want to talk to us, like, yesterday. Yes, we totally get you. And we want to fix whatever ails you. In the cloud, that is.

But what makes you think we won't know when something goes wrong before you do? (Checkmate.)

“No matter how much marketing jargon you spew at people, ‘the cloud’ is still just a bunch of servers. Stop lying.”

Why yes, yes, it is. Who's lying to you about that? You're right. "They" should stop lying.

The concept of "the cloud" is simply about where the servers are located and how you consume computing, storage, and networking resources. In "the cloud," your servers are accessed remotely via a network connection (often the Internet, for most of the clouds you know and love) as opposed to being locally accessed while housed in a server room or physical location on the company premises. Your premises, as in wherever you are while performing your computing functions. But no one's trying to pull the wool over your eyes with that one.

Think about it this way: If servers at your location are "on the ground," then servers away from your location can be considered "in the cloud." And that's all there is to it.

Did we help? Did we clear the cloudy haze? We certainly hope so.

But this is just the beginning, and our door is always open for you to question, criticize, and wax philosophical with us when it comes to all things cloud. So get at us. You can chat with us live via our homepage, message us or post up on Facebook, or sling a tweet at a SLayer. We've got real, live people manning their stations. Consider the gauntlet thrown.

-Fayza

September 3, 2014

The Cloud Doesn’t Bite, Part I

Why it's OK to be a server hugger—a cloud server hugger.

By now, you probably understand the cloud enough to know what it is and does. Maybe it's something you've even considered for your own business. But you're still not sold. You still have nagging concerns. You still have questions that you wish you could ask, but you're pretty sure no cloud company would dignify those questions with an honest, legitimate response.

Well we’re a cloud company, and we’ll answer those questions.

Inspired by a highly illuminating (!) thread on Slashdot about the video embedded below, we've noticed that some of you aren't ready to get your head caught up in the cloud just yet. And that's cool. But let's see if maybe we can put a few of those fears to rest right now.

"I'm worried about cloud services going down or disappearing, and there’s nothing anyone can do about it."

Let's just get one thing straight here: we're human, and the devices and infrastructures and networks we create are fallible. They're intelligent and groundbreaking and mind-boggling, but they are—like us—susceptible to bad things and prone to error at any given time.

But it's not the end of the world if or when it happens. Your cloud service provider has solutions. And so do you.

First, be smart about who you choose to work with. The larger, more reputable a company you select, the less likely you are to experience outages or outright disappearances. It's the nature of the beast—the big guys aren't going out of business any time soon. And if the worst should happen, they're not going down without a fight for your precious data.

Most outages end up being mere temporary blips that generally don’t last long. It'd take a major disaster (think hurricane or zombie apocalypse) to take any cloud-based platform out for more than a few hours. Which, of course, sounds like a long time, but we're talking worst case scenario here. And in the event of a zombie apocalypse, you probably have bigger fish to fry anyway.

But the buck doesn't stop there. Moving data to the cloud doesn't mean you get to kick up your heels, and set cruise control. (You don't really want that anyway, and you know it.) Be proactive. Know your service-level agreements, and make sure your system structures are built in a way that you're not losing out when it comes to outages and downtime. Know your provider's plan for redundancy. Know what monitoring systems are in place. Identify which applications and data are critical and should be treated differently in the event of a worst case scenario. Have a plan in the event of doomsday. You wouldn't go head first into sharknado season without a strategy for what to do if disaster hits, right? Why would the (unlikely) downfall of your data be any different?

Remember when we backed things up to external hard drives; before we'd ever heard of that network in the sky (a quaint concept, we know)? Well, we think it would behoove you to have a backup of what's essential to you and your business.

In fact, being realistic about technology these days is paramount. We can't prevent failure because we know better. According to Microsoft's chief reliability strategist, David Bills, "It's about designing resilient services in which inevitable failures have a minimal effect on service availability and functionality."

In any event, don't panic. You think you're freaking out about the cloud going down? Chances are, your provider is one step ahead of you already.

"Most of the time you don't find out about the cloud host's deficiencies until far too late." "One cloud company I had a personal Linux server with got hit with a DOS attack, and their response was to ignore their customer service email and phone for almost a week while trying to clean it up.”

Uh. Call us crazy, but we're guessing that company's no longer around—just a hunch.

We cloud infrastructure providers don't exactly pride ourselves on hoarding your data and then being completely inaccessible to you. Do your research on potential providers. Find out how easy it is (or difficult as the case may be) to get a hold of your customer service team. Make sure your potential provider's customer support meets your business needs. Make sure there's extra expertise available to you if you need personal attention or a little TLC. Make sure those response times are to your liking. Make sure those methods of contact are diverse enough and align with the way you do work.

We know you don't want to need us, but when you do need us, we are here for you.

"Of course, you have to either provide backup yourself, or routinely hard-verify the cloud provider's backup scheme. And you'd better have a backup-backup offsite recovery contract for when the cloud provider announces it can't really recover (e.g. Hurricane Sandy). And a super-backup-backup plan in case the cloud provider disappears with no forwarding address or has all its servers confiscated by DHS."

Hey, you don't have to have any of these things if your data's not that important to you. But if you'd have backups of your local servers, why wouldn't you have backups of anything you put in the cloud?

We thought so.

Nota bene: Sounds like you might want to take up some of this beef with Hurricane Sandy.

Stay tuned for part two where we tackle accountability, security, and buying ourselves new yachts.

- Fayza

August 28, 2014

Dude, how do I get into the cloud?

I know you may think that’s just a catchy title to get you to read my blog, but it’s not. I’ve actually had someone ask me that at a party. In fact, that’s the first thing anyone asks me when they find out I work for SoftLayer. The funny thing is, everyone is already in the cloud—they just don’t realize it! To make my point, I pick up their smart phone and tell them they already are in the cloud, and walk away. That, of course, sparks more conversation and the opportunity to educate my friends and family on the magic and mystery that is the cloud. But truthfully, it really is a very simple concept:

  • On demand
  • Compute
  • Consumption-based billing

That’s it. At its core. But if you want more detail, check out this document: NIST.

And, just to shed light on the backend of what the cloud is, well, it’s nothing but servers. I know, you were expecting something more exciting—maybe unicorns and fairy dust. But it’s not. We house the servers. We care for them daily. We store them and protect them. All from our data center.

What makes SoftLayer stand out from others in the cloud space is that we offer more than one-size-fits-all servers. We offer both public and private virtual servers like other cloud providers, but we also offer highly customizable and high performance bare metal, servers. And as with any good infrastructure, we offer all the ancillary services such as load balancing, firewalls, attached storage, DNS, etc…

There’s no magic involved here. We’ve simply taken your infrastructure and removed your capex and headache. You’re welcome.

So when you hear “The Cloud,” don’t be mystified, and don’t feel inadequate. Now you too can be the cloud genius at your next party. When they talk cloud, just say things like, “Oh yeah, it’s totally on demand computing that bills based on consumption.” Chicks dig that, trust me.

-Cheeku

September 30, 2013

The Economics of Cloud Computing: If It Seems Too Good to Be True, It Probably Is

One of the hosts of a popular Sirius XM radio talk show was recently in the market to lease a car, and a few weeks ago, he shared an interesting story. In his research, he came across an offer he came across that seemed "too good to be true": Lease a new Nissan Sentra with no money due at signing on a 24-month lease for $59 per month. The car would as "base" as a base model could be, but a reliable car that can be driven safely from Point A to Point B doesn't need fancy "upgrades" like power windows or an automatic transmission. Is it possible to lease new car for zero down and $59 per month? What's the catch?

After sifting through all of the paperwork, the host admitted the offer was technically legitimate: He could lease a new Nissan Sentra for $0 down and $59 per month for two years. Unfortunately, he also found that "lease" is just about the extent of what he could do with it for $59 per month. The fine print revealed that the yearly mileage allowance was 0 (zero) — he'd pay a significant per-mile rate for every mile he drove the car.

Let's say the mileage on the Sentra was charged at $0.15 per mile and that the car would be driven a very-conservative 5,000 miles per year. At the end of the two-year lease, the 10,000 miles on the car would amount to a $1,500 mileage charge. Breaking that cost out across the 24 months of the lease, the effective monthly payment would be around $121, twice the $59/mo advertised lease price. Even for a car that would be used sparingly, the numbers didn't add up, so the host wound up leasing a nicer car (that included a non-zero mileage allowance) for the same monthly cost.

The "zero-down, $59/mo" Sentra lease would be a fantastic deal for a person who wants the peace of mind of having a car available for emergency situations only, but for drivers who put the national average of 15,000 miles per year, the economic benefit of such a low lease rate is completely nullified by the mileage cost. If you were in the market to lease a new car, would you choose that Sentra deal?

At this point, you might be wondering why this story found its way onto the SoftLayer Blog, and if that's the case, you don't see the connection: Most cloud computing providers sell cloud servers like that car lease.

The "on demand" and "pay for what you use" aspects of cloud computing make it easy for providers to offer cloud servers exclusively as short-term utilities: "Use this cloud server for a couple of days (or hours) and return it to us. We'll just charge you for what you use." From a buyer's perspective, this approach is easy to justify because it limits the possibility of excess capacity — paying for something you're not using. While that structure is effective (and inexpensive) for customers who sporadically spin up virtual server instances and turn them down quickly, for the average customer looking to host a website or application that won't be turned off in a given month, it's a different story.

Instead of discussing the costs in theoretical terms, let's look at a real world example: One of our competitors offers an entry-level Linux cloud server for just over $15 per month (based on a 730-hour month). When you compare that offer to SoftLayer's least expensive monthly virtual server instance (@ $50/mo), you might think, "OMG! SoftLayer is more than three times as expensive!"

But then you remember that you actually want to use your server.

You see, like the "zero down, $59/mo" car lease that doesn't include any mileage, the $15/mo cloud server doesn't include any bandwidth. As soon as you "drive your server off the lot" and start using it, that "fantastic" rate starts becoming less and less fantastic. In this case, outbound bandwidth for this competitor's cloud server starts at $0.12/GB and is applied to the server's first outbound gigabyte (and every subsequent gigabyte in that month). If your server sends 300GB of data outbound every month, you pay $36 in bandwidth charges (for a combined monthly total of $51). If your server uses 1TB of outbound bandwidth in a given month, you end up paying $135 for that "$15/mo" server.

Cloud servers at SoftLayer are designed to be "driven." Every monthly virtual server instance from SoftLayer includes 1TB of outbound bandwidth at no additional cost, so if your cloud server sends 1TB of outbound bandwidth, your total charge for the month is $50. The "$15/mo v. $50/mo" comparison becomes "$135/mo v. $50/mo" when we realize that these cloud servers don't just sit in the garage. This illustration shows how the costs compare between the two offerings with monthly bandwidth usage up to 1.3TB*:

Cloud Cost v Bandwidth

*The graphic extends to 1.3TB to show how SoftLayer's $0.10/GB charge for bandwidth over the initial 1TB allotment compares with the competitor's $0.12/GB charge.

Most cloud hosting providers sell these "zero down, $59/mo car leases" and encourage you to window-shop for the lowest monthly price based on number of cores, RAM and disk space. You find the lowest price and mentally justify the cost-per-GB bandwidth charge you receive at the end of the month because you know that you're getting value from the traffic that used that bandwidth. But you'd be better off getting a more powerful server that includes a bandwidth allotment.

As a buyer, it's important that you make your buying decisions based on your specific use case. Are you going to spin up and spin down instances throughout the month or are you looking for a cloud server that is going to stay online the entire month? From there, you should estimate your bandwidth usage to get an idea of the actual monthly cost you can expect for a given cloud server. If you don't expect to use 300GB of outbound bandwidth in a given month, your usage might be best suited for that competitor's offering. But then again, it's probably worth mentioning that that SoftLayer's base virtual server instance has twice the RAM, more disk space and higher-throughput network connections than the competitor's offering we compared against. Oh yeah, and all those other cloud differentiators.

-@khazard

July 29, 2013

A Brief History of Cloud Computing

Believe it or not, "cloud computing" concepts date back to the 1950s when large-scale mainframes were made available to schools and corporations. The mainframe's colossal hardware infrastructure was installed in what could literally be called a "server room" (since the room would generally only be able to hold a single mainframe), and multiple users were able to access the mainframe via "dumb terminals" – stations whose sole function was to facilitate access to the mainframes. Due to the cost of buying and maintaining mainframes, an organization wouldn't be able to afford a mainframe for each user, so it became practice to allow multiple users to share access to the same data storage layer and CPU power from any station. By enabling shared mainframe access, an organization would get a better return on its investment in this sophisticated piece of technology.

Mainframe Computer

A couple decades later in the 1970s, IBM released an operating system called VM that allowed admins on their System/370 mainframe systems to have multiple virtual systems, or "Virtual Machines" (VMs) on a single physical node. The VM operating system took the 1950s application of shared access of a mainframe to the next level by allowing multiple distinct compute environments to live in the same physical environment. Most of the basic functions of any virtualization software that you see nowadays can be traced back to this early VM OS: Every VM could run custom operating systems or guest operating systems that had their "own" memory, CPU, and hard drives along with CD-ROMs, keyboards and networking, despite the fact that all of those resources would be shared. "Virtualization" became a technology driver, and it became a huge catalyst for some of the biggest evolutions in communications and computing.

Mainframe Computer

In the 1990s, telecommunications companies that had historically only offered single dedicated point–to-point data connections started offering virtualized private network connections with the same service quality as their dedicated services at a reduced cost. Rather than building out physical infrastructure to allow for more users to have their own connections, telco companies were able to provide users with shared access to the same physical infrastructure. This change allowed the telcos to shift traffic as necessary to allow for better network balance and more control over bandwidth usage. Meanwhile, virtualization for PC-based systems started in earnest, and as the Internet became more accessible, the next logical step was to take virtualization online.

If you were in the market to buy servers ten or twenty years ago, you know that the costs of physical hardware, while not at the same level as the mainframes of the 1950s, were pretty outrageous. As more and more people expressed demand to get online, the costs had to come out of the stratosphere, and one of the ways that was made possible was by ... you guessed it ... virtualization. Servers were virtualized into shared hosting environments, Virtual Private Servers, and Virtual Dedicated Servers using the same types of functionality provided by the VM OS in the 1950s. As an example of what that looked like in practice, let's say your company required 13 physical systems to run your sites and applications. With virtualization, you can take those 13 distinct systems and split them up between two physical nodes. Obviously, this kind of environment saves on infrastructure costs and minimizes the amount of actual hardware you would need to meet your company's needs.

Virtualization

As the costs of server hardware slowly came down, more users were able to purchase their own dedicated servers, and they started running into a different kind of problem: One server isn't enough to provide the resources I need. The market shifted from a belief that "these servers are expensive, let's split them up" to "these servers are cheap, let's figure out how to combine them." Because of that shift, the most basic understanding of "cloud computing" was born online. By installing and configuring a piece of software called a hypervisor across multiple physical nodes, a system would present all of the environment's resources as though those resources were in a single physical node. To help visualize that environment, technologists used terms like "utility computing" and "cloud computing" since the sum of the parts seemed to become a nebulous blob of computing resources that you could then segment out as needed (like telcos did in the 90s). In these cloud computing environments, it became easy add resources to the "cloud": Just add another server to the rack and configure it to become part of the bigger system.

Clouds

As technologies and hypervisors got better at reliably sharing and delivering resources, many enterprising companies decided to start carving up the bigger environment to make the cloud's benefits to users who don't happen to have an abundance of physical servers available to create their own cloud computing infrastructure. Those users could order "cloud computing instances" (also known as "cloud servers") by ordering the resources they need from the larger pool of available cloud resources, and because the servers are already online, the process of "powering up" a new instance or server is almost instantaneous. Because little overhead is involved for the owner of the cloud computing environment when a new instance is ordered or cancelled (since it's all handled by the cloud's software), management of the environment is much easier. Most companies today operate with this idea of "the cloud" as the current definition, but SoftLayer isn't "most companies."

SoftLayer took the idea of a cloud computing environment and pulled it back one more step: Instead of installing software on a cluster of machines to allow for users to grab pieces, we built a platform that could automate all of the manual aspects of bringing a server online without a hypervisor on the server. We call this platform "IMS." What hypervisors and virtualization do for a group of servers, IMS does for an entire data center. As a result, you can order a bare metal server with all of the resources you need and without any unnecessary software installed, and that server will be delivered to you in a matter of hours. Without a hypervisor layer between your operating system and the bare metal hardware, your servers perform better. Because we automate almost everything in our data centers, you're able to spin up load balancers and firewalls and storage devices on demand and turn them off when you're done with them. Other providers have cloud-enabled servers. We have cloud-enabled data centers.

SoftLayer Pod

IBM and SoftLayer are leading the drive toward wider adoption of innovative cloud services, and we have ambitious goals for the future. If you think we've come a long way from the mainframes of the 1950s, you ain't seen nothin' yet.

-James

Categories: 
July 16, 2013

Riak Performance Analysis: Bare Metal v. Virtual

In December, I posted a MongoDB performance analysis that showed the quantitative benefits of using bare metal servers for MongoDB workloads. It should come as no surprise that in the wake of SoftLayer's Riak launch, we've got some similar data to share about running Riak on bare metal.

To run this test, we started by creating five-node clusters with Riak 1.3.1 on SoftLayer bare metal servers and on a popular competitor's public cloud instances. For the SoftLayer environment, we created these clusters using the Riak Solution Designer, so the nodes were all provisioned, configured and clustered for us automatically when we ordered them. For the public cloud virtual instance Riak cluster, each node was provisioned indvidually using a Riak image template and manually configured into a cluster after all had come online. To optimize for Riak performance, I made a few tweaks at the OS level of our servers (running CentOS 64-bit):

Noatime
Nodiratime
barrier=0
data=writeback
ulimit -n 65536

The common Noatime and Nodiratime settings eliminate the need for writes during reads to help performance and disk wear. The barrier and writeback settings are a little less common and may not be what you'd normally set. Although those settings present a very slight risk for loss of data on disk failure, remember that the Riak solution is deployed in five-node rings with data redundantly available across multiple nodes in the ring. With that in mind and considering each node also being deployed with a RAID10 storage array, you can see that the minor risk for data loss on the failure of a single disk in the entire solution would have no impact on the entire data set (as there are plenty of redundant copies for that data available). Given the minor risk involved, the performance increases of those two settings justify their use.

With all of the nodes tweaked and configured into clusters, we set up Basho's test harness — Basho Bench — to remotely simulate load on the deployments. Basho Bench allows you to create a configurable test plan for a Riak cluster by configuring a number of workers to utilize a driver type to generate load. It comes packaged as an Erlang application with a config file example that you can alter to create the specifics for the concurrency, data set size, and duration of your tests. The results can be viewed as CSV data, and there is an optional graphics package that allows you to generate the graphs that I am posting in this blog. A simplified graphic of our test environment would look like this:

Riak Test Environment

The following Basho Bench config is what we used for our testing:

{mode, max}.
{duration, 120}.
{concurrent, 8}.
{driver, basho_bench_driver_riakc_pb}.
{key_generator,{int_to_bin,{uniform_int,1000000}}}.
{value_generator,{exponential_bin,4098,50000}}.
{riakc_pb_ips, [{10,60,68,9},{10,40,117,89},{10,80,64,4},{10,80,64,8},{10,60,68,7}]}.
{riakc_pb_replies, 2}.
{operations, [{get, 10},{put, 1}]}.

To spell it out a little simpler:

Tests Performed

Data Set: 400GB
10:1 Query-to-Update Operations
8 Concurrent Client Connections
Test Duration: 2 Hours

You may notice that in the test cases that use SoftLayer "Medium" Servers, the virtual provider nodes are running 26 virtual compute units against our dual proc hex-core servers (12 cores total). In testing with Riak, memory is important to the operations than CPU resources, so we provisioned the virtual instances to align with the 36GB of memory in each of the "Medium" SoftLayer servers. In the public cloud environment, the higher level of RAM was restricted to packages with higher CPU, so while the CPU counts differ, the RAM amounts are as close to even as we could make them.

One final "housekeeping" note before we dive into the results: The graphs below are pulled directly from the optional graphics package that displays Basho Bench results. You'll notice that the scale on the left-hand side of graphs differs dramatically between the two environments, so a cursory look at the results might not tell the whole story. Click any of the graphs below for a larger version. At the end of each test case, we'll share a few observations about the operations per second and latency results from each test. When we talk about latency in the "key observation" sections, we'll talk about the 99th percentile line — 99% of the results had latency below this line. More simply you could say, "This is the highest latency we saw on this platform in this test." The primary reason we're focusing on this line is because it's much easier to read on the graphs than the mean/median lines in the bottom graphs.

Riak Test 1: "Small" Bare Metal 5-Node Cluster vs Virtual 5-Node Cluster

Servers

SoftLayer Small Riak Server Node
Single 4-core Intel 1270 CPU
64-bit CentOS
8GB RAM
4 x 500GB SATAII – RAID10
1Gb Bonded Network
Virtual Provider Node
4 Virtual Compute Units
64-bit CentOS
7.5GB RAM
4 x 500GB Network Storage – RAID10
1Gb Network
 

Results

Riak Performance Analysis

Riak Performance Analysis

Key Observations

The SoftLayer environment showed much more consistency in operations per second with an average throughput around 450 Op/sec. The virtual environment throughput varied significantly between about 50 operations per second to more than 600 operations per second with the trend line fluctuating slightly between about 220 Op/sec and 350 Op/sec.

Comparing the latency of get and put requests, the 99th percentile of results in the SoftLayer environment stayed around 50ms for gets and under 200ms for puts while the same metric for the virtual environment hovered around 800ms in gets and 4000ms in puts. The scale of the graphs is drastically different, so if you aren't looking closely, you don't see how significantly the performance varies between the two.

Riak Test 2: "Medium" Bare Metal 5-Node Cluster vs Virtual 5-Node Cluster

Servers

SoftLayer Medium Riak Server Node
Dual 6-core Intel 5670 CPUs
64-bit CentOS
36GB RAM
4 x 300GB 15K SAS – RAID10
1Gb Network – Bonded
Virtual Provider Node
26 Virtual Compute Units
64-bit CentOS
30GB RAM
4 x 300GB Network Storage
1Gb Network
 

Results

Riak Performance Analysis

Riak Performance Analysis

Key Observations

Similar to the results of Test 1, the throughput numbers from the bare metal environment are more consistent (and are consistently higher) than the throughput results from the virtual instance environment. The SoftLayer environment performed between 1500 and 1750 operations per second on average while the virtual provider environment averaged around 1200 operations per second throughout the test.

The latency of get and put requests in Test 2 also paints a similar picture to Test 1. The 99th percentile of results in the SoftLayer environment stayed below 50ms and under 400ms for puts while the same metric for the virtual environment averaged about 250ms in gets and over 1000ms in puts. Latency in a big data application can be a killer, so the results from the virtual provider might be setting off alarm bells in your head.

Riak Test 3: "Medium" Bare Metal 5-Node Cluster vs Virtual 5-Node Cluster

Servers

SoftLayer Medium Riak Server Node
Dual 6-core Intel 5670 CPUs
64-bit CentOS
36GB RAM
4 x 128GB SSD – RAID10
1Gb Network – Bonded
Virtual Provider Node
26 Virtual Compute Units
64-bit CentOS
30GB RAM
4 x 300GB Network Storage
1Gb Network
 

Results

Riak Performance Analysis

Riak Performance Analysis

Key Observations

In Test 3, we're using the same specs in our virtual provider nodes, so the results for the virtual node environment are the same in Test 3 as they are in Test 2. In this Test, the SoftLayer environment substitutes SSD hard drives for the 15K SAS drives used in Test 2, and the throughput numbers show the impact of that improved I/O. The average throughput of the bare metal environment with SSDs is between 1750 and 2000 operations per second. Those numbers are slightly higher than the SoftLayer environment in Test 2, further distancing the bare metal results from the virtual provider results.

The latency of gets for the SoftLayer environment is very difficult to see in this graph because the latency was so low throughout the test. The 99th percentile of puts in the SoftLayer environment settled between 500ms and 625ms, which was a little higher than the bare metal results from Test 2 but still well below the latency from the virtual environment.

Summary

The results show that — similar to the majority of data-centric applications that we have tested — Riak has more consistent, better performing, and lower latency results when deployed onto bare metal instead of a cluster of public cloud instances. The stark differences in consistency of the results and the latency are noteworthy for developers looking to host their big data applications. We compared the 99th percentile of latency, but the mean/median results are worth checking out as well. Look at the mean and median results from the SoftLayer SSD Node environment: For gets, the mean latency was 2.5ms and the median was somewhere around 1ms. For puts, the mean was between 7.5ms and 11ms and the median was around 5ms. Those kinds of results are almost unbelievable (and that's why I've shared everything involved in completing this test so that you can try it yourself and see that there's no funny business going on).

It's commonly understood that local single-tenant resources that bare metal will always perform better than network storage resources, but by putting some concrete numbers on paper, the difference in performance is pretty amazing. Virtualizing on multi-tenant solutions with network attached storage often introduces latency issues, and performance will vary significantly depending on host load. These results may seem obvious, but sometimes the promise of quick and easy deployments on public cloud environments can lure even the sanest and most rational developer. Some applications are suited for public cloud, but big data isn't one of them. But when you have data-centric apps that require extreme I/O traffic to your storage medium, nothing can beat local high performance resources.

-Harold

September 24, 2012

Cloud Computing is not a 'Thing' ... It's a way of Doing Things.

I like to think that we are beyond 'defining' cloud, but what I find in reality is that we still argue over basics. I have conversations in which people still delineate things like "hosting" from "cloud computing" based degrees of single-tenancy. Now I'm a stickler for definitions just like the next pedantic software-religious guy, but when it comes to arguing minutiae about cloud computing, it's easy to lose the forest for the trees. Instead of discussing underlying infrastructure and comparing hypervisors, we'll look at two well-cited definitions of cloud computing that may help us unify our understanding of the model.

I use the word "model" intentionally there because it's important to note that cloud computing is not a "thing" or a "product." It's a way of doing business. It's an operations model that is changing the fundamental economics of writing and deploying software applications. It's not about a strict definition of some underlying service provider architecture or whether multi-tenancy is at the data center edge, the server or the core. It's about enabling new technology to be tested and fail or succeed in blazing calendar time and being able to support super-fast growth and scale with little planning. Let's try to keep that in mind as we look at how NIST and Gartner define cloud computing.

The National Institute of Standards and Technology (NIST) is a government organization that develops standards, guidelines and minimum requirements as needed by industry or government programs. Given the confusion in the marketplace, there's a huge "need" for a simple, consistent definition of cloud computing, so NIST had a pretty high profile topic on its hands. Their resulting Cloud Computing Definition describes five essential characteristics of cloud computing, three service models, and four deployment models. Let's table the service models and deployment models for now and look at the five essential characteristics of cloud computing. I'll summarize them here; follow the link if you want more context or detail on these points:

  • On-Demand Self Service: A user can automatically provision compute without human interaction.
  • Broad Network Access: Capabilities are available over the network.
  • Resource Pooling: Computing resources are pooled to serve multiple consumers using a multi-tenant model, with different physical and virtual resources dynamically assigned and reassigned.
  • Rapid Elasticity: Capabilities can be elastically provisioned and released.
  • Measured Service: Resource usage can be monitored, controlled and reported.

The characteristics NIST uses to define cloud computing are pretty straightforward, but they are still a little ambiguous: How quickly does an environment have to be provisioned for it to be considered "on-demand?" If "broad network access" could just mean "connected to the Internet," why include that as a characteristic? When it comes to "measured service," how granular does the resource monitoring and control need to be for something to be considered "cloud computing?" A year? A minute? These characteristics cast a broad net, and we can build on that foundation as we set out to create a more focused definition.

For our next stop, let's look at Gartner's view: "A style of computing in which scalable and elastic IT-enabled capabilities are delivered as a service using Internet infrastructure." From a philosophical perspective, I love their use of "style" when talking about cloud computing. Little differentiates the underlying IT capabilities of cloud computing from other types of computing, so when looking at cloud computing, we really just see a variation on how those capabilities are being leveraged. It's important to note that Gartner's definition includes "elastic" alongside "scalable" ... Cloud computing gets the most press for being able to scale remarkably, but the flip-side of that expansion is that it also needs to contract on-demand.

All of this describes a way of deploying compute power that is completely different than the way we did this in the decades that we've been writing software. It used to take months to get funding and order the hardware to deploy an application. That's a lot of time and risk that startups and enterprises alike can erase from their business plans.

How do we wrap all of those characteristics up into unified of definition of cloud computing? The way I look at it, cloud computing is as an operations model that yields seemingly unlimited compute power when you need it. It enables (scalable and elastic) capacity as you need it, and that capacity's pricing is based on consumption. That doesn't mean a provider should charge by the compute cycle, generator fan RPM or some other arcane measurement of usage ... It means that a customer should understand the resources that are being invoiced, and he/she should have the power to change those resources as needed. A cloud computing environment has to have self-service provisioning that doesn't require manual intervention from the provider, and I'd even push that requirement a little further: A cloud computing environment should have API accessibility so a customer doesn't even have to manually intervene in the provisioning process (The customer's app could use automated logic and API calls to scale infrastructure up or down based on resource usage).

I had the opportunity to speak at Cloud Connect Chicago, and I shared SoftLayer's approach to cloud computing and how it has evolved into a few distinct products that speak directly to our customers' needs:

The session was about 45 minutes, so the video above has been slimmed down a bit for easier consumption. If you're interested in seeing the full session and getting into a little more detail, we've uploaded an un-cut version here.

-Duke

September 12, 2012

How Can I Use SoftLayer Message Queue?

One of the biggest challenges developers run into when coding large, scalable systems is automating batch processes and distributing workloads to optimize compute resource usage. More simply, intra-application and inter-system communications tend to become a bottleneck that affect the user experience, and there is no easy way to get around it. Well ... There *was* no easy way around it.

Meet SoftLayer Message Queue.

As the name would suggest, Message Queue allows you to create one or more "queues" or containers which contain "messages" — strings of text that you can assign attributes to. The queues pass along messages in first-in-first-out order, and in doing so, they allow for parallel processing of high-volume workflows.

That all sounds pretty complex and "out there," but you might be surprised to learn that you're probably using a form of message queuing right now. Message queuing allows for discrete threads or applications to share information with one another without needing to be directly integrated or even operating concurrently. That functionality is at the heart of many of the most common operating systems and applications on the market.

What does it mean in a cloud computing context? Well, Message Queue facilitates more efficient interaction between different pieces of your application or independent software systems. The easiest way demonstrate how that happens is by sharing a quick example:

Creating a Video-Sharing Site

Let's say we have a mobile application providing the ability to upload video content to your website: sharevideoswith.phil. The problem we have is that our webserver and CMS can only share videos in a specific format from a specific location on a CDN. Transcoding the videos on the mobile device before it uploads proves to be far too taxing, what with all of the games left to complete from the last Humble Bundle release. Having the videos transcoded on our webserver would require a lot of time/funds/patience/knowledge, and we don't want to add infrastructure to our deployment for transcoding app servers, so we're faced with a conundrum. A conundrum that's pretty easily answered with Message Queue and SoftLayer's (free) video transcoding service.

What We Need

  • Our Video Site
  • The SoftLayer API Transcoding Service
  • SoftLayer Object Storage
    • A "New Videos" Container
    • A "Transcoded Videos" Container with CDN Enabled
  • SoftLayer Message Queue
    • "New Videos" Queue
    • "Transcoding Jobs" Queue

The Process

  1. Your user uploads the video to sharevideoswith.phil. Your web app creates a page for the video and populates the content with a "processing" message.
  2. The web application saves the video file into the "New Vidoes" container on object storage.
  3. When the video is saved into that container, it creates a new message in the "New Videos" message queue with the video file name as the body.
  4. From here, we have two worker functions. These workers work independently of each other and can be run at any comfortable interval via cron or any scheduling agent:
Worker One: Looks for messages in the "New Videos" message queue. If a message is found, Worker One transfers the video file to the SoftLayer Transcoding Service, starts the transcoding process and creates a message in the "Transcoding Jobs" message queue with the Job ID of the newly created transcoding job. Worker One then deletes the originating message from the "New Videos" message queue to prevent the process from happening again the next time Worker One runs.

Worker Two: Looks for messages in the "Transcoding Jobs" queue. If a message is found, Worker Two checks if the transcoding job is complete. If not, it does nothing with the message, and that message is be placed back into the queue for the next Worker Two to pick up and check. When Worker Two finds a completed job, the newly-transcoded video is pushed to the "Transcoded Videos" container on object storage, and Worker Two updates the page our web app created for the video to display an embedded media player using the CDN location for our transcoded video on object storage.

Each step in the process is handled by an independent component. This allows us to scale or substitute each piece as necessary without needing to refactor the other portions. As long as each piece receives and sends the expected message, its colleague components will keep doing their jobs.

Video transcoding is a simple use-case that shows some of the capabilities of Message Queue. If you check out the Message Queue page on our website, you can see a few other examples — from online banking to real-time stock, score and weather services.

Message Queue leverages Cloudant as the highly scalable low latency data layer for storing and distributing messages, and SoftLayer customers get their first 100,000 messages free every month (with additional messages priced at $0.01 for every 10,000).

What are you waiting for? Go get started with Message Queue!

-Phil (@SoftLayerDevs)

Subscribe to cloud-computing