Technology Posts

September 2, 2015

Backup and Restore in a Cloud and DevOps World

Virtualization has brought many improvements to the compute infrastructure, including snapshots and live migration1. When an infrastructure moves to the cloud, these options often become a client’s primary backup strategy. While snapshots and live migration are also part of a successful strategy, backing up on the cloud may need additional tools.

First, a basic question: Why do we take backups? They’re taken to recover from

  • The loss of an entire machine
  • Partially corrupted files
  • A complete data loss (either through hardware or human error)

While losing an entire machine is frightening, corrupted files or data loss are the more common reasons for data backups.

Snapshots are useful when the snapshot and restore occur in close proximity to each other, e.g., when you’re migrating middleware or an operating system and want to fall back quickly if something goes wrong. If you need to restore after extensive changes (hardware or data), a snapshot isn’t an adequate resource. The restore may require restoring to a new machine, selecting files to be restored, and moving data back to the original machine.

So if a snapshot isn’t the silver bullet for backing up in the cloud, what are the effective backup alternatives? The solution needs to handle a full system loss, partial data loss, or corruption, and ideally work for both virtualized and non-virtualized environments.

What to back up

There are three types of files that you’ll want to consider when backing up an active machine’s disks:

  • Binary files: Changed by operating system and middleware updates; can be easily stored and recovered.
  • Configuration files: Defined by how the binary files are connected, configured, and what data is accessible to them.
  • Data files: Generated by users and unrecoverable if not backed up. Data files are the most precious part of the disk content and losing them may result in a financial impact on the client’s business.

Keep in mind when determining your backup strategy that each file type has a different change rate—data files change faster than configuration files, which are more fluid than binary files. So, what are your options for backing up and restoring each type of file?

Binary files
In the case of a system failure, DevOps advocates (see Phoenix Servers from Martin Fowler) propose getting a new machine, which all cloud providers can automatically provision, including middleware. Automated provisioning processes are available for both bare metal and virtual machines.

Note that most Open Source products only require an Internet connection and a single command line for installation, while commercial products can be provisioned through automation.

Configuration files
Cloud-centric operations have a distinct advantage over traditional operations when it comes to backing up configuration files. With traditional operations, each element is configured manually, which has several drawbacks such as being time-consuming and error-prone. Cloud-centric operations, or DevOps, treat each configuration as code, which allows an environment to be built from a source configuration via automated tools and procedures. Tools such as Chef, Puppet, Ansible, and SaltStack show their power with central configuration repositories that are used to drive the composition of an environment. A central repository works well with another component of automated provisioning—changing the IP address and hostname.

You have limited control of how the cloud will allocate resources, so you need an automated method to collect the information and apply it to all the machines being provisioned.

In a cloud context, it’s suboptimal to manage machines individually; instead, the machines have to be seen as part of a cluster of servers, managed via automation. Cluster automation is one the core tenants of solutions like CoreOS’ Fleet and Apache Mesos. Resources are allocated and managed as a single entity via API, configuration repositories, and automation.

You can attain automation in small steps. Start by choosing an automation tool and begin converting your existing environment one file at a time. Soon, your entire configuration is centrally available and recovering a machine or deploying a full environment is possible with a single automated process.

In addition to being able to quickly provision new machines with your binary and configuration files, you are also able to create parallel environments, such as disaster recovery, test and development, and quality assurance. Using the same provisioning process for all of your environments assures consistent environments and early detection of potential production problems. Packages, binaries, and configuration files can be treated as data and stored in something similar to object stores, which are available in some form with all cloud solutions.

Data files
The final files to be backed up and restored are the data files. These files are the most important part of a backup and restore and the hardest ones to replace. Part of the challenge is the volume of data as well as access to it. Data files are relatively easy to back up; the exception being files that are in transition, e.g., files being uploaded. Data file backups can be done with several tools, including synchronization tools or a full file backup solution. Another option is object stores, which is the natural repository for relatively static files, and allows for a pay–as-you-go model.

Database content is a bit harder to back up. Even with instant snapshots on storage, backing up databases can be challenging. A snapshot at the storage level is an option, but it doesn’t allow for a partial database restore. Also, a snapshot can capture inflight transactions that can cause issues during a restore; which is why most database systems provide a mechanism for online backups. The online backups should be leveraged in combination with tools for file backups.

Something to remember about databases: many solutions end up accumulating data even after the data is no longer used by users. The data within an active database includes data currently being used and historical data. Having current and historical data allows for data analytics on the same database, it also increases the size of the database, making database-related operations harder. It may make sense to archive older data in either other databases or flat files, which makes the database volumes manageable.


To recap, because cloud provides rapid deployment of your operating system and convenient places to store data (such as object stores), it’s easy to factor cloud into your backup and recovery strategy. By leveraging the containerization approach, you should split the content of your machines—binary, configuration, and data. Focus on automating the deployment of binaries and configuration; it allows easier delivery of an environment, including quality assurance, test, and disaster recovery. Finally, use traditional backup tools for backing up data files. These tools make it possible to rapidly and repeatedly recover complete environments while controlling the amount of backed up data that has to be managed.


1 Snapshots are not available on bare metal servers that have no virtualization capability.

August 12, 2015

Network Performance 101: What is latency, and why does it matter?

We’ve all been there. Waiting for a web page to load can be so frustrating that we end up just closing out. You might ask yourself, “Hey, I have high-speed Internet. Why is this happening to me?” Well, there are a lot of factors outside your control that … control page loads. And whether you have an online store, run big data solutions, or have your employees set up on a network accessing files around the world, you never want to hear that your data, consumer products, information, or otherwise, is keeping you from a sale or slowing down employee productivity because of slow data transfer.

So why are some pages so much slower to load than others?
It could be that poorly written code or large images are slowing the load on the backend, but slow page loads can also be caused by network latency. This might sound elementary, but data is not just floating out there in some non-physical Internet space. In reality, data is stored on hard drives … somewhere. Network connectivity provides a path for that data to travel to end users around the world, and that connectivity can vary significantly—depending on how far it’s going, how many times the data has to hop between service providers, how much bandwidth is available along the way, the other data traveling across the same path, and a number of other variables.

The measurement of how quickly data travels between two connected points is called network latency. Network latency is an expression of the amount of time it takes a packet of data to get from one place to another.

Understanding Network Latency
Theoretically, data can travel at the speed of light across optical fiber network cables, but in practice, data typically travels slower than light due to the variables we referenced in the previous section. If a network connection doesn’t have any available bandwidth capacity, data might temporarily queue up to wait for its turn to travel across the line. If a service provider’s network doesn’t route a network path optimally, data could be sent hundreds or thousands of miles away from the destination in the process of routing to the destination. These kinds of delays and detours lead to higher network latency, which lead to slower page loads and download speeds.

We express network latency in milliseconds (that’s 1,000 milliseconds per second), and while a few thousandths of a second may not mean much to us as we’re living our daily lives, those milliseconds are often the deciding factors for whether we stay on a webpage or give up and try another site. As consumers of high-speed Internet, we like what we like, and we want what we want when we want it. In the financial sector, milliseconds can mean billions of dollars in gains or losses from trade transactions on a day-to-day basis.

Logical conclusion: Everyone wants the lowest network latency to the greatest number of users.

Common Approaches to Minimize Network Latency
If our shared goal is to minimize latency for our data, the most common approaches to addressing network latency involve limiting the number of potential variables that can impact the speed of data’s movement. While we don’t have complete control over how our data travels across the Internet, we can do a few things to keep our network latency in line:

  • Distribute data around the world: Users in different locations can pull data from a location that’s geographically close to them. Because the data is closer to the users, it is handed off fewer times, it has a shorter distance to travel, and inefficient routing is less likely to cause a significant performance impact.
  • Provision servers with high-capacity network ports: Huge volumes of data can travel to and from the server every second. If packets are delayed due to fully saturated ports, milliseconds of time pass, pages load slower, download speeds drop, and users get unhappy.
  • Understand how your providers route traffic: When you know how your data is transferred to users around the world, you can make better decisions about where you host your data.

How SoftLayer Minimizes Network Latency
To minimize latency, we took a unique approach to building our network. All of our data centers are connected to network points of presence. All of our network points of presence are connected to each other via our global backbone network. And by maintaining our own global backbone network, our network operations team is able to control network paths and data handoffs much more granularly than if we relied on other providers to move data between geographies.

SoftLayer Private Network

For example, if a user in Berlin wants to watch a cat video hosted on a SoftLayer server in Dallas, the packets of data that make up that cat video will travel across our backbone network (which is exclusively used by SoftLayer traffic) to Frankfurt, where the packets would be handed off to one of our peering or transit public network partners to get to the user in Berlin.

Without a global backbone network, the packets would be handed off to a peering or transit public network provider in Dallas, and that provider would route the packets across its network and/or hand the packets off to another provider at a network hop, and the packets would bounce their way to Germany. It’s entirely possible that the packets could get from Dallas to Berlin with the same network latency with or without the global backbone network, but without the global backbone network, there are a lot more variables.

In addition to building a global backbone network, we also segment public, private, and management traffic onto different network ports so that different types of traffic can be transferred without interfering with each other.

SoftLayer Private Network

But at the end of the day, all of that network planning and forethought doesn’t amount to a hill of beans if you can’t see the results for yourself. That’s why we put speed tests on our website so you can check out our network yourself (for more on speed tests, check out this blog post).

TL;DR: Network Latency
Your users want your data as quickly as you can get it to them. The time it takes for your data to get to them across the Internet is called network latency. The more control you (or your provider) have over your data’s network path, the more consistent (and lower) your network latency will be.

Stay tuned. Next month we will be discussing Network Performance 101: Security, where we’ll discuss all things cloud security—including answering your burning questions: Can other people see or access my data in a public cloud? Is my data more prone to hackers? And, what safeguards do SoftLayer have in place to protect data?


July 10, 2015

GPU Accelerated Supercomputing Comes to IBM’s SoftLayer Cloud Service

NVIDIA GPU technology powers some of the world’s fastest supercomputers. In fact, GPU technology is at the heart of the current #1 U.S. system, Titan, located at Oak Ridge National Labs. It will also be an important part of Titan’s forthcoming successor, Summit, an advanced new supercomputer based on next-generation, ultra-high performance GPU-accelerated OpenPOWER servers.

But, not everyone has access to these monster machines for their high-performance computing, deep learning, and scientific computing work. That’s why NVIDIA is working with IBM to make supercomputing-class technology more accessible to reserachers, engineers, developers, and other HPC users.

IBM Cloud announced earlier this week that NVIDIA Tesla K80 dual-GPU accelerators are now available on SoftLayer bare metal cloud servers. The team worked closely together to test and tune the speedy delivery of NVIDIA Tesla K80 enabled servers. The Tesla K80 GPU accelerators are the flagship technology of our Tesla Accelerated Computing Platform, delivering 10 times higher performance than today’s fastest CPU for a range of deep learning, data analytics and HPC applications.

Bringing Tesla K80 GPUs to SoftLayer means that more researchers and engineers worldwide will have access to ultra-high computing resources – without having to deal with the cost and time commitment of purchasing and maintaining their own HPC clusters. On-demand high performance computing can now be delivered in a matter of hours instead of the weeks or months it takes to build and deploy a dedicated system. Never before has bare-metal compute infrastructure been so agile. Fully populated Tesla K80 GPU nodes can be provisioned and used in two to four hours. Then, they can be de-provisioned or reassigned just as quickly.

With support for GPU accelerators, SoftLayer is providing full-scale data center resources for users to build a compute cluster, burst an existing cluster, or launch a compute intensive project—all on easy to use, cost effective, and easily accessible cloud infrastructure.

The strength of SoftLayer’s API and the experience of IBM Cloud make it easy for users to provision and reclaim resources to enable true cloud bursting for compute clusters, and controlling resources is key to controlling costs.

We’re delighted to expand the reach of GPU-accelerated computing broader than ever before. For more info on IBM Cloud’s GPU offerings on SoftLayer or to sign up, visit


Michael O’Neill is an established leader for NVIDIA. He provides specialized strategic thought leadership and technical guidance to customers on NVIDIA GRID and Tesla GPUs in virtualized environments. He works closely with business leaders to develop innovative solutions for graphical and compute heavy workloads. With over twenty years of experience in planning, developing, and implementing state of the art information systems, he has built a significant body of work empowering people to live, work and collaborate from anywhere on any device. His guidance has provided Fortune 500 companies with cloud computing solutions to help IT and service providers build private, hybrid and public clouds to deliver high-performance, elastic and cost-effective services for mobile workstyles.

June 22, 2015

3 Reasons Citrix NetScaler Should Be in Your PCI DSS Compliant Application Stack at SoftLayer

Whether you already process credit card information or are just starting to consider it, you’ve likely made yourself familiar with the Payment Card Industry Data Security Standard (PCI-DSS). The PCI-DSS’s 12 requirements (plus one appendix for service providers) outlines what you need to do to have a compliant workload and to pass your audits.

While SoftLayer handles the physical access and security aspects on our platform, we also offer tools to supplement your internal tools and processes to help you maintain PCI-DSS compliance such as the Citrix NetScaler VPX and MPX Platinum Edition product line.

Unique Features NetScaler Offers That Support PCI-DSS

  1. Mask Payment Account Numbers (PANs)
  2. With NetScaler Platinum Edition it’s possible to configure the device to block or mask PANs to prevent leakage of cardholder data—even if your application is attempting to present the data to a user. This is extremely useful when adhering to PCI-DSS Section 3.3—the first six and last four digits are the maximum number of digits to be displayed.

    NetScaler provides reporting as well so that your developers can tighten up that aspect of your application for more identification protection.

  3. Detect and Prevent Web-based Attacks
  4. By deploying a Web application firewall into your application stack, you can fully comply with PCI-DSS Section 6.6, which requires addressing new threats and vulnerabilities on an ongoing basis and ensuring these applications are protected against known attacks. The NetScaler Application Firewall module included in Platinum Edition provides continuous protection and can dynamically adjust to changes in your application code.

  5. Prevent Buffer Overflow, XML Security, Cross Site Scripting, & SQL Injection
  6. The NetScaler Web Application Firewall helps close the door on many common coding vulnerabilities outlined in PCI-DSS Section 6.5. By utilizing XML security protections, form tagging, dynamic context sensitive protections, and deep stream inspection, you can block, log, and report on these common security vectors and ensure your development team can shore up you applications

How to Order
SoftLayer offers Citrix NetScaler VPX Standard and Platinum Editions in multiple bandwidth packages—10Mbps, 200Mbps, and 1Gbps. Order these quickly and easily from your customer portal devices page (click order devices, scroll to networking devices, and select Citrix NetScaler).

SoftLayer also provides the NetScaler MPX for customers that require a dedicated hardware appliance running the NetScaler OS that can handle thousands of concurrent SSL transactions. To order the MPX product, chat with one of our sales advisors.

Be sure to take a look at some of the other features included with Citrix NetScaler.

Learn More About PCI-DSS
SoftLayer supports PCI workloads by providing the physical security required in the DSS. Within the customer portal you’re able to pull our most recent SOC 2 Type II audit report. You can use this as part of your compliance strategy. The rest is up to you to take advantage of the tools and services to make sure you meet the remaining PCI standards. Additionally, when you’re working with your PCI-DSS qualified security assessor, we can also provide an Attestation of Compliance.

For more information on compliance standards, check out


May 14, 2015

Update - VENOM Vulnerability

Yesterday, a security advisory designated CVE-2015-3456 / XSA-133 was publicly announced. The advisory identified a vulnerability, which has become commonly known as "VENOM", through which an attacker could exploit floppy driver support in QEMU to escalate their privileges.

SoftLayer engineers, in concert with our technology partners, completed a deep analysis of the vulnerability and determined that SoftLayer virtual servers are not affected by this issue.

We're always committed to ensuring our customers' operations and data are well protected. If customers have any questions or concerns, don't hesitate to reach out to SoftLayer support or your direct SoftLayer contacts.


March 30, 2015

The Importance of Data's Physical Location in the Cloud

If top-tier cloud providers use similar network hardware in their data centers and connect to the same transit and peering bandwidth providers, how can SoftLayer claim to provide the best network performance in the cloud computing industry?

Over the years, I've heard variations of that question asked dozens of times, and it's fairly easy to answer with impressive facts and figures. All SoftLayer data centers and network points of presence (PoPs) are connected to our unique global network backbone, which carries public, private, and management traffic to and from servers. Using our network connectivity table, some back-of-the-envelope calculations reveal that we have more than 2,500Gbps of bandwidth connectivity with some of the largest transit and peering bandwidth providers in the world (and that total doesn't even include the private peering relationships we have with other providers in various regional markets). Additionally, customers may order servers with up to 10Gbps network ports in our data centers.

For the most part, those stats explain our differentiation, but part of the bigger network performance story is still missing, and to a certain extent it has been untold—until today.

The 2,500+Gbps of bandwidth connectivity we break out in the network connectivity table only accounts for the on-ramps and off-ramps of our network. Our global network backbone is actually made up of an additional 2,600+Gbps of bandwidth connectivity ... and all of that backbone connectivity transports SoftLayer-related traffic.

This robust network architecture streamlines the access to and delivery of data on SoftLayer servers. When you access a SoftLayer server, the network is designed to bring you onto our global backbone as quickly as possible at one of our network PoPs, and when you're on our global backbone, you'll experience fewer hops (and a more direct route that we control). When one of your users requests data from your SoftLayer server, that data travels across the global backbone to the nearest network PoP, where it is handed off to another provider to carry the data the "last mile."

With this controlled environment, I decided to undertake an impromptu science experiment to demonstrate how location and physical distance affect network performance in the cloud.

Speed Testing on the SoftLayer Global Network Backbone

I work in the SoftLayer office in downtown Houston, Texas. In network-speak, this location is HOU04. You won't find that location on any data center or network tables because it's just an office, but it's connected to the same global backbone as our data centers and network points of presence. From my office, the "last mile" doesn't exist; when I access a SoftLayer server, my bits and bytes only travel across the SoftLayer network, so we're effectively cutting out a number of uncontrollable variables in the process of running network speed tests.

For better or worse, I didn't tell any network engineers that I planned to run speed tests to every available data center and share the results I found, so you're seeing exactly what I saw with no tomfoolery. I just fired up my browser, headed to our Data Centers page, and made my way down the list using the SpeedTest option for each facility. Customers often go through this process when trying to determine the latency, speeds, and network path that they can expect from servers in each data center, but if we look at the results collectively, we can learn a lot more about network performance in general.

With the results, we'll discuss how network speed tests work, what the results mean, and why some might be surprising. If you're feeling scientific and want to run the tests yourself, you're more than welcome to do so.

The Ookla SpeedTests we link to from the data centers table measured the latency (ping time), jitter (variation in latency), download speeds, and upload speeds between the user's computer and the data center's test server. To run this experiment, I connected my MacBook Pro via Ethernet to a 100Mbps wired connection. At the end of each speed test, I took a screenshot of the performance stats:

SoftLayer Network Speed Test

To save you the trouble of trying to read all of the stats on each data center as they cycle through that animated GIF, I also put them into a table (click the data center name to see its results screenshot in a new window):

Data Center Latency (ms) Download Speed (Mbps) Upload Speed (Mbps) Jitter (ms)
AMS01 121 77.69 82.18 1
DAL01 9 93.16 87.43 0
DAL05 7 93.16 83.77 0
DAL06 7 93.11 83.50 0
DAL07 8 93.08 83.60 0
DAL09 11 93.05 82.54 0
FRA02 128 78.11 85.08 0
HKG02 184 50.75 78.93 2
HOU02 2 93.12 83.45 1
LON02 114 77.41 83.74 2
MEL01 186 63.40 78.73 1
MEX01 27 92.32 83.29 1
MON01 52 89.65 85.94 3
PAR01 127 82.40 83.38 0
SJC01 44 90.43 83.60 1
SEA01 50 90.33 83.23 2
SNG01 195 40.35 72.35 1
SYD01 196 61.04 75.82 4
TOK02 135 75.63 82.20 2
TOR01 40 90.37 82.90 1
WDC01 43 89.68 84.35 0

By performing these speed tests on the SoftLayer network, we can actually learn a lot about how speed tests work and how physical location affects network performance. But before we get into that, let's take note of a few interesting results from the table above:

  • The lowest latency from my office is to the HOU02 (Houston, Texas) data center. That data center is about 14.2 miles away as the crow flies.
  • The highest latency results from my office are to the SYD01 (Sydney, Australia) and SNG01 (Singapore) data centers. Those data centers are at least 8,600 and 10,000 miles away, respectively.
  • The fastest download speed observed is 93.16Mbps, and that number was seen from two data centers: DAL01 and DAL05.
  • The slowest download speed observed is 40.35Mbps from SNG01.
  • The fastest upload speed observed is 87.43Mbps to DAL01.
  • The slowest upload speed observed is 72.35Mbps to SNG01.
  • The upload speeds observed are faster than the download speeds from every data center outside of North America.

Are you surprised that we didn't see any results closer to 100Mbps? Is our server in Singapore underperforming? Are servers outside of North America more selfish to receive data and stingy to give it back?

Those are great questions, and they actually jumpstart an explanation of how the network tests work and what they're telling us.

Maximum Download Speed on 100Mbps Connection

If my office is 2 milliseconds from the test server in HOU02, why is my download speed only 93.12Mbps? To answer this question, we need to understand that to perform these tests, a connection is made using Transmission Control Protocol (TCP) to move the data, and TCP does a lot of work in the background. The download is broken into a number of tiny chunks called packets and sent from the sender to the receiver. TCP wants to ensure that each packet that is sent is received, so the receiver sends an acknowledgement back to the sender to confirm that the packet arrived. If the sender is unable to verify that a given packet was successfully delivered to the receiver, the sender will resend the packet.

This system is pretty simple, but in actuality, it's very dynamic. TCP wants to be as efficient as possible ... to send the fewest number of packets to get the entire message across. To accomplish this, TCP is able to modify the size of each packet to optimize it for each communication. The receiver dictates how large the packet should be by providing a receive window to accommodate a small packet size, and it analyzes and adjusts the receive window to get the largest packets possible without becoming unstable. Some operating systems are better than others when it comes to tweaking and optimizing TCP transfer rates, but the processes TCP takes to ensure that the packets are sent and received without error takes overhead, and that overhead limits the maximum speed we can achieve.

Understanding the SNG01 Results

Why did my SNG01 speed test max out at a meager 40.35Mbps on my 100Mbps connection? Well, now that we understand how TCP is working behind the scenes, we can see why our download speeds from Singapore are lower than we'd expect. Latency between the sending and successful receipt of a packet plays into TCP’s considerations of a stable connection. Higher ping times will cause TCP to send smaller packet sizes than it would for lower ping times to ensure that no sizable packet is lost (which would have to be reproduced and resent).

With our global backbone optimizing the network path of the packets between Houston and Singapore, the more than 10,000-mile journey, the nature of TCP, and my computer's TCP receive window adjustments all factor into the download speeds recorded from SNG01. Looking at the results in the context of the distance the data has to travel, our results are actually well within the expected performance.

Because the default behavior of TCP is partially to blame for the results, we could actually tweak the test and tune our configurations to deliver faster speeds. To confirm that improvements can be made relatively easily, we can actually just look at the answer to our third question...

Upload > Download?

Why are the upload speeds faster than the download speeds after latency jumps from 50ms to 114ms? Every location in North America is within 2,000 miles of Houston, while the closest location outside of North America is about 5,000 miles away. With what we've learned about how TCP and physical distance play into download speeds, that jump in distance explains why the download speeds drop from 90.33Mbps to 77.41Mbps as soon as we cross an ocean, but how can the upload speeds to Europe (and even APAC) stay on par with their North American counterparts? The only difference between our download path and upload path is which side is sending and which side is receiving. And if the receiver determines the size of the TCP receive window, the most likely culprit in the discrepancy between download and upload speeds is TCP windowing.

A Linux server is built and optimized to be a server, whereas my MacOSX laptop has a lot of other responsibilities, so it shouldn't come as a surprise that the default TCP receive window handling is better on the server side. With changes to the way my laptop handles TCP, download speeds would likely be improved significantly. Additionally, if we wanted to push the envelope even further, we might consider using a different transfer protocol to take advantage of the consistent, controlled network environment.

The Importance of Physical Location in Cloud Computing

These real-world test results under controlled conditions demonstrate the significance of data's geographic proximity to its user on the user's perceived network performance. We know that the network latency in a 14-mile trip will be lower than the latency in a 10,000-mile trip, but we often don't think about the ripple effect latency has on other network performance indicators. And this experiment actually controls a lot of other variables that can exacerbate the performance impact of geographic distance. The tests were run on a 100Mbps connection because that's a pretty common maximum port speed, but if we ran the same tests on a GigE line, the difference would be even more dramatic. Proof: HOU02 @ 1Gbps v. SNG01 @ 1Gbps

Let's apply our experiment to a real-world example: Half of our site's user base is in Paris and the other half is in Singapore. If we chose to host our cloud infrastructure exclusively from Paris, our users would see dramatically different results. Users in Paris would have sub-10ms latency while users in Singapore have about 300ms of latency. Obviously, operating cloud servers in both markets would be the best way to ensure peak performance in both locations, but what if you can only afford to provision your cloud infrastructure in one location? Where would you choose to provision that infrastructure to provide a consistent user experience for your audience in both markets?

Given what we've learned, we should probably choose a location with roughly the same latency to both markets. We can use the SoftLayer Looking Glass to see that San Jose, California (SJC01) would be a logical midpoint ... At this second, the latency between SJC and PAR on the SoftLayer backbone is 149ms, and the latency between SJC and SNG is 162ms, so both would experience very similar performance (all else being equal). Our users in the two markets won't experience mind-blowing speeds, but neither will experience mind-numbing speeds either.

The network performance implications of physical distance apply to all cloud providers, but because of the SoftLayer global network backbone, we're able to control many of the variables that lead to higher (or inconsistent) latency to and from a given data center. The longer a single provider can route traffic, the more efficiently that traffic will move. You might see the same latency speeds to another provider's cloud infrastructure from a given location at a given time across the public Internet, but you certainly won't see the same consistency from all locations at all times. SoftLayer has spent millions of dollars to build, maintain, and grow our global network backbone to transport public and private network traffic, and as a result, we feel pretty good about claiming to provide the best network performance in cloud computing.


March 25, 2015

Introducing New Block Storage and File Storage

Everyone knows data growth is exploding. The chart below illustrates data growth—in zettabytes—over the last 11 years.

Storing all that data can get complicated. The rise of cloud computing and virtualization has led to myriad options for data storage. Kevin Trachier did a great job of defining and highlighting the differences in various cloud storage options in his blog post, Which storage solution is best for your project?

Today, I’m excited to announce that we’ve expanded SoftLayer’s cloud storage portfolio to include two new storage products: block storage and file storage, both featuring Performance and Endurance options. These storage offerings allow you to create storage volumes or shares and connect them to your bare metal or virtual servers using either NFS or iSCSI connectivity.

The Endurance and Performance classes of both block storage and file storage feature:

  • Storage sizes to fit any application—from 20GB to 12TB
  • Highly available connectivity—redundant networking connections reduce risk and mitigate against unplanned events to provide business continuity
  • Allocated IOPS—meet any workload requirement through customizable levels of IOPS that are there when you need them
  • Durable and Resilient —infrastructure provides safety of mind against data loss without managing system-level RAID arrays
  • Concurrent Access—multiple hosts can simultaneously access both block and file volumes in support of advanced use cases such as clustered databases

The Endurance class of both block storage and file storage is available in three tiers, allowing you can choose the right balance of performance and cost for your needs:

  • 0.25 IOPS per GB is designed for workloads with low I/O intensity. Example applications include storing mailboxes or departmental level file shares.
  • 2 IOPS per GB is designed for most general purpose use. Example applications include hosting small databases backing Web applications or virtual machine disk images for a hypervisor.
  • 4 IOPS per GB is designed for higher intensity workloads. Example applications include transactional and other performance-sensitive databases.

All Endurance tiers support snapshots and replication to remote data centers.

We designed the Performance class of both block storage and file storage to support high I/O applications like relational databases that require consistent levels of performance. Block volumes and file shares can be provisioned with up to 6,000 IOPS and 96MB/s of throughput.

Available sizes and IOPS combinations:

Block storage and file storage are available in SoftLayer data centers worldwide. SoftLayer customers can log in to the customer portal and start using them today.


March 20, 2015

Startups: Always Be Hiring

In late 2014, I was at a Denver job fair promoting an event I was organizing, NewCo Boulder. All the usual suspects of the Colorado tech community were there; companies ranging in size from 50 to 500 employees. It's a challenge to stand out from the crowd when vying for the best talent in this competitive job market, so the companies had pop-up banners, posters, swag of every kind on the table, and swarms of teams clad in company t-shirts to talk to everyone who walked by.

Nestled amid the dizzying display of logos was MediaNest, a three-person, pre-funding startup in the Catalyst program, at the time they were in the Boomtown Boulder fall 2014 cohort. What the heck was a scrappy startup doing among the top Colorado tech companies? In a word: hiring.

MediaNest was there to hire for three roles: front end developer, back end developer, and sales representative. They were there to double the size of their team ... when they had the money. In the war for talent, they started early and were doing it right.

I've often heard VCs (venture capitalists) and highly successful startup CEOs say the primary roles for a startup CEO are to always keep money in the bank and butts in seats. Both take tremendous time and energy, and they go hand-in-hand. It takes months to close a funding round, and similarly, it takes months to fill roles with the right people. If you're just getting started with hiring once that money is in the bank, you're starting from a deficit, burning capital, and straining resources while you get the recruiting gears going.

The number one resource for startup hiring is personal networks. Start with your friends and acquaintances and let everyone know you're looking to fill specific roles, even as you're out raising the capital to pay them. As the round gets closer to closing, intensify your efforts and expand your reach.

But what happens if you find someone perfect before you’re ready to hire them? Julien Khaleghy, CEO of MediaNest, says, "It's a tricky question. We will tend to be generous on the equity portion and conservative on the salary portion. If a comfortable salary is a requirement for the person, we will lock them for our next round of funding."

MediaNest wasn’t funded when I saw them in Denver, and they weren’t ready to make offers, so why attend a job fair? Khaleghy adds, based on his experience as CEO, "It's actually a good thing to show a letter of intent to hire someone when you are raising money."

At that job fair in Denver, MediaNest, with its simple table and two of the co-founders present, was just as busy that day as the companies with a full complement of staff giving away every piece of imaginable swag. I recommend following their example and getting ahead of the hiring game.

As long as you're successful, you'll never stop hiring. So start today.


March 18, 2015

SoftLayer, Bluemix and OpenStack: A Powerful Combination

Building and deploying applications on SoftLayer with Bluemix, IBM’s Platform as a Service (PaaS), just got a whole lot more powerful. At IBM’s Interconnect, we announced a beta service for deploying OpenStack-based virtual servers within Bluemix. Obviously, the new service is exciting because it brings together the scalable, secure, high-performance infrastructure from SoftLayer with the open, standards-based cloud management platform of OpenStack. But making the new service available via Bluemix presents a particularly unique set of opportunities.

Now Bluemix developers can deploy OpenStack-based virtual servers on SoftLayer or their own private OpenStack cloud in a consistent, developer-friendly manner. Without changing your code, your configuration, or your deployment method, you can launch your application to a local OpenStack cloud on your premises, a private OpenStack cloud you have deployed on SoftLayer bare metal servers, or to SoftLayer virtual servers within Bluemix. For instance, you could instantly fire up a few OpenStack-based virtual servers on SoftLayer to test out your new application. After you have impressed your clients and fully tested everything, you could deploy that application to a local OpenStack cloud in your own data center ̶all from within Bluemix. With Bluemix providing the ability to deploy applications across cloud deployment models, developers can create an infrastructure configuration once and deploy consistently, regardless of the stage of their application development life cycle.

OpenStack-based virtual servers on SoftLayer enable you to manage all of your virtual servers through standard OpenStack APIs and user interfaces, and leverage the tooling, knowledge and process you or your organization have already built out. So the choice is yours: you may fully manage your virtual servers directly from within the Bluemix user interface or choose standard OpenStack interface options such as the Horizon management portal, the OpenStack API or the OpenStack command line interface. For clients who are looking for enterprise-class infrastructure as a service but also wish to avoid getting locked in a vendor’s proprietary interface, our new OpenStack standard access provides clients a new choice.

Providing OpenStack-based virtual servers is just one more (albeit major) step toward our goal of providing even more OpenStack integration with SoftLayer services. For clients looking for enterprise-class Infrastructure as a Service (IaaS) available globally and accessible via standard OpenStack interfaces, OpenStack-based virtual servers on SoftLayer provide just what they are looking for.

The beta is open now for you to test deploying and running servers on the new SoftLayer OpenStack public cloud service through Bluemix. You can sign up for a Bluemix 30-day free trial.

- @marcalanjones

March 4, 2015

Docker: Containerization for Software

Before modern-day shipping, packing and transporting different shaped boxes and other oddly shaped items from ships to trucks to warehouses was difficult, inefficient, and cumbersome. That was until the modern day shipping container was introduced to the industry. These containers could easily be stacked and organized onto a cargo ship then easily transferred to a truck where it would be sent on to its final destination. Solomon Hykes, Docker founder and CTO, likens the Docker to the modern-day shipping industry’s solution for shipping goods. Docker utilizes containerization for shipping software.

Docker, an open platform for distributed applications used by developers and system administrators, leverages standard Linux container technologies and some git-inspired image management technology. Users can create containers that have everything they need to run an application just like a virtual server but are much lighter to deploy and manage. Each container has all the binaries it needs including library and middleware, configuration, and activation process. The containers can be moved around [like containers on ships] and executed in any Docker-enabled server.

Container images are built and maintained using deltas, which can be used by several other images. Sharing reduces the overall size and allows for easy image storage in Docker registries [like containers on ships]. Any user with access to the registry can download the image and activate it on any server with a couple of commands. Some organizations have development teams that build the images, which are run by their operations teams.

Docker & SoftLayer

The lightweight containers can be used on both virtual servers and bare metal servers, making Docker a nice fit with a SoftLayer offering. You get all the flexibility of a re-imaged server without the downtime. You can create red-black deployments, and mix hourly and monthly servers, both virtual and bare metal.

While many people share images on the public Docker registry, security-minded organizations will want to create a private registry by leveraging SoftLayer object storage. You can create Docker images for a private registry that will store all its information with object storage. Registries are then easy to create and move to new hosts or between data centers.

Creating a Private Docker Registry on SoftLayer

Use the following information to create a private registry that stores data with SoftLayer object storage. [All the commands below were executed on an Ubuntu 14.04 virtual server on SoftLayer.]

Optional setup step: Change Docker backend storage AuFS

Docker has several options for an image storage backend. The default backend is DeviceMapper. The option was not very stable during the test, failing to start and export images. This step may not be necessary in your specific build depending on updates of the operating system or Docker itself. The solution was to move to Another Union File System (AuFS).
  1. Install the following package to enable AuFS:
    apt-get install linux-image-extra-3.13.0-36-generic
  2. Edit /etc/init/docker.conf, and add the following line or argument:
  3. Restart Docker, and check if the backend was changed:
    service docker restart
    docker info

The command should indicate AuFS is being used. The output should look similar to the following:
Containers: 2
Images: 29
Storage Driver: aufs
Root Dir: /var/lib/docker/aufs
Dirs: 33
Execution Driver: native-0.2
Kernel Version: 3.13.0-36-generic
WARNING: No swap limit support

Step 1: Create image repo

  1. Create the directory registry-os in a work directory.
  2. Create a file named Dockerfile in the registry-os directory. It should contain the following code:
    # start from a registry release known to work
    FROM registry:0.7.3
    # get the swift driver for the registry
    RUN pip install docker-registry-driver-swift==0.0.1
    # SoftLayer uses v1 auth and the sample config doesn't have an option
    # for it so inject one
    RUN sed -i '91i\ swift_auth_version: _env:OS_AUTH_VERSION' /docker-registry/config/config_sample.yml
  3. Execute the following command from the directory that contains the registry-os directory to build the registry container:
    docker build -t registry-swift:0.7.3 registry-os

Step 2: Start it with your object storage credential

The credentials and container on the object storage must be provided in order to start the registry image. The standard Docker way of doing this is to pass the credentials as environment variables.

docker run -it -d -e SETTINGS_FLAVOR=swift -e
OS_CONTAINER='docker' -e GUNICORN_WORKERS=8 -p registry-swift:0.7.3

This example assumes we are storing images in DAL05 on a container called docker. API_USER and API_KEY are the object storage credentials you can obtain from the portal.

Step 3: Push image

An image needs to be pushed to the registry to make sure everything works. The image push involves two steps: tagging an image and pushing it to the registry.
docker tag registry-swift:0.7.3 localhost:5000/registry-swift

docker push localhost:5000/registry-swift

You can ensure that it worked by inspecting the contents of the container in the object storage.

Step 4: Get image

The image can be downloaded once successfully pushed to object storage via the registry by issuing the following command:
docker pull localhost:5000/registry-swift

Images can be downloaded from other servers by replacing localhost with the IP address to the registry server.

Final Considerations

The Docker container can be pushed throughout your infrastructure once you have created your private registry. Failure of the machine that contains the registry can be quickly mitigated by restarting the image on another node. To restart the image, make sure it’s on more than one node in the registry allowing you to leverage the SoftLayer platform and the high durability of object storage.

If you haven’t explored Docker, visit their site, and review the use cases.


Subscribe to technology