Infrastructure Posts

May 17, 2016

New routes configured for SoftLayer customers

Customers will see a new route configured on a newly provisioned customer host or on a customer host after a portal-initiated OS reload. This is part of a greater goal to enable new services and offerings for SoftLayer customers. This route will direct traffic addressed to hosts configured out of the 161.26.0.0/16 network block (161.26.0.0 -161.26.255.255) to the back end private gateway IP address configured on customer servers or virtual server instances.

The 161.2.0.0/16 address space is assigned to SoftLayer by IANA and will not be advertised over the front end public network. This space will be used exclusively on SoftLayer’s backend private network, will never conflict with network addresses on the Internet, and should never conflict with address space used by third-party VPN service providers.

This new route is similar to the 10.0.0.0/8 route already located on SoftLayer hosts, in that SoftLayer services are addressed out of both ranges. Also, both the 10.0.0.0/8 route and the 161.26.0.0/16 route will need to be configured on a customer host if it is required to access all SoftLayer services hosted on the back end private network. Unlike the 10.0.0.0/8 range, the 161.26.0.0/16 range will be used exclusively for SoftLayer services. Customers will need to ensure that ACL/firewalls on customer servers, virtual server instances, and gateway appliances are configured to allow connectivity to the 161.26.0.0/16 network block to access these new services.

For more information on this new route, including how to configure existing systems to use them, read more on KnowledgeLayer.

-Curtis

February 10, 2016

The Compliance Commons: Do you know our ISOs?

Editor’s note: This is the first of a three-part series designed to address general compliance topics and to answer frequently asked compliance questions.

How many times have you been asked by a customer if SoftLayer is ISO compliant?  Do you ever find yourself struggling for an immediate answer?  If so, you're not alone. 

ISO stands for International Organization for Standardization. The organization has published more than 19,000 international standards, covering almost all aspects of technology and business. If you have any questions about a specific ISO standard, you can search the ISO website. If you would like the full details of any ISO standard, an online copy of the standard can be purchased through their website. 

SoftLayer holds three ISO certifications, and we’re going after more. We offer industry standard best security practices relating to cloud infrastructure, including: 

ISO/IEC 27001: This certification covers the information security management process. It certifies that SoftLayer offers best security practices in the industry relating to cloud infrastructure as a service (IaaS). Going through this process and obtaining certification means that SoftLayer observes industry best practices in offering a safe and secure place to live in the cloud. It also means that our information security management practices adhere to strict, internationally recognized best practices.

ISO/IEC 27018: This certifies that SoftLayer follows the most stringent code of practice for protection of personally identifiable information (PII) in public clouds acting as PII processors. It establishes commonly accepted control objectives, controls, and guidelines for implementing measures to protect PII in accordance with the privacy principles in ISO/IEC 29100 for the public cloud computing environment. While not all of SoftLayer is public and while we have very distinct definitions for processing PII for customers, we decided to obtain the certification to solidify our security and privacy principles as robust.

ISO/IEC 27017: This is a code of practice for information security controls for cloud services.  It’s the global standard for cloud security practices—not only for what SoftLayer should do, but also for what our customers should do to protect information. SoftLayer’s ISO 27017 certification demonstrates our continued commitment to upholding the highest, most secure information security controls and applying them effectively and efficiently to our cloud infrastructure environment. The standard provides guidance in, but not limited to, the following areas:

  • Information Security
  • Human Resources
  • Asset Management
  • Access Control
  • Cryptography
  • Physical and Environmental Security
  • Operations Security
  • Communications Security
  • System Acquisition, Development & Maintenance
  • Supplier Relations
  • Incident Management
  • Business Continuity Management
  • Compliance
  • Network Security

How can SoftLayer’s ISO certification benefit me as a customer?

Customers can leverage SoftLayer’s certifications as long as it’s done in the proper manner. Customers cannot claim that they’re ISO certified just because they’re using SoftLayer infrastructure. That’s not how it works. SoftLayer’s ISO certifications may make it easier for customers to become certified because they can leverage our certification for the SoftLayer boundary. Our SOC2 report (available through our customer portal or sales team) describes our boundary in greater detail: the customers are not responsible for certifying what’s inside SoftLayer’s boundary.  

ISO File

How does SoftLayer prove its ISO compliance?

SoftLayer’s ISO Certificates of Registration are publicly available on our website and on our third-party assessor’s website. By design, our ISO certificates denote that we conform to and meet all the applicable objectives of each standard. Since the ISO standards are steadfast and constant controls for everyone, we don’t offer our reports from the audits, but we can provide our certificates.

What SoftLayer data centers are applicable to the ISO certifications?

All of them! Each ISO certificate is applicable to every one of our data centers, in the U.S. and internationally. SoftLayer obtained ISO certifications on every one of our facilities because we operate with consistency across the globe. When a new SoftLayer data center comes online, there is some lag time between opening and certification because we need to be reviewed by our third-party assessor and have operational evidence available to support our data center certification. But as soon as we obtain the certifications, we’ll make them available.

Visit www.softlayer.com/compliance for a full list of our certifications and reports. They can also be found through the customer portal.

-Dana

 

December 28, 2015

Semantics: "Public," "Private," and "Hybrid" in Cloud Computing, Part II

Welcome back! In the second post in this two-part series, we’ll look at the third definition of “public” and “private,” and we’ll have that broader discussion about “hybrid”—and we’ll figure out where we go after the dust has cleared on the semantics. If you missed the first part of our series, take a moment to get up to speed here before you dive in.

Definition 3—Control: Bare Metal v. Virtual

A third school of thought in the “public v. private” conversation is actually an extension of Definition 2, but with an important distinction. In order for infrastructure to be “private,” no one else (not even the infrastructure provider) can have access to a given hardware node.

In Definition 2, a hardware node provisioned for single-tenancy would be considered private. That single-tenant environment could provide customers with control of the server at the bare metal level—or it could provide control at the operating system level on top of a provider-managed hypervisor. In Definition 3, the latter example would not be considered “private” because the infrastructure provider has some level of control over the server in the form of the virtualization hypervisor.

Under Definition 3, infrastructure provisioned with full control over bare metal hardware is “private,” while any provider-virtualized or shared environment would be considered “public.” With complete, uninterrupted control down to the bare metal, a user can monitor all access and activity on the infrastructure and secure it from any third-party usage.

Defining “public cloud” and “private cloud” using the bare metal versus virtual delineation is easy. If a user orders infrastructure resources from a provider, and those resources are delivered from a shared, virtualized environment, that infrastructure would be considered public cloud. If the user orders a number of bare metal servers and chooses to install and maintain his or her own virtualization layer across those bare metal servers, that environment would be a private cloud.

“Hybrid”

Mix and Match

Now that we see the different meanings “public” and “private” can have in cloud computing, the idea of a “hybrid” environment is a lot less confusing. In actuality, it really only has one definition: A hybrid environment is a combination of any variation of public and private infrastructure.

Using bare metal servers for your database and virtual servers for your Web tier? That’s a hybrid approach. Using your own data centers for some of your applications and scaling out into another provider’s data centers when needed? That’s hybrid, too. As soon as you start using multiple types of infrastructure, by definition, you’ve created a hybrid environment.

And Throw in the Kitchen Sink

Taking our simple definition of “hybrid” one step further, we find a few other variations of that term’s usage. Because the cloud stack is made up of several levels of services—Infrastructure as a Service, Platform as a Service, Software as a Service, Business Process as a Service—“hybrid” may be defined by incorporating various “aaS” offerings into a single environment.

Perhaps you need bare metal infrastructure to build an off-prem private cloud at the IaaS level—and you also want to incorporate a managed analytics service at the BPaaS level. Or maybe you want to keep all of your production data on-prem and do your sandbox development in a PaaS environment like Bluemix. At the end of the day, what you’re really doing is leveraging a “hybrid” model.

Where do we go from here?

Once we can agree that this underlying semantic problem exists, we should be able to start having better conversations:

  • Them: We’re considering a hybrid approach to hosting our next application.
  • You: Oh yeah? What platforms or tools are we going to use in that approach?
  • Them: We want to try and incorporate public and private cloud infrastructure.
  • You: That’s interesting. I know that there are a few different definitions of public and private when it comes to infrastructure…which do you mean?
  • Them: That’s a profound observation! Since we have our own data centers, we consider the infrastructure there to be our private cloud, and we’re going to use bare metal servers from SoftLayer as our public cloud.
  • You: Brilliant! Especially the fact that we’re using SoftLayer.

Your mileage may vary, but that’s the kind of discussion we can get behind.

And if your conversation partner balks at either of your questions, send them over to this blog post series.

-@khazard

December 18, 2015

Semantics: "Public, "Private," and "Hybrid" in Cloud Computing, Part I

What does the word “gift” mean to you? In English, it most often refers to a present or something given voluntarily. In German, it has a completely different meaning: “poison.” If a box marked “gift” is placed in front of an English-speaker, it’s safe to assume that he or she would interact with it very differently than a German-speaker would.

In the same way, simple words like “public,” “private,” and “hybrid” in cloud computing can mean very different things to different audiences. But unlike our “gift” example above (which would normally have some language or cultural context), it’s much more difficult for cloud computing audiences to decipher meaning when terms like “public cloud,” “private cloud,” and “hybrid cloud” are used.

We, as an industry, need to focus on semantics.

In this two-part series, we’ll look at three different definitions of “public” and “private” to set the stage for a broader discussion about “hybrid.”

“Public” v. “Private”

Definition 1—Location: On-premises v. Off-premises

For some audiences (and the enterprise market), whether an infrastructure is public or private is largely a question of location. Does a business own and maintain the data centers, servers, and networking gear it uses for its IT needs, or does the business use gear that’s owned and maintained by another party?

This definition of “public v. private” makes sense for an audience that happens to own and operate its own data centers. If a business has exclusive physical access to and ownership of its gear, the business considers that gear “private.” If another provider handles the physical access and ownership of the gear, the business considers that gear “public.”

We can extend this definition a step further to understand what this audience would consider to be a “private cloud.” Using this definition of “private,” a private cloud is an environment with an abstracted “cloud” management layer (a la OpenStack or CloudStack or VMWare) that runs in a company’s own data center. In contrast, this audience would consider a “public cloud” to be a similar environment that’s owned and maintained by another provider.

Enterprises are often more likely to use this definition because they’re often the only ones that can afford to build and run their own data centers. They use “public” and “private” to distinguish between their own facilities or outside facilities. This definition does not make sense for businesses that don’t have their own data center facilities.

Definition 2—Population: Single-tenant v. Multi-tenant

Businesses that don’t own their own data center facilities would not use Definition 1 to distinguish “public” and “private” infrastructure. If the infrastructure they use is wholly owned and physically maintained by another provider, these businesses are most interested in whether hardware resources are shared with any other customers: Do any other customers have data on or access to a given server’s hardware? If so, the infrastructure is public. If not, the infrastructure is private.

Using this definition, public and private infrastructure could be served from the same third-party-owned data center, and the infrastructure could even be in the same server rack. “Public” infrastructure just happens to provide multiple users with resources and access to a single hardware node. Note: Even though the hardware node is shared, each user can only access his or her own data and allotted resources.

On the flip side, if a user has exclusive access to a hardware node, a business using Definition 2 would consider the node to be private.

Using this definition of “public” and “private,” multiple users share resources at the server level in a “public cloud” environment—and only one user has access to resources at the server level in a “private cloud” environment. Depending on the environment configuration, a “private cloud” user may or may not have full control over the individual servers he or she is using.

This definition echoes back to Definition 1, but it is more granular. Businesses using Definition 2 believe that infrastructure is public or private based on single-tenancy or multi-tenancy at the hardware level, whereas businesses using Definition 1 consider infrastructure to be public or private based on whether the data center itself is single-tenant or multi-tenant.

Have we blown your minds yet? Stay tuned for Part II, where we’ll tackle bare metal servers, virtual servers, and control. We’ll also show you how clear hybrid environments really are, and we’ll figure out where the heck we go from here now that we’ve figured it all out.

-@khazard

December 17, 2015

Xen Hypervisor Maintenance - December 2015

Security of your assets on our cloud platform is very important to the SoftLayer team. Last week, our Security Operations Center – which provides real time monitoring of suspicious activity (including being part of multiple security pre-disclosure lists) – alerted our engineering team to a potential vulnerability (advisory CVE-2015-8555 / XSA-165) in the Xen Hypervisor that if left un-remediated could allow a malicious user to access data from another VSI guest sharing the same hardware node and hypervisor instance.

Upon learning of this vulnerability, SoftLayer issued a notification including a per-data center schedule for applying critical maintenance to remediate the vulnerability. Our schedule was performed over multiple days and on a POD-by-POD basis with individual VM instances being offline for minutes while they rebooted. The updates were completed successfully in all data centers in advance of the public announcement of this vulnerability.

While deployment techniques such as clustering and failover across data centers and PODs allows continuous operations during a planned or unplanned event, you should be aware that SoftLayer is committed to working aggressively to further reduce the impact of events on your deployment and operations teams.

We value your business and will continue to take actions that insure your environment is secure and efficient to operate. If you have any questions or concerns, don't hesitate to reach out to SoftLayer support or your direct SoftLayer contacts.

-Sonny

October 28, 2015

Ongoing Actions to Eliminate Spam Hosting

We are announcing a new policy, effective today, as part of our regular efforts to reduce the ability for spam to be sent from the SoftLayer network.

Starting October 28, 2015 bare metal servers and virtual servers provisioned on new accounts will not have the ability to send email directly via outbound connections through TCP port 25 (SMTP). Port 25 can be used as a conduit for distributing unsolicited bulk email.

In a follow-up phase, we will roll out this network policy change to customers who established accounts before October 28. (A separate communications will be sent with timeline and implementation guidance to those customers.)

You can read the technical details on KnowledgeLayer.

SendGrid Services Available to Send and Track Emails

We have partnered with SendGrid™ since 2011 to provide email delivery services. We have arranged for SendGrid to provide SoftLayer customers with an account allowing sending of up to 25,000 emails per month at no charge, which can be activated via the SoftLayer customer portal.

SendGrid allows you to use a SmartHost to relay your outbound mail services while generating metrics, including tracking lists and bounce rates, open rates, and click-through rates. It also assists with newsletters and provides authentication. All of these services are designed to provide stronger email analytics for you to optimize your communications and eNurture programs. Full details on our SendGrid service, including free options, can be found here.

Use Your Email Service Through a Custom Email Port

You are welcome to use your own email service on a custom port following the API or SMTP guidelines provided by your mail provider to configure your servers to an email port other than TCP port 25. This is common practice for most mail providers and should not be an inhibitor to you sending and measuring your communications.

Need an Exception?

If you are a new client and need the ability to send outbound SMTP email via TCP port 25, please open a support ticket in the customer portal, and provide details about why you require an exception to this policy. Be sure to explain why the SendGrid email relaying solution does not fit your system or application needs. Our team is specialized to assist with most email relaying and blacklisting issues for recognized and reputable real-time blackhole lists (RBLs) and can evaluate your situation.

Dedicated to Your Success

We continuously work with established monitoring authorities and groups to eliminate fraudulent spammers and to block the usage of port 25 for email communications.

As we all know, spam is unsolicited bulk email. Our network architecture isolates devices so customers cannot see or share traffic across accounts. We follow ISO 27001. And for federal accounts, we are aligned to NIST 800-53 framework and maintain SOC 2 Type II reporting compliance for all data centers. We integrate three distinct network topologies for each physical or virtual server and offer security solutions for systems, applications, and data as well.

Thank you again to your commitment to SoftLayer as we continue to work hard to ensure a secure environment for you.

-Dani

August 12, 2015

Network Performance 101: What is latency, and why does it matter?

We’ve all been there. Waiting for a web page to load can be so frustrating that we end up just closing out. You might ask yourself, “Hey, I have high-speed Internet. Why is this happening to me?” Well, there are a lot of factors outside your control that … control page loads. And whether you have an online store, run big data solutions, or have your employees set up on a network accessing files around the world, you never want to hear that your data, consumer products, information, or otherwise, is keeping you from a sale or slowing down employee productivity because of slow data transfer.

So why are some pages so much slower to load than others?
It could be that poorly written code or large images are slowing the load on the backend, but slow page loads can also be caused by network latency. This might sound elementary, but data is not just floating out there in some non-physical Internet space. In reality, data is stored on hard drives … somewhere. Network connectivity provides a path for that data to travel to end users around the world, and that connectivity can vary significantly—depending on how far it’s going, how many times the data has to hop between service providers, how much bandwidth is available along the way, the other data traveling across the same path, and a number of other variables.

The measurement of how quickly data travels between two connected points is called network latency. Network latency is an expression of the amount of time it takes a packet of data to get from one place to another.

Understanding Network Latency
Theoretically, data can travel at the speed of light across optical fiber network cables, but in practice, data typically travels slower than light due to the variables we referenced in the previous section. If a network connection doesn’t have any available bandwidth capacity, data might temporarily queue up to wait for its turn to travel across the line. If a service provider’s network doesn’t route a network path optimally, data could be sent hundreds or thousands of miles away from the destination in the process of routing to the destination. These kinds of delays and detours lead to higher network latency, which lead to slower page loads and download speeds.

We express network latency in milliseconds (that’s 1,000 milliseconds per second), and while a few thousandths of a second may not mean much to us as we’re living our daily lives, those milliseconds are often the deciding factors for whether we stay on a webpage or give up and try another site. As consumers of high-speed Internet, we like what we like, and we want what we want when we want it. In the financial sector, milliseconds can mean billions of dollars in gains or losses from trade transactions on a day-to-day basis.

Logical conclusion: Everyone wants the lowest network latency to the greatest number of users.

Common Approaches to Minimize Network Latency
If our shared goal is to minimize latency for our data, the most common approaches to addressing network latency involve limiting the number of potential variables that can impact the speed of data’s movement. While we don’t have complete control over how our data travels across the Internet, we can do a few things to keep our network latency in line:

  • Distribute data around the world: Users in different locations can pull data from a location that’s geographically close to them. Because the data is closer to the users, it is handed off fewer times, it has a shorter distance to travel, and inefficient routing is less likely to cause a significant performance impact.
  • Provision servers with high-capacity network ports: Huge volumes of data can travel to and from the server every second. If packets are delayed due to fully saturated ports, milliseconds of time pass, pages load slower, download speeds drop, and users get unhappy.
  • Understand how your providers route traffic: When you know how your data is transferred to users around the world, you can make better decisions about where you host your data.

How SoftLayer Minimizes Network Latency
To minimize latency, we took a unique approach to building our network. All of our data centers are connected to network points of presence. All of our network points of presence are connected to each other via our global backbone network. And by maintaining our own global backbone network, our network operations team is able to control network paths and data handoffs much more granularly than if we relied on other providers to move data between geographies.

SoftLayer Private Network

For example, if a user in Berlin wants to watch a cat video hosted on a SoftLayer server in Dallas, the packets of data that make up that cat video will travel across our backbone network (which is exclusively used by SoftLayer traffic) to Frankfurt, where the packets would be handed off to one of our peering or transit public network partners to get to the user in Berlin.

Without a global backbone network, the packets would be handed off to a peering or transit public network provider in Dallas, and that provider would route the packets across its network and/or hand the packets off to another provider at a network hop, and the packets would bounce their way to Germany. It’s entirely possible that the packets could get from Dallas to Berlin with the same network latency with or without the global backbone network, but without the global backbone network, there are a lot more variables.

In addition to building a global backbone network, we also segment public, private, and management traffic onto different network ports so that different types of traffic can be transferred without interfering with each other.

SoftLayer Private Network

But at the end of the day, all of that network planning and forethought doesn’t amount to a hill of beans if you can’t see the results for yourself. That’s why we put speed tests on our website so you can check out our network yourself (for more on speed tests, check out this blog post).

TL;DR: Network Latency
Your users want your data as quickly as you can get it to them. The time it takes for your data to get to them across the Internet is called network latency. The more control you (or your provider) have over your data’s network path, the more consistent (and lower) your network latency will be.

Stay tuned. Next month we will be discussing Network Performance 101: Security, where we’ll discuss all things cloud security—including answering your burning questions: Can other people see or access my data in a public cloud? Is my data more prone to hackers? And, what safeguards do SoftLayer have in place to protect data?

-JRL

June 29, 2015

Opening Up the Cloud

This guest blog post is written by Alexia Emmanoulopoulou, marketing manager at Canonical.

With OpenStack, cloud computing becomes easily accessible to everyone. It tears down financial barriers to cloud deployments and tackles the fear of lock-in. One of the main benefits of OpenStack is the fact that it is open source and supported by a wide ecosystem, with contributions from more than 200 companies, including Canonical and IBM. Users can change service providers and hardware at any time, and compared to other clouds using virtualization technology, OpenStack can double server utilization to as much as 85 percent. This means that an OpenStack cloud is economical and delivers more flexibility, scalability, and agility to businesses. The challenge however lies in recruiting and retaining OpenStack experts, who are in high demand, making it hard for companies to deploy OpenStack on time and on budget. But BootStack, Canonical’s managed cloud product solved that problem by offering all the benefits of a private cloud without any of the pain of day-to-day infrastructure management.

Addressing the Challenge of Finding OpenStack Experts

Resourcing an OpenStack six-strong team to work 24x7 would cost between $900,000 and $1.5 million and can take months of headhunting. Thus the savings that OpenStack should bring companies are eroded so Canonical created BootStack, short for Build, Operate, and Optionally Transfer. It’s a new service for setting up and operating an OpenStack cloud, in both on-premises and hosted environments, and it gives users the option of taking over the management of your cloud in the future.

After working with each customer to define their requirements and specify the right cloud infrastructure for their business, Canonical’s experienced engineering and support team builds and manages the entire cloud infrastructure of the customer, including Ubuntu OpenStack, the underlying hypervisor, and deployment onto hosted or on-premises hardware. As a result, users get all the benefits of a private cloud without any of the pain of day-to-day infrastructure management. For added protection, BootStack is backed by a clear SLA that covers cloud availability at the user’s desired scale as well as uptime and responsiveness metrics.

Choosing Between On-premises and Hosted Cloud

Some companies prefer to host on-premises because they feel more secure knowing their cloud is running on their own site. However, when things go wrong, some companies find they don’t have the expertise on-hand to quickly recover. Furthermore, on-site hosting is at least three times as expensive as it is to outsource to a hosting specialist.

With the hosted option for BootStack, your OpenStack cloud will be hosted on Ubuntu-certified hardware in SoftLayer data centers. SoftLayer provides customizable bare metal and virtual servers run on the highest performing cloud infrastructure available. Users can seamlessly move data between servers at no cost and benefit from secure, fast, and low-latency communications between data centers. 24x7 expert staff in each data center can troubleshoot any rare issues that can’t be directly resolved through their self-service management portal. Canonical and SoftLayer also take care of patches and upgrades to both the operating system and OpenStack, hardware and software failure prevention and fix, proactive health monitoring of the cloud and hardware, and resolution of any other problems.

No Lock-In and Predictable Cost

The two features that set BootStack apart from other managed cloud products are the predictable cost structure and the lack of lock-in. With BootStack, users can access every tool and every machine, any time. A company can choose to take over the management of its cloud at any time, at which point it will receive training and support from Canonical to ensure a smooth transition. BootStack customers can then choose to either bring their cloud in-house or continue hosting with SoftLayer.

In terms of costs, BootStack cloud is priced at $15 per day per server, plus the cost of the hosting. SoftLayer offers a number of bare metal servers that exceed the OpenStack recommended configuration, starting at $699 per month. You pay as you go, and can scale as your business needs change.

All-in-all, it’s a flexible managed cloud at a predictable cost with expert staff to manage it until you’re ready to take over!

For more information about BootStack, SoftLayer, and OpenStack, download our free white paper: The Easiest Way to Build and Manage an OpenStack Cloud.

-Alexia

May 14, 2015

Update - VENOM Vulnerability

Yesterday, a security advisory designated CVE-2015-3456 / XSA-133 was publicly announced. The advisory identified a vulnerability, which has become commonly known as "VENOM", through which an attacker could exploit floppy driver support in QEMU to escalate their privileges.

SoftLayer engineers, in concert with our technology partners, completed a deep analysis of the vulnerability and determined that SoftLayer virtual servers are not affected by this issue.

We're always committed to ensuring our customers' operations and data are well protected. If customers have any questions or concerns, don't hesitate to reach out to SoftLayer support or your direct SoftLayer contacts.

-Sonny

March 30, 2015

The Importance of Data's Physical Location in the Cloud

If top-tier cloud providers use similar network hardware in their data centers and connect to the same transit and peering bandwidth providers, how can SoftLayer claim to provide the best network performance in the cloud computing industry?

Over the years, I've heard variations of that question asked dozens of times, and it's fairly easy to answer with impressive facts and figures. All SoftLayer data centers and network points of presence (PoPs) are connected to our unique global network backbone, which carries public, private, and management traffic to and from servers. Using our network connectivity table, some back-of-the-envelope calculations reveal that we have more than 2,500Gbps of bandwidth connectivity with some of the largest transit and peering bandwidth providers in the world (and that total doesn't even include the private peering relationships we have with other providers in various regional markets). Additionally, customers may order servers with up to 10Gbps network ports in our data centers.

For the most part, those stats explain our differentiation, but part of the bigger network performance story is still missing, and to a certain extent it has been untold—until today.

The 2,500+Gbps of bandwidth connectivity we break out in the network connectivity table only accounts for the on-ramps and off-ramps of our network. Our global network backbone is actually made up of an additional 2,600+Gbps of bandwidth connectivity ... and all of that backbone connectivity transports SoftLayer-related traffic.

This robust network architecture streamlines the access to and delivery of data on SoftLayer servers. When you access a SoftLayer server, the network is designed to bring you onto our global backbone as quickly as possible at one of our network PoPs, and when you're on our global backbone, you'll experience fewer hops (and a more direct route that we control). When one of your users requests data from your SoftLayer server, that data travels across the global backbone to the nearest network PoP, where it is handed off to another provider to carry the data the "last mile."

With this controlled environment, I decided to undertake an impromptu science experiment to demonstrate how location and physical distance affect network performance in the cloud.

Speed Testing on the SoftLayer Global Network Backbone

I work in the SoftLayer office in downtown Houston, Texas. In network-speak, this location is HOU04. You won't find that location on any data center or network tables because it's just an office, but it's connected to the same global backbone as our data centers and network points of presence. From my office, the "last mile" doesn't exist; when I access a SoftLayer server, my bits and bytes only travel across the SoftLayer network, so we're effectively cutting out a number of uncontrollable variables in the process of running network speed tests.

For better or worse, I didn't tell any network engineers that I planned to run speed tests to every available data center and share the results I found, so you're seeing exactly what I saw with no tomfoolery. I just fired up my browser, headed to our Data Centers page, and made my way down the list using the SpeedTest option for each facility. Customers often go through this process when trying to determine the latency, speeds, and network path that they can expect from servers in each data center, but if we look at the results collectively, we can learn a lot more about network performance in general.

With the results, we'll discuss how network speed tests work, what the results mean, and why some might be surprising. If you're feeling scientific and want to run the tests yourself, you're more than welcome to do so.

The Ookla SpeedTests we link to from the data centers table measured the latency (ping time), jitter (variation in latency), download speeds, and upload speeds between the user's computer and the data center's test server. To run this experiment, I connected my MacBook Pro via Ethernet to a 100Mbps wired connection. At the end of each speed test, I took a screenshot of the performance stats:

SoftLayer Network Speed Test

To save you the trouble of trying to read all of the stats on each data center as they cycle through that animated GIF, I also put them into a table (click the data center name to see its results screenshot in a new window):

Data Center Latency (ms) Download Speed (Mbps) Upload Speed (Mbps) Jitter (ms)
AMS01 121 77.69 82.18 1
DAL01 9 93.16 87.43 0
DAL05 7 93.16 83.77 0
DAL06 7 93.11 83.50 0
DAL07 8 93.08 83.60 0
DAL09 11 93.05 82.54 0
FRA02 128 78.11 85.08 0
HKG02 184 50.75 78.93 2
HOU02 2 93.12 83.45 1
LON02 114 77.41 83.74 2
MEL01 186 63.40 78.73 1
MEX01 27 92.32 83.29 1
MON01 52 89.65 85.94 3
PAR01 127 82.40 83.38 0
SJC01 44 90.43 83.60 1
SEA01 50 90.33 83.23 2
SNG01 195 40.35 72.35 1
SYD01 196 61.04 75.82 4
TOK02 135 75.63 82.20 2
TOR01 40 90.37 82.90 1
WDC01 43 89.68 84.35 0

By performing these speed tests on the SoftLayer network, we can actually learn a lot about how speed tests work and how physical location affects network performance. But before we get into that, let's take note of a few interesting results from the table above:

  • The lowest latency from my office is to the HOU02 (Houston, Texas) data center. That data center is about 14.2 miles away as the crow flies.
  • The highest latency results from my office are to the SYD01 (Sydney, Australia) and SNG01 (Singapore) data centers. Those data centers are at least 8,600 and 10,000 miles away, respectively.
  • The fastest download speed observed is 93.16Mbps, and that number was seen from two data centers: DAL01 and DAL05.
  • The slowest download speed observed is 40.35Mbps from SNG01.
  • The fastest upload speed observed is 87.43Mbps to DAL01.
  • The slowest upload speed observed is 72.35Mbps to SNG01.
  • The upload speeds observed are faster than the download speeds from every data center outside of North America.

Are you surprised that we didn't see any results closer to 100Mbps? Is our server in Singapore underperforming? Are servers outside of North America more selfish to receive data and stingy to give it back?

Those are great questions, and they actually jumpstart an explanation of how the network tests work and what they're telling us.

Maximum Download Speed on 100Mbps Connection

If my office is 2 milliseconds from the test server in HOU02, why is my download speed only 93.12Mbps? To answer this question, we need to understand that to perform these tests, a connection is made using Transmission Control Protocol (TCP) to move the data, and TCP does a lot of work in the background. The download is broken into a number of tiny chunks called packets and sent from the sender to the receiver. TCP wants to ensure that each packet that is sent is received, so the receiver sends an acknowledgement back to the sender to confirm that the packet arrived. If the sender is unable to verify that a given packet was successfully delivered to the receiver, the sender will resend the packet.

This system is pretty simple, but in actuality, it's very dynamic. TCP wants to be as efficient as possible ... to send the fewest number of packets to get the entire message across. To accomplish this, TCP is able to modify the size of each packet to optimize it for each communication. The receiver dictates how large the packet should be by providing a receive window to accommodate a small packet size, and it analyzes and adjusts the receive window to get the largest packets possible without becoming unstable. Some operating systems are better than others when it comes to tweaking and optimizing TCP transfer rates, but the processes TCP takes to ensure that the packets are sent and received without error takes overhead, and that overhead limits the maximum speed we can achieve.

Understanding the SNG01 Results

Why did my SNG01 speed test max out at a meager 40.35Mbps on my 100Mbps connection? Well, now that we understand how TCP is working behind the scenes, we can see why our download speeds from Singapore are lower than we'd expect. Latency between the sending and successful receipt of a packet plays into TCP’s considerations of a stable connection. Higher ping times will cause TCP to send smaller packet sizes than it would for lower ping times to ensure that no sizable packet is lost (which would have to be reproduced and resent).

With our global backbone optimizing the network path of the packets between Houston and Singapore, the more than 10,000-mile journey, the nature of TCP, and my computer's TCP receive window adjustments all factor into the download speeds recorded from SNG01. Looking at the results in the context of the distance the data has to travel, our results are actually well within the expected performance.

Because the default behavior of TCP is partially to blame for the results, we could actually tweak the test and tune our configurations to deliver faster speeds. To confirm that improvements can be made relatively easily, we can actually just look at the answer to our third question...

Upload > Download?

Why are the upload speeds faster than the download speeds after latency jumps from 50ms to 114ms? Every location in North America is within 2,000 miles of Houston, while the closest location outside of North America is about 5,000 miles away. With what we've learned about how TCP and physical distance play into download speeds, that jump in distance explains why the download speeds drop from 90.33Mbps to 77.41Mbps as soon as we cross an ocean, but how can the upload speeds to Europe (and even APAC) stay on par with their North American counterparts? The only difference between our download path and upload path is which side is sending and which side is receiving. And if the receiver determines the size of the TCP receive window, the most likely culprit in the discrepancy between download and upload speeds is TCP windowing.

A Linux server is built and optimized to be a server, whereas my MacOSX laptop has a lot of other responsibilities, so it shouldn't come as a surprise that the default TCP receive window handling is better on the server side. With changes to the way my laptop handles TCP, download speeds would likely be improved significantly. Additionally, if we wanted to push the envelope even further, we might consider using a different transfer protocol to take advantage of the consistent, controlled network environment.

The Importance of Physical Location in Cloud Computing

These real-world test results under controlled conditions demonstrate the significance of data's geographic proximity to its user on the user's perceived network performance. We know that the network latency in a 14-mile trip will be lower than the latency in a 10,000-mile trip, but we often don't think about the ripple effect latency has on other network performance indicators. And this experiment actually controls a lot of other variables that can exacerbate the performance impact of geographic distance. The tests were run on a 100Mbps connection because that's a pretty common maximum port speed, but if we ran the same tests on a GigE line, the difference would be even more dramatic. Proof: HOU02 @ 1Gbps v. SNG01 @ 1Gbps

Let's apply our experiment to a real-world example: Half of our site's user base is in Paris and the other half is in Singapore. If we chose to host our cloud infrastructure exclusively from Paris, our users would see dramatically different results. Users in Paris would have sub-10ms latency while users in Singapore have about 300ms of latency. Obviously, operating cloud servers in both markets would be the best way to ensure peak performance in both locations, but what if you can only afford to provision your cloud infrastructure in one location? Where would you choose to provision that infrastructure to provide a consistent user experience for your audience in both markets?

Given what we've learned, we should probably choose a location with roughly the same latency to both markets. We can use the SoftLayer Looking Glass to see that San Jose, California (SJC01) would be a logical midpoint ... At this second, the latency between SJC and PAR on the SoftLayer backbone is 149ms, and the latency between SJC and SNG is 162ms, so both would experience very similar performance (all else being equal). Our users in the two markets won't experience mind-blowing speeds, but neither will experience mind-numbing speeds either.

The network performance implications of physical distance apply to all cloud providers, but because of the SoftLayer global network backbone, we're able to control many of the variables that lead to higher (or inconsistent) latency to and from a given data center. The longer a single provider can route traffic, the more efficiently that traffic will move. You might see the same latency speeds to another provider's cloud infrastructure from a given location at a given time across the public Internet, but you certainly won't see the same consistency from all locations at all times. SoftLayer has spent millions of dollars to build, maintain, and grow our global network backbone to transport public and private network traffic, and as a result, we feel pretty good about claiming to provide the best network performance in cloud computing.

-@khazard

Subscribe to infrastructure