Posts Tagged 'Speed'

January 24, 2013

Startup Series: SPEEDILICIOUS

Research from the Aberdeen Group shows the average website is losing 9% of its business because
 the speed of the site frustrates visitors into leaving. 9% of your traffic might be leaving your site because they feel like it's too slow. That thought is staggering, and any site owner would be foolish not to fix the problem. SPEEDILICIOUS — one of our new Catalyst partners — has an innovative solution that optimizes website performance and helps businesses deliver content to their end users faster.

SPEEDILICIOUS

I recently had the chance to chat with SPEEDILICIOUS founders Seymour Segnit and Chip Krauskopf, and Seymour rephrased that "9%" statistic in a pretty alarming way: "Losing 9% of your business is the equivalent of simply allowing your website to go offline, down, dark, dead, 404 for over a MONTH each year!" There is ample data to back this up from high-profile sites like Amazon, Microsoft and Walmart.com, but intuitively, you know it already ... A slow site (even a slightly slow site) is annoying.

The challenge many website owners have when it comes to their loading speeds is that problems might not be noticeable from their own workstations. Thanks to caching and the Internet connections most of us have, when we visit our own sites, we don't have any trouble accessing our content quickly. Unfortunately, many of our customers don't share that experience when they visit our sites on mobile, hotel, airports and (worst of all) conference connections. The most common approach to speeding up load times is to throw bigger servers or a CDN (content delivery network) at the problem, but while those improvements make a difference, they only address part of the problem ... Even with the most powerful servers in SoftLayer's fleet, your page can load at a crawl if your code can't be rendered quickly by a browser.

That makes life as a website developer difficult. The process of optimizing code and tweaking settings to speed up load times can be time-consuming and frustrating. Or as Chip explained to me, "Speeding up your site is essential, it shouldn’t be be slow and complicated. We fix that problem." Take a look:

The idea that your site performance can be sped up significantly overnight seems a little crazy, but if it works (which it clearly does), wouldn't it be crazier not to try it? SPEEDILICIOUS offers a $1 trial for you to see the results on your own site, and they regularly host a free webinar called "How to Grow Your Business 5-15% Overnight" which covers the critical techniques for speeding up any website.

As technology continues to improve and behavioral patterns of purchasing migrate away from the mall and onto our computers and smart phones, SPEEDILICIOUS has a tremendous opportunity to capture a ripe market. So they're clearly a great fit for Catalyst. If you're interested in learning more or would like to speak to Seymour, Chip or anyone on their team, please let me know and I'll make the direct introduction any time.

-@JoshuaKrammes

December 17, 2012

Big Data at SoftLayer: The Importance of IOPS

The jet flow gates in the Hoover Dam can release up to 73,000 cubic feet — the equivalent of 546,040 gallons — of water per second at 120 miles per hour. Imagine replacing those jet flow gates with a single garden hose that pushes 25 gallons per minute (or 0.42 gallons per second). Things would get ugly pretty quickly. In the same way, a massive "big data" infrastructure can be crippled by insufficient IOPS.

IOPS — Input/Output Operations Per Second — measure computer storage in terms of the number of read and write operations it can perform in a second. IOPS are a primary concern for database environments where content is being written and queried constantly, and when we take those database environments to the extreme (big data), the importance of IOPS can't be overstated: If you aren't able perform database reads and writes quickly in a big data environment, it doesn't matter how many gigabytes, terabytes or petabytes you have in your database ... You won't be able to efficiently access, add to or modify your data set.

As we worked with 10gen to create, test and tweak SoftLayer's MongoDB engineered servers, our primary focus centered on performance. Since the performance of massively scalable databases is dictated by the read and write operations to that database's data set, we invested significant resources into maximizing the IOPS for each engineered server ... And that involved a lot more than just swapping hard drives out of servers until we found a configuration that worked best. Yes, "Disk I/O" — the amount of input/output operations a given disk can perform — plays a significant role in big data IOPS, but many other factors limit big data performance. How is performance impacted by network-attached storage? At what point will a given CPU become a bottleneck? How much RAM should included in a base configuration to accommodate the load we expect our users to put on each tier of server? Are there operating system changes that can optimize the performance of a platform like MongoDB?

The resulting engineered servers are a testament to the blood, sweat and tears that were shed in the name of creating a reliable, high-performance big data environment. And I can prove it.

Most shared virtual instances — the scalable infrastructure many users employ for big data — use network-attached storage for their platform's storage. When data has to be queried over a network connection (rather than from a local disk), you introduce latency and more "moving parts" that have to work together. Disk I/O might be amazing on the enterprise SAN where your data lives, but because that data is not stored on-server with your processor or memory resources, performance can sporadically go from "Amazing" to "I Hate My Life" depending on network traffic. When I've tested the IOPS for network-attached storage from a large competitor's virtual instances, I saw an average of around 400 IOPS per mount. It's difficult to say whether that's "not good enough" because every application will have different needs in terms of concurrent reads and writes, but it certainly could be better. We performed some internal testing of the IOPS for the hard drive configurations in our Medium and Large MongoDB engineered servers to give you an apples-to-apples comparison.

Before we get into the tests, here are the specs for the servers we're using:

Medium (MD) MongoDB Engineered Server
Dual 6-core Intel 5670 CPUs
CentOS 6 64-bit
36GB RAM
1Gb Network - Bonded
Large (LG) MongoDB Engineered Server
Dual 8-core Intel E5-2620 CPUs
CentOS 6 64-bit
128GB RAM
1Gb Network - Bonded
 

The numbers shown in the table below reflect the average number of IOPS we recorded with a 100% random read/write workload on each of these engineered servers. To measure these IOPS, we used a tool called fio with an 8k block size and iodepth at 128. Remembering that the virtual instance using network-attached storage was able to get 400 IOPS per mount, let's look at how our "base" configurations perform:

Medium - 2 x 64GB SSD RAID1 (Journal) - 4 x 300GB 15k SAS RAID10 (Data)
Random Read IOPS - /var/lib/mongo/logs 2937
Random Write IOPS - /var/lib/mongo/logs 1306
Random Read IOPS - /var/lib/mongo/data 1720
Random Write IOPS - /var/lib/mongo/data 772
Random Read IOPS - /var/lib/mongo/data/journal 19659
Random Write IOPS - /var/lib/mongo/data/journal 8869
   
Medium - 2 x 64GB SSD RAID1 (Journal) - 4 x 400GB SSD RAID10 (Data)
Random Read IOPS - /var/lib/mongo/logs 30269
Random Write IOPS - /var/lib/mongo/logs 13124
Random Read IOPS - /var/lib/mongo/data 33757
Random Write IOPS - /var/lib/mongo/data 14168
Random Read IOPS - /var/lib/mongo/data/journal 19644
Random Write IOPS - /var/lib/mongo/data/journal 8882
   
Large - 2 x 64GB SSD RAID1 (Journal) - 6 x 600GB 15k SAS RAID10 (Data)
Random Read IOPS - /var/lib/mongo/logs 4820
Random Write IOPS - /var/lib/mongo/logs 2080
Random Read IOPS - /var/lib/mongo/data 2461
Random Write IOPS - /var/lib/mongo/data 1099
Random Read IOPS - /var/lib/mongo/data/journal 19639
Random Write IOPS - /var/lib/mongo/data/journal 8772
 
Large - 2 x 64GB SSD RAID1 (Journal) - 6 x 400GB SSD RAID10 (Data)
Random Read IOPS - /var/lib/mongo/logs 32403
Random Write IOPS - /var/lib/mongo/logs 13928
Random Read IOPS - /var/lib/mongo/data 34536
Random Write IOPS - /var/lib/mongo/data 15412
Random Read IOPS - /var/lib/mongo/data/journal 19578
Random Write IOPS - /var/lib/mongo/data/journal 8835

Clearly, the 400 IOPS per mount results you'd see in SAN-based storage can't hold a candle to the performance of a physical disk, regardless of whether it's SAS or SSD. As you'd expect, the "Journal" reads and writes have roughly the same IOPS between all of the configurations because all four configurations use 2 x 64GB SSD drives in RAID1. In both configurations, SSD drives provide better Data mount read/write performance than the 15K SAS drives, and the results suggest that having more physical drives in a Data mount will provide higher average IOPS. To put that observation to the test, I maxed out the number of hard drives in both configurations (10 in the 2U MD server and 34 in the 4U LG server) and recorded the results:

Medium - 2 x 64GB SSD RAID1 (Journal) - 10 x 300GB 15k SAS RAID10 (Data)
Random Read IOPS - /var/lib/mongo/logs 7175
Random Write IOPS - /var/lib/mongo/logs 3481
Random Read IOPS - /var/lib/mongo/data 6468
Random Write IOPS - /var/lib/mongo/data 1763
Random Read IOPS - /var/lib/mongo/data/journal 18383
Random Write IOPS - /var/lib/mongo/data/journal 8765
   
Medium - 2 x 64GB SSD RAID1 (Journal) - 10 x 400GB SSD RAID10 (Data)
Random Read IOPS - /var/lib/mongo/logs 32160
Random Write IOPS - /var/lib/mongo/logs 12181
Random Read IOPS - /var/lib/mongo/data 34642
Random Write IOPS - /var/lib/mongo/data 14545
Random Read IOPS - /var/lib/mongo/data/journal 19699
Random Write IOPS - /var/lib/mongo/data/journal 8764
   
Large - 2 x 64GB SSD RAID1 (Journal) - 34 x 600GB 15k SAS RAID10 (Data)
Random Read IOPS - /var/lib/mongo/logs 17566
Random Write IOPS - /var/lib/mongo/logs 11918
Random Read IOPS - /var/lib/mongo/data 9978
Random Write IOPS - /var/lib/mongo/data 6526
Random Read IOPS - /var/lib/mongo/data/journal 18522
Random Write IOPS - /var/lib/mongo/data/journal 8722
 
Large - 2 x 64GB SSD RAID1 (Journal) - 34 x 400GB SSD RAID10 (Data)
Random Read IOPS - /var/lib/mongo/logs 34220
Random Write IOPS - /var/lib/mongo/logs 15388
Random Read IOPS - /var/lib/mongo/data 35998
Random Write IOPS - /var/lib/mongo/data 17120
Random Read IOPS - /var/lib/mongo/data/journal 17998
Random Write IOPS - /var/lib/mongo/data/journal 8822

It should come as no surprise that by adding more drives into the configuration, we get better IOPS, but you might be wondering why the results aren't "betterer" when it comes to the IOPS in the SSD drive configurations. While the IOPS numbers improve going from four to ten drives in the medium engineered server and six to thirty-four drives in the large engineered server, they don't increase as significantly as the IOPS differences in the SAS drives. This is what I meant when I explained that several factors contribute to and potentially limit IOPS performance. In this case, the limiting factor throttling the (ridiculously high) IOPS is the RAID card we are using in the servers. We've been working with our RAID card vendor to test a new card that will open a little more headroom for SSD IOPS, but that replacement card doesn't provide the consistency and reliability we need for these servers (which is just as important as speed).

There are probably a dozen other observations I could point out about how each result compares with the others (and why), but I'll stop here and open the floor for you. Do you notice anything interesting in the results? Does anything surprise you? What kind of IOPS performance have you seen from your server/cloud instance when running a tool like fio?

-Kelly

August 21, 2012

High Performance Computing - GPU v. CPU

Sometimes, technical conversations can sound like people are just making up tech-sounding words and acronyms: "If you want HPC to handle Gigaflops of computational operations, you probably need to supplement your server's CPU and RAM with a GPU or two." It's like hearing a shady auto mechanic talk about replacing gaskets on double overhead flange valves or hearing Chris Farley (in Tommy Boy) explain that he was "just checking the specs on the endline for the rotary girder" ... You don't know exactly what they're talking about, but you're pretty sure they're lying.

When we talk about high performance computing (HPC), a natural tendency is to go straight into technical specifications and acronyms, but that makes the learning curve steeper for people who are trying to understand why a solution is better suited for certain types of workloads than technology they are already familiar with. With that in mind, I thought I'd share a quick explanation of graphics processing units (GPUs) in the context of central processing units (CPUs).

The first thing that usually confuses people about GPUs is the name: "Why do I need a graphics processing unit on a server? I don't need to render the visual textures from Crysis on my database server ... A GPU is not going to benefit me." It's true that you don't need cutting-edge graphics on your server, but a GPU's power isn't limited to "graphics" operations. The "graphics" part of the name reflects the original intention for kind of processing GPUs perform, but in the last ten years or so, developers and engineers have come to adapt the processing power for more general-purpose computing power.

GPUs were designed in a highly parallel structure that allows large blocks of data to be processed at one time — similar computations are being made on data at the same time (rather than in order). If you assigned the task of rendering a 3D environment to a CPU, it would slow to a crawl — it handles requests more linearly. Because GPUs are better at performing repetitive tasks on large blocks of data than CPUs, you start see the benefit of enlisting a GPU in a server environment.

The Folding@home project and bitcoin mining are two of the most visible distributed computing projects that GPUs are accelerating, and they're perfect examples of workloads made exponentially faster with the parallel processing power of graphics processing units. You don't need to be folding protein or completing a blockchain to get the performance benefits, though; if you are taxing your CPUs with repetitive compute tasks, a GPU could make your life a lot easier.

If that still doesn't make sense, I'll turn the floor over to the Mythbusters in a presentation for our friends at NVIDIA:

SoftLayer uses NVIDIA Tesla GPUs in our high performance computing servers, so developers can use "Compute Unified Device Architecture" (CUDA) to easily take advantage of their GPU's capabilities.

Hopefully, this quick rundown is helpful in demystifying the "technobabble" about GPUs and HPC ... As a quick test, see if this sentence makes more sense now than it did when you started this blog: "If you want HPC to handle Gigaflops of computational operations, you probably need to supplement your server's CPU and RAM with a GPU or two."

-Phil

August 17, 2012

SoftLayer Private Clouds - Provisioning Speed

SoftLayer Private Clouds are officially live, and that means you can now order and provision your very own private cloud infrastructure on Citrix CloudPlatform quickly and easily. Chief Scientist Nathan Day introduced private clouds on the blog when it was announced at Cloud Expo East, and CTO Duke Skarda followed up with an explanation of the architecture powering SoftLayer Private Clouds. The most amazing claim: You can order a private cloud infrastructure and spin up its first virtual machines in a matter of hours rather than days, weeks or months.

If you've ever looked at building your own private cloud in the past, the "days, weeks or months" timeline isn't very surprising — you have to get the hardware provisioned, the software installed and the network configured ... and it all has to work together. Hearing that SoftLayer Private Clouds can be provisioned in "hours" probably seems too good to be true to administrators who have tried building a private cloud in the past, so I thought I'd put it to the test by ordering a private cloud and documenting the experience.

At 9:30am, I walked over to Phil Jackson's desk and asked him if he would be interested in helping me out with the project. By 9:35am, I had him convinced (proof), and the clock was started.

When we started the order process, part of our work is already done for us:

SoftLayer Private Clouds

To guarantee peak performance of the CloudPlatform management server, SoftLayer selected the hardware for us: A single processor quad core Xeon 5620 server with 6GB RAM, GigE, and two 2.0TB SATA II HDDs in RAID1. With the management server selected, our only task was choosing our host server and where we wanted the first zone (host server and management server) to be installed:

SoftLayer Private Clouds

For our host server, we opted for a dual processor quad core Xeon 5504 with the default specs, and we decided to spin it up in DAL05. We added (and justified) a block of 16 secondary IP addresses for our first zone, and we submitted the order. The time: 9:38am.

At this point, it would be easy for us to game the system to shave off a few minutes from the provisioning process by manually approving the order we just placed (since we have access to the order queue), but we stayed true to the experiment and let it be approved as it normally would be. We didn't have to wait long:

SoftLayer Private Clouds

At 9:42am, our order was approved, and the pressure was on. How long would it take before we were able to log into the CloudStack portal to create a virtual machine? I'd walked over to Phil's desk 12 minutes ago, and we still had to get two physical servers online and configured to work with each other on CloudPlatform. Luckily, the automated provisioning process took on a the brunt of that pressure.

Both server orders were sent to the data center, and the provisioning system selected two pieces of hardware that best matched what we needed. Our exact configurations weren't available, so a SBT in the data center was dispatched to make the appropriate hardware changes to meet our needs, and the automated system kicked into high gear. IP addresses were assigned to the management and host servers, and we were able to monitor each server's progress in the customer portal. The hardware was tested and prepared for OS install, and when it was ready, the base operating systems were loaded — CentOS 6 on the management server and Citrix XenServer 6 on the host server. After CentOS 6 finished provisioning on the management server, CloudStack was installed. Then we got an email:

SoftLayer Private Clouds

At 11:24am, less than two hours from when I walked over to Phil's desk, we had two servers online and configured with CloudStack, and we were ready to provision our first virtual machines in our private cloud environment.

We log into CloudStack and added our first instance:

SoftLayer Private Clouds

We configured our new instance in a few clicks, and we clicked "Launch VM" at 11:38am. It came online in just over 3 minutes (11:42am):

SoftLayer Private Clouds

I got from "walking to Phil's desk" to having a multi-server private cloud infrastructure running a VM in exactly two hours and twelve minutes. For fun, I created a second VM on the host server, and it was provisioned in 31.7 seconds. It's safe to say that the claim that SoftLayer takes "hours" to provision a private cloud has officially been confirmed, but we thought it would be fun to add one more wrinkle to the system: What if we wanted to add another host server in a different data center?

From the "Hardware" tab in the SoftLayer portal, we selected "Add Zone" to from the "Actions" in the "Private Clouds" section, and we chose a host server with four portable IP addresses in WDC01. The zone was created, and the host server went through the same hardware provisioning process that our initial deployment went through, and our new host server was online in < 2 hours. We jumped into CloudStack, and the new zone was created with our host server ready to provision VMs in Washington, D.C.

Given how quick the instances were spinning up in the first zone, we timed a few in the second zone ... The first instance was online in about 4 minutes, and the second was running in 26.8 seconds.

SoftLayer Private Clouds

By the time I went out for a late lunch at 1:30pm, we'd spun up a new private cloud infrastructure with geographically dispersed zones that launched new cloud instances in under 30 seconds. Not bad.

Don't take my word for it, though ... Order a SoftLayer Private Cloud and see for yourself.

-@khazard

July 5, 2012

Bandwidth Utilization: Managing a Global Network

SoftLayer has over 1,750 Gbit/s of network capacity. In each of our data centers and points of presence, we have an extensive library of peering relationships and multiple 10 Gbit/s connections to independent Tier 1 carriers. We operate one of the fastest, most reliable networks on the planet, and our customers love it:

From a network operations standpoint, that means we have our work cut out for us to keep everything running smoothly while continuing to build the network to accommodate a steady increase in customer demand. It might be easier to rest on our laurels to simply maintain what we already have in place, but when you look at the trend of bandwidth usage over the past 18 months, you'll see why we need to be proactive about expanding our network:

Long Term Bandwidth Usage Trend

The purple line above plots the 95th percentile of weekly outbound bandwidth utilization on the SoftLayer network, and the red line shows the linear trend of that consumption over time. From week to week, the total usage appears relatively consistent, growing at a steady rate, but when you look a little deeper, you get a better picture of how dynamic our network actually is:

SoftLayer Weekly Bandwidth Usage

The animated gif above shows the 2-hour average of bandwidth usage on our entire network over a seven-week period (times in CDT). As you can see, on a day-to-day basis, consumption fluctuates pretty significantly. The NOC (Network Operations Center) needs to be able to accommodate every spike of usage at any time of day, and our network engineering and strategy teams have to stay ahead of the game when it comes to planning points of presence and increasing bandwidth capacity to accommodate our customers' ever-expanding needs.

But wait. There's more.

Let's go one level deeper and look a graph of the 95th percentile bandwidth usage on 5-minute intervals from one week in a single data center:

Long Term Bandwidth Usage Trend

The variations in usage are even more dramatic. Because we have thirteen data centers geographically dispersed around the world with an international customer base, the variations you see in total bandwidth utilization understate the complexity of our network's bandwidth usage. Customers targeting the Asian market might host content in SNG01, and the peaks in bandwidth consumption from Singapore will counterbalance the valleys of consumption at the same time in the United States and Europe.

With that in mind, here's a challenge for you: Looking at the graph above, if the times listed are in CDT, which data center do you think that data came from?

It would be interesting to look at weekly usage trends, how those trends are changing and what those trends tell us about our customer base, but that assessment would probably be "information overload" in this post, so I'll save that for another day.

-Dani

P.S. If you came to this post expecting to see "a big truck" or "a series of tubes," I'm sorry I let you down.

January 10, 2012

Web Development - HTML5 - Custom Data Attributes

I recently worked on a project that involved creating promotion codes for our clients. I wanted to make this tool as simple as possible to use and because this involved dealing with thousands of our products in dozens of categories with custom pricing for each of these products, I had to find a generic way to deal with client-side form validation. I didn't want to write custom JavaScript functions for each of the required inputs, so I decided to use custom data attributes.

Last month, we started a series focusing on web development tips and tricks with a post about JavaScript optimization. In this installment, we're cover how to use HTML5 custom data attributes to assist you in validating forms.

Custom data attributes for elements are "[attributes] in no namespace whose name starts with the string 'data-', has at least one character after the hyphen, is XML-compatible, and contains no characters in the range U+0041 to U+005A (LATIN CAPITAL LETTER A to LATIN CAPITAL LETTER Z)." Thanks W3C. That definition is bookish, so let's break it down and look at some examples.

Valid:

<div data-name="Philip">Mr. Thompson is an okay guy.</div>
<a href="softlayer.com" data-company-name="SoftLayer" data-company-state="TX">SoftLayer</a>
<li data-color="blue">Smurfs</li>

Invalid:

// This attribute is not prefixed with 'data-'
    <h2 database-id="244">Food</h2>
 
// These 2 attributes contain capital letters in the attribute names
    <p data-firstName="Ashley" data-lastName="Thompson">...</p>
 
// This attribute does not have any valid characters following 'data-'
    <img src="/images/pizza.png" data-="Sausage" />

Now that you know what custom data attributes are, why would we use them? Custom attributes allow us to relate specific information to particular elements. This information is hidden to the end user, so we don't have to worry about the data cluttering screen space and we don't have to create separate hidden elements whose purpose is to hold custom data (which is just bad practice). This data can be used by a JavaScript programmer to many different ends. Among the most common use cases are to manipulate elements, provide custom styles (using CSS) and perform form validation. In this post, we'll focus on form validation.

Let's start out with a simple form with two radio inputs. Since I like food, our labels will be food items!

<input type="radio" name="food" value="pizza" data-sl-required="food" data-sl-show=".pizza" /> Pizza
<input type="radio" name="food" value="sandwich" data-sl-required="food" data-sl-show="#sandwich" /> Sandwich
<div class="hidden required" data-sl-error-food="1">A food item must be selected</div>

Here we have two standard radio inputs with some custom data attributes and a hidden div (using CSS) that contains our error message. The first input has a name of food and a value of pizza – these will be used on the server side once our form is submitted. There are two custom data attributes for this input: data-sl-required and data-sl-show. I am defining the data attribute data-sl-required to indicate that this radio button is required in the form and one of them must be selected to pass validation. Note that I have prefixed required with sl- to namespace my data attribute. required is generic and I don't want to have any conflicts with any other attributes – especially ones written for other projects! The value of data-sl-required is food, which I have tied to the div with the attribute data-sl-error-food (more on this later).

The second custom attribute is data-sl-show with a selector of .pizza. (To review selectors, jump back to the JavaScript optimization post.) This data attribute will be used to show a hidden div that contains the class pizza when the radio button is clicked. The sandwich radio input has the same data attributes but with slightly different values.

Now we can move on to our HTML with the hidden text inputs:

<div class="hidden pizza">
    Pizza type: <input type="text" name="pizza" data-sl-required="pizza" data-sl-depends="{&quot;type&quot;:&quot;radio&quot;,&quot;name&quot;:&quot;food&quot;,&quot;value&quot;:&quot;sandwich&quot;}" />
    <div class="hidden required" data-sl-error-pizza="1">The pizza type must not be empty</div>
</div>
 
<div class="hidden" id="sandwich">
    Sandwich: <input type="text" name="sandwich" data-sl-required="sandwich" data-sl-depends="{&quot;type&quot;:&quot;radio&quot;,&quot;name&quot;:&quot;food&quot;,&quot;value&quot;:&quot;sandwich&quot;}" />
    <div class="hidden required" data-sl-error-sandwich="1">The sandwich must not be empty</div>
</div>

These are hidden divs that contain text inputs that will be used to input more data depending on the radio input selected. Notice that the first outer div has a class of pizza and the second outer div has an id of sandwich. These values tie back to the data-sl-show selectors for the radio inputs. These text inputs also contain the data-sl-required attribute like the previous radio inputs. The data-sl-depends data attribute is a fun attribute that contains a JSON-encoded object that's been run through PHP's htmlentities() function. For readability, the data-sl-depends values contain:

{"type":"radio","name":"food","value":"pizza"}
{"type":"radio","name":"food","value":"sandwich"}

This first attribute says that our text input “depends” on the radio input with the name food and value pizza to be visible and selected in order to be processed as required. You can just imagine the possibilities of combinations you can create to make very custom functionality with very generic JavaScript.

Now we can examine our JavaScript to make sense of all these custom data attributes. Note that I'll be using Mootools' syntax, but the same can fairly easily be accomplished using jQuery or your favorite JavaScript framework. I'm going to start by creating a DataForm class. This will be generic enough so that you can use it in multiple forms and it's not tied to this specific instance. Reuse is good! To have it fit our needs, we're going to pass some options when instantiating it.

new DataForm({
    ...,
    dataAttributes: {
        required: 'data-sl-required',
        errorPrefix: 'data-sl-error',
        depends: 'data-sl-depends',
        show: 'data-sl-show'
    }
});

As you can see, I'm using the data attribute names from our form as the options – this will allow you to create your own data attribute names depending on your uses. We first need to make our hidden divs visible whenever our radio buttons are clicked – we'll use our initData() method for that.

initData: function() {
    var attrs = this.options.dataAttributes,
        divs = [];
 
    $$('input[type=radio]['+attrs.show+']').each(function(input) {
        var div = $$(input.get(attrs.show));
        divs.push(div);
        input.addEvent('change', function(e) {
            divs.each(function(div) { div.hide(); });
            div.show();
        });
    });
}

This method grabs all the radio inputs with the show attribute (data-sl-show) and adds an onchange event to each of them. When a radio input is checked, it first hides all the divs, and then shows the div that's associated with that radio input.

Great! Now we have our text inputs showing and hiding depending on which radio button is selected. Now onto the actual validation. First, we'll make sure our required radio inputs are checked:

$$('input[type=radio]['+attrs.required+']:not(:checked)').each(function(input) {
    var name = input.get('name');
    checkError(name, isRadioWithNameSelected(name))
});

This grabs all the unchecked radio inputs with the required attribute (data-sl-required) and checks to see if any radio with that same name is selected using the function isRadioWithNameSelected(). The checkError() function will show or hide the error div (data-sl-error-*) depending if the radio is checked or not. Don't worry - you'll see how these functions are implemented in the JSFiddle below.

Now we can check our text inputs:

$$('input[type=text]['+attrs.required+']:visible').each(function(input) {
    var name = input.get('name'),
        depends = input.get(attrs.depends),
        value = input.get('value').trim();
 
    if (depends) {
        depends = JSON.encode(depends);
        switch (depends.type) {
            case 'radio':
                if (depends.name &amp;&amp; depends.value) {
                    var radio = $$('input[type=radio][name="'+depends.name+'"][value="'+depends.value+'"]:checked');
                    if (!radio) {
                        checkError(input.get(attrs.required), true);
                        return;
                    }
                }
                break;
        }
    }
 
    checkError(name, value!='');
});

This obtains the required and visible text inputs and determines if they are empty or not. Here we look at our depends object to grab the associated radio inputs. If that radio is checked and the text input is empty, the error message appears. If that radio is not checked, it doesn't even evaluate that text input. The depends.type is in a switch statement to allow for easy expansion of types. There are other cases for evaluating relationships ... I'll let you come up with more for yourself.

That concludes our usage of custom data attributes in form validation. This really only touched upon the very tip of the iceberg. Data attributes allow you – the creative developer – to interact in a new, HTML-valid way with your web pages.

Check out the working example of the above code at jsfiddle.net. To read more on custom data attributes, see what Google has to say. If you want to see really cool functionality that uses data attributes plus so much more, check out Aaron Newton's Behavior module over at clientcide.com.

Happy coding!

-Philip

December 27, 2011

186,282.4 Miles Per Second

Let's say there are 2495 miles separating me and the world's foremost authority on orthopedics who lives in Vancouver, Canada. If I needed some medical advice for how to remove a screwdriver from the palm of my hand that was the result of a a Christmas toy with "some assembly required," I'd be pretty happy I live in the year 2011. Here are a few of the communication methods that I may have settled with in years past:

On Foot: The average human walks 3.5 mph sustainable. Using this method it would take a messenger 29.7 days to get a description of the problem and a drawing of the damage to that doctor if the messenger walked non-stop. Because the doctor in this theoretical scenario is the only person on the planet who knows how to perform the screwdriver removal surgery, the doctor would have to accompany the messenger back to Texas, and I am fairly sure by the time they arrived, they'd have to visit a grave with a terrible epitaph like "He got screwed," or they'd find me answering to a crass nickname like "Stumpy."

On Horseback: The average speed of a galloping horse is around 30 mph sustainable, so with the help of a couple equestrian friends, the message could reach the doctor in 3.5 days if the horse were to run the whole journey without stopping, the doctor could saddle up and hit the trail back to Houston, getting here in about 7 days. In that span of time, I'd only be able to wave to him with one hand, given the inevitable amputation.

Via High-Speed Rail: With an average speed of 101 mph, it would take a mere 24.7 hour to get from Houston to Vancouver, so if this means of communication were the only one used, I could have the doctor at my bedside in a little over 48 hours. That turnaround time might mean my hand would be saved, but the delay would still yield a terrible headache and a lot of embarrassment ... Seeing as how a screwdriver in your hand is relatively noticeable at Christmas parties.

Via Commercial Flight: If the message was taken by plane and the doctor returned by plane, the round trip would be around 12.4 hours at an average rate of 400 mph ... I'd only have to endure half a day of mockery.

Via E-mail: With the multimedia capabilities of email, the doctor could be sent a picture of the damage instantly and a surgeon in Houston could be instructed on how to best save my hand. There would be little delay, but there are no guarantees that the stand-in surgeon would be able to correctly execute on the instructions given by this theoretical world's only orthopedic surgeon.

Via Video Chat: In milliseconds, a video connection could be made between the stand-in surgeon and the orthopedic specialist. The specialist could watch and instruct the stand-in surgeon on how to complete the surgery, and I'd be using both hands again by Christmas morning. Technology is also getting to a point where the specialist could perform parts of the surgery remotely ... Let's just hope they use a good network connection on both end since any latency would be pretty significant.

I started thinking about the amazing speed with which we access information when I met with CTO Duke Skarda. He gave a few examples of our customers that piqued his interested, given to the innovative nature of their business, and one in particular made me realize how far we've come when I considered the availability and speed of our access to information:

The company facilitated advertisements on the Internet by customizing the advertising experience to each visitor by auctioning off ad space to companies that fit that particular visitor's profile. In the simplest sense, a website has a blank area for an advertisment, the site sends non-sensitive information about the visitor to an advertising network. The advertising network then distributes that information to multiple advertisers who process it, generate targeted ads and place a bid to "purchase" the space for that visitor. The winner of the auction is determined, and the winner's ad would be populated on the website.

All of this is done in under a second, before the visitor even knows the process took place.

We live in a time of instant access. We are only limited by the speed of light, a blazing 186,282.4 miles/second. That means you could, theoretically, send a message around the world in .03 milliseconds. Businesses use this speed to create and market products and services to the global market, I can't wait to see what tomorrow holds ... Maybe some kind of technology that prevents screwdrivers from piercing hands?

-Clayton

Categories: 
December 12, 2011

Web Development - JavaScript Optimization

So you have a fast website. You've optimized your database queries and you're using the most efficient page caching mechanisms. The users of your site are pretty satisfied, but you want just a little bit more. How can you push your site to the next level by making sure you've included all the optimizations you can to get the fastest speed possible?

Optimize your JavaScript.

You might not be concerned with optimizing you code because newer browsers have gotten significantly faster at processing JavaScript. But like all other programming languages, there are minor tweaks you can make that provide a significant performance gain. Let's take a look at a few of the ways you can be certain that you're getting the most out of your client-side JavaScript code.

JavaScript in External Files
The first optimization is to write your JavaScript in external files. By using external files, your browser is able to cache your code. This prevents users from needing to re-download your JavaScript during every page load and potentially during every AJAX request. When a browser visits any website, it checks the src attribute in the <script> tags and then determines if it already has a copy of that file on your computer. If it does, it won't spend wasteful time re-downloading the exact same code. If you include your code directly in your HTML within <script> tags, it will download that same code each time so that it can be processed. Save those precious bytes!

Minification
Now that you're putting your code in external files, your next goal is to reduce the amount of time it takes to initially download that code so that it can be cached by the browser. This is where minification comes into play. Minification (or minimization) is the process of removing all unnecessary characters from source code (without changing its functionality). Essentially this means you'll remove whitespace characters and comments.

Unless you like to torture yourself (and those who follow your work), you probably write code that's pretty readable. This includes splitting code up between multiple lines, indenting your code with spaces or tabs, and writing comments that explain some esoteric portion of code. All these items are not important to the JavaScript engine because it will just ignore them, but it still takes time to download these extra bytes that will never get used.

Luckily for you, you don't have to spend time creating the tool that will remove these unneeded characters. There are several good tools out there that will minify your code, and I recommend YUI Compressor. This tool can reduce your code by up to 60% (depending upon how efficiently you write your code). In addition to removing all the unnecessary characters, YUI Compressor will rewrite your code to use smaller variable names. If you have a local variable in a function called var myDescriptiveVariable, the compressor will rename it to var a. We just took our 21-character variable name down to 1 character! If this variable is used in 10 places, we've trimmed 200 characters with this one variable, and this happens for every local variable in your script! That's a lot of bytes saved.

If you're paying close attention, the key takeaway you'll notice is that using local variables (compared to global variable that don't get minified) is one great way to reduce the size of your code after minification occurs.

Initial Page Load Operations
Now that your code has been minified, let's take a look at initial page load operations. Since some JavaScript operations are synchronous and will halt other browser processing, we want to make sure we don't slow down the page when the user is trying to perform an action. There are two things we can do to improve initial page loading performance. First, reduce the amount of code you actually invoke on page load (or when the DOM is ready for interaction using Mootools' domready event, jQuery's document ready event or your favorite framework's equivalent). Second, implement lazy-loading techniques. For example, instead of adding all your events to all the elements on the page initially, wait for user interaction or some other process to add the events. Let's look at a few specific examples so you can see what I mean. Note: the code samples are using Mootools syntax.

Before:

var comments = $('.articles > div.comments');
$('showComments').addEvent('click', function() { comments.show(); });

After:

$('showComments').addEvent('click', function() { $('.articles > div.comments').show(); });

While this may improve initial page load, each time we click on showComments, we have to parse the DOM (Document Object Model) to find all the comments, and this is not a cheap operation. The key here is test your own code and determine with each scenario which way is faster and which will keep your users the happiest. Just realize that on demand operations may be more acceptable to users than having to wait for the page to load. You be the judge!

Code Caching
We discussed file caching before, but what about code caching? We need to determine which operations will benefit the most from caching. The two items I will focus on are loops and DOM interactions. Loops can be costly because you may be performing unnecessary actions within each iteration.

Before:

for (var i=0; i<$$('.toggle').length; i++) {
 var el = new Element('div.something', {
 html: $$('.toggle')[i].get('text')
 });
 el.inject($('content'));
}

After:

var i = 0,
 els = $$('.toggle'),
 length = els.length,
 container = new Element('div');
 
for (i=0; i<length; i++) {
 new Element('div.something', {
 html: els[i].get('text')
 }).inject(container);
}
 
container.inject($('content'));

The "Before" loop is inefficient because we are querying the DOM three separate times to get the elements we need (twice for .toggle elements and once for the #content element), and we are having to determine the length property of that collection of elements during each iteration. In our case, the length won't change, so we only need to find it one time. Finally, during each iteration, we add a new element to the #content div, and this causes a redraw of the page. DOM manipulation and redrawing can be expensive, so let's look at how the second method is so much more efficient.

We start by caching our DOM elements so we only have to pull them once and get the length property of those elements only once. We've also created an element which will be used to contain all our new elements. Since the container has not been added to the DOM yet, we can add and remove from it without having to worry about the expensive redraw of the page. Once all our elements have been added to the container, we then inject the container into the #content div. Since this is only happening once, we have significantly improved performance.

Selector Efficiency
The last optimization technique I'll review is selector efficiency. Selectors are used to grab specific elements from the document in order to interact with them in your code. It's very possible to write poor-performing selectors. Some selectors to avoid:

$$('*')

This will grab all the elements in your document. If you have a huge document, this could take a while. You better not be using this selector in a loop!

$$('[data-some-attribute]')

This selector is inefficient because it has to look for this data attribute within every element in the document. If there are thousands of elements and each has several data attributes, you should just go get some coffee.

$$('body .person:nth-child(3n+1) .category p span.title')

This selector is fairly complicated. The reason this can be inefficient is because it may have to unnecessarily inspect many elements to get to the one(s) it needs. Determine if you can simplify the selector by being more specific and using an id to get to elements or even consider slightly modifying your HTML so that it's easier to create efficient selectors.

There are plenty of other techniques that will help improve the speed and efficiency of your JavaScript, but these are a good start and should help make you and your users happy. Remember that DOM interactions are slow and expensive, so do what you can to reduce chatting back and forth with the document. Check your loops and minify your code, and your users will surely return.

Happy coding!

-Philip

December 8, 2011

UNIX Sysadmin Boot Camp: bash - Keyboard Shortcuts

On the support team, we're jumping in and out of shells constantly. At any time during my work day, I'll see at least four instances of PuTTY in my task bar, so one thing I learned quickly was that efficiency and accuracy in accessing ultimately make life easier for our customers and for us as well. Spending too much time rewriting paths, commands, VI navigation, and history cycling can really bring you to a crawl. So now that you have had some time to study bash and practice a little, I thought I'd share some of the keyboard shortcuts that help us work as effectively and as expediently as we do. I won't be able to cover all of the shortcuts, but these are the ones I use most:

Tab

[Tab] is one of the first keyboard shortcuts that most people learn, and it's ever-so-convenient. Let's say you just downloaded pckg54andahalf-5.2.17-v54-2-x86-686-Debian.tar.gz, but a quick listing of the directory shows you ALSO downloaded 5.1.11, 4.8.6 and 1.2.3 at some point in the past. What was that file name again? Fret not. You know you downloaded 5.2.something, so you just start with, say, pckg, and hit [Tab]. This autocompletes everything that it can match to a unique file name, so if there are no other files that start with "pckg," it will populate the whole file name (and this can occur at any point in a command).

In this case, we've got four different files that are similar:
pckg54andahalf-5.2.17-v54-2-x86-686-Debian.tar.gz pckg54andahalf-5.1.11-v54-2-x86-686-Debian.tar.gz
pckg54andahalf-4.8.6-v54-2-x86-686-Debian.tar.gz
pckg54andahalf-1.2.3-v54-2-x86-686-Debian.tar.gz

So typing "pckg" and hitting [Tab] brings up:
pckg54andahalf-

NOW, what you could do, knowing what files are there already, is type "5.2" and hit [Tab] again to fill out the rest. However, if you didn't know what the potential matches were, you could double-tap [Tab]. This displays all matching file names with that string.

Another fun fact: This trick also works in Windows. ;)

CTRL+R

[CTRL+R] is a very underrated shortcut in my humble opinion. When you've been working in the shell for untold hours parsing logs, moving files and editing configs, your bash history can get pretty immense. Often you'll come across a situation where you want to reproduce a command or series of commands that were run regarding a specific file or circumstance. You could type "history" and pore through the commands line by line, but I propose something more efficient: a reverse search.

Example: I've just hopped on my system and discovered that my SVN server isn't doing what it's supposed to. I want to take a look at any SVN related commands that were executed from bash, so I can make sure there were no errors. I'd simply hit [CTRL+R], which would pull up the following prompt:

(reverse-i-search)`':

Typing "s" at this point would immediately return the first command with the letter "s" in it in the history ... Keep in mind that's not just starting with s, it's containing an s. Finishing that out to "svn" brings up any command executed with those letters in that order. Pressing [CTRL+R] again at this point will cycle through the commands one by one.

In the search, I find the command that was run incorrectly ... There was a typo in it. I can edit the command within the search prompt before hitting enter and committing it to the command prompt. Pretty handy, right? This can quickly become one of your most used shortcuts.

CTRL+W & CTRL+Y

This pair of shortcuts is the one I find myself using the most. [CTRL+W] will basically take the word before your cursor and "cut" it, just like you would with [CTRL+X] in Windows if you highlighted a word. A "word" doesn't really describe what it cuts in bash, though ... It uses whitespace as a delimiter, so if you have an ultra long file path that you'll probably be using multiple times down the road, you can [CTRL+W] that sucker and keep it stowed away.

Example: I'm typing nano /etc/httpd/conf/httpd.conf (Related: The redundancy of this path always irked me just a little).
Before hitting [ENTER] I tap [CTRL+W], which chops that path right back out and stores it to memory. Because I want to run that command right now as well, I hit [CTRL+Y] to paste it back into the line. When I'm done with that and I'm out referencing other logs or doing work on other files and need to come back to it, I can simply type "nano " and hit [CTRL+Y] to go right back into that file.

CTRL+C

For the sake of covering most of my bases, I want to make sure that [CTRL+C] is covered. Not only is it useful, but it's absolutely essential for standard shell usage. This little shortcut performs the most invaluable act of killing whatever process you were running at that point. This can go for most anything, aside from the programs that have their own interfaces and kill commands (vi, nano, etc). If you start something, there's a pretty good chance you're going to want to stop it eventually.

I should be clear that this will terminate a process unless that process is otherwise instructed to trap [CTRL+C] and perform a different function. If you're compiling something or running a database command, generally you won't want to use this shortcut unless you know what you're doing. But, when it comes to everyday usage such as running a "top" and then quitting, it's essential.

Repeating a Command

There are four simple ways you can easily repeat a command with a keyboard shortcut, so I thought I'd run through them here before wrapping up:

  1. The [UP] arrow will display the previously executed command.
  2. [CTRL+P] will do the exact same thing as the [UP] arrow.
  3. Typing "!!" and hitting [Enter] will execute the previous command. Note that this actually runs it. The previous two options only display the command, giving you the option to hit [ENTER].
  4. Typing "!-1" will do the same thing as "!!", though I want to point out how it does this: When you type "history", you see a numbered list of commands executed in the past -1 being the most recent. What "!-1" does is instructs the shell to execute (!) the first item on the history (-1). This same concept can be applied for any command in the history at all ... This can be useful for scripting.

Start Practicing

What it really comes down to is finding what works for you and what suits your work style. There are a number of other shortcuts that are definitely worthwhile to take a look at. There are plenty of cheat sheets on the internet available to print out while you're learning, and I'd highly recommend checking them out. Trust me on this: You'll never regret honing your mastery of bash shortcuts, particularly once you've seen the lightning speed at which you start flying through the command line. The tedium goes away, and the shell becomes a much more friendly, dare I say inviting, place to be.

Quick reference for these shortcuts:

  • [TAB] - Autocomplete to furthest point in a unique matching file name or path.
  • [CTRL+R] - Reverse search through your bash history
  • [CTRL+W] - Cut one "word" back, or until whitespace encountered.
  • [CTRL+Y] - Paste a previously cut string
  • [CTRL+P] - Display previously run command
  • [UP] - Display previously run command

-Ryan

November 28, 2011

Brisket and BYOC

With all of the cooking and eating going on around Thanksgiving, Summer's Truffle Mac and Cheese blog inspired me to think back on any of the "expertise" I can provide for SoftLayer customers in the kitchen. One of the first things my mother taught me to cook was brisket. While it might not be as exotic as 3 Bars Barbeque, it's pretty easy to make. Everyone who tastes it sings its praises and thinks it took forever to prepare, and while it does have to cook in the oven for about four hours, there are only five ingredients, so the "preparation" time is actually only around ten minutes. Since it's not exactly a family secret, I don't think I'll get into any trouble for sharing it:

Easy-To-Make Brisket Ingredients

  • 1 Brisket - I'd recommend having the majority (not all) of the fat trimmed off at the store
  • 2 1/2 Cups of Ketchup - Buy the largest ketchup bottle and plan on using a little more than half
  • 1 1/2 Cups of Water
  • 1 Packet of Onion Soup Mix
  • 1 Can of Tomato Paste (Optional, adds flavor)

Instructions

  1. Pre-heat oven to 300 degrees
  2. Mix all of the non-brisket ingredients and pour them on top of the brisket in a large roaster (one with a lid would be preferable)
  3. Make sure the entire brisket is covered. Pick it up to get your other ingredients underneath.
  4. Pop it into the oven for four hours at 300 degrees.
  5. Take it out, let it cool, and enjoy!

That's the basic, original recipe, but I've found a few ways to make it juicier along the way. One tip is to pull the brisket from the oven after about three and a half hours and slice it against the grain. If you have an electric knife, this is the perfect chance to use it, and if you don't, this could be an excuse to get one. Put the brisket back in the roaster for another half hour, and you'll love the results. Because ovens differ, just make sure it's moist before you take it out to serve.

At this point, you're probably asking yourself what a brisket recipe has to do with SoftLayer. If you've used our Build Your Own Cloud wizard, you might already see the similarity: You can put something together that seems dauntingly time consuming quickly and without breaking a sweat ... And the end result is amazing. There are a few simple steps to making an impressive brisket, and it takes a few clicks to build a customized cloud instance with all the benefits of SoftLayer's global network and support.

Too often, selecting a cloud instance involves more limitations than it does choices, so we wanted to make sure the BYOC service enabled customers the granularity to choose CPU, RAM, and storage configurations on newer, more powerful servers than our competition. Just like my tweak of the original recipe, we want customers to have the ability to tweak their cloud platform to provide the best application performance, cost efficiency, and availability for their specific needs.

If this blog left you hungry, you've got everything you need to make an amazing brisket. If you don't have the ingredients (or the four hours) you need to make one now, you can try the quicker BYOC recipe:

SoftLayer Cloud Ordering Ingredients

  • The device you're using to read this blog.
  • A list of what you want on your cloud instance.

Instructions

  1. Visit SoftLayer's Build Your Own Cloud page.
  2. Select the options you want and submit your order.
  3. Start using your custom cloud instance in less than 20 minutes!

Happy Building! :-)

-Rachel

Categories: 
Subscribe to speed