serverchallenge

July 6, 2009

The Cure for Irrelevance

I’ve been feeling rather irrelevant lately……yeah, yeah, I know . Watch out, because when lawyers feel irrelevant we sue!

Anyway, just thinking about the things going on in the rest of the world, like the wars in Iraq and Afghanistan. We have men and women over there who are still dying and losing their sanity on a daily basis, but for those of us who don’t have anyone close to us involved in the wars, it’s become a low hum…car bomb, soldier killed, hum, hum. But then I think of the daily terror that those who have loved ones go through – did she have to go out on patrol today? Did she get hurt? Did she get killed? Every day, every hour spent wondering if that person is safe and will come home again. That soldier is not irrelevant – he is making the greatest sacrifice so we can go on with our safe, secure lives over here.

My thoughts also turn to the people of Iran, and I find myself thinking: “If I lived in a repressive regime, would I be out on the streets in defiance of the government and particularly with the threat of being beaten, jailed or disappearing from the face of the earth?” I like to think I would, but I don’t know the answer, and that feeds my irrelevance. And come on Iran; give your people some credit. Make it at least look like the election wasn’t predetermined. You declare a landslide victory for Ahmadinejad (cut and paste, baby, cut and paste!) mere minutes after the polls close, yet the ballots are supposed to be hand counted. How can that work? I mean, wait an hour or so – pretend you counted. Please! Are the people of Iran irrelevant? No, they are making the greatest sacrifice in a battle for freedom, and in an uprising that may very well change the course of their history. The world is watching.

So how do I become relevant? (Assuming, that is, that a lawyer can ever be relevant). How do SLayers and SLackers become relevant? We go that extra mile. We’ve been dealing with cranky clients all day – keep the smile on the face and in the voice and treat them like they’ve just put in an order for 300+ servers a month. We don’t remain satisfied with the status quo – figure out how to make our system and services better, stronger, faster. We don’t rest on our laurels because we just had a major release, i.e., Cloudlayer instances. We look ahead and figure out or invent the next new thing our customers need or want. We scan the forums to keep a pulse on our clients (and it’s usually good for an eye-roll or two). We keep Lance and Mike out of trouble.

So am I like a U.S. soldier or the Iranian people? Not so much, but I can do things to stay relevant in my own little world.

Categories: 
July 4, 2009

Fourth of July

Fourth of July – Independence Day is more than just a day for us to hang out with friends and family across the United States and gather around the BBQ and watching fireworks and bombs blow up. It is a day that we celebrate our founding fathers courage and bravery in the pursuit of liberty and freedom.

If it wasn’t for these men and their dreams, I would not be sitting here at SoftLayer writing this blog for a company that loves us to share our words and views with others. I have been amazed how over the last few weeks how Twitter and other sites have helped the country of Iran speak their voice and let the world know what is going on over there. We would never know what is going on as their government would not allow it to be voiced on the state ran television.

So, as I am camping this Fourth of July in the San Juan Islands, fishing on the lake and watching the skies over Friday Harbor light up, I will be thankful for what our founding fathers accomplished on that day in 1776.

Categories: 
July 1, 2009

Pre-configuration and Upgrades

I recently bought a new computer for my wife. Being a developer, and a former hardware engineering student, I opted the buy the parts and assemble the machine ourselves. Actually assembling a computer these days doesn't take too long, it's the software that really gets you. Windows security updates, driver packs, incompatibilities, inconsistencies, broken websites, and just plain bad code plagued me for most of the night. The video card, in particular, has a “known issue” where it just “uh-oh” turns off the monitor when Windows starts. The issue was first reported in March of 2006, and has yet to be fixed.

This is why SoftLayer always tests and verifies the configurations we offer. We don't make the end user discover on their own that Debian doesn't work on Nehalems, we install it first to be sure. This is also why our order forms prevent customers from ordering pre-installed software that are incompatible with any of the rest of the order. We want to make sure that customers avoid the frustration of ordering things only to find out later that they don't work together.

The problem with desktop computers, especially for people who are particular about their configurations, is that you cannot buy a pre-configured machine where all the parts are exactly what you want. We attempted to get a computer from Dell, and HP, but neither company would even display all the specifications we were interested in, nevermind actually having the parts we desired. Usually pre-built systems skimp on important things like the motherboard or the power supply, giving you very little room to upgrade.

At SoftLayer, we don't cut corners on our systems, and we ensure that each customer can upgrade as high as they possibly can. Each machine type can support more RAM and hard drives than the default level, and we normally have spare machines handy at all levels so that once you outgrow the expansion capabilities of your current box, you can move to a new system type. If you're thinking of getting a dedicated server, but you're worried about the cost, visit the SoftLayer Outlet Store and start small. We have single-core Pentium Ds in the outlet store, and you can upgrade from there until you're running a 24-core Xeon system.

June 29, 2009

Leaving Normal

What is normal for a server? In support we get that question from time to time. The problem is that normal varies from server to server. A load average of 200 is probably not normal but a load of 5 to 10 very well could be normal, depending on the server's application. What to do?

Baselining to the rescue. The idea behind baselining is to get performance numbers on your application when things are "normal" so that you have solid math to indicate when things are not "normal".

What makes a good baseline? Things like RAM use (overall, per process, rate of change), number and types of processes running, processor usage, disk usage (total, per app), disk speed and network utilization are all good OS metrics. You can also get metrics from your application. E-mails per hour, web page generation time, and number of users logged in are good to know.

You can capture OS metrics using tools like top, free, ps and iostat on Linux. Actually if you have iostat you probably have 'sar' which is great for performance history. Sar has a process that runs every few minutes and records various OS counters including processor info, RAM use, disk I/O and the like.

For the Windows people you have Task Manager and Performance Monitor. Task Manager is pretty simple and gives mostly an overview. Perfmon is really where its at on Windows. Using PerfMon you can track dozens of performance counters on disk, proc, memory, the network and even application specific metrics if you are running apps like MS Exchange that support them.

As with most tasks related to being the lord and master of a server, performance monitoring isn't a one time thing. As you make changes to the system you have to run new baselines. Between changes you should run your performance routines periodically to see how things are changing. It is much easier to look into an issue if you spot it earlier rather than later.

Go forth and make sure all your baselines are belong to you!

*bonus cool points for those who knew the title of this blog was also the title of a "Roswell" episode.

Categories: 
June 24, 2009

Clouds and Elephants

So there I was after work today, sitting in my favorite watering hole drinking my Jagerbomb, when Caira, my bartender asked what was on my mind. I told her that I had been working with clouds and elephants all day at work and neither of those things are little. She laughed and asked if I had stopped anywhere to get a drink prior to her bar. I replied no, I'm serious I had to make some large clouds and a stampede of elephants work together. I then explained to her what Hadoop was. Hadoop is a popular open source implementation of Google's MapReduce. It allows transformation and extensive analysis of large data sets using thousands of nodes while processing peta-bytes of data. It is used by websites such as Yahoo!, Facebook, Google, and China's best search engine Baidu. I explained to her what cloud computing was (multiple computing nodes working together) hence my reference to the clouds, and how Hadoop was named after the stuffed elephant that belonged to one of the founders - Doug Cutting - child. Now she doesn't think I am as crazy.

Categories: 
June 22, 2009

Really?

In catching up on some of my blog reading, I ran across this blog by Jill Eckhaus of AFCOM (a professional organization for data center managers). Yes, I realize that article is four months old, but like I said – I’m catching up.

One of the things that really concerns me with articles and blogs such is this one are the repetitive concerns about “data security” and “loss of control” of your infrastructure. Both of those points are easy to state because they prey on the natural fear of any system administrator or data center manager.

System administrators have long ago come to realize that, in the proper environment, there is no real downside to not being able to physically place their hands upon their servers. In the proper environment the system administrator can power on or off the server, can get instant KVM access to the server, can boot the server into a rescue kernel to try to salvage a corrupt file system, can control network port speeds and connectivity, can reload the operating system, can instantly add and manage services such as load balancers and firewalls, can manage software licenses and naturally, can control full access to the server with root or administrator level privileges. In other words, there is no “loss of control” and “data security” is still up to the system administrator.

The data center managers are understandably concerned about outsourcing because it can potentially impact their jobs. But let’s face it – in today’s economy, the capital outlay required to acquire new datacenter space or additional datacenter equipment is extremely difficult to justify. In those cases sometimes the only two options are to do nothing or to outsource to an available facility. Of course, another option is to jeopardize your existing facility by trying to cram even more services into an already overloaded data center. If a data center manager is trying to build a fiefdom of facilities and personnel, outsourcing is certainly going to be a concern. One interesting aspect of outsourcing is – datacenter management jobs are still there; they are just at consolidated and often times more efficient facilities.

In reality, “data security” and “loss of control” should be of no more or less concern if you are using your own data center versus if you are doing the proper research and selecting a viable outsourcing opportunity with a provider that can prove it has the processes, procedures and tools in place to handle the job for you.

(In the spirit of full disclosure; I am both a local and national AFCOM member and find the organization and the information they make available to be quite useful.)

-SamF

June 19, 2009

Self Signed SSL

A customer called up concerned the other day after getting a dire looking warning in Firefox3 regarding a self-signed SSL certificate.

"The certificate is not trusted because it is self signed."

In that case, she was connecting to her Plesk Control Panel and she wondered if it was safe. I figured the explanation might make for a worthwhile blog entry, so here goes.

When you connect to an HTTPS website your browser and the server exchange certificate information which allows them to encrypt the communication session. The certificates can be signed in two ways: by a certificate authority or what is known as self-signed. Either case is just as good from an encryption point of view. Keys are exchanged and data gets encrypted.

So if they are equally good from an encryption point of view why would someone pay for a CA signed certificate? The answer to that comes from the second function of an SSL cert: identity.

A CA signed cert is considered superior because someone (the CA) has said "Yes, the people to whom we've sold this cert have convinced us they are who they say they are". This convincing is sometimes little more than presenting some money to the CA. What makes the browser trust a given CA? That would be its configured store of trusted root certificates. For example, in Firefox3, if you go to Options > Advanced > Encryption and select View Certificates you can see the pre-installed trusted certificates under the Authorities tab. Provided a certificate has a chain of signatures leading back to one of these Authorities then Firefox will accept that it is legitimately signed.

To make the browser completely happy a certificate has to pass the following tests:

1) Valid signature
2) The Common Name needs to match the hostname you're trying to hit
3) The certificate has to be within its valid time period

A self-signed cert can match all of those criteria, provided you configure the browser to accept it as an Authority certificate.

Back to the original question... is it safe to work with a certificate which your browser has flagged as problematic. The answer is yes, if the problem is expected, such as hitting the self-signed cert on a new Plesk installation. Where you should be concerned is if a certificate that SHOULD be good, such as your bank, is causing the browser to complain. In that case further investigation is definitely warranted. It could be just a glitch or misconfiguration. It could also be someone trying to impersonate the target site.

Until next time... go forth and encrypt everything!

June 17, 2009

Problem Solving

Quite often my friends who are not really that internet savvy ask me what I do at work, I think back to the time in the first grade when my teacher Mrs. Hyde told me: “ Bill you’re going to be a great problem solver when you get older, your problem solving skills are already at a fourth grade level.” Now you’re probably reading this wondering how problem solving problems in the first grade have anything to do with my job. It is, as she told me, all about how you think. She told me I was an outside the box thinker.

My co-workers and I deal with a network of 20,000+ servers, and 5500+ customers, in over 110 different countries, and support over 15 different operating systems. That leads to an almost infinite combination of language, hardware, and software options. When our customers submit an issue for us to work on, it is always different than the time before – whether that is a ticket from the same customer or a ticket on a similar topic. We have a very diverse range of customers using our servers for a number of things, so not every server in here is doing the same thing. In order to be good at supporting our customers, SoftLayer’s management, in my opinion, has hired some of the best problem solvers around the world to address all of our customer issues. So that is what I am: I am a problem solver! Otherwise known as a Customer Systems Administrator. We’re required to know a broad range of technologies and have the passion to learn the new ones as they come along. I think that is why I chose to work in the field that I work in, it is always changing. I tried moving over to telecommunications engineering a few years ago, but got bored with is as it was the same issues day in and day out on the equipment. Working here at SoftLayer is wonderful as there is never a dull moment.

June 15, 2009

Help Us Help You

Working the System Admin queue in the middle of the night I see lots of different kinds of tickets. One thing that has become clear over the months is that a well formed ticket is a happy ticket and a quickly resolved one. What makes a well-formed ticket? Mostly it is all about information and attention to these few suggestions can do a great deal toward speeding your ticket toward a conclusion.

Category
When you create a ticket you're asked to choose a category for it, such as "Portal Information Question" or "Reboots and Remote Access". Selecting the proper category helps us to triage the tickets. If you're locked out of your server, say due to a firewall configuration, you'd use "Reboots and Remote Access". We have certain guys who are better at CDNLayer tickets, for example, and they will seek out those kind so if you have a CDN question, you'd be best served by using that category. Avoid using Sales and Accounting tickets for technical issues as those end up first in their respective departments and not in support.

Login Information
This one is a bit controversial. I'm going to state straight out... I get that some people don't want us knowing the login information for the server. My personal server at SoftLayer doesn't have up-to-date login information in the portal. I do this knowing that this could slow things down if I ever had to have one of the guys take a look at it while I'm not at work.

If necessary, we can ask for it in the ticket but that can cost you time that we could otherwise be addressing your issue. If you would like us to log into your server for assistance, please provide us with valid login information in the ticket form. Providing up-to-date login credentials will greatly expedite the troubleshooting process and mitigate any potential downtime, but is not a requirement for us to help with issues you may be facing.

Server Identification
If you have multiple servers with us, please make sure to clearly identify the system involved in the issue. If we have a doubt, we're going to stop and ask you, which again can cost you time.

Problem Description
This is really the big one. When typing up the problem description in the ticket please provide as much detail as you can. Each sentence of information about the issue can cut out multiple troubleshooting steps which is going to lead to a faster resolution for you.

Example:

  • Not-so-good: I cannot access my server!
  • Good: I was making adjustments to the Windows 2008 firewall on my server and I denied my home IP of 1.2.3.4 instead of allowing it. Please fix.

The tickets describe the same symptom. I can guarantee though we're going to have the second customer back into his server quicker because we have good information about the situation and can go straight to the source of the problem.

Categories: 
June 10, 2009

Medieval Financial Techniques in the 21st Century?

Recently I had the chance to attend the annual Beyond Budgeting Round Table (BBRT) conference to help me keep up on my CPE credits. Those darn accounting licenses have to be maintained, ya know.

I was pleasantly surprised at the conference that SoftLayer was already doing the crux of what this group preaches – namely, that assembling an annual budget and trying to live by it is a colossal waste of time!

One speaker pointed out that budgeting originated back in medieval times long before the Industrial Revolution. During those days, the feudal system was the order of the day. Landowners allowed people to live on their land and raise crops. Once per year, when the harvest came in, the landowners received payment from the people living on the land in the form of a share of the crops or a share of the gold for which the crops were sold. Since the landowners were paid once per year, they had to plan how to make their annual payday last for a whole year. You guessed it – this plan was called “the budget.”

Unfortunately, most companies and organizations today use this horribly outdated financial management technique to run their business in the fast-paced information age economy of today. In most cases, this just flat doesn’t work.

For example, one of the speakers was the CFO of a very large healthcare organization. He said that back in the days when they produced an annual budget, there were 240 budget managers that spent 90 days of full-time effort to produce the annual budget. That equates to 60 man-labor years of total time to produce that budget. If you assume that each of those managers averages $50K per year in compensation, the cost of producing that budget is $3 million. What’s worse is that the CFO said it was worthless before the final version was printed because it was built on stale fundamental assumptions that were several months old.

Once these obsolete documents are produced, they become static financial contracts. They limit spending for each department, and this isn’t always a good thing. Some departments may see some fantastic market opportunities develop halfway through the year, but they can do nothing to take advantage of them because they would exceed their budget. On the other hand, some departments can be allotted too much money, so they go on wasteful spending sprees at year end to be sure and use up their budget or else lose that funding next year. People often ask for permission to exceed budget, but usually no one gives back any unused budget dollars. Even worse, management compensation is often tied to these obsolete financial contracts. Business schools are awash with case studies of bad business decisions that were made to maximize bonus compensation in relation to the budget.

From the beginning, SoftLayer realized the futility of producing an annual budget. In the rapidly developing business of web hosting, the landscape can dramatically change much more quickly than an annual cycle. So we implemented the policy of maintaining a rolling forecast that is updated to the best of our current knowledge each and every month. This practice has served us well, and is one of the “best practices” adopted by the BBRT.

Another best practice recommended by BBRT is to maintain multiple forecast scenarios that factor in macroeconomic possibilities. Then as reality develops, you have a better handle on the tactics to implement because you now know what most of these decisions should be in advance. At SL, we will be implementing the multiple scenario practice over this summer.

Pages

Subscribe to serverchallenge