So there I was after work today, sitting in my favorite watering hole drinking my Jagerbomb, when Caira, my bartender asked what was on my mind. I told her that I had been working with clouds and elephants all day at work and neither of those things are little. She laughed and asked if I had stopped anywhere to get a drink prior to her bar. I replied no, I'm serious I had to make some large clouds and a stampede of elephants work together. I then explained to her what Hadoop was. Hadoop is a popular open source implementation of Google's MapReduce. It allows transformation and extensive analysis of large data sets using thousands of nodes while processing peta-bytes of data. It is used by websites such as Yahoo!, Facebook, Google, and China's best search engine Baidu. I explained to her what cloud computing was (multiple computing nodes working together) hence my reference to the clouds, and how Hadoop was named after the stuffed elephant that belonged to one of the founders - Doug Cutting - child. Now she doesn't think I am as crazy.
Posts Tagged 'Data'
When it comes to managing a server remember you can never be to careful. In this day and age we face a lot of things that can damage and even take a server to its knees here’s a few things for everyone to consider.
This is a must on systems open to the net now days. There are always nasty little things floating around looking to take your server apart from the OS out. For windows servers there are a multitude of choices and I’ll just mention a few that can help protect your goods. You can use several programs such as avast (which offers a free edition), ClamWin (open source), Kaspersky , and Panda just to name a few. I would suggest before installing any of these you check links such as http://en.wikipedia.org/wiki/List_of_antivirus_software to name one that provides a list of several choices and their compatibility. You may also want to read reviews that compare the available options and give you an idea of what to expect from them. This will allow you to make an informed choice on which one works best for you. Now with linux there are also several options for this including the well known clamav which from personal experience works really well and can be installed on a variety of linux disro’s(aka distributions). It’s very simple to use and may prevent you from headache later on down the road.
Firewalls are a double edged sword but are most defiantly needed. When it comes to firewalls you can protect yourself from quite a bit of headache however if setup to strict you can block positive traffic and even yourself from reaching your server but in the long run a defiant way to help protect your server from unwanted visitors. A lot of firewalls also have modules and add-ons that further assist in protecting you and securing your server. If in doubt it’s always a good idea to have a security company do an audit and even a security hardening session with your server to make sure you are protected the best way possible.
This is probably one of the most important this you can do to secure your server. Use strong passwords (no using password or jello is not a secure password even if it is in all caps) and if you are worried about not being able to come up with a secure one there are several password generators on the web that can come up with secure ones to assist. Passwords should contain caps letters, numbers, symbols, and should be at minimum 8 – 10 characters (the more the better). It’s the easy to remember and easy read passwords that get you into trouble.
Armed with this information and so much more on security that can be located on the web using the great and all powerful Google should be a good start to making sure you don’t have to worry about data loss and system hacks. Also remember no matter how secure you think you are make regular backups of all your important data as if you server could crash at any time.
My grandmother used to say an ounce of prevention is worth a pound of cure. Usually this was her polite way of telling me to pick my skateboard up off the stairs before she stepped on it and broke her neck or to put a sheet of newspaper over her antique kitchen table before I began refueling my model airplane. All very sound advice looking back. And now here I find myself repeating the same adage some twenty years later in the context of predicting mechanical drive failure. An ounce of prevention is worth a pound of cure.
Hard disk drive manufacturers recognized both the reality and the advantages of being able to predict normal hard disk failures associated with drive degradation sometime around 2003. This led a number of leading hard disk makers to collaborate on a standard which eventually became known as SMART. This acronym stands for Self-Monitoring, Analysis and Reporting Technology and when used properly is a formidable weapon in any system administrator's arsenal.
The basic concept is that firmware on the hard disk itself will record and report key "attributes" of that drive which when monitored and analyzed over time can be used to predict and avoid catastrophic hard disk failures. Anyone who has been around computers for more than a day knows the terrible feeling that manifests in the pit of your stomach when it becomes apparent that your server or workstation will not boot because the hard disk has cratered. Luckily, we ALL of course back up our hard drives daily! Right?
All kidding aside even with a recent back up just the task of restoring and getting your system back in working order is a serious hassle and it’s not something you get the luxury of scheduling if the machine is critical to operations and failed in the middle of your work day or worse yet, the middle of your beauty sleep. That is where SMART comes in. When properly used SMART data can give “clues” that a drive is reaching a failure point--prior to it failing. This in turns means you can schedule a drive cloning and replacement within your next regular maintenance window. Really aside from a hard disk that lasts forever what more could an administrator ask for?
SMART drive data has been described as a jigsaw puzzle. That's because it takes monitoring a myriad of data points consistently over time to be able to put together a picture of your hard disk health. The idea is that an administrator regularly records and analyzes characteristics about the installed spinning media and looks for early warning signs that something is going wrong. While different drives have different data points, some of the key and most common attributes are:
- head flying height
- data throughput performance
- spin-up time
- re-allocated sector count
- seek error rate
- seek time performance
- spin try recount
- drive calibration retry count
These items are considered typical drive health indicators and should be base-lined at drive installation and then monitored for significant degradation. While the experts still disagree on the exact value of SMART data analysis, I have seen sources that claim at least 30% of drive failures can be detected some 60 days prior to the actual failure through the monitoring of SMART data.
Of course not all drive failures can be predicted. Plus some failures are caused by factors other than drive degradation. Consider drives damaged by power surges or drives that are dropped in shipping as good examples of drive failures that cannot normally be detected through SMART monitoring. However in my humble opinion even one hard disk failure prevented over the course of my career is something to celebrate--unless you happen to own stock in McNeil Consumer Healthcare, a.k.a. the distributors of Tylenol!
So what does this have to do with SoftLayer? Well I am certainly not claiming that SoftLayer is going to predict all your hard drive disasters so there is no reason for you to back up your data. In fact, I recommend not just backing it up but backing it up in geographically disparate locations (did I mention we have data centers in Dallas and Seattle?). What I do mean to share is that technologies like SMART data are just one of the many ways SoftLayer is currently investigating to improve what is already the best hosting company in the business.
I should know. I was tasked with writing the low-level software to extract this data. That’s right. SoftLayer has engineers working at the application layer, down at the device driver layer, and everywhere in between. If that doesn’t give you a warm fuzzy about your hosting company, I don’t know what will.