So there I was after work today, sitting in my favorite watering hole drinking my Jagerbomb, when Caira, my bartender asked what was on my mind. I told her that I had been working with clouds and elephants all day at work and neither of those things are little. She laughed and asked if I had stopped anywhere to get a drink prior to her bar. I replied no, I'm serious I had to make some large clouds and a stampede of elephants work together. I then explained to her what Hadoop was. Hadoop is a popular open source implementation of Google's MapReduce. It allows transformation and extensive analysis of large data sets using thousands of nodes while processing peta-bytes of data. It is used by websites such as Yahoo!, Facebook, Google, and China's best search engine Baidu. I explained to her what cloud computing was (multiple computing nodes working together) hence my reference to the clouds, and how Hadoop was named after the stuffed elephant that belonged to one of the founders - Doug Cutting - child. Now she doesn't think I am as crazy.
Posts Tagged 'Software'
Quite often my friends who are not really that internet savvy ask me what I do at work, I think back to the time in the first grade when my teacher Mrs. Hyde told me: “ Bill you’re going to be a great problem solver when you get older, your problem solving skills are already at a fourth grade level.” Now you’re probably reading this wondering how problem solving problems in the first grade have anything to do with my job. It is, as she told me, all about how you think. She told me I was an outside the box thinker.
My co-workers and I deal with a network of 20,000+ servers, and 5500+ customers, in over 110 different countries, and support over 15 different operating systems. That leads to an almost infinite combination of language, hardware, and software options. When our customers submit an issue for us to work on, it is always different than the time before – whether that is a ticket from the same customer or a ticket on a similar topic. We have a very diverse range of customers using our servers for a number of things, so not every server in here is doing the same thing. In order to be good at supporting our customers, SoftLayer’s management, in my opinion, has hired some of the best problem solvers around the world to address all of our customer issues. So that is what I am: I am a problem solver! Otherwise known as a Customer Systems Administrator. We’re required to know a broad range of technologies and have the passion to learn the new ones as they come along. I think that is why I chose to work in the field that I work in, it is always changing. I tried moving over to telecommunications engineering a few years ago, but got bored with is as it was the same issues day in and day out on the equipment. Working here at SoftLayer is wonderful as there is never a dull moment.
Over the years I have had many motorized toys, including boats, cars, trucks, dirt bikes, quads, riding lawn mowers and others. I got my first mini bike when I was about 6 years old. That thing was powerful - it had a 4HP Engine on it. One day I was riding it on our 100 acre homestead and the chain broke. Well I just popped the kick stand up and left it there waiting for Dad to get home. Upon my father arriving I let him know the chain broke, he explained to me the proper maintenance one must do in order to keep a chain working, proper oiling techniques, making sure it has the right tension and more. A few years later I got my first two stroke dirt bike. I loved that thing! I rode it all weekend long and then I mixed the gas too lean and blew the top head of it. That’s when I learned how to maintain a 2 cycle engine. My uncle helped me rebuild the bike engine (or shall I say I handed him the tools, and he rebuilt it) With all motorized engines they need proper care and maintenance. I now take my car for an oil change every 4000 miles (even though they say it can go 5,000) and get everything checked out.
The same thing can be said for internet servers. Quite often I talk to people and they think they can just install their operating system, upload the applications they want to run and/or data they want to serve, and walk away from that machine for the next 12 months. That is wrong! Computer software is always updating and you need to stay on top of updating your software. Security threats are found hourly, Viruses are written daily to attack the threats found yesterday. Proper maintenance is the only way to make sure your data is safe and secure. That is why SoftLayer has partnered with companies that offer extended server management. We call them SoftLayer Certified Management Companies. You can find them in our forums. These companies like rackaid.com, seeksadmin.com, Bitpusher, and many more have all been certified by SoftLayer to know our infrastructure and work closely with us and many of our clients. They provide the same great level of customer service that is standard at SoftLayer and do a lot of the advanced administration tasks for our customers. We have teamed up with these managed services partners in order to provide our customers with the proper maintenance of their infrastructure. So if you haven’t done a security audit on one of your machines in a few months, I would suggest taking it to the service center and contacting one of these companies, so you can insure your machine is safe and secure!
From the beginning of my coming of age in the IT industry, It’s been one thing – Windows. As a system administrator in a highly mobile Windows environment, you learn a thing or two to make things tick, and to make them keep ticking. I had become quite proficient with the Active Directory environment, and was able to keep a domain going. While windows is a useful enterprise-grade server solution, it’s certainly not the only solution. Unfortunately when I made my departure from that particular environment, I hadn’t had much exposure to the plethora of options available to an administrator.
Then Along comes SoftLayer, and opens my eyes to an array of new (well, at least to me) operating systems. Now, I had begun my ‘new’ IT life, with exposure to the latest and greatest, to include Windows, as well as virtualization software such as Xen and Virtuozzo, and great open source operating systems such as CentOS, and FreeBSD. With the new exposure to all these high-speed technologies, I felt that maybe it was time for me to let the de-facto home operating system take a break, and kick the tires on a new installation.
I can say that while switching to open source was a bit nerve racking, it ended up being quick and painless, and I’m not looking back. I’ve lost a few hours of sleep here and there trying to dive in and learn a thing or two about the new operating system, as well as making some tweaks to get it just like I like it. The process was certainly a learning experience, and I’ve become much more familiar with an operating system that, at first, can seem rather intimidating. I went through a few different distributions till I settled on one that’s perfect for what I do (like reading the InnerLayer, and finishing the multitude of college papers).
The only problem with always reloading a PC is you have to sit there and watch it. It doesn’t hurt to have a TV and an MP3 player sitting around while you configure everything and get the reload going, but you still have to be around to make sure everything goes as planned. Imagine this… You click a button, and check back in a few. Sound Familiar? Yep, it would have been nice to have an automated reload system much like we have here at SoftLayer. Not to mention, if something goes awry, there’s the assurance that someone will be there to investigate and correct the issue. That way, I can open a cold one, and watch the game, or attend to other matters more important than telling my computer my time zone.
When you think about all the things that have to go right all the time where all the time is millions of times per second for a user to get your content it can be a little... daunting. The software, the network, the hardware all have to work for this bit of magic we call the Internet to actually occur.
There are points of failure all over the place. Take a server for example: hard drives can fail, power supplies can fail, the OS could fail. The people running servers can fail.. maybe you try something new and it has unforeseen consequences. This is simply the way of things.
Mitigation comes in many forms. If your content is mostly images you could use something like a content delivery network to move your content into the "cloud" so that failure in one area might not take out everything. On the server itself you can do things like redundant power supplies and RAID arrays. Proper testing and staging of changes can help minimize the occurrence of software bugs and configuration errors impacting your production setup.
Even if nothing fails there will come a time when you have to shut down a service or reboot an entire server. Patches can't always update files that are in use, for example. One way to work around this problem is to have multiple servers working together in a server cluster. Clustering can be done in various ways, using Unix machines, Windows machines and even a combination of operating systems.
Since I've recently setup a Windows 2008 cluster that is we're going to discuss. First we need to discuss some terms. A node is a member of a cluster. Nodes are used to host resources, which are things that a cluster provides. When a node in a cluster fails another node takes over the job of offering that resource to the network. This can be done because resources (files, IPs, etc) are stored on the network using shared storage, which is typically a set of SAN drives to which multiple machines can connect.
Windows clusters come in a couple of conceptual forms. Active/Passive clusters have the resources hosted on one node and have another node just sitting idle waiting for the first to fail. Active/Active clusters on the other hand host some resources on each node. This puts each node to work. The key with clusters is that you need to size the nodes such that your workloads can still function even if there is node failure.
Ok, so you have multiple machines, a SAN between them, some IPs and something you wish to serve up in a highly available manner. How does this work? Once you create the cluster you then go about defining resources. In the case of the cluster I set up my resource was a file share. I wanted these files to be available on the network even if I had to reboot one of the servers. The resource was actually combination of an IP address that could be answered by either machine and the iSCSI drive mount which contained the actual files.
Once the resource was established it was hosted on NodeA. When I rebooted NodeA though the resource was automatically failed over to NodeB so that the total interruption in service was only a couple of seconds. NodeB took possession of the IP address and the iSCSI mount automatically once it determined that NodeA had gone away.
File serving is a really basic example but you can clustering with much more complicated things like the Microsoft Exchange e-mail server, Internet Information Server, Virtual Machines and even network services like DHCP/DNS/WINs.
Clusters are not the end of service failures. The shared storage can fail, the network can fail, the software configuration or the humans could fail. With a proper technical staff implementing and maintaining them, however, clusters can be a useful tool in the quest for high availability.
You'll see the word "cloud" bouncing around quite a bit in IT nowadays. If you have been following The Inner Layer you'll have seen it a few times here as well. A cloud service is just something that is hosted on the Internet. Typically in a cloud scenario you are not actually doing the hosting but rather using hosted resources someone else is providing. Usually you'll hear it in terms of computing and storage.
This is going to be a brief article on a cloud storage product we are doing here at SoftLayer called CloudLayer™ Storage.
CloudLayer™ Storage is a WebDAV based system which uses client software on the local machine in order to redirect filesystem operations to the storage repository here at SoftLayer. In Windows you end up with a drive letter; on a Unix system you end up with a mount point. In both cases when you create folders and save files on those locations the actions actually happen on our storage repository here. Because the files are located with us you are able to access them wherever you are. Through the CloudLayer™ web UI you're able to also set up files for sharing with others so even if where you are never changes there is still value to using the cloud storage.
Even using a cloud storage system you must maintain proper backups. Hardware, software and human error all happens. Tied in with that concept of "errors happen" ... if you have files on CloudLayer™ that you need for a presentation, download them ahead of time. You don't want to be caught without files simply because the Internet connection at your hotel decided to take a nap the morning of your event.
Now what of security? Well, the connection to the CloudLayer™ endpoint here at SoftLayer is done via an encrypted session so people cannot snoop on your transmissions. This does mean you need to allow 443/tcp outbound communications via your firewall but since that is the normal HTTPS port I'd imagine you already have it open. Within CloudLayer™ you can control with whom you share your files.
Since CloudLayer™ is a filesystem redirected over the Internet the performance you get will be dependent on your local connection speed. Its best to treat CloudLayer™ Storage as simply a file repository. If you find you need some kind of off-machine storage for running applications on your server you could look into our iSCSI product.
So, dear readers, go forth and have a cloudy day.
What if I asked you to guess the name of a video game that came out within the last 10 years, and has sold more copies than the Halo series, the Half-Life series AND the Metal Gear series? No, it’s not Guitar Hero or Rock Band, and it’s not Pokemon. It’s not even made by one of the “serious” game development companies. The game that I’m talking about is Bejeweled (published by PopCap), a simple online flash game that has garnered 25 million purchases and more than 350 million free downloads.
The secret to PopCap’s success lies in creating simple, easy to use games that the average person finds fun. They’ve built an entire market segment from the simple beginnings of Bejewled, and now offer more than 50 games for sale, and even more in their free download section with almost a billion downloads between them. The “casual gaming” market is so large that the Nintendo Wii has almost been completely taken over by casual games.
By why has the industry taken off so much? Sure, casual games can be easy to make. I remember whipping up a version of Bejewled in a VBA form that I built as an Excel macro so I could play it in my “business software” class in high school. The real secret is that these games are easy to pick up and play, and in that sense they’re far better than their competition for people who are busy, inexperienced, or just plain tired.
People these days have less and less free time, which means they have less time to learn the function of the right trigger in crouch mode, run mode, driving mode, flying mode, stealth mode, raspberry jam mode, etc. The instructions for Bejeweled (“Swap adjacent gems to align sets of 3 or more”) are almost as simple as the original Pong’s instructions (“Avoid missing ball for high score”).
That’s what we try to do here at SoftLayer. Our portal is specifically designed to be used by people who just don’t have the time or inclination to perform menial repetitive tasks manually. From configuring a load balancer to rebooting your servers to performing notoriously difficult SWIP requests, the portal handles it all for you. Of course, the task we’re trying to help you accomplish is a lot more complex than “avoid missing ball for high score,” but we try our best to make the process as easy as possible. Maybe with the time saved you can come up with a new business segment to send more server orders our way, but I’m betting you’ll be playing Bejewled, or Peggle, or Zuma…
On May 14th my buddy Shawn wrote On Site Development. Aside from the ambiguous title (I originally thought it was an article on web site development, rather than the more appropriate on-site development), there were a number of things that I felt could be expanded upon. I started by simply commenting on his post, but the comment hit half a page and I had to admit to myself that I was, in fact, writing an entire new post.
Updating the computer systems in these restaurants is a question of scale. Sure, it seems cheap to update the software on the 6 computers in a local fast food restaurant. However, a certain “largest fast-food chain in the world” has 31,000+ locations (according to Wikipedia). Now I know how much I would charge to update greasy fast-food computers, and if you multiply that by 31,000, you get a whole lot of dollars. It just doesn’t scale well enough to make it worthwhile. The bottom line is, the companies do cost-benefit analysis on all projects, and the cost of re-doing the messed up orders is apparently less than the cost of patching the software on a quarter million little cash registers and kitchen computers.
It's the same logic that lead to Coke being sold for 5 cents for more than 60 years, spanning two world wars and the great depression without fluctuating in price. The vast majority of Coca-Cola during that time period was sold from vending machines. These vending machines only accepted nickels, and once a nickel was inserted, a Coke came out. That’s it. Nothing digital, no multi-coin receptacles, just insert nickel…receive Coke. The cost of replacing 100,000 vending machines was far higher than the profits they would get by increasing the price of coke slightly. Only after World War II, when industrialization and the suburb were really taking off, did Coca-Cola start to phase out their existing vending machine line and replace it with machines capable of charging more than 5 cents per bottle.
Of course, we all know how coke machines operate now. Computerized bill changers, many of them hooked up to the internet, allow Coke to charge upwards of $3 for a 20oz beverage on a hot day at a theme park. Coke even attempted (in 2005) to fluctuate the price of Coke based on local weather conditions. People would want a Coke more on a hot summer day, so why not charge more for it? (Because the public backlash was severe to the point where boycotts were suggested the very same day Coke announced their new plan, but that’s another story.)
The fast food problem Shawn mentioned, as well as the vending machine problem, is why so many companies are moving onto the web. Online retail is exploding at a rate that can be described as a “barely controlled Bubble.” To tie back in with my comments on the fast food restaurant, this means that all your customers see the exact same website, written by the exact same piece of code. Want to change the way orders are displayed? Well simply alter the order display page, and every customer in every country from now on will see that new display format.
This doesn’t just apply to retail, however. Many companies are moving towards web-based internal pages. When I got my mortgage, the load officer entered all my information into a web form on their intranet. This is brilliant, because it takes away all the cost of synchronizing the employee computers with the software, it removes the time needed for upgrades, and (most importantly) it means developers don’t have to come into the office at 4am to ensure that upgrades go smoothly before the start of the business day. So any of you business owners out there that have had to deal with the nightmare of upgrading antiquated POS software on dozens, hundreds, or hundreds of thousands of computers, consider making everything a web site.
SoftLayer has geographically diverse data centers, so your stores can always log in to a nearby servers to cut down on latency, and we allow for VPN access, distributed databases, and real-time backups, making a web-based solution preferable to even the hard coded local systems that many stores use now.
Around the office I am commonly considered a "low-level" software engineer. If you are in the business of computer programming you know this means I generally have various pieces of computer hardware strewn about my work area, and an ASCII chart hanging on my wall complete with a cheat-table so I can quickly convert numbers between binary, decimal, and hex. If you are not in the business of developing software, think of me as guy who couldn’t decide if I wanted to be an electrical engineer or a computer programmer and thus through my own indecision eventually found myself stuck somewhere in between. I know a bit about both but am not an expert in either. (I think the Roman word for this sort of limbo is purgatory, but I find it pretty cozy most days.)
At any rate when a project comes along that walks the fence between the realms of hardware and software my name naturally comes up. Such was the case a few weeks ago when one of our systems administrators had the need to retrieve the serial number from the RAM chips already installed in a number of servers. He asked me if it could be done. I looked and saw the information was reported in the BIOS of one of my machines, so I promptly responded with a “you bet”. After all, if the BIOS can display the information on the screen I should be able to as well. Right? I told him it would take a week.
The problem in this career field I have worked for some ten years now is you don’t know what you don’t know. Fast forward two weeks. Now think the Friday before Easter. That’s right, the one I am supposed to be off lounging around the house in my pajamas. It took a little longer to pull that serial number than I expected. If you’re interested the slow down turned out to be that the information existed at a physical memory address that was not easily accessible from Microsoft Windows (luckily for the BIOS it gets to display the data before an operating system is loaded).
Remember the old Chevy Chase movie "Funny Farm"? Chevy’s character is driving around lost when he passes the old man sitting on his porch in a rocking chair. Chevy stops his vehicle, rolls down his window, and says: “Excuse me Sir. Can you tell me how you would get to Redbud?” The old man leans forward, spits, and replies: “If I were going to Redbud I sure as hell wouldn’t start from here.”
Like Mr. Chase’s character in the movie, I didn’t get to pick where I started the journey from. We need the data available to us after the operating system boots. So I am hacking my way through it. I’m nearly there now. Close enough at least that I felt comfortable taking a break from the code and blowing off some steam by writing this blog. And the truth is, while I might have been whining just a bit I actually have enjoyed this project immensely. I appreciate the fact that the management here at SoftLayer gives us the opportunity to challenge ourselves and then grow to meet those challenges. We are encouraged to “get our hands dirty”. When I finish up here I will have a deeper understanding of how the BIOS relates to the operating system (and through the BIOS indirectly to the hardware).
As for our customers, well, it just so happens once I got to digging around in the binary mud there was a whole lot of other useful insight buried in the swirls of all those zeros and ones. Instead of extracting just the serial numbers I am pulling about a dozen pages of hardware data points we can use in statistical analysis for predicting failures, standards compliance, and availability trends. Like I said, you don’t know what you don’t know. But sometimes you are pleasantly surprised once you find out. By promoting such an amiable work environment, fostering creativity, and encouraging innovation, SoftLayer continues to boldly go where no other hosting company has gone before.
Alright, time to climb down from the pulpit and finish up my software.
Thanks for listening!
It’s a fact -- all software ends up relying on a piece of hardware at some point. And hardware can fail. But the secret is to create redundancy to minimize the impact if hardware does fail.
RAIDS, load balancers, redundant power supplies, cloud computing - the list goes on. And we support them all. Many of these options are not mandatory, but I wish they were! That’s where the customer comes in – it is critical to understand the value of the application and data sitting on the hardware and set a redundancy and recovery plan that fits.
Keep your DATA safe:
- RAID - For starters *everyone* should have a RAID 1, 5, or 10. This keeps your server online in the event of a drive failure.
The best approach – RAID 10 all the way. You get the benefits of a RAID 0 (striping across 2 drives so you get the data almost twice as fast) and the security of RAID 1 (mirroring data on 2 separate drives) all rolled into one. I think every server should have this as a default.
- Separate Backups – EVault Backup, ISCSI Storage, FTP/NAS Storage, your own NAS server or just a different server. Lose data just once (or have the ability to recover it painlessly) and these will pay for themselves. Remember, hardware is not the only way in which you can lose data -– hackers, software failures, and human error will always be a risk.
StorageLayer. Use it or lose it.
- Redundant servers in different locations – spread your servers out across different datacenters and use a load balancer. Nothing is safer than a duplicate server 1000’s of miles away. That’s why we have invested in a second data center – to keep your data and business safe.
Check 'em out in our Services > Network Services section.
- Solid state drives – aww yeah baby. They are coming.
Solid state drives are just that – a drive with no moving parts. No more platters or read/write heads. I mean come on, hard drives are essentially using the same basics that old record players use. CD’s use this technology too. And you see where those went (can you say iPod? I prefer my iPod touch. I have never had an iPod until now so I skipped right to the new fancy pants model. Can you tell I just got it?).
Check out these comparison tests of solid state drives vs. conventional ones:
- Faster, faster, faster! –- Processors, memory, drives, network -- everything is getting much faster. And in part by redundancy (dual and quad core processors, dual and quad processor motherboards). See? Redundancy is the way of the future!
We have 4 Intel Xeon Quadcore Tigertown processors on one motherboard. That’s 16 processors on one server! Shazam!
- Robot DC patrol sharks – yep. Got the plans on my desk right now. But I can’t take all the credit, Josh R. suggested this one, I just make things happen.
I work to keep all of our hardware running in tip top condition. But I look at the bigger picture when it comes to hardware – how to completely eliminate the impact of any hardware issue. That’s why I suggest all the redundancies listed above. While I can reduce the probability of hardware issues with testing, monitoring of firmware updates, proper handling procedures, choosing quality components, etc., redundancy is the ultimate solution to invisible hardware.
Hardwhere?, if you will.