Posts Tagged 'Memory'

October 4, 2011

An Introduction to Redis

I recently had the opportunity to get re-acquainted with Redis while evaluating solutions for a project on the Product Innovation team here at SoftLayer. I'd actually played with it a couple of times before, but this time it "clicked." Or my brain broke. Either way, I see a lot of potential for Redis now.

No one product is a perfect fit for all of your data storage needs, of course. There are such fundamental tradeoffs to be made in designing storage architectures that you should be immediately suspicious of any product that claims to fit every need.

The best solutions tend to be products that actually embrace these tradeoffs. Redis, for instance, has sacrificed a small amount of data durability in exchange for being awesome.

What is it?

Redis is a key/value store, but describing it that way is sort of like calling a helicopter a "vehicle." It's a technically correct description, but it leaves out some important stuff.

You can think of it like a sophisticated older brother of Memcached. It presents a flat keyspace, and you can set those keys to string values. Another feature of Memcached is the ability to perform remote atomic operations, like "incr" and "append." These are really handy, because you have the ability to modify remote data without fetching, and you have an assurance that you're the only one performing that operation at that instant.

Redis takes this concept of remote commands on data and goes completely nuts with it. The database is aware of data structures like hashes, lists and sets in addition to simple string values. You can sort, union, intersect, slice and dice to your heart's content without fetching any data. Redis is a data structure server. You can treat it like remote memory, and this has an awesome immediate benefit for a programmer: your code and brain are already optimized for these data types.

But it's not just about making storage simpler. It's fast, too. Crazy fast. If you make intelligent use of its data structures, it's possible to serve a lot of traffic from relatively modest hardware. Redis 2.4 can easily handle ~50k list appends a second on my notebook. With batching, it can append 2 million items to a list on a remote host in about 1.28 seconds.

It allows the remote, atomic and performant manipulation of data structures. It took me a little while to realize exactly how useful that is.

What's wrong with it?

Nothing. Move along.

OK, it's a little short on durability. Redis uses memory as its primary store and periodically flushes to disk. A common configuration is to do so every second.

That sounds pretty reasonable. If a server goes down, you could lose a second of data. Keep in mind, however, how many operations Redis can perform in a second. If you're in a high-volume environment, that could be a lot of data. It's not for your financial transactions.

It also supports relatively limited availability options. Currently, it only supports master/slave replication. Clustering support is planned for an upcoming release. It's looking pretty powerful, but it will take some real-world testing to know its performance impact.

These challenges should be taken into consideration, and it's probably clear if you're in a situation where the current tradeoffs aren't a good fit.

In my experience, a lot of developers seriously overestimate the consequences of their application losing small amounts of data. Also consider whether or not the chance of losing a second (or less) of data genuinely represents a bigger threat to your application than any other compromises you might have made.

More Information
You can check out the slightly aging docs or browse the impressively simple source. There are probably already bindings for your language of choice as well.


October 24, 2008

Pushing the Microsoft Kool-Aid

Recently on one of our technical forums I contributed to a discussion about the Windows operating system. One of our director’s saw the post and thought it might be of interest to readers of the InnerLayer as well. The post focused on the pros and cons of Windows 2008 from the viewpoint of a systems / driver engineer (aka me). If you have no technical background, or interest in Microsoft operating system offerings, what follows probably will not be of interest to you—just the same, here is my two cents.

Microsoft is no different than any other developer when it comes to writing software--they get better with each iteration. There is not a person out there who would argue that the world of home computers would have been better off if none of us ever progressed beyond MS-DOS 1.0. Not to say there is anything wrong with MS-DOS. I love it. And still use it occasionally doing embedded work. But my point is that while there have certainly been some false starts along the way (can you say BOB), Microsoft's operating systems generally get better with each release.

So why not go out and update everything the day the latest and greatest OS hits the shelves? Because as most of you know, there are bugs that have to get worked out. To add to that, the more complex the OS gets, the more bugs there are and the more time it takes to shake those bugs out. Windows Server 2008 is no different. In my experience there are still a number of troublesome issues with W2K8 that need to be addressed. Just to name a few:

  • UAC (user access control) - these are the security features that give us so much headache. I'm not saying we don't need the added security. I'm just saying this is a new arena for MS and they still have a lot to learn. After clicking YES, I REALLY REALLY REALLY WANT TO INSTALL SAID APPLICATION for the 40th time in a day, most administrators will opt to disable UAC, thereby thwarting the added security benefits entirely. If I were running this team at MS I'd require all my developers to take a good hard look at LINUX.
  • UMD (user mode drivers) - the idea of running a device driver, or a portion of a device driver, in the restricted and therefore safe user memory of the kernel is a great idea in terms of improving OS reliability. I've seen numbers suggesting that as many as 90% of hard OS failures are caused by faulty third-party drivers mucking around in kernel mode. However implementing user mode drivers adds some new complexities if hardware manufacturers don't want to take a performance hit and from my experience not all hardware vendors are up to speed yet.
  • Driver Verification - this to me is the most troublesome and annoying issue right now with the 64-bit only version of W2K8. Only kernel mode software that has been certified in the MS lab is allowed to execute on a production boot of the OS. Period. Since I am writing this on the SoftLayer blog, I am assuming most of you are not selecting hardware and drivers to run on your boxes. We are handling that for you. But let me tell you it’s a pain in the butt to only run third party drivers that have been through the MS quality lab. Besides not being able to run drivers we have developed in house it is impossible for us to apply a patch from even the largest of hardware vendors without waiting on that patch to get submitted to MS and then cleared for the OS. A good example was a problem we ran into with an Intel Enet driver. Here at SoftLayer we found a bug in the driver and after a lot of back and forth with Intel's Engineers we had a fix in hand. But that fix could not be applied to the W2K8 64-bit boxes until weeks later when the fix finally made it from Intel to MS and back to Intel and us again. Very frustrating.

Okay, so now that you see some of the reasons NOT to use MS Windows Server 2008 what are some of the reasons it’s at least worth taking a look at? Well here are just a few that I know of from some of the work I have done keeping up to speed with the latest driver model.

  • Improved Memory Management – W2K8 issues fewer and larger disk I/O's than its 2003 predecessor. This applies to standard disk fetching, but also paging and even read-aheads. On Windows 2003 it is not uncommon for disk writes to happen in blocks
  • Improved Data Reliability - Everyone knows how painful disk corruption can be. And everyone knows taking a server offline on a regular basis to run chkdsk and repair disk corruption is slow. One of the ideal improvements in terms of administering a websever is that W2K8 employs a technology called NTFS self-healing. This new feature built into the file system detects disk corruption on the fly and quarantines that sector, allowing system worker-threads to execute chkdsk like repairs on the corrupted area without taking the rest of the volume offline.
  • Scalability - The W2K8 kernel introduces a number of streamlining factors that greatly enhance system wide performance. A minor but significant change to the operating system's low level timer code, combined with new I/O completion handling, and more efficient thread pool, offer marked improvement on load-heavy server applications. I have read documentation supporting claims that the minimization in CPU synchronization alone results directly in a 30% gain on the number of concurrent Windows 2008 users over 2003. That's not to say once you throw in all the added security and take the user mode driver hit you won't be looking at 2003 speeds. I'm just pointing out hard kernel-level improvements that can be directly quantified by multiplying your resources against the number of saved CPU cycles.

Alright, no need to beat a dead horse. My hope was if nothing else to muddy the waters a bit. The majority of posts I read on our internal forums seemed to recommend avoiding W2K8 like the plague. I'm only suggesting while it is certainly not perfect, there are some benefits to at least taking it for a test drive. Besides, with SoftLayer's handy dandy portal driven OS deployment, in the amount of time it took you to read all my rambling you might have already installed Windows Server 2008 and tried it out for yourself. Okay, maybe that's a bit of an exaggeration. But get the idea!


February 11, 2008

Spares at the Ready

In Steve's last post he talked about the logic of outsourcing. The rationale included the cost of redundant internet connections, the cost of the server, UPS, small AC, etc. He covers a lot of good reasons to get the server out of the broom closet and into a real datacenter. However, I would like to add one more often over looked component to that argument: the Spares Kit.

Let's say that you do purchase your own server and you set it up in the broom closet (or a real datacenter for that matter) and you get the necessary power, cooling and internet connectivity for it. What about spare parts?

If you lose a hard drive on that server, do you have a spare one available for replacement? Maybe so - that's a common part with mechanical features that is liable to fail - so you might have that covered. Not only do you have a spare drive, the server is configured with some level of RAID so you're probably well covered there.

What if that RAID card fails? It happens - and it happens with all different brands of cards.

What about RAM? Do you keep a spare RAM DIMM handy or if you see failures on one stick, do you just plan to remove it and run with less RAM until you can get more on site? The application might run slower because it's memory starved or because now your memory is not interleaved - but that might be a risk you are willing to take.

How about a power supply? Do you keep an extra one of those handy? Maybe you keep a spare. Or, you have dual power supplies. Are those power supplies plugged into separate power strips on separate circuits backed up by separate UPSs?

What if the NIC on the motherboard gets flaky or goes out completely? Do you keep a spare motherboard handy?

If you rely on out of band management of your server via an IPMI, Lights Out or DRAC card - what happens if that card goes bad while you're on vacation?

Even if you have all necessary spare parts for your server or you have multiple servers in a load balanced configuration inside the broom closet; what happens if you lose your switch or your load balancer or your router or your... What happens if that little AC you purchased shuts down on Friday night and the broom closet heats up all weekend until the server overheats? Do you have temperature sensors in the closet that are configured to send you an alert - so that now you have to drive back to the office to empty the water pail of the spot cooler?

You might think that some of these scenarios are a bit far fetched but I can certainly assure you that they're not. At SoftLayer, we have spares of everything. We maintain hundreds of servers in inventory at all times, we maintain a completely stocked inventory room full of critical components, and we staff it all 24/7 and back it all up with a 4 hour SLA.

Some people do have all of their bases covered. Some people are willing to take a chance, and even if you convince your employer that it's ok to take those chances, how do you think the boss will respond when something actually happens and critical services are offline?


January 23, 2008

640K Ought to Be Enough for Everybody

I was talking with a friend about memory on computers. He said he wanted a computer with tons of memory for Photoshop. I said something to the effect that I've never seen a desktop that can handle more than 16 GB, and that most operating systems now don't want to handle more than that... that Windows will only give a process 4 GB max, 2 GB for the application and 2 GB shared with the operating system. He then said "I can't imagine having to use more than 16 GB!" This immediately reminded me of Bill Gates' famous quote, that "640K ought to be more than enough for everybody." Striving for accuracy, I went to the Internet to find when and where he said this.

Interesting fact came up: HE NEVER SAID IT. He vehemently denies having ever uttered this phrase. Every quote I've seen is always un-sourced, so let's give him the benefit of the doubt. So what makes this quote clog the tubes so easily?

First, the irony that a person renowned for his computer visions of the future would say something so backward causes people to smile. Secondly, there is a nugget of truth. In a 1989 speech, Bill Gates said:

"I have to say that in 1981, making those decisions, I felt like I was providing enough freedom for 10 years. That is, a move from 64k to 640k felt like something that would last a great deal of time. Well, it didn't - it took about only 6 years before people started to see that as a real problem."

Bill never said "640K is enough for everybody." He said "640K should keep everyone happy for the next 10 years." Turns out he was wrong. Within less than 6 years we started hitting the "640K limit"... which wasn't a hard limit at all, it was just a limit proposed by the operating system at the time. Mr. Gates thought that 640K was generous, and in the beginning it was. But as the industry marched forward, it started cramping. Pretty soon, we had DOS memory managers, extended and expanded memory, virtual memory... we would do ANYTHING to escape the 640K limit.

So what's the moral of this story? First, you can't believe just anything you read on the Internet. Sometimes something gets into the tubes because it's funny, not because it's accurate. Secondly, predicting the computer industry is hard. To be a successful computer development company (like the portion of SoftLayer I work in), you have to be able to look to the horizon and attempt to spy the most likely location the software industry is moving in. We developers work on projects 3, 6, 9, 12 months before they're used by our users; we have to make predictions at least double that size in advance to give growth room whilst the next tool is developed. And, I dare say, we've done a good job of that around here!

In conclusion, Bill Gates never said "640K should be enough for everybody." That quote is a myth. It's a funny joke, but a joke nonetheless. However, Bill Gates did actually say this at a Macintosh conference:

"To create a new standard, it takes something that's not just a little bit different; it takes something that's really new and really captures people's imagination — and the Macintosh, of all the machines I've ever seen, is the only one that meets that standard."

And this time there's video evidence. And there's the not-quite-quote where Bill Gates implies that Vista isn't really the best thing since 8086 Segmented Physical Memory Models. Aren't real quotes more humorous fake ones? I think so.


Subscribe to memory