An Introduction to Redis

October 4, 2011

I recently had the opportunity to get re-acquainted with Redis while evaluating solutions for a project on the Product Innovation team here at SoftLayer. I'd actually played with it a couple of times before, but this time it "clicked." Or my brain broke. Either way, I see a lot of potential for Redis now.

No one product is a perfect fit for all of your data storage needs, of course. There are such fundamental tradeoffs to be made in designing storage architectures that you should be immediately suspicious of any product that claims to fit every need.

The best solutions tend to be products that actually embrace these tradeoffs. Redis, for instance, has sacrificed a small amount of data durability in exchange for being awesome.

What is it?

Redis is a key/value store, but describing it that way is sort of like calling a helicopter a "vehicle." It's a technically correct description, but it leaves out some important stuff.

You can think of it like a sophisticated older brother of Memcached. It presents a flat keyspace, and you can set those keys to string values. Another feature of Memcached is the ability to perform remote atomic operations, like "incr" and "append." These are really handy, because you have the ability to modify remote data without fetching, and you have an assurance that you're the only one performing that operation at that instant.

Redis takes this concept of remote commands on data and goes completely nuts with it. The database is aware of data structures like hashes, lists and sets in addition to simple string values. You can sort, union, intersect, slice and dice to your heart's content without fetching any data. Redis is a data structure server. You can treat it like remote memory, and this has an awesome immediate benefit for a programmer: your code and brain are already optimized for these data types.

But it's not just about making storage simpler. It's fast, too. Crazy fast. If you make intelligent use of its data structures, it's possible to serve a lot of traffic from relatively modest hardware. Redis 2.4 can easily handle ~50k list appends a second on my notebook. With batching, it can append 2 million items to a list on a remote host in about 1.28 seconds.

It allows the remote, atomic and performant manipulation of data structures. It took me a little while to realize exactly how useful that is.

What's wrong with it?

Nothing. Move along.

OK, it's a little short on durability. Redis uses memory as its primary store and periodically flushes to disk. A common configuration is to do so every second.

That sounds pretty reasonable. If a server goes down, you could lose a second of data. Keep in mind, however, how many operations Redis can perform in a second. If you're in a high-volume environment, that could be a lot of data. It's not for your financial transactions.

It also supports relatively limited availability options. Currently, it only supports master/slave replication. Clustering support is planned for an upcoming release. It's looking pretty powerful, but it will take some real-world testing to know its performance impact.

These challenges should be taken into consideration, and it's probably clear if you're in a situation where the current tradeoffs aren't a good fit.

In my experience, a lot of developers seriously overestimate the consequences of their application losing small amounts of data. Also consider whether or not the chance of losing a second (or less) of data genuinely represents a bigger threat to your application than any other compromises you might have made.

More Information
You can check out the slightly aging docs or browse the impressively simple source. There are probably already bindings for your language of choice as well.