Posts Tagged 'Cluster'

July 9, 2013

When to Consider Riak for Your Big Data Architecture

In my Breaking Down 'Big Data' – Database Models, I briefly covered the most common database models, their strengths, and how they handle the CAP theorem — how a distributed storage system balances demands of consistency and availability while maintaining partition tolerance. Here's what I said about Dynamo-inspired databases:

What They Do: Distributed key/value stores inspired by Amazon's Dynamo paper. A key written to a dynamo ring is persisted in several nodes at once before a successful write is reported. Riak also provides a native MapReduce implementation.
Horizontal Scaling: Dynamo-inspired databases usually provide for the best scale and extremely strong data durability.
CAP Balance: Prefer availability over consistency
When to Use: When the system must always be available for writes and effectively cannot lose data.
Example Products: Cassandra, Riak, BigCouch

This type of key/value store architecture is very unique from the document-oriented MongoDB solutions we launched at the end of last year, so we worked with Basho to prioritize development of high-performance Riak solutions on our global platform. Since you already know about MongoDB, let's take a few minutes to meet the new kid on the block.

Riak is a distributed database architected for availability, fault tolerance, operational simplicity and scalability. Riak is masterless, so each node in a Riak cluster is the same and contains a complete, independent copy of the Riak package. This design makes the Riak environment highly fault tolerant and scalable, and it also aids in replication — if a node goes down, you can still read, write and update data.

As you approach the daunting prospect of choosing a big data architecture, there are a few simple questions you need to answer:

  1. How much data do/will I have?
  2. In what format am I storing my data?
  3. How important is my data?

Riak may be the choice for you if [1] you're working with more than three terabytes of data, [2] your data is stored in multiple data formats, and [3] your data must always be available. What does that kind of need look like in real life, though? Luckily, we've had a number of customers kick Riak's tires on SoftLayer bare metal servers, so I can share a few of the use cases we've seen that have benefited significantly from Riak's unique architecture.

Use Case 1 – Digital Media
An advertising company that serves over 10 billion ads per month must be able to quickly deliver its content to millions of end users around the world. Meeting that demand with relational databases would require a complex configuration of expensive, vertically scaled hardware, but it can be scaled out horizontally much easier with Riak. In a matter of only a few hours, the company is up and running with an ad-serving infrastructure that includes a back-end Riak cluster in Dallas with a replication cluster in Singapore along with an application tier on the front end with Web servers, load balancers and CDN.

Use Case 2 – E-commerce
An e-commerce company needs 100-percent availability. If any part of a customer's experience fails, whether it be on the website or in the shopping cart, sales are lost. Riak's fault tolerance is a big draw for this kind of use case: Even if one node or component fails, the company's data is still accessible, and the customer's user experience is uninterrupted. The shopping cart structure is critical, and Riak is built to be available ... It's a perfect match.

As an additional safeguard, the company can take advantage of simple multi-datacenter replication in their Riak Enterprise environment to geographically disperse content closer to its customers (while also serving as an important tool for disaster recovery and backup).

Use Case 3 – Gaming
With customers like Broken Bulb and Peak Games, SoftLayer is no stranger to the gaming industry, so it should come as no surprise that we've seen interesting use cases for Riak from some of our gaming customers. When a game developer incorporated Riak into a new game to store player data like user profiles, statistics and rankings, the performance of the bare metal infrastructure blew him away. As a result, the game's infrastructure was redesigned to also pull gaming content like images, videos and sounds from the Riak database cluster. Since the environment is so easy to scale horizontally, the process on the infrastructure side took no time at all, and the multimedia content in the game is getting served as quickly as the player data.

Databases are common bottlenecks for many applications, but they don't have to be. Making the transition from scaling vertically (upgrading hardware, adding RAM, etc.) to scaling horizontally (spreading the work intelligently across multiple nodes) alleviates many of the pain points for a quickly growing database environment. Have you made that transition? If not, what's holding you back? Have you considered implementing Riak?

-@marcalanjones

April 30, 2013

Big Data at SoftLayer: Riak

Big data is only getting bigger. Late last year, SoftLayer teamed up with 10Gen to launch a high-performance MongoDB solution, and since then, many of our customers have been clamoring for us to support other big data platforms in the same way. By automating the provisioning process of a complex big data environment on bare metal infrastructure, we made life a lot easier for developers who demanded performance and on-demand scalability for their big data applications, and it's clear that our simple formula produced amazing results. As Marc mentioned when he started breaking down big data database models, document-oriented databases like MongoDB are phenomenal for certain use-cases, and in other situations, a key-value store might be a better fit. With that in mind, we called up our friends at Basho and started building a high-performance architecture specifically for Riak ... And I'm excited to announce that we're launching it today!

Riak is an open source, distributed database platform based on the principles enumerated in the DynamoDB paper. It uses a simple key/value model for object storage, and it was architected for high availability, fault tolerance, operational simplicity and scalability. A Riak cluster is composed of multiple nodes that are all connected, all communicating and sharing data automatically. If one node were to fail, the other nodes would automatically share the data that the failed node was storing and processing until the node is back up and running or a new node is added. See the diagram below for a simple illustration of how adding a node to a cluster works within Riak.

Riak Nodes

We will support both the open source and the Enterprise versions of Riak. The open source version is a great place to start. It has all of the database functionality of Riak Enterprise, but it is limited to a single cluster. The Enterprise version supports replication between clusters across data centers, giving you lots of architectural options. You can use replication to build highly available, live-live failover applications. You can also use it to distribute your application's data across regions, giving you a global platform that you can update anywhere in the world and know that those modifications will be available anywhere else. Riak Enterprise customers also receive 24×7 coverage, both from SoftLayer and Basho. This includes SoftLayer's one-hour guaranteed response for Severity 1 hardware issues and unlimited support available via our secure web portal, email and phone.

The business use-case for this flexibility is that if you need to scale up or down, nodes can be easily added or taken down as your requirements change. You can opt for a single-data center environment with a few nodes or you can broaden your architecture to a multi-data center deployment with a 40-node cluster. While these capabilities are inherent in Riak, they can be complicated to build and configure, so we spent countless hours working with Basho to streamline Riak deployment on the SoftLayer platform. The fruit of that labor can be found in our Riak Solution Designer:

Riak Solution Designer

The server configurations and packages in the Riak Solution Designer have been selected to deliver the performance, availability and stability that our customers expect from their bare metal and virtual cloud infrastructure at SoftLayer. With a few quick clicks, you can order a fully configured Riak environment, and it'll be provisioned and online for you in two to four hours. And everything you order is on a month-to-month contract.

Thanks to the hard work done by the SoftLayer development group and Basho's team, we're proud to be the first in the marketplace to offer a turn-key Riak solution on bare metal infrastructure. You don't need to sacrifice performance and agility for simplicity.

For more information, visit SoftLayer.com/Riak or contact our sales team.

-Duke

February 15, 2012

SoftLayer + OpenStack Swift = SoftLayer Object Storage

Since our inception in 2005, SoftLayer's goal has been to provide an array of on-demand data center and hosting services that combine exceptional access, control, scalability and security with unparalleled network robustness and ease of use ... That's why we're so excited to unveil SoftLayer Object Storage to our customers.

Based on OpenStack Object Storage (codenamed Swift) — open-source software that allows the creation of redundant, scalable object storage on clusters of standardized servers — SoftLayer Object Storage provides customers with new opportunities to leverage cost-effective cloud-based storage and to simultaneously realize significant capex-related cost savings.

OpenStack has been phenomenally successful thanks to a global software community comprised of developers and other technologists that has built and tweaked a standards-based, massively scalable open-source platform for public and private cloud computing. The simple goal of the OpenStack project is to deliver code that enables any organization to create and offer feature-rich cloud computing services from industry-standard hardware. The overarching OpenStack technology consists of several interrelated project components: One for compute, one for an image service, one for object storage, and a few more projects in development.

SoftLayer Object Storage
Like the OpenStack Swift system on which it is based, SoftLayer Object Storage is not a file system or real-time data-storage system, rather it's a long-term storage system for a more permanent type of static data that can be retrieved, leveraged and updated when necessary. Typical applications for this type of storage can involve virtual machine images, photo storage, email storage and backup archiving.

One of the primary benefits of Object Storage is the role that it can play in automating and streamlining data storage in cloud computing environments. SoftLayer Object Storage offers rich metadata features and search capability that can be leveraged to automate the way unstructured data gets accessed. In this way, SoftLayer Object Storage will provide organizations with new capabilities for improving overall data management and storage efficiency.

File Storage v. Object Storage
To better understand the difference between file storage and object storage, let's look at how file storage and object storage differ when it comes to metadata and search for a simple photo image. When a digital camera or camera-enabled phone snaps a photo, it embeds a series of metadata values in the image. If you save the image in a standard image file format, you can search for it by standard file properties like name, date and size. If you save the same image with additional metadata as an object, you can set object metadata values for the image (after reading them from the image file). This detail provides granular search capability based on the metadata keys and values, in addition to the standard object properties. Here is a sample comparison of an image's metadata value in both systems:

File Metadata Object Metadata
Name:img01.jpg Name:img01.jpg
Date: 2012-02-13 Date:2012-02-13
Size:1.2MB Size:1.2MB
Manufacturer:CASIO
Model:QV-4000
x-Resolution:72.00
y-Resolution:72.00
PixelXDimension:2240
PixelYDimension:1680
FNumber:f/4.0
Exposure Time:1/659 sec.

Using the rich metadata and search capability enabled by object storage, you would be able to search for all images with a dimension of 2240x1680 or a resolution of 72x72 in a quick/automated fashion. The object storage system "understands" more about what is being stored because it is able to differentiate files based on characteristics that you define.

What Makes SoftLayer Object Storage Different?
SoftLayer Object Storage features several unique features and ways for SoftLayer customers to upload, access and manage data:

  • Search — Quickly access information through user-defined metadata key-value pairs, file name or unique identifier
  • CDN — Serve your content globally over our high-performance content delivery network
  • Private Network — Free, secure private network traffic between all data centers and storage cluster nodes
  • API — Access to a full-feature OpenStack-compatible API with additional support for CDN and search integration
  • Portal — Web application integrated into the SoftLayer portal
  • Mobile — iPhone and Android mobile apps, with Windows Phone app coming soon
  • Language Bindings — Feature-complete bindings for Java, PHP, Python and Ruby*

*Language bindings, documentation, and guides are available on SLDN.

We think SoftLayer Object Storage will be attractive to a broad range of current and prospective customers, from web-centric businesses dependent on file sharing and content distribution to legal/medical/financial-services companies which possess large volumes of data that must be stored securely while remaining readily accessible.

SoftLayer Object Storage significantly extends our cloud-services portfolio while substantially enriching the storage capabilities that we bring to our customers. What are you waiting for? Go order yourself some object storage @ $0.12/GB!

-Marc

Subscribe to cluster