Posts Tagged 'Configuration'

January 29, 2013

iptables Tips and Tricks: APF (Advanced Policy Firewall) Configuration

Let's talk about APF. APF — Advanced Policy Firewall — is a policy-based iptables firewall system that provides simple, powerful control over your day-to-day server security. It might seem intimidating to be faced with all of the features and configuration tools in APF, but this blog should put your fears to rest.

APF is an iptables wrapper that works alongside iptables and extends its functionality. I personally don't use iptables wrappers, but I have a lot of experience with them, and I've seen that they do offer some additional features that streamline policy management. For example, by employing APF, you'll get several simple on/off toggles (set via configuration files) that make some complex iptables configurations available without extensive coding requirements. The flip-side of a wrapper's simplicity is that you aren't directly in control of the iptables commands, so if something breaks it might take longer to diagnose and repair. Before you add a wrapper like APF, be sure that you know what you are getting into. Here are a few points to consider:

  • Make sure that what you're looking to use adds a feature you need but cannot easily incorporate with iptables on its own.
  • You need to know how to effectively enable and disable the iptables wrapper (the correct way ... read the manual!), and you should always have a trusted failsafe iptables ruleset handy in the unfortunate event that something goes horribly wrong and you need to disable the wrapper.
  • Learn about the basic configurations and rule changes you can apply via the command line. You'll need to understand the way your wrapper takes rules because it may differ from the way iptables handles rules.
  • You can't manually configure your iptables rules once you have your wrapper in place (or at least you shouldn't).
  • Be sure to know how to access your server via the IPMI management console so that if you completely lock yourself out beyond repair, you can get back in. You might even go so far as to have a script or set of instructions ready for tech support to run, in the event that you can't get in via the management console.

TL;DR: Have a Band-Aid ready!

APF Configuration

Now that you have been sufficiently advised about the potential challenges of using a wrapper (and you've got your Band-Aid ready), we can check out some of the useful APF rules that make iptables administration a lot easier. Most of the configuration for APF is in conf.apf. This file handles the default behavior, but not necessarily the specific blocking rules, and when we make any changes to the configuration, we'll need to restart the APF service for the changes to take effect.

Let's jump into conf.apf and break down what we see. The first code snippit is fairly self-explanatory. It's another way to make sure you don't lock yourself out of your server as you are making configuration changes and testing them:

# !!! Do not leave set to (1) !!!
# When set to enabled; 5 minute cronjob is set to stop the firewall. Set
# this off (0) when firewall is determined to be operating as desired.
DEVEL_MODE="1"

The next configuration options we'll look at are where you can make quick high-level changes if you find that legitimate traffic is being blocked and you want to make APF a little more lenient:

# This controls the amount of violation hits an address must have before it
# is blocked. It is a good idea to keep this very low to prevent evasive
# measures. The default is 0 or 1, meaning instant block on first violation.
RAB_HITCOUNT="1"
 
# This is the amount of time (in seconds) that an address gets blocked for if
# a violation is triggered, the default is 300s (5 minutes).
RAB_TIMER="300"
# This allows RAB to 'trip' the block timer back to 0 seconds if an address
# attempts ANY subsiquent communication while still on the inital block period.
RAB_TRIP="1"
 
# This controls if the firewall should log all violation hits from an address.
# The use of LOG_DROP variable set to 1 will override this to force logging.
RAB_LOG_HIT="1"
 
# This controls if the firewall should log all subsiqent traffic from an address
# that is already blocked for a violation hit, this can generate allot of logs.
# The use of LOG_DROP variable set to 1 will override this to force logging.
RAB_LOG_TRIP="0"

Next, we have an option to adjust ICMP flood protection. This protection should be useful against some forms of DoS attacks, and the associated rules show up in your INPUT chain:

# Set a reasonable packet/time ratio for ICMP packets, exceeding this flow
# will result in dropped ICMP packets. Supported values are in the form of:
# pkt/s (packets/seconds), pkt/m (packets/minutes)
# Set value to 0 for unlimited, anything above is enabled.
ICMP_LIM="30/s"

If you wanted to add more ports to block for p2p traffic (which will show up in the P2P chain), you'll update this code:

# A common set of known Peer-To-Peer (p2p) protocol ports that are often
# considered undesirable traffic on public Internet servers. These ports
# are also often abused on web hosting servers where clients upload p2p
# client agents for the purpose of distributing or downloading pirated media.
# Format is comma separated for single ports and an underscore separator for
# ranges (4660_4678).
BLK_P2P_PORTS="1214,2323,4660_4678,6257,6699,6346,6347,6881_6889,6346,7778"

The next few lines let you designate the ports that you want to have closed at all times. They will be blocked for INPUT and OUTPUT chains:

# These are common Internet service ports that are understood in the wild
# services you would not want logged under normal circumstances. All ports
# that are defined here will be implicitly dropped with no logging for
# TCP/UDP traffic inbound or outbound. Format is comma separated for single
# ports and an underscore separator for ranges (135_139).
BLK_PORTS="135_139,111,513,520,445,1433,1434,1234,1524,3127"

The next important section to look at deals with conntrack. If you get "conntrack full" errors, this is where you'd increase the allowed connections. It's not uncommon to need more connections than the default, so if you need to adjust that value, you'd do it here:

# This is the maximum number of "sessions" (connection tracking entries) that
# can be handled simultaneously by the firewall in kernel memory. Increasing
# this value too high will simply waste memory - setting it too low may result
# in some or all connections being refused, in particular during denial of
# service attacks.
SYSCTL_CONNTRACK="65536"

We've talked about the ports we want closed at all times, so it only makes sense that we'd specify which ports we want open for all interfaces:

# Common inbound (ingress) TCP ports
IG_TCP_CPORTS="22"
# Common inbound (ingress) UDP ports
IG_UDP_CPORTS=""
# Common outbound (egress) TCP ports
EG_TCP_CPORTS="21,25,80,443,43"
# Common outbound (egress) UDP ports
EG_UDP_CPORTS="20,21,53"

And when we want a special port allowance for specific users, we can declare it easily. For example, if we want port 22 open for user ID 0, we'd use this code:

# Allow outbound access to destination port 22 for uid 0
EG_TCP_UID="0:22"

The next few sections on Remote Rule Imports and Global Trust are a little more specialized, and I encourage you to read a little more about them (since there's so much to them and not enough space to cover them here on the blog). An important feature of APF is that it imports block lists from outside sources to keep you safe from some attackers, so the Remote Rule Imports can prove to be very useful. The Global Trust section is incredibly useful for multi-server deployments of APF. Here, you can set up your global allow/block lists and have them all pull from a central location so that you can make a single update to the source and have the update propogated to all servers in your configuration. These changes are synced to the glob_allow/deny.rules files, and they will be downloaded (and overwritten) on a regular basis from your specified source, so don't make any manual edits in glob_allow/deny.rules.

As you can see, apf.conf is no joke. It has a lot of stuff going on, but it's very straightforward and documented well. Once we've set up apf.conf with the configurations we need, it's time to look at the more focused allow_hosts.rules and deny_hosts.rules files. These .rules files are where where you put your typical firewall rules in place. If there's one piece of advice I can give you about these configurations, it would be to check if your traffic is already allowed or blocked. Having multiple rules that do the same thing (possibly in different places) is confusing and potentially dangerous.

The deny_hosts.rules configuration will look just like allow_hosts.rules, but it's performing the opposite function. Let's check out an allow_hosts.rules configuration that will allow the Nimsoft service to function:

tcp:in:d=48000_48020:s=10.0.0.0/8
tcp:out:d=48000_48020:d=10.0.0.0/8

The format is somewhat simplistic, but the file gives a little more context in the comments:

# The trust rules can be made in advanced format with 4 options
# (proto:flow:port:ip);
# 1) protocol: [packet protocol tcp/udp]
# 2) flow in/out: [packet direction, inbound or outbound]
# 3) s/d=port: [packet source or destination port]
# 4) s/d=ip(/xx) [packet source or destination address, masking supported]
# Syntax:
# proto:flow:[s/d]=port:[s/d]=ip(/mask)

APF also uses ds_hosts.rules to load the DShield.org blocklist, and I assume the ecnshame_hosts.rules does something similar (can't find much information about it), so you won't need to edit these files manually. Additionally, you probably don't need to make any changes to log.rules, unless you want to make changes to what exactly you log. As it stands, it logs certain dropped connections, which should be enough. Also, it might be worth noting that this file is a script, not a configuration file.

The last two configuration files are the preroute.rules and postroute.rules that (unsurprisingly) are used to make routing changes. If you have been following my articles, this corresponds to the iptables chains for PREROUTING and POSTROUTING where you would do things like port forwarding and other advanced configuration that you probably don't want to do in most cases.

APF Command Line Management

As I mentioned in the "points to consider" at the top of this post, it's important to learn the changes you can perform from the command line, and APF has some very useful command line tools:

[root@server]# apf --help
APF version 9.7 <apf@r-fx.org>
Copyright (C) 2002-2011, R-fx Networks <proj@r-fx.org>
Copyright (C) 2011, Ryan MacDonald <ryan@r-fx.org>
This program may be freely redistributed under the terms of the GNU GPL
 
usage /usr/local/sbin/apf [OPTION]
-s|--start ......................... load all firewall rules
-r|--restart ....................... stop (flush) &amp; reload firewall rules
-f|--stop........ .................. stop (flush) all firewall rules
-l|--list .......................... list all firewall rules
-t|--status ........................ output firewall status log
-e|--refresh ....................... refresh &amp; resolve dns names in trust rules
-a HOST CMT|--allow HOST COMMENT ... add host (IP/FQDN) to allow_hosts.rules and
                                     immediately load new rule into firewall
-d HOST CMT|--deny HOST COMMENT .... add host (IP/FQDN) to deny_hosts.rules and
                                     immediately load new rule into firewall
-u|--remove HOST ................... remove host from [glob]*_hosts.rules
                                     and immediately remove rule from firewall
-o|--ovars ......................... output all configuration options

You can use these command line tools to turn your firewall on and off, add allowed or blocked hosts and display troubleshooting information. These commands are very easy to use, but if you want more fine-tuned control, you'll need to edit the configuration files directly (as we looked at above).

I know it seems like a lot of information, but to a large extent, that's all you need to know to get started with APF. Take each section slowly and understand what each configuration file is doing, and you'll master APF in no time at all.

-Mark

December 20, 2012

MongoDB Performance Analysis: Bare Metal v. Virtual

Developers can be cynical. When "the next great thing in technology" is announced, I usually wait to see how it performs before I get too excited about it ... Show me how that "next great thing" compares apples-to-apples with the competition, and you'll get my attention. With the launch of MongoDB at SoftLayer, I'd guess a lot of developers outside of SoftLayer and 10gen have the same "wait and see" attitude about the new platform, so I put our new MongoDB engineered servers to the test.

When I shared MongoDB architectural best practices, I referenced a few of the significant optimizations our team worked with 10gen to incorporate into our engineered servers (cheat sheet). To illustrate the impact of these changes in MongoDB performance, we ran 10gen's recommended benchmarking harness (freely available for download and testing of your own environment) on our three tiers of engineered servers alongside equivalent shared virtual environments commonly deployed by the MongoDB community. We've made a pretty big deal about the performance impact of running MongoDB on optimized bare metal infrastructure, so it's time to put our money where our mouth is.

The Testing Environment

For each of the available SoftLayer MongoDB engineered servers, data sets of 512kb documents were preloaded onto single MongoDB instances. The data sets were created with varying size compared to available memory to allow for data sets that were both larger (2X) and smaller than available memory. Each test also ensured that the data set was altered during the test run frequently enough to prevent the queries from caching all of the data into memory.

Once the data sets were created, JMeter server instances with 4 cores and 16GB of RAM were used to drive 'benchrun' from the 10gen benchmarking harness. This diagram illustrates how we set up the testing environment (click for a better look):

MongoDB Performance Analysis Setup

These Jmeter servers function as the clients generating traffic on the MongoDB instances. Each client generated random query and update requests with a ratio of six queries per update (The update requests in the test were to ensure that data was not allowed to fully cache into memory and never exercise reads from disk). These tests were designed to create an extreme load on the servers from an exponentially increasing number of clients until the system resources became saturated, and we recorded the resulting performance of the MongoDB application.

At the Medium (MD) and Large (LG) engineered server tiers, performance metrics were run separately for servers using 15K SAS hard drive data mounts and servers using SSD hard drive data mounts. If you missed the post comparing the IOPS statistics between different engineered server hard drive configurations, be sure to check it out. For a better view of the results in a given graph, click the image included in the results below to see a larger version.

Test Case 1: Small MongoDB Engineered Servers vs Shared Virtual Instance

Servers

Small (SM) MongoDB Engineered Server
Single 4-core Intel 1270 CPU
64-bit CentOS
8GB RAM
2 x 500GB SATAII - RAID1
1Gb Network
Virtual Provider Instance
4 Virtual Compute Units
64-bit CentOS
7.5GB RAM
2 x 500GB Network Storage - RAID1
1Gb Network
 

Tests Performed

Small Data Set (8GB of .5mb documents)
200 iterations of 6:1 query-to-update operations
Concurrent client connections exponentially increased from 1 to 32
Test duration spanned 48 hours
Average Read Operations per Second
by Concurrent Client
MongoDB Performance Analysis
Peak Read Operations per Second
by Concurrent ClientMongoDB Performance Analysis
Average Write Operations per Second
by Concurrent Client
MongoDB Performance Analysis
Peak Write Operations per Second
by Concurrent ClientMongoDB Performance Analysis

Test Case 2: Medium MongoDB Engineered Servers vs Shared Virtual Instance

Servers (15K SAS Data Mount Comparison)

Medium (MD) MongoDB Engineered Server
Dual 6-core Intel 5670 CPUs
64-bit CentOS
36GB RAM
2 x 64GB SSD - RAID1 (Journal Mount)
4 x 300GB 15K SAS - RAID10 (Data Mount)
1Gb Network - Bonded
Virtual Provider Instance
26 Virtual Compute Units
64-bit CentOS
30GB RAM
2 x 64GB Network Storage - RAID1 (Journal Mount)
4 x 300GB Network Storage - RAID10 (Data Mount)
1Gb Network
 

Tests Performed

Small Data Set (32GB of .5mb documents)
200 iterations of 6:1 query-to-update operations
Concurrent client connections exponentially increased from 1 to 128
Test duration spanned 48 hours
Average Read Operations per Second
by Concurrent Client
MongoDB Performance Analysis
Peak Read Operations per Second
by Concurrent ClientMongoDB Performance Analysis
Average Write Operations per Second
by Concurrent Client
MongoDB Performance Analysis
Peak Write Operations per Second
by Concurrent ClientMongoDB Performance Analysis

Servers (SSD Data Mount Comparison)

Medium (MD) MongoDB Engineered Server
Dual 6-core Intel 5670 CPUs
64-bit CentOS
36GB RAM
2 x 64GB SSD - RAID1 (Journal Mount)
4 x 400GB SSD - RAID10 (Data Mount)
1Gb Network - Bonded
Virtual Provider Instance
26 Virtual Compute Units
64-bit CentOS
30GB RAM
2 x 64GB Network Storage - RAID1 (Journal Mount)
4 x 300GB Network Storage - RAID10 (Data Mount)
1Gb Network
 

Tests Performed

Small Data Set (32GB of .5mb documents)
200 iterations of 6:1 query-to-update operations
Concurrent client connections exponentially increased from 1 to 128
Test duration spanned 48 hours
Average Read Operations per Second
by Concurrent Client
MongoDB Performance Analysis
Peak Read Operations per Second
by Concurrent ClientMongoDB Performance Analysis
Average Write Operations per Second
by Concurrent Client
MongoDB Performance Analysis
Peak Write Operations per Second
by Concurrent ClientMongoDB Performance Analysis

Test Case 3: Large MongoDB Engineered Servers vs Shared Virtual Instance

Servers (15K SAS Data Mount Comparison)

Large (LG) MongoDB Engineered Server
Dual 8-core Intel E5-2620 CPUs
64-bit CentOS
128GB RAM
2 x 64GB SSD - RAID1 (Journal Mount)
6 x 600GB 15K SAS - RAID10 (Data Mount)
1Gb Network - Bonded
Virtual Provider Instance
26 Virtual Compute Units
64-bit CentOS
64GB RAM (Maximum available on this provider)
2 x 64GB Network Storage - RAID1 (Journal Mount)
6 x 600GB Network Storage - RAID10 (Data Mount)
1Gb Network
 

Tests Performed

Small Data Set (64GB of .5mb documents)
200 iterations of 6:1 query-to-update operations
Concurrent client connections exponentially increased from 1 to 128
Test duration spanned 48 hours
Average Read Operations per Second
by Concurrent Client
MongoDB Performance Analysis
Peak Read Operations per Second
by Concurrent ClientMongoDB Performance Analysis
Average Write Operations per Second
by Concurrent Client
MongoDB Performance Analysis
Peak Write Operations per Second
by Concurrent ClientMongoDB Performance Analysis

Servers (SSD Data Mount Comparison)

Large (LG) MongoDB Engineered Server
Dual 8-core Intel E5-2620 CPUs
64-bit CentOS
128GB RAM
2 x 64GB SSD - RAID1 (Journal Mount)
6 x 400GB SSD - RAID10 (Data Mount)
1Gb Network - Bonded
Virtual Provider Instance
26 Virtual Compute Units
64-bit CentOS
64GB RAM (Maximum available on this provider)
2 x 64GB Network Storage - RAID1 (Journal Mount)
6 x 600GB Network Storage - RAID10 (Data Mount)
1Gb Network
 

Tests Performed

Small Data Set (64GB of .5mb documents)
200 iterations of 6:1 query-to-update operations
Concurrent client connections exponentially increased from 1 to 128
Test duration spanned over 48 hours
Average Read Operations per Second
by Concurrent Client
MongoDB Performance Analysis
Peak Read Operations per Second
by Concurrent ClientMongoDB Performance Analysis
Average Write Operations per Second
by Concurrent Client
MongoDB Performance Analysis
Peak Write Operations per Second
by Concurrent ClientMongoDB Performance Analysis

Impressions from Performance Testing

The results speak for themselves. Running a Mongo DB big data solution on a shared virtual environment has significant drawbacks when compared to running MongoDB on a single-tenant bare metal offering. Disk I/O is by far the most limiting resource for MongoDB, and relying on shared network-attached storage (with much lower disk I/O) makes this limitation very apparent. Beyond the average and peak statistics above, performance varied much more significantly in the virtual instance environment, so it's not as consistent and predictable as a bare metal.

Highlights:

  • When a working data set is smaller than available memory, query performance increases.
  • The number of clients performing queries has an impact on query performance because more data is being actively cached at a rapid rate.
  • The addition of a separate Journal Mount volume significantly improves performance. Because the Small (SM) engineered server does not include a secondary mount for Journals, whenever MongoDB began to journal, the disk I/O associated with journalling was disruptive to the query and update operations performed on the Data Mount.
  • The best deployments in terms of operations per second, stability and control were the configurations with a RAID10 SSD Data Mount and a RAID1 SSD Journal Mount. These configurations are available in both our Medium and Large offerings, and I'd highly recommend them.

-Harold

December 6, 2012

MongoDB: Architectural Best Practices

With the launch of our MongoDB solutions, developers can provision powerful, optimized, horizontally scaling NoSQL database clusters in real-time on bare metal infrastructure in SoftLayer data centers around the world. We worked tirelessly with our friends at 10gen — the creators of MongoDB — to build and tweak hardware and software configurations that enable peak MongoDB performance, and the resulting platform is pretty amazing. As Duke mentioned in his blog post, those efforts followed 10Gen's MongoDB best practices, but what he didn't mention was that we created some architectural best practices of our own for MongoDB in deployments on our platform.

The MongoDB engineered servers that you order from SoftLayer already implement several of the recommendations you'll see below, and I'll note which have been incorporated as we go through them. Given the scope of the topic, it's probably easiest to break down this guide into a few sections to make it a little more digestible. Let's take a look at the architectural best practices of running MongoDB through the phases of the roll-out process: Selecting a deployment strategy to prepare for your MongoDB installation, the installation itself, and the operational considerations of running it in production.

Deployment Strategy

When planning your MongoDB deployment, you should follow Sun Tzu's (modified) advice: "If you know the [friend] and know yourself, you need not fear the result of a hundred battles." "Friend" was substituted for the "enemy" in this advice because the other party is MongoDB. If you aren't familiar with MongoDB, the top of your to-do list should be to read MongoDB's official documentation. That information will give you the background you'll need as you build and use your database. When you feel comfortable with what MongoDB is all about, it's time to "know yourself."

Your most important consideration will be the current and anticipated sizes of your data set. Understanding the volume of data you'll need to accommodate will be the primary driver for your choice of individual physical nodes as well as your sharding plans. Once you've established an expected size of your data set, you need to consider the importance of your data and how tolerant you are of the possibility of lost or lagging data (especially in replicated scenarios). With this information in hand, you can plan and start testing your deployment strategy.

It sounds a little strange to hear that you should test a deployment strategy, but when it comes to big data, you want to make sure your databases start with a strong foundation. You should perform load testing scenarios on a potential deployment strategy to confirm that a given architecture will meet your needs, and there are a few specific areas that you should consider:

Memory Sizing
MongoDB (like many data-oriented applications) works best when the data set can reside in memory. Nothing performs better than a MongoDB instance that does not require disk I/O. Whenever possible, select a platform that has more available RAM than your working data set size. If your data set exceeds the available RAM for a single node, then consider using sharding to increase the amount of available RAM in a cluster to accommodate the larger data set. This will maximize the overall performance of your deployment. If you notice page faults when you put your database under production load, they may indicate that you are exceeding the available RAM in your deployment.

Disk Type
If speed is not your primary concern or if you have a data set that is far larger than any available in memory strategy can support, selecting the proper disk type for your deployment is important. IOPS will be key in selecting your disk type and obviously the higher the IOPS the better the performance of MongoDB. Local disks should be used whenever possible (as network storage can cause high latency and poor performance for your deployment). It's also advised that you use RAID 10 when creating disk arrays.

To give you an idea of what kind of IOPS to expect from a given type of drive, these are the approximate ranges of IOPS per drive in SoftLayer MongoDB engineered servers:

SATA II – 100-200 IOPS
15K SAS – 300-400 IOPS
SSD – 7,000-8,000 IOPS (read) 19,000-20,000 IOPS (write)

CPU
Clock speed and the amount of available processors becomes a consideration if you anticipate using MapReduce. It has also been noted that when running a MongoDB instance with the majority of the data in memory, clock speed can have a major impact on overall performance. If you are planning to use MapReduce or you're able to operate with a majority of your data in memory, consider a deployment strategy that includes a CPU with a high clock/bus speed to maximize your operations per second.

Replication
Replication provides high availability of your data if a node fails in your cluster. It should be standard to replicate with at least three nodes in any MongoDB deployment. The most common configuration for replication with three nodes is a 2x1 deployment — having two primary nodes in a single data center with a backup server in a secondary data center:

MongoDB Replication

Sharding
If you anticipate a large, active data set, you should deploy a sharded MongoDB deployment. Sharding allows you to partition a single data set across multiple nodes. You can allow MongoDB to automatically distribute the data across nodes in the cluster or you may elect to define a shard key and create range-based sharding for that key.

Sharding may also help write performance, so you can also elect to shard even if your data set is small but requires a high amount of updates or inserts. It's important to note that when you deploy a sharded set, MongoDB will require three (and only three) config server instances which are specialized Mongo runtimes to track the current shard configuration. Loss of one of these nodes will cause the cluster to go into a read-only mode (for the configuration only) and will require that all nodes be brought back online before any configuration changes can be made.

Write Safety Mode
There are several write safety modes that govern how MongoDB will handle the persistence of the data to disk. It is important to consider which mode best fits your needs for both data integrity and performance. The following write safety modes are available:

None – This mode provides a deferred writing strategy that is non-blocking. This will allow for high performance, however there is a small opportunity in the case of a node failing that data can be lost. There is also the possibility that data written to one node in a cluster will not be immediately available on all nodes in that cluster for read consistency. The 'None' strategy will also not provide any sort of protection in the case of network failures. That lack of protection makes this mode highly unreliable and should only be used when performance is a priority and data integrity is not a concern.

Normal – This is the default for MongoDB if you do not select any other mode. It provides a deferred writing strategy that is non-blocking. This will allow for high performance, however there is a small opportunity in the case of a node failing that data can be lost. There is also the possibility that data written to one node in a cluster will not be immediately available on all nodes in that cluster for read consistency.

Safe – This mode will block until MongoDB has acknowledged that it has received the write request but will not block until the write is actually performed. This provides a better level of data integrity and will ensure that read consistency is achieved within a cluster.

Journal Safe – Journals provide a recovery option for MongoDB. Using this mode will ensure that the data has been acknowledged and a Journal update has been performed before returning.

Fsync - This mode provides the highest level of data integrity and blocks until a physical write of the data has occurred. This comes with a degradation in performance and should be used only if data integrity is the primary concern for your application.

Testing the Deployment
Once you've determined your deployment strategy, test it with a data set similar to your production data. 10gen has several tools to help you with load testing your deployment, and the console has a tool named 'benchrun' which can execute operations from within a JavaScript test harness. These tools will return operation information as well as latency numbers for each of those operations. If you require more detailed information about the MongoDB instance, consider using the mongostat command or MongoDB Monitoring Service (MMS) to monitor your deployment during the testing.

Installation

When performing the installation of MongoDB, a few considerations can help create both a stable and performance-oriented solution. 10gen recommends the use CentOS (64-bit) as the base operating system if at all possible. If you try installing MongoDB on a 32-bit operating system, you might run into file size limits that cause issues, and if you feel the urge to install it on Windows, you'll see performance issues if virtual memory begins to be utilized by the OS to make up for a lack of RAM in your deployment. As a result, 32-bit operating systems and Windows operating systems should be avoided on MongoDB servers. SoftLayer provisions CentOS 6.X 64-bit operating systems by default on all of our MongoDB engineered server deployments.

When you've got CentOS 64-bit installed, you should also make the following changes to maximize your performance (all of which are included by default on all SoftLayer engineered servers):

Set SSD Read Ahead Defaults to 16 Blocks - SSD drives have excellent seek times allowing for shrinking the Read Ahead to 16 blocks. Spinning disks might require slight buffering so these have been set to 32 blocks.

noatime - Adding the noatime option eliminates the need for the system to make writes to the file system for files which are simply being read — or in other words: Faster file access and less disk wear.

Turn NUMA Off in BIOS - Linux, NUMA and MongoDB tend not to work well together. If you are running MongoDB on NUMA hardware, we recommend turning it off (running with an interleave memory policy). If you don't, problems will manifest in strange ways like massive slow downs for periods of time or high system CPU time.

Set ulimit - We have set the ulimit to 64000 for open files and 32000 for user processes to prevent failures due to a loss of available file handles or user processes.

Use ext4 - We have selected ext4 over ext3. We found ext3 to be very slow in allocating files (or removing them). Additionally, access within large files is poor with ext3.

One last tip on installation: Make the Journal and Data volumes be distinct physical volumes. If the Journal and Data directories reside on a single physical volume, flushes to the Journal will interrupt the access of data and provide spikes of high latency within your MongoDB deployment.

Operations

Once a MongoDB deployment has been promoted to production, there are a few recommendations for monitoring and optimizing performance. You should always have the MMS agent running on all MongoDB instances to help monitor the health and performance of your deployment. Additionally, this tool is also very useful if you have 10gen MongoDB Cloud Subscriptions because it provides useful debugging data for the 10gen team during support interactions. In addition to MMS, you can use the mongostat command (mentioned in the deployment section) to see runtime information about the performance of a MongoDB node. If either of these tools flags performance issues, sharding or indexing are first-line options to resolve them:

Indexes - Indexes should be created for a MongoDB deployment if monitoring tools indicate that field based queries are performing poorly. Always use indexes when you are querying data based on distinct fields to help boost performance.

Sharding - Sharding can be leveraged when the overall performance of the node is suffering because of a large operating data set. Be sure to shard before you get in the red; the system only splits chunks for sharding on insert or update so if you wait too long to shard you may have some uneven distribution for a period of time or forever depending on your data set and sharding key strategy.

I know it seems like we've covered a lot over the course of this blog post, but this list of best practices is far from exhaustive. If you want to learn more, the MongoDB forums are a great resource to connect with the rest of the MongoDB community and learn from their experiences, and the documentation on MongoDB's site is another phenomenal resource. The best people to talk to when it comes to questions about MongoDB are the folks at 10gen, so I also highly recommend taking advantage of MongoDB Cloud Subscriptions to get their direct support for your one-off questions and issues.

-Harold

November 2, 2012

The Trouble with Open DNS Resolvers

In the last couple of days, there's been a bit of buzz about "open DNS resolvers" and DNS amplification DDoS attacks, and SoftLayer's name has been brought up a few times. In a blog post on October 30, CloudFlare explained DNS Amplification DDoS attacks and reported the geographic and network sources of open DNS resolvers that were contributing to a 20Gbps attack on their network. SoftLayer's AS numbers (SOFTLAYER and the legacy THEPLANET-AS number) show up on the top ten "worst offenders" list, and Dan Goodin contacted us to get a comment for a follow-up piece on Ars Technica — Meet the network operators helping to fuel the spike in big DDoS attacks.

While the content of that article is less sensationalized than the title, there are still a few gaps to fill about when it comes to how SoftLayer is actually involved in the big picture (*SPOILER ALERT* We aren't "helping to fuel the spike in big DDoS attacks"). The CloudFlare blog and the Ars Technica post presuppose that the presence of open recursive DNS resolvers is a sign of negligence on the part of the network provider at best and maliciousness at worst, and that's not the case.

The majority of SoftLayer's infrastructure is made up of self-managed dedicated and cloud servers. Customers who rent those servers on a monthly basis have unrestricted access to operate their servers in any way they'd like as long as that activity meets our acceptable use policy. Some of our largest customers are hosting resellers who provide that control to their customers who can then provide that control to their own customers. And if 23 million hostnames reside on the SoftLayer network, you can bet that we've got a lot of users hosting their DNS on SoftLayer infrastructure. Unfortunately, it's easier for those customers and customers-of-customers and customers-of-customers-of-customers to use "defaults" instead of looking for, learning and implementing "best practices."

It's all too common to find those DNS resolvers open and ultimately vulnerable to DNS amplification attacks, and whenever our team is alerted to that vulnerability on our network, we make our customers aware of it. In turn, they may pass the word down the customer-of-customer chain to get to the DNS owner. It's usually not a philosophical question about whether DNS resolvers should be open for the greater good of the Internet ... It's a question of whether the DNS owner has any idea that their "configuration" is vulnerable to be abused in this way.

SoftLayer's network operations, abuse and support teams have tools that flag irregular and potentially abusive traffic coming from any server on our network, and we take immediate action when we find a problem or are alerted to one by someone who sends details to abuse@softlayer.com. The challenge we run into is that flagging obvious abusive behavior from an active DNS server is a bit of a cat-and-mouse game ... Attackers cloak their activity in normal traffic. Instead of sending a huge amount of traffic from a single domain, they send a marginal amount of traffic from a large number of machines, and the "abusive" traffic is nearly impossible for even the DNS owner to differentiate from "regular" traffic.

CloudFlare effectively became a honeypot, and they caught a distributed DNS amplification DoS attack. The results they gathered are extremely valuable to teams like mine at SoftLayer, so if they go the next step to actively contact the abuse channel for each of the network providers in their list, I hope that each of the other providers will jump on that information as I know my team will.

If you have a DNS server on the SoftLayer network, and you're not sure whether it's configured to prevent it from being used for these types of attacks, our support team is happy to help you out. For those of you interested in doing a little DNS homework to learn more, Google's Developer Network has an awesome overview of DNS security threats and mitigations which gives an overview of potential attacks and preventative measures you can take. If you're just looking for an easy way to close an open recursor, scroll to the bottom of CloudFlare's post, and follow their quick guide.

If, on the other hand, you have your own DNS server and you don't want to worry about all of this configuration or administration, SoftLayer operates private DNS resolvers that are limited to our announced IP space. Feel free to use ours instead!

-Ryan

June 20, 2012

How Do You Build a Private Cloud?

If you read Nathan's "A Cloud to Call Your Own" blog, and you wanted to learn a little more about private clouds in general or SoftLayer Private Clouds specifically, this post is for you. We're going take a little time to dive deeper into the technology behind SoftLayer Private Clouds, and in the process, I'll talk a little about why particular platforms/hardware/configurations were chosen.

The Platform: Citrix CloudPlatform

There are several cloud infrastructure frameworks to choose from these days. We have surveyed a number of them and actively work with several of them. We are active members of the happenings around OpenStack and we have working implementations of vSphere, Nimula, Eucalyptus and other stacks in our data centers. So why CloudPlatform by Citrix?

First off, it's one of the most mature of these options. It's been around for several years and now has the substantial backing of Citrix. That backing includes investment, support organizations and the multitude of other products managed by Citrix. There are also some futuristic ideas we have regarding how to leverage products like CloudBridge and Netscaler with Private Clouds. Second, CloudPlatform operates in accordance with how we believe a private cloud should work: It's simple, it doesn't have a huge management infrastructure and we can charge for it by the CPU per month, just like all of our other products. Finally, CloudPlatform has made good inroads with enterprise customers. We love the idea that an enterprise ops team could leverage CloudPlatform as the management platform for both their on-premise and their off-premise private cloud.

So, we selected CloudPlatform for a multitude of reasons; not just one.

Another huge key was our ability to integrate CloudPlatform into the SoftLayer portals/mobile apps/API. Because many SoftLayer customers manage their environments exclusively through the SoftLayer API, we knew that a seamless integration there was an absolute necessity. With the help of the SoftLayer dev team and the CloudStack folks, we've been able to automate private clouds the same way we did for public cloud instances and dedicated servers.

The Hardware

When it came to choosing what hardware the private clouds would use, the decision was pretty simple. Given our need for automation, SoftLayer Private Clouds would need to be indistinguishable from a standard dedicated server or CloudLayer environment. We use the latest and greatest server hardware available on the market, and every month, you can see thousands of new SuperMicro boxes being delivered to our data centers around the world. Because we know we have a reliable, powerful and consistent hardware foundation on which we can build the private clouds product, it makes the integration of the system even easier.

When it comes to the specs of the hardware provided for a private cloud environment, we provide as much transparency and flexibility as we can for a customer to build exactly what he or she needs. Let's look into what that means...

The Hardware Configurations

A CloudPlatform environment can be broken down into these components:

  • A single management server (that can manage multiple zones across layer 2 networks)
  • One or more zones
  • One or more clusters in a zone
  • One or more hosts in a cluster
  • Storage shared by a cluster (which can be a single server)

A simple diagram of a two-zone private cloud might look like this:

SoftLayer Private Clouds

We've set a standard "management server" configuration that we know will be able to accommodate all of your needs when it comes to running CloudPlatform, and how you build and configure the rest of your private cloud infrastructure is up to you. Whether you want simple dual proc, quad core Nehalem box with a lot of local disk space for a dev cloud or an environment made up of quad proc 10-core Westmeres with SSDs, you have the freedom to choose exactly what you want.

Oh, and everything can be online in two to four hours, and it's offered on a month-to-month contract.

The Network Configuration

When it comes to where the hardware is provisioned, you have the ability to deploy zones in multiple geographies and manage them all through a single CloudPlatform management node. Given the way the SoftLayer three-tier network is built, the management node and host nodes do not even need to be accessible by our public network. You can choose to make accessible only the IPs used by the VMs you create. If your initial private cloud infrastructure is in Dallas and you want a node online in Singapore, you can just click a few buttons, and the new node will be provisioned and configured securely by CloudPlatform in a couple of hours.

Imagine how long it would have taken you to build this kind of infrastructure in the past:

SoftLayer Private Clouds

It doesn't take days or weeks now. It takes hours.

As you can see, when we approached the challenge of bringing private clouds to the SoftLayer platform, we had to innovate. In Texas, that would be roughly translated as "Go big or go home." Given the response we've seen from customers and partners since the announcement of SoftLayer Private Clouds, we know the industry has taken notice.

Will all of our customers need their own private cloud infrastructure? Probably not. But will the customers who've been looking for this kind of functionality be ecstatic with the CloudPlatform environment on SoftLayer's network? Absolutely.

-Duke

March 27, 2012

Tips and Tricks - How to Secure WordPress

As a hobby, I dabble in WordPress, so I thought I'd share a few security features I use to secure my WordPress blogs as soon as they're installed. Nothing in this blog will be earth-shattering, but because security is such a priority, I have no doubt that it will be useful to many of our customers. Often, the answer to the question, "How much security do I need on my site?" is simply, "More," so even if you have a solid foundation of security, you might learn a new trick or two that you can incorporate into your next (or current) WordPress site.

Move wp-config.php

The first thing I do is change the location of my wp-config.php. By default, it's installed in the WordPress parent directory. If the config file is in the parent directory, it can be viewed and accessed by Apache, so I move it out of web/root. Because you're changing the default location of a pretty significant file, you need to tell WordPress how to find it in wp-load.php. Let's say my WordPress runs out of /webroot on my host ... I'd need to make a change around Line 26:

if ( file_exists( ABSPATH . 'wp-config.php') ) {
 
        /** The config file resides in ABSPATH */
        require_once( ABSPATH . 'wp-config.php' );
 
} elseif ( file_exists( dirname(ABSPATH) . '/wp-config.php' ) &amp;&amp; ! file_exists( dirname(ABSPATH) . '/wp-settings.php' ) ) {
 
        /** The config file resides one level above ABSPATH but is not part of another install*/
        require_once( dirname(ABSPATH) . '/wp-config.php' );

The code above is the default setup, and the code below is the version with my subtle update incorporated.

if ( file_exists( ABSPATH . 'wp-config.php') ) {
 
        /** The config file resides in ABSPATH */
        require_once( ABSPATH . '../wp-config.php' );
 
} elseif ( file_exists( dirname(ABSPATH) . '..//wp-config.php' ) &amp;&amp; ! file_exists( dirname(ABSPATH) . '/wp-settings.php' ) ) {
 
        /** The config file resides one level above ABSPATH but is not part of another install*/
        require_once( dirname(ABSPATH) . '../wp-config.php' );

All we're doing is telling the application that the wp-config.php file is one directory higher. By making this simple change, you ensure that only the application can see your wp-config.php script.

Turn Down Access to /wp-admin

After I make that change, I want to turn down access to /wp-admin. I allow users to contribute on some of my blogs, but I don't want them to do so from /wp-admin; only users with admin rights should be able to access that panel. To limit access to /wp-admin, I recommend the plugin uCan Post. This plugin creates a page that allows users to write posts and submit them within your theme.

But won't a user just be able to navigate to http://site.com/wp-admin? Yes ... Until we add a simple function to our theme's functions.php file to limit that access. At the bottom of your functions.php file, add this:

############ Disable admin access for users ############

add_action('admin_init', 'no_more_dashboard');
function no_more_dashboard() {
  if (!current_user_can('manage_options') &amp;&amp; $_SERVER['DOING_AJAX'] != '/wp-admin/admin-ajax.php') {
  wp_redirect(site_url()); exit;
  }
}
 
###########################################################

Log in as a non-admin user, and you'll get redirected to the blog's home page if you try to access the admin panel. Voila!

Start Securing the WordPress Database

Before you go any further, you need to look at WordPress database security. This is the most important piece in my opinion, and it's not just because I'm a DBA. WordPress never needs all permissions. The only permissions WordPress needs to function are ALTER, CREATE, CREATE TEMPORARY TABLES, DELETE, DROP, INDEX, INSERT, LOCK TABLES, SELECT and UPDATE.

If you run WordPress and MySQL on the same server the permissions grant would look something like:

GRANT ALTER, CREATE, CREATE TEMPORARY TABLES, DELETE, DROP, INDEX, INSERT, LOCK TABLES, SELECT, UPDATE ON <DATABASE>.* TO <USER>@'localhost' IDENTIFIED BY '<PASSWORD>';

If you have a separate database server, make sure the host of the webserver is allowed to connect to the database server:

GRANT ALTER, CREATE, CREATE TEMPORARY TABLES, DELETE, DROP, INDEX, INSERT, LOCK TABLES, SELECT, UPDATE ON <DATABASE>.* TO <USER>@'<ip of web server' IDENTIFIED BY '<PASSWORD>';

The password you use should be random, and you should not need to change this. DO NOT USE THE SAME PASSWORD AS YOUR ADMIN ACCOUNT.

By taking those quick steps, we're able to go a long way to securing a default WordPress installation. There are other plugins out there that are great tools to enhance your blog's security, and once you've got the fundamental security updates in place, you might want to check some of them out. Login LockDown is designed to stop brute force login attempts, and Secure WordPress has some great additional features.

What else do you do to secure your WordPress sites?

-Lee

March 5, 2012

iptables Tips and Tricks - Not Locking Yourself Out

The iptables tool is one of the simplest, most powerful tools you can use to protect your server. We've covered port redirection, rule processing and troubleshooting in previous installments to this "Tips and Tricks" series, but what happens when iptables turns against you and locks you out of your own system?

Getting locked out of a production server can cost both time and money, so it's worth your time to avoid this. If you follow the correct procedures, you can safeguard yourself from being firewalled off of your server. Here are seven helpful tips to help you keep your sanity and prevent you from locking yourself out.

Tip 1: Keep a safe ruleset handy.

If you are starting with a working ruleset, or even if you are trying to troubleshoot an existing ruleset, take a backup of your iptables configuration before you ever start working on it.

iptables-save > /root/iptables-safe

Then if you do something that prevents your website from working, you can quickly restore it.

iptables-restore

Tip 2: Create a cron script that will reload to your safe ruleset every minute during testing.

This was pointed out to my by a friend who swears by this method. Just write a quick bash script and set a cron entry that will reload it back to the safe set every minute. You'll have to test quickly, but it will keep you from getting locked out.

Tip 3: Have the IPMI KVM ready.

SoftLayer-pod servers* are equipped with some sort of remote access device. Most of them have a KVM console. You will want to have your VPN connection set up, connected and the KVM window up. You can't paste to and from the KVM, so SSH is typically easier to work with, but it will definitely cut down on the downtime if something does go wrong.

*This may not apply to servers that were originally provisioned under another company name.

Tip 4: Try to avoid generic rules.

The more criteria you specify in the rule, the less chance you will have of locking yourself out. I would liken this to a pie. A specific rule is a very thin slice of the pie.

iptables -A INPUT -p tcp --dport 22 -s 10.0.0.0/8 -d 123.123.123.123 -j DROP

But if you block port 22 from any to any, it's a very large slice.

iptables -A INPUT -p tcp --dport 22 -j DROP

There are plenty of ways that you can be more specific. For example, using "-i eth0" will limit the processing to a single NIC in your server. This way, it will not apply the rule to eth1.

Tip 5: Whitelist your IP address at the top of your ruleset.

This may make testing more difficult unless you have a secondary offsite test server, but this is a very effective method of not getting locked out.

iptables -I INPUT -s <your IP> -j ACCEPT

You need to put this as the FIRST rule in order for it to work properly ("-I" inserts it as the first rule, whereas "-A" appends it to the end of the list).

Tip 6: Know and understand all of the rules in your current configuration.

Not making the mistake in the first place is half the battle. If you understand the inner workings behind your iptables ruleset, it will make your life easier. Draw a flow chart if you must.

Tip 7: Understand the way that iptables processes rules.

Remember, the rules start at the top of the chain and go down, unless specified otherwise. Crack open the iptables man page and learn about the options you are using.

-Mark

February 13, 2012

Logic Challenge: SoftLayer Server Rack Riddle

After I spent a little time weaving together a story in response to SKinman's "Choose Your Own Adventure" puzzle (which you can read in the comments section), I was reminded of another famous logic puzzle that I came across a few years ago. Because it was begging to be SoftLayer-ized, I freshened it up to challenge our community.

In 1962, Life International magazine published a logic puzzle that was said to be so difficult that it could only be solved by two percent of the world's population. It's been attributed to Einstein, and apparently Lewis Carroll is given a claim to it as well, but regardless of the original author, it's a great brain workout.

If you haven't tried a puzzle like this before, don't get discouraged and go Googling for the answer. You're given every detail you need to answer the question at the end ... Take your time and think about how the components are interrelated. If you've solved this puzzle before, this iteration might only be light mental calisthenics, but with its new SoftLayer twist, it should still be fun:

Einstein's SoftLayer Riddle

The Scenario: You're in a SoftLayer data center. You walk up to a server rack and you see five servers in the top five slots on the rack. Each of the five servers has a distinct hard drive configuration, processor type, operating system, control panel (or absence thereof) and add-on storage. No two servers in this rack are the same in any of those aspects.

  • The CentOS6 operating system is being run on the Xeon 3230 server.
  • The Dual Xeon 5410 server is racked next to (immediately above or below) the server running the Red Hat 6 operating system.
  • The Dual Xeon 5610 server uses 50GB of CloudLayer Storage as its add-on storage.
  • The Quad Xeon 7550 server has no control panel.
  • The Cent OS 5 operating system is racked immediately below the server running the Red Hat 5 operating system.
  • The server using 80GB NAS add-on storage is racked next to (immediately above or below) the server with two 100GB SSD hard drives.
  • The server running the Red Hat 5 operating system uses Parallels Virtuozzo (3VPS) as a control panel.
  • The server running the Windows 2008 operating system has two 100GB SSD hard drives.
  • The server using Plesk 9 as a control panel is in the middle space in the five-server set in the rack.
  • The top server in the rack is the Dual Xeon 5410 server.
  • The Xeon 3450 server has two 147GB 10K RPM SA-SCSI hard drives.
  • The server using 20GB EVault as its add-on storage has one 250GB SATA II hard drive.
  • The server with four 600GB 15K RPM SA-SCSI hard drives is next to (immediately above or below) the server using 100GB iSCSI SAN add-on storage.
  • The server using cPanel as a control panel has two 2TB SATA II hard drives.
  • The server with four 600GB 15K RPM SA-SCSI hard drives is racked next to (immediately above or below) the server using Plesk 10 (Unlimited) as a control panel.
  • One server will use a brand new, soon-to-be-announced product offering as its add-on storage.

Question: What is the monthly cost of the server that will be using our super-secret new product offering for its add-on storage?

Use the SoftLayer Shopping Cart to come up with your answer. You can assume that the server has a base configuration (unless specifically noted in the clues above), that SoftLayer's promotions are not used, and that the least expensive version of the control panel is being used for any control panel with several price points. You won't be able to include the cost of the add-on storage (yet), so just provide the base configuration cost of that server in one of our US-based data centers with all of the specs you are given.

Bonus Question: If you ordered all five of those servers, how long would it take for them to be provisioned for you?

Submit your answers via comment, and we'll publish the comments in about a week so other people have a chance to answer it without the risk of scrolling down and seeing spoilers.

-@khazard

January 9, 2012

iptables Tips and Tricks – Troubleshooting Rulesets

One of the most time consuming tasks with iptables is troubleshooting a problematic ruleset. That will not change no matter how much experience you have with it. However, with the right mindset, this task becomes considerably easier.

If you missed my last installment about iptables rule processing, here's a crash course:

  1. The rules start at the top, and proceed down, one by one unless otherwise directed.
  2. A rule must match exactly.
  3. Once iptables has accepted, rejected, or dropped a packet, it will not process any further rules on it.

There are essentially two things that you will be troubleshooting with iptables ... Either it's not accepting traffic and it should be OR it's accepting traffic and it shouldn't be. If the server is intermittently blocking or accepting traffic, that may take some additional troubleshooting, and it may not even be related to iptables.

Keep in mind what you are looking for, and don't jump to any conclusions. Troubleshooting iptables takes patience and time, and there shouldn't be any guesswork involved. If you have a configuration of 800 rules, you should expect to need to look through every single rule until you find the rule that is causing your problems.

Before you begin troubleshooting, you first need to know some information about the traffic:

  1. What is the source IP address or range that is having difficulty connecting?
  2. What is the destination IP address or website IP?
  3. What is the port or port range affected, or what type of traffic is it (TCP, ICMP, etc.)?
  4. Is it supposed to be accepted or blocked?

Those bits of information should be all you need to begin troubleshooting a buggy ruleset, except in some rare cases that are outside the scope of this article.

Here are some things to keep in mind (especially if you did not program every rule by hand):

  • iptables has three built in chains. These are for INPUT – the traffic coming in to the server, OUTPUT – the traffic coming out of the server, and FORWARD – traffic that is not destined to or coming from the server (usually only used when iptable is acting as a firewall for other servers). You will start your troubleshooting at the top of one of these three chains, depending on the type of traffic.
  • The "target" is the action that is taken when the rule matches. This may be another custom chain, so if you see a rule with another chain as the target that matches exactly, be sure to step through every rule in that chain as well. In the following example, you will see the BLACKLIST2 sub-chain that applies to traffic on port 80. If traffic comes through on port 80, it will be diverted to this other chain.
  • The RETURN target indicates that you should return to the parent chain. If you see a rule that matches with a RETURN target, stop all your troubleshooting on the current chain, and return the rule directly after the rule that referenced the custom chain.
  • If there are no matching rules, the chain policy is applied.
  • There may be rules in the "nat," "mangle" or "raw" tables that are blocking or diverting your traffic. Typically, all the rules will be in the "filter" table, but you might run into situations where this is not the case. Try running this to check: iptables -t mangle -nL ; iptables -t nat -nL ; iptables -t raw -nL
  • Be cognisant of the policy. If the policy is ACCEPT, all traffic that does not match a rule will be accepted. Conversely, if the policy is DROP or REJECT, all traffic that does not match a rule will be blocked.
  • My goal with this article is to introduce you to the algorithm by which you can troubleshoot a more complex ruleset. It is intentionally left simple, but you should still follow through even when the answer may be obvious.

Here is an example ruleset that I will be using for an example:

Chain INPUT (policy DROP)
target prot opt source destination
BLACKLIST2 tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt:80
ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt:50
ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt:53
ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt:22
ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt:1010

Chain BLACKLIST2 (1 references)
target prot opt source destination
REJECT * -- 123.123.123.123 0.0.0.0/0
REJECT * -- 45.34.234.234 0.0.0.0/0
ACCEPT * -- 0.0.0.0/0 0.0.0.0/0

Here is the problem: Your server is accepting SSH traffic to anyone, and you wish to only allow SSH to your IP – 111.111.111.111. We know that this is inbound traffic, so this will affect the INPUT chain.

We are looking for:

source IP: any
destination IP: any
protocol: tcp
port: 22

Step 1: The first rule denotes any source IP and and destination IP on destination port 80. Since this is regarding port 22, this rule does not match, so we'll continue to the next rule. If the traffic here was on port 80, it would invoke the BLACKLIST2 sub chain.
Step 2: The second rule denotes any source IP and any destination IP on destination port 50. Since this is regarding port 22, this rule does not match, so let's continue on.
Step 3: The third rule denotes any source IP and any destination IP on destination port 53. Since this is regarding port 22, this rule does not match, so let's continue on.
Step 4: The fourth rule denotes any source IP and any destination IP on destination port 22. Since this is regarding port 22, this rule matches exactly. The target ACCEPT is applied to the traffic. We found the problem, and now we need to construct a solution. I will be showing you the Redhat method of doing this.

Do this to save the running ruleset as a file:

iptables-save > current-iptables-rules

Then edit the current-iptables-rules file in your favorite editor, and find the rule that looks like this:

-A INPUT -p tcp --dport 22 -j ACCEPT

Then you can modify this to only apply to your IP address (the source, or "-s", IP address).

-A INPUT -p tcp -s 111.111.111.111 --dport 22 -j ACCEPT

Once you have this line, you will need to load the iptables configuration from this file for testing.

iptables-restore < current-iptables-rules

Don't directly edit the /etc/sysconfig/iptables file as this might lock you out of your server. It is good practice to test a configuration before saving to the system configuration files. This way, if you do get locked out, you can reboot your server and it will be working. The ruleset should look like this now:

Chain INPUT (policy DROP)
target prot opt source destination
BLACKLIST2 tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt:80
ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt:50
ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt:53
ACCEPT tcp -- 111.111.111.111 0.0.0.0/0 tcp dpt:22
ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt:1010

Chain BLACKLIST2 (1 references)
target prot opt source destination
REJECT * -- 123.123.123.123 0.0.0.0/0
REJECT * -- 45.34.234.234 0.0.0.0/0
ACCEPT * -- 0.0.0.0/0 0.0.0.0/0

The policy of "DROP" will now block any other connection on port 22. Remember, the rule must match exactly, so the rule on port 22 now *ONLY* applies if the IP address is 111.111.111.111.

Once you have confirmed that the rule is behaving properly (be sure to test from another IP address to confirm that you are not able to connect), you can write the system configuration:

service iptables save

If this troubleshooting sounds boring and repetitive, you are right. However, this is the secret to solid iptables troubleshooting. As I said earlier, there is no guesswork involved. Just take it step by step, make sure the rule matches exactly, and follow it through till you find the rule that is causing the problem. This method may not be fast, but it's reliable. You'll look like an expert in no time.

-Mark

January 5, 2012

iptables Tips and Tricks - Rule Processing

As I mentioned in "iptables Tips and Tricks - Port Redirection," iptables is probably a complete mystery to a lot of users, and one the biggest hurdles is understanding the method by which it filters traffic ... Once you understand this, you'll be able to tame the beast.

When I think of iptables, the best analogy that comes to mind is a gravity coin sorting bank with four rules and one policy. If you're not familiar with a gravity coin sorting bank, each coin is starts at the same place and slides down an declined plane until it can fall into it's appropriate tube:

iptables Rule Sorter

As you can see, once a coin starts down the path, there are four rules – each one "filtering traffic" based on the width of the coin in millimeters (Quarter = 25mm, Nickel = 22mm, Penny = 20mm, Dime = 18mm). Due to possible inconsistencies in the coins, the tube widths are slightly larger than the official sizes of each coin to prevent jamming. At the end of the line, if a coin didn't fit in any of the tubes, it's dropped out of the sorter.

As we use this visualization to apply to iptables, there are three important things to remember:

  1. The rules start at the top, and proceed down, one by one unless otherwise directed.
  2. A rule must match exactly.
  3. Once iptables has accepted, rejected, or dropped a packet, it will not process any further rules on it.

Let's jump back to the coin sorter. What would happen if you introduced a 23mm coin (slightly larger than a nickel)? What would happen if you introduced a 17mm coin (smaller than a dime)? What would happen if you dropped in a $1 coin @ 26.5mm?

In the first scenario, the coin would enter into the rule processing by being dropped in at the top. It would first pass by the dime slot, which requires a diameter of less than 18mm. It passes by the pennies slot as well, which requires less than 20mm. It continues past the nickels slot, which requires 22mm or less. It will then be "accepted" into the quarters slot, and there will be no further "processing" on the coin.

The iptables rules might look something like this:

Chain INPUT (policy DROP)
target prot opt source destination
ACCEPT all --- 0.0.0.0/00.0.0.0/0width ACCEPT all --- 0.0.0.0/00.0.0.0/0width ACCEPT all --- 0.0.0.0/00.0.0.0/0width ACCEPT all --- 0.0.0.0/00.0.0.0/0width

It's important to remember that once iptables has accepted, rejected, or dropped a packet, it will not process any further rules on it. In the second scenario (17mm coin), the coin would only be processed through the first rule; the other 3 rules would not be used even though the coin would meet their rules as well. Just because a port or and IP address is allowed somewhere in a chain, if a matching rule has dropped the packet, no further rules will be processed.

The final scenario (26.5mm coin) outlines a situation where none of the rules match, and this indicates that the policy will be used. In the coin bank example, it would be physically dropped off the side of the bank. iptables keeps a tally of the number of packets dropped and the corresponding size of the data. You can view this data by using the "iptables -vnL" command.

Chain OUTPUT (policy ACCEPT 3418K packets, 380M bytes)

cPanel even uses this tally functionality to track bandwidth usage (You may have seen the "acctboth" chain - this is used for tracking usage per IP).

So there you have it: iptables is just like a gravity coin sorting bank!

-Mark

Subscribe to configuration