Posts Tagged 'Alerts'

October 3, 2013

Improving Communications for Customer-Affecting Events

Service disruptions are never a good thing. Though SoftLayer invests extensively in design, equipment, and personnel training to reduce the risk of disruptions to our customers, in the technology world there are times where scheduled events or unplanned incidents are inevitable. During those times, we understand that restoring service is top priority, and almost as important is communicating to customers regarding the cause of the incident and the current status of our work to resolve it.

To date we've used a combination of tickets, emails, forum posts, portal "yellow" notifications, as well as RSS and Twitter feeds to provide status updates during service-affecting events. Many of these methods require customers to "come and get it," so we've been working on a more targeted, proactive approach to disseminating information.

I'm excited to report that our Development and Operations teams have collaborated on new functionality in the SoftLayer portal that will improve the way we share information with customers about unplanned infrastructure troubles or upcoming planned maintenances. With our new Event Communications toolset, we're able to pinpoint the accounts affected by an event and update users who opt-in to receive notifications about how these events may impact their services.

Notifications

As the development work is finalized, we plan to roll out a few phases of improvements. The first phase of implementation, which is ready today, enables email alerts for unplanned incidents, and any portal user account can opt-in to receive them. These emails provide details about the impact and current status of an unplanned incident in progress (UIP). In this phase, notifications can be sent for devices such as physical servers, CCIs and shared SLB VIPs, and we will be adding additional services over time.

In future phases of this project, we plan to include:

  • A new "Event" section of the Customer Portal which will allow customers to browse upcoming scheduled maintenances or current/recent unplanned incidents which may impact their services. In the past, we generated tickets for scheduled maintenances, so separating these event notifications will improve customer visibility.
  • Enhanced visibility for events in our mobile apps (phone/tablet).
  • Updates to affected services for a given event as customers add / change services.
  • Notification of newly added or newly updated events that have not been read by the user (similar email "inbox" functionality) in the portal.
  • Identification of any related current or recent events as a customer begins to open a ticket in the portal.
  • Reminders of upcoming scheduled maintenances along with progress updates to the event notification throughout the maintenance in some cases.
  • Improved ability to correlate specific incidents to customer service troubles.
  • Dissemination of RFO (reason-for-outage) statements to customers following a post-incident review of an unplanned service disruption.

Since we respect our customers' inboxes, these notifications will only be sent to user accounts that have opted in. If you'd like to receive them, simply log into the Customer Portal and navigate to "Notification Subscriptions" under the "Administration" menu (direct link). From that page, individual users can control event subscriptions, and portal logins that have administrative control over multiple users on the account can control the opt-in for themselves and their downstream users. For a more detailed walkthrough of the opt-in process, visit the KnowledgeLayer: "Update Subscription Settings for the Event Management System"

The Network Operations Center has already begun using this customer notification toolset for customer-affecting events, so we recommend that you opt-in as soon as possible to benefit from this new functionality.

-Dani

November 14, 2012

Risk Management: Securing Your Servers

How do you secure your home when you leave? If you're like most people, you make sure to lock the door you leave from, and you head off to your destination. If Phil is right about "locks keeping honest people honest," simply locking your front door may not be enough. When my family moved into a new house recently, we evaluated its physical security and tried to determine possible avenues of attack (garage, doors, windows, etc.), tools that could be used (a stolen key, a brick, a crowbar, etc.) and ways to mitigate the risk of each kind of attack ... We were effectively creating a risk management plan.

Every risk has different probabilities of occurrence, potential damages, and prevention costs, and the risk management process helps us balance the costs and benefits of various security methods. When it comes to securing a home, the most effective protection comes by using layers of different methods ... To prevent a home invasion, you might lock your door, train your dog to make intruders into chew toys and have an alarm system installed. Even if an attacker can get a key to the house and bring some leftover steaks to appease the dog, the motion detectors for the alarm are going to have the police on their way quickly. (Or you could violate every HOA regulation known to man by digging a moat around the house, filling with sharks with laser beams attached to their heads, and building a medieval drawbridge over the moat.)

I use the example of securing a house because it's usually a little more accessible than talking about "server security." Server security doesn't have to be overly complex or difficult to implement, but its stigma of complexity usually prevents systems administrators from incorporating even the simplest of security measures. Let's take a look at the easiest steps to begin securing your servers in the context of their home security parallels, and you'll see what I'm talking about.

Keep "Bad People" Out: Have secure password requirements.

Passwords are your keys and your locks — the controls you put into place that ensure that only the people who should have access get it. There's no "catch all" method of keeping the bad people out of your systems, but employing a variety of authentication and identification measures can greatly enhance the security of your systems. A first line of defense for server security would be to set password complexity and minimum/maximum password age requirements.

If you want to add an additional layer of security at the authentication level, you can incorporate "Strong" or "Two-Factor" authentication. From there, you can learn about a dizzying array of authentication protocols (like TACACS+ and RADIUS) to centralize access control or you can use active directory groups to simplify the process of granting and/or restricting access to your systems. Each layer of authentication security has benefits and drawbacks, and most often, you'll want to weigh the security risk against your need for ease-of-use and availability as you plan your implementation.

Stay Current on your "Good People": When authorized users leave, make sure their access to your system leaves with them.

If your neighbor doesn't return borrowed tools to your tool shed after you gave him a key when he was finishing his renovation, you need to take his key back when you tell him he can't borrow any more. If you don't, nothing is stopping him from walking over to the shed when you're not looking and taking more (all?) of your tools. I know it seems like a silly example, but that kind of thing is a big oversight when it comes to server security.

Employees are granted access to perform their duties (the principle of least privilege), and when they no longer require access, the "keys to the castle" should be revoked. Auditing who has access to what (whether it be for your systems or for your applications) should be continual.

You might have processes in place to grant and remove access, but it's also important to audit those privileges regularly to catch any breakdowns or oversights. The last thing you want is to have a disgruntled former employee wreak all sorts of havoc on your key systems, sell proprietary information or otherwise cost you revenue, fines, recovery efforts or lost reputation.

Catch Attackers: Monitor your systems closely and set up alerts if an intrusion is detected.

There is always a chance that bad people are going to keep looking for a way to get into your house. Maybe they'll walk around the house to try and open the doors and windows you don't use very often. Maybe they'll ring the doorbell and if no lights turn on, they'll break a window and get in that way.

You can never completely eliminate all risk. Security is a continual process, and eventually some determined, over-caffeinated hacker is going to find a way in. Thinking your security is impenetrable makes you vulnerable if by some stretch of the imagination, an attacker breaches your security (see: Trojan Horse). Continuous monitoring strategies can alert administrators if someone does things they shouldn't be doing. Think of it as a motion detector in your house ... "If someone gets in, I want to know where they are." When you implement monitoring, logging and alerting, you will also be able to recover more quickly from security breaches because every file accessed will be documented.

Minimize the Damage: Lock down your system if it is breached.

A burglar smashes through your living room window, runs directly to your DVD collection, and takes your limited edition "Saved by the Bell" series box set. What can you do to prevent them from running back into the house to get the autographed posted of Alf off of your wall?

When you're monitoring your servers and you get alerted to malicious activity, you're already late to the game ... The damage has already started, and you need to minimize it. In a home security environment, that might involve an ear-piercing alarm or filling the moat around your house even higher so the sharks get a better angle to aim their laser beams. File integrity monitors and IDS software can mitigate damage in a security breach by reverting files when checksums don't match or stopping malicious behavior in its tracks.

These recommendations are only a few of the first-line layers of defense when it comes to server security. Even if you're only able to incorporate one or two of these tips into your environment, you should. When you look at server security in terms of a journey rather than a destination, you can celebrate the progress you make and look forward to the next steps down the road.

Now if you'll excuse me, I have to go to a meeting where I'm proposing moats, drawbridges, and sharks with laser beams on their heads to SamF for data center security ... Wish me luck!

-Matthew

Subscribe to alerts