We invite each of our featured SoftLayer Tech Marketplace Partners to contribute a guest post to the SoftLayer Blog, and this week, we’re happy to welcome David Mytton, Founder of ServerDensity. Server Density is a hosted server and website monitoring service that alerts you when your website is slow, down or back up.
Tech Partners Marketplace: http://www.softlayer.com/marketplace/serverdensity
5 Ways to Minimize Downtime During Summer Vacation
It’s a fact of life that everything runs smoothly until you’re out of contact, away from the Internet or on holiday. However, you can’t be available 24/7 on the chance that something breaks; instead, there are several things you can do to ensure that when things go wrong, the problem can be managed and resolved quickly. To help you set up your own “get back up” plan, we’ve come up with a checklist of the top five things you can do to prepare for an ill-timed issue.
How will you know when things break? Using a tool like Server Density — which combines availability monitoring from locations around the world with internal server metrics like disk usage, Apache and MySQL — means that you can be alerted if your site goes down, and have the data to find out why.
Surprisingly, the most common problems we see are some that are the easiest to fix. One problem that happens all too often is when a customer simply runs out of disk space in a volume! If you’ve ever had it happen to you, you know that running out of space will break things in strange ways — whether it prevents the database from accepting writes or fails to store web sessions on disk. By doing something as simple as setting an alert to monitor used disk space for all important volumes (not just root) at around 75%, you’ll have proactive visibility into your server to avoid hitting volume capacity.
Additionally, you should define triggers for unusual values that will set off a red flag for you. For example, if your Apache requests per second suddenly drop significantly, that change could indicate a problem somewhere else in your infrastructure, and if you’re not monitoring those indirect triggers, you may not learn about those other problems as quickly as you’d like. Find measurable direct and indirect relationships that can give you this kind of early warning, and find a way to measure them and alert yourself when something changes.
2. Dealing with Alerts
It’s no good having alerts sent to someone who isn’t responding (or who can’t at a given time). Using a service like Pagerduty allows you to define on-call rotations for different types of alerts. Nobody wants to be on-call every hour of every day, so differentiating and channeling alerts in an automated way could save you a lot of hassle. Another huge benefit of a platform like Pagerduty is that it also handles escalations: If the first contact in the path doesn’t wake up or is out of service, someone else gets notified quickly.