Service Level Agreement - Best Practices & Crucial Elements
When an elevator gets stuck, its load capacity is the first thing to be investigated. Were there too many people or did its manufacturer not deliver on its promises? The maximum capacity badge clears up all doubt. Just like a Service Level Agreement (SLA).
An SLA provides legal certainty to customer and provider. The customer knows what service to expect; the provider knows what requirements to meet.
Here are the crucial elements of a waterproof SLA and best practices for setting it up.
Best practices for setting up an SLA
Make coordination a two-sided effort. An SLA only makes sense if both sides gear to a mutual agreement. A company should not rubber-stamp the demands of a customer just to score the deal. Likewise, when sending counter offers, they should themselves only suggest terms that they can achieve and deem satisfying for the customer.
Your overall objective in SLA negotiations should be to prevent penalties while creating a situation in which you can still convince the customer by exceeding what you warranted. The concept of under promising and over delivering, as described by Jason Zook in this Inc. article , delivers a suitable game plan for SLA negotiations.
Use the "SMART" model. An SLA is a legal document that should be clear at the least but better ironclad. Once an expensive issue arises, so will the question of liability. Then, loopholes can be costly and moreover jeopardize a company’s reputation. Assuring there aren’t any is the provider’s responsibility since the SLA is part of their service.
George T. Doran’s “SMART” model gives a handy overview of service level agreement best practices: The s pecific, m easurable, a chievable, r elevant, and the t ime-bound. Since many loopholes in SLAs are due to inconclusive wording, I’d like to stress Doran’s attributes specific, measurable and time-bound here.
S pecific, in our context, means that every SLA rule has one clear meaning while not allowing for a deviating interpretation. The meaning of words is often subjective. Numbers m easured with verified tools, on the contrary, carry universal meaning.
The definition of the service warranted, e.g. what issue to solve and how, can only be described in words. The quality of service delivery you warrant, e.g. how quickly you solve the issue, should be stated in numbers.
Sticking to numbers whenever possible is a sure way to prevent ambiguity. For example, when stating a time for deliveries to be sent out after weekends, you wouldn’t warrant “Monday morning” but “between 5:00 and 8:00 am on the following Monday morning.”
To define your area of accountability even sharper, make those numbers t ime-bound. The correct amount of an item or service is rendered irrelevant when it’s not delivered at the required time. When referring to, say, the number of onboarding and training sessions a customer is warranted, state a time frame for their realization.
Go into detail. Cover anything that matters in your service. If you’ve agreed on a personal success manager, for example, how will you handle the scenario of your rep being unavailable due to illness? Since you don’t know when that happens, settle a time limit upon which you have to notify the customer upfront. Then, also communicate whether there will be a sub and if yes, the sub’s qualification, quantifiable through years of experience.
Looking for better customer relationships?
Test Userlike for free and chat with your customers on your website, Facebook Messenger, and Telegram.Read more
A point often underestimated is to also state the obvious. What’s not covered by your service is relevant if it may inflict uncertainty. Seemingly obvious exceptions like longer processing and response times during holiday seasons should also be explicitly mentioned. Calling “obvious” or “self-evident” won’t save your neck when you’ve breached a warranty in your SLA on New Year’s Eve.
Review and adjust. An SLA describes the intersection of a customer’s operational needs and a provider’s capabilities to serve them. Both sides have their own operational needs, which due to market movements or deliberate strategic changes are in a constant flow. Naturally, an SLA has to be reevaluated every once in awhile. For example, an SLA commonly is reviewed after software updates and restaffing in departments with direct customer contact.
Crucial Elements of the SLA
This ITIL checklist contains the general points to include in an SLA. It provides a good overview of legal standards and structure. I’m going to go into more detail with the most pivotal elements now.
Urgency and technicality categories
Prioritization of important requests is a mutual interest of customers and providers. It allows businesses to distribute their resources efficiently and assures to customers that their most pressing issues are dealt with first.
To define prioritization in an SLA, you need to define request categories first. If your SLA refers to a single process with a routine of ten or so recurring steps, you may go into detail like Chaseblueloans .
If the SLA refers to a whole service operation, it’ll be hard to predefine every single scenario. So you’ll need more general issue types, still defined clearly enough to allow for a logical categorization of more detailed issue types.
Label categories with expressive terms like “normal,” “urgent,” and “top priority.” Do so also taking account of the ticket’s level of technicality, as suggested by Ankita Kaushik . Add to your SLA a list or table of categories and their respective urgency level according to the customer’s operational needs.
|Severity Level||Definition||Response times|
|1. System Down||AS/400, mainframe, server||Immediate|
|2. Critical||Business outage or significant customer impact that threatens future productivity||Within 1 hour|
|3. Urgent||High-impact problem where production is proceeding, but in a significantly impaired fashion; there is a time-sensitive issue important to long term productivity that is not causing an immediate work stoppage; or there is significant customer concern||Within 2 hours|
|4. Important||Important issue that does not have significant current productivity impact||Within 4 hours|
|5. Monitor||Issue requiring no further action beyond monitoring for follow-up, if needed||Within 1 business day|
|6. Informational||Request for information only||Within 1 business day|
Accessibility is a core principle of customer service and so it’s one of any SLA. One of your customer’s main concerns is how easy it is for her to get in touch for support. In your website’s contact section you simply refer to your office hours. In an SLA intended for larger corporate customers, you’ll likely grant facilities through custom terms.
Here are the most important general availability elements. Differentiate them based on weekdays, weekends, holidays, time zones and urgency labels (“urgency”):
- Time of day during which you’re available
- Channels you’re available on
- Unique “online” times of each channel
- Option to call-through to managers in charge (yes/no/who)
Waiting limits per time, channel, and urgency:
- Maximum first response time when not available in real-time
- Maximum response time
- Maximum delivery time after order placement
Your customer wants to be sure she’ll reach you when something went wrong. What's more, she wants to be certain that you’ll fix it. The SLA should assure her, but it will be much easier to trust you that you’ll fix the issue if you say how you’ll fix it.
At the heart of all problem resolutions is the disaster recovery plan. Usually, the disaster refers to a complete failure of all services or one of its central functions. In the case of Userlike , every scenario in which a customer becomes unavailable for chatting with its website visitors is disastrous.
As a provider in the IT sector, you probably have already worked out a disaster plan that applies to all your customers in scenarios of data loss and disconnection. Its contents should find their way into your SLA along with possible disaster scenarios and solution strategies for the specific customer.
An inclusion of a disaster plan in the SLA allows both sides to agree on the actions taken in a certain scenario, that they’re the best shot. Still, once that scenario becomes real, the customer will first ask: When will you fix it? Consequently, a recovery time objective is the metric of interest here.
- Maximum recovery time
For less existential issues, state the maximum time in which a ticket is dealt with and resolved, per time, channel, and urgency:
- Maximum problem resolution time
Then, how also refers to by whom . Your customer may want certain issues with top urgency to be handled by a particularly qualified employee or a specialized department. If so, include that person and department, as well as their respective time restrictions.
When putting single employees on call for several companies, always keep an eye on their overall workload. You might otherwise breach the SLA when several companies need your specialist at the same time.
Most SLAs are signed and left to collect dust until something goes wrong. And since processes and requirements in the IT sector swiftly change, the paper loses applicability over time.
Periodic performance reviews can avert the danger of the SLA’s obsolescence. Additionally, they raise genuine trust in your customers and help to identify performance trends. Say you’ve been getting increasingly sluggish in responding customers. Then reviews enable you to take timely countermeasures before you breach the SLA.
Agree on a fixed rhythm for sending out performance reviews and throw in metrics responding to the terms agreed on. Consider the following metrics to represent your performance in regard to the crucial elements of your SLA. Put them down in your review per day, time, channel, and urgency label next to their SLA element counterpart (in brackets):
- Average first response time (maximum response time when not available in real-time per label)
- Average response time (maximum response time)
- Average problem resolution time (maximum problem resolution time)
- Average time until issue is touched (maximum time until issue is first touched)
- Average delivery time after order was placed (maximum delivery time after order placement)
- Singular breaches
- Average breaches of SLA per 100 requests ( cf. “Things Gone Wrong” metric )
General security measures
In an IT company’s SLA, security measures translate to detailed descriptions of its setup. These elements will largely be waved through by the customer if you explain why and how they suit your processes. If they aren’t, that’s because the customer has spotted some vulnerability, which you too would want to close.
Ready.gov emphasizes these elements of an IT setup, that I’d suggest to answer in an SLA:
- Computer room environment (secure computer room with climate control, conditioned and backup power supply, etc.)
- Hardware (networks, servers, desktop and laptop computers, wireless devices and peripherals)
- Connectivity to a service provider (fiber, cable, wireless, etc.)
- Software applications (electronic data interchange, electronic mail, enterprise resource management, office productivity, etc.)
- Data and restoration
In an SLA, penalties work like an insurance for the case of breaches. Obviously, not every breach’s result can be estimated down to pennies. Service downtimes can damage a company’s reputation and cause diffuse long-term losses. Still, it’s important to go through various what-if scenarios and tag them with a price.
In penalty negotiations customers might try to push the numbers to increase the incentive for you. You should try to not bow to anything that exceeds the level of appropriate compensation.