The 4 Questions You Must Answer Before Running an A/B Test

The 4 Questions You Must Answer Before Running an A/B Test

Ever been in an argument with your boss about website design? They want the header image to be just so. They like the buttons to be blue, not orange.

That copy you wrote is good, but it could be tweaked just a little bit.

And so you decide to run an A/B test to see who’s right. You can’t wait to be vindicated by solid, inarguable maths. Never again will your wisdom be called into question.

Time for a reality check: this is not a good reason to run an A/B test, and it is not a good way to optimize your website.

Tests born this way tend to do nothing more than provide ammunition for future arguments, and meanwhile you’re missing opportunities to do some serious site optimization with proper methodology behind it.

Split tests can be:

- Time consuming

- Expensive

- Annoying to your site visitors

- Inconclusive

With those facts in mind it’s worth applying some thought before you split test every idea that pops into your head.

Here are the questions you must answer before running an a/b test.

If you can’t say “yes” to all of them, consider some of the alternatives I suggest at the end of the post instead:

1. Is the Test is Based on a Real Piece of Customer Insight?

Note customer insight. Your own guesswork isn’t the same. Gathering customer feedback and seeing your website through the eyes of the customer is essential to creating solid tests.

Use a CX analytics tool like Decibel Insight to watch visitor replays and analyse heatmaps, or conduct some user testing on your site with WhatUsersDo.

Spot trends in visitor behavior and identify a moment of friction or learn about the real way that visitors use the site – then take these insights and turn them into working hypotheses for your tests.

2. Can the Test Reach Statistical Significance in a Reasonable Timeframe?

Statistical significance is a mathematical concept that, in layman’s terms, tells you whether the result of your split test is reliable or not.

In other words, it indicates whether the result of your test – that 70% preferred option A and 30% preferred option B – is predictable and repeatable. It’s the threshold at which you can act on the result of the test.

Reaching statistical significance requires the test to be taken by a certain minimum number of subjects, and for the result to be mathematically convincing.

Don’t understand what I’m talking about? Don’t worry. All you need to know is that without statistical significance, your test result can’t be relied on. So use this brilliant calculator to make sure your test won’t have to run for an impractical length of time before it delivers a result.

3. Does Your Test Have a Working Hypothesis?

A hypothesis is the theory you’re trying to prove with your test. The point of having a hypothesis is to restrict your testing activities to a logical framework – so no more randomly experimenting with button colours and different fonts hoping to find something that works.

Instead you need to frame each test around the insight discussed in point 1, and mould it into a statement like this:

“I believe that changing element A in this way will result in a change to metric B”

For example, through heatmaps and session replays you might glean some evidence that visitors are ignoring a call to action on your homepage. A hypothesis for a test might read:

“I believe that changing the call to actions’ headline so that it’s all upper case will result in more clicks on the call to action.”

That hypothesis becomes your reference point for evaluating the test once it’s delivered a significant result.

4. Can You Measure the Result and Attribute it to the Variation?

The test outcome must be measureable. So, if you’re running a test to see if a green background is “better”, you’re going to struggle. Rather run a test to see if a green background “increases clicks” or “increases dwell time on the page”.

You also need to be able to attribute the change you measure to the variation in the test. That boils down to some old-school scientific discipline – test one variable at a time.

If you’re running a test for the copy on your landing page, don’t start tweaking page layout at the same time – you’ll never know which change contributed to the outcome.

Some Suggestions

Don’t have a test that meets those criteria? It’s probably not worth running. Here are some suggestions about what to do if you find yourself in that situation:

  • If you can see an obvious improvement to the website, it might fall into the “just do it” category that doesn’t require a test. Stuff like fixing errors, improving page load speed or correcting typos all fall into that category.
  • Steal other people’s ideas – there are loads of ways to glean learning from other people’s tests and put it into action on your own website. Check out Which Test Won for starters.
  • Get into analytics – data is the starting point for strong tests. If you’re not getting your head into the metrics, then you’re not benchmarking performance, which means you’re not going to find it easy to identify the areas that will benefit from a test. So, before you reach for the split testing software, make sure you know the numbers.
  • Focus on driving traffic instead. It sounds like a fallacy when the world has gone CRO crazy, but there’s still value in increasing traffic to your website. If you’re not yet ready for a robust programme of split testing then focus on getting more visitors, more cheaply. Optimise your PPC campaign, boost organic ranking, work with third-party referrers and advertising networks to acquire visitors for the lowest possible investment.