Search This Blog

A/B Testing Guide

There are two main ways people approach testing:
1. Comprehensive Evaluation
2. Going for quick wins- testing elements most noticeable & easy to implement

Below is a list of useful leanings that I've gathered over time which work fairly well with institutionalizing a testing strategy/program in organizations.

10 Principles for A/B Testing:

1.       have 100 (or 250) conversions per variation
a.      if confidence is still not 95% (or higher) This means there is no significant difference between the variations
b.      Check the test results across segments to see if significance was achieved in any segment 

2.       Tests need to run for full weeks (7 days minimum)
a.       We need to rule out seasonality and test for full weeks
                                                               i.      Because your conversion rate can vary greatly depending on the day of the week

3.       Send test data into your Analytics tool & study segments of traffic
a.       While Optimizely has some built-in segmentation of results, it’s still no match to what you can do within a web Analytics tool
b.      If A beats B by 10%, that’s not the full picture. You need to segment the test data, that’s where the insights lie- Averages lie
                                                               i.      segment the heck out of result- by source of traffic, by user behavior(repeat visits), by outcomes (goals).

4.       Declare winner only when you’ve reached 95% or higher
a.       A/B split testing tools like Optimizely or VWO both tend to call tests too early: their minimum sample sizes are way too small.
b.      A sample size of 100 visitors per variation is not enough. Optimizley leads many people to call tests early and doesn’t have a setting where you may change the minimum sample size needed before declaring a winner
                                                               i.       tool for calculating/confirming sample sizes: http://www.testsignificance.com

5.       Test Hypothesis, not random ideas
a.      A hypothesis is a proposed statement made on the basis of limited evidence that can be proved or disproved and is used as a starting point for further investigation.
b.      complete proper conversion research to discover where the problems lie, and then perform analysis to figure out what the problems might be, ultimately coming up with a hypothesis for overcoming the site’s problems.
                                                               i.      Use Web Analytics to identify pain points on the site
c.       Have clear action plans stated based on data results prior to launching tests

6.       Designing and Running best-of-class Experiments:
a.       Test one element (or section) of the page at a time between variations
b.      don’t test too many variations at once. It’s better to do simple A/B testing, you’ll get results faster
c.       test segments of visitors (target) as averages can lead to false positives
d.      Elements of Experiment:
                                                               i.      Existence: helps examine clutter and points of confusion & most important element on page
                                                             ii.      Real Estate: after most favorable element identified rearrange layout
                                                            iii.      Presentation: image, placement, color, background, borders, outline, style, font etc
                                                           iv.      Function: navigation, search, tools, flash, pop-ups, widgets, forms
                                                             v.      Copy: headline, paragraph vs. bullet, keywords, value prop, price, short vs. long
e.      Experiment Types:
                                                               i.      a/b: testing a single element with different versions
                                                             ii.      multivariate: testing multiple elements with different versions
                                                            iii.      multi-page: testing single element across multiple pages- usually for funnels

7.       Don’t overlap traffic with multiple tests
a.       Overlapping traffic will skew results
b.      If you want to test a new version of several layouts at once—for instance product page, cart and checkout—you should use multi-page experiments designed for this purpose.

8.       No Brainer Tests
a.       There is no best color, it’s always about visual hierarchy
b.      Use your traffic on high-impact stuff. Test data-driven hypotheses.
c.       Avoid testing in areas where there is low conversion/ traffic
                                                               i.      Don’t waste time waiting for a test result that takes many month
                                                             ii.      go for massive, radical changes – No testing, just switch

9.       Small gains are Acceptable
a.       If your site is pretty good, you’re not going to get massive lifts all the time
b.      Only 1 out of 8 A/B tests have driven significant change
                                                              i.      I have not failed 10,000 times. I have successfully found 10,000 ways that will not work.
c.       Most winning tests are going to give small gains—1%, 5%, 8%
                                                               i.      You’re going to do many, many tests. If you increase your conversion rate 5% each month, that’s going to be an 80% lift over 12 months. That’s compounding interest
d.      Be aware of conversion rate range of control vs. test
                                                               i.      You may beat control by a 2% at a given time but understand how it relates to controls past performance

10.    Pay attention to external factors
a.       winning test during the holidays might not be a winner in January (seasonality)
b.       You need to be aware of what your company is doing and conversion rate ranges in previous months

Beside the above, there are other factors that play into the mix such as Mobile testing, SEO considerations, prioritization of test ideas, path analysis between test page till conversion, qualitative findings (voice of customer), and organizational structure that should be considered.

here is a guide for getting started and managing expectations....

Tools for influencing Behavior: Expected Impact Time to Implement
  • CTA, headline, paragraph vs. bullet, keywords
  • Low
  • Easy
  • imagery, phone numbers
  • Low
  • Medium
  • rearrange layout
  • medium
  • Medium / High
  • Promotions (free delivery, gift card, % off, etc)
  • High
  • Medium / High
  • Price (test by area, segment, sensitivity, presentation)
  • High
  • High
  • New functionality / redesign (shopping cart, proactive call/chat)
  • High
  • High

Happy Testing!

Credits for my research go to:

1 comment: