A/B Testing Interview Questions
What if I don't have a control?
A control is the existing version of a landing page or webpage that you are testing against. Sometimes you may want to test two versions of a page that never existed before ... and that's okay. Just choose one of the variations and call that one the control. Try to pick the one that's the most similar to how you currently design pages, and use the other as the treatment.
What do I need to start A/B testing on my site?
The best way to run A/B tests is to use a software tool designed for it. HubSpot offers one in its all-in-one marketing software platform. Other providers include Unbounce and Visual Website Optimizer. If you don't mind messing with a little code, Google also has a free tool called Content Experiments in Google Analytics. It's a little different than traditional A/B testing, but if you're tech savvy, you could try it out.
How and when do I interpret my split test results?
The test starts. The results begin to roll in. You scramble to check who's winning. But the early stages of a test are not the right time to start interpreting your results. Wait until your test has reached statistical significance (see question 4 above) and then revisit your original hypothesis. Did the test definitively prove or disprove your hypothesis? If so, you can start to draw some conclusions. When you interpret your test, try to stay disciplined about attributing your results to the specific changes made. Make sure there are clear connections between the change and the outcome, and there aren't any other forces at play.
What do I do if I don't trust the results?
If you really don't trust the results and have ruled out any errors or challenges to the test's validity, the best thing to do is to run the same test again. Treat it as an entirely separate test and see if you can replicate the results. If you're able to replicate again and again, you probably have a solid set of results.
What is a null hypothesis?
A null hypothesis is the hypothesis that any difference in outcomes is the result of a sampling error or standard variation. Think about flipping a coin. While you have 50/50 odds for the coin to land on heads, sometimes the outcome in practice is 51/49 or some other variation due to chance. The more you flip the coin, though, the closer you should get to a 50/50 result. In statistics, the way you prove or disprove an idea is to dispute the null hypothesis. Disputing a null hypothesis is a matter of running the experiment long enough to rule out an incidental outcome. This concept is also referred to as reaching statistical significance.
What's multivariate testing, and how does it compare to A/B testing?
A/B testing is typically used for redesigns to test out the effectiveness of a single design direction or theory against a goal (like driving conversions). Multivariate testing tends to be used for smaller changes over a longer period of time. It will take a number of elements of your site and test out all possible combinations of these elements together for ongoing optimization. In a post in January, my colleague Corey Eridon explained the differences between when you'd use one test over the other in detail, saying: A/B testing is a great testing method if you need meaningful results fast. Because the changes from page to page are so stark, it will be easier to tell which page is most effective. It is also the right method to choose if you don't have a ton of traffic to your site. Because of the multiple variables being tested in a multivariate test, you'll need a highly trafficked site to get meaningful results with MVT. If you do have enough site traffic to pull off a successful multivariate test (though you can still use A/B testing if you're testing brand new designs and layouts!) a great time to use the testing method is when you want to make subtle changes to a page and understand how certain elements interact with one another to incrementally improve on an existing design.
When is A/B testing a good idea? When is it a bad idea?
A/B testing most commonly fails because the test itself has unclear goals, so you've got to know what you're testing. Use A/B testing to test a theory, for example -- would adding a picture to this landing page increase conversions? Are people more likely to click a red button or a blue button? What if I change the headline to stress the time-limit of the offer? These are all changes that can be easily quantified. People run into trouble with A/B testing when their theories are too vague, like testing two entirely different designs with multiple variants. While it can be done, unless there is a clear landslide winner, testing different designs can lead to softer conclusions and an uncertainty about what actually caused the increase in conversions.
How many visits to a page do I need to get good results with A/B testing?
Before you can test the results of an A/B test, you have to be sure the test has reached statistical significance -- the point at which you can have 95% confidence or more in the results. The good news is, many A/B testing tools have statistical significance built right in so you get an indication as to when your test is ready for interpretation. If you don't have that, however, there are also a number of free calculators and tools out there for understanding the statistical significance. HubSpot's is below, and you can also check out a more detailed excel spreadsheet over on the Occam's Razor blog.
What you test is up to you, but we recommend starting with a few basic lynchpins of your webpage.
Calls-to-Action: Even with the single element of a call-to-action, there are a number of different things you can test. Just make sure you're clear on what aspect of the CTA you're testing. You could test the text -- what the CTA compels the viewer to do; the location -- where the CTA is positioned on the page; the shape and style -- what the CTA looks like. In the example below, for instance, HubSpot tested the shape and style of our demo CTA to see which performed better. The CTA shaped like a button (on the right) rather than the CTA that included a sprocket image (left) performed signficantly better, giving us a 13% increase in conversions. Headline: It's typically the first thing a viewer reads on your site, so the potential for impact is significant. Try out different styles of headlines on your A/B tests. Make sure that the difference between each headline's positioning is clear rather than some simple wordsmithing so you can be certain as to what caused the change. Images: What's more effective, an image of a person using your product, or the product on its own? Test different versions of your pages with alternate supporting images to see if there's a difference in action. Copy length: Does shortening the text on your page result in a clearer message, or do you need the extra text to explain your offer? Trying out different versions of your body text can help you determine what amount of explanation a reader needs before converting. To make this test work, try to keep the text similar and just test the volume of it.
How many variations should I have in A/B testing?
Let's say you've brainstormed as a marketing team and you have four great ideas for a landing page design. It can be tempting to run all four treatments at once to declare a winner, but similar to the variations issue above, it's not a true A/B test if you have multiple different treatments running at once. A number of factors from each different design can get in there and muddy the test result waters, so to speak. The beauty of an A/B test is that its results are straightforward and concrete. We suggest running two versions against each other, and then running a second test afterwards to compare the winners. Think of it as a really techy basketball bracket.
Beyond sample size, what other validity traps are there?
MECLABS ran a great web clinic last year on a collection of other threats to a test's validity beyond sample size. In it, Dr. Flint McGlaughlin ran through three testing errors and how to mitigate the risk of coming across them in your tests. I'd recommend reading the full transcript of the clinic, but here are a couple of errors from the list. History Effect: Something happens in the outside world to adversely bias your test results. Instrumental Effect: An error in your testing software undermines the testing results.
How often should I run A/B testing?
Perspectives vary on this one. There's a good case to be made for always testing and iterating on your site. Just be sure that each test has a clear purpose and will result in a more functional site for your visitors and company. If you're running a lot of tests that are resulting in minimal outcomes or minor victories, reconsider your testing strategy.
Why isn't my split exactly 50/50?
Sometimes during A/B tests you may notice that the traffic numbers of each variation are not identical. This doesn't mean that anything is wrong with the test, just that random variations work, well, randomly. Think about flipping a coin. You have a 50-50 shot of heads or tails, but sometimes you get three tails in a row. As more traffic hits your site, however, the numbers should become closer to 50-50.
How do I find A/B tests from similar companies?
There are a number of sites out there that aggregate A/B testing examples and results. Some allow you to search by company type, and most provide details as to how the company interpreted the test results. When you're first getting started, it's not a bad idea to read through some of these sites to get ideas on what to test for your own company. Which Test Won: Anne Holland's website, Which Test Won, has a series of examples as well as some annual contests that you can submit your own tests to. Visual Website Optimizer: Like HubSpot, Visual Website Optimizer provides A/B testing software and has a number of examples on their blogs that you can learn from. ABTests.com: This site is no longer being updated, but it does have a good archive of tests and some quick takeaways, as well as links to the original test conductor and write-ups. The site was founded by HubSpot UX director Josh Porter who also has some great advice on A/B Testing over on his personal blog, bokardo.com
Does A/B testing negatively affect SEO?
There's a myth that A/B testing hurts search engine rankings because it could be classified as duplicate content, which search engines don't look kindly upon. This myth is most definitely false. In fact, Google's Matt Cutt advises running A/B tests to improve the functionality of your site. Website Optimizer has a good breakdown of the myth too, and why it doesn't hold up. If you're still concerned, you can always add a "no index" tag to your variation page. Detailed instructions on adding a "no index" tag can be found here.
Should I A/B test my homepage?
While I wouldn't completely rule out A/B testing your homepage, I will say that it can be difficult to run a conclusive test of your homepage. Your homepage gets a variable mix of traffic, from accidental visitors, to leads, to customers. There's also a typically a ton of content on your homepage, so it can be challenging in one test to determine what's driving a visitor to act or not act. Finally, because of the variety of people who come to your homepage, figuring out a goal for the page and test can be a challenge. You may think that your goal is to test lead-to-customer conversions, but if the sample visiting during the test is heavy on customers instead of prospects, your goal for that group could shift. If you want to test your homepage, think about just testing your CTAs. At the enterprise level, HubSpot's CTA manager allows you to run a split test just on a single CTA on your site rather than the whole page.
Can you run A/B tests on things other than web pages?
Yes! In addition to landing pages and webpages, many marketers run A/B tests on emails, PPC campaigns, and calls-to-action. Email: Email testing variables include the subject line, personalization features, and sender name, among others. PPC: For paid search ad campaigns, you can A/B test the headline, body text, link text, or keywords. CTAs: With CTAs, try altering the text on the CTA, its shape, color, or placement on the page.
How many variables should I test?
You want your A/B test to be conclusive -- you're investing time in it, so you want a clear and actionable answer! The problem with testing multiple variables at once is you aren't able to accurately determine which of the variables made the difference. So while you can say one page performed better than the other, if there are three or four variables on each, you can't be certain as to why or if one of those variables is actually a detriment to the page, nor can you replicate the good elements on other pages. Our advice? Do a series of basic one-variable tests to iterate your way to a page you know is more effective.