Multivariate testing
Table of Contents
What is multivariate testing?
Multivariate testing (MVT) is a technique for testing a hypothesis in which multiple variables are modified. The goal of multivariate testing is to determine which combination of variations performs the best out of all of the possible combinations.
Websites and mobile apps are made of combinations of changeable elements. A mvt test will change different elements, like changing a picture and headline at the same time. Three variations of the image and two variations of the headline are combined to create six variants of the content, which are tested concurrently to find the winning variation.
key characteristics:
- Tests multiple page elements simultaneously
- Requires larger sample sizes than A/B testing
- Provides insights into element interactions
- Best for optimizing critical pages without full redesigns
- Useful for understanding complex user behaviors
The total number of variations in a multivariate test will always be:
[# of Variations on Element A] X [# of Variations on Element B] ... = [Total # of Variations]
When running multivariate tests and trying to increase conversions, you may encounter some challenges. Here's how to address them:
- Inconclusive results: If your test isn't producing clear winners, try reducing the number of variables or increasing the test duration.
- Slow data collection or less traffic: Focus on high-traffic pages or simplify the test by reducing the number of variations.
- Conflicting results: Consider rerunning the test or segmenting your data for deeper analysis.
- Technical glitches: Ensure proper implementation and cross-browser compatibility of your test variations.
Multivariate testing vs. A/B testing
To better understand multivariate testing, let's compare it with A/B testing:
Aspect | A/B Testing | Multivariate Testing |
Variables tested | One at a time | Multiple simultaneously |
Complexity | Simple A version versus B version | More complex |
Sample size | Smaller | Larger |
Test duration | Shorter | Longer |
Best use case | Testing a single element | Optimizing multiple elements on a page |
The process of running a multivariate test is similar to A/B split testing, but different in that A/B testing only tests one variable. In an A/B test, a minimum of one variable is tested to determine the effect of a change to one variable. In a multivariate test, multiple variables are tested together to uncover the ideal combination that is effective in improving the primary metric chosen when setting up the test.
For example, you can test a lot more element functionality on a homepage or a webpage. This makes it incredibly important for you to craft your hypothesis carefully. The more elements you test, the larger the required sample size to run it would be. So, for multivariate tests, you need more visitors or tests take longer to reach statistical significance.
Which type of test you should use, depends on each use case and should be evaluated on a case-by-case basis. Always optimize for the maximum possible conversions in your marketing campaigns. Using multivariate tests can help uncover better user experience, where A/B tests might not be able to.
Learn more about Multivariate testing vs A/B testing.
Best metrics to focus on with multivariate testing
When setting up a multivariate test, similarly to an A/B test, start your testing process by defining your variants, set up a target page and audience (optional) and the metrics you think will be influenced by this change.
From our research, most ecommerce companies tend to focus on revenue, and B2B companies on conversions. Although these are the most valuable conversions you can have, sometimes adding other conversion metrics can help increase the statistical significance because there’s more data to work with. Based on our findings, companies that focussed on these alternative metics had higher returns and success rates in testing.
Example web metrics you can track in mvt tests are:
-
Call-to-action clicks - Commonly buttons or banners, measured in CTR (click-through-rate)
-
Conversion rate (CVR) - Measuring traffic / form submits or sales. CVR is the most common metric.
-
Engagement rate (ER) - A blog might measure engagement by website visitors / scrolled 75%
-
View-through rate (VTR) - Website visitors or play events / watched 75% or 90%
Setting up a multivariate test with these usability metrics allow the system to pick the best combination of elements for your page, based on your primary metric. Most tools do allow you to track more than one metric, but only the first (or main) one will be used to measure success.
Adding alternative metrics to plain conversions can also help with conversion rate optimization and even break you out of a local maximum. For instance, tracking user engagement before you get to a form increases data ingested in your multivariate test, and gives the test tool more data to work with and reach statistical significance in your test results.
A note on statistical significance in multivariate testing
Depending on how much traffic your given page gets, and how big the elements you’re changing are likely to impact your primary metric, it can take a while to reach statistical significance.
This is the same as with any A/B test, where the more elements you add (A/B/C/D), the longer your test can take to reach a conclusion. With multivariate testing, this is extra pronounced because each element that’s being changed needs to be measured against one another.
For example. If I’m changing an image A and B, and also a headline A and B, and finally a description A and B, the formula is as follows:
Image A and B (2 options) Headline A and B (2 options) * Description A and B (2 options) = 8 combinations.
Even though this seems like a relatively simple test, 8 variants would take a long time to test, as an A/B/C/D/E/F/G/H split test also would (also known as a full factorial test). To compensate, same as with an A/B test, focus on page elements that people can see immediately on pages with high a amount of traffic.
Another option to improve your results in multivariate testing, might be to reduce the total number of variables. If your tests are not reaching significance in a timely manner:
-
Use fewer different versions of elements
-
Use different elements that impact metrics more
-
Focus on pages with a high amount of traffic — more on that later
-
Use previous test data to determine likely impact on conversion rates
-
Redesign elements to make changes more prominent. Bolder tests tend to reach outsized test results
On the amount of traffic, depending on what changes you’re making and how much impact you expect those changes are expected to make on your primary metric, you need at least enough traffic to get to statistical significance. Taking the amount of traffic and the minimum detectable effect, you can calculate the exact estimated traffic using our Sample Size Calculator. The output sample size being the traffic needed, in this case.
Benefits of multivariate testing
Using multivariate testing for conversion rate optimization (CRO) can be helpful when multiple different elements on the same page can be changed in tandem to improve a single conversion goal: sign ups, clicks, form completions or social shares. If conducted properly, a multivariate test can eliminate the need to run several sequential A/B tests on the same page with the same goal and help find the most optimal from different combinations. The tests are run concurrently with a greater number of variations in a shorter period of time.
Multivariate testing, together with other testing methods can help give certainty the changes you’re making to your app or website have the maximum positive impact on your conversion metrics without having to test each different variation.
Common things that can be tested on, and are great for multivariate tests specifically are:
-
Button colors - Improving click-through rates (CTR)
-
CTA button text - Also improving click-through rates (CTR)
-
Different call to action button designs - like banners or buttons, improving CTR or conversion rates (CVR)
-
Page layout - Engagement rate (ER)
-
Interactive and media elements - Engagement rate (ER) or view-through rate (VTR)
How multivariate testing differs from a full factorial test
In statistics, a full factorial test is a variant of multivariate testing. It differs from a typical multivariate test in that with full factorial, all options are fully considered, as the name implies. Where in a multivariate test in most A/B testing tools like Optimizely Web Experimentation, we try finding the best combination faster before taking a conclusion.
Let’s take an example. A full factorial test can take the same 8 individual elements and changes to your page into account. But will equally test and divide traffic among them. Meaning, each of these options get equal traffic and data in order to reach statistical significance:
AAA, AAB, ABA, ABB, BAA, BAB, BBA and BBB all get tested equally to get to the purest result possible. Everything was given a fair chance to compete for the highest outcome.
In partial factorial testing, the more typically seen type of multivariate testing, when a positive change is detected early, only certain combinations that are more likely to outperform are tested. For example, if the system sees that test variations with variant B significantly outperforms those with A, there’s no need to keep testing variant A.
Bringing that back to the example from before, I’d only need to keep testing BAA, BAB, BBA and BBB, drastically cutting back on the amount of variants I need to keep testing, reaching statistical significance faster.
Downsides of multivariate testing
Here are a few potential challenges when running multivariate tests:
- Requires high traffic and page variation: The most difficult challenge in executing multivariate tests is the amount of visitor traffic and page variations required to reach meaningful results. Because of the fully factorial nature of these tests, the number of variations in a test can add up quickly. The result of a many-variation test is that the allocated traffic to each variation is lower. In A/B testing, traffic for an experiment is split in half, with 50 percent of traffic visiting each variation. In a multivariate test, traffic will be split into quarters, sixths, eighths, or even smaller segments, with variations receiving a much smaller portion of traffic than in a simple A/B test.
- Statistical significance: Before running a multivariate test, project the traffic sample size that you will need for each variation to reach a statistically significant result. If traffic to the page you would like to test is low, consider using an A/B test instead of a multivariate test.
- Lack of measurable effect: Sometimes, one or more of the variables being tested do not have a measurable effecton the conversion goal. For instance, if variations of an image on a landing page do not affect the conversion goal, while modifications to a headline do, the test would have been more effectively run as an A/B test rather than a multivariate test.
- Evolving user behavior: Like with any kind of testing, it’s important to note user behavior can be different on different pages, and it might be worth it to use multivariate testing on a different page with the same specific elements to verify test results. This methodology will help make sure you’re making the right data-driven decisions with test data, ensuring optimal user experience.
Examples of multivariate testing
Common examples of multivariate tests include:
-
Testing text and visual elements on a webpage together
- Testing the number of form fields and CTA text together
- Testing the text and color of a CTA button together instead of focusing on a single element
Using multivariate testing as a method of website optimization is a powerful method of gathering visitor and user data that gives detailed insights into complex customer behavior. The data uncovered in multivariate testing removes doubt and uncertainty from website optimization. Continuously testing, implementing winning variations and building off of testing insights can lead to significant conversion gains.