
You're spending more on ads every quarter, your CAC keeps climbing, and the pages you're sending traffic to aren't converting the way they should. So someone on the team suggests A/B testing. Good instinct, but here's the problem: most brands treat testing like a slot machine. They pick a random element, swap the headline, change a button color, and hope something sticks.
After close to 12+ years in CRO, one rule hasn't changed: better test ideas come from customer research, not best practices.
This guide breaks down how to run A/B tests that produce reliable, revenue-moving results for your Shopify store, from the research that feeds your hypotheses to the discipline that separates $10M brands from $100M ones.
A/B testing, also known as split testing, compares two or more versions of a web page or landing page. At minimum, you'd create a control version (Version A) and a variant version (Version B). You'd then show these variants to two randomized groups of customers simultaneously to determine which one works best and make website optimizations accordingly.
When it's part of a comprehensive CRO strategy, A/B testing helps identify and validate site changes that drive real business outcomes:
A/B testing is not the only method for determining the impact of website changes, though.
While slightly rarer, and only possible for bigger brands with a lot of traffic, multivariate testing (MVT) is an alternative to split testing.
With A/B testing, you test two or more versions of a page or a specific element to understand user behavior. By extension, you can develop hypotheses about how to influence their behaviors to increase average order value or boost conversion rate, for example. These tests can get you insights faster than their multivariate counterparts, and because they typically test significant website changes, they can often result in more impactful results.
Alternatively, multivariate testing compares different variants of a webpage or different combinations of changes to see which one performs best. For example, you might want to test combinations of different headlines, subheadlines, hero images, and buttons in the hero section of your site.
These two types of testing complement each other. A/B testing helps brands gather customer insights and develop overarching hypotheses, while MVT can help you further optimize and find the right combination of elements.
A/B/n testing (more than two variants) only makes sense when the variants answer the same question, like copy tweaks, positioning of a new element, or tactical design execution. When variants test different hypotheses (say, a video-focused PDP vs. a long-form storytelling PDP), those are strategies, not executions, and you should run them sequentially.
It must be said, however, that MVT isn't necessary for most brands doing minimal A/B testing. Don't run multivariate tests if your store does fewer than roughly 3,000 orders per month. You'll dilute results and never reach significance. For statistical reasons, MVT requires a big website with heavy traffic, so it's not worthwhile for lower-traffic ecommerce sites.
For instance, while a simple A/B test might only have two variants, an MVT test might have a different variant for each of several changes to test all possible combinations. So, say you have two versions of a headline, a subheadline, and a button. You'd have the equivalent of nine A/B test variants. You'd need a high volume of traffic to get enough eyes on each variant to produce accurate results.
Testing without research is expensive. You're spending dev time, traffic, and time-to-learn on ideas with no basis. The brands that consistently win with A/B testing aren't the ones running the most tests. They're the ones whose test ideas come from actual customer data instead of hunches or competitor inspiration.
At SplitBase, we use our 3Ps research framework to ensure every test has a reason to exist:
Within Patterns, there are three sub-layers that give you the full picture:
This is also where our Testing Trifecta methodology comes in. The Trifecta combines quantitative and qualitative research to reveal not just what's hindering conversions, but why. Armed with that info, you can develop strong hypotheses to address the real issues instead of guessing at solutions.

Image description: SplitBase's Testing Trifecta methodology
With A/B testing, brands can test nearly anything: header copy, layout, graphics, buttons, offers, shipping price, or entirely new elements. With so many options, it can be challenging to know where to start. Here's a step-by-step guide to conducting A/B tests that produce results you can trust.
You may already have an idea of what you want to test. For example, you might have a landing page with a high bounce rate and think that tweaking the design could reduce it. While this may be true, relying on intuition can waste time and resources. By starting with research, you can develop a data-driven testing strategy.
For a homepage test, we typically use a three-tool approach to generate hypotheses from data, not opinion:
Two hours of this work usually produces 5 to 10 specific hypotheses and turns a coin-flip test into one with 70 to 80% confidence.
This quantitative research reveals the problem.
Qualitative research reveals the root cause of problem areas on your site. Here, brands conduct audience research to identify the desires and pain points affecting conversion.
There are a variety of methods to gather information about your customers, such as customer interviews, surveys, usability testing, and session recordings. Reviewing customer service chat logs is also helpful for understanding customer obstacles, and Shopify stores can capture high-quality feedback through post-purchase surveys, which can be automated with tools like KnoCommerce or Fairing.
By conducting both types of research, brands identify areas for improvement and potential solutions rooted in data. Next, it's time to put those solutions to the test.
Start by using your research to inform your hypothesis. For example, we partnered with Dr. Squatch, a personal care and organic soap brand, to conduct a site-wide audit. During our research, we noticed that many customers purchased several soaps at once. However, the product page lacked a quantity field. Instead, shoppers had to click "add to cart" multiple times. We hypothesized that adding a quantity field would increase quantities purchased and, therefore, boost average order value.

Image description: Adding quantities to Dr. Squatch's product pages.
Before you design a single test, you need to decide what deserves your limited traffic and dev resources. Most brands skip straight to innovation (new layouts, bold redesigns) and wonder why tests fail. The smarter move is to follow a hierarchy:
We prioritize test ideas using our IMPACT Prioritization Framework, which scores each idea across six dimensions: Importance (is it in a critical funnel area with high traffic value?), Motivation (is it supported by qualitative and quantitative findings?), Potential (is the change visible above the fold within five seconds?), Alignment (does it impact the brand's current focus metric?), Creative effort (how much copy and design work is involved?), and Technicality (how much dev time does it require?).
This framework keeps you from burning resources on low-probability tests and focuses your program on the ideas most likely to move revenue.
A/B testing requires careful research and planning to ensure that the results are statistically significant. (Statistical significance is the probability that outcomes in a study are not due to random chance.) For example, a press mention or unexpected news event during your testing window may skew your results.
A/B testing is similar to a science experiment. Brands must form a hypothesis, create control and variant samples, and set testing parameters. Here are some elements to focus on when designing your test:
If you didn't have the appropriate sample size or number of variations, or if you didn't run your test long enough, your results would be unreliable. Needless to say, making optimization decisions based on misleading data is more likely to hurt your conversion rate than help it.
Numerous A/B testing tools are available, including Intelligems, Convert, and VWO. These tools simplify creating, running, and analyzing A/B tests without advanced coding skills. Shopify stores looking to test things like shipping thresholds or subscription offers will need specialized apps such as Shipscout and Rebuy.
For tests that involve more than changing text, do not use the visual editor provided by A/B testing tools. They auto-generate code and often cause significant browser compatibility and code issues. Always use developers to code and validate your tests. We never skip developer QA at SplitBase. It's been instrumental in ensuring experiments launch bug-free and in providing us with accurate insights as quickly as possible.
When you're ready to run your test, split your audience into two or more randomized groups (A/B testing software automates this process). Group A will see the original version, and Group B will see the modified version. Ensure all key metrics are tracked and collect data for each group.
Here's where discipline separates the $10M brands from the $100M ones. The biggest difference isn't tooling. It's patience. $210M brands check tests daily, want to stop early, and override data with gut feelings. $100M brands set a runtime, follow the plan, and let the data make the decision.
The number one reason tests fail isn't bad ideas. There's not enough traffic. Test fewer, bigger changes on higher-traffic pages. One bold homepage test teaches more than 10 micro-tests on low-traffic category pages.
Once your test is complete, study the results, but make sure you're looking at the right metric. Most brands default to conversion rate, but conversion rate can be gamed by dropping prices or removing upsells, and it kills revenue in the process. Revenue per visitor (RPV) captures the full picture: more conversions and maintained or higher AOV.
Look at the metrics you pre-determined were key to your experiment. Was your hypothesis correct? If not, what else did you learn? Most A/B testing tools include analytics and reporting features that help you answer these questions.
A/B testing is an ongoing process, and every test is both a business decision and a research instrument. Industry win rate is about 1 in 7. A research-driven program runs 20 to 40%. But a win rate that's too high isn't necessarily good either, because it means you're not taking enough risk and you're just testing things you should implement outright.
A losing test that teaches you customers don't care about social proof on a particular page, is more valuable than a winning test you can't explain. One experiment can lead to new questions and tests, so use your learnings to tweak and test different elements and maximize the effectiveness of your Shopify store.
To see results from split testing, you have to be smart about what you choose to experiment on. With A/B testing, brands can evaluate a variety of elements, including header copy, website layout, graphics, buttons, colors, and the checkout process. Here, we'll explore specific A/B tests worth running for your Shopify store.
Shipping thresholds are the minimum purchase amounts customers must meet to qualify for free or discounted shipping, and adjusting them can reduce cart abandonment while increasing AOV. Your shipping threshold will vary by your industry, product type, margins, and shipping costs. Start by looking at your average order value and determine a minimum purchase amount just above it.
Use A/B testing to evaluate how your shipping thresholds will perform. For example, half your visitors receive free shipping on all orders while the other half receive free shipping on orders over $75 dollars.
Upselling and cross-selling are also valuable tools to increase AOV. Offering product bundling or subscriptions can motivate customers to buy more. For example, Curlsmith gives customers the option of a one-time purchase or a subscription at a discounted rate.

Image source: Curlsmith
Offering different quantities can also increase AOV. In the case of Dr. Squatch, we hypothesized that if the product page defaulted to two soaps, it would increase the revenue per user. Our A/B tests showed a 54% increase in revenue per user with the proposed change.

Image description: Dr. Squatch's product pages default to two bars of soap.
You can also A/B test where you cross-sell and upsell. Some places to consider adding cross-selling or upselling opportunities include:

Image description: Everlane displays a carousel of product suggestions on its cart pages.
Many brands use a mix of the above, which can provide some great inspiration for your own upsell and cross-sell flows. But, of course, testing is essential to understanding what your target customers respond to best.
Customers can sometimes be overwhelmed by the number of options on a website, and personalized product recommendations combat this decision fatigue. A/B testing helps brands determine the best way to suggest products to visitors. This includes messaging, product placement, product recommendation quizzes, or using an AI-powered chat.

Image description: CurlSmith uses a pop-up to guide users toward a product quiz.

Image description: Haircare brand Amika includes an AI-powered chat on its homepage to suggest relevant product recommendations.
Product detail pages should help customers understand a product's value and guide them to the next stage of the shopping process, with clear calls to action (CTAs) that motivate purchase. These pages include a product's description, specifications, color options, sizing options, images, price, customer reviews, and more. A/B testing the PDP layout, copy, or other elements is key to optimizing these pages and driving more conversions.
For example, we used testing to help hair extension brand INH increase conversions by 26%. Using our Testing Trifecta process, which is rooted in deep customer research, we found that shoppers were confused about how to use the brand's products. They weren't reading the product descriptions on INH's product pages.
After testing various approaches to product demonstrations, including videos and images, we found that a combination of three GIFs worked best to boost conversion and return on ad spend (ROAS).

Image description: Including a product demonstration in INH's product detail pages.
But this example is not to say that those three GIFs on the product page are a universal solution for brands that have the same problem INH did. Ultimately, the brands that have the greatest success with testing and optimization develop test ideas by researching what does and doesn't work for their specific brands and customers.
A/B testing is deceptively simple, but if tests yield unreliable data, you could make changes to your Shopify store that don't align with customer preferences, threatening loyalty and revenue. Here are five best practices to follow.
All of the above can make or break your conversion rate optimization strategy.
To be clear, using a research-driven A/B testing framework doesn't guarantee that every hypothesis you come up with will be spot on. But it increases your chances of identifying effective solutions faster, and it compounds over time.
Our homepage "best path to purchase" approach alone has generated 6-figure lifts per month across beauty, fragrance, fashion, and apparel brands. And our win rate consistently runs above the 20-30% industry average because every test starts with research, not guesswork.
Don't forget that winning and losing tests are good, as long as you learn from them. If an experiment increases conversions, great. Document the results and make a note of what you learned. If an A/B test doesn't work, analyze what went wrong and what did work. Tweak your hypothesis accordingly, relaunch the experiment, and compare the results. Then, rinse and repeat to get closer to your conversion goal.
A/B testing is a powerful tool for ecommerce brands to keep up with evolving customer preferences, improve the customer experience, increase ROI, and drive revenue. But creating a thoughtful A/B testing strategy is critical. That's where SplitBase comes in.
SplitBase is the leading conversion optimization and landing page agency for ecommerce brands. We've spent 10+ years exclusively working with DTC ecommerce brands, doing 8- and 9-figure. We provide full-site A/B testing for your Shopify store, boosting customer acquisition, AOV, and conversions. Get a free ecommerce CRO proposal today to see how we can help optimize your Shopify store.
Most ecommerce brands run between 24 and 60 A/B tests per year, but volume isn't the point. A research-driven program that runs 20 focused tests will outperform 60 random ones because every hypothesis is rooted in actual customer data instead of gut instinct.
Revenue per visitor (RPV) beats conversion rate as your primary metric. Conversion rate can be gamed by dropping prices or removing upsells, which kills revenue. RPV captures the full picture because it accounts for both conversion volume and average order value in a single number.
Run tests for at least three to four full weeks, even if you hit your sample size early. You want at least 100 conversions per variation, 95% statistical significance, and enough time for weekly fluctuations to settle. Stopping early is the fastest way to act on data that isn't real.
The number one reason isn't bad ideas. It's not enough traffic. Brands spread thin by running multiple micro-tests on low-traffic pages instead of running fewer, bigger tests on high-traffic pages where you can reach significance faster and learn something meaningful.
Yes, but you need to be strategic about it. If your store gets limited traffic, focus on testing bold, high-impact changes on your highest-traffic pages rather than subtle tweaks across many pages. And skip multivariate testing entirely until you're doing at least 1,000 orders per month, because you won't have the traffic to reach significance.
A/B testing compares two or more versions of a single element or page, while multivariate testing compares combinations of multiple changes at once. A/B testing requires less traffic and gives faster results. Multivariate testing is useful for fine-tuning after you've validated the bigger strategic bets, but it requires significantly more traffic to produce reliable data.