Shopify A/B Testing for 8 and 9 Figure DTC Brands: The Complete Guide

By Raphael Paulin-Daigle Founder and CEO of SplitBase

You're spending more on ads every quarter, your CAC keeps climbing, and the pages you're sending traffic to aren't converting the way they should. So someone on the team suggests A/B testing. Good instinct, but here's the problem: most brands treat testing like a slot machine. They pick a random element, swap the headline, change a button color, and hope something sticks.

After close to 12+ years in CRO, one rule hasn't changed: better test ideas come from customer research, not best practices. 

This guide breaks down how to run A/B tests that produce reliable, revenue-moving results for your Shopify store, from the research that feeds your hypotheses to the discipline that separates $10M brands from $100M ones.

What is A/B testing?

A/B testing, also known as split testing, compares two or more versions of a web page or landing page. At minimum, you'd create a control version (Version A) and a variant version (Version B). You'd then show these variants to two randomized groups of customers simultaneously to determine which one works best and make website optimizations accordingly.

When it's part of a comprehensive CRO strategy, A/B testing helps identify and validate site changes that drive real business outcomes:

  • Increased conversions: CRO aims to get more of your ideal customers to follow through with making a purchase when they visit your online store.
  • Boosted average order value (AOV): You can test different versions of product bundles, promotions, and pricing strategies to get shoppers to spend more per order.
  • Reduced bounce rates: Testing different elements is one step in ensuring that your website, landing pages included, grabs shoppers' attention and aligns with their needs.
  • Enhanced customer experience: A/B testing can help you identify aspects of your current user experience that cause frustration or confusion, or that could simply be more enjoyable or memorable. You can then use your test results to deliver better experiences.

A/B testing is not the only method for determining the impact of website changes, though.

A/B testing vs. multivariate testing

While slightly rarer, and only possible for bigger brands with a lot of traffic, multivariate testing (MVT) is an alternative to split testing.

With A/B testing, you test two or more versions of a page or a specific element to understand user behavior. By extension, you can develop hypotheses about how to influence their behaviors to increase average order value or boost conversion rate, for example. These tests can get you insights faster than their multivariate counterparts, and because they typically test significant website changes, they can often result in more impactful results.

Alternatively, multivariate testing compares different variants of a webpage or different combinations of changes to see which one performs best. For example, you might want to test combinations of different headlines, subheadlines, hero images, and buttons in the hero section of your site.

These two types of testing complement each other. A/B testing helps brands gather customer insights and develop overarching hypotheses, while MVT can help you further optimize and find the right combination of elements.

A/B/n testing (more than two variants) only makes sense when the variants answer the same question, like copy tweaks, positioning of a new element, or tactical design execution. When variants test different hypotheses (say, a video-focused PDP vs. a long-form storytelling PDP), those are strategies, not executions, and you should run them sequentially.

It must be said, however, that MVT isn't necessary for most brands doing minimal A/B testing. Don't run multivariate tests if your store does fewer than roughly 3,000 orders per month. You'll dilute results and never reach significance. For statistical reasons, MVT requires a big website with heavy traffic, so it's not worthwhile for lower-traffic ecommerce sites.

For instance, while a simple A/B test might only have two variants, an MVT test might have a different variant for each of several changes to test all possible combinations. So, say you have two versions of a headline, a subheadline, and a button. You'd have the equivalent of nine A/B test variants. You'd need a high volume of traffic to get enough eyes on each variant to produce accurate results.

Research first: the foundation of every test worth running

Testing without research is expensive. You're spending dev time, traffic, and time-to-learn on ideas with no basis. The brands that consistently win with A/B testing aren't the ones running the most tests. They're the ones whose test ideas come from actual customer data instead of hunches or competitor inspiration.

At SplitBase, we use our 3Ps research framework to ensure every test has a reason to exist:

  • Patterns: behavioral, voice, and resistance research that tells you what's working and what's not
  • Proof: testing itself, framed as research in its true form
  • Perception: the competitive landscape and brand positioning

Within Patterns, there are three sub-layers that give you the full picture:

  • Behavioral patterns (the quantitative what and where): GA4, heatmaps, scroll maps. The key question here is whether the issue is affecting the majority of customers or only a small percentage.
  • Voice patterns (the qualitative why): post-purchase open-ended surveys, customer interviews, website polls, review and comment mining. The most-used method at SplitBase is the post-purchase qualitative survey.
  • Resistance patterns: technical and usability friction that causes abandonment.

This is also where our Testing Trifecta methodology comes in. The Trifecta combines quantitative and qualitative research to reveal not just what's hindering conversions, but why. Armed with that info, you can develop strong hypotheses to address the real issues instead of guessing at solutions.

SplitBase's Testing Trifecta methodology

Image description: SplitBase's Testing Trifecta methodology

How to A/B test your Shopify store

With A/B testing, brands can test nearly anything: header copy, layout, graphics, buttons, offers, shipping price, or entirely new elements. With so many options, it can be challenging to know where to start. Here's a step-by-step guide to conducting A/B tests that produce results you can trust.

1. Conduct quantitative research

You may already have an idea of what you want to test. For example, you might have a landing page with a high bounce rate and think that tweaking the design could reduce it. While this may be true, relying on intuition can waste time and resources. By starting with research, you can develop a data-driven testing strategy.

For a homepage test, we typically use a three-tool approach to generate hypotheses from data, not opinion:

  1. GA4 free-form reports: page paths, revenue per user by channel and device, and session key event rate. What converts on paid mobile is often very different from what converts on organic desktop.
  2. Shopify analytics: top products first-time customers buy, cross-referenced with LTV. Your most popular product isn't always your best acquisition product.
  3. Heatmap tools: revenue per click, not just clicks or engagement. An element getting 5% of clicks but 30% of revenue should move up the page.

Two hours of this work usually produces 5 to 10 specific hypotheses and turns a coin-flip test into one with 70 to 80% confidence.

This quantitative research reveals the problem.

2. Conduct qualitative research

Qualitative research reveals the root cause of problem areas on your site. Here, brands conduct audience research to identify the desires and pain points affecting conversion.

There are a variety of methods to gather information about your customers, such as customer interviews, surveys, usability testing, and session recordings. Reviewing customer service chat logs is also helpful for understanding customer obstacles, and Shopify stores can capture high-quality feedback through post-purchase surveys, which can be automated with tools like KnoCommerce or Fairing.

3. Form a hypothesis

By conducting both types of research, brands identify areas for improvement and potential solutions rooted in data. Next, it's time to put those solutions to the test.

Start by using your research to inform your hypothesis. For example, we partnered with Dr. Squatch, a personal care and organic soap brand, to conduct a site-wide audit. During our research, we noticed that many customers purchased several soaps at once. However, the product page lacked a quantity field. Instead, shoppers had to click "add to cart" multiple times. We hypothesized that adding a quantity field would increase quantities purchased and, therefore, boost average order value.

Adding quantities to Dr. Squatch's product pages

Image description: Adding quantities to Dr. Squatch's product pages.

4. Prioritize what to test (the CRO Hierarchy)

Before you design a single test, you need to decide what deserves your limited traffic and dev resources. Most brands skip straight to innovation (new layouts, bold redesigns) and wonder why tests fail. The smarter move is to follow a hierarchy:

  1. Fix what's broken: bugs, UX issues, page speed problems. These aren't glamorous, but they're eating your conversions right now.
  2. Optimize what's working: high-traffic pages and key conversion points where even a small lift produces meaningful revenue.
  3. Innovate: new layouts, offers, and funnels. This is where the big swings live, but only after the foundation is solid.

We prioritize test ideas using our IMPACT Prioritization Framework, which scores each idea across six dimensions: Importance (is it in a critical funnel area with high traffic value?), Motivation (is it supported by qualitative and quantitative findings?), Potential (is the change visible above the fold within five seconds?), Alignment (does it impact the brand's current focus metric?), Creative effort (how much copy and design work is involved?), and Technicality (how much dev time does it require?).

This framework keeps you from burning resources on low-probability tests and focuses your program on the ideas most likely to move revenue.

5. Design your test

A/B testing requires careful research and planning to ensure that the results are statistically significant. (Statistical significance is the probability that outcomes in a study are not due to random chance.) For example, a press mention or unexpected news event during your testing window may skew your results.

A/B testing is similar to a science experiment. Brands must form a hypothesis, create control and variant samples, and set testing parameters. Here are some elements to focus on when designing your test:

  • The sample size: Start by identifying the right sample size to ensure your results are statistically significant. Many A/B testing tools will do this for you. Optimizely's Sample Size Calculator or CXL's Pre-Test Calculator are useful resources.
  • Number of variations: Your sample size will affect the number of variations you can test. For example, you'd need a larger sample size to test five variations of a landing page than you would if you planned to test only two variants.
  • The duration of the test: The length of the tests will vary based on your sample size and number of variations. Remember to always conduct tests in full-week increments and test over at least two business cycles.

If you didn't have the appropriate sample size or number of variations, or if you didn't run your test long enough, your results would be unreliable. Needless to say, making optimization decisions based on misleading data is more likely to hurt your conversion rate than help it.

6. Choose an A/B testing tool

Numerous A/B testing tools are available, including Intelligems,  Convert, and VWO. These tools simplify creating, running, and analyzing A/B tests without advanced coding skills. Shopify stores looking to test things like shipping thresholds or subscription offers will need specialized apps such as Shipscout and Rebuy.

For tests that involve more than changing text, do not use the visual editor provided by A/B testing tools. They auto-generate code and often cause significant browser compatibility and code issues. Always use developers to code and validate your tests. We never skip developer QA at SplitBase. It's been instrumental in ensuring experiments launch bug-free and in providing us with accurate insights as quickly as possible.

7. Run your A/B test (with discipline)

When you're ready to run your test, split your audience into two or more randomized groups (A/B testing software automates this process). Group A will see the original version, and Group B will see the modified version. Ensure all key metrics are tracked and collect data for each group.

Here's where discipline separates the $10M brands from the $100M ones. The biggest difference isn't tooling. It's patience. $210M brands check tests daily, want to stop early, and override data with gut feelings. $100M brands set a runtime, follow the plan, and let the data make the decision.

The number one reason tests fail isn't bad ideas. There's not enough traffic. Test fewer, bigger changes on higher-traffic pages. One bold homepage test teaches more than 10 micro-tests on low-traffic category pages.

8. Analyze the results (measure RPV, not just conversion rate)

Once your test is complete, study the results, but make sure you're looking at the right metric. Most brands default to conversion rate, but conversion rate can be gamed by dropping prices or removing upsells, and it kills revenue in the process. Revenue per visitor (RPV) captures the full picture: more conversions and maintained or higher AOV.

Look at the metrics you pre-determined were key to your experiment. Was your hypothesis correct? If not, what else did you learn? Most A/B testing tools include analytics and reporting features that help you answer these questions.

9. Frame outcomes as learning, then iterate

A/B testing is an ongoing process, and every test is both a business decision and a research instrument. Industry win rate is about 1 in 7. A research-driven program runs 20 to 40%. But a win rate that's too high isn't necessarily good either, because it means you're not taking enough risk and you're just testing things you should implement outright.

A losing test that teaches you customers don't care about social proof on a particular page, is more valuable than a winning test you can't explain. One experiment can lead to new questions and tests, so use your learnings to tweak and test different elements and maximize the effectiveness of your Shopify store.

What should you A/B test?

To see results from split testing, you have to be smart about what you choose to experiment on. With A/B testing, brands can evaluate a variety of elements, including header copy, website layout, graphics, buttons, colors, and the checkout process. Here, we'll explore specific A/B tests worth running for your Shopify store.

How to test shipping thresholds

Shipping thresholds are the minimum purchase amounts customers must meet to qualify for free or discounted shipping, and adjusting them can reduce cart abandonment while increasing AOV. Your shipping threshold will vary by your industry, product type, margins, and shipping costs. Start by looking at your average order value and determine a minimum purchase amount just above it.

Use A/B testing to evaluate how your shipping thresholds will perform. For example, half your visitors receive free shipping on all orders while the other half receive free shipping on orders over $75 dollars.

How to test upsells and cross-sells

Upselling and cross-selling are also valuable tools to increase AOV. Offering product bundling or subscriptions can motivate customers to buy more. For example, Curlsmith gives customers the option of a one-time purchase or a subscription at a discounted rate.

Curlsmith subscription option

Image source: Curlsmith

Offering different quantities can also increase AOV. In the case of Dr. Squatch, we hypothesized that if the product page defaulted to two soaps, it would increase the revenue per user. Our A/B tests showed a 54% increase in revenue per user with the proposed change.

Dr. Squatch's product pages default to two bars of soap

Image description: Dr. Squatch's product pages default to two bars of soap.

You can also A/B test where you cross-sell and upsell. Some places to consider adding cross-selling or upselling opportunities include:

  • Product pages: Include "Frequently Purchased Together" or "Customer Also Bought" sections on your product pages.
  • Cart page: For example, a fashion brand could use a "Complete the Look" section on its cart page to recommend related products.
Everlane product suggestions carousel in cart

Image description: Everlane displays a carousel of product suggestions on its cart pages.

  • Checkout pages: Provide customers with opportunities to add relevant products to their order before finalizing the purchase.
  • Thank-you pages: Use order confirmation pages to offer special offers and suggest complementary products.

Many brands use a mix of the above, which can provide some great inspiration for your own upsell and cross-sell flows. But, of course, testing is essential to understanding what your target customers respond to best.

How to test product recommendations

Customers can sometimes be overwhelmed by the number of options on a website, and personalized product recommendations combat this decision fatigue. A/B testing helps brands determine the best way to suggest products to visitors. This includes messaging, product placement, product recommendation quizzes, or using an AI-powered chat.

CurlSmith pop-up guiding users to a product quiz

Image description: CurlSmith uses a pop-up to guide users toward a product quiz.

Amika AI-powered chat for product recommendations

Image description: Haircare brand Amika includes an AI-powered chat on its homepage to suggest relevant product recommendations.

How to test product detail pages (PDPs)

Product detail pages should help customers understand a product's value and guide them to the next stage of the shopping process, with clear calls to action (CTAs) that motivate purchase. These pages include a product's description, specifications, color options, sizing options, images, price, customer reviews, and more. A/B testing the PDP layout, copy, or other elements is key to optimizing these pages and driving more conversions.

For example, we used testing to help hair extension brand INH increase conversions by 26%. Using our Testing Trifecta process, which is rooted in deep customer research, we found that shoppers were confused about how to use the brand's products. They weren't reading the product descriptions on INH's product pages.

After testing various approaches to product demonstrations, including videos and images, we found that a combination of three GIFs worked best to boost conversion and return on ad spend (ROAS).

Product demonstration in INH's product detail pages

Image description: Including a product demonstration in INH's product detail pages.

But this example is not to say that those three GIFs on the product page are a universal solution for brands that have the same problem INH did. Ultimately, the brands that have the greatest success with testing and optimization develop test ideas by researching what does and doesn't work for their specific brands and customers.

A/B testing tips and best practices

A/B testing is deceptively simple, but if tests yield unreliable data, you could make changes to your Shopify store that don't align with customer preferences, threatening loyalty and revenue. Here are five best practices to follow.

  1. Maintain a balanced approach: It can be tempting to focus only on tests that you think will yield dramatic results (e.g., redesigning your home page). But smaller tests, such as testing how you display your shipping fees, can be just as important and inform your overall messaging strategy. When prioritizing A/B tests, include a mix of large and small experiments.
  2. Get the timing right: We recommend running tests for at least three to four weeks, even if you reach the suggested sample size. Brands should see at least 100 conversions per variation and test for full weeks at a time, as performance can vary by day of the week.
  3. Keep your A/B tests focused: Stick to changes that directly relate to your hypothesis or a single problem you're looking to solve. To illustrate, say you find that customers don't buy because of a lack of trust. If you were to change your entire page layout, you wouldn't be able to pinpoint what tweaks increased or decreased that trust. On the flip side, focusing only on elements that could help build trust would let you see more clearly what works and what doesn't.
  4. Segment your audience: When analyzing data, segment your audience by criteria such as demographics, location, or behavior (e.g., new vs. returning visitors). This will enable you to understand how different audiences respond to your changes.
  5. Track your A/B tests: Record each test, including the hypothesis, control group, variation group, results, and insights. Tracking your A/B tests will ensure you only test the same thing once and refine your approach for future tests.

All of the above can make or break your conversion rate optimization strategy.

The payoff you can expect from research-driven testing

To be clear, using a research-driven A/B testing framework doesn't guarantee that every hypothesis you come up with will be spot on. But it increases your chances of identifying effective solutions faster, and it compounds over time.

Our homepage "best path to purchase" approach alone has generated 6-figure lifts per month across beauty, fragrance, fashion, and apparel brands. And our win rate consistently runs above the 20-30% industry average because every test starts with research, not guesswork.

Don't forget that winning and losing tests are good, as long as you learn from them. If an experiment increases conversions, great. Document the results and make a note of what you learned. If an A/B test doesn't work, analyze what went wrong and what did work. Tweak your hypothesis accordingly, relaunch the experiment, and compare the results. Then, rinse and repeat to get closer to your conversion goal.

Optimize your Shopify store

A/B testing is a powerful tool for ecommerce brands to keep up with evolving customer preferences, improve the customer experience, increase ROI, and drive revenue. But creating a thoughtful A/B testing strategy is critical. That's where SplitBase comes in.

SplitBase is the leading conversion optimization and landing page agency for ecommerce brands. We've spent 10+ years exclusively working with DTC ecommerce brands, doing 8- and 9-figure. We provide full-site A/B testing for your Shopify store, boosting customer acquisition, AOV, and conversions. Get a free ecommerce CRO proposal today to see how we can help optimize your Shopify store.

Frequently asked questions

How many A/B tests should an ecommerce brand run per year?

Most ecommerce brands run between 24 and 60 A/B tests per year, but volume isn't the point. A research-driven program that runs 20 focused tests will outperform 60 random ones because every hypothesis is rooted in actual customer data instead of gut instinct.

What's the most important metric to track in an A/B test?

Revenue per visitor (RPV) beats conversion rate as your primary metric. Conversion rate can be gamed by dropping prices or removing upsells, which kills revenue. RPV captures the full picture because it accounts for both conversion volume and average order value in a single number.

How long should you run an A/B test on Shopify?

Run tests for at least three to four full weeks, even if you hit your sample size early. You want at least 100 conversions per variation, 95% statistical significance, and enough time for weekly fluctuations to settle. Stopping early is the fastest way to act on data that isn't real.

Why do most A/B tests fail?

The number one reason isn't bad ideas. It's not enough traffic. Brands spread thin by running multiple micro-tests on low-traffic pages instead of running fewer, bigger tests on high-traffic pages where you can reach significance faster and learn something meaningful.

Can small Shopify stores benefit from A/B testing?

Yes, but you need to be strategic about it. If your store gets limited traffic, focus on testing bold, high-impact changes on your highest-traffic pages rather than subtle tweaks across many pages. And skip multivariate testing entirely until you're doing at least 1,000 orders per month, because you won't have the traffic to reach significance.

What's the difference between A/B testing and multivariate testing?

A/B testing compares two or more versions of a single element or page, while multivariate testing compares combinations of multiple changes at once. A/B testing requires less traffic and gives faster results. Multivariate testing is useful for fine-tuning after you've validated the bigger strategic bets, but it requires significantly more traffic to produce reliable data.

Increase your conversions and AOV too.
Request a free proposal.
Book a Call