Split Testing with Confidence | ThomasARTS

back to insights


Constant testing and trials in marketing campaigns are vital for highly optimized and maximum performance. It helps us understand the platforms, audiences, creative, and back-end development that should receive focused attention and budget.

The more we test our strategies through trial and error, the higher our confidence in recommending winning campaigns. We can analyze standard campaigns and make observations regarding what changes make the greatest impact; however, in order to be certain our observations are significant enough to validate a change in campaign practices, we suggest submitting campaigns to a laboratory test such as a split, or A/B, test.

A split, or A/B, test places one concept to compete for performance against the same concept with the difference of one variable. Tests can be held on social or digital channels. The winning variable can then be implemented into future campaigns as a best practice.

Best practices

Split testing divides your audience into random, non-overlapping groups. Then, two identical ads with one differentiating variable are placed in each audience group. Key to these tests is that they are performed with statistical significance.

When claiming that a result has statistical significance, we’re claiming that the result is likely to be attributed to one specific reason. Tests should seek a high degree of statistical significance or a high level of confidence that the results occurred because of the change in variable and not because of chance.

Performing a high-quality and accurate test requires the following best practices:

  • A formal and written hypothesis should be formed. Your hypothesis guides the campaign and answers both what will be tested and why. Without answering these questions, the results will be insignificant. The hypothesis will clarify how the results will affect your campaign practices and what, if anything, you plan to change with the new insights.
  • Split tests should only test one variable at a time. Testable variables include pieces of the ad creative (image, copy, ad type), audiences, placements, or the Call To Action (CTA). Testing more than one variable at a time causes a larger standard of error which invalidates the results.
  • Audiences to which the variables are being tested should be as random as possible. While you can still use a targeted audience based on your campaign’s objectives, each audience should be split into random, non-overlapping groups. This will help mitigate any error caused by conditions such as the device type they may be using, demographics, or other branded ads they might be seeing. Targeted audiences should not be a part of any other branded campaigns outside of the test.
  • Shoot for a high level of confidence that your test results would be consistent if repeated again under the same conditions. A minimum confidence level of 75% is recommended.
  • Gather as much data as possible. The best way to reach a high level of statistical significance is to analyze as much data as possible. A minimum of 200 results should be collected before choosing a winner. This will rely on your campaign objectives and budget. A higher minimum should be considered before making any drastic changes to your campaign best practices.
  • Determine one KPI that will determine the clarified winner. While other metrics might be available that inform the course of the campaign, it is important that only one is used to determine next steps, as other results may not be statistically significant.

Who should consider a split test?

Split tests should be considered by brands who run similar recurring campaigns that could benefit from data-backed best practices. These brands have run variations of their tested variable in the past in an uncontrolled environment and are able to form a sound hypothesis. They have processes in place to implement the learnings of the test in future campaigns. Split tests should not be conducted without a written hypothesis.

Split tests are recommended for campaigns that will run for longer than one month and have the testing power to capture enough test results in a one-month period. Tests that require more time to gather enough results should consider a larger budget or a different key test metric.

back to insights

October 18, 2022by Cambria VandeMerwe


a little more light reading
Digital Marketing in 2023: Predictions From Our Experts

Every day at ThomasARTS, we make amazing things happen in digital marketing when we blend art + science — this has been our strategy for many years. However, …..

read more+
How TA Created and Executed a Nationally Acclaimed PPC Campaign

Recently, we created a paid-search campaign for Zions Bancorporation that far surpassed its intended goals — and our client’s expectations. We were pretty proud of it and submitted …..

read more+
Get ready to rock and enroll this AEP

Annual Enrollment Period, or AEP, is the time of year when seniors enroll in their annual Medicare plans. It’s often the biggest and busiest time of the year …..

read more+

Whether you’re looking for a new agency, a new job or just want to chat, we’re always happy to talk.