How to A/B Test Email Sequences

What is an A/B Test?

An A/B test is the easiest way to test your messaging and optimize your email sequences to produce better results on future campaigns to similar audiences.

A/B testing is a method to validate that any change to an email in your sequence is improving your open rate or reply rate before you send that sequence out to a larger group of leads.

1. Testing Open Rates vs. Reply Rates

For true A/B testing, you want to test either the open rate or the reply rate. You can't draw any valid conclusions if you try to test both at the same time.

You test open rate by keeping the same email body in both sequences (called "variations"), and you only change the subject lines of the emails. You test reply rate by using identical subject lines in both variations, but you change the body copy of the emails.

The best practice is to optimize the open rate first, and then run another test to optimize the reply rate using the winning subject lines. You would test the subject lines in the first test and keep the body copy of both variations the same, and then use the winning subject lines on both variations in the second test with different body copy to see which gets the best replies.

2. Sample Size

The goal of an A/B test is to prove a statistical difference between one variation of your email sequences and another so that you can achieve greater ROI on the leads you redeem in the future.

Your tests may not be statistically significant if your test sample size is too small or the difference in results between your variations is not large enough after the test. That means that there's no way to statistically verify that the difference you observe isn't due to chance alone. You can use an A/B test calculator to help you figure this out, which we'll explain below.

You should also consider whether A/B testing is necessary at all. Let's say you only ever plan on sending 250 emails to a particular audience, and you would have to send 100 emails for sequence A and 100 emails for sequence B to have a statistically significant test if you expect a massive 50% lift from new copy. Would the improvement in opens or responses you get on the 50 emails that are left over really be worth the effort of testing? Probably not in that case.

3. Using the A/B Test Calculator

Before starting an A/B test, you want to make sure that your sample size will have a statistically significant result if you achieve the improvement (called "uplift" or "lift") you think your changes could effect.

In the A/B test calculator, you can plug in the number of emails you want to send per variation in the "Unique visitors expected per variation" box. Then enter the number of conversions you expect from sending that many emails. You might know this from past emails or simply make an estimate. Enter this in the "Number of expected conversions Control" box, and enter the uplift you think your changes might achieve in the "Expected uplift (%)" box. A 10% uplift is generally considered the lowest number that is worthwhile to pursue, but that can depend on the scale of your campaigns.

In the example below, I'm planning to send 1,000 emails per sequence (2,000 total) for my test. From sending emails in the past, I know that about 25% of people have been opening the emails from my A sequence, so I can expect 250 conversions (opens) from sending 1,000 emails with those subject lines. If I think better subject lines can boost opens--the conversion rate--by 30%, then I'll have a statistically significant test and I should proceed.

Note: Choose "Pre-test analysis" to see if your sample size will be significant before conducting a test. You'll generally want to choose a "Two-sided Hypothesis" and at least a 95% or 99% Confidence level.

‚Äã

‚Äã

4. Set up your A/B Test

First, you'll have to have created an audience and created at least two email sequences to test. You've decided what variable you want to test and how large your audience should be to be statistically significant.

Now, run the two email sequences to a random selection of the audience. We don't have a scientific way of randomizing your audience, but as long as you've used the same filters to build each audience, your selection should be fairly random. Keep the timing of the emails the same and send them out.

After both sequences have completed, check the results and see the what the difference is in your chosen variable. Plug the numbers into the statistical calculator to see if they are statistically significant. If they are, congratulations! You've just figured out an optimal subject line or email copy!

Let's check out an example.

‚Äã

‚Äã‚Äã

‚Äã‚Äã

‚Äã

In the example above, if we were testing open rates to optimize subject lines, we can see that sequence #3 actually has a 12.3% lower conversion rate than sequence #2, and that result is statistically significant according to our calculator. This means that the subject lines in sequence #2 are better because they led to more people opening them. Sending emails with the subject lines in sequence #2 to the rest of your audience should lead to more email opens.

To see this in the statistical calculator, you would enter the Contacted column from Growlabs results as the number of visitors and the Unique Opens column from Growlabs as the conversions for each sequence. You're calculating how many people opened the emails out of the group that was sent the emails.

‚Äã

‚Äã

If we were testing reply rate to optimize the body of the emails, we can see that sequence #3 has a 35.72% lower conversion rate than sequence #2 and the difference is statistically significant. This means that the body copy of the emails in sequence #2 is better because when people open them they are more likely to reply positively. Sending emails with the email copy of sequence #2 to the rest of your audience should lead to more positive replies overall.

To see this in the statistical calculator, you would enter the Unique Opens column from Growlabs as the number of visitors and the Positive Replies column as the conversions for each sequence. You're calculating how many people positively replied out of the ones that opened the emails.

‚Äã

‚Äã

Note: We cannot infer both of the bolded lines above from a single test. Each of those paragraphs is imagining that the only variable in the test was the subject line or the body copy, not both at the same time.

By the way, before running any A/B tests, it is highly recommended that you test for email deliverability first and make sure each variation is equally well deliverable.

There you go! Now you can optimize your sequences and get the best possible ROI from your future sends!