Tiered Holdout Groups

02 Mar 2025 » MSA , Platform

I have recently written about testing, where I explained that everything in digital marketing should be tested. There are a couple of reasons for this: first, because you can test nearly everything, you have the means to do so; and second, because it is a scientific approach. Now, you might be wondering: how do I run those tests? The answer is simple and, again, scientific: by creating holdout groups. Let me explain.

The Theory

I am not a statistics expert; I actually hated it in high school and college. However, the science behind holdout groups relies on comparing the behavior of a subset of your audience receiving a specific treatment versus another subset receiving no treatment. I did a quick Internet search and found that this is a somewhat controversial statement, and the idea is not perfect. That being said, I believe it’s the best approach we have.

As an example, in the pharmaceutical industry, new drugs must be tested before being released to the market. Volunteers are divided into two groups: one group receives a placebo, and the other receives the actual new medicine. Doctors then compare the results from each group to evaluate whether the medicine provides additional benefits beyond what the placebo group experienced.

Flat

Usually, holdout groups are not layered on top of each other. Here are two common “flat” approaches:

Ad-hoc

The simplest approach in digital marketing is to create an ad-hoc holdout group for every campaign. This is how Adobe Target works, where for every A/B test, it generates a holdout group based on random assignments to each experience. If you’re using Adobe Journey Optimizer (AJO), the situation is similar: you would add a condition activity to a journey and select the percentage split option.

The result is that you have multiple, inconsistent holdout groups across your campaigns. An individual is randomly assigned to a treatment or holdout group for each campaign. Another potential issue is bias: random number generators are not perfectly random, and there is always a possibility they might favor certain demographics. Do not get me woring, this approach is still useful and has proven valuable to marketers.

Universal

Another typical approach is to create a universal holdout group. Basically, you randomly select a subset of your customers from your Customer Relationship Management (CRM) system and mark them as being part of the holdout group. Then, for every campaign, you only target the complementary subset – the treatment group.

One benefit of this approach is that you can reduce bias in the selection of the groups. You can ensure that you are selecting a representative sample of each demographic (or any other set of variables) for the holdout group.

Tiered

The What

Before I explain what a tiered holdout group is, I want to clarify that the concept is not an established methodology. I have seen it in action with some customers in the last few years, but many others are still using ad-hoc or universal holdout groups. Also, an Internet search does not reveal much about this concept.

You must be wondering by now: “What is a tiered holdout group?” The idea is as follows:

You start by creating a global holdout group, and you never send them any marketing messages again (or perhaps limit it to a specific timeframe, like one year).
Then, for every marketing program, you take the treatment group (those not in the global holdout group) and create a program-specific holdout group. This second tier does not necessarily have to be per marketing program; it can be per channel or any other dimension that makes sense to you.
Finally, for every marketing activity or campaign, you create a third layer of holdout groups. To do so, you take the treatment group from step 2 and apply the business rules defined for the campaign. Then, divide this new audience and split it into activity/campaign-specific holdout and treatment groups.

Visually, it would look like this:

Tiered Holdout Groups

A pitfall to avoid is creating holdout groups that are too large. Remember that holdout groups do not receive marketing messages. If your marketing messages are successful, you will miss out on potential conversions from an overly large holdout group. Therefore, keep the size of the holdout groups to the minimum necessary to achieve statistical significance. Your statistics expert should be able to help you determine the appropriate sizes.

The Why

I usually try to start with the “why” before explaining the subject. However, in this case, I thought it would be easier to understand the reasons behind it after explaining the concept.

If you are still reading, you are likely already understanding the benefits of this approach. You can create multiple reports and analyses at different levels. You can take the first tier (the global holdout group) and run a report to show how much value the digital marketing department is bringing to your company. I would suggest that you use this information to your advantage: if the lift you are generating is clearly higher than the cost of the digital marketing function, you can easily justify maintaining or even growing the department’s size.

The next step would be to analyze the lift and confidence achieved by each of the marketing programs. Here, the goal is to evaluate whether a program is profitable, whether it should continue, or whether it should be optimized.

Finally, and I am sure you know where I am going to, for each marketing activity, you do the same: calculate lift and confidence to evaluate its performance.

The How

For the global holdout group and the program-specific holdout groups, you need to select members carefully. Other than, perhaps, data distiller, none of the Adobe tools is designed to select these holdout groups.

My recommendation is to select them in your CRM before sending the data to Adobe. Your statistician should be heavily involved in designing the algorithm to choose members of the various holdout groups. You probably do not want to generate a purely random selection, as you would have no control over potential bias. I am not an expert here, so I cannot provide further advice on the selection process itself.

The output of the previous step should be a set of flags associated with each profile. These flags will identify whether an individual is part of the global holdout group or each of the program-specific holdout groups. It then becomes very easy to send these flags to Adobe Experience Platform (AEP) and use them in segment definitions.

Finally, for the last tier of holdout groups (activity/campaign-specific), you can rely on the corresponding Adobe tool. Let me show you how I would do it:

Create an RTCDP segment for the global holdout group. The segment definition would be as simple as everybody who has the global holdout flag set to True.
Create an RTCDP segment for the global treatment group, which is just anybody who is not in the global holdout group (or global holdout flag is False).
Create one RTCDP segment for each program, with the conditions:
- Member of the global treatment group; AND
- Their program-specific holdout flag is set to True.
Create one RTCDP segment per program, where the conditions are:
- Member of the global treatment group; AND
- Not a member of the program-specific holdout group (or flag is False).
For each campaign or activity, create an RTCDP segment based on:
- The program-specific treatment group; AND
- The business requirements for the segmentation.
In Adobe Target, apply the campaign-specific segment to the A/B test and set the percentage split defined for the activity-specific holdout group.
In AJO, create a segment-triggered journey using the campaign-specific segment and add a condition (split) activity with the percentage defined for the campaign-specific holdout group.

I am very conscious that the examples above are very basic and may have some gaps. You will need to tweak them according to your specific needs, but my goal was to give you an idea of how I would start.

Final Comments

While the idea is appealing, there are other details to consider:

There is no true holdout group; people have multiple devices and accounts and, therefore, will sometimes be in the holdout group and sometimes in the treatment group.
You should keep the size of the holdout groups to a minimum.
If the holdout group is too small, your results may be inconclusive.
A tiered approach is not for everyone, as it is a complex solution to manage.

Photo by Quang Nguyen Vinh

Profile Projection (Categories: Platform)
The AEP Edge Network (Categories: Platform)
Test, Test, Test (Categories: Opinion, MSA)
AEP and 1:M Relationships (Categories: Platform)
XDM Data Modeling (Categories: Platform)
Introduction to Federated Audience Composition (Categories: Platform)

Tiered Holdout Groups

The Theory

Flat

Ad-hoc

Universal

Tiered

The What

The Why

The How

Final Comments

Related Posts