How A/B Tests Work in Maestra

A/B testing is a method for validating hypotheses about your marketing campaigns by comparing the performance of two or more variants, or by comparing a variant against a control group. Instead of guessing what works, you let real customer behavior decide. Performance is measured through metrics such as average order value, conversion rate, and—in some test types—average revenue per user (ARPU) and click-through conversion.

How an A/B test runs

Every A/B test in Maestra follows the same four stages:

Identify the audience

Choose who participates in the test—a segment, all visitors to a site, or users of a mobile app.

Split the audience into groups

Divide participants into two or more groups, one per variant. You can also include a control group that receives nothing.

Run the experiment

Each group experiences its assigned variant. Maestra tracks behavior throughout.

Compare results

Compare each group’s behavior against the metrics you defined for success and decide whether your hypothesis holds up.

Where you can run A/B tests

Maestra supports A/B testing across a range of channels and mechanics:

Workflow scenarios
Mobile apps (in-app messages)
Website personalization—popups, embedded blocks, and recommendation widgets
Website visitors as a whole
Customer segments, which you can plug into any marketing mechanic

What you configure in a test

Every test has the same core settings:

Setting	What it does
Hypothesis	The statement you want to prove or disprove.
Participants	The segment, site, or app whose audience joins the test.
Traffic distribution	The share of participants assigned to each variant, expressed as proportions (for example, 50/50 or 75/25).
Analytics metrics	The KPIs Maestra uses to decide which variant wins.

Things to keep in mind when reading results

Wholesale buyers and other outliers. When a test uses average order value or ARPU, Maestra excludes unusually high-revenue customers—wholesale buyers, for example—from the calculation so they don’t distort the result.

Tests with more than two variants need proportionally more participants. They can also end without a clear winner if variant pairs disagree with one another. Uneven traffic splits (such as 75/25) take longer to reach significance than balanced splits (50/50), because the smaller variant accumulates participants more slowly. Device-based assignment for site and personalization tests. Website and personalization tests assign participants by device. A customer who visits from multiple devices can land in different variants on each one and counts as a participant in every branch they hit. Orders are attributed to the device used during the customer’s most recent site visit.

How to make tests finish faster

How quickly a test wraps up depends on three things:

Traffic volume. The more traffic to the surface you’re testing, the faster participants accumulate.
Number of variants. Fewer variants mean fewer participants needed overall.
Even distribution. Balanced splits (50/50, 33/33/34) fill every variant at the same rate—uneven splits drag out the slower branch.

Running multiple tests at once

Running several A/B tests on the exact same audience at the same time creates interaction effects. You won’t get false winners, but each test will take longer to reach significance, and tests with similar hypotheses across different channels can still muddy the interpretation.

As a rule, keep only one site-wide personalization test running at any given time. Otherwise, the control group gets fragmented across overlapping tests and you can’t trust the comparison.

Audience fatigue

If you run A/B tests on the same audience back-to-back, give that audience a cooldown period between tests. Audiences that are tested too often—either in parallel or in rapid succession—produce unreliable results and hide real differences between variants.

Reading the report

Tests run until statistical significance is reached. Maestra doesn’t stop them automatically—you’ll get a notification when significance arrives so you can decide what to do next. Reports become available within 24 hours of launch, and an estimated finish date appears about a week after the test starts. Each report includes:

A graph for every metric in the test
Configuration details—segment, participant count, hypothesis
A date range and aggregation period you can adjust
A variant comparison table that includes revenue figures (in $)
A statistical-significance indicator for each metric

When there’s no winner

Sometimes a test ends without a clear winner. Common reasons:

The metric you chose isn’t sensitive enough to pick up the difference.
The mechanic only works for a narrower slice of the audience than the one you tested.
You didn’t have enough participants—especially likely with multi-variant tests or heavily skewed splits.
The mechanic was turned off before the test finished.
Seasonality or an outside event interfered with the results.
The mechanic genuinely has little or no effect.

A “no winner” result is still a result. It tells you the variants are interchangeable for this audience, on this metric, in this window—which is useful information when you’re deciding where to invest next.

How Does the Test Mode Work in Automated Campaigns?How to Set up Control Groups and A/B Tests for Campaigns

Customers, orders and products

Import data

Filters

Segments

Flows

Campaigns

Personalization

Mobile app personalization

Ad Optimization

Loyalty

Reports

API integrations

Administration

Security

How an A/B test runs

Where you can run A/B tests

What you configure in a test

Things to keep in mind when reading results

How to make tests finish faster

Running multiple tests at once

Audience fatigue

Reading the report

When there’s no winner

​How an A/B test runs

​Where you can run A/B tests

​What you configure in a test

​Things to keep in mind when reading results

​How to make tests finish faster

​Running multiple tests at once

​Audience fatigue

​Reading the report

​When there’s no winner

How an A/B test runs

Where you can run A/B tests

What you configure in a test

Things to keep in mind when reading results

How to make tests finish faster

Running multiple tests at once

Audience fatigue

Reading the report

When there’s no winner