Read Results

This page helps you understand what each item on the Results tab means once your experiment is running. Walk through the page from top to bottom.

Why this matters: Your puzzle game has been running a new reward popup test for a week. The Results page shows green bars, gray bars, and various percentages — what do they mean? After reading this page, you will be able to confidently decide whether to ship, keep running, or end the experiment.

Still configuring the experiment? See Create and Launch an Experiment and Configure the Stats Engine and Validate Samples first.

Entry point

Click Experiments in the left navigation → select an experiment to open the detail page → switch to the Results tab. The experiment status is always shown in the page header; the Make Decision button also appears here while the experiment is running. Experiment Results tab entry

Sharing tip: The page URL encodes all the settings you are currently viewing. Send it to a teammate and they will see exactly the same view you see — no additional explanation needed.

Reading from top to bottom

A one-line conclusion at the top of the page that tells you the current state of the experiment.

What you see	What it means	Next step
Suggestion to add primary metrics	No primary metric selected	Add one to see comparison charts
Collecting data…	Not enough data yet	Let the experiment keep running
A better group has been found	A significant winner exists	Evaluate whether to Ship
No winning variant found	Ran to completion with no significant difference	Evaluate whether to Archive
Risk warning (red)	A high-risk condition was triggered	Stop reading the headline — investigate first

Game scenario: You launched a new onboarding A/B test yesterday and open Results today to see Collecting data… — this is normal. Most casual games need 7–14 days to accumulate enough signal; do not rush to conclusions.

2. Cumulative Exposures

Cumulative exposures chart At the top of the results page is the cumulative exposure trend chart. An exposure is a deduplicated count of experiment units that have entered the experiment — typically deduplicated users, or devices for device-level experiments. The timeline shows when the experiment started and how many exposures entered each day. You can view the rate at which users are assigned to each variant, the total cumulative exposures, and verify that the actual traffic split matches the target allocation configured in the experiment setup. Exposure chart detail When a Sample Ratio Mismatch (SRM) is detected, the chart marks the affected point in red. The cause is that the cumulative exposure proportion across variants does not match the traffic allocation configured in the experiment setup.

3. Three analysis views

View	Purpose
Basic Analysis (default)	"Did this experiment win?" — a head-to-head comparison of Control versus one Treatment
Explore	View metrics outside the experiment's metric set, or compare multiple Treatments simultaneously (up to 30 metrics × 5 variants)
HTE Analysis	Suspect the effect is concentrated in a specific player segment (e.g., payers vs. non-payers) and want to confirm

Basic Analysis

Basic analysis tab Displays the relative difference (lift) across all metrics you configured when creating the experiment. The experiment "Results" job runs daily, computing the difference between random variants (such as Treatment vs. Control) for each metric and performing statistical tests on the results.

The metrics table compares Control and Treatment row by row:

Column	Meaning
Metric name	May include a CUPED On / Off indicator
Baseline	The Control variant's value
Comparison	The Treatment variant's value
Relative difference	How much Treatment changed relative to Control, with confidence interval
Trends	A small sparkline showing how the difference has changed over time

Color coding:

Green — Treatment is significantly better than Control (won)
Red — Treatment is significantly worse than Control (problem)
Gray — No significant difference yet

Game scenario: Testing whether a spin-the-wheel animation improves D7 retention over a static chest. After 10 days, the retention row shows green at +2.3%, and guardrail metrics (ARPU, ad watch rate) are gray — a clean win; the spin wheel can be shipped with confidence.

Experiment metrics results table The relative difference (lift) formula is: Delta(%) = (Test - Control) / Control. The confidence interval is computed based on the selected significance level (default 95%).

CUPED indicator

CUPED On = variance reduction is active; results are more reliable
CUPED Off = this metric type is not supported; hover to see the reason

CUPED can typically let an experiment reach a conclusion 3–7 days earlier — by eliminating natural behavioral variation among players (heavy weekend play vs. occasional weekday sessions) to reduce noise.

Explore

Ad-hoc queries without changing the experiment configuration. Select metrics, select variants, set a date range, and click Query.

When to use it:

Compare all Treatments against Control in a single view
A teammate asks "did session length change?" — a metric not in the experiment's main metric set
Explore an idea on the fly without touching the experiment itself

Game scenario: The primary metric is D7 retention, but a monetization colleague wants to know whether the new popup affects IAP conversion rate. Open Explore, add the IAP conversion metric — no need to modify the experiment setup.

HTE (Heterogeneous Treatment Effects) analysis

Answers the question: "Is the effect the same for all players, or do different segments show different responses?"

Steps:

Select the baseline group at the top (defaults to Control)
Set a date range
To filter a specific segment, click Sample Filtering to add filter conditions
Under Metrics, add the metrics you care about (up to 10)
Under Group By, select a segmentation dimension (player level range, country, spending tier)
Click Query

Query results appear in the Metric Details section below, showing the treatment effect difference across each sub-segment. When there are multiple treatment variants, you can enable Multiple Comparison Correction in the top-right corner to control the false-positive risk.

Game scenario: A new difficulty curve shows a weak positive lift overall. But when segmented by player level, players at level 50+ show +5%, while new players show –1%. HTE surfaces this — if you had shipped to everyone without segmenting, new players would churn faster due to mismatched difficulty, and the "weak positive lift" in overall retention would have been an illusion driven by experienced players.

Experiment Backtrack: a second check on sample balance

Backtrack is a sample balance check that ABC runs automatically on every experiment. It answers: "Are the variants truly comparable over the recent traffic period?"

Results are shown in two places:

A small label in the experiment header (Normal / - - placeholder)
A standalone backtrack page listing detailed per-metric differences

Practical usage:

See Normal → you can trust the Suggestion and metrics table
See - - → the experiment has not yet accumulated enough data; check back in a few hours
Borderline result and unsure → switch to the 7-Day tab and compare direction and magnitude against the 3-day view
Just added a new metric → click Rerun Backtrack to recompute immediately

Backtrack vs. SRM: both check balance, but with different scopes — SRM covers the entire experiment from start to present; Backtrack covers only the most recent 3 or 7 days. For a long-running experiment that started cleanly but drifted later, SRM may pass while the 3-day Backtrack does not.

Full details in Configure the Stats Engine and Validate Samples.

Advanced Query: force a recompute

The Advanced Query button (next to the date picker) forces a fresh computation. Use it when:

You just added a new metric and do not want to wait for tomorrow's automated run
The metric definition or data source was updated and the numbers need to be refreshed

Click it, select the metrics to recompute, and click Query. To recompute from scratch across all data, enable the Rerun Data toggle.

Next step: make a decision

Once you have read the results, click the Make Decision button at the top of the experiment to ship the winning variant or archive the experiment. See End an Experiment and Make a Decision.

Layer Experiment

Read Results

Entry point

Reading from top to bottom