Sample Ratio Mismatch Checks

Sample Ratio Mismatch (SRM) is a common issue in online experiments, where the actual allocation of random units to different experimental groups does not match the intended allocation ratio, leading to biased estimations and incorrect conclusions.

What is sample ratio mismatch?

In an ideal online controlled experiment, random units, such as users, are randomly assigned to different groups (e.g., control and treatment) according to a predefined ratio. Sample ratio mismatch (SRM for short) occurs when the observed unit allocation deviates significantly from this intended ratio. For example, in a 50/50 split user-level experiment, half of the users should be in the control group and the other half in the treatment group. If the observed allocation is 90/10, then it is very likely that SRM has occurred.

Why should I care?

Flaws in experiment design or its implementation

SRM often points to flaws in the experiment design or its implementation. When SRM is detected, it usually indicates that something has gone wrong in the process of assigning users to different experimental groups. For instance, bugs in the code responsible for user assignment can lead to incorrect group allocations, and race conditions or concurrency issues might cause users to be misassigned in high-traffic environments.

Potential selection bias

One of the fundamental principles underlying random experiments is the random assignment of individuals to either the control or test group. This ensures that extraneous variables, which could potentially confound the results, are evenly distributed across groups. However, if there are flaws in the experimental design or its implementation causing SRM, the above assumption may no longer hold. Such flaws can introduce selection bias into the experiment, leading to incorrect attributions of observed effects to the treatment when, in fact, they may be due to imbalances in group characteristics.

Ethical considerations

Ensuring that users are fairly and randomly assigned to different groups is not only a methodological necessity but also an ethical consideration. Proper randomization safeguards the integrity of the experiment by preventing systematic biases that could skew the results. However, failures in SRM can lead to significant ethical and methodological issues. SRM can result in some users being unfairly excluded or overrepresented in certain groups, thereby compromising the representativeness and generalizability of the findings.

How does the check procedure work?

The Cumulative Exposures block

The Cumulative Exposures block provides a visual comparison of the cumulative unique exposed units across the different groups under consideration in the current experiment. This visualization facilitates the assessment of exposure distribution and helps ensure that the experimental groups are being treated equitably over time.

Pearson's chi-squared test

The Cumulative Exposures block performs a Pearson's chi-squared test with $α = 0.001$ on the collected cumulative unique units and the configured traffic distribution to determine the presence of SRM. This statistical test evaluates whether the observed distribution of exposures deviates significantly from the expected distribution, thereby identifying potential imbalances in group assignments.

Upon detection of SRM, a red indicator will appear on the graph, accompanied by a warning message. This alert signifies potential flaws in the experimental design or its implementation, which could compromise the validity of the experimental results. Addressing these issues is crucial to ensure the integrity and reliability of the findings.

Sample Ratio Mismatch Checks ​

What is sample ratio mismatch? ​

Why should I care? ​

Flaws in experiment design or its implementation ​

Potential selection bias ​

Ethical considerations ​

How does the check procedure work? ​

The Cumulative Exposures block ​

Pearson's chi-squared test ​