Warehouse Best Practices

This page records operational practices that pay off as experiment volume grows on a single warehouse connection.

Partitioning

Partition fact and assignment tables by the time grain ABetterChoice reads them at — typically daily — so each scheduled refresh prunes to the partitions it needs.
When you register a table, set its Partition Field (for example ds) and matching Partition Field Type (Date / Timestamp / Datetime) so the engine can prune partitions. See Defining tables for the field reference.
The demo project follows this pattern across all three table types — the assignment, fact, and user-property tables are each partitioned by date:

All three demo tables — assignment, fact, and user property — are registered from BigQuery and partitioned by date

For very large fact tables, build an upstream daily roll-up table and register that table as the metric source instead of the raw event table.
Keep the roll-up's grain at "user × day × event-of-interest" — finer grains rarely add statistical power.

Lean on the warehouse's incremental update features (table-level MERGEs, materialized views) to avoid full scans on every result refresh.
When the underlying fact table is append-only and date-partitioned, the scheduled refresh naturally touches only recent partitions.

The connection runs as a single Business Account that a platform proxy account impersonates. Keep the impersonation grant scoped to the dataset used for assignment tables and analysis data — nothing wider.
Treat the analysis dataset as production: lock down DDL, enable backups, and set a retention you can defend.
Only grant Dataproc permissions when your team uses HTE Analysis; leave that surface unconfigured otherwise.