Lattice Retail — synthetic sessions for recommender testing

The challenge

Lattice's recommender team had a uniquely retail problem: they wanted to test dozens of variants every week, but each variant required real session data to be evaluated — and their privacy team had drawn a hard line at sharing session-level clickstreams with any vendor or partner.

Their workaround was a static, 18-month-old sample of 200k sessions that everyone in the team shared. Vendors fitted to it. Internal experiments overfit to it. By 2025 the gap between offline AUC and online uplift had widened to the point that the team was shipping recommenders that looked great in the notebook and disappointing in production.

The solution

Lattice deployed Doppelset Engine into their data platform on Databricks. A daily job trains an event-stream doppel of the previous week's shopping sessions — preserving funnel structure, dwell-time distributions, conversion rates, and seasonal effects.

Every variant of the recommender now gets evaluated against 14 million fresh synthetic sessions per week. Because the synthetic dataset rotates weekly, no one can overfit to it — and crucially, vendors can run their evaluation on Lattice-supplied synthetic data without any production session ever leaving the data lake.

When a variant graduates, it's promoted to a sealed real-data evaluation harness for the final 24 hours of testing. The harness emits the same shape of report the synthetic harness does, so the team's reporting and dashboards are unchanged.

The minute we stopped overfitting to a stale sample, our offline numbers started predicting our online numbers. Doppelset isn't a privacy product to us — it's the only sane way to run a recommender team.

— Anya Beck, Director of Personalisation, Lattice Retail

The results

Offline-to-online gap closed

Validated recommenders now ship with an average +18% uplift in click-through against the previous baseline — the gap between offline and online performance has shrunk by 60% relative to the static-sample era.

A vendor evaluation everyone trusts

Three external recommender vendors now run their evals on Lattice-supplied synthetic data. The procurement cycle for a new recommender vendor dropped from 11 to 4 weeks.

Privacy that scales with the team

Lattice can grow its recommender team without growing its privacy review pipeline — synthetic sessions don't require per-person access reviews.

Continuous learning, lossless

Because the synthetic dataset rotates weekly, every team starts on fresh data — and the long-tail of seasonal events (Sales, Cyber Week, regional holidays) shows up in the synthetic dataset within hours of it showing up in production.

What's running

Doppelset Engine (Databricks)Delta Lake · MLflowKafkaOkta SSOAzure West Europe

Atlas Health — healthcare

Read →

A/B testing customer journeys on a population that never existed.