Customer story

Atlas Health

10 hospitals, one synthetic patient registry, zero PHI on the move.

Atlas Health needed to give 80+ external researchers cohort-level access to a decade of admissions data — without ever moving protected health information off-site. Doppelset became their research datalake.

At a glance

Industry
Healthcare
HQ
Madrid, Spain
Size
11,400 employees · 10 hospitals
Stack
Doppelset Vault, Postgres · Iceberg

80%

shorter cohort approval time

10⁻⁷

re-identification risk (audited)

23 days

from pilot to production

The challenge

Atlas Health's data science team sat on roughly nine years of admissions, lab results, and imaging metadata across ten hospitals. Each new research collaboration meant a new data-sharing agreement, a new IRB cycle, and a new pseudonymisation pass by their compliance team. The median time to a cohort was eleven weeks.

Worse, every variant they generated drifted from production: masking erased rare conditions, k-anonymity collapsed long tails, and the resulting datasets were no longer useful for any model that needed to learn from outliers. The team was caught between two failure modes — too risky to share, or too lossy to learn from.

The solution

Atlas deployed Doppelset Vault inside their own VPC, behind their existing data perimeter. Doppelset ingested the full nine-year admissions schema (47 tables, 312 foreign keys, 8 free-text columns) and trained a relational doppel with ε = 1.4.

Every Friday afternoon, the platform now refreshes a synthetic mirror of the previous week's admissions. Researchers query the synthetic dataset directly through Atlas's existing JupyterHub. No PHI ever leaves the cluster — the synthetic mirror does.

Their compliance team built a one-pager for new collaborators: a hash of the synthetic dataset, a signed quality + privacy report from Doppelset, and a copy of the IRB blanket approval. Time from first email to first query: 14 days, end-to-end.

We used to spend more time pseudonymising than analysing. Doppelset turned that on its head — researchers get realistic data on Monday, and our DPO sleeps better.
Dr. Elin Hartmann, Head of Data Science, Atlas Health

The results

Cohorts in days, not weeks

Median time from researcher request to first cohort dropped from 11 weeks to 14 days — an 80% reduction, mostly recovered from the legal and pseudonymisation passes that are now skipped entirely.

Models that still know the long tail

An internal sepsis-prediction model retrained on the synthetic mirror sat within 0.4 AUC of the production version — and crucially, preserved the rare-condition signal that previous anonymisation passes had erased.

An audit trail that defends itself

Every weekly refresh ships with a signed quality + privacy receipt. Atlas's DPO now files a single quarterly report instead of per-project DPIAs.

Real PHI, never moved

Zero gigabytes of production PHI have crossed Atlas's data perimeter since deployment. The platform is fully VPC-resident; Doppelset sees telemetry, not data.

What's running

Doppelset VaultPostgres · IcebergJupyterHub · DatabricksOkta SSOAWS eu-west-1 (VPC)

Next story

Norden Bank — banking & fraud

Read →
Try it now

Ship faster. Stop arguing with legal.

Generate your first 100,000 synthetic rows in the next ten minutes. No credit card.